[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-advimman--lama":3,"tool-advimman--lama":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":79,"owner_url":80,"languages":81,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":10,"env_os":102,"env_gpu":103,"env_ram":104,"env_deps":105,"category_tags":113,"github_topics":114,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":133,"updated_at":134,"faqs":135,"releases":166},1006,"advimman\u002Flama","lama","🦙  LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022","LaMa是一款开源的图像修复工具，专攻填补照片中大面积缺失或损坏区域的任务。它解决了传统方法在高分辨率图像（如2000像素以上）或大块缺失区域（比如移除物体后留下的空白）时效果不佳的痛点——许多工具只能处理小范围修复，而LaMa凭借独特的傅里叶卷积技术，即使在训练时仅接触256x256小图，也能在超高分辨率下精准还原细节，尤其擅长修复栅栏、纹理等周期性结构。开发者可轻松集成它到应用中；研究人员能用于图像生成领域的创新实验；设计师可快速清理照片瑕疵；普通用户则能通过Cleanup.pictures等第三方工具一键体验。作为WACV 2022的学术成果，LaMa以高效、易用的特性，为各类用户提供了专业级的图像修复方案，让复杂修复变得简单可靠。","# 🦙 LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions\n\nby Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, \nAnastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, Victor Lempitsky.\n\n\u003Cp align=\"center\" \"font-size:30px;\">\n  🔥🔥🔥\n  \u003Cbr>\n  \u003Cb>\nLaMa generalizes surprisingly well to much higher resolutions (~2k❗️) than it saw during training (256x256), and achieves the excellent performance even in challenging scenarios, e.g. completion of periodic structures.\u003C\u002Fb>\n\u003C\u002Fp>\n\n[[Project page](https:\u002F\u002Fadvimman.github.io\u002Flama-project\u002F)] [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.07161)] [[Supplementary](https:\u002F\u002Fashukha.com\u002Fprojects\u002Flama_21\u002Flama_supmat_2021.pdf)] [[BibTeX](https:\u002F\u002Fsenya-ashukha.github.io\u002Fprojects\u002Flama_21\u002Fpaper.txt)] [[Casual GAN Papers Summary](https:\u002F\u002Fwww.casualganpapers.com\u002Flarge-masks-fourier-convolutions-inpainting\u002FLaMa-explained.html)]\n \n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F15KTEIScUbVZtUP6w2tCDMVpE-b1r9pkZ?usp=drive_link\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\"\u002F>\n  \u003C\u002Fa>\n      \u003Cbr>\n   Try out in Google Colab \n  \u003Cbr>\n  All yandex dist links went bad, you can download the model from the https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=sharing \n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e06eeccf0efa.gif\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e5fb4788742e.gif\" \u002F>\n\u003C\u002Fp>\n\n\n\n# LaMa development\n(Feel free to share your paper by creating an issue)\n- https:\u002F\u002Fgithub.com\u002Fgeekyutao\u002FInpaint-Anything --- Inpaint Anything: Segment Anything Meets Image Inpainting\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e37234404f87.png\" \u002F>\n\u003C\u002Fp>\n\n- [Feature Refinement to Improve High Resolution Image Inpainting](https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.13644) \u002F [video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=gEukhOheWgE) \u002F code https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fpull\u002F112 \u002F by Geomagical Labs ([geomagical.com](geomagical.com))\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_0f9fbaf86b7c.png\" \u002F>\n\u003C\u002Fp>\n\n# Non-official 3rd party apps:\n(Feel free to share your app\u002Fimplementation\u002Fdemo by creating an issue)\n\n- https:\u002F\u002Fgithub.com\u002Fenesmsahin\u002Fsimple-lama-inpainting - a simple pip package for LaMa inpainting.\n- https:\u002F\u002Fgithub.com\u002Fmallman\u002FCoreMLaMa - Apple's Core ML model format\n- [https:\u002F\u002Fcleanup.pictures](https:\u002F\u002Fcleanup.pictures\u002F) - a simple interactive object removal tool by [@cyrildiagne](https:\u002F\u002Ftwitter.com\u002Fcyrildiagne)\n    - [lama-cleaner](https:\u002F\u002Fgithub.com\u002FSanster\u002Flama-cleaner) by [@Sanster](https:\u002F\u002Fgithub.com\u002FSanster\u002Flama-cleaner) is a self-host version of [https:\u002F\u002Fcleanup.pictures](https:\u002F\u002Fcleanup.pictures\u002F)\n- Integrated to [Huggingface Spaces](https:\u002F\u002Fhuggingface.co\u002Fspaces) with [Gradio](https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Fgradio). See demo: [![Hugging Face Spaces](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fakhaliq\u002Flama) by [@AK391](https:\u002F\u002Fgithub.com\u002FAK391)\n- Telegram bot [@MagicEraserBot](https:\u002F\u002Ft.me\u002FMagicEraserBot) by [@Moldoteck](https:\u002F\u002Fgithub.com\u002FMoldoteck), [code](https:\u002F\u002Fgithub.com\u002FMoldoteck\u002FMagicEraser)\n- [Auto-LaMa](https:\u002F\u002Fgithub.com\u002Fandy971022\u002Fauto-lama) = DE:TR object detection + LaMa inpainting by [@andy971022](https:\u002F\u002Fgithub.com\u002Fandy971022)\n- [LAMA-Magic-Eraser-Local](https:\u002F\u002Fgithub.com\u002Fzhaoyun0071\u002FLAMA-Magic-Eraser-Local) = a standalone inpainting application built with PyQt5 by [@zhaoyun0071](https:\u002F\u002Fgithub.com\u002Fzhaoyun0071)\n- [Hama](https:\u002F\u002Fwww.hama.app\u002F) - object removal with a smart brush which simplifies mask drawing.\n- [ModelScope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fdamo\u002Fcv_fft_inpainting_lama\u002Fsummary) = the largest Model Community in Chinese by  [@chenbinghui1](https:\u002F\u002Fgithub.com\u002Fchenbinghui1).\n- [LaMa with MaskDINO](https:\u002F\u002Fgithub.com\u002Fqwopqwop200\u002Flama-with-maskdino) = MaskDINO object detection + LaMa inpainting with refinement by [@qwopqwop200](https:\u002F\u002Fgithub.com\u002Fqwopqwop200).\n- [CoreMLaMa](https:\u002F\u002Fgithub.com\u002Fmallman\u002FCoreMLaMa) - a script to convert Lama Cleaner's port of LaMa to Apple's Core ML model format.\n\n# Environment setup\n\n❗️❗️❗️ All yandex dist links went bad, you can download the model from the [google drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=sharing) ❗️❗️❗️\n\nClone the repo:\n`git clone https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama.git`\n\nThere are three options of an environment:\n\n1. Python virtualenv:\n\n    ```\n    virtualenv inpenv --python=\u002Fusr\u002Fbin\u002Fpython3\n    source inpenv\u002Fbin\u002Factivate\n    pip install torch==1.8.0 torchvision==0.9.0\n    \n    cd lama\n    pip install -r requirements.txt \n    ```\n\n2. Conda\n    \n    ```\n    % Install conda for Linux, for other OS download miniconda at https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002Fminiconda.html\n    wget https:\u002F\u002Frepo.anaconda.com\u002Fminiconda\u002FMiniconda3-latest-Linux-x86_64.sh\n    bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME\u002Fminiconda\n    $HOME\u002Fminiconda\u002Fbin\u002Fconda init bash\n\n    cd lama\n    conda env create -f conda_env.yml\n    conda activate lama\n    conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -y\n    pip install pytorch-lightning==1.2.9\n    ```\n \n3. Docker: No actions are needed 🎉.\n\n# Inference \u003Ca name=\"prediction\">\u003C\u002Fa>\n\nRun\n```\ncd lama\nexport TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n```\n\n**1. Download pre-trained models**\n\nThe best model (Places2, Places Challenge):\n    \n```    \ncurl -LJO https:\u002F\u002Fhuggingface.co\u002Fsmartywu\u002Fbig-lama\u002Fresolve\u002Fmain\u002Fbig-lama.zip\nunzip big-lama.zip\n```\n\nAll models (Places & CelebA-HQ):\n\n```\ndownload [https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=drive_link]\nunzip lama-models.zip\n```\n\n**2. Prepare images and masks**\n\nDownload test images:\n\n```\nunzip LaMa_test_images.zip\n```\n\u003Cdetails>\n \u003Csummary>OR prepare your data:\u003C\u002Fsummary>\n1) Create masks named as `[images_name]_maskXXX[image_suffix]`, put images and masks in the same folder. \n\n- You can use the [script](https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fblob\u002Fmain\u002Fbin\u002Fgen_mask_dataset.py) for random masks generation. \n- Check the format of the files:\n    ```    \n    image1_mask001.png\n    image1.png\n    image2_mask001.png\n    image2.png\n    ```\n\n2) Specify `image_suffix`, e.g. `.png` or `.jpg` or `_input.jpg` in `configs\u002Fprediction\u002Fdefault.yaml`.\n\n\u003C\u002Fdetails>\n\n\n**3. Predict**\n\nOn the host machine:\n\n    python3 bin\u002Fpredict.py model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n\n**OR** in the docker\n  \nThe following command will pull the docker image from Docker Hub and execute the prediction script\n```\nbash docker\u002F2_predict.sh $(pwd)\u002Fbig-lama $(pwd)\u002FLaMa_test_images $(pwd)\u002Foutput device=cpu\n```\nDocker cuda:\n```\nbash docker\u002F2_predict_with_gpu.sh $(pwd)\u002Fbig-lama $(pwd)\u002FLaMa_test_images $(pwd)\u002Foutput\n```\n\n**4. Predict with Refinement**\n\nOn the host machine:\n\n    python3 bin\u002Fpredict.py refine=True model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n\n# Train and Eval\n\nMake sure you run:\n\n```\ncd lama\nexport TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n```\n\nThen download models for _perceptual loss_:\n\n    mkdir -p ade20k\u002Fade20k-resnet50dilated-ppm_deepsup\u002F\n    wget -P ade20k\u002Fade20k-resnet50dilated-ppm_deepsup\u002F http:\u002F\u002Fsceneparsing.csail.mit.edu\u002Fmodel\u002Fpytorch\u002Fade20k-resnet50dilated-ppm_deepsup\u002Fencoder_epoch_20.pth\n\n\n## Places\n\n⚠️ NB: FID\u002FSSIM\u002FLPIPS metric values for Places that we see in LaMa paper are computed on 30000 images that we produce in evaluation section below.\nFor more details on evaluation data check [[Section 3. Dataset splits in Supplementary](https:\u002F\u002Fashukha.com\u002Fprojects\u002Flama_21\u002Flama_supmat_2021.pdf#subsection.3.1)]  ⚠️\n\nOn the host machine:\n\n    # Download data from http:\u002F\u002Fplaces2.csail.mit.edu\u002Fdownload.html\n    # Places365-Standard: Train(105GB)\u002FTest(19GB)\u002FVal(2.1GB) from High-resolution images section\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Ftrain_large_places365standard.tar\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Fval_large.tar\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Ftest_large.tar\n\n    # Unpack train\u002Ftest\u002Fval data and create .yaml config for it\n    bash fetch_data\u002Fplaces_standard_train_prepare.sh\n    bash fetch_data\u002Fplaces_standard_test_val_prepare.sh\n    \n    # Sample images for test and viz at the end of epoch\n    bash fetch_data\u002Fplaces_standard_test_val_sample.sh\n    bash fetch_data\u002Fplaces_standard_test_val_gen_masks.sh\n\n    # Run training\n    python3 bin\u002Ftrain.py -cn lama-fourier location=places_standard\n\n    # To evaluate trained model and report metrics as in our paper\n    # we need to sample previously unseen 30k images and generate masks for them\n    bash fetch_data\u002Fplaces_standard_evaluation_prepare_data.sh\n    \n    # Infer model on thick\u002Fthin\u002Fmedium masks in 256 and 512 and run evaluation \n    # like this:\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier_\u002F \\\n    indir=$(pwd)\u002Fplaces_standard_dataset\u002Fevaluation\u002Frandom_thick_512\u002F \\\n    outdir=$(pwd)\u002Finference\u002Frandom_thick_512 model.checkpoint=last.ckpt\n\n    python3 bin\u002Fevaluate_predicts.py \\\n    $(pwd)\u002Fconfigs\u002Feval2_gpu.yaml \\\n    $(pwd)\u002Fplaces_standard_dataset\u002Fevaluation\u002Frandom_thick_512\u002F \\\n    $(pwd)\u002Finference\u002Frandom_thick_512 \\\n    $(pwd)\u002Finference\u002Frandom_thick_512_metrics.csv\n\n    \n    \nDocker: TODO\n    \n## CelebA\nOn the host machine:\n\n    # Make shure you are in lama folder\n    cd lama\n    export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n\n    # Download CelebA-HQ dataset\n    # Download data256x256.zip from https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11Vz0fqHS2rXDb5pprgTjpD7S2BAJhi1P\n    \n    # unzip & split into train\u002Ftest\u002Fvisualization & create config for it\n    bash fetch_data\u002Fcelebahq_dataset_prepare.sh\n\n    # generate masks for test and visual_test at the end of epoch\n    bash fetch_data\u002Fcelebahq_gen_masks.sh\n\n    # Run training\n    python3 bin\u002Ftrain.py -cn lama-fourier-celeba data.batch_size=10\n\n    # Infer model on thick\u002Fthin\u002Fmedium masks in 256 and run evaluation \n    # like this:\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier-celeba_\u002F \\\n    indir=$(pwd)\u002Fceleba-hq-dataset\u002Fvisual_test_256\u002Frandom_thick_256\u002F \\\n    outdir=$(pwd)\u002Finference\u002Fceleba_random_thick_256 model.checkpoint=last.ckpt\n    \n    \nDocker: TODO\n\n## Places Challenge \n\nOn the host machine:\n\n    # This script downloads multiple .tar files in parallel and unpacks them\n    # Places365-Challenge: Train(476GB) from High-resolution images (to train Big-Lama) \n    bash places_challenge_train_download.sh\n    \n    TODO: prepare\n    TODO: train \n    TODO: eval\n      \nDocker: TODO\n\n## Create your data\n\nPlease check bash scripts for data preparation and mask generation from CelebaHQ section,\nif you stuck at one of the following steps.\n\n\nOn the host machine:\n\n    # Make shure you are in lama folder\n    cd lama\n    export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n\n    # You need to prepare following image folders:\n    $ ls my_dataset\n    train\n    val_source # 2000 or more images\n    visual_test_source # 100 or more images\n    eval_source # 2000 or more images\n\n    # LaMa generates random masks for the train data on the flight,\n    # but needs fixed masks for test and visual_test for consistency of evaluation.\n\n    # Suppose, we want to evaluate and pick best models \n    # on 512x512 val dataset  with thick\u002Fthin\u002Fmedium masks \n    # And your images have .jpg extention:\n\n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\ # thick, thin, medium\n    my_dataset\u002Fval_source\u002F \\\n    my_dataset\u002Fval\u002Frandom_\u003Csize>_512.yaml \\# thick, thin, medium\n    --ext jpg\n\n    # So the mask generator will: \n    # 1. resize and crop val images and save them as .png\n    # 2. generate masks\n    \n    ls my_dataset\u002Fval\u002Frandom_medium_512\u002F\n    image1_crop000_mask000.png\n    image1_crop000.png\n    image2_crop000_mask000.png\n    image2_crop000.png\n    ...\n\n    # Generate thick, thin, medium masks for visual_test folder:\n\n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\  #thick, thin, medium\n    my_dataset\u002Fvisual_test_source\u002F \\\n    my_dataset\u002Fvisual_test\u002Frandom_\u003Csize>_512\u002F \\ #thick, thin, medium\n    --ext jpg\n    \n\n    ls my_dataset\u002Fvisual_test\u002Frandom_thick_512\u002F\n    image1_crop000_mask000.png\n    image1_crop000.png\n    image2_crop000_mask000.png\n    image2_crop000.png\n    ...\n\n    # Same process for eval_source image folder:\n    \n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\  #thick, thin, medium\n    my_dataset\u002Feval_source\u002F \\\n    my_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\ #thick, thin, medium\n    --ext jpg\n    \n\n\n    # Generate location config file which locate these folders:\n    \n    touch my_dataset.yaml\n    echo \"data_root_dir: $(pwd)\u002Fmy_dataset\u002F\" >> my_dataset.yaml\n    echo \"out_root_dir: $(pwd)\u002Fexperiments\u002F\" >> my_dataset.yaml\n    echo \"tb_dir: $(pwd)\u002Ftb_logs\u002F\" >> my_dataset.yaml\n    mv my_dataset.yaml ${PWD}\u002Fconfigs\u002Ftraining\u002Flocation\u002F\n\n\n    # Check data config for consistency with my_dataset folder structure:\n    $ cat ${PWD}\u002Fconfigs\u002Ftraining\u002Fdata\u002Fabl-04-256-mh-dist\n    ...\n    train:\n      indir: ${location.data_root_dir}\u002Ftrain\n      ...\n    val:\n      indir: ${location.data_root_dir}\u002Fval\n      img_suffix: .png\n    visual_test:\n      indir: ${location.data_root_dir}\u002Fvisual_test\n      img_suffix: .png\n\n\n    # Run training\n    python3 bin\u002Ftrain.py -cn lama-fourier location=my_dataset data.batch_size=10\n\n    # Evaluation: LaMa training procedure picks best few models according to \n    # scores on my_dataset\u002Fval\u002F \n\n    # To evaluate one of your best models (i.e. at epoch=32) \n    # on previously unseen my_dataset\u002Feval do the following \n    # for thin, thick and medium:\n\n    # infer:\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier_\u002F \\\n    indir=$(pwd)\u002Fmy_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\\n    outdir=$(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512 \\\n    model.checkpoint=epoch32.ckpt\n\n    # metrics calculation:\n    python3 bin\u002Fevaluate_predicts.py \\\n    $(pwd)\u002Fconfigs\u002Feval2_gpu.yaml \\\n    $(pwd)\u002Fmy_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\\n    $(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512 \\\n    $(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512_metrics.csv\n\n    \n**OR** in the docker:\n\n    TODO: train\n    TODO: eval\n    \n# Hints\n\n### Generate different kinds of masks\nThe following command will execute a script that generates random masks.\n\n    bash docker\u002F1_generate_masks_from_raw_images.sh \\\n        configs\u002Fdata_gen\u002Frandom_medium_512.yaml \\\n        \u002Fdirectory_with_input_images \\\n        \u002Fdirectory_where_to_store_images_and_masks \\\n        --ext png\n\nThe test data generation command stores images in the format,\nwhich is suitable for [prediction](#prediction).\n\nThe table below describes which configs we used to generate different test sets from the paper.\nNote that we *do not fix a random seed*, so the results will be slightly different each time.\n\n|        | Places 512x512         | CelebA 256x256         |\n|--------|------------------------|------------------------|\n| Narrow | random_thin_512.yaml   | random_thin_256.yaml   |\n| Medium | random_medium_512.yaml | random_medium_256.yaml |\n| Wide   | random_thick_512.yaml  | random_thick_256.yaml  |\n\nFeel free to change the config path (argument #1) to any other config in `configs\u002Fdata_gen` \nor adjust config files themselves.\n\n### Override parameters in configs\nAlso you can override parameters in config like this:\n\n    python3 bin\u002Ftrain.py -cn \u003Cconfig> data.batch_size=10 run_title=my-title\n\nWhere .yaml file extension is omitted\n\n### Models options \nConfig names for models from paper (substitude into the training command): \n\n    * big-lama\n    * big-lama-regular\n    * lama-fourier\n    * lama-regular\n    * lama_small_train_masks\n\nWhich are seated in configs\u002Ftraining\u002Ffolder\n\n### Links\n- All the data (models, test images, etc.) https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FAmdeG-bIjmvSug\n- Test images from the paper https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FxKQJZeVRk5vLlQ\n- The pre-trained models https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FEgqaSnLohjuzAg\n- The models for perceptual loss https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FncVmQlmT_kTemQ\n- Our training logs are available at https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002F9Bt1wNSDS4jDkQ\n\n\n### Training time & resources\n\nTODO\n\n## Acknowledgments\n\n* Segmentation code and models if form [CSAILVision](https:\u002F\u002Fgithub.com\u002FCSAILVision\u002Fsemantic-segmentation-pytorch).\n* LPIPS metric is from [richzhang](https:\u002F\u002Fgithub.com\u002Frichzhang\u002FPerceptualSimilarity)\n* SSIM is from [Po-Hsun-Su](https:\u002F\u002Fgithub.com\u002FPo-Hsun-Su\u002Fpytorch-ssim)\n* FID is from [mseitzer](https:\u002F\u002Fgithub.com\u002Fmseitzer\u002Fpytorch-fid)\n\n## Citation\nIf you found this code helpful, please consider citing: \n```\n@article{suvorov2021resolution,\n  title={Resolution-robust Large Mask Inpainting with Fourier Convolutions},\n  author={Suvorov, Roman and Logacheva, Elizaveta and Mashikhin, Anton and Remizova, Anastasia and Ashukha, Arsenii and Silvestrov, Aleksei and Kong, Naejin and Goka, Harshith and Park, Kiwoong and Lempitsky, Victor},\n  journal={arXiv preprint arXiv:2109.07161},\n  year={2021}\n}\n```\n","# 🦙 LaMa：基于傅里叶卷积（Fourier Convolutions）的分辨率鲁棒大遮罩修复\n\n作者：Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, \nAnastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, Victor Lempitsky.\n\n\u003Cp align=\"center\" \"font-size:30px;\">\n  🔥🔥🔥\n  \u003Cbr>\n  \u003Cb>\nLaMa 能出人意料地良好泛化到远高于训练分辨率（256x256）的场景（~2k❗️），即使在复杂场景（如周期性结构修复）中也表现出色。\u003C\u002Fb>\n\u003C\u002Fp>\n\n[[项目主页](https:\u002F\u002Fadvimman.github.io\u002Flama-project\u002F)] [[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.07161)] [[补充材料](https:\u002F\u002Fashukha.com\u002Fprojects\u002Flama_21\u002Flama_supmat_2021.pdf)] [[BibTeX](https:\u002F\u002Fsenya-ashukha.github.io\u002Fprojects\u002Flama_21\u002Fpaper.txt)] [[Casual GAN Papers 摘要](https:\u002F\u002Fwww.casualganpapers.com\u002Flarge-masks-fourier-convolutions-inpainting\u002FLaMa-explained.html)]\n \n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F15KTEIScUbVZtUP6w2tCDMVpE-b1r9pkZ?usp=drive_link\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\"\u002F>\n  \u003C\u002Fa>\n      \u003Cbr>\n   在 Google Colab 中体验 \n  \u003Cbr>\n  所有 Yandex 分发链接已失效，可从 https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=sharing 下载模型\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e06eeccf0efa.gif\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e5fb4788742e.gif\" \u002F>\n\u003C\u002Fp>\n\n\n\n# LaMa 开发进展\n(欢迎通过创建 issue 分享您的论文)\n- https:\u002F\u002Fgithub.com\u002Fgeekyutao\u002FInpaint-Anything --- Inpaint Anything：Segment Anything 与图像修复的结合\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_e37234404f87.png\" \u002F>\n\u003C\u002Fp>\n\n- [提升高分辨率图像修复的特征优化方法](https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.13644) \u002F [视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=gEukhOheWgE) \u002F 代码 https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fpull\u002F112 \u002F 作者 Geomagical Labs ([geomagical.com](geomagical.com))\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_readme_0f9fbaf86b7c.png\" \u002F>\n\u003C\u002Fp>\n\n# 非官方第三方应用：\n(欢迎通过创建 issue 分享您的应用\u002F实现\u002F演示)\n\n- https:\u002F\u002Fgithub.com\u002Fenesmsahin\u002Fsimple-lama-inpainting - LaMa 修复的简易 pip 包\n- https:\u002F\u002Fgithub.com\u002Fmallman\u002FCoreMLaMa - 苹果 Core ML 模型格式（Core ML model format）\n- [https:\u002F\u002Fcleanup.pictures](https:\u002F\u002Fcleanup.pictures\u002F) - [@cyrildiagne](https:\u002F\u002Ftwitter.com\u002Fcyrildiagne) 开发的简易交互式物体移除工具\n    - [lama-cleaner](https:\u002F\u002Fgithub.com\u002FSanster\u002Flama-cleaner) 由 [@Sanster](https:\u002F\u002Fgithub.com\u002FSanster\u002Flama-cleaner) 开发，是 [https:\u002F\u002Fcleanup.pictures](https:\u002F\u002Fcleanup.pictures\u002F) 的自托管版本\n- 集成至 [Huggingface Spaces](https:\u002F\u002Fhuggingface.co\u002Fspaces) 平台，基于 [Gradio](https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Fgradio)。查看演示：[![Hugging Face Spaces](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fakhaliq\u002Flama) 作者 [@AK391](https:\u002F\u002Fgithub.com\u002FAK391)\n- Telegram 机器人 [@MagicEraserBot](https:\u002F\u002Ft.me\u002FMagicEraserBot) 作者 [@Moldoteck](https:\u002F\u002Fgithub.com\u002FMoldoteck)，[代码](https:\u002F\u002Fgithub.com\u002FMoldoteck\u002FMagicEraser)\n- [Auto-LaMa](https:\u002F\u002Fgithub.com\u002Fandy971022\u002Fauto-lama) = DE:TR 物体检测 + LaMa 修复 作者 [@andy971022](https:\u002F\u002Fgithub.com\u002Fandy971022)\n- [LAMA-Magic-Eraser-Local](https:\u002F\u002Fgithub.com\u002Fzhaoyun0071\u002FLAMA-Magic-Eraser-Local) = 基于 PyQt5 构建的独立修复应用 作者 [@zhaoyun0071](https:\u002F\u002Fgithub.com\u002Fzhaoyun0071)\n- [Hama](https:\u002F\u002Fwww.hama.app\u002F) - 采用智能画笔简化遮罩绘制的物体移除工具\n- [ModelScope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fdamo\u002Fcv_fft_inpainting_lama\u002Fsummary) = 中文最大模型社区 作者 [@chenbinghui1](https:\u002F\u002Fgithub.com\u002Fchenbinghui1)\n- [LaMa with MaskDINO](https:\u002F\u002Fgithub.com\u002Fqwopqwop200\u002Flama-with-maskdino) = MaskDINO 物体检测 + 带优化的 LaMa 修复 作者 [@qwopqwop200](https:\u002F\u002Fgithub.com\u002Fqwopqwop200)\n- [CoreMLaMa](https:\u002F\u002Fgithub.com\u002Fmallman\u002FCoreMLaMa) - 将 Lama Cleaner 移植的 LaMa 转换为苹果 Core ML 模型格式的脚本\n\n# 环境配置\n\n❗️❗️❗️ 所有 Yandex 分发链接已失效，可从 [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=sharing) 下载模型 ❗️❗️❗️\n\n克隆仓库：\n`git clone https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama.git`\n\n提供三种环境配置选项：\n\n1. Python 虚拟环境（virtualenv）:\n\n    ```\n    virtualenv inpenv --python=\u002Fusr\u002Fbin\u002Fpython3\n    source inpenv\u002Fbin\u002Factivate\n    pip install torch==1.8.0 torchvision==0.9.0\n    \n    cd lama\n    pip install -r requirements.txt \n    ```\n\n2. Conda\n    \n    ```\n    % 安装适用于 Linux 的 conda，其他操作系统请从 https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002Fminiconda.html 下载 miniconda\n    wget https:\u002F\u002Frepo.anaconda.com\u002Fminiconda\u002FMiniconda3-latest-Linux-x86_64.sh\n    bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME\u002Fminiconda\n    $HOME\u002Fminiconda\u002Fbin\u002Fconda init bash\n\n    cd lama\n    conda env create -f conda_env.yml\n    conda activate lama\n    conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -y\n    pip install pytorch-lightning==1.2.9\n    ```\n \n3. Docker：无需额外操作 🎉.\n\n# 推理 \u003Ca name=\"prediction\">\u003C\u002Fa>\n\n运行\n```\ncd lama\nexport TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n```\n\n**1. 下载预训练模型（pre-trained models）**\n\n最佳模型（Places数据集和Places Challenge数据集）：\n    \n```    \ncurl -LJO https:\u002F\u002Fhuggingface.co\u002Fsmartywu\u002Fbig-lama\u002Fresolve\u002Fmain\u002Fbig-lama.zip\nunzip big-lama.zip\n```\n\n所有模型（Places数据集和CelebA-HQ数据集）：\n\n```\ndownload [https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1B2x7eQDgecTL0oh3LSIBDGj0fTxs6Ips?usp=drive_link]\nunzip lama-models.zip\n```\n\n**2. 准备图像和掩码（masks）**\n\n下载测试图像：\n\n```\nunzip LaMa_test_images.zip\n```\n\u003Cdetails>\n \u003Csummary>或者准备您的数据：\u003C\u002Fsummary>\n1) 创建掩码文件，命名格式为`[图像名称]_maskXXX[图像后缀]`，将图像和掩码文件放在同一文件夹中。\n\n- 可使用[脚本](https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fblob\u002Fmain\u002Fbin\u002Fgen_mask_dataset.py)生成随机掩码。\n- 检查文件格式：\n    ```    \n    image1_mask001.png\n    image1.png\n    image2_mask001.png\n    image2.png\n    ```\n\n2) 在`configs\u002Fprediction\u002Fdefault.yaml`中指定`image_suffix`，例如`.png`或`.jpg`或`_input.jpg`。\n\n\u003C\u002Fdetails>\n\n\n**3. 预测（Predict）**\n\n在主机上运行：\n\n    python3 bin\u002Fpredict.py model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n\n**或者** 在Docker中运行\n  \n以下命令将从Docker Hub拉取镜像并执行预测脚本\n```\nbash docker\u002F2_predict.sh $(pwd)\u002Fbig-lama $(pwd)\u002FLaMa_test_images $(pwd)\u002Foutput device=cpu\n```\nDocker CUDA版本：\n```\nbash docker\u002F2_predict_with_gpu.sh $(pwd)\u002Fbig-lama $(pwd)\u002FLaMa_test_images $(pwd)\u002Foutput\n```\n\n**4. 带优化的预测（Predict with Refinement）**\n\n在主机上运行：\n\n    python3 bin\u002Fpredict.py refine=True model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n\n# 训练与评估（Train and Eval）\n\n确保已执行：\n\n```\ncd lama\nexport TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n```\n\n然后下载用于_感知损失（perceptual loss）_的模型：\n\n    mkdir -p ade20k\u002Fade20k-resnet50dilated-ppm_deepsup\u002F\n    wget -P ade20k\u002Fade20k-resnet50dilated-ppm_deepsup\u002F http:\u002F\u002Fsceneparsing.csail.mit.edu\u002Fmodel\u002Fpytorch\u002Fade20k-resnet50dilated-ppm_deepsup\u002Fencoder_epoch_20.pth\n\n\n## Places数据集\n\n⚠️ 注意：我们在LaMa论文中看到的Places数据集FID（Fréchet Inception Distance）\u002FSSIM（结构相似性，Structural Similarity）\u002FLPIPS（学习型感知图像块相似度，Learned Perceptual Image Patch Similarity）指标值，是基于评估部分生成的30000张图像计算得出。\n有关评估数据的更多细节，请查阅[[附录第3节：数据集划分](https:\u002F\u002Fashukha.com\u002Fprojects\u002Flama_21\u002Flama_supmat_2021.pdf#subsection.3.1)] ⚠️\n\n在主机上运行：\n\n    # 从 http:\u002F\u002Fplaces2.csail.mit.edu\u002Fdownload.html 下载数据\n    # Places365-Standard：从高分辨率图像部分下载训练集(105GB)\u002F测试集(19GB)\u002F验证集(2.1GB)\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Ftrain_large_places365standard.tar\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Fval_large.tar\n    wget http:\u002F\u002Fdata.csail.mit.edu\u002Fplaces\u002Fplaces365\u002Ftest_large.tar\n\n    # 解压训练\u002F测试\u002F验证数据并创建.yaml配置文件\n    bash fetch_data\u002Fplaces_standard_train_prepare.sh\n    bash fetch_data\u002Fplaces_standard_test_val_prepare.sh\n    \n    # 为测试和每轮次（epoch）结束时的可视化采样图像\n    bash fetch_data\u002Fplaces_standard_test_val_sample.sh\n    bash fetch_data\u002Fplaces_standard_test_val_gen_masks.sh\n\n    # 启动训练\n    python3 bin\u002Ftrain.py -cn lama-fourier location=places_standard\n\n    # 为评估训练模型并报告论文中的指标\n    # 需要采样30000张未见过的图像并生成掩码\n    bash fetch_data\u002Fplaces_standard_evaluation_prepare_data.sh\n    \n    # 在256和512分辨率的厚\u002F薄\u002F中等掩码上进行推理并运行评估 \n    # 示例：\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier_\u002F \\\n    indir=$(pwd)\u002Fplaces_standard_dataset\u002Fevaluation\u002Frandom_thick_512\u002F \\\n    outdir=$(pwd)\u002Finference\u002Frandom_thick_512 model.checkpoint=last.ckpt\n\n    python3 bin\u002Fevaluate_predicts.py \\\n    $(pwd)\u002Fconfigs\u002Feval2_gpu.yaml \\\n    $(pwd)\u002Fplaces_standard_dataset\u002Fevaluation\u002Frandom_thick_512\u002F \\\n    $(pwd)\u002Finference\u002Frandom_thick_512 \\\n    $(pwd)\u002Finference\u002Frandom_thick_512_metrics.csv\n\n    \n    \nDocker：待补充\n    \n## CelebA数据集\n在主机上运行：\n\n    # 确保在lama目录下\n    cd lama\n    export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n\n    # 下载CelebA-HQ数据集\n    # 从 https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11Vz0fqHS2rXDb5pprgTjpD7S2BAJhi1P 下载data256x256.zip\n    \n    # 解压并拆分为训练\u002F测试\u002F可视化数据集，创建配置文件\n    bash fetch_data\u002Fcelebahq_dataset_prepare.sh\n\n    # 为测试和可视化测试生成掩码\n    bash fetch_data\u002Fcelebahq_gen_masks.sh\n\n    # 启动训练\n    python3 bin\u002Ftrain.py -cn lama-fourier-celeba data.batch_size=10\n\n    # 在256分辨率的厚\u002F薄\u002F中等掩码上进行推理并运行评估 \n    # 示例：\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier-celeba_\u002F \\\n    indir=$(pwd)\u002Fceleba-hq-dataset\u002Fvisual_test_256\u002Frandom_thick_256\u002F \\\n    outdir=$(pwd)\u002Finference\u002Fceleba_random_thick_256 model.checkpoint=last.ckpt\n    \n    \nDocker：待补充\n\n## Places Challenge数据集 \n\n在主机上运行：\n\n    # 该脚本并行下载多个.tar文件并解压\n    # Places365-Challenge：从高分辨率图像下载训练集(476GB)（用于训练Big-Lama）\n    bash places_challenge_train_download.sh\n    \n    待补充：数据准备\n    待补充：训练 \n    待补充：评估\n      \nDocker：待补充\n\n## 创建你的数据\n\n如果在以下步骤中遇到问题，请查看 CelebaHQ（名人面部高清数据集）部分的 bash 脚本，用于数据准备和掩码（mask）生成。\n\n\n在主机上：\n\n    # Make shure you are in lama folder\n    cd lama\n    export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n\n    # You need to prepare following image folders:\n    $ ls my_dataset\n    train\n    val_source # 2000 or more images\n    visual_test_source # 100 or more images\n    eval_source # 2000 or more images\n\n    # LaMa generates random masks for the train data on the flight,\n    # but needs fixed masks for test and visual_test for consistency of evaluation.\n\n    # Suppose, we want to evaluate and pick best models \n    # on 512x512 val dataset  with thick\u002Fthin\u002Fmedium masks \n    # And your images have .jpg extention:\n\n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\ # thick, thin, medium\n    my_dataset\u002Fval_source\u002F \\\n    my_dataset\u002Fval\u002Frandom_\u003Csize>_512.yaml \\# thick, thin, medium\n    --ext jpg\n\n    # So the mask generator will: \n    # 1. resize and crop val images and save them as .png\n    # 2. generate masks\n    \n    ls my_dataset\u002Fval\u002Frandom_medium_512\u002F\n    image1_crop000_mask000.png\n    image1_crop000.png\n    image2_crop000_mask000.png\n    image2_crop000.png\n    ...\n\n    # Generate thick, thin, medium masks for visual_test folder:\n\n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\  #thick, thin, medium\n    my_dataset\u002Fvisual_test_source\u002F \\\n    my_dataset\u002Fvisual_test\u002Frandom_\u003Csize>_512\u002F \\ #thick, thin, medium\n    --ext jpg\n    \n\n    ls my_dataset\u002Fvisual_test\u002Frandom_thick_512\u002F\n    image1_crop000_mask000.png\n    image1_crop000.png\n    image2_crop000_mask000.png\n    image2_crop000.png\n    ...\n\n    # Same process for eval_source image folder:\n    \n    python3 bin\u002Fgen_mask_dataset.py \\\n    $(pwd)\u002Fconfigs\u002Fdata_gen\u002Frandom_\u003Csize>_512.yaml \\  #thick, thin, medium\n    my_dataset\u002Feval_source\u002F \\\n    my_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\ #thick, thin, medium\n    --ext jpg\n    \n\n\n    # Generate location config file which locate these folders:\n    \n    touch my_dataset.yaml\n    echo \"data_root_dir: $(pwd)\u002Fmy_dataset\u002F\" >> my_dataset.yaml\n    echo \"out_root_dir: $(pwd)\u002Fexperiments\u002F\" >> my_dataset.yaml\n    echo \"tb_dir: $(pwd)\u002Ftb_logs\u002F\" >> my_dataset.yaml\n    mv my_dataset.yaml ${PWD}\u002Fconfigs\u002Ftraining\u002Flocation\u002F\n\n\n    # Check data config for consistency with my_dataset folder structure:\n    $ cat ${PWD}\u002Fconfigs\u002Ftraining\u002Fdata\u002Fabl-04-256-mh-dist\n    ...\n    train:\n      indir: ${location.data_root_dir}\u002Ftrain\n      ...\n    val:\n      indir: ${location.data_root_dir}\u002Fval\n      img_suffix: .png\n    visual_test:\n      indir: ${location.data_root_dir}\u002Fvisual_test\n      img_suffix: .png\n\n\n    # Run training\n    python3 bin\u002Ftrain.py -cn lama-fourier location=my_dataset data.batch_size=10\n\n    # Evaluation: LaMa training procedure picks best few models according to \n    # scores on my_dataset\u002Fval\u002F \n\n    # To evaluate one of your best models (i.e. at epoch=32) \n    # on previously unseen my_dataset\u002Feval do the following \n    # for thin, thick and medium:\n\n    # infer:\n    python3 bin\u002Fpredict.py \\\n    model.path=$(pwd)\u002Fexperiments\u002F\u003Cuser>_\u003Cdate:time>_lama-fourier_\u002F \\\n    indir=$(pwd)\u002Fmy_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\\n    outdir=$(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512 \\\n    model.checkpoint=epoch32.ckpt\n\n    # metrics calculation:\n    python3 bin\u002Fevaluate_predicts.py \\\n    $(pwd)\u002Fconfigs\u002Feval2_gpu.yaml \\\n    $(pwd)\u002Fmy_dataset\u002Feval\u002Frandom_\u003Csize>_512\u002F \\\n    $(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512 \\\n    $(pwd)\u002Finference\u002Fmy_dataset\u002Frandom_\u003Csize>_512_metrics.csv\n\n    \n**或者** 在 Docker 中：\n\n    TODO: train\n    TODO: eval\n    \n# 提示\n\n### 生成不同类型的掩码（mask）\n以下命令将执行一个生成随机掩码（mask）的脚本。\n\n    bash docker\u002F1_generate_masks_from_raw_images.sh \\\n        configs\u002Fdata_gen\u002Frandom_medium_512.yaml \\\n        \u002Fdirectory_with_input_images \\\n        \u002Fdirectory_where_to_store_images_and_masks \\\n        --ext png\n\n测试数据生成命令以适合 [预测](#prediction) 的格式存储图像。\n\n下表描述了我们用于从论文中生成不同测试集的配置。请注意，我们*不固定随机种子*，因此每次结果会略有不同。\n\n|        | Places 512x512         | CelebA 256x256         |\n|--------|------------------------|------------------------|\n| 窄掩码 | random_thin_512.yaml   | random_thin_256.yaml   |\n| 中掩码 | random_medium_512.yaml | random_medium_256.yaml |\n| 宽掩码 | random_thick_512.yaml  | random_thick_256.yaml  |\n\n你可以随意将配置路径（参数 #1）更改为 `configs\u002Fdata_gen` 中的任何其他配置，或调整配置文件本身。\n\n### 覆盖配置中的参数\n你也可以像这样覆盖配置中的参数：\n\n    python3 bin\u002Ftrain.py -cn \u003Cconfig> data.batch_size=10 run_title=my-title\n\n其中 .yaml 文件扩展名被省略。\n\n### 模型选项 \n论文中模型的配置名称（替换到训练命令中）： \n\n    * big-lama\n    * big-lama-regular\n    * lama-fourier\n    * lama-regular\n    * lama_small_train_masks\n\n这些配置位于 configs\u002Ftraining\u002F 文件夹中。\n\n### 链接\n- 所有数据（模型、测试图像等） https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FAmdeG-bIjmvSug\n- 论文中的测试图像 https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FxKQJZeVRk5vLlQ\n- 预训练模型 https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FEgqaSnLohjuzAg\n- 用于感知损失的模型 https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FncVmQlmT_kTemQ\n- 我们的训练日志可在 https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002F9Bt1wNSDS4jDkQ 获取\n\n\n### 训练时间与资源\n\nTODO\n\n## 致谢\n\n* 分割代码和模型来自 [CSAILVision](https:\u002F\u002Fgithub.com\u002FCSAILVision\u002Fsemantic-segmentation-pytorch)。\n* LPIPS 指标来自 [richzhang](https:\u002F\u002Fgithub.com\u002Frichzhang\u002FPerceptualSimilarity)\n* SSIM 来自 [Po-Hsun-Su](https:\u002F\u002Fgithub.com\u002FPo-Hsun-Su\u002Fpytorch-ssim)\n* FID 来自 [mseitzer](https:\u002F\u002Fgithub.com\u002Fmseitzer\u002Fpytorch-fid)\n\n## 引用\n如果本代码对你有所帮助，请考虑引用： \n```\n@article{suvorov2021resolution,\n  title={Resolution-robust Large Mask Inpainting with Fourier Convolutions},\n  author={Suvorov, Roman and Logacheva, Elizaveta and Mashikhin, Anton and Remizova, Anastasia and Ashukha, Arsenii and Silvestrov, Aleksei and Kong, Naejin and Goka, Harshith and Park, Kiwoong and Lempitsky, Victor},\n  journal={arXiv preprint arXiv:2109.07161},\n  year={2021}\n}\n```","# LaMa 快速上手指南\n\n## 环境准备\n- **系统要求**：Linux 或 macOS（Windows 未官方支持）\n- **前置依赖**：\n  - Python 3.6+\n  - pip\n  - Git\n  - （推荐）科学上网工具（用于下载模型资源）\n- **注意**：模型文件较大（约 1.5GB），建议预留 2GB 以上磁盘空间\n\n## 安装步骤\n1. 克隆仓库（使用清华镜像加速）：\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama.git\n   cd lama\n   ```\n\n2. 创建虚拟环境并安装依赖（推荐清华源加速）：\n   ```bash\n   virtualenv inpenv --python=python3\n   source inpenv\u002Fbin\u002Factivate\n   pip install torch==1.8.0 torchvision==0.9.0 -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n   pip install -r requirements.txt\n   ```\n\n## 基本使用\n1. **下载预训练模型**（推荐 Hugging Face 链接，国内访问更稳定）：\n   ```bash\n   curl -LJO https:\u002F\u002Fhuggingface.co\u002Fsmartywu\u002Fbig-lama\u002Fresolve\u002Fmain\u002Fbig-lama.zip\n   unzip big-lama.zip\n   ```\n\n2. **准备测试图像**：\n   - 下载示例图像（可选）：\n     ```bash\n     wget https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Freleases\u002Fdownload\u002Fv1.0\u002FLaMa_test_images.zip\n     unzip LaMa_test_images.zip\n     ```\n   - 或使用自定义图像：将图像和掩码放在同一目录，掩码命名格式为 `[image_name]_mask001.png`\n\n3. **运行图像修复**：\n   ```bash\n   export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)\n   python3 bin\u002Fpredict.py model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n   ```\n   - 修复结果将保存在 `output` 目录\n   - 支持高达 2K 分辨率图像处理（无需额外配置）\n\n> **提示**：若需更高精度修复，添加 `refine=True` 参数：\n> ```bash\n> python3 bin\u002Fpredict.py refine=True model.path=$(pwd)\u002Fbig-lama indir=$(pwd)\u002FLaMa_test_images outdir=$(pwd)\u002Foutput\n> ```","一位历史档案馆的数字化专员正在修复一张2000×3000像素的1940年代老照片，照片因潮湿导致右下角50%区域布满霉斑和划痕，需快速恢复原貌用于博物馆线上展览。\n\n### 没有 lama 时\n- 传统修复工具（如Photoshop内容识别）处理高分辨率图像时频繁崩溃，强行操作后修复区域模糊失真，细节严重丢失。\n- 大面积霉斑（覆盖人物主体）导致背景生成断裂，出现明显拼接痕迹，需手动逐像素修补耗时超3小时。\n- 老照片特有的纸张网格纹理属于周期性结构，修复后网格错位不连贯，破坏历史真实性。\n- 商业软件授权费用高昂（单套年费超万元），且对超大遮罩支持薄弱，反复调整参数仍难达专业要求。\n- 修复结果无法通过博物馆质检，常需返工重做，延误展览上线进度。\n\n### 使用 lama 后\n- lama的分辨率鲁棒性直接处理2K图像，修复区域清晰锐利，保留原始照片的胶片颗粒质感。\n- 傅里叶卷积技术无缝融合霉斑区域，人物与背景自然过渡，5分钟内完成高质量修复。\n- 周期性网格纹理修复连贯精准，纸张褶皱和印刷图案完全复原，符合历史档案标准。\n- 开源免费集成到工作流，零成本替代商业软件，单张照片处理效率提升36倍。\n- 修复结果一次性通过质检，确保展览按时上线，避免项目延期损失。\n\nlama让高分辨率历史影像的大面积修复从耗时费力的难题变为高效精准的标准化流程。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fadvimman_lama_0f9fbaf8.png","advimman","Image Manipulation","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fadvimman_978d60da.png","",null,"https:\u002F\u002Fgithub.com\u002Fadvimman",[82,86,90,94],{"name":83,"color":84,"percentage":85},"Jupyter Notebook","#DA5B0B",86.1,{"name":87,"color":88,"percentage":89},"Python","#3572A5",13.4,{"name":91,"color":92,"percentage":93},"Shell","#89e051",0.5,{"name":95,"color":96,"percentage":97},"Dockerfile","#384d54",0,9831,1038,"2026-04-04T19:38:51","Apache-2.0","Linux, macOS","推荐 NVIDIA GPU，CUDA 10.2（非必需，支持 CPU 模式）","未说明",{"notes":106,"python":107,"dependencies":108},"预训练模型需从 Google Drive 下载（原 Yandex 链接失效）；训练需大量存储空间（如 Places365 训练集 105GB）；建议使用 conda 或 Docker 管理环境以避免依赖冲突","3.6+",[109,110,111,112],"torch==1.8.0","torchvision==0.9.0","torchaudio","pytorch-lightning==1.2.9",[13,14],[115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132],"inpainting","inpainting-methods","inpainting-algorithm","computer-vision","cnn","deep-learning","deep-neural-networks","image-inpainting","fourier","fourier-transform","fourier-convolutions","colab-notebook","colab","high-resolution","generative-adversarial-network","gan","generative-adversarial-networks","pytorch","2026-03-27T02:49:30.150509","2026-04-06T05:18:00.213832",[136,141,146,151,156,161],{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},4477,"big-lama 模型是否在 places-challenge 数据集上训练？能否获取完整检查点用于微调？","维护者已上传完整检查点（包含鉴别器和 SegmPL 权重）至 https:\u002F\u002Fdisk.yandex.ru\u002Fd\u002FwJ2Ee0f1HvasDQ。AchoWu 分享了训练经验：需修改 PyTorch Lightning 代码处理检查点加载，具体步骤包括在 `pytorch_lightning\u002Ftrainer\u002Fconnectors\u002Fcheckpoint_connector.py` 第 106 行添加异常处理（替换为 `try: self.restore_training_state(checkpoint) except KeyError: ...`），并在 `saicinpainting\u002Ftraining\u002Ftrainers\u002Fbase.py` 中调整损失函数配置。微调效果需用户反馈验证。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F96",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},4478,"是否有简单的模型推理代码示例，不使用项目提供的复杂工具？","可以使用社区项目 simple-lama-inpainting，它提供了简化版的推理代码实现，无需 Hydra 配置或命令行工具。直接克隆仓库并运行：`git clone https:\u002F\u002Fgithub.com\u002Fenesmsahin\u002Fsimple-lama-inpainting`，按照 README 中的示例代码调用模型。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F227",{"id":147,"question_zh":148,"answer_zh":149,"source_url":150},4479,"如何将 LaMa 模型成功导出为 ONNX 格式？","由于 ONNX 不支持 `torch.fft.rfftn` 操作，原生导出会失败。Carve-Photos 团队已成功导出并提供 ONNX 模型：https:\u002F\u002Fhuggingface.co\u002FCarve\u002FLaMa-ONNX。适配动态轴时需修改 FourierUnit 以支持动态尺寸（如调整 `irfttn` 和 `rfft` 的填充逻辑），具体实现参考其仓库代码。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F84",{"id":152,"question_zh":153,"answer_zh":154,"source_url":155},4480,"在 Google Colab 中运行时出现 'ModuleNotFoundError: No module named torchtext.legacy' 错误如何解决？","该问题已在代码库修复（commit 8ba8381）。原因是 Colab 默认安装了较新版本的 PyTorch（1.11.0+cu113），与 LaMa 代码不兼容。解决方案：安装兼容版本，执行 `!pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0` 以匹配 CUDA 11.1 环境，避免版本冲突。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F114",{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},4481,"如何为自定义图像生成随机 masks？","predict.py 脚本会自动对输入 masks 进行二值化处理（参考代码行 https:\u002F\u002Fgithub.com\u002Fsaic-mdal\u002Flama\u002Fblob\u002Fmain\u002Fbin\u002Fpredict.py#L75）。建议手动创建二值 masks（值为 0 或 255），避免使用平滑 masks，因为平滑 masks 在图像修复中无实际意义。可使用 OpenCV 或 PIL 库生成随机二值 masks 示例代码：`mask = np.random.randint(0, 2, (h, w), dtype=np.uint8) * 255`。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F103",{"id":162,"question_zh":163,"answer_zh":164,"source_url":165},4482,"如何在训练时使用自定义 masks 而不是随机生成的 masks？","参考 NilsBochow 的 forked repository，它支持自定义 masks 训练：https:\u002F\u002Fgithub.com\u002FNilsBochow\u002Flama_reconstruction。使用方法：将自定义 masks 存放于指定目录（如 `data\u002Fmasks`），并在配置文件中设置 `dataset.masks` 路径指向该目录。训练时模型会优先读取自定义 masks 而非随机生成。","https:\u002F\u002Fgithub.com\u002Fadvimman\u002Flama\u002Fissues\u002F119",[]]