[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-River-Zhang--ICEdit":3,"tool-River-Zhang--ICEdit":61},[4,18,28,37,45,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":24,"last_commit_at":25,"category_tags":26,"status":17},9989,"n8n","n8n-io\u002Fn8n","n8n 是一款面向技术团队的公平代码（fair-code）工作流自动化平台，旨在让用户在享受低代码快速构建便利的同时，保留编写自定义代码的灵活性。它主要解决了传统自动化工具要么过于封闭难以扩展、要么完全依赖手写代码效率低下的痛点，帮助用户轻松连接 400 多种应用与服务，实现复杂业务流程的自动化。\n\nn8n 特别适合开发者、工程师以及具备一定技术背景的业务人员使用。其核心亮点在于“按需编码”：既可以通过直观的可视化界面拖拽节点搭建流程，也能随时插入 JavaScript 或 Python 代码、调用 npm 包来处理复杂逻辑。此外，n8n 原生集成了基于 LangChain 的 AI 能力，支持用户利用自有数据和模型构建智能体工作流。在部署方面，n8n 提供极高的自由度，支持完全自托管以保障数据隐私和控制权，也提供云端服务选项。凭借活跃的社区生态和数百个现成模板，n8n 让构建强大且可控的自动化系统变得简单高效。",184740,2,"2026-04-19T23:22:26",[16,14,13,15,27],"插件",{"id":29,"name":30,"github_repo":31,"description_zh":32,"stars":33,"difficulty_score":10,"last_commit_at":34,"category_tags":35,"status":17},10095,"AutoGPT","Significant-Gravitas\u002FAutoGPT","AutoGPT 是一个旨在让每个人都能轻松使用和构建 AI 的强大平台，核心功能是帮助用户创建、部署和管理能够自动执行复杂任务的连续型 AI 智能体。它解决了传统 AI 应用中需要频繁人工干预、难以自动化长流程工作的痛点，让用户只需设定目标，AI 即可自主规划步骤、调用工具并持续运行直至完成任务。\n\n无论是开发者、研究人员，还是希望提升工作效率的普通用户，都能从 AutoGPT 中受益。开发者可利用其低代码界面快速定制专属智能体；研究人员能基于开源架构探索多智能体协作机制；而非技术背景用户也可直接选用预置的智能体模板，立即投入实际工作场景。\n\nAutoGPT 的技术亮点在于其模块化“积木式”工作流设计——用户通过连接功能块即可构建复杂逻辑，每个块负责单一动作，灵活且易于调试。同时，平台支持本地自托管与云端部署两种模式，兼顾数据隐私与使用便捷性。配合完善的文档和一键安装脚本，即使是初次接触的用户也能在几分钟内启动自己的第一个 AI 智能体。AutoGPT 正致力于降低 AI 应用门槛，让人人都能成为 AI 的创造者与受益者。",183572,"2026-04-20T04:47:55",[13,36,27,14,15],"语言模型",{"id":38,"name":39,"github_repo":40,"description_zh":41,"stars":42,"difficulty_score":10,"last_commit_at":43,"category_tags":44,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":46,"name":47,"github_repo":48,"description_zh":49,"stars":50,"difficulty_score":24,"last_commit_at":51,"category_tags":52,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",161147,"2026-04-19T23:31:47",[14,13,36],{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":24,"last_commit_at":59,"category_tags":60,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":97,"forks":98,"last_commit_at":99,"license":100,"difficulty_score":10,"env_os":101,"env_gpu":102,"env_ram":103,"env_deps":104,"category_tags":110,"github_topics":112,"view_count":24,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":122,"updated_at":123,"faqs":124,"releases":155},8983,"River-Zhang\u002FICEdit","ICEdit","[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run! ","ICEdit 是一款基于大规模扩散 Transformer 的指令式图像编辑工具，旨在让用户通过简单的文字指令精准修改图片内容。它有效解决了传统图像编辑方法对海量训练数据依赖度高、计算资源消耗大，以及在多轮编辑中难以保持人物或物体身份一致性（ID Persistence）的痛点。\n\n无论是希望快速实现创意构思的设计师、需要高效工作流的普通用户，还是致力于探索模型效率的研究人员与开发者，都能从 ICEdit 中受益。其最显著的技术亮点在于极高的效率：仅需以往最先进方法 1% 的参数量（采用单 LoRA 架构）和 0.5% 的训练数据，即可达成卓越的编辑效果。官方特别发布的 MoE（混合专家）版本更是将显存需求降低至 4GB，使得在消费级显卡上流畅运行成为可能。此外，ICEdit 在多轮连续编辑中表现出惊人的身份保持能力，甚至超越了部分闭源商业模型。目前，该项目已开源训练代码，并提供了 ComfyUI 节点、Gradio 演示及华为昇腾 NPU 适配版本，方便不同技术背景的用户轻松上手体验。","\u003Cdiv align=\"center\">\n\n\u003Ch1>In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer\u003C\u002Fh1>\n\n\u003Cdiv>\n    \u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002Fzechuanzhang\u002F\u002F\" target=\"_blank\">Zechuan Zhang\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhorizonwind2004.github.io\u002F\" target=\"_blank\">Ji Xie\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fyulu.net.cn\u002F\" target=\"_blank\">Yu Lu\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fz-x-yang.github.io\u002F\" target=\"_blank\">Zongxin Yang\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fscholar.google.com\u002Fcitations?user=RMSuNFwAAAAJ&hl=zh-CN&oi=ao\" target=\"_blank\">Yi Yang✉\u003C\u002Fa>&emsp;\n\u003C\u002Fdiv>\n\u003Cdiv>\n    ReLER, CCAI, Zhejiang University; Harvard University\n\u003C\u002Fdiv>\n\u003Cdiv>\n     \u003Csup>✉\u003C\u002Fsup>Corresponding Author\n\u003C\u002Fdiv>\n\u003Cdiv>\n    \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.20690\" target=\"_blank\">Arxiv\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FRiverZ\u002FICEdit\" target=\"_blank\">Huggingface Demo 🤗\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fsanaka87\u002FICEdit-MoE-LoRA\u002Ftree\u002Fmain\" target=\"_blank\">Model 🤗\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F\" target=\"_blank\">Project Page\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\n\u003Cdiv style=\"width: 80%; margin:auto;\">\n    \u003Cimg style=\"width:100%; display: block; margin: auto;\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_515b75bb2956.png\">\n    \u003Cp style=\"text-align: left;\">\u003Cstrong>Image Editing is worth a single LoRA!\u003C\u002Fstrong> We present In-Context Edit, a novel approach that achieves state-of-the-art instruction-based editing \u003Cb>using just 0.5% of the training data and 1% of the parameters required by prior SOTA methods\u003C\u002Fb>. The first row illustrates a series of multi-turn edits, executed with high precision, while the second and third rows highlight diverse, visually impressive single-turn editing results from our method.\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n:open_book: For more visual results, go checkout our \u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F\" target=\"_blank\">project page\u003C\u002Fa>\n\n\n\u003Cdiv align=\"left\">\n\n\n# 🎆 News \n- **[2025\u002F9\u002F19]** 🔥 We have open-sourced our [MoE version ICEdit and ckpt](#for-the-usage-of-moe-lora-version). Have a try!🚀\n- **[2025\u002F9\u002F18]** 🌟 ICEdit has been accepted by NeurIPS 2025!🎉 See you in San Diego!\n- **[2025\u002F8\u002F21]** 🌟 We have released an [Ascend (Huawei NPU)-powered version of ICEdit](https:\u002F\u002Fgithub.com\u002F2018liuzhiyuan\u002FICEdit-on-Ascend-NPU). Now you can run ICEdit on Ascend NPU! Many thanks to [Zhiyuan](https:\u002F\u002Fgithub.com\u002F2018liuzhiyuan)！\n- **[2025\u002F5\u002F16]** 🌟 Many thanks to [gluttony-10 (十字鱼)](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fpull\u002F47#issue-3067039788) for adapting Gradio demo with [GGUF quantization](#inference-in-gradio-demo), further reducing memory usage to **10GB**.\n- **[2025\u002F5\u002F14]** 🔥 With the help of the [official comfy-org](https:\u002F\u002Fwww.comfy.org\u002Fzh-cn\u002F), we have integrated our ComfyUI nodes into [Comfy Registry](https:\u002F\u002Fregistry.comfy.org\u002Fnodes\u002FICEdit)! \n- **[2025\u002F5\u002F13]** 🔥 We have released the [training code](.\u002Ftrain\u002F)! Train your own editing LoRAs now!\n- **[2025\u002F5\u002F11]** 🌟 Great thanks to [gluttony-10 (十字鱼)](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F23#issue-3050804566) for making a [windows gradio demo](#inference-in-gradio-demo-on-windows) to use our project on Windows!\n- **[2025\u002F5\u002F8]** 🔥 We have released our **[official ComfyUI workflow](#official-comfyui-workflow)**! 🚀 Check the repository and have a try!\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Click to expand\u002Fcollapse news\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- **[2025\u002F5\u002F8]** 🔥 We have added LoRA scale slider in the gradio demo. You can try to discover more interesting demo with different scale! \n\u003Cdiv align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_cb46b441b1fb.png\" width=\"70%\" style=\"display: block; margin: auto;\">\n\u003Cdiv align=\"left\">\n\n- **[2025\u002F5\u002F7]** 🌟 We update some notes when using the ComfyUI workflow to avoid unsatisfactory results! \n- **[2025\u002F5\u002F6]** 🔥 ICEdit currently ranks **2nd** on the overall\u002Fweekly trending list of [Hugging Face space](https:\u002F\u002Fhuggingface.co\u002Fspaces). Thank you all for your support and love!🤗\n- **[2025\u002F5\u002F5]** 🌟 Heartfelt thanks to [Datou](https:\u002F\u002Fx.com\u002FDatou) for creating a fantastic [ComfyUI workflow](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB) on OpenArt! 🚀 Have a try!\n- **[2025\u002F5\u002F2]** 🌟 Heartfelt thanks to [judian17](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411) for crafting an amazing [ComfyUI-nunchaku demo](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411)! Only **4GB VRAM GPU** is enough to run with ComfyUI-nunchaku!🚀 Dive in and give it a spin!\n- **[2025\u002F4\u002F30]** 🔥 We release the [Huggingface Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FRiverZ\u002FICEdit) 🤗! Have a try!\n- **[2025\u002F4\u002F30]** 🔥 We release the [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.20690) on arXiv!\n- **[2025\u002F4\u002F29]** We release the [project page](https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F) and demo video! Codes will be made available in next week~ Happy Labor Day!\n\n\u003C\u002Fdetails>\n\n# 🎈 Tutorial on Bilibili or Youtube\n\n### 👑 Feel free to share your results in this [Gallery](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fdiscussions\u002F21)!\n- **[2025\u002F5\u002F15]** 🌟 We find that [啦啦啦的小黄瓜](https:\u002F\u002Fspace.bilibili.com\u002F219572544) has made a detailed [bilibili tutorial](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1tSEqzJE7q\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) introducing our model! What a great video!\n- **[2025\u002F5\u002F14]** 🌟 We find that [Nenly同学](https:\u002F\u002Fspace.bilibili.com\u002F1814756990) has made a fantastic [bilibili tutorial](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1bNEvzrEn1\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) on how to use our repository! Great thanks to him!\n- **[2025\u002F5\u002F10]** 🌟 Great thanks to [月下Hugo](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) for making a [Chinese tutorial](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) on how to use our official workflow!\n- **[2025\u002F5\u002F7]** 🌟 Heartfelt thanks to [T8star](https:\u002F\u002Fx.com\u002FT8star_Aix) for making a [tutorial](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s6GMKL-Jjos) and [ComfyUI workflow](https:\u002F\u002Fwww.runninghub.cn\u002Fpost\u002F1920075398585974786\u002F?utm_source=kol01-RH099) on how to **increase the editing success to 100%**!🚀 Have a try!\n- **[2025\u002F5\u002F3]** 🌟 Heartfelt thanks to [softicelee2](https:\u002F\u002Fgithub.com\u002Fsofticelee2) for making a [Youtube video](https:\u002F\u002Fyoutu.be\u002FrRMc5DE4qMo) on how to use our model!\n# 📖 Table of Contents\n\n- [🎆 News](#-news)\n- [🎈 Tutorial on Bilibili or Youtube](#-tutorial-on-bilibili-or-youtube)\n    - [👑 Feel free to share your results in this Gallery!](#-feel-free-to-share-your-results-in-this-gallery)\n- [📖 Table of Contents](#-table-of-contents)\n    - [📢 Attention All: Incorrect ComfyUI Workflow Usage Alert!](#-attention-all-incorrect-comfyui-workflow-usage-alert)\n- [💼 Installation](#-installation)\n  - [Conda environment setup](#conda-environment-setup)\n  - [Download pretrained weights](#download-pretrained-weights)\n  - [Inference in bash (w\u002Fo VLM Inference-time Scaling)](#inference-in-bash-wo-vlm-inference-time-scaling)\n      - [For the usage of MoE-LoRA version](#for-the-usage-of-moe-lora-version)\n  - [Inference in Gradio Demo](#inference-in-gradio-demo)\n  - [💼 Windows one-click package](#-windows-one-click-package)\n- [🔧 Training](#-training)\n- [🎨ComfyUI Workflow](#comfyui-workflow)\n    - [Official ComfyUI-workflow](#official-comfyui-workflow)\n    - [ComfyUI-workflow for increased editing success rate](#comfyui-workflow-for-increased-editing-success-rate)\n    - [ComfyUI-nunchaku](#comfyui-nunchaku)\n    - [ComfyUI-workflow](#comfyui-workflow-1)\n- [⚠️ Tips](#️-tips)\n    - [If you encounter such a failure case, please **try again with a different seed**!](#if-you-encounter-such-a-failure-case-please-try-again-with-a-different-seed)\n    - [⚠️ Clarification](#️-clarification)\n- [💪 To Do List](#-to-do-list)\n- [💪 Comparison with Commercial Models](#-comparison-with-commercial-models)\n- [🌟 Star History](#-star-history)\n- [Bibtex](#bibtex)\n\n\n\n### 📢 Attention All: Incorrect ComfyUI Workflow Usage Alert!\n- ### We have released our **[official ComfyUI workflow](#official-comfyui-workflow)** for proper usage! Check our repository and have a try!\n- You need to **add the fixed pre-prompt \"A diptych with two side-by-side images of the same scene. On the right, the scene is exactly the same as on the left but {instruction}\"** before inputing the edit instructions, otherwise you may get bad results! (This is mentioned in the paper!, The code for the Hugging Face gradio demo already embeds this prompt. So, you can simply input the editing instructions without additional setup.)\n- The width of the input image must resize to **512** (no restriction to height).\n- Please **[use the Normal LoRA](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain)** not the MoE-LoRA, because the MoE-LoRA cannot be correctly loaded with ComfyUI lora loader.\n- 🔥💐🎆 Welcome to share your **creative workflows** (such as combining Redux, ACE, etc.) in the Issues section and showcase the results! We will include references so that more people can see your creativity.\n\n\n\n# 💼 Installation\n\n## Conda environment setup\n\n```bash\nconda create -n icedit python=3.10\nconda activate icedit\npip install -r requirements.txt\npip install -U huggingface_hub\n```\n\n## Download pretrained weights\n\nIf you can connect to Huggingface, you don't need to download the weights. Otherwise, you need to download the weights to local.\n\n- [Flux.1-fill-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002Fflux.1-fill-dev).\n- [ICEdit-normal-LoRA](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain).\n- [ICEdit-MoE-LoRA](https:\u002F\u002Fhuggingface.co\u002Fsanaka87\u002FICEdit-MoE-LoRA\u002Ftree\u002Fmain)\n\n~~Note: Due to some cooperation permission issues, we have to withdraw the weights and codes of moe-lora temporarily. What is released currently is just the ordinary lora, but it still has powerful performance. If you urgently need the moe lora weights of the original text, please email the author.~~\n\n## Inference in bash (w\u002Fo VLM Inference-time Scaling)\n\nNow you can have a try!\n\n> Our model can **only edit images with a width of 512 pixels** (there is no restriction on the height). If you pass in an image with a width other than 512 pixels, the model will automatically resize it to 512 pixels.\n\n> If you found the model failed to generate the expected results, please try to change the `--seed` parameter. Inference-time Scaling with VLM can help much to improve the results.\n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --seed 304897401 \\\n```\n\nEditing a 512×768 image requires 35 GB of GPU memory. If you need to run on a system with 24 GB of GPU memory (for example, an NVIDIA RTX3090), you can add the `--enable-model-cpu-offload` parameter.\n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --enable-model-cpu-offload\n```\n\nIf you have downloaded the pretrained weights locally, please pass the parameters during inference, as in: \n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev \\\n                            --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA\n```\n\n#### For the usage of MoE-LoRA version\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --seed 42 \\\n```\n\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --enable-model-cpu-offload\n```\n\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev \\\n                            --lora-path \u002Fpath\u002Fto\u002FICEdit-MoE-LoRA\n```\n\n## Inference in Gradio Demo\n\nWe provide a gradio demo for you to edit images in a more user-friendly way. You can run the following command to start the demo.\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7860\n\n\n\n## for MoE version\npython scripts\u002Fgradio_demo_moe.py --port 7860\n\n```\n\nLike the inference script, if you want to run the demo on a system with 24 GB of GPU memory, you can add the `--enable-model-cpu-offload` parameter. And if you have downloaded the pretrained weights locally, please pass the parameters during inference, as in:\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7860 \\\n                              --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev (optional) \\\n                              --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA (optional) \\\n                              --enable-model-cpu-offload (optional) \\\n\n## for MoE version\npython scripts\u002Fgradio_demo_moe.py --port 7860 \\\n                              --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev (optional) \\\n                              --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA (optional) \\\n                              --enable-model-cpu-offload (optional) \\\n```\n\nOr if you want to run the demo on a system with 10 GB of GPU memory, you can download the gguf models from [FLUX.1-Fill-dev-gguf](https:\u002F\u002Fhuggingface.co\u002FYarvixPA\u002FFLUX.1-Fill-dev-gguf), [t5-v1_1-xxl-encoder-gguf](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002Ft5-v1_1-xxl-encoder-gguf) and pass the parameters during inference, as in:\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7861 \\\n                              --flux-path models\u002Fflux.1-fill-dev \\\n                              --lora-path models\u002FICEdit-normal-LoRA \\\n                              --transformer models\u002Fflux1-fill-dev-Q4_0.gguf \\\n                              --text_encoder_2 models\u002Ft5-v1_1-xxl-encoder-Q8_0.gguf \\\n                              --enable-model-cpu-offload \\\n```\n\nThen you can open the link in your browser to edit images.\n\n\u003Cdiv align=\"center\">\n\u003Cdiv style=\"width: 80%; text-align: left; margin:auto;\">\n    \u003Cimg style=\"width:100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4519622a8c26.png\">\n    \u003Cp style=\"text-align: left;\">Gradio Demo: just input the instruction and wait for the result!\u003C\u002Fb>.\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"left\">\n\nHere is also a Chinese tutorial [Youtube video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=rRMc5DE4qMo) on how to install and use ICEdit, created by [softicelee2](https:\u002F\u002Fgithub.com\u002Fsofticelee2). It's definitely worth a watch!\n\n## 💼 Windows one-click package\n\nGreat thanks to [gluttony-10](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F23#issue-3050804566), a famous [Bilibili Up](https:\u002F\u002Fspace.bilibili.com\u002F893892)! He made a tutorial ([Youtube](https:\u002F\u002Fyoutu.be\u002FC-OpWlJi424) and [Bilibili](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1oT5uzzEbs)) on how to install our project on windows and a one-click package for Windows! **Just unzip it and it's ready to use**. It has undergone quantization processing. It only takes up 14GB of space and supports graphics cards of the 50 series.\n\nDownload link: [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F16j3wQvWjuzCRKnVolszLmhCtc_yOCqcx?usp=sharing) or [Baidu Wangpan](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1oT5uzzEbs\u002F?vd_source=2a911c0bc75f6d9b9d056bf0e7410d45)(refer to the comment section of the video)\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_14bc04e1025d.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n# 🔧 Training\n\nFound more details in here: [Training Code](.\u002Ftrain\u002F)\n\n\n# 🎨ComfyUI Workflow\n\n\n### Official ComfyUI-workflow\nWe have released our **official ComfyUI workflow** in this repository for correct usage of our model! **We have embedded the prompt \"A diptych with two side-by-side images of the same scene ... but\" into our nodes** and you just need to input the edit instructions such as \"make the girl wear pink sunglasses\". We also add a high resolution refinement module for better image quality! The total VRAM consumption is about 14GB. Use this [workflow](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official) and the [ICEdit-normal-lora](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain) to fulfill your creative ideas!\n\nWe have specially created [a repository for the workflow](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official) and you can **install it directly in ComfyUI**. Just open the manager tab and click **'Install via Git URL'**, copy the following URL and you are able to use it. For more details please refer to this [issue](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F22#issuecomment-2864977880)\n\n**URL:** [https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official)\n\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_c3ddde9581d2.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_a9a466a49d78.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n Great thanks to [月下Hugo](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) for making a [Chinese tutorial](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) on how to use our official workflow!\n\n### ComfyUI-workflow for increased editing success rate\nThanks to [T8star](https:\u002F\u002Fx.com\u002FT8star_Aix)! He made a tutorial ([Youtube](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s6GMKL-Jjos) and [bilibili](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV11HVhz1Eky\u002F?spm_id_from=333.40164.top_right_bar_window_dynamic.content.click&vd_source=2a911c0bc75f6d9b9d056bf0e7410d45)) and a creative workflow ([OpenArt](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Ft8star\u002Ficedit100v1\u002FHN4EZ2Cej98ZX8CC1RK5) and [RunningHub](https:\u002F\u002Fwww.runninghub.cn\u002Fpost\u002F1920075398585974786\u002F?utm_source=kol01-RH099)) that could increase the editing success rate greatly (about 100%)! Have a try with it!\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_a23195273e01.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n### ComfyUI-nunchaku\n\nWe extend our heartfelt thanks to @[judian17](https:\u002F\u002Fgithub.com\u002Fjudian17) for crafting a ComfyUI [workflow](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411) that facilitates seamless usage of our model. Explore this excellent [workflow](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411) to effortlessly run our model within ComfyUI. Only **4GB VRAM GPU** is enough to run with ComfyUI-nunchaku! \n\nThis workflow incorporates high-definition refinement, yielding remarkably good results. Moreover, integrating this LoRA with Redux enables outfit changes to a certain degree. Once again, a huge thank you to @[judian17](https:\u002F\u002Fgithub.com\u002Fjudian17) for his innovative contributions! \n\n![comfyui image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4faa9284acda.png)\n\n\n### ComfyUI-workflow\n\nThanks to [Datou](https:\u002F\u002Fx.com\u002FDatou), a workflow of ICEdit in ComfyUI can also be downloaded [here](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB). Try it with the [normal lora ckpt](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain).\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_c5e0a63935bf.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n\n\n\n\n# ⚠️ Tips\n\n### If you encounter such a failure case, please **try again with a different seed**!\n\n- Our base model, FLUX, does not inherently support a wide range of styles, so a large portion of our dataset involves style transfer. As a result, the model **may sometimes inexplicably change your artistic style**.\n\n- Our training dataset is **mostly targeted at realistic images**. For non-realistic images, such as **anime** or **blurry pictures**, the success rate of the editing **drop and could potentially affect the final image quality**.\n\n- While the success rates for adding objects, modifying color attributes, applying style transfer, and changing backgrounds are high, the success rate for object removal is relatively lower due to the low quality of the removal dataset we use.\n\nThe current model is the one used in the experiments in the paper, trained with only 4 A800 GPUs (total `batch_size` = 2 x 2 x 4 = 16). In the future, we will enhance the dataset, and do scale-up, finally release a more powerful model.\n\n### ⚠️ Clarification\n\nWe've noticed numerous web pages related to ICEdit, including [https:\u002F\u002Ficedit.net\u002F](https:\u002F\u002Ficedit.net\u002F), [https:\u002F\u002Ficedit.org\u002F](https:\u002F\u002Ficedit.org\u002F). Kudos to those who built these pages!\n\nHowever, we'd like to emphasize two important points:\n- **No Commercial Use**: Our project **cannot** be used for commercial purposes. Please check the [LICENSE](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fblob\u002Fmain\u002FLICENSE) for details.\n- **Official Page**: The official project page is [https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F](https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F).\n\n\n\n\n\n# 💪 To Do List\n\n- [x] Inference Code\n- [ ] Inference-time Scaling with VLM\n- [x] Pretrained Weights\n- [x] More Inference Demos\n- [x] Gradio demo\n- [x] Comfy UI demo (by @[judian17](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411), compatible with [nunchaku](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002FComfyUI-nunchaku), support high-res refinement and FLUX Redux. Only 4GB VRAM GPU is enough to run!)\n- [x] Comfy UI demo with normal lora (by @[Datou](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB) in openart)\n- [x] Official ComfyUI workflow\n- [x] Training Code\n- [ ] LoRA for higher image resolution (768, 1024)\n\n\n\n# 💪 Comparison with Commercial Models\n\n\u003Cdiv align=\"center\">\n\u003Cdiv style=\"width: 80%; text-align: left; margin:auto;\">\n    \u003Cimg style=\"width:100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_545b98c5b1cc.png\">\n    \u003Cp style=\"text-align: left;\">Compared with commercial models such as Gemini and GPT-4o, our methods are comparable to and even superior to these commercial models in terms of character ID preservation and instruction following. \u003Cb>We are more open-source than them, with lower costs, faster speed (it takes about 9 seconds to process one image), and powerful performance\u003C\u002Fb>.\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\n\u003Cdiv align=\"left\">\n\n\n# 🌟 Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4543fbb1f228.png)](https:\u002F\u002Fwww.star-history.com\u002F#River-Zhang\u002FICEdit&Date)\n\n# Bibtex\nIf this work is helpful for your research, please consider citing the following BibTeX entry.\n\n```\n@article{zhang2025context,\n  title={In-context edit: Enabling instructional image editing with in-context generation in large scale diffusion transformer},\n  author={Zhang, Zechuan and Xie, Ji and Lu, Yu and Yang, Zongxin and Yang, Yi},\n  journal={arXiv preprint arXiv:2504.20690},\n  year={2025}\n}\n\n@inproceedings{zhang2025icedit,\n  title     = {In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large-Scale Diffusion Transformers},\n  author    = {Zhang, Zechuan and Xie, Ji and Lu, Yu and Yang, Zongxin and Yang, Yi},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2025},\n  note      = {arXiv:2504.20690}\n}\n\n```\n","\u003Cdiv align=\"center\">\n\n\u003Ch1>上下文编辑：利用大规模扩散Transformer中的上下文生成实现指令式图像编辑\u003C\u002Fh1>\n\n\u003Cdiv>\n    \u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002Fzechuanzhang\u002F\u002F\" target=\"_blank\">张泽川\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhorizonwind2004.github.io\u002F\" target=\"_blank\">谢骥\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fyulu.net.cn\u002F\" target=\"_blank\">陆宇\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fz-x-yang.github.io\u002F\" target=\"_blank\">杨宗鑫\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fscholar.google.com\u002Fcitations?user=RMSuNFwAAAAJ&hl=zh-CN&oi=ao\" target=\"_blank\">杨毅✉\u003C\u002Fa>&emsp;\n\u003C\u002Fdiv>\n\u003Cdiv>\n    浙江大学ReLER实验室、CCAI；哈佛大学\n\u003C\u002Fdiv>\n\u003Cdiv>\n     \u003Csup>✉\u003C\u002Fsup>通讯作者\n\u003C\u002Fdiv>\n\u003Cdiv>\n    \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.20690\" target=\"_blank\">Arxiv\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FRiverZ\u002FICEdit\" target=\"_blank\">Huggingface演示 🤗\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fsanaka87\u002FICEdit-MoE-LoRA\u002Ftree\u002Fmain\" target=\"_blank\">模型 🤗\u003C\u002Fa>&emsp;\n    \u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F\" target=\"_blank\">项目页面\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\n\u003Cdiv style=\"width: 80%; margin:auto;\">\n    \u003Cimg style=\"width:100%; display: block; margin: auto;\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_515b75bb2956.png\">\n    \u003Cp style=\"text-align: left;\">\u003Cstrong>图像编辑只需一个LoRA就够了！\u003C\u002Fstrong>我们提出了上下文编辑方法，这是一种新颖的基于指令的编辑技术，\u003Cb>仅使用先前SOTA方法所需训练数据的0.5%和参数量的1%\u003C\u002Fb>,却达到了最先进的水平。第一行展示了一系列高精度的多轮编辑结果，而第二、三行则展示了我们方法在单轮编辑中产生的多样化且视觉效果惊艳的结果。\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n:open_book: 更多可视化结果，请访问我们的\u003Ca href=\"https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F\" target=\"_blank\">项目页面\u003C\u002Fa>\n\n\n\u003Cdiv align=\"left\">\n\n\n# 🎆 新闻 \n- **[2025\u002F9\u002F19]** 🔥 我们已开源了我们的[MoE版本ICEdit及检查点](#for-the-usage-of-moe-lora-version)。快来试试吧！🚀\n- **[2025\u002F9\u002F18]** 🌟 ICEdit已被NeurIPS 2025接收！🎉 圣地亚哥见！\n- **[2025\u002F8\u002F21]** 🌟 我们发布了[基于Ascend（华为NPU）的ICEdit版本](https:\u002F\u002Fgithub.com\u002F2018liuzhiyuan\u002FICEdit-on-Ascend-NPU)。现在你可以在Ascend NPU上运行ICEdit了！非常感谢[Zhiyuan](https:\u002F\u002Fgithub.com\u002F2018liuzhiyuan)！\n- **[2025\u002F5\u002F16]** 🌟 非常感谢[gluttony-10 (十字鱼)](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fpull\u002F47#issue-3067039788)将Gradio演示适配为[GGUF量化](#inference-in-gradio-demo)，进一步将内存占用降低至**10GB**。\n- **[2025\u002F5\u002F14]** 🔥 在[官方Comfy-org](https:\u002F\u002Fwww.comfy.org\u002Fzh-cn\u002F)的帮助下，我们已将我们的ComfyUI节点集成到[Comfy Registry](https:\u002F\u002Fregistry.comfy.org\u002Fnodes\u002FICEdit)中！\n- **[2025\u002F5\u002F13]** 🔥 我们发布了[训练代码](.\u002Ftrain\u002F)！现在就训练属于你的编辑LoRA吧！\n- **[2025\u002F5\u002F11]** 🌟 非常感谢[gluttony-10 (十字鱼)](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F23#issue-3050804566)制作了[Windows版Gradio演示](#inference-in-gradio-demo-on-windows)，让你可以在Windows上使用我们的项目！\n- **[2025\u002F5\u002F8]** 🔥 我们发布了我们的**[官方ComfyUI工作流](#official-comfyui-workflow)**！🚀 快去仓库看看并试一试吧！\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>点击展开\u002F收起新闻\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- **[2025\u002F5\u002F8]** 🔥 我们在Gradio演示中添加了LoRA缩放滑块。你可以尝试用不同的缩放比例来发现更多有趣的演示效果！ \n\u003Cdiv align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_cb46b441b1fb.png\" width=\"70%\" style=\"display: block; margin: auto;\">\n\u003Cdiv align=\"left\">\n\n- **[2025\u002F5\u002F7]** 🌟 我们更新了一些使用ComfyUI工作流时的注意事项，以避免出现不理想的效果！ \n- **[2025\u002F5\u002F6]** 🔥 ICEdit目前在[Hugging Face space](https:\u002F\u002Fhuggingface.co\u002Fspaces)的总体\u002F每周趋势榜单上排名**第2位**。感谢大家的支持与喜爱！🤗\n- **[2025\u002F5\u002F5]** 🌟 衷心感谢[Datou](https:\u002F\u002Fx.com\u002FDatou)在OpenArt平台上创建了一个精彩的[ComfyUI工作流](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB)！🚀 赶快试试吧！\n- **[2025\u002F5\u002F2]** 🌟 衷心感谢[judian17](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411)打造了一个令人惊叹的[ComfyUI双节棍演示](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411)！只需**4GB显存的GPU**就能运行ComfyUI双节棍！🚀 快来体验一下吧！\n- **[2025\u002F4\u002F30]** 🔥 我们发布了[Huggingface演示](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FRiverZ\u002FICEdit) 🤗！快来试试吧！\n- **[2025\u002F4\u002F30]** 🔥 我们在arXiv上发布了[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.20690)！\n- **[2025\u002F4\u002F29]** 我们发布了[项目页面](https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F)和演示视频！代码将在下周公开~ 祝大家劳动节快乐！\n\n\u003C\u002Fdetails>\n\n# 🎈 Bilibili或Youtube教程\n\n### 👑 欢迎在本[图库](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fdiscussions\u002F21)分享你的成果！\n- **[2025\u002F5\u002F15]** 🌟 我们发现[啦啦啦的小黄瓜](https:\u002F\u002Fspace.bilibili.com\u002F219572544)制作了一段详细的[Bilibili教程](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1tSEqzJE7q\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095)介绍我们的模型！真是太棒的视频了！\n- **[2025\u002F5\u002F14]** 🌟 我们发现[Nenly同学](https:\u002F\u002Fspace.bilibili.com\u002F1814756990)制作了一段精彩的[Bilibili教程](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1bNEvzrEn1\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095)讲解如何使用我们的仓库！非常感谢他！\n- **[2025\u002F5\u002F10]** 🌟 非常感谢[月下Hugo](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095)制作了一段[中文教程](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095)介绍如何使用我们的官方工作流！\n- **[2025\u002F5\u002F7]** 🌟 衷心感谢[T8star](https:\u002F\u002Fx.com\u002FT8star_Aix)制作了一段[教程](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s6GMKL-Jjos)和一个[ComfyUI工作流](https:\u002F\u002Fwww.runninghub.cn\u002Fpost\u002F1920075398585974786\u002F?utm_source=kol01-RH099)讲解如何**将编辑成功率提升至100%**！🚀 赶快试试吧！\n- **[2025\u002F5\u002F3]** 🌟 衷心感谢[softicelee2](https:\u002F\u002Fgithub.com\u002Fsofticelee2)制作了一段[Youtube视频](https:\u002F\u002Fyoutu.be\u002FrRMc5DE4qMo)讲解如何使用我们的模型！\n\n# 📖 目录\n\n- [🎆 新闻](#-news)\n- [🎈 Bilibili 或 YouTube 教程](#-tutorial-on-bilibili-or-youtube)\n    - [👑 欢迎在本画廊分享你的成果！](#-feel-free-to-share-your-results-in-this-gallery)\n- [📖 目录](#-table-of-contents)\n    - [📢 全体注意：ComfyUI 工作流使用错误提醒！](#-attention-all-incorrect-comfyui-workflow-usage-alert)\n- [💼 安装](#-installation)\n  - [Conda 环境搭建](#conda-environment-setup)\n  - [下载预训练权重](#download-pretrained-weights)\n  - [在 Bash 中进行推理（不使用 VLM 推理时缩放）](#inference-in-bash-wo-vlm-inference-time-scaling)\n      - [关于 MoE-LoRA 版本的使用说明](#for-the-usage-of-moe-lora-version)\n  - [在 Gradio Demo 中进行推理](#inference-in-gradio-demo)\n  - [💼 Windows 一键安装包](#-windows-one-click-package)\n- [🔧 训练](#-training)\n- [🎨 ComfyUI 工作流](#comfyui-workflow)\n    - [官方 ComfyUI 工作流](#official-comfyui-workflow)\n    - [提高编辑成功率的 ComfyUI 工作流](#comfyui-workflow-for-increased-editing-success-rate)\n    - [ComfyUI-nunchaku](#comfyui-nunchaku)\n    - [ComfyUI 工作流](#comfyui-workflow-1)\n- [⚠️ 小贴士](#️-tips)\n    - [如果你遇到这样的失败情况，请**尝试使用不同的随机种子重试**！](#if-you-encounter-such-a-failure-case-please-try-again-with-a-different-seed)\n    - [⚠️ 说明](#️-clarification)\n- [💪 待办事项](#-to-do-list)\n- [💪 与商业模型的对比](#-comparison-with-commercial-models)\n- [🌟 星标历史](#-star-history)\n- [Bibtex](#bibtex)\n\n\n\n### 📢 全体注意：ComfyUI 工作流使用错误提醒！\n- ### 我们已发布用于正确使用的**[官方 ComfyUI 工作流](#official-comfyui-workflow)**！请查看我们的仓库并尝试一下！\n- 在输入编辑指令之前，你需要**添加固定的前置提示“一幅由两幅并排图像组成的双联画，场景完全相同。右侧的场景与左侧完全一致，但{instruction}”**，否则可能会得到较差的结果！（这一点在论文中已有提及！Hugging Face Gradio 演示代码已经嵌入了这个提示。因此，你只需直接输入编辑指令即可，无需额外设置。）\n- 输入图像的宽度必须调整为**512**（高度无限制）。\n- 请**使用 Normal LoRA**（https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain），而不是 MoE-LoRA，因为 MoE-LoRA 无法通过 ComfyUI 的 LoRA 加载器正确加载。\n- 🔥💐🎆 欢迎在 Issues 栏目中分享你的**创意工作流**（例如结合 Redux、ACE 等），并展示成果！我们会附上引用链接，让更多人看到你的创意。\n\n\n\n# 💼 安装\n\n## Conda 环境搭建\n\n```bash\nconda create -n icedit python=3.10\nconda activate icedit\npip install -r requirements.txt\npip install -U huggingface_hub\n```\n\n## 下载预训练权重\n\n如果你可以连接到 Hugging Face，则无需下载权重。否则，你需要将权重下载到本地。\n\n- [Flux.1-fill-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002Fflux.1-fill-dev)。\n- [ICEdit-normal-LoRA](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain)。\n- [ICEdit-MoE-LoRA](https:\u002F\u002Fhuggingface.co\u002Fsanaka87\u002FICEdit-MoE-LoRA\u002Ftree\u002Fmain)\n\n~~注：由于一些合作权限问题，我们暂时撤回了 moe-lora 的权重和代码。目前发布的只是普通的 LoRA，但它仍然具有强大的性能。如果你急需原文中的 moe lora 权重，请联系作者。~~\n\n## 在 Bash 中进行推理（不使用 VLM 推理时缩放）\n\n现在你可以尝试一下了！\n\n> 我们的模型**只能编辑宽度为 512 像素的图像**（高度没有限制）。如果你输入的图像宽度不是 512 像素，模型会自动将其调整为 512 像素。\n\n> 如果你发现模型未能生成预期结果，请尝试更改 `--seed` 参数。使用 VLM 进行推理时缩放可以显著改善结果。\n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                            --seed 304897401 \\\n```\n\n编辑一张 512×768 的图像需要 35 GB 的显存。如果你需要在只有 24 GB 显存的系统上运行（例如 NVIDIA RTX3090），可以添加 `--enable-model-cpu-offload` 参数。\n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                            --enable-model-cpu-offload\n```\n\n如果你已将预训练权重下载到本地，请在推理时传递参数，如下所示：\n\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                            --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev \\\n                            --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA\n```\n\n#### 关于 MoE-LoRA 版本的使用方法\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                                --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                                --seed 42 \\\n```\n\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                                --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                                --enable-model-cpu-offload\n```\n\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                                --instruction \"让她把头发染成深绿色，衣服换成格子图案。\" \\\n                                --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev \\\n                                --lora-path \u002Fpath\u002Fto\u002FICEdit-MoE-LoRA\n```\n\n## 在 Gradio Demo 中进行推理\n\n我们提供了一个 Gradio 演示，方便你以更友好的方式编辑图像。你可以运行以下命令来启动演示。\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7860\n\n\n\n## 对于 MoE 版本\npython scripts\u002Fgradio_demo_moe.py --port 7860\n\n```\n\n与推理脚本类似，如果你想在只有 24 GB 显存的系统上运行演示，可以添加 `--enable-model-cpu-offload` 参数。如果你已将预训练权重下载到本地，请在推理时传递参数，如下所示：\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7860 \\\n                              --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev（可选） \\\n                              --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA（可选） \\\n                              --enable-model-cpu-offload（可选） \\\n\n## 用于 MoE 版本\npython scripts\u002Fgradio_demo_moe.py --port 7860 \\\n                              --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev (可选) \\\n                              --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA (可选) \\\n                              --enable-model-cpu-offload (可选) \\\n```\n\n或者，如果你想在只有 10 GB 显存的设备上运行演示，可以从 [FLUX.1-Fill-dev-gguf](https:\u002F\u002Fhuggingface.co\u002FYarvixPA\u002FFLUX.1-Fill-dev-gguf) 和 [t5-v1_1-xxl-encoder-gguf](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002Ft5-v1_1-xxl-encoder-gguf) 下载 gguf 模型，并在推理时传递这些参数，如下所示：\n\n```bash\npython scripts\u002Fgradio_demo.py --port 7861 \\\n                              --flux-path models\u002Fflux.1-fill-dev \\\n                              --lora-path models\u002FICEdit-normal-LoRA \\\n                              --transformer models\u002Fflux1-fill-dev-Q4_0.gguf \\\n                              --text_encoder_2 models\u002Ft5-v1_1-xxl-encoder-Q8_0.gguf \\\n                              --enable-model-cpu-offload \\\n```\n\n然后你可以在浏览器中打开链接来编辑图片。\n\n\u003Cdiv align=\"center\">\n\u003Cdiv style=\"width: 80%; text-align: left; margin:auto;\">\n    \u003Cimg style=\"width:100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4519622a8c26.png\">\n    \u003Cp style=\"text-align: left;\">Gradio 演示：只需输入指令并等待结果！\u003C\u002Fb>.\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"left\">\n\n这里还有一段由 [softicelee2](https:\u002F\u002Fgithub.com\u002Fsofticelee2) 制作的 ICEdit 安装与使用中文教程 [YouTube 视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=rRMc5DE4qMo)，非常值得一看！\n\n## 💼 Windows 一键安装包\n\n非常感谢 [gluttony-10](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F23#issue-3050804566)，这位著名的 [Bilibili UP 主](https:\u002F\u002Fspace.bilibili.com\u002F893892)! 他制作了关于如何在 Windows 上安装我们项目的教程（[YouTube](https:\u002F\u002Fyoutu.be\u002FC-OpWlJi424) 和 [Bilibili](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1oT5uzzEbs))，以及一个 Windows 一键安装包！**只需解压即可使用**。该安装包经过量化处理，仅占用 14GB 磁盘空间，并支持 50 系列显卡。\n\n下载链接：[Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F16j3wQvWjuzCRKnVolszLmhCtc_yOCqcx?usp=sharing) 或 [百度网盘](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1oT5uzzEbs\u002F?vd_source=2a911c0bc75f6d9b9d056bf0e7410d45)（请参考视频评论区）\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_14bc04e1025d.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n# 🔧 训练\n\n更多细节请参见：[训练代码](.\u002Ftrain\u002F)\n\n\n# 🎨ComfyUI 工作流\n\n\n### 官方 ComfyUI 工作流\n我们在本仓库中发布了我们的**官方 ComfyUI 工作流**，以便正确使用我们的模型！**我们已将提示词“一幅由两幅并排图像组成的二联画，描绘同一场景……但”嵌入到我们的节点中**，你只需输入编辑指令，例如“让女孩戴上粉色太阳镜”。此外，我们还添加了一个高分辨率细化模块，以获得更好的图像质量！整个工作流的显存消耗约为 14GB。使用这个[工作流](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official)和[ICEdit-normal-lora](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain)，尽情实现你的创作想法吧！\n\n我们专门创建了一个[工作流仓库](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official)，你可以直接在 ComfyUI 中**通过 Git URL 安装**。只需打开管理器选项卡，点击“通过 Git URL 安装”，复制以下 URL 即可使用。更多详情请参阅此[议题](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F22#issuecomment-2864977880)\n\n**URL:** [https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official](https:\u002F\u002Fgithub.com\u002Fhayd-zju\u002FICEdit-ComfyUI-official)\n\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_c3ddde9581d2.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_a9a466a49d78.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n 非常感谢 [月下Hugo](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095) 制作了关于如何使用我们官方工作流的[中文教程](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1JZVRzuE12\u002F?share_source=copy_web&vd_source=8fcb933ee576af56337afc41509fa095)！\n\n### 提高编辑成功率的 ComfyUI 工作流\n感谢 [T8star](https:\u002F\u002Fx.com\u002FT8star_Aix)! 他制作了教程（[YouTube](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s6GMKL-Jjos) 和 [Bilibili](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV11HVhz1Eky\u002F?spm_id_from=333.40164.top_right_bar_window_dynamic.content.click&vd_source=2a911c0bc75f6d9b9d056bf0e7410d45)) 以及一个创意工作流（[OpenArt](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Ft8star\u002Ficedit100v1\u002FHN4EZ2Cej98ZX8CC1RK5) 和 [RunningHub](https:\u002F\u002Fwww.runninghub.cn\u002Fpost\u002F1920075398585974786\u002F?utm_source=kol01-RH099)），能够极大提高编辑的成功率（约 100%）！不妨试试看！\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_a23195273e01.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n### ComfyUI-nunchaku\n\n我们衷心感谢 @[judian17](https:\u002F\u002Fgithub.com\u002Fjudian17) 打造的 ComfyUI [工作流](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411)，它使得我们的模型能够更顺畅地使用。探索这个优秀的[工作流](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411)，就能轻松地在 ComfyUI 中运行我们的模型。只需**4GB 显存的 GPU** 就足以配合 ComfyUI-nunchaku 运行！\n\n该工作流集成了高清细化功能，效果非常出色。此外，将此 LoRA 与 Redux 结合使用，还能在一定程度上实现服装更换。再次向 @[judian17](https:\u002F\u002Fgithub.com\u002Fjudian17) 致以诚挚的感谢，感谢他的创新贡献！ \n\n![comfyui 图片](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4faa9284acda.png)\n\n\n### ComfyUI 工作流\n\n感谢 [Datou](https:\u002F\u002Fx.com\u002FDatou)，ICEdit 在 ComfyUI 中的工作流也可以从这里下载[链接](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB)。可以搭配[普通 LoRA ckpt](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain)一起使用。\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_c5e0a63935bf.png\" width=\"80%\" style=\"display: block; margin: auto;\">\n\n\n\n\n\n\n# ⚠️ 小贴士\n\n### 如果您遇到此类失败情况，请**尝试使用不同的随机种子重试**！\n\n- 我们的基模型 FLUX 本身并不支持广泛的风格，因此我们的数据集中很大一部分涉及风格迁移。结果是，该模型**有时可能会莫名其妙地改变您的艺术风格**。\n\n- 我们的训练数据集**主要针对写实图像**。对于非写实图像，例如**动漫风格**或**模糊图片**，编辑的成功率会**显著下降，并可能影响最终的图像质量**。\n\n- 虽然添加物体、修改颜色属性、应用风格迁移和更换背景的成功率较高，但由于我们使用的移除数据集质量较低，物体移除的成功率相对较低。\n\n当前的模型是我们论文实验中使用的版本，仅使用4张A800显卡进行训练（总`batch_size` = 2 x 2 x 4 = 16）。未来，我们将扩充数据集并进行规模扩展，最终发布更强大的模型。\n\n### ⚠️ 说明\n\n我们注意到有许多与 ICEdit 相关的网页，包括 [https:\u002F\u002Ficedit.net\u002F](https:\u002F\u002Ficedit.net\u002F) 和 [https:\u002F\u002Ficedit.org\u002F](https:\u002F\u002Ficedit.org\u002F)。向这些页面的开发者致以敬意！\n\n然而，我们想强调两点：\n- **禁止商业使用**：我们的项目**不能**用于商业用途。详情请参阅 [LICENSE](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fblob\u002Fmain\u002FLICENSE)。\n- **官方页面**：项目的官方页面是 [https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F](https:\u002F\u002Friver-zhang.github.io\u002FICEdit-gh-pages\u002F)。\n\n\n\n# 💪 待办事项\n\n- [x] 推理代码\n- [ ] 使用 VLM 进行推理时的缩放\n- [x] 预训练权重\n- [x] 更多推理演示\n- [x] Gradio 演示\n- [x] Comfy UI 演示（由 @[judian17](https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F1#issuecomment-2846568411) 提供，兼容 [nunchaku](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002FComfyUI-nunchaku)，支持高分辨率细化和 FLUX Redux。只需4GB显存的GPU即可运行！）\n- [x] 带普通 LoRA 的 Comfy UI 演示（由 @[Datou](https:\u002F\u002Fopenart.ai\u002Fworkflows\u002Fdatou\u002Ficedit-moe-lora-flux-fill\u002FQFmaWNKsQo3P5liYz4RB) 在 openart 上提供）\n- [x] 官方 ComfyUI 工作流\n- [x] 训练代码\n- [ ] 用于更高分辨率图像的 LoRA（768、1024）\n\n\n\n# 💪 与商用模型的对比\n\n\u003Cdiv align=\"center\">\n\u003Cdiv style=\"width: 80%; text-align: left; margin:auto;\">\n    \u003Cimg style=\"width:100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_545b98c5b1cc.png\">\n    \u003Cp style=\"text-align: left;\">与 Gemini 和 GPT-4o 等商用模型相比，我们在人物身份保持和指令遵循方面不相上下，甚至更胜一筹。\u003Cb>我们的项目更加开源，成本更低、速度更快（处理一张图像约需9秒），且性能强大\u003C\u002Fb>。\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\n\u003Cdiv align=\"left\">\n\n\n# 🌟 星标历史\n\n[![星标历史图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_readme_4543fbb1f228.png)](https:\u002F\u002Fwww.star-history.com\u002F#River-Zhang\u002FICEdit&Date)\n\n# Bibtex\n如果本工作对您的研究有所帮助，请考虑引用以下 BibTeX 条目。\n\n```\n@article{zhang2025context,\n  title={In-context edit: Enabling instructional image editing with in-context generation in large scale diffusion transformer},\n  author={Zhang, Zechuan and Xie, Ji and Lu, Yu and Yang, Zongxin and Yang, Yi},\n  journal={arXiv preprint arXiv:2504.20690},\n  year={2025}\n}\n\n@inproceedings{zhang2025icedit,\n  title     = {In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large-Scale Diffusion Transformers},\n  author    = {Zhang, Zechuan and Xie, Ji and Lu, Yu and Yang, Zongxin and Yang, Yi},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2025},\n  note      = {arXiv:2504.20690}\n}\n\n```","# ICEdit 快速上手指南\n\nICEdit (In-Context Edit) 是一种基于指令的图像编辑工具，利用大规模扩散 Transformer 中的上下文生成技术，仅需极少的训练数据和参数量即可实现 SOTA 级别的编辑效果。支持多轮编辑和复杂的指令操作。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**: Linux 或 Windows (Windows 需特定配置，推荐 Linux)\n- **GPU**: 建议显存 ≥ 24GB (如 RTX 3090\u002F4090)。若显存较小，需开启 CPU 卸载模式。\n- **Python**: 3.10\n\n### 前置依赖\n确保已安装 `conda` 和 `git`。\n\n## 安装步骤\n\n### 1. 创建并激活 Conda 环境\n```bash\nconda create -n icedit python=3.10\nconda activate icedit\n```\n\n### 2. 安装依赖包\n建议使用国内镜像源加速下载（如清华源）：\n```bash\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\npip install -U huggingface_hub -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 3. 下载预训练权重\n模型依赖以下权重，若网络通畅可自动下载，否则需手动下载并指定路径：\n- **底模**: [Flux.1-fill-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002Fflux.1-fill-dev)\n- **LoRA 权重**: [ICEdit-normal-LoRA](https:\u002F\u002Fhuggingface.co\u002FRiverZ\u002Fnormal-lora\u002Ftree\u002Fmain) (推荐使用)\n- **MoE LoRA 权重**: [ICEdit-MoE-LoRA](https:\u002F\u002Fhuggingface.co\u002Fsanaka87\u002FICEdit-MoE-LoRA\u002Ftree\u002Fmain) (可选，需专用脚本)\n\n## 基本使用\n\n### 注意事项\n- **图像尺寸**: 输入图像宽度必须为 **512 像素**（高度不限）。若非 512，模型会自动缩放。\n- **提示词格式**: 命令行使用时，无需手动添加前缀提示词（代码内部已处理）；若使用 ComfyUI 需手动添加特定前缀。\n- **失败重试**: 若结果不理想，请尝试更换 `--seed` 参数。\n\n### 命令行推理示例\n\n#### 标准模式 (推荐)\n适用于显存充足的环境：\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --seed 304897401\n```\n\n#### 低显存模式 (CPU Offload)\n适用于 24GB 或更小显存显卡（如 RTX 3090），通过卸载部分模型到内存降低显存占用：\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --enable-model-cpu-offload\n```\n\n#### 指定本地权重路径\n若已手动下载权重到本地：\n```bash\npython scripts\u002Finference.py --image assets\u002Fgirl.png \\\n                            --instruction \"Make her hair dark green and her clothes checked.\" \\\n                            --flux-path \u002Fpath\u002Fto\u002Fflux.1-fill-dev \\\n                            --lora-path \u002Fpath\u002Fto\u002FICEdit-normal-LoRA\n```\n\n#### MoE 版本使用 (可选)\n如果使用 MoE-LoRA 版本，请使用专用脚本：\n```bash\npython scripts\u002Finference_moe.py --image assets\u002Fgirl.png \\\n                                --instruction \"Make her hair dark green and her clothes checked.\" \\\n                                --seed 42\n```","一位电商设计师需要为同一款新品背包快速生成多张不同场景（如雪山、海滩、城市街头）的营销海报，且必须严格保持背包的外观细节、Logo 位置和材质质感完全一致。\n\n### 没有 ICEdit 时\n- **身份一致性难以维持**：使用常规重绘或提示词修改时，背包的拉链形状、品牌 Logo 极易发生变形或丢失，导致商品特征不统一。\n- **训练成本高昂**：若要固定主体特征，通常需收集大量该背包的多角度照片训练专属 LoRA，耗时数小时且对数据量要求高。\n- **硬件门槛限制**：现有的高精度编辑模型往往显存占用巨大，普通工作站的显卡无法流畅运行，只能依赖昂贵的云端算力。\n- **多轮编辑累积误差**：在进行“先换背景、再调光影、最后加配饰”的多步操作时，图像质量会逐次下降，最终结果模糊失真。\n\n### 使用 ICEdit 后\n- **完美的 ID 持久性**：凭借超越 GPT-4o 的身份保持能力，无论背景如何剧烈变化，背包的每一个像素级细节都精准锁定，无需反复微调。\n- **极低的数据与参数需求**：仅需极少量参考数据（0.1% 级别）和单个轻量级 LoRA 即可实现顶级编辑效果，省去了繁琐的大规模训练过程。\n- **亲民的资源消耗**：优化后的架构仅需 4GB 显存即可运行，设计师在本地笔记本上也能实时完成高质量的多轮指令编辑。\n- **稳定的多轮交互**：支持高精度的多回合修改指令，连续调整场景风格与光照时，画面依然清晰锐利，无累积噪点或结构崩坏。\n\nICEdit 通过极低的资源消耗和数据门槛，彻底解决了商业修图中“主体一致性”与“编辑灵活性”不可兼得的行业痛点。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FRiver-Zhang_ICEdit_515b75bb.png","River-Zhang","Zechuan Zhang","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FRiver-Zhang_1def3c47.jpg","CS PhD student at RELER, CCAI, Zhejiang University; Visiting scholar at Broad Institute of Harvard and MIT\r\n","Zhejiang University; Broad Institute","Cambridge, MA","zechuan@zju.edu.cn",null,"https:\u002F\u002Friver-zhang.github.io\u002Fzechuanzhang\u002F\u002F","https:\u002F\u002Fgithub.com\u002FRiver-Zhang",[83,87,91,94],{"name":84,"color":85,"percentage":86},"Python","#3572A5",99.9,{"name":88,"color":89,"percentage":90},"Shell","#89e051",0,{"name":92,"color":93,"percentage":90},"Cuda","#3A4E3A",{"name":95,"color":96,"percentage":90},"C++","#f34b7d",2093,115,"2026-04-16T20:10:48","NOASSERTION","Linux, Windows","需要 NVIDIA GPU。编辑 512×768 图像默认需 35GB 显存；使用 --enable-model-cpu-offload 参数可在 24GB 显存（如 RTX 3090）上运行；配合 ComfyUI-nunchaku 方案最低仅需 4GB 显存。未明确提及具体 CUDA 版本。","未说明",{"notes":105,"python":106,"dependencies":107},"1. 输入图像宽度必须调整为 512 像素（高度不限），否则模型会自动调整。2. 若生成结果不理想，建议更换随机种子（seed）。3. Windows 用户可使用一键安装包或 Gradio Demo。4. 支持华为 Ascend NPU 版本（需单独仓库）。5. 使用 ComfyUI 时需注意添加固定的前缀提示词，且普通版 LoRA 与 MoE-LoRA 加载方式不同。","3.10",[108,109],"huggingface_hub","requirements.txt 中定义的依赖包",[111,15,36],"视频",[113,114,115,116,117,118,119,120,121],"diffusion","diffusion-models","diffusion-transformer","editing-image","image-editing","dit","in-context","gpt4o","gpt4oimage","2026-03-27T02:49:30.150509","2026-04-20T16:23:06.980675",[125,130,135,140,145,150],{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},40312,"训练代码何时发布或如何获取？","训练代码现已发布。用户可以训练分辨率为 512、768 或 1024 像素的自定义模型。请查看项目最新更新以获取代码。","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F14",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},40313,"在 Windows 上运行时提示找不到模型路径（ValueError: ... is neither a valid local path nor a valid repo id），如何解决？","该错误通常是因为使用了相对路径。解决方法是将模型路径改为绝对的完整路径。例如，将命令中的路径修改为：`C:\u002FUsers\u002F你的用户名\u002FICEdit\u002Fmodels\u002FFLUX.1-Fill-dev`。完整的运行命令示例如下：\n`python scripts\u002Fgradio_demo_windows.py --server_name 127.0.0.1 --port 7860 --flux-path C:\u002FUsers\u002F你的用户名\u002FICEdit\u002Fmodels\u002FFLUX.1-Fill-dev --lora-path C:\u002FUsers\u002F你的用户名\u002FICEdit\u002Fmodels\u002Fnormal-lora --enable-model-cpu-offload --int8`","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F15",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},40314,"使用 RTX 3090 等显卡运行时出现显存不足（CUDA Out of Memory）错误，怎么办？","可以通过启用模型 CPU 卸载来解决显存不足的问题。请在运行脚本时添加 `--enable-model-cpu-offload True` 参数。例如：\n`python scripts\u002Fgradio_demo.py --port 7860 --enable-model-cpu-offload True`\n此外，也有用户反馈使用 CUDA 11.8 版本可能有助于解决此问题。","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F25",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},40315,"模型在物体移除（Object Removal）任务上效果不佳，有什么优化建议？","如果是自定义训练场景，建议尝试使用约 1,000 到 2,000 条数据进行训练，并设置较小的 LoRA rank（如 4、8 或 16），这通常能取得较好的效果。如果数据量过大（如 5,000-10,000 条）难以收集，可以先用小数据集和小 rank 进行测试。","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F12",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},40316,"Windows 一键包中使用了混淆代码（obfuscated code），这是出于什么考虑？","这是因为在 Windows 环境下直接运行未量化的 Transformer 部分大约需要占用 24GB 显存，加上系统开销会导致使用共享显存从而大幅降低速度。开发者利用 torchao 的 int8 量化技术对 Transformer 进行处理，以在降低显存占用的同时保持较好的效果。由于量化和封装的需要，暂时使用了混淆处理，未来官方可能会开设 Windows 分支并精简代码。","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F23",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},40317,"该 LoRA 模型是否支持基于 Schnell 版本的 Flux 模型？","目前暂不支持。因为 Black Forest Lab 尚未正式发布 Flux Fill 的 Schnell 版本模型，所以需要等待官方发布相应版本后才能适配。","https:\u002F\u002Fgithub.com\u002FRiver-Zhang\u002FICEdit\u002Fissues\u002F46",[]]