[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-JoePenna--Dreambooth-Stable-Diffusion":3,"tool-JoePenna--Dreambooth-Stable-Diffusion":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",147882,2,"2026-04-09T11:32:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":78,"owner_website":77,"owner_url":79,"languages":80,"stars":93,"forks":94,"last_commit_at":95,"license":96,"difficulty_score":10,"env_os":97,"env_gpu":98,"env_ram":99,"env_deps":100,"category_tags":106,"github_topics":107,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":117,"updated_at":118,"faqs":119,"releases":145},5896,"JoePenna\u002FDreambooth-Stable-Diffusion","Dreambooth-Stable-Diffusion","Implementation of Dreambooth (https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.12242) by way of Textual Inversion (https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.01618) for Stable Diffusion (https:\u002F\u002Farxiv.org\u002Fabs\u002F2112.10752). Tweaks focused on training faces, objects, and styles.","Dreambooth-Stable-Diffusion 是一个基于 Stable Diffusion 模型的开源训练工具，旨在让用户能够轻松定制专属的 AI 图像生成模型。它通过结合 Dreambooth 和 Textual Inversion 技术，解决了通用大模型难以精准还原特定人物面部特征、独特物体细节或专属艺术风格的痛点。用户只需提供少量目标图片，即可训练出能高度还原特定主体，同时又能灵活适应不同场景和画风的新模型。\n\n该项目由电影导演 Joe Penna（MysteryGuitarMan）发起并优化，最初是为了满足影视制作中对特定演员、道具及场景的高精度生成需求，因此特别强化了对人脸训练的支持。相比原始方案，它在保持主体一致性的同时，更好地平衡了风格迁移的能力，有效避免了生成图像过于僵化或丢失特征的问题。\n\nDreambooth-Stable-Diffusion 非常适合概念艺术家、漫画设计师、影视从业者以及希望创作个人化内容的开发者使用。项目提供了详尽的教程，支持在 Google Colab、Vast.ai 云平台以及本地 Windows 或 Ubuntu 环境中部署，降低","Dreambooth-Stable-Diffusion 是一个基于 Stable Diffusion 模型的开源训练工具，旨在让用户能够轻松定制专属的 AI 图像生成模型。它通过结合 Dreambooth 和 Textual Inversion 技术，解决了通用大模型难以精准还原特定人物面部特征、独特物体细节或专属艺术风格的痛点。用户只需提供少量目标图片，即可训练出能高度还原特定主体，同时又能灵活适应不同场景和画风的新模型。\n\n该项目由电影导演 Joe Penna（MysteryGuitarMan）发起并优化，最初是为了满足影视制作中对特定演员、道具及场景的高精度生成需求，因此特别强化了对人脸训练的支持。相比原始方案，它在保持主体一致性的同时，更好地平衡了风格迁移的能力，有效避免了生成图像过于僵化或丢失特征的问题。\n\nDreambooth-Stable-Diffusion 非常适合概念艺术家、漫画设计师、影视从业者以及希望创作个人化内容的开发者使用。项目提供了详尽的教程，支持在 Google Colab、Vast.ai 云平台以及本地 Windows 或 Ubuntu 环境中部署，降低了技术门槛。无论是想将自家宠物融入奇幻场景，还是为电影预演快速生成特定角色的概念图，它都能提供强大而灵活的助力。","\n# Extended Dreambooth How-To Guides by Yushan\n[For Running On Vast.ai](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-vast-ai-5f1018239820)\u003Cbr>\n[For Running On Google Colab](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-google-colab-63ec6e6cf050)\u003Cbr>\n[For Running On a Local PC (Windows)](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-a-local-pc-windows-f00a4fd11dfd)\u003Cbr>\n[For Running On a Local PC (Ubuntu)](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-a-local-pc-ubuntu-a2bf796430d2)\u003Cbr>\n[Adapting Corridor Digital's Dreambooth Tutorial To JoePenna's Repo](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fadapting-corridor-digitals-dreambooth-tutorial-to-joepenna-s-repo-d82bfbe0bfd2)\u003Cbr>\n[Using Captions in JoePenna's Dreambooth](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fusing-captions-with-dreambooth-joepenna-dreambooth-716f5b9e9866)\u003Cbr>\n\n# Index\n\n- [Notes by Joe Penna](#notes-by-joe-penna)\n- [Setup](#setup)\n  - [Easy RunPod Instructions](#easy-runpod-instructions)\n  - [Vast.AI Setup](#vast-ai-setup)\n  - [Run Locally](#running-locally)\n    - [venv](#running-locally-venv)\n    - [Conda](#running-locally-conda)\n  - [Configuration File and Command Line Reference](#config-file-and-command-line-reference)\n- [Captions & Multiple Subject\u002FConcept Support](#captions-and-multi-concept)\n- [Textual Inversion vs. Dreambooth](#text-vs-dreamb)\n- [Using the Generated Model](#using-the-generated-model)\n- [Debugging Your Results](#debugging-your-results)\n  - [They don't look like you at all!](#they-dont-look-like-you)\n  - [They sorta look like you, but exactly like your training images](#they-sorta-look-like-you-but-exactly-like-your-training-images)\n  - [They look like you, but not when you try different styles](#they-look-like-you-but-not-when-you-try-different-styles)\n- [Hugging Face Diffusers](#hugging-face-diffusers)\n\n# The Repo Formerly Known As \"Dreambooth\"\n![image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_935348b6b48b.png)\n\n## \u003Ca name=\"notes-by-joe-penna\">\u003C\u002Fa>  Notes by Joe Penna\n### **INTRODUCTIONS!**\nHi! My name is Joe Penna.\n\nYou might have seen a few YouTube videos of mine under *MysteryGuitarMan*. I'm now a feature film director. You might have seen [ARCTIC](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=N5aD9ppoQIo&t=6s) or [STOWAWAY](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=A_apvQkWsVY).\n\nFor my movies, I need to be able to train specific actors, props, locations, etc. So, I did a bunch of changes to @XavierXiao's repo in order to train people's faces.\n\nI can't release all the tests for the movie I'm working on, but when I test with my own face, I release those on my Twitter page - [@MysteryGuitarM](https:\u002F\u002Ftwitter.com\u002FMysteryGuitarM).\n\nLots of these tests were done with a buddy of mine -- Niko from CorridorDigital. It might be how you found this repo!\n\nI'm not really a coder. I'm just stubborn, and I'm not afraid of googling. So, eventually, some really smart folks joined in and have been contributing. In this repo, specifically: [@djbielejeski](https:\u002F\u002Fgithub.com\u002Fdjbielejeski) @gammagec @MrSaad –– but so many others in our Discord!\n\nThis is no longer my repo. This is the people-who-wanna-see-Dreambooth-on-SD-working-well's repo!\n\nNow, if you wanna try to do this... please read the warnings below first:\n\n### **WARNING!**\n\n- Let's respect the hard work and creativity of people who have spent years honing their skills.\n  - This iteration of Dreambooth was specifically designed for digital artists to train their own characters and styles into a Stable Diffusion model, as well as for people to train their own likenesses. My main goal is to make a tool for filmmakers to interact with concept artists that they've hired -- to generate the seed of an initial idea, so that they can then communicate visually. Meant to be used by filmmakers, concept artists, comic book designers, etc.\n  - One day, there'll be a Stable Diffussion trained on perfect datasets. In the meantime, for moral \u002F ethical \u002F potentially legal reasons, I strongly discourage training someone else's art into these model (unless you've obtained explicit permission, or they've made a public statement about this technology). For similar reasons, I recommend against using artists' names in your prompts. Don't put the people who made this possible out of the job!\n\n- Onto the technical side:\n  - You can now run this on a GPU with **24GB of VRAM** (e.g. 3090). Training will be slower, and you'll need to be sure this is the *only* program running.\n  - If, like myself, you don't happen to own one of those, I'm including a Jupyter notebook here to help you run it on a rented cloud computing platform. \n  - It's currently tailored to [runpod.io](https:\u002F\u002Frunpod.io?ref=n8yfwyum) and [vast.ai](http:\u002F\u002Fconsole.vast.ai\u002F?ref=47390) \n  - We do support a colab notebook as well: [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fblob\u002Fmain\u002Fdreambooth_google_colab_joepenna.ipynb)\n  \n- This implementation does not fully implement Google's ideas on how to preserve the latent space.\n\n  - Most images that are similar to what you're training will be shifted towards that.\n  - e.g. If you're training a person, all people will look like you. If you're training an object, anything in that class will look like your object.\n\n- There doesn't seem to be an easy way to train two subjects consecutively. You will end up with an `11-12GB` file before pruning.\n  - The provided notebook has a pruner that crunches it down to `~2gb`\n  \n- Best practice is to change the **token** to a celebrity name (*note: token, not class* -- so your prompt would be something like: `Chris Evans person`). Here's [my wife trained with the exact same settings, except for the token](#using-the-generated-model)\n\n\n# \u003Ca name=\"setup\">\u003C\u002Fa> Setup\n## \u003Ca name=\"easy-runpod-instructions\">\u003C\u002Fa> Easy RunPod Instructions\n\n**Note Runpod periodically upgrades their base Docker image which can lead to repo not working. None of the Youtube videos are up to date but you can still follow them as a guide. Follow along the typical Runpod Youtube videos\u002Ftutorials, with the following changes:**\n\nFrom within the My Pods page,\n\n- Click the menu button (to the left of the purple play button)\n- Click Edit Pod\n- Update \"Docker Image Name\" to one of the following (tested 2023\u002F06\u002F27):\n  - `runpod\u002Fpytorch:3.10-2.0.1-120-devel`\n  - `runpod\u002Fpytorch:3.10-2.0.1-118-runtime`\n  - `runpod\u002Fpytorch:3.10-2.0.0-117`\n  - `runpod\u002Fpytorch:3.10-1.13.1-116`\n- Click Save.\n- Restart your pod\n\n### Carry on with the rest of the guide:\n\n- Sign up for RunPod. Feel free to use my [referral link here](https:\u002F\u002Frunpod.io?ref=n8yfwyum), so that I don't have to pay for it (but you do).\n- After logging in, select either `SECURE CLOUD` or `COMMUNITY CLOUD`\n- Make sure you find a \"High\" interent speed so you're not wasting time and money on slow downloads\n- Select something with at **least 24gb VRAM** like RTX 3090, RTX 4090 or RTX A5000\n\n- Follow these video instructions below:\n\n[![VIDEO INSTRUCTIONS](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_134ad10ce0f1.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=7m__xadX0z0#t=5m33.1s)\n\n## \u003Ca name=\"vast-ai-setup\">\u003C\u002Fa>  Vast.AI Instructions\n- Sign up for [Vast.AI](http:\u002F\u002Fconsole.vast.ai\u002F?ref=47390) (Referral Links by David Bielejeski)\n- Add some funds (I typically add them in $10 increments)\n- Navigate to the [Client - Create page](https:\u002F\u002Fvast.ai\u002Fconsole\u002Fcreate\u002F?ref=47390)\n  - Select pytorch\u002Fpytorch as your docker image, and the buttons \"Use Jupyter Lab Interface\" and \"Jupyter direct HTTPS\"\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_b98b2c93c9a0.png)\n- You will want to increase your disk space, and filter on GPU RAM (2GB checkpoint files + 2-8GB model file + regularization images + other stuff adds up fast)\n  - I typically allocate 150GB\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_2c5ea7a56299.png)\n  - Also good to check the Upload\u002FDownload speed for enough bandwidth so you don't spend all your money waiting for things to download.\n- Select the instance you want, and click `Rent`, then head over to your [Instances](https:\u002F\u002Fvast.ai\u002Fconsole\u002Finstances\u002F?ref=47390) page and click `Open`\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_6a2f78177fe9.png)\n  - You will get an unsafe certificate warning. Click past the warning or install the [Vast cert](https:\u002F\u002Fvast.ai\u002Fstatic\u002Fjvastai_root.cer).\n- Click `Notebook -> Python 3` (You can do this next step a number of ways, but I typically do this)\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_82e300612c40.png)\n- Clone Joe's repo with this command\n  - `!git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion.git`\n  - Click `run`\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_78b9a04cdeb2.png)\n- Navigate into the new `Dreambooth-Stable-Diffusion` directory on the left and open either the `dreambooth_simple_joepenna.ipynb` or `dreambooth_runpod_joepenna.ipynb` file\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_0e26619d826b.png)\n- Follow the instructions in the workbook and start training\n\n## \u003Ca name=\"running-locally\">\u003C\u002Fa> Running Locally Instructions\n\n### \u003Ca name=\"running-locally-venv\">\u003C\u002Fa> Setup - Virtual Environment\n\n### Pre-Requisites\n1. [Git](https:\u002F\u002Fgitforwindows.org\u002F)\n2. [Python 3.10](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n3. Open `cmd`\n4. Clone the repository\n   1. `C:\\>git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion`\n5. Navigate into the repository\n   1. `C:\\>cd Dreambooth-Stable-Diffusion`\n\n### Install Dependencies and Activate Environment\n```cmd\ncmd> python -m venv dreambooth_joepenna\ncmd> dreambooth_joepenna\\Scripts\\activate.bat\ncmd> pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu117\ncmd> pip install -r requirements.txt\n```\n\n#### Run\n`cmd> python \"main.py\" --project_name \"ProjectName\" --training_model \"C:\\v1-5-pruned-emaonly-pruned.ckpt\" --regularization_images \"C:\\regularization_images\" --training_images \"C:\\training_images\" --max_training_steps 2000 --class_word \"person\" --token \"zwx\" --flip_p 0 --learning_rate 1.0e-06 --save_every_x_steps 250`\n\n#### Cleanup\n```cmd\ncmd> deactivate \n```\n\n### \u003Ca name=\"running-locally-conda\">\u003C\u002Fa>  Setup - Conda\n\n### Pre-Requisites\n1. [Git](https:\u002F\u002Fgitforwindows.org\u002F)\n2. [Python 3.10](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n2. [miniconda3](https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002Fminiconda.html)\n3. Open `Anaconda Prompt (miniconda3)`\n4. Clone the repository\n   1. `(base) C:\\>git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion`\n5. Navigate into the repository\n   1. `(base) C:\\>cd Dreambooth-Stable-Diffusion`\n\n### Install Dependencies and Activate Environment\n\n```cmd\n(base) C:\\Dreambooth-Stable-Diffusion> conda env create -f environment.yaml\n(base) C:\\Dreambooth-Stable-Diffusion> conda activate dreambooth_joepenna\n```\n\n##### Run\n`cmd> python \"main.py\" --project_name \"ProjectName\" --training_model \"C:\\v1-5-pruned-emaonly-pruned.ckpt\" --regularization_images \"C:\\regularization_images\" --training_images \"C:\\training_images\" --max_training_steps 2000 --class_word \"person\" --token \"zwx\" --flip_p 0 --learning_rate 1.0e-06 --save_every_x_steps 250`\n\n##### Cleanup\n```cmd\ncmd> conda deactivate\n```\n\n# \u003Ca name=\"config-file-and-command-line-reference\">\u003C\u002Fa>  Configuration File and Command Line Reference\n\n## Example Configuration File\n\n```\n{\n    \"class_word\": \"woman\",\n    \"config_date_time\": \"2023-04-08T16-54-00\",\n    \"debug\": false,\n    \"flip_percent\": 0.0,\n    \"gpu\": 0,\n    \"learning_rate\": 1e-06,\n    \"max_training_steps\": 3500,\n    \"model_path\": \"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\",\n    \"model_repo_id\": \"\",\n    \"project_config_filename\": \"my-config.json\",\n    \"project_name\": \"\u003Ctoken> project\",\n    \"regularization_images_folder_path\": \"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\",\n    \"save_every_x_steps\": 250,\n    \"schema\": 1,\n    \"seed\": 23,\n    \"token\": \"\u003Ctoken>\",\n    \"token_only\": false,\n    \"training_images\": [\n        \"001@a photo of \u003Ctoken> looking down.png\",\n        \"002-DUPLICATE@a close photo of \u003Ctoken> smiling wearing a black sweatshirt.png\",\n        \"002@a photo of \u003Ctoken> wearing a black sweatshirt sitting on a blue couch.png\",\n        \"003@a photo of \u003Ctoken> smiling wearing a red flannel shirt with a door in the background.png\",\n        \"004@a photo of \u003Ctoken> wearing a purple sweater dress standing with her arms crossed in front of a piano.png\",\n        \"005@a close photo of \u003Ctoken> with her hand on her chin.png\",\n        \"005@a photo of \u003Ctoken> with her hand on her chin wearing a dark green coat and a red turtleneck.png\",\n        \"006@a close photo of \u003Ctoken>.png\",\n        \"007@a close photo of \u003Ctoken>.png\",\n        \"008@a photo of \u003Ctoken> wearing a purple turtleneck and earings.png\",\n        \"009@a close photo of \u003Ctoken> wearing a red flannel shirt with her hand on her head.png\",\n        \"011@a close photo of \u003Ctoken> wearing a black shirt.png\",\n        \"012@a close photo of \u003Ctoken> smirking wearing a gray hooded sweatshirt.png\",\n        \"013@a photo of \u003Ctoken> standing in front of a desk.png\",\n        \"014@a close photo of \u003Ctoken> standing in a kitchen.png\",\n        \"015@a photo of \u003Ctoken> wearing a pink sweater with her hand on her forehead sitting on a couch with leaves in the background.png\",\n        \"016@a photo of \u003Ctoken> wearing a black shirt standing in front of a door.png\",\n        \"017@a photo of \u003Ctoken> smiling wearing a black v-neck sweater sitting on a couch in front of a lamp.png\",\n        \"019@a photo of \u003Ctoken> wearing a blue v-neck shirt in front of a door.png\",\n        \"020@a photo of \u003Ctoken> looking down with her hand on her face wearing a black sweater.png\",\n        \"021@a close photo of \u003Ctoken> pursing her lips wearing a pink hooded sweatshirt.png\",\n        \"022@a photo of \u003Ctoken> looking off into the distance wearing a striped shirt.png\",\n        \"023@a photo of \u003Ctoken> smiling wearing a blue beanie holding a wine glass with a kitchen table in the background.png\",\n        \"024@a close photo of \u003Ctoken> looking at the camera.png\"\n    ],\n    \"training_images_count\": 24,\n    \"training_images_folder_path\": \"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\"\n}\n```\n\n### Using your configuration for training\n\n```\npython \"main.py\" --config_file_path \"path\u002Fto\u002Fthe\u002Fmy-config.json\"\n```\n\n## Command Line Parameters\n\n[dreambooth_helpers\\arguments.py](https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fblob\u002Fmain\u002Fdreambooth_helpers\u002Farguments.py)\n\n| Command | Type | Example | Description |\n| ------- | ---- | ------- | ----------- |\n| `--config_file_path` | string | `\"C:\\\\Users\\\\David\\\\Dreambooth Configs\\\\my-config.json\"` | The path the configuration file to use |\n| `--project_name` | string | `\"My Project Name\"` | Name of the project |\n| `--debug` | bool | `False` | *Optional* Defaults to `False`. Enable debug logging |\n| `--seed` | int | `23` | *Optional* Defaults to `23`. Seed for seed_everything |\n| `--max_training_steps` | int | `3000` | Number of training steps to run |\n| `--token` | string | `\"owhx\"` | Unique token you want to represent your trained model. |\n| `--token_only` | bool | `False` | *Optional* Defaults to `False`. Train only using the token and no class. |\n| `--training_model` | string | `\"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\"` | Path to model to train (model.ckpt) |\n| `--training_images` | string | `\"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\"` | Path to training images directory |\n| `--regularization_images` | string | `\"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\"` | Path to directory with regularization images |\n| `--class_word` | string | `\"woman\"` | Match class_word to the category of images you want to train. Example: `man`, `woman`, `dog`, or `artstyle`. |\n| `--flip_p` | float | `0.0` | *Optional* Defaults to `0.5`. Flip Percentage. Example: if set to `0.5`, will flip (mirror) your training images 50% of the time. This helps expand your dataset without needing to include more training images. This can lead to worse results for face training since most people's faces are not perfectly symmetrical. |\n| `--learning_rate` | float | `1.0e-06` | *Optional* Defaults to `1.0e-06` (0.000001). Set the learning rate. Accepts scientific notation. |\n| `--save_every_x_steps` | int | `250` | *Optional* Defaults to `0`. Saves a checkpoint every x steps.   At `0` only saves at the end of training when `max_training_steps` is reached. |\n| `--gpu` | int | `0` | *Optional* Defaults to `0`. Specify a GPU other than 0 to use for training.  Multi-GPU support is not currently implemented.\n\n### Using your configuration for training\n\n```\npython \"main.py\" --project_name \"My Project Name\" --max_training_steps 3000 --token \"owhx\" --training_model \"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\" --training_images \"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\" --regularization_images \"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\" --class_word \"woman\" --flip_p 0.0 --save_every_x_steps 500\n```\n\n# \u003Ca name=\"captions-and-multi-concept\">\u003C\u002Fa>  Captions and Multiple Subject\u002FConcept Support\n\nCaptions are supported.  Here is the [guide](https:\u002F\u002Fdiscord.com\u002Fchannels\u002F1023277529424986162\u002F1029222282511515678) on how we implemented them.\n\nLet's say that your token is effy and your class is person, your data root is \u002Ftrain then:\n\n`training_images\u002Fimg-001.jpg` is captioned with `effy person`\n\nYou can customize the captioning by adding it after a `@` symbol in the filename.\n\n`\u002Ftraining_images\u002Fimg-001@a photo of effy` => `a photo of effy`\n\nYou can use two tokens in your captions `S` - uppercase S - and `C` - uppercase C - to indicate subject and class.\n\n`\u002Ftraining_images\u002Fimg-001@S being a good C.jpg` => `effy being a good person`\n\nTo create a new subject you just need to create a folder for it. So:\n\n`\u002Ftraining_images\u002Fbingo\u002Fimg-001.jpg` => `bingo person`\n\nThe class stays the same, but now the subject has changed.\n\nAgain - the token S is now bingo:\n\n`\u002Ftraining_images\u002Fbingo\u002Fimg-001@S is being silly.jpg` => `bingo is being silly`\n\nOne folder deeper and you can change the class: `\u002Ftraining_images\u002Fbingo\u002Fdog\u002Fimg-001@S being a good C.jpg` => `bingo being a good dog`\n\nNo comes the kicker: one level deeper and you can caption group of images: `\u002Ftraining_images\u002Feffy\u002Fperson\u002Fa picture of\u002Fimg-001.jpg` => `a picture of effy person`\n\n\n# \u003Ca name=\"text-vs-dreamb\">\u003C\u002Fa>  Textual Inversion vs. Dreambooth\nThe majority of the code in this repo was written by Rinon Gal et. al, the authors of the Textual Inversion research paper. Though a few ideas about regularization images and prior loss preservation (ideas from \"Dreambooth\") were added in, out of respect to both the MIT team and the Google researchers, I'm renaming this fork to:\n*\"The Repo Formerly Known As \"Dreambooth\"\"*.\n\nFor an alternate implementation , please see [\"Alternate Option\"](#hugging-face-diffusers) below.\n\n\n# \u003Ca name=\"using-the-generated-model\">\u003C\u002Fa> Using the generated model\nThe `ground truth` (real picture, caution: very beautiful woman)\n\u003Cbr>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_d8a449d41a16.png\" width=\"200\">\n\nSame prompt for all of these images below:\n\n| `sks person` | `woman person` | `Natalie Portman person` | `Kate Mara person` |\n| ----- | ------- | ----------------- | ----------- |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_a9da7c4323b9.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_a2945b0a7f9b.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_b87aed1fcfca.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_ac30e5e06ed0.png\" width=\"200\"> |   \n\n# \u003Ca name=\"debugging-your-results\">\u003C\u002Fa> Debugging your results\n### ❗❗ THE NUMBER ONE MISTAKE PEOPLE MAKE ❗❗\n\n**Prompting with just your token. ie \"joepenna\" instead of \"joepenna person\"**\n\n\nIf you trained with `joepenna` under the class `person`, the model should only know your face as:\n\n```\njoepenna person\n```\n\nExample Prompts:\n\n🚫 Incorrect (missing `person` following `joepenna`)\n```\nportrait photograph of joepenna 35mm film vintage glass\n```\n\n✅ This is right (`person` is included after `joepenna`)\n```\nportrait photograph of joepenna person 35mm film vintage glass\n```\n\nYou might sometimes get someone who kinda looks like you with joepenna (especially if you trained for too many steps), but that's only because this current iteration of Dreambooth overtrains that token so much that it bleeds into that token.\n\n---\n\n#### ☢ Be careful with the types of images you train\n\nWhile training, Stable doesn't know that you're a person. It's just going to mimic what it sees.\n\nSo, if these are your training images look like this:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_42742602b882.png)\n\nYou're only going to get generations of you outside next to a spiky tree, wearing a white-and-gray shirt, in the style of... well, selfie photograph.\n\nInstead, this training set is much better:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_5221e245bd3c.png)\n\nThe only thing that is consistent between images is the subject. So, Stable will look through the images and learn only your face, which will make \"editing\" it into other styles possible.\n\n## Oh no! You're not getting good generations!\n\n#### \u003Ca name=\"they-dont-look-like-you\">\u003C\u002Fa> OPTION 1: They're not looking like you at all! (Train longer, or get better training images)\n\nAre you sure you're prompting it right?\n\nIt should be `\u003Ctoken> \u003Cclass>`, not just `\u003Ctoken>`. For example:\n\n`JoePenna person, portrait photograph, 85mm medium format photo`\n\n\nIf it still doesn't look like you, you didn't train long enough.\n\n----\n\n#### \u003Ca name=\"they-sorta-look-like-you-but-exactly-like-your-training-images\">\u003C\u002Fa> OPTION 2: They're looking like you, but are all looking like your training images. (Train for less steps, get better training images, fix with prompting)\n\nOkay, a few reasons why: you might have trained too long... or your images were too similar... or you didn't train with enough images.\n\nNo problem. We can fix that with the prompt. Stable Diffusion puts a LOT of merit to whatever you type first. So save it for later:\n\n`an exquisite portrait photograph, 85mm medium format photo of JoePenna person with a classic haircut`\n\n\n----\n\n#### \u003Ca name=\"they-look-like-you-but-not-when-you-try-different-styles\">\u003C\u002Fa> OPTION 3: They're looking like you, but not when you try different styles. (Train longer, get better training images)\n\nYou didn't train long enough...\n\nNo problem. We can fix that with the prompt:\n\n`JoePenna person in a portrait photograph, JoePenna person in a 85mm medium format photo of JoePenna person`\n\n\n### More tips and help here: [Stable Diffusion Dreambooth Discord](https:\u002F\u002Fdiscord.com\u002Finvite\u002FqbMuXBXyHA)\n\n# \u003Ca name=\"hugging-face-diffusers\">\u003C\u002Fa> Hugging Face Diffusers - Alternate Option\n\nDreambooth is now supported in HuggingFace Diffusers for training with Stable Diffusion.\n\nTry it out here:\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fhuggingface\u002Fnotebooks\u002Fblob\u002Fmain\u002Fdiffusers\u002Fsd_dreambooth_training.ipynb)\n","# 由Yushan提供的扩展Dreambooth操作指南\n[在Vast.ai上运行](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-vast-ai-5f1018239820)\u003Cbr>\n[在Google Colab上运行](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-google-colab-63ec6e6cf050)\u003Cbr>\n[在本地PC（Windows）上运行](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-a-local-pc-windows-f00a4fd11dfd)\u003Cbr>\n[在本地PC（Ubuntu）上运行](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fdreambooth-training-joepenna-on-a-local-pc-ubuntu-a2bf796430d2)\u003Cbr>\n[将Corridor Digital的Dreambooth教程适配到JoePenna的仓库](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fadapting-corridor-digitals-dreambooth-tutorial-to-joepenna-s-repo-d82bfbe0bfd2)\u003Cbr>\n[在JoePenna的Dreambooth中使用标题](https:\u002F\u002Fmedium.com\u002F@yushantripleseven\u002Fusing-captions-with-dreambooth-joepenna-dreambooth-716f5b9e9866)\u003Cbr>\n\n# 索引\n\n- [Joe Penna的笔记](#notes-by-joe-penna)\n- [设置](#setup)\n  - [简易RunPod说明](#easy-runpod-instructions)\n  - [Vast.AI设置](#vast-ai-setup)\n  - [本地运行](#running-locally)\n    - [venv](#running-locally-venv)\n    - [Conda](#running-locally-conda)\n  - [配置文件与命令行参考](#config-file-and-command-line-reference)\n- [标题及多主体\u002F概念支持](#captions-and-multi-concept)\n- [文本反转 vs. Dreambooth](#text-vs-dreamb)\n- [使用生成的模型](#using-the-generated-model)\n- [调试你的结果](#debugging-your-results)\n  - [它们完全不像你！](#they-dont-look-like-you)\n  - [它们有点像你，但又完全像你的训练图片](#they-sorta-look-like-you-but-exactly-like-your-training-images)\n  - [它们看起来像你，但在尝试不同风格时却不再如此](#they-look-like-you-but-not-when-you-try-different-styles)\n- [Hugging Face Diffusers](#hugging-face-diffusers)\n\n# 曾被称为“Dreambooth”的仓库\n![image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_935348b6b48b.png)\n\n## \u003Ca name=\"notes-by-joe-penna\">\u003C\u002Fa> Joe Penna的笔记\n### **介绍！**\n你好！我叫乔·佩纳。\n\n你可能在YouTube上看过我以“MysteryGuitarMan”名义发布的一些视频。现在我是一名故事片导演。你或许已经看过《极地》（ARCTIC）或《偷渡者》（STOWAWAY）。\n\n为了我的电影制作，我需要能够训练特定的演员、道具、场景等。因此，我对@XavierXiao的仓库进行了大量修改，以便用于训练人脸。\n\n虽然我无法公开当前正在拍摄的电影的所有测试内容，但我会在自己的Twitter页面——[@MysteryGuitarM](https:\u002F\u002Ftwitter.com\u002FMysteryGuitarM)——分享使用自己脸部进行的测试结果。这些测试大多是在我的朋友Niko（来自CorridorDigital）的帮助下完成的。也许这就是你找到这个仓库的原因！\n\n我并不是一名程序员。我只是比较固执，并且不怕去谷歌搜索。最终，一些非常聪明的人加入了进来并持续贡献代码。在这个仓库中，特别要感谢：[@djbielejeski](https:\u002F\u002Fgithub.com\u002Fdjbielejeski)、@gammagec、MrSaad——当然还有我们Discord社区中的许多其他成员！\n\n现在，这已经不再是我一个人的仓库了，而是所有希望看到Dreambooth在Stable Diffusion上良好运行的人们的共同成果！\n\n如果你打算尝试这个项目，请先阅读以下警告：\n\n### **警告！**\n\n- 让我们尊重那些花费多年时间磨练技艺的创作者们的辛勤劳动和创造力。\n  - 这个版本的Dreambooth专为数字艺术家设计，旨在将他们自己的角色和风格融入Stable Diffusion模型中，同时也允许用户训练自己的肖像。我的主要目标是为电影制作人提供一个工具，让他们能够与所雇佣的概念艺术家进行互动，从而生成初始创意的种子，以便更直观地沟通。它适用于电影制作人、概念艺术家、漫画设计师等。\n  - 未来可能会出现基于完美数据集训练而成的Stable Diffusion模型。然而，在此之前，出于道德、伦理以及潜在法律方面的考虑，我强烈建议不要将他人的作品训练进这些模型中（除非你已获得明确许可，或者对方已公开表示接受这项技术）。同样地，我也不建议在提示词中使用艺术家的名字。请不要让那些使这一切成为可能的人失去工作机会！\n\n- 技术方面：\n  - 你现在可以在配备**24GB显存**的GPU上运行（例如3090）。不过训练速度会较慢，而且你需要确保这是唯一正在运行的程序。\n  - 如果你像我一样没有这样的设备，我在这里附上了一个Jupyter笔记本，帮助你在租用的云计算平台上运行。\n  - 目前该笔记本已针对[runpod.io](https:\u002F\u002Frunpod.io?ref=n8yfwyum)和[vast.ai](http:\u002F\u002Fconsole.vast.ai\u002F?ref=47390)进行了优化。\n  - 我们也支持Colab笔记本：[![在Colab中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fblob\u002Fmain\u002Fdreambooth_google_colab_joepenna.ipynb)\n\n- 此实现并未完全遵循Google关于如何保持潜在空间完整性的理念。\n  - 大多数与你训练内容相似的图像都会向你所训练的内容靠拢。\n  - 例如，如果你在训练一个人物，那么所有人物都会看起来像你；如果你在训练一件物品，那么同类物品都会呈现出你所训练的外观。\n\n- 目前似乎还没有简单的方法可以连续训练两个主题。在修剪之前，你会得到一个大小约为`11-12GB`的文件。\n  - 提供的笔记本包含一个修剪工具，可以将其压缩至约`~2gb`。\n\n- 最佳实践是将**token**替换为某个名人的名字（注意：这里指的是token，而非class——因此你的提示词可能是这样的：“Chris Evans person”）。以下是[我妻子使用完全相同的设置进行训练的结果，唯一的区别就是token](#using-the-generated-model)\n\n\n# \u003Ca name=\"setup\">\u003C\u002Fa> 设置\n## \u003Ca name=\"easy-runpod-instructions\">\u003C\u002Fa> 简易RunPod说明\n\n**请注意，RunPod会定期升级其基础Docker镜像，这可能导致仓库无法正常运行。尽管目前所有的YouTube视频都不再是最新的，但你仍然可以将其作为参考。按照常规的RunPod YouTube视频或教程操作，只需做以下更改：**\n\n在“My Pods”页面中，\n\n- 点击菜单按钮（位于紫色播放按钮左侧）\n- 选择“编辑Pod”\n- 将“Docker镜像名称”更新为以下之一（经2023年6月27日测试）：\n  - `runpod\u002Fpytorch:3.10-2.0.1-120-devel`\n  - `runpod\u002Fpytorch:3.10-2.0.1-118-runtime`\n  - `runpod\u002Fpytorch:3.10-2.0.0-117`\n  - `runpod\u002Fpytorch:3.10-1.13.1-116`\n- 点击“保存”。\n- 重启你的Pod\n\n### 继续阅读指南的其余部分：\n\n- 注册 RunPod。欢迎使用我的[推荐链接](https:\u002F\u002Frunpod.io?ref=n8yfwyum)，这样我就不用为此付费了（但你需要支付）。\n- 登录后，选择 `SECURE CLOUD` 或 `COMMUNITY CLOUD`。\n- 确保选择网络速度“高”的选项，以免因下载缓慢而浪费时间和金钱。\n- 选择显存至少为 **24GB** 的设备，例如 RTX 3090、RTX 4090 或 RTX A5000。\n\n- 按照下面的视频教程操作：\n\n[![视频教程](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_134ad10ce0f1.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=7m__xadX0z0#t=5m33.1s)\n\n## \u003Ca name=\"vast-ai-setup\">\u003C\u002Fa> Vast.AI 使用说明\n- 注册 [Vast.AI](http:\u002F\u002Fconsole.vast.ai\u002F?ref=47390)（由 David Bielejeski 提供的推荐链接）。\n- 充值一些资金（我通常每次充值 10 美元）。\n- 前往 [客户端 - 创建页面](https:\u002F\u002Fvast.ai\u002Fconsole\u002Fcreate\u002F?ref=47390)：\n  - 选择 pytorch\u002Fpytorch 作为 Docker 镜像，并勾选“使用 Jupyter Lab 界面”和“Jupyter 直接 HTTPS”按钮。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_b98b2c93c9a0.png)\n- 建议增加磁盘空间，并按 GPU 显存进行筛选（2GB 的检查点文件 + 2-8GB 的模型文件 + 正则化图像 + 其他内容会迅速占用大量空间）。\n  - 我通常分配 150GB。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_2c5ea7a56299.png)\n  - 同时也要检查上传\u002F下载速度，确保带宽足够，以免在等待下载时耗尽资金。\n- 选择所需的实例，点击“租用”，然后前往你的 [实例页面](https:\u002F\u002Fvast.ai\u002Fconsole\u002Finstances\u002F?ref=47390)，点击“打开”。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_6a2f78177fe9.png)\n  - 你可能会收到不安全证书警告。点击忽略该警告，或安装[Vast 证书](https:\u002F\u002Fvast.ai\u002Fstatic\u002Fjvastai_root.cer)。\n- 点击 `Notebook -> Python 3`（你可以通过多种方式完成这一步，但我通常这样做）。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_82e300612c40.png)\n- 使用以下命令克隆 Joe 的仓库：\n  - `!git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion.git`\n  - 点击“运行”。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_78b9a04cdeb2.png)\n- 在左侧导航栏中进入新创建的 `Dreambooth-Stable-Diffusion` 目录，打开 `dreambooth_simple_joepenna.ipynb` 或 `dreambooth_runpod_joepenna.ipynb` 文件。\n  - ![img.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_0e26619d826b.png)\n- 按照笔记本中的说明开始训练。\n\n## \u003Ca name=\"running-locally\">\u003C\u002Fa> 本地运行说明\n\n### \u003Ca name=\"running-locally-venv\">\u003C\u002Fa> 设置 - 虚拟环境\n\n### 先决条件\n1. [Git](https:\u002F\u002Fgitforwindows.org\u002F)\n2. [Python 3.10](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n3. 打开 `cmd`。\n4. 克隆仓库：\n   1. `C:\\>git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion`\n5. 进入仓库目录：\n   1. `C:\\>cd Dreambooth-Stable-Diffusion`\n\n### 安装依赖并激活环境\n```cmd\ncmd> python -m venv dreambooth_joepenna\ncmd> dreambooth_joepenna\\Scripts\\activate.bat\ncmd> pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu117\ncmd> pip install -r requirements.txt\n```\n\n#### 运行\n`cmd> python \"main.py\" --project_name \"ProjectName\" --training_model \"C:\\v1-5-pruned-emaonly-pruned.ckpt\" --regularization_images \"C:\\regularization_images\" --training_images \"C:\\training_images\" --max_training_steps 2000 --class_word \"person\" --token \"zwx\" --flip_p 0 --learning_rate 1.0e-06 --save_every_x_steps 250`\n\n#### 清理\n```cmd\ncmd> deactivate \n```\n\n### \u003Ca name=\"running-locally-conda\">\u003C\u002Fa> 设置 - Conda\n\n### 先决条件\n1. [Git](https:\u002F\u002Fgitforwindows.org\u002F)\n2. [Python 3.10](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n3. [miniconda3](https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002Fminiconda.html)\n4. 打开 `Anaconda Prompt (miniconda3)`。\n5. 克隆仓库：\n   1. `(base) C:\\>git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion`\n6. 进入仓库目录：\n   1. `(base) C:\\>cd Dreambooth-Stable-Diffusion`\n\n### 安装依赖并激活环境\n\n```cmd\n(base) C:\\Dreambooth-Stable-Diffusion> conda env create -f environment.yaml\n(base) C:\\Dreambooth-Stable-Diffusion> conda activate dreambooth_joepenna\n```\n\n##### 运行\n`cmd> python \"main.py\" --project_name \"ProjectName\" --training_model \"C:\\v1-5-pruned-emaonly-pruned.ckpt\" --regularization_images \"C:\\regularization_images\" --training_images \"C:\\training_images\" --max_training_steps 2000 --class_word \"person\" --token \"zwx\" --flip_p 0 --learning_rate 1.0e-06 --save_every_x_steps 250`\n\n##### 清理\n```cmd\ncmd> conda deactivate\n```\n\n# \u003Ca name=\"config-file-and-command-line-reference\">\u003C\u002Fa> 配置文件与命令行参考\n\n## 示例配置文件\n\n```\n{\n    \"class_word\": \"woman\",\n    \"config_date_time\": \"2023-04-08T16-54-00\",\n    \"debug\": false,\n    \"flip_percent\": 0.0,\n    \"gpu\": 0,\n    \"learning_rate\": 1e-06,\n    \"max_training_steps\": 3500,\n    \"model_path\": \"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\",\n    \"model_repo_id\": \"\",\n    \"project_config_filename\": \"my-config.json\",\n    \"project_name\": \"\u003Ctoken> project\",\n    \"regularization_images_folder_path\": \"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\",\n    \"save_every_x_steps\": 250,\n    \"schema\": 1,\n    \"seed\": 23,\n    \"token\": \"\u003Ctoken>\",\n    \"token_only\": false,\n    \"training_images\": [\n        \"001@a photo of \u003Ctoken> looking down.png\",\n        \"002-DUPLICATE@a close photo of \u003Ctoken> smiling wearing a black sweatshirt.png\",\n        \"002@a photo of \u003Ctoken> wearing a black sweatshirt sitting on a blue couch.png\",\n        \"003@a photo of \u003Ctoken> smiling wearing a red flannel shirt with a door in the background.png\",\n        \"004@a photo of \u003Ctoken> wearing a purple sweater dress standing with her arms crossed in front of a piano.png\",\n        \"005@a close photo of \u003Ctoken> with her hand on her chin.png\",\n        \"005@a photo of \u003Ctoken> with her hand on her chin wearing a dark green coat and a red turtleneck.png\",\n        \"006@a close photo of \u003Ctoken>.png\",\n        \"007@a close photo of \u003Ctoken>.png\",\n        \"008@a photo of \u003Ctoken> wearing a purple turtleneck and earings.png\",\n        \"009@a close photo of \u003Ctoken> wearing a red flannel shirt with her hand on her head.png\",\n        \"011@a close photo of \u003Ctoken> wearing a black shirt.png\",\n        \"012@a close photo of \u003Ctoken> smirking wearing a gray hooded sweatshirt.png\",\n        \"013@a photo of \u003Ctoken> standing in front of a desk.png\",\n        \"014@a close photo of \u003Ctoken> standing in a kitchen.png\",\n        \"015@a photo of \u003Ctoken> wearing a pink sweater with her hand on her forehead sitting on a couch with leaves in the background.png\",\n        \"016@a photo of \u003Ctoken> wearing a black shirt standing in front of a door.png\",\n        \"017@a photo of \u003Ctoken> smiling wearing a black v-neck sweater sitting on a couch in front of a lamp.png\",\n        \"019@a photo of \u003Ctoken> wearing a blue v-neck shirt in front of a door.png\",\n        \"020@a photo of \u003Ctoken> looking down with her hand on her face wearing a black sweater.png\",\n        \"021@a close photo of \u003Ctoken> pursing her lips wearing a pink hooded sweatshirt.png\",\n        \"022@a photo of \u003Ctoken> looking off into the distance wearing a striped shirt.png\",\n        \"023@a photo of \u003Ctoken> smiling wearing a blue beanie holding a wine glass with a kitchen table in the background.png\",\n        \"024@a close photo of \u003Ctoken> looking at the camera.png\"\n    ],\n    \"training_images_count\": 24,\n    \"training_images_folder_path\": \"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\"\n}\n```\n\n### 使用您的配置进行训练\n\n```\npython \"main.py\" --config_file_path \"path\u002Fto\u002Fthe\u002Fmy-config.json\"\n```\n\n## 命令行参数\n\n[dreambooth_helpers\\arguments.py](https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fblob\u002Fmain\u002Fdreambooth_helpers\u002Farguments.py)\n\n| 命令 | 类型 | 示例 | 描述 |\n| ------- | ---- | ------- | ----------- |\n| `--config_file_path` | 字符串 | `\"C:\\\\Users\\\\David\\\\Dreambooth Configs\\\\my-config.json\"` | 要使用的配置文件路径 |\n| `--project_name` | 字符串 | `\"My Project Name\"` | 项目名称 |\n| `--debug` | 布尔值 | `False` | *可选* 默认为 `False`。启用调试日志记录 |\n| `--seed` | 整数 | `23` | *可选* 默认为 `23`。用于 seed_everything 的随机种子 |\n| `--max_training_steps` | 整数 | `3000` | 要运行的训练步数 |\n| `--token` | 字符串 | `\"owhx\"` | 您希望用来代表训练后模型的唯一标记。 |\n| `--token_only` | 布尔值 | `False` | *可选* 默认为 `False`。仅使用标记进行训练，不使用类别词。 |\n| `--training_model` | 字符串 | `\"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\"` | 要训练的模型路径（model.ckpt） |\n| `--training_images` | 字符串 | `\"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\"` | 训练图像目录的路径 |\n| `--regularization_images` | 字符串 | `\"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\"` | 包含正则化图像的目录路径 |\n| `--class_word` | 字符串 | `\"woman\"` | 将 class_word 与您想要训练的图像类别匹配。例如：`man`、`woman`、`dog` 或 `artstyle`。 |\n| `--flip_p` | 浮点数 | `0.0` | *可选* 默认为 `0.5`。翻转百分比。例如，如果设置为 `0.5`，则会在 50% 的时间里翻转（镜像）您的训练图像。这有助于在无需增加更多训练图像的情况下扩展数据集。不过，对于人脸训练来说，这种做法可能会导致效果变差，因为大多数人的脸并不完全对称。 |\n| `--learning_rate` | 浮点数 | `1.0e-06` | *可选* 默认为 `1.0e-06`（0.000001）。设置学习率。支持科学计数法。 |\n| `--save_every_x_steps` | 整数 | `250` | *可选* 默认为 `0`。每 x 步保存一次检查点。当设置为 `0` 时，仅在达到 `max_training_steps` 后的训练结束时保存。 |\n| `--gpu` | 整数 | `0` | *可选* 默认为 `0`。指定除 GPU 0 外的其他 GPU 进行训练。目前尚未实现多 GPU 支持。\n\n### 使用您的配置进行训练\n\n```\npython \"main.py\" --project_name \"My Project Name\" --max_training_steps 3000 --token \"owhx\" --training_model \"D:\\\\stable-diffusion\\\\models\\\\v1-5-pruned-emaonly-pruned.ckpt\" --training_images \"D:\\\\stable-diffusion\\\\training_images\\\\24 Images - captioned\" --regularization_images \"D:\\\\stable-diffusion\\\\regularization_images\\\\Stable-Diffusion-Regularization-Images-person_ddim\\\\person_ddim\" --class_word \"woman\" --flip_p 0.0 --save_every_x_steps 500\n```\n\n# \u003Ca name=\"captions-and-multi-concept\">\u003C\u002Fa> 字幕与多主体\u002F概念支持\n\n字幕功能已支持。关于我们如何实现字幕的指南，请参阅[这里](https:\u002F\u002Fdiscord.com\u002Fchannels\u002F1023277529424986162\u002F1029222282511515678)。\n\n假设你的标记是“effy”，类别是“person”，数据根目录为\u002Ftrain，那么：\n\n`training_images\u002Fimg-001.jpg` 的字幕为 `effy person`\n\n你可以通过在文件名中添加 `@` 符号来自定义字幕。\n\n`\u002Ftraining_images\u002Fimg-001@a photo of effy` => `a photo of effy`\n\n你可以在字幕中使用两个标记 `S`（大写 S）和 `C`（大写 C），分别表示主体和类别。\n\n`\u002Ftraining_images\u002Fimg-001@S being a good C.jpg` => `effy being a good person`\n\n要创建一个新的主体，只需为其创建一个文件夹即可。例如：\n\n`\u002Ftraining_images\u002Fbingo\u002Fimg-001.jpg` => `bingo person`\n\n类别的部分保持不变，但主体已经改变。\n\n此时，标记 S 变为 bingo：\n\n`\u002Ftraining_images\u002Fbingo\u002Fimg-001@S is being silly.jpg` => `bingo is being silly`\n\n再深入一层，你就可以更改类别：`\u002Ftraining_images\u002Fbingo\u002Fdog\u002Fimg-001@S being a good C.jpg` => `bingo being a good dog`\n\n更进一步：再深入一层，你还可以为一组图片添加字幕：`\u002Ftraining_images\u002Feffy\u002Fperson\u002Fa picture of\u002Fimg-001.jpg` => `a picture of effy person`\n\n\n# \u003Ca name=\"text-vs-dreamb\">\u003C\u002Fa> 文本反转 vs. Dreambooth\n这个仓库中的大部分代码是由 Rinon Gal 等人编写的，他们是文本反转研究论文的作者。尽管其中加入了一些关于正则化图像和先验损失保留的想法（来自“Dreambooth”），但为了尊重 MIT 团队和 Google 的研究人员，我将这个分支重命名为：\n*“曾经被称为‘Dreambooth’的仓库”*。\n\n如需其他实现方式，请参阅下方的[替代方案](#hugging-face-diffusers)。\n\n\n# \u003Ca name=\"using-the-generated-model\">\u003C\u002Fa> 使用生成的模型\n`ground truth`（真实照片，注意：非常美丽的女性）\n\u003Cbr>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_d8a449d41a16.png\" width=\"200\">\n\n以下所有图片使用相同的提示词：\n\n| `sks person` | `woman person` | `Natalie Portman person` | `Kate Mara person` |\n| ----- | ------- | ----------------- | ----------- |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_a9da7c4323b9.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_a2945b0a7f9b.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_b87aed1fcfca.png\" width=\"200\"> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_ac30e5e06ed0.png\" width=\"200\"> |   \n\n# \u003Ca name=\"debugging-your-results\">\u003C\u002Fa> 调试你的生成结果\n### ❗❗ 人们常犯的第一大错误 ❗❗\n\n**仅用你的标记进行提示，例如“joepenna”而不是“joepenna person”**\n\n\n如果你以 `joepenna` 作为标记，并将其归入 `person` 类别进行训练，那么模型只会将你的脸识别为：\n\n```\njoepenna person\n```\n\n示例提示词：\n\n🚫 错误（`joepenna` 后缺少 `person`）\n```\nportrait photograph of joepenna 35mm film vintage glass\n```\n\n✅ 正确（`joepenna` 后包含 `person`）\n```\nportrait photograph of joepenna person 35mm film vintage glass\n```\n\n有时你可能会得到一些长得有点像你的结果（尤其是在训练步数过多时），但这只是因为当前版本的 Dreambooth 过度训练了该标记，导致其泛化到了其他内容上。\n\n---\n\n#### ☢ 训练时请注意使用的图片类型\n\n在训练过程中，Stable Diffusion 并不知道你是一个人。它只会模仿所看到的内容。\n\n因此，如果你的训练图片看起来像这样：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_42742602b882.png)\n\n那么你最终生成的结果很可能都是你在带刺树旁、穿着白灰相间衬衫的样子，风格也只会是……嗯，自拍照片。\n\n相比之下，这样的训练集会更好：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_readme_5221e245bd3c.png)\n\n这些图片之间唯一一致的地方就是主体。这样一来，Stable Diffusion 将只学习你的面部特征，从而可以将其“编辑”成其他风格。\n\n## 哦不！生成效果并不理想！\n\n#### \u003Ca name=\"they-dont-look-like-you\">\u003C\u002Fa> 方案 1：生成结果完全不像你！（继续训练，或使用更好的训练图片）\n\n你确定自己的提示词写对了吗？\n\n正确的格式应该是 `\u003C标记> \u003C类别>`，而不是仅仅使用 `\u003C标记>`。例如：\n\n`JoePenna person, portrait photograph, 85mm medium format photo`\n\n\n如果仍然不像你，那说明你训练的时间还不够长。\n\n----\n\n#### \u003Ca name=\"they-sorta-look-like-you-but-exactly-like-your-training-images\">\u003C\u002Fa> 方案 2：生成结果有点像你，但都和你的训练图片一模一样。（减少训练步数，使用更好的训练图片，或通过调整提示词解决）\n\n可能有以下几个原因：你可能训练得太久了……或者你的图片过于相似……又或者你用于训练的图片数量不足。\n\n没关系，我们可以通过调整提示词来解决这个问题。Stable Diffusion 对你输入内容的优先级非常高，所以请把关键信息放在前面：\n\n`an exquisite portrait photograph, 85mm medium format photo of JoePenna person with a classic haircut`\n\n\n----\n\n#### \u003Ca name=\"they-look-like-you-but-not-when-you-try-different-styles\">\u003C\u002Fa> 方案 3：生成结果像你，但在尝试不同风格时却不像。（继续训练，使用更好的训练图片）\n\n看来你还是训练得不够充分……\n\n没关系，我们可以通过调整提示词来解决：\n\n`JoePenna person in a portrait photograph, JoePenna person in a 85mm medium format photo of JoePenna person`\n\n\n### 更多技巧和帮助请访问：[Stable Diffusion Dreambooth Discord](https:\u002F\u002Fdiscord.com\u002Finvite\u002FqbMuXBXyHA)\n\n# \u003Ca name=\"hugging-face-diffusers\">\u003C\u002Fa> Hugging Face Diffusers - 替代方案\n\n现在，HuggingFace Diffusers 已经支持使用 Stable Diffusion 进行 Dreambooth 训练。\n\n你可以在以下链接中尝试：\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fhuggingface\u002Fnotebooks\u002Fblob\u002Fmain\u002Fdiffusers\u002Fsd_dreambooth_training.ipynb)","# Dreambooth-Stable-Diffusion 快速上手指南\n\nDreambooth-Stable-Diffusion 是一个基于 Stable Diffusion 的微调工具，允许用户通过少量图片训练模型，以生成特定人物、物体或风格的图像。本指南基于 JoePenna 优化版本，帮助开发者快速部署并运行训练任务。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**：Windows 10\u002F11 或 Ubuntu Linux\n- **GPU**：推荐 NVIDIA RTX 3090\u002F4090 或同等算力显卡（显存至少 24GB）。若显存不足，建议使用云端 GPU 服务（如 RunPod, Vast.AI）。\n- **Python 版本**：3.10\n\n### 前置依赖\n请确保已安装以下软件：\n1. **Git**：[下载地址 (Windows)](https:\u002F\u002Fgitforwindows.org\u002F)\n2. **Python 3.10**：[官方下载](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n3. **CUDA 驱动**：确保显卡驱动支持 CUDA 11.7（对应 PyTorch 1.13.1）\n\n> **国内加速建议**：\n> - 使用清华源或阿里源加速 Python 包安装。\n> - 若访问 GitHub 缓慢，可使用镜像站克隆仓库。\n\n## 安装步骤\n\n你可以选择使用 **虚拟环境 (venv)** 或 **Conda** 进行安装。以下以 Windows 为例，Linux 用户请将路径分隔符 `\\` 改为 `\u002F`。\n\n### 方式一：使用 venv (推荐)\n\n1. **克隆仓库**\n   打开命令行工具 (`cmd`)，执行：\n   ```cmd\n   git clone https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion.git\n   cd Dreambooth-Stable-Diffusion\n   ```\n\n2. **创建并激活虚拟环境**\n   ```cmd\n   python -m venv dreambooth_joepenna\n   dreambooth_joepenna\\Scripts\\activate.bat\n   ```\n\n3. **安装 PyTorch (国内加速)**\n   推荐使用清华源安装指定版本的 PyTorch：\n   ```cmd\n   pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu117 -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n   ```\n\n4. **安装其他依赖**\n   ```cmd\n   pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n   ```\n\n### 方式二：使用 Conda\n\n1. **克隆仓库** (同上)\n\n2. **创建环境**\n   打开 `Anaconda Prompt`，进入项目目录后执行：\n   ```cmd\n   conda env create -f environment.yaml\n   conda activate dreambooth_joepenna\n   ```\n   > 若 `environment.yaml` 下载依赖过慢，可手动编辑该文件，将 `pip` 部分替换为国内镜像源地址。\n\n## 基本使用\n\n训练前请准备好以下资源：\n- **基础模型**：例如 `v1-5-pruned-emaonly-pruned.ckpt`\n- **训练图片**：放入文件夹，例如 `C:\\training_images`\n- **正则化图片**：用于防止过拟合，放入文件夹，例如 `C:\\regularization_images`\n\n### 运行训练命令\n\n在激活的环境中执行以下命令（请根据实际路径修改参数）：\n\n```cmd\npython \"main.py\" --project_name \"MyProject\" --training_model \"C:\\models\\v1-5-pruned-emaonly-pruned.ckpt\" --regularization_images \"C:\\regularization_images\" --training_images \"C:\\training_images\" --max_training_steps 2000 --class_word \"person\" --token \"zwx\" --flip_p 0 --learning_rate 1.0e-06 --save_every_x_steps 250\n```\n\n### 参数说明\n- `--project_name`: 项目名称，生成的模型将保存在此文件夹下。\n- `--training_model`: 原始 Stable Diffusion 模型路径 (.ckpt)。\n- `--training_images`: 包含你想要训练的特定主体（如人脸、物体）的图片文件夹路径。\n- `--regularization_images`: 正则化图片文件夹路径（通常使用同类别的通用图片，如“人”、“狗”等）。\n- `--class_word`: 类别词，例如训练人像填 `person`，训练狗填 `dog`。\n- `--token`: 触发词，一个独特的标识符（建议生僻词或名字），用于在生成时调用训练结果。\n- `--max_training_steps`: 最大训练步数，通常 2000-3500 步。\n- `--learning_rate`: 学习率，默认 `1.0e-06` 即可。\n\n训练完成后，生成的模型文件位于 `projects\u002FMyProject\u002Fcheckpoints` 目录下，可直接加载到 Stable Diffusion WebUI 中使用。","独立游戏开发者小林正在为一款赛博朋克风格的游戏制作宣传图，需要让主角“艾拉”以不同姿态出现在各种复杂场景中。\n\n### 没有 Dreambooth-Stable-Diffusion 时\n- 每次生成主角都必须依赖繁琐的后期修图，因为通用模型无法稳定还原“艾拉”独特的面部特征和发型细节。\n- 试图通过提示词描述角色外貌时，生成的图像往往面目全非，或者每次生成的“艾拉”长得都不一样，缺乏一致性。\n- 若要获得高质量且角色统一的素材，只能聘请画师手绘大量原画，成本高昂且修改迭代周期长达数天。\n- 调整角色动作或背景风格时，经常导致角色崩坏，难以在保持人物特征的同时自由切换艺术风格。\n\n### 使用 Dreambooth-Stable-Diffusion 后\n- 仅需上传十几张“艾拉”的概念设计图进行微调训练，Dreambooth-Stable-Diffusion 就能将角色特征深度植入模型，实现“一键召唤”固定角色。\n- 输入简单的提示词（如“艾拉在雨夜霓虹灯下”），即可生成高度一致的主角形象，彻底解决了角色长相随机变化的问题。\n- 开发者可以低成本快速产出数十种不同姿势、光影和场景的草图供团队筛选，将创意验证时间从几天缩短至几分钟。\n- 在保持“艾拉”面容不变的前提下，能自由让她穿梭于写实、动漫或油画等多种艺术风格中，极大丰富了视觉表现力。\n\nDreambooth-Stable-Diffusion 通过将特定角色或风格“刻入”模型基因，让创作者能以极低的成本实现高一致性的个性化内容批量生产。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJoePenna_Dreambooth-Stable-Diffusion_935348b6.png","JoePenna","Joe Penna","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FJoePenna_fae75b21.jpg","Hosting page for SD Dreambooth repo",null,"MysteryGuitarM","https:\u002F\u002Fgithub.com\u002FJoePenna",[81,85,89],{"name":82,"color":83,"percentage":84},"Jupyter Notebook","#DA5B0B",89.4,{"name":86,"color":87,"percentage":88},"Python","#3572A5",10.6,{"name":90,"color":91,"percentage":92},"Shell","#89e051",0.1,3218,534,"2026-04-08T16:37:16","MIT","Windows, Linux","必需 NVIDIA GPU。推荐显存 24GB (如 RTX 3090, 4090, A5000)；最低支持 24GB VRAM 运行（文中强调需独占资源），云端实例建议至少 24GB。本地运行命令指定 CUDA 11.7 (cu117)。","未说明 (但建议云实例磁盘空间至少 150GB 以容纳模型和正则化图像)",{"notes":101,"python":102,"dependencies":103},"1. 支持通过 RunPod、Vast.ai 云端或本地 (Windows\u002FUbuntu) 运行。2. 训练两个人物主体连续进行较困难，生成的模型文件在修剪前约为 11-12GB，修剪后约 2GB。3. 强烈建议使用 24GB 显存的显卡，若使用此类显卡，需确保其为系统中唯一运行的程序。4. 提供了针对云平台的 Jupyter Notebook 脚本。5. 需要手动准备基础模型文件 (.ckpt) 和正则化图像。","3.10",[104,105],"torch==1.13.1+cu117","torchvision==0.14.1+cu117",[15,13,14],[108,109,110,111,112,113,114,115,116],"ai","txt2img","artificial-intelligence","image-generation","machine-learning","model-training","img2img","latent-diffusion","stable-diffusion","2026-03-27T02:49:30.150509","2026-04-09T21:33:59.907080",[120,125,130,135,140],{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},26763,"训练过程中进程被\"Killed\"或突然终止是什么原因？","这通常由两个主要原因导致：\n1. GPU 未正确配置：如果在 RunPod 等平台启动实例时，默认可能设置为 0 GPU。必须手动选择至少 1x GPU。\n2. 内存（RAM）不足：更常见的原因是机器的主内存（RAM）耗尽，而非显存（VRAM）。解决方法包括释放内存、增加交换空间（swap space）、升级硬件，或租用拥有更大内存的机器。","https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fissues\u002F15",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},26764,"遇到\"RuntimeError: indices should be either on cpu or on the same device as the indexed tensor\"错误如何解决？","这是一个设备匹配错误。可以通过修改代码将索引强制转换到 CPU 来解决。具体做法是将相关代码行修改为：\nlogvar_t = self.logvar[t.cpu()].to(self.device)\n确保在访问 tensor 之前，索引 t 先通过 .cpu() 转移到 CPU 上。","https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fissues\u002F50",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},26765,"如何在 RunPod 上解决 Hugging Face 登录时的 Javascript 错误？","如果遇到 Hugging Face 登录单元格的 Javascript 错误（如模块版本不匹配），可以尝试以下替代登录方法：\n不要使用默认的登录单元格，而是运行以下 Python 代码进行登录：\nfrom huggingface_hub import interpreter_login\ninterpreter_login()\n此外，也可以尝试在启动 RunPod 时选择带有 SD 1.5 的版本作为变通方案。","https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fissues\u002F134",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},26766,"如何同时训练人脸和全身以避免生成扭曲的脸部或无法更换服装？","Dreambooth 在同时处理人脸和全身时存在局限性。如果仅用面部图片训练，生成全身像时脸部可能扭曲；如果仅用全身像训练，则难以自由改变发型和服装。\n建议的优化方法是：在完成初步训练后，加载生成的检查点（ckpts），使用较小的学习率（learning rate）再额外训练约 500 步，这有助于改善细节一致性，但完全自由地控制服装和保持完美脸型仍受限于模型能力。","https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fissues\u002F93",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},26767,"该项目是否支持 Stable Diffusion 2.0 模型？","该仓库目前已不再维护（dead），原生不支持 SD 2.0。如果需要支持 SD 2.0 或其衍生模型（如 depth, inpainting 等），建议转向其他活跃的社区项目，例如 TheLastBen 的 fast-stable-diffusion Colab 笔记本，或者查看 huggingface\u002Fdiffusers 仓库中关于 Dreambooth 的实现尝试。","https:\u002F\u002Fgithub.com\u002FJoePenna\u002FDreambooth-Stable-Diffusion\u002Fissues\u002F112",[]]