[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-guoyww--AnimateDiff":3,"tool-guoyww--AnimateDiff":62},[4,18,26,35,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,2,"2026-04-10T11:39:34",[14,15,13],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":32,"last_commit_at":41,"category_tags":42,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[43,13,15,14],"插件",{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":10,"last_commit_at":50,"category_tags":51,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[52,15,13,14],"语言模型",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[14,15,13,61],"视频",{"id":63,"github_repo":64,"name":65,"description_en":66,"description_zh":67,"ai_summary_zh":68,"readme_en":69,"readme_zh":70,"quickstart_zh":71,"use_case_zh":72,"hero_image_url":73,"owner_login":74,"owner_name":75,"owner_avatar_url":76,"owner_bio":77,"owner_company":78,"owner_location":79,"owner_email":78,"owner_twitter":78,"owner_website":80,"owner_url":81,"languages":82,"stars":87,"forks":88,"last_commit_at":89,"license":90,"difficulty_score":10,"env_os":91,"env_gpu":92,"env_ram":91,"env_deps":93,"category_tags":104,"github_topics":78,"view_count":32,"oss_zip_url":78,"oss_zip_packed_at":78,"status":17,"created_at":105,"updated_at":106,"faqs":107,"releases":137},6397,"guoyww\u002FAnimateDiff","AnimateDiff","Official implementation of AnimateDiff.","AnimateDiff 是一款创新的开源模块，旨在让现有的文本生成图像模型轻松具备制作动画的能力。它巧妙地将静态图片生成转化为动态视频创作，用户无需对模型进行额外的复杂训练或微调，即可直接利用社区中丰富的个性化模型（如 ToonYou、Realistic Vision 等）生成流畅的视频片段。\n\n这一工具主要解决了传统 AI 视频生成门槛高、需要大量算力重新训练模型的痛点。通过“即插即用”的设计，AnimateDiff 极大地降低了动画制作的成本和技术难度，让创作者能够专注于创意本身，而非繁琐的工程调整。其核心技术亮点在于能够兼容多种主流扩散模型架构（包括 Stable Diffusion V1.5 和 SDXL），并支持通过 MotionLoRA 等技术精细控制镜头运动，从而在保证画面风格一致性的同时实现自然的动态效果。\n\nAnimateDiff 非常适合各类人群使用：设计师和艺术创作者可以利用它快速将概念图转化为动态演示；研究人员可以将其作为探索视频生成机制的高效基线；而熟悉命令行操作的开发者则能通过简单的脚本配置，灵活集成到自己的工作流中。无论是想尝试 AI 动画的爱好者，还是寻","AnimateDiff 是一款创新的开源模块，旨在让现有的文本生成图像模型轻松具备制作动画的能力。它巧妙地将静态图片生成转化为动态视频创作，用户无需对模型进行额外的复杂训练或微调，即可直接利用社区中丰富的个性化模型（如 ToonYou、Realistic Vision 等）生成流畅的视频片段。\n\n这一工具主要解决了传统 AI 视频生成门槛高、需要大量算力重新训练模型的痛点。通过“即插即用”的设计，AnimateDiff 极大地降低了动画制作的成本和技术难度，让创作者能够专注于创意本身，而非繁琐的工程调整。其核心技术亮点在于能够兼容多种主流扩散模型架构（包括 Stable Diffusion V1.5 和 SDXL），并支持通过 MotionLoRA 等技术精细控制镜头运动，从而在保证画面风格一致性的同时实现自然的动态效果。\n\nAnimateDiff 非常适合各类人群使用：设计师和艺术创作者可以利用它快速将概念图转化为动态演示；研究人员可以将其作为探索视频生成机制的高效基线；而熟悉命令行操作的开发者则能通过简单的脚本配置，灵活集成到自己的工作流中。无论是想尝试 AI 动画的爱好者，还是寻求高效生产力的专业人士，AnimateDiff 都提供了一个强大且易用的解决方案，助力大家轻松开启动态视觉创作之旅。","# AnimateDiff\n\nThis repository is the official implementation of [AnimateDiff](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725) [ICLR2024 Spotlight].\nIt is a plug-and-play module turning most community text-to-image models into animation generators, without the need of additional training.\n\n**[AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725)** \n\u003C\u002Fbr>\n[Yuwei Guo](https:\u002F\u002Fguoyww.github.io\u002F),\n[Ceyuan Yang✝](https:\u002F\u002Fceyuan.me\u002F),\n[Anyi Rao](https:\u002F\u002Fanyirao.com\u002F),\n[Zhengyang Liang](https:\u002F\u002Fmaxleung99.github.io\u002F),\n[Yaohui Wang](https:\u002F\u002Fwyhsirius.github.io\u002F),\n[Yu Qiao](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=gFtI-8QAAAAJ),\n[Maneesh Agrawala](https:\u002F\u002Fgraphics.stanford.edu\u002F~maneesh\u002F),\n[Dahua Lin](http:\u002F\u002Fdahua.site),\n[Bo Dai](https:\u002F\u002Fdaibo.info)\n(✝Corresponding Author)  \n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2307.04725-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725)\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Website-green)](https:\u002F\u002Fanimatediff.github.io\u002F)\n[![Open in OpenXLab](https:\u002F\u002Fcdn-static.openxlab.org.cn\u002Fapp-center\u002Fopenxlab_app.svg)](https:\u002F\u002Fopenxlab.org.cn\u002Fapps\u002Fdetail\u002FMasbfca\u002FAnimateDiff)\n[![Hugging Face Spaces](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-yellow)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fguoyww\u002FAnimateDiff)\n\n***Note:*** The `main` branch is for [Stable Diffusion V1.5](https:\u002F\u002Fhuggingface.co\u002Frunwayml\u002Fstable-diffusion-v1-5); for [Stable Diffusion XL](https:\u002F\u002Fhuggingface.co\u002Fstabilityai\u002Fstable-diffusion-xl-base-1.0), please refer `sdxl-beta` branch.\n\n\n## Quick Demos\nMore results can be found in the [Gallery](__assets__\u002Fdocs\u002Fgallery.md).\nSome of them are contributed by the community.\n\n\u003Ctable class=\"center\">\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_7b1d4ddb2807.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_a1ab43ed31f9.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_49f030238598.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_b4a3d259293b.gif\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cp style=\"margin-left: 2em; margin-top: -1em\">Model：\u003Ca href=\"https:\u002F\u002Fcivitai.com\u002Fmodels\u002F30240\u002Ftoonyou\">ToonYou\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Ctable>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_45c0638d55b7.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_27bbf246622f.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_3c2330e66e73.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0ab3171d601c.gif\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cp style=\"margin-left: 2em; margin-top: -1em\">Model：\u003Ca href=\"https:\u002F\u002Fcivitai.com\u002Fmodels\u002F4201\u002Frealistic-vision-v20\">Realistic Vision V2.0\u003C\u002Fa>\u003C\u002Fp>\n\n\n## Quick Start\n***Note:*** AnimateDiff is also offically supported by Diffusers.\nVisit [AnimateDiff Diffusers Tutorial](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fdiffusers\u002Fapi\u002Fpipelines\u002Fanimatediff) for more details.\n*Following instructions is for working with this repository*.\n\n***Note:*** For all scripts, checkpoint downloading will be *automatically* handled, so the script running may take longer time when first executed.\n\n### 1. Setup repository and environment\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff.git\ncd AnimateDiff\n\npip install -r requirements.txt\n```\n\n### 2. Launch the sampling script!\nThe generated samples can be found in `samples\u002F` folder.\n\n#### 2.1 Generate animations with comunity models\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_1_animate_RealisticVision.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_2_animate_FilmVelvia.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_3_animate_ToonYou.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_4_animate_MajicMix.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_5_animate_RcnzCartoon.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_6_animate_Lyriel.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_7_animate_Tusun.yaml\n```\n\n#### 2.2 Generate animation with MotionLoRA control\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F2_motionlora\u002F2_motionlora_RealisticVision.yaml\n```\n\n#### 2.3 More control with SparseCtrl RGB and sketch\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_1_sparsectrl_i2v.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_2_sparsectrl_rgb_RealisticVision.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_3_sparsectrl_sketch_RealisticVision.yaml\n```\n\n#### 2.4 Gradio app\nWe created a Gradio demo to make AnimateDiff easier to use. \nBy default, the demo will run at `localhost:7860`.\n```\npython -u app.py\n```\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ebd82a29b023.jpg\" style=\"width: 75%\">\n\n\n## Technical Explanation\n\u003Cdetails close>\n\u003Csummary>Technical Explanation\u003C\u002Fsummary>\n\n### AnimateDiff\n\n**AnimateDiff aims to learn transferable motion priors that can be applied to other variants of Stable Diffusion family.**\nTo this end, we design the following training pipeline consisting of three stages.\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0897bb3aa5ee.png\" style=\"width:100%\">\n\n- In **1. Alleviate Negative Effects** stage, we train the **domain adapter**, e.g., `v3_sd15_adapter.ckpt`, to fit defective visual aritfacts (e.g., watermarks) in the training dataset.\nThis can also benefit the distangled learning of motion and spatial appearance.\nBy default, the adapter can be removed at inference. It can also be integrated into the model and its effects can be adjusted by a lora scaler.\n\n- In **2. Learn Motion Priors** stage, we train the **motion module**, e.g., `v3_sd15_mm.ckpt`, to learn the real-world motion patterns from videos.\n\n- In **3. (optional) Adapt to New Patterns** stage, we train **MotionLoRA**, e.g., `v2_lora_ZoomIn.ckpt`, to efficiently adapt motion module for specific motion patterns (camera zooming, rolling, etc.).\n\n### SparseCtrl\n\n**SparseCtrl aims to add more control to text-to-video models by adopting some sparse inputs (e.g., few RGB images or sketch inputs).**\nIts technicall details can be found in the following paper:\n\n**[SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933)**  \n[Yuwei Guo](https:\u002F\u002Fguoyww.github.io\u002F),\n[Ceyuan Yang✝](https:\u002F\u002Fceyuan.me\u002F),\n[Anyi Rao](https:\u002F\u002Fanyirao.com\u002F),\n[Maneesh Agrawala](https:\u002F\u002Fgraphics.stanford.edu\u002F~maneesh\u002F),\n[Dahua Lin](http:\u002F\u002Fdahua.site),\n[Bo Dai](https:\u002F\u002Fdaibo.info)\n(✝Corresponding Author)  \n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2311.16933-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933)\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Website-green)](https:\u002F\u002Fguoyww.github.io\u002Fprojects\u002FSparseCtrl\u002F)\n\n\u003C\u002Fdetails>\n\n\n## Model Versions\n\u003Cdetails close>\n\u003Csummary>Model Versions\u003C\u002Fsummary>\n\n### AnimateDiff v3 and SparseCtrl (2023.12)\n\nIn this version, we use **Domain Adapter LoRA** for image model finetuning, which provides more flexiblity at inference.\nWe also implement two (RGB image\u002Fscribble) [SparseCtrl](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933) encoders, which can take abitary number of condition maps to control the animation contents.\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v3 Model Zoo\u003C\u002Fsummary>\n\n| Name | HuggingFace | Type | Storage | Description |\n| - | - | - | - | - |\n| `v3_adapter_sd_v15.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_adapter.ckpt) | Domain Adapter | 97.4 MB | |\n| `v3_sd15_mm.ckpt.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_mm.ckpt) | Motion Module | 1.56 GB | |\n| `v3_sd15_sparsectrl_scribble.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_sparsectrl_scribble.ckpt) | SparseCtrl Encoder | 1.86 GB | scribble condition |\n| `v3_sd15_sparsectrl_rgb.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_sparsectrl_rgb.ckpt) | SparseCtrl Encoder | 1.85 GB | RGB image condition |\n\u003C\u002Fdetails>\n\n#### Limitations\n1. Small fickering is noticable;\n2. To stay compatible with comunity models, there is no specific optimizations for general T2V, leading to limited visual quality under this setting;\n3. **(Style Alignment) For usage such as image animation\u002Finterpolation, it's recommanded to use images generated by the same community model.**\n\n#### Demos\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Input (by RealisticVision)\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Animation\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Input\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Animation\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_b33e188c339b.png\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_f97e730b531d.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_162d5cb96ecb.png\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_5d94a644a4e0.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Input Scribble\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Output\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Input Scribbles\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">Output\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_34f58fa231fa.png\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2794b05a58f9.gif\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2f412112cd0c.png\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_aa19862f25c3.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n### AnimateDiff SDXL-Beta (2023.11)\n\nRelease the Motion Module (beta version) on SDXL, available at [Google Drive](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1EK_D9hDOPfJdK4z8YDB8JYvPracNx2SX\u002Fview?usp=share_link\n) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sdxl_v10_beta.ckpt\n) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules). High resolution videos (i.e., 1024x1024x16 frames with various aspect ratios) could be produced **with\u002Fwithout** personalized models. Inference usually requires ~13GB VRAM and tuned hyperparameters (e.g., sampling steps), depending on the chosen personalized models.  \nCheckout to the branch [sdxl](https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Ftree\u002Fsdxl) for more details of the inference.\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff SDXL-Beta Model Zoo\u003C\u002Fsummary>\n\n| Name | HuggingFace | Type | Storage Space |\n| - | - | - | - |\n| `mm_sdxl_v10_beta.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sdxl_v10_beta.ckpt) | Motion Module | 950 MB |\n\u003C\u002Fdetails>\n\n#### Demos\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=52% style=\"border: none; text-align: center\">Original SDXL\u003C\u002Ftd>\n    \u003Ctd width=30% style=\"border: none; text-align: center\">Community SDXL\u003C\u002Ftd>\n    \u003Ctd width=18% style=\"border: none; text-align: center\">Community SDXL\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd width=52% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_3836d18e29d6.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=30% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_12907e4011a5.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=18% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_af795ab41ce2.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n### AnimateDiff v2 (2023.09)\n\nIn this version, the motion module `mm_sd_v15_v2.ckpt` ([Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI?usp=sharing) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules)) is trained upon larger resolution and batch size.\nWe found that the scale-up training significantly helps improve the motion quality and diversity.  \nWe also support **MotionLoRA** of eight basic camera movements.\nMotionLoRA checkpoints take up only **77 MB storage per model**, and are available at [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI?usp=sharing) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules).\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v2 Model Zoo\u003C\u002Fsummary>\n\n| Name | HuggingFace | Type | Parameter | Storage |\n| - | - | - | - | - |\n| `mm_sd_v15_v2.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v15_v2.ckpt) | Motion Module | 453 M | 1.7 GB |\n| `v2_lora_ZoomIn.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_ZoomIn.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_ZoomOut.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_ZoomOut.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_PanLeft.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_PanLeft.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_PanRight.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_PanRight.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_TiltUp.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_TiltUp.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_TiltDown.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_TiltDown.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_RollingClockwise.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_RollingClockwise.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_RollingAnticlockwise.ckpt` | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_RollingAnticlockwise.ckpt) | MotionLoRA | 19 M | 74 MB |\n\u003C\u002Fdetails>\n\n\n#### Demos (MotionLoRA)\n\u003Ctable class=\"center\">\n  \u003Ctr style=\"line-height: 0\">\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Zoom In\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Zoom Out\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Zoom Pan Left\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Zoom Pan Right\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_d3b5cf28a738.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2a2f1b9b78b9.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_83bf4eece884.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_bd9d648762b6.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_e4a2e4e95a65.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_cc775ad7ea1b.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_e2d0a392e509.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_125a2708b833.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr style=\"line-height: 0\">\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Tilt Up\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Tilt Down\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Rolling Anti-Clockwise\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">Rolling Clockwise\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_85b8dbe350df.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_6a6a024ed0f0.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ffb8d4b42af2.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0eefc409d05d.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_1faaff305cb2.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ff4c7aeb49b0.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_c1feae0692ba.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_704087580b80.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n#### Demos (Improved Motions)\nHere's a comparison between `mm_sd_v15.ckpt` (left) and improved `mm_sd_v15_v2.ckpt` (right).\n\n\u003Ctable class=\"center\">\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_92fa32531c54.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_47947bcb12c8.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_04f96a486271.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_933bd611c7d1.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ea5661852519.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_f3ceb5e36464.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_d27149973451.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_df03b2d9e362.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n### AnimateDiff v1 (2023.07)\n\nThe first version of AnimateDiff!\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v1 Model Zoo\u003C\u002Fsummary>\n\n| Name | HuggingFace | Parameter | Storage Space |\n| - | - | - | - |\n| mm_sd_v14.ckpt | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v14.ckpt) | 417 M | 1.6 GB |\n| mm_sd_v15.ckpt | [Link](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v15.ckpt) | 417 M | 1.6 GB |\n\u003C\u002Fdetails>\n\n\u003C\u002Fdetails>\n\n\n## Training\nPlease check [Steps for Training](__assets__\u002Fdocs\u002Fanimatediff.md) for details.\n\n\n## Related Resources\n\nAnimateDiff for Stable Diffusion WebUI: [sd-webui-animatediff](https:\u002F\u002Fgithub.com\u002Fcontinue-revolution\u002Fsd-webui-animatediff) (by [@continue-revolution](https:\u002F\u002Fgithub.com\u002Fcontinue-revolution))  \nAnimateDiff for ComfyUI: [ComfyUI-AnimateDiff-Evolved](https:\u002F\u002Fgithub.com\u002FKosinkadink\u002FComfyUI-AnimateDiff-Evolved) (by [@Kosinkadink](https:\u002F\u002Fgithub.com\u002FKosinkadink))  \nGoogle Colab: [Colab](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fcamenduru\u002FAnimateDiff-colab\u002Fblob\u002Fmain\u002FAnimateDiff_colab.ipynb) (by [@camenduru](https:\u002F\u002Fgithub.com\u002Fcamenduru))\n\n\n## Disclaimer\nThis project is released for academic use.\nWe disclaim responsibility for user-generated content.\nAlso, please be advised that our only official website are https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff and https:\u002F\u002Fanimatediff.github.io, and all the other websites are NOT associated with us at AnimateDiff. \n\n\n## Contact Us\nYuwei Guo: [guoyw@ie.cuhk.edu.hk](mailto:guoyw@ie.cuhk.edu.hk)  \nCeyuan Yang: [limbo0066@gmail.com](mailto:limbo0066@gmail.com)  \nBo Dai: [doubledaibo@gmail.com](mailto:doubledaibo@gmail.com)\n\n\n## BibTeX\n```\n@article{guo2023animatediff,\n  title={AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning},\n  author={Guo, Yuwei and Yang, Ceyuan and Rao, Anyi and Liang, Zhengyang and Wang, Yaohui and Qiao, Yu and Agrawala, Maneesh and Lin, Dahua and Dai, Bo},\n  journal={International Conference on Learning Representations},\n  year={2024}\n}\n\n@article{guo2023sparsectrl,\n  title={SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models},\n  author={Guo, Yuwei and Yang, Ceyuan and Rao, Anyi and Agrawala, Maneesh and Lin, Dahua and Dai, Bo},\n  journal={arXiv preprint arXiv:2311.16933},\n  year={2023}\n}\n```\n\n\n## Acknowledgements\nCodebase built upon [Tune-a-Video](https:\u002F\u002Fgithub.com\u002Fshowlab\u002FTune-A-Video).\n","# AnimateDiff\n\n本仓库是 [AnimateDiff](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725) [ICLR2024 Spotlight] 的官方实现。它是一个即插即用的模块，可以将大多数社区提供的文生图模型转化为动画生成器，而无需额外训练。\n\n**[AnimateDiff：无需特定微调即可为您的个性化文生图扩散模型添加动画效果](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725)**  \n\u003C\u002Fbr>\n[Yuwei Guo](https:\u002F\u002Fguoyww.github.io\u002F)、\n[Ceyuan Yang✝](https:\u002F\u002Fceyuan.me\u002F)、\n[Anyi Rao](https:\u002F\u002Fanyirao.com\u002F)、\n[Zhengyang Liang](https:\u002F\u002Fmaxleung99.github.io\u002F)、\n[Yaohui Wang](https:\u002F\u002Fwyhsirius.github.io\u002F)、\n[Yu Qiao](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=gFtI-8QAAAAJ)、\n[Maneesh Agrawala](https:\u002F\u002Fgraphics.stanford.edu\u002F~maneesh\u002F)、\n[Dahua Lin](http:\u002F\u002Fdahua.site)、\n[Bo Dai](https:\u002F\u002Fdaibo.info)  \n(✝通讯作者)  \n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2307.04725-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04725)\n[![项目主页](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Website-green)](https:\u002F\u002Fanimatediff.github.io\u002F)\n[![在 OpenXLab 中打开](https:\u002F\u002Fcdn-static.openxlab.org.cn\u002Fapp-center\u002Fopenxlab_app.svg)](https:\u002F\u002Fopenxlab.org.cn\u002Fapps\u002Fdetail\u002FMasbfca\u002FAnimateDiff)\n[![Hugging Face Spaces](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-yellow)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fguoyww\u002FAnimateDiff)\n\n***注意：*** `main` 分支适用于 [Stable Diffusion V1.5](https:\u002F\u002Fhuggingface.co\u002Frunwayml\u002Fstable-diffusion-v1-5)；对于 [Stable Diffusion XL](https:\u002F\u002Fhuggingface.co\u002Fstabilityai\u002Fstable-diffusion-xl-base-1.0)，请参考 `sdxl-beta` 分支。\n\n\n## 快速演示\n更多结果可在 [Gallery](__assets__\u002Fdocs\u002Fgallery.md) 中找到。其中部分由社区贡献。\n\n\u003Ctable class=\"center\">\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_7b1d4ddb2807.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_a1ab43ed31f9.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_49f030238598.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_b4a3d259293b.gif\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cp style=\"margin-left: 2em; margin-top: -1em\">模型：\u003Ca href=\"https:\u002F\u002Fcivitai.com\u002Fmodels\u002F30240\u002Ftoonyou\">ToonYou\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Ctable>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_45c0638d55b7.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_27bbf246622f.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_3c2330e66e73.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0ab3171d601c.gif\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cp style=\"margin-left: 2em; margin-top: -1em\">模型：\u003Ca href=\"https:\u002F\u002Fcivitai.com\u002Fmodels\u002F4201\u002Frealistic-vision-v20\">Realistic Vision V2.0\u003C\u002Fa>\u003C\u002Fp>\n\n\n## 快速入门\n***注意：*** AnimateDiff 也得到了 Diffusers 的官方支持。访问 [AnimateDiff Diffusers 教程](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fdiffusers\u002Fapi\u002Fpipelines\u002Fanimatediff) 以获取更多详情。*以下说明适用于本仓库的操作*。\n\n***注意：*** 对于所有脚本，检查点下载将被 *自动* 处理，因此首次运行时可能需要更长时间。\n\n### 1. 设置仓库和环境\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff.git\ncd AnimateDiff\n\npip install -r requirements.txt\n```\n\n### 2. 启动采样脚本！\n生成的样本可以在 `samples\u002F` 文件夹中找到。\n\n#### 2.1 使用社区模型生成动画\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_1_animate_RealisticVision.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_2_animate_FilmVelvia.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_3_animate_ToonYou.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_4_animate_MajicMix.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_5_animate_RcnzCartoon.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_6_animate_Lyriel.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_7_animate_Tusun.yaml\n```\n\n#### 2.2 使用 MotionLoRA 控制生成动画\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F2_motionlora\u002F2_motionlora_RealisticVision.yaml\n```\n\n#### 2.3 通过 SparseCtrl RGB 和草图获得更多控制\n```\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_1_sparsectrl_i2v.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_2_sparsectrl_rgb_RealisticVision.yaml\npython -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_3_sparsectrl_sketch_RealisticVision.yaml\n```\n\n#### 2.4 Gradio 应用程序\n我们创建了一个 Gradio 演示，使 AnimateDiff 更易于使用。默认情况下，该演示将在 `localhost:7860` 上运行。\n```\npython -u app.py\n```\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ebd82a29b023.jpg\" style=\"width: 75%\">\n\n\n## 技术说明\n\u003Cdetails close>\n\u003Csummary>技术说明\u003C\u002Fsummary>\n\n### AnimateDiff\n\n**AnimateDiff 的目标是学习可迁移的运动先验，这些先验可以应用于 Stable Diffusion 系列的其他变体。**为此，我们设计了由三个阶段组成的训练流程。\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0897bb3aa5ee.png\" style=\"width:100%\">\n\n- 在 **1. 减轻负面影响** 阶段，我们训练 **领域适配器**，例如 `v3_sd15_adapter.ckpt`，以适应训练数据集中存在的视觉瑕疵（如水印）。这也有助于分离运动与空间外观的学习。默认情况下，适配器可以在推理时移除。它也可以集成到模型中，并通过 LoRA 缩放器调整其效果。\n\n- 在 **2. 学习运动先验** 阶段，我们训练 **运动模块**，例如 `v3_sd15_mm.ckpt`，以从视频中学习真实世界的运动模式。\n\n- 在 **3. （可选）适应新模式** 阶段，我们训练 **MotionLoRA**，例如 `v2_lora_ZoomIn.ckpt`，以高效地使运动模块适应特定的运动模式（如镜头拉近、滚动等）。\n\n### SparseCtrl\n\n**SparseCtrl 的目标是通过采用一些稀疏输入（如少量 RGB 图像或草图输入）来为文生视频模型增加更多控制。** 其技术细节可在以下论文中找到：\n\n**[SparseCtrl：为文生视频扩散模型添加稀疏控制](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933)**  \n[Yuwei Guo](https:\u002F\u002Fguoyww.github.io\u002F)、\n[Ceyuan Yang✝](https:\u002F\u002Fceyuan.me\u002F)、\n[Anyi Rao](https:\u002F\u002Fanyirao.com\u002F)、\n[Maneesh Agrawala](https:\u002F\u002Fgraphics.stanford.edu\u002F~maneesh\u002F)、\n[Dahua Lin](http:\u002F\u002Fdahua.site)、\n[Bo Dai](https:\u002F\u002Fdaibo.info)  \n(✝通讯作者)  \n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2311.16933-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933)\n[![项目主页](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Website-green)](https:\u002F\u002Fguoyww.github.io\u002Fprojects\u002FSparseCtrl\u002F)\n\n\u003C\u002Fdetails>\n\n\n## 模型版本\n\u003Cdetails close>\n\u003Csummary>模型版本\u003C\u002Fsummary>\n\n### AnimateDiff v3 和 SparseCtrl（2023.12）\n\n在这一版本中，我们使用**领域适配器 LoRA** 对图像模型进行微调，这在推理时提供了更高的灵活性。我们还实现了两种（RGB 图像\u002F涂鸦）[SparseCtrl](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.16933) 编码器，它们可以接受任意数量的条件图来控制动画的内容。\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v3 模型库\u003C\u002Fsummary>\n\n| 名称 | HuggingFace | 类型 | 存储空间 | 描述 |\n| - | - | - | - | - |\n| `v3_adapter_sd_v15.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_adapter.ckpt) | 领域适配器 | 97.4 MB | |\n| `v3_sd15_mm.ckpt.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_mm.ckpt) | 运动模块 | 1.56 GB | |\n| `v3_sd15_sparsectrl_scribble.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_sparsectrl_scribble.ckpt) | SparseCtrl 编码器 | 1.86 GB | 涂鸦条件 |\n| `v3_sd15_sparsectrl_rgb.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv3_sd15_sparsectrl_rgb.ckpt) | SparseCtrl 编码器 | 1.85 GB | RGB 图像条件 |\n\u003C\u002Fdetails>\n\n#### 局限性\n1. 细小的抖动较为明显；\n2. 为了保持与社区模型的兼容性，针对通用的 T2V 并没有进行专门优化，因此在这种设置下的视觉质量有限；\n3. **（风格对齐）对于图像动画\u002F插值等用途，建议使用由同一社区模型生成的图像。**\n\n#### 示例\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输入（由 RealisticVision 提供）\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">动画\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输入\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">动画\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_b33e188c339b.png\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_f97e730b531d.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_162d5cb96ecb.png\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_5d94a644a4e0.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输入涂鸦\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输出\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输入涂鸦\u003C\u002Ftd>\n    \u003Ctd width=25% style=\"border: none; text-align: center\">输出\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_34f58fa231fa.png\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2794b05a58f9.gif\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2f412112cd0c.png\" style=\"width:100%\">\u003C\u002Ftd>\n      \u003Ctd width=25% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_aa19862f25c3.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n### AnimateDiff SDXL-Beta（2023.11）\n\n我们在 SDXL 上发布了运动模块（beta 版本），可通过以下途径获取：[Google Drive](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1EK_D9hDOPfJdK4z8YDB8JYvPracNx2SX\u002Fview?usp=share_link) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sdxl_v10_beta.ckpt) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules)。无论是**有无**个性化模型，都可以生成高分辨率视频（例如 1024×1024、16 帧，支持多种长宽比）。推理通常需要约 13GB 显存，并且需要调整超参数（如采样步数），具体取决于所选的个性化模型。更多关于推理的细节，请查看 [sdxl](https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Ftree\u002Fsdxl) 分支。\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff SDXL-Beta 模型库\u003C\u002Fsummary>\n\n| 名称 | HuggingFace | 类型 | 存储空间 |\n| - | - | - | - |\n| `mm_sdxl_v10_beta.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sdxl_v10_beta.ckpt) | 运动模块 | 950 MB |\n\u003C\u002Fdetails>\n\n#### 示例\n\u003Ctable class=\"center\">\n    \u003Ctr style=\"line-height: 0\">\n    \u003Ctd width=52% style=\"border: none; text-align: center\">原始 SDXL\u003C\u002Ftd>\n    \u003Ctd width=30% style=\"border: none; text-align: center\">社区 SDXL\u003C\u002Ftd>\n    \u003Ctd width=18% style=\"border: none; text-align: center\">社区 SDXL\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd width=52% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_3836d18e29d6.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=30% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_12907e4011a5.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003Ctd width=18% style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_af795ab41ce2.gif\" style=\"width:100%\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n### AnimateDiff v2（2023年9月）\n\n在这一版本中，运动模块 `mm_sd_v15_v2.ckpt`（[Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI?usp=sharing) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules)）是在更高分辨率和更大批量下训练得到的。\n我们发现，这种规模化的训练显著提升了运动质量和多样性。\n\n此外，我们还支持八种基础相机运动的 **MotionLoRA**。每个 MotionLoRA 检查点仅占用 **77 MB 存储空间**，可在 [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI?usp=sharing) \u002F [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff) \u002F [CivitAI](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F108836\u002Fanimatediff-motion-modules) 上获取。\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v2 模型库\u003C\u002Fsummary>\n\n| 名称 | HuggingFace | 类型 | 参数量 | 存储空间 |\n| - | - | - | - | - |\n| `mm_sd_v15_v2.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v15_v2.ckpt) | 运动模块 | 453 M | 1.7 GB |\n| `v2_lora_ZoomIn.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_ZoomIn.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_ZoomOut.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_ZoomOut.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_PanLeft.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_PanLeft.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_PanRight.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_PanRight.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_TiltUp.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_TiltUp.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_TiltDown.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_TiltDown.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_RollingClockwise.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_RollingClockwise.ckpt) | MotionLoRA | 19 M | 74 MB |\n| `v2_lora_RollingAnticlockwise.ckpt` | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fv2_lora_RollingAnticlockwise.ckpt) | MotionLoRA | 19 M | 74 MB |\n\u003C\u002Fdetails>\n\n\n#### 示例（MotionLoRA）\n\u003Ctable class=\"center\">\n  \u003Ctr style=\"line-height: 0\">\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">变焦拉近\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">变焦推远\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">向左平移变焦\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">向右平移变焦\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_d3b5cf28a738.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_2a2f1b9b78b9.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_83bf4eece884.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_bd9d648762b6.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_e4a2e4e95a65.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_cc775ad7ea1b.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_e2d0a392e509.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_125a2708b833.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr style=\"line-height: 0\">\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">仰角提升\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">俯角下降\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">逆时针旋转\u003C\u002Ftd>\n    \u003Ctd colspan=\"2\" style=\"border: none; text-align: center\">顺时针旋转\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_85b8dbe350df.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_6a6a024ed0f0.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ffb8d4b42af2.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src \"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_0eefc409d05d.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_1faaff305cb2.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src \"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ff4c7aeb49b0.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src \"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_c1feae0692ba.gif\">\u003C\u002Ftd>\n    \u003Ctd style=\"border: none\">\u003Cimg src \"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_704087580b80.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n#### 示例（改进后的运动效果）\n以下是 `mm_sd_v15.ckpt`（左）与改进后的 `mm_sd_v15_v2.ckpt`（右）的对比。\n\n\u003Ctable class=\"center\">\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_92fa32531c54.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_47947bcb12c8.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_04f96a486271.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_933bd611c7d1.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_ea5661852519.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_f3ceb5e36464.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_d27149973451.gif\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_readme_df03b2d9e362.gif\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n### AnimateDiff v1（2023年7月）\n\nAnimateDiff 的首个版本！\n\n\u003Cdetails close>\n\u003Csummary>AnimateDiff v1 模型库\u003C\u002Fsummary>\n\n| 名称 | HuggingFace | 参数量 | 存储空间 |\n| - | - | - | - |\n| mm_sd_v14.ckpt | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v14.ckpt) | 417 M | 1.6 GB |\n| mm_sd_v15.ckpt | [链接](https:\u002F\u002Fhuggingface.co\u002Fguoyww\u002Fanimatediff\u002Fblob\u002Fmain\u002Fmm_sd_v15.ckpt) | 417 M | 1.6 GB |\n\u003C\u002Fdetails>\n\n\u003C\u002Fdetails>\n\n\n## 训练\n详细信息请参阅 [训练步骤](__assets__\u002Fdocs\u002Fanimatediff.md)。\n\n\n## 相关资源\n\n适用于 Stable Diffusion WebUI 的 AnimateDiff：[sd-webui-animatediff](https:\u002F\u002Fgithub.com\u002Fcontinue-revolution\u002Fsd-webui-animatediff)（由 [@continue-revolution](https:\u002F\u002Fgithub.com\u002Fcontinue-revolution) 提供）  \n适用于 ComfyUI 的 AnimateDiff：[ComfyUI-AnimateDiff-Evolved](https:\u002F\u002Fgithub.com\u002FKosinkadink\u002FComfyUI-AnimateDiff-Evolved)（由 [@Kosinkadink](https:\u002F\u002Fgithub.com\u002FKosinkadink) 提供）  \nGoogle Colab：[Colab](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fcamenduru\u002FAnimateDiff-colab\u002Fblob\u002Fmain\u002FAnimateDiff_colab.ipynb)（由 [@camenduru](https:\u002F\u002Fgithub.com\u002Fcamenduru) 提供）\n\n\n## 免责声明\n本项目仅供学术研究使用。\n我们对用户生成的内容不承担任何责任。\n此外，请注意，我们的官方网站仅为 https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff 和 https:\u002F\u002Fanimatediff.github.io，其他所有网站均与 AnimateDiff 无关。\n\n\n## 联系我们\n郭宇伟：[guoyw@ie.cuhk.edu.hk](mailto:guoyw@ie.cuhk.edu.hk)  \n杨策源：[limbo0066@gmail.com](mailto:limbo0066@gmail.com)  \n戴博：[doubledaibo@gmail.com](mailto:doubledaibo@gmail.com)\n\n## BibTeX\n```\n@article{guo2023animatediff,\n  title={AnimateDiff：无需特定微调即可动画化您的个性化文生图扩散模型},\n  author={郭宇伟和杨策源和饶安怡和梁正阳和王耀辉和乔宇和阿格拉瓦拉·马尼什和林大华和戴博},\n  journal={国际表示学习会议},\n  year={2024}\n}\n\n@article{guo2023sparsectrl,\n  title={SparseCtrl：为文生视频扩散模型添加稀疏控制},\n  author={郭宇伟和杨策源和饶安怡和阿格拉瓦拉·马尼什和林大华和戴博},\n  journal={arXiv预印本 arXiv:2311.16933},\n  year={2023}\n}\n```\n\n\n## 致谢\n代码库基于 [Tune-a-Video](https:\u002F\u002Fgithub.com\u002Fshowlab\u002FTune-A-Video) 构建。","# AnimateDiff 快速上手指南\n\nAnimateDiff 是一个即插即用模块，可将大多数社区版的文本生成图像（Text-to-Image）模型转化为动画生成器，无需额外训练即可让静态模型“动”起来。本项目官方支持 Stable Diffusion V1.5，SDXL 版本请参考 `sdxl-beta` 分支。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux 或 macOS (Windows 用户建议使用 WSL2)\n*   **Python**: 3.8 或更高版本\n*   **GPU**: 推荐 NVIDIA GPU，显存建议 8GB 以上（运行高分辨率或复杂控制需更多显存）\n*   **依赖管理**: 已安装 `git` 和 `pip`\n\n> **国内开发者提示**：为避免下载依赖包超时，建议在安装前配置国内镜像源（如清华源或阿里源）。\n\n## 2. 安装步骤\n\n### 2.1 克隆仓库\n首先从 GitHub 克隆项目代码：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff.git\ncd AnimateDiff\n```\n\n### 2.2 安装依赖\n使用 pip 安装所需依赖。**国内用户推荐使用以下命令加速下载**：\n\n```bash\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n> **注意**：首次运行脚本时，程序会自动下载所需的模型检查点（Checkpoints），这可能需要一些时间，请耐心等待。\n\n## 3. 基本使用\n\n安装完成后，您可以通过命令行脚本或 Gradio 界面生成动画。生成的样本默认保存在 `samples\u002F` 文件夹中。\n\n### 3.1 命令行生成动画（推荐）\n\n以下是使用不同社区模型生成动画的最简示例。脚本会自动处理模型下载。\n\n**生成写实风格动画 (Realistic Vision):**\n```bash\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_1_animate_RealisticVision.yaml\n```\n\n**生成卡通风格动画 (ToonYou):**\n```bash\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_3_animate_ToonYou.yaml\n```\n\n**其他预设模型:**\n项目内置了多种风格配置，可直接替换配置文件运行：\n```bash\n# FilmVelvia, MajicMix, RcnzCartoon, Lyriel, Tusun 等\npython -m scripts.animate --config configs\u002Fprompts\u002F1_animate\u002F1_2_animate_FilmVelvia.yaml\n```\n\n### 3.2 进阶控制 (可选)\n\n如果您需要更精细的运动控制或图像引导，可以使用以下功能：\n\n*   **MotionLoRA 控制镜头运动** (如推拉镜头):\n    ```bash\n    python -m scripts.animate --config configs\u002Fprompts\u002F2_motionlora\u002F2_motionlora_RealisticVision.yaml\n    ```\n\n*   **SparseCtrl 稀疏控制** (使用参考图或草图控制内容):\n    ```bash\n    # 图生视频 (Image-to-Video)\n    python -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_1_sparsectrl_i2v.yaml\n    \n    # 使用草图控制\n    python -m scripts.animate --config configs\u002Fprompts\u002F3_sparsectrl\u002F3_3_sparsectrl_sketch_RealisticVision.yaml\n    ```\n\n### 3.3 使用 Gradio 可视化界面\n\n为了方便调试和预览，项目提供了一个本地 Web 界面：\n\n```bash\npython -u app.py\n```\n\n运行后，在浏览器中访问 `http:\u002F\u002Flocalhost:7860` 即可通过图形化界面调整参数并生成动画。\n\n---\n**提示**：对于图像动画或插值任务，为了保持风格一致性，建议使用由同一个社区模型生成的图片作为输入。","一位独立游戏开发者希望将自己精心训练的二次元风格角色模型转化为动态立绘，用于游戏的开场动画展示。\n\n### 没有 AnimateDiff 时\n- **训练成本极高**：若要生成连贯动画，必须收集大量该角色的连续动作帧数据，并重新进行耗时数天的全量微调训练。\n- **风格难以保持一致**：传统视频生成模型往往无法完美复现个人定制模型（如 ToonYou）的独特画风，导致角色“变脸”或风格漂移。\n- **技术门槛过高**：开发者需要深入理解视频扩散模型的复杂架构，编写繁琐的代码来对齐时间帧，调试难度极大。\n- **资源消耗巨大**：反复试错训练需要占用多张高端显卡，对于个人开发者或小团队而言，算力成本难以承受。\n\n### 使用 AnimateDiff 后\n- **即插即用无需重训**：直接将现有的静态文生图模型加载为插件，无需任何额外训练数据或微调过程，几分钟内即可生成动画。\n- **完美继承个性化风格**：生成的视频帧严格遵循原模型的特征，确保角色外貌、上色风格与静态设定图高度一致，无违和感。\n- **操作简便快速落地**：只需修改简单的 YAML 配置文件提示词，调用预设脚本即可输出流畅视频，大幅降低开发复杂度。\n- **高效利用现有资源**：在单张消费级显卡上即可运行推理，显著降低了时间和金钱成本，让创意验证变得轻而易举。\n\nAnimateDiff 通过零样本迁移能力，打破了静态图像与动态视频之间的壁垒，让个性化 AI 模型瞬间具备“动起来”的生命力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fguoyww_AnimateDiff_7b1d4ddb.gif","guoyww","Yuwei Guo","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fguoyww_e8206f83.jpg","PhD Student@MMLAB, CUHK",null,"Hong Kong","guoyww.github.io","https:\u002F\u002Fgithub.com\u002Fguoyww",[83],{"name":84,"color":85,"percentage":86},"Python","#3572A5",100,12096,1062,"2026-04-10T11:48:36","Apache-2.0","未说明","需要 NVIDIA GPU。SDXL Beta 版本推理通常要求约 13GB 显存；SD v1.5 版本未明确具体数值，但运行深度学习模型通常建议 8GB+ 显存。",{"notes":94,"python":91,"dependencies":95},"该工具主要基于 Stable Diffusion v1.5（主分支）和 SDXL（sdxl-beta 分支）。首次运行脚本时会自动下载模型文件（单个运动模块约 1.56GB，适配器约 97MB，SparseCtrl 编码器约 1.8GB），可能导致启动时间较长。官方推荐使用 Diffusers 库或本仓库提供的 Gradio 界面进行操作。若使用 SDXL 版本，可能需要调整超参数（如采样步数）以适配不同的个性化模型。",[96,97,98,99,100,101,102,103],"torch","diffusers","transformers","accelerate","gradio","opencv-python","einops","safetensors",[15,61],"2026-03-27T02:49:30.150509","2026-04-11T04:54:49.120693",[108,113,118,122,127,132],{"id":109,"question_zh":110,"answer_zh":111,"source_url":112},28980,"生成视频时遇到 'CUDA error: invalid configuration argument' 错误怎么办？","这是一个常见的配置冲突问题。解决方法是在 WebUI 的设置界面中，找到并取消勾选特定的选项（通常与显存优化或特定加速功能有关）。多位用户确认，在 WebUI 设置中取消勾选相关选项后，问题即可解决。","https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Fissues\u002F155",{"id":114,"question_zh":115,"answer_zh":116,"source_url":117},28981,"在 Windows 上通过 Conda 安装时遇到 'ResolvePackageNotFound: xformers' 错误如何解决？","由于 Conda 渠道可能无法直接找到匹配的 xformers 版本，建议按以下步骤操作：\n1. 从 environment.yaml 文件中移除 xformers 依赖。\n2. 运行 `conda env create -f environment.yaml` 创建基础环境。\n3. 激活环境：`conda activate animatediff`。\n4. 前往 https:\u002F\u002Fanaconda.org\u002Fxformers\u002Fxformers\u002Ffiles 下载与您 PyTorch 和 CUDA 版本匹配的离线包（例如：xformers-0.0.16-py310_cu11.3_pyt1.12.1.tar.bz2）。\n5. 使用命令 `conda install --use-local \u003C下载的文件名>` 进行本地安装。\n如果仍有其他依赖错误，可尝试使用 pip 安装。","https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Fissues\u002F33",{"id":119,"question_zh":120,"answer_zh":121,"source_url":117},28982,"如何在 Windows 上正确配置 environment.yaml 以成功运行 AnimateDiff？","对于拥有 NVIDIA 3090 显卡的 Windows 用户，可以使用以下经过测试的 environment.yaml 配置：\nname: animatediff\nchannels:\n  - pytorch\n  - nvidia\ndependencies:\n  - python=3.10\n  - pytorch=1.13.1\n  - torchvision=0.14.1\n  - torchaudio=0.13.1\n  - pytorch-cuda=11.7\n  - pip\n  - pip:\n    - diffusers==0.11.1\n    - transformers==4.25.1\n    - xformers==0.0.16\n    - imageio==2.27.0\n    - gdown\n    - einops\n    - omegaconf\n    - safetensors\n    - gradio\n注意：如果运行时报错 'No module named triton'，可能需要额外安装 triton 或忽略该警告（部分优化将不可用）。",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},28983,"项目是否支持图生视频（Image-to-Video, I2V）功能？","原生仓库目前主要关注文生视频。对于图生视频需求，维护者建议参考第三方插件 https:\u002F\u002Fgithub.com\u002Fcontinue-revolution\u002Fsd-webui-animatediff，该插件实现了相关功能。技术原理上，需要在训练过程或扩散网络中加入图像监督（如光流、深度图或 VAE 嵌入），但这通常需要修改训练代码而非仅推理。","https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Fissues\u002F103",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},28984,"如何在使用 AnimateDiff 时指定初始图像（Init Image）？","主仓库正在开发原生支持初始图像的功能。在此之前，用户可以参考分支项目 https:\u002F\u002Fgithub.com\u002Ftalesofai\u002FAnimateDiff 获取相关代码实现。若想在主仓库实现类似效果，目前社区建议结合 img2img 或 depth2img 技术先对输入图像进行处理，使其结构符合动画要求，然后再交由 AnimateDiff 进行生成。","https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Fissues\u002F150",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},28985,"在非标准架构（如 Android\u002FARM64）或 CPU 模式下如何安装和运行？","对于 linux-aarch64 (ARM64) 架构或希望在 Android Termux 上运行的用户，标准的 conda 包（如 torchaudio=0.13.1）可能不可用。社区成员提供了针对 CPU 模式在 Android 上的部署指南，可以参考该项目：https:\u002F\u002Fgithub.com\u002FKintCark\u002FAnimatediff-Android-Termux 获取具体的安装步骤和配置方法。","https:\u002F\u002Fgithub.com\u002Fguoyww\u002FAnimateDiff\u002Fissues\u002F358",[]]