[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-showlab--MotionDirector":3,"tool-showlab--MotionDirector":64},[4,17,26,40,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,2,"2026-04-03T11:11:01",[13,14,15],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":23,"last_commit_at":32,"category_tags":33,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,34,35,36,15,37,38,13,39],"数据工具","视频","插件","其他","语言模型","音频",{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":10,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,38,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":10,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74939,"2026-04-05T23:16:38",[38,14,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":23,"last_commit_at":62,"category_tags":63,"status":16},2471,"tesseract","tesseract-ocr\u002Ftesseract","Tesseract 是一款历史悠久且备受推崇的开源光学字符识别（OCR）引擎，最初由惠普实验室开发，后由 Google 维护，目前由全球社区共同贡献。它的核心功能是将图片中的文字转化为可编辑、可搜索的文本数据，有效解决了从扫描件、照片或 PDF 文档中提取文字信息的难题，是数字化归档和信息自动化的重要基础工具。\n\n在技术层面，Tesseract 展现了强大的适应能力。从版本 4 开始，它引入了基于长短期记忆网络（LSTM）的神经网络 OCR 引擎，显著提升了行识别的准确率；同时，为了兼顾旧有需求，它依然支持传统的字符模式识别引擎。Tesseract 原生支持 UTF-8 编码，开箱即用即可识别超过 100 种语言，并兼容 PNG、JPEG、TIFF 等多种常见图像格式。输出方面，它灵活支持纯文本、hOCR、PDF、TSV 等多种格式，方便后续数据处理。\n\nTesseract 主要面向开发者、研究人员以及需要构建文档处理流程的企业用户。由于它本身是一个命令行工具和库（libtesseract），不包含图形用户界面（GUI），因此最适合具备一定编程能力的技术人员集成到自动化脚本或应用程序中",73286,"2026-04-03T01:56:45",[13,14],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":10,"env_os":92,"env_gpu":93,"env_ram":92,"env_deps":94,"category_tags":107,"github_topics":108,"view_count":23,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":115,"updated_at":116,"faqs":117,"releases":153},4025,"showlab\u002FMotionDirector","MotionDirector","[ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.","MotionDirector 是一款专为文生视频扩散模型设计的运动定制工具，由新加坡国立大学 Show Lab 团队研发。它的核心功能是让 AI 学会特定的“动作套路”：用户只需提供一组展示相同运动概念的视频片段（如某种独特的舞蹈步伐或运镜方式），MotionDirector 就能调整现有的生成模型，使其在创作新视频时精准复现这种运动风格，同时保持画面内容的多样性。\n\n这一工具有效解决了当前文生视频模型难以精确控制复杂动态、往往只能生成通用运动的痛点。通过它，创作者可以将参考视频中的动作特征迁移到全新的角色或场景中，实现外观与运动的双重自定义。例如，输入兵马俑的静态图片作为外观参考，再结合一段骑马的运动视频，即可生成“兵马俑在古代战场骑马驰骋”的逼真画面。\n\nMotionDirector 特别适合需要精细控制视频动态的研究人员、AI 开发者以及追求创意表达的数字艺术家使用。其技术亮点在于能够解耦并独立定制视频中的“外观”与“运动”，在 ECCV 2024 会议上获得了口头报告荣誉。无论是希望深入探索视频生成机制的极客，还是想要为作品注入独特动态灵感的设计师，都能利用 MotionDir","MotionDirector 是一款专为文生视频扩散模型设计的运动定制工具，由新加坡国立大学 Show Lab 团队研发。它的核心功能是让 AI 学会特定的“动作套路”：用户只需提供一组展示相同运动概念的视频片段（如某种独特的舞蹈步伐或运镜方式），MotionDirector 就能调整现有的生成模型，使其在创作新视频时精准复现这种运动风格，同时保持画面内容的多样性。\n\n这一工具有效解决了当前文生视频模型难以精确控制复杂动态、往往只能生成通用运动的痛点。通过它，创作者可以将参考视频中的动作特征迁移到全新的角色或场景中，实现外观与运动的双重自定义。例如，输入兵马俑的静态图片作为外观参考，再结合一段骑马的运动视频，即可生成“兵马俑在古代战场骑马驰骋”的逼真画面。\n\nMotionDirector 特别适合需要精细控制视频动态的研究人员、AI 开发者以及追求创意表达的数字艺术家使用。其技术亮点在于能够解耦并独立定制视频中的“外观”与“运动”，在 ECCV 2024 会议上获得了口头报告荣誉。无论是希望深入探索视频生成机制的极客，还是想要为作品注入独特动态灵感的设计师，都能利用 MotionDirector 轻松打破创意边界，高效产出符合预期的高质量视频内容。","\u003Cp align=\"center\">\n\n  \u003Ch2 align=\"center\">MotionDirector: Motion Customization of Text-to-Video Diffusion Models\u003C\u002Fh2>\n  \u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fruizhaocv.github.io\u002F\">\u003Cstrong>Rui Zhao\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fycgu.site\u002F\">\u003Cstrong>Yuchao Gu\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fzhangjiewu.github.io\u002F\">\u003Cstrong>Jay Zhangjie Wu\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fjunhaozhang98.github.io\u002F\u002F\">\u003Cstrong>David Junhao Zhang\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fjia-wei-liu.github.io\u002F\">\u003Cstrong>Jia-Wei Liu\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fweijiawu.github.io\u002F\">\u003Cstrong>Weijia Wu\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fwww.jussikeppo.com\u002F\">\u003Cstrong>Jussi Keppo\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fsites.google.com\u002Fview\u002Fshowlab\">\u003Cstrong>Mike Zheng Shou\u003C\u002Fstrong>\u003C\u002Fa>\n    \u003Cbr>\n    \u003Cbr>\n        \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.08465\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2310.08465-b31b1b.svg'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fshowlab.github.io\u002FMotionDirector'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_Page-MotionDirector-blue'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fruizhaocv\u002FMotionDirector'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-yellow'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=Wq93zi8bE3U'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDemo_Video-MotionDirector-red'>\u003C\u002Fa>\n    \u003Cbr>\n    \u003Cb>Show Lab, National University of Singapore\u003C\u002Fb>\n  \u003C\u002Fp>\n\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fshowlab_MotionDirector_readme_bd41f63b8917.gif\" width=\"1080px\"\u002F>  \n\u003Cbr>\n\u003Cem>MotionDirector can customize text-to-video diffusion models to generate videos with desired motions.\u003C\u002Fem>\n\u003C\u002Fp>\n\n## Task Definition\nMotion Customization of Text-to-Video Diffusion Models: \u003C\u002Fbr>\nGiven a set of video clips of the same motion concept, the task of **Motion Customization** is to adapt existing text-to-video diffusion\nmodels to generate diverse videos with this motion.\n\n\n## Demos\n### Demo Video:\n[![Demo Video of MotionDirector](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fshowlab_MotionDirector_readme_adf765f2f697.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=Wq93zi8bE3U \"Demo Video of MotionDirector\")\n\n### Customize both Appearance and Motion: \u003Ca name=\"Customize_both_Appearance_and_Motion\">\u003C\u002Fa>\n\u003Ctable class=\"center\"> \n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference images or videos\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002Freference_images.png>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_riding_a_horse_through_an_ancient_battlefield_1455028.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_playing_golf_in_front_of_the_Great_Wall_5804477.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_walking_cross_the_ancient_army_captured_with_a_reverse_follow_cinematic_shot_653658.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">Reference images for appearance customization: \"A Terracotta Warrior on a pure color background.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is riding a horse through an ancient battlefield.\"\u003C\u002Fbr> seed: 1455028\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is playing golf in front of the Great Wall.\" \u003C\u002Fbr> seed: 5804477\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is walking cross the ancient army captured with a reverse follow cinematic shot.\" \u003C\u002Fbr> seed: 653658\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002Freference_videos.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_riding_a_bicycle_past_an_ancient_Chinese_palace_166357.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_lifting_weights_in_front_of_the_Great_Wall_5635982.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_skateboarding_9033688.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">Reference videos for motion customization: \"A person is riding a bicycle.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is riding a bicycle past an ancient Chinese palace.\"\u003C\u002Fbr> seed: 166357.\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is lifting weights in front of the Great Wall.\" \u003C\u002Fbr> seed: 5635982\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Terracotta Warrior is skateboarding.\" \u003C\u002Fbr> seed: 9033688\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## News\n- [2024.02.03] [MotionDirector for AnimateDiff](https:\u002F\u002Fgithub.com\u002FExponentialML\u002FAnimateDiff-MotionDirector) is available. Thanks to [ExponentialML](https:\u002F\u002Fgithub.com\u002FExponentialML).\n- [2023.12.27] [MotionDirector with Customized Appearance](#motiondirector-with-customized-appearance-) released. Now, you can customize both appearance and motion in video generation.\n- [2023.12.27] [MotionDirector for Image Animation](#motiondirector-for-image-animation-) released.\n- [2023.12.23] MotionDirector has been featured in Hugging Face's '[Spaces of the Week](https:\u002F\u002Fhuggingface.co\u002Fspaces) 🔥' trending list! \n- [2023.12.13] Online gradio demo released @ [Hugging Face Spaces](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fruizhaocv\u002FMotionDirector)! Welcome to try it.\n- [2023.12.06] [MotionDirector for Sports](#motiondirector-for-sports-) released! Lifting weights, riding horse, palying golf, etc.\n- [2023.12.05] [Colab demo](https:\u002F\u002Fgithub.com\u002Fcamenduru\u002FMotionDirector-colab) is available. Thanks to [Camenduru](https:\u002F\u002Ftwitter.com\u002Fcamenduru).\n- [2023.12.04] [MotionDirector for Cinematic Shots](#motiondirector-for-cinematic-shots-) released. Now, you can make AI films with professional cinematic shots!\n- [2023.12.02] Code and model weights released!\n- [2023.10.12] Paper and project page released.\n\n## ToDo\n- [x] Gradio Demo\n- [ ] More trained weights of MotionDirector\n\n## Model List\n\n| Type |                                                       Training Data                                                       |                                      Descriptions                                       | Link  |\n| :---: |:-------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------:|:---:|\n| MotionDirector for Sports   |                                              Multiple videos for each model.                                              | Learn motion concepts of sports, i.e. lifting weights, riding horse, palying golf, etc. | [Link](#motiondirector-for-sports-)  |\n| MotionDirector for Cinematic Shots   |                                              A single video for each model.                                               |   Learn motion concepts of cinematic shots, i.e. dolly zoom, zoom in, zoom out, etc.    | [Link](#motiondirector-for-cinematic-shots-)  |\n| MotionDirector for Image Animation   |                 A single image for spatial path. And a single video or multiple videos for temporal path.                 |                      Animate the given image with learned motions.                      | [Link](#motiondirector-for-image-animation-)  |\n| MotionDirector with Customized Appearance  | A single image or multiple images for spatial path. And a single video or multiple videos for temporal path. |                Customize both appearance and motion in video generation.                | [Link](#motiondirector-with-customized-appearance-)  |\n\n\n\n\n## Setup\n### Requirements\n\n```shell\n# create virtual environment\nconda create -n motiondirector python=3.8\nconda activate motiondirector\n# install packages\npip install -r requirements.txt\n```\n\n### Weights of Foundation Models\n```shell\ngit lfs install\n## You can choose the ModelScopeT2V or ZeroScope, etc., as the foundation model.\n## ZeroScope\ngit clone https:\u002F\u002Fhuggingface.co\u002Fcerspense\u002Fzeroscope_v2_576w .\u002Fmodels\u002Fzeroscope_v2_576w\u002F\n## ModelScopeT2V\ngit clone https:\u002F\u002Fhuggingface.co\u002Fdamo-vilab\u002Ftext-to-video-ms-1.7b .\u002Fmodels\u002Fmodel_scope\u002F\n```\n### Weights of trained MotionDirector \u003Ca name=\"download_weights\">\u003C\u002Fa>\n```shell\n# Make sure you have git-lfs installed (https:\u002F\u002Fgit-lfs.com)\ngit lfs install\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights .\u002Foutputs\n\n# More and better trained MotionDirector are released at a new repo:\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n# The usage is slightly different, which will be updated later.\n```\n\n## Usage\n### Training\n\n#### Train MotionDirector on multiple videos:\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_videos.yaml\n```\n#### Train MotionDirector on a single video:\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_video.yaml\n```\n\nNote:  \n- Before running the above command, \nmake sure you replace the path to foundational model weights and training data with your own in the config files `config_multi_videos.yaml` or `config_single_video.yaml`.\n- Generally, training on multiple 16-frame videos usually takes `300~500` steps, about `9~16` minutes using one A5000 GPU. Training on a single video takes `50~150` steps, about `1.5~4.5` minutes using one A5000 GPU. The required VRAM for training is around `14GB`.\n- Reduce `n_sample_frames` if your GPU memory is limited.\n- Reduce the learning rate and increase the training steps for better performance.\n\n\n### Inference\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"Your prompt\" --checkpoint_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector --checkpoint_index 300 --noise_prior 0.\n```\nNote: \n- Replace `\u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel` with your own path to the foundation model, like ZeroScope.\n- The value of `checkpoint_index` means the checkpoint saved at which the training step is selected.\n- The value of `noise_prior` indicates how much the inversion noise of the reference video affects the generation. \nWe recommend setting it to `0` for MotionDirector trained on multiple videos to achieve the highest diverse generation, while setting it to `0.1~0.5` for MotionDirector trained on a single video for faster convergence and better alignment with the reference video.\n\n\n## Inference with pre-trained MotionDirector\nAll available weights are at official [Huggingface Repo](https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights).\nRun the [download command](#download_weights), the weights will be downloaded to the folder `outputs`, then run the following inference command to generate videos.\n\n### MotionDirector trained on multiple videos:\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A person is riding a bicycle past the Eiffel Tower.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Friding_bicycle\u002F --checkpoint_index 300 --noise_prior 0. --seed 7192280\n```\nNote:  \n- Replace `\u002Fpath\u002Fto\u002Fthe\u002FZeroScope` with your own path to the foundation model, i.e. the ZeroScope.\n- Change the `prompt` to generate different videos. \n- The `seed` is set to a random value by default. Set it to a specific value will obtain certain results, as provided in the table below.\n\nResults:\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Videos\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002Freference_videos.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_person_is_riding_a_bicycle_past_the_Eiffel_Tower_7192280.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_panda_is_riding_a_bicycle_in_a_garden_2178639.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FAn_alien_is_riding_a_bicycle_on_Mars_2390886.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A person is riding a bicycle.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A person is riding a bicycle past the Eiffel Tower.” \u003C\u002Fbr> seed: 7192280\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A panda is riding a bicycle in a garden.\"  \u003C\u002Fbr> seed: \u003Cs>2178639\u003C\u002Fs> \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An alien is riding a bicycle on Mars.\"  \u003C\u002Fbr> seed: 2390886\u003C\u002Ftd>\n\u003C\u002Ftable>\n\n### MotionDirector trained on a single video:\n16 frames:\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A tank is running on the moon.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fcar_16\u002F --checkpoint_index 150 --noise_prior 0.5 --seed 8551187\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_tank_is_running_on_the_moon_8551187.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_lion_is_running_past_the_pyramids_431554.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_spaceship_is_flying_past_Mars_8808231.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A car is running on the road.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A tank is running on the moon.” \u003C\u002Fbr> seed: 8551187\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A lion is running past the pyramids.\" \u003C\u002Fbr> seed: 431554\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A spaceship is flying past Mars.\"  \u003C\u002Fbr> seed: 8808231\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n24 frames:\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A truck is running past the Arc de Triomphe.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fcar_24\u002F --checkpoint_index 150 --noise_prior 0.5 --width 576 --height 320 --num-frames 24 --seed 34543\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_truck_is_running_past_the_Arc_de_Triomphe_34543.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FAn_elephant_is_running_in_a_forest_2171736.gif>\u003C\u002Ftd>              \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A car is running on the road.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A truck is running past the Arc de Triomphe.” \u003C\u002Fbr> seed: 34543\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An elephant is running in a forest.\" \u003C\u002Fbr> seed: 2171736\u003C\u002Ftd>\n \u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_person_on_a_camel_is_running_past_the_pyramids_4904126.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_spacecraft_is_flying_past_the_Milky_Way_galaxy_3235677.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A car is running on the road.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A person on a camel is running past the pyramids.\" \u003C\u002Fbr> seed: 4904126\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A spacecraft is flying past the Milky Way galaxy.\"  \u003C\u002Fbr> seed: 3235677\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## MotionDirector for Sports \u003Ca name=\"MotionDirector_for_Sports\">\u003C\u002Fa>\n\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A panda is lifting weights in a garden.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Flifting_weights\u002F --checkpoint_index 300 --noise_prior 0. --seed 9365597\n```\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\" colspan=\"4\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Lifting Weights\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Riding Bicycle\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Flifting_weights\u002FA_panda_is_lifting_weights_in_a_garden_1699276.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Flifting_weights\u002FA_police_officer_is_lifting_weights_in_front_of_the_police_station_6804745.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_panda_is_riding_a_bicycle_in_a_garden_2178639.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FAn_alien_is_riding_a_bicycle_on_Mars_2390886.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A panda is lifting weights in a garden.” \u003C\u002Fbr> seed: 1699276\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A police officer is lifting weights in front of the police station.” \u003C\u002Fbr> seed: 6804745\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A panda is riding a bicycle in a garden.\"  \u003C\u002Fbr> seed: \u003Cs>2178639\u003C\u002Fs> \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An alien is riding a bicycle on Mars.\"  \u003C\u002Fbr> seed: 2390886\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Riding Horse\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Riding Horse\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_knight_riding_on_horseback_passing_by_a_castle_6491893.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_man_riding_an_elephant_through_the_jungle_6230765.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_girl_riding_a_unicorn_galloping_under_the_moonlight_6940542.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FAn_adventurer_riding_a_dinosaur_exploring_through_the_rainforest_6972276.gif>\u003C\u002Ftd> \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A knight riding on horseback passing by a castle.” \u003C\u002Fbr> seed: 6491893\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A man riding an elephant through the jungle.” \u003C\u002Fbr> seed: 6230765\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A girl riding a unicorn galloping under the moonlight.\"  \u003C\u002Fbr> seed: 6940542\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An adventurer riding a dinosaur exploring through the rainforest.\"  \u003C\u002Fbr> seed: 6972276\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Skateboarding\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Playing Golf\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fskateboarding\u002FA_robot_is_skateboarding_in_a_cyberpunk_city_1020673.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fskateboarding\u002FA_teddy_bear_skateboarding_in_Times_Square_New_York_3306353.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fplaying_golf\u002FA_man_is_playing_golf_in_front_of_the_White_House_8870450.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fplaying_golf\u002FA_monkey_is_playing_golf_on_a_field_full_of_flowers_2989633.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A robot is skateboarding in a cyberpunk city.” \u003C\u002Fbr> seed: 1020673\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A teddy bear skateboarding in Times Square New York.” \u003C\u002Fbr> seed: 3306353\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A man is playing golf in front of the White House.\"  \u003C\u002Fbr> seed: 8870450\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A monkey is playing golf on a field full of flowers.\"  \u003C\u002Fbr> seed: 2989633\u003C\u002Ftd>\n\u003Ctr>\n\u003C\u002Ftable>\n\nMore sports, to be continued ...\n\n## MotionDirector for Cinematic Shots \u003Ca name=\"MotionDirector_for_Cinematic_Shots\">\u003C\u002Fa>\n\n### 1. Zoom\n#### 1.1 Dolly Zoom (Hitchcockian Zoom)\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A firefighter standing in front of a burning forest captured with a dolly zoom.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fdolly_zoom\u002F --checkpoint_index 150 --noise_prior 0.5 --seed 9365597\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fdolly_zoom_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_dolly_zoom_9365597.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_dolly_zoom_1675932.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_dolly_zoom_2310805.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A man standing in room captured with a dolly zoom.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A firefighter standing in front of a burning forest captured with a dolly zoom.\" \u003C\u002Fbr> seed: 9365597 \u003C\u002Fbr> noise_prior: 0.5\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A lion sitting on top of a cliff captured with a dolly zoom.\" \u003C\u002Fbr> seed: 1675932 \u003C\u002Fbr> noise_prior: 0.5\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Roman soldier standing in front of the Colosseum captured with a dolly zoom.\"  \u003C\u002Fbr> seed: 2310805 \u003C\u002Fbr> noise_prior: 0.5 \u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fdolly_zoom_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_dolly_zoom_4615820.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_dolly_zoom_4114896.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_dolly_zoom_7492004.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A man standing in room captured with a dolly zoom.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A firefighter standing in front of a burning forest captured with a dolly zoom.\" \u003C\u002Fbr> seed: 4615820 \u003C\u002Fbr> noise_prior: 0.3\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A lion sitting on top of a cliff captured with a dolly zoom.\" \u003C\u002Fbr> seed: 4114896 \u003C\u002Fbr> noise_prior: 0.3\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Roman soldier standing in front of the Colosseum captured with a dolly zoom.\"  \u003C\u002Fbr> seed: 7492004\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n#### 1.2 Zoom In\nThe reference video is shot with my own water cup. You can also pick up your cup or any other object to practice camera movements and turn it into imaginative videos. Create your AI films with customized camera movements!\n\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A firefighter standing in front of a burning forest captured with a zoom in.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fzoom_in\u002F --checkpoint_index 150 --noise_prior 0.3 --seed 1429227\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fzoom_in_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_zoom_in_1429227.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_zoom_in_487239.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_zoom_in_1393184.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A cup in a lab captured with a zoom in.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A firefighter standing in front of a burning forest captured with a zoom in.\" \u003C\u002Fbr> seed: 1429227\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A lion sitting on top of a cliff captured with a zoom in.\" \u003C\u002Fbr> seed: 487239 \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Roman soldier standing in front of the Colosseum captured with a zoom in.\"  \u003C\u002Fbr> seed: 1393184\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n#### 1.3 Zoom Out\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A firefighter standing in front of a burning forest captured with a zoom out.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fzoom_out\u002F --checkpoint_index 150 --noise_prior 0.3 --seed 4971910\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fzoom_out_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_zoom_out_4971910.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_zoom_out_1767994.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_zoom_out_8203639.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A cup in a lab captured with a zoom out.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A firefighter standing in front of a burning forest captured with a zoom out.\" \u003C\u002Fbr> seed: 4971910\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A lion sitting on top of a cliff captured with a zoom out.\" \u003C\u002Fbr> seed: 1767994 \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A Roman soldier standing in front of the Colosseum captured with a zoom out.\"  \u003C\u002Fbr> seed: 8203639\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n### 2. Advanced Cinematic Shots\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Follow\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Reverse Follow\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_fire_captured_with_a_follow_cinematic_shot_4926511.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_is_walking_on_the_moon_with_a_follow_cinematic_shot_7594623.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_fire_captured_with_a_reverse_follow_cinematic_shot_9759630.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_walking_on_the_moon_captured_with_a_reverse_follow_cinematic_shot_4539309.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A fireman is walking through fire captured with a follow cinematic shot.” \u003C\u002Fbr> seed: 4926511\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A spaceman is walking on the moon with a follow cinematic shot.” \u003C\u002Fbr> seed: 7594623\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A fireman is walking through fire captured with a reverse follow cinematic shot.”  \u003C\u002Fbr> seed: 9759630\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A spaceman walking on the moon captured with a reverse follow cinematic shot.\"  \u003C\u002Fbr> seed: 4539309\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Chest Transition\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Mini Jib Reveal: Foot-to-Head Shot\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_the_burning_forest_captured_with_a_chest_transition_cinematic_shot_5236349.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FAn_ancient_Roman_soldier_walks_through_the_crowd_on_the_street_captured_with_a_chest_transition_cinematic_shot_3982271.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FAn_ancient_Roman_soldier_walks_through_the_crowd_on_the_street_captured_with_a_mini_jib_reveal_cinematic_shot_654178.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_British_Redcoat_soldier_is_walking_through_the_mountains_captured_with_a_mini_jib_reveal_cinematic_shot_566917.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A fireman is walking through the burning forest captured with a chest transition cinematic shot.” \u003C\u002Fbr> seed: 5236349\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An ancient Roman soldier walks through the crowd on the street captured with a chest transition cinematic shot.” \u003C\u002Fbr> seed: 3982271\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An ancient Roman soldier walks through the crowd on the street captured with a mini jib reveal cinematic shot.”  \u003C\u002Fbr> seed: 654178\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A British Redcoat soldier is walking through the mountains captured with a mini jib reveal cinematic shot.\"  \u003C\u002Fbr> seed: 566917\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Pull Back: Subject Enters form the Left\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Orbit\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_robot_looks_at_a_distant_cyberpunk_city_captured_with_a_pull_back_cinematic_shot_9342597.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_woman_looks_at_a_distant_erupting_volcano_captured_with_a_pull_back_cinematic_shot_4197508.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_in_the_burning_forest_captured_with_an_orbit_cinematic_shot_8450300.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_on_the_moon_captured_with_an_orbit_cinematic_shot_5899496.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A robot looks at a distant cyberpunk city captured with a pull back cinematic shot.” \u003C\u002Fbr> seed: 9342597\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A woman looks at a distant erupting volcano captured with a pull back cinematic shot.” \u003C\u002Fbr> seed: 4197508\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A fireman in the burning forest captured with an orbit cinematic shot.”  \u003C\u002Fbr> seed: 8450300\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A spaceman on the moon captured with an orbit cinematic shot.\"  \u003C\u002Fbr> seed: 5899496\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\nMore Cinematic Shots, to be continued ....\n\n## MotionDirector for Image Animation \u003Ca name=\"MotionDirector_for_Image_Animation\">\u003C\u002Fa>\n### Train\nTrain the spatial path with reference image.\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_image.yaml\n```\nThen train the temporal path to learn the motion in reference video.\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_video.yaml\n```\n\n### Inference\nInference with spatial path learned from reference image and temporal path learned form reference video.\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"Your prompt\" --spatial_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Fspatial\u002Flora\u002F --temporal_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Ftemporal\u002Flora\u002F --noise_prior 0.\n```\n### Example\nDownload the pre-trained weights.\n```bash\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n```\nRun the following command.\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A car is running on the road.\" --spatial_path_folder .\u002Foutputs\u002Ftrain\u002Fimage_animation\u002Ftrain_2023-12-26T14-37-16\u002Fcheckpoint-300\u002Fspatial\u002Flora\u002F --temporal_path_folder .\u002Foutputs\u002Ftrain\u002Fimage_animation\u002Ftrain_2023-12-26T13-08-20\u002Fcheckpoint-300\u002Ftemporal\u002Flora\u002F --noise_prior 0.5 --seed 5057764\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Image\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>Reference Video\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>Videos Generated by MotionDirector\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=test_data\u002Fimg_car\u002Fcar.jpg>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002Fcar-turn-original.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002FA_car_is_running_on_the_road_5057764.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002FA_car_is_running_on_the_road_covered_with_snow_4904543.gif>\u003C\u002Ftd>     \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A car is running on the road.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">\"A car is running on the road.\"\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A car is running on the road.\" \u003C\u002Fbr> seed: 5057764\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"A car is running on the road covered with snow.\" \u003C\u002Fbr> seed: 4904543\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n## MotionDirector with Customized Appearance \u003Ca name=\"MotionDirector_with_Customized_Appearance\">\u003C\u002Fa>\n### Train\nTrain the spatial path with reference images.\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_images.yaml\n```\nThen train the temporal path to learn the motions in reference videos.\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_videos.yaml\n```\n\n### Inference\nInference with spatial path learned from reference images and temporal path learned form reference videos.\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"Your prompt\" --spatial_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Fspatial\u002Flora\u002F --temporal_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Ftemporal\u002Flora\u002F --noise_prior 0.\n```\n### Example\nDownload the pre-trained weights.\n```bash\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n```\nRun the following command.\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"A Terracotta Warrior is riding a horse through an ancient battlefield.\" --spatial_path_folder .\u002Foutputs\u002Ftrain\u002Fcustomized_appearance\u002Fterracotta_warrior\u002Fcheckpoint-default\u002Fspatial\u002Flora --temporal_path_folder .\u002Foutputs\u002Ftrain\u002Friding_horse\u002Fcheckpoint-default\u002Ftemporal\u002Flora\u002F --noise_prior 0. --seed 1455028\n```\nResults are shown in the [table](#customize-both-appearance-and-motion-).\n\n## More results\n\nIf you have a more impressive MotionDirector or generated videos, please feel free to open an issue and share them with us. We would greatly appreciate it.\nImprovements to the code are also highly welcome.\n\nPlease refer to [Project Page](https:\u002F\u002Fshowlab.github.io\u002FMotionDirector) for more results.\n\n### Astronaut's daily life on Mars:\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\" colspan=\"4\">\u003Cb>Astronaut's daily life on Mars (Motion concepts learned by MotionDirector)\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Lifting Weights\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Playing Golf\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Riding Horse\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Riding Bicycle\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_lifting_weights_on_Mars_4K_high_quailty_highly_detailed_4008521.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAstronaut_playing_golf_on_Mars_659514.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_riding_a_horse_on_Mars_4K_high_quailty_highly_detailed_1913261.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_riding_a_bicycle_past_the_pyramids_Mars_4K_high_quailty_highly_detailed_5532778.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is lifting weights on Mars, 4K, high quailty, highly detailed.” \u003C\u002Fbr> seed: 4008521\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"Astronaut playing golf on Mars” \u003C\u002Fbr> seed: 659514\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is riding a horse on Mars, 4K, high quailty, highly detailed.\"  \u003C\u002Fbr> seed: 1913261\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is riding a bicycle past the pyramids Mars, 4K, high quailty, highly detailed.\"  \u003C\u002Fbr> seed: 5532778\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Skateboarding\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Cinematic Shot: \"Reverse Follow\"\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Cinematic Shot: \"Follow\"\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>Cinematic Shot: \"Orbit\"\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_skateboarding_on_Mars_6615212.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_walking_on_Mars_captured_with_a_reverse_follow_cinematic_shot_1224445.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_walking_on_Mars_captured_with_a_follow_cinematic_shot_6191674.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_standing_on_Mars_captured_with_an_orbit_cinematic_shot_7483453.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is skateboarding on Mars\"\u003C\u002Fbr> seed: 6615212\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is walking on Mars captured with a reverse follow cinematic shot.\" \u003C\u002Fbr> seed: 1224445\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is walking on Mars captured with a follow cinematic shot.\" \u003C\u002Fbr> seed: 6191674\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">\"An astronaut is standing on Mars captured with an orbit cinematic shot.\" \u003C\u002Fbr> seed: 7483453\u003C\u002Ftd>\n\u003Ctr>\n\u003C\u002Ftable>\n\n## Citation\n\n\n```bibtex\n\n@article{zhao2023motiondirector,\n  title={MotionDirector: Motion Customization of Text-to-Video Diffusion Models},\n  author={Zhao, Rui and Gu, Yuchao and Wu, Jay Zhangjie and Zhang, David Junhao and Liu, Jiawei and Wu, Weijia and Keppo, Jussi and Shou, Mike Zheng},\n  journal={arXiv preprint arXiv:2310.08465},\n  year={2023}\n}\n\n```\n\n## Shoutouts\n\n- This code builds on [diffusers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fdiffusers), [Tune-a-video](https:\u002F\u002Fgithub.com\u002Fshowlab\u002FTune-A-Video) and [Text-To-Video-Finetuning](https:\u002F\u002Fgithub.com\u002FExponentialML\u002FText-To-Video-Finetuning). Thanks for open-sourcing!\n- Thanks to [camenduru](https:\u002F\u002Ftwitter.com\u002Fcamenduru) for the [colab demo](https:\u002F\u002Fgithub.com\u002Fcamenduru\u002FMotionDirector-colab).\n- Thanks to [yhyu13](https:\u002F\u002Fgithub.com\u002Fyhyu13) for the [Huggingface Repo](https:\u002F\u002Fhuggingface.co\u002FYhyu13\u002FMotionDirector_LoRA).\n- We would like to thank [AK(@_akhaliq)](https:\u002F\u002Ftwitter.com\u002F_akhaliq?lang=en) and huggingface team for the help of setting up oneline gradio demo.\n- Thanks to [MagicAnimate](https:\u002F\u002Fgithub.com\u002Fmagic-research\u002Fmagic-animate\u002F) for the gradio demo template.\n- Thanks to [deepbeepmeep](https:\u002F\u002Fgithub.com\u002Fdeepbeepmeep), and [XiaominLi](https:\u002F\u002Fgithub.com\u002FXiaominLi1997) for improving the code.\n","\u003Cp align=\"center\">\n\n  \u003Ch2 align=\"center\">MotionDirector：文本到视频扩散模型的运动定制\u003C\u002Fh2>\n  \u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fruizhaocv.github.io\u002F\">\u003Cstrong>赵睿\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fycgu.site\u002F\">\u003Cstrong>顾宇超\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fzhangjiewu.github.io\u002F\">\u003Cstrong>吴张杰\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fjunhaozhang98.github.io\u002F\u002F\">\u003Cstrong>张俊豪\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fjia-wei-liu.github.io\u002F\">\u003Cstrong>刘家伟\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fweijiawu.github.io\u002F\">\u003Cstrong>吴伟佳\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fwww.jussikeppo.com\u002F\">\u003Cstrong>尤西·凯波\u003C\u002Fstrong>\u003C\u002Fa>\n    ·\n    \u003Ca href=\"https:\u002F\u002Fsites.google.com\u002Fview\u002Fshowlab\">\u003Cstrong>Mike Zheng Shou\u003C\u002Fstrong>\u003C\u002Fa>\n    \u003Cbr>\n    \u003Cbr>\n        \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.08465\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2310.08465-b31b1b.svg'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fshowlab.github.io\u002FMotionDirector'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_Page-MotionDirector-blue'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fruizhaocv\u002FMotionDirector'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-yellow'>\u003C\u002Fa>\n        \u003Ca href='https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=Wq93zi8bE3U'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDemo_Video-MotionDirector-red'>\u003C\u002Fa>\n    \u003Cbr>\n    \u003Cb>新加坡国立大学Show Lab\u003C\u002Fb>\n  \u003C\u002Fp>\n\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fshowlab_MotionDirector_readme_bd41f63b8917.gif\" width=\"1080px\"\u002F>  \n\u003Cbr>\n\u003Cem>MotionDirector能够定制文本到视频扩散模型，以生成具有所需运动的视频。\u003C\u002Fem>\n\u003C\u002Fp>\n\n## 任务定义\n文本到视频扩散模型的运动定制： \u003C\u002Fbr>\n给定一组具有相同运动概念的视频片段，**运动定制**的任务是调整现有的文本到视频扩散模型，使其能够生成包含该运动的多样化视频。\n\n\n## 演示\n### 演示视频：\n[![MotionDirector演示视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fshowlab_MotionDirector_readme_adf765f2f697.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=Wq93zi8bE3U \"MotionDirector演示视频\")\n\n### 同时定制外观和运动： \u003Ca name=\"Customize_both_Appearance_and_Motion\">\u003C\u002Fa>\n\u003Ctable class=\"center\"> \n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考图片或视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002Freference_images.png>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_riding_a_horse_through_an_ancient_battlefield_1455028.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_playing_golf_in_front_of_the_Great_Wall_5804477.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_walking_cross_the_ancient_army_captured_with_a_reverse_follow_cinematic_shot_653658.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">用于外观定制的参考图片：“一个兵马俑站在纯色背景前。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正骑着马穿越古代战场。”\u003C\u002Fbr> 种子：1455028\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正在长城前打高尔夫球。” \u003C\u002Fbr> 种子：5804477\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正走过古代军队，镜头采用反向跟随的电影式拍摄手法。” \u003C\u002Fbr> 种子：653658\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002Freference_videos.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_riding_a_bicycle_past_an_ancient_Chinese_palace_166357.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_lifting_weights_in_front_of_the_Great_Wall_5635982.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcustomized_appearance_results\u002FA_Terracotta_Warrior_is_skateboarding_9033688.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">用于运动定制的参考视频：“一个人正在骑自行车。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正骑着自行车经过一座古老的中国宫殿。”\u003C\u002Fbr> 种子：166357。\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正在长城前举重。” \u003C\u002Fbr> 种子：5635982\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个兵马俑正在滑板。” \u003C\u002Fbr> 种子：9033688\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## 新闻\n- [2024.02.03] [适用于AnimateDiff的MotionDirector](https:\u002F\u002Fgithub.com\u002FExponentialML\u002FAnimateDiff-MotionDirector) 已发布。感谢[ExponentialML](https:\u002F\u002Fgithub.com\u002FExponentialML)。\n- [2023.12.27] [带有自定义外观的MotionDirector](#motiondirector-with-customized-appearance-) 发布。现在，您可以在视频生成中同时定制外观和运动。\n- [2023.12.27] [用于图像动画的MotionDirector](#motiondirector-for-image-animation-) 发布。\n- [2023.12.23] MotionDirector被收录于Hugging Face的“本周Spaces”🔥热门榜单！\n- [2023.12.13] 在[Hugging Face Spaces](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fruizhaocv\u002FMotionDirector) 上发布了在线Gradio演示！欢迎体验。\n- [2023.12.06] [用于体育运动的MotionDirector](#motiondirector-for-sports-) 发布！包括举重、骑马、打高尔夫等。\n- [2023.12.05] [Colab演示](https:\u002F\u002Fgithub.com\u002Fcamenduru\u002FMotionDirector-colab) 可用。感谢[Camenduru](https:\u002F\u002Ftwitter.com\u002Fcamenduru)。\n- [2023.12.04] [用于电影镜头的MotionDirector](#motiondirector-for-cinematic-shots-) 发布。现在，您可以使用专业的电影镜头制作AI电影！\n- [2023.12.02] 代码和模型权重已发布！\n- [2023.10.12] 论文和项目页面已发布。\n\n## 待办事项\n- [x] Gradio演示\n- [ ] 更多训练好的MotionDirector权重\n\n## 模型列表\n\n| 类型 |                                                       训练数据                                                       |                                      描述                                       | 链接  |\n| :---: |:-------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------:|:---:|\n| 运动导演（体育类）   |                                              每个模型对应多段视频。                                              | 学习体育运动中的动作概念，例如举重、骑马、打高尔夫等。 | [链接](#motiondirector-for-sports-)  |\n| 运动导演（电影镜头类）   |                                              每个模型对应单段视频。                                               |   学习电影镜头中的运动概念，例如推轨变焦、拉近镜头、拉远镜头等。    | [链接](#motiondirector-for-cinematic-shots-)  |\n| 运动导演（图像动画类）   |                 空间路径使用单张图片，时间路径则可使用单段视频或多个视频。                 |                      利用学习到的运动效果为给定图像添加动画效果。                      | [链接](#motiondirector-for-image-animation-)  |\n| 运动导演（自定义外观）  | 空间路径可使用单张图片或多张图片，时间路径则可使用单段视频或多个视频。 |                在视频生成过程中同时自定义外观和运动效果。                | [链接](#motiondirector-with-customized-appearance-)  |\n\n\n\n\n## 安装\n### 环境要求\n\n```shell\n# 创建虚拟环境\nconda create -n motiondirector python=3.8\nconda activate motiondirector\n# 安装依赖包\npip install -r requirements.txt\n```\n\n### 基础模型权重\n```shell\ngit lfs install\n## 可以选择 ModelScopeT2V 或 ZeroScope 等作为基础模型。\n## ZeroScope\ngit clone https:\u002F\u002Fhuggingface.co\u002Fcerspense\u002Fzeroscope_v2_576w .\u002Fmodels\u002Fzeroscope_v2_576w\u002F\n## ModelScopeT2V\ngit clone https:\u002F\u002Fhuggingface.co\u002Fdamo-vilab\u002Ftext-to-video-ms-1.7b .\u002Fmodels\u002Fmodel_scope\u002F\n```\n\n### 已训练好的运动导演模型权重 \u003Ca name=\"download_weights\">\u003C\u002Fa>\n```shell\n# 确保已安装 git-lfs (https:\u002F\u002Fgit-lfs.com)\ngit lfs install\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights .\u002Foutputs\n\n# 更多且性能更优的已训练运动导演模型将在新仓库中发布：\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n# 使用方法略有不同，后续会更新说明。\n```\n\n## 使用方法\n### 训练\n\n#### 在多段视频上训练运动导演：\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_videos.yaml\n```\n#### 在单段视频上训练运动导演：\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_video.yaml\n```\n\n注意：  \n- 在运行上述命令之前，\n请确保在配置文件 `config_multi_videos.yaml` 或 `config_single_video.yaml` 中将基础模型权重和训练数据的路径替换为你自己的路径。\n- 通常，在多段16帧视频上训练需要 `300~500` 步，使用一块 A5000 显卡大约需要 `9~16` 分钟。而在单段视频上训练则需要 `50~150` 步，使用一块 A5000 显卡大约需要 `1.5~4.5` 分钟。训练所需的显存约为 `14GB`。\n- 如果你的显存有限，可以减少 `n_sample_frames` 参数。\n- 为了获得更好的效果，可以降低学习率并增加训练步数。\n\n\n### 推理\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"你的提示语\" --checkpoint_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector --checkpoint_index 300 --noise_prior 0.\n```\n注意： \n- 将 `\u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel` 替换为你自己的基础模型路径，比如 ZeroScope。\n- `checkpoint_index` 的值表示选择哪个训练步骤保存的检查点进行推理。\n- `noise_prior` 的值决定了参考视频的反演噪声对生成结果的影响程度。 \n我们建议对于在多段视频上训练的运动导演，将其设置为 `0`，以实现最高程度的多样性；而对于在单段视频上训练的运动导演，则建议将其设置为 `0.1~0.5`，以便更快地收敛并与参考视频更好地匹配。\n\n\n## 使用预训练的运动导演进行推理\n所有可用的权重都位于官方 [Huggingface 仓库](https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights) 中。执行[下载命令](#download_weights)，权重将被下载到 `outputs` 文件夹中，然后运行以下推理命令即可生成视频。\n\n### 在多段视频上训练的运动导演：\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一个人正骑着自行车经过埃菲尔铁塔。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Friding_bicycle\u002F --checkpoint_index 300 --noise_prior 0. --seed 7192280\n```\n注意：  \n- 将 `\u002Fpath\u002Fto\u002Fthe\u002FZeroScope` 替换为你自己的基础模型路径，即 ZeroScope。\n- 可以更改 `prompt` 来生成不同的视频。 \n- 默认情况下，`seed` 是随机设置的。如果设置为特定值，则会得到确定的结果，如下表所示。\n\n结果：\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>由运动导演生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002Freference_videos.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_person_is_riding_a_bicycle_past_the_Eiffel_Tower_7192280.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_panda_is_riding_a_bicycle_in_a_garden_2178639.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FAn_alien_is_riding_a_bicycle_on_Mars_2390886.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一个人正在骑自行车。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个人正骑着自行车经过埃菲尔铁塔。” \u003C\u002Fbr> 种子：7192280\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只熊猫正在花园里骑自行车。”  \u003C\u002Fbr> 种子：2178639\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个外星人正骑着自行车在火星上行驶。”  \u003C\u002Fbr> 种子：2390886\u003C\u002Ftd>\n\u003C\u002Ftable>\n\n### MotionDirector 在单个视频上训练：\n16 帧：\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"月球上有一辆坦克在行驶。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fcar_16\u002F --checkpoint_index 150 --noise_prior 0.5 --seed 8551187\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_tank_is_running_on_the_moon_8551187.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_lion_is_running_past_the_pyramids_431554.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002FA_spaceship_is_flying_past_Mars_8808231.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一辆汽车正在公路上行驶。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“月球上有一辆坦克在行驶。” \u003C\u002Fbr> 种子：8551187\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一头狮子正跑过金字塔。” \u003C\u002Fbr> 种子：431554\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一艘宇宙飞船正飞越火星。”  \u003C\u002Fbr> 种子：8808231\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n24 帧：\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一辆卡车正驶过凯旋门。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fcar_24\u002F --checkpoint_index 150 --noise_prior 0.5 --width 576 --height 320 --num-frames 24 --seed 34543\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_truck_is_running_past_the_Arc_de_Triomphe_34543.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FAn_elephant_is_running_in_a_forest_2171736.gif>\u003C\u002Ftd>              \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一辆汽车正在公路上行驶。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一辆卡车正驶过凯旋门。” \u003C\u002Fbr> 种子：34543\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一头大象正在森林里奔跑。” \u003C\u002Fbr> 种子：2171736\u003C\u002Ftd>\n \u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002Freference_video.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_person_on_a_camel_is_running_past_the_pyramids_4904126.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsingle_video_results\u002F24_frames\u002FA_spacecraft_is_flying_past_the_Milky_Way_galaxy_3235677.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一辆汽车正在公路上行驶。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一位骑骆驼的人正经过金字塔。” \u003C\u002Fbr> 种子：4904126\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一艘航天器正飞越银河系。”  \u003C\u002Fbr> 种子：3235677\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## 运动主题的 MotionDirector \u003Ca name=\"MotionDirector_for_Sports\">\u003C\u002Fa>\n\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一只熊猫正在花园里举重。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Flifting_weights\u002F --checkpoint_index 300 --noise_prior 0. --seed 9365597\n```\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\" colspan=\"4\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>举重\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>骑自行车\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Flifting_weights\u002FA_panda_is_lifting_weights_in_a_garden_1699276.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Flifting_weights\u002FA_police_officer_is_lifting_weights_in_front_of_the_police_station_6804745.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FA_panda_is_riding_a_bicycle_in_a_garden_2178639.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fmulti_videos_results\u002FAn_alien_is_riding_a_bicycle_on_Mars_2390886.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只熊猫正在花园里举重。” \u003C\u002Fbr> 种子：1699276\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名警察正在警局前举重。” \u003C\u002Fbr> 种子：6804745\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只熊猫正在花园里骑自行车。”  \u003C\u002Fbr> 种子：\u003Cs>2178639\u003C\u002Fs> \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名外星人正骑自行车在火星上行驶。”  \u003C\u002Fbr> 种子：2390886\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>骑马\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>骑马\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_knight_riding_on_horseback_passing_by_a_castle_6491893.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_man_riding_an_elephant_through_the_jungle_6230765.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FA_girl_riding_a_unicorn_galloping_under_the_moonlight_6940542.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Friding_horse\u002FAn_adventurer_riding_a_dinosaur_exploring_through_the_rainforest_6972276.gif>\u003C\u002Ftd> \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名骑士骑马经过城堡。” \u003C\u002Fbr> 种子：6491893\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名男子骑着大象穿越丛林。” \u003C\u002Fbr> 种子：6230765\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名女孩骑着独角兽在月光下奔驰。”  \u003C\u002Fbr> 种子：6940542\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名冒险家骑着恐龙探索雨林。”  \u003C\u002Fbr> 种子：6972276\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>滑板\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>打高尔夫\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fskateboarding\u002FA_robot_is_skateboarding_in_a_cyberpunk_city_1020673.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fskateboarding\u002FA_teddy_bear_skateboarding_in_Times_Square_New_York_3306353.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fplaying_golf\u002FA_man_is_playing_golf_in_front_of_the_White_House_8870450.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fsports_results\u002Fplaying_golf\u002FA_monkey_is_playing_golf_on_a_field_full_of_flowers_2989633.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一台机器人正在赛博朋克城市里滑板。” \u003C\u002Fbr> 种子：1020673\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只泰迪熊正在纽约时代广场滑板。” \u003C\u002Fbr> 种子：3306353\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名男子正在白宫前打高尔夫。”  \u003C\u002Fbr> 种子：8870450\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只猴子正在开满鲜花的田野上打高尔夫。”  \u003C\u002Fbr> 种子：2989633\u003C\u002Ftd>\n\u003Ctr>\n\u003C\u002Ftable>\n\n更多运动主题，敬请期待……\n\n## MotionDirector 用于电影级镜头 \u003Ca name=\"MotionDirector_for_Cinematic_Shots\">\u003C\u002Fa>\n\n### 1. 变焦\n#### 1.1 多莉变焦（希区柯克式变焦）\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一名消防员站在燃烧的森林前，采用多莉变焦拍摄。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fdolly_zoom\u002F --checkpoint_index 150 --noise_prior 0.5 --seed 9365597\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fdolly_zoom_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_dolly_zoom_9365597.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_dolly_zoom_1675932.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_dolly_zoom_2310805.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一个男人站在房间里，采用多莉变焦拍摄。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员站在燃烧的森林前，采用多莉变焦拍摄。” \u003C\u002Fbr> 种子：9365597 \u003C\u002Fbr> 噪声先验：0.5\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只狮子坐在悬崖顶上，采用多莉变焦拍摄。” \u003C\u002Fbr> 种子：1675932 \u003C\u002Fbr> 噪声先验：0.5\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名罗马士兵站在斗兽场前，采用多莉变焦拍摄。”  \u003C\u002Fbr> 种子：2310805 \u003C\u002Fbr> 噪声先验：0.5 \u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fdolly_zoom_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_dolly_zoom_4615820.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_dolly_zoom_4114896.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_dolly_zoom_7492004.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一个男人站在房间里，采用多莉变焦拍摄。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员站在燃烧的森林前，采用多莉变焦拍摄。” \u003C\u002Fbr> 种子：4615820 \u003C\u002Fbr> 噪声先验：0.3\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只狮子坐在悬崖顶上，采用多莉变焦拍摄。” \u003C\u002Fbr> 种子：4114896 \u003C\u002Fbr> 噪声先验：0.3\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名罗马士兵站在斗兽场前，采用多莉变焦拍摄。”  \u003C\u002Fbr> 种子：7492004\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n#### 1.2 变焦推近\n参考视频是用我自己的水杯拍摄的。你也可以拿起自己的杯子或其他任何物体来练习摄像机运动，并将其变成充满想象力的视频。用自定义的摄像机动画创作属于你的 AI 电影吧！\n\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一名消防员站在燃烧的森林前，采用变焦推近拍摄。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fzoom_in\u002F --checkpoint_index 150 --noise_prior 0.3 --seed 1429227\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fzoom_in_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_zoom_in_1429227.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_zoom_in_487239.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_zoom_in_1393184.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“实验室里的一个杯子，采用变焦推近拍摄。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员站在燃烧的森林前，采用变焦推近拍摄。” \u003C\u002Fbr> 种子：1429227\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只狮子坐在悬崖顶上，采用变焦推近拍摄。” \u003C\u002Fbr> 种子：487239 \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名罗马士兵站在斗兽场前，采用变焦推近拍摄。”  \u003C\u002Fbr> 种子：1393184\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n#### 1.3 变焦拉远\n```bash\npython MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一名消防员站在燃烧的森林前，采用变焦拉远拍摄。\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fzoom_out\u002F --checkpoint_index 150 --noise_prior 0.3 --seed 4971910\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"3\">\u003Cb>MotionDirector 生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fzoom_out_16.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_firefighter_standing_in_front_of_a_burning_forest_captured_with_a_zoom_out_4971910.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_lion_sitting_on_top_of_a_cliff_captured_with_a_zoom_out_1767994.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002FA_Roman_soldier_standing_in_front_of_the_Colosseum_captured_with_a_zoom_out_8203639.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“实验室里的一个杯子，采用变焦拉远拍摄。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员站在燃烧的森林前，采用变焦拉远拍摄。” \u003C\u002Fbr> 种子：4971910\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一只狮子坐在悬崖顶上，采用变焦拉远拍摄。” \u003C\u002Fbr> 种子：1767994 \u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名罗马士兵站在斗兽场前，采用变焦拉远拍摄。”  \u003C\u002Fbr> 种子：8203639\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n### 2. 高级电影镜头\n\n\u003Ctable class=\"center\">\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>跟随镜头\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>反向跟随镜头\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_fire_captured_with_a_follow_cinematic_shot_4926511.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_is_walking_on_the_moon_with_a_follow_cinematic_shot_7594623.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_fire_captured_with_a_reverse_follow_cinematic_shot_9759630.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_walking_on_the_moon_captured_with_a_reverse_follow_cinematic_shot_4539309.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员正在火海中行走，由跟随式电影镜头捕捉。” \u003C\u002Fbr> 种子：4926511\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员正在月球上行走，由跟随式电影镜头拍摄。” \u003C\u002Fbr> 种子：7594623\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员正在火海中行走，由反向跟随式电影镜头捕捉。”  \u003C\u002Fbr> 种子：9759630\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员在月球上行走，由反向跟随式电影镜头拍摄。”  \u003C\u002Fbr> 种子：4539309\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>胸部过渡镜头\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>迷你摇臂揭幕：从脚到头的镜头\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_is_walking_through_the_burning_forest_captured_with_a_chest_transition_cinematic_shot_5236349.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FAn_ancient_Roman_soldier_walks_through_the_crowd_on_the_street_captured_with_a_chest_transition_cinematic_shot_3982271.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FAn_ancient_Roman_soldier_walks_through_the_crowd_on_the_street_captured_with_a_mini_jib_reveal_cinematic_shot_654178.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_British_Redcoat_soldier_is_walking_through_the_mountains_captured_with_a_mini_jib_reveal_cinematic_shot_566917.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员正在燃烧的森林中行走，由胸部过渡式电影镜头拍摄。” \u003C\u002Fbr> 种子：5236349\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名古罗马士兵在街道人群中穿行，由胸部过渡式电影镜头拍摄。” \u003C\u002Fbr> 种子：3982271\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名古罗马士兵在街道人群中穿行，由迷你摇臂揭幕式电影镜头拍摄。”  \u003C\u002Fbr> 种子：654178\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名英国红衣士兵正在山间行走，由迷你摇臂揭幕式电影镜头拍摄。”  \u003C\u002Fbr> 种子：566917\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>拉远镜头：主体从左侧进入\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>环绕镜头\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_robot_looks_at_a_distant_cyberpunk_city_captured_with_a_pull_back_cinematic_shot_9342597.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_woman_looks_at_a_distant_erupting_volcano_captured_with_a_pull_back_cinematic_shot_4197508.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_fireman_in_the_burning_forest_captured_with_an_orbit_cinematic_shot_8450300.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fcinematic_shots_results\u002Fmore_results\u002FA_spaceman_on_the_moon_captured_with_an_orbit_cinematic_shot_5899496.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一个机器人凝视着远处的赛博朋克城市，由拉远式电影镜头拍摄。” \u003C\u002Fbr> 种子：9342597\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一位女性凝视着远处喷发的火山，由拉远式电影镜头拍摄。” \u003C\u002Fbr> 种子：4197508\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名消防员身处燃烧的森林中，由环绕式电影镜头拍摄。”  \u003C\u002Fbr> 种子：8450300\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员在月球上行走，由环绕式电影镜头拍摄。”  \u003C\u002Fbr> 种子：5899496\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n更多电影镜头，待续……\n\n## 图像动画运动导演 \u003Ca name=\"MotionDirector_for_Image_Animation\">\u003C\u002Fa>\n### 训练\n使用参考图像训练空间路径。\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_image.yaml\n```\n然后训练时间路径，以学习参考视频中的动作。\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_single_video.yaml\n```\n\n### 推理\n结合从参考图像中学习的空间路径和从参考视频中学习的时间路径进行推理。\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"您的提示\" --spatial_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Fspatial\u002Flora\u002F --temporal_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Ftemporal\u002Flora\u002F --noise_prior 0.\n```\n### 示例\n下载预训练权重。\n```bash\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n```\n运行以下命令。\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一辆汽车正在公路上行驶。\" --spatial_path_folder .\u002Foutputs\u002Ftrain\u002Fimage_animation\u002Ftrain_2023-12-26T14-37-16\u002Fcheckpoint-300\u002Fspatial\u002Flora\u002F --temporal_path_folder .\u002Foutputs\u002Ftrain\u002Fimage_animation\u002Ftrain_2023-12-26T13-08-20\u002Fcheckpoint-300\u002Ftemporal\u002Flora\u002F --noise_prior 0.5 --seed 5057764\n```\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考图像\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\">\u003Cb>参考视频\u003C\u002Fb>\u003C\u002Ftd>\n  \u003Ctd style=\"text-align:center;\" colspan=\"2\">\u003Cb>MotionDirector生成的视频\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=test_data\u002Fimg_car\u002Fcar.jpg>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002Fcar-turn-original.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002FA_car_is_running_on_the_road_5057764.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fimage_animation_results\u002FA_car_is_running_on_the_road_covered_with_snow_4904543.gif>\u003C\u002Ftd>     \n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一辆汽车正在公路上行驶。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;color:gray;\">“一辆汽车正在公路上行驶。”\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一辆汽车正在公路上行驶。” \u003C\u002Fbr> 种子：5057764\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一辆汽车正在被雪覆盖的公路上行驶。” \u003C\u002Fbr> 种子：4904543\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n## 具有自定义外观的运动导演 \u003Ca name=\"MotionDirector_with_Customized_Appearance\">\u003C\u002Fa>\n\n### 训练\n使用参考图像训练空间路径。\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_images.yaml\n```\n然后训练时间路径，以学习参考视频中的动作。\n```bash\npython MotionDirector_train.py --config .\u002Fconfigs\u002Fconfig_multi_videos.yaml\n```\n\n### 推理\n结合从参考图像中学习到的空间路径和从参考视频中学习到的时间路径进行推理。\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002Ffoundation\u002Fmodel  --prompt \"Your prompt\" --spatial_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Fspatial\u002Flora\u002F --temporal_path_folder \u002Fpath\u002Fto\u002Fthe\u002Ftrained\u002FMotionDirector\u002Ftemporal\u002Flora\u002F --noise_prior 0.\n```\n### 示例\n下载预训练权重。\n```bash\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector .\u002Foutputs\n```\n运行以下命令。\n```bash\npython MotionDirector_inference_multi.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope  --prompt \"一名兵马俑正骑着马穿越古代战场。\" --spatial_path_folder .\u002Foutputs\u002Ftrain\u002Fcustomized_appearance\u002Fterracotta_warrior\u002Fcheckpoint-default\u002Fspatial\u002Flora --temporal_path_folder .\u002Foutputs\u002Ftrain\u002Friding_horse\u002Fcheckpoint-default\u002Ftemporal\u002Flora\u002F --noise_prior 0. --seed 1455028\n```\n结果如[表格](#customize-both-appearance-and-motion-)所示。\n\n## 更多结果\n\n如果您有更令人印象深刻的 MotionDirector 或生成的视频，请随时提交 issue 与我们分享。我们将不胜感激。同时，我们也非常欢迎对代码进行改进。\n\n更多结果请参阅 [项目页面](https:\u002F\u002Fshowlab.github.io\u002FMotionDirector)。\n\n### 火星上的宇航员日常生活：\n\u003Ctable class=\"center\">\n\u003Ctr>\n  \u003Ctd style=\"text-align:center;\" colspan=\"4\">\u003Cb>火星上的宇航员日常生活（由 MotionDirector 学习的动作概念）\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\">\u003Cb>举重\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>打高尔夫球\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>骑马\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>骑自行车\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_lifting_weights_on_Mars_4K_high_quailty_highly_detailed_4008521.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAstronaut_playing_golf_on_Mars_659514.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_riding_a_horse_on_Mars_4K_high_quailty_highly_detailed_1913261.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_riding_a_bicycle_past_the_pyramids_Mars_4K_high_quailty_highly_detailed_5532778.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员正在火星上举重，4K，高质量，高度细节化。” \u003C\u002Fbr> 种子：4008521\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“宇航员在火星上打高尔夫球” \u003C\u002Fbr> 种子：659514\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员正在火星上骑马，4K，高质量，高度细节化。”  \u003C\u002Fbr> 种子：1913261\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员正在火星上骑自行车经过金字塔，4K，高质量，高度细节化。”  \u003C\u002Fbr> 种子：5532778\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd style=\"text-align:center;\">\u003Cb>滑板\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>电影镜头：“反向跟随”\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>电影镜头：“跟随”\u003C\u002Fb>\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center;\">\u003Cb>电影镜头：“环绕”\u003C\u002Fb>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_skateboarding_on_Mars_6615212.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_walking_on_Mars_captured_with_a_reverse_follow_cinematic_shot_1224445.gif>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_walking_on_Mars_captured_with_a_follow_cinematic_shot_6191674.gif>\u003C\u002Ftd>              \n  \u003Ctd>\u003Cimg src=assets\u002Fastronaut_mars\u002FAn_astronaut_is_standing_on_Mars_captured_with_an_orbit_cinematic_shot_7483453.gif>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员正在火星上滑板”\u003C\u002Fbr> 种子：6615212\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员在火星上行走，采用反向跟随的电影镜头拍摄。” \u003C\u002Fbr> 种子：1224445\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员在火星上行走，采用跟随的电影镜头拍摄。” \u003C\u002Fbr> 种子：6191674\u003C\u002Ftd>\n  \u003Ctd width=25% style=\"text-align:center;\">“一名宇航员站在火星上，采用环绕的电影镜头拍摄。” \u003C\u002Fbr> 种子：7483453\u003C\u002Ftd>\n\u003Ctr>\n\u003C\u002Ftable>\n\n## 引用\n\n\n```bibtex\n\n@article{zhao2023motiondirector,\n  title={MotionDirector: 文本到视频扩散模型的动作定制},\n  author={Zhao, Rui and Gu, Yuchao and Wu, Jay Zhangjie and Zhang, David Junhao and Liu, Jiawei and Wu, Weijia and Keppo, Jussi and Shou, Mike Zheng},\n  journal={arXiv 预印本 arXiv:2310.08465},\n  year={2023}\n}\n\n```\n\n## 致谢\n\n- 本代码基于 [diffusers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fdiffusers)、[Tune-a-video](https:\u002F\u002Fgithub.com\u002Fshowlab\u002FTune-A-Video) 和 [Text-To-Video-Finetuning](https:\u002F\u002Fgithub.com\u002FExponentialML\u002FText-To-Video-Finetuning) 构建。感谢开源！\n- 感谢 [camenduru](https:\u002F\u002Ftwitter.com\u002Fcamenduru) 提供的 [colab 演示](https:\u002F\u002Fgithub.com\u002Fcamenduru\u002FMotionDirector-colab)。\n- 感谢 [yhyu13](https:\u002F\u002Fgithub.com\u002Fyhyu13) 创建的 [Huggingface 仓库](https:\u002F\u002Fhuggingface.co\u002FYhyu13\u002FMotionDirector_LoRA)。\n- 我们要特别感谢 [AK(@_akhaliq)](https:\u002F\u002Ftwitter.com\u002F_akhaliq?lang=en) 和 Huggingface 团队，帮助我们搭建了在线 Gradio 演示。\n- 感谢 [MagicAnimate](https:\u002F\u002Fgithub.com\u002Fmagic-research\u002Fmagic-animate\u002F) 提供的 Gradio 演示模板。\n- 感谢 [deepbeepmeep](https:\u002F\u002Fgithub.com\u002Fdeepbeepmeep) 和 [XiaominLi](https:\u002F\u002Fgithub.com\u002FXiaominLi1997) 对代码的改进。","# MotionDirector 快速上手指南\n\nMotionDirector 是一个用于文本生成视频（Text-to-Video）扩散模型的运动定制工具。它允许用户通过提供少量参考视频，让模型学习特定的运动概念（如骑自行车、电影运镜等），从而生成具有相同运动风格但内容多样的新视频。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux (推荐 Ubuntu)\n*   **Python 版本**: 3.8\n*   **GPU 要求**: 建议显存 ≥ 14GB (如 NVIDIA A5000 或更高)。若显存有限，需在配置中减少帧数。\n*   **依赖工具**: `conda`, `git`, `git-lfs`\n\n**前置操作：**\n请确保已安装 `git-lfs` 以正确下载模型权重：\n```bash\ngit lfs install\n```\n\n## 2. 安装步骤\n\n### 2.1 创建虚拟环境并安装依赖\n```bash\n# 创建名为 motiondirector 的 conda 环境\nconda create -n motiondirector python=3.8\nconda activate motiondirector\n\n# 安装项目依赖\npip install -r requirements.txt\n```\n> **提示**：国内用户建议使用清华或阿里镜像源加速安装：\n> `pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n\n### 2.2 下载基础模型权重\n您可以选择 **ZeroScope** 或 **ModelScopeT2V** 作为基础模型。以下以 ZeroScope 为例：\n\n```bash\n# 克隆 ZeroScope 模型到本地 models 目录\ngit clone https:\u002F\u002Fhuggingface.co\u002Fcerspense\u002Fzeroscope_v2_576w .\u002Fmodels\u002Fzeroscope_v2_576w\u002F\n```\n> **国内加速**：如果 Hugging Face 连接缓慢，可使用镜像站（如 `hf-mirror.com`）：\n> `export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com`\n> 然后重新运行上述 git clone 命令。\n\n### 2.3 下载预训练 MotionDirector 权重\n下载官方提供的预训练运动模型（如运动、运镜等）：\n\n```bash\n# 克隆预训练权重到 outputs 目录\ngit clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights .\u002Foutputs\n```\n\n## 3. 基本使用\n\n本部分演示如何使用已下载的预训练模型进行视频推理（Inference）。假设您已下载了“骑自行车”运动的预训练权重。\n\n### 3.1 运行推理命令\n将 `\u002Fpath\u002Fto\u002Fthe\u002FZeroScope` 替换为您本地实际的基础模型路径（即 `.\u002Fmodels\u002Fzeroscope_v2_576w\u002F`）。\n\n```bash\npython MotionDirector_inference.py \\\n  --model .\u002Fmodels\u002Fzeroscope_v2_576w\u002F \\\n  --prompt \"A person is riding a bicycle past the Eiffel Tower.\" \\\n  --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Friding_bicycle\u002F \\\n  --checkpoint_index 300 \\\n  --noise_prior 0. \\\n  --seed 7192280\n```\n\n### 3.2 参数说明\n*   `--model`: 基础文生视频模型的路径。\n*   `--prompt`: 生成视频的文本提示词（可自由修改主体和场景，运动风格将由模型自动保持）。\n*   `--checkpoint_folder`: 训练好的 MotionDirector 权重文件夹路径。\n*   `--checkpoint_index`: 选择训练过程中的哪个步数进行检查点加载（预训练模型通常推荐使用 300）。\n*   `--noise_prior`: \n    *   设为 `0.`：适用于多视频训练的模型，可生成多样性最高的结果。\n    *   设为 `0.1~0.5`：适用于单视频训练的模型，能更快收敛并更贴近参考视频的运动细节。\n*   `--seed`: 随机种子。固定该值可复现相同的结果；不设置或更改该值可生成不同变体。\n\n### 3.3 输出结果\n运行完成后，生成的视频文件将保存在输出目录中。您可以尝试修改 `--prompt` 中的主体（例如将 \"A person\" 改为 \"A panda\" 或 \"An alien\"），即可看到相同的骑行动作应用到了不同的角色上。","一家独立游戏工作室正在为新品宣传制作短视频，需要将游戏中特有的“机械蜘蛛爬行”动作应用到不同角色和场景中，以快速生成多样化的营销素材。\n\n### 没有 MotionDirector 时\n- **动作难以复现**：通用的文生视频模型无法理解“机械蜘蛛”这种非生物的特殊步态，生成的视频往往只是普通昆虫爬行或完全错误的滑动。\n- **训练成本高昂**：若要定制动作，团队需收集大量该动作的视频数据并重新训练整个模型，耗时数天且需要昂贵的 GPU 资源。\n- **角色与动作耦合**：一旦模型学会了动作，往往只能生成原始参考视频中的特定角色，无法将“机械蜘蛛步态”迁移到主角或其他怪物身上。\n- **试错效率低下**：通过反复调整提示词（Prompt）来“碰运气”寻找正确动作，不仅成功率低，还导致创意迭代周期被无限拉长。\n\n### 使用 MotionDirector 后\n- **精准动作定制**：只需提供几段“机械蜘蛛爬行”的参考视频，MotionDirector 就能让模型精准掌握这一独特运动规律，无需海量数据。\n- **高效微调适配**：采用高效的微调技术，在极短时间内即可完成动作概念的注入，大幅降低了算力门槛和时间成本。\n- **动作与外观解耦**：成功将“机械蜘蛛步态”从参考视频中剥离，自由应用到“赛博朋克风格的主角”或“古代机关兽”等全新角色上。\n- **创意快速落地**：团队可以立即生成“主角在废墟中像机械蜘蛛一样攀爬”或“机关兽在长城上移动”等多种高质量变体，加速内容产出。\n\nMotionDirector 的核心价值在于它将复杂的动作定制转化为简单的概念注入，让创作者能像搭积木一样，自由地将任意独特动作赋予任何角色与场景。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fshowlab_MotionDirector_ac89cb85.png","showlab","Show Lab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fshowlab_fc159bb4.png","",null,"https:\u002F\u002Fsites.google.com\u002Fview\u002Fshowlab","https:\u002F\u002Fgithub.com\u002Fshowlab",[84],{"name":85,"color":86,"percentage":87},"Python","#3572A5",100,1048,61,"2026-04-01T09:36:39","Apache-2.0","未说明","需要 NVIDIA GPU，训练显存需求约 14GB（文中提及使用 A5000 GPU），推理显存需求未明确但通常低于训练",{"notes":95,"python":96,"dependencies":97},"1. 建议使用 conda 创建虚拟环境。2. 需安装 git-lfs 以下载模型权重。3. 基础模型可选择 ZeroScope 或 ModelScopeT2V。4. 训练多视频概念约需 300-500 步（单卡 A5000 约 9-16 分钟），单视频约需 50-150 步（约 1.5-4.5 分钟）。5. 若显存不足可减少配置中的 n_sample_frames 参数。","3.8",[98,99,100,101,102,103,104,105,106],"torch","diffusers","transformers","accelerate","xformers","gradio","opencv-python","decord","einops",[14,35],[109,110,111,112,113,114],"diffusion-models","text-to-motion","text-to-video","text-to-video-generation","video-generation","motion-customization","2026-03-27T02:49:30.150509","2026-04-06T08:44:29.147261",[118,123,128,133,138,143,148],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},18323,"为什么我的推理结果与作者展示的结果不一致？","这通常是因为不同版本模型使用的随机种子（seed）不同。例如，较新模型在 Hugging Face Space 中使用的种子是 2178639。建议您在 Hugging Face Space 演示中输入相同的文本提示和种子进行尝试，以确保结果一致。如果特定提示（如“熊猫骑自行车”）效果异常，请检查是否使用了正确的模型版本和对应的种子。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F32",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},18324,"为什么推理结果模糊或不符合预期？","请尝试使用最新的检查点（checkpoint）并调整随机种子。具体步骤如下：\n1. 克隆最新权重：`git clone https:\u002F\u002Fhuggingface.co\u002Fruizhaocv\u002FMotionDirector_weights.git .\u002Foutputs`\n2. 运行推理命令（示例）：`python MotionDirector_inference.py --model \u002Fpath\u002Fto\u002Fthe\u002FZeroScope --prompt \"A tank is running on the moon.\" --checkpoint_folder .\u002Foutputs\u002Ftrain\u002Fcar_16\u002F --checkpoint_index 200 --noise_prior 0.5 --seed 2424022`\n注意：固定推理阶段的种子可以确保生成相同的结果，但训练阶段和推理阶段即使种子相同，结果也可能不同，因为训练过程中种子会被多次调用并改变。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F9",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},18325,"代码中通过设置 lora_scale=0 来掩码 LoRA，这真的是“冻结”参数吗？","是的。在前向传播过程中，设置 lora_i = 0 确保对应的 LoRA 不会影响输出和损失计算；在反向传播过程中，对应的优化器不会更新该部分参数，从而实现了参数的“冻结”。这是一种加速训练的技巧，如果您希望完全禁用此行为，可以将掩码概率设置为 0%。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F29",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},18326,"如何使用 MotionDirector 让单张图片动起来（图像动画化）？","代码已发布，您可以参考 README 文档中的“MotionDirector for Image Animation”部分获取具体指令。该功能复现了论文中 Figure 2 的第 4 行效果。通常情况下，这需要参考视频的逆潜变量（inversion latents），具体操作请查阅官方文档链接。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F13",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},18327,"MotionDirector 支持 Stable Video Diffusion (SVD) 吗？","目前 SVD 的支持还在待办事项列表中，尚未有具体的完成时间表。对于 AnimateDiff 的训练支持，可以参考社区贡献的仓库：https:\u002F\u002Fgithub.com\u002FExponentialML\u002FAnimateDiff-MotionDirector。欢迎社区成员将 MotionDirector 适配到不同的基础模型上并共同协作。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F10",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},18328,"为什么使用训练保存的权重进行推理时，结果与训练最后的验证结果不一致？","这通常是因为推理阶段随机采样的噪声与训练阶段不同所致。此外，如果您使用自定义的 DreamBooth 权重（仅微调空间层），通常可以正常工作；但如果同时改变了时间层（temporal layers），结果可能会受到影响。建议检查训练和推理时的设置是否完全一致。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F18",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},18329,"项目代码和模型权重何时发布？","代码和模型权重已经发布。请访问项目主页或相关链接下载最新版本的代码和预训练权重。","https:\u002F\u002Fgithub.com\u002Fshowlab\u002FMotionDirector\u002Fissues\u002F1",[]]