[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-duixcom--Duix-Avatar":3,"tool-duixcom--Duix-Avatar":65},[4,23,32,40,49,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":22},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,2,"2026-04-05T10:45:23",[13,14,15,16,17,18,19,20,21],"图像","数据工具","视频","插件","Agent","其他","语言模型","开发框架","音频","ready",{"id":24,"name":25,"github_repo":26,"description_zh":27,"stars":28,"difficulty_score":29,"last_commit_at":30,"category_tags":31,"status":22},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[17,13,20,19,18],{"id":33,"name":34,"github_repo":35,"description_zh":36,"stars":37,"difficulty_score":29,"last_commit_at":38,"category_tags":39,"status":22},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[19,13,20,18],{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":46,"last_commit_at":47,"category_tags":48,"status":22},3215,"awesome-machine-learning","josephmisiti\u002Fawesome-machine-learning","awesome-machine-learning 是一份精心整理的机器学习资源清单，汇集了全球优秀的机器学习框架、库和软件工具。面对机器学习领域技术迭代快、资源分散且难以甄选的痛点，这份清单按编程语言（如 Python、C++、Go 等）和应用场景（如计算机视觉、自然语言处理、深度学习等）进行了系统化分类，帮助使用者快速定位高质量项目。\n\n它特别适合开发者、数据科学家及研究人员使用。无论是初学者寻找入门库，还是资深工程师对比不同语言的技术选型，都能从中获得极具价值的参考。此外，清单还延伸提供了免费书籍、在线课程、行业会议、技术博客及线下聚会等丰富资源，构建了从学习到实践的全链路支持体系。\n\n其独特亮点在于严格的维护标准：明确标记已停止维护或长期未更新的项目，确保推荐内容的时效性与可靠性。作为机器学习领域的“导航图”，awesome-machine-learning 以开源协作的方式持续更新，旨在降低技术探索门槛，让每一位从业者都能高效地站在巨人的肩膀上创新。",72149,1,"2026-04-03T21:50:24",[20,18],{"id":50,"name":51,"github_repo":52,"description_zh":53,"stars":54,"difficulty_score":46,"last_commit_at":55,"category_tags":56,"status":22},2234,"scikit-learn","scikit-learn\u002Fscikit-learn","scikit-learn 是一个基于 Python 构建的开源机器学习库，依托于 SciPy、NumPy 等科学计算生态，旨在让机器学习变得简单高效。它提供了一套统一且简洁的接口，涵盖了从数据预处理、特征工程到模型训练、评估及选择的全流程工具，内置了包括线性回归、支持向量机、随机森林、聚类等在内的丰富经典算法。\n\n对于希望快速验证想法或构建原型的数据科学家、研究人员以及 Python 开发者而言，scikit-learn 是不可或缺的基础设施。它有效解决了机器学习入门门槛高、算法实现复杂以及不同模型间调用方式不统一的痛点，让用户无需重复造轮子，只需几行代码即可调用成熟的算法解决分类、回归、聚类等实际问题。\n\n其核心技术亮点在于高度一致的 API 设计风格，所有估算器（Estimator）均遵循相同的调用逻辑，极大地降低了学习成本并提升了代码的可读性与可维护性。此外，它还提供了强大的模型选择与评估工具，如交叉验证和网格搜索，帮助用户系统地优化模型性能。作为一个由全球志愿者共同维护的成熟项目，scikit-learn 以其稳定性、详尽的文档和活跃的社区支持，成为连接理论学习与工业级应用的最",65628,"2026-04-05T10:10:46",[20,18,14],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":10,"last_commit_at":63,"category_tags":64,"status":22},3364,"keras","keras-team\u002Fkeras","Keras 是一个专为人类设计的深度学习框架，旨在让构建和训练神经网络变得简单直观。它解决了开发者在不同深度学习后端之间切换困难、模型开发效率低以及难以兼顾调试便捷性与运行性能的痛点。\n\n无论是刚入门的学生、专注算法的研究人员，还是需要快速落地产品的工程师，都能通过 Keras 轻松上手。它支持计算机视觉、自然语言处理、音频分析及时间序列预测等多种任务。\n\nKeras 3 的核心亮点在于其独特的“多后端”架构。用户只需编写一套代码，即可灵活选择 TensorFlow、JAX、PyTorch 或 OpenVINO 作为底层运行引擎。这一特性不仅保留了 Keras 一贯的高层易用性，还允许开发者根据需求自由选择：利用 JAX 或 PyTorch 的即时执行模式进行高效调试，或切换至速度最快的后端以获得最高 350% 的性能提升。此外，Keras 具备强大的扩展能力，能无缝从本地笔记本电脑扩展至大规模 GPU 或 TPU 集群，是连接原型开发与生产部署的理想桥梁。",63927,"2026-04-04T15:24:37",[20,14,18],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":110,"forks":111,"last_commit_at":112,"license":113,"difficulty_score":29,"env_os":114,"env_gpu":115,"env_ram":116,"env_deps":117,"category_tags":125,"github_topics":126,"view_count":135,"oss_zip_url":80,"oss_zip_packed_at":80,"status":22,"created_at":136,"updated_at":137,"faqs":138,"releases":167},805,"duixcom\u002FDuix-Avatar","Duix-Avatar","🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.","Duix-Avatar 是一款完全开源的 AI 数字人工具包，致力于实现离线的视频生成与数字人克隆。长期以来，制作逼真的数字人往往需要高昂的 3D 建模费用或依赖云端服务，这不仅增加了成本，还带来了隐私泄露的风险。Duix-Avatar 通过先进的 AI 算法，能够精准复刻用户的外貌特征与声音特质，支持文本输入或语音驱动来生成自然的口播视频，并实现了高精度的音画同步。\n\nDuix-Avatar 最大的亮点在于其完全本地化的运行模式，无需联网即可在 Windows 系统上完成所有操作，有效保障了数据安全。同时，Duix-Avatar 支持中英法德等多国语言，界面简洁易用。无论是希望探索技术的开发者、研究人员，还是需要高效生产内容的创作者、教育工作者，甚至是关注隐私保护的普通用户，都可以免费下载并使用。Duix-Avatar 旨在打破技术门槛，让每个人都能零成本拥有专属的数字分身。","# 🚀🚀🚀 Duix Avatar — Truly open-source AI avatar toolkit for offline video generation and digital human cloning\n\n🔗 **Office website:** [www.duix.com](http:\u002F\u002Fwww.duix.com)\n\n# Table of Contents\n\n1. [What's Duix.Avatar](#1-whats-Duix.Avatar)\n2. [Introduction](#2-introduction)\n3. [How to Run Locally](#3-how-to-run-locally)\n4. [Open APIs](#4-open-apis)\n5. [What's New](#5-whats-new)\n6. [FAQ](#6-faq)\n7. [How to Interact in real time](#7-how-to-interact-in-real-time)\n8. [Contact](#8-contact)\n9. [License](#9-license)\n10. [Acknowledgments](#10-acknowledgments)\n11. [Star History](#11-star-history)\n\n------\n\n## 1. What's Duix.Avatar\n\n**Duix.Avatar** is a free and open-source AI avatar project developed by **Duix.com**.\n\nSeven years ago, a group of young pioneers chose an unconventional technical path, developing a method to train digital human models using real-person video data. Unlike traditional costly 3D digital human approaches, we leveraged AI-generated technology to create ultra-realistic digital humans, slashing production costs from hundreds of thousands of dollars to just $1,000. This innovation has empowered over 10,000 enterprises and generated over 500,000 personalized avatars for professionals across fields – educators, content creators, legal experts, medical practitioners, and entrepreneurs – dramatically enhancing their video production efficiency. However, our vision extends beyond commercial applications. We believe this transformative technology should be accessible to everyone. To democratize digital human creation, we've open-sourced our cloning technology and video production framework. Our commitment remains: breaking down technological barriers to make cutting-edge tools available to all. Now, anyone with a computer can freely craft their own AI Avatar and produce videos at zero cost – this is the essence of  **Duix.Avatar**.\n\n## 2. Introduction\n\n![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_ca5212cf2071.png)\n\nDuix.Avatar is a fully offline video synthesis tool designed for Windows systems that can precisely clone your appearance and voice, digitalizing your image. You can create videos by driving virtual avatars through text and voice. No internet connection is required, protecting your privacy while enjoying convenient and efficient digital experiences.\n\n- Core Features\n  - Precise Appearance and Voice Cloning: Using advanced AI algorithms to capture human facial features with high precision, including facial features, contours, etc., to build realistic virtual models. It can also precisely clone voices, capturing and reproducing subtle characteristics of human voices, supporting various voice parameter settings to create highly similar cloning effects.\n  - Text and Voice-Driven Virtual Avatars: Understanding text content through natural language processing technology, converting text into natural and fluent speech to drive virtual avatars. Voice input can also be used directly, allowing virtual avatars to perform corresponding actions and facial expressions based on the rhythm and intonation of the voice, making the virtual avatar's performance more natural and vivid.\n  - Efficient Video Synthesis: Highly synchronizing digital human video images with sound, achieving natural and smooth lip-syncing, intelligently optimizing audio-video synchronization effects.\n  - Multi-language Support: Scripts support eight languages - English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish.\n- Key Advantages\n  - Fully Offline Operation: No internet connection required, effectively protecting user privacy, allowing users to create in a secure, independent environment, avoiding potential data leaks during network transmission.\n  - User-Friendly: Clean and intuitive interface, easy to use even for beginners with no technical background, quickly mastering the software's usage to start their digital human creation journey.\n  - Multiple Model Support: Supports importing multiple models and managing them through one-click startup packages, making it convenient for users to choose suitable models based on different creative needs and application scenarios.\n- Technical Support\n  - Voice Cloning Technology: Using advanced technologies like artificial intelligence to generate similar or identical voices based on given voice samples, covering context, intonation, speed, and other aspects of speech.\n  - Automatic Speech Recognition: Technology that converts human speech vocabulary content into computer-readable input (text format), enabling computers to \"understand\" human speech.\n  - Computer Vision Technology: Used in video synthesis for visual processing, including facial recognition and lip movement analysis, ensuring virtual avatar lip movements match voice and text content.\n\n\n\n## 3. How to Run Locally\n\nDuix.Avatar supports Docker-based rapid deployment. Prior to deployment, ensure your hardware and software environments meet the specified requirements.\n\nDuix.Avatar support two deployment modes：Windows \u002F Ubuntu 22.04 Installation\n\n### **Dependencies**\n\n1. Nodejs 18\n2. Docker Images\n   - docker pull guiji2025\u002Ffun-asr\n   - docker pull guiji2025\u002Ffish-speech-ziming\n   - docker pull guiji2025\u002Fduix.avatar\n\n\n\n### Mode 1：Windows Installation\n\n**System Requirements:**\n\n- Currently supports Windows 10 19042.1526 or higher\n\n**Hardware Requirements：**\n\n- Must have D Drive: Mainly used for storing digital human and project data\n  - Free space requirement: More than 30GB\n- C Drive: Used for storing service image files\n  - Free space requirement: More than 100GB\n  - If less than 100GB is available, after installing Docker, you can choose a different disk folder with more than 100GB of remaining space at the location shown below.\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_07da571277dd.png)\n\n- Recommended Configuration:\n  - CPU: 13th Gen Intel Core i5-13400F\n  - Memory: 32GB\n  - Graphics Card: RTX 4070\n- Ensure you have an NVIDIA graphics card with properly installed drivers\n\n  > NVIDIA driver download link: https:\u002F\u002Fwww.nvidia.cn\u002Fdrivers\u002Flookup\u002F\n\n  ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_3e6a4ac16729.png)\n\n\n#### **Installing Windows Docker**\n\n1. Use the command `wsl --list --verbose` to check if WSL is installed. If it shows as below, it's already installed and no further installation is needed.\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_34ea940b6148.png)\n\n\n\n2. Update WSL using `wsl --update`.\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_61950961790b.png)\n\n3. [Download Docker for Windows](https:\u002F\u002Fwww.docker.com\u002F), choose the appropriate installation package based on your CPU architecture.\n4. When you see this interface, installation is successful.\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_13edad5b850b.png)\n\n5. Run Docker\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_0e7077eba800.png)\n\n6. Accept the agreement and skip login on first run\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_8a162647f4e1.png)\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b1d94f235543.png)\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b838ce0e327b.png)\n\n\n#### **Installing the Server**\n\nInstallation using Docker, docker-compose as follows:\n\n1. The `docker-compose.yml` file is in the `\u002Fdeploy` directory.\n2. Execute `docker-compose up -d` in the `\u002Fdeploy` directory, if you want to use the lite version, execute `docker-compose -f docker-compose-lite.yml up -d`\n3. Wait patiently (about half an hour, speed depends on network), download will consume about 70GB of traffic, make sure to use WiFi\n4. When you see three services in Docker, it indicates success (the lite version has only one service `Duix.Avatar-gen-video`)\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_c3686bdda9e2.png)\n\n#### **Server Deployment Solution for NVIDIA 50 Series Graphics Cards**\n\nFor 50 series graphics cards (tested and also works for 30\u002F40 series with CUDA 12.8) Uses the official preview version of PyTorch\n\n#### **Client**\n\n1. Directly download the [officially built installation package](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Freleases)\n2. Double-click `Duix.Avatar-x.x.x-setup.exe` to install\n\n### Mode 2：Ubuntu 22.04 Installation\n\n**System Requirements：**\n\nWe have conducted a complete test on **Ubuntu 22.04**. However, theoretically, it supports desktop Linux distributions.\n\n**Hardware Requirements：**\n\n- Recommended Configuration\n- CPU: 13th Generation Intel Core i5 - 13400F\n- Memory: 32G or more (necessary)\n- Graphics Card: RTX - 4070 (Ensure you have an NVIDIA graphics card and the graphics card driver is correctly installed)\n- Hard Disk: Free space greater than 100G\n\n**Install Docker:**\n\nFirst, use` docker --version` to check if Docker is installed. If it is installed, skip the following steps.\n\n```\nsudo apt update\nsudo apt install docker.io\nsudo apt install docker-compose\n```\n\n**Install the graphics card driver:**\n\n1. Install the graphics card driver by referring to the official documentation(https:\u002F\u002Fwww.nvidia.cn\u002Fdrivers\u002Flookup\u002F).\n\nAfter installation, execute the `nvidia-smi` command. If the graphics card information is displayed, the installation is successful.\n\n2. Install the NVIDIA Container Toolkit\n\n​    The NVIDIA Container Toolkit is a necessary tool for Docker to use NVIDIA GPUs. The installation steps are as follows:\n\n- Add the NVIDIA package repository:\n\n```\ndistribution=$(. \u002Fetc\u002Fos-release;echo $ID$VERSION_ID) \\\n  && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002Fgpgkey | sudo apt-key add - \\\n  && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002F$distribution\u002Flibnvidia-container.list | sudo tee \u002Fetc\u002Fapt\u002Fsources.list.d\u002Fnvidia-container-toolkit.list\n```\n\n- Update the package list and install the toolkit:\n\n```\nsudo apt-get update\nsudo apt-get install -y nvidia-container-toolkit\n```\n\n- Configure Docker to use the NVIDIA runtime:\n\n```\nsudo nvidia-ctk runtime configure --runtime=docker\n```\n\n- Restart the Docker service:\n\n```\nsudo systemctl restart docker\n```\n\n#### **Install the server**\n\n```\ncd \u002Fdeploy\ndocker-compose -f docker-compose-linux.yml up -d\n```\n\n#### **Install the client**\n\n1. Directly download the Linux version of the [officially built installation package](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Freleases).\n2. Double click `Duix.Avatar-x.x.x.AppImage` to launch it. No installation is required.\n\nReminder: In the Ubuntu system, if you enter the desktop as the `root` user, directly double - clicking `Duix.Avatar - x.x.x.AppImage` may not work. You need to execute `.\u002FDuix.Avatar - x.x.x.AppImage --no - sandbox` in the command - line terminal. Adding the `--no - sandbox` parameter will do the trick.\n\n\n\n## 4. Open APIs\n\nWe have opened APIs for model training and video synthesis. After Docker starts, several ports will be exposed locally, accessible through `http:\u002F\u002F127.0.0.1`.\n\nFor specific code, refer to:\n\n- src\u002Fmain\u002Fservice\u002Fmodel.js\n- src\u002Fmain\u002Fservice\u002Fvideo.js\n- src\u002Fmain\u002Fservice\u002Fvoice.js\n\n### **Model Training**\n\n1. Separate video into silent video + audio\n2. Place audio in\n\n    `D:\\duix_avatar_data\\voice\\data` is agreed with the `guiji2025\u002Ffish-speech-ziming` service, can be modified in docker-compose\n\n3. Call the\n\n    Parameter example:Response example:**Record the response results as they will be needed for subsequent audio synthesis**\n\n### **Audio Synthesis**\n\nInterface: `http:\u002F\u002F127.0.0.1:18180\u002Fv1\u002Finvoke`\n\n```\n\u002F\u002F Request parameters\n{\n  \"speaker\": \"{uuid}\", \u002F\u002F A unique UUID\n  \"text\": \"xxxxxxxxxx\", \u002F\u002F Text content to synthesize\n  \"format\": \"wav\", \u002F\u002F Fixed parameter\n  \"topP\": 0.7, \u002F\u002F Fixed parameter\n  \"max_new_tokens\": 1024, \u002F\u002F Fixed parameter\n  \"chunk_length\": 100, \u002F\u002F Fixed parameter\n  \"repetition_penalty\": 1.2, \u002F\u002F Fixed parameter\n  \"temperature\": 0.7, \u002F\u002F Fixed parameter\n  \"need_asr\": false, \u002F\u002F Fixed parameter\n  \"streaming\": false, \u002F\u002F Fixed parameter\n  \"is_fixed_seed\": 0, \u002F\u002F Fixed parameter\n  \"is_norm\": 0, \u002F\u002F Fixed parameter\n  \"reference_audio\": \"{voice.asr_format_audio_url}\", \u002F\u002F Return value from previous \"Model Training\" step\n  \"reference_text\": \"{voice.reference_audio_text}\" \u002F\u002F Return value from previous \"Model Training\" step\n}\n```\n\n### **Video Synthesis**\n\n- Synthesis interface: `http:\u002F\u002F127.0.0.1:8383\u002Feasy\u002Fsubmit`\n\n  ```\n  \u002F\u002F Request parameters\n  {\n    \"audio_url\": \"{audioPath}\", \u002F\u002F Audio path\n    \"video_url\": \"{videoPath}\", \u002F\u002F Video path\n    \"code\": \"{uuid}\", \u002F\u002F Unique key\n    \"chaofen\": 0, \u002F\u002F Fixed value\n    \"watermark_switch\": 0, \u002F\u002F Fixed value\n    \"pn\": 1 \u002F\u002F Fixed value\n  }\n  ```\n\n- Progress query: `http:\u002F\u002F127.0.0.1:8383\u002Feasy\u002Fquery?code=${taskCode}`\n\nGET request, the parameter `taskCode` is the `code` from the synthesis interface input above\n\n### **Important Notice to Developer Partners**\n\nwe are now announcing two parallel service solutions:\n\n| **Project**              | **Duix.Avatar Open Source Local Deployment**                      | **Digital Human\u002FClone Voice API Service**                    |\n| ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |\n| Usage                    | Open Source Local Deployment                                 | Rapid Clone API Service                                      |\n| Recommended              | Technical Users                                              | Business Users                                               |\n| Technical Threshold      | Developers with deep learning framework experience\u002Fpursuing deep customization\u002Fwishing to participate in community co-construction | Quick business integration\u002Ffocus on upper-level application development\u002Fneed enterprise-level SLA assurance for commercial scenarios |\n| Hardware Requirements    | Need to purchase GPU server                                  | No need to purchase GPU server                               |\n| Customization            | Can modify and extend the code according to your needs, fully controlling the software's functions and behavior | Cannot directly modify the source code, can only extend functions through API-provided interfaces, less flexible than open source projects |\n| Technical Support        | Community Support                                            | Dynamic expansion support + professional technical response team |\n| Maintenance Cost         | High maintenance cost                                        | Simple maintenance                                           |\n| Lip Sync Effect          | Usable effect                                                | Stunning and higher definition effect                        |\n| Commercial Authorization | Supports global free commercial use (enterprises with more than 100,000 users or annual revenue exceeding 10 million USD need to sign a commercial license agreement) | Commercial use allowed                                       |\n| Iteration Speed          | Slow updates, bug fixes depend on the community              | Latest models\u002Falgorithms are prioritized, fast problem resolution |\n\nWe always adhere to the open source spirit, and the launch of the API service aims to provide a more complete solution matrix for developers with different needs. No matter which method you choose, you can always obtain technical support documents through [https:\u002F\u002Fduix.com](https:\u002F\u002Fduix.com\u002F)\n\n\n\nWe look forward to working with you to promote the inclusive development of digital human technology!\n\n\n\nYou can chat with Duix.Avatar Digital Human on the official website: https:\u002F\u002Fduix.com\u002F\n\nWe also provide  APl at DUIX Platform: https:\u002F\u002Fdocs.duix.com\u002Fapi-reference\u002Fapi\u002FIntroduction\n\n## 5. What's New\n\n### **[Nvidia 50 Series GPU Version Notice]**\n\n1. Tested and verified on 5090 GPU\n2. For installation instructions, see [Server Deployment Solution for NVIDIA 50 Series Graphics Cards](#Server-Deployment-Solution-for-NVIDIA-50-Series-Graphics-Cards)\n\n### **[New Ubuntu Version Notice]**\n\n**Ubuntu Version Officially Released**\n\n1. Adaptation and verification work for Ubuntu 22.04 Desktop version (kernel 6.8.0-52-generic) has been completed. Compatibility testing for other Linux versions has not yet been conducted.\n2. Added internationalization (English) for the client program interface.\n3. Fixed some known issues\n   - \\#304\n   - \\#292\n4. [Ubuntu22.04 Installation Documentation](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar?tab=readme-ov-file#ubuntu-2204-installation)\n\n## 6. FAQ\n\n### **Self-Check Steps Before Asking Questions**\n\n1. Check if all three services are in Running status\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_c3686bdda9e2.png)\n\n2. Confirm that your machine has an NVIDIA graphics card and drivers are correctly installed.\n\nAll computing power for this project is local. The three services won't start without an NVIDIA graphics card or proper drivers.\n\n3. Ensure both server and client are updated to the latest version. The project is newly open-sourced, the community is very active, and updates are frequent. Your issue might have been resolved in a new version.\n   - Server: Go to `\u002Fdeploy` directory and re-execute `docker-compose up -d`\n   - Client: `pull` code and re-`build`\n4. [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Fissues) are continuously updated, issues are being resolved and closed daily. Check frequently, your issue might already be resolved.\n\n### **Question Template**\n\n1. Problem Description\n\nDescribe the reproduction steps in detail, with screenshots if possible.\n\n2. Provide Error Logs\n    - How to get client logs:\n\n      ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_d093440deb4c.jpeg)\n\n    - Server logs:\n\n      Find the key location, or click on our three Docker services, and \"Copy\" as shown below.\n\n      ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b586c6459533.png)\n\n## 7. How to Interact in real time\n\nDuix.Avatar's digital human realizes digital human cloning and non-real-time video synthesis.\n\nIf you want a digital human to support interaction, you can visit [duix.com](www.duix.com) to experience the free test.\n\n## 8. Contact\n\nIf you have any questions, please raise an issue or contact us at james@duix.com\n\n## 9. License\n\nhttps:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Fblob\u002Fmain\u002FLICENSE\n\n## 10. Acknowledgments\n\n- ASR based on fun-asr\n- TTS based on fish-speech-ziming\n\n## 11. Star History\n\n[GitHub Star History](https:\u002F\u002Fwww.star-history.com\u002F#duixcom\u002FDuix.Avatar&Date)\n","# 🚀🚀🚀 Duix Avatar — 真正开源的离线视频生成和数字人克隆 AI 头像工具包\n\n🔗 **官方网站：** [www.duix.com](http:\u002F\u002Fwww.duix.com)\n\n# 目录\n\n1. [什么是 Duix.Avatar](#1-whats-Duix.Avatar)\n2. [简介](#2-introduction)\n3. [本地运行指南](#3-how-to-run-locally)\n4. [开放 API](#4-open-apis)\n5. [更新日志](#5-whats-new)\n6. [常见问题](#6-faq)\n7. [如何实时交互](#7-how-to-interact-in-real-time)\n8. [联系方式](#8-contact)\n9. [许可证](#9-license)\n10. [致谢](#10-acknowledgments)\n11. [星标历史](#11-star-history)\n\n------\n\n## 1. 什么是 Duix.Avatar\n\n**Duix.Avatar** 是由 **Duix.com** 开发的一款免费开源的 AI (人工智能) 头像项目。\n\n七年前，一群年轻的先驱者选择了一条非传统的技术路径，开发了一种使用真人视频数据训练数字人模型的方法。与传统的昂贵 3D 数字人方法不同，我们利用 AI 生成技术创建了超逼真的数字人，将制作成本从数十万美元降低到仅需 1000 美元。这一创新已赋能超过 10,000 家企业，并为教育者、内容创作者、法律专家、医疗从业者及企业家等各行各业的专业人士生成了超过 500,000 个个性化头像，极大地提高了他们的视频制作效率。然而，我们的愿景不仅限于商业应用。我们相信这项变革性技术应惠及每个人。为了普及数字人创作，我们开源了我们的克隆技术和视频制作框架。我们的承诺始终如一：打破技术壁垒，让尖端工具触手可及。现在，任何人只要有电脑，就可以免费制作自己的 AI 头像并零成本制作视频——这就是 **Duix.Avatar** 的核心精髓。\n\n## 2. 简介\n\n![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_ca5212cf2071.png)\n\nDuix.Avatar 是一款专为 Windows 系统设计的完全离线视频合成工具，能够精确克隆您的外貌和声音，实现您形象的数字化。您可以通过文本和语音驱动虚拟头像来创建视频。无需互联网连接，在享受便捷高效数字体验的同时保护您的隐私。\n\n- 核心功能\n  - 精确外貌与声音克隆：利用先进的 AI 算法高精度捕捉人类面部特征，包括五官、轮廓等，构建逼真的虚拟模型。它还能精确克隆声音，捕捉并重现人类声音的细微特征，支持多种声音参数设置，以创造高度相似的克隆效果。\n  - 文本与语音驱动的虚拟头像：通过自然语言处理 (NLP) 技术理解文本内容，将文本转换为自然流畅的语音以驱动虚拟头像。也可以直接使用语音输入，允许虚拟头像根据语音的节奏和语调执行相应的动作和面部表情，使虚拟头像的表演更加自然生动。\n  - 高效视频合成：高度同步数字人视频图像与声音，实现自然流畅的口型同步，智能优化音视频同步效果。\n  - 多语言支持：脚本支持八种语言——英语、日语、韩语、中文、法语、德语、阿拉伯语和西班牙语。\n- 关键优势\n  - 完全离线运行：无需互联网连接，有效保护用户隐私，允许用户在安全独立的环境中创作，避免网络传输过程中潜在的数据泄露。\n  - 用户友好：界面简洁直观，即使是没有技术背景的初学者也能轻松上手，快速掌握软件用法，开启数字人创作之旅。\n  - 多模型支持：支持导入多个模型并通过一键启动包进行管理，方便用户根据不同的创作需求和应用场景选择合适的模型。\n- 技术支持\n  - 声音克隆技术：利用人工智能等先进技术，基于给定的声音样本生成相似或相同的声音，涵盖语境、语调、语速等语音方面。\n  - 自动语音识别 (ASR)：将人类语音词汇内容转换为计算机可读的输入（文本格式），使计算机能够“理解”人类语音的技术。\n  - 计算机视觉技术：用于视频合成中的视觉处理，包括人脸识别和唇部运动分析，确保虚拟头像的唇部运动与语音和文本内容相匹配。\n\n\n\n## 3. 本地运行指南\n\nDuix.Avatar 支持基于 Docker (容器引擎) 的快速部署。部署前，请确保您的硬件和软件环境满足指定要求。\n\nDuix.Avatar 支持两种部署模式：Windows \u002F Ubuntu 22.04 安装\n\n### **依赖项**\n\n1. Nodejs 18\n2. Docker 镜像\n   - docker pull guiji2025\u002Ffun-asr\n   - docker pull guiji2025\u002Ffish-speech-ziming\n   - docker pull guiji2025\u002Fduix.avatar\n\n### 模式 1：Windows 安装\n\n**系统要求：**\n\n- 目前支持 Windows 10 19042.1526 或更高版本\n\n**硬件要求：**\n\n- 必须拥有 D 盘：主要用于存储数字人和项目数据\n  - 剩余空间要求：大于 30GB\n- C 盘：用于存储服务镜像文件\n  - 剩余空间要求：大于 100GB\n  - 如果可用空间少于 100GB，在安装 Docker（容器引擎）后，您可以选择下方所示位置的其他磁盘文件夹，该文件夹需有超过 100GB 的剩余空间。\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_07da571277dd.png)\n\n- 推荐配置：\n  - 处理器：第 13 代 Intel Core i5-13400F\n  - 内存：32GB\n  - 显卡：RTX 4070\n- 确保您拥有 NVIDIA 显卡且驱动程序已正确安装\n\n  > NVIDIA 驱动程序下载链接：https:\u002F\u002Fwww.nvidia.cn\u002Fdrivers\u002Flookup\u002F\n\n  ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_3e6a4ac16729.png)\n\n\n#### **安装 Windows Docker**\n\n1. 使用命令 `wsl --list --verbose` 检查是否已安装 WSL（Windows 子系统 for Linux）。如果显示如下，则表示已安装，无需进一步安装。\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_34ea940b6148.png)\n\n\n\n2. 使用 `wsl --update` 更新 WSL。\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_61950961790b.png)\n\n3. [下载 Docker for Windows](https:\u002F\u002Fwww.docker.com\u002F)，根据您的 CPU 架构选择合适的安装包。\n4. 看到此界面时，表示安装成功。\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_13edad5b850b.png)\n\n5. 运行 Docker\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_0e7077eba800.png)\n\n6. 首次运行时接受协议并跳过登录\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_8a162647f4e1.png)\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b1d94f235543.png)\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b838ce0e327b.png)\n\n\n#### **安装服务器**\n\n使用 Docker 和 docker-compose（编排工具）进行安装，步骤如下：\n\n1. `docker-compose.yml` 文件位于 `\u002Fdeploy` 目录中。\n2. 在 `\u002Fdeploy` 目录下执行 `docker-compose up -d`，如果您想使用精简版，请执行 `docker-compose -f docker-compose-lite.yml up -d`\n3. 请耐心等待（约半小时，速度取决于网络），下载将消耗约 70GB 流量，请确保使用 WiFi\n4. 当您在 Docker 中看到三个服务时，表示成功（精简版只有一个服务 `Duix.Avatar-gen-video`）\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_c3686bdda9e2.png)\n\n#### **NVIDIA 50 系列显卡的服务器部署方案**\n\n对于 50 系列显卡（经测试，配合 CUDA 12.8 也适用于 30\u002F40 系列），使用 PyTorch（深度学习框架）官方预览版。\n\n#### **客户端**\n\n1. 直接下载 [官方构建的安装包](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Freleases)\n2. 双击 `Duix.Avatar-x.x.x-setup.exe` 进行安装\n\n### 模式 2：Ubuntu 22.04 安装\n\n**系统要求：**\n\n我们在 **Ubuntu 22.04** 上进行了完整测试。理论上，它支持桌面 Linux 发行版。\n\n**硬件要求：**\n\n- 推荐配置\n- 处理器：第 13 代 Intel Core i5 - 13400F\n- 内存：32G 或更多（必需）\n- 显卡：RTX - 4070（确保您拥有 NVIDIA 显卡且显卡驱动程序已正确安装）\n- 硬盘：剩余空间大于 100G\n\n**安装 Docker：**\n\n首先，使用 `docker --version` 检查是否已安装 Docker。如果已安装，请跳过以下步骤。\n\n```\nsudo apt update\nsudo apt install docker.io\nsudo apt install docker-compose\n```\n\n**安装显卡驱动：**\n\n1. 参考官方文档安装显卡驱动 (https:\u002F\u002Fwww.nvidia.cn\u002Fdrivers\u002Flookup\u002F)。\n\n安装完成后，执行 `nvidia-smi` 命令。如果显示了显卡信息，则安装成功。\n\n2. 安装 NVIDIA Container Toolkit\n\nNVIDIA Container Toolkit 是 Docker 使用 NVIDIA GPU（图形处理器）所必需的工具。安装步骤如下：\n\n- 添加 NVIDIA 软件源仓库：\n\n```\ndistribution=$(. \u002Fetc\u002Fos-release;echo $ID$VERSION_ID) \\\n  && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002Fgpgkey | sudo apt-key add - \\\n  && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002F$distribution\u002Flibnvidia-container.list | sudo tee \u002Fetc\u002Fapt\u002Fsources.list.d\u002Fnvidia-container-toolkit.list\n```\n\n- 更新软件包列表并安装工具包：\n\n```\nsudo apt-get update\nsudo apt-get install -y nvidia-container-toolkit\n```\n\n- 配置 Docker 使用 NVIDIA 运行时：\n\n```\nsudo nvidia-ctk runtime configure --runtime=docker\n```\n\n- 重启 Docker 服务：\n\n```\nsudo systemctl restart docker\n```\n\n#### **安装服务器**\n\n```\ncd \u002Fdeploy\ndocker-compose -f docker-compose-linux.yml up -d\n```\n\n#### **安装客户端**\n\n1. 直接下载 Linux 版本的 [官方构建的安装包](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Freleases)。\n2. 双击 `Duix.Avatar-x.x.x.AppImage` 启动它。无需安装。\n\n注意：在 Ubuntu 系统中，如果您以 `root` 用户身份进入桌面，直接双击 `Duix.Avatar - x.x.x.AppImage` 可能无法运行。您需要在命令行终端中执行 `.\u002FDuix.Avatar - x.x.x.AppImage --no - sandbox`。添加 `--no - sandbox` 参数即可解决。（注：沙箱模式限制）\n\n\n\n## 4. 开放 API\n\n我们已开放用于模型训练和视频合成的 API。Docker 启动后，将暴露几个本地端口，可通过 `http:\u002F\u002F127.0.0.1` 访问。\n\n具体代码请参考：\n\n- src\u002Fmain\u002Fservice\u002Fmodel.js\n- src\u002Fmain\u002Fservice\u002Fvideo.js\n- src\u002Fmain\u002Fservice\u002Fvoice.js\n\n### **模型训练**\n\n1. 将视频分离为无声视频 + 音频\n2. 将音频放置于\n\n    `D:\\duix_avatar_data\\voice\\data` 与 `guiji2025\u002Ffish-speech-ziming` 服务约定一致，可在 docker-compose 中修改\n\n3. 调用接口\n\n    参数示例：响应示例：**记录响应结果，因为后续音频合成需要用到**\n\n### **音频合成**\n\n接口：`http:\u002F\u002F127.0.0.1:18180\u002Fv1\u002Finvoke`\n\n```\n\u002F\u002F 请求参数\n{\n  \"speaker\": \"{uuid}\", \u002F\u002F 唯一的 UUID（通用唯一识别码）\n  \"text\": \"xxxxxxxxxx\", \u002F\u002F 要合成的文本内容\n  \"format\": \"wav\", \u002F\u002F 固定参数\n  \"topP\": 0.7, \u002F\u002F 固定参数\n  \"max_new_tokens\": 1024, \u002F\u002F 固定参数\n  \"chunk_length\": 100, \u002F\u002F 固定参数\n  \"repetition_penalty\": 1.2, \u002F\u002F 固定参数\n  \"temperature\": 0.7, \u002F\u002F 固定参数\n  \"need_asr\": false, \u002F\u002F 固定参数\n  \"streaming\": false, \u002F\u002F 固定参数\n  \"is_fixed_seed\": 0, \u002F\u002F 固定参数\n  \"is_norm\": 0, \u002F\u002F 固定参数\n  \"reference_audio\": \"{voice.asr_format_audio_url}\", \u002F\u002F 前一步“模型训练”步骤的返回值\n  \"reference_text\": \"{voice.reference_audio_text}\" \u002F\u002F 前一步“模型训练”步骤的返回值\n}\n```\n\n### **视频合成**\n\n- 合成接口：`http:\u002F\u002F127.0.0.1:8383\u002Feasy\u002Fsubmit`\n\n  ```\n  \u002F\u002F Request parameters\n  {\n    \"audio_url\": \"{audioPath}\", \u002F\u002F Audio path\n    \"video_url\": \"{videoPath}\", \u002F\u002F Video path\n    \"code\": \"{uuid}\", \u002F\u002F Unique key\n    \"chaofen\": 0, \u002F\u002F Fixed value\n    \"watermark_switch\": 0, \u002F\u002F Fixed value\n    \"pn\": 1 \u002F\u002F Fixed value\n  }\n  ```\n\n- 进度查询：`http:\u002F\u002F127.0.0.1:8383\u002Feasy\u002Fquery?code=${taskCode}`\n\nGET 请求，参数 `taskCode` 为上述合成接口输入中的 `code`\n\n### **致开发者合作伙伴的重要通知**\n\n我们现在宣布两种并行的服务解决方案：\n\n| **项目**              | **Duix.Avatar Open Source（开源）本地部署**                      | **数字人\u002F克隆语音 API（应用程序编程接口）服务**                    |\n| ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |\n| 用途                    | Open Source（开源）本地部署                                 | 快速克隆 API 服务                                      |\n| 推荐人群              | 技术用户                                              | 商业用户                                               |\n| 技术门槛      | 具有深度学习框架经验的开发者\u002F追求深度定制\u002F希望参与社区共建 | 快速业务集成\u002F专注于上层应用开发\u002F需要企业级 SLA（服务等级协议）保障的商业场景 |\n| 硬件要求    | 需购买 GPU（图形处理器）服务器                                  | 无需购买 GPU 服务器                               |\n| 定制化能力            | 可根据需求修改和扩展代码，完全控制软件的功能和行为 | 无法直接修改源代码，只能通过 API 提供的接口扩展功能，灵活性不如开源项目 |\n| 技术支持        | 社区支持                                            | 动态扩展支持 + 专业技术响应团队 |\n| 维护成本         | 高维护成本                                        | 简单维护                                           |\n| 唇形同步效果          | 可用效果                                                | 惊艳且更高清的效果                        |\n| 商业授权 | 支持全球免费商用（用户超过 10 万或年营收超过 1000 万美元的企业需签署商业许可协议） | 允许商用                                       |\n| 迭代速度          | 更新慢，bug 修复依赖社区              | 优先使用最新模型\u002F算法，问题解决速度快 |\n\n我们始终秉持开源精神，推出 API 服务旨在为不同需求的开发者提供更完整的解决方案矩阵。无论您选择哪种方式，都可以通过 [https:\u002F\u002Fduix.com](https:\u002F\u002Fduix.com\u002F) 获取技术支持文档\n\n\n\n我们期待与您携手，共同推动数字人技术的普惠发展！\n\n\n\n您可以在官网与 Duix.Avatar 数字人聊天：https:\u002F\u002Fduix.com\u002F\n\n我们也在 DUIX 平台提供 API：https:\u002F\u002Fdocs.duix.com\u002Fapi-reference\u002Fapi\u002FIntroduction\n\n## 5. 更新内容\n\n### **[NVIDIA（英伟达）50 系列 GPU（图形处理器）版本通知]**\n\n1. 已在 5090 GPU 上测试验证\n2. 安装说明请参见 [NVIDIA 50 系列显卡服务器部署方案](#Server-Deployment-Solution-for-NVIDIA-50-Series-Graphics-Cards)\n\n### **[新版 Ubuntu 版本通知]**\n\n**Ubuntu 版本正式发布**\n\n1. 已完成对 Ubuntu 22.04 桌面版（内核 6.8.0-52-generic）的适配和验证工作。尚未对其他 Linux 版本进行兼容性测试。\n2. 客户端程序界面增加了国际化（英文）支持。\n3. 修复了一些已知问题\n   - \\#304\n   - \\#292\n4. [Ubuntu22.04 安装文档](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar?tab=readme-ov-file#ubuntu-2204-installation)\n\n## 6. 常见问题解答\n\n### **提问前自查步骤**\n\n1. 检查所有三项服务是否处于 Running（运行）状态\n\n    ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_c3686bdda9e2.png)\n\n2. 确认您的机器拥有 NVIDIA 显卡且驱动程序正确安装。\n\n本项目的所有算力均为本地计算。如果没有 NVIDIA 显卡或正确的驱动程序，这三项服务将无法启动。\n\n3. 确保服务端和客户端均已更新至最新版本。该项目刚开源，社区非常活跃，更新频繁。您的问题可能在新版本中已得到解决。\n   - 服务端：进入 `\u002Fdeploy` 目录并重新执行 `docker-compose up -d`\n   - 客户端：`pull` 代码并重新 `build`\n4. [GitHub Issue（问题单）](https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Fissues) 持续更新，每天都有问题被解决和关闭。请经常查看，您的问题可能已经解决了。\n\n### **问题模板**\n\n1. 问题描述\n\n详细描述复现步骤，如有可能请附上截图。\n\n2. 提供错误日志\n    - 如何获取客户端日志：\n\n      ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_d093440deb4c.jpeg)\n\n    - 服务端日志：\n\n      找到关键位置，或者点击我们的三个 Docker 服务，并按下图所示“复制”。\n\n      ![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_readme_b586c6459533.png)\n\n## 7. 如何实时交互\n\nDuix.Avatar 数字人实现了数字人克隆和非实时视频合成。\n\n如果您希望数字人支持交互，可以访问 [duix.com](www.duix.com) 体验免费试用。\n\n## 8. 联系方式\n\n如有任何问题，请提交 Issue 或联系我们 james@duix.com\n\n## 9. 许可证\n\nhttps:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix.Avatar\u002Fblob\u002Fmain\u002FLICENSE\n\n## 10. 致谢\n\n- ASR（自动语音识别）基于 fun-asr\n- TTS（文本转语音）基于 fish-speech-ziming\n\n## 11. Star 历史\n\n[GitHub Star 历史](https:\u002F\u002Fwww.star-history.com\u002F#duixcom\u002FDuix.Avatar&Date)","# Duix-Avatar 快速上手指南\n\n**Duix-Avatar** 是一款完全离线的 AI 数字人视频合成工具，支持 Windows 和 Ubuntu 系统。它利用 AI 技术克隆您的外貌与声音，通过文本或语音驱动虚拟形象生成视频，无需联网即可保护隐私。\n\n## 1. 环境准备\n\n在开始之前，请确保您的硬件和软件环境满足以下要求：\n\n### 系统要求\n- **操作系统**：Windows 10 (版本 19042.1526 及以上) 或 Ubuntu 22.04。\n- **显卡**：NVIDIA 显卡（推荐 RTX 4070 或更高），需正确安装驱动。\n- **内存**：建议 32GB 或以上。\n- **硬盘空间**：\n  - C 盘：剩余空间大于 100GB（用于存储服务镜像）。\n  - D 盘：剩余空间大于 30GB（用于存储数字人和项目数据）。\n\n### 前置依赖\n- **Node.js**：版本 18。\n- **Docker**：已安装并运行。\n- **Docker Compose**：已安装。\n\n> **注意**：首次拉取镜像约消耗 70GB 流量，建议使用 WiFi 网络。\n\n---\n\n## 2. 安装步骤\n\n### 第一步：配置 Docker 与驱动\n\n#### Windows 用户\n1. 检查并更新 WSL：\n   ```bash\n   wsl --list --verbose\n   wsl --update\n   ```\n2. 下载并安装 [Docker Desktop](https:\u002F\u002Fwww.docker.com\u002F)。\n3. 启动 Docker，接受协议并跳过登录。\n\n#### Ubuntu 用户\n1. 安装 Docker：\n   ```bash\n   sudo apt update\n   sudo apt install docker.io\n   sudo apt install docker-compose\n   ```\n2. 安装 NVIDIA 驱动及容器工具包：\n   ```bash\n   distribution=$(. \u002Fetc\u002Fos-release;echo $ID$VERSION_ID) \\\n     && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002Fgpgkey | sudo apt-key add - \\\n     && curl -s -L https:\u002F\u002Fnvidia.github.io\u002Flibnvidia-container\u002F$distribution\u002Flibnvidia-container.list | sudo tee \u002Fetc\u002Fapt\u002Fsources.list.d\u002Fnvidia-container-toolkit.list\n   sudo apt-get update\n   sudo apt-get install -y nvidia-container-toolkit\n   sudo nvidia-ctk runtime configure --runtime=docker\n   sudo systemctl restart docker\n   ```\n\n### 第二步：部署服务器端\n进入项目 `\u002Fdeploy` 目录，执行以下命令启动服务：\n\n```bash\ncd \u002Fdeploy\ndocker-compose up -d\n```\n*(如需使用精简版 lite version，可执行 `docker-compose -f docker-compose-lite.yml up -d`)*\n\n等待约半小时，直到 Docker 显示三个服务运行成功（Lite 版为一个服务）。\n\n### 第三步：安装客户端\n- **Windows**：下载官方安装包 `Duix.Avatar-x.x.x-setup.exe` 并双击安装。\n- **Ubuntu**：下载 `Duix.Avatar-x.x.x.AppImage`。若以 root 用户运行失败，请在终端执行：\n  ```bash\n  .\u002FDuix.Avatar-x.x.x.AppImage --no-sandbox\n  ```\n\n---\n\n## 3. 基本使用\n\n部署完成后，所有功能均为本地离线运行。您可通过以下方式交互：\n\n### 1. 访问本地接口\n服务启动后，API 将在本地暴露，默认地址为 `http:\u002F\u002F127.0.0.1`。\n\n### 2. 核心 API 调用示例\n参考源码 `src\u002Fmain\u002Fservice\u002F` 获取详细参数，以下为音频合成接口示例：\n\n**接口地址**：`http:\u002F\u002F127.0.0.1:18180\u002Fv1\u002Finvoke`\n\n**请求参数**：\n```json\n{\n  \"speaker\": \"{uuid}\", \n  \"text\": \"xxxxxxxxxx\", \n  \"format\": \"wav\", \n  \"topP\": 0.7, \n  \"max_new_tokens\": 1024, \n  \"chunk_length\": 100, \n  \"repetition_penalty\": 1.2, \n  \"temperature\": 0.7, \n  \"need_asr\": false, \n  \"streaming\": false, \n  \"is_fixed_seed\": 0, \n  \"is_norm\": 0, \n  \"reference_audio\": \"{voice.asr_format_audio_url}\", \n  \"reference_text\": \"{voice.reference_audio_text}\"\n}\n```\n\n### 3. 视频合成\n视频合成接口位于 `http:\u002F\u002F127.0.0.1:8383\u002Feasy\u002Fsubmit`，需传入生成的音频路径和视频路径进行驱动合成。\n\n> **提示**：首次使用前，请确保按照文档指引完成模型训练（将音频放入 `D:\\duix_avatar_data\\voice\\data` 等指定目录）。","某独立编程教育博主计划每周更新三次技术教程视频，但受限于时间精力和预算，急需高效解决方案。\n\n### 没有 Duix-Avatar 时\n- 传统拍摄需搭建专业影棚，单次录制准备及打光时间长达两小时。\n- 发现口误必须重新整段录制，后期剪辑效率极低，难以快速响应反馈。\n- 使用在线 SaaS 平台存在个人生物特征数据上传云端泄露风险。\n- 订阅制付费高昂，且不支持多语言切换，难以覆盖海外学员群体。\n\n### 使用 Duix-Avatar 后\n- Duix-Avatar 支持本地部署，只需少量样本即可克隆形象，无需反复出镜。\n- 通过文本驱动数字人口型，修改脚本后秒级重新生成，极大提升迭代速度。\n- 全离线模式确保所有训练数据保留在本地电脑，彻底杜绝隐私隐患。\n- 内置八国语言支持，一键切换中英双语输出，轻松触达全球学习者。\n\nDuix-Avatar 以零成本、高隐私的本地化方案，彻底解决了数字人视频制作的效率与数据安全难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fduixcom_Duix-Avatar_ca5212cf.png","duixcom","Duix","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fduixcom_4cf6bef7.png","Duix is a global leader in digital human(AI avatar) solutions, offering open-source technologies for video-based avatars, real-time interactive avatars.",null,"support@duix.com","Duix_official","https:\u002F\u002Fwww.duix.com\u002F","https:\u002F\u002Fgithub.com\u002Fduixcom",[86,90,94,98,102,106],{"name":87,"color":88,"percentage":89},"C","#555555",85,{"name":91,"color":92,"percentage":93},"Vue","#41b883",10.3,{"name":95,"color":96,"percentage":97},"JavaScript","#f1e05a",4.3,{"name":99,"color":100,"percentage":101},"Less","#1d365d",0.2,{"name":103,"color":104,"percentage":105},"CSS","#663399",0.1,{"name":107,"color":108,"percentage":109},"HTML","#e34c26",0,12666,2086,"2026-04-05T10:33:38","NOASSERTION","Windows, Linux","需要 NVIDIA 显卡，推荐 RTX 4070，需安装正确驱动程序，50 系显卡需使用官方预览版 PyTorch","32GB",{"notes":118,"python":119,"dependencies":120},"Windows 需安装 WSL；C 盘需 100GB+ 空闲空间，D 盘需 30GB+；首次部署下载约 70GB 流量；Ubuntu 若以 root 用户运行客户端需添加 --no-sandbox 参数","未说明",[121,122,123,124],"Nodejs 18","Docker","Docker Compose","NVIDIA Container Toolkit",[18,15],[127,128,129,130,131,132,133,134],"ai-avatar","ai-avatars","digital-human","video-generation","cloning","cloning-tool","multimodal-ai","video-synthesis",10,"2026-03-27T02:49:30.150509","2026-04-06T06:46:11.684378",[139,144,148,153,158,163],{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},3467,"heygem-tts 启动报错‘No module named tools.api’怎么办？","在 docker-compose.yaml 文件中修改启动命令。将原来的 \"\u002Fopt\u002Fconda\u002Fenvs\u002Fpython310\u002Fbin\u002Fpython3 -m tools.api --listen 0.0.0.0:8080\" 替换为 \"\u002Fopt\u002Fconda\u002Fenvs\u002Fpython310\u002Fbin\u002Fpython3 tools\u002Fapi_server.py --listen 0.0.0.0:8080\" 即可解决。","https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix-Avatar\u002Fissues\u002F5",{"id":145,"question_zh":146,"answer_zh":147,"source_url":143},3468,"F2F 模块启动后一直不运行或报错如何处理？","在 docker-compose.yaml 中将 face2face 相关的配置文件挂载行注释掉。具体操作是将以下两行加上#号注释：\n - d:\u002Fheygem_data\u002Fface2face\u002Fsdk\u002Fconfig.ini:\u002Fcode\u002Fconfig\u002Fconfig.ini\n - d:\u002Fheygem_data\u002Fface2face\u002Fsdk\u002Flicense.txt:\u002Fcode\u002Flicense.txt",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},3469,"Docker 镜像体积太大（70G）如何优化？","使用官方推出的 Lite 版本。Lite 版可以减少两个服务（heygem-tts 和 heygem-asr），安装体积从 70G 减小到 13.5G，定制形象和视频生成都更快。注意：Lite 版没有文字生成视频功能，只能用上传音频的方式生成视频。","https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix-Avatar\u002Fissues\u002F10",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},3470,"默认使用第一张 GPU，修改配置不生效，支持多卡调用吗？","常见问题是 CUDA 版本和 Docker 版本不匹配导致无法识别多卡。请检查宿主机的 NVIDIA 驱动版本以及 Docker 容器内的 CUDA 版本是否一致，确保版本匹配后才能正常调用多张 GPU。","https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix-Avatar\u002Fissues\u002F355",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},3471,"提交定制时报错 SQLite3 can only bind numbers... 且内存不足如何解决？","确保物理内存至少 32G。如果是 WSL2 环境，需在用户目录（%UserProfile%）创建 .wslconfig 文件，内容为 [wsl2] memory=32GB。保存后在 cmd 输入 wsl --shutdown，然后重启 Docker 即可生效。","https:\u002F\u002Fgithub.com\u002Fduixcom\u002FDuix-Avatar\u002Fissues\u002F149",{"id":164,"question_zh":165,"answer_zh":166,"source_url":162},3472,"遇到 SQLite 错误时，Docker 配置和模型加载需要注意什么？","1. 确保没有开启 Docker 的省资源模式。\n2. 启动或重启后需等待约 5 分钟，因为 heygem-asr 需要下载模型文件，过早请求会导致连接异常。",[168,173],{"id":169,"version":170,"summary_zh":171,"released_at":172},103084,"v1.0.6","It's just a name change. If the version you're using has no issues, you can continue to use it. If you're a new user, please use this version to match the latest server.","2025-09-28T09:47:14",{"id":174,"version":175,"summary_zh":176,"released_at":177},103085,"v1.0.5","The original project name \"HeyGem\" has now been officially changed to \"Duix.Avatar\".","2025-08-15T08:33:49"]