[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-cleardusk--3DDFA_V2":3,"similar-cleardusk--3DDFA_V2":146},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":15,"owner_avatar_url":16,"owner_bio":17,"owner_company":18,"owner_location":19,"owner_email":20,"owner_twitter":19,"owner_website":21,"owner_url":22,"languages":23,"stars":51,"forks":52,"last_commit_at":53,"license":54,"difficulty_score":55,"env_os":56,"env_gpu":57,"env_ram":58,"env_deps":59,"category_tags":67,"github_topics":71,"view_count":85,"oss_zip_url":19,"oss_zip_packed_at":19,"status":86,"created_at":87,"updated_at":88,"faqs":89,"releases":130},7042,"cleardusk\u002F3DDFA_V2","3DDFA_V2","The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.","3DDFA_V2 是一款基于 PyTorch 开源的 3D 人脸密集对齐工具，源自 ECCV 2020 的研究成果。它的核心功能是从单张图片或实时视频流中，快速、精准地重建出包含丰富细节的 3D 人脸模型，并估算头部姿态。\n\n针对传统方法在速度、精度和稳定性上的不足，3DDFA_V2 进行了全面升级。它摒弃了较慢的检测器，集成了高效的 FaceBoxes 人脸检测算法，并引入了优化的 C++\u002FCython 渲染模块，显著提升了处理效率。其独特的技术亮点在于支持 ONNX Runtime 加速，使得在普通 CPU 上推理 3DMM 参数的延迟低至约 1.35 毫秒\u002F图，实现了真正的实时高性能运行。此外，它还支持导出多种 3D 格式（如 .ply, .obj）及纹理映射，功能十分全面。\n\n这款工具非常适合计算机视觉领域的研究人员、需要集成人脸 3D 功能的开发者，以及对数字人建模感兴趣的技术爱好者使用。无论是用于学术研究、AR\u002FVR 应用开发，还是作为学习 3D 人脸重建的入门项目，3DDFA_V2 都提供了稳定且易于上手的官方实现，甚至无需昂贵显卡即可在 CPU 环境中流畅体验。","# Towards Fast, Accurate and Stable 3D Dense Face Alignment\n\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-yellow.svg)](LICENSE)\n![GitHub repo size](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frepo-size\u002Fcleardusk\u002F3DDFA_V2.svg)\n[![](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv)\n\nBy [Jianzhu Guo](https:\u002F\u002Fguojianzhu.com), [Xiangyu Zhu](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fxiangyuzhu\u002F), [Yang Yang](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fyyang\u002Fmain.htm), Fan Yang, [Zhen Lei](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fzlei\u002F) and [Stan Z. Li](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=Y-nyLGIAAAAJ).\nThe code repo is owned and maintained by **[Jianzhu Guo](https:\u002F\u002Fguojianzhu.com)**.\n\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_3a445f5812e0.gif\" alt=\"demo\" width=\"512px\">\n\u003C\u002Fp>\n\n\n**\\[Updates\\]**\n - `2021.7.10`: Run 3DDFA_V2 online on [Gradio](https:\u002F\u002Fgradio.app\u002Fhub\u002FAK391\u002F3DDFA_V2).\n - `2021.1.15`: Borrow the implementation of [Dense-Head-Pose-Estimation](https:\u002F\u002Fgithub.com\u002F1996scarlet\u002FDense-Head-Pose-Estimation) for the faster mesh rendering (speedup about 3x, 15ms -> 4ms), see [utils\u002Frender_ctypes.py](.\u002Futils\u002Frender_ctypes.py) for details.\n - `2020.10.7`: Add the latency evaluation of the full pipeline in [latency.py](.\u002Flatency.py), just run by `python3 latency.py --onnx`, see [Latency](#Latency) evaluation for details.\n - `2020.10.6`: Add onnxruntime support for FaceBoxes to reduce the face detection latency, just append the `--onnx` action to activate it, see [FaceBoxes_ONNX.py](FaceBoxes\u002FFaceBoxes_ONNX.py) for details.\n - `2020.10.2`: **Add onnxruntime support to greatly reduce the 3dmm parameters inference latency**, just append the `--onnx` action when running `demo.py`, see [TDDFA_ONNX.py](.\u002FTDDFA_ONNX.py) for details.\n - `2020.9.20`: Add features including pose estimation and serializations to .ply and .obj, see `pose`, `ply`, `obj` options in [demo.py](.\u002Fdemo.py).\n - `2020.9.19`: Add PNCC (Projected Normalized Coordinate Code), uv texture mapping features, see `pncc`, `uv_tex` options in [demo.py](.\u002Fdemo.py).\n\n\n## Introduction\n\nThis work extends [3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA), named **3DDFA_V2**, titled [Towards Fast, Accurate and Stable 3D Dense Face Alignment](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162.pdf), accepted by [ECCV 2020](https:\u002F\u002Feccv2020.eu\u002F). The supplementary material is [here](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162-supp.pdf). The [gif](.\u002Fhttps:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_3a445f5812e0.gif) above shows a webcam demo of the tracking result, in the scenario of my lab. This repo is the official implementation of 3DDFA_V2.\n\nCompared to [3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA), 3DDFA_V2 achieves better performance and stability. Besides, 3DDFA_V2 incorporates the fast face detector [FaceBoxes](https:\u002F\u002Fgithub.com\u002Fzisianw\u002FFaceBoxes.PyTorch) instead of Dlib. A simple 3D render written by c++ and cython is also included. This repo supports the onnxruntime, and the latency of regressing 3DMM parameters using the default backbone is about **1.35ms\u002Fimage on CPU** with a single image as input. If you are interested in this repo, just try it on this **[google colab](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv)**! Welcome for valuable issues, PRs and discussions 😄\n\n\u003C!-- Currently, the pre-trained model, inference code and some utilities are released.  -->\n\n## Getting started\n\n### Requirements\nSee [requirements.txt](.\u002Frequirements.txt), tested on macOS and Linux platforms. The Windows users may refer to [FQA](#FQA) for building issues. Note that this repo uses Python3. The major dependencies are PyTorch, numpy, opencv-python and onnxruntime, etc. If you run the demos with `--onnx` flag to do acceleration, you may need to install `libomp` first, i.e., `brew install libomp` on macOS.\n\n### Usage\n\n1. Clone this repo\n   \n```shell script\ngit clone https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2.git\ncd 3DDFA_V2\n```\n\n2. Build the cython version of NMS, Sim3DR, and the faster mesh render\n\u003C!-- ```shell script\ncd FaceBoxes\nsh .\u002Fbuild_cpu_nms.sh\ncd ..\n\ncd Sim3DR\nsh .\u002Fbuild_sim3dr.sh\ncd ..\n\n# the faster mesh render\ncd utils\u002Fasset\ngcc -shared -Wall -O3 render.c -o render.so -fPIC\ncd ..\u002F..\n```\n\nor simply build them by -->\n```shell script\nsh .\u002Fbuild.sh\n```\n\n3. Run demos\n\n```shell script\n# 1. running on still image, the options include: 2d_sparse, 2d_dense, 3d, depth, pncc, pose, uv_tex, ply, obj\npython3 demo.py -f examples\u002Finputs\u002Femma.jpg --onnx # -o [2d_sparse, 2d_dense, 3d, depth, pncc, pose, uv_tex, ply, obj]\n\n# 2. running on videos\npython3 demo_video.py -f examples\u002Finputs\u002Fvideos\u002F214.avi --onnx\n\n# 3. running on videos smoothly by looking ahead by `n_next` frames\npython3 demo_video_smooth.py -f examples\u002Finputs\u002Fvideos\u002F214.avi --onnx\n\n# 4. running on webcam\npython3 demo_webcam_smooth.py --onnx\n```\n\nThe implementation of tracking is simply by alignment. If the head pose > 90° or the motion is too fast, the alignment may fail. A threshold is used to trickly check the tracking state, but it is unstable.\n\nYou can refer to [demo.ipynb](.\u002Fdemo.ipynb) or [google colab](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv) for the step-by-step tutorial of running on the still image.\n\nFor example, running `python3 demo.py -f examples\u002Finputs\u002Femma.jpg -o 3d` will give the result below:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_5cfd1279d001.jpg\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\nAnother example:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9d7dc6e9f897.jpg\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\nRunning on a video will give:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_8f88e33884c0.gif\" alt=\"demo\" width=\"512px\">\n\u003C\u002Fp>\n\nMore results or demos to see: [Hathaway](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fvideos\u002Fhathaway_3ddfa_v2.mp4).\n\n\u003C!-- Obviously, the eyes parts are not good. -->\n\n### Features (up to now)\n\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Cth>2D sparse\u003C\u002Fth>\n    \u003Cth>2D dense\u003C\u002Fth>\n    \u003Cth>3D\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_2b1e45dfe9d7.jpg\" width=\"360\" alt=\"2d sparse\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9a5cc3a4ebf8.jpg\"  width=\"360\" alt=\"2d dense\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_48f3e8a93e46.jpg\"        width=\"360\" alt=\"3d\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Cth>Depth\u003C\u002Fth>\n    \u003Cth>PNCC\u003C\u002Fth>\n    \u003Cth>UV texture\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_ce6c0523fb57.jpg\"     width=\"360\" alt=\"depth\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_13e924297835.jpg\"      width=\"360\" alt=\"pncc\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_948b1fa279c0.jpg\"    width=\"360\" alt=\"uv_tex\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Cth>Pose\u003C\u002Fth>\n    \u003Cth>Serialization to .ply\u003C\u002Fth>\n    \u003Cth>Serialization to .obj\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_e025c41d2328.jpg\"      width=\"360\" alt=\"pose\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9ae14d02c517.jpg\"                     width=\"360\" alt=\"ply\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_383bdf492eae.jpg\"                     width=\"360\" alt=\"obj\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n\u003C\u002Ftable>\n\n### Configs\n\nThe default backbone is MobileNet_V1 with input size 120x120 and the default pre-trained weight is `weights\u002Fmb1_120x120.pth`, shown in [configs\u002Fmb1_120x120.yml](configs\u002Fmb1_120x120.yml). This repo provides another config in [configs\u002Fmb05_120x120.yml](configs\u002Fmb05_120x120.yml), with the widen factor 0.5, being smaller and faster. You can specify the config by `-c` or `--config` option. The released models are shown in the below table. Note that the inference time on CPU in the paper is evaluated using TensorFlow.\n\n| Model | Input | #Params | #Macs | Inference (TF) |\n| :-: | :-: | :-: | :-: | :-: |\n| MobileNet  | 120x120 | 3.27M | 183.5M | ~6.2ms |\n| MobileNet x0.5 | 120x120 | 0.85M | 49.5M | ~2.9ms |\n\n\n**Surprisingly**, the latency of [onnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime) is much smaller. The inference time on CPU with different threads is shown below. The results are tested on my MBP (i5-8259U CPU @ 2.30GHz on 13-inch MacBook Pro), with the `1.5.1` version of onnxruntime. The thread number is set by `os.environ[\"OMP_NUM_THREADS\"]`, see [speed_cpu.py](.\u002Fspeed_cpu.py) for more details.\n\n| Model | THREAD=1 | THREAD=2 | THREAD=4 |\n| :-: | :-: | :-: | :-: |\n| MobileNet  | 4.4ms  | 2.25ms | 1.35ms |\n| MobileNet x0.5 | 1.37ms | 0.7ms | 0.5ms |\n\n### Latency\n\nThe `onnx` option greatly reduces the overall **CPU** latency, but face detection still takes up most of the latency time, e.g., 15ms for a 720p image. 3DMM parameters regression takes about 1~2ms for one face, and the dense reconstruction (more than 30,000 points, i.e. 38,365) is about 1ms for one face. Tracking applications may benefit from the fast 3DMM regression speed, since detection is not needed for every frame. The latency is tested using my 13-inch MacBook Pro (i5-8259U CPU @ 2.30GHz).\n\nThe default `OMP_NUM_THREADS` is set 4, you can specify it by setting `os.environ['OMP_NUM_THREADS'] = '$NUM'` or inserting `export OMP_NUM_THREADS=$NUM` before running the python script.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_360da3d56223.gif\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\n## FQA\n\n1. What is the training data?\n\n    We use [300W-LP](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F0B7OEHD3T4eCkVGs0TkhUWFN6N1k\u002Fview?usp=sharing) for training. You can refer to our [paper](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162.pdf) for more details about the training. Since few images are closed-eyes in the training data 300W-LP, the landmarks of eyes are not accurate when closing. The eyes part of the webcam demo are also not good.\n\n2. Running on Windows.\n\n    You can refer to [this comment](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F12#issuecomment-697479173) for building NMS on Windows.\n\n## Acknowledgement\n\n* The FaceBoxes module is modified from [FaceBoxes.PyTorch](https:\u002F\u002Fgithub.com\u002Fzisianw\u002FFaceBoxes.PyTorch).\n* A list of previous works on 3D dense face alignment or reconstruction: [3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA), [face3d](https:\u002F\u002Fgithub.com\u002FYadiraF\u002Fface3d), [PRNet](https:\u002F\u002Fgithub.com\u002FYadiraF\u002FPRNet).\n* Thank [AK391](https:\u002F\u002Fgithub.com\u002FAK391) for hosting the Gradio web app.\n\n## Other implementations or applications\n\n* [Dense-Head-Pose-Estimation](https:\u002F\u002Fgithub.com\u002F1996scarlet\u002FDense-Head-Pose-Estimation): Tensorflow Lite framework for face mesh, head pose, landmarks, and more.\n* [HeadPoseEstimate](https:\u002F\u002Fgithub.com\u002Fbubingy\u002FHeadPoseEstimate): Head pose estimation system based on 3d facial landmarks.\n* [img2pose](https:\u002F\u002Fgithub.com\u002Fvitoralbiero\u002Fimg2pose): Borrow the renderer implementation of Sim3DR in this repo.\n\n## Citation\n\nIf your work or research benefits from this repo, please cite two bibs below : ) and 🌟 this repo.\n\n    @inproceedings{guo2020towards,\n        title =        {Towards Fast, Accurate and Stable 3D Dense Face Alignment},\n        author =       {Guo, Jianzhu and Zhu, Xiangyu and Yang, Yang and Yang, Fan and Lei, Zhen and Li, Stan Z},\n        booktitle =    {Proceedings of the European Conference on Computer Vision (ECCV)},\n        year =         {2020}\n    }\n\n    @misc{3ddfa_cleardusk,\n        author =       {Guo, Jianzhu and Zhu, Xiangyu and Lei, Zhen},\n        title =        {3DDFA},\n        howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA}},\n        year =         {2018}\n    }\n\n## Contact\n**Jianzhu Guo (郭建珠)** [[Homepage](https:\u002F\u002Fguojianzhu.com), [Google Scholar](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=W8_JzNcAAAAJ&hl=en&oi=ao)]: **guojianzhu1994@foxmail.com** or **guojianzhu1994@gmail.com** or **jianzhu.guo@nlpr.ia.ac.cn** (this email will be invalid soon).\n","# 致力于快速、准确且稳定的3D密集人脸对齐\n\n[![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-yellow.svg)](LICENSE)\n![GitHub 仓库大小](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frepo-size\u002Fcleardusk\u002F3DDFA_V2.svg)\n[![](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv)\n\n作者：[郭建竹](https:\u002F\u002Fguojianzhu.com)、[朱向宇](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fxiangyuzhu\u002F)、[杨洋](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fyyang\u002Fmain.htm)、杨帆、[雷震](http:\u002F\u002Fwww.cbsr.ia.ac.cn\u002Fusers\u002Fzlei\u002F) 和 [Stan Z. Li](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=Y-nyLGIAAAAJ)。  \n代码仓库由 **[郭建竹](https:\u002F\u002Fguojianzhu.com)** 拥有并维护。\n\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_3a445f5812e0.gif\" alt=\"demo\" width=\"512px\">\n\u003C\u002Fp>\n\n\n**\\[更新\\]**\n - `2021.7.10`：在 [Gradio](https:\u002F\u002Fgradio.app\u002Fhub\u002FAK391\u002F3DDFA_V2) 上在线运行 3DDFA_V2。\n - `2021.1.15`：借鉴了 [Dense-Head-Pose-Estimation](https:\u002F\u002Fgithub.com\u002F1996scarlet\u002FDense-Head-Pose-Estimation) 的实现，以加快网格渲染速度（提速约 3 倍，从 15ms 降至 4ms），详情请参阅 [utils\u002Frender_ctypes.py](.\u002Futils\u002Frender_ctypes.py)。\n - `2020.10.7`：在 [latency.py](.\u002Flatency.py) 中添加了完整流水线的延迟评估，只需运行 `python3 latency.py --onnx` 即可，详情请参阅 [Latency](#Latency) 评估。\n - `2020.10.6`：为 FaceBoxes 添加了 onnxruntime 支持，以降低人脸检测延迟，只需添加 `--onnx` 参数即可启用，详情请参阅 [FaceBoxes_ONNX.py](FaceBoxes\u002FFaceBoxes_ONNX.py)。\n - `2020.10.2`：**添加了 onnxruntime 支持，大幅降低 3DMM 参数推断延迟**，只需在运行 `demo.py` 时添加 `--onnx` 参数即可，详情请参阅 [TDDFA_ONNX.py](.\u002FTDDFA_ONNX.py)。\n - `2020.9.20`：新增姿态估计以及导出为 .ply 和 .obj 格式的功能，详见 [demo.py](.\u002Fdemo.py) 中的 `pose`、`ply`、`obj` 选项。\n - `2020.9.19`：新增 PNCC（投影归一化坐标编码）和 UV 纹理映射功能，详见 [demo.py](.\u002Fdemo.py) 中的 `pncc` 和 `uv_tex` 选项。\n\n\n## 简介\n\n本工作是对 [3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA) 的扩展，命名为 **3DDFA_V2**，论文标题为 [Towards Fast, Accurate and Stable 3D Dense Face Alignment](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162.pdf)，已被 [ECCV 2020](https:\u002F\u002Feccv2020.eu\u002F) 接受。补充材料请见 [这里](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162-supp.pdf)。上方的 [gif](.\u002Fhttps:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_3a445f5812e0.gif) 展示了实验室场景下的摄像头跟踪演示结果。本仓库是 3DDFA_V2 的官方实现。\n\n与 [3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA) 相比，3DDFA_V2 在性能和稳定性上都有所提升。此外，3DDFA_V2 使用了快速人脸检测器 [FaceBoxes](https:\u002F\u002Fgithub.com\u002Fzisianw\u002FFaceBoxes.PyTorch) 替代 Dlib，并包含用 C++ 和 Cython 编写的简单 3D 渲染器。本仓库支持 onnxruntime，在使用默认主干网络回归 3DMM 参数时，单张图像输入的延迟约为 **CPU 上 1.35ms\u002F帧**。如果您对本仓库感兴趣，不妨在 [google colab](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv) 上试一试！欢迎提出宝贵的问题、PR 和讨论 😄\n\n\u003C!-- 目前已发布预训练模型、推理代码及部分工具。 -->\n\n## 快速入门\n\n### 需求\n请参阅 [requirements.txt](.\u002Frequirements.txt)，已在 macOS 和 Linux 平台上测试通过。Windows 用户如遇构建问题，可参考 [FQA](#FQA)。请注意，本仓库使用 Python3。主要依赖包括 PyTorch、numpy、opencv-python 和 onnxruntime 等。若使用 `--onnx` 标志运行演示以加速，可能需要先安装 `libomp`，例如在 macOS 上运行 `brew install libomp`。\n\n### 使用方法\n\n1. 克隆本仓库\n   \n```shell script\ngit clone https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2.git\ncd 3DDFA_V2\n```\n\n2. 构建 NMS、Sim3DR 的 Cython 版本以及更快的网格渲染器\n\u003C!-- ```shell script\ncd FaceBoxes\nsh .\u002Fbuild_cpu_nms.sh\ncd ..\n\ncd Sim3DR\nsh .\u002Fbuild_sim3dr.sh\ncd ..\n\n# 更快的网格渲染器\ncd utils\u002Fasset\ngcc -shared -Wall -O3 render.c -o render.so -fPIC\ncd ..\u002F..\n```\n\n或者直接运行 -->\n```shell script\nsh .\u002Fbuild.sh\n```\n\n3. 运行演示\n\n```shell script\n# 1. 对静态图像进行处理，可选参数包括：2d_sparse、2d_dense、3d、depth、pncc、pose、uv_tex、ply、obj\npython3 demo.py -f examples\u002Finputs\u002Femma.jpg --onnx # -o [2d_sparse、2d_dense、3d、depth、pncc、pose、uv_tex、ply、obj]\n\n# 2. 对视频进行处理\npython3 demo_video.py -f examples\u002Finputs\u002Fvideos\u002F214.avi --onnx\n\n# 3. 通过提前查看 `n_next` 帧平滑处理视频\npython3 demo_video_smooth.py -f examples\u002Finputs\u002Fvideos\u002F214.avi --onnx\n\n# 4. 使用摄像头实时处理\npython3 demo_webcam_smooth.py --onnx\n```\n\n跟踪的实现方式很简单，即通过对齐完成。如果头部姿态超过 90° 或运动过快，对齐可能会失败。我们使用一个阈值来粗略判断跟踪状态，但这种方法并不稳定。\n\n您可以参考 [demo.ipynb](.\u002Fdemo.ipynb) 或 [google colab](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1OKciI0ETCpWdRjP-VOGpBulDJojYfgWv) 获取关于如何处理静态图像的分步教程。\n\n例如，运行 `python3 demo.py -f examples\u002Finputs\u002Femma.jpg -o 3d` 将得到如下结果：\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_5cfd1279d001.jpg\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\n另一个例子：\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9d7dc6e9f897.jpg\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\n对视频进行处理的结果如下：\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_8f88e33884c0.gif\" alt=\"demo\" width=\"512px\">\n\u003C\u002Fp>\n\n更多结果或演示请参阅：[Hathaway](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fvideos\u002Fhathaway_3ddfa_v2.mp4)。\n\n\u003C!-- 显然，眼睛部分的效果还不够好。 -->\n\n### 功能（截至目前）\n\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Cth>2D稀疏点\u003C\u002Fth>\n    \u003Cth>2D密集点\u003C\u002Fth>\n    \u003Cth>3D\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_2b1e45dfe9d7.jpg\" width=\"360\" alt=\"2D稀疏点\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9a5cc3a4ebf8.jpg\"  width=\"360\" alt=\"2D密集点\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_48f3e8a93e46.jpg\"        width=\"360\" alt=\"3D\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Cth>深度\u003C\u002Fth>\n    \u003Cth>PNCC\u003C\u002Fth>\n    \u003Cth>UV纹理\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_ce6c0523fb57.jpg\"     width=\"360\" alt=\"深度\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_13e924297835.jpg\"      width=\"360\" alt=\"PNCC\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_948b1fa279c0.jpg\"    width=\"360\" alt=\"UV纹理\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Cth>姿态\u003C\u002Fth>\n    \u003Cth>导出为.ply\u003C\u002Fth>\n    \u003Cth>导出为.obj\u003C\u002Fth>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_e025c41d2328.jpg\"      width=\"360\" alt=\"姿态\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_9ae14d02c517.jpg\"                     width=\"360\" alt=\"导出为.ply\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_383bdf492eae.jpg\"                     width=\"360\" alt=\"导出为.obj\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n\u003C\u002Ftable>\n\n### 配置\n\n默认的主干网络是 MobileNet_V1，输入尺寸为 120x120，预训练权重为 `weights\u002Fmb1_120x120.pth`，配置文件位于 [configs\u002Fmb1_120x120.yml](configs\u002Fmb1_120x120.yml)。本仓库还提供了另一个配置文件 [configs\u002Fmb05_120x120.yml](configs\u002Fmb05_120x120.yml)，其宽度因子为 0.5，模型更小、速度更快。您可以通过 `-c` 或 `--config` 选项指定配置文件。已发布的模型如下表所示。请注意，论文中 CPU 上的推理时间是使用 TensorFlow 评估的。\n\n| 模型         | 输入       | 参数量   | MACs    | 推理时间 (TF) |\n| ------------ | ---------- | -------- | ------- | ------------- |\n| MobileNet    | 120x120    | 3.27M    | 183.5M  | ~6.2ms        |\n| MobileNet x0.5 | 120x120    | 0.85M    | 49.5M   | ~2.9ms        |\n\n**令人惊讶的是**，[onnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime) 的延迟要小得多。以下是不同线程数下 CPU 上的推理时间。测试在 MacBook Pro（i5-8259U，2.30GHz）上进行，使用的 onnxruntime 版本为 1.5.1。线程数通过 `os.environ[\"OMP_NUM_THREADS\"]` 设置，更多细节请参见 [speed_cpu.py](.\u002Fspeed_cpu.py)。\n\n| 模型         | THREAD=1   | THREAD=2 | THREAD=4 |\n| ------------ | ---------- | -------- | -------- |\n| MobileNet    | 4.4ms      | 2.25ms   | 1.35ms   |\n| MobileNet x0.5 | 1.37ms     | 0.7ms    | 0.5ms    |\n\n### 延迟\n\n使用 `onnx` 选项可以显著降低整体 **CPU** 延迟，但人脸检测仍然占据了大部分延迟时间，例如对一张 720p 图像的检测需要约 15ms。3DMM 参数回归每张人脸大约需要 1~2ms，而密集重建（超过 3 万个点，即 38,365 个点）每张人脸大约需要 1ms。对于跟踪应用来说，快速的3DMM回归速度会很有帮助，因为并非每一帧都需要进行检测。延迟测试是在我的 13 英寸 MacBook Pro（i5-8259U，2.30GHz）上进行的。\n\n默认的 `OMP_NUM_THREADS` 被设置为 4，您可以通过设置 `os.environ['OMP_NUM_THREADS'] = '$NUM'` 或在运行 Python 脚本前插入 `export OMP_NUM_THREADS=$NUM` 来指定线程数。\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_readme_360da3d56223.gif\" alt=\"demo\" width=\"640px\">\n\u003C\u002Fp>\n\n## 常见问题解答\n\n1. 训练数据是什么？\n\n    我们使用 [300W-LP](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F0B7OEHD3T4eCkVGs0TkhUWFN6N1k\u002Fview?usp=sharing) 进行训练。更多关于训练的细节，请参阅我们的 [论文](https:\u002F\u002Fguojianzhu.com\u002Fassets\u002Fpdfs\u002F3162.pdf)。由于训练数据 300W-LP 中闭眼图像较少，因此在闭眼时眼睛的关键点定位不够准确。网络摄像头演示中的眼部效果也不太理想。\n\n2. 在 Windows 上运行。\n\n    您可以参考 [此评论](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F12#issuecomment-697479173) 来了解如何在 Windows 上构建 NMS。\n\n## 致谢\n\n* FaceBoxes 模块修改自 [FaceBoxes.PyTorch](https:\u002F\u002Fgithub.com\u002Fzisianw\u002FFaceBoxes.PyTorch)。\n* 之前关于 3D 密集人脸对齐或重建的工作列表：[3DDFA](https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA)、[face3d](https:\u002F\u002Fgithub.com\u002FYadiraF\u002Fface3d)、[PRNet](https:\u002F\u002Fgithub.com\u002FYadiraF\u002FPRNet)。\n* 感谢 [AK391](https:\u002F\u002Fgithub.com\u002FAK391) 托管 Gradio Web 应用。\n\n## 其他实现或应用\n\n* [Dense-Head-Pose-Estimation](https:\u002F\u002Fgithub.com\u002F1996scarlet\u002FDense-Head-Pose-Estimation)：基于 Tensorflow Lite 的框架，用于人脸网格、头部姿态、关键点等。\n* [HeadPoseEstimate](https:\u002F\u002Fgithub.com\u002Fbubingy\u002FHeadPoseEstimate)：基于 3D 人脸关键点的头部姿态估计系统。\n* [img2pose](https:\u002F\u002Fgithub.com\u002Fvitoralbiero\u002Fimg2pose)：借用了本仓库中 Sim3DR 的渲染器实现。\n\n## 引用\n\n如果您在工作或研究中受益于本仓库，请引用以下两篇文献，并为本仓库点赞 : )。\n\n    @inproceedings{guo2020towards,\n        title =        {Towards Fast, Accurate and Stable 3D Dense Face Alignment},\n        author =       {Guo, Jianzhu and Zhu, Xiangyu and Yang, Yang and Yang, Fan and Lei, Zhen and Li, Stan Z},\n        booktitle =    {Proceedings of the European Conference on Computer Vision (ECCV)},\n        year =         {2020}\n    }\n\n    @misc{3ddfa_cleardusk,\n        author =       {Guo, Jianzhu and Zhu, Xiangyu and Lei, Zhen},\n        title =        {3DDFA},\n        howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA}},\n        year =         {2018}\n    }\n\n## 联系方式\n**郭建珠** [[主页](https:\u002F\u002Fguojianzhu.com), [Google 学术](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=W8_JzNcAAAAJ&hl=en&oi=ao)]：**guojianzhu1994@foxmail.com** 或 **guojianzhu1994@gmail.com** 或 **jianzhu.guo@nlpr.ia.ac.cn**（此邮箱即将失效）。","# 3DDFA_V2 快速上手指南\n\n3DDFA_V2 是一个快速、准确且稳定的 3D 密集人脸对齐工具，支持从单张图像或视频中重建高密度 3D 人脸网格、估计姿态及生成纹理映射。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**：macOS 或 Linux（Windows 用户需参考构建脚本自行配置编译环境）\n- **Python 版本**：Python 3.x\n\n### 前置依赖\n主要依赖包括 PyTorch, numpy, opencv-python, onnxruntime 等。\n建议在安装前确保系统已安装基础编译工具（如 `gcc`, `make`）。\n\n若使用 `--onnx` 加速标志运行演示，macOS 用户需预先安装 `libomp`：\n```bash\nbrew install libomp\n```\n\n其他依赖可通过以下命令安装：\n```bash\npip install -r requirements.txt\n```\n> **提示**：国内用户可使用清华或阿里镜像源加速安装，例如：\n> `pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n\n## 安装步骤\n\n1. **克隆仓库**\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2.git\n   cd 3DDFA_V2\n   ```\n\n2. **构建核心模块**\n   项目包含 NMS、Sim3DR 及加速网格渲染的 C++\u002FCython 模块，需编译后使用。执行一键构建脚本：\n   ```bash\n   sh .\u002Fbuild.sh\n   ```\n   *注：若自动脚本失败，可参照 README 手动进入 `FaceBoxes`, `Sim3DR`, `utils\u002Fasset` 目录分别执行对应的 build 脚本。*\n\n3. **下载预训练模型**\n   确保 `weights\u002F` 目录下存在预训练权重文件（如 `mb1_120x120.pth`）。若仓库未自动包含，请根据项目说明从发布页或 Google Drive 下载并放入对应目录。\n\n## 基本使用\n\n以下示例展示如何对单张图片进行 3D 人脸重建。推荐使用 `--onnx` 参数以启用 ONNX Runtime 加速，显著降低 CPU 推理延迟。\n\n### 单张图片推理\n运行以下命令处理示例图片 `emma.jpg`，输出 3D 人脸结果：\n```bash\npython3 demo.py -f examples\u002Finputs\u002Femma.jpg --onnx -o 3d\n```\n\n**参数说明：**\n- `-f`: 输入文件路径（图片或视频）。\n- `--onnx`: 启用 ONNX 加速（推荐）。\n- `-o`: 输出模式，可选值包括：\n  - `2d_sparse`: 2D 稀疏关键点\n  - `2d_dense`: 2D 密集关键点\n  - `3d`: 3D 人脸网格（默认可视化）\n  - `depth`: 深度图\n  - `pncc`: 投影归一化坐标码\n  - `pose`: 头部姿态估计\n  - `uv_tex`: UV 纹理映射\n  - `ply` \u002F `obj`: 导出 3D 模型文件\n\n### 其他场景\n- **视频处理**：\n  ```bash\n  python3 demo_video.py -f examples\u002Finputs\u002Fvideos\u002F214.avi --onnx\n  ```\n- **摄像头实时演示**：\n  ```bash\n  python3 demo_webcam_smooth.py --onnx\n  ```\n\n运行成功后，结果图像将保存至输出目录，终端会显示处理耗时及详细信息。","某影视特效团队正在为一款低成本网络剧制作虚拟角色，需要将演员的面部表情实时映射到 3D 数字人模型上。\n\n### 没有 3DDFA_V2 时\n- **重建精度不足**：传统算法仅能捕捉稀疏的关键点，导致生成的 3D 面部网格在脸颊和额头区域细节丢失，表情显得僵硬且不自然。\n- **运行速度缓慢**：旧方案依赖 CPU 密集计算，单帧处理耗时超过 15 毫秒，无法在普通开发机上实现流畅的实时预览，严重拖慢调试进度。\n- **稳定性差**：当演员快速转头或光线变化时，跟踪极易丢失或产生剧烈抖动，后期需要人工逐帧修复，增加了大量重复劳动。\n- **部署门槛高**：缺乏高效的推理加速支持，难以将算法轻量化部署到边缘设备或集成到现有的实时渲染管线中。\n\n### 使用 3DDFA_V2 后\n- **高密度对齐**：3DDFA_V2 利用稠密人脸对齐技术，精准还原了包括细微皱纹在内的面部几何细节，数字人表情细腻逼真。\n- **极速推理性能**：借助 ONNX Runtime 加速，3DDFA_V2 将 3DMM 参数回归延迟压缩至约 1.35 毫秒（CPU 单图），实现了丝滑的实时驱动效果。\n- **鲁棒性强**：内置的 FaceBoxes 检测器与优化算法确保了在大角度姿态和复杂光照下依然保持稳定跟踪，彻底消除了画面抖动。\n- **灵活集成**：支持导出 .obj\u002F.ply 格式及 UV 纹理映射，3DDFA_V2 可无缝对接主流渲染引擎，大幅降低了从采集到合成的工程难度。\n\n3DDFA_V2 通过兼顾高精度与毫秒级低延迟，让中小团队也能在普通硬件上轻松实现电影级的实时面部捕捉与驱动。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcleardusk_3DDFA_V2_5cfd1279.jpg","cleardusk","Jianzhu Guo","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fcleardusk_65351b89.jpg","Focus on GenAI","Previously, @KlingTeam, Kuaishou \u003C\u003C PhD@CASIA",null,"guojianzhu1994@gmail.com","https:\u002F\u002Fguojianzhu.com","https:\u002F\u002Fgithub.com\u002Fcleardusk",[24,28,32,36,40,44,48],{"name":25,"color":26,"percentage":27},"Python","#3572A5",69.8,{"name":29,"color":30,"percentage":31},"C++","#f34b7d",16.4,{"name":33,"color":34,"percentage":35},"Cython","#fedf5b",6.8,{"name":37,"color":38,"percentage":39},"C","#555555",4.2,{"name":41,"color":42,"percentage":43},"Jupyter Notebook","#DA5B0B",2.4,{"name":45,"color":46,"percentage":47},"CMake","#DA3434",0.2,{"name":49,"color":50,"percentage":47},"Shell","#89e051",3123,550,"2026-04-10T09:38:14","MIT",4,"Linux, macOS","未说明 (主要基于 CPU 优化，支持 ONNX Runtime 加速)","未说明",{"notes":60,"python":61,"dependencies":62},"Windows 用户需参考 FQA 部分解决编译问题；若在 macOS 上使用 --onnx 标志加速，需先安装 libomp (brew install libomp)；项目包含需要编译的 C++\u002FCython 模块 (NMS, Sim3DR, render)，可通过运行 build.sh 脚本构建；默认骨干网络为 MobileNet_V1，也提供更小的 MobileNet x0.5 配置。","Python 3",[63,64,65,66,33],"PyTorch","numpy","opencv-python","onnxruntime",[68,69,70],"开发框架","图像","其他",[72,73,74,75,76,77,78,79,80,81,82,83,84],"eccv","3d-face-alignment","pytorch","face-alignment","3d-face","3dmm","alignment","3d","computer-vision","onnx","3d-landmarks","single-image-reconstruction","eccv-2020",2,"ready","2026-03-27T02:49:30.150509","2026-04-13T16:33:44.272011",[90,95,100,105,110,115,120,125],{"id":91,"question_zh":92,"answer_zh":93,"source_url":94},31699,"如何解决 'ModuleNotFoundError: No module named FaceBoxes.utils.nms.cpu_nms' 错误？","需要在 FaceBoxes\u002Futils\u002Fnms 目录下手动编译 cpu_nms 模块。具体步骤如下：\n1. 在该目录创建 setup.py 文件，内容如下：\n```python\nfrom setuptools import setup, Extension\nfrom Cython.Build import cythonize\nimport numpy as np\n\nextensions = [\n    Extension(\n        \"cpu_nms\",\n        sources=[\"cpu_nms.pyx\", \"cpu_nms.c\"],\n        include_dirs=[np.get_include()],\n    )\n]\n\nsetup(\n    name=\"cpu_nms\",\n    ext_modules=cythonize(extensions)\n)\n```\n2. 运行命令进行编译：`python setup.py build_ext --inplace`\n或者直接进入对应目录运行构建脚本（如 `bash xxx.sh`）。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F25",{"id":96,"question_zh":97,"answer_zh":98,"source_url":99},31700,"运行 demo.py 时遇到 scipy.io.loadmat 解压错误 (zlib.error: inconsistent stream state) 怎么办？","这通常是由于导入顺序或文件损坏导致的。尝试以下解决方案：\n1. 调整 demo.py 和 uv.py 中的包导入顺序。\n2. 确保相关的 .mat 文件下载完整且未损坏。\n有用户反馈调整导入顺序后问题解决。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F44",{"id":101,"question_zh":102,"answer_zh":103,"source_url":104},31701,"如何获取包含颈部和耳朵区域的 3DDFA_V2 模型？","默认的 3DDFA_V2 模型不包含耳朵和颈部区域。维护者提供了一个包含颈部区域的版本，可以通过百度网盘下载（链接通常在相关 Issue 评论中提供，如 issue #10 评论所示）。注意：原链接可能需要中国大陆手机号注册，非大陆用户需寻找替代下载源或联系维护者获取非百度网盘链接。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F10",{"id":106,"question_zh":107,"answer_zh":108,"source_url":109},31702,"Sim3DR 是否支持可微渲染 (differentiable render)？如果不支持，有什么替代方案？","Sim3DR 不支持可微渲染。维护者推荐使用以下替代库：\n1. PyTorch3D (https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fpytorch3d)\n2. Kaolin (https:\u002F\u002Fgithub.com\u002FNVIDIAGameWorks\u002Fkaolin)\n这些库支持从 3D 形状生成法线图 (normal maps) 及可微渲染功能。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F16",{"id":111,"question_zh":112,"answer_zh":113,"source_url":114},31703,"函数 `parse_roi_box_from_bbox` 中的超参数（如 0.14, 1.58）是如何选择的？必须使用这种裁剪策略吗？","这些超参数是经验值，用于调整人脸检测框（因为检测器倾向于检测人脸的上半部分，所以需要向下移动边界框）。这不是强制必须的步骤，而是一种裁剪策略。最佳实践是：训练时的裁剪策略应与评估（推理）时的策略保持一致，以确保效果最优。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F6",{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},31704,"项目是否会发布训练代码？","截至当前 Issues 讨论，维护者尚未正式发布完整的训练代码。许多用户在 Issue 中请求分享训练代码及针对不同架构（如 MobileNet）和输入尺寸的训练指令。建议关注仓库的更新或查看是否有非官方的训练实现分享。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F1",{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},31705,"是否提供 ResNet 骨干网络的预训练权重？","目前主要提供的是默认骨干网络的权重。关于 ResNet（如 ResNet-22）的权重，维护者表示如果有足够的多样化数据（例如唇部数据），用户可以尝试自己训练。社区中有用户请求分享 PyTorch 格式的 .pkl 权重文件，但官方尚未明确发布计划。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F19",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},31706,"如何使用 300W-LP 数据集的 Ground Truth (GT) 进行重新训练？其 Shape_Para 数值过大如何处理？","300W-LP 的 GT 数据不能直接用于该项目，因为其参数尺度（如 Shape_Para）与项目预设不一致。需要对其进行预处理和转换，使其符合项目的参数空间（例如 199 维形状参数和 29 维表情参数）。直接可视化原始 GT 会导致结果错误，需参考项目中参数解析逻辑进行适配。","https:\u002F\u002Fgithub.com\u002Fcleardusk\u002F3DDFA_V2\u002Fissues\u002F134",[131,136,141],{"id":132,"version":133,"summary_zh":134,"released_at":135},238894,"v0.12","* 一些修复和更新","2020-11-17T10:58:29",{"id":137,"version":138,"summary_zh":139,"released_at":140},238895,"v0.11","新功能：\n\n* PNCC\n* UV 纹理\n* 重构文档和代码","2020-09-19T16:46:59",{"id":142,"version":143,"summary_zh":144,"released_at":145},238896,"v0.1","3DDFA_V2 的初始版本支持：\n* 稀疏特征点\n* 密集特征点\n* 网格渲染\n* 深度图渲染\n* 视频跟踪","2020-08-31T15:55:20",[147,158,166,175,183,192],{"id":148,"name":149,"github_repo":150,"description_zh":151,"stars":152,"difficulty_score":153,"last_commit_at":154,"category_tags":155,"status":86},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[156,68,69,157],"Agent","数据工具",{"id":159,"name":160,"github_repo":161,"description_zh":162,"stars":163,"difficulty_score":153,"last_commit_at":164,"category_tags":165,"status":86},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[68,69,156],{"id":167,"name":168,"github_repo":169,"description_zh":170,"stars":171,"difficulty_score":85,"last_commit_at":172,"category_tags":173,"status":86},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",152630,"2026-04-12T23:33:54",[68,156,174],"语言模型",{"id":176,"name":177,"github_repo":178,"description_zh":179,"stars":180,"difficulty_score":85,"last_commit_at":181,"category_tags":182,"status":86},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[68,69,156],{"id":184,"name":185,"github_repo":186,"description_zh":187,"stars":188,"difficulty_score":85,"last_commit_at":189,"category_tags":190,"status":86},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[191,156,69,68],"插件",{"id":193,"name":194,"github_repo":195,"description_zh":196,"stars":197,"difficulty_score":85,"last_commit_at":198,"category_tags":199,"status":86},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[191,68]]