[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-emilianavt--OpenSeeFace":3,"tool-emilianavt--OpenSeeFace":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150037,2,"2026-04-10T23:33:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":75,"owner_location":75,"owner_email":75,"owner_twitter":75,"owner_website":76,"owner_url":77,"languages":78,"stars":103,"forks":104,"last_commit_at":105,"license":106,"difficulty_score":32,"env_os":107,"env_gpu":108,"env_ram":107,"env_deps":109,"category_tags":117,"github_topics":119,"view_count":139,"oss_zip_url":75,"oss_zip_packed_at":75,"status":17,"created_at":140,"updated_at":141,"faqs":142,"releases":172},681,"emilianavt\u002FOpenSeeFace","OpenSeeFace","Robust realtime face and facial landmark tracking on CPU with Unity integration","OpenSeeFace 是一款基于 CPU 的实时面部及关键点追踪库，旨在为虚拟形象驱动提供稳定的面部数据输入。它解决了在普通硬件环境下，如何低成本且高效地完成高精度面部动作捕捉的难题。即使在光线昏暗或背景嘈杂的场景中，OpenSeeFace 依然能保持较高的追踪稳定性，并能准确还原多种嘴型变化。\n\n这款库非常适合 Unity 游戏开发者、虚拟主播技术团队以及计算机视觉研究人员。事实上，许多知名软件如 VSeeFace 和 VTube Studio 的核心追踪功能都源自 OpenSeeFace。\n\n技术层面，OpenSeeFace 采用 MobileNetV3 模型并优化为 ONNX 格式，确保在 Windows 系统上能达到 30 至 60 帧的流畅运行。其独特的 UDP 数据传输设计，允许将追踪计算与图形渲染分离到不同设备，既提升了整体性能，也有效保护了用户的摄像头隐私。对于需要集成面部动捕功能的开发者而言，OpenSeeFace 是一个可靠且灵活的基础选择。","![OSF.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_b2a824e1e2ba.png)\n\n# Overview\n\n**Note**: This is a tracking library, **not** a stand-alone avatar puppeteering program. I'm also working on [VSeeFace](https:\u002F\u002Fwww.vseeface.icu\u002F), which allows animating [VRM](https:\u002F\u002Fvrm.dev\u002Fen\u002Fhow_to_make_vrm\u002F) and [VSFAvatar](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=jhQ8DF87I5I) 3D models by using OpenSeeFace tracking. [VTube Studio](https:\u002F\u002Fdenchisoft.com\u002F) uses OpenSeeFace for webcam based tracking to animate Live2D models. A renderer for the Godot engine can be found [here](https:\u002F\u002Fgithub.com\u002Fvirtual-puppet-project\u002Fvpuppr).\n\nThis project implements a facial landmark detection model based on MobileNetV3.\n\nAs Pytorch 1.3 CPU inference speed on Windows is very low, the model was converted to ONNX format. Using [onnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime) it can run at 30 - 60 fps tracking a single face. There are four models, with different speed to tracking quality trade-offs.\n\nIf anyone is curious, the name is a silly pun on the open seas and seeing faces. There's no deeper meaning.\n\nAn up to date sample video can be found [here](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=AaNap_ud_3I&vq=hd1080), showing the default tracking model's performance under different noise and light levels.\n\n# Tracking quality\n\nSince the landmarks used by OpenSeeFace are a bit different from those used by other approaches (they are close to iBUG 68, with two less points in the mouth corners and quasi-3D face contours instead of face contours that follow the visible outline) it is hard to numerically compare its accuracy to that of other approaches found commonly in scientific literature. The tracking performance is also more optimized for making landmarks that are useful for animating an avatar than for exactly fitting the face image. For example, as long as the eye landmarks show whether the eyes are opened or closed, even if their location is somewhat off, they can still be useful for this purpose.\n\nFrom general observation, OpenSeeFace performs well in adverse conditions (low light, high noise, low resolution) and keeps tracking faces through a very wide range of head poses with relatively high stability of landmark positions. Compared to MediaPipe, OpenSeeFace landmarks remain more stable in challenging conditions and it accurately represents a wider range of mouth poses. However, tracking of the eye region can be less accurate.\n\nI ran OpenSeeFace on a sample clip from the video presentation for [3D Face Reconstruction with Dense Landmarks](https:\u002F\u002Fmicrosoft.github.io\u002FDenseLandmarks\u002F) by Wood et al. to compare it to MediaPipe and their approach. You can watch the result [here](https:\u002F\u002Fwww.vseeface.icu\u002Fassets\u002Fmedia\u002FOSFMediaPipe3DFR.mp4).\n\n# Usage\n\nA sample Unity project for VRM based avatar animation can be found [here](https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFaceSample).\n\nThe face tracking itself is done by the `facetracker.py` Python 3.7 script. It is a commandline program, so you should start it manually from cmd or write a batch file to start it. If you downloaded a release and are on Windows, you can run the `facetracker.exe` inside the `Binary` folder without having Python installed. You can also use the `run.bat` inside the `Binary` folder for a basic demonstration of the tracker.\n\nThe script will perform the tracking on webcam input or video file and send the tracking data over UDP. This design also allows tracking to be done on a separate PC from the one who uses the tracking information. This can be useful to enhance performance and to avoid accidentially revealing camera footage.\n\nThe provided `OpenSee` Unity component can receive these UDP packets and provides the received information through a public field called `trackingData`. The `OpenSeeShowPoints` component can visualize the landmark points of a detected face. It also serves as an example. Please look at it to see how to properly make use of the `OpenSee` component. Further examples are included in the `Examples` folder. The UDP packets are received in a separate thread, so any components using the `trackingData` field of the `OpenSee` component should first copy the field and access this copy, because otherwise the information may get overwritten during processing. This design also means that the field will keep updating, even if the `OpenSee` component is disabled.\n\nRun the python script with `--help` to learn about the possible options you can set.\n\n    python facetracker.py --help\n\nA simple demonstration can be achieved by creating a new scene in Unity, adding an empty game object and both the `OpenSee` and `OpenSeeShowPoints` components to it. While the scene is playing, run the face tracker on a video file:\n\n    python facetracker.py --visualize 3 --pnp-points 1 --max-threads 4 -c video.mp4\n\n__Note__: If dependencies were installed using [poetry](https:\u002F\u002Fpython-poetry.org\u002F), the commands have to be executed from a `poetry shell` or have to be prefixed with `poetry run`.\n\nThis way the tracking script will output its own tracking visualization while also demonstrating the transmission of tracking data to Unity.\n\nThe included `OpenSeeLauncher` component allows starting the face tracker program from Unity. It is designed to work with the pyinstaller created executable distributed in the binary release bundles. It provides three public API functions:\n\n* `public string[] ListCameras()` returns the names of available cameras. The index of the camera in the array corresponds to its ID for the `cameraIndex` field. Setting the `cameraIndex` to `-1` will disable webcam capturing.\n* `public bool StartTracker()` will start the tracker. If it is already running, it will shut down the running instance and start a new one with the current settings.\n* `public void StopTracker()` will stop the tracker. The tracker is stopped automatically when the application is terminated or the `OpenSeeLauncher` object is destroyed.\n\nThe `OpenSeeLauncher` component uses WinAPI job objects to ensure that the tracker child process is terminated if the application crashes or closes without terminating the tracker process first.\n\nAdditional custom commandline arguments should be added one by one into elements of `commandlineArguments` array. For example `-v 1` should be added as two elements, one element containing `-v` and one containing `1`, not a single one containing both parts.\n\nThe included `OpenSeeIKTarget` component can be used in conjunction with FinalIK or other IK solutions to animate head motion.\n\n## Expression detection\n\nThe `OpenSeeExpression` component can be added to the same component as the `OpenSeeFace` component to detect specific facial expressions. It has to be calibrated on a per-user basis. It can be controlled either through the checkboxes in the Unity Editor or through the equivalent public methods that can be found in its source code.\n\nTo calibrate this system, you have to gather example data for each expression. If the capture process is going too fast, you can use the `recordingSkip` option to slow it down.\n\nThe general process is as follows:\n\n* Type in a name for the expression you want to calibrate.\n* Make the expression and hold it, then tick the recording box.\n* Keep holding the expression and move your head around and turn it in various directions.\n* After a short while, start talking while doing so if the expression should be compatible with talking.\n* After doing this for a while, untick the recording box and work on capturing another expression.\n* Tick the train box and see if the expressions you gathered data for are detected accurately.\n* You should also get some statistics in the lower part of the component.\n* If there are issues with any expression being detected, keep adding data to it.\n\nTo delete the captured data for an expression, type in its name and tick the \"Clear\" box.\n\nTo save both the trained model and the captured training data, type in a filename including its full path in the \"Filename\" field and tick the \"Save\" box. To load it, enter the filename and tick the \"Load\" box.\n\n### Hints\n\n* A reasonable number of expressions is six, including the neutral one.\n* Before starting to capture expressions, make some faces and wiggle your eyebrows around, to warm up the feature detection part of the tracker.\n* Once you have a detection model that works decently, when using it take a moment to check all the expressions work as intended and add a little data if not.\n\n# General notes\n\n* The tracking seems to be quite robust even with partial occlusion of the face, glasses or bad lighting conditions.\n* The highest quality model is selected with `--model 3`, the fastest model with the lowest tracking quality is `--model 0`.\n* Lower tracking quality mainly means more rigid tracking, making it harder to detect blinking and eyebrow motion.\n* Depending on the frame rate, face tracking can easily use up a whole CPU core. At 30fps for a single face, it should still use less than 100% of one core on a decent CPU. If tracking uses too much CPU, try lowering the frame rate. A frame rate of 20 is probably fine and anything above 30 should rarely be necessary.\n* When setting the number of faces to track to a higher number than the number of faces actually in view, the face detection model will run every `--scan-every` frames. This can slow things down, so try to set `--faces` no higher than the actual number of faces you are tracking.\n\n# Models\n\nFour pretrained face landmark models are included. Using the `--model` switch, it is possible to select them for tracking. The given fps values are for running the model on a single face video on a single CPU core. Lowering the frame rate would reduce CPU usage by a corresponding degree.\n\n* Model **-1**: This model is for running on toasters, so it's a very very fast and very low accuracy model. (213fps without gaze tracking)\n* Model **0**: This is a very fast, low accuracy model. (68fps)\n* Model **1**: This is a slightly slower model with better accuracy. (59fps)\n* Model **2**: This is a slower model with good accuracy. (50fps)\n* Model **3** (default): This is the slowest and highest accuracy model. (44fps)\n\nFPS measurements are from running on one core of my CPU.\n\nPytorch weights for use with `model.py` can be found [here](https:\u002F\u002Fmega.nz\u002Ffile\u002FvvYXlYQT#h7FpEg4tmOCJNxjpsDEw0JomJIkVGKwrt4OUV0RNDDU). Some unoptimized ONNX models can be found [here](https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F48).\n\n# Results\n\n## Landmarks\n\n![Results1.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_89fb1f23fd86.png)\n\n![Results2.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_51edddfaa10f.png)\n\nMore samples: [Results3.png](https:\u002F\u002Fraw.githubusercontent.com\u002Femilianavt\u002FOpenSeeFace\u002Fmaster\u002FImages\u002FResults3.png), [Results4.png](https:\u002F\u002Fraw.githubusercontent.com\u002Femilianavt\u002FOpenSeeFace\u002Fmaster\u002FImages\u002FResults4.png)\n\n## Face detection\n\nThe landmark model is quite robust with respect to the size and orientation of the faces, so the custom face detection model gets away with rougher bounding boxes than other approaches. It has a favorable speed to accuracy ratio for the purposes of this project.\n\n![EmiFace.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_12f0b180a4c7.png)\n\n# Release builds\n\nThe builds in the release section of this repository contain a `facetracker.exe` inside a `Binary` folder that was built using `pyinstaller` and contains all required dependencies.\n\nTo run it, at least the `models` folder has to be placed in the same folder as `facetracker.exe`. Placing it in a common parent folder should work too.\n\nWhen distributing it, you should also distribute the `Licenses` folder along with it to make sure you conform to requirements set forth by some of the third party libraries. Unused models can be removed from redistributed packages without issue.\n\nThe release builds contain a custom build of ONNX Runtime without telemetry.\n\n# Dependencies (Python 3.6 - 3.9)\n\n* ONNX Runtime\n* OpenCV\n* Pillow\n* Numpy\n\nThe required libraries can be installed using pip:\n\n     pip install onnxruntime opencv-python pillow numpy\n\nAlternatively poetry can be used to \ninstall all dependencies for this project in a separate virtual env:\n\n     poetry install\n\n# Dependencies\n\n* onnxruntime\n* OpenCV\n* Pillow\n* Numpy\n\nThe required libraries can be installed using pip:\n \n\tpip install onnxruntime opencv-python pillow numpy\n\n# References\n\n## Training dataset\n\nThe model was trained on a 66 point version of the [LS3D-W](https:\u002F\u002Fwww.adrianbulat.com\u002Fface-alignment) dataset.\n\n    @inproceedings{bulat2017far,\n      title={How far are we from solving the 2D \\& 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)},\n      author={Bulat, Adrian and Tzimiropoulos, Georgios},\n      booktitle={International Conference on Computer Vision},\n      year={2017}\n    }\n\nAdditional training has been done on the WFLW dataset after reducing it to 66 points and replacing the contour points and tip of the nose with points predicted by the model trained up to this point. This additional training is done to improve fitting to eyes and eyebrows.\n\n    @inproceedings{wayne2018lab,\n      author = {Wu, Wayne and Qian, Chen and Yang, Shuo and Wang, Quan and Cai, Yici and Zhou, Qiang},\n      title = {Look at Boundary: A Boundary-Aware Face Alignment Algorithm},\n      booktitle = {CVPR},\n      month = June,\n      year = {2018}\n    }\n\nFor the training the gaze and blink detection model, the [MPIIGaze](https:\u002F\u002Fwww.mpi-inf.mpg.de\u002Fdepartments\u002Fcomputer-vision-and-machine-learning\u002Fresearch\u002Fgaze-based-human-computer-interaction\u002Fappearance-based-gaze-estimation-in-the-wild\u002F) dataset was used. Additionally, around 125000 synthetic eyes generated with [UnityEyes](https:\u002F\u002Fwww.cl.cam.ac.uk\u002Fresearch\u002Frainbow\u002Fprojects\u002Funityeyes\u002F) were used during training.\n\nIt should be noted that additional custom data was also used during the training process and that the reference landmarks from the original datasets have been modified in certain ways to address various issues. It is likely not possible to reproduce these models with just the original LS3D-W and WFLW datasets, however the additional data is not redistributable.\n\nThe heatmap regression based face detection model was trained on random 224x224 crops from the WIDER FACE dataset.\n\n\t@inproceedings{yang2016wider,\n\t  Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},\n\t  Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n\t  Title = {WIDER FACE: A Face Detection Benchmark},\n\t  Year = {2016}\n    }\n\n## Algorithm\n\nThe algorithm is inspired by:\n\n* [Designing Neural Network Architectures for Different Applications: From Facial Landmark Tracking to Lane Departure Warning System](https:\u002F\u002Fwww.synopsys.com\u002Fdesignware-ip\u002Ftechnical-bulletin\u002Fulsee-designing-neural-network.html) by YiTa Wu, Vice President of Engineering, ULSee\n* [Real-time Human Pose Estimation in the Browser with TensorFlow.js](https:\u002F\u002Fblog.tensorflow.org\u002F2018\u002F05\u002Freal-time-human-pose-estimation-in.html)\n* [U-Net: Convolutional Networks for Biomedical Image Segmentation](https:\u002F\u002Flmb.informatik.uni-freiburg.de\u002Fpeople\u002Fronneber\u002Fu-net\u002F) by Olaf Ronneberger, Philipp Fischer, Thomas Brox\n* [MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.04861) by Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam\n* [Searching for MobileNetV3](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.02244) by Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam\n\nThe MobileNetV3 code was taken from [here](https:\u002F\u002Fgithub.com\u002Frwightman\u002Fgen-efficientnet-pytorch).\n\nFor all training a modified version of [Adaptive Wing Loss](https:\u002F\u002Fgithub.com\u002Ftankrant\u002FAdaptive-Wing-Loss) was used.\n\n* [Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.07399) by Xinyao Wang, Liefeng Bo, Li Fuxin\n\nFor expression detection, [LIBSVM](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Flibsvm\u002F) is used.\n\nFace detection is done using a custom heatmap regression based face detection model or RetinaFace.\n\n    @inproceedings{deng2019retinaface,\n      title={RetinaFace: Single-stage Dense Face Localisation in the Wild},\n      author={Deng, Jiankang and Guo, Jia and Yuxiang, Zhou and Jinke Yu and Irene Kotsia and Zafeiriou, Stefanos},\n      booktitle={arxiv},\n      year={2019}\n    }\n\nRetinaFace detection is based on [this](https:\u002F\u002Fgithub.com\u002Fbiubug6\u002FPytorch_Retinaface) implementation. The pretrained model was modified to remove unnecessary landmark detection and converted to ONNX format for a resolution of 640x640.\n\n# Thanks!\n\nMany thanks to everyone who helped me test things!\n\n* [@Virtual_Deat](https:\u002F\u002Ftwitter.com\u002FVirtual_Deat), who also inspired me to start working on this.\n* [@ENiwatori](https:\u002F\u002Ftwitter.com\u002Feniwatori) and family.\n* [@ArgamaWitch](https:\u002F\u002Ftwitter.com\u002FArgamaWitch)\n* [@AngelVayuu](https:\u002F\u002Ftwitter.com\u002FAngelVayuu)\n* [@DapperlyYours](https:\u002F\u002Ftwitter.com\u002FDapperlyYours)\n* [@comdost_art](https:\u002F\u002Ftwitter.com\u002Fcomdost_art)\n* [@Ponoki_Chan](https:\u002F\u002Ftwitter.com\u002FPonoki_Chan)\n\n# License\n\nThe code and models are distributed under the BSD 2-clause license. \n\nYou can find licenses of third party libraries used for binary builds in the `Licenses` folder.\n\n","![OSF.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_b2a824e1e2ba.png)\n\n# 概述\n\n**注意**：这是一个跟踪库（tracking library），**并非**独立的虚拟角色操控程序（stand-alone avatar puppeteering program）。我也正在开发 [VSeeFace](https:\u002F\u002Fwww.vseeface.icu\u002F)，它允许使用 OpenSeeFace 跟踪来动画化 [VRM](https:\u002F\u002Fvrm.dev\u002Fen\u002Fhow_to_make_vrm\u002F) 和 [VSFAvatar](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=jhQ8DF87I5I) 3D 模型。[VTube Studio](https:\u002F\u002Fdenchisoft.com\u002F) 使用 OpenSeeFace 进行基于网络摄像头的跟踪以动画化 Live2D 模型。Godot 引擎的渲染器可以在 [这里](https:\u002F\u002Fgithub.com\u002Fvirtual-puppet-project\u002Fvpuppr) 找到。\n\n本项目实现了一个基于 MobileNetV3 的面部关键点检测模型（facial landmark detection model）。\n\n由于 Pytorch 1.3 在 Windows 上的 CPU 推理速度（CPU inference speed）非常低，该模型已转换为 ONNX 格式（ONNX format）。使用 [onnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime)，它可以以 30 - 60 fps（帧每秒）的速度跟踪单张人脸。有四个模型，具有不同的速度与跟踪质量权衡。\n\n如果有人好奇，这个名字是对“开阔海洋”（open seas）和“看见面孔”（seeing faces）的一个无厘头双关语。没有更深层的含义。\n\n最新的示例视频可以在 [这里](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=AaNap_ud_3I&vq=hd1080) 找到，展示了默认跟踪模型在不同噪声和光照水平下的性能。\n\n# 跟踪质量\n\n由于 OpenSeeFace 使用的关键点（landmarks）与其他方法略有不同（它们接近 iBUG 68，嘴角少了两个点，并且是准 3D 面部轮廓而不是跟随可见轮廓的面部轮廓），很难将其精度与科学文献中常见的其他方法进行数值比较。跟踪性能也更多地优化为生成对动画化虚拟角色有用的关键点，而不是精确拟合面部图像。例如，只要眼部关键点能显示眼睛是睁开还是闭合，即使位置有些偏差，它们仍然可以用于此目的。\n\n从一般观察来看，OpenSeeFace 在恶劣条件下（低光、高噪声、低分辨率）表现良好，并且在很宽的头姿势范围内保持跟踪人脸，同时关键点位置具有较高的相对稳定性。与 MediaPipe 相比，OpenSeeFace 的关键点在挑战性条件下保持更稳定，并且能更准确地表示更广泛的面部表情。然而，眼部区域的跟踪可能不够准确。\n\n我在 Wood 等人关于 [3D Face Reconstruction with Dense Landmarks](https:\u002F\u002Fmicrosoft.github.io\u002FDenseLandmarks\u002F) 的视频演示中的样本片段上运行了 OpenSeeFace，以将其与 MediaPipe 及其方法进行比较。你可以在 [这里](https:\u002F\u002Fwww.vseeface.icu\u002Fassets\u002Fmedia\u002FOSFMediaPipe3DFR.mp4) 观看结果。\n\n# 使用方法\n\n一个用于基于 VRM 的虚拟角色动画的示例 Unity 项目可以在 [这里](https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFaceSample) 找到。\n\n面部跟踪本身由 `facetracker.py` Python 3.7 脚本完成。它是一个命令行程序（commandline program），因此你应该手动从 cmd 启动它或编写批处理文件（batch file）来启动它。如果你下载了发行版并在 Windows 上，你可以运行 `Binary` 文件夹内的 `facetracker.exe`，而无需安装 Python。你也可以使用 `Binary` 文件夹内的 `run.bat` 来演示跟踪器的基本功能。\n\n该脚本将在网络摄像头输入或视频文件上执行跟踪，并通过 UDP 发送跟踪数据。这种设计还允许在与使用跟踪信息的计算机不同的另一台计算机上进行跟踪。这对于增强性能和避免意外泄露摄像头画面很有用。\n\n提供的 `OpenSee` Unity 组件（Unity component）可以接收这些 UDP 数据包，并通过名为 `trackingData` 的公共字段（public field）提供接收到的信息。`OpenSeeShowPoints` 组件可以可视化检测到的人脸的关键点。它也作为一个示例。请查看它以了解如何正确使用 `OpenSee` 组件。更多示例包含在 `Examples` 文件夹中。UDP 数据包是在单独的线程（thread）中接收的，因此任何使用 `OpenSee` 组件 `trackingData` 字段的组件都应首先复制该字段并访问此副本，否则信息可能在处理过程中被覆盖。这种设计也意味着，即使 `OpenSee` 组件被禁用，该字段也会继续更新。\n\n使用 `--help` 运行 python 脚本以了解可设置的选项。\n\n    python facetracker.py --help\n\n可以通过在 Unity 中创建一个新场景，添加一个空的游戏对象并将 `OpenSee` 和 `OpenSeeShowPoints` 组件都添加到其中来实现简单的演示。当场景播放时，在视频文件上运行面部跟踪器：\n\n    python facetracker.py --visualize 3 --pnp-points 1 --max-threads 4 -c video.mp4\n\n__注意__：如果使用 [poetry](https:\u002F\u002Fpython-poetry.org\u002F) 安装了依赖项，则命令必须从 `poetry shell` 执行，或者必须以 `poetry run` 开头。\n\n这样跟踪脚本将输出其自己的跟踪可视化，同时演示跟踪数据传输到 Unity。\n\n包含的 `OpenSeeLauncher` 组件允许从 Unity 启动面部跟踪程序。它是专为配合二进制发布包中分发的 pyinstaller 创建的可执行文件而设计的。它提供了三个公共 API 函数：\n\n* `public string[] ListCameras()` 返回可用摄像头的名称。数组中摄像头的索引对应于 `cameraIndex` 字段的 ID。将 `cameraIndex` 设置为 `-1` 将禁用网络摄像头捕获。\n* `public bool StartTracker()` 将启动跟踪器。如果它已经在运行，它将关闭当前运行的实例并使用当前设置启动一个新实例。\n* `public void StopTracker()` 将停止跟踪器。当应用程序终止或 `OpenSeeLauncher` 对象被销毁时，跟踪器会自动停止。\n\n`OpenSeeLauncher` 组件使用 WinAPI 作业对象（WinAPI job objects）以确保如果应用程序崩溃或在未先终止跟踪进程的情况下关闭，跟踪器子进程将被终止。\n\n额外的自定义命令行参数（commandline arguments）应逐个添加到 `commandlineArguments` 数组的元素中。例如 `-v 1` 应作为两个元素添加，一个元素包含 `-v`，另一个包含 `1`，而不是包含两部分的一个单一元素。\n\n包含的 `OpenSeeIKTarget` 组件可与 FinalIK 或其他 IK（反向动力学）解决方案结合使用，以动画化头部运动。\n\n## 表情检测\n\n`OpenSeeExpression` 组件可以添加到与 `OpenSeeFace` 组件相同的组件中，以检测特定的面部表情。它必须针对每个用户进行校准。可以通过 Unity Editor 中的复选框或源代码中可用的等效公共方法进行控制。\n\n要校准此系统，您需要为每种表情收集示例数据。如果捕获过程太快，可以使用 `recordingSkip` 选项来减慢速度。\n\n一般流程如下：\n\n* 输入您要校准的表情名称。\n* 做出该表情并保持住，然后勾选录制框。\n* 保持表情并四处移动头部，将其转向各个方向。\n* 过一会儿，如果该表情应与说话兼容，则在保持的同时开始说话。\n* 这样做一段时间后，取消勾选录制框并开始捕获另一种表情。\n* 勾选训练框，查看您收集数据的表情是否被准确检测。\n* 您还应该会在组件的下半部分获得一些统计数据。\n* 如果任何表情的检测有问题，请继续向其添加数据。\n\n要删除表情的捕获数据，输入其名称并勾选“Clear”框。\n\n要保存训练模型和捕获的训练数据，在“Filename”字段中输入包含完整路径的文件名，并勾选“Save”框。要加载它，输入文件名并勾选“Load”框。\n\n### 提示\n\n* 合理的表情数量是六个，包括中性表情。\n* 在开始捕获表情之前，做一些面部动作并摆动眉毛，以预热跟踪器的特征检测部分。\n* 一旦您有了一个工作良好的检测模型，在使用它时花点时间检查所有表情是否按预期工作，如果不是则添加少量数据。\n\n# 通用说明\n\n* 即使存在面部部分遮挡、眼镜或光照条件不佳的情况，跟踪似乎也相当稳健。\n* 最高质量的模型通过 `--model 3` 选择，最快但跟踪质量最低的模型是 `--model 0`。\n* 较低的跟踪质量主要意味着更刚性的跟踪，使得检测眨眼和眉毛运动更加困难。\n* 取决于帧率，人脸跟踪很容易占用整个 CPU 核心。对于单张人脸 30fps，在不错的 CPU 上仍应使用不到一个核心的 100%。如果跟踪使用过多 CPU，请尝试降低帧率。20 的帧率可能没问题，高于 30 的情况很少需要。\n* 当将跟踪的人脸数量设置为高于实际可见的人脸数量时，人脸检测模型将每 `--scan-every` 帧运行一次。这可能会减慢速度，因此尝试将 `--faces` 设置为不超过您实际跟踪的人脸数量。\n\n# 模型\n\n包含了四个预训练的人脸关键点模型。使用 `--model` 开关，可以选择它们进行跟踪。给定的 fps 值是在单核 CPU 上运行单张人脸视频的模型性能。降低帧率会相应减少 CPU 使用量。\n\n* 模型 **-1**：此模型专为烤面包机等设备运行设计，因此是一个非常非常快且精度很低的模型。(213fps，无视线跟踪)\n* 模型 **0**：这是一个非常快、精度低的模型。(68fps)\n* 模型 **1**：这是一个稍慢但精度更好的模型。(59fps)\n* 模型 **2**：这是一个较慢但精度好的模型。(50fps)\n* 模型 **3** (默认)：这是最慢但精度最高的模型。(44fps)\n\nFPS 测量值来自在我的 CPU 的一个核心上运行。\n\n可用于 `model.py` 的 Pytorch 权重可在 [此处](https:\u002F\u002Fmega.nz\u002Ffile\u002FvvYXlYQT#h7FpEg4tmOCJNxjpsDEw0JomJIkVGKwrt4OUV0RNDDU) 找到。一些未优化的 ONNX 模型可在 [此处](https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F48) 找到。\n\n# 结果\n\n## 关键点\n\n![Results1.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_89fb1f23fd86.png)\n\n![Results2.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_51edddfaa10f.png)\n\n更多样本：[Results3.png](https:\u002F\u002Fraw.githubusercontent.com\u002Femilianavt\u002FOpenSeeFace\u002Fmaster\u002FImages\u002FResults3.png), [Results4.png](https:\u002F\u002Fraw.githubusercontent.com\u002Femilianavt\u002FOpenSeeFace\u002Fmaster\u002FImages\u002FResults4.png)\n\n## 人脸检测\n\n关键点模型对于人脸的大小和方向相当稳健，因此自定义人脸检测模型可以使用比其他方法更粗糙的边界框。它在本项目的目的中具有有利的速度与精度比。\n\n![EmiFace.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_readme_12f0b180a4c7.png)\n\n# 发布构建\n\n本仓库发布部分中的构包含一个位于 `Binary` 文件夹内的 `facetracker.exe`，该文件夹是使用 `pyinstaller` 构建的，并包含所有必需的依赖项。\n\n要运行它，至少需要将 `models` 文件夹放置在 `facetracker.exe` 所在的同一文件夹中。将其放在公共父文件夹中也可以。\n\n分发时，还应同时分发 `Licenses` 文件夹，以确保符合某些第三方库提出的要求。未使用的模型可以从重新分发的包中移除而不引起问题。\n\n发布构建包含一个没有遥测功能的自定义版本 ONNX Runtime。\n\n# 依赖项 (Python 3.6 - 3.9)\n\n* ONNX Runtime\n* OpenCV\n* Pillow\n* Numpy\n\n所需的库可以使用 pip 安装：\n\n     pip install onnxruntime opencv-python pillow numpy\n\n或者可以使用 poetry 在单独的虚拟环境中安装此项目的所有依赖项：\n\n     poetry install\n\n# 依赖项\n\n* onnxruntime\n* OpenCV\n* Pillow\n* Numpy\n\n所需的库可以使用 pip 安装：\n \n\tpip install onnxruntime opencv-python pillow numpy\n\n# 参考文献\n\n## 训练数据集\n\n该模型是在 [LS3D-W](https:\u002F\u002Fwww.adrianbulat.com\u002Fface-alignment) 数据集的 66 点版本上进行训练的。\n\n    @inproceedings{bulat2017far,\n      title={How far are we from solving the 2D \\& 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)},\n      author={Bulat, Adrian and Tzimiropoulos, Georgios},\n      booktitle={International Conference on Computer Vision},\n      year={2017}\n    }\n\n在将 WFLW 数据集缩减为 66 个点，并用此前训练好的模型预测的轮廓点和鼻尖点替换原有对应点后，进行了额外的训练。进行此项额外训练是为了提高对眼睛和眉毛的拟合效果。\n\n    @inproceedings{wayne2018lab,\n      author = {Wu, Wayne and Qian, Chen and Yang, Shuo and Wang, Quan and Cai, Yici and Zhou, Qiang},\n      title = {Look at Boundary: A Boundary-Aware Face Alignment Algorithm},\n      booktitle = {CVPR},\n      month = June,\n      year = {2018}\n    }\n\n用于训练眼动和眨眼检测模型的是 [MPIIGaze](https:\u002F\u002Fwww.mpi-inf.mpg.de\u002Fdepartments\u002Fcomputer-vision-and-machine-learning\u002Fresearch\u002Fgaze-based-human-computer-interaction\u002Fappearance-based-gaze-estimation-in-the-wild\u002F) 数据集。此外，训练过程中还使用了约 125000 个使用 [UnityEyes](https:\u002F\u002Fwww.cl.cam.ac.uk\u002Fresearch\u002Frainbow\u002Fprojects\u002Funityeyes\u002F) 生成的合成眼睛图像。\n\n需要注意的是，训练过程中还使用了额外的自定义数据，并且原始数据集中的参考关键点 (landmarks) 经过了一定程度的修改以解决各种问题。仅使用原始的 LS3D-W 和 WFLW 数据集可能无法复现这些模型，然而额外数据不可重新分发。\n\n基于热力图回归 (heatmap regression) 的面部检测模型是在 WIDER FACE 数据集的随机 224x224 裁剪区域上训练的。\n\n\t@inproceedings{yang2016wider,\n\t  Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},\n\t  Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n\t  Title = {WIDER FACE: A Face Detection Benchmark},\n\t  Year = {2016}\n    }\n\n## 算法\n\n该算法灵感来源于：\n\n* [设计针对不同应用的神经网络架构：从面部关键点追踪到车道偏离预警系统](https:\u002F\u002Fwww.synopsys.com\u002Fdesignware-ip\u002Ftechnical-bulletin\u002Fulsee-designing-neural-network.html) by YiTa Wu, ULSee 工程副总裁\n* [浏览器中的实时人体姿态估计与 TensorFlow.js](https:\u002F\u002Fblog.tensorflow.org\u002F2018\u002F05\u002Freal-time-human-pose-estimation-in.html)\n* [U-Net：用于生物医学图像分割的卷积网络](https:\u002F\u002Flmb.informatik.uni-freiburg.de\u002Fpeople\u002Fronneber\u002Fu-net\u002F) by Olaf Ronneberger, Philipp Fischer, Thomas Brox\n* [MobileNets：面向移动视觉应用的高效卷积神经网络](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.04861) by Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam\n* [搜索 MobileNetV3](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.02244) by Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam\n\nMobileNetV3 代码取自 [此处](https:\u002F\u002Fgithub.com\u002Frwightman\u002Fgen-efficientnet-pytorch)。\n\n所有训练均使用了修改版的 [自适应翼损失 (Adaptive Wing Loss)](https:\u002F\u002Fgithub.com\u002Ftankrant\u002FAdaptive-Wing-Loss)。\n\n* [自适应翼损失用于通过热力图回归实现鲁棒的面部对齐](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.07399) by Xinyao Wang, Liefeng Bo, Li Fuxin\n\n表情检测使用了 [LIBSVM](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Flibsvm\u002F)。\n\n面部检测使用自定义的热力图回归面部检测模型或 RetinaFace 完成。\n\n    @inproceedings{deng2019retinaface,\n      title={RetinaFace: Single-stage Dense Face Localisation in the Wild},\n      author={Deng, Jiankang and Guo, Jia and Yuxiang, Zhou and Jinke Yu and Irene Kotsia and Zafeiriou, Stefanos},\n      booktitle={arxiv},\n      year={2019}\n    }\n\nRetinaFace 检测基于 [此](https:\u002F\u002Fgithub.com\u002Fbiubug6\u002FPytorch_Retinaface) 实现。预训练模型经过修改，移除了不必要的关键点检测，并转换为 ONNX 格式，分辨率为 640x640。\n\n# 致谢！\n\n非常感谢所有帮助我测试的人！\n\n* [@Virtual_Deat](https:\u002F\u002Ftwitter.com\u002FVirtual_Deat)，他也激励我开始这项工作。\n* [@ENiwatori](https:\u002F\u002Ftwitter.com\u002Feniwatori) 及其家人。\n* [@ArgamaWitch](https:\u002F\u002Ftwitter.com\u002FArgamaWitch)\n* [@AngelVayuu](https:\u002F\u002Ftwitter.com\u002FAngelVayuu)\n* [@DapperlyYours](https:\u002F\u002Ftwitter.com\u002FDapperlyYours)\n* [@comdost_art](https:\u002F\u002Ftwitter.com\u002Fcomdost_art)\n* [@Ponoki_Chan](https:\u002F\u002Ftwitter.com\u002FPonoki_Chan)\n\n# 许可证\n\n代码和模型均在 BSD 2-Clause 许可证下分发。 \n\n你可以在 `Licenses` 文件夹中找到用于二进制构建的第三方库的许可证。","# OpenSeeFace 快速上手指南\n\nOpenSeeFace 是一个基于 MobileNetV3 的面部特征点检测库，支持实时追踪面部关键点并通过 UDP 传输数据，常用于驱动 VRM 或 Live2D 虚拟形象。\n\n## 环境准备\n\n- **操作系统**：Windows \u002F Linux \u002F macOS\n- **Python 环境**：3.7+（仅在使用 Python 脚本时必需）\n- **硬件要求**：普通 CPU 即可运行，单面追踪在 30fps 下约占单核不到 100% 资源。\n- **注意**：Windows 用户强烈推荐使用预编译的 `facetracker.exe`，无需配置 Python 环境。\n\n## 安装步骤\n\n### 方式一：使用预编译版本（推荐）\n1. 前往 GitHub Releases 页面下载最新发布的压缩包。\n2. 解压后，确保 `Binary` 文件夹内的 `facetracker.exe` 与 `models` 文件夹位于同一目录（或 `models` 放在其上级目录）。\n3. 无需额外安装依赖，双击或命令行运行即可。\n\n### 方式二：源码部署\n1. 克隆项目仓库。\n2. 安装依赖（官方推荐使用 Poetry）：\n   ```bash\n   poetry install\n   ```\n3. 若使用其他包管理器，请根据项目依赖文件安装 `onnxruntime` 等相关库。\n\n## 基本使用\n\n### 命令行追踪\n程序支持摄像头输入或视频文件输入，追踪数据将通过 UDP 协议发送。\n\n**查看可用参数：**\n```bash\npython facetracker.py --help\n```\n\n**视频文件测试示例：**\n```bash\npython facetracker.py --visualize 3 --pnp-points 1 --max-threads 4 -c video.mp4\n```\n*(注：若使用 Poetry 环境，请在命令前添加 `poetry run`)*\n\n**模型选择说明：**\n通过 `--model` 参数切换不同精度与速度的模型：\n- `--model 0`：极速模式，精度较低（约 68fps）\n- `--model 3`：默认模式，高精度（约 44fps）\n- `--model -1`：超低性能设备专用，极低精度\n\n### Unity 集成简述\n1. 在 Unity 场景中创建空物体，添加 `OpenSee` 和 `OpenSeeShowPoints` 组件。\n2. 运行上述追踪命令（指定 `-c` 为摄像头 ID 或视频路径）。\n3. Unity 组件将自动接收 UDP 数据包中的 `trackingData` 字段。\n4. 如需从 Unity 内部启动追踪器，可添加 `OpenSeeLauncher` 组件并使用 API 控制。","一位独立游戏开发者正在为他的虚拟主播角色搭建实时面部捕捉系统，但受限于笔记本电脑的硬件配置，无法负担高性能显卡带来的功耗和发热问题。\n\n### 没有 OpenSeeFace 时\n- 依赖 GPU 加速的模型导致笔记本风扇狂转，游戏运行时画面严重卡顿掉帧。\n- 在室内灯光不足的环境下，面部特征点频繁丢失或剧烈抖动，模型表情显得僵硬。\n- 需要编写复杂的中间件代码才能将摄像头数据传入 Unity 引擎驱动 3D 模型。\n- 高延迟传输导致虚拟角色的口型与语音不同步，严重影响直播互动体验。\n\n### 使用 OpenSeeFace 后\n- OpenSeeFace 基于 CPU 运行，完全释放了显卡资源，确保游戏后台流畅度不受影响。\n- 即使在昏暗房间也能稳定追踪，面部关键点位置保持平滑，准确还原细微表情变化。\n- 通过 UDP 协议直接发送数据给 Unity 组件，集成过程简单快捷，无需额外安装 Python 环境。\n- 低延迟传输让虚拟角色的表情反应几乎与真人动作同步，显著提升了观众的沉浸感。\n\nOpenSeeFace 以轻量级 CPU 方案解决了低配设备下虚拟形象实时驱动的稳定性难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Femilianavt_OpenSeeFace_89fb1f23.png","emilianavt","Emiliana","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Femilianavt_29c79746.png",null,"https:\u002F\u002Ftwitter.com\u002FEmiliana_vt","https:\u002F\u002Fgithub.com\u002Femilianavt",[79,83,87,91,95,99],{"name":80,"color":81,"percentage":82},"Python","#3572A5",42.5,{"name":84,"color":85,"percentage":86},"C#","#178600",33.7,{"name":88,"color":89,"percentage":90},"C++","#f34b7d",22.4,{"name":92,"color":93,"percentage":94},"C","#555555",0.9,{"name":96,"color":97,"percentage":98},"Batchfile","#C1F12E",0.3,{"name":100,"color":101,"percentage":102},"Shell","#89e051",0.2,1842,186,"2026-04-10T08:00:43","BSD-2-Clause","未说明","不需要 GPU，基于 CPU 运行",{"notes":110,"python":111,"dependencies":112},"需将 models 文件夹与程序放在同目录；推荐使用 CPU 运行；提供 Windows 可执行文件（无需 Python）；支持通过 UDP 传输追踪数据；表情检测需针对用户校准；高帧率追踪可能占用单核 CPU 资源","3.7",[113,114,115,116],"onnxruntime","opencv-python","numpy","torch",[118,15,14],"其他",[120,121,122,123,124,125,126,127,128,113,129,130,131,132,133,134,135,136,137,138],"face-tracking","face-landmarks","depth-estimation","unity","unity3d","python","csharp","udp","onnx","virtual-youtuber","vtuber","mobilenetv3","pytorch","openseeface","face-detection","detection-model","landmark-model","tracker","cpu",4,"2026-03-27T02:49:30.150509","2026-04-11T16:54:11.256571",[143,148,153,158,163,167],{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},2837,"如何在 OpenSeeFace 中实现 VMC 协议支持？","如果仅作为 'VMC Assistent' 实现，不需要逆向运动学 (IK)。可以直接发送头部目标数据到 `\u002FVMC\u002FExt\u002FTra`。但如果需要将数据应用到骨骼并发送 `\u002FVMC\u002FExt\u002FBone`，则必须使用 IK。注意并非所有应用都能处理接收到的数据，例如 VSeeFace 可能会忽略 `\u002FVMC\u002FExt\u002FTra` 数据。四元数没有固定的最小\u002F最大范围，特别是未归一化时。","https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F38",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},2838,"如何为自定义相机（如 Oak-D）集成 OpenSeeFace？","主要难点在于解码模型输出。可以参考项目代码中的 `model.py` 和 `tracker.py` 文件。具体解码逻辑位于 `tracker.py` 的特定行（例如 tracker.py#L105-L111 和 tracker.py#L641-L660）。如果模型不直接兼容，可以尝试反向工程值来源并匹配 OpenSeeFace 协议，或者使用现有的 gaze 和 headtracking 示例进行适配。","https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F32",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},2839,"如何理解或创建偏移量图（Offset Maps）？","偏移量图的工作原理可参考 TensorFlow 博客关于实时人体姿态估计的技术深入部分。Sigmoid 函数和缩放因子用于调整偏移掩码上的梯度陡峭度。可以使用 `logit_arr` 函数来反转 Sigmoid 函数的效果。具体的形状处理可以参考相关讨论中的 `landmarks` 函数。","https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F70",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},2840,"是否支持将模型转换为 TFLite 格式？","目前不支持。尝试使用 `onnx-tensorflow` 将 ONNX 转换为 TFLite 时会报错 `BackendIsNotSupposedToImplementIt: FusedConv is not implemented`。维护者明确表示无法支持此情况。","https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F69",{"id":164,"question_zh":165,"answer_zh":166,"source_url":162},2841,"Model 4 在闭眼状态下的表现及数据集情况？","Model 4 在双眼闭合时的准确率不应高于其他模型，它仅在解耦眼睛闭合状态方面表现更好。关于训练数据集，维护者表示可能需要时间才能找到原始的 checkpoint，暂时无法提供大量正确标注的闭眼图像数据集。",{"id":168,"question_zh":169,"answer_zh":170,"source_url":171},2842,"输入图像尺寸较小时是否需要重缩放网络？","如果输入图像接近 100x100，通常不需要从 224x224 估算面部特征点。尝试动态量化可能会导致运行速度变慢。有用户反馈 `modelU` 与 `model0` 性能相似但在应用中更快，建议自行尝试不同模型配置。","https:\u002F\u002Fgithub.com\u002Femilianavt\u002FOpenSeeFace\u002Fissues\u002F9",[173,178,183,188,193,198,203,208,213,218,223,228,233,238,243,248,253,258,263,268],{"id":174,"version":175,"summary_zh":176,"released_at":177},200926,"v1.20.4","Changed the gaze tracking model to always run single threaded.\r\n\r\nUpdate: If you have trouble running it, please replace the `dshowcapture` DLL files inside the `Binary` folder with the updated ones attached to this release.","2021-09-17T20:20:39",{"id":179,"version":180,"summary_zh":181,"released_at":182},200927,"v1.20.3","Fixed various issues, made other small improvements and added a new, experimental tracking model with improved wink support.","2021-08-07T16:45:28",{"id":184,"version":185,"summary_zh":186,"released_at":187},200928,"v1.20.2","Fixed some bugs.","2020-12-13T12:33:02",{"id":189,"version":190,"summary_zh":191,"released_at":192},200929,"v1.20.1","Reduced the impact of eye blinks and jaw movement on head pose estimation and fixed various things.","2020-12-07T17:20:27",{"id":194,"version":195,"summary_zh":196,"released_at":197},200930,"v1.20.0","This release contains various fixes and adjustments as well as two new tracking models with different quality and speed tradeoffs.","2020-11-28T15:11:33",{"id":199,"version":200,"summary_zh":201,"released_at":202},200931,"v1.19.0","This release adds support for jaw bone animation to `OpenSeeVRMDriver`, dynamic port selection support to `OpenSeeLauncher` and updates the binary build of the face tracker to use onnxruntime 1.5.1, fixing some performance issues caused by the tracker not respecting thread limits.","2020-10-16T13:01:54",{"id":204,"version":205,"summary_zh":206,"released_at":207},200932,"v1.18.3","This release decodes landmarks from the python side again, as somehow using the decoded landmarks from setting `inference=True` on the models caused some issues","2020-10-08T11:51:50",{"id":209,"version":210,"summary_zh":211,"released_at":212},200933,"v1.18.2","With this release, the landmark models decode landmarks within the ONNX models again. The included binary should run with lower CPU utilization at the same speed when enabling multithreading.","2020-10-05T18:04:47",{"id":214,"version":215,"summary_zh":216,"released_at":217},200934,"v1.17.0","This release fixes a bug with the `OpenSeeLauncher` and adds support for selecting device capability lines for direct show cameras and a `--benchmark` option.","2020-09-29T00:03:02",{"id":219,"version":220,"summary_zh":221,"released_at":222},200935,"v1.16.0","The main new feature with this release is a very fast, less accurate thirty point tracking model that can be activated with `--model -1`.","2020-09-05T19:23:11",{"id":224,"version":225,"summary_zh":226,"released_at":227},200936,"v1.15.3","This release contains bug fixes and mostly small improvements. It also contains an optional, simplified expression detection method.","2020-08-21T21:25:29",{"id":229,"version":230,"summary_zh":231,"released_at":232},200937,"v1.15.2","Further reliability improvements. Landmark decoding has been moved from the ONNX model back into `tracker.py` in one of the recent releases, but `model.py` still contains suitable decoding code for exporting your own ONNX model containing the landmark decoding.","2020-08-17T09:37:10",{"id":234,"version":235,"summary_zh":236,"released_at":237},200938,"v1.15.1","Improved reliability and logging.","2020-08-11T11:44:57",{"id":239,"version":240,"summary_zh":241,"released_at":242},200939,"v1.15.0","This release includes various small optimizations that can lead to a 20-25% performance gain under certain conditions.","2020-08-10T16:36:47",{"id":244,"version":245,"summary_zh":246,"released_at":247},200940,"v1.14.2","This release fixes further issues with the new camera code.","2020-08-08T15:24:04",{"id":249,"version":250,"summary_zh":251,"released_at":252},200941,"v1.14.0","This release accumulates a lot of random changes, mainly to the Unity parts. One big change is that escapi is no longer used for webcam reading on Windows and has been replaced with libdshowcamera, which is also used in OBS.","2020-08-06T18:23:53",{"id":254,"version":255,"summary_zh":256,"released_at":257},200942,"v1.13.0","This release introduces a new model for `--model 2`. Unity components relying on 3D points will now skip frames with very bad 3D fits.","2020-03-30T15:04:28",{"id":259,"version":260,"summary_zh":261,"released_at":262},200943,"v1.12.0","This release contains various improvements, such as improved models for `--model 0` and `--model 1`.","2020-03-28T14:21:47",{"id":264,"version":265,"summary_zh":266,"released_at":267},200944,"v1.11.0","The main new feature of this release is a new default face detection model that is used to lock onto yet untracked faces for tracking. There are also some other small changes and fixes. The chin tracking point is used for fitting the 3D points once more to prevent the `solvePnP` function from choosing an upside down fit.","2020-03-22T13:27:03",{"id":269,"version":270,"summary_zh":271,"released_at":272},200945,"v1.10.0","New in this version:\r\n\r\n* 3D points are now scaled according to the reference model along X and Y axis to make them more consistent. The tip of the nose should always be at (0, 0) and the nose should point straight up.\r\n* `OpenSeeVRMDriver` can now animate mouth and eyes using the face tracking information.\r\n* `OpenSeeVRMDriver` supports global hotkeys to override expressions.\r\n* Small fixes and changes.","2020-03-08T22:28:39"]