[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-alex-petrenko--sample-factory":3,"tool-alex-petrenko--sample-factory":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":81,"owner_email":82,"owner_twitter":83,"owner_website":84,"owner_url":85,"languages":86,"stars":102,"forks":103,"last_commit_at":104,"license":105,"difficulty_score":23,"env_os":106,"env_gpu":107,"env_ram":108,"env_deps":109,"category_tags":115,"github_topics":116,"view_count":23,"oss_zip_url":118,"oss_zip_packed_at":118,"status":16,"created_at":119,"updated_at":120,"faqs":121,"releases":149},2456,"alex-petrenko\u002Fsample-factory","sample-factory","High throughput synchronous and asynchronous reinforcement learning","Sample Factory 是一款专注于高性能强化学习（RL）的开源代码库，旨在提供极高效的同步与异步策略梯度算法（如 PPO）实现。简单来说，它就像是一个为人工智能代理打造的“超级训练工厂”，能够以惊人的速度处理海量数据，让 AI 在复杂环境中快速学会如何完成任务。\n\n在传统的强化学习训练中，速度慢和资源消耗大往往是两大痛点。Sample Factory 正是为了解决这些问题而生。它通过高度优化的算法架构，大幅提升了数据吞吐率，从而在显著缩短训练时间的同时，降低了对硬件配置的要求。无论是简单的 Atari 游戏，还是复杂的 ViZDoom、IsaacGym 机器人仿真或 Mujoco 物理环境，Sample Factory 都能帮助模型以更低的成本达到业界领先（SOTA）的性能表现。\n\n这款工具特别适合 AI 研究人员、机器学习工程师以及需要大规模训练智能体的开发者使用。如果你正在探索强化学习的前沿应用，或者苦恼于训练过程过于漫长，Sample Factory 将是一个得力的助手。其核心技术亮点在于灵活支持同步和异步两种训练模式，并提供了从单进程串行到大规模并行计算的多种选择，兼顾","Sample Factory 是一款专注于高性能强化学习（RL）的开源代码库，旨在提供极高效的同步与异步策略梯度算法（如 PPO）实现。简单来说，它就像是一个为人工智能代理打造的“超级训练工厂”，能够以惊人的速度处理海量数据，让 AI 在复杂环境中快速学会如何完成任务。\n\n在传统的强化学习训练中，速度慢和资源消耗大往往是两大痛点。Sample Factory 正是为了解决这些问题而生。它通过高度优化的算法架构，大幅提升了数据吞吐率，从而在显著缩短训练时间的同时，降低了对硬件配置的要求。无论是简单的 Atari 游戏，还是复杂的 ViZDoom、IsaacGym 机器人仿真或 Mujoco 物理环境，Sample Factory 都能帮助模型以更低的成本达到业界领先（SOTA）的性能表现。\n\n这款工具特别适合 AI 研究人员、机器学习工程师以及需要大规模训练智能体的开发者使用。如果你正在探索强化学习的前沿应用，或者苦恼于训练过程过于漫长，Sample Factory 将是一个得力的助手。其核心技术亮点在于灵活支持同步和异步两种训练模式，并提供了从单进程串行到大规模并行计算的多种选择，兼顾了调试的便捷性与生产环境的高效率。目前，Sample Factory 已更新至版本 2，拥有完善的文档和社区支持，代码风格规范且经过严格测试，是追求高效能强化学习解决方案的理想选择。","[![tests](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Ftest-ci.yml\u002Fbadge.svg?branch=master)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Ftest-ci.yml)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Falex-petrenko\u002Fsample-factory\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg?token=9EHMIU5WYV)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Falex-petrenko\u002Fsample-factory)\n[![pre-commit](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fpre-commit.yml\u002Fbadge.svg?branch=master)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fpre-commit.yml)\n[![docs](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fdocs.yml\u002Fbadge.svg)](https:\u002F\u002Fsamplefactory.dev)\n[![Code style: black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n[![Imports: isort](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https:\u002F\u002Fpycqa.github.io\u002Fisort\u002F)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Fblob\u002Fmaster\u002FLICENSE)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_a699eb45bbed.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fsample-factory)\n[\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F987232982798598164?label=discord\">](https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr)\n\u003C!-- [![pre-commit.ci status](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002FwmFrank\u002Fsample-factory\u002Fmaster.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002FwmFrank\u002Fsample-factory\u002Fmaster)-->\n\u003C!-- [![wakatime](https:\u002F\u002Fwakatime.com\u002Fbadge\u002Fgithub\u002Falex-petrenko\u002Fsample-factory.svg)](https:\u002F\u002Fwakatime.com\u002Fbadge\u002Fgithub\u002Falex-petrenko\u002Fsample-factory)-->\n\n\n# Sample Factory\n\nHigh-throughput reinforcement learning codebase. Version **2** is out! 🤗\n\n**Resources:**\n\n* **Documentation:** [https:\u002F\u002Fsamplefactory.dev](https:\u002F\u002Fsamplefactory.dev) \n\n* **Paper:** https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.11751\n\n* **Citation:** [BibTeX](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory#citation)\n\n* **Discord:** [https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr](https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr)\n\n* **Twitter (for updates):** [@petrenko_ai](https:\u002F\u002Ftwitter.com\u002Fpetrenko_ai)\n\n* **Talk (circa 2021):** https:\u002F\u002Fyoutu.be\u002FlLG17LKKSZc\n\n### What is Sample Factory?\n\nSample Factory is one of the fastest RL libraries focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO). \n\nSample Factory is thoroughly tested and used by many researchers and practitioners.\nOur implementation is known to reach state-of-the-art (SOTA) performance across a wide range of domains, while minimizing the required training time and hardware requirements.\nClips below demonstrate ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:\n\n\u003Cp align=\"middle\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_7b4e5f74b2a2.gif\" width=\"360\" alt=\"VizDoom agents traned using Sample Factory 2.0\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_b7ba2e340337.gif\" width=\"360\" alt=\"IsaacGym agents traned using Sample Factory 2.0\">\n\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_d323331beb85.gif\" width=\"380\" alt=\"DMLab-30 agents traned using Sample Factory 2.0\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_c760c106e239.gif\" width=\"340\" alt=\"Megaverse agents traned using Sample Factory 2.0\">\n\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_a690ae6c6309.gif\" width=\"390\" alt=\"Mujoco agents traned using Sample Factory 2.0\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_04de05c5fbed.gif\" width=\"330\" alt=\"Atari agents traned using Sample Factory 2.0\">\n\u003C\u002Fp>\n\n**Key features:**\n\n* Highly optimized algorithm [architecture](https:\u002F\u002Fwww.samplefactory.dev\u002F06-architecture\u002Foverview\u002F) for maximum learning throughput\n* [Synchronous and asynchronous](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fsync-async\u002F) training regimes\n* [Serial (single-process) mode](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fserial-mode\u002F) for easy debugging\n* Optimal performance in both CPU-based and [GPU-accelerated environments](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fisaacgym\u002F)\n* Single- & multi-agent training, self-play, supports [training multiple policies](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fmulti-policy-training\u002F) at once on one or many GPUs\n* Population-Based Training ([PBT](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fmulti-policy-training\u002F))\n* Discrete, continuous, hybrid action spaces\n* Vector-based, image-based, dictionary observation spaces\n* Automatically creates a model architecture by parsing action\u002Fobservation space specification. Supports [custom model architectures](https:\u002F\u002Fwww.samplefactory.dev\u002F03-customization\u002Fcustom-models\u002F)\n* Library is designed to be imported into other projects, [custom environments](https:\u002F\u002Fwww.samplefactory.dev\u002F03-customization\u002Fcustom-environments\u002F) are first-class citizens\n* Detailed [WandB and Tensorboard summaries](https:\u002F\u002Fwww.samplefactory.dev\u002F05-monitoring\u002Fmetrics-reference\u002F), [custom metrics](https:\u002F\u002Fwww.samplefactory.dev\u002F05-monitoring\u002Fcustom-metrics\u002F)\n* [HuggingFace 🤗 integration](https:\u002F\u002Fwww.samplefactory.dev\u002F10-huggingface\u002Fhuggingface\u002F) (upload trained models and metrics to the Hub)\n* [Multiple](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fmujoco\u002F) [example](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fatari\u002F) [environment](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fvizdoom\u002F) [integrations](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fdmlab\u002F) with tuned parameters and trained models\n\nThis Readme provides only a brief overview of the library.\nVisit full documentation at [https:\u002F\u002Fsamplefactory.dev](https:\u002F\u002Fsamplefactory.dev) for more details.\n\n## Installation\n\nJust install from PyPI:\n\n```pip install sample-factory```\n\nSF is known to work on Linux and macOS. There is no Windows support at this time.\nPlease refer to the [documentation](https:\u002F\u002Fsamplefactory.dev) for additional environment-specific installation notes.\n\n## Quickstart\n\nUse command line to train an agent using one of the existing integrations, e.g. Mujoco (might need to run `pip install sample-factory[mujoco]`):\n\n```bash\npython -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n```\n\nStop the experiment (Ctrl+C) when the desired performance is reached and then evaluate the agent:\n\n```bash\npython -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n\n# Or use an alternative eval script, no rendering but much faster! (use `sample_env_episodes` >= `num_workers` * `num_envs_per_worker`).\npython -m sf_examples.mujoco.fast_eval_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir --sample_env_episodes=128 --num_workers=16 --num_envs_per_worker=2\n```\n\nDo the same in a pixel-based VizDoom environment (might need to run `pip install sample-factory[vizdoom]`, please also see docs for VizDoom-specific instructions):\n\n```bash\npython -m sf_examples.vizdoom.train_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=.\u002Ftrain_dir --num_workers=16 --num_envs_per_worker=10 --train_for_env_steps=1000000\npython -m sf_examples.vizdoom.enjoy_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=.\u002Ftrain_dir\n```\n\nMonitor any running or completed experiment with Tensorboard:\n\n```bash\ntensorboard --logdir=.\u002Ftrain_dir\n```\n(or see the docs for WandB integration).\n\nTo continue from here, copy and modify one of the existing env integrations to train agents in your own custom environment. We provide\nexamples for all kinds of supported environments, please refer to the [documentation](https:\u002F\u002Fsamplefactory.dev) for more details.\n\n## Acknowledgements\n\nThis project would not be possible without amazing contributions from many people. I would like to thank:\n\n* [Vladlen Koltun](https:\u002F\u002Fvladlen.info) for amazing guidance and support, especially in the early stages of the project, for\nhelping me solidify the ideas that eventually became this library.\n* My academic advisor [Gaurav Sukhatme](https:\u002F\u002Fviterbi.usc.edu\u002Fdirectory\u002Ffaculty\u002FSukhatme\u002FGaurav) for supporting this project\nover the years of my PhD and for being overall an awesome mentor.\n* [Zhehui Huang](https:\u002F\u002Fzhehui-huang.github.io\u002F) for his contributions to the original ICML submission, his diligent work on\ntesting and evaluating the library and for adopting it in his own research.\n* [Edward Beeching](https:\u002F\u002Fedbeeching.github.io\u002F) for his numerous awesome contributions to the codebase, including\nhybrid action distributions, new version of the custom model builder, multiple environment integrations, and also\nfor promoting the library through the HuggingFace integration!\n* [Andrew Zhang](https:\u002F\u002Fandrewzhang505.github.io\u002F) and [Ming Wang](https:\u002F\u002Fwww.mingwang.me\u002F) for numerous contributions to the codebase and documentation during their HuggingFace internships!\n* [Thomas Wolf](https:\u002F\u002Fthomwolf.io\u002F) and others at HuggingFace for the incredible (and unexpected) support and for the amazing\nwork they are doing for the open-source community.\n* [Erik Wijmans](https:\u002F\u002Fwijmans.xyz\u002F) for feedback and insights and for his awesome implementation of RNN backprop using PyTorch's `PackedSequence`, multi-layer RNNs, and other features!\n* [Tushar Kumar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftushartk\u002F) for contributing to the original paper and for his help\nwith the [fast queue implementation](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Ffaster-fifo).\n* [Costa Huang](https:\u002F\u002Fcosta.sh\u002F) for developing CleanRL, for his work on benchmarking RL algorithms, and for awesome feedback\nand insights!\n* [Denys Makoviichuk](https:\u002F\u002Fgithub.com\u002FDenys88\u002Frl_games) for developing rl_games, a very fast RL library, for inspiration and \nfeedback on numerous features of this library (such as return normalizations, adaptive learning rate, and others).\n* [Eugene Vinitsky](https:\u002F\u002Feugenevinitsky.github.io\u002F) for adopting this library in his own research and for his valuable feedback.\n* All my labmates at RESL who used Sample Factory in their projects and provided feedback and insights!\n\nHuge thanks to all the people who are not mentioned here for your code contributions, PRs, issues, and questions!\nThis project would not be possible without a community!\n\n## Citation\n\nIf you use this repository in your work or otherwise wish to cite it, please make reference to our ICML2020 paper.\n\n```\n@inproceedings{petrenko2020sf,\n  author    = {Aleksei Petrenko and\n               Zhehui Huang and\n               Tushar Kumar and\n               Gaurav S. Sukhatme and\n               Vladlen Koltun},\n  title     = {Sample Factory: Egocentric 3D Control from Pixels at 100000 {FPS}\n               with Asynchronous Reinforcement Learning},\n  booktitle = {Proceedings of the 37th International Conference on Machine Learning,\n               {ICML} 2020, 13-18 July 2020, Virtual Event},\n  series    = {Proceedings of Machine Learning Research},\n  volume    = {119},\n  pages     = {7652--7662},\n  publisher = {{PMLR}},\n  year      = {2020},\n  url       = {http:\u002F\u002Fproceedings.mlr.press\u002Fv119\u002Fpetrenko20a.html},\n  biburl    = {https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficml\u002FPetrenkoHKSK20.bib},\n  bibsource = {dblp computer science bibliography, https:\u002F\u002Fdblp.org}\n}\n```\n\nFor questions, issues, inquiries please join Discord. \nGithub issues and pull requests are welcome! Check out the [contribution guidelines](https:\u002F\u002Fwww.samplefactory.dev\u002Fcommunity\u002Fcontribution\u002F).\n","[![tests](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Ftest-ci.yml\u002Fbadge.svg?branch=master)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Ftest-ci.yml)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Falex-petrenko\u002Fsample-factory\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg?token=9EHMIU5WYV)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Falex-petrenko\u002Fsample-factory)\n[![pre-commit](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fpre-commit.yml\u002Fbadge.svg?branch=master)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fpre-commit.yml)\n[![docs](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Factions\u002Fworkflows\u002Fdocs.yml\u002Fbadge.svg)](https:\u002F\u002Fsamplefactory.dev)\n[![Code style: black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n[![Imports: isort](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https:\u002F\u002Fpycqa.github.io\u002Fisort\u002F)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Fblob\u002Fmaster\u002FLICENSE)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_a699eb45bbed.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fsample-factory)\n[\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F987232982798598164?label=discord\">](https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr)\n\u003C!-- [![pre-commit.ci status](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002FwmFrank\u002Fsample-factory\u002Fmaster.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002FwmFrank\u002Fsample-factory\u002Fmaster)-->\n\u003C!-- [![wakatime](https:\u002F\u002Fwakatime.com\u002Fbadge\u002Fgithub\u002Falex-petrenko\u002Fsample-factory.svg)](https:\u002F\u002Fwakatime.com\u002Fbadge\u002Fgithub\u002Falex-petrenko\u002Fsample-factory)-->\n\n\n# Sample Factory\n\n高吞吐量强化学习代码库。**版本 2** 已发布！🤗\n\n**资源：**\n\n* **文档：** [https:\u002F\u002Fsamplefactory.dev](https:\u002F\u002Fsamplefactory.dev)\n\n* **论文：** https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.11751\n\n* **引用：** [BibTeX](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory#citation)\n\n* **Discord：** [https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr](https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr)\n\n* **Twitter（更新）：** [@petrenko_ai](https:\u002F\u002Ftwitter.com\u002Fpetrenko_ai)\n\n* **演讲（约 2021 年）：** https:\u002F\u002Fyoutu.be\u002FlLG17LKKSZc\n\n### Sample Factory 是什么？\n\nSample Factory 是最快的强化学习库之一，专注于高效同步和异步的策略梯度算法实现（PPO）。\n\nSample Factory 经过全面测试，被众多研究人员和从业者广泛使用。\n我们的实现以在多种任务领域中达到最先进（SOTA）性能而闻名，同时最大限度地减少所需的训练时间和硬件需求。\n以下视频展示了使用 Sample Factory 训练的 ViZDoom、IsaacGym、DMLab-30、Megaverse、Mujoco 和 Atari 智能体：\n\n\u003Cp align=\"middle\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_7b4e5f74b2a2.gif\" width=\"360\" alt=\"使用 Sample Factory 2.0 训练的 VizDoom 智能体\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_b7ba2e340337.gif\" width=\"360\" alt=\"使用 Sample Factory 2.0 训练的 IsaacGym 智能体\">\n\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_d323331beb85.gif\" width=\"380\" alt=\"使用 Sample Factory 2.0 训练的 DMLab-30 智能体\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_c760c106e239.gif\" width=\"340\" alt=\"使用 Sample Factory 2.0 训练的 Megaverse 智能体\">\n\u003Cbr\u002F>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_a690ae6c6309.gif\" width=\"390\" alt=\"使用 Sample Factory 2.0 训练的 Mujoco 智能体\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_readme_04de05c5fbed.gif\" width=\"330\" alt=\"使用 Sample Factory 2.0 训练的 Atari 智能体\">\n\u003C\u002Fp>\n\n**主要特性：**\n\n* 高度优化的算法[架构](https:\u002F\u002Fwww.samplefactory.dev\u002F06-architecture\u002Foverview\u002F)，实现最大学习吞吐量\n* [同步与异步](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fsync-async\u002F)训练模式\n* [串行（单进程）模式](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fserial-mode\u002F)便于调试\n* 在基于 CPU 和 [GPU 加速环境](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fisaacgym\u002F)中均表现出色\n* 支持单智能体与多智能体训练、自我对弈，并可同时在一台或多台 GPU 上训练 [多个策略](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fmulti-policy-training\u002F)\n* 基于群体的训练（[PBT](https:\u002F\u002Fwww.samplefactory.dev\u002F07-advanced-topics\u002Fmulti-policy-training\u002F)）\n* 支持离散、连续及混合动作空间\n* 支持向量型、图像型和字典型观测空间\n* 可自动根据动作\u002F观测空间规范构建模型架构，同时也支持 [自定义模型架构](https:\u002F\u002Fwww.samplefactory.dev\u002F03-customization\u002Fcustom-models\u002F)\n* 库设计为可导入其他项目，[自定义环境](https:\u002F\u002Fwww.samplefactory.dev\u002F03-customization\u002Fcustom-environments\u002F)被视为一等公民\n* 提供详细的 [WandB 和 TensorBoard 摘要](https:\u002F\u002Fwww.samplefactory.dev\u002F05-monitoring\u002Fmetrics-reference\u002F)以及 [自定义指标](https:\u002F\u002Fwww.samplefactory.dev\u002F05-monitoring\u002Fcustom-metrics\u002F)\n* [HuggingFace 🤗 集成](https:\u002F\u002Fwww.samplefactory.dev\u002F10-huggingface\u002Fhuggingface\u002F)（可将训练好的模型和指标上传至 Hub）\n* 提供多种 [示例](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fmujoco) [环境](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fatari) [集成](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fvizdoom) [与调优参数和训练好的模型](https:\u002F\u002Fwww.samplefactory.dev\u002F09-environment-integrations\u002Fdmlab\u002F)\n\n本 README 仅提供库的简要概述。\n更多详细信息请访问完整文档：[https:\u002F\u002Fsamplefactory.dev](https:\u002F\u002Fsamplefactory.dev)。\n\n## 安装\n\n只需从 PyPI 安装即可：\n\n```pip install sample-factory```\n\nSF 已知可在 Linux 和 macOS 上运行。目前暂不支持 Windows。\n有关特定于环境的安装说明，请参阅 [文档](https:\u002F\u002Fsamplefactory.dev)。\n\n## 快速入门\n\n使用命令行通过现有集成之一训练智能体，例如 Mujoco（可能需要先运行 `pip install sample-factory[mujoco]`）：\n\n```bash\npython -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n```\n\n当达到预期性能时，按 Ctrl+C 停止实验，然后评估智能体：\n\n```bash\npython -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n\n# 或者使用替代的评估脚本，虽然不进行渲染，但速度更快！（请确保 `sample_env_episodes` 大于等于 `num_workers` * `num_envs_per_worker`）。\npython -m sf_examples.mujoco.fast_eval_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir --sample_env_episodes=128 --num_workers=16 --num_envs_per_worker=2\n```\n\n在基于像素的 VizDoom 环境中执行相同的操作（可能需要运行 `pip install sample-factory[vizdoom]`，请同时参阅 VizDoom 特定说明文档）：\n\n```bash\npython -m sf_examples.vizdoom.train_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=.\u002Ftrain_dir --num_workers=16 --num_envs_per_worker=10 --train_for_env_steps=1000000\npython -m sf_examples.vizdoom.enjoy_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=.\u002Ftrain_dir\n```\n\n使用 TensorBoard 监控任何正在运行或已完成的实验：\n\n```bash\ntensorboard --logdir=.\u002Ftrain_dir\n```\n（或参阅文档了解 WandB 集成方法）。\n\n要继续深入，请复制并修改现有的环境集成代码，以在您自定义的环境中训练智能体。我们提供了适用于各种支持环境的示例，请参阅[文档](https:\u002F\u002Fsamplefactory.dev)获取更多详细信息。\n\n## 致谢\n\n没有众多人士的杰出贡献，本项目将无法实现。在此特别感谢：\n\n* [Vladlen Koltun](https:\u002F\u002Fvladlen.info)，尤其是在项目早期阶段，他为我提供了卓越的指导与支持，帮助我巩固了最终形成该库的核心理念。\n* 我的学术导师 [Gaurav Sukhatme](https:\u002F\u002Fviterbi.usc.edu\u002Fdirectory\u002Ffaculty\u002FSukhatme\u002FGaurav)，多年来一直支持我的博士研究，并始终是一位出色的导师。\n* [Zhehui Huang](https:\u002F\u002Fzhehui-huang.github.io\u002F) 对原始 ICML 论文投稿的贡献、对库的严谨测试与评估工作，以及将其应用于自身研究的努力。\n* [Edward Beeching](https:\u002F\u002Fedbeeching.github.io\u002F) 对代码库的诸多出色贡献，包括混合动作分布、自定义模型构建器的新版本、多种环境集成，以及通过 HuggingFace 集成推广该库！\n* [Andrew Zhang](https:\u002F\u002Fandrewzhang505.github.io\u002F) 和 [Ming Wang](https:\u002F\u002Fwww.mingwang.me\u002F) 在 HuggingFace 实习期间为代码库和文档做出的大量贡献！\n* HuggingFace 的 [Thomas Wolf](https:\u002F\u002Fthomwolf.io\u002F) 及其团队提供的难以置信（且出乎意料的）支持，以及他们为开源社区所做的卓越工作。\n* [Erik Wijmans](https:\u002F\u002Fwijmans.xyz\u002F) 提供的反馈与洞见，尤其是他利用 PyTorch 的 `PackedSequence` 实现 RNN 反向传播、多层 RNN 等功能！\n* [Tushar Kumar](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftushartk\u002F) 对原始论文的贡献，以及他在[快速队列实现](https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Ffaster-fifo)方面的帮助。\n* [Costa Huang](https:\u002F\u002Fcosta.sh\u002F) 开发了 CleanRL，致力于强化学习算法的基准测试，并提供了宝贵的反馈与见解！\n* [Denys Makoviichuk](https:\u002F\u002Fgithub.com\u002FDenys88\u002Frl_games) 开发了 rl_games 这一极速强化学习库，为本库的诸多特性（如回报归一化、自适应学习率等）提供了灵感与反馈。\n* [Eugene Vinitsky](https:\u002F\u002Feugenevinitsky.github.io\u002F) 将本库应用于其研究，并提供了宝贵的意见。\n* RESL 实验室的所有同事，他们在各自项目中使用了 Sample Factory，并给予了反馈与洞见！\n\n衷心感谢所有未在此提及的各位，感谢你们的代码贡献、PR、问题与疑问！正是有了社区的支持，才成就了这个项目！\n\n## 引用\n\n如果您在工作中使用了本仓库，或希望引用它，请参考我们的 ICML 2020 论文。\n\n```\n@inproceedings{petrenko2020sf,\n  author    = {Aleksei Petrenko and\n               Zhehui Huang and\n               Tushar Kumar and\n               Gaurav S. Sukhatme and\n               Vladlen Koltun},\n  title     = {Sample Factory: Egocentric 3D Control from Pixels at 100000 {FPS}\n               with Asynchronous Reinforcement Learning},\n  booktitle = {Proceedings of the 37th International Conference on Machine Learning,\n               {ICML} 2020, 13-18 July 2020, Virtual Event},\n  series    = {Proceedings of Machine Learning Research},\n  volume    = {119},\n  pages     = {7652--7662},\n  publisher = {{PMLR}},\n  year      = {2020},\n  url       = {http:\u002F\u002Fproceedings.mlr.press\u002Fv119\u002Fpetrenko20a.html},\n  biburl    = {https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficml\u002FPetrenkoHKSK20.bib},\n  bibsource = {dblp computer science bibliography, https:\u002F\u002Fdblp.org}\n}\n```\n\n如有任何问题、意见或咨询，请加入 Discord 社区。欢迎提交 GitHub 问题和拉取请求！请查看[贡献指南](https:\u002F\u002Fwww.samplefactory.dev\u002Fcommunity\u002Fcontribution\u002F)。","# Sample Factory 快速上手指南\n\nSample Factory 是一个专注于高吞吐量的强化学习代码库，提供了策略梯度（PPO）的高效同步和异步实现。它以极低的训练时间和硬件需求，在多个领域达到了最先进（SOTA）的性能。\n\n## 环境准备\n\n*   **操作系统**：支持 Linux 和 macOS。**暂不支持 Windows**。\n*   **Python 环境**：建议安装 Python 3.7+。\n*   **前置依赖**：确保已安装 `pip`。根据你要运行的具体环境示例（如 Mujoco 或 VizDoom），可能需要额外的系统级依赖库，请参考[官方文档](https:\u002F\u002Fsamplefactory.dev)中的环境特定安装说明。\n\n## 安装步骤\n\n通过 PyPI 直接安装核心库：\n\n```bash\npip install sample-factory\n```\n\n> **提示**：如果你计划运行特定的示例环境，建议安装对应的额外依赖。例如：\n> *   Mujoco 示例：`pip install sample-factory[mujoco]`\n> *   VizDoom 示例：`pip install sample-factory[vizdoom]`\n\n## 基本使用\n\n以下以 Mujoco 环境为例，展示如何训练和评估智能体。\n\n### 1. 训练智能体\n\n运行以下命令开始训练。该示例使用 `mujoco_ant` 环境，实验命名为 `Ant`，模型保存在 `.\u002Ftrain_dir` 目录下。\n\n```bash\npython -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n```\n\n当达到预期性能后，按 `Ctrl+C` 停止训练。\n\n### 2. 评估智能体\n\n训练完成后，可以使用以下命令可视化评估智能体的表现：\n\n```bash\npython -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir\n```\n\n如果需要更快速的非渲染评估（适用于批量测试），可以使用：\n\n```bash\npython -m sf_examples.mujoco.fast_eval_mujoco --env=mujoco_ant --experiment=Ant --train_dir=.\u002Ftrain_dir --sample_env_episodes=128 --num_workers=16 --num_envs_per_worker=2\n```\n\n### 3. 监控训练过程\n\n你可以使用 TensorBoard 实时监控训练指标：\n\n```bash\ntensorboard --logdir=.\u002Ftrain_dir\n```\n\n也可以在浏览器中打开 TensorBoard 显示的地址查看损失函数、奖励曲线等详细数据。\n\n---\n\n**更多资源：**\n*   完整文档：[https:\u002F\u002Fsamplefactory.dev](https:\u002F\u002Fsamplefactory.dev)\n*   社区交流：[Discord](https:\u002F\u002Fdiscord.gg\u002FBCfHWaSMkr)","一家游戏工作室的 AI 研发团队正致力于训练一个能在复杂 3D 环境中自主导航并完成任务的智能体（基于 ViZDoom 或类似仿真器），需要在有限的项目周期内完成大规模迭代。\n\n### 没有 sample-factory 时\n- **训练效率低下**：使用传统 RL 库时，数据采样吞吐量低，GPU 经常处于等待 CPU 处理数据的闲置状态，导致硬件利用率不足 30%。\n- **迭代周期漫长**：为了达到可用的智能水平，模型需要数周时间才能收敛，严重拖慢了算法验证和功能开发的节奏。\n- **资源成本高昂**：为了缩短训练时间，团队不得不申请昂贵的多节点集群资源，且因代码并行效率低，造成计算资源的极大浪费。\n- **调优难度极大**：同步与异步训练的实现复杂，手动优化数据管道容易引入 Bug，研究人员大量时间耗费在工程调试而非算法创新上。\n\n### 使用 sample-factory 后\n- **吞吐量显著提升**：凭借高度优化的架构，sample-factory 实现了极高的数据采样率，GPU 利用率飙升至 90% 以上，充分释放硬件潜能。\n- **研发速度飞跃**：得益于高吞吐特性，模型收敛时间从数周缩短至数天甚至数小时，团队每天可进行多次完整的实验迭代。\n- **硬件成本降低**：在单台或多台标准服务器上即可实现此前需要大型集群才能达到的训练规模，大幅降低了算力预算。\n- **专注核心算法**：sample-factory 提供了开箱即用的同步\u002F异步训练模式及完善的 API，研究人员无需关心底层并行细节，可专注于奖励函数设计和策略优化。\n\nsample-factory 通过极致的工程优化解决了强化学习中的“数据饥饿”问题，让团队以更低的成本和更快的速度实现 SOTA 级别的智能体训练。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falex-petrenko_sample-factory_b7ba2e34.gif","alex-petrenko","Aleksei Petrenko","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Falex-petrenko_8b027664.jpg","Research Scientist at Apple. CS PhD, USC 2023. Interests: Deep Reinforcement Learning, Simulation, Optimization, Robotics, LLMs.","Apple","Cupertino, USA","apetrenko1991@gmail.com","petrenko_ai","https:\u002F\u002Falex-petrenko.github.io\u002F","https:\u002F\u002Fgithub.com\u002Falex-petrenko",[87,91,95,99],{"name":88,"color":89,"percentage":90},"Python","#3572A5",98.8,{"name":92,"color":93,"percentage":94},"Jupyter Notebook","#DA5B0B",0.8,{"name":96,"color":97,"percentage":98},"Shell","#89e051",0.2,{"name":100,"color":101,"percentage":98},"Makefile","#427819",981,148,"2026-03-23T22:41:58","MIT","Linux, macOS","未说明（支持 CPU 和 GPU 加速环境，具体取决于所选环境如 IsaacGym 等）","未说明",{"notes":110,"python":108,"dependencies":111},"不支持 Windows 系统。可通过 pip install sample-factory 安装。针对特定环境（如 Mujoco, VizDoom）需安装额外依赖（例如 pip install sample-factory[mujoco]）。支持同步和异步训练模式，以及单进程调试模式。",[112,113,114],"torch","gym","numpy",[13],[117],"reinforcement-learning",null,"2026-03-27T02:49:30.150509","2026-04-06T05:37:27.957534",[122,127,132,137,141,145],{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},11303,"如何查看 Sample Factory 的所有可用参数及其详细说明？","所有参数在源代码中都有详细的帮助字符串。你可以查看 `algorithm.py` 和 `appo.py` 文件以及 README 中的配置部分。\n\n获取特定算法和环境组合的所有参数列表的一个技巧是传递一个错误的参数，例如：\n`python -m algorithms.appo.train_appo --algo APPO --env doom_basic --experiment=test --x`\n这将触发错误并列出所有可用参数。你也可以在代码库中搜索参数名称以获取解释。","https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Fissues\u002F23",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},11304,"如何在 Windows 10 (Nvidia GPU) 上运行 Sample Factory？","Sample Factory 官方未在 Windows 上进行测试，主要依赖 Linux 特有的高性能 FIFO 队列。如果你必须在 Windows 上运行，可以创建一个名为 `faster_fifo` 的 Python 模块作为存根（stub），将调用路由到标准的 `multiprocessing.Queue`。\n\n由于接口相同，只需实现 `get_many()` 方法即可，示例代码如下：\n```python\nmsgs = []\nwhile True:\n  try:\n    msgs.append(q.get_nowait()) \n  except Empty:\n    break\nreturn msgs\n```\n注意：这种替代方案性能会下降（不再“更快”），但在某些配置下影响可能 negligible。此外，你可能会遇到其他兼容性问题。","https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Fissues\u002F29",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},11305,"为什么增加 `num_epoch` 会导致奖励（Reward）下降，而在其他库（如 SB3）中通常会上升？","这通常与超参数配置有关。在 Sample Factory 中，建议保持较小的 `num_epochs`（例如 2），并调整 `rollout` 和 `batch_size` 以匹配数据量。\n\n维护者建议的有效配置示例：\n`--num_workers=8 --num_epochs=2 --rollout=64 --batch_size=1024 --num_batches_per_epoch=8`\n或者：\n`--num_workers=8 --num_epochs=2 --rollout=32 --batch_size=1024 --num_batches_per_epoch=4`\n\n如果增加 epoch 数导致性能下降，可能是因为过拟合或策略滞后（policy lag）问题。尝试减少 epoch 数并增加 rollout 长度或 batch size 往往能获得更稳定的训练效果。","https:\u002F\u002Fgithub.com\u002Falex-petrenko\u002Fsample-factory\u002Fissues\u002F276",{"id":138,"question_zh":139,"answer_zh":140,"source_url":136},11306,"在对比实验结果时，应该使用哪个指标作为 X 轴（Step vs Time）？","不建议使用默认的 \"step\"，因为它可能是系统内部的随机度量（如梯度步数），不同库之间定义不一致，难以公平比较。\n\n维护者强烈建议使用 \"Relative time (Wall)\"（相对墙钟时间）作为 X 轴。你可以在 WandB 工作区右上角点击 `x->` 图标进行更改。\n\n如果你关注样本效率（sample efficiency），可以使用 \"global_step\"（环境交互步数）。但在大多数实际应用场景中，单位时间内的性能提升（Wall time）是衡量算法效率更直观的指标。",{"id":142,"question_zh":143,"answer_zh":144,"source_url":131},11307,"如何解决 \"Learner accumulated too much experience\" 或 \"Waiting for trajectory buffer\" 的性能瓶颈？","这些警告表明数据生成（Worker）和数据消费（Learner）之间不平衡。\n\n1. 如果出现 \"Waiting for trajectory buffer\"，说明 Worker 太快，Learner 处理不过来。可以尝试增加 `batch_size` 或减少 `num_envs_per_worker`。\n2. 如果出现 \"Learner accumulated too much experience\"，说明 Learner 是瓶颈。可以尝试增加 `--learner_main_loop_num_cores`（例如从 2 增加到 10）来分配更多 CPU 核心给学习循环。\n3. 确保硬件资源匹配：在低配机器上（如双核），过高的 `num_envs_per_worker` 会导致 CPU 争用，反而降低 FPS。需要根据 CPU 核心数和内存调整并行度。",{"id":146,"question_zh":147,"answer_zh":148,"source_url":126},11308,"Sample Factory 是否支持非图像输入的 OpenAI Gym 环境（如 CartPole）？","是的，Sample Factory 支持非图像输入的 Gym 环境。你需要正确配置编码器类型。\n\n对于类似 CartPole 的低维状态输入，应使用 MLP 编码器。关键参数设置如下：\n`--encoder_type=mlp`\n`--encoder_subtype=mlp_mujoco` (或其他适合低维输入的 MLP 子类型)\n`--use_rnn=False` (如果不需要记忆功能)\n\n确保你的环境符合 Gym 接口规范，并在启动训练时指定正确的环境 ID。",[150],{"id":151,"version":152,"summary_zh":153,"released_at":154},61800,"1.0.0","ICML2020定稿提交时的版本","2020-06-24T00:47:05"]