[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-proroklab--VectorizedMultiAgentSimulator":3,"tool-proroklab--VectorizedMultiAgentSimulator":62},[4,18,26,36,46,54],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",158594,2,"2026-04-16T23:34:05",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":42,"last_commit_at":43,"category_tags":44,"status":17},8272,"opencode","anomalyco\u002Fopencode","OpenCode 是一款开源的 AI 编程助手（Coding Agent），旨在像一位智能搭档一样融入您的开发流程。它不仅仅是一个代码补全插件，而是一个能够理解项目上下文、自主规划任务并执行复杂编码操作的智能体。无论是生成全新功能、重构现有代码，还是排查难以定位的 Bug，OpenCode 都能通过自然语言交互高效完成，显著减少开发者在重复性劳动和上下文切换上的时间消耗。\n\n这款工具专为软件开发者、工程师及技术研究人员设计，特别适合希望利用大模型能力来提升编码效率、加速原型开发或处理遗留代码维护的专业人群。其核心亮点在于完全开源的架构，这意味着用户可以审查代码逻辑、自定义行为策略，甚至私有化部署以保障数据安全，彻底打破了传统闭源 AI 助手的“黑盒”限制。\n\n在技术体验上，OpenCode 提供了灵活的终端界面（Terminal UI）和正在测试中的桌面应用程序，支持 macOS、Windows 及 Linux 全平台。它兼容多种包管理工具，安装便捷，并能无缝集成到现有的开发环境中。无论您是追求极致控制权的资深极客，还是渴望提升产出的独立开发者，OpenCode 都提供了一个透明、可信",144296,1,"2026-04-16T14:50:03",[13,45],"插件",{"id":47,"name":48,"github_repo":49,"description_zh":50,"stars":51,"difficulty_score":32,"last_commit_at":52,"category_tags":53,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":55,"name":56,"github_repo":57,"description_zh":58,"stars":59,"difficulty_score":32,"last_commit_at":60,"category_tags":61,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[45,13,15,14],{"id":63,"github_repo":64,"name":65,"description_en":66,"description_zh":67,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":32,"env_os":92,"env_gpu":93,"env_ram":94,"env_deps":95,"category_tags":104,"github_topics":106,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":127,"updated_at":128,"faqs":129,"releases":160},8285,"proroklab\u002FVectorizedMultiAgentSimulator","VectorizedMultiAgentSimulator","VMAS is a vectorized differentiable simulator designed for efficient Multi-Agent Reinforcement Learning benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface.","VectorizedMultiAgentSimulator（简称 VMAS）是一款专为多智能体强化学习（MARL）设计的高效仿真平台。它旨在解决传统仿真器在训练大规模智能体时速度慢、扩展性差以及难以与深度学习框架无缝集成的痛点，让研究人员能够快速验证算法并复现复杂场景。\n\n这款工具非常适合从事机器人协同控制、人工智能算法研究的研究人员，以及希望快速搭建多智能体训练环境的开发者。VMAS 的核心亮点在于其完全基于 PyTorch 构建的向量化 2D 物理引擎。这意味着它不仅支持可微分模拟，还能利用 GPU 加速，轻松实现成千上万个环境并行运行，极大提升了训练效率。此外，VMAS 提供了丰富的预设挑战场景，支持自定义传感器（如激光雷达）、智能体间通信及复杂的物理交互（如弹性碰撞和关节连接）。其接口兼容 OpenAI Gym、Gymnasium 及 TorchRL 等主流框架，并可与 BenchMARL 库配合使用，让用户能够开箱即用，专注于算法创新而非底层环境搭建。","# VectorizedMultiAgentSimulator (VMAS)\n\u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fvmas\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fvmas\" alt=\"pypi version\">\u003C\u002Fa>\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_33bb5a560cbf.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fvmas)\n![tests](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Factions\u002Fworkflows\u002Ftests-linux.yml\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcoverage.svg?branch=main)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fproroklab\u002FVectorizedMultiAgentSimulator)\n[![Documentation Status](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_13d664e1afd7.png)](https:\u002F\u002Fvmas.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-GPLv3.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002FLICENSE)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2207.03530-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530)\n[![Discord Shield](https:\u002F\u002Fdcbadge.limes.pink\u002Fapi\u002Fserver\u002Fhttps:\u002F\u002Fdiscord.gg\u002Fdg8txxDW5t)](https:\u002F\u002Fdiscord.gg\u002Fdg8txxDW5t)\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FSimulation_and_training_in_VMAS_and_BenchMARL.ipynb)\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_a5f0d93f05b1.gif\" alt=\"drawing\"\u002F>  \n\u003C\u002Fp>\n\n> [!NOTE]  \n> We have released [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL), a benchmarking library where you \n> can train VMAS tasks using TorchRL!\n> Check out [how easy it is to use it.](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb)\n\n## Welcome to VMAS!\n\nThis repository contains the code for the Vectorized Multi-Agent Simulator (VMAS).\n\nVMAS is a vectorized differentiable simulator designed for efficient MARL benchmarking.\nIt is comprised of a fully-differentiable vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios.\nScenario creation is made simple and modular to incentivize contributions.\nVMAS simulates agents and landmarks of different shapes and supports rotations, elastic collisions, joints, and custom gravity.\nHolonomic motion models are used for the agents to simplify simulation. Custom sensors such as LIDARs are available and the simulator supports inter-agent communication.\nVectorization in [PyTorch](https:\u002F\u002Fpytorch.org\u002F) allows VMAS to perform simulations in a batch, seamlessly scaling to tens of thousands of parallel environments on accelerated hardware.\nVMAS has an interface compatible with [OpenAI Gym](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fgym), with [Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F), with [RLlib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html), with [torchrl](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl) and its MARL training library: [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL),\nenabling out-of-the-box integration with a wide range of RL algorithms. \nThe implementation is inspired by [OpenAI's MPE](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmultiagent-particle-envs). \nAlongside VMAS's scenarios, we port and vectorize all the scenarios in MPE.\n\n### [Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530)\nThe arXiv paper can be found [here](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530).\n\nIf you use VMAS in your research, **cite** it using:\n```\n@article{bettini2022vmas,\n  title = {VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning},\n  author = {Bettini, Matteo and Kortvelesy, Ryan and Blumenkamp, Jan and Prorok, Amanda},\n  year = {2022},\n  journal={The 16th International Symposium on Distributed Autonomous Robotic Systems},\n  publisher={Springer}\n}\n```\n\n### Video\nWatch the presentation video of VMAS, showing its structure, scenarios, and experiments.\n\n\u003Cp align=\"center\">\n\n[![VMAS Video](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_fd2dcbcaccd3.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aaDRYfiesAY)\n\u003C\u002Fp>\nWatch the talk at DARS 2022 about VMAS.\n\u003Cp align=\"center\">\n\n[![VMAS Video](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b240046ffeb6.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=boViBY7Woqg)\n\u003C\u002Fp>\nWatch the lecture on creating a custom scenario in VMAS and training it in BenchMARL.\n\u003Cp align=\"center\">\n\n[![VMAS Video](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_73b9cf0ec012.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=mIb1uGeRJsg)\n\u003C\u002Fp>\n\n## Table of contents\n- [VectorizedMultiAgentSimulator (VMAS)](#vectorizedmultiagentsimulator-vmas)\n  * [Welcome to VMAS!](#welcome-to-vmas)\n    + [Paper](#paper)\n    + [Video](#video)\n  * [Table of contents](#table-of-contents)\n  * [How to use](#how-to-use)\n    + [Notebooks](#notebooks)\n    + [Install](#install)\n    + [Run](#run)\n      - [RLlib](#rllib)\n      - [TorchRL](#torchrl)\n    + [Input and output spaces](#input-and-output-spaces)\n      - [Output spaces](#output-spaces)\n      - [Input action space](#input-action-space)\n  * [Simulator features](#simulator-features)\n  * [Creating a new scenario](#creating-a-new-scenario)\n  * [Play a scenario](#play-a-scenario)\n  * [Rendering](#rendering)\n    + [Plot function under rendering](#plot-function-under-rendering)\n    + [Rendering on server machines](#rendering-on-server-machines)\n  * [List of environments](#list-of-environments)\n    + [VMAS](#vmas)\n       - [Main scenarios](#main-scenarios)\n       - [Debug scenarios](#debug-scenarios)\n    + [MPE](#mpe)\n  * [Our papers using VMAS](#our-papers-using-vmas)\n  * [TODOS](#todos)\n\n\n## How to use\n### Notebooks\n-  [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FVMAS_Use_vmas_environment.ipynb) &ensp; **Using a VMAS environment**.\n Here is a simple notebook that you can run to create, step and render any scenario in VMAS. It reproduces the `use_vmas_env.py` script in the `examples` folder.\n- [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FSimulation_and_training_in_VMAS_and_BenchMARL.ipynb) &ensp;  **Creating a VMAS scenario and training it in [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL)**.  We will create a scenario where multiple robots with different embodiments need to navigate to their goals while avoiding each other (as well as obstacles) and train it using MAPPO and MLP\u002FGNN policies.\n- [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb) &ensp;  **Training VMAS in BenchMARL (suggested)**.  In this notebook, we show how to use VMAS in [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL), TorchRL's MARL training library.\n- [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpytorch\u002Frl\u002Fblob\u002Fgh-pages\u002F_downloads\u002Fa977047786179278d12b52546e1c0da8\u002Fmultiagent_ppo.ipynb)  &ensp;  **Training VMAS in TorchRL**.  In this notebook, [available in the TorchRL docs](https:\u002F\u002Fpytorch.org\u002Frl\u002Fstable\u002Ftutorials\u002Fmultiagent_ppo.html#), we show how to use any VMAS scenario in TorchRL. It will guide you through the full pipeline needed to train agents using MAPPO\u002FIPPO.\n- [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpytorch\u002Frl\u002Fblob\u002Fgh-pages\u002F_downloads\u002Fd30bb6552cc07dec0f1da33382d3fa02\u002Fmultiagent_competitive_ddpg.py)  &ensp;  **Training competitive VMAS MPE in TorchRL**.  In this notebook, [available in the TorchRL docs](https:\u002F\u002Fpytorch.org\u002Frl\u002Fstable\u002Ftutorials\u002Fmultiagent_competitive_ddpg.html), we show how to solve a Competitive Multi-Agent Reinforcement Learning (MARL) problem using MADDPG\u002FIDDPG.\n- [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FVMAS_RLlib.ipynb)  &ensp;  **Training VMAS in RLlib**.  In this notebook, we show how to use any VMAS scenario in RLlib. It reproduces the `rllib.py` script in the `examples` folder.\n\n\n\n### Install\n\nTo install the simulator, you can use pip to get the latest release:\n```bash\npip install vmas\n```\nIf you want to install the current master version (more up to date than latest release), you can do:\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator.git\ncd VectorizedMultiAgentSimulator\npip install -e .\n```\nBy default, vmas has only the core requirements. To install further dependencies to enable training with [Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F) wrappers, [RLLib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html) wrappers, for rendering, and testing, you may want to install these further options:\n```bash\n# install gymnasium for gymnasium wrappers\npip install vmas[gymnasium]\n\n# install rllib for rllib wrapper\npip install vmas[rllib]\n\n# install rendering dependencies\npip install vmas[render]\n\n# install testing dependencies\npip install vmas[test]\n\n# install all dependencies\npip install vmas[all]\n```\n\nYou can also install the following training libraries:\n\n```bash\npip install benchmarl # For training in BenchMARL\npip install torchrl # For training in TorchRL\npip install \"ray[rllib]\"==2.1.0 # For training in RLlib. We support versions \"ray[rllib]\u003C=2.2,>=1.13\"\n```\n\n### Run \n\nTo use the simulator, simply create an environment by passing the name of the scenario\nyou want (from the `scenarios` folder) to the `make_env` function.\nThe function arguments are explained in the documentation. The function returns an environment object with the VMAS interface:\n\nHere is an example:\n```python\n env = vmas.make_env(\n        scenario=\"waterfall\", # can be scenario name or BaseScenario class\n        num_envs=32,\n        device=\"cpu\", # Or \"cuda\" for GPU\n        continuous_actions=True,\n        wrapper=None,  # One of: None, \"rllib\", \"gym\", \"gymnasium\", \"gymnasium_vec\"\n        max_steps=None, # Defines the horizon. None is infinite horizon.\n        seed=None, # Seed of the environment\n        dict_spaces=False, # By default tuple spaces are used with each element in the tuple being an agent.\n        # If dict_spaces=True, the spaces will become Dict with each key being the agent's name\n        grad_enabled=False, # If grad_enabled the simulator is differentiable and gradients can flow from output to input\n        terminated_truncated=False, # If terminated_truncated the simulator will return separate `terminated` and `truncated` flags in the `done()`, `step()`, and `get_from_scenario()` functions instead of a single `done` flag\n        **kwargs # Additional arguments you want to pass to the scenario initialization\n    )\n```\nA further example that you can run is contained in `use_vmas_env.py` in the `examples` directory.\n\nWith the `terminated_truncated` flag set to `True`, the simulator will return separate `terminated` and `truncated` flags\nin the `done()`, `step()`, and `get_from_scenario()` functions instead of a single `done` flag.\nThis is useful when you want to know if the environment is done because the episode has ended or\nbecause the maximum episode length\u002F timestep horizon has been reached. \nSee [the Gymnasium documentation](https:\u002F\u002Fgymnasium.farama.org\u002Ftutorials\u002Fgymnasium_basics\u002Fhandling_time_limits\u002F) for more details on this.\n\n#### RLlib\n\nTo see how to use VMAS in RLlib, check out the script in `examples\u002Frllib.py`.\n\nYou can find more examples of multi-agent training in VMAS in the [HetGPPO repository](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FHetGPPO).\n\n#### TorchRL \n\nVMAS is supported by [TorchRL](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl) and its MARL training library [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL).\n\nCheck out how simple it is to use VMAS in [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL) with this [notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb).\n\nWe provide a [notebook](https:\u002F\u002Fpytorch.org\u002Frl\u002Ftutorials\u002Fmultiagent_ppo.html) which guides you through a full\nmulti-agent reinforcement learning pipeline for training VMAS scenarios in TorchRL using MAPPO\u002FIPPO.\n\nYou can find **example scripts** in the TorchRL repo [here](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Ftree\u002Fmain\u002Fsota-implementations\u002Fmultiagent)\non how to run MAPPO-IPPO-MADDPG-QMIX-VDN using the [VMAS wrapper](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Fblob\u002Fmain\u002Ftorchrl\u002Fenvs\u002Flibs\u002Fvmas.py).\n\n\n\n### Input and output spaces\n\nVMAS uses gym spaces for input and output spaces. \nBy default, action and observation spaces are tuples:\n```python\nspaces.Tuple(\n    [agent_space for agent in agents]\n)\n```\nWhen creating the environment,  by setting `dict_spaces=True`, tuples can be changed to dictionaries:\n```python\nspaces.Dict(\n  {agent.name: agent_space for agent in agents}\n)\n```\n\n#### Output spaces\n\nIf `dict_spaces=False`, observations, infos, and rewards returned by the environment will be a list with each element being the value for that agent.\n\nIf `dict_spaces=True`, observations, infos, and rewards returned by the environment will be a dictionary with each element having key = agent_name and value being the value for that agent.\n\nEach agent **observation** in either of these structures is either (depending on how you implement the scenario):\n  - a tensor with shape `[num_envs, observation_size]`, where `observation_size` is the size of the agent's observation.\n```python\n def observation(self, agent: Agent):\n        return torch.cat([agent.state.pos, agent.state.vel], dim=-1)\n```\n  - a dictionary of such tensors\n```python\n def observation(self, agent: Agent):\n        return {\n            \"pos\": agent.state.pos,\n            \"nested\": {\"vel\": agent.state.vel},\n        }\n```\n\nEach agent **reward** in either of these structures is a tensor with shape `[num_envs]`.\n\nEach agent **info** in either of these structures is a dictionary where each entry has key representing the name of that info and value a tensor with shape `[num_envs, info_size]`, where `info_size` is the size of that info for that agent.\n\nDone is a tensor of shape `[num_envs]`.\n\n\n#### Input action space\n  \nEach agent in vmas has to provide an action tensor with shape `[num_envs, action_size]`, where `num_envs` is the number of vectorized environments and `action_size` is the size of the agent's action.\n\nThe agents' actions can be provided to the `env.step()` in two ways:\n- A **List** of length equal to the number of agents which looks like `[tensor_action_agent_0, ..., tensor_action_agent_n]`\n- A **Dict** of length equal to the number of agents and with each entry looking like `{agent_0.name: tensor_action_agent_0, ..., agent_n.name: tensor_action_agent_n}`\n\nUsers can interchangeably use either of the two formats and even change formats during execution, vmas will always perform all sanity checks. \nEach format will work regardless of the fact that tuples or dictionary spaces have been chosen.\n\n## Simulator features\n\n- **Vectorized**: VMAS vectorization can step any number of environments in parallel. This significantly reduces the time needed to collect rollouts for training in MARL.\n- **Simple**: Complex vectorized physics engines exist (e.g., [Brax](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fbrax)), but they do not scale efficiently when dealing with multiple agents. This defeats the computational speed goal set by vectorization. VMAS uses a simple custom 2D dynamics engine written in PyTorch to provide fast simulation. \n- **General**: The core of VMAS is structured so that it can be used to implement general high-level multi-robot problems in 2D. It can support adversarial as well as cooperative scenarios. Holonomic point-robot simulation has been chosen to focus on general high-level problems, without learning low-level custom robot controls through MARL.\n- **Extensible**: VMAS is not just a simulator with a set of environments. It is a framework that can be used to create new multi-agent scenarios in a format that is usable by the whole MARL community. For this purpose, we have modularized the process of creating a task and introduced interactive rendering to debug it. You can define your own scenario in minutes. Have a look at the dedicated section in this document.\n- **Compatible**: VMAS has wrappers for [RLlib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html), [torchrl](https:\u002F\u002Fpytorch.org\u002Frl\u002Freference\u002Fgenerated\u002Ftorchrl.envs.libs.vmas.VmasEnv.html), [OpenAI Gym](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fgym) and [Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F). RLlib and torchrl have a large number of already implemented RL algorithms.\nKeep in mind that this interface is less efficient than the unwrapped version. For an example of wrapping, see the main of `make_env`.\n- **Tested**: Our scenarios come with tests which run a custom designed heuristic on each scenario.\n- **Entity shapes**: Our entities (agent and landmarks) can have different customizable shapes (spheres, boxes, lines).\nAll these shapes are supported for elastic collisions.\n- **Faster than physics engines**: Our simulator is extremely lightweight, using only tensor operations. It is perfect for \nrunning MARL training at scale with multi-agent collisions and interactions.\n- **Customizable**: When creating a new scenario of your own, the world, agent and landmarks are highly\ncustomizable. Examples are: drag, friction, gravity, simulation timestep, non-differentiable communication, agent sensors (e.g. LIDAR), and masses.\n- **Non-differentiable communication**: Scenarios can require agents to perform discrete or continuous communication actions.\n- **Gravity**: VMAS supports customizable gravity.\n- **Sensors**: Our simulator implements ray casting, which can be used to simulate a wide range of distance-based sensors that can be added to agents. We currently support LIDARs. To see available sensors, have a look at the `sensors` script.\n- **Joints**: Our simulator supports joints. Joints are constraints that keep entities at a specified distance. The user can specify the anchor points on the two objects, the distance (including 0), the thickness of the joint, if the joint is allowed to rotate at either anchor point, and if he wants the joint to be collidable. Have a look at the waterfall scenario to see how you can use joints. See the `waterfall` and `joint_passage` scenarios for an example.\n- **Agent actions**: Agents' physical actions are 2D forces for holonomic motion. Agent rotation can also be controlled through a torque action (activated by setting `agent.action.u_rot_range` at agent creation time). Agents can also be equipped with continuous or discrete communication actions.\n- **Action preprocessing**: By implementing the `process_action` function of a scenario, you can modify the agents' actions before they are passed to the simulator. This is used in `controllers` (where we provide different types of controllers to use) and `dynamics` (where we provide custom robot dynamic models).\n- **Controllers**: Controllers are components that can be appended to the neural network policy or replace it completely.  We provide a `VelocityController` which can be used to treat input actions as velocities (instead of default vmas input forces). This PID controller takes velocities and outputs the forces which are fed to the simulator. See the `vel_control` debug scenario for an example.\n- **Dynamic models**: VMAS simulates holonomic dynamics models by default. Custom dynamics can be chosen at agent creation time. Implementations now include `DiffDriveDynamics` for differential drive robots, `KinematicBicycleDynamics` for kinematic bicycle model, and `Drone` for quadcopter dynamics. See `diff_drive`, `kinematic_bicycle` and `drone` debug scenarios for examples.\n- **Differentiable**: By setting `grad_enabled=True` when creating an environment, the simulator will be differentiable, allowing gradients flowing through any of its function.\n\n## Creating a new scenario\n\nTo create a new scenario, just extend the `BaseScenario` class in `scenario.py`.\n\nYou will need to implement at least `make_world`, `reset_world_at`, `observation`, and `reward`. Optionally, you can also implement `done`, `info`, `process_action`, and `extra_render`.\n\nYou can also change the viewer size, zoom, and enable a background rendered grid by changing these inherited attributes in the `make_world` function.\n\nTo know how, just read the documentation of `BaseScenario` in `scenario.py` and look at the implemented scenarios. \n\n## Play a scenario\n\nYou can play with a scenario interactively! **Just execute its script!**\n\nJust use the `render_interactively` function in the `interactive_rendering.py` script. Relevant values will be plotted to screen.\nMove the agent with the arrow keys and switch agents with TAB. You can reset the environment by pressing R.\nIf you have more than 1 agent, you can control another one with W,A,S,D and switch the second agent using LSHIFT. To do this, just set `control_two_agents=True`.\nIf the agents also have rotational actions, you can control them with M,N for the first agent and with Q,E (for example in the `diff_drive` scenario).\n\nOn the screen you will see some data from the agent controlled with arrow keys. This data includes: name, current obs, \ncurrent reward, total reward so far and environment done flag.\n\nHere is an overview of what it looks like:\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bc22df4b5996.png\"  alt=\"drawing\" width=\"500\"\u002F>\n\u003C\u002Fp>\n\n## Rendering\n\nTo render the environment, just call the `render` or the `try_render_at` functions (depending on environment wrapping).\n\nExample:\n```\nenv.render(\n    mode=\"rgb_array\", # \"rgb_array\" returns image, \"human\" renders in display\n    agent_index_focus=4, # If None keep all agents in camera, else focus camera on specific agent\n    index=0, # Index of batched environment to render\n    visualize_when_rgb: bool = False, # Also run human visualization when mode==\"rgb_array\"\n)\n```\n\nYou can also change the viewer size, zoom, and enable a background rendered grid by changing these inherited attributes in the scenario `make_world` function.\n\n|                                                                    Gif                                                                    |                             Agent focus                             |\n|:-----------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------:|\n|        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4c5662bb48f8.gif\" alt=\"drawing\" width=\"260\"\u002F>        | With ` agent_index_focus=None` the camera keeps focus on all agents |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_a6f17aef6b30.gif\" alt=\"drawing\" width=\"260\"\u002F> |       With ` agent_index_focus=0` the camera follows agent 0        |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_aac6f159aa0b.gif\" alt=\"drawing\" width=\"260\"\u002F> |       With ` agent_index_focus=4` the camera follows agent 4        |\n### Plot function under rendering \n\nIt is possible to plot a function under the rendering of the agents by providing a function `f` to the `render` function.\n```\nenv.render(\n    plot_position_function=f\n)\n```\nThe function takes a numpy array with shape `(n_points, 2)`, which represents a set of x, y values to evaluate f over for plotting.\n`f` outputs either an array with shape `(n_points, 1)`, which will be plotted as a colormap,\nor an array with shape `(n_points, 4)`, which will be plotted as RGBA values.\n\nSee the `sampling.py` scenario for more info.\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_0f7801f2b8f8.png\"  alt=\"drawing\" width=\"400\"\u002F>\n\u003C\u002Fp>\n\n### Rendering on server machines\nTo render in machines without a display use `mode=\"rgb_array\"`. Make sure you have OpenGL and Pyglet installed.\nTo use GPUs for headless rendering, you can install the EGL library.\nIf you do not have EGL, you need to create a fake screen. You can do this by running these commands before the script: \n```\nexport DISPLAY=':99.0'\nXvfb :99 -screen 0 1400x900x24 > \u002Fdev\u002Fnull 2>&1 &\n```\nor in this way:\n```\nxvfb-run -s \\\"-screen 0 1400x900x24\\\" python \u003Cyour_script.py>\n```\nTo create a fake screen you need to have `Xvfb` installed.\n\n## List of environments\n### VMAS\n|                                                                                                                                                                       |                                                                                                                                                               |                                                                                                                                                                           |\n|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **\u003Cp align=\"center\">dropout\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bcf2311275d4.gif\"\u002F>                       | **\u003Cp align=\"center\">football\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_c99ea49a03a4.gif\"\u002F>             | **\u003Cp align=\"center\">transport\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_1a55da5023bf.gif\"\u002F>                       |\n| **\u003Cp align=\"center\">wheel\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_83c4bfedd4d0.gif\"\u002F>                           | **\u003Cp align=\"center\">balance\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8136de98b37b.gif\"\u002F>               | **\u003Cp align=\"center\">reverse \u003Cbr\u002F> transport\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b191d1d9b094.gif\"\u002F> |\n| **\u003Cp align=\"center\">give_way\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e2555f0224d8.gif\"\u002F>                     | **\u003Cp align=\"center\">passage\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9c9990512c4f.gif\"\u002F>               | **\u003Cp align=\"center\">dispersion\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_f63a89a7931f.gif\"\u002F>                     |\n| **\u003Cp align=\"center\">joint_passage_size\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_797105841d5c.gif\"\u002F> | **\u003Cp align=\"center\">flocking\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_d61ed0755b01.gif\"\u002F>             | **\u003Cp align=\"center\">discovery\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_3322093dfdd7.gif\"\u002F>                       | \n| **\u003Cp align=\"center\">joint_passage\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5341ca4763cc.gif\"\u002F>           | **\u003Cp align=\"center\">ball_passage\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8da73fd9c5ef.gif\"\u002F>     | **\u003Cp align=\"center\">ball_trajectory\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_47433f1600e9.gif\"\u002F>           |\n| **\u003Cp align=\"center\">buzz_wire\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8e533e17f7e1.gif\"\u002F>                   | **\u003Cp align=\"center\">multi_give_way\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9d2e413f783c.gif\"\u002F> | **\u003Cp align=\"center\">navigation\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_2a5a00a73f4c.gif\"\u002F>                     |\n| **\u003Cp align=\"center\">sampling\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_ab9d1bde5e69.gif\"\u002F>                     | **\u003Cp align=\"center\">wind_flocking\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4ff9195efe30.gif\"\u002F>   | **\u003Cp align=\"center\">road_traffic\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_82971b43cccc.gif\"\u002F>         |\n\n#### Main scenarios\n\n| Env name                | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | GIF                                                                                                                                            |\n|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|\n| `dropout.py`            | In this scenario, `n_agents` and a goal are spawned at random positions between -1 and 1. Agents cannot collide with each other and with the goal. The reward is shared among all agents. The team receives a reward of 1 when at least one agent reaches the goal. A penalty is given to the team proportional to the sum of the magnitude of actions of every agent. This penalises agents for moving. The impact of the energy reward can be tuned by setting `energy_coeff`. The default coefficient is 0.02 makes it so that for one agent it is always worth reaching the goal. The optimal policy consists in agents sending only the closest agent to the goal and thus saving as much energy as possible. Every agent observes its position, velocity, relative position to the goal and a flag that is set when someone reaches the goal. The environment terminates when when someone reaches the goal. To solve this environment, communication is needed.                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bcf2311275d4.gif\" alt=\"drawing\" width=\"300\"\u002F>              |\n| `dispersion.py`         | In this scenario, `n_agents` agents and goals are spawned. All agents spawn in [0,0] and goals spawn at random positions between -1 and 1.   Agents cannot collide with each other and with the goals. Agents are tasked with reaching the goals. When a goal is reached, the team gets a reward of 1 if `share_reward` is true, otherwise the agents which reach that goal in the same step split the reward of 1. If `penalise_by_time` is true, every agent gets an additional reward of -0.01 at each step. The optimal policy is for agents to disperse and each tackle a different goal. This requires high coordination and diversity. Every agent observes its position and velocity. For every goal it also observes the relative position and a flag indicating if the goal has been already reached by someone or not. The environment terminates when all the goals are reached.                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_f63a89a7931f.gif\" alt=\"drawing\" width=\"300\"\u002F>           |\n| `transport.py`          | In this scenario, `n_agents`, `n_packages` (default 1) and a goal are spawned at random positions between -1 and 1. Packages are boxes with `package_mass` mass (default 50 times agent mass) and `package_width` and `package_length` as sizes.  The goal is for agents to push all packages to the goal. When all packages overlap with the goal, the scenario ends. Each agent receives the same reward which is proportional to the sum of the distance variations between the packages and the goal. In other words, pushing a package towards the goal will give a positive reward, while pushing it away, a negative one. Once a package overlaps with the goal, it becomes green and its contribution to the reward becomes 0. Each agent observes its position, velocity, relative position to packages, package velocities, relative positions between packages and the goal and a flag for each package indicating if it is on the goal. By default packages are very heavy and one agent is barely able to push them. Agents need to collaborate and push packages together to be able to move them faster.                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_1a55da5023bf.gif\" alt=\"drawing\" width=\"300\"\u002F>            |\n| `reverse_transport.py`  | This is exactly the same of transport except with `n_agents` spawned inside a single package. All the rest is the same.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b191d1d9b094.gif\" alt=\"drawing\" width=\"300\"\u002F>    |\n| `give_way.py`           | In this scenario, two agents and two goals are spawned in a narrow corridor. The agents need to reach the goal with their color. The agents are standing in front of each other's goal and thus need to swap places. In the middle of the corridor there is an asymmetric opening which fits one agent only. Therefore the optimal policy is for one agent to give way to the other. This requires heterogeneous behaviour. Each agent observes its position, velocity and the relative position to its goal. The scenario terminates when both agents reach their goals.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e2555f0224d8.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `wheel.py`              | In this scenario, `n_agents` are spawned at random positions between -1 and 1. One line with `line_length` and `line_mass` is spawned in the middle. The line is constrained in the origin and can rotate. The goal of the agents is to make the absolute angular velocity of the line match `desired_velocity`. Therefore, it is not sufficient for the agents to all push in the extrema of the line, but they need to organize to achieve, and not exceed, the desired velocity. Each agent observes its position, velocity, the current angle of the line module pi, the absolute difference between the current angular velocity of the line and the desired one, and the relative position to the two line extrema. The reward is shared and it is the absolute difference between the current angular velocity of the line and the desired one.                                                                                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_83c4bfedd4d0.gif\" alt=\"drawing\" width=\"300\"\u002F>                |  \n| `balance.py`            | In this scenario, `n_agents` are spawned uniformly spaced out under a line upon which lies a spherical package of mass `package_mass`. The team and the line are spawned at a random X position at the bottom of the environment. The environment has vertical gravity. If `random_package_pos_on_line` is True (default), the relative X position of the package on the line is random. In the top half of the environment a goal is spawned. The agents have to carry the package to the goal.  Each agent receives the same reward which is proportional to the distance variation between the package and the goal. In other words, getting the package closer to the goal will give a positive reward, while moving it away, a negative one. The team receives a negative reward of -10 for making the package or the line fall to the floor. The observations for each agent are: its position, velocity, relative position to the package, relative position to the line, relative position between package and goal, package velocity, line velocity, line angular velocity, and line rotation mod pi. The environment is done either when the package or the line fall or when the package touches the goal.            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8136de98b37b.gif\" alt=\"drawing\" width=\"300\"\u002F>              |  \n| `football.py`           | In this scenario, a team of `n_blue_agents` play football against a team of `n_red_agents`. The boolean parameters `ai_blue_agents` and `ai_red_agents` specify whether each team is controlled by action inputs or a programmed AI. Consequently, football can be treated as either a cooperative or competitive task. The reward in this scenario can be tuned with `dense_reward_ratio`, where a value of 0 denotes a fully sparse reward (1 for a goal scored, -1 for a goal conceded), and 1 denotes a fully dense reward (based on the the difference of the \"attacking value\" of each team, which considers the distance from the ball to the goal and the presence of open dribbling\u002Fshooting lanes to the goal). Every agent observes its position, velocity, relative position to the ball, and relative velocity to the ball. The episode terminates when one team scores a goal.                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_c99ea49a03a4.gif\" alt=\"drawing\" width=\"300\"\u002F>             | \n| `discovery.py`          | In this scenario, a team of `n_agents` has to coordinate to cover `n_targets` targets as quickly as possible while avoiding collisions. A target is considered covered if `agents_per_target` agents have approached a target at a distance of at least `covering_range`. After a target is covered, the `agents_per_target` each receive a reward and the target is respawned to a new random position. Agents receive a penalty if they collide with each other. Every agent observes its position, velocity, LIDAR range measurements to other agents and targets (independently). The episode terminates after a fixed number of time steps.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_3322093dfdd7.gif\" alt=\"drawing\" width=\"300\"\u002F>            | \n| `flocking.py`           | In this scenario, a team of `n_agents` has to flock around a target while staying together and maximising their velocity without colliding with each other and a number of `n_obstacles` obstacles. Agents are penalized for colliding with each other and with obstacles, and are rewarded for maximising velocity and minimising the span of the flock (cohesion). Every agent observes its position, velocity, and LIDAR range measurements to other agents. The episode terminates after a fixed number of time steps.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_d61ed0755b01.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `passage.py`            | In this scenario, a team of 5 robots is spawned in formation at a random location in the bottom part of the environment. A simular formation of goals is spawned at random in the top part. Each robot has to reach its corresponding goal. In the middle of the environment there is a wall with `n_passages`. Each passage is large enough to fit one robot at a time. Each agent receives a reward which is proportional to the distance variation between itself and the goal. In other words, getting closer to the goal will give a positive reward, while moving it away, a negative one. This reward will be shared in case `shared_reward` is true. If collisions among robots occur, each robot involved will get a reward of -10. Each agent observes: its position, velocity, relative position to the goal and relative position to the center of each passage. The environment terminates when all the robots reach their goal.                                                                                                                                                                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9c9990512c4f.gif\" alt=\"drawing\" width=\"300\"\u002F>              |\n| `joint_passage_size.py` | Here, two robots of different sizes (blue circles),connected by a linkage through two revolute joints, need to cross a passage while keeping the linkage parallel to it and then match the desired goal position (green circles) on the other side. The passage is comprised of a bigger and a smaller gap, which are spawned in a random position and order on the wall, but always at the same distance between each other. The team is spawned in a random order and position on the lower side with the linkage always perpendicular to the passage. The goal is spawned horizontally in a random position on the upper side. Each robot observes its velocity, relative position to each gap, and relative position to the goal center. The robots receive a shaped global reward that guides them to the goal without colliding with the passage.                                                                                                                                                                                                                                                                                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_797105841d5c.gif\" alt=\"drawing\" width=\"300\"\u002F>   |\n| `joint_passage.py`      | This is the same as `joint_passage_size.py` with the difference that the robots are now physically identical, but the linkage has an asymmetric mass (black circle). The passage is a single gap, positioned randomly on the wall. The agents need to cross it while keeping the linkage perpendicular to the wall and avoiding collisions. The team and the goal are spawned in a random position, order, and rotation on opposite sides of the passage.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5341ca4763cc.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `ball_passage.py`       | This is the same as `joint_passage.py`, except now the agents are not connected by linkages and need to push a ball through the passage. The reward is only dependent on the ball and it's shaped to guide it through the passage.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8da73fd9c5ef.gif\" alt=\"drawing\" width=\"300\"\u002F>         |\n| `ball_trajectory.py`    | This is the same as `circle_trajectory.py` except the trajectory reward is now dependent on a ball object. Two agents need to drive the ball in a circular trajectory. If `joints=True` the agents are connected to the ball with linkages.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_47433f1600e9.gif\" alt=\"drawing\" width=\"300\"\u002F>      |\n| `buzz_wire.py`          | Two agents are connected to a mass through linkages and need to play the [Buzz Wire game](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FWire_loop_game) in a straight corridor. Be careful not to touch the borders, or the episode ends!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8e533e17f7e1.gif\" alt=\"drawing\" width=\"300\"\u002F>            |\n| `multi_give_way.py`     | This scenario is an extension of `give_way.py` where four agents have to reach their goal by giving way to each other.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9d2e413f783c.gif\" alt=\"drawing\" width=\"300\"\u002F>       |\n| `navigation.py`         | Randomly spawned agents need to navigate to their goal. Collisions can be turned on and agents can use LIDARs to avoid running into each other. Rewards can be shared or individual. Apart from position, velocity, and lidar readings, each agent can be set up to observe just the relative distance to its goal, or its relative distance to *all* goals (in this case the task needs heterogeneous behavior to be solved). The scenario can also be set up so that multiple agents share the same goal.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_2a5a00a73f4c.gif\" alt=\"drawing\" width=\"300\"\u002F>           |\n| `sampling.py`           | `n_agents` are spawned randomly in a workspace with an underlying gaussian density function composed of `n_gaussians` modes. Agents need to collect samples by moving in this field. The field is discretized to a grid and once an agent visits a cell its sample is collected without replacement and given as reward to the whole team (or just to the agent if `shared_rew=False`). Agents can use a lidar to sens each other. Apart from lidar, position and velocity observations, each agent observes the values of samples in the 3x3 grid around it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_ab9d1bde5e69.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `wind_flocking.py`      | Two agents need to flock at a specified distance northwards. They are rewarded for their distance and the alignment of their velocity vectors to the reference. The scenario presents wind from north to south. The agents present physical heterogeneity: the smaller one has some aerodynamical properties and can shield the bigger one from wind, thus optimizing the flocking performance. Thus, the optimal solution to this task consists in the agents performing heterogeneous wind shielding. See the [SND paper](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fsystem-neural-diversity-measuring-behavioral-heterogeneity-in-multi-agent-learning\u002F) for more info.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4ff9195efe30.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `road_traffic.py`       | This scenario provides a MARL benchmark for Connected and Automated Vehicles (CAVs) using a High-Definition (HD) map from the Cyber-Physical Mobility Lab ([CPM Lab](https:\u002F\u002Fcpm.embedded.rwth-aachen.de\u002F)), an open-source testbed for CAVs. The map features an eight-lane intersection and a loop-shaped highway with multiple merge-in and -outs, offering a range of challenging traffic conditions. Forty loop-shaped reference paths are predefined, allowing for simulations with infinite durations. You can initialize up to 100 agents, with a default number of 20. In the event of collisions during training, the scenario reinitializes all agents, randomly assigning them new reference paths, initial positions, and speeds. This setup is designed to simulate the unpredictability of real-world driving. Besides, the observations are designed to promote sample efficiency and generalization (i.e., agents' ability to generalize to unseen scenarios). In addition, both ego view and bird's-eye view are implemented; partial observation is also supported to simulate partially observable Markov Decision Processes. See [this paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.07644) for more info. | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_82971b43cccc.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n\n#### Debug scenarios\n\n| Env name               | Description                                                                                                                                                                                                                                                                                                                                                                             | GIF                                                                                                                                         |\n|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|\n| `waterfall.py`         | `n_agents` agents are spawned in the top of the environment. They are all connected to each other through collidable linkages. The last agent is connected to a box. Each agent is rewarded based on how close it is to the center of the black line at the bottom. Agents have to reach the line and in doing so they might collide with each other and with boxes in the environment. | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_48c5d5d7546b.gif\" alt=\"drawing\" width=\"300\"\u002F>         |\n| `asym_joint.py`        | Two agents are connected by a linkage with an asymmetric mass. The agents are rewarded for bringing the linkage to a vertical position while consuming the least team energy possible.                                                                                                                                                                                                  | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_7d36dd4027a8.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `vel_control.py`       | Example scenario where three agents have velocity controllers with different acceleration constraints                                                                                                                                                                                                                                                                                   | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_72b6aba27348.gif\" alt=\"drawing\" width=\"300\"\u002F>       |\n| `goal.py`              | An agent with a velocity controller is spawned at random in the workspace. It is rewarded for moving to a randomly initialised goal while consuming the least energy. The agent observes its velocity and the relative position to the goal.                                                                                                                                            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5de27ccc7fc7.gif\" alt=\"drawing\" width=\"300\"\u002F>              | \n| `het_mass.py`          | Two agents with different masses are spawned randomly in the workspace. They are rewarded for maximising the team maximum speed while minimizing the team energy expenditure. The optimal policy requires the heavy agent to stay still while the light agent moves at maximum speed.                                                                                                   | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_351306cf3f59.gif\" alt=\"drawing\" width=\"300\"\u002F>          |\n| `line_trajectory.py`   | One agent is rewarded to move in a line trajectory.                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_13e38d4636f1.gif\" alt=\"drawing\" width=\"300\"\u002F>   |\n| `circle_trajectory.py` | One agent is rewarded to move in a circle trajectory at the `desired_radius`.                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_822742ecdf42.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n| `diff_drive.py`        | An example of the `diff_drive` dynamic model constraint. Both agents have rotational actions which can be controlled interactively.  The first agent has differential drive dynamics. The second agent has standard vmas holonomic dynamics.                                                                                                                                            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5a45022c3846.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `kinematic_bicycle.py` | An example of `kinematic_bicycle` dynamic model constraint. Both agents have rotational actions which can be controlled interactively.  The first agent has kinematic bicycle model dynamics. The second agent has standard vmas holonomic dynamics.                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e0b71eb68c18.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n| `drone.py`             | An example of the `drone` dynamic model.                                                                                                                                                                                                                                                                                                                                                | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_41a8996e0438.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n\n### [MPE](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmultiagent-particle-envs)\n\n| Env name in code (name in paper)                         | Communication? | Competitive? | Notes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n|----------------------------------------------------------|----------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `simple.py`                                              | N              | N            | Single agent sees landmark position, rewarded based on how close it gets to landmark. Not a multi-agent environment -- used for debugging policies.                                                                                                                                                                                                                                                                                                                                                                                                               |\n| `simple_adversary.py` (Physical deception)               | N              | Y            | 1 adversary (red), N good agents (green), N landmarks (usually N=2). All agents observe position of landmarks and other agents. One landmark is the ‘target landmark’ (colored green). Good agents rewarded based on how close one of them is to the target landmark, but negatively rewarded if the adversary is close to target landmark. Adversary is rewarded based on how close it is to the target, but it doesn’t know which landmark is the target landmark. So good agents have to learn to ‘split up’ and cover all landmarks to deceive the adversary. |\n| `simple_crypto.py` (Covert communication)                | Y              | Y            | Two good agents (alice and bob), one adversary (eve). Alice must sent a private message to bob over a public channel. Alice and bob are rewarded based on how well bob reconstructs the message, but negatively rewarded if eve can reconstruct the message. Alice and bob have a private key (randomly generated at beginning of each episode), which they must learn to use to encrypt the message.                                                                                                                                                             |\n| `simple_push.py` (Keep-away)                             | N              | Y            | 1 agent, 1 adversary, 1 landmark. Agent is rewarded based on distance to landmark. Adversary is rewarded if it is close to the landmark, and if the agent is far from the landmark. So the adversary learns to push agent away from the landmark.                                                                                                                                                                                                                                                                                                                 |\n| `simple_reference.py`                                    | Y              | N            | 2 agents, 3 landmarks of different colors. Each agent wants to get to their target landmark, which is known only by other agent. Reward is collective. So agents have to learn to communicate the goal of the other agent, and navigate to their landmark. This is the same as the simple_speaker_listener scenario where both agents are simultaneous speakers and listeners.                                                                                                                                                                                    |\n| `simple_speaker_listener.py` (Cooperative communication) | Y              | N            | Same as simple_reference, except one agent is the ‘speaker’ (gray) that does not move (observes goal of other agent), and other agent is the listener (cannot speak, but must navigate to correct landmark).                                                                                                                                                                                                                                                                                                                                                      |\n| `simple_spread.py` (Cooperative navigation)              | N              | N            | N agents, N landmarks. Agents are rewarded based on how far any agent is from each landmark. Agents are penalized if they collide with other agents. So, agents have to learn to cover all the landmarks while avoiding collisions.                                                                                                                                                                                                                                                                                                                               |\n| `simple_tag.py` (Predator-prey)                          | N              | Y            | Predator-prey environment. Good agents (green) are faster and want to avoid being hit by adversaries (red). Adversaries are slower and want to hit good agents. Obstacles (large black circles) block the way.                                                                                                                                                                                                                                                                                                                                                    |\n| `simple_world_comm.py`                                   | Y              | Y            | Environment seen in the video accompanying the paper. Same as simple_tag, except (1) there is food (small blue balls) that the good agents are rewarded for being near, (2) we now have ‘forests’ that hide agents inside from being seen from outside; (3) there is a ‘leader adversary” that can see the agents at all times, and can communicate with the other adversaries to help coordinate the chase.                                                                                                                                                      |\n\n## Our papers using VMAS\n\n- [VMAS](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fvmas-a-vectorized-multi-agent-simulator-for-collective-robot-learning\u002F) features training of `balance`, `transport`, `give_way`, `wheel`\n- [HetGPPO](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fheterogeneous-multi-robot-reinforcement-learning\u002F) features training of `het_mass`, `give_way`, `joint_passage`, `joint_passage_size`\n- [SND](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fsystem-neural-diversity-measuring-behavioral-heterogeneity-in-multi-agent-learning\u002F) features training of `navigation`, `joint_passage`, `joint_passage_size`, `wind`\n- [TorchRL](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Ftorchrl-a-data-driven-decision-making-library-for-pytorch\u002F) features training of `navigation`, `sampling`, `balance`\n- [BenchMARL](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Fbenchmarl\u002F) features training of `navigation`, `sampling`, `balance`\n- [The Cambridge RoboMaster](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Frobomaster\u002F) features training of `navigation`\n- [DiversityControl (DiCo)](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Fcontrolling-behavioral-diversity-in-multi-agent-reinforcement-learning\u002F) features training of `navigation`, `sampling`, `dispersion`, `simple_tag`\n\n## TODOS\n\nTODOs are now listed [here](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F116). \n\n- [X] Improve test efficiency and add new tests\n- [X] Implement 2D drone dynamics\n- [X] Allow any number of actions\n- [X] Improve VMAS performance\n- [X] Dict obs support in torchrl\n- [X] Make TextLine a Geom usable in a scenario\n- [X] Notebook on how to use torch rl with vmas\n- [X] Allow dict obs spaces and multidim obs\n- [X] Talk about action preprocessing and velocity controller\n- [X] New envs from joint project with their descriptions\n- [X] Talk about navigation \u002F multi_goal\n- [X] Link video of experiments\n- [X] Add LIDAR section\n- [X] Implement LIDAR\n- [X] Rewrite all MPE scenarios\n  - [X] simple\n  - [x] simple_adversary\n  - [X] simple_crypto\n  - [X] simple_push\n  - [X] simple_reference\n  - [X] simple_speaker_listener\n  - [X] simple_spread\n  - [X] simple_tag\n  - [X] simple_world_comm\n","# 向量化多智能体模拟器 (VMAS)\n\u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fvmas\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fvmas\" alt=\"pypi版本\">\u003C\u002Fa>\n[![下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_33bb5a560cbf.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fvmas)\n![测试](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Factions\u002Fworkflows\u002Ftests-linux.yml\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcoverage.svg?branch=main)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fproroklab\u002FVectorizedMultiAgentSimulator)\n[![文档状态](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_13d664e1afd7.png)](https:\u002F\u002Fvmas.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![GitHub许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-GPLv3.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002FLICENSE)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2207.03530-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530)\n[![Discord盾牌](https:\u002F\u002Fdcbadge.limes.pink\u002Fapi\u002Fserver\u002Fhttps:\u002F\u002Fdiscord.gg\u002Fdg8txxDW5t)](https:\u002F\u002Fdiscord.gg\u002Fdg8txxDW5t)\n[![在Colab中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FSimulation_and_training_in_VMAS_and_BenchMARL.ipynb)\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_a5f0d93f05b1.gif\" alt=\"drawing\"\u002F>  \n\u003C\u002Fp>\n\n> [!NOTE]  \n> 我们发布了[BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL)，这是一个基准测试库，您可以在其中使用TorchRL训练VMAS任务！\n> 请查看[如何轻松使用它。](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb)\n\n## 欢迎来到VMAS！\n\n本仓库包含向量化多智能体模拟器（VMAS）的代码。\n\nVMAS是一个向量化可微分模拟器，专为高效的多智能体强化学习基准测试而设计。\n它由一个完全可微的PyTorch编写的向量化2D物理引擎和一组具有挑战性的多机器人场景组成。\n场景的创建简单且模块化，以鼓励贡献。\nVMAS可以模拟不同形状的智能体和地标，并支持旋转、弹性碰撞、关节以及自定义重力。\n为了简化仿真，智能体采用了全向运动模型。还提供了诸如激光雷达之类的自定义传感器，且该模拟器支持智能体间的通信。\n通过[PyTorch](https:\u002F\u002Fpytorch.org\u002F)中的向量化功能，VMAS能够批量执行仿真，在加速硬件上无缝扩展至数万个并行环境。\nVMAS拥有与[OpenAI Gym](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fgym)、[Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F)、[RLlib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html)、[torchrl](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl)及其多智能体强化学习训练库：[BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL)兼容的接口，\n从而实现与多种强化学习算法的开箱即用集成。\n其实现灵感来源于[OpenAI的MPE](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmultiagent-particle-envs)。\n除了VMAS自带的场景外，我们还移植并将其余所有MPE场景进行了向量化处理。\n\n### [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530)\n该arXiv论文可在[此处](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.03530)找到。\n\n如果您在研究中使用了VMAS，请**引用**它：\n```\n@article{bettini2022vmas,\n  title = {VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning},\n  author = {Bettini, Matteo and Kortvelesy, Ryan and Blumenkamp, Jan and Prorok, Amanda},\n  year = {2022},\n  journal={The 16th International Symposium on Distributed Autonomous Robotic Systems},\n  publisher={Springer}\n}\n```\n\n### 视频\n观看VMAS的演示视频，展示其结构、场景和实验。\n\n\u003Cp align=\"center\">\n\n[![VMAS视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_fd2dcbcaccd3.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aaDRYfiesAY)\n\u003C\u002Fp>\n观看DARS 2022关于VMAS的演讲。\n\u003Cp align=\"center\">\n\n[![VMAS视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b240046ffeb6.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=boViBY7Woqg)\n\u003C\u002Fp>\n观看关于如何在VMAS中创建自定义场景并在BenchMARL中进行训练的讲座。\n\u003Cp align=\"center\">\n\n[![VMAS视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_73b9cf0ec012.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=mIb1uGeRJsg)\n\u003C\u002Fp>\n\n## 目录\n- [向量化多智能体模拟器 (VMAS)](#vectorizedmultiagentsimulator-vmas)\n  * [欢迎来到VMAS！](#welcome-to-vmas)\n    + [论文](#paper)\n    + [视频](#video)\n  * [目录](#table-of-contents)\n  * [使用方法](#how-to-use)\n    + [笔记本](#notebooks)\n    + [安装](#install)\n    + [运行](#run)\n      - [RLlib](#rllib)\n      - [TorchRL](#torchrl)\n    + [输入与输出空间](#input-and-output-spaces)\n      - [输出空间](#output-spaces)\n      - [输入动作空间](#input-action-space)\n  * [模拟器特性](#simulator-features)\n  * [创建新场景](#creating-a-new-scenario)\n  * [游玩场景](#play-a-scenario)\n  * [渲染](#rendering)\n    + [渲染下的绘图函数](#plot-function-under-rendering)\n    + [服务器上的渲染](#rendering-on-server-machines)\n  * [环境列表](#list-of-environments)\n    + [VMAS](#vmas)\n       - [主要场景](#main-scenarios)\n       - [调试场景](#debug-scenarios)\n    + [MPE](#mpe)\n  * [我们使用VMAS的论文](#our-papers-using-vmas)\n  * [待办事项](#todos)\n\n\n## 使用方法\n\n### 笔记本\n-  [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FVMAS_Use_vmas_environment.ipynb) &ensp; **使用 VMAS 环境**。\n  这是一个简单的笔记本，你可以运行它来创建、逐步执行并渲染 VMAS 中的任何场景。它复现了 `examples` 文件夹中的 `use_vmas_env.py` 脚本。\n- [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FSimulation_and_training_in_VMAS_and_BenchMARL.ipynb) &ensp;  **创建 VMAS 场景并在 [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL) 中进行训练**。 我们将创建一个场景，其中多个具有不同形态的机器人需要在避开彼此（以及障碍物）的同时导航到各自的目标，并使用 MAPPO 和 MLP\u002FGNN 策略对其进行训练。\n- [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb) &ensp;  **在 BenchMARL 中训练 VMAS（推荐）**。 在这个笔记本中，我们展示了如何在 TorchRL 的 MARL 训练库 [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL) 中使用 VMAS。\n- [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpytorch\u002Frl\u002Fblob\u002Fgh-pages\u002F_downloads\u002Fa977047786179278d12b52546e1c0da8\u002Fmultiagent_ppo.ipynb)  &ensp;  **在 TorchRL 中训练 VMAS**。 在这个笔记本中，[可在 TorchRL 文档中找到](https:\u002F\u002Fpytorch.org\u002Frl\u002Fstable\u002Ftutorials\u002Fmultiagent_ppo.html#)，我们展示了如何在 TorchRL 中使用任何 VMAS 场景。它将引导你完成使用 MAPPO\u002FIPPO 训练智能体所需的完整流程。\n- [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpytorch\u002Frl\u002Fblob\u002Fgh-pages\u002F_downloads\u002Fd30bb6552cc07dec0f1da33382d3fa02\u002Fmultiagent_competitive_ddpg.py)  &ensp;  **在 TorchRL 中训练竞争性 VMAS MPE**。 在这个笔记本中，[可在 TorchRL 文档中找到](https:\u002F\u002Fpytorch.org\u002Frl\u002Fstable\u002Ftutorials\u002Fmultiagent_competitive_ddpg.html)，我们展示了如何使用 MADDPG\u002FIDDPG 解决竞争性多智能体强化学习 (MARL) 问题。\n- [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fnotebooks\u002FVMAS_RLlib.ipynb)  &ensp;  **在 RLlib 中训练 VMAS**。 在这个笔记本中，我们展示了如何在 RLlib 中使用任何 VMAS 场景。它复现了 `examples` 文件夹中的 `rllib.py` 脚本。\n\n\n\n### 安装\n\n要安装模拟器，你可以使用 pip 获取最新版本：\n```bash\npip install vmas\n```\n如果你想安装当前的 master 版本（比最新版本更新），可以这样做：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator.git\ncd VectorizedMultiAgentSimulator\npip install -e .\n```\n默认情况下，vmas 只包含核心依赖项。若要安装更多依赖项以支持使用 [Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F) 包装器、[RLLib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html) 包装器进行训练、渲染和测试，你可以安装以下可选依赖项：\n```bash\n# 安装 gymnasium 以使用 gymnasium 包装器\npip install vmas[gymnasium]\n\n# 安装 rllib 以使用 rllib 包装器\npip install vmas[rllib]\n\n# 安装渲染依赖项\npip install vmas[render]\n\n# 安装测试依赖项\npip install vmas[test]\n\n# 安装所有依赖项\npip install vmas[all]\n```\n\n你还可以安装以下训练库：\n\n```bash\npip install benchmarl # 用于在 BenchMARL 中训练\npip install torchrl # 用于在 TorchRL 中训练\npip install \"ray[rllib]\"==2.1.0 # 用于在 RLlib 中训练。我们支持 \"ray[rllib]\u003C=2.2,>=1.13\" 版本\n```\n\n### 运行\n\n要使用模拟器，只需通过将您想要的场景名称（来自 `scenarios` 文件夹）传递给 `make_env` 函数来创建环境。\n函数参数在文档中已详细说明。该函数会返回一个具有 VMAS 接口的环境对象：\n\n以下是一个示例：\n```python\n env = vmas.make_env(\n        scenario=\"waterfall\", # 可以是场景名称或 BaseScenario 类\n        num_envs=32,\n        device=\"cpu\", # 或者使用 \"cuda\" 表示 GPU\n        continuous_actions=True,\n        wrapper=None,  # 可选值：None、\"rllib\"、\"gym\"、\"gymnasium\"、\"gymnasium_vec\"\n        max_steps=None, # 定义时间步长上限。None 表示无上限。\n        seed=None, # 环境的随机种子\n        dict_spaces=False, # 默认使用元组空间，每个元素对应一个智能体。\n        # 如果 dict_spaces=True，则空间会变为字典形式，键为智能体名称。\n        grad_enabled=False, # 如果启用梯度计算，模拟器将具备可微性，输出到输入之间可以传播梯度。\n        terminated_truncated=False, # 如果启用此选项，模拟器将在 `done()`、`step()` 和 `get_from_scenario()` 函数中分别返回 `terminated` 和 `truncated` 标志，而不是单一的 `done` 标志。\n        **kwargs # 您希望传递给场景初始化的其他参数\n    )\n```\n您可以在 `examples` 目录下的 `use_vmas_env.py` 中找到另一个可运行的示例。\n\n当 `terminated_truncated` 标志设置为 `True` 时，模拟器会在 `done()`、`step()` 和 `get_from_scenario()` 函数中分别返回 `terminated` 和 `truncated` 标志，而不是单一的 `done` 标志。\n这在您需要区分环境结束是因为回合已经结束，还是因为达到了最大回合长度\u002F时间步长上限时非常有用。\n有关此功能的更多详细信息，请参阅 [Gymnasium 文档](https:\u002F\u002Fgymnasium.farama.org\u002Ftutorials\u002Fgymnasium_basics\u002Fhandling_time_limits\u002F)。\n\n#### RLlib\n\n要了解如何在 RLlib 中使用 VMAS，请查看 `examples\u002Frllib.py` 脚本。\n\n您还可以在 [HetGPPO 仓库](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FHetGPPO)中找到更多关于 VMAS 多智能体训练的示例。\n\n#### TorchRL\n\nVMAS 很好地支持 [TorchRL](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl) 及其多智能体强化学习训练库 [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL)。\n\n请参阅此 [笔记本](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002FBenchMARL\u002Fblob\u002Fmain\u002Fnotebooks\u002Frun.ipynb)，了解在 [BenchMARL](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FBenchMARL) 中使用 VMAS 是何等简单。\n\n我们提供了一个 [笔记本](https:\u002F\u002Fpytorch.org\u002Frl\u002Ftutorials\u002Fmultiagent_ppo.html)，它将引导您完成一个完整的多智能体强化学习流程，使用 MAPPO\u002FIPPO 在 TorchRL 中训练 VMAS 场景。\n\n您可以在 TorchRL 仓库的 [这里](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Ftree\u002Fmain\u002Fsota-implementations\u002Fmultiagent) 找到 **示例脚本**，演示如何利用 [VMAS 封装](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Fblob\u002Fmain\u002Ftorchrl\u002Fenvs\u002Flibs\u002Fvmas.py)运行 MAPPO-IPPO-MADDPG-QMIX-VDN。\n\n\n\n### 输入与输出空间\n\nVMAS 使用 gym 空间来定义输入和输出空间。\n默认情况下，动作和观测空间是元组形式：\n```python\nspaces.Tuple(\n    [agent_space for agent in agents]\n)\n```\n在创建环境时，通过将 `dict_spaces` 设置为 `True`，可以将元组转换为字典形式：\n```python\nspaces.Dict(\n  {agent.name: agent_space for agent in agents}\n)\n```\n\n#### 输出空间\n\n如果 `dict_spaces=False`，环境返回的观测、信息和奖励将是一个列表，每个元素对应某个智能体的值。\n\n如果 `dict_spaces=True`，环境返回的观测、信息和奖励将是一个字典，其中每个键是智能体名称，对应的值则是该智能体的相应数值。\n\n无论采用哪种结构，每个智能体的 **观测** 都可能是（取决于场景的具体实现）：\n  - 形状为 `[num_envs, observation_size]` 的张量，其中 `observation_size` 是该智能体的观测维度。\n```python\n def observation(self, agent: Agent):\n        return torch.cat([agent.state.pos, agent.state.vel], dim=-1)\n```\n  - 或者由多个此类张量组成的字典。\n```python\n def observation(self, agent: Agent):\n        return {\n            \"pos\": agent.state.pos,\n            \"nested\": {\"vel\": agent.state.vel},\n        }\n```\n\n每个智能体的 **奖励** 都是一个形状为 `[num_envs]` 的张量。\n\n每个智能体的 **信息** 则是一个字典，其中每个条目都包含一个键（表示信息名称）和一个形状为 `[num_envs, info_size]` 的张量，`info_size` 是该智能体相关信息的维度。\n\n`done` 是一个形状为 `[num_envs]` 的张量。\n\n\n#### 输入动作空间\n\nVMAS 中的每个智能体都需要提供一个形状为 `[num_envs, action_size]` 的动作张量，其中 `num_envs` 是向量化环境的数量，`action_size` 是该智能体的动作维度。\n\n智能体的动作可以通过两种方式传递给 `env.step()`：\n- 一个长度等于智能体数量的 **列表**，形式为 `[tensor_action_agent_0, ..., tensor_action_agent_n]`；\n- 一个长度等于智能体数量的 **字典**，形式为 `{agent_0.name: tensor_action_agent_0, ..., agent_n.name: tensor_action_agent_n}`。\n\n用户可以在这两种格式之间自由切换，甚至在执行过程中更改格式，VMAS 始终会进行所有必要的合法性检查。无论选择元组空间还是字典空间，这两种格式都能正常工作。\n\n## 模拟器特性\n\n- **向量化**：VMAS 的向量化功能可以并行推进任意数量的环境。这显著减少了在多智能体强化学习中收集训练轨迹所需的时间。\n- **简单**：虽然存在复杂的向量化物理引擎（例如 [Brax](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fbrax)），但在处理多个智能体时，它们的扩展效率并不高。这违背了向量化所追求的计算速度目标。VMAS 使用一个用 PyTorch 编写的简单自定义 2D 动力学引擎，以提供快速的仿真。\n- **通用性**：VMAS 的核心设计使其能够用于实现 2D 空间中的通用高层次多机器人问题。它既支持对抗性场景，也支持合作性场景。我们选择了全向点机器人仿真，以便专注于高层次的通用问题，而不通过 MARL 学习低层次的自定义机器人控制。\n- **可扩展性**：VMAS 不仅仅是一个包含一组环境的模拟器，它还是一种框架，可用于创建新的多智能体场景，并以整个 MARL 社区都能使用的方式呈现这些场景。为此，我们将任务创建过程模块化，并引入了交互式渲染功能来调试场景。您可以在几分钟内定义自己的场景。请参阅本文档中的专门章节。\n- **兼容性**：VMAS 提供了针对 [RLlib](https:\u002F\u002Fdocs.ray.io\u002Fen\u002Flatest\u002Frllib\u002Findex.html)、[torchrl](https:\u002F\u002Fpytorch.org\u002Frl\u002Freference\u002Fgenerated\u002Ftorchrl.envs.libs.vmas.VmasEnv.html)、[OpenAI Gym](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fgym) 和 [Gymnasium](https:\u002F\u002Fgymnasium.farama.org\u002F) 的封装接口。RLlib 和 torchrl 已经实现了大量的强化学习算法。\n请注意，这种封装接口的效率低于未封装版本。封装示例请参见 `make_env` 的主函数。\n- **经过测试**：我们的场景附带测试，会在每个场景上运行一个定制的启发式算法。\n- **实体形状**：我们的实体（智能体和地标）可以具有不同的可定制形状（球体、盒子、线段等）。\n所有这些形状都支持弹性碰撞。\n- **比物理引擎更快**：我们的模拟器极其轻量，仅使用张量运算。它非常适合大规模运行包含多智能体碰撞和交互的 MARL 训练。\n- **可定制性**：在创建您自己的新场景时，世界、智能体和地标都可以高度定制。例如：阻力、摩擦力、重力、仿真时间步长、不可微分的通信、智能体传感器（如激光雷达）以及质量等。\n- **不可微分的通信**：场景可能要求智能体执行离散或连续的通信动作。\n- **重力**：VMAS 支持可定制的重力。\n- **传感器**：我们的模拟器实现了光线投射功能，可用于模拟多种基于距离的传感器，并将其添加到智能体上。目前我们支持激光雷达。有关可用传感器的信息，请参阅 `sensors` 脚本。\n- **关节**：我们的模拟器支持关节。关节是一种约束，用于将两个实体保持在指定的距离。用户可以指定两个物体上的锚定点、距离（包括 0）、关节的厚度、是否允许在任一锚点旋转，以及是否希望关节参与碰撞。请查看瀑布场景，了解如何使用关节。有关示例，请参阅 `waterfall` 和 `joint_passage` 场景。\n- **智能体动作**：智能体的物理动作是用于全向运动的 2D 力。智能体的旋转也可以通过扭矩动作来控制（在创建智能体时设置 `agent.action.u_rot_range` 即可启用）。智能体还可以配备连续或离散的通信动作。\n- **动作预处理**：通过实现场景的 `process_action` 函数，您可以在动作传递给模拟器之前对其进行修改。这在 `controllers`（我们提供了不同类型的控制器供使用）和 `dynamics`（我们提供了自定义的机器人动力学模型）中都有应用。\n- **控制器**：控制器是可以附加到神经网络策略上的组件，也可以完全替代该策略。我们提供了一个 `VelocityController`，它可以将输入动作视为速度（而不是默认的 VMAS 输入力）。这个 PID 控制器接收速度值，并输出作为输入力传递给模拟器。有关示例，请参阅 `vel_control` 调试场景。\n- **动力学模型**：VMAS 默认模拟全向动力学模型。在创建智能体时可以选择自定义动力学模型。目前实现的模型包括用于差速驱动机器人的 `DiffDriveDynamics`、用于运动学自行车模型的 `KinematicBicycleDynamics` 以及用于四旋翼无人机的动力学模型 `Drone`。有关示例，请参阅 `diff_drive`、`kinematic_bicycle` 和 `drone` 调试场景。\n- **可微分性**：在创建环境时设置 `grad_enabled=True`，模拟器将变为可微分的，从而允许梯度在其任何函数中流动。\n\n## 创建新场景\n\n要创建新场景，只需在 `scenario.py` 中扩展 `BaseScenario` 类即可。\n\n您至少需要实现 `make_world`、`reset_world_at`、`observation` 和 `reward`。此外，您还可以选择实现 `done`、`info`、`process_action` 和 `extra_render`。\n\n您还可以通过修改 `make_world` 函数中的继承属性来调整视图大小、缩放比例，并启用背景网格渲染。\n\n具体操作方法，请阅读 `scenario.py` 中 `BaseScenario` 的文档，并参考已实现的场景。\n\n## 交互式游玩场景\n\n您可以交互式地体验场景！**只需运行其脚本即可！**\n\n使用 `interactive_rendering.py` 脚本中的 `render_interactively` 函数即可。相关数值会实时显示在屏幕上。\n使用方向键移动智能体，按 TAB 键切换智能体。按下 R 键可重置环境。\n如果您有多个智能体，可以使用 W、A、S、D 键控制另一个智能体，并按 LSHIFT 键切换第二个智能体。为此，只需设置 `control_two_agents=True` 即可。如果智能体还具备旋转动作，您可以使用 M、N 键控制第一个智能体，使用 Q、E 键控制第二个智能体（例如在 `diff_drive` 场景中）。\n屏幕上会显示由方向键控制的智能体的相关数据，包括：名称、当前观测、当前奖励、累计奖励以及环境是否结束的标志。\n以下是其界面概览：\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bc22df4b5996.png\"  alt=\"drawing\" width=\"500\"\u002F>\n\u003C\u002Fp>\n\n## 渲染\n\n要渲染环境，只需调用 `render` 或 `try_render_at` 函数（具体取决于环境的包装方式）。\n\n示例：\n```\nenv.render(\n    mode=\"rgb_array\", # \"rgb_array\" 返回图像，\"human\" 在显示窗口中渲染\n    agent_index_focus=4, # 如果为 None，则相机保持所有智能体在视野内；否则将相机聚焦到特定智能体\n    index=0, # 要渲染的批处理环境的索引\n    visualize_when_rgb: bool = False, # 当 mode==\"rgb_array\" 时，同时运行人类可读的可视化\n)\n```\n\n你还可以通过修改场景 `make_world` 函数中的这些继承属性来调整查看器大小、缩放比例，并启用背景网格渲染。\n\n|                                                                    动画                                                                    |                             智能体聚焦                             |\n|:-----------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------:|\n|        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4c5662bb48f8.gif\" alt=\"drawing\" width=\"260\"\u002F>        | 当 `agent_index_focus=None` 时，相机保持所有智能体在视野内 |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_a6f17aef6b30.gif\" alt=\"drawing\" width=\"260\"\u002F> |       当 `agent_index_focus=0` 时，相机跟随智能体 0        |\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_aac6f159aa0b.gif\" alt=\"drawing\" width=\"260\"\u002F> |       当 `agent_index_focus=4` 时，相机跟随智能体 4        |\n### 渲染下的绘图函数\n\n可以通过向 `render` 函数提供一个函数 `f`，在智能体的渲染画面下方绘制函数图形。\n```\nenv.render(\n    plot_position_function=f\n)\n```\n该函数接受一个形状为 `(n_points, 2)` 的 NumPy 数组，表示一组用于评估并绘制函数 `f` 的 x, y 值。`f` 可以输出形状为 `(n_points, 1)` 的数组，作为颜色映射进行绘制；也可以输出形状为 `(n_points, 4)` 的数组，作为 RGBA 值进行绘制。\n\n更多信息请参阅 `sampling.py` 场景。\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_0f7801f2b8f8.png\"  alt=\"drawing\" width=\"400\"\u002F>\n\u003C\u002Fp>\n\n### 在服务器机器上渲染\n在没有显示设备的机器上进行渲染时，请使用 `mode=\"rgb_array\"`。请确保已安装 OpenGL 和 Pyglet。若要使用 GPU 进行无头渲染，可以安装 EGL 库。\n\n如果没有 EGL，需要创建一个虚拟屏幕。可以在脚本运行前执行以下命令：\n```\nexport DISPLAY=':99.0'\nXvfb :99 -screen 0 1400x900x24 > \u002Fdev\u002Fnull 2>&1 &\n```\n或者采用另一种方式：\n```\nxvfb-run -s \\\"-screen 0 1400x900x24\\\" python \u003Cyour_script.py>\n```\n\n创建虚拟屏幕需要先安装 `Xvfb`。\n\n## 环境列表\n\n### VMAS\n|                                                                                                                                                                       |                                                                                                                                                               |                                                                                                                                                                           |\n|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **\u003Cp align=\"center\">辍学\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bcf2311275d4.gif\"\u002F>                       | **\u003Cp align=\"center\">足球\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_c99ea49a03a4.gif\"\u002F>             | **\u003Cp align=\"center\">交通\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_1a55da5023bf.gif\"\u002F>                       |\n| **\u003Cp align=\"center\">轮子\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_83c4bfedd4d0.gif\"\u002F>                           | **\u003Cp align=\"center\">平衡\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8136de98b37b.gif\"\u002F>               | **\u003Cp align=\"center\">反向\u003Cbr\u002F>交通\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b191d1d9b094.gif\"\u002F> |\n| **\u003Cp align=\"center\">让行\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e2555f0224d8.gif\"\u002F>                     | **\u003Cp align=\"center\">通道\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9c9990512c4f.gif\"\u002F>               | **\u003Cp align=\"center\">分散\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_f63a89a7931f.gif\"\u002F>                     |\n| **\u003Cp align=\"center\">联合通道大小\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_797105841d5c.gif\"\u002F> | **\u003Cp align=\"center\"> flocking\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_d61ed0755b01.gif\"\u002F>             | **\u003Cp align=\"center\">发现\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_3322093dfdd7.gif\"\u002F>                       | \n| **\u003Cp align=\"center\">联合通道\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5341ca4763cc.gif\"\u002F>           | **\u003Cp align=\"center\">球类通道\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8da73fd9c5ef.gif\"\u002F>     | **\u003Cp align=\"center\">球的轨迹\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_47433f1600e9.gif\"\u002F>           |\n| **\u003Cp align=\"center\">嗡嗡线\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8e533e17f7e1.gif\"\u002F>                   | **\u003Cp align=\"center\">多重让行\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9d2e413f783c.gif\"\u002F> | **\u003Cp align=\"center\">导航\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_2a5a00a73f4c.gif\"\u002F>                     |\n| **\u003Cp align=\"center\">采样\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_ab9d1bde5e69.gif\"\u002F>                     | **\u003Cp align=\"center\">风力flocking\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4ff9195efe30.gif\"\u002F>   | **\u003Cp align=\"center\">道路交通\u003C\u002Fp>** \u003Cbr\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_82971b43cccc.gif\"\u002F>         |\n\n#### 主要场景\n\n| 环境名称                | 描述                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 动图                                                                                                                                            |\n|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|\n| `dropout.py`            | 在此场景中，`n_agents` 个智能体和一个目标会随机生成在 -1 到 1 之间的位置。智能体之间以及与目标之间不能发生碰撞。奖励由所有智能体共享。当至少有一个智能体到达目标时，团队将获得 1 分的奖励。同时，团队还会根据每个智能体动作大小的总和受到惩罚，以此来鼓励智能体尽量减少移动。能量奖励的影响可以通过设置 `energy_coeff` 来调整。默认系数为 0.02，这意味着对于单个智能体来说，始终值得去到达目标。最优策略是让离目标最近的智能体独自前往目标，从而尽可能节省能量。每个智能体能够观察到自己的位置、速度、相对于目标的位置，以及是否已有人到达目标的标志。当有智能体到达目标时，环境结束。解决该环境需要智能体之间的通信。                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_bcf2311275d4.gif\" alt=\"drawing\" width=\"300\"\u002F>              |\n| `dispersion.py`         | 在此场景中，会生成 `n_agents` 个智能体和若干目标。所有智能体初始位于 [0,0]，而目标则随机分布在 -1 到 1 之间。智能体之间及与目标之间均不能发生碰撞。智能体的任务是各自到达不同的目标。当某个目标被成功到达时，如果 `share_reward` 为真，则整个团队获得 1 分的奖励；否则，同一时间步内到达该目标的所有智能体平分这 1 分。若 `penalise_by_time` 为真，每一步每个智能体会额外获得 -0.01 的奖励。最优策略是智能体分散开来，各自负责一个不同的目标。这需要高度的协调与行为多样性。每个智能体可以观察到自身的位子和速度，以及每个目标的相对位置和是否已被其他智能体到达的标志。当所有目标都被到达时，环境结束。                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_f63a89a7931f.gif\" alt=\"drawing\" width=\"300\"\u002F>           |\n| `transport.py`          | 在此场景中，`n_agents` 个智能体、`n_packages` 个包裹（默认为 1）以及一个目标会随机生成在 -1 到 1 之间的位置。包裹是具有 `package_mass` 质量（默认为智能体质量的 50 倍）和指定尺寸的方块。任务是让智能体将所有包裹推送到目标位置。当所有包裹都与目标重叠时，场景结束。每个智能体获得的奖励与其推动包裹靠近目标的距离变化成正比。也就是说，向目标方向推动包裹会得到正奖励，而将其推开则会得到负奖励。一旦包裹与目标重叠，它会变成绿色，不再对奖励做出贡献。每个智能体可以观察到自己的位置、速度、相对于包裹的位置、包裹的速度、包裹与目标之间的相对位置，以及每个包裹是否已到达目标的标志。默认情况下，包裹非常重，单个智能体几乎无法推动它们。因此，智能体需要协作共同推动包裹，才能更快地完成任务。                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_1a55da5023bf.gif\" alt=\"drawing\" width=\"300\"\u002F>            |\n| `reverse_transport.py`  | 此场景与 `transport.py` 完全相同，唯一的区别是所有智能体都被困在一个包裹内部。其余部分完全一致。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_b191d1d9b094.gif\" alt=\"drawing\" width=\"300\"\u002F>    |\n| `give_way.py`           | 在此场景中，两个智能体和两个目标被放置在一个狭窄的走廊里。每个智能体需要到达与其颜色对应的目标。由于两个智能体分别站在对方目标的前方，因此他们必须交换位置才能各自到达目标。走廊中间有一个不对称的开口，仅能容纳一个智能体通过。因此，最优策略是其中一个智能体主动让路给另一个智能体。这需要智能体表现出异质的行为。每个智能体可以观察到自己的位置、速度以及相对于目标的相对位置。当两个智能体都到达各自的目标时，场景结束。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e2555f0224d8.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `wheel.py`              | 在此场景中，`n_agents` 个智能体随机生成在 -1 到 1 之间的位置。一条长度为 `line_length`、质量为 `line_mass` 的杆状物被放置在场景中央，其一端固定在原点，可以自由旋转。智能体的目标是使杆的绝对角速度达到 `desired_velocity`。因此，仅仅让智能体在杆的两端施力是不够的，他们需要协同合作，以精确控制并不超过目标角速度。每个智能体可以观察到自己的位置、速度、杆当前角度（模 π）、当前角速度与目标角速度之间的绝对差，以及相对于杆两端的位置。奖励由所有智能体共享，其数值等于当前角速度与目标角速度之间的绝对差。                                                                                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_83c4bfedd4d0.gif\" alt=\"drawing\" width=\"300\"\u002F>                |  \n| `balance.py`            | 在此场景中，`n_agents` 个智能体均匀分布在线条下方，线上放置着一个质量为 `package_mass` 的球形包裹。团队和线条会在环境底部的随机 X 位置生成。环境中存在竖直方向的重力。如果 `random_package_pos_on_line` 为真（默认），包裹在线上的相对 X 位置则是随机的。环境的上半部分会生成一个目标。智能体的任务是将包裹运送到目标处。每个智能体获得的奖励与其推动包裹靠近目标的距离变化成正比。换句话说，越接近目标，奖励越高；反之则越低。如果包裹或线条掉落到地面，团队将受到 -10 的惩罚。每个智能体可以观察到自己的位置、速度、相对于包裹和线条的位置、包裹与目标之间的相对位置、包裹和线条的速度、线条的角速度以及线条旋转的角度（模 π）。当包裹或线条掉落，或者包裹触碰到目标时，场景结束。            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8136de98b37b.gif\" alt=\"drawing\" width=\"300\"\u002F>              |  \n| `football.py`           | 在此场景中，一支由 `n_blue_agents` 组成的蓝队与一支由 `n_red_agents` 组成的红队进行足球比赛。布尔参数 `ai_blue_agents` 和 `ai_red_agents` 决定每支球队是由玩家输入控制，还是由预设的 AI 控制。因此，足球既可以被视为合作任务，也可以被视为竞争任务。场景中的奖励可以根据 `dense_reward_ratio` 进行调整：值为 0 表示完全稀疏的奖励系统（进球得 1 分，失球得 -1 分）；值为 1 则表示完全密集的奖励系统（基于双方“进攻价值”的差异，该值考虑了球与目标的距离，以及是否存在畅通的带球或射门路线）。每个智能体可以观察到自己的位置、速度、相对于球的位置以及相对于球的速度。当一方球队进球时，比赛结束。                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_c99ea49a03a4.gif\" alt=\"drawing\" width=\"300\"\u002F>             | \n| `discovery.py`          | 在此场景中，`n_agents` 名智能体需要协调合作，在避免碰撞的同时尽快覆盖 `n_targets` 个目标。当每个目标都有 `agents_per_target` 名智能体接近到至少 `covering_range` 的距离时，该目标即被视为被覆盖。目标被覆盖后，参与的智能体将获得奖励，而目标会重新出现在一个新的随机位置。如果智能体之间发生碰撞，将会受到惩罚。每个智能体可以观察到自己的位置、速度，以及使用 LIDAR 测量到其他智能体和目标的距离。场景将在固定的时间步数后结束。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_3322093dfdd7.gif\" alt=\"drawing\" width=\"300\"\u002F>            | \n| `flocking.py`           | 在此场景中，`n_agents` 名智能体需要围绕一个目标进行群集飞行，同时保持彼此间的紧密联系并最大化速度，且不得与其他智能体或 `n_obstacles` 个障碍物发生碰撞。智能体因相互碰撞或与障碍物碰撞而受到惩罚，同时因提高速度和缩小群集范围（增强凝聚力）而获得奖励。每个智能体可以观察到自己的位置、速度，以及使用 LIDAR 测量到其他智能体的距离。场景将在固定的时间步数后结束。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_d61ed0755b01.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `passage.py`            | 在此场景中，5 个机器人以编队形式随机生成在环境的底部区域。同样数量的目标也随机生成在环境的顶部区域。每个机器人需要到达与其对应的目标。环境中部有一堵墙，上面开有 `n_passages` 个通道。每个通道一次只能通过一个机器人。每个智能体获得的奖励与其与目标之间的距离变化成正比。也就是说，越靠近目标，奖励越高；反之则越低。如果设置了 `shared_reward`，这些奖励将由所有智能体共享。如果机器人之间发生碰撞，涉及的每个机器人将受到 -10 的惩罚。每个智能体可以观察到自己的位置、速度、相对于目标的位置，以及相对于每个通道中心的位置。当所有机器人到达各自的目标时，场景结束。                                                                                                                                                                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9c9990512c4f.gif\" alt=\"drawing\" width=\"300\"\u002F>              |\n| `joint_passage_size.py` | 在这里，两个不同大小的机器人（蓝色圆圈）通过两个转动关节连接成一个联动装置，需要穿过一个通道，同时保持联动装置与通道平行，并最终到达另一侧的目标位置（绿色圆圈）。通道由一大一小两个缝隙组成，它们以随机顺序和位置出现在墙上，但两者之间的距离始终保持不变。团队以随机顺序和位置生成在通道的下侧，联动装置始终垂直于通道。目标则水平放置在通道的另一侧的随机位置。每个机器人可以观察自己的速度、相对于每个缝隙的位置，以及相对于目标中心的位置。机器人会收到一种特殊的全局奖励，引导他们安全穿过通道而不与通道发生碰撞。                                                                                                                                                                                                                                                                                                                                                                          | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_797105841d5c.gif\" alt=\"drawing\" width=\"300\"\u002F>   |\n| `joint_passage.py`      | 此场景与 `joint_passage_size.py` 相同，唯一的区别在于机器人现在物理上完全相同，但联动装置的质量分布不均匀（黑色圆圈）。通道是一个单独的缝隙，随机出现在墙上。智能体需要穿过这个通道，同时保持联动装置垂直于墙壁，并避免碰撞。团队和目标分别以随机的位置、顺序和旋转角度生成在通道的两侧。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5341ca4763cc.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `ball_passage.py`       | 此场景与 `joint_passage.py` 相同，只是这次智能体之间没有联动装置，而是需要推动一个球穿过通道。奖励只与球相关，并经过特殊设计以引导球顺利通过通道。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8da73fd9c5ef.gif\" alt=\"drawing\" width=\"300\"\u002F>         |\n| `ball_trajectory.py`    | 此场景与 `circle_trajectory.py` 类似，唯一的区别在于轨迹奖励现在依赖于一个球体对象。两名智能体需要将球沿着圆形轨迹推动。如果设置 `joints=True`，智能体将通过联动装置与球相连。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_47433f1600e9.gif\" alt=\"drawing\" width=\"300\"\u002F>      |\n| `buzz_wire.py`          | 两个智能体通过联动装置与一个质量相连，需要在一个直线型走廊中玩“嗡嗡线游戏”（Buzz Wire game）。注意不要触碰边界，否则游戏立即结束！                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_8e533e17f7e1.gif\" alt=\"drawing\" width=\"300\"\u002F>            |\n| `multi_give_way.py`     | 此场景是 `give_way.py` 的扩展版本，其中四名智能体需要通过互相让路来各自到达目标。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_9d2e413f783c.gif\" alt=\"drawing\" width=\"300\"\u002F>       |\n| `navigation.py`         | 随机生成的智能体需要导航到各自的目标。可以开启碰撞检测功能，智能体也可以使用 LIDAR 来避免相互碰撞。奖励可以选择共享或个体奖励。除了位置、速度和 LIDAR 数据外，每个智能体还可以选择只观察自己与目标之间的相对距离，或者观察自己与所有目标之间的相对距离（在这种情况下，需要智能体表现出异质性才能解决问题）。此外，该场景还允许多个智能体共享同一个目标。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_2a5a00a73f4c.gif\" alt=\"drawing\" width=\"300\"\u002F>           |\n| `sampling.py`           | `n_agents` 个智能体随机生成在一个具有高斯分布密度函数的工作空间中，该密度函数由 `n_gaussians` 个模式组成。智能体需要在这个场中移动并收集样本。整个区域被划分为网格，一旦智能体访问某个格子，就会采集该格子的样本，且样本不会重复出现，采集到的样本将作为奖励分配给整个团队（或仅分配给该智能体，如果 `shared_rew=False`）。智能体可以使用 LIDAR 来感知彼此。除了 LIDAR、位置和速度观测之外，每个智能体还能观察到周围 3x3 格子内的样本数值。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_ab9d1bde5e69.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n| `wind_flocking.py`      | 两名智能体需要以指定的距离向北方向进行群集飞行。他们将根据与参考方向的一致性以及自身速度向量的对齐程度获得奖励。场景中存在从北向南吹来的风。这两名智能体在物理特性上存在差异：较小的智能体具有一定的空气动力学特性，可以为较大的智能体遮挡风力，从而优化群集效果。因此，解决这一任务的最佳方案是智能体之间进行异质性的风阻协作。更多信息请参阅 [SND 论文](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fsystem-neural-diversity-measuring-behavioral-heterogeneity-in-multi-agent-learning\u002F)。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_4ff9195efe30.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `road_traffic.py`       | 此场景提供了一个用于联网自动驾驶车辆（CAVs）的 MARL 基准测试，使用来自网络物理移动实验室（CPM Lab）的高清地图，该实验室是一个面向 CAVs 的开源测试平台。地图包含一个八车道交叉路口和一个环形高速公路，设有多个汇入和驶出匝道，能够模拟多种复杂的交通状况。预先定义了四十条环形参考路径，支持无限时长的仿真运行。您可以初始化最多 100 个智能体，但默认数量为 20。训练过程中如果发生碰撞，场景会重新初始化所有智能体，随机分配新的参考路径、初始位置和速度。这种设置旨在模拟真实驾驶环境中的不可预测性。此外，观测设计注重提升样本效率和泛化能力（即智能体对未见场景的适应能力）。同时，实现了自车视角和鸟瞰视角两种观测方式，并支持部分可观测的马尔可夫决策过程建模。更多信息请参阅 [这篇论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.07644)。 | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_82971b43cccc.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n\n#### 调试场景\n\n| 环境名称               | 描述                                                                                                                                                                                                                                                                                                                                                                             | 动图                                                                                                                                         |\n|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|\n| `waterfall.py`         | 在环境顶部生成 `n_agents` 个智能体。它们之间通过可碰撞的连杆相互连接，最后一个智能体连接着一个方块。每个智能体的奖励取决于其与底部黑线中心的距离。智能体需要到达这条线，在此过程中可能会相互碰撞以及与环境中的方块发生碰撞。 | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_48c5d5d7546b.gif\" alt=\"drawing\" width=\"300\"\u002F>         |\n| `asym_joint.py`        | 两个智能体通过一个质量不对称的连杆相连。智能体的奖励是将连杆摆放到垂直位置，同时尽量减少团队的能量消耗。                                                                                                                                                                                                                                                                                  | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_7d36dd4027a8.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `vel_control.py`       | 示例场景：三个智能体分别具有不同加速度约束的速度控制器。                                                                                                                                                                                                                                                                                                                                   | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_72b6aba27348.gif\" alt=\"drawing\" width=\"300\"\u002F>       |\n| `goal.py`              | 在工作空间中随机生成一个带有速度控制器的智能体。它的奖励是移动到一个随机初始化的目标位置，同时尽量减少能量消耗。该智能体可以观察自己的速度以及相对于目标的位置。                                                                                                                                            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5de27ccc7fc7.gif\" alt=\"drawing\" width=\"300\"\u002F>              | \n| `het_mass.py`          | 在工作空间中随机生成两个质量不同的智能体。它们的奖励是最大化团队的最大速度，同时最小化团队的能量消耗。最优策略要求重的智能体静止不动，而轻的智能体以最大速度运动。                                                                                                   | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_351306cf3f59.gif\" alt=\"drawing\" width=\"300\"\u002F>          |\n| `line_trajectory.py`   | 一个智能体被奖励沿着直线轨迹运动。                                                                                                                                                                                                                                                                                                                                     | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_13e38d4636f1.gif\" alt=\"drawing\" width=\"300\"\u002F>   |\n| `circle_trajectory.py` | 一个智能体被奖励以 `desired_radius` 为半径做圆周运动。                                                                                                                                                                                                                                                                                                           | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_822742ecdf42.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n| `diff_drive.py`        | `diff_drive` 动力学模型约束的示例。两个智能体都具有可交互控制的旋转动作。第一个智能体采用差速驱动动力学，第二个智能体则采用标准的 V-MAS 全向动力学。                                                                                                                                            | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_5a45022c3846.gif\" alt=\"drawing\" width=\"300\"\u002F>        |\n| `kinematic_bicycle.py` | `kinematic_bicycle` 动力学模型约束的示例。两个智能体都具有可交互控制的旋转动作。第一个智能体采用自行车运动学模型动力学，第二个智能体则采用标准的 V-MAS 全向动力学。                                                                                                                                    | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_e0b71eb68c18.gif\" alt=\"drawing\" width=\"300\"\u002F> |\n| `drone.py`             | `drone` 动力学模型的示例。                                                                                                                                                                                                                                                                                                                                                | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_readme_41a8996e0438.gif\" alt=\"drawing\" width=\"300\"\u002F>             |\n\n\n\n### [MPE](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmultiagent-particle-envs)\n\n| 代码中的环境名称（论文中名称）                         | 是否有通信？ | 是否竞争性？ | 备注                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n|----------------------------------------------------------|----------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `simple.py`                                              | 否              | 否            | 单智能体观察地标位置，奖励基于其与地标之间的距离。这不是多智能体环境——用于策略调试。                                                                                                                                                                                                                                                                                                                                                                                                               |\n| `simple_adversary.py`（物理欺骗）                       | 否              | 是            | 1个敌对智能体（红色），N个良性智能体（绿色），N个地标（通常N=2）。所有智能体都能观察地标和其他智能体的位置。其中一个地标是“目标地标”（绿色）。良性智能体的奖励基于其中任意一个接近目标地标的程度，但如果敌对智能体靠近目标地标，则会受到惩罚。敌对智能体的奖励基于其与目标地标的距离，但它并不知道哪个地标是目标地标。因此，良性智能体需要学会“分散”并覆盖所有地标，以迷惑敌对智能体。 |\n| `simple_crypto.py`（隐秘通信）                          | 是              | 是            | 两个良性智能体（alice和bob），一个敌对智能体（eve）。alice必须通过公共信道向bob发送一条秘密消息。alice和bob的奖励基于bob成功还原消息的程度，但如果eve也能还原消息，则会受到惩罚。alice和bob拥有一把在每轮开始时随机生成的密钥，他们必须学会利用这把密钥来加密消息。                                                                                                                                                             |\n| `simple_push.py`（保持距离）                            | 否              | 是            | 1个智能体、1个敌对智能体、1个地标。智能体的奖励基于其与地标的距离。敌对智能体则会在靠近地标且智能体远离地标时获得奖励。因此，敌对智能体会学习将智能体推离地标。                                                                                                                                                                                                                                                                                                                 |\n| `simple_reference.py`                                    | 是              | 否            | 2个智能体，3个不同颜色的地标。每个智能体都想到达自己专属的目标地标，而这个目标地标只有另一方智能体才知道。奖励是集体性的。因此，智能体需要学会相互沟通对方的目标，并导航到各自的地标。这与simple_speaker_listener场景相同，即两个智能体既是说话者又是倾听者。                                                                                                                                                                                    |\n| `simple_speaker_listener.py`（合作性通信）              | 是              | 否            | 与simple_reference相同，区别在于：一个智能体是“说话者”（灰色），不移动，负责观察另一智能体的目标；另一个智能体是“倾听者”（不能说话，但必须导航到正确的目标地标）。                                                                                                                                                                                                                                                                                                                                                      |\n| `simple_spread.py`（合作性导航）                        | 否              | 否            | N个智能体，N个地标。智能体的奖励基于任意一个智能体与每个地标之间的距离。如果智能体之间发生碰撞，则会受到惩罚。因此，智能体需要学会在避免碰撞的同时覆盖所有地标。                                                                                                                                                                                                                                                                                                                               |\n| `simple_tag.py`（捕食者-猎物）                          | 否              | 是            | 捕食者-猎物环境。良性智能体（绿色）速度较快，希望避免被敌对智能体（红色）击中。敌对智能体速度较慢，希望击中良性智能体。障碍物（大黑圆圈）会阻挡路径。                                                                                                                                                                                                                                                                                                                                                    |\n| `simple_world_comm.py`                                   | 是              | 是            | 论文中附带视频中展示的环境。与simple_tag相同，区别在于：(1) 存在食物（小蓝球），良性智能体靠近食物时会获得奖励；(2) 现在有“森林”，可以将智能体隐藏起来，使其从外部无法被看到；(3) 存在一名“领导者敌对智能体”，它可以始终看到所有智能体，并能与其他敌对智能体通信，以协调追捕行动。                                                                                                                                                      |\n\n## 我们使用 VMAS 的论文\n\n- [VMAS](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fvmas-a-vectorized-multi-agent-simulator-for-collective-robot-learning\u002F) 支持 `balance`、`transport`、`give_way`、`wheel` 等任务的训练。\n- [HetGPPO](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fheterogeneous-multi-robot-reinforcement-learning\u002F) 支持 `het_mass`、`give_way`、`joint_passage`、`joint_passage_size` 等任务的训练。\n- [SND](https:\u002F\u002Fmatteobettini.github.io\u002Fpublication\u002Fsystem-neural-diversity-measuring-behavioral-heterogeneity-in-multi-agent-learning\u002F) 支持 `navigation`、`joint_passage`、`joint_passage_size`、`wind` 等任务的训练。\n- [TorchRL](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Ftorchrl-a-data-driven-decision-making-library-for-pytorch\u002F) 支持 `navigation`、`sampling`、`balance` 等任务的训练。\n- [BenchMARL](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Fbenchmarl\u002F) 支持 `navigation`、`sampling`、`balance` 等任务的训练。\n- [剑桥 RoboMaster](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Frobomaster\u002F) 支持 `navigation` 任务的训练。\n- [DiversityControl (DiCo)](https:\u002F\u002Fmatteobettini.com\u002Fpublication\u002Fcontrolling-behavioral-diversity-in-multi-agent-reinforcement-learning\u002F) 支持 `navigation`、`sampling`、`dispersion`、`simple_tag` 等任务的训练。\n\n## 待办事项\n\n待办事项现已列在 [这里](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F116)。\n\n- [X] 提升测试效率并添加新测试\n- [X] 实现 2D 无人机动力学\n- [X] 允许任意数量的动作\n- [X] 提高 VMAS 性能\n- [X] 在 torchrl 中支持字典型观测\n- [X] 将 TextLine 使能为场景中可用的 Geom\n- [X] 编写关于如何将 torch rl 与 vmas 结合使用的笔记本\n- [X] 允许字典型观测空间和多维观测\n- [X] 讨论动作预处理和速度控制器\n- [X] 来自联合项目的全新环境及其描述\n- [X] 讨论导航 \u002F 多目标问题\n- [X] 链接实验视频\n- [X] 添加 LiDAR 章节\n- [X] 实现 LiDAR\n- [X] 重写所有 MPE 场景\n  - [X] simple\n  - [x] simple_adversary\n  - [X] simple_crypto\n  - [X] simple_push\n  - [X] simple_reference\n  - [X] simple_speaker_listener\n  - [X] simple_spread\n  - [X] simple_tag\n  - [X] simple_world_comm","# VectorizedMultiAgentSimulator (VMAS) 快速上手指南\n\nVMAS 是一个基于 PyTorch 构建的向量化多智能体模拟器，专为高效的多智能体强化学习（MARL）基准测试设计。它支持在加速硬件上并行运行数万个环境，并兼容 Gymnasium、RLlib 和 TorchRL 等主流框架。\n\n## 环境准备\n\n*   **操作系统**: Linux, macOS, Windows\n*   **Python 版本**: 3.8, 3.9, 3.10, 3.11\n*   **核心依赖**: PyTorch\n*   **可选依赖**:\n    *   `gymnasium`: 用于 Gym 接口封装\n    *   `ray[rllib]`: 用于 RLlib 训练\n    *   `torchrl` \u002F `benchmarl`: 用于 TorchRL 生态训练\n    *   `pygame` 等：用于可视化渲染\n\n> **提示**：国内开发者建议使用清华源或阿里源加速 PyTorch 及相关依赖的安装。\n\n## 安装步骤\n\n### 1. 基础安装\n通过 pip 安装最新稳定版：\n```bash\npip install vmas -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 2. 安装开发版（可选）\n如需使用最新功能（master 分支）：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator.git\ncd VectorizedMultiAgentSimulator\npip install -e . -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 3. 安装扩展功能\n根据需求安装额外依赖（推荐按需安装）：\n\n```bash\n# 安装 Gymnasium 封装支持\npip install vmas[gymnasium] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 安装 RLlib 封装支持\npip install vmas[rllib] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 安装渲染依赖（可视化需要）\npip install vmas[render] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 一次性安装所有依赖\npip install vmas[all] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 4. 安装训练框架（可选）\n```bash\npip install benchmarl -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple  # Facebook BenchMARL\npip install torchrl -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple   # PyTorch RL\npip install \"ray[rllib]\"==2.1.0 -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple # RLlib\n```\n\n## 基本使用\n\n以下示例展示如何创建一个包含 32 个并行环境的\"waterfall\"场景，并在 CPU 上运行一步仿真。\n\n```python\nimport vmas\n\n# 创建环境\nenv = vmas.make_env(\n    scenario=\"waterfall\",       # 场景名称（位于 scenarios 文件夹）\n    num_envs=32,                # 并行环境数量\n    device=\"cpu\",               # 运行设备：\"cpu\" 或 \"cuda\"\n    continuous_actions=True,    # 是否使用连续动作空间\n    wrapper=None,               # 封装类型：None, \"rllib\", \"gym\", \"gymnasium\" 等\n    max_steps=None,             # 最大步数，None 表示无限\n    seed=None,                  # 随机种子\n    dict_spaces=False,          # False: 元组空间 (tuple); True: 字典空间 (dict)\n    grad_enabled=False,         # 是否启用微分（允许梯度回传）\n    terminated_truncated=False, # 是否区分 terminated 和 truncated 标志\n    # **kwargs 可传递特定场景的初始化参数\n)\n\n# 重置环境\nobs = env.reset()\n\n# 执行一步仿真\n# actions 需符合场景定义的动作空间结构\nactions = env.action_space.sample() \nobs, reward, done, info = env.step(actions)\n\n# 关闭环境\nenv.close()\n```\n\n**关键参数说明：**\n*   `scenario`: 指定要加载的场景（如 `\"balance\"`, `\"navigation\"`, `\"waterfall\"` 等）。\n*   `num_envs`: 利用向量化特性，可轻松设置为数千以充分利用 GPU。\n*   `device`: 设为 `\"cuda\"` 可大幅加速大规模并行仿真。\n*   `wrapper`: 若需对接特定算法库，可设置为 `\"gymnasium\"` 或 `\"rllib\"`。\n\n更多详细示例请查看项目根目录下的 `examples\u002Fuse_vmas_env.py` 脚本或官方 Colab 笔记本。","某机器人实验室的研究团队正在开发一套多无人机协同避障与编队控制算法，需要在大规模并行环境中验证强化学习策略的泛化能力。\n\n### 没有 VectorizedMultiAgentSimulator 时\n- **训练效率极低**：传统模拟器只能串行运行少量环境，收集一次有效训练数据需要数小时，导致算法迭代周期长达数周。\n- **物理引擎不可微**：底层物理计算不支持梯度回传，研究人员无法利用基于梯度的优化方法，限制了高级控制策略的探索。\n- **场景定制困难**：每增加一种新的传感器（如激光雷达）或修改碰撞逻辑，都需要重写大量底层代码，开发门槛高且易出错。\n- **框架兼容性差**：模拟器接口与主流强化学习库（如 TorchRL、RLlib）不匹配，团队需花费大量时间编写适配层而非核心算法。\n\n### 使用 VectorizedMultiAgentSimulator 后\n- **万级环境并行**：借助 PyTorch 向量化特性，单张 GPU 可同时运行上万个并行环境，数据采集速度提升数百倍，训练周期从数周缩短至数小时。\n- **端到端可微分**：内置的可微分 2D 物理引擎支持直接反向传播，团队成功实现了更精细的基于梯度的策略优化，显著提升了控制精度。\n- **模块化快速扩展**：通过简单的模块化接口，研究人员在几行代码内就添加了自定义 Lidar 传感器和弹性碰撞规则，新场景验证变得轻而易举。\n- **无缝生态集成**：原生兼容 Gymnasium 和 BenchMARL 等标准接口，团队直接调用现成的 MARL 算法库进行训练，实现了“开箱即用”。\n\nVectorizedMultiAgentSimulator 通过向量化加速与可微分物理引擎，将多智能体强化学习的研发效率从“手工小作坊”带入了“工业化量产”时代。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fproroklab_VectorizedMultiAgentSimulator_fd2dcbca.jpg","proroklab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fproroklab_a47708b6.png","",null,"www.proroklab.org","https:\u002F\u002Fgithub.com\u002Fproroklab",[80,84],{"name":81,"color":82,"percentage":83},"Python","#3572A5",99.1,{"name":85,"color":86,"percentage":87},"TeX","#3D6117",0.9,555,106,"2026-04-15T03:55:40","GPL-3.0","Linux, macOS, Windows","非必需。支持 CPU 和 CUDA GPU（通过 device='cuda' 参数启用），利用 PyTorch 进行向量化加速，具体显存和 CUDA 版本取决于安装的 PyTorch 版本。","未说明（取决于并行环境数量 num_envs 和场景复杂度）",{"notes":96,"python":97,"dependencies":98},"该工具是一个基于 PyTorch 的向量化多智能体物理模拟器。默认安装仅包含核心依赖，若需渲染、测试或使用特定强化学习库（如 Gymnasium, RLlib），需通过 pip extras 单独安装（例如 pip install vmas[render]）。支持在 CPU 或 GPU 上运行大规模并行环境模拟。","3.8, 3.9, 3.10, 3.11",[99,100,101,102,103],"torch","gymnasium (可选)","ray[rllib] (可选，>=1.13, \u003C=2.2)","torchrl (可选)","benchmarl (可选)",[14,13,105],"其他",[107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126],"gym","gym-environment","marl","multi-agent","multi-agent-learning","multi-agent-reinforcement-learning","multi-agent-simulation","multi-agent-systems","multi-robot","multi-robot-framework","multi-robot-sim","multi-robot-simulator","multi-robot-systems","pytorch","rllib","simulator","vectorization","vectorized","robotics","simulation","2026-03-27T02:49:30.150509","2026-04-17T08:22:56.132980",[130,135,140,145,150,155],{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},37097,"如何在 VMAS 场景中启用稀疏奖励（Sparse Reward）设置？","可以通过修改场景参数来禁用密集奖励 shaping，从而只保留稀疏奖励。具体步骤是将 `pos_shaping_factor` 设置为 0。设置后，智能体将仅获得碰撞惩罚（agent_collision_penalty）和到达目标的最终奖励（final_reward），不再包含基于位置变化的稠密奖励。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F152",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},37098,"论文中提到的“平均回合奖励”（episode reward mean）是如何计算的？","“平均回合奖励”是指环境返回的奖励总和，计算范围是从回合开始直到 `done=True` 为止。为了得到单个数值，会对所有智能体的奖励求平均（如果是全局奖励则保持不变）。这里的“平均”指的是对训练批次中的一组回合（batch of training episodes）取平均值。当 `done=True` 返回后，不再统计后续动作的奖励。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F33",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},37099,"如何在交互式模式（interactive mode）下自定义固定的视图边界（viewer bound），以适配非中心原点的地图？","可以在 Scenario 类的 `make_world` 函数中添加一个自定义属性 `viewer_bound`，例如：`self.viewer_bound = torch.tensor([0, world_x_dim, 0, world_y_dim], device=device, dtype=torch.float32)`。然后修改源码 `vmas\u002Fsimulator\u002Fenvironment\u002Fenvironment.py` 中的渲染逻辑，检查该属性是否存在。如果存在，则使用 `self.scenario.viewer_bound` 设定的左右上下边界调用 `set_bounds`；否则沿用默认的对称边界设置。这允许地图原点位于左下角等非中心位置时正确渲染。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F92",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},37100,"如何使用 TorchRL 解决涉及通信的 MPE 环境（如 simple_reference）时遇到奖励数值异常或难以收敛的问题？","建议参考 BenchMARL 项目中针对 VMAS 微调过的超参数配置，特别是使用 MADDPG 算法且不共享策略参数（no policy parameter sharing）的情况。可以查看 BenchMARL 仓库中 `fine_tuned\u002Fvmas\u002Fconf` 目录下的配置文件。默认的 `maddpg_iddpg.yaml` 基线参数可能不足以解决通信任务，需要根据具体任务进行超参数调整或采用经过验证的微调参数。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F62",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},37101,"如果智能体的动力学模型不在库中（如离散动作或非物理变量输入），如何自定义环境的 step 函数？","虽然目前不能直接重写核心的 `step` 函数，但可以通过实现自定义的动力学模型来解决。例如，对于特殊的运动模型（如运动学自行车模型），社区已通过 Pull Request #68 将其集成。用户可以参考该实现，通过自定义 `process_action` 或在 Scenario 中定义特定的状态更新逻辑来适配非标准动力学，而无需修改底层步进函数。如果遇到特定用例，欢迎贡献代码合并到主库。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F64",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},37102,"在实现差速驱动机器人（differential drive robot）时，为什么方向约束在物理子步长（substeps）中被忽略导致仿真效果不佳？","这是因为核心的 `_integrate_state` 函数默认假设实体的方向与力（速度向量）是独立的。对于差速驱动机器人，旋转会同时改变速度向量的方向，而模拟器仅在每一步之前应用约束，忽略了子步长内的方向约束。解决方法是需要“黑客”式地修改积分逻辑，或者等待官方针对差速驱动动态模型的专门支持。用户需注意在前端提到的差速驱动模型可能存在此限制，需自行处理子步长内的方向一致性。","https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fissues\u002F46",[161,166,171,176,181,186,191,196,201,206,211,216,221,226,231,236,241,246,251,256],{"id":162,"version":163,"summary_zh":164,"released_at":165},297570,"VMAS-1.5.2","## 变更内容\n* [CI] 由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F165 中引入 Matplotlib\n* 修复：修正联合通行奖励函数（问题 #145），由 @Benabdellah22 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F168 中完成\n* [版权] 由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F171 中修复文件头\n* [体验] 由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F172 中禁用 gymnasium 通知\n\n## 新贡献者\n* @Benabdellah22 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F168 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002F1.5.1...VMAS-1.5.2","2025-11-10T10:08:01",{"id":167,"version":168,"summary_zh":169,"released_at":170},297571,"1.5.1","## 变更内容\n* [教程] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F151 中新增了关于创建场景并进行训练的教程\n* [功能] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F154 中实现了本地种子功能\n* [Bug修复] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F156 中修复了足球环境中的 `u_shoot_multiplier` 问题\n* @Schopenhauer-loves-Hegel 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F157 中修复了在使用 CUDA 时 road_traffic 场景的 bug\n* @itwasabhi 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F160 中确保 `pos_shaping_factor` 能够正确传递到 buzz_wire 环境\n\n## 新贡献者\n* @Schopenhauer-loves-Hegel 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F157 中完成了首次贡献\n* @itwasabhi 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F160 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002F1.5.0...1.5.1","2025-09-30T18:02:43",{"id":172,"version":173,"summary_zh":174,"released_at":175},297572,"1.5.0","# 足球场景升级\r\n![football_vmas-ezgif com-optimize](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F2bff7aef-6fac-4865-bd2a-be385a8cf740)\r\n\r\n我们刚刚对 VMAS 足球场景进行了全面升级！\r\n\r\n新增了奖励机制、观测信息、球员体型差异选项、阵型生成方式以及价值函数可视化功能。\r\n\r\n快来体验一场紧张刺激的 N 对 M 连续控制博弈吧！\r\n\r\n此外，我们还提供了一个可调强度的人工对手，其精准度、决策能力和速度均可调节。\r\n\r\n以下是本次更新的重点内容：\r\n- **全新奖励机制**：稀疏奖励用于进球，密集奖励用于将球向球门推进，以及密集奖励用于至少让一名队员靠近皮球。这些奖励可以在课程式训练中进行调整或移除。\r\n- **可调强度的人工对手**：您可以使用该人工对手作为基准对手，以公平地评估不同算法的性能。该对手（在场景代码中称为 AI）具有三种可调参数，每种参数的取值范围为 0 到 1，分别是 **速度强度、决策强度和精准强度**。其中，**速度强度**直接决定所有动作的速度，包括带球和无球跑动。当取值为 0 时，队员完全静止；取值为 1 时，则会以最大允许速度持续行动。**决策强度**通过向决策所依据的价值函数添加噪声来影响队员的决策行为，这不仅会影响移动方向的选择（例如，队员可能会选择前往价值较低的位置，尤其是在各价值相近时），还会影响控球权的判断（例如，队员可能无法准确判断球权归属，从而错误地将球传给队友）。最后，**精准强度**则控制队员执行预设动作的能力。\r\n- **球员体型差异**：您可以开启球员体型差异选项，使不同角色的球员具备不同的身体特征。守门员会更高大且移动较慢，而前锋则更矮小且速度更快。\r\n- **阵型生成**：您可以让队员随机分布，也可以按照预设的阵型进行生成。\r\n- **踢球与非踢球模式**：默认情况下，队员仅通过二维移动和碰撞来移动及推动皮球。若开启射门动作，队员将新增两个连续力控制的动作，分别用于旋转并踢出皮球。（所有动作均可像其他 VMAS 场景一样离散化）\r\n- **自我对弈**：两支球队的观测信息和奖励机制均采用镜像设计。因此，如果您不想使用人工对手，而是希望采用自我对弈或其他同时控制双方球队的框架，也是完全可行的。\r\n\r\n有关该场景的更多技术细节，请参阅 [这篇论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.16244) 的 C.1 节。\n\n## 变更内容\r\n* [BugFix] Windows 路径问题，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F148 中修复\r\n* [Minor] 更新 CPM 道路交通场景的链接，由 @Jianye-Xu 在 https:\u002F","2025-02-02T17:15:16",{"id":177,"version":178,"summary_zh":179,"released_at":180},297573,"1.4.3","一如既往，我们带来了大量新更新，而这次的更新大多由社区驱动。\n\n## 概述\n\n### 终止\u002F截断支持与 Gymnasium 封装 #143\n\n感谢 @LukasSchaefer，我们现在可以选择从 vmas 环境中返回终止和截断状态，而不是 done 标志。\n\n此外，我们还提供了针对 gymnasium 的封装（单个 vmas 环境）以及向量化 gymnasium 封装，后者可以包装一批 vmas 环境，并在数据中保留 `n_envs` 维度。\n\n### 自定义连续动作的离散化方式 #119\n\n此前，连续动作仅提供三种离散化选项：增加、减少、保持不变。\n\n现在，得益于 @rikifunt 的贡献，你可以自由选择将连续动作范围划分为多少个离散选项！\n\n### 向量化激光雷达 #124\n\n激光雷达具有多条射线，能够感知多个实体。感谢 @Zartris 的工作，我们现在对这两个维度都实现了向量化处理，从而使得射线数量的扩展变得无缝衔接。\n\n### 可视化世界边界 #142\n\n感谢 @Giovannibriglia，如果你在场景中设置了世界边界，VMAS 现在会自动绘制这些边界。\n\n### 更多动力学模式 #125\n\n- 静态：无移动动作\n- 前进：仅沿朝向方向施加前进或后退的力\n- 旋转：仅执行旋转动作\n\n## 破坏性变更\n\n- `env.unwrapped()` -> `env.unwrapped` 在 `GymWrapper` 中\n- 从观测中移除了冗余的 `agent.state.pos`\n\n## 已更改内容\n* [功能] 关节旋转偏移及更多动力学模式，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F125 中实现\n* [BugFix] 修复关节推断角度问题，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F127 中完成\n* [功能] 如果实体生成耗时过长则发出警告，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F128 中实现\n* [功能] 为每个动作维度设置不同数量的动作选项（离散动作），由 @rikifunt 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F119 中实现\n* 引入 `x_semidim` 和 `y_semidim` 参数，以支持自定义环境尺寸，由 @Giovannibriglia 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F133 中实现\n* [BugFix] 修复发现任务中的观测问题，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F137 中完成\n* [Bug fix] 导航场景：确保实体放置在受限环境边界内，由 @Giovannibriglia 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F139 中完成\n* [功能] 允许在导航、发现和群集场景中设置激光雷达射线数量，由 @Giovannibriglia 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F141 中实现\n* [功能] 为有限大小环境添加边界可视化功能，由 @Giovannibriglia 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F142 中实现\n* [功能] 终止\u002F截断支持及 Gymnasium 封装，由 @LukasSchaefer 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002F 中实现","2024-09-21T14:21:27",{"id":182,"version":183,"summary_zh":184,"released_at":185},297574,"1.4.2","# VMAS 1.4.2\n\n这次我们带来了**大量**令人兴奋的更新和社区贡献。\n\n## 新的道路交通场景\n\n首先，也是最重要的，得益于 @Jianye-Xu 的出色工作，我们新增了一个 `road_traffic` 场景！\n\n这里有一张动图先睹为快：\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Fmatteobettini\u002Fvmas-media\u002Fblob\u002Fmain\u002Fmedia\u002Fscenarios\u002Froad_traffic_cpm_lab.gif?raw=true\" alt=\"drawing\" width=\"500\"\u002F>\n\n## 关节功能更加强大\n\n在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F113 中，我们对关节功能进行了重大改进。\n\n现在你可以将多个刚体连接起来，形成新的物理体，并共享所有的力和扭矩。\n\n请查看 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F113 中的渲染示例，了解目前在 VMAS 中使用关节可以实现的效果。\n\n这里有一个预览：\n\n![342393854-cc920e53-cb03-4bc4-99ce-0c58e27ef08e-ezgif com-video-to-gif-converter (1)](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fassets\u002F55539777\u002F87b5a4a5-88ba-4445-b443-9c6e4867c495)\n\n## LiDAR 功能改进\n\n@Zartris 将 LiDAR 作为自己的主要任务，并推动了许多实用的 LiDAR 功能开发：\n\n- LiDAR 角度现在相对于智能体的旋转角度进行计算 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F101\n- LiDAR 渲染现可选择是否显示 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F108\n- LiDAR 渲染与透明度进一步优化 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F115\n\n## 新增 `pre_step` 和 `post_step` 场景函数\n\n同样来自 @Zartris 的贡献，通过添加这两个函数，创建场景变得更加简单便捷。\n\n详情请参阅 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F112。\n\n## 场景中的关键字参数检查\n\n感谢 @KaleabTessera 的工作，我们现在有了更好的关键字参数检查机制。如果你向场景传递了意外的参数，系统会发出警告。\n\n更多信息请见：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F117。\n\n## 变更内容\n* [渲染] 仅在必要时导入 `matplotlib`，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F96 中实现。\n* [功能] 使 LiDAR 角度相对于智能体旋转角度，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F101 中实现。\n* [BugFix] 修复动力学问题，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F102 中完成。\n* [测试] 为 Apple Silicon 架构添加 macOS 运行器，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F103 中实现。\n* [CI] 修复 CI 流程，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F105 中完成。\n* [功能] 添加传感器渲染的可选开关，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F108 中实现。\n* [功能] 移除多余的 `observation` 调用，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F111 中完成。\n* [功能] 向 BaseScenario 添加 `pre_step` 和 `post_step` 函数，由 @Zartris 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectoriz","2024-07-10T09:49:21",{"id":187,"version":188,"summary_zh":189,"released_at":190},297575,"1.4.1","# 亮点\n\n- @gy2256 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F83 中实现了无人机动力学！！！\n- @kfu02 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F91 中修复了一个错误：运输场景中未将包裹质量传递给构造函数。\n- @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F93 中引入了 `scenario.render_origin`。\n- @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F95 中增加了对 Python 3.11 的兼容性。\n- 更完善的测试和跨平台 CI 测试，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F89。\n- VMAS 现在有了文档！\n\n## 变更内容\n* [动力学] @gy2256 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F83 中实现了无人机动力学。\n* [代码质量] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F87 中进行了 pre-commit 格式化。\n* [文档] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F85 中首次添加了基础文档。\n* [文档] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F88 中编写了使用说明。\n* [测试] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F89 中改进了测试并支持多操作系统。\n* [笔记本] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F90 中更新了笔记本。\n* [Bug] @kfu02 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F91 中确保包裹质量参数现在会传递到构造函数中。\n* [渲染] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F93 中引入了 `scenario.render_origin`。\n* [质量] @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F95 中增加了对 Python 3.11 的支持。\n\n## 新贡献者\n* @kfu02 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F91 中完成了他们的首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.4.0...VMAS-1.4.1","2024-05-20T13:41:13",{"id":192,"version":193,"summary_zh":194,"released_at":195},297576,"VMAS-1.4.0","# 可微分的 VMAS\n\n**没错，VMAS 现在已经完全可微了！**\n\n## 我该如何使用它？\n\n只需在环境初始化时将 `grad_enabled=True` 设置为真，并确保输入中有需要计算梯度的张量即可。这些输入可以是动作或场景参数。VMAS 会持续跟踪该张量上的计算图。\n\n## 这意味着什么？\n\n这意味着你可以对 VMAS 的任何输出进行求导，从而实现对转移动态、奖励函数和观测函数的微分。\n\n## 为什么这很有用？\n\n现在，你可以在 VMAS 场景中优化参数（例如各种场景函数的参数，或者仅仅是初始状态值），方法是基于奖励或观测计算损失。此外，它还允许你通过时间（即模拟步骤）进行反向传播。\n\n## 变更内容\n* [特性] 可微分的 VMAS，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F80 中实现\n* [重构] 驱动动力学的可微化，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F81 中实现\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.3.4...VMAS-1.4.0","2024-02-07T10:37:57",{"id":197,"version":198,"summary_zh":199,"released_at":200},297577,"VMAS-1.3.4","# 亮点\n\n- **VMAS 中的动作现已与物理引擎解耦**。这使得在创建智能体时可以定义自定义的动力学模型（`Holonomic`、`HolonomicWithRotation`、`DiffDrive`、`KinematicBicycle`），并支持额外的自定义动作。详情请参阅：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F76\n- **控制李雅普诺夫控制器现已作为启发式策略集成到 VMAS 中**！@gy2256 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F74 中实现了这一功能。\n- **修复了重置后仍可执行的场景问题**（例如运输、丢包等），并在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F78 中为场景添加了更多关键字参数。\n\n## 变更内容\n* 添加了由 @gy2256 实现的带有控制李雅普诺夫控制器的启发式策略，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F74\n* 【特性】动作与物理引擎解耦，并支持任意数量的动作，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F76 中实现\n* 【特性】更新场景功能，由 @matteobettini 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F78 中实现\n\n## 新贡献者\n* @gy2256 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F74 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.3.3...VMAS-1.3.4","2024-01-19T11:30:28",{"id":202,"version":203,"summary_zh":204,"released_at":205},297578,"VMAS-1.3.3","修复了与新的向量化约束相关的错误\n\n**完整更新日志**：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.3.0...VMAS-1.3.3","2023-12-19T09:25:10",{"id":207,"version":208,"summary_zh":209,"released_at":210},297579,"VMAS-1.3.0","# 向量化物理引擎\n\n**如果您正在使用 vmas，现在正是拉取代码的时候。**\n\n物理引擎已被重写，能够在智能体维度上进行向量化运行（而不仅仅是环境维度），同时对 vmas 的逻辑和接口没有任何改动。\n\n## 变化\n\n- _之前_：vmas 在环境维度上实现了向量化，但用于解决碰撞和约束的物理引擎仍然使用双重循环遍历所有实体。\n- _现在_：双重循环遍历实体的方式已被向量化替代，使得 vmas 在智能体和环境两个维度上都实现了向量化，同时仍保留了智能体之间异构性的可能性。\n\n## 意义\n\n- 对于普通环境，性能提升可达约 10 倍。\n- 对于可碰撞实体密度较高的环境，性能提升将非常显著（尤其是在 GPU 上），最高可达 10000 倍。\n- 使用大量不同形状的智能体和实体来模拟环境时，性能将显著提升，并且随着实体数量的增加，计算扩展性也会更好。\n\n# 类汽车机器人动力学\n\n感谢 @Jianye-Xu，vmas 现在支持类汽车机器人的动力学模型。这意味着您可以创建自己喜爱的交通场景。请查看[专用场景](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fblob\u002Fmain\u002Fvmas\u002Fscenarios\u002Fdebug\u002Fkinematic_bicycle.py)，了解其具体实现方式。\n\n## 变更内容\n* [重构] `use_vmas_env` 示例兼容 3D 动作，由 @matteobettini 完成，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F69\n* [新功能] 新增类汽车机器人动力学：运动学自行车模型，由 @Jianye-Xu 实现，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F68\n* [渲染] 支持选择颜色映射进行绘图，由 @matteobettini 完成，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F70\n* [性能优化] 向量化碰撞及其他重大性能改进，由 @matteobettini 完成，详见 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F71\n\n## 新贡献者\n* @Jianye-Xu 在 https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F68 中完成了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.2.13...VMAS-1.3.0","2023-12-02T12:11:15",{"id":212,"version":213,"summary_zh":214,"released_at":215},297580,"VMAS-1.2.13","- Fixes to MPE resetting\r\n- Introduces naming convention `\u003Cname>_\u003Cint>` for all scenarios (to be used in torchrl for automatic grouping https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Fpull\u002F1658)","2023-11-02T16:05:41",{"id":217,"version":218,"summary_zh":219,"released_at":220},297581,"VMAS-1.2.12","## What's Changed\r\n* [Feature] Text is a Geom and can be returned by scenarios by @matteobettini in https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F54\r\n* [detach() on input actions](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcommit\u002Fdfbf19342fda9bccc6591668c964ccae77cd9eef)\r\n* [DOCS] Torchrl notebook link by @matteobettini in https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fpull\u002F59\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcompare\u002FVMAS-1.2.11...VMAS-1.2.12","2023-09-20T21:32:53",{"id":222,"version":223,"summary_zh":224,"released_at":225},297582,"VMAS-1.2.11","- Differential drive robot dynamics now available b9199e8b8365b74f2f6ec2d7ffca3ed74ac44941\r\n- Observations can be also dictionaries of tensors now 25250a0d011bca5de083710cd44e7a03cf0d7259","2023-05-24T09:18:35",{"id":227,"version":228,"summary_zh":229,"released_at":230},297583,"VMAS-1.2.10","Many new exciting things in vmas!:\r\n\r\n- [New scenarios: sampling, navigation, wind_flocking](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcommit\u002Ff6b73596cf46283b7aea35b84861d26560a7044f)\r\n- Fixed a rendering memory leak\r\n- [interactive_rendering.py accepts both scenario names and scenario class](https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator\u002Fcommit\u002F88d252a1d5bacc60e3b2f1f2475f92f23c50e798)\r\n- external forces (e.g. gravitiy) can be defined per agent\r\n- improved flocking scenario","2023-05-10T16:40:18",{"id":232,"version":233,"summary_zh":234,"released_at":235},297584,"VMAS-1.2.9","- Fix a bug on the viewer device\r\n- Now cloning all output and input from vmas simulator\r\n- Ready for MAPPO IPPO example in torch rl (https:\u002F\u002Fgithub.com\u002Fpytorch\u002Frl\u002Fpull\u002F1027)","2023-04-06T08:33:25",{"id":237,"version":238,"summary_zh":239,"released_at":240},297585,"VMAS-1.2.8","- vmas can now also use dictionaries with agent names as keys for input and output spaces (instead of tuples) https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator#input-and-output-spaces","2023-03-20T14:54:35",{"id":242,"version":243,"summary_zh":244,"released_at":245},297586,"VMAS-1.2.7","- Possibility of plotting a function under rendering https:\u002F\u002Fgithub.com\u002Fproroklab\u002FVectorizedMultiAgentSimulator#plot-function-under-rendering\r\n- Possibility of passing a scenario class to `make_env` instead of only being able to pass a scenario name","2023-03-20T11:46:13",{"id":247,"version":248,"summary_zh":249,"released_at":250},297587,"VMAS-1.2.6","- now env has `done()` function which wraps horizon (step counting)","2023-01-20T08:55:29",{"id":252,"version":253,"summary_zh":254,"released_at":255},297588,"VMAS-1.2.5","- introduced `vmas_env.to(device)` which allows to change the device a vmas enviornment is using during execution","2023-01-19T23:30:15",{"id":257,"version":258,"summary_zh":259,"released_at":260},297589,"VMAS-1.2.4","- fix a packaging issue on pypi","2023-01-18T14:26:10"]