[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-PKU-MARL--HARL":3,"similar-PKU-MARL--HARL":63},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":14,"owner_avatar_url":15,"owner_bio":16,"owner_company":17,"owner_location":17,"owner_email":18,"owner_twitter":17,"owner_website":17,"owner_url":19,"languages":20,"stars":33,"forks":34,"last_commit_at":35,"license":17,"difficulty_score":36,"env_os":37,"env_gpu":38,"env_ram":39,"env_deps":40,"category_tags":54,"github_topics":17,"view_count":57,"oss_zip_url":17,"oss_zip_packed_at":17,"status":58,"created_at":59,"updated_at":60,"faqs":61,"releases":62},8514,"PKU-MARL\u002FHARL","HARL","Official implementation of HARL algorithms based on PyTorch.","HARL 是一个基于 PyTorch 开发的开源强化学习算法库，专注于解决多智能体系统中的“异构”协作难题。在传统方法中，为了让不同特性的智能体协同工作，往往需要强制它们共享参数，这不仅限制了模型能力，也难以应对复杂的现实场景。HARL 摒弃了这种限制，提出了一套无需参数共享即可实现高效合作的解决方案。\n\n其核心亮点在于采用了独特的“顺序更新机制”，区别于主流算法的同步更新方式。这一设计不仅拥有严格的数学理论支撑，保证了训练过程的单调改进和收敛性，还在多个高难度基准测试中展现出超越经典算法（如 MAPPO）的性能。库内集成了 HAPPO、HASAC 等七种前沿算法，并预置了包括星际争霸、多足机器人控制及无人机对抗在内的七个主流实验环境接口，方便用户直接上手。\n\nHARL 非常适合人工智能领域的研究人员、算法工程师以及高校师生使用。无论是希望复现顶级会议论文结果，还是需要将先进的多智能体协作技术应用到机器人控制、游戏博弈或复杂调度系统中，HARL 都能提供坚实的理论保障与便捷的代码实现，是探索异构多智能体强化学习的得力助手。","\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_3f39e2b6bf27.jpg\" width=\"300px\" height=\"auto\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Ch1 align=\"center\"> Heterogeneous-Agent Reinforcement Learning \u003C\u002Fh1>\n\nThis repository contains the **official implementation** of **Heterogeneous-Agent Reinforcement Learning (HARL)** algorithms, including **HAPPO**, **HATRPO**, **HAA2C**, **HADDPG**, **HATD3**, **HAD3QN**, and **HASAC**, based on PyTorch. ***HARL algorithms are markedly different from MAPPO in that they are generally applicable to heterogeneous agents and are supported by rigorous theories, often achieving superior performance.*** This repository allows researchers and practitioners to easily reproduce our results on seven challenging benchmarks or apply HARL algorithms to their intended applications.\n\n\n\n## Overview\n\nHARL algorithms are our novel solutions to achieving effective multi-agent cooperation in the general *heterogeneous-agent* settings, without relying on the restrictive *parameter-sharing* trick. \n\n### Key features\n\n- HARL algorithms achieve coordinated agent updates by employing the *sequential update scheme*, different from the *simultaneous update scheme* utilized by MAPPO and MADDPG.\n- HARL algorithms enjoy theoretical guarantees of **monotonic improvement** and **convergence to equilibrium**, ensuring their efficacy in promoting cooperative behavior among agents.\n- Both on-policy and off-policy HARL algorithms, exemplified by **HAPPO** and **HASAC** respectively, demonstrate superior performance across a wide range of benchmarks.\n\nThe following figure is an illustration of the *sequential update scheme*\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_66d9af49fb12.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\nFor more details, please refer to our [HARL](https:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv25\u002F23-0488.html) (JMLR 2024) and [MEHARL](https:\u002F\u002Fopenreview.net\u002Fforum?id=tmqOhBC4a5) (ICLR 2024 spotlight) papers.\n\n\n## Installation\n\n### Install HARL\n\n```shell\nconda create -n harl python=3.8\nconda activate harl\n# Install pytorch>=1.9.0 (CUDA>=11.0) manually\ngit clone https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FHARL.git\ncd HARL\npip install -e .\n```\n\n\n\n### Install Environments Dependencies\n\nAlong with HARL algorithms, we also implement the interfaces for seven common environments ([SMAC](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac), [SMACv2](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmacv2), [MAMuJoCo](https:\u002F\u002Fgithub.com\u002Fschroederdewitt\u002Fmultiagent_mujoco), [MPE](https:\u002F\u002Fpettingzoo.farama.org\u002Fenvironments\u002Fmpe\u002F), [Google Research Football](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ffootball), [Bi-DexterousHands](https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FDexterousHands), [Light Aircraft Game](https:\u002F\u002Fgithub.com\u002Fliuqh16\u002FCloseAirCombat)) and they can be used directly. (We also implement the interface for [Gym](https:\u002F\u002Fwww.gymlibrary.dev\u002F). Gym is a single-agent environment, which can be seen as a special case of multi-agent environments. It is included mainly for reference purposes.) You may choose to install the dependencies to the environments you want to use. **After the installation, please follow the steps in the [\"Solve Dependencies\"](https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FHARL?tab=readme-ov-file#solve-dependencies) section.**\n\n**Install Dependencies of Bi-DexterousHands**\n\nBi-DexterousHands depend on IsaacGym. The hardware requirements of IsaacGym has to be satisfied. To install IsaacGym, download IsaacGym Preview 4 release from [its official website](https:\u002F\u002Fdeveloper.nvidia.com\u002Fisaac-gym\u002Fdownload). Then run `pip install -e .` under its `python` folder.\n\n**Install Light Aircraft Game**\n\n[Light Aircraft Game](https:\u002F\u002Fgithub.com\u002Fliuqh16\u002FCloseAirCombat) (LAG) is a recently developed cooperative-competitive environment for red and blue aircraft games, offering various settings such as single control, 1v1, and 2v2 scenarios. In the context of multi-agent scenarios, LAG currently supports self-play only for 2v2 settings. To address this limitation, we introduce novel cooperative non-weapon and shoot-missile tasks where two agents collaborate to combat two opponents controlled by the built-in AI. In the non-weapon task, agents are trained to fly towards the opponents' tails while maintaining a suitable distance. In the shoot-missile task, agents learn to dodge opponent missiles and launch their own missiles to destroy the opponents.\n\nTo install LAG, run the following command:\n```shell\n# Install dependencies\npip install torch pymap3d jsbsim==1.1.6 geographiclib gym==0.21.0 wandb icecream setproctitle\n# Initialize submodules(*JSBSim-Team\u002Fjsbsim*)\ngit submodule init\ngit submodule update\n```\n\n**Install Google Research Football**\n\nPlease follow [the official instructions](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ffootball) to install Google Research Football.\n\n**Install SMAC**\n\nPlease follow [the official instructions](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac) to install SMAC. We use StarCraft II version 4.10 on Linux.\n\n**Install SMACv2**\n\nPlease follow [the official instructions](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmacv2) to install SMACv2.\n\n**Install MPE**\n\n```shell\npip install pettingzoo==1.22.2\npip install supersuit==3.7.0\n```\n\n**Install Gym Suite (Except MuJoCo)**\n\n```shell\n# Install gym\npip install gym\n# Install classic control\npip install gym[classic_control]\n# Install box2d\nconda install -c anaconda swig\npip install gym[box2d]\n# Install atari\npip install --upgrade pip setuptools wheel\npip install opencv-python\npip install atari-py\npip install gym[atari]\npip install autorom[accept-rom-license]\n```\n\n**Install MuJoCo**\n\nFirst, follow the instructions on https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py, https:\u002F\u002Fwww.roboti.us\u002F, and https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Fmujoco to download the right version of mujoco you need.\n\nSecond, `mkdir ~\u002F.mujoco`.\n\nThird, move the .tar.gz or .zip to `~\u002F.mujoco`, and extract it using `tar -zxvf` or `unzip`.\n\nFourth, add the following line to the `.bashrc`:\n\n```shell\nexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\u002Fhome\u002F\u003Cuser>\u002F.mujoco\u002F\u003Cfolder-name, e.g. mujoco210, mujoco-2.2.1>\u002Fbin\n```\n\nFifth, run the following command:\n\n```shell\nsudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3\npip install mujoco\npip install gym[mujoco]\nsudo apt-get update -y\nsudo apt-get install -y patchelf\n```\n\n**Install Dependencies of MAMuJoCo**\n\nFirst follow the instructions above to install MuJoCo. Then run the following commands.\n\n```shell\npip install \"mujoco-py>=2.1.2.14\"\npip install \"Jinja2>=3.0.3\"\npip install \"glfw>=2.5.1\"\npip install \"Cython>=0.29.28\"\n```\n\nNote that [mujoco-py](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py) is compatible with `mujoco210` (see [this](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py#install-mujoco)). So please make sure to download `mujoco210` and extract it into the right place.\n\n\n\n### Solve Dependencies\n\nAfter the installation above, run the following commands to solve dependencies.\n\n```shell\npip install gym==0.21.0\npip install pyglet==1.5.0\npip install importlib-metadata==4.13.0\n```\n\nIf you encounter issues when using `pip install gym==0.21.0`, try using the following command instead:\n\n```\nconda install -c conda-forge gym=0.21.0\n```\n\n\n\n\n## Usage\n\n### Training on Existing Environments\n\nTo train an algorithm on a provided environment, users can modify yaml configuration files of the corresponding algorithm and environment under `harl\u002Fconfigs\u002Falgos_cfgs` and `harl\u002Fconfigs\u002Fenvs_cfgs` as they wish, go to `examples` folder, and then start training with a one-liner `python train.py --algo \u003CALGO> --env \u003CENV> --exp_name \u003CEXPERIMENT NAME>` or `python train.py --load_config \u003CCONFIG FILE PATH> --exp_name \u003CEXPERIMENT NAME>`, where the latter is mostly used when reproducing an experiment. We provide the **tuned configurations** for algorithms in each environments under `tuned_configs` folder. Users can **reproduce our results** by using `python train.py --load_config \u003CTUNED CONFIG PATH> --exp_name \u003CEXPERIMENT NAME>` and change `\u003CTUNED CONFIG PATH>` to the absolute path of the tuned config file on their machine.\n\nDuring training, users receive continuous logging feedback in the terminal.\n\nAfter training, users can check the log file, tensorboard output, experiment configuration, and saved models under the generated `results` folder. Moreover, users can also render the trained models by setting `use_render: True`, `model_dir: \u003Cpath to trained models>` in algorithm configuration file (for football users also need to set `render: True` in the environment configuration file), and use the same training command as above again. For SMAC and SMACv2, rendering comes in the form of video replay automatically saved to the `StarCraftII\u002FReplays` folder (more details can be found [here](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac#saving-and-watching-starcraft-ii-replays)).\n\nTo enable batch running, we allow users to modify yaml configs in the command line. For each training command, users specify the special parameters in the commands with the same names as in the config files. For example, if you want to run HAPPO on SMAC tasks under three random seeds. You can customize the configs and replace `train.sh` with the following commands:\n\n```shell\nfor seed in $(seq 1 3)\ndo\n\tpython train.py --algo happo --env smac --exp_name test --seed $seed\ndone\n```\n\n\n\n### Applying to New Environments\n\nIf you want to apply HA series algorithms to solve new tasks, you need to implement environment interfaces, following the examples of the seven environments provided. A simplest interface may look like:\n\n```python\nclass Env:\n    def __init__(self, args):\n        self.env = ...\n        self.n_agents = ...\n        self.share_observation_space = ...\n        self.observation_space = ...\n        self.action_space = ...\n\n    def step(self, actions):\n        return obs, state, rewards, dones, info, available_actions\n\n    def reset(self):\n        return obs, state, available_actions\n\n    def seed(self, seed):\n        pass\n\n    def render(self):\n        pass\n\n    def close(self):\n        self.env.close()\n```\n\nThe purpose of interface is to *hide environment-specific details and expose a unified interaction protocol, so that other modules could process data uniformly*.\n\nYou may also want to produce continuous logging output during training. If you intend to use an on-policy algorithm, you need to implement a logger for your environment. The simplest logger can inherit from `BaseLogger` in `harl\u002Fcommon\u002Fbase_logger.py` and implement the `get_task_name` function. More customised logging requirements can be fulfilled by extending or overriding more functions. We recommend referring to the existing loggers in each environment directory to get to know how it is written. If an off-policy algorithm is used, you can directly customise logging by modifying the off-policy runner code. In the end, please register the logger (if any), add a yaml config file, and add necessary code to `examples\u002Ftrain.py`, `harl\u002Futils\u002Fconfigs_tool.py`, and `harl\u002Futils\u002Fenvs_tool.py`. Again, following the existing seven environment examples will be convenient.\n\nAfter these steps, you can apply the algorithms immediately as above.\n\n\n\n### Application Scope of Algorithms\n\n|        | Continuous action space | Discrete action space | Multi Discrete action space |\n| :----: | :---------------------: | :-------------------: | :-------------------------: |\n| HAPPO  |            √            |           √           |              √              |\n| HATRPO |            √            |           √           |                             |\n| HAA2C  |            √            |           √           |              √              |\n| HADDPG |            √            |                       |                             |\n| HATD3  |            √            |                       |                             |\n| HAD3QN |                         |           √           |                             |\n| HASAC  |            √            |           √           |              √              |\n| MAPPO  |            √            |           √           |              √              |\n| MADDPG |            √            |                       |                             |\n| MATD3  |            √            |                       |                             |\n\n\n\n## Performance on Cooperative MARL Benchmarks\n\n### MPE\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_e4b5186a4a41.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### MAMuJoCo\n\nHAPPO, HADDPG, and HATD3 outperform MAPPO, MADDPG, and MATD3; HAPPO and HATD3 are the most effective methods for heterogeneous-agent cooperation tasks.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_84c0a68c922f.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_d458114ef9d3.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_024057ea9992.jpg\" width=\"40%\"\u002F>\n\u003C\u002Fdiv>\n\n### SMAC & SMACv2\n\nHAPPO and HATRPO are comparable to or better than MAPPO and QMIX in SMAC and SMACv2, demonstrating their capability in mostly homogeneous-agent settings.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_34cd60a47f59.png\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_ec65ffbdf1c8.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### GRF\n\nHAPPO consistently outperforms MAPPO and QMIX on GRF, and the performance gap increases as the number and heterogeneity of agents increase.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_505d589dadac.png\" width=\"50%\"\u002F>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_5344924163dd.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### Bi-DexterousHands\n\nHAPPO consistently outperforms MAPPO, and is also better than the single-agent baseline PPO, while also showing less variance.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_18a61c29a9c7.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n*The experiment results of HASAC can be found at https:\u002F\u002Fsites.google.com\u002Fview\u002Fmeharl*\n\n## Citation\n\nThis repository is affiliated with [Peking University](https:\u002F\u002Fwww.pku.edu.cn\u002F\u002F) and [BIGAI](https:\u002F\u002Fwww.bigai.ai\u002F). If you find our paper or this repository helpful in your research or project, please consider citing our works using the following BibTeX citation:\n\n```tex\n@article{JMLR:v25:23-0488,\n  author  = {Yifan Zhong and Jakub Grudzien Kuba and Xidong Feng and Siyi Hu and Jiaming Ji and Yaodong Yang},\n  title   = {Heterogeneous-Agent Reinforcement Learning},\n  journal = {Journal of Machine Learning Research},\n  year    = {2024},\n  volume  = {25},\n  number  = {32},\n  pages   = {1--67},\n  url     = {http:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv25\u002F23-0488.html}\n}\n```\n\n```tex\n@inproceedings{\nliu2024maximum,\ntitle={Maximum Entropy Heterogeneous-Agent Reinforcement Learning},\nauthor={Jiarong Liu and Yifan Zhong and Siyi Hu and Haobo Fu and QIANG FU and Xiaojun Chang and Yaodong Yang},\nbooktitle={The Twelfth International Conference on Learning Representations},\nyear={2024},\nurl={https:\u002F\u002Fopenreview.net\u002Fforum?id=tmqOhBC4a5}\n}\n```\n","\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_3f39e2b6bf27.jpg\" width=\"300px\" height=\"auto\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Ch1 align=\"center\"> 异构智能体强化学习 \u003C\u002Fh1>\n\n本仓库包含基于 PyTorch 的 **异构智能体强化学习（HARL）** 算法的 **官方实现**，其中包括 **HAPPO**、**HATRPO**、**HAA2C**、**HADDPG**、**HATD3**、**HAD3QN** 和 **HASAC**。***HARL 算法与 MAPPO 显著不同，它们普遍适用于异构智能体，并具有严格的理论支持，通常能够取得更优的性能。*** 本仓库使研究人员和从业者能够轻松复现我们在七个具有挑战性的基准测试上的结果，或将 HARL 算法应用于其目标应用场景。\n\n\n\n## 概述\n\nHARL 算法是我们针对一般 *异构智能体* 场景下实现有效多智能体协作的新颖解决方案，无需依赖限制性较强的 *参数共享* 技巧。\n\n### 主要特点\n\n- HARL 算法采用 *顺序更新机制* 实现智能体的协同更新，这与 MAPPO 和 MADDPG 所采用的 *同步更新机制* 不同。\n- HARL 算法在理论上保证了 **单调改进** 和 **收敛至均衡状态**，从而确保其在促进智能体间合作行为方面的有效性。\n- 无论是基于策略梯度的算法还是基于值函数的算法，例如分别以 **HAPPO** 和 **HASAC** 为代表的算法，在广泛的基准测试中均表现出优越的性能。\n\n下图展示了 *顺序更新机制*：\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_66d9af49fb12.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n更多详细信息，请参阅我们的 [HARL](https:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv25\u002F23-0488.html)（JMLR 2024）和 [MEHARL](https:\u002F\u002Fopenreview.net\u002Fforum?id=tmqOhBC4a5)（ICLR 2024 Spotlight）论文。\n\n\n## 安装\n\n### 安装 HARL\n\n```shell\nconda create -n harl python=3.8\nconda activate harl\n# 手动安装 pytorch>=1.9.0 (CUDA>=11.0)\ngit clone https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FHARL.git\ncd HARL\npip install -e .\n```\n\n\n\n### 安装环境依赖\n\n除了 HARL 算法之外，我们还实现了七个常用环境的接口（[SMAC](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac)、[SMACv2](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmacv2)、[MAMuJoCo](https:\u002F\u002Fgithub.com\u002Fschroederdewitt\u002Fmultiagent_mujoco)、[MPE](https:\u002F\u002Fpettingzoo.farama.org\u002Fenvironments\u002Fmpe\u002F)、[Google Research Football](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ffootball)、[Bi-DexterousHands](https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FDexterousHands)、[Light Aircraft Game](https:\u002F\u002Fgithub.com\u002Fliuqh16\u002FCloseAirCombat)），可以直接使用。（我们还实现了 [Gym](https:\u002F\u002Fwww.gymlibrary.dev\u002F) 的接口。Gym 是一个单智能体环境，可以看作是多智能体环境的一种特殊情况，主要出于参考目的而包含在内。）您可以选择安装所需环境的依赖项。**安装完成后，请按照 [\"解决依赖\"](https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FHARL?tab=readme-ov-file#solve-dependencies) 部分中的步骤操作。**\n\n**安装 Bi-DexterousHands 的依赖**\n\nBi-DexterousHands 依赖于 IsaacGym。必须满足 IsaacGym 的硬件要求。要安装 IsaacGym，从其[官方网站](https:\u002F\u002Fdeveloper.nvidia.com\u002Fisaac-gym\u002Fdownload)下载 IsaacGym Preview 4 版本，然后在其 `python` 文件夹下运行 `pip install -e .`。\n\n**安装 Light Aircraft Game**\n\n[Light Aircraft Game](https:\u002F\u002Fgithub.com\u002Fliuqh16\u002FCloseAirCombat)（LAG）是一个新近开发的红蓝双方飞机博弈的合作竞争环境，提供了单人控制、1v1 和 2v2 等多种场景设置。在多智能体场景中，LAG 目前仅支持 2v2 场景下的自对弈。为了解决这一局限性，我们引入了新颖的合作型无武器任务和发射导弹任务：两名智能体协作对抗由内置 AI 控制的两名对手。在无武器任务中，智能体被训练为飞向对手机尾并保持适当距离；而在发射导弹任务中，智能体则学习躲避对手导弹，并发射自己的导弹摧毁对手。\n\n要安装 LAG，运行以下命令：\n```shell\n# 安装依赖\npip install torch pymap3d jsbsim==1.1.6 geographiclib gym==0.21.0 wandb icecream setproctitle\n# 初始化子模块(*JSBSim-Team\u002Fjsbsim*)\ngit submodule init\ngit submodule update\n```\n\n**安装 Google Research Football**\n\n请按照 [官方说明](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ffootball)安装 Google Research Football。\n\n**安装 SMAC**\n\n请按照 [官方说明](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac)安装 SMAC。我们使用 Linux 上的 StarCraft II 版本 4.10。\n\n**安装 SMACv2**\n\n请按照 [官方说明](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmacv2)安装 SMACv2。\n\n**安装 MPE**\n\n```shell\npip install pettingzoo==1.22.2\npip install supersuit==3.7.0\n```\n\n**安装 Gym 套件（除 MuJoCo 外）**\n\n```shell\n# 安装 gym\npip install gym\n# 安装经典控制\npip install gym[classic_control]\n# 安装 box2d\nconda install -c anaconda swig\npip install gym[box2d]\n# 安装 atari\npip install --upgrade pip setuptools wheel\npip install opencv-python\npip install atari-py\npip install gym[atari]\npip install autorom[accept-rom-license]\n```\n\n**安装 MuJoCo**\n\n首先，按照 https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py、https:\u002F\u002Fwww.roboti.us\u002F 和 https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Fmujoco 上的说明，下载您所需的正确版本的 MuJoCo。\n\n其次，创建 `~\u002F.mujoco` 目录。\n\n第三，将 `.tar.gz` 或 `.zip` 文件移动到 `~\u002F.mujoco` 中，并使用 `tar -zxvf` 或 `unzip` 解压。\n\n第四，在 `.bashrc` 文件中添加以下行：\n\n```shell\nexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\u002Fhome\u002F\u003Cuser>\u002F.mujoco\u002F\u003C文件夹名称，例如 mujoco210、mujoco-2.2.1>\u002Fbin\n```\n\n第五，运行以下命令：\n\n```shell\nsudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3\npip install mujoco\npip install gym[mujoco]\nsudo apt-get update -y\nsudo apt-get install -y patchelf\n```\n\n**安装 MAMuJoCo 的依赖**\n\n首先按照上述说明安装 MuJoCo。然后运行以下命令。\n\n```shell\npip install \"mujoco-py>=2.1.2.14\"\npip install \"Jinja2>=3.0.3\"\npip install \"glfw>=2.5.1\"\npip install \"Cython>=0.29.28\"\n```\n\n需要注意的是，[mujoco-py](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py) 与 `mujoco210` 兼容（详见 [此处](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fmujoco-py#install-mujoco)）。因此，请务必下载 `mujoco210` 并将其解压到正确的位置。\n\n### 解决依赖项\n\n在完成上述安装后，运行以下命令以解决依赖项。\n\n```shell\npip install gym==0.21.0\npip install pyglet==1.5.0\npip install importlib-metadata==4.13.0\n```\n\n如果在使用 `pip install gym==0.21.0` 时遇到问题，请尝试使用以下命令：\n\n```\nconda install -c conda-forge gym=0.21.0\n```\n\n\n\n\n## 使用方法\n\n### 在现有环境中训练\n\n要在提供的环境中训练算法，用户可以根据需要修改 `harl\u002Fconfigs\u002Falgos_cfgs` 和 `harl\u002Fconfigs\u002Fenvs_cfgs` 目录下的相应算法和环境的 YAML 配置文件，然后进入 `examples` 文件夹，使用一行命令开始训练：`python train.py --algo \u003CALGO> --env \u003CENV> --exp_name \u003CEXPERIMENT NAME>` 或者 `python train.py --load_config \u003CCONFIG FILE PATH> --exp_name \u003CEXPERIMENT NAME>`。后者通常用于复现实验。我们在 `tuned_configs` 文件夹中为每个环境提供了算法的**调优配置**。用户可以通过运行 `python train.py --load_config \u003CTUNED CONFIG PATH> --exp_name \u003CEXPERIMENT NAME>` 来**复现我们的结果**，只需将 `\u003CTUNED CONFIG PATH>` 替换为本地机器上该调优配置文件的绝对路径即可。\n\n训练过程中，用户将在终端中持续获得日志反馈。\n\n训练完成后，用户可以在生成的 `results` 文件夹中查看日志文件、TensorBoard 输出、实验配置以及保存的模型。此外，用户还可以通过在算法配置文件中设置 `use_render: True` 和 `model_dir: \u003Cpath to trained models>` 来渲染训练好的模型（对于足球环境，还需在环境配置文件中设置 `render: True`），然后再次使用上述相同的训练命令。对于 SMAC 和 SMACv2 环境，渲染将以视频回放的形式自动保存到 `StarCraftII\u002FReplays` 文件夹中（更多详情请参阅 [这里](https:\u002F\u002Fgithub.com\u002Foxwhirl\u002Fsmac#saving-and-watching-starcraft-ii-replays)）。\n\n为了支持批量运行，我们允许用户在命令行中直接修改 YAML 配置文件。每次训练时，用户只需在命令中指定与配置文件中同名的特殊参数即可。例如，如果您想在 SMAC 任务上使用 HAPPO 算法，并采用三个随机种子进行训练，可以自定义配置并用以下命令替换 `train.sh`：\n\n```shell\nfor seed in $(seq 1 3)\ndo\n\tpython train.py --algo happo --env smac --exp_name test --seed $seed\ndone\n```\n\n\n\n### 应用于新环境\n\n若要将 HA 系列算法应用于解决新任务，您需要按照已提供的七个环境的示例实现环境接口。一个最简单的接口可能如下所示：\n\n```python\nclass Env:\n    def __init__(self, args):\n        self.env = ...\n        self.n_agents = ...\n        self.share_observation_space = ...\n        self.observation_space = ...\n        self.action_space = ...\n\n    def step(self, actions):\n        return obs, state, rewards, dones, info, available_actions\n\n    def reset(self):\n        return obs, state, available_actions\n\n    def seed(self, seed):\n        pass\n\n    def render(self):\n        pass\n\n    def close(self):\n        self.env.close()\n```\n\n接口的作用是*隐藏环境特定的细节，暴露统一的交互协议，从而使其他模块能够以统一的方式处理数据*。\n\n您可能还希望在训练过程中产生持续的日志输出。如果您打算使用基于策略的算法，则需要为您的环境实现一个日志记录器。最简单的日志记录器可以继承 `harl\u002Fcommon\u002Fbase_logger.py` 中的 `BaseLogger` 类，并实现 `get_task_name` 函数。对于更定制化的日志需求，可以通过扩展或重写更多函数来满足。我们建议参考各环境目录中的现有日志记录器，了解其编写方式。如果使用的是基于价值的算法，则可以直接通过修改基于价值的运行器代码来定制日志记录。最后，请注册日志记录器（如有），添加 YAML 配置文件，并在 `examples\u002Ftrain.py`、`harl\u002Futils\u002Fconfigs_tool.py` 和 `harl\u002Futils\u002Fenvs_tool.py` 中添加必要的代码。同样，参照现有的七个环境示例将非常方便。\n\n完成这些步骤后，您就可以像之前一样立即应用这些算法了。\n\n\n\n### 算法适用范围\n\n|        | 连续动作空间 | 离散动作空间 | 多离散动作空间 |\n| :----: | :---------------------: | :-------------------: | :-------------------------: |\n| HAPPO  |            √            |           √           |              √              |\n| HATRPO |            √            |           √           |                             |\n| HAA2C  |            √            |           √           |              √              |\n| HADDPG |            √            |                       |                             |\n| HATD3  |            √            |                       |                             |\n| HAD3QN |                         |           √           |                             |\n| HASAC  |            √            |           √           |              √              |\n| MAPPO  |            √            |           √           |              √              |\n| MADDPG |            √            |                       |                             |\n| MATD3  |            √            |                       |                             |\n\n\n\n## 在合作型多智能体强化学习基准上的性能\n\n### MPE\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_e4b5186a4a41.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### MAMuJoCo\n\nHAPPO、HADDPG 和 HATD3 的表现优于 MAPPO、MADDPG 和 MATD3；其中，HAPPO 和 HATD3 是处理异构智能体合作任务最有效的方法。\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_84c0a68c922f.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_d458114ef9d3.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_024057ea9992.jpg\" width=\"40%\"\u002F>\n\u003C\u002Fdiv>\n\n### SMAC & SMACv2\n\nHAPPO 和 HATRPO 在 SMAC 和 SMACv2 中的表现与 MAPPO 和 QMIX 不相上下甚至更好，这表明它们在以同质智能体为主的场景中具有很强的能力。\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_34cd60a47f59.png\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_ec65ffbdf1c8.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### GRF\n\nHAPPO 在 GRF 上的表现始终优于 MAPPO 和 QMIX，且随着智能体数量和异质性的增加，性能差距也在不断扩大。\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_505d589dadac.png\" width=\"50%\"\u002F>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_5344924163dd.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n### 双灵巧手\n\nHAPPO 始终优于 MAPPO，同时也优于单智能体基线 PPO，并且表现出更低的方差。\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_readme_18a61c29a9c7.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n*HASAC 的实验结果可在 https:\u002F\u002Fsites.google.com\u002Fview\u002Fmeharl 查看*\n\n## 引用\n\n本仓库隶属于 [北京大学](https:\u002F\u002Fwww.pku.edu.cn\u002F\u002F) 和 [BIGAI](https:\u002F\u002Fwww.bigai.ai\u002F)。如果您在研究或项目中使用了我们的论文或本仓库，请考虑使用以下 BibTeX 格式的引用：\n\n```tex\n@article{JMLR:v25:23-0488,\n  author  = {Yifan Zhong 和 Jakub Grudzien Kuba 和 Xidong Feng 和 Siyi Hu 和 Jiaming Ji 和 Yaodong Yang},\n  title   = {异构智能体强化学习},\n  journal = {机器学习研究期刊},\n  year    = {2024},\n  volume  = {25},\n  number  = {32},\n  pages   = {1--67},\n  url     = {http:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv25\u002F23-0488.html}\n}\n```\n\n```tex\n@inproceedings{\nliu2024maximum,\ntitle={最大熵异构智能体强化学习},\nauthor={Jiarong Liu 和 Yifan Zhong 和 Siyi Hu 和 Haobo Fu 和 QIANG FU 和 Xiaojun Chang 和 Yaodong Yang},\nbooktitle={第十二届国际学习表征会议},\nyear={2024},\nurl={https:\u002F\u002Fopenreview.net\u002Fforum?id=tmqOhBC4a5}\n}\n```","# HARL 快速上手指南\n\nHARL (Heterogeneous-Agent Reinforcement Learning) 是一个基于 PyTorch 的异质智能体强化学习算法库，包含 HAPPO、HATRPO、HASAC 等先进算法。相比 MAPPO，HARL 不依赖参数共享技巧，适用于异质智能体场景，并具备单调改进和收敛的理论保证。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**: Linux (推荐，部分环境如 SMAC 仅支持 Linux)\n- **Python**: 3.8\n- **PyTorch**: >= 1.9.0 (需手动安装，建议搭配 CUDA >= 11.0)\n- **硬件**: \n  - 常规训练：标准 GPU\n  - Bi-DexterousHands 环境：需满足 IsaacGym 硬件要求\n\n### 前置依赖\n在开始之前，请确保已安装 `conda` 和 `git`。若使用 MuJoCo 或特定游戏环境（如 StarCraft II），需提前下载对应软件及授权文件。\n\n## 安装步骤\n\n### 1. 创建虚拟环境并安装 HARL\n```shell\nconda create -n harl python=3.8\nconda activate harl\n# 请手动安装 pytorch>=1.9.0 (CUDA>=11.0)\n# 例如: pip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n\ngit clone https:\u002F\u002Fgithub.com\u002FPKU-MARL\u002FHARL.git\ncd HARL\npip install -e .\n```\n\n### 2. 解决核心依赖冲突\n安装完主程序后，必须执行以下命令以固定关键库版本，避免兼容性问题：\n```shell\npip install gym==0.21.0\npip install pyglet==1.5.0\npip install importlib-metadata==4.13.0\n```\n*注：若 `pip install gym==0.21.0` 失败，可尝试 `conda install -c conda-forge gym=0.21.0`。*\n\n### 3. 安装特定环境依赖（按需）\nHARL 支持 7 种主流多智能体环境，请根据需求选择安装：\n\n- **MPE (Multi-Particle Environments)**\n  ```shell\n  pip install pettingzoo==1.22.2\n  pip install supersuit==3.7.0\n  ```\n\n- **Light Aircraft Game (LAG)**\n  ```shell\n  pip install torch pymap3d jsbsim==1.1.6 geographiclib gym==0.21.0 wandb icecream setproctitle\n  git submodule init\n  git submodule update\n  ```\n\n- **MuJoCo \u002F MAMuJoCo**\n  需先前往 [MuJoCo 官网](https:\u002F\u002Fwww.roboti.us\u002F) 或 GitHub 下载对应版本（如 `mujoco210`），解压至 `~\u002F.mujoco` 并配置 `LD_LIBRARY_PATH`，然后运行：\n  ```shell\n  sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3\n  pip install mujoco\n  pip install \"mujoco-py>=2.1.2.14\"\n  pip install \"Jinja2>=3.0.3\" \"glfw>=2.5.1\" \"Cython>=0.29.28\"\n  ```\n\n- **其他环境 (SMAC, SMACv2, Google Football, Bi-DexterousHands)**\n  请参考各环境官方仓库的安装说明进行安装。例如 SMAC 需要安装 StarCraft II 4.10 版本。\n\n## 基本使用\n\n### 1. 快速开始训练\nHARL 提供了调优好的配置文件。你可以直接使用一行命令在指定环境中训练算法。\n\n**示例：在 SMAC 环境中使用 HAPPO 算法进行训练**\n```shell\ncd examples\npython train.py --algo happo --env smac --exp_name my_first_experiment\n```\n\n**示例：复现论文结果（使用预调优配置）**\n```shell\npython train.py --load_config \u002F绝对路径\u002Fto\u002Ftuned_configs\u002Fsmac\u002Fhappo.yaml --exp_name reproduce_result\n```\n\n### 2. 批量运行实验\n可以通过 Shell 脚本循环修改随机种子进行批量实验：\n```shell\nfor seed in $(seq 1 3)\ndo\n\tpython train.py --algo happo --env smac --exp_name test_batch --seed $seed\ndone\n```\n\n### 3. 查看结果与渲染\n- **日志与模型**：训练结束后，检查 `results` 文件夹，内含 TensorBoard 日志、配置文件及保存的模型。\n- **渲染演示**：修改算法配置文件，设置 `use_render: True` 和 `model_dir: \u003C模型路径>`，再次运行训练命令即可渲染。\n  - *注意*：对于 Football 环境，还需在环境配置中设置 `render: True`；SMAC 环境会自动生成视频回放至 `StarCraftII\u002FReplays`。\n\n### 4. 适配新环境\n若需在新任务上应用 HARL，需实现统一的环境接口类：\n```python\nclass Env:\n    def __init__(self, args):\n        self.env = ...\n        self.n_agents = ...\n        self.share_observation_space = ...\n        self.observation_space = ...\n        self.action_space = ...\n\n    def step(self, actions):\n        return obs, state, rewards, dones, info, available_actions\n\n    def reset(self):\n        return obs, state, available_actions\n\n    def seed(self, seed):\n        pass\n\n    def render(self):\n        pass\n\n    def close(self):\n        self.env.close()\n```\n实现后，参照现有环境示例注册 Logger、添加 YAML 配置并在 `examples\u002Ftrain.py` 中接入即可。","某无人机编队研发团队正在训练异构智能体集群，以执行复杂的协同侦察与打击任务，其中包含高速侦察机与重型攻击机等不同能力的单位。\n\n### 没有 HARL 时\n- **策略同质化严重**：被迫使用参数共享技巧（Parameter-sharing），导致不同类型的无人机学习到相同的策略，无法发挥各自独特的机动或火力优势。\n- **协作效率低下**：采用传统的同时更新机制，智能体在更新策略时相互干扰，难以在动态战场中形成稳定的配合默契。\n- **训练收敛困难**：缺乏理论保证，多智能体环境下的非平稳性问题导致训练过程震荡剧烈，经常陷入局部最优甚至无法收敛。\n- **场景适应性差**：一旦任务场景从“同型机编队”变为“混编作战”，原有算法性能急剧下降，需重新设计复杂的奖励函数。\n\n### 使用 HARL 后\n- **异构策略专精**：利用序列更新方案，HARL 允许侦察机与攻击机独立学习专属策略，完美适配各自的速度、载荷等异构特性。\n- **协同单调提升**：基于严格的单调改进理论，智能体按顺序更新策略，确保了团队协作能力随训练步数稳定增强，避免了相互干扰。\n- **收敛有迹可循**：凭借收敛至均衡点的理论保障，训练曲线平滑且高效，显著缩短了从仿真到实战部署的验证周期。\n- **通用性强**：无需针对特定编队结构调整网络架构，直接复现于多种高难度基准测试，轻松应对从纯合作到混合博弈的各类任务。\n\nHARL 通过严谨的序列更新机制打破了异构多智能体协作的理论瓶颈，让无人机编队在复杂动态环境中实现了真正的高效协同与稳定收敛。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPKU-MARL_HARL_3f39e2b6.jpg","PKU-MARL","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FPKU-MARL_95f1bd1d.png","RL Research Group, Institute for AI @ Peking University",null,"yaodong.yang@pku.edu.cn","https:\u002F\u002Fgithub.com\u002FPKU-MARL",[21,25,29],{"name":22,"color":23,"percentage":24},"Python","#3572A5",99.4,{"name":26,"color":27,"percentage":28},"HTML","#e34c26",0.4,{"name":30,"color":31,"percentage":32},"CMake","#DA3434",0.2,895,128,"2026-04-16T09:03:40",4,"Linux","部分环境必需（如 Bi-DexterousHands\u002FIsaacGym, MuJoCo）。需支持 CUDA >= 11.0 的 NVIDIA GPU。IsaacGym 对硬件有特定要求（未详述具体型号，通常需高性能显卡）。","未说明",{"notes":41,"python":42,"dependencies":43},"1. 核心算法基于 PyTorch (>=1.9.0) 且需手动安装，要求 CUDA >= 11.0。\n2. 不同环境依赖差异大：Bi-DexterousHands 需单独下载并安装 IsaacGym Preview 4；SMAC 需 StarCraft II 4.10 版本（仅支持 Linux）；MAMuJoCo 需特定版本的 mujoco210。\n3. 安装完主要依赖后，必须执行额外步骤解决版本冲突（强制安装 gym==0.21.0, pyglet==1.5.0, importlib-metadata==4.13.0）。\n4. Google Research Football、SMAC、SMACv2 需遵循其官方仓库的安装指令。","3.8",[44,45,46,47,48,49,50,51,52,53],"torch>=1.9.0","gym==0.21.0","pettingzoo==1.22.2","supersuit==3.7.0","mujoco","mujoco-py>=2.1.2.14","jsbsim==1.1.6","pyglet==1.5.0","wandb","Jinja2>=3.0.3",[55,56],"开发框架","Agent",2,"ready","2026-03-27T02:49:30.150509","2026-04-18T00:45:52.764355",[],[],[64,75,83,92,102,110],{"id":65,"name":66,"github_repo":67,"description_zh":68,"stars":69,"difficulty_score":70,"last_commit_at":71,"category_tags":72,"status":58},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[56,55,73,74],"图像","数据工具",{"id":76,"name":77,"github_repo":78,"description_zh":79,"stars":80,"difficulty_score":70,"last_commit_at":81,"category_tags":82,"status":58},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[55,73,56],{"id":84,"name":85,"github_repo":86,"description_zh":87,"stars":88,"difficulty_score":57,"last_commit_at":89,"category_tags":90,"status":58},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159267,"2026-04-17T11:29:14",[55,56,91],"语言模型",{"id":93,"name":94,"github_repo":95,"description_zh":96,"stars":97,"difficulty_score":98,"last_commit_at":99,"category_tags":100,"status":58},8272,"opencode","anomalyco\u002Fopencode","OpenCode 是一款开源的 AI 编程助手（Coding Agent），旨在像一位智能搭档一样融入您的开发流程。它不仅仅是一个代码补全插件，而是一个能够理解项目上下文、自主规划任务并执行复杂编码操作的智能体。无论是生成全新功能、重构现有代码，还是排查难以定位的 Bug，OpenCode 都能通过自然语言交互高效完成，显著减少开发者在重复性劳动和上下文切换上的时间消耗。\n\n这款工具专为软件开发者、工程师及技术研究人员设计，特别适合希望利用大模型能力来提升编码效率、加速原型开发或处理遗留代码维护的专业人群。其核心亮点在于完全开源的架构，这意味着用户可以审查代码逻辑、自定义行为策略，甚至私有化部署以保障数据安全，彻底打破了传统闭源 AI 助手的“黑盒”限制。\n\n在技术体验上，OpenCode 提供了灵活的终端界面（Terminal UI）和正在测试中的桌面应用程序，支持 macOS、Windows 及 Linux 全平台。它兼容多种包管理工具，安装便捷，并能无缝集成到现有的开发环境中。无论您是追求极致控制权的资深极客，还是渴望提升产出的独立开发者，OpenCode 都提供了一个透明、可信",144296,1,"2026-04-16T14:50:03",[56,101],"插件",{"id":103,"name":104,"github_repo":105,"description_zh":106,"stars":107,"difficulty_score":57,"last_commit_at":108,"category_tags":109,"status":58},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[55,73,56],{"id":111,"name":112,"github_repo":113,"description_zh":114,"stars":115,"difficulty_score":57,"last_commit_at":116,"category_tags":117,"status":58},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[101,56,73,55]]