[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-google-deepmind--reverb":3,"similar-google-deepmind--reverb":95},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":15,"owner_avatar_url":16,"owner_bio":17,"owner_company":18,"owner_location":18,"owner_email":18,"owner_twitter":18,"owner_website":19,"owner_url":20,"languages":21,"stars":48,"forks":49,"last_commit_at":50,"license":51,"difficulty_score":52,"env_os":53,"env_gpu":54,"env_ram":54,"env_deps":55,"category_tags":62,"github_topics":18,"view_count":52,"oss_zip_url":18,"oss_zip_packed_at":18,"status":65,"created_at":66,"updated_at":67,"faqs":68,"releases":94},2415,"google-deepmind\u002Freverb","reverb","Reverb is an efficient and easy-to-use data storage and transport system designed for machine learning research","Reverb 是一款专为机器学习研究打造的高效数据存储与传输系统，由 DeepMind 开源。它主要解决了分布式强化学习训练中“经验回放”环节的瓶颈问题，能够流畅地在数据生成者与模型训练者之间搬运和缓存海量样本，显著提升训练效率。\n\n除了核心的回放功能，Reverb 还内置了丰富的数据结构支持，如先进先出（FIFO）、后进先出（LIFO）及优先队列，并提供了灵活的采样策略、速率限制、数据分片及检查点机制。这些特性让研究人员可以轻松定制数据管理逻辑，无需重复造轮子去构建复杂的底层通信架构。\n\n这款工具非常适合从事强化学习算法研究的科研人员、AI 工程师以及需要处理大规模流式数据的开发者使用。虽然目前主要支持 Linux 环境且定位偏向科研实验而非生产级部署，但其简洁的 API 设计（仅需几行代码即可启动服务）以及与 TensorFlow 的深度集成，使其成为构建高性能分布式训练流水线的得力助手。如果你正在探索复杂的 RL 算法并受限于数据吞吐效率，Reverb 值得尝试。","# Reverb\n![PyPI - Python Version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fdm-reverb)\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb)\n\nReverb is an efficient and easy-to-use data storage and transport system\ndesigned for machine learning research. Reverb is primarily used as an\nexperience replay system for distributed reinforcement learning algorithms but\nthe system also supports multiple data structure representations such as FIFO,\nLIFO, and priority queues.\n\n## Table of Contents\n\n-   [Installation](#installation)\n-   [Quick Start](#quick-start)\n-   [Detailed Overview](#detailed-overview)\n    -   [Tables](#tables)\n    -   [Item Selection Strategies](#item-selection-strategies)\n    -   [Rate Limiting](#rate-limiting)\n    -   [Sharding](#sharding)\n    -   [Checkpointing](#checkpointing)\n-   [Citation](#citation)\n\n## Installation\n\nPlease keep in mind that Reverb is not hardened for production use, and while we\ndo our best to keep things in working order, things may break or segfault.\n\n> :warning: Reverb currently only supports Linux based OSes.\n\nThe recommended way to install Reverb is with `pip`. We also provide instructions\nto build from source using the same docker images we use for releases.\n\nTensorFlow can be installed separately or as part of the `pip` install.\nInstalling TensorFlow as part of the install ensures compatibility.\n\n```shell\n$ pip install dm-reverb[tensorflow]\n\n# Without Tensorflow install and version dependency check.\n$ pip install dm-reverb\n```\n\n### Nightly builds\n\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb-nightly.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb-nightly)\n\n```shell\n$ pip install dm-reverb-nightly[tensorflow]\n\n# Without Tensorflow install and version dependency check.\n$ pip install dm-reverb-nightly\n\n```\n\n### Build from source\n\n[This guide](reverb\u002Fpip_package\u002FREADME.md#how-to-develop-and-build-reverb-with-the-docker-containers)\ndetails how to build Reverb from source.\n\n\n### Reverb Releases\n\nDue to some underlying libraries such as `protoc` and `absl`, Reverb has to be\npaired with a specific version of TensorFlow. If installing Reverb as\n`pip install dm-reverb[tensorflow]` the correct version of Tensorflow will be\ninstalled. The table below lists the version of TensorFlow that each release of\nReverb is associated with and some versions of interest:\n\n  * 0.13.0 dropped Python 3.8 support.\n  * 0.11.0 first version to support Python 3.11.\n  * 0.10.0 last version to support Python 3.7.\n\n\nRelease | Branch \u002F Tag                                               | TensorFlow Version\n------- | ---------------------------------------------------------- | ------------------\nNightly | [master](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb)               | tf-nightly\n0.14.0  | [v0.14.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.14.0) | 2.14.0\n0.13.0  | [v0.13.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.13.0) | 2.14.0\n0.12.0  | [v0.12.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.12.0) | 2.13.0\n0.11.0  | [v0.11.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.11.0) | 2.12.0\n0.10.0  | [v0.10.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.10.0) | 2.11.0\n0.9.0  | [v0.9.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.9.0)   | 2.10.0\n0.8.0  | [v0.8.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.8.0)   | 2.9.0\n0.7.x  | [v0.7.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.7.0)   | 2.8.0\n\n## Quick Start\n\nStarting a Reverb server is as simple as:\n\n```python\nimport reverb\n\nserver = reverb.Server(tables=[\n    reverb.Table(\n        name='my_table',\n        sampler=reverb.selectors.Uniform(),\n        remover=reverb.selectors.Fifo(),\n        max_size=100,\n        rate_limiter=reverb.rate_limiters.MinSize(1)),\n    ],\n)\n```\n\nCreate a client to communicate with the server:\n\n```python\nclient = reverb.Client(f'localhost:{server.port}')\nprint(client.server_info())\n```\n\nWrite some data to the table:\n\n```python\n# Creates a single item and data element [0, 1].\nclient.insert([0, 1], priorities={'my_table': 1.0})\n```\n\nAn item can also reference multiple data elements:\n\n```python\n# Appends three data elements and inserts a single item which references all\n# of them as {'a': [2, 3, 4], 'b': [12, 13, 14]}.\nwith client.trajectory_writer(num_keep_alive_refs=3) as writer:\n  writer.append({'a': 2, 'b': 12})\n  writer.append({'a': 3, 'b': 13})\n  writer.append({'a': 4, 'b': 14})\n\n  # Create an item referencing all the data.\n  writer.create_item(\n      table='my_table',\n      priority=1.0,\n      trajectory={\n          'a': writer.history['a'][:],\n          'b': writer.history['b'][:],\n      })\n\n  # Block until the item has been inserted and confirmed by the server.\n  writer.flush()\n```\n\nThe items we have added to Reverb can be read by sampling them:\n\n```python\n# client.sample() returns a generator.\nprint(list(client.sample('my_table', num_samples=2)))\n```\n\nContinue with the\n[Reverb Tutorial](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Fexamples\u002Fdemo.ipynb)\nfor an interactive tutorial.\n\n## Detailed overview\n\nExperience replay has become an important tool for training off-policy\nreinforcement learning policies. It is used by algorithms such as\n[Deep Q-Networks (DQN)][DQN], [Soft Actor-Critic (SAC)][SAC],\n[Deep Deterministic Policy Gradients (DDPG)][DDPG], and\n[Hindsight Experience Replay][HER], ... However building an efficient, easy to\nuse, and scalable replay system can be challenging. For good performance Reverb\nis implemented in C++ and to enable distributed usage it provides a gRPC service\nfor adding, sampling, and updating the contents of the tables. Python clients\nexpose the full functionality of the service in an easy to use fashion.\nFurthermore native TensorFlow ops are available for performant integration with\nTensorFlow and `tf.data`.\n\nAlthough originally designed for off-policy reinforcement learning, Reverb's\nflexibility makes it just as useful for on-policy reinforcement -- or even\n(un)supervised learning. Creative users have even used Reverb to store and\ndistribute frequently updated data (such as model weights), acting as an\nin-memory lightweight alternative to a distributed file system where each table\nrepresents a file.\n\n### Tables\n\nA Reverb `Server` consists of one or more tables. A table holds items, and each\nitem references one or more data elements. Tables also define sample and\nremoval [selection strategies](#item-selection-strategies), a maximum item\ncapacity, and a [rate limiter](#rate-limiting).\n\nMultiple items can reference the same data element, even if these items exist in\ndifferent tables. This is because items only contain references to data elements\n(as opposed to a copy of the data itself). This also means that a data element\nis only removed when there exists no item that contains a reference to it.\n\nFor example, it is possible to set up one Table as a Prioritized Experience\nReplay (PER) for transitions (sequences of length 2), and another Table as a\n(FIFO) queue of sequences of length 3. In this case the PER data could be used\nto train DQN, and the FIFO data to train a transition model for the environment.\n\n![Using multiple tables](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgoogle-deepmind_reverb_readme_ddebbd6c7a57.png)\n\nItems are automatically removed from the Table when one of two conditions are\nmet:\n\n1.  Inserting a new item would cause the number of items in the Table to exceed\n    its maximum capacity. Table's removal strategy is used to determine which\n    item to remove.\n\n1.  An item has been sampled more than the maximum number of times permitted by\n    the Table's rate limiter. Such item is deleted.\n\nData elements not referenced anymore by any item are also deleted.\n\nUsers have full control over how data is sampled and removed from Reverb\ntables. The behavior is primarily controlled by the\n[item selection strategies](#item-selection-strategies) provided to the `Table`\nas the `sampler` and `remover`. In combination with the\n[`rate_limiter`](#rate-limiting) and `max_times_sampled`, a wide range of\nbehaviors can be achieved. Some commonly used configurations include:\n\n**Uniform Experience Replay**\n\nA set of `N=1000` most recently inserted items are maintained. By setting\n`sampler=reverb.selectors.Uniform()`, the probability to select an item is the\nsame for all items. Due to `reverb.rate_limiters.MinSize(100)`, sampling\nrequests will block until 100 items have been inserted. By setting\n`remover=reverb.selectors.Fifo()` when an item needs to be removed the oldest\nitem is removed first.\n\n```python\nreverb.Table(\n     name='my_uniform_experience_replay_buffer',\n     sampler=reverb.selectors.Uniform(),\n     remover=reverb.selectors.Fifo(),\n     max_size=1000,\n     rate_limiter=reverb.rate_limiters.MinSize(100),\n)\n```\n\nExamples of algorithms that make use of uniform experience replay include [SAC]\nand [DDPG].\n\n**Prioritized Experience Replay**\n\nA set of `N=1000` most recently inserted items. By setting\n`sampler=reverb.selectors.Prioritized(priority_exponent=0.8)`, the probability\nto select an item is proportional to the item's priority.\n\nNote: See [Schaul, Tom, et al.][PER] for the algorithm used in this\nimplementation of Prioritized Experience Replay.\n\n```python\nreverb.Table(\n     name='my_prioritized_experience_replay_buffer',\n     sampler=reverb.selectors.Prioritized(0.8),\n     remover=reverb.selectors.Fifo(),\n     max_size=1000,\n     rate_limiter=reverb.rate_limiters.MinSize(100),\n)\n```\n\nExamples of algorithms that make use of Prioritized Experience Replay are DQN\n(and its variants), and\n[Distributed Distributional Deterministic Policy Gradients][D4PG].\n\n**Queue**\n\nCollection of up to `N=1000` items where the oldest item is selected and removed\nin the same operation. If the collection contains 1000 items then insert calls\nare blocked until it is no longer full, if the collection is empty then sample\ncalls are blocked until there is at least one item.\n\n```python\nreverb.Table(\n    name='my_queue',\n    sampler=reverb.selectors.Fifo(),\n    remover=reverb.selectors.Fifo(),\n    max_size=1000,\n    max_times_sampled=1,\n    rate_limiter=reverb.rate_limiters.Queue(size=1000),\n)\n\n# Or use the helper classmethod `.queue`.\nreverb.Table.queue(name='my_queue', max_size=1000)\n```\n\nExamples of algorithms that make use of Queues are\n[IMPALA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.01561) and asynchronous implementations of\n[Proximal Policy Optimization](https:\u002F\u002Farxiv.org\u002Fabs\u002F1707.06347).\n\n### Item selection strategies\n\nReverb defines several selectors that can be used for item sampling or removal:\n\n-   **Uniform:** Sample uniformly among all items.\n-   **Prioritized:** Samples proportional to stored priorities.\n-   **FIFO:** Selects the oldest data.\n-   **LIFO:** Selects the newest data.\n-   **MinHeap:** Selects data with the lowest priority.\n-   **MaxHeap:** Selects data with the highest priority.\n\nAny of these strategies can be used for sampling or removing items from a\nTable. This gives users the flexibility to create customized Tables that best\nfit their needs.\n\n### Rate Limiting\n\nRate limiters allow users to enforce conditions on when items can be inserted\nand\u002For sampled from a Table. Here is a list of the rate limiters that are\ncurrently available in Reverb:\n\n-   **MinSize:** Sets a minimum number of items that must be in the Table before\n    anything can be sampled.\n-   **SampleToInsertRatio:** Sets that the average ratio of inserts to samples\n    by blocking insert and\u002For sample requests. This is useful for controlling\n    the number of times each item is sampled before being removed.\n-   **Queue:** Items are sampled exactly once before being removed.\n-   **Stack:** Items are sampled exactly once before being removed.\n\n### Sharding\n\nReverb servers are unaware of each other and when scaling up a system to a multi\nserver setup data is not replicated across more than one node. This makes Reverb\nunsuitable as a traditional database but has the benefit of making it trivial to\nscale up systems where some level of data loss is acceptable.\n\nDistributed systems can be horizontally scaled by simply increasing the number\nof Reverb servers. When used in combination with a gRPC compatible load\nbalancer, the address of the load balanced target can simply be provided to a\nReverb client and operations will automatically be distributed across the\ndifferent nodes. You'll find details about the specific behaviors in the\ndocumentation of the relevant methods and classes.\n\nIf a load balancer is not available in your setup or if more control is required\nthen systems can still be scaled in almost the same way. Simply increase the\nnumber of Reverb servers and create separate clients for each server.\n\n### Checkpointing\n\nReverb supports checkpointing; the state and content of Reverb servers can be\nstored to permanent storage. While checkpointing, the `Server` serializes all of\nits data and metadata needed to reconstruct it. During this process the `Server`\nblocks all incoming insert, sample, update, and delete requests.\n\nCheckpointing is done with a call from the Reverb `Client`:\n\n```python\n# client.checkpoint() returns the path the checkpoint was written to.\ncheckpoint_path = client.checkpoint()\n```\n\nTo restore the `reverb.Server` from a checkpoint:\n\n```python\n# The checkpointer accepts the path of the root directory in which checkpoints\n# are written. If we pass the root directory of the checkpoints written above\n# then the new server will load the most recent checkpoint written from the old\n# server.\ncheckpointer = reverb.platform.checkpointers_lib.DefaultCheckpointer(\n  path=checkpoint_path.rsplit('\u002F', 1)[0])\n\n# The arguments passed to `tables=` must be the same as those used by the\n# `Server` that wrote the checkpoint.\nserver = reverb.Server(tables=[...], checkpointer=checkpointer)\n```\n\nRefer to\n[tfrecord_checkpointer.h](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Freverb\u002Fcc\u002Fplatform\u002Ftfrecord_checkpointer.h)\nfor details on the implementation of checkpointing in Reverb.\n\n## Starting Reverb using `reverb_server` (beta)\n\nInstalling `dm-reverb` using `pip` will install a `reverb_server` script, which\naccepts its config as a textproto. For example:\n\n```bash\n$ reverb_server --config=\"\nport: 8000\ntables: {\n  table_name: \\\"my_table\\\"\n  sampler: {\n    fifo: true\n  }\n  remover: {\n    fifo: true\n  }\n  max_size: 200 max_times_sampled: 5\n  rate_limiter: {\n    min_size_to_sample: 1\n    samples_per_insert: 1\n    min_diff: $(python3 -c \"import sys; print(-sys.float_info.max)\")\n    max_diff: $(python3 -c \"import sys; print(sys.float_info.max)\")\n  }\n}\"\n```\n\nThe `rate_limiter` config is equivalent to the Python expression `MinSize(1)`,\nsee `rate_limiters.py`.\n\n\n## Citation\n\nIf you use this code, please cite the\n[Reverb paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.04736) as\n\n```\n@misc{cassirer2021reverb,\n      title={Reverb: A Framework For Experience Replay},\n      author={Albin Cassirer and Gabriel Barth-Maron and Eugene Brevdo and Sabela Ramos and Toby Boyd and Thibault Sottiaux and Manuel Kroiss},\n      year={2021},\n      eprint={2102.04736},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```\n\n\u003C!-- Links to papers go here -->\n\n[D4PG]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1804.08617\n[DDPG]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1509.02971\n[DQN]: https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fnature14236\n[HER]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1707.01495\n[PER]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.05952\n[SAC]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01290\n","# Reverb\n![PyPI - Python Version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fdm-reverb)\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb)\n\nReverb 是一个高效且易于使用的数据存储与传输系统，专为机器学习研究设计。Reverb 主要用作分布式强化学习算法中的经验回放系统，但它也支持多种数据结构表示，例如 FIFO、LIFO 和优先级队列。\n\n## 目录\n\n-   [安装](#installation)\n-   [快速入门](#quick-start)\n-   [详细概述](#detailed-overview)\n    -   [表](#tables)\n    -   [项选择策略](#item-selection-strategies)\n    -   [速率限制](#rate-limiting)\n    -   [分片](#sharding)\n    -   [检查点](#checkpointing)\n-   [引用](#citation)\n\n## 安装\n\n请注意，Reverb 尚未针对生产环境进行充分优化，尽管我们尽力保持其正常运行，但仍可能出现故障或段错误。\n\n> :warning: Reverb 目前仅支持基于 Linux 的操作系统。\n\n推荐使用 `pip` 安装 Reverb。我们还提供了使用与发布版本相同的 Docker 镜像从源代码构建的说明。\n\nTensorFlow 可以单独安装，也可以作为 `pip` 安装的一部分一起安装。将 TensorFlow 作为安装的一部分可以确保兼容性。\n\n```shell\n$ pip install dm-reverb[tensorflow]\n\n# 不包含 TensorFlow 安装及版本依赖检查。\n$ pip install dm-reverb\n```\n\n### 每日构建\n\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb-nightly.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fdm-reverb-nightly)\n\n```shell\n$ pip install dm-reverb-nightly[tensorflow]\n\n# 不包含 TensorFlow 安装及版本依赖检查。\n$ pip install dm-reverb-nightly\n\n```\n\n### 从源代码构建\n\n[本指南](reverb\u002Fpip_package\u002FREADME.md#how-to-develop-and-build-reverb-with-the-docker-containers) 详细介绍了如何从源代码构建 Reverb。\n\n\n### Reverb 发布版本\n\n由于一些底层库（如 `protoc` 和 `absl`），Reverb 必须与特定版本的 TensorFlow 搭配使用。如果通过 `pip install dm-reverb[tensorflow]` 安装 Reverb，将会自动安装正确版本的 TensorFlow。下表列出了每个 Reverb 版本所对应的 TensorFlow 版本以及一些值得关注的版本：\n\n  * 0.13.0 放弃了对 Python 3.8 的支持。\n  * 0.11.0 是首个支持 Python 3.11 的版本。\n  * 0.10.0 是最后一个支持 Python 3.7 的版本。\n\n\n版本 | 分支 \u002F 标签                                               | TensorFlow 版本\n------- | ---------------------------------------------------------- | ------------------\n每日构建 | [master](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb)               | tf-nightly\n0.14.0  | [v0.14.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.14.0) | 2.14.0\n0.13.0  | [v0.13.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.13.0) | 2.14.0\n0.12.0  | [v0.12.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.12.0) | 2.13.0\n0.11.0  | [v0.11.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.11.0) | 2.12.0\n0.10.0  | [v0.10.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.10.0) | 2.11.0\n0.9.0  | [v0.9.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.9.0)   | 2.10.0\n0.8.0  | [v0.8.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.8.0)   | 2.9.0\n0.7.x  | [v0.7.0](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fv0.7.0)   | 2.8.0\n\n## 快速入门\n\n启动一个 Reverb 服务器非常简单：\n\n```python\nimport reverb\n\nserver = reverb.Server(tables=[\n    reverb.Table(\n        name='my_table',\n        sampler=reverb.selectors.Uniform(),\n        remover=reverb.selectors.Fifo(),\n        max_size=100,\n        rate_limiter=reverb.rate_limiters.MinSize(1)),\n    ],\n)\n```\n\n创建一个客户端与服务器通信：\n\n```python\nclient = reverb.Client(f'localhost:{server.port}')\nprint(client.server_info())\n```\n\n向表中写入一些数据：\n\n```python\n# 创建一个单条目，并包含数据元素 [0, 1]。\nclient.insert([0, 1], priorities={'my_table': 1.0})\n```\n\n一个条目也可以引用多个数据元素：\n\n```python\n# 追加三个数据元素，并插入一个单条目，该条目引用所有这些元素，形式为 {'a': [2, 3, 4], 'b': [12, 13, 14]}。\nwith client.trajectory_writer(num_keep_alive_refs=3) as writer:\n  writer.append({'a': 2, 'b': 12})\n  writer.append({'a': 3, 'b': 13})\n  writer.append({'a': 4, 'b': 14})\n\n  # 创建一个引用所有数据的条目。\n  writer.create_item(\n      table='my_table',\n      priority=1.0,\n      trajectory={\n          'a': writer.history['a'][:],\n          'b': writer.history['b'][:],\n      })\n\n  # 等待条目被插入并由服务器确认。\n  writer.flush()\n```\n\n我们添加到 Reverb 中的条目可以通过采样来读取：\n\n```python\n# client.sample() 返回一个生成器。\nprint(list(client.sample('my_table', num_samples=2)))\n```\n\n欲了解更多信息，请参阅 [Reverb 教程](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Fexamples\u002Fdemo.ipynb)，其中包含交互式教程。\n\n## 详细概述\n\n经验回放已成为训练离策略强化学习策略的重要工具。它被用于诸如 [深度 Q 网络 (DQN)][DQN]、[软演员-评论家 (SAC)][SAC]、[深度确定性策略梯度 (DDPG)][DDPG] 以及 [事后经验回放 (HER)][HER] 等算法中。然而，构建一个高效、易用且可扩展的经验回放系统可能颇具挑战性。为了获得良好的性能，Reverb 使用 C++ 实现，并通过 gRPC 服务提供分布式使用的功能，允许用户添加、采样和更新表中的内容。Python 客户端以易于使用的方式暴露了该服务的全部功能。此外，还提供了原生 TensorFlow 操作，以便与 TensorFlow 和 `tf.data` 进行高效的集成。\n\n尽管 Reverb 最初是为离策略强化学习设计的，但其灵活性使其同样适用于在线策略强化学习，甚至无监督或半监督学习。一些富有创造力的用户甚至利用 Reverb 来存储和分发频繁更新的数据（如模型权重），将其作为一种内存中的轻量级替代方案，取代分布式文件系统，其中每个表代表一个文件。\n\n### 表\n\nReverb `Server` 由一个或多个表组成。表用于存储数据项，每个数据项引用一个或多个数据元素。表还定义了采样和移除策略（[数据项选择策略](#item-selection-strategies)）、最大数据项容量以及[速率限制器](#rate-limiting)。\n\n多个数据项可以引用同一个数据元素，即使这些数据项位于不同的表中。这是因为数据项仅包含对数据元素的引用，而不是数据本身的副本。这也意味着只有当没有任何数据项再引用某个数据元素时，该数据元素才会被移除。\n\n例如，可以设置一个表作为优先级经验回放（PER）用于处理长度为2的转移序列，而另一个表则作为长度为3的序列的先进先出（FIFO）队列。在这种情况下，PER中的数据可用于训练DQN，而FIFO中的数据则可用于训练环境的转移模型。\n\n![使用多个表](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgoogle-deepmind_reverb_readme_ddebbd6c7a57.png)\n\n当满足以下两种条件之一时，数据项会自动从表中移除：\n\n1. 插入新数据项会导致表中数据项数量超过其最大容量。此时将根据表的移除策略决定移除哪一项。\n2. 某个数据项被采样次数超过了表的速率限制器所允许的最大次数，该数据项将被删除。\n\n不再被任何数据项引用的数据元素也会被删除。\n\n用户可以完全控制如何从 Reverb 表中采样和移除数据。这一行为主要由提供给 `Table` 的 `sampler` 和 `remover` 控制，即[数据项选择策略](#item-selection-strategies)。结合[`rate_limiter`](#rate-limiting)和`max_times_sampled`，可以实现多种不同的行为。一些常用的配置包括：\n\n**均匀经验回放**\n\n维护一组最近插入的 `N=1000` 个数据项。通过设置 `sampler=reverb.selectors.Uniform()`，所有数据项被选中的概率相同。由于设置了 `reverb.rate_limiters.MinSize(100)`，在插入到100个数据项之前，采样请求会被阻塞。同时，通过设置 `remover=reverb.selectors.Fifo()`，当需要移除数据项时，最早插入的数据项将被优先移除。\n\n```python\nreverb.Table(\n     name='my_uniform_experience_replay_buffer',\n     sampler=reverb.selectors.Uniform(),\n     remover=reverb.selectors.Fifo(),\n     max_size=1000,\n     rate_limiter=reverb.rate_limiters.MinSize(100),\n)\n```\n\n使用均匀经验回放的算法示例包括 [SAC] 和 [DDPG]。\n\n**优先级经验回放**\n\n同样维护一组最近插入的 `N=1000` 个数据项。通过设置 `sampler=reverb.selectors.Prioritized(priority_exponent=0.8)`，数据项被选中的概率与其优先级成正比。\n\n注意：有关此实现中优先级经验回放所使用的算法，请参阅 [Schaul, Tom, et al.][PER]。\n\n```python\nreverb.Table(\n     name='my_prioritized_experience_replay_buffer',\n     sampler=reverb.selectors.Prioritized(0.8),\n     remover=reverb.selectors.Fifo(),\n     max_size=1000,\n     rate_limiter=reverb.rate_limiters.MinSize(100),\n)\n```\n\n使用优先级经验回放的算法示例有 DQN 及其变体，以及[Distributed Distributional Deterministic Policy Gradients][D4PG]。\n\n**队列**\n\n最多可容纳 `N=1000` 个数据项的集合，其中最旧的数据项会在同一操作中被选中并移除。如果集合中已有1000个数据项，则插入操作将被阻塞，直到集合未满；若集合为空，则采样操作也将被阻塞，直到至少有一个数据项存在。\n\n```python\nreverb.Table(\n    name='my_queue',\n    sampler=reverb.selectors.Fifo(),\n    remover=reverb.selectors.Fifo(),\n    max_size=1000,\n    max_times_sampled=1,\n    rate_limiter=reverb.rate_limiters.Queue(size=1000),\n)\n\n# 或者使用辅助类方法 `.queue`。\nreverb.Table.queue(name='my_queue', max_size=1000)\n```\n\n使用队列的算法示例有 [IMPALA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.01561) 以及异步实现的 [Proximal Policy Optimization](https:\u002F\u002Farxiv.org\u002Fabs\u002F1707.06347)。\n\n### 数据项选择策略\n\nReverb 定义了几种可用于数据项采样或移除的选择器：\n\n-   **Uniform:** 在所有数据项中均匀采样。\n-   **Prioritized:** 根据存储的优先级按比例采样。\n-   **FIFO:** 选择最早的数据。\n-   **LIFO:** 选择最新的数据。\n-   **MinHeap:** 选择优先级最低的数据。\n-   **MaxHeap:** 选择优先级最高的数据。\n\n以上任何一种策略都可以用于从表中采样或移除数据项。这为用户提供了灵活性，可以根据自身需求创建定制化的表。\n\n### 速率限制\n\n速率限制器允许用户强制执行某些条件，以控制何时可以从表中插入和\u002F或采样数据项。以下是 Reverb 中当前可用的速率限制器列表：\n\n-   **MinSize:** 设置表中必须至少有多少个数据项才能进行采样。\n-   **SampleToInsertRatio:** 通过阻塞插入和\u002F或采样请求来设定平均插入与采样的比率。这对于控制每个数据项在被移除前被采样的次数非常有用。\n-   **Queue:** 数据项在被移除前恰好被采样一次。\n-   **Stack:** 数据项在被移除前恰好被采样一次。\n\n### 分片\n\nReverb 服务器彼此独立，在将系统扩展到多服务器架构时，数据不会跨多个节点复制。这使得 Reverb 不适合作为传统数据库使用，但同时也带来了易于扩展的优势——在可以接受一定程度数据丢失的情况下，系统可以轻松地水平扩展。\n\n分布式系统只需增加 Reverb 服务器的数量即可实现水平扩展。当与兼容 gRPC 的负载均衡器配合使用时，只需将负载均衡目标地址提供给 Reverb 客户端，操作就会自动分配到不同的节点上。具体行为细节请参阅相关方法和类的文档。\n\n如果您的环境中没有负载均衡器，或者需要更精细的控制，仍然可以通过几乎相同的方式扩展系统。只需增加 Reverb 服务器的数量，并为每台服务器创建单独的客户端即可。\n\n### 检查点\n\nReverb 支持检查点功能，可以将 Reverb 服务器的状态和内容保存到持久化存储中。在创建检查点的过程中，`Server` 会序列化所有必要的数据和元数据，以便后续重建。在此期间，`Server` 会阻止所有传入的插入、采样、更新和删除请求。\n\n创建检查点的操作由 Reverb 客户端发起：\n\n```python\n\n# client.checkpoint() 返回检查点写入的路径。\ncheckpoint_path = client.checkpoint()\n```\n\n从检查点恢复 `reverb.Server`：\n\n```python\n# 检查点管理器接受检查点写入的根目录路径。如果我们传递上述写入的检查点的根目录，\n# 那么新的服务器将加载旧服务器写入的最新检查点。\ncheckpointer = reverb.platform.checkpointers_lib.DefaultCheckpointer(\n  path=checkpoint_path.rsplit('\u002F', 1)[0])\n\n# 传递给 `tables=` 的参数必须与写入检查点的 `Server` 所使用的参数相同。\nserver = reverb.Server(tables=[...], checkpointer=checkpointer)\n```\n\n有关 Reverb 中检查点实现的详细信息，请参阅\n[tfrecord_checkpointer.h](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Freverb\u002Fcc\u002Fplatform\u002Ftfrecord_checkpointer.h)。\n\n## 使用 `reverb_server` 启动 Reverb（测试版）\n\n使用 `pip` 安装 `dm-reverb` 将会安装一个 `reverb_server` 脚本，该脚本以 textproto 格式接收其配置。例如：\n\n```bash\n$ reverb_server --config=\"\nport: 8000\ntables: {\n  table_name: \\\"my_table\\\"\n  sampler: {\n    fifo: true\n  }\n  remover: {\n    fifo: true\n  }\n  max_size: 200 max_times_sampled: 5\n  rate_limiter: {\n    min_size_to_sample: 1\n    samples_per_insert: 1\n    min_diff: $(python3 -c \"import sys; print(-sys.float_info.max)\")\n    max_diff: $(python3 -c \"import sys; print(sys.float_info.max)\")\n  }\n}\"\n```\n\n`rate_limiter` 配置等价于 Python 表达式 `MinSize(1)`，详情请参阅 `rate_limiters.py`。\n\n## 引用\n\n如果您使用此代码，请引用\n[Reverb 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.04736)，格式如下：\n\n```\n@misc{cassirer2021reverb,\n      title={Reverb: 一种经验回放框架},\n      author={Albin Cassirer 和 Gabriel Barth-Maron、Eugene Brevdo、Sabela Ramos、Toby Boyd、Thibault Sottiaux 和 Manuel Kroiss},\n      year={2021},\n      eprint={2102.04736},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```\n\n\u003C!-- 论文链接放在这里 -->\n\n[D4PG]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1804.08617\n[DDPG]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1509.02971\n[DQN]: https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fnature14236\n[HER]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1707.01495\n[PER]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.05952\n[SAC]: https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01290","# Reverb 快速上手指南\n\nReverb 是一个高效且易用的数据存储与传输系统，专为机器学习研究设计。它主要用作分布式强化学习算法的经验回放（Experience Replay）系统，同时也支持 FIFO、LIFO 和优先队列等多种数据结构。\n\n## 环境准备\n\n在开始之前，请确保满足以下系统要求和依赖条件：\n\n*   **操作系统**：仅支持 **Linux** 系统（Windows 和 macOS 暂不支持）。\n*   **Python 版本**：支持 Python 3.9 - 3.11（注意：0.13.0 及以上版本已放弃对 Python 3.8 的支持）。\n*   **核心依赖**：\n    *   TensorFlow（推荐安装以确保兼容性，版本需与 Reverb 对应）。\n    *   底层库依赖：`protoc` 和 `absl`。\n\n> **注意**：Reverb 目前尚未针对生产环境进行加固，开发过程中可能会遇到崩溃或段错误（segfault），建议仅在研究和实验环境中使用。\n\n### TensorFlow 版本对应表\n若通过 `pip install dm-reverb[tensorflow]` 安装，系统将自动匹配正确的 TensorFlow 版本。常见版本对应关系如下：\n\n| Reverb 版本 | TensorFlow 版本 | 备注 |\n| :--- | :--- | :--- |\n| Nightly | tf-nightly | 开发版 |\n| 0.14.0 \u002F 0.13.0 | 2.14.0 | |\n| 0.12.0 | 2.13.0 | |\n| 0.11.0 | 2.12.0 | 首个支持 Python 3.11 的版本 |\n| 0.10.0 | 2.11.0 | 最后一个支持 Python 3.7 的版本 |\n\n## 安装步骤\n\n推荐使用 `pip` 进行安装。国内用户若遇网络问题，可配置国内镜像源（如清华源、阿里源）加速下载。\n\n### 1. 标准安装（推荐）\n安装包含 TensorFlow 依赖的完整版本，确保兼容性：\n\n```shell\npip install dm-reverb[tensorflow]\n```\n\n如果已单独安装 TensorFlow 或不需要版本检查，可运行：\n\n```shell\npip install dm-reverb\n```\n\n### 2. 安装夜间构建版 (Nightly Builds)\n如需体验最新功能（可能不稳定）：\n\n```shell\npip install dm-reverb-nightly[tensorflow]\n```\n\n### 3. 国内镜像加速示例\n使用清华大学镜像源加速安装：\n\n```shell\npip install dm-reverb[tensorflow] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 4. 源码编译\n如需从源码构建（例如需要特定定制或使用 Docker 环境），请参考官方指南：\n[如何构建 Reverb](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Freverb\u002Fpip_package\u002FREADME.md#how-to-develop-and-build-reverb-with-the-docker-containers)\n\n## 基本使用\n\n以下是启动服务器、创建客户端、写入数据及采样数据的最小化工作流程。\n\n### 1. 启动 Reverb 服务器\n定义一个表（Table），配置采样策略（Sampler）、移除策略（Remover）、最大容量及速率限制器。\n\n```python\nimport reverb\n\nserver = reverb.Server(tables=[\n    reverb.Table(\n        name='my_table',\n        sampler=reverb.selectors.Uniform(),       # 均匀采样\n        remover=reverb.selectors.Fifo(),          # 先进先出移除\n        max_size=100,                             # 最大容量 100\n        rate_limiter=reverb.rate_limiters.MinSize(1)), # 至少 1 条数据才允许采样\n    ],\n)\n```\n\n### 2. 创建客户端并连接\n连接到本地运行的服务器。\n\n```python\nclient = reverb.Client(f'localhost:{server.port}')\nprint(client.server_info())\n```\n\n### 3. 写入数据\n**方式 A：插入单个条目**\n直接插入包含数据元素 `[0, 1]` 的单个条目。\n\n```python\n# 创建单个条目和数据元素 [0, 1]\nclient.insert([0, 1], priorities={'my_table': 1.0})\n```\n\n**方式 B：写入轨迹（Trajectory）**\n适用于序列数据，一个条目可引用多个数据元素。\n\n```python\n# 追加三个数据元素，并创建一个引用所有元素的条目 {'a': [2, 3, 4], 'b': [12, 13, 14]}\nwith client.trajectory_writer(num_keep_alive_refs=3) as writer:\n  writer.append({'a': 2, 'b': 12})\n  writer.append({'a': 3, 'b': 13})\n  writer.append({'a': 4, 'b': 14})\n\n  # 创建引用所有数据的条目\n  writer.create_item(\n      table='my_table',\n      priority=1.0,\n      trajectory={\n          'a': writer.history['a'][:],\n          'b': writer.history['b'][:],\n      })\n\n  # 阻塞直到条目被服务器确认插入\n  writer.flush()\n```\n\n### 4. 采样数据\n从表中采样数据进行训练。`client.sample()` 返回一个生成器。\n\n```python\n# 从 'my_table' 采样 2 个样本\nprint(list(client.sample('my_table', num_samples=2)))\n```\n\n> **提示**：更多高级用法（如优先经验回放、队列模式等）及交互式教程，请参阅 [Reverb Tutorial](https:\u002F\u002Fgithub.com\u002Fdeepmind\u002Freverb\u002Ftree\u002Fmaster\u002Fexamples\u002Fdemo.ipynb)。","某自动驾驶研发团队正在构建分布式强化学习系统，需让多台采集车实时上传驾驶经验，供中央训练集群高效采样以优化决策模型。\n\n### 没有 reverb 时\n- **数据同步延迟高**：各节点通过共享文件系统或消息队列传输经验数据，高并发下 I\u002FO 阻塞严重，导致训练样本更新滞后。\n- **采样策略单一僵化**：难以动态切换 FIFO（先进先出）或优先队列等结构，无法根据算法需求灵活调整旧经验的保留与剔除逻辑。\n- **资源竞争频繁**：写入与读取操作缺乏精细的流控机制，常因数据量突发导致内存溢出或训练进程等待数据而空转。\n- **断点恢复困难**：缺乏原生的检查点机制，服务重启后历史经验数据丢失，需重新预热收集，极大浪费算力资源。\n\n### 使用 reverb 后\n- **传输效率显著提升**：reverb 作为专用数据存储与传输系统，利用底层优化实现低延迟通信，确保训练集群毫秒级获取最新驾驶经验。\n- **策略配置灵活多样**：原生支持 Uniform、FIFO 及优先级队列等多种采样与移除策略，团队可一键切换以适应不同强化学习算法的收敛需求。\n- **流控机制稳定可靠**：内置速率限制器（Rate Limiter）自动平衡读写速度，防止缓冲区爆炸，保障分布式环境下的高吞吐稳定性。\n- **状态持久化无忧**：借助内置检查点功能，意外中断后可快速恢复至断开前的数据状态，确保持续学习过程的连贯性与数据完整性。\n\nreverb 通过提供高效、灵活且稳定的经验回放机制，将分布式强化学习的研发迭代周期从“天”级缩短至“小时”级。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgoogle-deepmind_reverb_ab8a08f8.png","google-deepmind","Google DeepMind","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fgoogle-deepmind_06b1dd17.png","",null,"https:\u002F\u002Fwww.deepmind.com\u002F","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind",[22,26,30,34,38,42,45],{"name":23,"color":24,"percentage":25},"C++","#f34b7d",73.5,{"name":27,"color":28,"percentage":29},"Python","#3572A5",19,{"name":31,"color":32,"percentage":33},"Starlark","#76d275",6.7,{"name":35,"color":36,"percentage":37},"Shell","#89e051",0.3,{"name":39,"color":40,"percentage":41},"C","#555555",0.2,{"name":43,"color":44,"percentage":41},"Dockerfile","#384d54",{"name":46,"color":18,"percentage":47},"Linker Script",0,770,108,"2026-04-02T20:19:43","Apache-2.0",2,"Linux","未说明",{"notes":56,"python":57,"dependencies":58},"该工具目前仅支持基于 Linux 的操作系统，不支持 macOS 和 Windows。Reverb 尚未针对生产环境进行加固，可能会出现崩溃或段错误。安装时建议通过 'pip install dm-reverb[tensorflow]' 以确保 TensorFlow 版本兼容，不同版本的 Reverb 强制绑定特定版本的 TensorFlow（例如 Reverb 0.14.0 对应 TF 2.14.0）。底层依赖包括 protoc 和 absl。","3.9, 3.10, 3.11 (0.13.0 及以上版本不支持 Python 3.8；0.10.0 及以下版本支持 Python 3.7)",[59,60,61],"tensorflow>=2.8.0 (需与 Reverb 版本严格对应)","protoc","absl",[63,64],"开发框架","数据工具","ready","2026-03-27T02:49:30.150509","2026-04-06T05:27:03.142060",[69,74,79,84,89],{"id":70,"question_zh":71,"answer_zh":72,"source_url":73},11117,"如何在 TPU 上使用 ReverbDataset？遇到 'Op type not registered' 错误怎么办？","在 Cloud TPU VM 上，系统预装了自定义版本的 tf-nightly，这可能导致 Reverb 操作无法注册。解决方案是安装与预装 TensorFlow 版本匹配的 Reverb 版本，或者重新构建 Reverb 以匹配预装的 TensorFlow 版本。如果安装默认的 tensorflow 2.6.0 和 reverb 0.4.0，虽然导入正常，但可能会丢失 TPU 特定的操作（如 ConfigureDistributedTPU）。因此，通常需要从源码构建 Reverb，确保其编译环境与 TPU VM 上的 TensorFlow 版本一致。","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Freverb\u002Fissues\u002F29",{"id":75,"question_zh":76,"answer_zh":77,"source_url":78},11118,"导入 Reverb 时出现 'undefined symbol' 或 protobuf 相关错误如何解决？","这通常是由于编译时的 ABI 设置与系统环境不匹配导致的。在自行构建 Reverb wheel 包时，需要修改 `.bazelrc` 文件中的编译选项。具体来说，将 `build --cxxopt=\"-D_GLIBCXX_USE_CXX11_ABI=0\"` 改为 `build --cxxopt=\"-D_GLIBCXX_USE_CXX11_ABI=1\"`（或者根据你系统中 TensorFlow 的 ABI 设置进行调整），并注释掉特定的 manylinux 工具链配置，然后重新构建即可解决符号未定义的问题。","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Freverb\u002Fissues\u002F53",{"id":80,"question_zh":81,"answer_zh":82,"source_url":83},11119,"从 Reverb 0.2.0 升级到 0.3.1 后性能大幅下降怎么办？","性能下降可能与特定提交中状态处理方式的变更有关。如果遇到此问题，建议尝试使用维护者提供的修复分支进行构建。例如，可以从 `ebrevdo\u002Freverb` 仓库的 `after` 分支拉取代码，并针对你的 TensorFlow 版本（如 2.4.1）进行编译。该分支包含了性能优化的修复，测试显示可以恢复到甚至超过升级前的吞吐量水平。","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Freverb\u002Fissues\u002F57",{"id":85,"question_zh":86,"answer_zh":87,"source_url":88},11120,"如何实现 Reverb 服务的分片（Sharding）以支持分布式优先回放缓冲区？","官方文档中关于分片的细节较少。一种实践方案是结合 Ray 框架使用：不在 Learner 进程内直接运行所有 Actor，而是使用 `@ray.remote` 装饰器重写 Actor 类（如 `FeedForwardActor`），使其分布在集群的不同 CPU 上运行。同时，定义一个远程变量客户端（RemoteVariableClient）在 Learner 外部异步共享参数。这样可以避免大量模型参数传输阻塞 Learner 的更新过程，从而实现高效的分布式训练架构（如 APEX 或 R2D2）。","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Freverb\u002Fissues\u002F3",{"id":90,"question_zh":91,"answer_zh":92,"source_url":93},11121,"遇到 'Too many open files' 错误或 Checkpoint 超过 2GB 限制如何处理？","这类错误通常与底层日志后端（envlogger）的文件管理策略有关。可以通过配置每个文件的最大轨迹数（max number of trajectories per file）来减少生成的文件数量，从而避免达到操作系统文件句柄限制或单个文件过大。具体配置可参考 envlogger 的 Riegeli 后端写入器代码。如果问题持续，建议在 envlogger 仓库中提单，因为这是其内部机制导致的问题。","https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Freverb\u002Fissues\u002F86",[],[96,107,116,124,132,144],{"id":97,"name":98,"github_repo":99,"description_zh":100,"stars":101,"difficulty_score":102,"last_commit_at":103,"category_tags":104,"status":65},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[63,105,106],"图像","Agent",{"id":108,"name":109,"github_repo":110,"description_zh":111,"stars":112,"difficulty_score":52,"last_commit_at":113,"category_tags":114,"status":65},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,"2026-04-05T11:33:21",[63,106,115],"语言模型",{"id":117,"name":118,"github_repo":119,"description_zh":120,"stars":121,"difficulty_score":52,"last_commit_at":122,"category_tags":123,"status":65},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[63,105,106],{"id":125,"name":126,"github_repo":127,"description_zh":128,"stars":129,"difficulty_score":52,"last_commit_at":130,"category_tags":131,"status":65},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[63,115],{"id":133,"name":134,"github_repo":135,"description_zh":136,"stars":137,"difficulty_score":52,"last_commit_at":138,"category_tags":139,"status":65},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[105,64,140,141,106,142,115,63,143],"视频","插件","其他","音频",{"id":145,"name":146,"github_repo":147,"description_zh":148,"stars":149,"difficulty_score":102,"last_commit_at":150,"category_tags":151,"status":65},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[106,105,63,115,142]]