[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-allenai--tango":3,"tool-allenai--tango":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150720,2,"2026-04-11T11:33:10",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":76,"owner_website":78,"owner_url":79,"languages":80,"stars":100,"forks":101,"last_commit_at":102,"license":103,"difficulty_score":32,"env_os":104,"env_gpu":105,"env_ram":104,"env_deps":106,"category_tags":113,"github_topics":114,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":121,"updated_at":122,"faqs":123,"releases":153},6626,"allenai\u002Ftango","tango","Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.","Tango 是由 AllenAI 开发的一款开源实验管理工具，旨在帮助研究人员将复杂的 AI 实验流程拆解为独立、可复用的步骤。在长期的科研项目中，研究者常面临目录结构混乱、文件版本难以追踪以及重复计算浪费资源等痛点。Tango 通过引入“步骤化”的管理理念，自动缓存每个步骤的运行结果。当实验配置未发生变化时，它能直接调用缓存数据跳过重复执行，从而显著节省计算时间和成本，让实验过程更加整洁有序。\n\n这款工具特别适合从事机器学习、深度学习及相关领域研究的开发者与科研人员。无论是调试模型超参数，还是构建复杂的数据处理流水线，Tango 都能提供清晰的管理视图。其核心技术亮点在于灵活的缓存机制与模块化设计：用户只需通过简单的 Python 装饰器定义步骤，并配合声明式的配置文件（支持 Jsonnet）即可启动实验。此外，Tango 还具备良好的扩展性，支持与 PyTorch、Weights & Biases 等主流生态无缝集成，并提供 Docker 部署方案，能够轻松适应从本地调试到大规模集群训练的各种场景，是提升科研效率的得力助手。","\u003Cdiv align=\"center\">\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fallenai_tango_readme_9b7ad81e7ff0.png\" width=\"600\"\u002F>\n\u003Cbr>\n\u003Cbr>\n\u003Cp>\n\u003C!-- start tagline -->\nAI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project.\n\u003C!-- end tagline -->\n\u003C\u002Fp>\n\u003Chr\u002F>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Factions\">\n    \u003Cimg alt=\"CI\" src=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fworkflows\u002FCI\u002Fbadge.svg?event=push&branch=main\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F\">\n    \u003Cimg alt=\"PyPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fai2-tango\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fai2-tango.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fallenai_tango_readme_13d664e1afd7.png\" alt=\"Documentation Status\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE\">\n    \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fallenai\u002Ftango.svg?color=blue&cachedrop\">\n\u003C\u002Fa>\n\u003Cbr\u002F>\n\u003C\u002Fdiv>\n\n## Quick links\n\n- [Documentation](https:\u002F\u002Fai2-tango.readthedocs.io\u002F)\n- [PyPI Package](https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F)\n- [Contributing](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)\n- [License](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE)\n\n## In this README\n\n- [Quick start](#quick-start)\n- [Installation](#installation)\n  - [Installing with PIP](#installing-with-pip)\n  - [Installing with Conda](#installing-with-conda)\n  - [Installing from source](#installing-from-source)\n  - [Checking your installation](#checking-your-installation)\n  - [Docker image](#docker-image)\n- [FAQ](#faq)\n- [Team](#team)\n- [License](#license)\n\n## Quick start\n\nCreate a Tango step:\n\n```python\n# hello.py\n\nfrom tango import step\n\n@step()\ndef hello(name: str) -> str:\n    message = f\"Hello, {name}!\"\n    print(message)\n    return message\n```\n\nAnd create a corresponding experiment configuration file:\n\n```jsonnet\n\u002F\u002F hello.jsonnet\n\n{\n  steps: {\n    hello: {\n      type: \"hello\",\n      name: \"World\",\n    }\n  }\n}\n```\n\nThen run the experiment using a local workspace to cache the result:\n\n```bash\ntango run hello.jsonnet -w \u002Ftmp\u002Fworkspace\n```\n\nYou'll see something like this in the output:\n\n```\nStarting new run expert-llama\n● Starting step \"hello\"...\nHello, World!\n✓ Finished step \"hello\"\n✓ Finished run expert-llama\n```\n\nIf you run this a second time the output will now look like this:\n\n```\nStarting new run open-crab\n✓ Found output for step \"hello\" in cache...\n✓ Finished run open-crab\n```\n\nYou won't see \"Hello, World!\" this time because the result of the step was found in the cache, so it wasn't run again.\n\nFor a more detailed introduction check out the [First Steps](https:\u002F\u002Fai2-tango.readthedocs.io\u002Fen\u002Flatest\u002Ffirst_steps.html) walk-through.\n\n## Installation\n\n\u003C!-- start install -->\n\n**ai2-tango** requires Python 3.8 or later.\n\n### Installing with `pip`\n\n**ai2-tango** is available [on PyPI](https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F). Just run\n\n```bash\npip install ai2-tango\n```\n\nTo install with a specific integration, such as `torch` for example, run\n\n```bash\npip install 'ai2-tango[torch]'\n```\n\nTo install with all integrations, run\n\n```bash\npip install 'ai2-tango[all]'\n```\n\n### Installing with `conda`\n\n**ai2-tango** is available on conda-forge. You can install just the base package with\n\n```bash\nconda install tango -c conda-forge\n```\n\nYou can pick and choose from the integrations with one of these:\n\n```bash\nconda install tango-datasets -c conda-forge\nconda install tango-torch -c conda-forge\nconda install tango-wandb -c conda-forge\n```\n\nYou can also install everything:\n\n```bash\nconda install tango-all -c conda-forge\n```\n\nEven though **ai2-tango** itself is quite small, installing everything will pull in a lot of dependencies.\nDon't be surprised if this takes a while!\n\n### Installing from source\n\nTo install **ai2-tango** from source, first clone [the repository](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango):\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango.git\ncd tango\n```\n\nThen run\n\n```bash\npip install -e '.[all]'\n```\n\nTo install with only a specific integration, such as `torch` for example, run\n\n```bash\npip install -e '.[torch]'\n```\n\nOr to install just the base tango library, you can run\n\n```bash\npip install -e .\n```\n\n### Checking your installation\n\nRun\n\n```bash\ntango info\n```\n\nto check your installation.\n\n### Docker image\n\nYou can build a Docker image suitable for tango projects by using [the official Dockerfile](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FDockerfile) as a starting point for your own Dockerfile, or you can simply use one of our [prebuilt images](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango) as a base image in your Dockerfile. For example:\n\n```Dockerfile\n# Start from a prebuilt tango base image.\n# You can choose the right tag from the available options here:\n# https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango\u002Fversions\nFROM ghcr.io\u002Fallenai\u002Ftango:cuda11.3\n\n# Install your project's additional requirements.\nCOPY requirements.txt .\nRUN \u002Fopt\u002Fconda\u002Fbin\u002Fpip install --no-cache-dir -r requirements.txt\n\n# Install source code.\n# This instruction copies EVERYTHING in the current directory (build context),\n# which may not be what you want. Consider using a \".dockerignore\" file to\n# exclude files and directories that you don't want on the image.\nCOPY . .\n```\n\nMake sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports.\nYou can see a list of all available image tags [on GitHub](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango\u002Fversions).\n\n\u003C!-- end install -->\n\n## FAQ\n\n\u003C!-- start faq -->\n\n### Why is the library named Tango?\n\nThe motivation behind this library is that we can make research easier by composing it into well-defined steps.  What happens when you choreograph a number of steps together?  Well, you get a dance.  And since our [team's leader](https:\u002F\u002Fnasmith.github.io\u002F) is part of a tango band, \"AI2 Tango\" was an obvious choice!\n\n### How can I debug my steps through the Tango CLI?\n\nYou can run the `tango` command through [pdb](https:\u002F\u002Fdocs.python.org\u002F3\u002Flibrary\u002Fpdb.html). For example:\n\n```bash\npython -m pdb -m tango run config.jsonnet\n```\n\n### How is Tango different from [Metaflow](https:\u002F\u002Fmetaflow.org), [Airflow](https:\u002F\u002Fairflow.apache.org), or [redun](https:\u002F\u002Fgithub.com\u002Finsitro\u002Fredun)?\n\nWe've found that existing DAG execution engines like these tools are great for production workflows but not as well suited for messy, collaborative research projects\nwhere code is changing constantly. AI2 Tango was built *specifically* for these kinds of research projects.\n\n### How does Tango's caching mechanism work?\n\nAI2 Tango caches the results of steps based on the `unique_id` of the step. The `unique_id` is essentially a hash of all of the inputs to the step along with:\n\n1. the step class's fully qualified name, and\n2. the step class's `VERSION` class variable (an arbitrary string).\n\nUnlike other workflow engines like [redun](https:\u002F\u002Fgithub.com\u002Finsitro\u002Fredun), Tango does *not* take into account the source code of the class itself (other than its fully qualified name) because we've found that using a hash of the source code bytes is way too sensitive and less transparent for users.\nWhen you change the source code of your step in a meaningful way you can just manually change the `VERSION` class variable to indicate to Tango\nthat the step has been updated.\n\n\u003C!-- end faq -->\n\n## Team\n\n\u003C!-- start team -->\n\n**ai2-tango** is developed and maintained by the AllenNLP team, backed by [the Allen Institute for Artificial Intelligence (AI2)](https:\u002F\u002Fallenai.org\u002F).\nAI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.\nTo learn more about who specifically contributed to this codebase, see [our contributors](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fgraphs\u002Fcontributors) page.\n\n\u003C!-- end team -->\n\n## License\n\n\u003C!-- start license -->\n\n**ai2-tango** is licensed under [Apache 2.0](https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0).\nA full copy of the license can be found [on GitHub](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE).\n\n\u003C!-- end license -->\n","\u003Cdiv align=\"center\">\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fallenai_tango_readme_9b7ad81e7ff0.png\" width=\"600\"\u002F>\n\u003Cbr>\n\u003Cbr>\n\u003Cp>\n\u003C!-- start tagline -->\nAI2 Tango 通过将实验组织成可在整个研究项目生命周期中缓存和重用的离散步骤，取代了杂乱无章的目录和充斥着文件版本的电子表格。\n\u003C!-- end tagline -->\n\u003C\u002Fp>\n\u003Chr\u002F>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Factions\">\n    \u003Cimg alt=\"CI\" src=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fworkflows\u002FCI\u002Fbadge.svg?event=push&branch=main\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F\">\n    \u003Cimg alt=\"PyPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fai2-tango\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fai2-tango.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fallenai_tango_readme_13d664e1afd7.png\" alt=\"Documentation Status\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE\">\n    \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fallenai\u002Ftango.svg?color=blue&cachedrop\">\n\u003C\u002Fa>\n\u003Cbr\u002F>\n\u003C\u002Fdiv>\n\n## 快速链接\n\n- [文档](https:\u002F\u002Fai2-tango.readthedocs.io\u002F)\n- [PyPI 包](https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F)\n- [贡献指南](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)\n- [许可证](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE)\n\n## 在本 README 中\n\n- [快速入门](#quick-start)\n- [安装](#installation)\n  - [使用 pip 安装](#installing-with-pip)\n  - [使用 Conda 安装](#installing-with-conda)\n  - [从源码安装](#installing-from-source)\n  - [检查安装](#checking-your-installation)\n  - [Docker 镜像](#docker-image)\n- [常见问题](#faq)\n- [团队](#team)\n- [许可证](#license)\n\n## 快速入门\n\n创建一个 Tango 步骤：\n\n```python\n# hello.py\n\nfrom tango import step\n\n@step()\ndef hello(name: str) -> str:\n    message = f\"Hello, {name}!\"\n    print(message)\n    return message\n```\n\n并创建相应的实验配置文件：\n\n```jsonnet\n\u002F\u002F hello.jsonnet\n\n{\n  steps: {\n    hello: {\n      type: \"hello\",\n      name: \"World\",\n    }\n  }\n}\n```\n\n然后使用本地工作区运行实验以缓存结果：\n\n```bash\ntango run hello.jsonnet -w \u002Ftmp\u002Fworkspace\n```\n\n你将在输出中看到类似以下的内容：\n\n```\nStarting new run expert-llama\n● Starting step \"hello\"...\nHello, World!\n✓ Finished step \"hello\"\n✓ Finished run expert-llama\n```\n\n如果第二次运行，输出将变为：\n\n```\nStarting new run open-crab\n✓ Found output for step \"hello\" in cache...\n✓ Finished run open-crab\n```\n\n这次你不会看到“Hello, World!”，因为该步骤的结果已存在于缓存中，因此未再次执行。\n\n如需更详细的介绍，请参阅[第一步](https:\u002F\u002Fai2-tango.readthedocs.io\u002Fen\u002Flatest\u002Ffirst_steps.html)教程。\n\n## 安装\n\n\u003C!-- start install -->\n\n**ai2-tango** 需要 Python 3.8 或更高版本。\n\n### 使用 `pip` 安装\n\n**ai2-tango** 已在 [PyPI](https:\u002F\u002Fpypi.org\u002Fproject\u002Fai2-tango\u002F) 上发布。只需运行：\n\n```bash\npip install ai2-tango\n```\n\n若需安装特定集成（例如 `torch`），可运行：\n\n```bash\npip install 'ai2-tango[torch]'\n```\n\n若需安装所有集成，可运行：\n\n```bash\npip install 'ai2-tango[all]'\n```\n\n### 使用 `conda` 安装\n\n**ai2-tango** 可在 conda-forge 上获取。仅安装基础包时，可运行：\n\n```bash\nconda install tango -c conda-forge\n```\n\n也可根据需要选择安装特定集成：\n\n```bash\nconda install tango-datasets -c conda-forge\nconda install tango-torch -c conda-forge\nconda install tango-wandb -c conda-forge\n```\n\n或直接安装所有集成：\n\n```bash\nconda install tango-all -c conda-forge\n```\n\n尽管 **ai2-tango** 本身体积较小，但安装全部集成会引入大量依赖项。请做好耐心等待的准备！\n\n### 从源码安装\n\n要从源码安装 **ai2-tango**，首先克隆 [仓库](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango)：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango.git\ncd tango\n```\n\n然后运行：\n\n```bash\npip install -e '.[all]'\n```\n\n若仅需安装特定集成（如 `torch`），可运行：\n\n```bash\npip install -e '.[torch]'\n```\n\n或者仅安装基础库，可运行：\n\n```bash\npip install -e .\n```\n\n### 检查安装\n\n运行以下命令以验证安装是否成功：\n\n```bash\ntango info\n```\n\n### Docker 镜像\n\n你可以基于 [官方 Dockerfile](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FDockerfile) 构建适合 Tango 项目的 Docker 镜像，也可以直接使用我们提供的 [预构建镜像](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango) 作为你的 Dockerfile 的基础镜像。例如：\n\n```Dockerfile\n# 从预构建的 Tango 基础镜像开始。\n# 你可以从这里选择合适的标签：\n# https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango\u002Fversions\nFROM ghcr.io\u002Fallenai\u002Ftango:cuda11.3\n\n# 安装项目所需的额外依赖。\nCOPY requirements.txt .\nRUN \u002Fopt\u002Fconda\u002Fbin\u002Fpip install --no-cache-dir -r requirements.txt\n\n# 安装源代码。\n# 此指令会复制当前目录（构建上下文）中的所有内容，\n# 这可能并非你所期望的行为。建议使用 `.dockerignore` 文件来排除不需要包含在镜像中的文件和目录。\nCOPY . .\n```\n\n请根据你使用的 Tango 版本以及宿主机支持的 CUDA 版本，选择合适的基础镜像。\n所有可用镜像标签列表可在 [GitHub](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fpkgs\u002Fcontainer\u002Ftango\u002Fversions) 上查看。\n\n\u003C!-- end install -->\n\n## 常见问题\n\n\u003C!-- start faq -->\n\n### 为什么这个库叫 Tango？\n\n我们开发这个库的初衷是希望通过将研究分解为清晰定义的步骤，使其变得更加简单。那么，当你把许多步骤编排在一起时会发生什么呢？答案就是一场舞蹈。而由于我们的 [团队负责人](https:\u002F\u002Fnasmith.github.io\u002F) 是一支探戈乐队的成员，“AI2 Tango”便成了一个显而易见的选择！\n\n### 如何通过 Tango CLI 调试我的步骤？\n\n你可以通过 [pdb](https:\u002F\u002Fdocs.python.org\u002F3\u002Flibrary\u002Fpdb.html) 来运行 `tango` 命令进行调试。例如：\n\n```bash\npython -m pdb -m tango run config.jsonnet\n```\n\n### Tango 与 [Metaflow](https:\u002F\u002Fmetaflow.org)、[Airflow](https:\u002F\u002Fairflow.apache.org) 或 [redun](https:\u002F\u002Fgithub.com\u002Finsitro\u002Fredun) 有何不同？\n\n我们发现，像这些现有的 DAG 执行引擎非常适合生产工作流，但在处理不断变化代码的混乱协作型研究项目中并不那么适用。AI2 Tango 就是专门为这类研究项目设计的。\n\n### Tango 的缓存机制是如何工作的？\n\nAI2 Tango 会根据步骤的 `unique_id` 来缓存步骤的结果。`unique_id` 实际上是该步骤所有输入的哈希值，再加上：\n\n1. 步骤类的完全限定名；  \n2. 步骤类的 `VERSION` 类变量（一个任意字符串）。\n\n与其他工作流引擎（如 [redun](https:\u002F\u002Fgithub.com\u002Finsitro\u002Fredun)）不同，Tango 并不会考虑类本身的源代码内容（除了其完全限定名），因为我们发现使用源代码字节的哈希值过于敏感，且对用户不够透明。  \n\n当你以有意义的方式修改了步骤的源代码时，只需手动更改 `VERSION` 类变量，即可告知 Tango 该步骤已更新。\n\n\u003C!-- end faq -->\n\n## 团队\n\n\u003C!-- start team -->\n\n**ai2-tango** 由 AllenNLP 团队开发并维护，背后支持机构为 [艾伦人工智能研究所 (AI2)](https:\u002F\u002Fallenai.org\u002F)。AI2 是一家非营利性研究机构，致力于通过具有重大影响力的 AI 研究与工程为人类社会作出贡献。若想了解具体为本代码库做出贡献的人员，请参阅我们的 [贡献者页面](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fgraphs\u002Fcontributors)。\n\n\u003C!-- end team -->\n\n## 许可证\n\n\u003C!-- start license -->\n\n**ai2-tango** 采用 [Apache 2.0 许可证](https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0) 许可。许可证全文可在 [GitHub](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fblob\u002Fmain\u002FLICENSE) 上找到。\n\n\u003C!-- end license -->","# AI2 Tango 快速上手指南\n\nAI2 Tango 是一个专为科研实验设计的工具，它将复杂的实验流程组织成可缓存、可复用的离散步骤，替代混乱的文件目录和版本表格。\n\n## 环境准备\n\n- **操作系统**：Linux, macOS, Windows (推荐 Linux\u002FmacOS)\n- **Python 版本**：3.8 或更高\n- **前置依赖**：\n  - `pip` 或 `conda` 包管理器\n  - (可选) Docker：如需使用容器化部署\n\n> **注意**：国内用户建议使用清华源或阿里源加速 Python 包下载。\n\n## 安装步骤\n\n### 方式一：使用 pip 安装（推荐）\n\n基础安装：\n```bash\npip install ai2-tango -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n安装特定集成（例如 PyTorch）：\n```bash\npip install 'ai2-tango[torch]' -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n安装所有集成（体积较大）：\n```bash\npip install 'ai2-tango[all]' -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方式二：使用 Conda 安装\n\n基础安装：\n```bash\nconda install tango -c conda-forge\n```\n\n安装特定组件：\n```bash\nconda install tango-torch -c conda-forge\n```\n\n安装全套组件：\n```bash\nconda install tango-all -c conda-forge\n```\n\n### 验证安装\n\n运行以下命令检查安装是否成功：\n```bash\ntango info\n```\n\n## 基本使用\n\n### 1. 定义实验步骤\n\n创建一个名为 `hello.py` 的文件，定义一个简单的处理步骤：\n\n```python\n# hello.py\n\nfrom tango import step\n\n@step()\ndef hello(name: str) -> str:\n    message = f\"Hello, {name}!\"\n    print(message)\n    return message\n```\n\n### 2. 创建实验配置\n\n创建一个名为 `hello.jsonnet` 的配置文件，描述实验流程：\n\n```jsonnet\n\u002F\u002F hello.jsonnet\n\n{\n  steps: {\n    hello: {\n      type: \"hello\",\n      name: \"World\",\n    }\n  }\n}\n```\n\n### 3. 运行实验\n\n使用本地工作空间运行实验，结果将被自动缓存：\n\n```bash\ntango run hello.jsonnet -w \u002Ftmp\u002Fworkspace\n```\n\n**首次运行输出：**\n```text\nStarting new run expert-llama\n● Starting step \"hello\"...\nHello, World!\n✓ Finished step \"hello\"\n✓ Finished run expert-llama\n```\n\n### 4. 体验缓存机制\n\n再次运行相同的命令：\n\n```bash\ntango run hello.jsonnet -w \u002Ftmp\u002Fworkspace\n```\n\n**第二次运行输出：**\n```text\nStarting new run open-crab\n✓ Found output for step \"hello\" in cache...\n✓ Finished run open-crab\n```\n\n可以看到，由于步骤输入未发生变化，Tango 直接从缓存中读取了结果，跳过了实际执行过程，从而节省了大量时间。","某 AI 实验室的研究团队正在迭代训练一个大型语言模型，需要频繁调整数据预处理参数并重新运行部分训练流程。\n\n### 没有 tango 时\n- 研究人员依靠手动命名文件夹（如 `run_v2_final_really`）和 Excel 表格来记录实验版本，极易混淆且难以追溯具体配置。\n- 修改了数据清洗逻辑后，即使模型架构未变，也不得不重新运行耗时的预处理步骤，浪费大量 GPU 算力和时间。\n- 团队成员间协作困难，无法直接复用他人已完成的中间结果，导致重复劳动频发。\n- 实验流程缺乏标准化，新人上手成本高，经常因遗漏某个脚本步骤而导致实验结果不可复现。\n- 调试错误时，难以确定是哪一步骤的数据出了问题，只能在杂乱的目录中人工排查文件版本。\n\n### 使用 tango 后\n- tango 将实验拆解为独立的“步骤”并自动缓存结果，通过配置文件即可清晰管理整个流水线，彻底告别混乱的文件目录。\n- 当仅调整下游任务参数时，tango 自动识别上游数据预处理步骤未变动，直接从缓存加载结果，跳过冗余计算，显著缩短迭代周期。\n- 团队成员只需引用相同的步骤配置，即可瞬间获取同事已计算好的中间数据，实现高效的成果共享与协作。\n- 所有实验均通过声明式配置运行，确保了流程的标准化和可复现性，新成员也能快速构建可靠的实验环境。\n- 每一步都有明确的输入输出依赖关系，出错时可精准定位到具体步骤，并利用缓存机制快速重试，无需从头开始。\n\ntango 通过将实验流程模块化与智能化缓存，让研究人员从繁琐的文件管理中解放出来，专注于核心算法创新。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fallenai_tango_88d32030.png","allenai","Ai2","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fallenai_65c450d5.png","",null,"ai2-info@allenai.org","http:\u002F\u002Fwww.allenai.org","https:\u002F\u002Fgithub.com\u002Fallenai",[81,85,88,92,96],{"name":82,"color":83,"percentage":84},"Python","#3572A5",97.3,{"name":86,"color":87,"percentage":32},"Jsonnet","#0064bd",{"name":89,"color":90,"percentage":91},"Shell","#89e051",0.6,{"name":93,"color":94,"percentage":95},"Makefile","#427819",0.1,{"name":97,"color":98,"percentage":99},"Dockerfile","#384d54",0,572,54,"2026-04-10T04:56:40","Apache-2.0","未说明","非必需。提供预构建的 Docker 镜像支持 CUDA 11.3 (ghcr.io\u002Fallenai\u002Ftango:cuda11.3)，适用于需要 GPU 加速的场景（如安装 torch 集成时）。",{"notes":107,"python":108,"dependencies":109},"该工具核心库较小，但若选择安装所有集成（all），会拉取大量依赖，安装时间较长。支持通过 pip 或 conda 安装，也提供基于 CUDA 11.3 的预构建 Docker 镜像。缓存机制基于步骤输入哈希和版本号，不直接监控源代码变动，修改逻辑需手动更新 VERSION 变量。","3.8+",[110,111,112],"torch (可选集成)","datasets (可选集成)","wandb (可选集成)",[35,14,13,15],[115,116,117,118,119,120],"python","python3","machine-learning","nlp","ai","pytorch","2026-03-27T02:49:30.150509","2026-04-11T21:48:19.528074",[124,129,134,138,143,148],{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},29911,"如何在 Step 中使用多进程（multiprocessing）时避免 ModuleNotFoundError？","在 Tango 的 Step 中使用 joblib 或 multiprocessing 时，如果子进程无法找到当前模块，通常会报错。该问题已在 PR #406 中修复。请确保升级到包含该修复的版本。如果问题仍然存在，请检查项目结构是否包含 __init__.py 文件，并确保使用 `-i` 参数正确导入模块路径，例如：`tango run config.jsonnet -i your_package -w workspace`。","https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F404",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},29912,"运行 tango info 时出现 OpenMP 初始化错误（OMP: Error #15）怎么办？","该错误是由于多个 OpenMP 运行时副本被链接到程序中导致的。虽然官方建议避免静态链接，但作为一个临时的、未文档化的变通方法，你可以设置环境变量 `KMP_DUPLICATE_LIB_OK=TRUE` 来允许程序继续执行。请注意，这可能会导致崩溃或产生不正确的结果。该问题已在版本更新（参考 PR #447）中得到修复，建议升级 Tango 到最新版本。","https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F338",{"id":135,"question_zh":136,"answer_zh":137,"source_url":133},29913,"安装 ai2-tango[transformers] 后运行 tango 提示找不到 wandb 模块如何解决？","即使安装了 `ai2-tango[transformers]`，Tango 在初始化时仍可能尝试加载所有集成（包括 wandb），从而导致 `ModuleNotFoundError: No module named 'wandb'`。解决方法是显式安装缺失的依赖包：`pip install wandb`。或者，如果你不需要 wandb 功能，可以关注后续版本是否支持按需加载集成以避免此类错误。",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},29914,"Wandb 回调在恢复训练（非从头开始）时打印步数警告怎么办？","当训练任务恢复运行时，Wandb 回调可能会打印警告：`WARNING Step must only increase in log calls. Step X \u003C Y; dropping`。这是因为 Wandb 要求步数必须单调递增，而恢复运行时的起始步数可能小于已记录的步数。目前无法直接覆盖旧值，但维护者表示可以考虑抑制这些警告。暂时可以通过忽略该警告或手动调整日志步数来应对。","https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F152",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},29915,"Tango 项目的开发状态如何？还会继续维护吗？","Tango 核心开发团队表示，库已达到稳定的状态，因此最近的提交频率有所降低，团队重心转向了使用该库的研究项目。但这并不意味着停止开发新功能。团队承诺在可预见的未来将继续积极维护 Tango，特别是因为 AI2 内部有许多用户依赖它。新用户可以放心将其用于 NLP 研究工作。","https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F567",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},29916,"Beaker Executor 在 pip 暂时失去连接时会失败吗？如何应对？","是的，在使用 Beaker Executor 时，如果 pip 在安装依赖过程中暂时失去网络连接，任务可能会失败并抛出 urllib3 相关错误。建议在 Beaker 配置中增加网络重试机制，或预先构建包含所需依赖的自定义 Docker 镜像，以减少运行时对网络的依赖。此外，确保 Beaker 任务所在的集群网络稳定性也是关键。","https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F395",[154,159,164,169,174,179,184,189,194,199,204,209,214,219,224,229,234,239,244,249],{"id":155,"version":156,"summary_zh":157,"released_at":158},206498,"v1.3.2","## 新增内容\n\n### 修复 ✅\n\n- 修复了在 Beaker 执行器中使用 gcloud auth 时出现的问题。\n\n## 提交记录\n\n2ad79f8 修复 gcloud auth 相关问题 (#609)\n4a1ebea 修复 Read the Docs 配置 (#608)\n\n","2023-10-27T17:47:24",{"id":160,"version":161,"summary_zh":162,"released_at":163},206499,"v1.3.1","## 新增内容\n\n### 修复 ✅\n\n- 修复了 `GSWorkspace()` 中的 minor bugs。\n\n### 变更 ⚠️\n\n- 为 Python 定义的实验添加了 CLI 风格的执行函数。\n- 向 `ExecutorOutput` 中添加了 `display()` 方法，用于生成总结运行情况的表格。\n\n## 提交记录\n\n4c8ae5a 为 Python 定义的实验添加 CLI 风格的执行 (#600)\n8bb3472 修复 GS 工作区的 bug (#607)\n\n","2023-10-25T23:25:06",{"id":165,"version":166,"summary_zh":167,"released_at":168},206500,"v1.3.0","## 新增内容\n\n### 新增 🎉\n- 添加了 `Workspace.remove_step()` 方法，用于安全地移除步骤。\n- 现在可以使用 Google Cloud 存储桶的子文件夹来初始化 `GSWorkspace()`。\n\n### 变更 ⚠️\n\n- `BeakerExecutor` 现在会使用执行器实例化时的 HEAD 提交来执行步骤，而不是步骤运行时的 HEAD 提交。\n\n### 修复 ✅\n\n- 移除了不必要的代码覆盖率开发依赖。\n- 修复了新版本 PyTorch 导致所有学习率调度器未注册的问题。\n- 更新了 jax、jaxlib 和 flax 的固定版本。\n\n## 提交记录\n\ned72140 准备发布 v1.3.0\n5d776d5 撤销 “准备发布 v1.3.0”\n8eec3df 撤销 “移除文件夹”\n671525f 移除文件夹\n1a384f7 准备发布 v1.3.0\n56c1476 使用执行器实例化时的提交 (#605)\n11b5229 为本地工作空间添加“移除步骤”功能 (#588)\n3857415 GS 工作空间可以是存储桶的子文件夹 (#604)\nb955ef7 CI 错误消失 (#601)\n01077eb 学习率调度器 (#573)\n3a19688 修复从 GS 工作空间获取结果的 bug (#582)\n4c3edce 样式：迁移到 ruff (#562)\n416ffa6 添加 CITATION.cff 文件 (#572)\n70f2681 将 PyTorch 要求从 \u003C1.14,>=1.9 更新为 >=1.9,\u003C2.1 (#539)\n1ee2c56 将 Weights & Biases 要求从 \u003C0.13.11,>=0.12 更新为 >=0.12,\u003C0.14.3 (#547)\nfcf1010 将 sentencepiece 从 0.1.97 升级到 0.1.98 (#548)\n7d04cde 将 mypy 从 1.0.1 升级到 1.2.0 (#555)\nfbb068b 将 allenai\u002Fbeaker-run-action 从 1.1 升级到 1.2 (#534)\na717d70 将 black 从 23.1.0 升级到 23.3.0 (#554)\n42322bb 修复 readthedocs\n825ec19 删除多余的 pkl 文件\n3638cdc 重构：将打包信息移至 `pyproject.toml` 文件中 (#549)\ne86ad65 将 black 从 23.1.0 升级到 23.3.0 (#543)\n69e7574 移除不必要的代码覆盖率依赖 (#550)\n\n","2023-10-13T21:20:36",{"id":170,"version":171,"summary_zh":172,"released_at":173},206501,"v1.2.1","## 新增内容\n\n### 新增 🎉\n\n- 添加了以下工作空间方法，以支持 Tango 可视化界面：`Workspace.search_registered_runs()`、`Workspace.search_step_info()`、`Workspace.num_registered_runs()` 和 `Workspace.num_steps()`。\n\n### 修复 ✅\n\n- 修复了一个 bug：当对象直接接受 `Step` 参数时，`FromParams` 无法正确解析。\n- 更改了一个名称，以避免覆盖内置名称 `set`。\n- 修复了一个在密集步骤图中会导致 O(n^2) 内存消耗的 bug。\n\n## 提交记录\n\n258f440 只有一个步骤对象 (#545)\n07abee5 不要覆盖名称 `set` (#544)\n4784bab 从 `Workspace.search_registered_runs()` 返回更多信息 (#536)\n3d2d890 修复 `FromParams` 对象直接接收 `Step` 参数时的 bug (#535)\n38561d0 新的分页式工作空间搜索方法 (#489)\n7a25e3e 修复数据集类型问题 (#531)\n810b742 对 CI 的小幅更新 (#529)\n\n","2023-04-07T00:05:24",{"id":175,"version":176,"summary_zh":177,"released_at":178},206502,"v1.2.0","## 新增内容\n\n### 新增 🎉\n\n- 现在可以在不使缓存失效的情况下为步骤添加参数。请参阅 `Step.SKIP_DEFAULT_ARGUMENTS`。\n- 修复了 `tango info` 命令中的集成状态消息。\n- 添加了 `RemoteClient`、`RemoteStepCache` 和 `RemoteWorkspace` 的抽象层。\n- 添加了一个 GS 集成，附带 `GSWorkspace`——一个使用 Google Cloud Storage 的远程 `Workspace` 实现。\n- 现在可以使用 `@step(bind=True)` 将函数型步骤绑定到底层的 `Step` 实例上，这意味着该函数的第一个参数将是一个 `Step` 对象。\n- 添加了 `ShellStep`，用于运行任意 Shell 命令。\n- 添加了 `@make_registrable` 装饰器，使任意函数可注册，从而更方便在 Tango 配置中引用它们。\n\n### 修复 ✅\n\n- Jsonnet 解析现在速度大幅提升，并且在 Windows 上也能正常工作。\n- 锁相关的警告现在每 30 秒会可靠地打印一次。\n- 我们现在确保 Beaker 作业使用最新版本的 beaker-py，以兼容最新的 API 变更。\n- 当指标完全不变时，提前停止功能现已正常工作。\n- 修复了 `FromParams` 中对可变长度元组处理不当的 bug。\n\n### 变更 ⚠️\n\n- Tango 的默认日志级别现已改为 `warning`。\n- 在 `tango run` 命令中，可以使用 `-s` 指定多个步骤。\n\n## 提交记录\n\n985f6fa 修复 lint\nf77c0e0 修复 release_notes 脚本\n2c9456d 准备发布 v1.2.0\nf1dc63d 更新 wandb 依赖范围，由 \u003C=0.13.5,>=0.12 改为 >=0.12,\u003C0.13.11 (#523)\n32598c4 更新 rich 依赖范围，由 \u003C13.0,>=12.3 改为 >=12.3,\u003C14.0 (#498)\n35ca0f0 将 actions\u002Fcheckout 从 1 升级到 3 (#524)\n739e40c 各种依赖项更新 (#525)\n49f3afc 修复可变长度元组相关 bug (#527)\n28ea796 工作空间的小幅修复 (#526)\n379095d GCSWorkspace (#417)\nc949416 Shell 步骤 + 可注册函数 (#521)\n308689b 允许在 `tango run` 中使用 `-s` 指定多个步骤 (#516)\n9002169 当指标相同时表示提前停止 (#515)\n4a21132 为 `@step` 装饰器添加 `bind` 选项 (#512)\n8a6775e 在 Beaker 作业中升级 beaker-py (#509)\n679700b Rjsonnet (#505)\n3760419 默认日志级别设置为 warning (#508)\n8d29321 锁相关警告改进 (#506)\n521de99 在问题修复之前固定 wandb 依赖版本 (#507)\n6618a04 为 beaker-py 升级提供支持 (#504)\n34dbec4 修复 `tango info` 命令中的集成状态消息 (#502)\nfbb7581 快速修复 beaker-py 升级问题\na10fb3b 克隆到 src 目录 (#495)\n65f699d 修复 #483 问题 (#484)\n4a183de 修复额外不可缓存依赖导致的 bug (#480)\nac0a193 跳过默认参数 (#481)\n\n","2023-02-10T20:02:05",{"id":180,"version":181,"summary_zh":182,"released_at":183},206503,"v1.1.0","## 新增内容\n\n### 新增 🎉\n\n- 在 `StepResources` 中新增了 `gpu_type` 字段。`BeakerExecutor` 可以使用该字段来决定将步骤提交到哪些集群。\n- 在 `StepResources` 中新增了 `machine` 字段。在使用 `BeakerExecutor` 时，可以将其设置为 `\"local\"`，以强制在本地运行该步骤。\n- 为 `tango run` 命令新增了 `--ext-var` 参数，用于在加载实验配置时设置 JSONNET 外部变量。\n- 新增了 `@step()` 装饰器，用于从函数创建 `Step` 类。\n- 添加了 `transformers::with_soft_prompt` 集成，使带有软提示的前缀转换器更加易于使用。\n\n### 移除 👋\n\n- 移除了 PyTorch Lightning 集成。\n- 移除了 `tango server` 命令以及 `tango run` 的 `--serve\u002F--no-serve` 选项。\n- 移除了意外提交的 `source_release.py` 文件。\n\n### 修复 ✅\n\n- 修复了 Tango 设置文件中 Executor 的 `parallelism` 选项会被忽略的问题。\n- 修复了一个 bug：如果依赖于另一个步骤结果键值的步骤名称发生变化，该步骤的唯一 ID 也会随之改变。\n- 修复了一个 bug：导入某些库（如 `torchmetrics`）会因为我们异常处理机制出现问题，因为这些库出于某种原因设置了 `sys.excepthook`。现在我们在导入后始终会重置 `sys.excepthook`。\n- Flax 训练器的类型提示曾暗示训练集是可选的，但实际上它是必需的。\n- 提升了 `BeakerWorkspace` 和 `BeakerStepLock` 在作业被抢占时的健壮性。\n- 对 Beaker 执行器和工作空间进行了小幅性能优化。\n\n## 提交记录\n\n73bfa86 软提示 (#231)  \n79b7d01 Beaker 集成性能改进  \nd455541 训练并非可选 (#474)  \n3eab580 Beaker 集成性能改进 (#475)  \n241b4eb 移除本不该存在的步骤 (#478)  \n5f5ba41 添加 `@step()` 装饰器 (#476)  \nc0c4ae0 使 `BeakerStepLock` 对被抢占的作业更具鲁棒性  \n39d3d66 移除 tango 服务器 (#470)  \nfef9bba 导入其他模块后重置 sys.excepthook，并添加 `--ext-var` CLI 选项 (#471)  \nccebb4c 自动移除临时 Beaker 数据集  \n81d773e `BeakerExecutor` 改进，修复 `StepIndexer` 的 bug (#469)  \nc05a80a 将 sphinx-copybutton 从 0.5.0 升级到 0.5.1 (#467)  \n147e408 移除 PyTorch Lightning 集成 (#468)  \ncd4f626 更新 more-itertools 的依赖范围，从 `\u003C9.0,>=8.0` 改为 `>=8.0,\u003C10.0` (#456)  \nd219dff 修复文档构建失败问题  \nee71c09 修复设置文件中并行度设置的 bug (#466)  \n1609eb4 将 mypy 从 0.982 升级到 0.991 (#465)","2022-12-01T19:01:11",{"id":185,"version":186,"summary_zh":187,"released_at":188},206504,"v1.0.2","## 新增内容\n\n### 变更 ⚠️\n\n- `BeakerScheduler` 现在可以返回集群列表。\n\n## 提交记录\n\n64215d8 将 mypy 从 0.982 升级到 0.990 (#463)\nacc1082 更新 torch 依赖版本，由 \u003C1.13,>=1.9 改为 >=1.9,\u003C1.14 (#460)\n570d24e 使用新的 `constraints` 字段进行集群分配 (#462)\n5afd89f 将 Beaker 客户端的 User-Agent 头设置为 Tango v*.*.* (#459)\n9d9628f 修复 BeakerExecutor 中的进度日志语句 (#458)\n\n","2022-11-15T18:09:52",{"id":190,"version":191,"summary_zh":192,"released_at":193},206505,"v1.0.1","## 新增内容\n\n### 修复 ✅\n\n- `LightningTrainStep` 现在可以接受 `Lazy` 模型对象，从而确保哈希值的确定性。\n- 修复了远程 `Workspace` 实现（如 `WandbWorkspace` 和 `BeakerWorkspace`）在使用不同 W&B 或 Beaker 工作空间时仍会共享同一本地缓存的问题。\n- 修复了在构造回调时 `TorchEvalStep` 的 bug。\n- 修复了因未安装相关集成而导致的一些导入错误问题。\n- 修正了 `MulticoreExecutor` 中最终结果报告不正确的问题。\n\n### 变更 ⚠️\n\n- Wandb 步骤缓存会在超时时重试 API 调用。\n- 需要 `beaker-py >= 1.11`。\n\n## 提交记录\n\n26e6416 将 sphinx 版本从 5.2.3 升级到 5.3.0 (#455)\nc10cec5 重试 wandb 调用 (#450)\nf950b12 为每个工作空间使用独立的本地缓存目录 (#451)\n4c70161 应报告不可缓存步骤的失败 (#453)\n9476942 将 black 版本从 22.8.0 升级到 22.10.0 (#445)\n821c0fc 将 furo 版本从 2022.9.15 升级到 2022.9.29 (#435)\n8d0b146 允许 `LightningTrainStep` 接受 `Lazy` 模型输入 (#448)\n12eec56 修复缺失集成时的导入问题 (#447)\nd16997c 修复 `TorchEvalStep` 的 bug (#442)\n9a5f792 将 beaker-py 升级到 ≥1.11 版本 (#443)\nf5c2e52 添加到常见问题解答 (#440)\n\n","2022-10-20T17:02:01",{"id":195,"version":196,"summary_zh":197,"released_at":198},206506,"v1.0.0","这是 AI2 Tango 的首个稳定版本，凝聚了一年多的努力和近 1000 次提交！\n\n我们 AllenNLP 团队一直在低调地推进这个项目，它目前已被这里的研究人员日常使用。因此，当我们对 API 感到满意时，非常高兴能够正式发布它 🎉\n\n## 自上一版以来的新变化\n\n### 新增 🎉\n\n- 向 `Step` 类添加了 `step_extra_dependencies` 输入字段，可用于强制依赖另一个步骤，即使当前步骤并不直接依赖该步骤的输出。更多背景信息请参阅 [#418](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F418)。\n\n### 变更 ⚠️\n\n- 现需 `beaker-py >= 1.10`。\n\n### 修复 ✅\n\n- 长日志行将被软换行，以确保链接可点击。\n- 修复了一个 bug：如果某个步骤的 `Format` 在 `Workspace.step_finished()` 中未能序列化工单结果，某些工作空间可能会处于错误状态。\n- 有时函数和方法会作为步骤的参数传入，这就需要对其进行哈希处理。现在不再对函数本身进行哈希，而是对其所属模块和名称进行哈希。\n- 修复了 Beaker 执行器的一个问题：当某个依赖于其他步骤的步骤失败时，执行器会在运行结束时挂起。\n- 更新测试以兼容新版本的 transformers。\n- 修复了 `Executor.execute_sub_graph_for_step()`，使其能够并行运行步骤的依赖项。\n\n## 提交记录\n\na48a825 仅在需要时才在入口点安装 `gh` (#439)\n69b948d 改进步骤状态边界情况的错误处理 (#429)\n39b3132 修复 `Executor.execute_sub_graph_for_step()` (#438)\n408944e 放宽部分依赖的版本范围 (#437)\n76cd46d 对 #401 的后续修复 (#434)\n04e7963 修复 Beaker 执行器的 bug (#430)\nfb077e6 将 fairscale 从 0.4.9 升级至 0.4.11 (#432)\n25f7373 将 furo 从 2022.6.21 升级至 2022.9.15 (#410)\nf11925c 将 mypy 从 0.971 升级至 0.982 (#433)\n4845317 将 sphinx 从 5.1.1 升级至 5.2.3 (#431)\n572f83b 将 myst-parser 从 0.18.0 升级至 0.18.1 (#426)\ne1fc2d3 向 `Step` 类添加 `step_extra_dependencies` 选项 (#419)\n19927d3 不再固定 protobuf 版本 (#428)\n3dbe8c7 我们需要 click 8 才能正常工作。(#427)\ncbdbe68 对函数进行哈希处理 (#424)\n747469f 将 `Format` 序列化失败标记为步骤失败 (#421)\nb1d6431 确保包含的模块的基础路径保留在 `sys.path` 中 (#406)\ne1a1cd1 对 `BeakerExecutor` 内部实现进行小幅改进 (#416)\nd3e1891 确保日志行被软换行，以便链接始终可点击 (#415)\n\n","2022-10-05T16:57:21",{"id":200,"version":201,"summary_zh":202,"released_at":203},206507,"v0.14.0","## 新增功能\n\n### 新增 🎉\n\n- 添加了使用 IA3 适配器修改 Hugging Face 转换器的功能。\n- 添加了一个可注册的 `BeakerScheduler` 类，作为 `BeakerExecutor` 的 `scheduler` 参数传入，用于控制分配给在 Beaker 上运行的步骤的资源。用户可以实现自己的 `BeakerScheduler` 子类来自定义资源分配行为。\n\n### 变更 ⚠️\n\n- 在 `tango run` 命令中，`--no-server` 现已成为默认选项。如需启动服务器，请使用 `--server`。\n\n### 修复 ✅\n\n- 提升了 `BeakerExecutor` 对连接、超时、SSL 以及其他可恢复 HTTP 错误的健壮性。\n- 改进了 `BeakerStepLock` 的健壮性，从而使得 `BeakerWorkspace` 更加稳定，减少了因锁处于不良状态而需要手动干预的情况。\n- 修复了 `BeakerExecutor` 内部调度逻辑中的一个 bug，该 bug 可能会导致部分并行步骤提交延迟。\n- 修复了一个从参数创建 `StepInfo` 对象时可能导致不必要的导入的 bug。\n- 修复了取消 `BeakerExecutor` 可能无法正常工作的问题。\n- 修复了当设置了 `train_epochs` 并且使用梯度累积时，训练器会过度训练的 bug。\n- 修正了 `tango run` 显示不可缓存步骤结果的方式。\n- `BeakerExecutor` 不会同时运行重复的可缓存步骤。\n\n## 提交记录\n\n0828adc `BeakerExecutor` 不会运行重复的可缓存步骤 (#414)\n7382019 IA3 适配器 (#403)\nd498cf7 修复最终输出的紧急补丁\nbff9ebf 当步骤还无法运行时添加警告，并进行错误修复 (#408)\nc72552e 默认不启动服务器 (#409)\nd34fe09 添加用于处理资源分配的 `BeakerScheduler` 类 (#407)\n5dcbb56 梯度累积与 `train_epochs` (#402)\nd27bbef 提升 `BeakerStepLock` 的健壮性 (#401)\ncd9b5fd 修复 `StepInfo.from_params()` 中的 bug、`BeakerExecutor` 取消操作问题，并保留 `ref` 名称 (#400)\n6ff6b9e 修复调度逻辑中的 bug (#399)\n15196f2 为张量实现确定性哈希 (#398)\n230d78e 提升 `BeakerExecutor` 对所有可恢复错误类型（连接、HTTP、SSL、超时等）的健壮性 (#397)\n5f63a27 将 fairscale 从 0.4.8 升级到 0.4.9 (#391)\n\n","2022-09-20T18:49:01",{"id":205,"version":206,"summary_zh":207,"released_at":208},206508,"v0.13.0","## What's new\n\n### Added 🎉\n\n- You can now reference into a particular index of the result of another step in a config. For example: `{type: \"ref\", ref: \"some_previous_step\", key: 0}`.\n  The key field can be an integer if the result of the referenced step is a list or tuple, or a string if the result of the referenced step is a dictionary.\n- Added `priority` parameter to Beaker executor for setting the default task priority for Beaker jobs.\n- Added `Workspace.step_result()` method for getting a step's result from the latest\n  run.\n- `tango run` will now display a URL to the logs for failed steps when you use the `BeakerExecutor`.\n\n### Changed ⚠️\n\n- The `TorchTrainStep` now enables monitoring arbitrary model outputs during training. `TorchTrainEngine.forward_train` now returns a tuple `loss, model_outputs` for each micro batch and the list of model outputs for all micro batches in a batch is passed to the `TrainCallback.log_batch` and `TrainCallback.post_batch`.\n- Tango will now automatically search Python modules in the current working directory\n  for registered classes so that you don't always need to use the `--include-package` setting.\n- The minimum supported Python version is now 3.8.\n- Added support for PyTorch Lightning 1.7.x\n- The Beaker Executor will no-longer live-stream logs from Beaker jobs, but logs will be viewable on Beaker and more readable.\n- Only the Beaker executor requires a clean working directory\n\n### Fixed ✅\n\n- Fixed a bug that did not allow a wandb artifact's type to be set from a step's metadata dictionary. \n- Fixed a bug with how the Beaker executor streams log lines from Beaker which sometimes resulted in messages missing some starting characters, and tqdm lines being duplicated.\n- Fixed a bug in the Beaker workspace where the lock dataset wouldn't be removed if the step\n  was found to be in an invalid state.\n- Improved cluster choice logic in `BeakerExecutor` to ensure greater diversity of clusters when submitting many steps at once.\n- Fixed bug where sub-processes of the multicore executor would use the wrong executor if `executor` was defined in a `tango.yml` file.\n\n## Commits\n\n4f89d55 Improve Beaker cluster choice logic (#392)\ne1ceae2 Display URL to logs for failed steps  (#390)\n3dc9591 Bump black from 22.6.0 to 22.8.0 (#380)\nc9ce257 Catch when Beaker experiments are stopped (#389)\n0fe12e9 Fix issues with WandbWorkspace causing CI crash (#388)\n342eb26 Keep parameters in `Params` objects to make error messages more readable (#375)\n92f0354 Simplified beaker logging (#383)\nfd9d3cc Only the Beaker executor needs clean working directories (#373)\n06f26ae Update wandb artifact type (#378)\nf6a6b70 Update base images, get us out of the latest infinite loop of pip madness (#382)\n306986b Catch all errors when attempting log record decode (#379)\n628caff Allowing indexing into step results in config (#371)\n858cef8 Minor improvement to Beaker logging (#377)\n7a5619e Add `Workspace.step_result()` method (#374)\n0750d76 Fix bugs with how `BeakerExecutor` streams logs (#372)\n6e8b107 Detailed train outputs (#369)\nbcd50d8 Update pytorch-lightning requirement from \u003C1.7,>=1.6 to >=1.6,\u003C1.8 (#349)\n8ed0c86 Bump fairscale from 0.4.6 to 0.4.8 (#347)\n62f2746 Python minimum version is 3.8 (#368)\n45e02fe Auto import local Python modules when searching for registered classes (#367)\n\n","2022-09-08T00:39:04",{"id":210,"version":211,"summary_zh":212,"released_at":213},206509,"v0.12.0","## What's new\n\n### Added 🎉\n\n- **Step resources:**\n  - Added a `step_resources` parameter to the `Step` class which should be used to describe the computational resources required to run a step.\n    `Executor` implementations can use this information. For example, if your step needs 2 GPUs, you should set\n    `step_resources=StepResources(gpu_count=2)` (`\"step_resources\": {\"gpu_count\": 2}` in the configuration language).\n  - Added a `Step.resources()` property method. By default this returns the value specified by the `step_resources` parameter.\n    If your step implementation always requires the same resources, you can just override this method so you don't have to provide\n    the `step_resources` parameter.\n- **Step execution:**\n  - Added an `executor` field to the `tango.yml` settings. You can use this to define the executor you want to use by default.\n  - Added a Beaker `Executor` to the Beaker integration, registered as an `Executor` with the name \"beaker\".\n    To use this executor, add these lines to your `tango.yml` file:\n    ```yaml\n    executor:\n      type: beaker\n      beaker_workspace: ai2\u002Fmy-workspace\n      clusters:\n        - ai2\u002Fgeneral-cirrascale\n    ```\n    See the docs for the `BeakerExecutor` for more information on the input parameters.\n- **Step class:**\n  - Added a metadata field to the step class API. This can be set through the class\n    variable `METADATA` or through the constructor argument `step_metadata`.\n- **Weights & Biases integration:**\n  - You can now change the artifact kind for step result artifacts by adding a field\n    called \"artifact_kind\" to a step's metadata.\n    For models, setting \"artifact_kind\" to \"model\" will add the corresponding artifact to W&B's new model zoo.\n\n### Changed ⚠️\n\n- **CLI:**\n  - The `tango run` command will throw an error if you have uncommitted changes in your repository, unless\n    you use the `--allow-dirty` flag.\n  - The `tango run` command will use the lightweight base executor (single process) by default.\n    To use the multi-process executor, set `-j\u002F--parallelism` to 1 or higher or -1 to use all available CPU cores.\n\n### Fixed ✅\n\n- Fixed bug where `StepInfo` environment and platform metadata could be out-of-date if a step is run again due to failure.\n- Fixed a bug where an unfortunate combination of early stopping and decreasing model performance could result in a crash in the torch trainer.\n\n## Commits\n\nbefb00a Add `workspace_metadata` arg to `Step` class, allow changing artifact kind in W&B workspace (#363)\n5ab1c2a Fix undefined behavior with `TorchTrainStep` (#366)\nbf3c1a0 Update filelock requirement from \u003C3.8,>=3.4 to >=3.4,\u003C3.9 (#354)\nb4e48a7 Update jsonpickle requirement from \u003C2.2.0,>=2.1.0 to >=2.1.0,\u003C2.3.0 (#351)\n1c491f0 Update wandb requirement from \u003C0.13,>=0.12 to >=0.12,\u003C0.14 (#350)\n93d5eb4 Bump allenai\u002Fsetup-beaker from 1 to 2 (#359)\ndc0f89a Fix #355 - ensure git metadata is up-to-date (#361)\n258e880 Raise better error msg from `step_result_for_run()` (#360)\n43916d1 Print debugging information about the repo used. (#353)\n928aa7a Add `BeakerExecutor` (#340)\n\n","2022-08-24T00:06:13",{"id":215,"version":216,"summary_zh":217,"released_at":218},206510,"v0.11.0","## What's new\n\n### Added 🎉\n- Added a [Flax](https:\u002F\u002Fflax.readthedocs.io\u002Fen\u002Flatest\u002F) integration along with an example config.\n\n## Commits\n\nb4cd2b3 Flax Integration (#313)\nb9a7422 Bump sphinx from 5.0.2 to 5.1.1 (#346)\nd7952ef Bump mypy from 0.961 to 0.971 (#339)\n6a58bfd Put PIP install instructions first (#348)\n\n","2022-08-04T22:32:34",{"id":220,"version":221,"summary_zh":222,"released_at":223},206511,"v0.10.1","## What's new\n\n### Fixed ✅\n\n- Fixed issue where the StepInfo config argument could be parsed into a Step. \n- Restored capability to run tests out-of-tree.\n\n## Commits\n\n2498318 Fix issue where StepInfo config could be parsed into a Step (#344)\n57096b2 Make tests runnable out-of-tree for help with conda-packaging (#307)\n\n","2022-07-26T20:57:30",{"id":225,"version":226,"summary_zh":227,"released_at":228},206512,"v0.10.0","## What's new\n\n### Changed ⚠️\n\n- Renamed `workspace` parameter of `BeakerWorkspace` class to `beaker_workspace`.\n- `Executor` class is now a `Registrable` base class. `MulticoreExecutor` is registered as \"multicore\".\n\n### Removed 👋\n\n- Removed `StepExecutionMetadata`. Its fields have been absorbed into `StepInfo`.\n\n### Fixed ✅\n\n- Improved `Step.ensure_result()` such that the step's result doesn't have to be read from the cache.\n- Fixed an issue with the output from `MulticoreExecutor` such that it's now consistent with the default `Executor` for steps that were found in the cache.\n- One of our error messages referred to a configuration file that no longer exists.\n- Improved performance of `BeakerWorkspace`.\n\n### Added 🎉\n\n- Added the ability to train straight `Model` instead of just `Lazy[Model]`\n\n## Commits\n\n4e809f5 Eager models (#319)\n361777b Metadata changes, make executor registrable (#331)\na6b0be9 Beaker workspace performance (#328)\nf43e5ea Update torch requirement from \u003C1.12,>=1.9 to >=1.9,\u003C1.13 (#330)\n8495c64 update dev dependencies (#333)\n712d862 Make multicore executor output consistent with default (#325)\n903569c Refer to the right config file (#324)\nbd9e4be Modernize our issue templates (#323)\n\n","2022-07-07T21:04:18",{"id":230,"version":231,"summary_zh":232,"released_at":233},206513,"v0.9.1","## What's new\n\n### Fixed ✅\n\n- Fixed non-deterministic behavior in `TorchTrainStep`.\n- Fixed bug in `BeakerWorkspace` where `.step_info(step)` would raise a `KeyError` if the step hasn't been registered as part of a run yet.\n- Fixed a bug in `BeakerWorkspace` where it would send too many requests to the beaker service.\n- Fixed a bug where `WandbWorkspace.step_finished()` or `.step_failed()` would crash if called\n  from a different process than `.step_starting()`.\n- Fixed a bug in `WandbWorkspace.step_finished()` which led to a `RuntimeError` sometimes while\n  caching the result of a step.\n\n## Commits\n\nc6fc5be Fix bugs with `Workspace` and `WandbWorkspace`, specifically (#321)\n80c90ca Beaker DOS fix (#315)\n8b75591 Log from `BeakerStepLock` at WARNING level (#316)\n4d46d67 fix non-deterministic behavior in TorchTrainStep (#314)\nc59b6b3 Bump actions\u002Fsetup-python from 3 to 4 (#311)\nb02cf40 Bump sphinx from 4.5.0 to 5.0.1 (#305)\n4501815 Bump furo from 2022.6.4 to 2022.6.4.1 (#309)\nda9c29c Fix bug in Beaker workspace (#312)\ne8422cb Bump mypy from 0.960 to 0.961 (#308)\n8256a74 Bump myst-parser from 0.17.2 to 0.18.0 (#310)\n44ae92e Bump furo from 2022.4.7 to 2022.6.4 (#306)\n39923ae Update protobuf requirement from \u003C=3.20.0 to \u003C4.22.0 (#301)\ne7ef1f5 Registerables first steps eg (#304)\n\n","2022-06-24T16:05:38",{"id":235,"version":236,"summary_zh":237,"released_at":238},206514,"v0.9.0","## What's new\n\n### Added 🎉\n\n- Added a [Beaker](https:\u002F\u002Fbeaker.org) integration that comes with `BeakerWorkspace`, a remote `Workspace` implementation that uses Beaker Datasets under the hood.\n- Added a `datasets::dataset_remix` step that provides the split remixing functionality of `tango.steps.datasest_remix.DatasetRemixStep` now for Huggingface `DatasetDict`.\n\n### Changed ⚠️\n\n- If you try to import something from a tango integration that is not fully installed due to missing dependencies, an `IntegrationMissingError` will be raised\ninstead of `ModuleNotFound`.\n- You can now set `-j 0` in `tango run` to disable multicore execution altogether.\n\n### Fixed ✅\n\n- Improved how steps and workspaces handle race conditions when different processes are competing to execute the same step. This would result in a `RuntimeError` before with most workspaces, but now it's handled gracefully.\n- Fixed bug which caused GradScaler state to not be saved and loaded with checkpoints.\n\n## Commits\n\n0ddd2ac Add Beaker integration (#296)\n6bdd1dd Updates the Euler example (#297)\nbc89470 GradScaler state saving and loading (#293)\nb8562db fix old filename in CONTRIBUTING.md (#300)\n4aff1bb Dataset remix (#298)\neb1fcd8 Bump mypy from 0.950 to 0.960 (#295)\n903741e Update filelock requirement from \u003C3.7,>=3.4 to >=3.4,\u003C3.8 (#284)\nb58b823 Handle missing integrations (#292)\n\n","2022-06-01T21:34:20",{"id":240,"version":241,"summary_zh":242,"released_at":243},206515,"v0.8.0","## What's new\n\n### Added 🎉\n\n- Added a Weights & Baises remote `Workspace` implementation: `WandbWorkspace`, registered as \"wandb\".\n  This can be instantiated from a workspace URL in the form \"wandb:\u002F\u002Fentity\u002Fproject\".\n- Added a method `Workspace.step_result_for_run` which gives the result of a step given the run name and step name within that run.\n- Added property `Workspace.url`, which returns a URL for the workspace that can be used to instantiate the exact same workspace using `Workspace.from_url()`. Subclasses must implement this.\n\n### Changed ⚠️\n\n- `StepInfo` start and end times will be always be in UTC now.\n- `WandbTrainCallback` now logs system metrics from each worker process in distributed training.\n- `StepCache.__contains__()` and `StepCache.__getitem__()` now take accept either a `Step` or `StepInfo` as an argument (`Union[Step, StepInfo]`).\n- Refactored `tango.step_graph.StepGraph` to allow initialization from a `Dict[str, Step]`.\n- `Executor.execute_step_graph()` now attempts to execute all steps and summarizes success\u002Ffailures.\n\n### Fixed ✅\n\n- Fixed bug with `LocalWorkspace.from_parsed_url()` ([#278](https:\u002F\u002Fgithub.com\u002Fallenai\u002Ftango\u002Fissues\u002F278)).\n- Deprecation warnings will now be logged from `tango` CLI.\n- Fixed the text format in the case of serializing an iterator of string.\n- Added missing default value of `None` to `TangoGlobalSettings.find_or_default()`.\n- Mypy has become incompatible with transformers and datasets, so we have to disable the checks in some places.\n- The `VERSION` member of step arguments that were wrapped in `Lazy` were not respected. Now they are.\n\n## Commits\n\n3069226 Makes sure the `VERSION` parameter of classes is respected even when we construct them inside of a `Lazy` object. (#289)\ndd71446 Add Weights & Baises remote workspace (#232)\ne3f2bd2 Adds a dependency that's missing from transformers (#285)\n25919e1 Fixes the text format (#283)\n381de74 Add missing default to `TangoGlobalSettings.find_or_default()` (#282)\n9ac708a Update click requirement from \u003C8.1.3,>=7.0 to >=7.0,\u003C8.1.4 (#277)\n749357e Bump mypy from 0.942 to 0.950 (#276)\n2c59c96 Bump allenai\u002Fbeaker-run-action from 1.0 to 1.1 (#274)\n53ffe80 refactor (#275)\n\n","2022-05-20T00:52:35",{"id":245,"version":246,"summary_zh":247,"released_at":248},206516,"v0.7.0","## What's new\n\n### Added 🎉\n\n- Added the \"-n\u002F--name\" option to `tango run`. This option allows the user to give the run an arbitrary name.\n- Added a convenience property `.workspace` to `Step` class that can be called from a step's `.run()` method to get the current `Workspace` being used.\n- Gave `FromParams` objects (which includes all `Registrable` objects) the ability to version themselves.\n- Added CLI option to run a single step in a config using `--step-name` or `-s`.\n- Added a `MultiCoreExecutor` that executes steps in parallel.\n- Added an `ExecutorOutput` dataclass that is returned by `Executor.execute_step_graph()`.\n- `StepGraph` now prints itself in a readable way.\n- Tango now automatically detects when it's running under a debugger, and disables multicore support accordingly. Many debuggers can't properly follow sub-processes, so this is a convenience for people who love debuggers.\n- Added more models to the stuff we can import from the transformers library.\n- Added new example for finetuning text-to-text models.\n\n### Changed ⚠️\n\n- Renamed `click_logger` to `cli_logger`, and we now use [rich](https:\u002F\u002Fgithub.com\u002FTextualize\u002Frich)'s logging `Handler` as the default handler, which means prettier output, better tracebacks, and you can use rich's markup syntax with the `cli_logger` to easily add style to text.\n- Refactored `tango.step_graph.StepGraph` to allow initialization from a `Dict[str, Step]`.\n- `Executor.execute_step_graph()` now attempts to execute all steps and summarizes success\u002Ffailures.\n- Upgraded PyTorch version in `tango` Docker image to latest `v1.11.0+cu113`.\n- `RunGeneration` now allows model object as input.\n\n### Fixed ✅\n\n- Fixed bug that mistakenly disallowed fully-qualified names containing `\"_\"` (underscores) in the config.\n- Fixed bug where `TorchTrainStep` working directory would be left in an unrecoverable state if training failed after saving the final model weights.\n- Fixed bug in `FromParams` where `**kwargs` might be passed down to the constructors of arguments.\n- Fixed bug in the way dependencies are tracked between steps.\n- Fixed bug that caused `MulticoreExecutor` to hang in case of a failing step that was required recursively (not directly) downstream.\n- Fixed bug in the way dependencies are tracked between steps\n- Compatibility with PyTorch Lightning 1.6\n\n## Commits\n\n1083049 Finetuning (#255)\n42b1dba Bug fix with failing steps (#257)\n7bd251a Bump myst-parser from 0.17.0 to 0.17.2 (#273)\ncc9a1dd Bump actions\u002Fupload-artifact from 2 to 3 (#262)\n66777d9 Bump actions\u002Fdownload-artifact from 2 to 3 (#261)\n14d4adb use new beaker-action for building test image (#265)\naf47287 Update pytorch-lightning requirement from \u003C1.6,>=1.5 to >=1.5,\u003C1.7 (#248)\nb1df9a4 use beaker-run action for GPU Tests (#263)\n0a7468e fix release job (#260)\nc1b16b2 Bump furo from 2022.3.4 to 2022.4.7 (#259)\nb55aaf2 use beaker-py to submit GPU tests (#258)\nb2a93a9 Logging part 2: denoising run logging and making Dirk happy (#252)\nff6be8d Update click requirement from \u003C=8.0.4,>=7.0 to >=7.0,\u003C8.1.3 (#254)\n83d78cc Bump mypy from 0.941 to 0.942 (#243)\n3769327 Bump sphinx from 4.4.0 to 4.5.0 (#245)\n81fc5c5 Bump black from 21.12b0 to 22.3.0 (#246)\ne46059b Update tqdm requirement from \u003C4.64,>=4.62 to >=4.62,\u003C4.65 (#256)\nbbdeb6f Revert \"Set `$TEMP` (#241)\"\nb9fd9e9 Fix tracking dependencies between steps (#249)\n53502e1 Pretty-print a step graph (#250)\nd5328c9 Fix dissimilar objects hashing to the same thing (#240)\nccc37ce Autodetect debugger and turn off multicore (#251)\n5c39f61 Pin click\n5bb0fad Logging improvements (#233)\n037e4a0 fix bug with FromParams (#242)\ne142530 Bump actions\u002Fcache from 2 to 3 (#236)\n878402d Set `$TEMP` (#241)\n2d9fa0c fix bug w\u002F TorchTrainStep working dir (#238)\n410faeb Multicore Parallelism (#204)\n9e8e99f Update datasets requirement from \u003C2,>=1.12 to >=1.12,\u003C3 (#234)\n40e0a1a Bump mypy from 0.940 to 0.941 (#230)\nede7428 add name to changelog workflow\n4bb659b Bump actions\u002Fsetup-python from 2 to 3 (#229)\n5db1a6a Bump actions\u002Fcheckout from 1 to 3 (#228)\n8049104 Update torch version where it's hard-coded, add an automatic remind to do this stuff in the future (#227)\nfe05449 add back intersphinx inventory links for HF libraries (#222)\n9927749 Bump mypy from 0.931 to 0.940 (#226)\n29ab68b Update torch requirement from \u003C1.11,>=1.9 to >=1.9,\u003C1.12 (#225)\na3fc83b Bump furo from 2022.2.23 to 2022.3.4 (#218)\n28e839e Bump fairscale from 0.4.5 to 0.4.6 (#224)\nf18d393 Update tqdm requirement from \u003C4.63,>=4.62 to >=4.62,\u003C4.64 (#213)\n54c4a8d automatically keep copyright up-to-date (#221)\n06adb07 Allow setting the run name as a command-line option (#212)\n71e0639 Update cached-path requirement from \u003C1.1,>=1.0 to >=1.0,\u003C1.2 (#217)\n5d4660a Temporarily remove intersphinx links to HF docs (#220)\n13c7f3f Merge pull request #216 from allenai\u002FVersionForFromParams\n0027cb2 Merge pull request #215 from allenai\u002Ffix-fully-qualified-name-recognition\n76f9922 Add \"Step.workspace\" property (#210)\n\n","2022-04-19T22:52:53",{"id":250,"version":251,"summary_zh":252,"released_at":253},206517,"v0.6.0","## What's new\n\n### Added 🎉\n\n- New example that finetunes a pre-trained ResNet model on the Cats & Dogs dataset.\n- Added a '@requires_gpus' decorator for marking tests as needing GPUs. Tests marked with this will be run in the \"GPU Tests\" workflow\n  on dual k80 GPUs via Beaker.\n- Added the \"-w\u002F--workspace\" option to `tango run` and `tango server` commands. This option takes a path or URL, and instantiates the workspace from the URL using the newly added `Workspace.from_url()` method.\n- Added the \"workspace\" field to `TangoGlobalSettings`.\n- Added the \"environment\" field to `TangoGlobalSettings` for setting environment variables each\n  time `tango` is run.\n- Added a utility function to get a `StepGraph` directly from a file.\n- Added `tango.settings` module and `tango settings` group of commands.\n- A format for storing sequences as `SqliteSparseSequence`\n- A way to massage kwargs before they determine the unique ID of a `Step`\n\n### Changed ⚠️\n\n- `local_workspace.ExecutorMetadata` renamed to `StepExecutionMetadata` and now saved as `execution-metadata.json`.\n- `tango run` without the option \"-w\u002F--workspace\" or \"-d\u002F--workspace-dir\" will now use a `MemoryWorkspace` instead of a `LocalWorkspace` in a temp directory, unless you've specified\n  a default workspace in a `TangoGlobalSettings` file.\n- Moved `tango.workspace.MemoryWorkspace` and `tango.local_workspace.LocalWorkspace` to `tango.workspaces.*`.\n- Moved `tango.step_cache.MemoryStepCache` and `tango.step_cache.LocalStepCache` to `tango.step_caches.*`.\n- Deprecated the `-d\u002F--workspace-dir` command-line option. Please use `-w\u002F--workspace` instead.\n\n### Fixed ✅\n\n- Fixed a small bug `LocalWorkspace` would fail to capture the conda environment in our Docker image.\n- Fixed activation of `FILE_FRIENDLY_LOGGING` when set from the corresponding environment variable.\n- Fixed setting log level via the environment variable `TANGO_LOG_LEVEL`.\n- Use relative paths within the `work_dir` for symbolic links to the latest and the best checkpoints in `TorchTrainStep`.\n- Fixed some scenarios where Tango can hang after finishing all steps.\n- `distributed_port` and `log_every` parameters won't factor into `TorchTrainStep`'s unique ID.\n- `MappedSequence` now works with slicing.\n- `MappedSequence` now works with Huggingface `Dataset`.\n- Uncacheable steps are now visible in Tango UI.\n- Fixed bug in `Registrable.list_available()` where an error might be raised if the default implementation hadn't been explicitly imported.\n- Fixed issue where having a default argument to the `run()` method wasn't getting applied to the step's unique ID.\n\n## Commits\n\nf9da0af Merge pull request #211 from allenai\u002FMassage\ne78dcbe Allow setting environment variables in tango settings, fix bug with TANGO_LOG_LEVEL env var (#209)\n82404b6 Re-create LICENSE so GitHub will show it (#208)\n0fadecf Bump furo from 2022.2.14.1 to 2022.2.23 (#207)\n787b6e6 Merge pull request #206 from allenai\u002Fsettings\nc3401f2 Merge pull request #205 from allenai\u002FRobustnessFixes\n7ceda9c Merge pull request #201 from allenai\u002Fworkspace-prep\n6dd7d86 Merge pull request #200 from allenai\u002Funcacheable-steps-in-server\n5ad3f44 Bump furo from 2022.1.2 to 2022.2.14.1 (#199)\n3528230 Update filelock requirement from \u003C3.5,>=3.4 to >=3.4,\u003C3.7 (#202)\n21d6d40 Merge pull request #193 from allenai\u002FStepGraphFromFile\n258a7d2 skip 'distributed_port' and 'log_every' in unique ID (#197)\ndd4c47f Merge pull request #192 from allenai\u002FCloseSqliteHarder\ncc94e1c Merge pull request #156 from allenai\u002FDocumentationRefresh\n5cc86b8 Rename \"ExecutorMetadata\" -> \"StepExecutionMetadata\" (#195)\n6aecab7 Bump myst-parser from 0.16.1 to 0.17.0 (#191)\n6478293 make pushing test image to Beaker more robust (#190)\n7c1ac5b Finetune resnet Example for Tango (#150)\n5187b01 update docs for integration tests and gpu tests timeout\n95b78b5 Add new manually triggered workflow for integration tests, other bug fixes (#188)\n19f7b31 Merge pull request #189 from allenai\u002Ffix-checkpoint-path-link\na438b26 Workflow quickfix\n671a6dc verify exit code of beaker job (#187)\n7ccad94 Merge pull request #186 from allenai\u002Fadd-tests\nbf6ecd0 Run GPU tests on Beaker (#183)\n\n","2022-02-25T21:35:00"]