[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-NVIDIA--aistore":3,"tool-NVIDIA--aistore":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159267,2,"2026-04-17T11:29:14",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":118,"forks":119,"last_commit_at":120,"license":121,"difficulty_score":10,"env_os":122,"env_gpu":123,"env_ram":123,"env_deps":124,"category_tags":130,"github_topics":131,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":144,"updated_at":145,"faqs":146,"releases":174},8637,"NVIDIA\u002Faistore","aistore","AIStore: scalable storage for AI applications","AIStore 是一款专为人工智能应用打造的高性能分布式存储系统。它旨在解决 AI 工作负载中常见的大规模数据存取瓶颈，提供线性扩展能力与均衡的 I\u002FO 分布，确保在节点数量增加时性能依然稳定可靠。无论是单机环境、虚拟机，还是从小型集群到超大规模裸金属架构，AIStore 都能灵活部署，甚至支持在无 Kubernetes 环境下运行。\n\n该系统特别适合需要处理海量数据集的 AI 开发者、数据工程师及研究人员使用。其核心亮点在于能够原生统一管理本地与云端（如 AWS S3、GCS、Azure 等）数据，无需将任何一方仅视为缓存。AIStore 不仅兼容标准的 Amazon S3 接口，方便现有工具无缝接入，还具备独特的“分块对象”技术，支持高效并行读取与数据转换。此外，它内置了完善的多副本容错、自动修复机制以及基于负载的动态流控功能，在保障数据安全的同时，有效防止集群过载。通过集成 Prometheus 和 Grafana，用户还能轻松实现全方位的监控与运维，是构建弹性、高可用 AI 数据底座的理想选择。","**AIStore: High-Performance, Scalable Storage for AI Workloads**\n\n![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)\n![Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-v4.4-green.svg)\n![Go Report Card](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVIDIA_aistore_readme_2b4a70945b89.png)\n\nAIStore (AIS) is a lightweight distributed storage stack tailored for AI applications. It's an elastic cluster that can grow and shrink at runtime and can be ad-hoc deployed, with or without Kubernetes, anywhere from a single Linux machine to a bare-metal cluster of any size. Built from scratch, AIS provides linear scale-out, consistent performance, and a flexible deployment model.\n\nAIS is a reliable storage cluster that can natively operate on both in-cluster and remote data, without treating either as a cache.\n\nAIS consistently shows [balanced I\u002FO distribution and linear scalability](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F07\u002F26\u002Fsmooth-max-line-speed) across an arbitrary number of clustered nodes. The system supports fast data access, reliability, and rich customization for data transformation workloads.\n\n## Features\n\n* ✅ **Multi-Cloud Access:** Seamlessly access and manage content across multiple [cloud backends](\u002Fdocs\u002Foverview.md#at-a-glance) (including AWS S3, GCS, Azure, and OCI), with fast-tier performance, configurable redundancy, and namespace-aware bucket identity (same-name buckets can coexist across accounts, endpoints, and providers).\n* ✅ **Deploy Anywhere:** AIS runs on any Linux machine, virtual or physical. Deployment options range from a [minimal container-based deployment](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose\u002FREADME.md) and [Google Colab](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2024\u002F09\u002F18\u002Fgoogle-colab-aistore) to petascale [Kubernetes clusters](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s). There are [no built-in limitations](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#no-limitations-principle) on deployment size or functionality.\n* ✅ **High Availability:** Redundant control and data planes. Self-healing, end-to-end protection, n-way mirroring, and erasure coding. Arbitrary number of lightweight access points (AIS proxies).\n* ✅ **HTTP-based API:** A feature-rich, native API (with user-friendly SDKs for Go and Python), and compliant [Amazon S3 API](\u002Fdocs\u002Fs3compat.md) for running unmodified S3 clients.\n* ✅ **Monitoring:** Comprehensive observability with integrated Prometheus metrics, Grafana dashboards, detailed logs with configurable verbosity, and CLI-based performance tracking for complete cluster visibility and troubleshooting. See [AIStore Observability](\u002Fdocs\u002Fmonitoring-overview.md) for details.\n* ✅ **Chunked Objects:** High-performance chunked object representation, with independently retrievable chunks, metadata v2, and checksum-protected manifests. Supports rechunking, parallel reads, and seamless integration with [Get-Batch](\u002Fdocs\u002Fget_batch.md), [blob-downloader](\u002Fdocs\u002Fblob_downloader.md), and multipart uploads to supported cloud backends.\n* ✅ **JWT Authentication and Authorization:** [Validates request JWTs](\u002Fdocs\u002Fauth_validation.md) to provide cluster- and bucket-level access control using static keys or dynamic OIDC issuer JWKS lookup.\n* ✅ **Secure Redirects:** Configurable cryptographic signing of redirect URLs using HMAC-SHA256 with a versioned cluster key (distributed via metasync, stored in memory only).\n* ✅ **Load-Aware Throttling:** Dynamic request throttling based on a multi-dimensional load vector (CPU, memory, disk, file descriptors, goroutines) to protect AIS clusters under stress.\n* ✅ **Unified Namespace:** Attach AIS clusters together to provide unified access to datasets across independent clusters, allowing users to reference shared buckets with cluster-specific identifiers.\n* ✅ **Turn-key Cache:** In addition to robust data protection features, AIS offers a per-bucket configurable LRU-based cache with eviction thresholds and storage capacity watermarks.\n* ✅ **ETL Offload:** Execute I\u002FO intensive data transformations [close to the data](\u002Fdocs\u002Fetl.md), either inline (on-the-fly as part of each read request) or offline (batch processing, with the destination bucket populated with transformed results).\n* ✅ **Get-Batch:** Retrieve multiple objects and\u002For [archived files](\u002Fdocs\u002Farchive.md) with a single call. Designed for ML\u002FAI pipelines, [Get-Batch](\u002Fdocs\u002Fget_batch.md) fetches an entire training batch in one operation, assembling a TAR (or other supported [serialization formats](\u002Fdocs\u002Farchive.md)) that contains all requested items in the exact user-specified order ([paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.22434)).\n* ✅ **Data Consistency:** Guaranteed [consistency](\u002Fdocs\u002Fterminology.md#read-after-write-consistency) across all gateways, with [write-through](\u002Fdocs\u002Fterminology.md#write-through) semantics in presence of [remote backends](\u002Fdocs\u002Fterminology.md#backend-provider).\n* ✅ **Serialization & Sharding:** Native, first-class support for TAR, TGZ, TAR.LZ4, and ZIP [archives](\u002Fdocs\u002Farchive.md) for efficient storage and processing of small-file datasets. Features include seamless integration with existing unmodified workflows across all APIs and subsystems.\n* ✅ **Kubernetes:** For production, AIS runs natively on Kubernetes. The dedicated [ais-k8s](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s) repository includes the AIS K8s Operator, Ansible playbooks, Helm charts, and deployment guidance.\n* ✅ **Batch Jobs:** More than 30 cluster-wide [batch operations](\u002Fdocs\u002Fbatch.md) that you can start, monitor, and control otherwise. The list currently includes:\n\n```console\n$ ais show job --help\n\nNAME:\n    archive        blob-download  cleanup       copy-bucket    copy-objects      delete-objects\n    download       dsort          ec-bucket     ec-get         ec-put            ec-resp\n    elect-primary  etl-bucket     etl-inline    etl-objects    evict-objects     evict-remote-bucket\n    get-batch      list           lru-eviction  mirror         prefetch-objects  promote-files\n    put-copies     rebalance      rechunk       rename-bucket  resilver          summary\n    warm-up-metadata\n```\n\n> The feature set continues to grow and also includes: [native bucket inventory (NBI)](\u002Fdocs\u002Fnbi.md); [blob-downloader](\u002Fdocs\u002Fblob_downloader.md); [AuthN - authentication and authorization server](\u002Fdocs\u002Fauthn.md); runtime management of [TLS certificates](\u002Fdocs\u002Fcli\u002Fx509.md); full support for [adding\u002Fremoving nodes at runtime](\u002Fdocs\u002Flifecycle_node.md); adaptive [rate limiting](\u002Fdocs\u002Frate_limit.md); and more.\n\n> For the original **white paper** and design philosophy, please see [AIStore Overview](\u002Fdocs\u002Foverview.md), which also includes high-level block diagram, terminology, APIs, CLI, and more.\n> For our 2024 KubeCon presentation, please see [AIStore: Enhancing petascale Deep Learning across Cloud backends](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=N-d9cbROndg).\n\n## CLI\n\nAIS includes an integrated, scriptable [CLI](\u002Fdocs\u002Fcli.md) for managing clusters, buckets, and objects, running and monitoring batch jobs, viewing and downloading logs, generating performance reports, and more:\n\n```console\n$ ais \u003CTAB-TAB>\n\nadvanced         cluster          etl              ls               prefetch         search           tls\nalias            config           evict            ml               put              show             wait\narchive          cp               get              mpu              remote-cluster   space-cleanup\nauth             create           help             nbi              rmb              start\nblob-download    download         job              object           rmo              stop\nbucket           dsort            log              performance      scrub            storage\n```\n\n## Developer Tools\n\nAIS runs natively on Kubernetes and features open format - thus, the freedom to copy or move your data from AIS at any time using the familiar Linux `tar(1)`, `scp(1)`, `rsync(1)` and similar.\n\nFor developers and data scientists, there's also:\n\n* [Go API](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fapi) used in [CLI](\u002Fdocs\u002Fcli.md) and [benchmarking tools](\u002Fdocs\u002Faisloader.md)\n* [Python SDK](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fsdk) + [Reference Guide](\u002Fdocs\u002Fpython_sdk.md)\n* [PyTorch integration](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fpytorch) and usage examples\n* [Boto3 support](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fbotocore_patch)\n\n## Quick Start\n\n1. Read the [Getting Started Guide](\u002Fdocs\u002Fgetting_started.md) for a 5-minute local install, or\n2. Run a [minimal container-based](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose) AIS cluster consisting of a single gateway and a single storage node, or\n3. Clone the repo and run `make kill cli aisloader deploy` followed by `ais show cluster`\n\n---------------------\n\n## Deployment options\n\nAIS deployment options, as well as intended (development vs. production vs. first-time) usages, are all [summarized here](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdeploy\u002FREADME.md).\n\nPrerequisites essentially boil down to having Linux with a disk.\nDeployment options range from a [minimal container-based deployment](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose) to petascale bare-metal clusters of any size, and from a single VM to multiple racks of high-end servers.\nPractical use cases require, of course, further consideration.\n\nSome of the most popular deployment options include:\n\n| Option                                                                                                                 | Use Case                                                                                                                                                                                                                          |\n|------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [Local playground](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fgetting_started.md#local-playground)               | AIS developers or first-time users, Linux or Mac OS. Run `make kill cli aisloader deploy \u003C\u003C\u003C $'N\\nM'`, where `N` is a number of [targets](\u002Fdocs\u002Fterminology.md#target), `M` is a number of [gateways](\u002Fdocs\u002Fterminology.md#proxy) |\n| [Minimal container-based deployment](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose)           | Quick testing and evaluation; single-node setup                                                                                                                                                                                   |\n| [GCP\u002FGKE automated install](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fgetting_started.md#kubernetes-playground) | Developers, first-time users, AI researchers                                                                                                                                                                                      |\n| [Large-scale production deployment](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)                                                 | Requires Kubernetes; provided via [ais-k8s](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)                                                                                                                                                    |\n\n> For performance tuning, see [performance](\u002Fdocs\u002Fperformance.md) and [AIS K8s Playbooks](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s\u002Ftree\u002Fmain\u002Fplaybooks\u002Fhost-config).\n\n## Existing Datasets\n\nAIS supports multiple ingestion modes:\n\n* ✅ **On Demand:** Transparent cloud access during workloads.\n* ✅ **PUT:** Locally accessible files and directories.\n* ✅ **Promote:** Import local target directories and\u002For NFS\u002FSMB shares mounted on AIS targets.\n* ✅ **Copy:** Full buckets, virtual subdirectories (recursively or non-recursively), lists or ranges (via Bash expansion).\n* ✅ **Download:** HTTP(S)-accessible datasets and objects.\n* ✅ **Prefetch:** Remote buckets or selected objects (from remote buckets), including subdirectories, lists, and\u002For ranges.\n* ✅ **Archive:** [Group and store](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2024\u002F08\u002F16\u002Fishard) related small files from an original dataset.\n\n## Install from Release Binaries\n\nYou can install the CLI and benchmarking tools using:\n\n```console\n.\u002Fscripts\u002Finstall_from_binaries.sh --help\n```\n\nThe script installs [aisloader](\u002Fdocs\u002Faisloader.md) and [CLI](\u002Fdocs\u002Fcli.md) from the latest or previous GitHub [release](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases) and enables CLI auto-completions.\n\n## PyTorch integration\n\nPyTorch integration is a growing set of datasets (both iterable and map-style), samplers, and dataloaders:\n\n* [Taxonomy of abstractions and API reference](\u002Fdocs\u002Fpytorch.md)\n* [AIS plugin for PyTorch: usage examples](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fpytorch\u002FREADME.md)\n* [Jupyter notebook examples](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Fexamples\u002Fpytorch\u002F)\n\n## AIStore Badge\n\nLet others know your project is powered by high-performance AI storage:\n\n[![aistore](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpowered%20by-AIStore-76B900?style=flat&labelColor=000000)](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore)\n\n```markdown\n[![aistore](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpowered%20by-AIStore-76B900?style=flat&labelColor=000000)](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore)\n```\n\n## More Docs & Guides\n\n* [Overview and Design](\u002Fdocs\u002Foverview.md)\n* [Terminology and Core Abstractions](\u002Fdocs\u002Fterminology.md)\n* [Networking Model](\u002Fdocs\u002Fnetworking.md)\n* [Getting Started](\u002Fdocs\u002Fgetting_started.md)\n* [AIS Buckets: Design and Operations](\u002Fdocs\u002Fbucket.md)\n* [Observability](\u002Fdocs\u002Fmonitoring-overview.md)\n* [Technical Blog](https:\u002F\u002Faistore.nvidia.com\u002Fblog)\n* [S3 Compatibility](\u002Fdocs\u002Fs3compat.md)\n* [Batch Jobs](\u002Fdocs\u002Fbatch.md)\n* [Performance](\u002Fdocs\u002Fperformance.md) and [CLI: performance](\u002Fdocs\u002Fcli\u002Fperformance.md)\n* [CLI Reference](\u002Fdocs\u002Fcli.md)\n* [Production Deployment: Kubernetes Operator, Ansible Playbooks, Helm Charts, Monitoring](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)\n\n### How to find information\n\n* See [Extended Index](\u002Fdocs\u002Fdocs.md)\n* Use CLI `search` command, e.g.: `ais search copy`\n* Clone the repository and run `git grep`, e.g.: `git grep -n out-of-band -- \"*.md\"`\n\n## License\n\nMIT\n\n## Author\n\nAlex Aizman (NVIDIA)\n","**AIStore：面向 AI 工作负载的高性能、可扩展存储**\n\n![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)\n![版本](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-v4.4-green.svg)\n![Go 报告卡](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVIDIA_aistore_readme_2b4a70945b89.png)\n\nAIStore（AIS）是一个专为 AI 应用程序设计的轻量级分布式存储栈。它是一个具有弹性的集群，可以在运行时动态扩展和缩减，并且可以随时随地进行临时部署，无论是否使用 Kubernetes，从单台 Linux 服务器到任意规模的裸金属集群均可支持。AIS 从零开始构建，提供线性扩展能力、一致的性能以及灵活的部署模式。\n\nAIS 是一个可靠的存储集群，能够原生处理集群内部数据和远程数据，而无需将任何一方视为缓存。\n\nAIS 在任意数量的集群节点上均能持续展现出[均衡的 I\u002FO 分布和线性扩展性](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F07\u002F26\u002Fsmooth-max-line-speed)。该系统支持快速的数据访问、高可靠性，并为数据转换工作负载提供了丰富的自定义选项。\n\n## 特性\n\n* ✅ **多云访问：** 无缝访问和管理跨多个[云后端](\u002Fdocs\u002Foverview.md#at-a-glance)的内容（包括 AWS S3、GCS、Azure 和 OCI），具备高速层性能、可配置的冗余机制以及命名空间感知的存储桶标识（不同账号、不同端点、不同提供商下的同名存储桶可以共存）。\n* ✅ **随处部署：** AIS 可以在任何 Linux 机器上运行，无论是虚拟机还是物理机。部署方式多样，从[最小化的容器化部署](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose\u002FREADME.md)和[Google Colab](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2024\u002F09\u002F18\u002Fgoogle-colab-aistore)，到 PB 级的[Kubernetes 集群](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)。对于部署规模或功能，[没有任何内置限制](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#no-limitations-principle)。\n* ✅ **高可用性：** 冗余的控制平面和数据平面。具备自我修复能力、端到端保护、多路镜像和纠删码技术。支持任意数量的轻量级接入点（AIS 代理）。\n* ✅ **基于 HTTP 的 API：** 功能丰富的原生 API（配备易用的 Go 和 Python SDK），并兼容[Amazon S3 API](\u002Fdocs\u002Fs3compat.md)，以便运行未经修改的 S3 客户端。\n* ✅ **监控：** 提供全面的可观测性，集成 Prometheus 指标、Grafana 仪表盘、可配置详细级别的日志记录，以及基于 CLI 的性能跟踪工具，从而实现对集群的完全可见性和故障排查。详情请参阅[AIStore 可观测性](\u002Fdocs\u002Fmonitoring-overview.md)。\n* ✅ **分块对象：** 高性能的分块对象表示方式，支持独立检索的分块、元数据 v2 以及校验和保护的清单文件。支持重新分块、并行读取，以及与[Get-Batch](\u002Fdocs\u002Fget_batch.md)、[blob-downloader](\u002Fdocs\u002Fblob_downloader.md)和向受支持云后端的分片上传功能的无缝集成。\n* ✅ **JWT 身份验证与授权：** [验证请求中的 JWT 令牌](\u002Fdocs\u002Fauth_validation.md)，通过静态密钥或动态 OIDC 发行者 JWKS 查找，实现集群和存储桶级别的访问控制。\n* ✅ **安全重定向：** 支持使用 HMAC-SHA256 并结合带有版本号的集群密钥（通过 metasync 分发，仅存储在内存中）对重定向 URL 进行可配置的加密签名。\n* ✅ **负载感知限流：** 基于多维度负载向量（CPU、内存、磁盘、文件描述符、goroutine）的动态请求限流机制，可在压力下保护 AIS 集群。\n* ✅ **统一命名空间：** 可以将多个 AIS 集群连接在一起，为跨独立集群的数据集提供统一的访问入口，允许用户通过集群特定的标识符引用共享存储桶。\n* ✅ **开箱即用的缓存：** 除了强大的数据保护功能外，AIS 还提供基于 LRU 的存储桶级可配置缓存，支持逐出阈值和存储容量水位线设置。\n* ✅ **ETL 卸载：** 可以在[靠近数据的地方](\u002Fdocs\u002Fetl.md)执行 I\u002FO 密集型的数据转换操作，既可以采用内联方式（在每次读取请求过程中即时完成），也可以采用离线方式（批量处理，并将转换后的结果写入目标存储桶）。\n* ✅ **Get-Batch：** 可以通过一次调用检索多个对象和\u002F或[归档文件](\u002Fdocs\u002Farchive.md)。专为 ML\u002FAI 流程设计，[Get-Batch](\u002Fdocs\u002Fget_batch.md)能够在一次操作中获取整个训练批次，组装成一个 TAR 文件（或其他支持的[序列化格式](\u002Fdocs\u002Farchive.md)），其中包含所有请求的项目，并严格按照用户指定的顺序排列（[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.22434)）。\n* ✅ **数据一致性：** 所有网关之间保证[一致性](\u002Fdocs\u002Fterminology.md#read-after-write-consistency)，并在存在[远程后端](\u002Fdocs\u002Fterminology.md#backend-provider)的情况下采用[直写](\u002Fdocs\u002Fterminology.md#write-through)语义。\n* ✅ **序列化与分片：** 原生且一流的 TAR、TGZ、TAR.LZ4 和 ZIP[归档文件](\u002Fdocs\u002Farchive.md)支持，适用于小文件数据集的高效存储和处理。特性包括与现有未修改的工作流程在所有 API 和子系统中的无缝集成。\n* ✅ **Kubernetes：** 在生产环境中，AIS 可以原生运行在 Kubernetes 上。专门的[ais-k8s](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)仓库包含 AIS K8s Operator、Ansible 剧本、Helm 图表以及部署指南。\n* ✅ **批处理作业：** 提供超过 30 种集群范围内的[批处理操作](\u002Fdocs\u002Fbatch.md)，您可以启动、监控和控制这些作业。当前列表包括：\n\n```console\n$ ais show job --help\n\nNAME:\n    archive        blob-download  cleanup       copy-bucket    copy-objects      delete-objects\n    download       dsort          ec-bucket     ec-get         ec-put            ec-resp\n    elect-primary  etl-bucket     etl-inline    etl-objects    evict-objects     evict-remote-bucket\n    get-batch      list           lru-eviction  mirror         prefetch-objects  promote-files\n    put-copies     rebalance      rechunk       rename-bucket  resilver          summary\n    warm-up-metadata\n```\n\n> 功能集仍在不断扩展，还包括：[原生存储桶清单 (NBI)](\u002Fdocs\u002Fnbi.md)；[blob-downloader](\u002Fdocs\u002Fblob_downloader.md)；[AuthN - 身份验证与授权服务器](\u002Fdocs\u002Fauthn.md)；运行时对[TLS 证书](\u002Fdocs\u002Fcli\u002Fx509.md)的管理；对[运行时添加\u002F移除节点](\u002Fdocs\u002Flifecycle_node.md)的全面支持；自适应[速率限制](\u002Fdocs\u002Frate_limit.md)等。\n\n> 如需了解原始的**白皮书**和设计理念，请参阅[AIStore 概述](\u002Fdocs\u002Foverview.md)，其中还包含高层次的框图、术语、API、CLI 等内容。\n> 关于我们在 2024 年 KubeCon 大会上的演讲，请观看[AIStore：跨云后端提升 PB 级深度学习性能](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=N-d9cbROndg)。\n\n## CLI\n\nAIS 包含一个集成的、可脚本化的 [CLI](\u002Fdocs\u002Fcli.md)，用于管理集群、存储桶和对象，运行及监控批处理作业，查看和下载日志，生成性能报告等：\n\n```console\n$ ais \u003CTAB-TAB>\n\nadvanced         cluster          etl              ls               prefetch         search           tls\nalias            config           evict            ml               put              show             wait\narchive          cp               get              mpu              remote-cluster   space-cleanup\nauth             create           help             nbi              rmb              start\nblob-download    download         job              object           rmo              stop\nbucket           dsort            log              performance      scrub            storage\n```\n\n## 开发者工具\n\nAIS 原生运行于 Kubernetes 上，并采用开放格式——因此，您可以随时使用熟悉的 Linux `tar(1)`、`scp(1)`、`rsync(1)` 等工具将数据从 AIS 复制或迁移出去。\n\n对于开发者和数据科学家而言，还提供以下工具：\n\n* [Go API](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fapi)，用于 [CLI](\u002Fdocs\u002Fcli.md) 和 [基准测试工具](\u002Fdocs\u002Faisloader.md)\n* [Python SDK](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fsdk) + [参考指南](\u002Fdocs\u002Fpython_sdk.md)\n* [PyTorch 集成](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fpytorch)及使用示例\n* [Boto3 支持](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fbotocore_patch)\n\n## 快速入门\n\n1. 阅读[入门指南](\u002Fdocs\u002Fgetting_started.md)，可在 5 分钟内完成本地安装；或\n2. 运行一个由单个网关和单个存储节点组成的[最小化容器化](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose) AIS 集群；或\n3. 克隆仓库并执行 `make kill cli aisloader deploy`，随后运行 `ais show cluster`\n\n---------------------\n\n## 部署选项\n\nAIS 的部署选项以及其预期用途（开发、生产或首次使用）均已在此处[汇总](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdeploy\u002FREADME.md)。\n\n前提条件基本上就是拥有一台装有磁盘的 Linux 系统。部署选项范围从[最小化容器化部署](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose)到任意规模的 PB 级裸金属集群，从单个虚拟机到多机架的高端服务器集群不等。当然，实际应用场景还需要进一步考量。\n\n一些最受欢迎的部署选项包括：\n\n| 选项                                                                                                                 | 使用场景                                                                                                                                                                                                                          |\n|------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [本地试用环境](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fgetting_started.md#local-playground)               | AIS 开发人员或首次使用者，适用于 Linux 或 Mac OS。运行 `make kill cli aisloader deploy \u003C\u003C\u003C $'N\\nM'`，其中 `N` 是[目标](\u002Fdocs\u002Fterminology.md#target)的数量，`M` 是[网关](\u002Fdocs\u002Fterminology.md#proxy)的数量 |\n| [最小化容器化部署](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdeploy\u002Fprod\u002Fdocker\u002Fcompose)           | 快速测试与评估；单节点设置                                                                                                                                                                                   |\n| [GCP\u002FGKE 自动化安装](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fgetting_started.md#kubernetes-playground) | 开发人员、首次使用者、AI 研究人员                                                                                                                                                                                      |\n| [大规模生产部署](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)                                                 | 需要 Kubernetes；通过 [ais-k8s](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s) 提供                                                                                                                                                    |\n\n> 如需进行性能调优，请参阅[性能文档](\u002Fdocs\u002Fperformance.md)以及[AIS K8s Playbooks](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s\u002Ftree\u002Fmain\u002Fplaybooks\u002Fhost-config)。\n\n## 现有数据集\n\nAIS 支持多种数据摄取模式：\n\n* ✅ **按需访问：** 在工作负载期间实现透明的云访问。\n* ✅ **PUT：** 本地可访问的文件和目录。\n* ✅ **Promote：** 导入本地目标目录和\u002F或挂载在 AIS 目标上的 NFS\u002FSMB 共享。\n* ✅ **Copy：** 整个存储桶、虚拟子目录（递归或非递归）、列表或范围（通过 Bash 扩展）。\n* ✅ **Download：** 可通过 HTTP(S) 访问的数据集和对象。\n* ✅ **Prefetch：** 远程存储桶或选定的对象（来自远程存储桶），包括子目录、列表和\u002F或范围。\n* ✅ **Archive：** 将原始数据集中相关的较小文件[分组并存储](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2024\u002F08\u002F16\u002Fishard)。\n\n## 从发布二进制文件安装\n\n您可以通过以下命令安装 CLI 和基准测试工具：\n\n```console\n.\u002Fscripts\u002Finstall_from_binaries.sh --help\n```\n\n该脚本会从最新的或之前的 GitHub [发布](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases)中安装 [aisloader](\u002Fdocs\u002Faisloader.md) 和 [CLI](\u002Fdocs\u002Fcli.md)，并启用 CLI 自动补全功能。\n\n## PyTorch 集成\n\nPyTorch 集成是一系列不断增长的数据集（包括可迭代式和映射式数据集）、采样器和数据加载器：\n\n* [抽象层次与 API 参考](\u002Fdocs\u002Fpytorch.md)\n* [AIS PyTorch 插件：使用示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Faistore\u002Fpytorch\u002FREADME.md)\n* [Jupyter Notebook 示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fpython\u002Fexamples\u002Fpytorch\u002F)\n\n## AIStore 徽章\n\n让其他人知道您的项目由高性能 AI 存储提供支持：\n\n[![aistore](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpowered%20by-AIStore-76B900?style=flat&labelColor=000000)](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore)\n\n```markdown\n[![aistore](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpowered%20by-AIStore-76B900?style=flat&labelColor=000000)](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore)\n```\n\n## 更多文档与指南\n\n* [概述与设计](\u002Fdocs\u002Foverview.md)\n* [术语与核心抽象](\u002Fdocs\u002Fterminology.md)\n* [网络模型](\u002Fdocs\u002Fnetworking.md)\n* [快速入门](\u002Fdocs\u002Fgetting_started.md)\n* [AIS 存储桶：设计与操作](\u002Fdocs\u002Fbucket.md)\n* [可观测性](\u002Fdocs\u002Fmonitoring-overview.md)\n* [技术博客](https:\u002F\u002Faistore.nvidia.com\u002Fblog)\n* [S3 兼容性](\u002Fdocs\u002Fs3compat.md)\n* [批处理作业](\u002Fdocs\u002Fbatch.md)\n* [性能](\u002Fdocs\u002Fperformance.md) 和 [CLI 性能](\u002Fdocs\u002Fcli\u002Fperformance.md)\n* [CLI 参考](\u002Fdocs\u002Fcli.md)\n* [生产部署：Kubernetes Operator、Ansible 剧本、Helm 图表、监控](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s)\n\n### 如何查找信息\n\n* 请参阅[扩展索引](\u002Fdocs\u002Fdocs.md)\n* 使用 CLI 的 `search` 命令，例如：`ais search copy`\n* 克隆仓库并运行 `git grep`，例如：`git grep -n out-of-band -- \"*.md\"`\n\n## 许可证\n\nMIT\n\n## 作者\n\nAlex Aizman (NVIDIA)","# AIStore 快速上手指南\n\nAIStore (AIS) 是由 NVIDIA 开源的高性能、可扩展分布式存储系统，专为 AI\u002FML 工作负载设计。它支持线性扩展、多云接入（AWS S3, GCS, Azure 等），并可在单台 Linux 机器到大规模 Kubernetes 集群间灵活部署。\n\n## 环境准备\n\n### 系统要求\n*   **操作系统**: Linux (推荐 Ubuntu 20.04+ 或 CentOS 7+)。macOS 仅适用于本地开发测试（Local Playground）。\n*   **硬件**: 至少需要一块磁盘。生产环境建议配置多块磁盘以提升 I\u002FO 吞吐。\n*   **容器环境** (可选): 若使用 Docker 部署，需安装 Docker Engine 和 Docker Compose。\n*   **Kubernetes** (可选): 生产环境部署需具备 K8s 集群及 `kubectl` 工具。\n\n### 前置依赖\n*   **Go 语言**: 版本 1.21+ (若需从源码编译)。\n*   **Make**: 用于执行构建脚本。\n*   **网络**: 确保节点间网络互通，若访问公有云后端需配置相应网络策略。\n\n> **注意**: 目前官方未提供特定的中国镜像源。国内用户若遇到拉取 Docker 镜像或 Go 模块超时问题，建议配置通用的 Docker 镜像加速器（如阿里云、腾讯云镜像站）及 `GOPROXY` 环境变量：\n> ```bash\n> export GOPROXY=https:\u002F\u002Fgoproxy.cn,direct\n> ```\n\n## 安装步骤\n\n根据使用场景，选择以下任一方式快速启动：\n\n### 方案一：本地快速体验 (Local Playground)\n适合开发者在单机（Linux\u002FMac）上快速验证功能。此命令会自动下载二进制文件并启动一个包含 N 个存储节点和 M 个网关的临时集群。\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore.git\ncd aistore\n\n# 启动集群 (例如：3 个目标节点，1 个代理节点)\n# 系统将自动处理依赖下载和进程启动\nmake kill cli aisloader deploy \u003C\u003C\u003C $'3\\n1'\n```\n\n### 方案二：Docker Compose 最小化部署\n适合在单台 Linux 服务器上通过容器快速搭建生产级架构的最小集合（1 个 Gateway + 1 个 Storage Node）。\n\n```bash\n# 进入部署目录\ncd deploy\u002Fprod\u002Fdocker\u002Fcompose\n\n# 启动服务\ndocker compose up -d\n\n# 验证集群状态\ndocker compose exec ais-aisnode ais show cluster\n```\n\n### 方案三：Kubernetes 生产部署\n适合大规模生产环境。请使用专用的 `ais-k8s` 仓库进行部署（需预先配置好 K8s 环境）：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fais-k8s.git\ncd ais-k8s\n# 参考该仓库 README 使用 Helm 或 Operator 进行部署\n```\n\n## 基本使用\n\n安装完成后，您可以使用内置的 `ais` 命令行工具或 Python SDK 进行操作。\n\n### 1. 检查集群状态\n查看集群节点、容量及健康状态：\n```bash\nais show cluster\n```\n\n### 2. 创建 Bucket\n创建一个名为 `my-ai-data` 的存储桶：\n```bash\nais create bucket my-ai-data\n```\n\n### 3. 上传数据\n将本地文件上传至 Bucket：\n```bash\nais put .\u002Flocal_dataset.tar.gz ais:\u002F\u002Fmy-ai-data\u002Fdataset.tar.gz\n```\n或者上传整个目录：\n```bash\nais put .\u002Fdata_folder\u002F ais:\u002F\u002Fmy-ai-data\u002F\n```\n\n### 4. 列出对象\n查看 Bucket 中的文件列表：\n```bash\nais ls ais:\u002F\u002Fmy-ai-data\u002F\n```\n\n### 5. 高性能批量读取 (Get-Batch)\nAIStore 的核心特性之一是将多个对象打包为 TAR 流一次性读取，极大优化 ML 训练的数据加载效率：\n```bash\n# 获取指定文件列表并打包输出到 stdout 或本地文件\nais get batch --to-file training_batch.tar ais:\u002F\u002Fmy-ai-data\u002Ffile1.jpg ais:\u002F\u002Fmy-ai-data\u002Ffile2.jpg ...\n```\n\n### 6. Python SDK 示例\n在 AI 训练脚本中直接集成：\n\n```python\nfrom aistore.sdk import Client\n\n# 初始化客户端 (默认连接本地)\nclient = Client(\"http:\u002F\u002Flocalhost:8080\")\n\n# 列出 bucket 内容\nbucket = client.bucket(\"my-ai-data\")\nobjects = bucket.list_objects()\n\n# 读取数据 (返回字节流)\nfor obj in objects[:5]:\n    data = bucket.get(obj.name)\n    # 此处可接入 PyTorch DataLoader 进行处理\n    print(f\"Read {obj.name}: {len(data)} bytes\")\n```\n\n### 7. 清理资源\n测试结束后，若使用 `make` 方式部署，可运行以下命令停止并清理集群：\n```bash\nmake kill\n```","某自动驾驶研发团队需要在本地 GPU 集群与多个公有云存储之间，高效调度 PB 级图像数据以训练大规模感知模型。\n\n### 没有 aistore 时\n- **数据孤岛严重**：训练数据分散在 AWS S3、Azure 和本地磁盘，工程师需编写复杂脚本手动同步，经常因网络波动导致中断。\n- **训练频繁卡顿**：直接读取云端数据时 I\u002FO 延迟高且不稳定，GPU 经常因等待数据而闲置，算力利用率不足 60%。\n- **扩容维护困难**：随着数据量激增，存储架构无法弹性伸缩，增加节点往往需要停机迁移数据，严重影响迭代进度。\n- **权限管理混乱**：缺乏统一的访问控制，不同项目间数据隔离靠文件名约定，存在误删或泄露风险。\n\n### 使用 aistore 后\n- **统一命名空间**：aistore 将多云和本地存储聚合为单一视图，团队可直接通过标准 S3 API 透明访问所有数据，无需关心底层位置。\n- **极致 I\u002FO 性能**：利用 aistore 的分块对象技术和智能缓存，数据读取吞吐量线性增长，GPU 利用率稳定提升至 95% 以上。\n- **弹性无缝扩展**：集群支持运行时动态增减节点，新服务器加入后自动平衡负载，无需停机即可应对数据量爆发。\n- **安全精细管控**：基于 JWT 的认证机制实现了桶级权限隔离，确保各算法组只能访问授权数据，保障核心资产安全。\n\naistore 通过构建高性能、可弹性伸缩的统一存储层，彻底消除了 AI 训练中的数据供给瓶颈，让算力真正专注于模型迭代。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVIDIA_aistore_07a6b9d9.png","NVIDIA","NVIDIA Corporation","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNVIDIA_7dcf6000.png","",null,"https:\u002F\u002Fnvidia.com","https:\u002F\u002Fgithub.com\u002FNVIDIA",[80,84,88,92,96,100,104,107,111,115],{"name":81,"color":82,"percentage":83},"Go","#00ADD8",73.7,{"name":85,"color":86,"percentage":87},"Python","#3572A5",17.4,{"name":89,"color":90,"percentage":91},"Shell","#89e051",4.1,{"name":93,"color":94,"percentage":95},"Jupyter Notebook","#DA5B0B",4,{"name":97,"color":98,"percentage":99},"Makefile","#427819",0.4,{"name":101,"color":102,"percentage":103},"CSS","#663399",0.2,{"name":105,"color":106,"percentage":103},"TypeScript","#3178c6",{"name":108,"color":109,"percentage":110},"Dockerfile","#384d54",0.1,{"name":112,"color":113,"percentage":114},"Mustache","#724b3b",0,{"name":116,"color":117,"percentage":114},"Jinja","#a52a22",1820,246,"2026-04-17T11:59:00","MIT","Linux","未说明",{"notes":125,"python":123,"dependencies":126},"AIStore 是一个用 Go 语言编写的分布式存储系统，而非单纯的 Python AI 库。它原生运行在 Linux 上（支持物理机、虚拟机或容器），官方文档明确指出先决条件本质上是“拥有磁盘的 Linux 机器”。虽然提供了 Python SDK 和 PyTorch 集成用于开发，但核心服务不依赖特定版本的 Python 或 GPU。生产环境推荐使用 Kubernetes 部署，也支持单机 Docker Compose 快速启动。",[127,128,129],"Go (用于构建核心)","Kubernetes (生产环境可选)","Docker (容器化部署可选)",[16,14],[132,133,134,135,136,137,138,139,140,141,142,143],"object-storage","etl-offload","linear-scalability","batch-jobs","kubernetes","distributed-storage","high-availability","high-performance","ml-training","multi-cloud","multipart-upload","s3-compatible","2026-03-27T02:49:30.150509","2026-04-18T03:35:36.262005",[147,152,156,161,166,170],{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},38686,"ETL WebDataset 连接超时或无法获取结果怎么办？","首先，检查 ETL Pod 的日志以定位初始化问题，命令为 `kubectl logs \u003Cpod_name>`。若需查看集群侧更详细的 ETL 设置日志，可执行 `ais config cluster log.modules etl` 开启详细模式。\n\n关于无法从集群外部获取结果的问题：ETL Pod 的 NodePort 设计初衷并非供集群外部访问，而是仅供 AIS 目标节点（target node）直接请求对象转换时使用。请确保 Python 版本的 ETL 任务客户端与 ETL 运行时版本匹配，并使用支持 URL ARG_TYPE 的 AIS 版本。","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fissues\u002F136",{"id":153,"question_zh":154,"answer_zh":155,"source_url":151},38687,"如何调试和监控 ETL 任务的运行状态？","调试 ETL 任务的常用方法是：先运行设置 ETL 的命令，然后观察生成了哪些 Pod。接着对所有新生成的 Pod 使用日志命令（`kubectl logs`）查看是否有初始化错误。90% 的问题可以通过这些 Pod 的日志找到原因。此外，可以在转换函数中直接使用 print 打印日志，并通过 `kubectl logs` 查看。",{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},38688,"AIStore 是否完全兼容 Amazon S3 并支持所有 S3 客户端的身份验证？","AIStore 提供兼容 Amazon S3 的 API，旨在运行未经修改的 S3 客户端和应用，但它并非 Amazon S3 的直接替代品，某些例外情况已在文档中说明。\n\n关于身份验证：如果启用了 AIStore 认证，请求必须包含带有 `Bearer \u003Ctoken>` 的 `Authorization` 头。不同的 S3 兼容客户端构建请求的方式不同，部分客户端可能无法自动发送该头部。如果遇到此类问题，建议直接使用 AIStore 的原生 Go 或 Python SDK，这通常比尝试修补第三方 S3 客户端更简单高效。","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fissues\u002F181",{"id":162,"question_zh":163,"answer_zh":164,"source_url":165},38689,"Docker Compose 重启后集群配置（如 Cluster ID、认证信息）丢失如何解决？","配置丢失通常是因为状态未持久化或容器重启后内部 IP 变化导致节点无法加入集群。解决方案如下：\n1. 确保将配置和数据目录挂载为 Docker 卷（volumes），例如 `\u002Fdata\u002Fais:\u002Fais\u002Fdisk0`。\n2. 由于状态持久化后节点会尝试连接旧的内部 IP，建议在 `docker-compose.yml` 中配置固定的子网和静态 IP。示例配置：\n```yaml\nnetworks:\n  vlan:\n    ipam:\n      config:\n        - subnet: 10.5.0.0\u002F16\n          gateway: 10.5.0.1\nservices:\n  ais:\n    networks:\n      vlan:\n        ipv4_address: 10.5.0.2\n```\n这样可以防止因 IP 变更导致的 `connection refused` 错误。","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fissues\u002F194",{"id":167,"question_zh":168,"answer_zh":169,"source_url":165},38690,"重启持久化状态的 Docker 容器时出现 'failed to join cluster: connection refused' 错误怎么办？","这是因为容器重启后内部 IP 地址发生了变化，而持久化的状态中仍记录着旧的 IP 地址，导致节点试图连接旧地址失败。\n\n解决方法是强制在 Docker Compose 中使用固定 IP 地址。你需要定义一个自定义网络并指定 `ipv4_address`，确保每次重启容器时其内部 IP 保持一致。同时，务必将配置目录挂载为卷以持久化状态。",{"id":171,"question_zh":172,"answer_zh":173,"source_url":151},38691,"ETL 任务客户端版本与运行时版本需要匹配吗？","是的，Python 版本的 ETL 任务客户端需要与 ETL 运行时版本相匹配。版本不匹配可能导致通信问题或转换失败。请确保使用的 AIS 版本支持相应的 URL 参数类型（ARG_TYPE），并且客户端代码与部署的 ETL 运行时版本一致。",[175,180,185,190,195,200,205,210,215,220,225,230,235,240,245,250,255,260,265,270],{"id":176,"version":177,"summary_zh":178,"released_at":179},314621,"v1.4.4","AIStore **4.4** 是一个短周期发布版本，重点在于运行时正确性、容器感知能力以及功能整合。\n\n本次发布首次实现了对 cgroup v2 的完整支持，从而提升了 AIS 在受限环境和容器化部署中的行为表现。CPU 和内存的统计现在具备 cgroup 感知能力，容器检测与启动初始化更加稳健，而 CPU 使用率则以平滑移动平均值而非瞬时采样值来报告。\n\n此外，4.4 版本还将原生桶清单（NBI）正式纳入 S3 兼容清单工作流的默认路径。该版本新增了非递归的清单列出功能，移除了旧有的 S3 特定清单实现，并更新了 S3 兼容层，使其将基于清单的请求路由至 NBI。目前，NBI 已可被视为稳定功能。\n\n其他改进包括：优化 S3 桶区域发现与错误报告机制，为每个桶添加 OCI 区域及实例主身份验证支持，引入每桶用户自定义元数据，并更新了 CLI、文档、追踪系统以及构建工具链。\n\n本次发布自 [4.3](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases\u002Ftag\u002Fv1.4.3) 以来共包含约 50 个提交，且完全向后兼容。\n\n---\n\n## 目录\n\n1. [运行时：cgroup v2、CPU 与内存](#runtime-cgroup-v2-cpu-and-memory)\n2. [原生桶清单（NBI）](#native-bucket-inventory-nbi)\n3. [S3 兼容性与远程桶发现](#s3-compatibility-and-remote-bucket-discovery)\n4. [OCI 后端](#oci-backend)\n5. [桶元数据与 BMD 变更](#bucket-metadata-and-bmd-changes)\n6. [CLI](#cli)\n7. [Python SDK 与 ETL](#python-sdk-and-etl)\n8. [追踪](#tracing)\n9. [官网](#website)\n10. [文档](#documentation)\n11. [构建、CI 与工具](#build-ci-and-tools)\n12. [升级注意事项](#upgrade-notes)\n\n---\n\n\u003Ca name=\"runtime-cgroup-v2-cpu-and-memory\">\u003C\u002Fa>\n## 运行时：cgroup v2、CPU 与内存\n\n4.4 版本的主要特性是全面支持 **cgroup v2**，并进行了更广泛的运行时重构，以确保 AIS 在标准容器及受限制的 Kubernetes 风格环境中能够正常运行。\n\n### 参阅\n\n* [AIS 在容器化环境中的使用](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcontainerized.md)\n\n相关工作首先从启动流程和环境检测入手。AIS 现在会通过 `sys.Init()` 更早、更明确地初始化 CPU 和内存统计；同时，借助额外的启发式方法改进了容器自动检测逻辑，并新增了 `ForceContainerCPUMem` 功能标志，以便在检测失败时强制覆盖。此外，初始化路径还应用了容器感知的 CPU 计数和 `GOMAXPROCS` 设置，并避免在运行时再次尝试解析失败的 cgroup 数据。\n\n在内存方面，AIS 现在支持基于 `memory.max`、`memory.current` 和 `memory.stat` 的 **cgroup v2 内存报告**，同时保留对 `memory.stat` 的尽力而为处理，以防止缺失或无法读取的辅助信息导致严重的运行时故障。主机和容器的内存读取逻辑已被重构为 stat","2026-04-08T17:14:24",{"id":181,"version":182,"summary_zh":183,"released_at":184},314622,"v1.4.3","AIStore **4.3** 为远程存储桶引入了原生存储桶清单（NBI），并由全新的系统存储桶命名空间提供支持。如今，包含数百万甚至数千万对象的存储桶，可以直接从 AIS 管理的清单快照中列出，而无需在每次调用时遍历后端。\n\n本次发布还新增了对所有三个逻辑网络及传输流媒体层的完整 IPv6 网络支持，具备集群范围内的地址族配置和自动 IPv4 回退功能。\n\n4.3 版本扩展了身份验证和密钥管理能力，支持 JWKS 持久化、手动密钥轮换、标准 JWT 声明以及新的密钥提供商抽象层。客户端多部分下载功能在 Python SDK、Go API 和 aisloader 中得到了显著提升，新增基于多进程的并行下载器，并采用共享内存环形缓冲区作为支撑。Python SDK 和 ETL 工具链增加了对多种服务器风格的流式转换支持，并改进了重试处理机制。\n\n其他改进包括：面向媒体的工作者并行度优化、全局再平衡机制的强化、按存储桶配置的 GCP 凭证、`rechunk` 对远程存储桶的支持、多部分上传的整对象校验和，以及多项后端和可靠性修复。\n\n在 4.2 版本中以实验性推出的 Object HEAD v2 现已稳定。\n\n本次发布自 [4.2](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases\u002Ftag\u002Fv1.4.2) 以来包含了超过 300 个提交，并且完全向后兼容。\n\n---\n\n## 目录\n\n1. [原生存储桶清单（NBI）](#native-bucket-inventory-nbi)\n2. [系统存储桶](#system-buckets)\n3. [IPv6 网络](#ipv6-networking)\n4. [多部分下载（MPD）](#multi-part-download)\n5. [多部分上传：整对象校验和](#multi-part-upload-whole-object-checksum)\n6. [全局再平衡](#global-rebalance)\n7. [多凭证 GCP 后端](#multi-credential-gcp-backend)\n8. [身份验证与密钥管理](#authentication-and-key-management)\n9. [面向媒体的工作者并行度](#media-aware-worker-parallelism)\n10. [Object HEAD v2](#object-head-v2)\n11. [ETL](#etl)\n12. [Rechunk：远程存储桶支持与 `--sync-remote`](#rechunk-remote-bucket-sync)\n13. [Python SDK](#python-sdk)\n14. [CLI](#cli)\n15. [运行时、后端及可靠性修复](#runtime-backend-and-reliability-fixes)\n16. [文档](#documentation)\n17. [构建、CI 及工具](#build-ci-and-tools)\n18. [升级注意事项](#upgrade-notes)\n\n---\n\n\u003Ca name=\"native-bucket-inventory-nbi\">\u003C\u002Fa>\n## 原生存储桶清单（NBI）\n\n[原生存储桶清单](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fnbi.md) 在 4.3 版本中仍处于实验阶段。它为 AIS 提供了一种内置方式，用于创建和复用**远程存储桶**的清单快照。\n\n目前，NBI 仅适用于远程存储桶：云存储桶（包括 S3、GCS、Azure 和 OCI）以及远程 AIS 存储桶。\n\n其目标非常明确：大幅加快大型远程存储桶的列举速度——尤其是那些包含数百万或数千万对象的存储桶。在如此规模下，直接通过后端进行列举可能需要数分钟甚至更长时间，并且…","2026-03-25T22:39:15",{"id":186,"version":187,"summary_zh":188,"released_at":189},314623,"v1.4.2","AIStore **4.2** 专注于正确性与可靠性、身份验证与可观测性、API 现代化，以及后端和工具链中的各项运维修复。\n\n[resilver](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fresilver.md) 子系统经过大幅重写，引入了对 [mountpath](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#mountpath) 事件的显式抢占机制，并全面支持分块对象的重新定位。\n\n安全方面的增强包括为 [AuthN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fauthn.md) 提供 Prometheus 指标、持久化的 RSA 密钥对，以及兼容 OIDC 的发现端点和 JWKS 端点。\n\n新 API 用显式的条件等待替代了传统的轮询方式来处理批处理作业，并引入了感知分块的 `HEAD(object)` 接口，从而实现更可扩展的任务监控和高效的大对象访问。\n\n针对 AIS 存储桶的非递归遍历，修正了列出对象的功能；同时，Python SDK 和 CLI 增加了更快、感知分块的下载路径，支持并行下载和进度报告。\n\n其他改进和修复涵盖：云存储桶命名空间、分片上传的重试行为、后端互操作性，以及过早报告全局再平衡已完成的问题。\n\n尤其值得一提的是，4.2 版本新增了对 [命名空间作用域](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fbucket.md#namespaces) 云存储桶的支持（例如 `s3:\u002F\u002F#prod\u002Fdata`、`s3:\u002F\u002F#dev\u002Fdata` 等）。这使得多租户场景成为可能：不同用户或账户可以通过各自的配置文件和\u002F或终端节点，在不同的云账号中访问同名存储桶。\n\nAIStore 4.2 保持与 [v4.1](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases\u002Ftag\u002Fv1.4.1) 及更早版本的完全向后兼容。总体而言，此版本在存在 [磁盘故障](#fshc) 的情况下提升了系统的可用性，在负载下增强了可观测性和正确性，并对长期使用的 API 进行了现代化改造。\n\n---\n\n**目录**\n\n1. [Resilver](#resilver)\n2. [身份验证与可观测性](#authentication-and-observability)\n3. [新 API](#new-apis)\n   - 3.1 [Xaction v2](#xaction-v2)\n   - 3.2 [Object HEAD v2](#object-head-v2)\n4. [列出对象：非递归遍历](#list-objects-non-recursive-walks)\n5. [分片传输（下载、上传及后端互操作性）](#multipart-transfers)\n6. [全局再平衡](#global-rebalance)\n7. [文件系统健康检查器 (FSHC)](#fshc)\n8. [ETag 和 Last-Modified 标准化](#etag-and-last-modified-normalization)\n9. [Python SDK](#python-sdk)\n10. [CLI](#cli)\n11. [文档](#documentation)\n12. [构建与 CI](#build-and-ci)\n13. [各子系统中的杂项修复](#miscellaneous-fixes)\n\n---\n\n\u003Ca name=\"resilver\">\u003C\u002Fa>\n## Resilver\n\n[Resilver](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fresilver.md) 是 AIStore 中节点本地版的 [全局再平衡](#global-rebalance)：它会在卷组发生变化（如挂载…","2026-01-15T17:01:56",{"id":191,"version":192,"summary_zh":193,"released_at":194},314624,"v1.4.1","AIStore **4.1** 在检索、安全性和集群运维方面带来了多项升级。针对机器学习训练工作负载，GetBatch API 得到了显著扩展，新增客户端侧流式传输功能，并在资源短缺情况下提升了系统的容错能力和错误处理机制。认证机制经过重新设计，支持 OIDC、结构化 JWT 验证以及基于集群密钥的 HMAC 签名用于 HTTP 重定向。此外，本次发布还新增了 rechunk 作业，用于在单体布局与分块布局之间转换数据集；统一了所有云后端的分片上传行为；并增强了 Blob 下载器的负载感知限速功能。Python SDK 升级至 v1.18，重构了 Batch API 并优化了超时配置。整体配置验证能力也得到加强，包括从 v4.0 认证设置的自动迁移。\n\n本版本自 [v4.0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases\u002Ftag\u002Fv1.4.0) 以来累计提交超过 200 条，同时保持向后兼容性，支持滚动升级。\n\n**目录**\n\n1. [GetBatch：分布式多对象检索](#getbatch-distributed-multi-object-retrieval)\n2. [认证与安全](#authentication-and-security)\n3. [分块对象](#chunked-objects)\n4. [Blob 下载器](#blob-downloader)\n5. [Rechunk 作业](#rechunk-job)\n6. [统一的负载与限速机制](#unified-load-and-throttling)\n7. [传输层](#transport-layer)\n8. [分片上传](#multipart-upload)\n9. [Python SDK](#python-sdk)\n10. [S3 兼容性](#s3-compatibility)\n11. [构建系统与工具链](#build-system-and-tooling)\n12. [Xaction 生命周期](#xaction-lifecycle)\n13. [ETL 与转换管道](#etl-and-transform-pipeline)\n14. [可观测性](#observability)\n15. [配置变更](#configuration-changes)\n16. [工具：`aisloader`](#tools-aisloader)\n\n---\n\n\u003Ca name=\"getbatch-distributed-multi-object-retrieval\">\u003C\u002Fa>\n## GetBatch：分布式多对象检索\n\nGetBatch 工作流现已在集群范围内实现稳健运行。该流程以流式传输为核心，支持跨多个存储桶的批量操作，并提供可调的软错误处理机制。请求路径引入了基于负载的限速策略，当系统压力过大时，可能会返回 HTTP 429（“请求过多”）响应。系统会综合考虑内存和磁盘压力，并对连接重置进行透明处理。\n\n可通过新的 “get_batch” 配置节进行相关设置：\n\n```json\n{\n  \"max_wait\": \"30s\",           \u002F\u002F 等待远程目标的时间（范围：1 秒至 1 分钟）\n  \"warmup_workers\": 2,         \u002F\u002F 页面缓存预读线程数（-1=禁用，0-10）\n  \"max_soft_errs\": 6           \u002F\u002F 每次请求允许的最大可恢复错误数\n}\n```\n\n通过整合计数器、Prometheus 指标以及更清晰的状态报告，可观测性得到了进一步提升。客户端及工具方面的更新包括 Python SDK 中新增的 `Batch` API、aisloader 的扩展支持，以及用于分布式基准测试的 Ansible Composer 剧本。\n\n> 参考文档：[https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fget_b","2025-12-05T23:10:01",{"id":196,"version":197,"summary_zh":198,"released_at":199},314625,"v1.4.0","AIStore **4.0** 是一次重大发布，引入了 v2 版本的对象元数据格式以及分块对象表示方式（即以多个数据块存储和管理对象）。此外，该版本还新增了原生的分片上传功能，以及一个新的 [GetBatch API](#getbatch-api-ml-endpoint)，用于高吞吐量地批量检索对象和\u002F或归档文件。\n\n集群内 ETL 功能现已扩展为 _ETL 管道_，允许用户在无需中间存储桶的情况下串联多个转换步骤。在可观测性方面，4.0 版本将监控后端统一为 Prometheus，并新增了磁盘级别容量告警。\n\n所有子系统、扩展模块均已更新，以支持新功能。CLI 命令行工具新增了集群仪表板、改进了特性开关管理，并进行了多项易用性优化。配置更新包括新增 `chunks` 部分，以及更多可调参数，以支持包含数亿个对象的大型集群，并优化运行时限流机制。\n\n本次发布自上一版本 [3.31](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Freleases\u002Ftag\u002Fv1.3.31) 以来，累计提交近 300 条。同时，新版本保持与旧版本的兼容性，支持滚动升级。\n\n## 目录\n\n- [对象元数据版本 2](#object-metadata-version-2)\n- [分块对象](#chunked-objects)\n- [原生 API：分片上传（统一实现）](#native-api-multipart-upload-unified-implementation)\n- [GetBatch API（机器学习端点）](#getbatch-api-ml-endpoint)\n- [ETL](#etl)\n- [可观测性与指标](#observability-and-metrics)\n- [Python SDK 1.16](#python-sdk-116)\n- [CLI](#cli)\n- [配置变更](#configuration-changes)\n- [性能优化](#performance-optimizations)\n- [构建、CI\u002FCD 和测试](#build-cicd-and-testing)\n- [文档](#documentation)\n\n---\n\n\u003Ca name=\"object-metadata-version-2\">\u003C\u002Fa>\n## 对象元数据版本 2\n\n自 AIStore 于 2018 年诞生以来，这是首次对磁盘上的对象元数据格式进行升级。\n\n目前，v2 元数据已成为所有新写入操作的默认格式，而 v1 仍完全兼容，以确保向后兼容性。系统能够无缝读取这两种格式。\n\n### 新增内容\n\n此次升级引入了持久化的存储桶标识符和持久化标志位。每个对象现在都会在其元数据中存储所属存储桶的 ID（BID）。加载时，AIStore 会将存储的 BID 与当前存储桶元数据（BMD）中的 BID 进行比对。若两者不一致，则该对象会被标记为失效并从缓存中移除，从而强制实现每个对象与其写入的确切存储桶版本之间的强引用完整性。\n\n旧版元数据格式在扩展新功能方面存在局限性。而 v2 版本则专门预留了 8 字节字段，用于未来可能添加的功能（如存储类别、压缩、加密、回写等）。当元数据损坏或无法解析时，系统会返回明确的类型化错误信息（例如 `ErrLmetaCorrupted`），从而提升故障排查和调试能力。\n\n原有的标志位（如“文件名过长”、“LOM 为分块对象”）仍保留其原有的磁盘位置。","2025-10-07T20:20:25",{"id":201,"version":202,"summary_zh":203,"released_at":204},314626,"v1.3.31","## 更改记录\n\n### 核心功能\n\n- [63367e4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F63367e48ea890e9fd0f9e1b890796a4802e5b06f): 不再将流量反向代理到自身\n- [aeac54b](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Faeac54bbadd9f898d2bcbbf9a79359f13e4c6bc8): 反向流量现使用集群内控制网络（不包括 S3）\n- [803fc4d](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F803fc4d26041288a24e8a5d069c1ec5a6d543198): 移除旧版 CONNECT 隧道\n\n### 全局再平衡\n\n- [1b310f1](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F1b310f19051280465187db277b845c6c141b9990): 添加操作范围内的 `--latest` 和 `--sync`\n- [62793b1](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F62793b17966a5d368b76747d5e98a2c0a36865fc): 修复空桶的有限范围问题\n- [cf31b87](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fcf31b87518493ef0af2297ea5b79f09316b6d8b7): 引入三重版本冲突解决机制：本地\u002F发送方\u002F云\n\n### S3 分片上传\n\n- [d830bbf](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fd830bbff330cb66b7f56c9db1966c9606ee42f50): 添加并检查 `NoSuchUpload` 错误\n- [7e13f86](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F7e13f862bc80468dc66f6145b18f44a8867f4fd9): 修正分片上传的错误处理\n\n### CLI\n\n- [45d625c](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F45d625c9880a9142c3791ce72cca83819323e974)、[59c9bd8](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F59c9bd809874af2e79946880c5b0473a3f54d737): 新增 `ais show dashboard` 命令：集群概览仪表板——节点数量、容量、性能、健康状况及版本信息\n- [19a83de](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F19a83de216ffbd2509a274f2e037f5da1f5b2b91): 修复当远程集群使用不同 HTTP(s) 协议时的 `ais show remote-cluster` 命令\n\n> 参阅：[`ais cluster` 命令](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fcluster.md)\n\n### ETL\n\n- [b98e109](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fb98e10900c9f576c3ba2fe965a9f72d9d9b91b42): 在单对象转换流程中添加对 `ETLArgs` 的支持\n\n> 参阅：[单对象复制\u002F转换能力](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F07\u002F25\u002Fsingle-object-copy-transformation-capability)\n\n### 部署与监控\n\n- [85ad0a4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F85ad0a4a18c4c36f9861c4a9ccf842dd0545e7a5): 降低推荐的文件描述符限制；更新文档\n- [59ec6b9](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F59ec6b9d26006db520251fecdc389532da149c55): 检查并定期记录文件描述符表的使用情况\n\n> 参阅：[最大打开文件数](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fperformance.md#maximum-number-of-open-files)\n\n### 重构与代码检查\n\n- [bea853c](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fbea853c46cc693dfde12ffa2931e8d347b8169b0): 升级 golangci-lint 版本\n- [1b67381](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F1b673811884ed99cfdbca1a8bfc4f682de7eb64a): 修复 `noctx` 代码检查工具的错误\n- [3373ae3](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F3373ae37f17c7bf2bb94bb2257ef8f201d80b813): 重构 `ErrBucketNotFound`\n\n### 文档","2025-07-25T16:39:16",{"id":206,"version":207,"summary_zh":208,"released_at":209},314627,"v1.3.30","本次 AIStore 发布版本为 **3.30**，距离上一版本已过去两个月，期间累计超过 300 次提交。与以往版本一样，3.30 保持向后兼容，并支持滚动升级。\n\n此版本新增了对 [批处理工作流](#batch-workflows) 的支持。其核心理念是在单个序列化流式（或分块）响应中提供成百上千个对象（或归档文件）。\n\nAIStore 3.30 在多个子系统中实现了性能提升，尤其在 I\u002FO 效率、连接管理和 [ETL](#etl) 操作方面。更新并重构后的 ETL 子系统现支持 ETL 容器直接访问文件系统，消除了 WebSocket 通信器中 io.Pipe 的瓶颈，并使容器能够直接执行 PUT 操作。此外，它还简化了配置方式，用精简的运行时规范替代了完整的 Kubernetes Pod YAML 文件。\n\n[Python SDK 1.15](#python-sdk) 引入了针对大型归档文件的高性能流式解压批处理功能，以及强大的全新 ETL 能力。该版本为破坏性更新，移除了已弃用的 `init_code` ETL API，同时通过改进的重试逻辑提升了系统的容错能力。\n\n在可观测性方面，Prometheus 现可导出磁盘写入延迟和待处理 I\u002FO 深度指标，并在触发磁盘告警时自动刷新容量信息。尽管 StatsD 导出器仍可用，但现已默认禁用，因为我们正逐步将 Prometheus 和 OpenTelemetry 打造成一流的监控解决方案。\n\n工具方面，CLI 新增了一个 [`ml` 命名空间](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fml.md)，其中包含用于机器学习流水线的 [Lhotse](https:\u002F\u002Fgithub.com\u002Flhotse-speech\u002Flhotse) CutSet 工具集。此次 CLI 升级还实现了与 Hugging Face (https:\u002F\u002Fhuggingface.co\u002Fapi\u002Fdatasets) 数据集仓库的集成（包括批量下载），并进行了多项易用性改进。\n\n云后端方面的增强包括对 Oracle Cloud Infrastructure 分块上传的支持，使得 S3 客户端（如 boto3、s3cmd、AWS CLI 等）无需修改代码即可在 OCI 后端执行分块上传；此外还优化了 AWS 配置管理，并修复了相关问题。\n\n新增的 `ais object cp` 和 `ais object etl` 命令（及其对应的 API）提供了同步的复制和转换操作，而无需启动异步的多对象事务（即批处理作业）。\n\n[文档](#documentation) 更新包括对 ETL CLI 文档的全面重写、新增的连接管理和机器学习工作流操作指南、强化的 Python SDK 文档，以及改进的 AWS 后端配置说明。\n\n基础设施方面的改进则涵盖了 GitHub 发布版中 macOS\u002Farm64 平台 CLI 二进制文件的自动构建，以及对所有开源依赖项（除 Kubernetes 客户端库外）的升级，从而为整个代码库带来了安全补丁和性能提升。\n\n## 目录\n\n1.  [批处理工作流](#batch-workflows)\n2.  [ETL](#etl)\n3.  [性能与可扩展性](#performance)\n4.  [","2025-07-21T21:43:19",{"id":211,"version":212,"summary_zh":213,"released_at":214},314628,"v1.3.29","## 更改日志\n\n### 核心功能\n- [d41ca5d](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fd41ca5d)：禁用替代的冷获取逻辑；暂不移除，但仍标记为已弃用\n- [aa7c05e](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Faa7c05e)：缩短冷获取冲突超时时间；修复抖动问题（拼写错误）\n- [2ec8b22](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F2ec8b22)：在“高协程数”情况下校准‘num-workers’参数（修复）\n- [0292c1c](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F0292c1c)：多对象复制\u002F转换：当目标运行不同 UUID 时（修复）\n\n### CLI 工具\n- [3bfe3d6](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F3bfe3d6)：新增 `ais advanced check-lock` 命令（与 `CheckObjectLock` API 相关）\n- [2e4e6c1](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F2e4e6c1)：修复 ETL TCO（多对象复制\u002F转换）作业的进度监控提示\n- [4ef854b](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F4ef854b)：在 `etl init` CLI 命令中添加 `object-timeout` 参数\n\n### API 和配置变更\n- [b3bc65d](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fb3bc65d)：新增 `CheckObjectLock` API（高级用法）\n- [2507229](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F2507229)：添加可配置的冷获取冲突超时时间——新配置项 `timeout.cold_get_conflict`（默认 5 秒）\n- [312af7b](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F312af7b)：引入新的 ETL 初始化规范 YAML 格式\n- [25513e7](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F25513e7)：为单个对象转换请求添加超时选项\n- [82fcb58](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F82fcb58)：在 WebSocket 处理程序参数中包含来自控制消息的对象路径\n\n### Python SDK\n- [e25c8b9](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fe25c8b9)：为 `ETLServer` 子类添加序列化工具\n- [1016bda](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F1016bda)：为 `ObjectFileReader` 中的“流式冷获取”限制添加 workaround\n- [4e0f356](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F4e0f356)：改进重试行为和日志记录；将 Python 版本升级至 1.13.8\n- [8503952](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F8503952)：修复 `ObjectFile` 错误处理\n- [197f866](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F197f866)：对 `ObjectFileReader` 进行小幅后续修复\n\n### 文档\n- [d3bf4f0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fd3bf4f0)：新增可观测性指南（第四部分）\n- [0d03b71](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F0d03b71)：统一 S3 兼容性文档并进行整合\n- [4d6c7c9](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F4d6c7c9)：更新交叉引用\n- [e45a4f3](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fe45a4f3)：改进入门指南（第三部分）\n- [8b2a1da](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F8b2a1da)：概述\u002FCLI 文档；移除旧演示；修复传输包自述文件\n- [804abda](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F804abda)：修复 ETL 主文档中的外部链接\n- [8a4eea5](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F8a4eea5)：修正少量拼写错误\n\n### 官网和博客\n- [011ff05](https:\u002F\u002Fgith","2025-05-23T13:37:53",{"id":216,"version":217,"summary_zh":218,"released_at":219},314629,"v1.3.28","最新的 AIStore 发布版本 **3.28** 距离上一个版本已近 [三个月](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcompare\u002Fv1.3.27...v1.3.28)。与往常一样，v3.28 保持与先前版本的兼容性，我们完全预期它能够从早期版本无缝升级。\n\n本次发布显著提升了 [ETL 卸载](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F05\u002F09\u002Fetl-optimized-data-movement-and-server-framework)，新增了 [WebSocket 通信器](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fetl.md#communication-mechanisms) 和 Kubernetes 集群中 ETL Pod 与 AIS 目标之间的 [优化数据流](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fetl.md#direct-put-optimization)。\n\n对于 Python 用户，我们添加了 [健壮的重试逻辑](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F04\u002F02\u002Fpython-retry)，可在生命周期事件期间维持无缝连接——这一能力在运行持续数小时的训练工作负载时尤为重要。此外，我们还改进了 `JobStats` 和 `JobSnapshot` 模型，新增了 `MultiJobSnapshot`，扩展并修复了 URL 编码，并为 Object 类添加了 `props` 访问方法。\n\nPython SDK 的 ETL 功能也得到了扩展，推出了一套全新的 [ETL 服务器框架](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fpython\u002Faistore\u002Fsdk\u002Fetl\u002Fwebserver\u002FREADME.md#quickstart)，提供了三种基于 Python 的 Web 服务器实现：`FastAPI`、`Flask` 和 `HTTPMultiThreadedServer`。\n\n与此同时，v3.28 还引入了 [双层限速](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F03\u002F19\u002Frate-limit-blog) 功能，可配置地支持前端（面向客户端）和后端（面向云、自适应）两种模式。\n\n在 CLI 方面，以下列出了多项易用性改进。用户现在可以更高效地执行列出对象的操作 (`ais ls`)；内联帮助信息和 [CLI 文档](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Ftree\u002Fmain\u002Fdocs\u002Fcli) 已经修订并优化。`ais show job` 命令现可显示分布式作业的集群范围对象总数和字节数总量。\n\n可观测性的增强同样详述如下，包括用于跟踪限速操作的新指标以及扩展的作业统计信息。大多数受支持的作业现在都会报告 _j-w-f_ 指标：即 [mountpath](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#mountpath) 运行者数量、用户指定的工作线程数量，以及工作通道满队列次数。\n\n其他改进还包括新的（且更快的）内容校验和、Go API 中的快速 URL 解析、针对多对象操作和 ETL 的优化缓冲区分配，以及对对象名称中 Unicode 和特殊字符的支持。我们对众多组件进行了重构和微调，并修订了多份文档，其中包括 [主 README](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002FREADME.md) 和 [概览文档](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md)。\n\n最后但同样重要的是，为了提升网络并行性，我们现在支持多个长期存在的点对点连接。","2025-05-10T18:05:11",{"id":221,"version":222,"summary_zh":223,"released_at":224},314630,"v1.3.27","## 更改日志\n\n### 列出对象\n* “跳过查找后进行远程列表（修复）” [7762faf2c](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F7762faf2c)\n  - 重构并修复相关代码片段\n  - 主循环：在获取下一页后也检查“已中止”状态\n\n### CLI\n\n* “显示所有支持的功能标志及其描述（易用性）” [9f222a215](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F9f222a215)\n  - 集群范围和桶范围\n  - 设置和显示操作\n  - 对当前（已设置）功能进行颜色标记\n  - 更新自述文件\n* “彩色帮助（所有变体）” [e95f3ac7f628](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fe95f3ac7f628)\n  - 命令、子命令以及应用程序本身（`ais --help`）\n  - 大量彩色模板及配置\n  - 另外，`more` 分页：将 memsys 替换为简单缓冲区\n  - 同时进行了重构和清理\n\n### Python & ETL\n\n* “修复 ETL 测试” [b639c0d68](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fb639c0d68)\n* “功能：向 Client 和 SessionManager 添加 ‘max_pool_size’ 参数” [8630b853b](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F8630b853b)\n* “[Go API 变更] 扩展 `api.ETLObject` — 添加转换参数” [4b434184a](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F4b434184a)\n  - 添加 `etl_args` 参数\n  - 添加 `TestETLInlineObjWithMetadata` 集成测试\n* “添加 ETL 转换参数 QParam 支持” [8d9f2d11ae7d](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F8d9f2d11ae7d)\n  - 引入 `ETLConfig` 数据类，用于封装 ETL 相关参数。\n  - 更新 `get_reader` 和 `get` 方法以支持 `ETLConfig`，确保对 ETL 元数据的一致处理。\n  - 在 Python SDK 中添加与 ETL 相关的查询参数 (`QPARAM_ETL_ARGS`)。\n  - 重构 `get_reader` 和 `get`，采用新的 ETL 配置方式。\n\n### 构建与 Lint\n\n* “升级所有 OSS 包” [1b65a37a6](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F1b65a37a6)\n* “各种 lint 检查；对齐字段” [a5f7cfea1](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002Fa5f7cfea1)\n  - 构建：添加 trimpath\n  - 生产模式：\n  - `go build -trimpath`\n  - - 包括 aisnode 和 cli 在内的所有可执行文件\n* “降级所有 aws-sdk-go-v2 包” [462d7f45fb78](ttps:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F462d7f45fb78)","2025-02-15T22:34:59",{"id":226,"version":227,"summary_zh":228,"released_at":229},314631,"v1.3.26","**Version 3.26** arrives 4 months after the previous release and contains more than [400 commits](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcompare\u002Fv1.3.25...v1.3.26).\r\n\r\nThe core changes in v3.26 address the last remaining [limitations](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#no-limitations-principle). A new `scrub` capability has been added, supporting bidirectional diffing to detect remote out-of-band deletions and version changes. The cluster can now also reload updated user credentials at runtime without requiring downtime.\r\n\r\nEnhancements to observability are detailed below, and performance improvements include memory pooling for HTTP requests, global rebalance optimizations, and micro-optimizations across the codebase. Key fixes include better error-handling logic (with a new category for IO errors and improvements to the filesystem health checker) and enhanced object metadata caching.\r\n\r\nThe release also introduces the ability to [resolve split-brain](https:\u002F\u002Faistore.nvidia.com\u002Fblog\u002F2025\u002F02\u002F16\u002Fsplit-brain-blog) scenarios by merging splintered clusters. When and if a network partition occurs and two islands of nodes independently elect primaries, the \"set primary with force\" feature enables the administrative action of joining one cluster to another, effectively restoring the original node count. This functionality provides greater control for handling extreme and unlikely events that involve network partitioning.\r\n\r\nOn the CLI side, users can now view not only the fact that a specific UUID-ed instance of operations like `prefetch`, `copy`, `etl`, or `rebalance` is running, but also the exact command line that was used to launch the batch operation. This makes it easier to track and understand batch job context.\r\n\r\nFor the detailed changelog, please see [link](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.26.md).\r\n\r\n#### Table of Contents\r\n- [CLI](#cli)\r\n- [Observability](#observability)\r\n- [Python SDK](#python-sdk)\r\n- [Erasure Coding](#erasure-coding)\r\n- [Oracle (OCI) Object Storage](#oracle-oci-object-storage)\r\n- [Kubernetes Operator](#kubernetes-operator)\r\n- [ETL](#etl)\r\n\r\n---\r\n\r\n\u003Ca name=\"cli\">\u003C\u002Fa>\r\n## CLI\r\n\r\nThe CLI in v3.26 features revamped inline help, reorganized command-line options with clearer descriptions, and added usage examples. Fixes include support for multi-object PUT with client-side checksumming and universal prefix support for all multi-object commands. \r\n\r\nA notable new feature is the [`ais scrub`](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fstorage.md#validate-in-cluster-content-for-misplaced-objects-and-missing-copies) command for validating in-cluster content. Additionally, the [`ais performance`](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fperformance.md) command has received several updates, including improved calculation of cluster-wide throughput. Top-level commands and their options have been reorganized for better clarity.\r\n\r\nThe [`ais scrub`](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fstorage.md#validate-in-cluster-content-for-misplaced-objects-and-missing-copies) command in v3.26 focuses on detection rather than correction. It detects:\r\n* Misplaced objects (cluster-wide or within a specific multi-disk target)\r\n* Objects missing from the remote backend, and vice versa\r\n* In-cluster objects that no longer exist remotely\r\n* Objects with insufficient replicas\r\n* Objects larger or smaller than a specified size\r\n\r\nThe command generates both summary statistics and detailed reports for each identified issue. However, it does not attempt to fix misplaced or corrupted objects (those with invalid checksums). The ability to correct such issues is planned for v3.27.\r\n\r\nFor more details, see the full changelog [here](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.26.md#cli).\r\n\r\n---\r\n\r\n\u003Ca name=\"observability\">\u003C\u002Fa>\r\n## Observability\r\n\r\nVersion 3.26 includes several important updates. Prometheus metrics are now updated in real-time, eliminating the previous periodic updates via the `prometheus.Collect` interface.\r\n\r\nLatencies and throughputs are no longer published as internally computed metrics; instead, `.ns.total` (nanoseconds) and `.size` (bytes) metrics are used to compute latency and throughput based on time intervals controlled by the monitoring client.\r\n\r\nDefault Prometheus `go_*` counters and gauges, including metrics for tracking goroutines and garbage collection, have been removed. \r\n\r\nIn addition to the total aggregated metrics, separate latency and throughput metrics are now included for each [backend](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#at-a-glance).\r\n\r\nMetrics resulting from actions on a specific bucket now include the bucket name as a Prometheus variable label.\r\n\r\nIn-cluster writing generated by xactions (jobs) also now includes xaction labels, including the respective kind and ID, which results in more PUT metrics, including those not generated from user PUT requests.\r\n\r\nF","2025-02-08T01:37:03",{"id":231,"version":232,"summary_zh":233,"released_at":234},314632,"v1.3.25","## Changelog\r\n\r\n* \"S3 compatibility API: add missing access control\" c046cb8f1\r\n\r\n* \"core: async shutdown\u002Fdecommission; _primary_ to reject node-level requests\" 2e17aaf75\r\n| * _primary_ will now fail node-level decommission and similar lifecycle and cluster membership (changing) requests\r\n| * keeping shutdown-cluster exception when `forced` (in re: local playground)\r\n| * when shutting down or decommissioning an entire cluster _primary_ will now perform the final step asynchronously\r\n| * (so that the API caller receives ok)\r\n\r\n* \"python\u002Fsdk: improve error handling and logging for `ObjectFile`\" b61b3dbf5\r\n\r\n* \"core: cold-GET vs upgrading _rlock_ to _wlock_\" 9857e789c\r\n| * remove all `sync.Cond` related state and logic\r\n| * reduce low-level `lock-info` to just rc and wlock\r\n| * poll for up to `host-busy` timeout\r\n| * return `err-busy` if unsuccessful\r\n\r\n* \"CLI `show cluster` to sort rows by POD names with _primary_ on top\" e46968408\r\n\r\n* \"health check to be forwarded to _primary_ when invoked with \"primary-ready-to-rebalance\" query param a59f92177\r\n| * (previously, non-primary would fail the request)\r\n\r\n* \"python: avoid module level import of webds; remove 'webds' dependency 228f23f54\r\n| * refactor dataset_config.py: avoid module-level import of ShardWriter\r\n| * update pyproject.toml: add webdataset==0.2.86 as an optional dependency\"\r\n\r\n* \"aisloader: '--subdir' vs prefix (clarify)\" 7e7e8e49c\r\n\r\n* \"CLI: directory walk: do not call `lstat` on every entry (optimize)\" 4a22b8869\r\n| * skip errors _iff_ \"continue-on-error\"\r\n| * add verbose mode to see all warnings - especially when invoked with the \"continue-on-error\" option\r\n| * otherwise, stop walking and return the error in question\r\n| * with partial rewrite\r\n\r\n* \"docs: add tips for copying files from Lustre; `ais put` vs `ais promote`\" 3cb20f69d\r\n\r\n* \"CLI: `--num-workers` option ('ais put', 'ais archive', and more)\" d5e6fbc26\r\n| * add; amend\r\n| * an option to execute serially (consistent with aistore)\r\n| * limit not to exceed (2 * num-CPUs)\r\n| * remove `--conc` flag (obsolete)\r\n| * fix inline help\r\n\r\n* \"CLI: PUT and archive files from multiple matching directories\" 16edff70a\r\n| * `GLOB`alize\r\n| * PUT: add back `--include-src-dir` option\r\n\r\n* \"trim prefix: list-objects; bucket-summary; multi-obj operations\" 7cf15465b\r\n| * `rtrim(prefix, '*')` to satisfy one common expectation\r\n| * proxy only (leaving CLI intact)\r\n\r\n* \"unify 'validate-prefix' & 'validate-objname'; count list-objects errors\" 57892739d\r\n| * add `ErrInvalidPrefix` (type-code)\r\n| * refactor and micro-optimize `validate-*` helpers; unify\r\n| * move object name validation to proxies; proxies to (also) count `err.list.n`\r\n| * refactor `ver-changed` and `obj-move`","2024-10-07T13:52:37",{"id":236,"version":237,"summary_zh":238,"released_at":239},314633,"v1.3.24","Version 3.24 arrives nearly 4 months after the previous one and contains more than [400 commits](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md) that fall into several main categories, topics, and sub-topics:\r\n\r\n## 1. Core\r\n\r\n#### 1.1 [Observability](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md#core-observability)\r\n\r\nWe improved and optimized stats-reporting logic and introduced multiple [new metrics](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fmetrics-reference.md) and [new management alerts](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fcmn\u002Fcos\u002Fnode_state.go#L19-L38).\r\n\r\nThere's now an easy way to **observe** per-backend performance and errors, if any. Instead of (or rather, in addition to) a single _combined_ counter or latency, the system separately tracks requests that utilize AWS, GCP, and\u002For Azure backends.\r\n\r\nFor latencies, we additionally added cumulative \"total-time\" metrics:\r\n* \"GET: total cumulative time (nanoseconds)\"\r\n* \"PUT: total cumulative time (nanoseconds)\"\r\n* and [more](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fstats\u002Fcommon.go)\r\n\r\nTogether with respective counters, those _total-times_ can be used to compute precise latencies and throughputs over arbitrary time intervals - either on a per-backend basis or averaged across all remote backends, if any.\r\n\r\nNew management alerts include `keep-alive`, `tls-certificate-will-soon-expire` (see next section), `low-memory`, `low-capacity`, and more.\r\n\r\nBuild-wise, `aisnode` with StatsD will now require the corresponding build tag.\r\nPrometheus is effectively the default; for details, see related:\r\n\r\n* [build tags and examples](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fenvironment-vars.md#package-stats)\r\n\r\n#### 1.2 [HTTPS; TLS](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md#core-https-tls)\r\n\r\nHTTPS deployment implies (and requires) that each AIS node (`aisnode`) has a valid TLS ([X.509](https:\u002F\u002Fwww.ssl.com\u002Ffaqs\u002Fwhat-is-an-x-509-certificate\u002F)) certificate.\r\n\r\nTLS certificates tend to expire from time to time, or eventually. Each TLS certificate expires, with a standard-defined maximum of 13 months - roughly, 397 days.\r\n\r\nAIS v3.24 automatically reloads updated certificates, tracks expiration times, and reports any inconsistencies between certificates in a cluster:\r\n\r\n* [Loading, reloading, and generating certificates; switching cluster between HTTP and HTTPS](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fhttps.md)\r\n\r\nAssociated Grafana and CLI-visible management alerts:\r\n\r\n   | alert | comment |\r\n   | -- | -- |\r\n   | `tls-cert-will-soon-expire` | Warning: less than 3 days remain until the current X.509 cert expires |\r\n   | `tls-cert-expired` | Critical (red) alert (as the name implies) |\r\n   | `tls-cert-invalid` | ditto |\r\n\r\nFinally, there's a brand-new management API and [`ais tls`](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fcli\u002Fx509.md) CLI.\r\n\r\n#### 1.3 [Filesystem Health Checker (FSHC)](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md#core-filesystem-health-checker-fshc)\r\n\r\n[FSHC](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fais\u002Ftgtfshc.go) component detects disk faults, raises associated alerts, and disables degraded mountpaths.\r\n\r\nAIS v3.24 comes with FSHC a major (version 2) update, with new capabilities that include:\r\n\r\n* detect _mountpath changed at runtime_;\r\n* differentiate in-cluster IO errors from network and remote backend (errors);\r\n* support associated configuration (section \"API changes; Config changes\" below);\r\n* resolve (mountpath, filesystem) to disk(s), and handle:\r\n  - no-disks exception;\r\n  - disk loss, disk fault;\r\n  - new disk attachments.\r\n\r\n#### 1.4 [Keep-Alive; Primary Election](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md#core-keep-alive-primary-election)\r\n\r\nIn-cluster keep-alive mechanism (a.k.a. heartbeat) was generally micro-optimized and improved. In particular, when and if failing to ping [primary](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#terminology) via [intra-cluster control](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#networking), an AIS node will now utilize its [public network](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#networking), if available.\r\n\r\nAnd vice versa.\r\n\r\n> As an aside, AIS does not require provisioning **3** different networks at deployment time. This has always been and remains a recommended option. But our experience running Kubernetes clusters in production environments proves that it is, well, highly recommended.\r\n\r\n#### 1.5 [Rebalance; Erasure Coding: Intra-Cluster streams](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fchangelog\u002Fv3.24.md#core-rebalance-erasure-coding-intra-cluster-streams)\r\n\r\nNeedless to say, erasure coding produces a lot of in-cluster traffic. For all those erasure-coded slice-sending-receiving transactions, AIS targets establish long-living peer-to-peer connections dubbed [streams](https:\u002F\u002Fgithu","2024-09-27T19:36:04",{"id":241,"version":242,"summary_zh":243,"released_at":244},314634,"v1.3.23","Version 3.23 arrives three months after the previous one. In addition to datapath optimizations and bug fixes, most of the other changes are enumerated in the following\r\n\r\n**Table of Contents**\r\n- **List Objects; Bucket Inventory**\r\n- **Selecting `Primary` at startup; Restarting cluster when node IPs change (K8s)**\r\n- **S3 (backend, frontend)**\r\n- **BLOBs**\r\n- **Mountpath labels**\r\n- **Reading shards; Reading from shards**\r\n\r\nSee also:\r\n- [Blog: AIS on NFS](https:\u002F\u002Fstoragetarget.com\u002F2024\u002F03\u002F30\u002Fais-on-nfs\u002F)\r\n- [Blog: Very large](https:\u002F\u002Fstoragetarget.com\u002F2024\u002F05\u002F20\u002Fvery-large\u002F)\r\n\r\n## List Objects; Bucket Inventory\r\n- S3 backend: S3 ListObjectsV2 _may_ return a directory !6672\r\n- list very large buckets using _bucket inventory_ !6682, !6684, !6686, !6689, !6692\r\n- list-objects: optimize for prefix; add 'dont-optimize' feature flag !6685\r\n- list very large buckets using _bucket inventory_ (major update, API changes) !6695, !6698\r\n- list very large buckets using _bucket inventory_ !6704\r\n- list-objects: support non-recursive operation (new) !6711, !6712\r\n- refactor and code-generate (message pack) list-objects results !6714\r\n- _bucket inventory_; generic no-recursion helper !6715\r\n- _bucket inventory_: support arbitrary schema; add validation !6769\r\n- list-objects: micro-optimize setting custom properties of remote objects !6770\r\n- list very large buckets using _bucket inventory_ !6775, !6776, !6777, !6778\r\n- list very large buckets using _bucket inventory_ (major) !6810, !6811\r\n- list very large buckets using _bucket inventory_ !6815\r\n- list-objects: skip virtual directories !6835\r\n- list very large buckets using _bucket inventory_ !6847, !6851, !6853\r\n\r\n## Selecting `Primary` at startup; Restarting cluster when node IPs change (K8s)\r\n- primary role: add 'is-secondary' environment; precedence !6746\r\n- 'original' & 'discovery' URLs (major) !6747, !6749\r\n- cluster config: new convention for primary URL; role of the primary during: initial deployment, cluster restart !6752, !6755\r\n- cluster restart with simultaneous change of primary (major) !6758, !6760, !6761\r\n- primary startup: always update node net-infos !6762\r\n- all proxies to store `RMD` (previously, only primary) !6764\r\n- node join: remove duplicate IP check (is redundant) !6783\r\n- K8s startup with proxies change their network infos !6785\r\n- primary startup: initial version of the cluster map !6787\r\n- non-primary startup: retry and refactor; factor in !6788\r\n- K8s: primary startup when net-infos change !6789\r\n\r\n## S3 (backend, frontend)\r\n- backend put-object interface; presigned S3 (refactoring & cleanup) !6662\r\n- default AWS region (cleanup) !6679\r\n- `s3cmd`: add negative testing !6681\r\n- backend: S3 ListObjectsV2 _may_ return a directory !6672\r\n- backend: consolidate environment and defaults !6678\r\n- backend: retain S3-specific error code !6688, !6691\r\n- move presigned URLs code to `backend` package !6801\r\n- multipart upload: read and send next part in parallel !6803\r\n- backend: refactor and simplify !6819\r\n- new feature flag to enable (older) path-style addressing !6821\r\n\r\n## BLOBs\r\n- [config change](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fcmn\u002Ffeat\u002Ffeat.go#L62): assorted feature flags now have bucket scope (major) !6664, !6666\r\n- Python: blob-download API !6687\r\n- Python: get and prefetch with blob-download !6708\r\n- blob downloader (minor ref) !6793\r\n- blob-downloader: finalize control structures; refactor !6812\r\n- GET via blob-download !6873\r\n- multiple blob-download jobs (fixes) !6876\r\n- prefetch via blob-downloader !6882\r\n\r\n## Mountpath labels\r\n- override-config, `fspaths` section (minor ref) !6718\r\n- [config change, API change](https:\u002F\u002Fstoragetarget.com\u002F2024\u002F03\u002F30\u002Fais-on-nfs\u002F): mountpath labels (major) !6721, !6722, !6725, !6726, !6733, !6734, !6735, !6736, !6738\r\n- [backward compatibility](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fcommit\u002F85e15b9c6c9854ab) v3.22 and prior; bump CLI version !6740, !6742\r\n- log: mountpath labels vs shared filesystems; memory pressure !6744\r\n\r\n## Reading shards; Reading from shards\r\n- reading (from) shards: add read-until, read-one, and read-regex methods !6823\r\n- reading shards: read-until, read-one, read-regex !6824\r\n- WebDataset: add `wds-key`; add comments !6826\r\n- reading .TAR, .TGZ, etc. formatted objects (a.k.a. _shards_) - multiple selection !6827\r\n- GET request to select multiple archived files (feature) !6859\r\n- GET multiple archived files in one shot (feature) !6861, !6862, !6863, !6864, !6866\r\n- Python: GET multiple files from an archive (shard) !6860\r\n\r\n## Core\r\n- backend put-object interface (refactoring & cleanup) !6662\r\n- get-stats API vs attach\u002Fdetach mountpaths !6669\r\n- unwrap URL errors; remove `mux.unhandle`; CLI: more tips !6673\r\n- removing a node from a 2-node cluster (in re: rebalance) !6674\r\n- POST \u002Fv1\u002Fbuckets handler: add one more check to URI validation !6690\r\n- last byte (minor ref) !6694\r\n- project layout: move and consolidate all scripts !6699\r\n- extend RMD to reinforce cluster integrity check","2024-05-28T14:49:06",{"id":246,"version":247,"summary_zh":248,"released_at":249},314635,"v1.3.22","## Highlights\r\n- [**Blob downloader**](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fblob_downloader.md)\r\n- [**Multi-homing**: support multiple user-facing network interfaces](https:\u002F\u002Faiatscale.org\u002Fblog\u002F2024\u002F02\u002F16\u002Fmultihome-bench)\r\n- **Versioning and remote sync**\r\n  - execute in presence of out-of-band changes\u002Fdeletions\r\n  - support _latest version_: the capability to check in-cluster metadata and, possibly, `GET`, download, prefetch, and\u002For copy the latest remote (object) version\r\n  - _remote synch_: same as above, plus: remove in-cluster object if its remote counterpart is not present (any longer)\r\n   - both _latest version_ and _remote sync_ are supported in a variety of APIs (including `GET` primitive) and tools (CLI, `aisloader`)\r\n- **Intra-cluster n-way mirroring**\r\n   - to withstand a loss of node(s) erasure coding is now optional\r\n- **AWS S3 (frontend) API**\r\n   - multipart  V2 (major upgrade); other productization\r\n   - listing very large S3 datasets\r\n   - support [presigned S3](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Fs3compat.md) requests (beta)\r\n- **List objects** (job): show diff: in-cluster vs. remote\r\n- **Prefetch** (job): V2 (major upgrade)\r\n- **Copy\u002Ftransform** (jobs): V2 (major upgrade)\r\n- AWS S3: migrate AWS backend to AWS SDK V2\r\n- Azure Blob Storage: transition to latest stable native SDK\r\n\r\n> See also: [aistore features](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore?tab=readme-ov-file#features) and brief [overview](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmain\u002Fdocs\u002Foverview.md#at-a-glance).\r\n\r\n## Core\r\n- NVMe multipathing: pick alternative block-stats location !6432\r\n- rotate logs; remove redundant interfaces, other refactoring !6433\r\n- cold `GET`: add stats !6435\r\n- http(s) clients: unify naming, construction; reduce code !6438, !6439\r\n- don't escape URL paths; up cli !6441\r\n- `dsort`: sort records (minor) !6445\r\n- core: micro-optimize copy-buffer !6447\r\n- `list-objects` utilities and helpers; rerun `list-objects` code-gen: refactor and optimize; cleanup !6450, !6451\r\n- intra-cluster transport: zero-copy header !6455\r\n- Go API: (object, multi-object): ref !6456\r\n- add 'read header timeout'; docs: aistore environment variables !6459\r\n- core: support target multi-homing - comma-separated IPs (part one) !6464\r\n- package 'ais': continued refactoring; up cli !6466\r\n- support multiple user-facing network interfaces (multi-homing) !6467, !6468\r\n- when setting backend two (or more) times a row !6469\r\n- core: (begin, abort, commit) job - corner cases !6470\r\n- in-cluster K8s environment: prune and cleanup, comment, and document !6471\r\n- multi-object `PUT` - variations !6473, !6474\r\n- unify `PUT` and `PROMOTE` destination naming !6475\r\n- `APPEND` (verb) to append if exists; amend metadata (major) !6476\r\n- EC: refactor and simplify erasure-coding datapath; docs: remove all gitlab references !6477\r\n- `list-objects`: enforce intra-cluster access, validate !6480\r\n- EC: remove redundant state; simplify !6481\r\n- Go API `get-bmd`; follow-up !6483\r\n- EC: cleanup manager: remove rlock and unused map - micro-optimize !6490\r\n- copy bucket: extend the command to sync remote bucket !6491\r\n- extend 'copy bucket' to sync remote !6494, !6495, !6497, !6498, !6499\r\n- don't compare checksums of different (checksum) types !6496\r\n- when deleting non-present (remote) object !6502\r\n- move transform\u002Fcopy-bucket from 'mirror' package to 'xs' !6503\r\n- don't create data mover in a single-node cluster !6504\r\n- multi-object transform\u002Fcopy (job): add missing cleanup !6506\r\n- multi-object transform & copy !6507\r\n- core: abort all (jobs) of a given kind; CLI 'ais stop'; strings: Damerau-Levensthein !6508\r\n- revamp target initialization !6509\r\n- copy\u002Ftransform remote, non-present !6510\r\n- revamp target initialization !6512, !6513\r\n- [API change] get latest version (feature) !6516\r\n- amend Prefetch; flush `atime` cache when shutting down !6517\r\n- amend metadata cache flushing logic (`atime`, `prefetch`, `is-dirty`) !6518\r\n- core: remote reader to support 'latest version' !6519\r\n- extend config ROM; follow-up !6520\r\n- Prefetch v2 !6521\r\n- backend error formatting; `notification-listener` name !6522\r\n- [API change] Prefetch v2; multi-object operations !6523\r\n- Prefetch v2; cold-get stats; put size !6524\r\n- [config change] versioning vs remote version changed or deleted !6525, !6526\r\n- add 'remote-deleted' stats counter; Prefetch: test more !6528\r\n- AWS backend `not-found`; job status; other cleanup !6529\r\n- core: refactor 'copy-object' interface, prep to sync remote => in-cluster !6531\r\n- [Cluster Config change] versioning vs remote version: remote _changed_, _deleted_ !6532\r\n- copy\u002Ftransform (bucket | multi-object); intra-cluster notifications !6533\r\n- revise\u002Fsimplify 'is-not-exist' check; `ldp.reader` to honor `sync-remote` option !6537\r\n- pre-parse (`log-modules`, `log-level`); micro-optimize !6538\r\n- amend error handling: `not-found` vs list iterator; OOS !6539\r\n- jobs (\"xactions\"): add and log non-critical errors; `join(error)`","2024-02-25T18:14:53",{"id":251,"version":252,"summary_zh":253,"released_at":254},314636,"v1.3.21","## Highlights\r\n- cold GET: extract and micro-optimize the flow\r\n- sync Cloud bucket\r\n  - leverage [validate-warm-GET](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmaster\u002Fdocs\u002Fvalidate_warm_get.md) bucket config, and\r\n  - extend it to support non-versioned Cloud buckets, and\r\n  - optionally, delete (remotely deleted) objects\r\n- bucket sizing and counting:\r\n   - support very large buckets that are not necessarily _present_ in the cluster;\r\n   - unify `ais ls --summary` and `ais storage summary` to utilize the same control message and flags\r\n- list, _summarize_, and lookup the properties of remote buckets without adding them to cluster's BMD\r\n- HTTPS:\r\n   - support TLS configuration to authenticate clients\r\n   - switch cluster from HTTP to HTTPS, and vice versa\r\n- optimize metadata cache\r\n- optimize capacity management\r\n- bug fixes, performance improvements\r\n\r\n## Core\r\n- set `prime-time` to amend local generation of globally unique IDs !6325\r\n- multi-object (archive, copy, transform) jobs: transport endpoint !6326\r\n- core: (maintenance, decommission, shutdown) transition w\u002F rebalancing !6327\r\n- core: (maintenance, decommission, shutdown) transition w\u002F rebalancing !6328\r\n- intra-cluster transport: make receive-side stats optional !6329\r\n- intra-cluster transport: reduce receive side contention !6330\r\n- fix _channel full_ condition; rebalance-cluster; transport !6331\r\n- feature flags: add `limited-coexistence`; transport: track closed endpoints !6334\r\n- fix `prime-time`: add `caller-is-primary`; up cli module !6335\r\n- switch existing cluster between HTTPS and HTTP !6336\r\n- Go 1.21: use built-in `min` and `max` functions !6337\r\n- `list-objects`(remote-bucket-and-only-remote-props); Go 1.21 `clear` built-in !6339\r\n- Go 1.20: use typed atomic pointer, remove unsafe !6343\r\n- core: assorted micro-optimizations; remove read locks !6346\r\n- tweak multi-error `join-err`, remove error channel (minor) !6347\r\n- [API change] capacity management !6348\r\n- xxhash; field-align `vol` package !6349\r\n- bucket: new-query help; silent GET; test tools !6350\r\n- etl: adding `fqn` param to spec templates !6351\r\n- low-level control structs: `bucket`, `namespace` !6352\r\n- etl: Keras template fix !6355\r\n- etl: fix hello-world ais-etl tests !6356\r\n- core: don't recompute `uname` hash !6359\r\n- repackage HRW methods !6361\r\n- core: lom cache v2 (major update) !6362\r\n- refactor: downloader's diff resolver; control plane (receive BMD) !6363\r\n- core: lom metadata cache (cont-ed) !6365\r\n- `dsort`: error handling, assorted cleanups, more scripted tests !6366\r\n- core transactions: concurrency !6368\r\n- downloader: throttle; wait !6369\r\n- optimize cold GET !6370\r\n- global rebalance: log; minor edits !6373\r\n- core: update backend 'get-reader' API (all supported backends) !6374\r\n- core: `validate-warm-get` to support non-versioned buckets, and more !6375\r\n- `validate-warm-get` to support non-versioned buckets !6376\r\n- [API change] silent `HEAD(object)` request !6378\r\n- core: add `load-unsafe` (the faster way to load local metadata) !6382\r\n- total disk size: compute at startup, recompute on change !6383\r\n- [API change] new bucket summary; unify `list-objects` and `summary` !6384, !6386, !6387\r\n- add `config.Rom` to consolidate assorted \"read-mostly\" config values; refactor and unify !6388\r\n- [API change] new bucket summary (major update) !6390\r\n- mountpath jogger: support bucket query !6392\r\n- backend providers: do not include (`checksum`, `version`) if not asked to !6394\r\n- python: updated bucket info API !6395\r\n- feature flags: don't-add-remote & don't-head-remote; log: add s3 module; verbosity; !6398\r\n- support listing remote buckets without adding them to cluster's BMD !6399\r\n- concurrent HEAD(object) vs evict\u002Fcreate bucket - fix the race !6400\r\n- [API change] list and _summarize_ remote buckets without adding remote buckets to cluster's BMD !6401\r\n- datapath query (`dpq`) !6402\r\n- Go-based API: response header to error message !6403\r\n- [API change] new bucket summary !6405, !6406\r\n- downloader: streamline and cleanup initialization sequence !6409\r\n- HTTPS: support TLS configuration !6410, !6411, !6412, !6413, !6414, !6415, !6416\r\n- assorted minor fixes !6417, !6418\r\n- core: cold GET: fast path & slow path !6419\r\n- cluster configuration: flip `validate-cold-get` !6420\r\n- downloader (major update); [API change]: xaction registry !6422\r\n- `validate-warm-get`: add scripted test utilizing remote ais cluster !6423\r\n- core: cold GET: fast path & slow path !6424, !6427\r\n- feature flags: add `disable-fast-cold-get`; `show performance latency`; up cli module !6425\r\n- refactor ais\u002Futils !6429\r\n\r\n## Bench: `aisloader` and `aisloader-composer`\r\n- skip list objects for 100% put load !6332\r\n- `composer`: add playbook and script for intial `aisloader` copy !6333\r\n- `composer`: add support for `aisloader --filelist` option !6345\r\n- default value for duration should be infinite if num-epochs value is defined !6353\r\n- `composer`: add epochs option for GET workloads !6354\r\n- ","2023-11-05T22:20:37",{"id":256,"version":257,"summary_zh":258,"released_at":259},314637,"v1.3.20","## Core\r\n- tweak stop-maintenance logic; rebalance: cleanup log messages; assorted minor fixes !6288\r\n- do not timestamp `err-aborted` message !6290\r\n- [API change] dsort: remove _extended metrics_; add new counters; revise and refactor !6297\r\n- list-objects; house-keeper; `aisloader`, logger (assorted fixes) !6298\r\n- core stats: remove mutex and work channel - speed up !6299\r\n- slab allocator: remove stats mutex, do not sort !6300\r\n- consolidate and revise OOM handling !6301\r\n- ETL: require admin access to create & delete; add feature flag !6302\r\n- remove unused heartbeat tracker w\u002F minor ref !6308\r\n- reimplement keep-alive mechanism (major) !6309\r\n- keep-alive v2 (major update) !6312\r\n- keep-alive v2: remove timeout stats (control structure and code) !6317\r\n- keep-alive v2: add fast path !6320\r\n- micro-optimize get-all-running (jobs); atomic heard-from\u002Ftimed-out !6321\r\n- node-restarted: remove 'lsof', use net dialer; fix node-decommissioning tests !6322\r\n\r\n## Tools and tests\r\n- CI: update _fspath_ (aka _mountpath_) config for minikube-based aistore deployments !6289\r\n- `aisloader`: list and read s3 buckets _directly_ !6291\r\n- `aisloader`: list, read, and write s3 buckets directly !6292\r\n- tests: K8s long tests (EchoGolang) fix !6293\r\n- `aisloader`: fix cleanup option for s3 bucket benchmarks !6294\r\n- `aisloader`: reimplement direct get from s3 - use SDK !6295\r\n- `aisloader`: show progress when listing s3 directly !6296\r\n- CLI: add show details param to etl !6304\r\n- tools: add check for ais etl deployment !6305\r\n- tools: add `ETL_NAME` var for CLI tests !6310\r\n- `aisloader-composer`: add playbook and script for clearing Linux Page Cache on all  AIS targets !6311\r\n- `aisloader-composer`: add playbook for copying aws credentials !6314\r\n- tools: update check for aistore Kubernetes deployment !6315\r\n- CI: update github action version (all modules) !6316\r\n- CLI\u002FETL: support enumerated `arg-type` !6287, !6323\r\n\r\n## Build\r\n- upgrade all OSS packages (minor versions) !6313\r\n- transition to Go 1.21 !6318","2023-09-12T15:14:43",{"id":261,"version":262,"summary_zh":263,"released_at":264},314638,"v1.3.19","## Core\r\n- [API change] archive and download logs (feature) !6172, !6175\r\n- [API change] dsort: extend input format !6181\r\n- [API change] dsort spec; CLI: print job spec !6204\r\n- [API change] revise request spec (major upd) !6217\r\n- [API change] dsort: is now 'xaction' as well !6253\r\n- (downloader, dsort, ETL): disallow to run when out of space !6235\r\n- handle \"DNS lookup fail\" as one of the _unreachable_ err types; nlog flush-exit !6164\r\n- when electing new primary; when joining nodes at startup !6165\r\n- k8s: Change prod k8s and docker default to not log all to stderr !6166\r\n- revise GFN !6167\r\n- stats runner is now responsible to periodically flush logs !6170\r\n- core: fail user attempt to abort global rebalance when !6184\r\n- new Go API; assorted fixes !6189\r\n- metasync BMD; up modules !6190\r\n- downloader: return not-found when not found !6196\r\n- start using scripted integration tests; CLI: 'dsort src dst spec' !6198\r\n- support S3 AWS profiles with alternative creds (feature) !6214\r\n- core: state transition => rebalance => (point of no return) !6216\r\n- amend low-level Go API check-response routine; add error type-code !6228, !6229\r\n- control plane: deserialize original error from call result !6230\r\n- xactions: when checking inactivity (\"is idle\") !6242 !6243\r\n- primary readiness vs cluster shutdown !6244\r\n- Go API: wait for xaction-related conditions !6245\r\n- assorted tuneups: space cleanup; housekeeping (HK) callback; log !6246\r\n- access control: when copying\u002Ftransforming\u002Fdsorting to non-existing 'ais:\u002F\u002F' destination !6255\r\n- core: a call to update stats should never block !6257\r\n- core stats: add _fast_ counters !6258 !6259 !6261\r\n- sparsify latency stats !6260\r\n- ETL: refactor and cleanup construction !6267\r\n- deploy\u002Fdev: updated minikube scripts !6272\r\n- new option to add Cloud bucket to aistore without checking accessibility !6275, !6277\r\n- un-throttle PUT mirroring; assorted changes !6278\r\n- feature: local generation of global (job) IDs !6280 !6282\r\n\r\n## Performance\r\n- Add distributed loader scripts and playbooks for using aisloader with multiple hosts !6156\r\n- pyaisloader: usability improvements !6215\r\n- Update Grafana dashboard to include latency statistics !6249\r\n- Reorganize benchmarks and related tools !6254\r\n- aisloader: no need to call `rand` for 100% or 50% read\u002Fwrite workloads !6256\r\n- aisloader-composer: add dashboard for DC network and disk !6266\r\n- aisloader: add an option to randomize gateways !6279\r\n- aisloader-composer: fix output files for GET bench !6283\r\n\r\n## Python\r\n- sdk: update ETL templates (docker migration) !6168\r\n- sdk: Release version 1.4.1 !6169\r\n- sdk: ETL templates (compress + ffmpeg decode) !6185\r\n- sdk: ETL templates (imagepullpolicy as always) !6191\r\n- sdk: adding keras_transform template !6200\r\n- sdk: ETL templates fix !6201\r\n- sdk: ETL templates (ffmpeg decode transformer) !6205\r\n- sdk: compress ETL template (updated usage) !6211\r\n- sdk: torchvision sample transformer ETL template !6221\r\n- sdk: fix comments (minor) !6240\r\n- sdk: update version !6248\r\n- sdk: increase timeout for torchvision transformer template (large image) !6252\r\n- sdk: updated torchvision transform ETL !6262\r\n- sdk: update dsort job info query and related tests !6265\r\n- sdk: switch ETL init code 'transform_url' boolean flag to 'arg_type' string !6269\r\n- docs: update ETL dev deployment for macOS !6163\r\n- ETL: keras template minor fix !6213\r\n- ETL: remove incorrect reference !6268\r\n- ETL: add 'arg-type=FQN' (new) !6271\r\n\r\n## Datasets (resize, resort, and shuffle)\r\n- [API change] dsort: extend input format !6181\r\n- dsort input format: iterate list, iterate range !6186 !6187\r\n- start using scripted integration tests; CLI: 'dsort src dst spec' !6198\r\n- add test scripts; memsys: init gmm only once !6192\r\n- refactoring and renaming !6193\r\n- move\u002Fconsolidate error types; continued refactoring !6202\r\n- Go API change; add dsort\u002Fapi.go; CLI: print job spec !6203\r\n- [API change]: dsort spec; CLI: print job spec !6204\r\n- CLI\u002Fdsort: extend inline help, pretty-print job spec; update docs !6206\r\n- dsort: continued refactoring (major update) !6208, !6209, !6210\r\n- free sgl on error; feature: _any_ extension !6212\r\n- [API change] revise request spec (major upd) !6217\r\n- create destination on the fly !6218\r\n- record content path to retain full shard name !6219\r\n- output shard size estimation (rewrite) !6223\r\n- add is-compressed; refactor dsort-mem !6227\r\n- compressable shards (major) !6231\r\n- output ext; rcb buffer; fixes !6232\r\n- duplicated records (full coverage & stress); fixes !6233\r\n- fix tests; add stress !6234\r\n- rename subpackage, fix comments, refactor !6237\r\n- remove dsort-context, rewrite initialization !6238\r\n- static\u002Fstateless shard readers\u002Fwriters; refactor and simplify !6239\r\n- two goroutines per each shard-distributing request !6241\r\n- [API change]: dsort: is now 'xaction' as well !6253\r\n- dsort: support generic abort-xaction API !6264\r\n- no need to block when sending shard records !6286\r\n\r\n## CLI\r\n- ","2023-08-29T17:31:45",{"id":266,"version":267,"summary_zh":268,"released_at":269},314639,"v1.3.18","## Core\r\n- add `htext` to track restarted state; target run and misc !5966\r\n- cluster rebalance (scenarios) !5969, !5971, !5973, !5974, !5975, !5977, !5980, !5983, !5986, !5987, !5989, !5991, !5992, !5993, !5995, !6002\r\n- add 'cluster-ready' helper; use it to reinforce !5976\r\n- cleanup better when decommissioning; previous BMD at startup !5979\r\n- fs: reliable remove-all !5981, !5982, !5984\r\n- yet another buf pool !5985\r\n- do not modify cluster map when starting up; always skip logging idle disks !5988\r\n- rebalance (scenarios, major update) !5992\r\n- [API change]: core: rebalance (scenarios) !5993\r\n- rebalance (major update); when receiving new cluster map !5995\r\n- up modules; handle housekeeper registration race !5994\r\n- 'not present in the loaded cluster map' and similar startup validation !5996\r\n- shutdown or decommission a node that's already in maintenance !5998\r\n- transport: never establish a streaming connection to the peer that's in maintenance (or will be) !5999\r\n- `metasync` just-in-time; assorted refactoring (minor) !6001\r\n- maintenance mode: pre & post vs keepalive & metasync; CLI: more colored cues !6004\r\n- shutdown is also 'maintenance'; docs: adding-removing intro !6005\r\n- add `meta` package !6006, !6007\r\n- ETL: add arg-type parameter when initializing with code !6008\r\n- archive v2: support empty template (tar entire bucket); atime !6013\r\n- keep poi.atime in nanoseconds !6015\r\n- archive v2: append to arch; refactoring !6017\r\n- archive v2: up modules !6018\r\n- archive v2: part four (major) !6019\r\n- archive v2: detect an empty tar when appending, and handle !6020\r\n- archive v2: part six !6022\r\n- archive v2: mime detection !6024\r\n- archive v2: extend 'append-to-arch' to support tar.gz !6025, !6027\r\n- archive v2: tar and tgz append; fixes !6028\r\n- log filenames; overlapping run vs node-restarted !6029\r\n- archive v2: multi-object append-to-arch !6030\r\n- archive v2: multi-object append-to-arch !6033\r\n- archive v2: multi-object append-to-arch !6034\r\n- cleanup disk utils (minor) !6035\r\n- ios startup: run the command only once !6036\r\n- hide AuthN secret !6038\r\n- archive v2: append to zip !6041\r\n- archive v2: append to msgpack !6043\r\n- add `cmn\u002Farchive` package !6044\r\n- archive v2: write and copy via new 'cmn\u002Farchive' !6045\r\n- archive v2: append via new 'cmn\u002Farchive' !6046\r\n- [API change] archive v2: MIME vs file extensions !6047\r\n- ios: cleanup lsblk cache; CLI: refactor get-node-arg; up modules !6048\r\n- archive v2: remove msgpack; refactor !6051\r\n- archive v2: add '.tar.lz4' serialization (new) !6053\r\n- archive v2: tar.lz4 cont-d !6054\r\n- archive v2: lz4 features; checksum !6055\r\n- s3 compat: run E2E tests with correct HTTP\u002FHTTPS mode !6057\r\n- [API change]: append to arch if exists !6062\r\n- [API change] append to arch if doesn't exist; CLI cont-d !6064\r\n- checksumming and buffering vs reader-from !6074\r\n- core: content-length universally; revise write-json and friends !6075, !6076\r\n- archive v2: [API change] put (files, dirs) with an option to append !6081, !6082\r\n- archive v2: quiesce faster, refine continue-on-error logic !6083\r\n- core: double-check target-in-maintenance, quiesce faster !6084\r\n- archive v2: finalize cmn\u002Farchive package !6085\r\n- archive v2: finalize cmn\u002Farchive package !6086\r\n- log verbosity: core and modules !6087\r\n- http client: disable compression; core: undefer & micro-optimize !6066\r\n- append to (non-existing) arch: an option to create !6068\r\n- mem-pool alloc\u002Ffree symmetry: copy\u002Ftransform & archive !6069\r\n- copy\u002Ftransform, multi-archive: refactor Rx logic and error handling !6071\r\n- log verbosity: core and modules; remote cluster !6089\r\n- log verbosity: core and modules; remote cluster !6091\r\n- ec: minor refactoring !6092\r\n- archive v2: WD basename; get with extraction; Range !6093\r\n- archive v2: WD basename; get with extraction; Range !6094\r\n- archive v2: tools\u002Farchive utils !6095\r\n- archive v2: tools\u002Farchive utils !6096\r\n- compile-out asserts; super-verbose logging; log module 'mirror' (ref) !6097\r\n- log verbosity at runtime; log modules; remove glog; unify (major update) !6099\r\n- fields iterator; size converter; log rotation (fixes) !6101\r\n- [API change] get-bucket-info to count remote objects !6102\r\n- [API change] get-bucket-info to count remote objects !6103\r\n- [API change] get-bucket-info (part three); docs and CLI !6106\r\n- list-objects vs buckets: revise and refactor, add validation, clarify !6107\r\n- list-objects: introduce optional args (ref, cleanup) !6108\r\n- list-objects: mem-pool msgpack buffers !6109\r\n- kvdb: remove redundant err-not-found; amend dsort, downloader, authn !6110\r\n- x-lso must idle more time !6111\r\n- log modules (part three) !6112\r\n- add `nlog` (new logger) !6113, !6122, !6124\r\n- do not log perf counters when there's no change; sort the names !6119\r\n- fix disk usage call for clusters on mac OS !6121\r\n- log etl events: spec parsed, pod ready, hpull\u002Fhpush !6125\r\n- cleanup `fs.PathError`; add object name validation !6127\r\n- extend dsort to support `.tar.lz4`","2023-07-09T18:56:36",{"id":271,"version":272,"summary_zh":273,"released_at":274},314640,"v1.3.17","### Table of Contents\r\n- **CLI v1.2**\r\n- **Python SDK v1.1.2**\r\n- **S3 compatibility and [Botocore](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Faistore\u002Fblob\u002Fmaster\u002Fpython\u002Faistore\u002Fbotocore_patch\u002FREADME.md)**\r\n- **API changes**\r\n- **Tests and Documentation**\r\n- **Core: bug fixes and improvements**\r\n- **Build and Continuous Integration**\r\n- **Extensions: Downloader, dSort, ETL**\r\n\r\nSee also:\r\n\r\n* [Blog: Transforming non-existing datasets](https:\u002F\u002Faiatscale.org\u002Fblog\u002F2023\u002F04\u002F10\u002Ftco-any-to-any)\r\n* [Blog: Python SDK: transform and load into PyTorch](https:\u002F\u002Faiatscale.org\u002Fblog\u002F2023\u002F04\u002F03\u002Ftransform-images-with-python-sdk)\r\n\r\n## CLI\r\n- show all jobs !5645\r\n- start\u002Fstop job\u002Fxaction !5660\r\n- refresh rate and countdown; long-running 'show job' and friends !5651\r\n- 'show log node-name' to mimic 'tail -f' !5652, !5654\r\n- add custom duration flag and logic !5655\r\n- 'ais config (cluster|node|cli)', 'ais config reset', and friends !5656\r\n- bucket completions !5657\r\n- set-config to show all updates; tweak iter-fields reflection !5658\r\n- 'show job' to aggregate all categories and support all selections !5661\r\n- transition to using job display names (major) !5663\r\n- start, stop, show jobs and xactions (cont-d) !5665\r\n- amend and restructure jobs !5666\r\n- running xactions (completions) !5672\r\n- tweak config json printout; get-config from memory !5673\r\n- update backend config !5674\r\n- update backend config (part two) !5677\r\n- add footnote, marshal message only once !5678\r\n- remove `xaction` term and subcommand (everything is `job` now) !5692\r\n- suggest (targets, proxies, nodes) !5694, !5696\r\n- revise bash completions script !5697\r\n- remove 'xaction' (term and subcommand) !5698, !5699, !5700\r\n- 'show cluster': separate cluster nodes from all other (tab-tab) completions !5701\r\n- consolidate and refactor cluster map access !5704\r\n- tweak `ais create bucket --props` & `ais bucket props set` !5706\r\n- extend 'job start' to support (resilver, copy-bucket, rename-bucket) !5715\r\n- tweak listed props !5719\r\n- remove (cleanup) download and dsort jobs !5721\r\n- extend 'ais stop' to support --all|--regex !5722\r\n- 'show job' verbose option; unify usage args; ref PUT\u002FAPPEND !5723\r\n- rewrite command-not-found logic; add similar commands !5724\r\n- `show jobs` (major) !5726, !5727\r\n- bash autocomplete ordering improvements !5728\r\n- improvements (usability) !5729\r\n- add `bucket cp` alias !5730\r\n- flag printable name; split 'show job' in parts; usability !5736\r\n- further unify stopping, waiting-for, and showing jobs !5744\r\n- revise & amend 'show rebalance' - all permutations !5761\r\n- universal start-end formatting; template refactoring !5762\r\n- jobs grouping by name and, within name, by UUID !5764\r\n- complete `etl-name` transition !5767\r\n- ETL tools, UUID (part one) !5745, !5746, !5749, !5753, !5754, !5763\r\n- fix download\u002Fdsort progress !5769\r\n- new table to show target statistics !5788\r\n- 'ais show performance' (new) !5791, !5793, !5800, !5802, !5803, !5809, !5810, !5811, !5812, !5816\r\n- IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5820\r\n- reduce code, simplify, cleanup !5821\r\n- IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5823\r\n- disk stats: add average read\u002Fwrite sizes !5824\r\n- amend existing mountpath tab and add a new one !5833, !5834\r\n- expect node unreachable when iterating '--refresh' !5837\r\n- assorted usability; add 'no-color' config !5839\r\n- 'ais show performance': average (GET, PUT, etc.) sizes on the fly !5840\r\n- support new API to reset stats !5841, !5843\r\n- 'ais show performance': refactor throughput, add latency !5844\r\n- 'ais show performance': finalize latency tab !5847\r\n- 'ais show performance' cont-d !5848\r\n- 'ais show performance': finalize top-level tab !5850\r\n- 'ais show performance': add cluster-level throughput, beautify !5852\r\n- 'ais show performance': alias 'stats' and remove older code !5853\r\n- 'ais show performance': disk table v2 !5855\r\n- 'ais show performance': finalize disk table !5857\r\n- 'ais show performance': new mountpaths\u002Fdisks\u002Fcapacity table !5858, !5859\r\n- 'ais show performance': finalize capacity table !5861\r\n- refactor and cleanup multi-object put !5862\r\n- multi-object PUT: source dir, list\u002Frange; matching pattern !5865\r\n- fix concatenation logic, refactor progress bar !5867\r\n- copy bucket: support progress bars (copied objects and size) !5870\r\n- consistent timeout management !5871\r\n- copy\u002Ftransform a list or range of objects: add progress bar !5873\r\n- copy\u002Ftransform with progress bar: style, reuse !5874\r\n- multi-object PUT !5876\r\n- rogress bar: all multi-object operations; universal 'wait-for' !5879\r\n- PUT multi-object - all flavors !5880\r\n- get multiple objects in one shot (\"\"multi-object GET\"\") !5884\r\n- GET destination & assorted fixes !5882\r\n- copy-bucket: prepend prefix, command helps, examples !5889, !5891\r\n- more inline help !5892\r\n- assorted improvements (minor) !5900\r\n- fix downloading with progress bar enabled !5903\r\n- how-to text: how to reconfigure remote ais cluster !5932\r\n- add CLI compatibi","2023-04-11T00:46:07"]