[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-llamastack--llama-stack":3,"tool-llamastack--llama-stack":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":79,"owner_url":80,"languages":81,"stars":119,"forks":120,"last_commit_at":121,"license":122,"difficulty_score":10,"env_os":123,"env_gpu":123,"env_ram":123,"env_deps":124,"category_tags":127,"github_topics":79,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":128,"updated_at":129,"faqs":130,"releases":159},1031,"llamastack\u002Fllama-stack","llama-stack","Composable building blocks to build LLM Apps","Llama Stack 是一个开源的代理式 API 服务器，旨在帮助开发者轻松构建大语言模型应用。作为 OpenAI API 的兼容替代方案，Llama Stack 允许你在任何环境中运行——无论是本地笔记本、数据中心还是云端。Llama Stack 主要解决了模型切换和基础设施绑定的问题，开发者无需修改代码，即可在 Llama、GPT、Gemini 等不同模型之间自由替换，也能灵活选择 Ollama、vLLM 等推理后端。\n\nLlama Stack 非常适合需要灵活部署 AI 应用的开发者和技术团队。除了提供标准的聊天、嵌入、向量存储及批量处理接口外，Llama Stack 独特的可插拔架构支持将推理引擎、向量数据库（如 FAISS、Milvus）及工具连接器（如 MCP 服务器）进行模块化组合。此外，Llama Stack 还内置了服务端代理编排能力，支持工具调用和文件搜索（RAG），让构建复杂的智能体应用变得更加简单高效。","# Llama Stack\n\n[![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fllama_stack.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama_stack\u002F)\n[![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fllama-stack)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama-stack\u002F)\n[![Docker Hub - Pulls](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fllamastack\u002Fdistribution-starter)](https:\u002F\u002Fhub.docker.com\u002Fu\u002Fllamastack)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fllama_stack.svg)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Fblob\u002Fmain\u002FLICENSE)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1257833999603335178?color=6A7EC2&logo=discord&logoColor=ffffff)](https:\u002F\u002Fdiscord.gg\u002Fllama-stack)\n[![Unit Tests](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Funit-tests.yml\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Funit-tests.yml?query=branch%3Amain)\n[![Integration Tests](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Fintegration-tests.yml\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Fintegration-tests.yml?query=branch%3Amain)\n\n[**Quick Start**](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fgetting_started\u002Fquickstart) | [**Documentation**](https:\u002F\u002Fllamastack.github.io\u002Fdocs) | [**OpenAI API Compatibility**](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai) | [**Discord**](https:\u002F\u002Fdiscord.gg\u002Fllama-stack)\n\n**Open-source agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**\n\nLlama Stack is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(base_url=\"http:\u002F\u002Flocalhost:8321\u002Fv1\", api_key=\"fake\")\nresponse = client.chat.completions.create(\n    model=\"llama-3.3-70b\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n)\n```\n\n## What you get\n\n- **Chat Completions & Embeddings** — standard `\u002Fv1\u002Fchat\u002Fcompletions`, `\u002Fv1\u002Fcompletions`, and `\u002Fv1\u002Fembeddings` endpoints, compatible with any OpenAI client\n- **Responses API** — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call ([learn more](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai))\n- **Vector Stores & Files** — `\u002Fv1\u002Fvector_stores` and `\u002Fv1\u002Ffiles` for managed document storage and search\n- **Batches** — `\u002Fv1\u002Fbatches` for offline batch processing\n- **[Open Responses](https:\u002F\u002Fwww.openresponses.org\u002F) conformant** — the Responses API implementation passes the Open Responses conformance test suite\n\n## Use any model, use any infrastructure\n\nLlama Stack has a pluggable provider architecture. Develop locally with Ollama, deploy to production with vLLM, or connect to a managed service — the API stays the same.\n\n```text\n┌─────────────────────────────────────────────────────────────────────────┐\n│                          Llama Stack Server                             │\n│               (same API, same code, any environment)                    │\n│                                                                         │\n│  \u002Fv1\u002Fchat\u002Fcompletions  \u002Fv1\u002Fresponses  \u002Fv1\u002Fvector_stores  \u002Fv1\u002Ffiles      │\n│  \u002Fv1\u002Fembeddings        \u002Fv1\u002Fbatches    \u002Fv1\u002Fmodels         \u002Fv1\u002Fconnectors │\n├───────────────────┬──────────────────┬──────────────────────────────────┤\n│  Inference        │  Vector stores   │  Tools & connectors              │\n│    Ollama         │    FAISS         │    MCP servers                   │\n│    vLLM, TGI      │    Milvus        │    Brave, Tavily (web search)    │\n│    AWS Bedrock    │    Qdrant        │    File search (built-in RAG)    │\n│    Azure OpenAI   │    PGVector      │                                  │\n│    Fireworks      │    ChromaDB      │  File storage & processing       │\n│    Together       │    Weaviate      │    Local filesystem, S3          │\n│    ...15+ more    │    Elasticsearch │    PDF, HTML (file processors)   │\n│                   │    SQLite-vec    │                                  │\n└───────────────────┴──────────────────┴──────────────────────────────────┘\n```\n\nSee the [provider documentation](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fproviders) for the full list.\n\n## Get started\n\nInstall and run a Llama Stack server:\n\n```bash\n# One-line install\ncurl -LsSf https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fraw\u002Fmain\u002Fscripts\u002Finstall.sh | bash\n\n# Or install via uv\nuv pip install llama-stack\n\n# Start the server (uses the starter distribution with Ollama)\nllama stack run\n```\n\nThen connect with any OpenAI client — [Python](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-python), [TypeScript](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-node), [curl](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fapi-reference), or any framework that speaks the OpenAI API.\n\nSee the [Quick Start guide](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fgetting_started\u002Fquickstart) for detailed setup.\n\n## Resources\n\n- [Documentation](https:\u002F\u002Fllamastack.github.io\u002Fdocs) — full reference\n- [OpenAI API Compatibility](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai) — endpoint coverage and provider matrix\n- [Getting Started Notebook](.\u002Fdocs\u002Fgetting_started.ipynb) — text and vision inference walkthrough\n- [Contributing](CONTRIBUTING.md) — how to contribute\n\n**Client SDKs:**\n\n|  Language |  SDK | Package |\n| :----: | :----: | :----: |\n| Python |  [llama-stack-client-python](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack-client-python) | [![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fllama_stack_client.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama_stack_client\u002F) |\n| TypeScript   | [llama-stack-client-typescript](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack-client-typescript) | [![NPM version](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002Fllama-stack-client.svg)](https:\u002F\u002Fnpmjs.org\u002Fpackage\u002Fllama-stack-client) |\n\n## Community\n\nWe hold regular community calls every Thursday at 09:00 AM PST — see the [Community Event on Discord](https:\u002F\u002Fdiscord.com\u002Fevents\u002F1257833999603335178\u002F1413266296748900513) for details.\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fllamastack_llama-stack_readme_ecc467d6e068.png)](https:\u002F\u002Fwww.star-history.com\u002F#meta-llama\u002Fllama-stack&Date)\n\nThanks to all our amazing contributors!\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fllamastack_llama-stack_readme_5b97e69f1f20.png\" alt=\"Llama Stack contributors\" \u002F>\n\u003C\u002Fa>\n","# Llama Stack\n\n[![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fllama_stack.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama_stack\u002F)\n[![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fllama-stack)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama-stack\u002F)\n[![Docker Hub - Pulls](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fllamastack\u002Fdistribution-starter)](https:\u002F\u002Fhub.docker.com\u002Fu\u002Fllamastack)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fllama_stack.svg)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Fblob\u002Fmain\u002FLICENSE)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1257833999603335178?color=6A7EC2&logo=discord&logoColor=ffffff)](https:\u002F\u002Fdiscord.gg\u002Fllama-stack)\n[![Unit Tests](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Funit-tests.yml\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Funit-tests.yml?query=branch%3Amain)\n[![Integration Tests](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Fintegration-tests.yml\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Factions\u002Fworkflows\u002Fintegration-tests.yml?query=branch%3Amain)\n\n[**快速开始**](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fgetting_started\u002Fquickstart) | [**文档**](https:\u002F\u002Fllamastack.github.io\u002Fdocs) | [**OpenAI API 兼容性**](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai) | [**Discord**](https:\u002F\u002Fdiscord.gg\u002Fllama-stack)\n\n**用于构建 AI 应用的开源智能体（Agentic）API（应用程序接口）服务器。兼容 OpenAI。支持任意模型，任意基础设施。**\n\nLlama Stack 是 OpenAI API 的即用型替代品，您可以在任何地方运行它——您的笔记本电脑、数据中心或云端。使用任何兼容 OpenAI 的客户端或智能体框架。无需更改应用程序代码，即可在 Llama、GPT、Gemini、Mistral 或任何模型之间切换。\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(base_url=\"http:\u002F\u002Flocalhost:8321\u002Fv1\", api_key=\"fake\")\nresponse = client.chat.completions.create(\n    model=\"llama-3.3-70b\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n)\n```\n\n## 您将获得什么\n\n- **聊天补全与嵌入向量（Embeddings）** — 标准 `\u002Fv1\u002Fchat\u002Fcompletions`、`\u002Fv1\u002Fcompletions` 和 `\u002Fv1\u002Fembeddings` 端点，兼容任何 OpenAI 客户端\n- **Responses API（响应 API）** — 服务器端智能体编排，包含工具调用、MCP（Model Context Protocol）服务器集成以及内置文件搜索（检索增强生成（RAG）），仅需单次 API 调用 ([了解更多](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai))\n- **向量存储（Vector Stores）与文件** — `\u002Fv1\u002Fvector_stores` 和 `\u002Fv1\u002Ffiles` 用于托管文档存储和搜索\n- **批处理（Batches）** — `\u002Fv1\u002Fbatches` 用于离线批处理\n- 符合 **[Open Responses](https:\u002F\u002Fwww.openresponses.org\u002F)** 标准 — Responses API 实现通过了 Open Responses 一致性测试套件\n\n## 使用任意模型，使用任意基础设施\n\nLlama Stack 拥有可插拔的提供者（Provider）架构。本地开发使用 Ollama，生产部署使用 vLLM，或连接托管服务——API 保持不变。\n\n```text\n┌─────────────────────────────────────────────────────────────────────────┐\n│                          Llama Stack Server                             │\n│               (same API, same code, any environment)                    │\n│                                                                         │\n│  \u002Fv1\u002Fchat\u002Fcompletions  \u002Fv1\u002Fresponses  \u002Fv1\u002Fvector_stores  \u002Fv1\u002Ffiles      │\n│  \u002Fv1\u002Fembeddings        \u002Fv1\u002Fbatches    \u002Fv1\u002Fmodels         \u002Fv1\u002Fconnectors │\n├───────────────────┬──────────────────┬──────────────────────────────────┤\n│  Inference        │  Vector stores   │  Tools & connectors              │\n│    Ollama         │    FAISS         │    MCP servers                   │\n│    vLLM, TGI      │    Milvus        │    Brave, Tavily (web search)    │\n│    AWS Bedrock    │    Qdrant        │    File search (built-in RAG)    │\n│    Azure OpenAI   │    PGVector      │                                  │\n│    Fireworks      │    ChromaDB      │  File storage & processing       │\n│    Together       │    Weaviate      │    Local filesystem, S3          │\n│    ...15+ more    │    Elasticsearch │    PDF, HTML (file processors)   │\n│                   │    SQLite-vec    │                                  │\n└───────────────────┴──────────────────┴──────────────────────────────────┘\n```\n\n查看 [提供者文档](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fproviders) 获取完整列表。\n\n## 开始使用\n\n安装并运行 Llama Stack 服务器：\n\n```bash\n# One-line install\ncurl -LsSf https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fraw\u002Fmain\u002Fscripts\u002Finstall.sh | bash\n\n# Or install via uv\nuv pip install llama-stack\n\n# Start the server (uses the starter distribution with Ollama)\nllama stack run\n```\n\n然后使用任何 OpenAI 客户端连接——[Python](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-python)、[TypeScript](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-node)、[curl](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fapi-reference) 或任何使用 OpenAI API 的框架。\n\n查看 [快速开始指南](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fgetting_started\u002Fquickstart) 获取详细设置。\n\n## 资源\n\n- [文档](https:\u002F\u002Fllamastack.github.io\u002Fdocs) — 完整参考\n- [OpenAI API 兼容性](https:\u002F\u002Fllamastack.github.io\u002Fdocs\u002Fapi-openai) — 端点覆盖范围和提供者矩阵\n- [入门交互式笔记本（Notebook）](.\u002Fdocs\u002Fgetting_started.ipynb) — 文本和视觉推理演练\n- [贡献指南](CONTRIBUTING.md) — 如何贡献\n\n**客户端软件开发工具包（SDK）：**\n\n|  语言 |  SDK | 包 |\n| :----: | :----: | :----: |\n| Python |  [llama-stack-client-python](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack-client-python) | [![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fllama_stack_client.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllama_stack_client\u002F) |\n| TypeScript   | [llama-stack-client-typescript](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack-client-typescript) | [![NPM version](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002Fllama-stack-client.svg)](https:\u002F\u002Fnpmjs.org\u002Fpackage\u002Fllama-stack-client) |\n\n## 社区\n\n我们每周四太平洋标准时间上午 09:00 举行定期社区会议——详见 [Discord 社区活动](https:\u002F\u002Fdiscord.com\u002Fevents\u002F1257833999603335178\u002F1413266296748900513)。\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fllamastack_llama-stack_readme_ecc467d6e068.png)](https:\u002F\u002Fwww.star-history.com\u002F#meta-llama\u002Fllama-stack&Date)\n\n感谢所有出色的贡献者！\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-stack\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fllamastack_llama-stack_readme_5b97e69f1f20.png\" alt=\"Llama Stack contributors\" \u002F>\n\u003C\u002Fa>","# Llama Stack 快速上手指南\n\nLlama Stack 是一个开源的代理 API 服务器，完全兼容 OpenAI API。它支持在任何环境（本地、数据中心或云端）运行，允许你在不更改应用代码的情况下无缝切换 Llama、GPT、Gemini 等模型。\n\n## 环境准备\n\n- **操作系统**：Linux、macOS 或 Windows（Windows 推荐使用 WSL）\n- **运行环境**：Python 环境\n- **依赖工具**：`curl`（用于一键安装）或 `uv`（用于包管理）\n- **模型后端**：默认启动配置依赖 Ollama（也可配置 vLLM、AWS Bedrock 等其他提供者）\n\n## 安装步骤\n\n选择以下任一方式安装 Llama Stack：\n\n**方式一：一键安装脚本**\n\n```bash\ncurl -LsSf https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fraw\u002Fmain\u002Fscripts\u002Finstall.sh | bash\n```\n\n**方式二：使用 uv 安装**\n\n```bash\nuv pip install llama-stack\n```\n\n**启动服务器**\n\n安装完成后，运行以下命令启动服务器（默认使用带有 Ollama 的 starter 分布）：\n\n```bash\nllama stack run\n```\n\n## 基本使用\n\nLlama Stack 启动后，可使用任何兼容 OpenAI 的客户端进行连接。默认接口地址为 `http:\u002F\u002Flocalhost:8321\u002Fv1`。\n\n**Python 调用示例**\n\n使用官方 `openai` 库即可直接调用：\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(base_url=\"http:\u002F\u002Flocalhost:8321\u002Fv1\", api_key=\"fake\")\nresponse = client.chat.completions.create(\n    model=\"llama-3.3-70b\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n)\n```\n\n**核心功能**\n- **标准接口**：支持 `\u002Fv1\u002Fchat\u002Fcompletions`、`\u002Fv1\u002Fembeddings` 等\n- **代理编排**：Responses API 支持工具调用、MCP 集成及内置文件搜索（RAG）\n- **数据存储**：支持 `\u002Fv1\u002Fvector_stores` 和 `\u002Fv1\u002Ffiles` 进行文档管理与检索\n\n更多详细配置与提供者列表请参考 [官方文档](https:\u002F\u002Fllamastack.github.io\u002Fdocs)。","某金融科技团队正在构建一款智能投顾助手，需要在本地隐私环境下调试模型，同时在生产环境调用高性能云端 API，并具备文档检索能力。\n\n### 没有 llama-stack 时\n- 代码深度耦合特定厂商 SDK，若从 GPT 切换至开源 Llama 模型，需重构大量请求逻辑。\n- 本地开发使用 Ollama，生产部署使用 AWS Bedrock，接口差异导致环境配置繁琐且易出错。\n- 实现 RAG 功能需自行搭建向量数据库并编写文件解析管道，开发周期长达数周。\n- 缺乏统一的标准来集成外部工具，每次新增联网搜索功能都要单独适配 API。\n\n### 使用 llama-stack 后\n- 基于 OpenAI 兼容接口，只需修改配置即可在 Llama、GPT 或 Mistral 间无缝切换，代码零改动。\n- 本地与生产环境沿用同一套 API 标准，从笔记本调试到数据中心部署无需调整代码。\n- 直接调用内置的 `\u002Fv1\u002Fvector_stores` 接口，自动处理文件存储与检索，RAG 功能即刻可用。\n- 通过标准化 MCP 集成，快速连接 Brave 搜索等外部工具，大幅缩短智能体功能开发时间。\n\nllama-stack 通过屏蔽底层基础设施差异，让团队能灵活选择模型与环境，显著提升 AI 应用交付效率。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fllamastack_llama-stack_fe0b1243.png","llamastack","Llama Stack","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fllamastack_aeead1b7.png","Building blocks for an AI Platform",null,"https:\u002F\u002Fgithub.com\u002Fllamastack",[82,86,90,94,98,102,106,109,112,115],{"name":83,"color":84,"percentage":85},"Python","#3572A5",83.3,{"name":87,"color":88,"percentage":89},"TypeScript","#3178c6",10.8,{"name":91,"color":92,"percentage":93},"Mustache","#724b3b",3.7,{"name":95,"color":96,"percentage":97},"Shell","#89e051",1.6,{"name":99,"color":100,"percentage":101},"Swift","#F05138",0.2,{"name":103,"color":104,"percentage":105},"Dockerfile","#384d54",0.1,{"name":107,"color":108,"percentage":105},"JavaScript","#f1e05a",{"name":110,"color":111,"percentage":105},"Makefile","#427819",{"name":113,"color":114,"percentage":105},"CSS","#663399",{"name":116,"color":117,"percentage":118},"Objective-C","#438eff",0,8308,1296,"2026-04-05T22:33:10","MIT","未说明",{"notes":125,"python":123,"dependencies":126},"该工具为开源代理 API 服务器，可替代 OpenAI API，支持多种推理后端（Ollama, vLLM, AWS Bedrock 等）及向量存储。安装可通过 curl 脚本或 uv pip。本地运行默认使用带 Ollama 的 starter 分布。具体硬件需求取决于所选后端提供商，支持 Docker 部署。",[67],[26,13,15],"2026-03-27T02:49:30.150509","2026-04-06T08:09:07.463855",[131,136,140,145,149,154],{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},4597,"Llama Stack 的 CLI 构建和运行命令在最新版本中有什么变化？","在 v0.3.0 版本中，`llama stack build` 命令已被移除。现在请使用 `llama stack build-container` 来构建容器镜像，并使用 `llama stack run` 来直接运行服务器（替代了之前的 `build --run` 参数），以改善用户体验。","https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fissues\u002F2878",{"id":137,"question_zh":138,"answer_zh":139,"source_url":135},4598,"如何查看或列出 Llama Stack 发行版的依赖项？","目前推荐使用 `list-deps` 命令。维护者表示这比 `show` 命令能提供更好的用户体验，特别是便于使用 `jq` 提取依赖并管道传输到安装方法。未来可能会保留该命令或将其与 `show` 合并。",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},4599,"支持在不重启服务器的情况下动态管理提供者（Provider）连接吗？","目前不支持。提供者必须在 `run.yaml` 文件中配置，任何更改都需要重启服务器。动态管理（通过 API 注册、更新、移除提供者）已在规划中（参考 issue #4163），未来将支持热重载和持久化。","https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fissues\u002F3809",{"id":146,"question_zh":147,"answer_zh":148,"source_url":144},4600,"Llama Stack 支持哪些多租户部署场景？","主要支持两种场景：(1) PaaS 场景：每个租户独立运行一个 Llama Stack 实例；(2) SaaS 场景：多个租户共享同一个服务器实例。在 SaaS 场景中，配置（如模型、提供者）可被视为运行时状态而非静态配置。",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},4601,"Llama Stack 是否支持 Arm 架构或 Apple Silicon？","支持。社区确认 Apple Silicon 支持和 Arm 架构已经被支持。","https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fissues\u002F6",{"id":155,"question_zh":156,"answer_zh":157,"source_url":158},4602,"如何为 vector-io 提供者配置默认的嵌入模型（Embedding Model）？","建议在 `run.yaml` 的 vector-io 提供者配置中指定 `embedding_model` 和 `embedding_dimension`。若未指定，系统将默认使用 `run.yaml` 中出现的第一个嵌入模型，这可能导致重启后行为不一致。","https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fissues\u002F2729",[160,165,170,175,180,185,190,195,200,205,210,215,220,225,230,235,240,245,250,255],{"id":161,"version":162,"summary_zh":163,"released_at":164},113743,"v0.7.0","## What's Changed\r\n* fix: exclude informational checks from ci-status aggregation by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5105\r\n* feat: add Responses API test coverage analyzer and conformance annotations by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5101\r\n* refactor!: remove fine_tuning API by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5104\r\n* fix!: remove duplicate dataset_id parameter in append-rows endpoint by @eoinfennessy in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4849\r\n* fix: Multi-worker cache synchronization for vector stores by @elinacse in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5076\r\n* feat: Add integration test for service_tier with openai client by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5103\r\n* feat: test responses API integration tests against Azure AI Foundry by @iamemilio in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5107\r\n* fix(security): add path traversal and header injection defenses by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5086\r\n* feat!: Part 2 - implement inline neural rerank for RAG by @r3v5 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4877\r\n* feat: add provider compatibility matrix for Responses API by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5113\r\n* perf: lazy-load braintrust autoevals to reduce idle memory (~63MB) by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5078\r\n* feat: add provider version tracking to compatibility matrix by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5115\r\n* perf: lazy-load torch in embedding_mixin to reduce startup memory by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5116\r\n* perf: lazy-load torch and transformers in prompt_guard by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5117\r\n* perf: lazy-load numpy, faiss, and sqlite_vec in vector_io providers by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5118\r\n* fix(CI): reduce Mergify PR update frequency by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5106\r\n* feat: Add support for filters in PGVector and replace f-string usage in table name by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5111\r\n* fix: bump pyjwt to 2.12.0 (CVE-2026-32597) by @eoinfennessy in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5127\r\n* fix(inference): improve chat completions OpenAI conformance by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5108\r\n* fix(storage): resolve asyncio event loop mismatch via operation deferral by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5130\r\n* fix(ci): use RELEASE_PAT and PRs in post-release workflow by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5132\r\n* chore: bump fallback_version to 0.6.1.dev0 by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5136\r\n* fix: remove UV_EXTRA_INDEX_URL from Release branch ci by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5138\r\n* fix(ci): add uv lock to post-release workflow to update stale lockfile by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5139\r\n* chore(github-deps): bump stainless-api\u002Fupload-openapi-spec-action from 1.11.6 to 1.13.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5148\r\n* chore(github-deps): bump docker\u002Fsetup-buildx-action from 3.12.0 to 4.0.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5142\r\n* chore(github-deps): bump astral-sh\u002Fsetup-uv from 7.3.1 to 7.5.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5143\r\n* feat(blog): Agentic flows tutorial by @raghotham in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5035\r\n* chore(github-deps): bump docker\u002Flogin-action from 3.7.0 to 4.0.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5146\r\n* chore(github-deps): bump llamastack\u002Fllama-stack from ce063acfe127393537cb0a5deb29cd20063c76af to 2157c0903ed748e95f41a2fc3b3a75cb4c469a40 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5145\r\n* feat: Add OpenAI client integration test for top_logprobs by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5124\r\n* ci(mergify): skip conflict comments on stale PRs by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5156\r\n* feat: Add stream_options parameter support by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4815\r\n* feat: promote connector API from v1alpha to v1beta by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5129\r\n* refactor: replace LiteLLM with OpenAI mixin for WatsonX provider by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5133\r\n* fix: optimize connector listing by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5164\r\n* feat: Add OpenAI client integration test for incomplete_details by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5157\r\n* refactor!: rename meta-reference providers to builtin by @leseb in ","2026-04-01T20:52:28",{"id":166,"version":167,"summary_zh":168,"released_at":169},113744,"v0.6.1","## What's Changed\r\n* fix: remove UV_EXTRA_INDEX_URL from Release branch ci (backport #5138) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5140\r\n* chore: update llama-stack-client to ^0.6.0 in UI lockfile by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5137\r\n* fix(storage): resolve asyncio event loop mismatch via operation deferral (#5130) by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5135\r\n* feat(blog): Agentic flows tutorial (backport #5035) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5167\r\n* fix: milvus hybrid ranker usage (backport #5312) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5368\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.6.0...v0.6.1","2026-03-30T13:09:35",{"id":171,"version":172,"summary_zh":173,"released_at":174},113745,"v0.6.0","## What's Changed\r\n* chore: update convert_tooldef_to_openai_tool to match its usage by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4837\r\n* feat!: improve consistency of post-training API endpoints by @eoinfennessy in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4606\r\n* fix: Arbitrary file write via a non-default configuration by @VaishnaviHire in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4844\r\n* chore: reduce uses of models.llama.datatypes by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4847\r\n* docs: add technical release steps and improvements to RELEASE_PROCESS.md by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4792\r\n* chore: bump fallback version to 0.5.1 by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4846\r\n* fix: Exclude null 'strict' field in function tools to prevent OpenAI … by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4795\r\n* chore(test): add test to verify responses params make it to backend service by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4850\r\n* chore: revert \"fix: disable together banner (#4517)\" by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4856\r\n* fix: update together to work with latest api.together.xyz service (circa feb 2026) by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4857\r\n* chore(github-deps): bump astral-sh\u002Fsetup-uv from 7.2.0 to 7.3.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4867\r\n* chore(github-deps): bump github\u002Fcodeql-action from 4.32.0 to 4.32.2 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4861\r\n* chore(github-deps): bump actions\u002Fcache from 5.0.2 to 5.0.3 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4859\r\n* chore(github-deps): bump llamastack\u002Fllama-stack from 76bcb6657de312160c726fbe069275cd5537b702 to c518b35a65f8bd1370c938c688dfb2e2a00cceab by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4858\r\n* fix(ci): ensure oasdiff is available for openai-coverage hook by @EleanorWho in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4835\r\n* fix: Deprecate items when create conversation by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4765\r\n* chore: refactor chunking to use configurable tiktoken encoding and document tokenizer limits by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4870\r\n* chore: prune unused parts of models packages (checkpoint, tokenizer, prompt templates, datatypes) by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4871\r\n* chore: prune unused utils from utils.memory.vector_store by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4873\r\n* fix: Escape special characters in auto-generated provider documentati… by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4822\r\n* chore(docs): Use starter for opentelemetry integration test by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4875\r\n* fix: kvstore should call shutdown but not close by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4872\r\n* fix: uvicorn log ambiguity by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4522\r\n* chore(github-deps): bump actions\u002Fcheckout from 4.2.2 to 6.0.2 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4865\r\n* chore: cleanup mypy excludes by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4876\r\n* feat: add integration test for max_output_tokens by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4825\r\n* chore(test): add test to verify responses params make it to backend s… by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4852\r\n* ci: add Docker image publishing to release workflow by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4882\r\n* feat: add ProcessFileRequest model to file_processors API by @alinaryan in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4885\r\n* docs: update responses api known limitations doc by @jaideepr97 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4845\r\n* fix(vector_io): align Protocol signatures with request models by @skamenan7 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4747\r\n* fix: add _ExceptionTranslatingRoute to prevent keep-alive breakage on Linux by @iamemilio in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4886\r\n* docs: add release notes for version 0.5 by @rhuss in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4855\r\n* fix(ci): disable uv cache cleanup when UV_NO_CACHE is set by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4889\r\n* feat: Add truncation parameter support by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4813\r\n* chore(ci): bump pinned action commit hashes in integration-tests.yml by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4895\r\n* docs: Add README for running observability test by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4884\r\n* fix: update rerank routing to match params by @mattf in https:\u002F\u002Fgithub.","2026-03-11T15:01:41",{"id":176,"version":177,"summary_zh":178,"released_at":179},113746,"v0.5.2","## What's Changed\r\n* chore: bump llama-stack-client to 0.5.1 by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4957\r\n* ci: add arm64 image manifest publishing to release workflow by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5006\r\n* feat(ci): automate post-release and pre-release version management (backport #4938) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5032\r\n* fix(llama-guard): less strict parsing of safety categories (backport #5045) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5053\r\n* fix: OCI26ai sql query patches (backport #5046) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F5054\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.5.1...v0.5.2","2026-03-06T13:21:59",{"id":181,"version":182,"summary_zh":183,"released_at":184},113747,"v0.5.1","## What's Changed\r\n* fix: [release-0.5.x] Arbitrary file write via a non-default configuration (#4844) by @VaishnaviHire in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4869\r\n* fix(vertexai): raise descriptive error on auth failure instead of silent empty string (backport #4909) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4923\r\n* fix: resolve StorageConfig default env vars at construction time (backport #4897) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4924\r\n* feat: add opentelemetry-distro to core dependencies (backport #4935) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4943\r\n* fix(vector_io): eliminate duplicate call for vector store registration (backport #4925) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4941\r\n* chore: bump version to 0.5.1 for release by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4955\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.5.0...v0.5.1","2026-02-19T19:01:18",{"id":186,"version":187,"summary_zh":188,"released_at":189},113748,"v0.4.5","## What's Changed\r\n* chore: bump llama-stack-client to 0.4.4 in UI lockfile by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4791\r\n* fix: MCP CPU spike by using context manager for session cleanup by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4851\r\n* fix(vector_io): eliminate duplicate call for vector store registration (backport #4925) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4944\r\n* chore: bump version to 0.4.5 for release by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4954\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.4.4...v0.4.5","2026-02-19T18:55:49",{"id":191,"version":192,"summary_zh":193,"released_at":194},113749,"v0.5.0","## What's Changed\r\n* docs: Added a new oci-llamastack notebook for how to build agents with OCI and llama stack by @omaryashraf5 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4418\r\n* docs: Add guide to migrating from Agents to Responses by @jwm4 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4375\r\n* feat: convert models API to use a FastAPI router by @nathan-weinberg in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4407\r\n* chore: update mcp dependency constraint to >=1.23.0 by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4457\r\n* feat(ci): added codeql scanning workflow by @gmatuz in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4462\r\n* feat: migrate Conversations API to FastAPI router by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4342\r\n* fix(faiss): add backward compatibility for EmbeddedChunk deserialization by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4463\r\n* fix: Removed duplicate parameters from integration test by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4461\r\n* fix: removed scan on push by @gmatuz in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4466\r\n* chore: Document release process by @raghotham in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4470\r\n* feat: build ARM64-based UBI starter image by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4474\r\n* chore: add \"Discussion\" issue template by @nathan-weinberg in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4469\r\n* chore: Updated test integration guide by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4460\r\n* fix: update `CONTRIBUTING.md` to reflect pre-commit version used in CI by @eoinfennessy in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4468\r\n* fix: skip resources with empty IDs from conditional env vars in config processing by @Elbehery in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4455\r\n* fix: Fix Vector Store Integration Tests by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4472\r\n* chore: Delete CHANGELOG.md by @terrytangyuan in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4480\r\n* ci: run ARM64 builds on nightly schedule only by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4479\r\n* chore(github-deps): bump actions\u002Fcheckout from 4.3.1 to 6.0.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4491\r\n* chore(github-deps): bump astral-sh\u002Fsetup-uv from 7.1.6 to 7.2.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4490\r\n* chore(github-deps): bump docker\u002Fsetup-qemu-action from 3.2.0 to 3.7.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4489\r\n* chore(github-deps): bump github\u002Fcodeql-action from 3.31.9 to 4.31.9 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4488\r\n* chore(github-deps): bump stainless-api\u002Fupload-openapi-spec-action from 1.9.0 to 1.10.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4487\r\n* chore: Add backwards compatibility for Milvus Chunks by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4484\r\n* fix: aiohttp HTTP Parser auto_decompress feature susceptible to zip bomb by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4494\r\n* chore: Add backwards compatibility for qdrant chunks by @Ygnas in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4495\r\n* chore: Updated CONTRIBUTING guidance for integration test by @gyliu513 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4459\r\n* fix: fonttools security advisory by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4503\r\n* chore: Add backwards compatibility for pgvector chunks by @Ygnas in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4506\r\n* refactor!: change image_name to distro_name in StackConfig by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4396\r\n* fix: Add backwards compatibility for sqlite-vec, chroma, and weaviate chunks by @ChristianZaccaria in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4502\r\n* fix: urllib3 vulnerable to decompression-bomb safeguard bypass by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4512\r\n* fix: disable together banner by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4517\r\n* chore: switch to monthly minor release by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4518\r\n* chore: change discussion template label by @nathan-weinberg in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4525\r\n* chore: add maintenance policy to release doc by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4514\r\n* docs: fixed outdated links for api overview, routed to the updated links by @lalexandrh in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4524\r\n* chore: upgrade virtualenv by @raghotham in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4585\r\n* chore: resync client dep with main by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4591\r\n* fix: llama-stack-api packaging by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamast","2026-02-05T17:20:42",{"id":196,"version":197,"summary_zh":198,"released_at":199},113750,"v0.4.4","## What's Changed\r\n* fix: Enable session polling during streaming responses (backport #4738) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4756\r\n* feat: add scheduled CI workflow for release branches (backport #4510) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4769\r\n* fix: make release-branch-scheduled-ci compatible with older branches (backport #4753) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4767\r\n* fix: pass branch explicitly to install-llama-stack-client action (backport #4759) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4763\r\n* fix: llama-stack-api packaging by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4777\r\n* feat(ci): unify PyPI\u002Fnpm release workflow with dry-run support (backport #4774) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4785\r\n* fix: install setuptools-scm in CI (backport #4782) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4786\r\n* build: bump llama-stack-client to 0.4.4 for release by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4787\r\n* fix: override version from release tag for all packages (backport #4788) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4789\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.4.3...v0.4.4","2026-01-30T16:25:49",{"id":201,"version":202,"summary_zh":203,"released_at":204},113751,"v0.4.3","## What's Changed\r\n* fix: enable vector store registration from config with OpenAI metadata (backport #4616) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4631\r\n* fix: Fix redundant MCP tools\u002Flist calls (backport #4634) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4663\r\n* fix: file_search_call results missing document attributes\u002Fmetadata (backport #4680) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4686\r\n* fix: Concurrent calls into SentenceTransformer() cause failures of client.vector_stores.file_batches.create() (backport #4636) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4698\r\n* feat: Add shutdown functionality to LlamaStackAsLibraryClient and AsyncLlamaStackAsLibraryClient (backport #4642) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4733\r\n* feat(PGVector): implement automatic creation of vector extension during initialization of PGVectorVectorIOAdapter (backport #4660) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4740\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.4.2...v0.4.3","2026-01-26T21:51:10",{"id":206,"version":207,"summary_zh":208,"released_at":209},113752,"v0.4.2","## What's Changed\r\n* fix: disable together banner (backport #4517) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4519\r\n* fix: llama-stack-api packaging (backport #4593) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4596\r\n* fix(memory\u002Frag): remove file:\u002F\u002F uri prefix (backport #4286) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4603\r\n* fix: benchmark registration via registered_resources config (backport #4600) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4604\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.4.1...v0.4.2","2026-01-16T14:44:59",{"id":211,"version":212,"summary_zh":213,"released_at":214},113753,"v0.4.1","## What's Changed\r\n* fix(faiss): add backward compatibility for EmbeddedChunk deserialization (backport #4463) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4464\r\n* fix: skip resources with empty IDs from conditional env vars in config processing (backport #4455) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4475\r\n* chore: Add backwards compatibility for Milvus Chunks (backport #4484) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4493\r\n* fix: Fix Vector Store Integration Tests (backport #4472) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4481\r\n* chore: Add backwards compatibility for qdrant chunks (backport #4495) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4499\r\n* fix: aiohttp HTTP Parser auto_decompress feature susceptible to zip bomb (backport #4494) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4498\r\n* fix: fonttools security advisory (backport #4503) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4504\r\n* chore: Add backwards compatibility for pgvector chunks (backport #4506) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4507\r\n* fix: Add backwards compatibility for sqlite-vec, chroma, and weaviate chunks (backport #4502) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4513\r\n* fix: urllib3 vulnerable to decompression-bomb safeguard bypass (backport #4512) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4515\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.4.0...v0.4.1","2026-01-13T16:04:22",{"id":216,"version":217,"summary_zh":218,"released_at":219},113754,"v0.4.0","# llama-stack v0.4.0 \r\n\r\n## 🎯 Notable Features in llama-stack v0.4.0\r\n\r\n### ⚠️ Breaking Changes + Noteworthy Functional Changes to be Aware of\r\n\r\n####  Configuration Files\r\n\r\n  - StackRunConfig → StackConfig: Configuration class renamed\r\n  - run.yaml → config.yaml: Runtime configuration file renamed\r\n  - build.yaml removed: Build configuration consolidated into config.yaml\r\n  - New required config sections: vector_stores, safety, and structured storage.stores configuration\r\n\r\n#### Vector Store API\r\n\r\n  - vector_db_id → vector_store_id: Renamed across all endpoints and data structures (#3923)\r\n  - ChunkMetadata now required: No longer optional in Chunk class (#4413)\r\n  - Embeddings refactored: Removed from base Chunk class, added new EmbeddedChunk class (#4413)\r\n  - Removed chunk_id property: Dropped from Chunk class (#3954)\r\n  - Search response structure changed: Updated return format (#4080)\r\n\r\n #### API Removals\r\n\r\n  - Agents API removed: Sessions and turns endpoints deleted—use Responses + Conversations instead (#4055)\r\n  - SDG API Stubs removed: Synthetic data generation endpoints deleted (#4035)\r\n  - \u002Fv1\u002Fopenai\u002Fv1\u002F* routes removed: Use \u002Fv1\u002F* routes instead (#4054)\r\n  - Deprecated v1 routes with v1alpha equivalents were removed (#4054)\r\n  - Register\u002Funregister resources deprecated: Resource registration APIs deprecated (#4099)\r\n\r\n#### Provider Configuration\r\n\r\n  - Bedrock env variable: Token variable renamed to match AWS\u002Fboto3 conventions—update your environment (#4152)\r\n  - Inference base_url standardized: Unified configuration format across providers (#4177)\r\n\r\n#### API Behavior Changes\r\n\r\n  - Parallel tool calls: New parallel_tool_calls parameter affects execution flow (#4124)\r\n  - Logprobs parameter: Use include parameter instead of direct field (#4261)\r\n  - Telemetry architecture: Complete redesign around OpenTelemetry auto-instrumentation (#4127)\r\n\r\n### 🏗️ Architecture Improvements\r\n\r\n  - API\u002FProvider Separation: Split API and provider specs into separate llama-stack-api package for better modularity (#3895)\r\n  - FastAPI Migration: Converted multiple APIs to FastAPI router system (Files, Providers, Inspect, Datasets, Benchmarks) for improved performance and maintainability\r\n  - Inspect API Update: \u002Fv1\u002Finspect now only lists v1 APIs by default (#3948)\r\n\r\n### 🔍 Vector Store Enhancements\r\n\r\n  - Query Rewrite Support: Added query rewrite capabilities in vector_store.search (#4171)\r\n  - Qdrant Improvements: Hybrid and keyword search support (#4006)\r\n  - ChromaDB Enhancements: Keyword search and delete_chunk implementation (#3057)\r\n  - Persistence: Vector stores now persist across server restarts (#3977)\r\n  - Metadata & Embeddings: Return embeddings and metadata from vector store methods (#4046)\r\n\r\n### 🤖 Model & Inference\r\n\r\n  - Model Discovery: List available models via provider_data header (#3968, #3928)\r\n  - New Providers:\r\n    - OpenAI-compatible Bedrock provider (#3748)\r\n    - OCI GenAI service integration (#3876)\r\n    - OCI embeddings support (#4300)\r\n  - Standardized Configuration: Unified base_url for inference providers (#4177)\r\n\r\n### 📋 Responses API Improvements\r\n\r\n  - Parallel tool calls support (#4124)\r\n  - tool_choice parameter (#4106)\r\n  - max_tool_calls parameter (#4062)\r\n  - Logprobs via include parameter (#4261)\r\n\r\n### 🆕 New APIs\r\n\r\n  - Read-only Connectors API (#4258)\r\n  - File Processor API skeleton (#4113)\r\n  - Admin API\r\n    - Stack Administration: New \u002Fadmin API (v1alpha) for administrative operations (#4401)\r\n    - Endpoints: Provider management, health checks, version info, and route listing\r\n    - Deprecates: Standalone \u002Fproviders and \u002Finspect APIs (still functional for backward compatibility)\r\n\r\n### 🗑️ Deprecations & Removals\r\n\r\n  - Deprecated register\u002Funregister resource APIs (#4099)\r\n  - Removed Agents (sessions\u002Fturns) API (#4055)\r\n  - Removed SDG Stub API (#4035)\r\n\r\n### 🔒 Security Fixes\r\n\r\n  - Fixed RBAC bypass vulnerabilities in model access (#4270)\r\n  - Prevented ABAC bypass in vector store operations (#4394)\r\n  - JWT token redaction in logs (#4325)\r\n\r\n### 🛠️ Infrastructure & DX\r\n\r\n  - Multi-architecture builds with ARM compatibility (#4290)\r\n  - OpenTelemetry auto-instrumentation support (#4281)\r\n  - SQLite WAL mode to prevent database locking (#4048)\r\n  - File deletion permission enforcement (#4275)\r\n \r\n## What's Changed\r\n\r\n* feat: Adding Demo script  by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3870\r\n* chore: use --no-cache in Containerfile by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3884\r\n* feat: Add rerank models and rerank API change by @jiayin-nvidia in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3831\r\n* fix(conversations)!: update Conversations API definitions (was: bump openai from 1.107.0 to 2.5.0) by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3847\r\n* fix(logging): ensure logs go to stderr, loggers obey levels by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3885\r\n* fix(respon","2026-01-06T21:23:26",{"id":221,"version":222,"summary_zh":223,"released_at":224},113755,"v0.3.5","## What's Changed\r\n* chore(docs): Remove Llama 4 support details from README (backport #4178) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4323\r\n* fix(inference): respect table_name config in InferenceStore (backport #4371) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4372\r\n* fix: InferenceStore workers being cancelled on event loop change by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4373\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.3.4...v0.3.5","2025-12-15T14:41:05",{"id":226,"version":227,"summary_zh":228,"released_at":229},113756,"v0.3.4","## What's Changed\r\n* chore: bump starlette version (backport #4158) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4248\r\n* fix: uninitialised enable_write_queue by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4264\r\n* fix: Add policies to adapters (backport #4277) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4279\r\n* fix: Avoid model_limits KeyError (backport #4060) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4283\r\n* chore: bump mcp package version (backport #4287) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4288\r\n* fix: RBAC bypass vulnerabilities in model access (backport #4270) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4285\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.3.3...v0.3.4","2025-12-03T19:05:47",{"id":231,"version":232,"summary_zh":233,"released_at":234},113757,"v0.3.3","## What's Changed\r\n* fix: allowed_models config did not filter models (backport #4030) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4223\r\n* fix: Vector store persistence across server restarts (backport #3977) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4225\r\n* fix: enable SQLite WAL mode to prevent database locking errors (backport #4048) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4226\r\n* fix(docs): fix glob vulnerability (backport #4193) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4227\r\n* fix: enforce allowed_models during inference requests (backport #4197) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4228\r\n* fix: update hard-coded google model names (backport #4212) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4229\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.3.2...v0.3.3","2025-11-24T21:21:57",{"id":236,"version":237,"summary_zh":238,"released_at":239},113758,"v0.3.2","## What's Changed\r\n* fix: only set UV_INDEX_STRATEGY when UV_EXTRA_INDEX_URL is present by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4017\r\n* fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4019\r\n* fix: print help for list-deps if no args (backport #4078) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4083\r\n* docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (backport #4122) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4136\r\n* docs: clarify model identification uses provider_model_id not model_id (backport #4128) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4137\r\n* chore(ci): remove unused recordings (backport #4074) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4141\r\n* fix: harden storage semantics (backport #4118) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4138\r\n* fix(inference): enable routing of models with provider_data alone (backport #3928) by @mergify[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4142\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.3.1...v0.3.2","2025-11-12T23:22:16",{"id":241,"version":242,"summary_zh":243,"released_at":244},113759,"v0.3.1","## What's Changed\r\n* feat(cherry-pick): fixes for 0.3.1 release by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3998\r\n* fix(ci): install client from release branch before uv sync by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4002\r\n* chore(release-0.3.x): handle missing external_providers_dir by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4011\r\n* fix(ci): unset empty UV index env vars to prevent uv errors by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4013\r\n* feat: support `workers` in run config by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4014\r\n* docs: A getting started notebook featuring simple agent examples by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F4015\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fcompare\u002Fv0.3.0...v0.3.1","2025-10-31T23:05:50",{"id":246,"version":247,"summary_zh":248,"released_at":249},113760,"v0.3.0","## Highlights\r\n\r\n* Stable OpenAI-Compatible APIs\r\n* Llama Stack now separates APIs into stable (\u002Fv1\u002F), experimental (\u002Fv1alpha\u002F and \u002Fv1beta\u002F) and deprecated (deprecated = True.) \r\n* extra_body\u002Fmetadata support for APIs which support extra functionality compared to the OpenAI implementation \r\n* Documentation overhaul: Migration to Docusaurus, modern formatting, and improved API docs\r\n\r\n## What's Changed\r\n* feat(internal): add image_url download feature to OpenAIMixin by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3516\r\n* chore(api): remove batch inference by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3261\r\n* chore(apis): unpublish deprecated \u002Fv1\u002Finference apis by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3297\r\n* chore: recordings for fireworks (inference + openai) by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3573\r\n* chore: remove extra logging by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3574\r\n* chore: MANIFEST maintenance by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3454\r\n* feat: Add items and title to ToolParameter\u002FToolParamDefinition by @TamiTakamiya in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3003\r\n* feat(ci): use @next branch from llama-stack-client by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3576\r\n* chore(ui-deps): bump shiki from 1.29.2 to 3.13.0 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3585\r\n* chore(ui-deps): bump tw-animate-css from 1.2.9 to 1.4.0 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3583\r\n* chore(github-deps): bump actions\u002Fcache from 4.2.4 to 4.3.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3577\r\n* chore: skip nvidia datastore tests when nvidia datastore is not enabled by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3590\r\n* chore: introduce write queue for response_store by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3497\r\n* revert: feat(ci): use @next branch from llama-stack-client by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3593\r\n* fix: adding mime type of application\u002Fjson support  by @wukaixingxp in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3452\r\n* chore(api): remove deprecated embeddings impls by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3301\r\n* feat(api): level inference\u002Frerank and remove experimental by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3565\r\n* chore: skip safety tests when shield not available by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3592\r\n* feat: update eval runner to use openai endpoints by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3588\r\n* docs: update image paths by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3599\r\n* fix: remove inference.completion from docs by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3589\r\n* fix: Remove deprecated user param in OpenAIResponseObject by @slekkala1 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3596\r\n* fix: ensure usage is requested if telemetry is enabled by @mhdawson in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3571\r\n* feat(openai_movement): Change URL structures to kill \u002Fopenai\u002Fv1  (part 1) by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3587\r\n* feat(files): fix expires_after API shape by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3604\r\n* feat(openai_movement)!: Change URL structures to kill \u002Fopenai\u002Fv1  (part 2) by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3605\r\n* fix: mcp tool with array type should include items by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3602\r\n* feat: add llamastack + CrewAI integration example  notebook by @wukaixingxp in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3275\r\n* chore: unpublish \u002Finference\u002Fchat-completion by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3609\r\n* feat: use \u002Fv1\u002Fchat\u002Fcompletions for safety model inference by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3591\r\n* feat(api): level \u002Fagents as `v1alpha` by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3610\r\n* feat(api): Add Vector Store File batches api stub by @slekkala1 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3615\r\n* fix(expires_after): make sure multipart\u002Fform-data is properly parsed by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3612\r\n* docs: frontpage update by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3620\r\n* docs: update safety notebook by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3617\r\n* feat: add support for require_approval argument when creating response by @grs in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3608\r\n* fix: don't pass default response format in Responses by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3614\r\n* fix(logging): disable console tel","2025-10-22T19:21:54",{"id":251,"version":252,"summary_zh":253,"released_at":254},113761,"v0.2.23","## Highlights\r\n* Overhauls documentation with Docusaurus migration and modern formatting.\r\n* Standardizes Ollama and Fireworks provider with OpenAI compatibility layer.\r\n* Combines dynamic model discovery with static embedding metadata for better model information.\r\n* Refactors server.main for better code organization.\r\n* Introduces API leveling with post_training and eval promoted to v1alpha.\r\n\r\n\r\n## What's Changed\r\n* fix: Added a bug fix when registering new models by @omaryashraf5 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3453\r\n* fix: unbound variable PR_HEAD_REPO by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3469\r\n* fix: Fixing prompts import warning by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3455\r\n* docs: update documentation links by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3459\r\n* fix: Set provider_id in NVIDIA notebook when registering dataset by @JashG in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3472\r\n* feat: update qdrant hash function from SHA-1 to SHA-256 by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3477\r\n* feat: Add dynamic authentication token forwarding support for vLLM by @akram in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3388\r\n* feat: include all models from provider's \u002Fv1\u002Fmodels by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3471\r\n* chore: update the ollama inference impl to use OpenAIMixin for openai-compat functions by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3395\r\n* fix: add missing files provider to NVIDIA distribution by @jiayin-nvidia in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3479\r\n* feat: combine ProviderSpec datatypes by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3378\r\n* chore: refactor server.main by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3462\r\n* docs: Fix incorrect vector_db_id usage in RAG tutorial by @adam-d-young in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3444\r\n* fix: force milvus-lite installation for inline::milvus by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3488\r\n* chore: simplify authorized sqlstore by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3496\r\n* chore: remove duplicate AnthropicProviderDataValidator by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3512\r\n* fix: Update inference recorder to handle both Ollama and OpenAI model by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3470\r\n* fix: handle missing API keys gracefully in model refresh by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3493\r\n* chore: remove duplicate OpenAI and Gemini data validators by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3513\r\n* chore(github-deps): bump astral-sh\u002Fsetup-uv from 6.6.1 to 6.7.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3502\r\n* chore(ui-deps): bump remeda from 2.30.0 to 2.32.0 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3511\r\n* chore(ui-deps): bump @radix-ui\u002Freact-dialog from 1.1.13 to 1.1.15 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3510\r\n* chore(ui-deps): bump jest-environment-jsdom from 29.7.0 to 30.1.2 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3509\r\n* fix:  change ModelRegistryHelper to use ProviderModelEntry instead of hardcoded ModelType.llm  by @wukaixingxp in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3451\r\n* chore: Refactor fireworks to use OpenAIMixin by @slekkala1 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3480\r\n* chore: fix build by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3522\r\n* fix: return llama stack model id from embeddings by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3525\r\n* fix(dev): fix vllm inference recording (await models.list) by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3524\r\n* chore: refactor tracingmiddelware by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3520\r\n* feat: (re-)enable Databricks inference adapter by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3500\r\n* feat: update Cerebras inference provider to support dynamic model listing by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3481\r\n* docs: fix typos in RAG docs by @nathan-weinberg in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3530\r\n* chore(perf): run guidellm benchmarks by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3421\r\n* fix: fix API docstrings for proper MDX parsing by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3526\r\n* fix: update OpenAPI generator by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3527\r\n* fix: update API conformance test to point to new schema location by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3528\r\n* docs: provider and distro codegen ","2025-09-26T21:41:23",{"id":256,"version":257,"summary_zh":258,"released_at":259},113762,"v0.2.22","## Highlights\r\n* Migrated to unified \"setups\" system for test config\r\n* Added default inference store automatically during llama stack build\r\n* Introduced write queue for inference store\r\n* Proposed API leveling framework\r\n* Enhanced Together provider with embedding and dynamic model support\r\n\r\n## What's Changed\r\n* feat(tests): migrate to global \"setups\" system for test configuration by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3390\r\n* chore: remove unused variable by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3389\r\n* feat: include a default inference store during llama stack build by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3373\r\n* feat: Add vector_db_id to chunk metadata by @are-ces in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3304\r\n* fix: Add missing files_api parameter to MemoryToolRuntimeImpl test by @akram in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3394\r\n* fix: pre-commit issues: non executable shebang file and removal of @pytest.mark.asyncio decorator  by @akram in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3397\r\n* chore: update the vertexai inference impl to use openai-python for openai-compat functions by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3377\r\n* ci: Re-enable pre-commit to fail by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3399\r\n* fix: Fireworks chat completion broken due to telemetry by @slekkala1 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3392\r\n* chore: logging perf improvments by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3393\r\n* revert: Fireworks chat completion broken due to telemetry by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3402\r\n* fix: unbound variable error in schedule-record-workflow.sh by @derekhiggins in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3401\r\n* chore: introduce write queue for inference_store by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3383\r\n* docs: horizontal nav bar by @reluctantfuturist in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3407\r\n* chore(python-deps): bump pytest from 8.4.1 to 8.4.2 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3359\r\n* chore(python-deps): bump locust from 2.39.1 to 2.40.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3358\r\n* chore(python-deps): bump openai from 1.102.0 to 1.106.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3356\r\n* chore(ui-deps): bump tailwindcss from 4.1.6 to 4.1.13 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3362\r\n* chore: telemetry test by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3405\r\n* chore: move benchmarking related code by @ehhuang in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3406\r\n* fix(inference_store): on duplicate chat completion IDs, replace by @ashwinb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3408\r\n* chore: remove openai dependency from providers by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3398\r\n* fix: AWS Bedrock inference profile ID conversion for region-specific endpoints by @skamenan7 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3386\r\n* chore(replay): improve replay robustness with un-validated construction by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3414\r\n* feat: add Azure OpenAI inference provider support by @leseb in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3396\r\n* chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG by @franciscojavierarceo in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3367\r\n* chore: update the vLLM inference impl to use OpenAIMixin for openai-compat functions by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3404\r\n* chore(unit tests): remove network use, update async test by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3418\r\n* feat: Add langchain llamastack Integration example notebook by @slekkala1 in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3314\r\n* fix: oasdiff enhancements and stability by @cdoern in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3419\r\n* fix: Improve pre-commit workflow error handling and feedback by @akram in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3400\r\n* feat: migrate to FIPS-validated cryptographic algorithms by @rhdedgar in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3423\r\n* chore(recorder, tests): add test for openai \u002Fv1\u002Fmodels by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3426\r\n* chore(tests): always show slowest tests by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3431\r\n* chore(recorder): add support for NOT_GIVEN by @mattf in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-stack\u002Fpull\u002F3430\r\n* chore(ui-deps): bump next from 15.3.3 to 15.5.3 in \u002Fllama_stack\u002Fui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fllamastack\u002Fllama-s","2025-09-16T20:15:26"]