[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-mudler--LocalAI":3,"tool-mudler--LocalAI":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":81,"owner_email":80,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":126,"forks":127,"last_commit_at":128,"license":129,"difficulty_score":10,"env_os":130,"env_gpu":131,"env_ram":132,"env_deps":133,"category_tags":144,"github_topics":145,"view_count":164,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":165,"updated_at":166,"faqs":167,"releases":196},2375,"mudler\u002FLocalAI","LocalAI","LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.","LocalAI 是一款开源的本地人工智能引擎，旨在让用户在任意硬件上轻松运行各类 AI 模型，包括大语言模型、图像生成、语音识别及视频处理等。它的核心优势在于彻底打破了高性能计算的门槛，无需昂贵的专用 GPU，仅凭普通 CPU 或常见的消费级显卡（如 NVIDIA、AMD、Intel 及 Apple Silicon）即可部署和运行复杂的 AI 任务。\n\n对于担心数据隐私的用户而言，LocalAI 提供了“隐私优先”的解决方案，确保所有数据处理均在本地基础设施内完成，无需上传至云端。同时，它完美兼容 OpenAI、Anthropic 等主流 API 接口，这意味着开发者可以无缝迁移现有应用，直接利用本地资源替代云服务，既降低了成本又提升了可控性。\n\nLocalAI 内置了超过 35 种后端支持（如 llama.cpp、vLLM、Whisper 等），并集成了自主 AI 代理、工具调用及检索增强生成（RAG）等高级功能，且具备多用户管理与权限控制能力。无论是希望保护敏感数据的企业开发者、进行算法实验的研究人员，还是想要在个人电脑上体验最新 AI 技术的极客玩家，都能通过 LocalAI 获","LocalAI 是一款开源的本地人工智能引擎，旨在让用户在任意硬件上轻松运行各类 AI 模型，包括大语言模型、图像生成、语音识别及视频处理等。它的核心优势在于彻底打破了高性能计算的门槛，无需昂贵的专用 GPU，仅凭普通 CPU 或常见的消费级显卡（如 NVIDIA、AMD、Intel 及 Apple Silicon）即可部署和运行复杂的 AI 任务。\n\n对于担心数据隐私的用户而言，LocalAI 提供了“隐私优先”的解决方案，确保所有数据处理均在本地基础设施内完成，无需上传至云端。同时，它完美兼容 OpenAI、Anthropic 等主流 API 接口，这意味着开发者可以无缝迁移现有应用，直接利用本地资源替代云服务，既降低了成本又提升了可控性。\n\nLocalAI 内置了超过 35 种后端支持（如 llama.cpp、vLLM、Whisper 等），并集成了自主 AI 代理、工具调用及检索增强生成（RAG）等高级功能，且具备多用户管理与权限控制能力。无论是希望保护敏感数据的企业开发者、进行算法实验的研究人员，还是想要在个人电脑上体验最新 AI 技术的极客玩家，都能通过 LocalAI 获得灵活、安全且高效的本地化 AI 部署体验。","\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg width=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_aabfa243e427.png\"> \u003Cbr>\n\u003Cbr>\n\u003C\u002Fh1>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Fstargazers\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgo-skynet\u002FLocalAI?style=for-the-badge\" alt=\"LocalAI stars\"\u002F>\n\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Freleases'>\n\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002Fgo-skynet\u002FLocalAI?&label=Latest&style=for-the-badge'>\n\u003C\u002Fa>\n\u003Ca href=\"LICENSE\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg?style=for-the-badge\" alt=\"LocalAI License\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftwitter.com\u002FLocalAI_API\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FX-%23000000.svg?style=for-the-badge&logo=X&logoColor=white&label=LocalAI_API\" alt=\"Follow LocalAI_API\"\u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FuJAeKSAGDy\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdynamic\u002Fjson?color=blue&label=Discord&style=for-the-badge&query=approximate_member_count&url=https%3A%2F%2Fdiscordapp.com%2Fapi%2Finvites%2FuJAeKSAGDy%3Fwith_counts%3Dtrue&logo=discord\" alt=\"Join LocalAI Discord Community\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F5539\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_4a68feb902da.png\" alt=\"mudler%2FLocalAI | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n**LocalAI** is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.\n\n- **Drop-in API compatibility** — OpenAI, Anthropic, ElevenLabs APIs\n- **35+ backends** — llama.cpp, vLLM, transformers, whisper, diffusers, MLX...\n- **Any hardware** — NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only\n- **Multi-user ready** — API key auth, user quotas, role-based access\n- **Built-in AI agents** — autonomous agents with tool use, RAG, MCP, and skills\n- **Privacy-first** — your data never leaves your infrastructure\n\nCreated and maintained by [Ettore Di Giacinto](https:\u002F\u002Fgithub.com\u002Fmudler).\n\n> [:book: Documentation](https:\u002F\u002Flocalai.io\u002F) | [:speech_balloon: Discord](https:\u002F\u002Fdiscord.gg\u002FuJAeKSAGDy) | [💻 Quickstart](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F) | [🖼️ Models](https:\u002F\u002Fmodels.localai.io\u002F) | [❓FAQ](https:\u002F\u002Flocalai.io\u002Ffaq\u002F)\n\n## Screenshots\n\n### Chat, Model gallery\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F08cbb692-57da-48f7-963d-2e7b43883c18\n\n### Agents\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6270b331-e21d-4087-a540-6290006b381a\n\n## Quickstart\n\n### macOS\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Flatest\u002Fdownload\u002FLocalAI.dmg\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDownload-macOS-blue?style=for-the-badge&logo=apple&logoColor=white\" alt=\"Download LocalAI for macOS\"\u002F>\n\u003C\u002Fa>\n\n> **Note:** The DMG is not signed by Apple. After installing, run: `sudo xattr -d com.apple.quarantine \u002FApplications\u002FLocalAI.app`. See [#6268](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F6268) for details.\n\n### Containers (Docker, podman, ...)\n\n> Already ran LocalAI before? Use `docker start -i local-ai` to restart an existing container.\n\n#### CPU only:\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest\n```\n\n#### NVIDIA GPU:\n\n```bash\n# CUDA 13\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-13\n\n# CUDA 12\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-12\n\n# NVIDIA Jetson ARM64 (CUDA 12, for AGX Orin and similar)\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-nvidia-l4t-arm64\n\n# NVIDIA Jetson ARM64 (CUDA 13, for DGX Spark)\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-nvidia-l4t-arm64-cuda-13\n```\n\n#### AMD GPU (ROCm):\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fkfd --device=\u002Fdev\u002Fdri --group-add=video localai\u002Flocalai:latest-gpu-hipblas\n```\n\n#### Intel GPU (oneAPI):\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fdri\u002Fcard1 --device=\u002Fdev\u002Fdri\u002FrenderD128 localai\u002Flocalai:latest-gpu-intel\n```\n\n#### Vulkan GPU:\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest-gpu-vulkan\n```\n\n### Loading models\n\n```bash\n# From the model gallery (see available models with `local-ai models list` or at https:\u002F\u002Fmodels.localai.io)\nlocal-ai run llama-3.2-1b-instruct:q4_k_m\n# From Huggingface\nlocal-ai run huggingface:\u002F\u002FTheBloke\u002Fphi-2-GGUF\u002Fphi-2.Q8_0.gguf\n# From the Ollama OCI registry\nlocal-ai run ollama:\u002F\u002Fgemma:2b\n# From a YAML config\nlocal-ai run https:\u002F\u002Fgist.githubusercontent.com\u002F...\u002Fphi-2.yaml\n# From a standard OCI registry (e.g., Docker Hub)\nlocal-ai run oci:\u002F\u002Flocalai\u002Fphi-2:latest\n```\n\n> **Automatic Backend Detection**: LocalAI automatically detects your GPU capabilities and downloads the appropriate backend. For advanced options, see [GPU Acceleration](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpu-acceleration\u002F).\n\nFor more details, see the [Getting Started guide](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F).\n\n## Latest News\n\n- **March 2026**: [Agent management](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8820), [New React UI](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8772), [WebRTC](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8790), [MLX-distributed via P2P and RDMA](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8801), [MCP Apps, MCP Client-side](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8947)\n- **February 2026**: [Realtime API for audio-to-audio with tool calling](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6245), [ACE-Step 1.5 support](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8396)\n- **January 2026**: **LocalAI 3.10.0** — Anthropic API support, Open Responses API, video & image generation (LTX-2), unified GPU backends, tool streaming, Moonshine, Pocket-TTS. [Release notes](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Ftag\u002Fv3.10.0)\n- **December 2025**: [Dynamic Memory Resource reclaimer](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7583), [Automatic multi-GPU model fitting (llama.cpp)](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7584), [Vibevoice backend](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7494)\n- **November 2025**: [Import models via URL](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7245), [Multiple chats and history](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7325)\n- **October 2025**: [Model Context Protocol (MCP)](https:\u002F\u002Flocalai.io\u002Fdocs\u002Ffeatures\u002Fmcp\u002F) support for agentic capabilities\n- **September 2025**: New Launcher for macOS and Linux, extended backend support for Mac and Nvidia L4T, MLX-Audio, WAN 2.2\n- **August 2025**: MLX, MLX-VLM, Diffusers, llama.cpp now supported on Apple Silicon\n- **July 2025**: All backends migrated outside the main binary — [lightweight, modular architecture](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Ftag\u002Fv3.2.0)\n\nFor older news and full release notes, see [GitHub Releases](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases) and the [News page](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fnews\u002F).\n\n## Features\n\n- [Text generation](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Ftext-generation\u002F) (`llama.cpp`, `transformers`, `vllm` ... [and more](https:\u002F\u002Flocalai.io\u002Fmodel-compatibility\u002F))\n- [Text to Audio](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Ftext-to-audio\u002F)\n- [Audio to Text](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Faudio-to-text\u002F)\n- [Image generation](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fimage-generation)\n- [OpenAI-compatible tools API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-functions\u002F)\n- [Realtime API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-realtime\u002F) (Speech-to-speech)\n- [Embeddings generation](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fembeddings\u002F)\n- [Constrained grammars](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fconstrained_grammars\u002F)\n- [Download models from Huggingface](https:\u002F\u002Flocalai.io\u002Fmodels\u002F)\n- [Vision API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpt-vision\u002F)\n- [Object Detection](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fobject-detection\u002F)\n- [Reranker API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Freranker\u002F)\n- [P2P Inferencing](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fdistribute\u002F)\n- [Distributed Mode](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fdistributed-mode\u002F) — Horizontal scaling with PostgreSQL + NATS\n- [Model Context Protocol (MCP)](https:\u002F\u002Flocalai.io\u002Fdocs\u002Ffeatures\u002Fmcp\u002F)\n- [Built-in Agents](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fagents\u002F) — Autonomous AI agents with tool use, RAG, skills, SSE streaming, and [Agent Hub](https:\u002F\u002Fagenthub.localai.io)\n- [Backend Gallery](https:\u002F\u002Flocalai.io\u002Fbackends\u002F) — Install\u002Fremove backends on the fly via OCI images\n- Voice Activity Detection (Silero-VAD)\n- Integrated WebUI\n\n## Supported Backends & Acceleration\n\nLocalAI supports **35+ backends** including llama.cpp, vLLM, transformers, whisper.cpp, diffusers, MLX, MLX-VLM, and many more. Hardware acceleration is available for **NVIDIA** (CUDA 12\u002F13), **AMD** (ROCm), **Intel** (oneAPI\u002FSYCL), **Apple Silicon** (Metal), **Vulkan**, and **NVIDIA Jetson** (L4T). All backends can be installed on-the-fly from the [Backend Gallery](https:\u002F\u002Flocalai.io\u002Fbackends\u002F).\n\nSee the full [Backend & Model Compatibility Table](https:\u002F\u002Flocalai.io\u002Fmodel-compatibility\u002F) and [GPU Acceleration guide](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpu-acceleration\u002F).\n\n## Resources\n\n- [Documentation](https:\u002F\u002Flocalai.io\u002F)\n- [LLM fine-tuning guide](https:\u002F\u002Flocalai.io\u002Fdocs\u002Fadvanced\u002Ffine-tuning\u002F)\n- [Build from source](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fbuild\u002F)\n- [Kubernetes installation](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F#run-localai-in-kubernetes)\n- [Integrations & community projects](https:\u002F\u002Flocalai.io\u002Fdocs\u002Fintegrations\u002F)\n- [Media & blog posts](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fnews\u002F#media-blogs-social)\n- [Examples](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI-examples)\n\n## Autonomous Development Team\n\nLocalAI is helped being maintained by a team of autonomous AI agents led by an AI Scrum Master.\n\n- **Live Reports**: [reports.localai.io](http:\u002F\u002Freports.localai.io)\n- **Project Board**: [Agent task tracking](https:\u002F\u002Fgithub.com\u002Fusers\u002Fmudler\u002Fprojects\u002F6)\n- **Blog Post**: [Learn about the experiment](https:\u002F\u002Fmudler.pm\u002Fposts\u002F2026\u002F02\u002F28\u002Fa-call-to-open-source-maintainers-stop-babysitting-ai-how-i-built-a-100-local-autonomous-dev-team-to-maintain-localai-and-why-you-should-too\u002F)\n\n## Citation\n\nIf you utilize this repository, data in a downstream project, please consider citing it with:\n\n```\n@misc{localai,\n  author = {Ettore Di Giacinto},\n  title = {LocalAI: The free, Open source OpenAI alternative},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI}},\n```\n\n## Sponsors\n\n> Do you find LocalAI useful?\n\nSupport the project by becoming [a backer or sponsor](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler). Your logo will show up here with a link to your website.\n\nA huge thank you to our generous sponsors who support this project covering CI expenses, and our [Sponsor list](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler):\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.spectrocloud.com\u002F\" target=\"blank\">\n    \u003Cimg height=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_5ac1f6180ce5.png\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.premai.io\u002F\" target=\"blank\">\n    \u003Cimg height=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_b80a16a54ad7.png\"> \u003Cbr>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n### Individual sponsors\n\nA special thanks to individual sponsors, a full list is on [GitHub](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler) and [buymeacoffee](https:\u002F\u002Fbuymeacoffee.com\u002Fmudler). Special shout out to [drikster80](https:\u002F\u002Fgithub.com\u002Fdrikster80) for being generous. Thank you everyone!\n\n## Star history\n\n[![LocalAI Star history Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_a97aecd2da90.png)](https:\u002F\u002Fstar-history.com\u002F#go-skynet\u002FLocalAI&Date)\n\n## License\n\nLocalAI is a community-driven project created by [Ettore Di Giacinto](https:\u002F\u002Fgithub.com\u002Fmudler\u002F).\n\nMIT - Author Ettore Di Giacinto \u003Cmudler@localai.io>\n\n## Acknowledgements\n\nLocalAI couldn't have been built without the help of great software already available from the community. Thank you!\n\n- [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp)\n- https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\n- https:\u002F\u002Fgithub.com\u002Fcornelk\u002Fllama-go for the initial ideas\n- https:\u002F\u002Fgithub.com\u002Fantimatter15\u002Falpaca.cpp\n- https:\u002F\u002Fgithub.com\u002FEdVince\u002FStable-Diffusion-NCNN\n- https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fwhisper.cpp\n- https:\u002F\u002Fgithub.com\u002Frhasspy\u002Fpiper\n- [exo](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo) for the MLX distributed auto-parallel sharding implementation\n\n## Contributors\n\nThis is a community project, a special thanks to our contributors!\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_484175fce90d.png\" \u002F>\n\u003C\u002Fa>\n","\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg width=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_aabfa243e427.png\"> \u003Cbr>\n\u003Cbr>\n\u003C\u002Fh1>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Fstargazers\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgo-skynet\u002FLocalAI?style=for-the-badge\" alt=\"LocalAI 星标数\"\u002F>\n\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Freleases'>\n\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002Fgo-skynet\u002FLocalAI?&label=最新版本&style=for-the-badge'>\n\u003C\u002Fa>\n\u003Ca href=\"LICENSE\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg?style=for-the-badge\" alt=\"LocalAI 许可证\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftwitter.com\u002FLocalAI_API\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FX-%23000000.svg?style=for-the-badge&logo=X&logoColor=white&label=LocalAI_API\" alt=\"关注 LocalAI_API\"\u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FuJAeKSAGDy\" target=\"blank\">\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdynamic\u002Fjson?color=blue&label=Discord&style=for-the-badge&query=approximate_member_count&url=https%3A%2F%2Fdiscordapp.com%2Fapi%2Finvites%2FuJAeKSAGDy%3Fwith_counts%3Dtrue&logo=discord\" alt=\"加入 LocalAI Discord 社区\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F5539\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_4a68feb902da.png\" alt=\"mudler%2FLocalAI | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n**LocalAI** 是一款开源的 AI 引擎。你可以在任何硬件上运行任意模型——LLM、视觉、语音、图像、视频等，无需 GPU。\n\n- **即插即用的 API 兼容性** —— OpenAI、Anthropic、ElevenLabs 等 API\n- **35+ 后端支持** —— llama.cpp、vLLM、transformers、whisper、diffusers、MLX…\n- **兼容多种硬件** —— NVIDIA、AMD、Intel、Apple Silicon、Vulkan，甚至仅使用 CPU\n- **多用户支持** —— API 密钥认证、用户配额、基于角色的访问控制\n- **内置 AI 代理** —— 具备工具使用、RAG、MCP 和技能的自主代理\n- **隐私优先** —— 您的数据绝不会离开您的基础设施\n\n由 [Ettore Di Giacinto](https:\u002F\u002Fgithub.com\u002Fmudler) 创建并维护。\n\n> [:book: 文档](https:\u002F\u002Flocalai.io\u002F) | [:speech_balloon: Discord](https:\u002F\u002Fdiscord.gg\u002FuJAeKSAGDy) | [💻 快速入门](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F) | [🖼️ 模型库](https:\u002F\u002Fmodels.localai.io\u002F) | [❓常见问题解答](https:\u002F\u002Flocalai.io\u002Ffaq\u002F)\n\n## 截图\n\n### 聊天与模型展示\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F08cbb692-57da-48f7-963d-2e7b43883c18\n\n### 代理\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6270b331-e21d-4087-a540-6290006b381a\n\n## 快速入门\n\n### macOS\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Flatest\u002Fdownload\u002FLocalAI.dmg\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F下载-macOS-blue?style=for-the-badge&logo=apple&logoColor=white\" alt=\"下载 LocalAI for macOS\"\u002F>\n\u003C\u002Fa>\n\n> **注意**：该 DMG 文件未经过 Apple 签名。安装后，请运行：`sudo xattr -d com.apple.quarantine \u002FApplications\u002FLocalAI.app`。详情请参阅 [#6268](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F6268)。\n\n### 容器（Docker、podman 等）\n\n> 已经运行过 LocalAI？使用 `docker start -i local-ai` 即可重启现有容器。\n\n#### 仅 CPU：\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest\n```\n\n#### NVIDIA GPU：\n\n```bash\n# CUDA 13\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-13\n\n# CUDA 12\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-12\n\n# NVIDIA Jetson ARM64（CUDA 12，适用于 AGX Orin 等）\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-nvidia-l4t-arm64\n\n# NVIDIA Jetson ARM64（CUDA 13，适用于 DGX Spark）\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-nvidia-l4t-arm64-cuda-13\n```\n\n#### AMD GPU（ROCm）：\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fkfd --device=\u002Fdev\u002Fdri --group-add=video localai\u002Flocalai:latest-gpu-hipblas\n```\n\n#### Intel GPU（oneAPI）：\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fdri\u002Fcard1 --device=\u002Fdev\u002Fdri\u002FrenderD128 localai\u002Flocalai:latest-gpu-intel\n```\n\n#### Vulkan GPU：\n\n```bash\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest-gpu-vulkan\n```\n\n### 加载模型\n\n```bash\n# 从模型库加载（可通过 `local-ai models list` 查看可用模型，或访问 https:\u002F\u002Fmodels.localai.io）\nlocal-ai run llama-3.2-1b-instruct:q4_k_m\n# 从 Hugging Face 加载\nlocal-ai run huggingface:\u002F\u002FTheBloke\u002Fphi-2-GGUF\u002Fphi-2.Q8_0.gguf\n# 从 Ollama OCI 注册表加载\nlocal-ai run ollama:\u002F\u002Fgemma:2b\n# 从 YAML 配置文件加载\nlocal-ai run https:\u002F\u002Fgist.githubusercontent.com\u002F...\u002Fphi-2.yaml\n# 从标准 OCI 注册表（如 Docker Hub）加载\nlocal-ai run oci:\u002F\u002Flocalai\u002Fphi-2:latest\n```\n\n> **自动后端检测**：LocalAI 会自动检测你的 GPU 性能并下载合适的后端。如需高级选项，请参阅 [GPU 加速指南](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpu-acceleration\u002F)。\n\n更多详细信息，请参阅 [入门指南](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F)。\n\n## 最新消息\n\n- **2026年3月**：[代理管理](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8820)、[全新 React UI](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8772)、[WebRTC](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8790)、[MLX 通过 P2P 和 RDMA 分布式计算](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8801)、[MCP 应用程序与客户端支持](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8947)\n- **2026年2月**：[支持实时音频到音频的 API，并可调用工具](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6245)、[支持 ACE-Step 1.5](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8396)\n- **2026年1月**：**LocalAI 3.10.0** — 支持 Anthropic API、Open Responses API、视频与图像生成（LTX-2）、统一 GPU 后端、工具流媒体、Moonshine、Pocket-TTS。[发布说明](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Ftag\u002Fv3.10.0)\n- **2025年12月**：[动态内存资源回收器](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7583)、[自动多 GPU 模型适配（llama.cpp）](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7584)、[Vibevoice 后端](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7494)\n- **2025年11月**：[通过 URL 导入模型](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7245)、[多聊天窗口与历史记录](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F7325)\n- **2025年10月**：[支持模型上下文协议（MCP）](https:\u002F\u002Flocalai.io\u002Fdocs\u002Ffeatures\u002Fmcp\u002F)以实现代理功能\n- **2025年9月**：为 macOS 和 Linux 推出全新启动器，扩展对 Mac 和 Nvidia L4T 的后端支持，新增 MLX-Audio 和 WAN 2.2\n- **2025年8月**：MLX、MLX-VLM、Diffusers、llama.cpp 现已支持 Apple Silicon\n- **2025年7月**：所有后端均已迁出主二进制文件——[轻量级、模块化架构](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Ftag\u002Fv3.2.0)\n\n更多过往新闻及完整发布说明，请查看 [GitHub 发布页面](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases)和 [新闻页面](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fnews\u002F)。\n\n## 功能\n\n- [文本生成](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Ftext-generation\u002F)（`llama.cpp`、`transformers`、`vllm`……[更多](https:\u002F\u002Flocalai.io\u002Fmodel-compatibility\u002F)）\n- [文本转语音](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Ftext-to-audio\u002F)\n- [语音转文本](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Faudio-to-text\u002F)\n- [图像生成](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fimage-generation)\n- [与 OpenAI 兼容的工具 API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-functions\u002F)\n- [实时 API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-realtime\u002F)（语音到语音）\n- [嵌入生成](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fembeddings\u002F)\n- [约束语法](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fconstrained_grammars\u002F)\n- [从 Hugging Face 下载模型](https:\u002F\u002Flocalai.io\u002Fmodels\u002F)\n- [视觉 API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpt-vision\u002F)\n- [目标检测](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fobject-detection\u002F)\n- [重排序器 API](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Freranker\u002F)\n- [P2P 推理](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fdistribute\u002F)\n- [分布式模式](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fdistributed-mode\u002F)——使用 PostgreSQL + NATS 实现水平扩展\n- [模型上下文协议 (MCP)](https:\u002F\u002Flocalai.io\u002Fdocs\u002Ffeatures\u002Fmcp\u002F)\n- [内置智能体](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fagents\u002F)——具备工具使用、RAG、技能、SSE 流式传输等功能的自主 AI 智能体，以及 [Agent Hub](https:\u002F\u002Fagenthub.localai.io)\n- [后端库](https:\u002F\u002Flocalai.io\u002Fbackends\u002F)——通过 OCI 镜像即时安装或移除后端\n- 语音活动检测（Silero-VAD）\n- 集成 WebUI\n\n## 支持的后端与加速\n\nLocalAI 支持 **35+ 后端**，包括 llama.cpp、vLLM、transformers、whisper.cpp、diffusers、MLX、MLX-VLM 等。硬件加速适用于 **NVIDIA**（CUDA 12\u002F13）、**AMD**（ROCm）、**Intel**（oneAPI\u002FSYCL）、**Apple Silicon**（Metal）、**Vulkan** 以及 **NVIDIA Jetson**（L4T）。所有后端均可通过 [后端库](https:\u002F\u002Flocalai.io\u002Fbackends\u002F) 即时安装。\n\n请参阅完整的 [后端与模型兼容性表](https:\u002F\u002Flocalai.io\u002Fmodel-compatibility\u002F) 和 [GPU 加速指南](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fgpu-acceleration\u002F)。\n\n## 资源\n\n- [文档](https:\u002F\u002Flocalai.io\u002F)\n- [LLM 微调指南](https:\u002F\u002Flocalai.io\u002Fdocs\u002Fadvanced\u002Ffine-tuning\u002F)\n- [从源码构建](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fbuild\u002F)\n- [Kubernetes 安装](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fgetting_started\u002F#run-localai-in-kubernetes)\n- [集成与社区项目](https:\u002F\u002Flocalai.io\u002Fdocs\u002Fintegrations\u002F)\n- [媒体与博客文章](https:\u002F\u002Flocalai.io\u002Fbasics\u002Fnews\u002F#media-blogs-social)\n- [示例](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI-examples)\n\n## 自主开发团队\n\nLocalAI 由一个以 AI Scrum 主管为首的自主 AI 智能体团队维护。\n\n- **实时报告**：[reports.localai.io](http:\u002F\u002Freports.localai.io)\n- **项目看板**：[智能体任务跟踪](https:\u002F\u002Fgithub.com\u002Fusers\u002Fmudler\u002Fprojects\u002F6)\n- **博客文章**：[了解实验详情](https:\u002F\u002Fmudler.pm\u002Fposts\u002F2026\u002F02\u002F28\u002Fa-call-to-open-source-maintainers-stop-babysitting-ai-how-i-built-a-100-local-autonomous-dev-team-to-maintain-localai-and-why-you-should-too\u002F)\n\n## 引用\n\n如果您在下游项目中使用了本仓库或其中的数据，请考虑以下引用方式：\n\n```\n@misc{localai,\n  author = {Ettore Di Giacinto},\n  title = {LocalAI：免费的开源 OpenAI 替代方案},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub 仓库},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI}},\n```\n\n## 赞助商\n\n> 您觉得 LocalAI 有用吗？\n\n请通过成为 [支持者或赞助商](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler) 来支持该项目。您的 logo 将在此处展示，并附上您网站的链接。\n\n衷心感谢慷慨赞助我们项目的各位，他们为 CI 开支提供了支持，以下是我们的 [赞助商名单](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler)：\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.spectrocloud.com\u002F\" target=\"blank\">\n    \u003Cimg height=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_5ac1f6180ce5.png\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.premai.io\u002F\" target=\"blank\">\n    \u003Cimg height=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_b80a16a54ad7.png\"> \u003Cbr>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n### 个人赞助者\n\n特别感谢各位个人赞助者，完整名单可在 [GitHub](https:\u002F\u002Fgithub.com\u002Fsponsors\u002Fmudler) 和 [buymeacoffee](https:\u002F\u002Fbuymeacoffee.com\u002Fmudler) 上查看。特别鸣谢 [drikster80](https:\u002F\u002Fgithub.com\u002Fdrikster80)，感谢他的慷慨支持！谢谢大家！\n\n## 星标历史\n\n[![LocalAI 星标历史图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_a97aecd2da90.png)](https:\u002F\u002Fstar-history.com\u002F#go-skynet\u002FLocalAI&Date)\n\n## 许可证\n\nLocalAI 是由 [Ettore Di Giacinto](https:\u002F\u002Fgithub.com\u002Fmudler\u002F) 创建的社区驱动项目。\n\nMIT 许可证——作者 Ettore Di Giacinto \u003Cmudler@localai.io>\n\n## 致谢\n\n没有社区中已有的优秀软件的帮助，LocalAI 根本无法诞生。在此表示感谢！\n\n- [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp)\n- https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\n- https:\u002F\u002Fgithub.com\u002Fcornelk\u002Fllama-go 提供了最初的想法\n- https:\u002F\u002Fgithub.com\u002Fantimatter15\u002Falpaca.cpp\n- https:\u002F\u002Fgithub.com\u002FEdVince\u002FStable-Diffusion-NCNN\n- https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fwhisper.cpp\n- https:\u002F\u002Fgithub.com\u002Frhasspy\u002Fpiper\n- [exo](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo) 提供了 MLX 分布式自动并行分片实现\n\n## 贡献者\n\n这是一个社区项目，特别感谢所有贡献者！\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgo-skynet\u002FLocalAI\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_readme_484175fce90d.png\" \u002F>\n\u003C\u002Fa>","# LocalAI 快速上手指南\n\nLocalAI 是一个开源的 AI 引擎，允许你在任何硬件（包括 CPU）上本地运行各种模型（LLM、语音、图像、视频等），并提供与 OpenAI API 兼容的接口。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**：Linux、macOS、Windows (通过 WSL2 或 Docker)\n- **硬件**：\n  - **CPU**：任意现代 x86_64 或 ARM64 处理器（无需 GPU 即可运行）\n  - **GPU（可选加速）**：NVIDIA (CUDA 12\u002F13), AMD (ROCm), Intel (oneAPI), Apple Silicon (Metal), Vulkan\n- **内存**：建议至少 4GB RAM（运行大模型需更多）\n\n### 前置依赖\n- **容器方案（推荐）**：安装 Docker 或 Podman\n- **二进制方案**：无特殊依赖，直接下载可执行文件\n- **网络**：首次运行需联网下载模型后端和模型文件\n\n> **注意**：国内用户若遇到 Docker 拉取缓慢，建议配置 Docker 镜像加速器。\n\n---\n\n## 安装步骤\n\n### 方式一：使用 Docker（推荐）\n\nLocalAI 提供针对不同硬件优化的镜像，请根据你的硬件选择对应命令。\n\n#### 1. 仅使用 CPU\n```bash\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest\n```\n\n#### 2. NVIDIA GPU (CUDA)\n根据你安装的 CUDA 版本选择（推荐 CUDA 12 或 13）：\n```bash\n# CUDA 12\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-12\n\n# CUDA 13\ndocker run -ti --name local-ai -p 8080:8080 --gpus all localai\u002Flocalai:latest-gpu-nvidia-cuda-13\n```\n\n#### 3. AMD GPU (ROCm)\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fkfd --device=\u002Fdev\u002Fdri --group-add=video localai\u002Flocalai:latest-gpu-hipblas\n```\n\n#### 4. Intel GPU (oneAPI)\n```bash\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fdri\u002Fcard1 --device=\u002Fdev\u002Fdri\u002FrenderD128 localai\u002Flocalai:latest-gpu-intel\n```\n\n#### 5. macOS (DMG 安装包)\n1. 下载最新版的 `.dmg` 文件：[Download for macOS](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Freleases\u002Flatest\u002Fdownload\u002FLocalAI.dmg)\n2. 安装后，由于未签名，需在终端执行以下命令解除隔离限制：\n   ```bash\n   sudo xattr -d com.apple.quarantine \u002FApplications\u002FLocalAI.app\n   ```\n3. 启动应用程序即可。\n\n---\n\n## 基本使用\n\n安装完成后，LocalAI 默认在 `http:\u002F\u002Flocalhost:8080` 运行。它内置了 WebUI 和兼容 OpenAI 的 API。\n\n### 1. 加载并运行模型\n\n你可以直接使用 `local-ai` 命令（如果在宿主机安装了二进制）或通过 API 动态加载。以下是通过命令行快速运行模型的示例（需确保在容器内或已安装 CLI）：\n\n**从官方模型库运行 (推荐):**\n```bash\nlocal-ai run llama-3.2-1b-instruct:q4_k_m\n```\n\n**从 HuggingFace 运行:**\n```bash\nlocal-ai run huggingface:\u002F\u002FTheBloke\u002Fphi-2-GGUF\u002Fphi-2.Q8_0.gguf\n```\n\n**从 Ollama 注册表运行:**\n```bash\nlocal-ai run ollama:\u002F\u002Fgemma:2b\n```\n\n> **提示**：LocalAI 会自动检测你的硬件并下载合适的后端（如 llama.cpp, vLLM 等）。\n\n### 2. 调用 API (OpenAI 兼容)\n\nLocalAI 完全兼容 OpenAI API 格式。你可以使用任何支持 OpenAI 的客户端工具（如 Python SDK, Curl, LangChain 等）。\n\n**使用 Curl 测试文本生成：**\n\n```bash\ncurl http:\u002F\u002Flocalhost:8080\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"llama-3.2-1b-instruct:q4_k_m\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"你好，请介绍一下你自己\"}]\n  }'\n```\n\n**使用 Python SDK 测试：**\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"http:\u002F\u002Flocalhost:8080\u002Fv1\",\n    api_key=\"not-needed\" # 本地运行通常不需要真实的 API Key\n)\n\nresponse = client.chat.completions.create(\n    model=\"llama-3.2-1b-instruct:q4_k_m\",\n    messages=[{\"role\": \"user\", \"content\": \"写一首关于春天的短诗\"}]\n)\n\nprint(response.choices[0].message.content)\n```\n\n### 3. 访问内置 WebUI\n\n直接在浏览器打开：\n```\nhttp:\u002F\u002Flocalhost:8080\n```\n你可以在界面中浏览模型画廊、管理对话历史记录以及配置 Agent。\n\n---\n\n**下一步**：\n- 查看完整文档：[https:\u002F\u002Flocalai.io\u002F](https:\u002F\u002Flocalai.io\u002F)\n- 浏览可用模型：[https:\u002F\u002Fmodels.localai.io\u002F](https:\u002F\u002Fmodels.localai.io\u002F)","某初创医疗团队需要在内部旧服务器上部署一套患者咨询助手，要求数据完全本地化以符合隐私法规，但团队仅配备无独立显卡的普通办公电脑。\n\n### 没有 LocalAI 时\n- **硬件门槛高**：主流大模型依赖高性能 NVIDIA GPU，团队现有的 CPU 服务器无法运行，被迫申请高昂的云端算力预算。\n- **数据泄露风险**：使用公有云 API 意味着患者敏感病历需上传至第三方服务器，严重违反医疗数据合规要求。\n- **集成成本巨大**：不同功能（如语音问诊、影像分析）需对接多家厂商接口，协议不统一导致开发周期延长数周。\n- **网络依赖性强**：一旦外网波动或云服务中断，整个咨询系统即刻瘫痪，无法保障急诊场景下的连续性。\n\n### 使用 LocalAI 后\n- **利旧降本**：LocalAI 直接利用现有 CPU 资源运行量化后的 LLM 和语音模型，无需采购任何新显卡，启动成本降为零。\n- **隐私闭环**：所有推理过程均在局域网内完成，患者数据从未离开内部基础设施，轻松通过安全审计。\n- **统一接口开发**：LocalAI 提供兼容 OpenAI 的标准 API，团队只需修改一行代码即可让原有系统同时支持文本、语音和多模态输入。\n- **离线稳定运行**：部署为本地 Docker 容器后，系统完全脱离外网依赖，即使在网络隔离环境下也能 7x24 小时稳定响应。\n\nLocalAI 让资源受限的团队也能在零信任架构下，以最低成本构建全功能、自主可控的私有化 AI 应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmudler_LocalAI_aabfa243.png","mudler","Ettore Di Giacinto","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmudler_457f3f3a.jpg","ex-SUSE\u002FRancher, ex-gentoo, ex-Sabayon, @mocaccinoOS",null,"Italy","mudler_it","https:\u002F\u002Fmudler.pm","https:\u002F\u002Fgithub.com\u002Fmudler",[86,90,94,98,102,106,110,114,118,122],{"name":87,"color":88,"percentage":89},"Go","#00ADD8",69.8,{"name":91,"color":92,"percentage":93},"JavaScript","#f1e05a",11.8,{"name":95,"color":96,"percentage":97},"HTML","#e34c26",7.2,{"name":99,"color":100,"percentage":101},"Python","#3572A5",6.1,{"name":103,"color":104,"percentage":105},"C++","#f34b7d",1.9,{"name":107,"color":108,"percentage":109},"CSS","#663399",1.3,{"name":111,"color":112,"percentage":113},"Shell","#89e051",1,{"name":115,"color":116,"percentage":117},"Makefile","#427819",0.7,{"name":119,"color":120,"percentage":121},"Dockerfile","#384d54",0.2,{"name":123,"color":124,"percentage":125},"CMake","#DA3434",0.1,44782,3844,"2026-04-02T22:14:26","MIT","Linux, macOS","非必需。支持多种硬件加速：NVIDIA (CUDA 12\u002F13), AMD (ROCm), Intel (oneAPI\u002FSYCL), Apple Silicon (Metal), Vulkan, NVIDIA Jetson (L4T)。若无 GPU 可仅使用 CPU 运行。","未说明（取决于加载的模型大小）",{"notes":134,"python":135,"dependencies":136},"1. 提供 Docker 镜像，推荐通过 Docker\u002FPodman 部署，可根据硬件自动选择镜像（如 CPU 版、NVIDIA CUDA 版、AMD ROCm 版等）。2. macOS 用户可下载 DMG 安装包，首次运行需执行命令移除隔离属性。3. 支持从 HuggingFace、Ollama、OCI 仓库等多种来源动态加载模型。4. 架构模块化，后端组件可按需安装。5. 内置 WebUI 和 API 兼容层（OpenAI\u002FAnthropic 等）。","未说明",[137,138,139,140,141,142,143],"llama.cpp","vLLM","transformers","whisper","diffusers","MLX","torch",[14,55,26,15,13,51,53],[146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163],"llama","ai","llm","stable-diffusion","api","tts","musicgen","mamba","audio-generation","image-generation","text-generation","rerank","distributed","libp2p","decentralized","object-detection","mcp","agents",13,"2026-03-27T02:49:30.150509","2026-04-06T07:01:03.615331",[168,173,178,183,188,192],{"id":169,"question_zh":170,"answer_zh":171,"source_url":172},10922,"遇到 'rpc error: code = Unavailable ... connection refused' gRPC 连接错误怎么办？","这是一个常见的后端服务启动失败问题，可能由多种原因引起（如模型加载超时、端口冲突或架构不匹配）。\n1. 检查日志确认后端进程是否成功启动。\n2. 确保使用的 Docker 镜像或二进制文件与你的硬件架构（如 Apple Silicon, NVIDIA GPU）匹配。\n3. 如果是 macOS Apple Silicon 用户，尝试通过 Homebrew 安装或确保使用正确的 arm64 镜像。\n4. 增加模型加载的超时时间，因为大模型在慢速硬盘上可能需要更长时间启动。\n5. 检查防火墙或网络设置是否阻止了本地回环接口（127.0.0.1）的通信。","https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F771",{"id":174,"question_zh":175,"answer_zh":176,"source_url":177},10923,"如何在非 Arch Linux 系统（如 RHEL, SLES, Ubuntu）上运行带有 ROCm 支持的 LocalAI？","目前官方对 ROCm (AMD GPU) 的原生支持主要集中在 Arch Linux 上，其他企业级系统（如 RHEL, SLES）直接构建较为困难。\n建议方案：\n1. 优先尝试使用社区维护的或特定配置的 Docker 容器，这些容器可能已经预装了 ROCm 依赖。\n2. 参考 `llama.cpp` 项目的 Docker 实现，它们通常能开箱即用支持 ROCm。\n3. 如果必须自行构建，请确保安装了与你的操作系统版本严格匹配的 ROCm 驱动程序和开发包。\n4. 关注项目路线图中的 'area\u002Fcontainer' 标签更新，以获取更好的多发行版支持。","https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F1592",{"id":179,"question_zh":180,"answer_zh":181,"source_url":182},10924,"在 Windows WSL 环境下编译或运行失败，有什么推荐的解决方案？","在 Ubuntu WSL 上可能会遇到兼容性问题。一个有效的解决方案是切换到 Debian WSL 环境。\n具体步骤：\n1. 卸载当前的 Ubuntu WSL 实例，安装 Debian WSL。\n2. 下载对应的预编译二进制文件（例如 `local-ai-avx-Linux-x86_64`）。\n3. 创建一个启动脚本（如 `localai.sh`），配置必要的环境变量：\n   ```bash\n   export GALLERIES='[{\"url\": \"github:go-skynet\u002Fmodel-gallery\u002Fhuggingface.yaml\",\"name\":\"huggingface\"}]'\n   export CORS=true\n   export ADDRESS=localhost:4040\n   export DEBUG=true\n   export PARALLEL_REQUESTS=true\n   export THREADS=18\n   .\u002Flocal-ai-avx-Linux-x86_64\n   ```\n4. 赋予脚本执行权限并运行。这种方法避免了复杂的源码编译过程。","https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F1196",{"id":184,"question_zh":185,"answer_zh":186,"source_url":187},10925,"更新显卡驱动后 CUDA 加速失效或报错，如何处理？","当主机显卡驱动更新后，Docker 容器内的 CUDA 版本可能与新驱动不兼容，导致加速失效。\n解决方法：\n1. 确认宿主机安装的 NVIDIA 驱动版本支持的 CUDA 版本。\n2. 更换 LocalAI 的 Docker 镜像标签，使其内部的 CUDA 版本与宿主机驱动匹配。例如，如果驱动较新，尝试使用 `latest-aio-gpu-nvidia-cuda-12` 或更高版本的镜像。\n3. 运行命令示例：\n   ```bash\n   docker run --rm -ti --gpus all -p 8080:8080 -e DEBUG=true -v $PWD\u002Fmodels:\u002Fmodels localai\u002Flocalai:latest-aio-gpu-nvidia-cuda-12 --models-path \u002Fmodels\n   ```\n4. 如果不确定版本，尝试删除旧容器并重新拉取最新镜像。","https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fissues\u002F2394",{"id":189,"question_zh":190,"answer_zh":191,"source_url":177},10926,"显存不足（VRAM OOM）导致图像生成或模型加载失败怎么办？","当模型过大无法完全放入显存时，会引发错误。\n解决方案：\n1. 更换更小的模型版本（例如从 `sd3-medium` 换到 `dreamshaper` 或其他量化版本）。\n2. 使用量化模型（如 Q4_K_M, Q5_K_M 等 GGUF 格式），它们占用的显存更少。\n3. 调整启动参数，限制上下文大小（`--context-size`）或批次大小。\n4. 对于图像生成，尝试降低分辨率或步数。\n5. 如果必须使用大模型，考虑启用 CPU 卸载（虽然速度会变慢），或者升级硬件。",{"id":193,"question_zh":194,"answer_zh":195,"source_url":182},10927,"如何正确配置环境变量来运行 LocalAI 并启用模型画廊？","可以通过设置环境变量来灵活配置 LocalAI，而无需修改配置文件。\n常用配置示例：\n- `GALLERIES`: 定义模型来源，例如 `[{'url': 'github:go-skynet\u002Fmodel-gallery\u002Fhuggingface.yaml','name':'huggingface'}]`。\n- `ADDRESS`: 设置监听地址，如 `localhost:4040` 或 `0.0.0.0:8080`。\n- `THREADS`: 设置 CPU 线程数，根据核心数调整（如 `18`）。\n- `DEBUG`: 设为 `true` 以输出详细日志排查问题。\n- `CORS`: 设为 `true` 允许跨域请求。\n- `PARALLEL_REQUESTS`: 设为 `true` 启用并行处理。\n可以在运行二进制文件或 Docker 容器时通过 `-e` 参数传递这些变量。",[197,202,207,212,217,222,227,232,237,242,247,252,257,262,267,272,277,282,287,292],{"id":198,"version":199,"summary_zh":200,"released_at":201},53349,"v4.1.0","# 🎉 LocalAI 4.1.0 发布！🚀\n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 4.1.0 正式发布！🔥\n\n就在具有里程碑意义的 4.0 版本发布仅几周后，我们再次带来一次重磅更新。本次发布将 LocalAI 打造成一个 **生产级 AI 平台**：轻松搭建具备智能路由和自动扩缩容功能的分布式集群，通过内置的身份验证和用户配额机制确保安全性，甚至无需离开 UI 即可对模型进行微调，还有更多强大功能等你探索。如果说 4.0 是基石，那么 **4.1 就是控制中枢**。\n\n| 功能 | 简介 |\n|--------|--------|\n| 🌐 **分布式模式** | 将 LocalAI 部署为集群——智能路由、节点分组、优雅下线\u002F恢复、最小\u002F最大自动扩缩容。 |\n| 🔐 **用户与认证** | 内置 OIDC 用户管理、邀请模式、API 密钥及管理员模拟登录功能。 |\n| 📊 **配额系统** | 基于预测分析和细分仪表盘的用户级使用配额。 |\n| 🧪 **微调** | （实验性）支持通过 TRL 对模型进行微调，并可自动导出为 GGUF 格式后再导入——全程在 UI 中完成。 |\n| ⚗️ **量化** | （实验性）新增用于实时模型量化的后端功能。 |\n| 🔧 **管道编辑器** | 在 React UI 中提供可视化模型管道编辑器。 |\n| 🤖 **独立代理** | 可通过 CLI 运行代理，命令为 `local-ai agent run`。 |\n| 🧠 **智能推理** | 自动推理默认采用 Unsloth 模型，工具解析作为后备方案，并支持 `min_p` 参数。 |\n| 🎬 **媒体历史** | 在 Studio 页面中浏览过往生成的图片及其他媒体内容。 |\n\n\n**全新**（详细版）完整部署教程：https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=cMVNnlqwfw4\n---\n\n## 🚀 核心特性\n\n### 🌐 分布式模式：水平扩展 LocalAI\n\n现在你可以将 LocalAI 部署为 **分布式集群**，让系统自动决定将请求路由到哪个节点，彻底告别单节点瓶颈。\n\n- **智能路由**：请求会根据各节点可用的 VRAM 大小进行排序，优先分配给显存最充足、空闲的 GPU。\n- **节点分组**：可将特定模型固定到不同的节点分组，实现工作负载隔离（例如，“GPU 密集型”与“CPU 轻量型”）。\n- **自动扩缩容**：内置最小\u002F最大扩缩容策略，并配备节点协调器，自动管理节点生命周期。\n- **优雅下线与恢复**：只需一条 API 请求，即可优雅地将节点下线进行维护，并在完成后快速恢复上线。\n- **集群仪表盘**：从首页即可一目了然地查看整个集群的状态。\n- **智能模型迁移**：支持通过 S3 或点对点方式传输模型。\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F2ae64094-b8df-4ec2-accf-0b8afd5f9e79\n\n---\n\n### 🔐 用户、认证与配额\n\nLocalAI 现已自带一套 **完整的多用户平台**，非常适合团队、课堂或其他需要共享部署的场景。\n\n- **用户管理**：可在 React UI 中创建、编辑和管理用户。\n- **OIDC\u002FOAuth**：可接入任意身份提供商实现单点登录——Google、Keycloak、Authentik 等，任君选择。\n- **邀请模式**：限制注册仅限受邀用户，并需管理员批准。\n- **API 密钥**：为每个用户生成唯一的 API 密钥。","2026-04-02T22:14:48",{"id":203,"version":204,"summary_zh":205,"released_at":206},53350,"v4.0.0","\u003C!-- 发布说明由 .github\u002Frelease.yml 配置在 master 分支上生成 -->\n\n---\n# 🎉 LocalAI 4.0.0 正式发布！🚀\n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 4.0.0 已正式发布！\n\n本次重大版本更新将 LocalAI 打造成一个完整的 AI 编排平台。我们直接将智能体和混合搜索功能嵌入核心，使用 React 全面重构了用户界面，带来现代化的使用体验；同时，我们非常高兴地推出 **Agenthub**（[链接](https:\u002F\u002Fagenthub.localai.io)），这是一个全新的社区中心，方便用户轻松分享和导入智能体。除了这些重大更新外，我们还引入了诸如用于代码工件的画布模式、MCP 应用以及对 MCP 客户端的全面支持等强大新功能。\n\n| 功能 | 简介 |\n|--------|--------|\n| **智能体编排与 Agenthub** | 原生智能体管理，支持记忆、技能，并通过新的 Agenthub 实现社区共享。 |\n| **焕然一新的 React UI** | 完整前端重写，性能极速提升，用户体验更加现代化。 |\n| **画布模式** | 在聊天界面中并排预览代码块和生成的工件。 |\n| **MCP 客户端支持** | 完全支持模型上下文协议、MCP 应用以及聊天中的工具流式传输。 |\n| **WebRTC 实时通信** | 支持 WebRTC，实现低延迟的实时音频对话。 |\n| **新增后端** | 新增实验性后端 **MLX Distributed**、fish-speech、ace-step.cpp 以及 faster-qwen3-tts。 |\n| **基础设施** | 提供 Podman 文档、Shell 自动补全功能，并实现了持久化数据路径的分离。 |\n\n---\n\n## 🚀 核心特性\n\n### 🤖 原生智能体编排与 Agenthub\nLocalAI 现在内置了原生的智能体能力，直接集成在核心中。您可以通过全新界面管理和导入智能体，启动或停止它们。\n- 🌐 **Agenthub：** 我们隆重推出了 **[Agenthub](https:\u002F\u002Fagenthub.localai.io\u002F)**！这是一个集中式的社区空间，用于分享通用智能体，并可轻松导入到您的 LocalAI 实例中。\n- **智能体管理：** 通过 React UI 实现智能体的全生命周期管理。您可以创建智能体、将其连接到 Slack、配置 MCP 服务器和技能。\n- **技能管理：** 中央化的智能体技能数据库。\n- **记忆：** 智能体可以利用混合搜索（PostgreSQL）或嵌入式内存存储（Chromem）来管理记忆。\n- **可观测性：** 智能体列表中新增“事件”列，用于跟踪可观测指标和状态。\n- 📚 **文档：** 欢迎查阅我们的[官方智能体文档](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fagents)，深入了解这些新功能。\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6270b331-e21d-4087-a540-6290006b381a\n\n### 🎨 界面焕新与画布模式\nWeb 界面已完全迁移到 **React**，带来了更流畅的使用体验和强大的新功能：\n- **画布模式：** 在聊天中启用“画布模式”，即可在右侧专用预览栏中查看 LLM 生成的代码块和工件。\n- **系统","2026-03-14T18:18:41",{"id":208,"version":209,"summary_zh":210,"released_at":211},53351,"v3.12.1","\u003C!-- 发布说明由 .github\u002Frelease.yml 中的配置在 master 分支上生成 -->\n\n这是一个补丁版本，用于标记新的 llama.cpp 版本，该版本修复了与通义千问 3 coder 的不兼容问题。\n\n## 变更内容\n### 其他变更\n* docs：:arrow_up: 更新文档版本 mudler\u002FLocalAI，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8611 中完成\n* feat(traces)：添加后端追踪功能，由 @richiejp 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8609 中完成\n* chore：:arrow_up: 将 ggml-org\u002Fllama.cpp 更新至 `b908baf1825b1a89afef87b09e22c32af2ca6548`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8612 中完成\n* chore：从管道中移除 bark.cpp 的残留文件，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8614 中完成\n* fix：合并 openresponses 消息，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8615 中完成\n* chore：:arrow_up: 将 ggml-org\u002Fllama.cpp 更新至 `ba3b9c8844aca35ecb40d31886686326f22d2214`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8613 中完成\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.12.0...v3.12.1","2026-02-21T13:49:24",{"id":213,"version":214,"summary_zh":215,"released_at":216},53352,"v3.12.0","# 🎉 LocalAI 3.12.0 发布！🚀\n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 3.12.0 已发布！\n\n| 功能 | 简介 |\n|--------|--------|\n| **多模态实时交互** | 在实时对话中发送文本、图像和音频，实现更丰富的交互体验。 |\n| **Voxtral 后端** | 新增高质量的文本转语音后端。 |\n| **多 GPU 支持** | 通过多 GPU 提升 Diffusers 的性能。 |\n| **旧版 CPU 优化** | 增强对老旧处理器的兼容性。 |\n| **UI 主题与布局** | 改进了 UI 主题（深色\u002F浅色版本）及导航方式。 |\n| **实时稳定性** | 针对音频、图像和模型处理进行了多项修复。 |\n| **日志改进** | 减少了过多的日志输出，并优化了日志处理流程。 |\n\n---\n\n## Local Stack 家族\n\n喜欢 LocalAI 吗？LocalAI 是一套集成式 AI 基础设施工具的一部分，你可能也会对以下项目感兴趣：\n\n- **[LocalAGI](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI)** - 兼容 OpenAI Responses API 的 AI 代理编排平台，具备先进的代理能力。\n- **[LocalRecall](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall)** - 提供持久化记忆与存储的 MCP\u002FREST API 知识库系统，专为 AI 代理设计。\n- 🆕 **[Cogito](https:\u002F\u002Fgithub.com\u002Fmudler\u002Fcogito)** - 用于构建智能协作型代理软件及 LLM 驱动工作流的 Go 库，专注于提升小型开源语言模型的表现，并可扩展至任何 LLM。它为 LocalAGI 和 LocalAI 的 MCP\u002F代理功能提供支持。\n- 🆕 **[Wiz](https:\u002F\u002Fgithub.com\u002Fmudler\u002Fwiz)** - 通过 Ctrl+Space 快捷键访问的终端 AI 代理。这是一款便携、本地 LLM 友好的 Shell 助手，支持 TUI\u002FCLI 模式、经批准的工具执行、MCP 协议以及多 shell 兼容性（zsh、bash、fish）。\n- 🆕 **[SkillServer](https:\u002F\u002Fgithub.com\u002Fmudler\u002Fskillserver)** - 通过 MCP 为 AI 代理提供的简单集中式技能数据库。以 Markdown 文件管理技能，集成 MCP 服务器、Web 界面编辑、Git 同步及全文搜索功能。\n\n## ❤️ 感谢\n\nLocalAI 是一场真正的 FOSS 运动——由贡献者共建，由社区驱动。\n\n如果你认同隐私优先的 AI：\n- ✅ **点赞** 本仓库\n- 💬 **贡献** 代码、文档或反馈\n- 📣 **分享** 给更多人\n\n你的支持让这一技术栈持续发展。\n\n---\n\n## ✅ 完整变更日志\n\n\u003Cdetails>\n\u003Csummary>📋 点击展开完整变更日志\u003C\u002Fsummary>\n\n## 变更内容\n### Bug 修复 :bug:\n* 安全性：验证 URL 以防止内容获取端点中的 SSRF 攻击，由 @kolega-ai-dev 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8476 中完成。\n* 修复（实时）：使用用户提供的语音，并允许管道模型无需后端，由 @richiejp 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8415 中完成。\n* 修复（实时）：采样与 WebSocket 锁定问题，由 @richiejp 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8521 中修复。\n* 修复（实时）：正确发送图像数据","2026-02-20T18:16:23",{"id":218,"version":219,"summary_zh":220,"released_at":221},53353,"v3.11.0","# 🎉 LocalAI 3.11.0 发布！🚀\n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 3.11.0 是一次针对 **音频与多模态能力** 的重大更新。\n\n我们推出了 **实时音频对话**、专用的 **音乐生成界面**，以及对 **ASR（语音转文本）** 和 **TTS** 后端的大规模扩展。无论您是想与 AI 对话、克隆声音、进行带有说话人识别的转写，还是生成歌曲，本次发布都能满足您的需求。\n\n请查看下方的亮点内容！\n\n---\n\n## 📌 简要概述\n\n| 功能 | 概述 |\n|--------|--------|\n| **实时音频** | 原生支持 **音频对话**，实现类似 OpenAI Realtime API 的流畅语音交互。[文档](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-realtime\u002F)|\n| **音乐生成界面** | 针对 **MusicGen**（Ace-Step）的新 UI 界面，允许您直接在浏览器中通过文本提示生成音乐。|\n| **新增 ASR 后端** | 新增了 **WhisperX**（带说话人分离功能）、**VibeVoice**、**Qwen-ASR** 和 **Nvidia NeMo**。|\n| **TTS 流式传输** | 文本转语音现支持 **流式模式**，以降低响应延迟。（目前仅限 VoxCPM）|\n| **vLLM Omni** | 新增对 **vLLM Omni** 的支持，进一步扩展了我们的高性能推理能力。|\n| **说话人分离** | 原生支持通过 **WhisperX** 在转写中识别不同说话人。|\n| **硬件支持扩展** | 扩展了对 CUDA 12\u002F13、L4T（Jetson）、SBSA 的构建支持，并增强了与 MLX 后端的 Metal（Apple Silicon）集成。|\n| **破坏性变更** | 已移除 **ExLlama**（已弃用）和 **Bark**（未维护）后端。|\n\n---\n\n## 🚀 新特性与重大改进\n\n### 🎙️ **实时音频对话**\n\nLocalAI 3.11.0 引入了对 **实时音频对话** 的原生支持。\n\n- 实现与智能体之间流畅、低延迟的语音交互。\n- 逻辑直接在 LocalAI 流程中处理，确保音频输入\u002F输出工作流无缝衔接。\n- 支持 STT\u002FTTS 以及语音到语音模型（实验性功能）。\n- 支持工具调用。\n\n> 🗣️ **与您的 LocalAI 对话**：这使我们距离完全本地化、语音原生的助手体验更近了一步，且兼容标准客户端实现。\n\n详细文档请参阅 [这里](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fopenai-realtime\u002F)。\n\n---\n\n### 🎵 **音乐生成界面与 Ace-Step**\n\n我们新增了一个专门用于音乐生成的界面！\n\n- **新后端**：通过 `ace-step` 后端支持 **Ace-Step**（MusicGen）。\n- **Web UI 集成**：可直接从 LocalAI Web UI 生成音乐片段。\n- 简单的文本到音乐工作流（例如：“适合学习的低保真嘻哈节拍”）。\n\n\n\u003Cimg width=\"1920\" height=\"1820\" alt=\"截图 2026-02-07 23:32:00 LocalAI - 使用 ace-step-turbo 生成声音\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6cd1b169-f1","2026-02-07T21:31:43",{"id":223,"version":224,"summary_zh":225,"released_at":226},53354,"v3.10.1","这是一个小型补丁版本，旨在修复一些 bug 并进行小幅优化。此外，我们还新增了对昨日刚发布的 Qwen-TTS 的支持。\n\n- 修复推理和指令模型上的推理检测问题\n- 支持带有 openresponses 的推理块\n- 修复 API，使其能够正确运行 LTX-2\n- 支持 Qwen3-TTS！\n\n## 变更内容\n### Bug 修复 :bug:\n* fix(reasoning): 支持无需以“thinking”标签开头的推理模型，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8132 中实现\n* fix(tracing): 在首次请求时创建追踪缓冲区，以便在运行时启用追踪功能，由 @richiejp 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8148 中实现\n* fix(videogen): 移除不完整的端点，并为 LTX-2 添加 GGUF 支持，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8160 中实现\n### 令人兴奋的新特性 🎉\n* feat(openresponses): 支持推理块，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8133 中实现\n* feat: 如果未显式设置，则从后端自动检测是否支持“thinking”功能，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8167 中实现\n* feat(qwen-tts): 添加 Qwen-tts 后端，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8163 中实现\n### 🧠 模型\n* chore(model gallery): :robot: 通过画廊代理添加 1 个新模型，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8128 中实现\n* chore(model gallery): 添加 flux 2 和 flux 2 klein，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8141 中实现\n* chore(model-gallery): :arrow_up: 更新校验和，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8153 中实现\n* chore(model gallery): :robot: 通过画廊代理添加 1 个新模型，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8157 中实现\n* chore(model gallery): :robot: 通过画廊代理添加 1 个新模型，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8170 中实现\n### 👒 依赖项\n* chore(deps): 将 github.com\u002Fmudler\u002Fcogito 从 0.7.2 升级至 0.8.1，由 @dependabot[bot] 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8124 中实现\n### 其他变更\n* feat(swagger): 更新 Swagger 文档，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8098 中实现\n* chore: :arrow_up: 将 ggml-org\u002Fllama.cpp 更新至 `287a33017b32600bfc0e81feeb0ad6e81e0dd484`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8100 中实现\n* chore: :arrow_up: 将 leejet\u002Fstable-diffusion.cpp 更新至 `2efd19978dd4164e387bf226025c9666b6ef35e2`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8099 中实现\n* docs: :arrow_up: 更新 mudler\u002FLocalAI 的文档版本，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8120 中实现\n* chore: :arrow_up: 将 leejet\u002Fstable-diffusion.cpp 更新至 `a48b4a3ade9972faf0adcad47e51c6fc03f0e46d`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8121 中实现\n* chore: :arrow_up: 将 ggml-org\u002Fllama.cpp 更新至 `959ecf7f234dc0bc0cd6829b25cb0ee1481aa78a`，由 @localai-bot 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8122 中实现\n* chore(deps): 将 llama.cpp 升级至 `1c7cf94b22a9dc6b1d32422f72a627787a4783a3`，由 @mudler 在 https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F8136 中实现\n* chore: 移除冗余日志，由 @mudler 在 https:\u002F\u002Fgithub.c","2026-01-23T14:21:45",{"id":228,"version":229,"summary_zh":230,"released_at":231},53355,"v3.10.0","# 🎉 LocalAI 3.10.0 发布！🚀  \n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 3.10.0 在 **代理能力、多模态支持和跨平台可靠性** 方面有了重大提升。  \n\n我们新增了原生的 **Anthropic API 支持**，推出了全新的 **视频生成 UI**，引入了 **Open Responses API 兼容性**，并借助 **统一的 GPU 后端系统** 提升了性能。  \n\n欲了解完整更新内容，请参阅下文！\n\n---\n\n## 📌 简要概览\n\n| 功能 | 概述 |\n|--------|--------|\n| **Anthropic API 支持** | 完全兼容 `\u002Fv1\u002Fmessages` 端点，可无缝替代 Claude。 |\n| **Open Responses API** | 原生支持带工具调用、流式传输、后台模式和多轮对话的状态感知型代理，并通过了所有 [官方合规测试](https:\u002F\u002Fwww.openresponses.org\u002Fcompliance)。 |\n| **视频与图像生成套件** | 新增视频生成 UI + LTX-2 支持，实现文本到视频及图像到视频的转换。 |\n| **统一 GPU 后端** | 将 GPU 库（CUDA、ROCm、Vulkan）打包到后端容器中——在 Nvidia、AMD 和 ARM64 上均可 **开箱即用**（实验性）。 |\n| **工具流式传输与 XML 解析** | 完全支持工具调用的流式传输以及 XML 格式的工具输出。 |\n| **系统感知后端库列表** | 只显示您的系统能够运行的后端（例如，在 Linux 上隐藏 MLX）。 |\n| **崩溃修复** | 防止仅支持 AVX 的 CPU（Intel Sandy\u002FIvy Bridge）发生崩溃，并修复 AMD GPU 的显存报告问题。 |\n| **请求追踪** | 通过基于内存的请求\u002F响应日志来调试代理和微调过程。 |\n| **Moonshine 后端** | 面向低配置设备的超快速转录引擎。 |\n| **Pocket-TTS** | 轻量级、高保真度的文字转语音功能，支持声音克隆。 |\n| **Vulkan arm64 构建** | 我们现在也为 arm64 平台上的 Vulkan 构建后端和镜像。 |\n\n---\n\n## 🚀 新特性与重大改进\n\n### 🤖 **Open Responses API：构建更智能、自主的代理**  \n\nLocalAI 现已支持 **Open Responses API**，可在本地实现强大的代理工作流。  \n\n- 通过 `response_id` 实现 **状态感知的对话**——可恢复并管理长时间运行的代理会话。  \n- **后台模式**：以异步方式运行代理，稍后再获取结果。  \n- 对工具、图像和音频提供 **流式支持**。  \n- **内置工具**：网络搜索、文件搜索以及通过 MCP 集成使用计算机。  \n- 支持带有动态上下文和工具使用的 **多轮交互**。  \n\n> ✅ 非常适合开发人员构建能够在本地机器上完成浏览网页、分析文件或与系统交互的代理。  \n\n> 🔧 **使用方法**：  \n> - 在请求中设置 `response_id`，以在多次调用之间保持会话状态。  \n> - 使用 `background: true` 以异步方式运行代理。  \n> - 通过 `GET \u002Fapi\u002Fv1\u002Fresponses\u002F{response_id}` 获取结果。  \n> - 启用 `stream: true` 即可实时接收部分响应和工具调用。  \n\n> 📌 **小贴士**：","2026-01-18T21:00:06",{"id":233,"version":234,"summary_zh":235,"released_at":236},53356,"v3.9.0","# 圣诞发布 :santa:  **LocalAI 3.9.0**! 🚀\n\n\u003Ch1 align=\"center\">\n  \u003Cbr>\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\">\n  \u003Cbr>\n  \u003Cbr>\n\u003C\u002Fh1>\n\nLocalAI 3.9.0 专注于**稳定性、资源效率以及更智能的代理工作流**。我们修复了模型加载中的关键问题，优化了系统资源管理，并引入了全新的**代理作业面板**，用于调度和管理后台代理任务。无论您是在本地运行模型，还是编排复杂的代理工作流，此次发布都能让操作更快速、更可靠、更易于管理。\n\n## 📌 简要总结\n\n| 功能 | 概述 |\n|--------|--------|\n| **代理作业面板** | 可通过 cron 表达式或 API 调度并运行后台任务——非常适合自动化工作流。 |\n| **智能内存回收器** | 当内存不足时，自动释放 GPU\u002FVRAM，驱逐最近最少使用的模型。 |\n| **LRU 模型驱逐机制** | 根据使用情况自动将模型从内存中卸载，以防止崩溃。 |\n| **MLX 和 CUDA 13 支持** | 新增模型后端，并增强了对现代硬件的 GPU 兼容性。 |\n| **UI 优化与修复** | 优化了导航界面，修复了布局溢出问题，并进行了多项改进。 |\n| **Vibevoice** | 新增对 vibevoice 后端的支持！ |\n\n---\n\n## 🚀 新特性\n\n### 🤖 **代理作业面板：调度与自动化任务**\n\nLocalAI 3.9.0 在 Web UI 和 API 中引入了**新的代理作业面板**，允许您创建、运行和调度后台代理任务，这些任务可以通过 API 或 Web 界面以编程方式启动。\n\n- 使用 cron 语法或 API 按计划运行代理提示。\n- 代理可通过模型设置定义，支持 MCP。\n- 通过 API 触发作业，以便集成到 CI\u002FCD 或外部工具中。\n- 可选择将结果发送至 webhook 进行后处理。\n- 模板和提示可动态填充变量。\n\n> ✅ 应用场景：每日报告、CI 集成、自动化数据处理、定期模型评估。\n\n\u003Cimg width=\"1576\" height=\"767\" alt=\"2025-12-24 15:26:32 LocalAI - 代理作业截图\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F9c170d93-fedc-48b8-be30-14330b1052a3\" \u002F>\n\n---\n\n### 🧠 **智能内存回收器：自动优化 GPU 资源**\n\n我们引入了**新的内存回收器**，可监控系统内存使用情况，并在需要时自动释放 GPU\u002FVRAM。\n\n\u003Cimg width=\"975\" height=\"670\" alt=\"2025-12-24 15:25:30 LocalAI API 截图 - 8b3e0eb (8b3e0ebf8aab4071ef7721121f04081c32a5c9da)\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F36f76486-0123-4480-8386-d66d43a96fa1\" \u002F>\n\n- 跟踪所有后端的内存消耗。\n- 当使用量超过配置阈值时，会驱逐**最近最少使用的（LRU）**模型。\n- 防止因内存不足而导致的崩溃，确保系统在高负载下保持稳定。\n\n这是迈向**自适应资源管理**的重要一步，未来…","2025-12-24T14:31:28",{"id":238,"version":239,"summary_zh":240,"released_at":241},53357,"v3.8.0","\u003Ch1 align=\"center\">\r\n  \u003Cbr>\r\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\"> \u003Cbr>\r\n  \u003Cbr>\r\n\u003C\u002Fh1>\r\n\r\n欢迎来到 **LocalAI 3.8.0**！\r\n\r\nLocalAI 3.8.0 致力于优化用户体验，在无需重启或复杂配置文件的情况下，为用户提供更强大的功能。本次发布引入了全新的引导流程和通用模型加载器，能够无缝处理从 Hugging Face URL 到本地文件的各种输入格式。\n\n我们还改进了聊天界面，解决了长期以来关于 OpenAI API 兼容性的需求（特别是 SSE 流式传输标准），并为部分后端（llama.cpp）及后端管理提供了更为精细的控制选项。\n\n## 📌 简要总结\n\n| 功能 | 概述 |\n|--------|--------|\n| **通用模型导入** | 可直接从 Hugging Face、Ollama、OCI 或本地路径导入模型。自动检测后端并处理聊天模板。 |\n| **UI 和索引全面升级** | 新增引导向导、启动时自动选择模型，以及更简洁的表格式模型管理视图。 |\n| **MCP 实时流式传输** | **新增：** 代理动作和工具调用现在可通过模型上下文协议实时流式传输——您可实时查看推理过程。 |\n| **热重载设置** | 无需重启容器即可修改监控规则、API 密钥、P2P 设置和默认配置。 |\n| **聊天增强功能** | 聊天历史和并行对话现持久化存储于本地。 |\n| **严格遵循 SSE 标准** | 修复了流式传输格式，使其完全符合 OpenAI 规范（解决了与 LangChain\u002FJS 客户端的兼容性问题）。 |\n| **高级配置** | 通过 YAML 选项微调 `context_shift`、`cache_ram` 和并行工作线程数。 |\n| **Logprobs 和 Logitbias** | 增加了 token 级别的概率支持，以提升代理和评估工作流的表现。 |\n\n## 功能详解\n\n### 🚀 通用模型导入（基于 URL）\n\n我们重构了模型导入的方式。对于常见场景，您不再需要手动编写配置文件。新的导入工具支持来自 **Hugging Face、Ollama 和 OCI 注册表** 的 URL，或者直接通过 Web 界面上传本地文件路径。\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F230576c2-2abe-4b20-97c0-935d4ed6e7e7\n\n* **自动检测：** 系统会尝试识别正确的后端（例如 `llama.cpp` 对比 `diffusers`），并通过读取模型元数据自动应用原生聊天模板（如 `llama-3`、`mistral` 等）。\n* **导入时自定义：** 您可以在导入过程中立即覆盖默认设置，例如强制对 GGUF 文件进行特定量化，或选择 `vLLM` 而不是 `transformers`。\n* **多模态支持：** 视觉组件（`mmproj`）会被自动检测并配置。\n* **文件安全机制：** 我们添加了一项保护措施，防止在多个模型配置共享同一份模型文件（blob）时误删文件。\n\n### 🎨 UI 全面升级\n\nWeb 界面经过重新设计，旨在提升易用性和清晰度。\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F260a","2025-11-26T20:22:49",{"id":243,"version":244,"summary_zh":245,"released_at":246},53358,"v3.7.0","\u003Ch1 align=\"center\">\r\n  \u003Cbr>\r\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\"> \u003Cbr>\r\n  \u003Cbr>\r\n\u003C\u002Fh1>\n\n欢迎来到 **LocalAI 3.7.0** :wave:  \n\n本次发布引入了 **支持 Agentic MCP 并与 WebUI 完全集成的功能**、全新的 **neutts TTS 后端**、**模糊模型搜索**、针对 Chatterbox 的 **长文本 TTS 分块处理**，以及一次彻底的 **WebUI 全面重构**。  \n\n我们还修复了关键性 bug，提升了系统稳定性，并增强了与 OpenAI API 的兼容性。\n\n---\n\n## 📌 简而言之——LocalAI 3.7.0 有哪些新特性？\n\n| 功能 | 概述 |\n|---|---|\n| 🤖 **Agentic MCP 支持（已接入 WebUI）** | 构建能够使用真实工具（如网络搜索、代码执行）的 AI 代理。完全兼容 OpenAI 标准，并无缝集成到 WebUI 中。 |\n| 🎙️ **neutts TTS 后端（由 Neuphonic 提供支持）** | 生成自然、高质量且低延迟的语音——非常适合用于语音助手场景。 |\n| 🖼️ **WebUI 优化** | 更快速、更简洁的用户界面，支持实时更新，并可完全通过 YAML 文件控制模型配置。 |\n| 💬 **Chatterbox 长文本 TTS 分块处理** | 通过智能分割文本并保持上下文连贯性，生成自然流畅的长音频内容。 |\n| 🧩 **高级代理控制功能** | 提供重试、推理和重新评估等新选项，帮助您精细调整代理行为。 |\n| 📸 **新增视频生成接口** | 现在支持与 OpenAI 兼容的 `\u002Fv1\u002Fvideos` 接口，用于文本转视频功能。 |\n| :snake: **Whisper 兼容性增强** | Whisper.cpp 现已支持多种 CPU 指令集（AVX、AVX2 等），有效避免因 `非法指令` 导致的崩溃问题。 |\n| 🔍 **画廊模糊搜索** | 即使输入有拼写错误，也能在模型画廊中找到所需模型（例如，输入 `gema` 仍可匹配到 `gemma`）。 |\n| 📦 **更便捷的模型与后端管理** | 您可以直接在 WebUI 中通过清晰的 YAML 配置文件导入、编辑和删除模型。 |\n| ▶️ **实时示例** | 请查看新的 [实时语音助手示例](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI-examples\u002Ftree\u002Fmain\u002Frealtime)（多语言支持）。 |\n| ⚠️ **安全性、稳定性与 API 兼容性** | 修复了关键性崩溃、死锁、会话事件、OpenAI 兼容性问题以及 JSON Schema 相关的 panic 异常。 |\n| :brain: **Qwen 3 VL** | 支持 Qwen 3 VL 模型，基于 llama.cpp\u002Fgguf 格式 |\n\n## 🔥 详细更新内容\n\n### 🤖 **Agentic MCP 支持——构建具备工具使用能力的智能 AI 代理**\n\n我们非常自豪地宣布推出 **全面的 Agentic MCP 支持**，这一功能旨在帮助您构建能够 **进行推理、规划，并利用外部工具执行任务**的 AI 代理，例如网络搜索、代码执行和数据检索。您可以继续使用标准的 `chat\u002Fcompletions` 接口，但其背后将由一个代理引擎驱动。\n\n完整文档请参见 [这里](https:\u002F\u002Flocalai.io\u002Fdocs\u002Ffeatures\u002Fmcp\u002F)\n\n> ✅ **现已接入 WebUI**：当所选模型支持 MCP 时，聊天界面中会显示一个专用切换按钮。只需点击即可启用代理模式。\n\n#### ✨ 核心特性：\n- **全新接口**：`POST \u002Fmcp\u002Fv1\u002Fchat\u002Fcompletions`（与 OpenAI 标准兼容）。\n- **灵活的工具配置**：\n  ```yaml\n  mcp:\n    stdio: |\n    ","2025-10-31T21:34:21",{"id":248,"version":249,"summary_zh":250,"released_at":251},53359,"v3.6.0","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix: reranking models limited to 512 tokens in llama.cpp backend by @jongames in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6344\r\n### Exciting New Features 🎉\r\n* feat(kokoro): add support for l4t devices by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6322\r\n* feat(chatterbox): support multilingual by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6240\r\n### 🧠 Models\r\n* chore(model gallery): add qwen-image-edit-2509 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6336\r\n* chore(models): add whisper-turbo via whisper.cpp by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6340\r\n* chore(model gallery): add ibm-granite_granite-4.0-h-small by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6373\r\n* chore(model gallery): add ibm-granite_granite-4.0-h-tiny by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6374\r\n* chore(model gallery): add ibm-granite_granite-4.0-h-micro by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6375\r\n* chore(model gallery): add ibm-granite_granite-4.0-micro by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6376\r\n### 👒 Dependencies\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.0 in \u002Fbackend\u002Fpython\u002Ftransformers by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6332\r\n* chore(deps): bump securego\u002Fgosec from 2.22.8 to 2.22.9 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6324\r\n* chore(deps): bump llama.cpp to '72b24d96c6888c609d562779a23787304ae4609c' by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6349\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fcoqui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6353\r\n* chore(deps): bump transformers from 4.48.3 to 4.56.2 in \u002Fbackend\u002Fpython\u002Fcoqui by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6330\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fdiffusers by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6361\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Frerankers by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6360\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fcommon\u002Ftemplate by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6358\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fvllm by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6357\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fbark by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6359\r\n* chore(deps): bump grpcio from 1.75.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Ftransformers by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6362\r\n* chore(deps): bump grpcio from 1.74.0 to 1.75.1 in \u002Fbackend\u002Fpython\u002Fexllama2 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6356\r\n### Other Changes\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `7f766929ca8e8e01dcceb1c526ee584f7e5e1408` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6319\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6318\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `da30ab5f8696cabb2d4620cdc0aa41a298c54fd6` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6321\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `1d0125bcf1cbd7195ad0faf826a20bc7cec7d3f4` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6335\r\n* chore(cudss): add cudds to l4t images by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6338\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `4ae88d07d026e66b41e85afece74e88af54f4e66` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6339\r\n* CI: disable build-testing on PRs against arm64 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6341\r\n* chore(deps): bump llama.cpp to '835b2b915c52bcabcd688d025eacff9a07b65f52' by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6347\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `4807e8f96a61b2adccebd5e57444c94d18de7264` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6350\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `bd0af02fc96c2057726f33c0f0daf7bb8f3e462a` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6352\r\n* Revert \"chore(deps): bump transformers from 4.48.3 to 4.56.2 in \u002Fbackend\u002Fpython\u002Fcoqui\" by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6363\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `32be14f8ebfc0498c2c619182f0d7f4c822d52c4` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6354\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `5f7e166cbf7b9ca928c7fad990098ef32358ac75` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6355\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `b2ba81dbe07b6dbea9c96b13346c66973dede32c` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6366\r\n* chore: :arrow_up: Update ggml-or","2025-10-03T13:08:57",{"id":253,"version":254,"summary_zh":255,"released_at":256},53360,"v3.5.4","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix(python): make option check uniform across backends by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6314\r\n### Other Changes\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `44fa2f647cf2a6953493b21ab83b50d5f5dbc483` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6317\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `f432d8d83e7407073634c5e4fd81a3d23a10827f` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6316\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6315\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.5.3...v3.5.4","2025-09-20T07:49:10",{"id":258,"version":259,"summary_zh":260,"released_at":261},53361,"v3.5.3","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix(diffusers): fix float detection by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6313\r\n### 🧠 Models\r\n* chore(model gallery): add mistralai_magistral-small-2509 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6309\r\n* chore(model gallery): add impish_qwen_14b-1m by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6310\r\n* chore(model gallery): add aquif-3.5-a4b-think by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6311\r\n### 👒 Dependencies\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `3edd87cd055a45d885fa914d879d36d33ecfc3e1` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6308\r\n### Other Changes\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6307\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.5.2...v3.5.3","2025-09-19T17:10:03",{"id":263,"version":264,"summary_zh":265,"released_at":266},53362,"v3.5.2","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### 👒 Dependencies\r\n* Revert \"feat(nvidia-gpu): bump images to cuda 12.8\" by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6303\r\n### Other Changes\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6305\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `0320ac5264279d74f8ee91bafa6c90e9ab9bbb91` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6306\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.5.1...v3.5.2","2025-09-18T07:37:15",{"id":268,"version":269,"summary_zh":270,"released_at":271},53363,"v3.5.1","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix: make sure to turn down all processes on exit by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6200\r\n* fix(p2p): automatically install llama-cpp for p2p workers by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6199\r\n* Point to LocalAI-examples repo for llava by @mauromorales in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6241\r\n* fix: runtime capability detection for backends by @sozercan in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6149\r\n* fix(chat): use proper finish_reason for tool\u002Ffunction calling by @imkira in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6243\r\n* fix(rocm): Rename tag suffix for hipblas whisper build to match backend config by @KingJ in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6247\r\n* fix(llama-cpp): correctly calculate embeddings by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6259\r\n### Exciting New Features 🎉\r\n* feat(launcher): show welcome page by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6234\r\n* feat: support HF_ENDPOINT env for the HuggingFace endpoint by @qxo in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6220\r\n### 🧠 Models\r\n* chore(model gallery): add nousresearch_hermes-4-14b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6197\r\n* chore(model gallery): add MiniCPM-V-4.5-8b-q4_K_M by @M0Rf30 in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6205\r\n* chore(model-gallery): :arrow_up: update checksum by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6211\r\n* feat(whisper): Add diarization (tinydiarize) by @richiejp in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6184\r\n* chore(model gallery): add baidu_ernie-4.5-21b-a3b-thinking by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6267\r\n* chore(model gallery): add aquif-ai_aquif-3.5-8b-think by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6269\r\n* chore(model gallery): add qwen3-stargate-sg1-uncensored-abliterated-8b-i1 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6270\r\n* chore(model gallery): add k2-think-i1 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6288\r\n* chore(model gallery): add holo1.5-72b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6289\r\n* chore(model gallery): add holo1.5-7b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6290\r\n* chore(model gallery): add holo1.5-3b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6291\r\n* chore(model gallery): add alibaba-nlp_tongyi-deepresearch-30b-a3b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6295\r\n* chore(model gallery): add webwatcher-7b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6297\r\n* chore(model gallery): add webwatcher-32b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6298\r\n* chore(model gallery): add websailor-32b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6299\r\n* chore(model gallery): add websailor-7b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6300\r\n### 📖 Documentation and examples\r\n* chore(docs): add MacOS dmg download button by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6233\r\n### 👒 Dependencies\r\n* chore(deps): bump github.com\u002Fopencontainers\u002Fimage-spec from 1.1.0 to 1.1.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6223\r\n* chore(deps): bump actions\u002Fstale from 9.1.0 to 10.0.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6227\r\n* chore(deps): bump go.opentelemetry.io\u002Fotel\u002Fexporters\u002Fprometheus from 0.50.0 to 0.60.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6226\r\n* chore(deps): bump oras.land\u002Foras-go\u002Fv2 from 2.5.0 to 2.6.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6225\r\n* chore(deps): bump github.com\u002Fswaggo\u002Fswag from 1.16.3 to 1.16.6 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6222\r\n* chore(deps): bump actions\u002Flabeler from 5 to 6 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6229\r\n* feat(nvidia-gpu): bump images to cuda 12.8 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6239\r\n* feat(chatterbox): add MPS, and CPU, pin version by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6242\r\n### Other Changes\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `0fce7a1248b74148c1eb0d368b7e18e8bcb96809` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6193\r\n* chore: :arrow_up: Update leejet\u002Fstable-diffusion.cpp to `2eb3845df5675a71565d5a9e13b7bad0881fafcd` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6192\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6201\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `fb15d649ed14ab447eeab911e0c9d21e35fb243e` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6202\r\n* Fix Typos in Docs by @alizfara112 in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6204\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `bb0e1fc60f26a707cabf724edcf7cfcab2a269b6` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6203\r\n* chore: :arrow_up: Update ","2025-09-17T17:03:06",{"id":273,"version":274,"summary_zh":275,"released_at":276},53364,"v3.5.0","\u003Ch1 align=\"center\">\r\n  \u003Cbr>\r\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\"> \u003Cbr>\r\n\u003Cbr>\r\n🚀 LocalAI 3.5.0\r\n\u003C\u002Fh1>\r\n\r\nWelcome to LocalAI 3.5.0! This release focuses on expanding backend support, improving usability, refining the overall experience, and keeping reducing footprint of LocalAI, to make it a truly portable, privacy-focused AI stack. We’ve added several new backends, enhanced the WebUI with new features, made significant performance improvements under the hood, and simplified LocalAI management with a new Launcher app (Alpha) available for Linux and MacOS.\r\n\r\n## TL;DR – What’s New in LocalAI 3.5.0 🎉\r\n\r\n- 🖼️ **Expanded Backend Support:** Welcome to MLX!  **mlx**, **mlx-audio**, **mlx-vlm** are now all available in LocalAI. We also added support to WAN for video generation, and a CPU and MPS version of the diffusers backend! Now you can generate and edit images from MacOS or if you don't have any GPU (albeit slow).\r\n- ✨ **WebUI Enhancements:** Download model configurations, a manual model refresh button, streamlined error streaming during SSE events, and a stop button for running backends. Models now can also be imported and edited via the WebUI.\r\n- 🚀 **Performance & Architecture:** Whisper backend has been rewritten in Purego with integrated Voice Activity Detection (VAD) for improved efficiency and stability. Stablediffusion also benefits from the Purego conversion.\r\n- 🛠️ **Simplified Management:** New LocalAI Launcher App (Alpha) for easy installation, startup, updates, and access to the WebUI.\r\n- ✅ **Bug Fixes & Stability:** Resolutions to AMD RX 9060XT ROCm errors, libomp linking issues, model loading problems on macOS, CUDA device detection improvements, and more.\r\n- **Enhanced support for MacOS**: whisper, diffusers, llama.cpp, MLX (VLM, Audio, LLM), stable-diffusion.cpp will now work on MacOS!\r\n\r\n\r\n## What’s New in Detail\r\n\r\n### 🚀 New Backends and Model Support\r\n\r\nWe've significantly expanded the range of models you can run with LocalAI!\r\n\r\n*   **mlx-audio:** Bring text to life with Kokoro’s voice models on MacOS with the power of MLX!. Install with the `mlx-audio` backend. Example configuration:\r\n    ```yaml\r\n    backend: mlx-audio\r\n    name: kokoro-mlx\r\n    parameters:\r\n      model: prince-canuma\u002FKokoro-82M\r\n      voice: \"af_heart\"\r\n      known_usecases:\r\n        - tts\r\n    ```\r\n*   **mlx-vlm:** Experiment with the latest VLM models. While we don't have any models in the gallery, it's really easy to configure, see https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6119 for more details.\r\n    ```yaml\r\n    name: mlx-gemma\r\n    backend: mlx-vlm\r\n    parameters:\r\n      model: \"mlx-community\u002Fgemma-3n-E2B-it-4bit\"\r\n    template:\r\n      use_tokenizer_template: true\r\n    known_usecases:\r\n    - chat\r\n    ```\r\n*   **WAN:** Generate videos with Wan2.1 or Wan 2.2 models using the `diffusers` backend, supporting both I2V and T2V.  Example configuration:\r\n    ```yaml\r\n    name: wan21\r\n    f16: true\r\n    backend: diffusers\r\n    known_usecases:\r\n      - video\r\n    parameters:\r\n      model: Wan-AI\u002FWan2.1-T2V-1.3B-Diffusers\r\n    diffusers:\r\n      cuda: true\r\n      pipeline_type: WanPipeline\r\n      step: 40\r\n    options:\r\n        - guidance_scale:5.0\r\n        - num_frames:81\r\n        - torch_dtype:bf16\r\n    ```\r\n*   **Diffusers CPU and MacOS Support:** Run diffusers models directly on your CPU without a GPU or with a Mac! This opens up LocalAI to a wider range of hardware configurations.\r\n\r\n### ✨ WebUI Improvements\r\n\r\nWe've added several new features to make using LocalAI even easier:\r\n\r\n*   **Download Model Config:** A \"Get Config\" button in the model gallery lets you download a model’s configuration file without installing the full model. This is perfect for custom setups and easier integration.\r\n*   **Manual Model Refresh:** A new button allows you to manually refresh the on-disk YAML configuration, ensuring the WebUI always has the latest model information.\r\n*   **Streamlined Error Handling:** Errors during SSE streaming events are now displayed directly to the user, providing better visibility and debugging information.\r\n*   **Backend Stop Button:** Quickly stop running backends directly from the WebUI.\r\n\r\n\u003Cimg width=\"1009\" height=\"262\" alt=\"Screenshot From 2025-08-15 22-25-52\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F84a9c0bf-5fc0-4436-b830-2e8bf03c04ea\" \u002F>\r\n\r\n*   **Model import and edit:** Now models can be edited and imported directly from the WebUI.\r\n\r\n\u003Cimg width=\"1920\" height=\"917\" alt=\"Screenshot 2025-08-14 at 22-28-59 LocalAI - Import Model\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fd7b1b499-c30b-448e-831a-a80589ba4a27\" \u002F>\r\n\u003Cimg width=\"1920\" height=\"917\" alt=\"Screenshot 2025-08-14 at 22-28-47 LocalAI - Edit Model gpt-oss-20b\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F49e2961f-7cec-4fcd-a09a-2239256099aa\" \u002F>\r\n\r\n*   **Installed Backend List:** Now displays installed backends in the WebUI for easier access and management.\r\n","2025-09-03T20:23:15",{"id":278,"version":279,"summary_zh":280,"released_at":281},53365,"v3.4.0","\u003Ch1 align=\"center\">\r\n  \u003Cbr>\r\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\"> \u003Cbr>\r\n\u003Cbr>\r\n🚀 LocalAI 3.4.0\r\n\u003C\u002Fh1>\r\n\r\n## What’s New in LocalAI 3.4.0 🎉\r\n\r\n- WebUI improvements: now size can be set during image generation\r\n- New backends: [KittenTTS](github.com\u002FKittenML\u002FKittenTTS), [kokoro](https:\u002F\u002Fgithub.com\u002Fhexgrad\u002Fkokoro) and [dia](https:\u002F\u002Fgithub.com\u002Fnari-labs\u002Fdia) now are available as backends and models can be installed directly from the gallery\r\n  Note: these backends needs to be warmed up during the first call to download the model files.\r\n- Support for reasoning effort in the OpenAI chat completion\r\n- Diffusers backend now is available for l4t images and devices\r\n- During backend installation from the CLI can be supplied alias and name (`--alias` and --name`) to override configurations\r\n- Backends now can be sideloaded from the system: you can drag-and-drop the backends in the backends folder and they will just work! \r\n\r\n## The Complete Local Stack for Privacy-First AI\r\n\r\n\u003Ctable>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\">\r\n        \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\" width=\"200\" alt=\"LocalAI Logo\">\r\n        \u003Ch3>LocalAI\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\">\r\n         \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAGI\u002Frefs\u002Fheads\u002Fmain\u002Fwebui\u002Freact-ui\u002Fpublic\u002Flogo_2.png\" width=\"200\" alt=\"LocalAGI Logo\">\r\n         \u003Ch3>LocalAGI\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\">\r\n         \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalRecall\u002Frefs\u002Fheads\u002Fmain\u002Fstatic\u002Flocalrecall_horizontal.png\" width=\"200\" alt=\"LocalRecall Logo\">\r\n         \u003Ch3>LocalRecall\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n\u003C\u002Ftable>\r\n\r\n## Thank you! ❤️\r\n\r\nA massive **THANK YOU** to our incredible community and our sponsors! LocalAI has over **34,500 stars**, and LocalAGI has already rocketed past **1k+ stars**!\r\n\r\nAs a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time and our sponsors to provide us the hardware! If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!\r\n\r\n👉 **Check out the reborn LocalAGI v2 today:** [https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI)\r\n\r\n\r\n## Full changelog :point_down: \r\n\r\n\u003Cdetails>\r\n\r\n\u003Csummary>\r\n:point_right: Click to expand :point_left: \r\n\u003C\u002Fsummary>\r\n\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix(llama.cpp): do not default to linear rope by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5982\r\n### Exciting New Features 🎉\r\n* feat(webui): allow to specify image size by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5976\r\n* feat(backends): add KittenTTS by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5977\r\n* feat(kokoro): complete kokoro integration by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5978\r\n* feat: add reasoning effort and metadata to template by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5981\r\n* feat(transformers): add support to Dia by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5991\r\n* feat(diffusers): add builds for nvidia-l4t by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F6004\r\n* feat(backends install): allow to specify name and alias during manual installation by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5971\r\n### 🧠 Models\r\n* chore(models): add gpt-oss-20b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5973\r\n* chore(models): add gpt-oss-120b by @mudler in https:\u002F\u002Fgithub.com\u002F","2025-08-12T07:13:55",{"id":283,"version":284,"summary_zh":285,"released_at":286},53366,"v3.3.2","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\n## What's Changed\r\n### Exciting New Features 🎉\r\n* feat(backends): install from local path by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5962\r\n* feat(backends): allow backends to not have a metadata file by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5963\r\n### 📖 Documentation and examples\r\n* fix(docs): Improve responsiveness of tables by @dedyf5 in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5954\r\n### 👒 Dependencies\r\n* chore(stable-diffusion): bump, set GGML_MAX_NAME by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5961\r\n* chore(build): Rename sycl to intel by @richiejp in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5964\r\n### Other Changes\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5956\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `0becabc8d68d9ffa6ddfba5240e38cd7a2642046` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5958\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `5c0eb5ef544aeefd81c303e03208f768e158d93c` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5959\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `d31192b4ee1441bbbecd3cbf9e02633368bdc4f5` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5965\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.3.1...v3.3.2","2025-08-04T14:52:43",{"id":288,"version":289,"summary_zh":290,"released_at":291},53367,"v3.3.1","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at master -->\r\n\r\nThis is a minor release, however we have addressed some important bug regarding Intel-GPU Images, and we have changed naming of the container images.\r\n\r\nThis release also adds support for Flux Kontext and Flux krea!\r\n\r\n## :warning:  Breaking change\r\n\r\nIntel GPU images has been renamed from `latest-gpu-intel-f32` and `latest-gpu-intel-f16` to a single one, `latest-gpu-intel`, for example:\r\n\r\n```bash\r\ndocker run -ti --name local-ai -p 8080:8080 --device=\u002Fdev\u002Fdri\u002Fcard1 --device=\u002Fdev\u002Fdri\u002FrenderD128 localai\u002Flocalai:latest-gpu-intel\r\n```\r\n\r\nand for AIO (All-In-One) images:\r\n\r\n```bash\r\ndocker run -ti --name local-ai -p 8080:8080 localai\u002Flocalai:latest-aio-gpu-intel\r\n```\r\n\r\n## :framed_picture:  Flux kontext\r\n\r\nFrom this release LocalAI supports Flux Kontext and can be used to edit images via the API:\r\n\r\nInstall with:\r\n\r\n```bash\r\nlocal-ai run flux.1-kontext-dev\r\n```\r\n\r\nTo test:\r\n\r\n```bash\r\ncurl http:\u002F\u002Flocalhost:8080\u002Fv1\u002Fimages\u002Fgenerations -H \"Content-Type: application\u002Fjson\" -d '{\r\n  \"model\": \"flux.1-kontext-dev\",\r\n  \"prompt\": \"change 'flux.cpp' to 'LocalAI'\",\r\n  \"size\": \"256x256\",\r\n  \"ref_images\": [\r\n  \t\"https:\u002F\u002Fraw.githubusercontent.com\u002Fleejet\u002Fstable-diffusion.cpp\u002Fmaster\u002Fassets\u002Fflux\u002Fflux1-dev-q8_0.png\"\r\n  ]\r\n}'\r\n```\r\n\r\n\u003Cimg width=\"256\" height=\"256\" alt=\"b64567298114 (1)\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F5f659956-9e7e-4517-998b-7b373d0c081c\" \u002F>\r\n\u003Cimg width=\"256\" height=\"256\" alt=\"b641424088517\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F045cb064-015b-4393-8b08-1c90c1f79005\" \u002F>\r\n\r\n\r\n## What's Changed\r\n### Breaking Changes 🛠\r\n* fix(intel): Set GPU vendor on Intel images and cleanup by @richiejp in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5945\r\n### Exciting New Features 🎉\r\n* feat(stablediffusion-ggml): add support to ref images (flux Kontext) by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5935\r\n### 🧠 Models\r\n* chore(model gallery): add qwen_qwen3-30b-a3b-instruct-2507 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5936\r\n* chore(model gallery): add arcee-ai_afm-4.5b by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5938\r\n* chore(model gallery): add qwen_qwen3-30b-a3b-thinking-2507 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5939\r\n* chore(model gallery): add flux.1-dev-ggml-q8_0 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5947\r\n* chore(model gallery): add flux.1-dev-ggml-abliterated-v2-q8_0 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5948\r\n* chore(model gallery): add flux.1-krea-dev-ggml by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5949\r\n### Other Changes\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5929\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `8ad7b3e65b5834e5574c2f5640056c9047b5d93b` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5931\r\n* chore: :arrow_up: Update leejet\u002Fstable-diffusion.cpp to `f6b9aa1a4373e322ff12c15b8a0749e6dd6f0253` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5930\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `d0a9d8c7f8f7b91c51d77bbaa394b915f79cde6b` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5932\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `aa79524c51fb014f8df17069d31d7c44b9ea6cb8` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5934\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `e9192bec564780bd4313ad6524d20a0ab92797db` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5940\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `f7502dca872866a310fe69d30b163fa87d256319` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5941\r\n* chore: update swagger by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5946\r\n* feat(stablediffusion-ggml): allow to load loras by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5943\r\n* chore(capability): improve messages by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5944\r\n* feat(swagger): update swagger by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5950\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `daf2dd788066b8b239cb7f68210e090c2124c199` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5951\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.3.0...v3.3.1","2025-08-01T13:02:00",{"id":293,"version":294,"summary_zh":295,"released_at":296},53368,"v3.3.0","\u003Ch1 align=\"center\">\r\n  \u003Cbr>\r\n  \u003Cimg height=\"300\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\"> \u003Cbr>\r\n\u003Cbr>\r\n🚀 LocalAI 3.3.0\r\n\u003C\u002Fh1>\r\n\r\n## What’s New in LocalAI 3.3.0 🎉\r\n\r\n\r\n- Object detection! From 3.3.0, now LocalAI supports with a new API - also fast object detection! Just install the `rfdetr-base` model - See [the documentation](https:\u002F\u002Flocalai.io\u002Ffeatures\u002Fobject-detection\u002F) to learn more\r\n- Backends now have defined mirrors for download - this helps when primary registries fails during download\r\n- Bug fixes: worked hard into squashing bugfixes in this release! Ranging from container images to backends and installation scripts\r\n\r\n\r\n## The Complete Local Stack for Privacy-First AI\r\n\r\n\u003Ctable>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\">\r\n        \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAI\u002Frefs\u002Fheads\u002Fmaster\u002Fcore\u002Fhttp\u002Fstatic\u002Flogo.png\" width=\"200\" alt=\"LocalAI Logo\">\r\n        \u003Ch3>LocalAI\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\">\r\n         \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalAGI\u002Frefs\u002Fheads\u002Fmain\u002Fwebui\u002Freact-ui\u002Fpublic\u002Flogo_2.png\" width=\"200\" alt=\"LocalAGI Logo\">\r\n         \u003Ch3>LocalAGI\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n  \u003Ctr>\r\n    \u003Ctd width=\"30%\" valign=\"top\" align=\"center\">\r\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\">\r\n         \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fmudler\u002FLocalRecall\u002Frefs\u002Fheads\u002Fmain\u002Fstatic\u002Flocalrecall_horizontal.png\" width=\"200\" alt=\"LocalRecall Logo\">\r\n         \u003Ch3>LocalRecall\u003C\u002Fh3>\r\n      \u003C\u002Fa>\r\n    \u003C\u002Ftd>\r\n    \u003Ctd width=\"70%\" valign=\"top\">\r\n      \u003Cp>A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.\u003C\u002Fp>\r\n      \u003Cp>\u003Cem>Link:\u003C\u002Fem> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\">https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalRecall\u003C\u002Fa>\u003C\u002Fp>\r\n    \u003C\u002Ftd>\r\n  \u003C\u002Ftr>\r\n\u003C\u002Ftable>\r\n\r\n## Thank you! ❤️\r\n\r\nA massive **THANK YOU** to our incredible community and our sponsors! LocalAI has over **34,100 stars**, and LocalAGI has already rocketed past **900+ stars**!\r\n\r\nAs a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time and our sponsors to provide us the hardware! If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!\r\n\r\n👉 **Check out the reborn LocalAGI v2 today:** [https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI](https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAGI)\r\n\r\n\r\n## Full changelog :point_down: \r\n\r\n\u003Cdetails>\r\n\r\n\u003Csummary>\r\n:point_right: Click to expand :point_left: \r\n\u003C\u002Fsummary>\r\n\r\n## What's Changed\r\n### Bug fixes :bug:\r\n* fix(backend gallery): intel images for python-based backends, re-add exllama2 by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5928\r\n### Exciting New Features 🎉\r\n* feat: normalize search by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5925\r\n* feat(rfdetr): add object detection API by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5923\r\n### Other Changes\r\n* docs: :arrow_up: update docs version mudler\u002FLocalAI by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5920\r\n* chore: :arrow_up: Update ggml-org\u002Fwhisper.cpp to `e7bf0294ec9099b5fc21f5ba969805dfb2108cea` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5922\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `11dd5a44eb180e1d69fac24d3852b5222d66fb7f` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5921\r\n* chore: drop assistants endpoint by @mudler in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5926\r\n* chore: :arrow_up: Update ggml-org\u002Fllama.cpp to `bf78f5439ee8e82e367674043303ebf8e92b4805` by @localai-bot in https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fpull\u002F5927\r\n\r\n\u003C\u002Fdetails>\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmudler\u002FLocalAI\u002Fcompare\u002Fv3.2.3...v3.3.0","2025-07-28T15:03:28"]