[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Trans-N-ai--swama":3,"tool-Trans-N-ai--swama":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":78,"owner_location":78,"owner_email":78,"owner_twitter":78,"owner_website":78,"owner_url":79,"languages":80,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":23,"env_os":93,"env_gpu":94,"env_ram":95,"env_deps":96,"category_tags":103,"github_topics":78,"view_count":23,"oss_zip_url":78,"oss_zip_packed_at":78,"status":16,"created_at":104,"updated_at":105,"faqs":106,"releases":107},3651,"Trans-N-ai\u002Fswama","swama","High-performance MLX-based LLM inference engine for macOS with native Swift implementation","Swama 是一款专为 macOS 打造的高性能本地大语言模型推理引擎，完全采用 Swift 原生编写并基于 Apple MLX 框架构建。它旨在解决用户在苹果设备上运行大型语言模型（LLM）和视觉语言模型（VLM）时面临的配置复杂、性能不足及依赖云端服务等痛点，让本地 AI 推理变得简单、快速且隐私安全。\n\n无论是希望保护数据隐私的普通用户、需要快速验证原型的开发者，还是进行本地模型研究的研究人员，都能从中受益。Swama 不仅提供优雅的菜单栏应用和完整的命令行工具，还兼容 OpenAI API 标准，方便无缝集成现有工作流。其独特亮点在于深度优化了 Apple Silicon 芯片性能，支持文本、图像输入及本地语音识别（Whisper），并具备智能模型管理功能——用户只需输入简短别名（如\"qwen3\"），系统即可自动下载并缓存模型。此外，它还内置了向量嵌入生成能力，轻松支持语义搜索与 RAG 应用，是 Mac 用户探索本地 AI 的理想选择。","# Swama\n\n[![Swift](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSwift-6.2-orange.svg)](https:\u002F\u002Fswift.org)\n[![macOS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FmacOS-15.0+-blue.svg)](https:\u002F\u002Fwww.apple.com\u002Fmacos\u002F)\n[![MLX](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMLX-Swift-green.svg)](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](LICENSE)\n\n> English | [中文](README_CN.md) | [日本語](README_JA.md)\n\n**Swama** is a high-performance machine learning runtime written in pure Swift, designed specifically for macOS and built on Apple's MLX framework. It provides a powerful and easy-to-use solution for local LLM (Large Language Model) and VLM (Vision Language Model) inference.\n\n## ✨ Features\n\n- 🚀 **High Performance**: Built on Apple MLX framework, optimized for Apple Silicon\n- 🔌 **OpenAI Compatible API**: Standard `\u002Fv1\u002Fchat\u002Fcompletions`, `\u002Fv1\u002Fembeddings`, `\u002Fv1\u002Faudio\u002Ftranscriptions`, and `\u002Fv1\u002Faudio\u002Fspeech` (experimental) endpoint support with tool calling\n- 📱 **Menu Bar App**: Elegant macOS native menu bar integration\n- 💻 **Command Line Tools**: Complete CLI support for model management and inference\n- 🖼️ **Multimodal Support**: Support for both text and image inputs\n- 🎤 **Local Audio Transcription**: Built-in speech recognition with Whisper (no cloud required)\n- 🔍 **Text Embeddings**: Built-in embedding generation for semantic search and RAG applications\n- 📦 **Smart Model Management**: Automatic downloading, caching, and version management\n- 🔄 **Streaming Responses**: Real-time streaming text generation support\n- 🌍 **HuggingFace Integration**: Direct model downloads from HuggingFace Hub\n\n## 🏗️ Architecture\n\nSwama features a modular architecture design:\n\n- **SwamaKit**: Core framework library containing all business logic\n- **Swama CLI**: Command-line tool providing complete model management and inference functionality\n- **Swama.app**: macOS menu bar application with graphical interface and background services\n\n## 📋 System Requirements\n\n- macOS 15.0 or later (Sequoia)\n- Apple Silicon (M1\u002FM2\u002FM3\u002FM4)\n- Xcode 16.0+ (for compilation)\n- Swift 6.2+\n\n## 🛠️ Installation\n\n### 🍺 Homebrew (Recommended)\n\n```bash\nbrew install swama\n```\n\n### 📱 Download Pre-built App\n\n1. **Download the latest release**\n   - Go to [Releases](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases)\n   - Download `Swama.dmg` from the latest release\n\n2. **Install the app**\n   - Double-click `Swama.dmg` to mount the disk image\n   - Drag `Swama.app` to the `Applications` folder\n   - Launch Swama from Applications or Spotlight\n   \n   **Note**: On first launch, macOS may show a security warning. If this happens:\n   - Go to **System Preferences > Security & Privacy > General**\n   - Click **\"Open Anyway\"** next to the Swama app message\n   - Or right-click the app and select **\"Open\"** from the context menu\n\n3. **Install CLI tools**\n   - Open Swama from the menu bar\n   - Click \"Install Command Line Tool…\" to add `swama` command to your PATH\n\n### 🔧 Build from Source (Advanced)\n\nFor developers who want to build from source:\n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama.git\ncd swama\n\n# Build CLI tool\ncd swama\nswift build -c release\nmv .build\u002Frelease\u002Fswama .build\u002Frelease\u002Fswama-bin\n\n# Build macOS app (requires Xcode)\ncd ..\u002Fswama-macos\u002FSwama\nxcodebuild -project Swama.xcodeproj -scheme Swama -configuration Release\n```\n\n## 🚀 Quick Start\n\nAfter installing Swama.app, you can use either the menu bar app or command line:\n\n### 1. Instant Inference with Model Aliases\n\n```bash\n# Use short aliases instead of full model names - auto-downloads if needed!\nswama run qwen3 \"Hello, AI\"\nswama run llama3.2 \"Tell me a joke\"\nswama run gemma3 \"What's in this image?\" -i \u002Fpath\u002Fto\u002Fimage.jpg\n\n# Traditional way (also works)\nswama run mlx-community\u002FLlama-3.2-1B-Instruct-4bit \"Hello, how are you?\"\n\n# List downloaded models\nswama list\n```\n\n**✨ Smart Features:**\n- **Model Aliases**: Use friendly names like `qwen3`, `llama3.2`, `deepseek-r1`, `gpt-oss` instead of long URLs\n- **Auto-Download**: Models are automatically downloaded on first use - no need to `pull` first!\n- **Cache Management**: Downloaded models are cached for future use\n\n### 2. Available Model Aliases\n\n#### Language Models (LLM)\n\n| Alias | Full Model Name | Size | Description |\n|-------|-----------------|------|-------------|\n| `qwen3` | `mlx-community\u002FQwen3-8B-4bit` | 4.3 GB | Qwen3 8B (default) |\n| `qwen3-1.7b` | `mlx-community\u002FQwen3-1.7B-4bit` | 938.4 MB | Qwen3 1.7B (lightweight) |\n| `qwen3-30b` | `mlx-community\u002FQwen3-30B-A3B-4bit` | 16.0 GB | Qwen3 30B (high-capacity) |\n| `qwen3-32b` | `mlx-community\u002FQwen3-32B-4bit` | 17.2 GB | Qwen3 32B (ultra-scale) |\n| `qwen3-235b` | `mlx-community\u002FQwen3-235B-A22B-4bit` | 123.2 GB | Qwen3 235B (trillion-scale) |\n| `llama3.2` | `mlx-community\u002FLlama-3.2-3B-Instruct-4bit` | 1.7 GB | Llama 3.2 3B (default) |\n| `llama3.2-1b` | `mlx-community\u002FLlama-3.2-1B-Instruct-4bit` | 876.3 MB | Llama 3.2 1B (fastest) |\n| `deepseek-r1` | `mlx-community\u002FDeepSeek-R1-0528-4bit` | ~32 GB | DeepSeek R1 (reasoning model) |\n| `deepseek-r1-8b` | `mlx-community\u002FDeepSeek-R1-0528-Qwen3-8B-8bit` | 8.6 GB | DeepSeek R1 based on Qwen3-8B |\n| `qwen2.5` | `mlx-community\u002FQwen2.5-7B-Instruct-4bit` | 4.0 GB | Qwen 2.5 7B |\n| `gpt-oss` | `lmstudio-community\u002Fgpt-oss-20b-MLX-8bit` | ~20 GB | GPT-OSS 20B (21B params, 3.6B active) |\n| `gpt-oss-120b` | `lmstudio-community\u002Fgpt-oss-120b-MLX-8bit` | ~120 GB | GPT-OSS 120B (117B params, 5.1B active) |\n\n#### Vision Language Models (VLM)\n\n| Alias | Full Model Name | Size | Description |\n|-------|-----------------|------|-------------|\n| `qwen3.5` | `mlx-community\u002FQwen3.5-35B-A3B-4bit` | ~21 GB | Qwen3.5 35B-A3B (default) |\n| `qwen3.5-0.8b` | `mlx-community\u002FQwen3.5-0.8B-4bit` | ~0.6 GB | Qwen3.5 0.8B |\n| `qwen3.5-2b` | `mlx-community\u002FQwen3.5-2B-4bit` | ~1.4 GB | Qwen3.5 2B |\n| `qwen3.5-4b` | `mlx-community\u002FQwen3.5-4B-4bit` | ~2.4 GB | Qwen3.5 4B |\n| `qwen3.5-9b` | `mlx-community\u002FQwen3.5-9B-4bit` | ~6.0 GB | Qwen3.5 9B |\n| `qwen3.5-27b` | `mlx-community\u002FQwen3.5-27B-4bit` | ~16 GB | Qwen3.5 27B |\n| `qwen3.5-35b-a3b` | `mlx-community\u002FQwen3.5-35B-A3B-4bit` | ~21 GB | Qwen3.5 35B-A3B |\n| `qwen3.5-122b-a10b` | `mlx-community\u002FQwen3.5-122B-A10B-4bit` | ~68 GB | Qwen3.5 122B-A10B |\n| `qwen3.5-397b-a17b` | `mlx-community\u002FQwen3.5-397B-A17B-4bit` | ~220 GB | Qwen3.5 397B-A17B |\n| `gemma3` | `mlx-community\u002Fgemma-3-4b-it-4bit` | 3.2 GB | Gemma 3 4B (default VLM) |\n| `gemma3-27b` | `mlx-community\u002Fgemma-3-27b-it-4bit` | 15.7 GB | Gemma 3 27B (large-scale VLM) |\n| `qwen3-vl` | `mlx-community\u002FQwen3-VL-4B-Instruct-4bit` | ~4 GB | Qwen3-VL 4B (default VLM) |\n| `qwen3-vl-2b` | `mlx-community\u002FQwen3-VL-2B-Instruct-4bit` | ~2 GB | Qwen3-VL 2B (lightweight) |\n| `qwen3-vl-8b` | `mlx-community\u002FQwen3-VL-8B-Instruct-4bit` | ~8 GB | Qwen3-VL 8B (balanced) |\n\n#### Audio Models (Speech Recognition)\n\n| Alias | Full Model Name | Size | Description |\n|-------|-----------------|------|-------------|\n| `whisper-large` | `mlx-community\u002Fwhisper-large-v3-4bit` | 1.6 GB | Whisper Large v3 (highest accuracy) |\n| `whisper-medium` | `mlx-community\u002Fwhisper-medium-4bit` | 791.1 MB | Whisper Medium (balanced) |\n| `whisper-small` | `mlx-community\u002Fwhisper-small-4bit` | 251.7 MB | Whisper Small (fast) |\n| `whisper-base` | `mlx-community\u002Fwhisper-base-4bit` | 77.2 MB | Whisper Base (faster) |\n| `whisper-tiny` | `mlx-community\u002Fwhisper-tiny-4bit` | 40.1 MB | Whisper Tiny (fastest) |\n| `funasr` | `mlx-community\u002FFun-ASR-Nano-2512-4bit` | ~200 MB | FunASR Nano (multilingual) |\n| `funasr-mlt` | `mlx-community\u002FFun-ASR-MLT-Nano-2512-4bit` | ~200 MB | FunASR MLT (multilingual transcription) |\n\n#### Text-to-Speech Models (experimental)\n\n| Alias | Full Model Name | Size | Description |\n|-------|-----------------|------|-------------|\n| `orpheus` | `mlx-community\u002Forpheus-3b-0.1-ft-4bit` | - | - |\n| `marvis` | `Marvis-AI\u002Fmarvis-tts-100m-v0.2-MLX-6bit` | - | - |\n| `chatterbox` | `mlx-community\u002FChatterbox-TTS-q4` | - | - |\n| `chatterbox-turbo` | `mlx-community\u002FChatterbox-Turbo-TTS-q4` | - | - |\n| `outetts` | `mlx-community\u002FLlama-OuteTTS-1.0-1B-4bit` | - | - |\n| `cosyvoice2` | `mlx-community\u002FCosyVoice2-0.5B-4bit` | - | - |\n| `cosyvoice3` | `mlx-community\u002FFun-CosyVoice3-0.5B-2512-4bit` | - | - |\n\n### 3. Start API Service\n\n```bash\n# Or start without specifying model (can switch via API)\nswama serve --host 0.0.0.0 --port 28100\n```\n\n### 4. API Usage\n\n#### 🔌 OpenAI Compatible API\n\nSwama provides a fully OpenAI-compatible API endpoint, allowing you to use it with existing tools and integrations:\n\nNote: `\u002Fv1\u002Faudio\u002Fspeech` is experimental.\n\n```bash\n# Get available models\ncurl http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fmodels\n\n# Chat completion using aliases (auto-downloads if needed)\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"qwen3\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello!\"}\n    ],\n    \"temperature\": 0.7,\n    \"max_tokens\": 100\n  }'\n\n# Streaming response with DeepSeek R1\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"deepseek-r1\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Solve this step by step: What is 15% of 240?\"}\n    ],\n    \"stream\": true\n  }'\n\n# Generate text embeddings\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fembeddings \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"input\": [\"Hello world\", \"Text embeddings\"],\n    \"model\": \"mlx-community\u002FQwen3-Embedding-0.6B-4bit-DWQ\"\n  }'\n\n# Transcribe audio files (local processing)\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Ftranscriptions \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=whisper-large\" \\\n  -F \"response_format=json\"\n\n# Text-to-speech (TTS, experimental)\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Fspeech \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"orpheus\",\n    \"input\": \"Hello from Swama TTS\",\n    \"voice\": \"tara\",\n    \"response_format\": \"wav\"\n  }' --output speech.wav\n\n# TTS models: orpheus, marvis, chatterbox, chatterbox-turbo, outetts, cosyvoice2, cosyvoice3\n# Voice-supported models: orpheus, marvis\n# Orpheus voices: tara, leah, jess, leo, dan, mia, zac, zoe\n# Marvis voices: conversational_a, conversational_b\n# CosyVoice uses a cached default reference audio when no reference is provided\n\n# Tool calling (function calling)\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"qwen3\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"What is the weather in Tokyo?\"}],\n    \"tools\": [\n      {\n        \"type\": \"function\",\n        \"function\": {\n          \"name\": \"get_weather\",\n          \"description\": \"Get current weather\",\n          \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n              \"location\": {\"type\": \"string\", \"description\": \"City name\"}\n            },\n            \"required\": [\"location\"]\n          }\n        }\n      }\n    ],\n    \"tool_choice\": \"auto\"\n  }'\n\n# Multimodal support (vision language models)\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"gemma3\",\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": [\n          {\"type\": \"text\", \"text\": \"What do you see in this image?\"},\n          {\"type\": \"image_url\", \"image_url\": {\"url\": \"https:\u002F\u002Fexample.com\u002Fimage.jpg\"}}\n        ]\n      }\n    ]\n  }'\n```\n\n## 📚 Command Reference\n\n### Model Management\n\n```bash\n# Download model (supports both aliases and full names)\nswama pull qwen3                    # Using alias\nswama pull whisper-large            # Download speech recognition model\nswama pull mlx-community\u002FQwen3-8B-4bit  # Using full name\n\n# List local models and available aliases\nswama list [--format json]\n\n# Run inference (auto-downloads if model not found locally)\nswama run qwen3 \"Your prompt here\"              # Using alias - downloads automatically!\nswama run deepseek-coder \"Write a Python function\"  # Another alias\nswama run \u003Cfull-model-name> \u003Cprompt> [options]      # Using full name\n\n# Transcribe audio files\nswama transcribe audio.wav --model whisper-large --language en\n```\n\n### Server\n\n```bash\n# Start API server\nswama serve [--host HOST] [--port PORT]\n```\n\n### Model Aliases\n\nSwama supports convenient aliases for popular models. Use these short names instead of full model URLs:\n\n```bash\n# Examples with different model families\nswama run qwen3 \"Explain machine learning\"           # Qwen3 8B\nswama run llama3.2-1b \"Quick question: what is AI?\"  # Llama 3.2 1B (fastest)\nswama run deepseek-r1 \"Think step by step: 2+2*3\"    # DeepSeek R1 (reasoning)\n```\n\n### Options\n\n- `--temperature \u003Cvalue>`: Sampling temperature (0.0-2.0)\n- `--top-p \u003Cvalue>`: Nucleus sampling parameter (0.0-1.0)\n- `--max-tokens \u003Cnumber>`: Maximum number of tokens to generate\n- `--repetition-penalty \u003Cvalue>`: Repetition penalty factor\n\n## 🔧 Development\n\n### Dependencies\n\n- [swift-nio](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-nio) - High-performance networking framework\n- [swift-argument-parser](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-argument-parser) - Command-line argument parsing\n- [mlx-swift](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift) - Apple MLX Swift bindings\n- [mlx-swift-lm](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift-lm) - MLX Swift language models\n- [mlx-swift-audio](https:\u002F\u002Fgithub.com\u002FDePasqualeOrg\u002Fmlx-swift-audio) - MLX Swift audio processing (Whisper, FunASR)\n\n### Building\n\n```bash\n# Development build\nswift build\n\n# Release build\nswift build -c release\n\n# Run tests\nswift test\n\n# Generate Xcode project\nswift package generate-xcodeproj\n```\n\n## 🤝 Contributing\n\nWe welcome community contributions! Please follow these steps:\n\n1. Fork this repository\n2. Create a feature branch (`git checkout -b feature\u002Famazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature\u002Famazing-feature`)\n5. Open a Pull Request\n\n### Development Guidelines\n\n- Follow Swift coding style guidelines\n- Add tests for new features\n- Update relevant documentation\n- Ensure all tests pass\n\n## 📝 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- [Apple MLX](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx) team for the excellent machine learning framework\n- [Swift NIO](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-nio) for high-performance networking support\n- All contributors and community members\n\n## 📞 Support\n\n- 📝 [Issue Tracker](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fissues)\n- 💬 [Discussions](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fdiscussions)\n\n## 🗺️ Roadmap\n\n- TODO\n\n---\n\n**Swama** - Bringing the best local AI experience to macOS users 🚀\n","# Swama\n\n[![Swift](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSwift-6.2-orange.svg)](https:\u002F\u002Fswift.org)\n[![macOS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FmacOS-15.0+-blue.svg)](https:\u002F\u002Fwww.apple.com\u002Fmacos\u002F)\n[![MLX](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMLX-Swift-green.svg)](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](LICENSE)\n\n> 英文 | [中文](README_CN.md) | [日语](README_JA.md)\n\n**Swama** 是一款纯 Swift 编写的高性能机器学习运行时，专为 macOS 设计，并基于 Apple 的 MLX 框架构建。它提供了一种强大且易于使用的本地 LLM（大型语言模型）和 VLM（视觉语言模型）推理解决方案。\n\n## ✨ 特性\n\n- 🚀 **高性能**: 基于 Apple MLX 框架，针对 Apple Silicon 进行优化\n- 🔌 **兼容 OpenAI API**: 支持标准的 `\u002Fv1\u002Fchat\u002Fcompletions`、`\u002Fv1\u002Fembeddings`、`\u002Fv1\u002Faudio\u002Ftranscriptions` 以及实验性的 `\u002Fv1\u002Faudio\u002Fspeech` 端点，并支持工具调用\n- 📱 **菜单栏应用**: 优雅的 macOS 原生菜单栏集成\n- 💻 **命令行工具**: 完整的 CLI 支持，用于模型管理和推理\n- 🖼️ **多模态支持**: 同时支持文本和图像输入\n- 🎤 **本地音频转录**: 内置 Whisper 语音识别功能（无需云端）\n- 🔍 **文本嵌入**: 内置嵌入生成功能，适用于语义搜索和 RAG 应用\n- 📦 **智能模型管理**: 自动下载、缓存和版本管理\n- 🔄 **流式响应**: 支持实时流式文本生成\n- 🌍 **HuggingFace 集成**: 直接从 HuggingFace Hub 下载模型\n\n## 🏗️ 架构\n\nSwama 采用模块化架构设计：\n\n- **SwamaKit**: 包含所有业务逻辑的核心框架库\n- **Swama CLI**: 提供完整模型管理和推理功能的命令行工具\n- **Swama.app**: 带有图形界面和后台服务的 macOS 菜单栏应用\n\n## 📋 系统要求\n\n- macOS 15.0 或更高版本（Sequoia）\n- Apple Silicon (M1\u002FM2\u002FM3\u002FM4)\n- Xcode 16.0+（用于编译）\n- Swift 6.2+\n\n## 🛠️ 安装\n\n### 🍺 Homebrew（推荐）\n\n```bash\nbrew install swama\n```\n\n### 📱 下载预编译应用\n\n1. **下载最新版本**\n   - 访问 [Releases](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases)\n   - 下载最新版本中的 `Swama.dmg`\n\n2. **安装应用**\n   - 双击 `Swama.dmg` 挂载磁盘映像\n   - 将 `Swama.app` 拖到 `Applications` 文件夹\n   - 从 Applications 或 Spotlight 启动 Swama\n\n   **注意**: 首次启动时，macOS 可能会显示安全警告。如果出现这种情况：\n   - 打开 **系统偏好设置 > 安全性与隐私 > 通用**\n   - 点击 Swama 应用提示旁边的 **“仍要打开”**\n   - 或右键点击应用并从上下文菜单中选择 **“打开”**\n\n3. **安装 CLI 工具**\n   - 从菜单栏打开 Swama\n   - 点击“安装命令行工具…”以将 `swama` 命令添加到你的 PATH 中\n\n### 🔧 从源码构建（高级）\n\n对于希望从源码构建的开发者：\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama.git\ncd swama\n\n# 构建 CLI 工具\ncd swama\nswift build -c release\nmv .build\u002Frelease\u002Fswama .build\u002Frelease\u002Fswama-bin\n\n# 构建 macOS 应用（需要 Xcode）\ncd ..\u002Fswama-macos\u002FSwama\nxcodebuild -project Swama.xcodeproj -scheme Swama -configuration Release\n```\n\n## 🚀 快速入门\n\n安装 Swama.app 后，你可以使用菜单栏应用或命令行：\n\n### 1. 使用模型别名进行快速推理\n\n```bash\n# 使用简短别名代替完整模型名称 - 如需则自动下载！\nswama run qwen3 \"你好，AI\"\nswama run llama3.2 \"给我讲个笑话\"\nswama run gemma3 \"这张图片里有什么？\" -i \u002Fpath\u002Fto\u002Fimage.jpg\n\n# 传统方式（同样适用）\nswama run mlx-community\u002FLlama-3.2-1B-Instruct-4bit \"你好，最近怎么样？\"\n\n# 列出已下载的模型\nswama list\n```\n\n**✨ 智能特性:**\n- **模型别名**: 使用友好的名称如 `qwen3`、`llama3.2`、`deepseek-r1`、`gpt-oss`，而不用冗长的 URL\n- **自动下载**: 模型在首次使用时会自动下载，无需提前 `pull`\n- **缓存管理**: 已下载的模型会被缓存以供后续使用\n\n### 2. 可用模型别名\n\n#### 语言模型 (LLM)\n\n| 别名 | 完整模型名称 | 大小 | 描述 |\n|-------|-----------------|------|-------------|\n| `qwen3` | `mlx-community\u002FQwen3-8B-4bit` | 4.3 GB | 通义千问3 8B（默认） |\n| `qwen3-1.7b` | `mlx-community\u002FQwen3-1.7B-4bit` | 938.4 MB | 通义千问3 1.7B（轻量级） |\n| `qwen3-30b` | `mlx-community\u002FQwen3-30B-A3B-4bit` | 16.0 GB | 通义千问3 30B（高容量） |\n| `qwen3-32b` | `mlx-community\u002FQwen3-32B-4bit` | 17.2 GB | 通义千问3 32B（超大规模） |\n| `qwen3-235b` | `mlx-community\u002FQwen3-235B-A22B-4bit` | 123.2 GB | 通义千问3 235B（万亿规模） |\n| `llama3.2` | `mlx-community\u002FLlama-3.2-3B-Instruct-4bit` | 1.7 GB | Llama 3.2 3B（默认） |\n| `llama3.2-1b` | `mlx-community\u002FLlama-3.2-1B-Instruct-4bit` | 876.3 MB | Llama 3.2 1B（最快） |\n| `deepseek-r1` | `mlx-community\u002FDeepSeek-R1-0528-4bit` | ~32 GB | DeepSeek R1（推理模型） |\n| `deepseek-r1-8b` | `mlx-community\u002FDeepSeek-R1-0528-Qwen3-8B-8bit` | 8.6 GB | 基于Qwen3-8B的DeepSeek R1 |\n| `qwen2.5` | `mlx-community\u002FQwen2.5-7B-Instruct-4bit` | 4.0 GB | 通义千问2.5 7B |\n| `gpt-oss` | `lmstudio-community\u002Fgpt-oss-20b-MLX-8bit` | ~20 GB | GPT-OSS 20B（21B参数，3.6B活跃） |\n| `gpt-oss-120b` | `lmstudio-community\u002Fgpt-oss-120b-MLX-8bit` | ~120 GB | GPT-OSS 120B（117B参数，5.1B活跃） |\n\n#### 视觉语言模型 (VLM)\n\n| 别名 | 完整模型名称 | 大小 | 描述 |\n|-------|-----------------|------|-------------|\n| `qwen3.5` | `mlx-community\u002FQwen3.5-35B-A3B-4bit` | ~21 GB | 通义千问3.5 35B-A3B（默认） |\n| `qwen3.5-0.8b` | `mlx-community\u002FQwen3.5-0.8B-4bit` | ~0.6 GB | 通义千问3.5 0.8B |\n| `qwen3.5-2b` | `mlx-community\u002FQwen3.5-2B-4bit` | ~1.4 GB | 通义千问3.5 2B |\n| `qwen3.5-4b` | `mlx-community\u002FQwen3.5-4B-4bit` | ~2.4 GB | 通义千问3.5 4B |\n| `qwen3.5-9b` | `mlx-community\u002FQwen3.5-9B-4bit` | ~6.0 GB | 通义千问3.5 9B |\n| `qwen3.5-27b` | `mlx-community\u002FQwen3.5-27B-4bit` | ~16 GB | 通义千问3.5 27B |\n| `qwen3.5-35b-a3b` | `mlx-community\u002FQwen3.5-35B-A3B-4bit` | ~21 GB | 通义千问3.5 35B-A3B |\n| `qwen3.5-122b-a10b` | `mlx-community\u002FQwen3.5-122B-A10B-4bit` | ~68 GB | 通义千问3.5 122B-A10B |\n| `qwen3.5-397b-a17b` | `mlx-community\u002FQwen3.5-397B-A17B-4bit` | ~220 GB | 通义千问3.5 397B-A17B |\n| `gemma3` | `mlx-community\u002Fgemma-3-4b-it-4bit` | 3.2 GB | Gemma 3 4B（默认VLM） |\n| `gemma3-27b` | `mlx-community\u002Fgemma-3-27b-it-4bit` | 15.7 GB | Gemma 3 27B（大型VLM） |\n| `qwen3-vl` | `mlx-community\u002FQwen3-VL-4B-Instruct-4bit` | ~4 GB | 通义千问3-VL 4B（默认VLM） |\n| `qwen3-vl-2b` | `mlx-community\u002FQwen3-VL-2B-Instruct-4bit` | ~2 GB | 通义千问3-VL 2B（轻量级） |\n| `qwen3-vl-8b` | `mlx-community\u002FQwen3-VL-8B-Instruct-4bit` | ~8 GB | 通义千问3-VL 8B（平衡型） |\n\n#### 音频模型（语音识别）\n\n| 别名 | 完整模型名称 | 大小 | 描述 |\n|-------|-----------------|------|-------------|\n| `whisper-large` | `mlx-community\u002Fwhisper-large-v3-4bit` | 1.6 GB | Whisper Large v3（最高精度） |\n| `whisper-medium` | `mlx-community\u002Fwhisper-medium-4bit` | 791.1 MB | Whisper Medium（平衡型） |\n| `whisper-small` | `mlx-community\u002Fwhisper-small-4bit` | 251.7 MB | Whisper Small（快速） |\n| `whisper-base` | `mlx-community\u002Fwhisper-base-4bit` | 77.2 MB | Whisper Base（更快速） |\n| `whisper-tiny` | `mlx-community\u002Fwhisper-tiny-4bit` | 40.1 MB | Whisper Tiny（最快） |\n| `funasr` | `mlx-community\u002FFun-ASR-Nano-2512-4bit` | ~200 MB | FunASR Nano（多语言） |\n| `funasr-mlt` | `mlx-community\u002FFun-ASR-MLT-Nano-2512-4bit` | ~200 MB | FunASR MLT（多语言转录） |\n\n#### 文本到语音模型（实验性）\n\n| 别名 | 完整模型名称 | 大小 | 描述 |\n|-------|-----------------|------|-------------|\n| `orpheus` | `mlx-community\u002Forpheus-3b-0.1-ft-4bit` | - | - |\n| `marvis` | `Marvis-AI\u002Fmarvis-tts-100m-v0.2-MLX-6bit` | - | - |\n| `chatterbox` | `mlx-community\u002FChatterbox-TTS-q4` | - | - |\n| `chatterbox-turbo` | `mlx-community\u002FChatterbox-Turbo-TTS-q4` | - | - |\n| `outetts` | `mlx-community\u002FLlama-OuteTTS-1.0-1B-4bit` | - | - |\n| `cosyvoice2` | `mlx-community\u002FCosyVoice2-0.5B-4bit` | - | - |\n| `cosyvoice3` | `mlx-community\u002FFun-CosyVoice3-0.5B-2512-4bit` | - | - |\n\n### 3. 启动 API 服务\n\n```bash\n# 或不指定模型直接启动（可通过 API 切换）\nswama serve --host 0.0.0.0 --port 28100\n```\n\n### 4. API 使用\n\n#### 🔌 OpenAI 兼容 API\n\nSwama 提供完全兼容 OpenAI 的 API 端点，可与现有工具和集成无缝对接：\n\n注意：`\u002Fv1\u002Faudio\u002Fspeech` 为实验性功能。\n\n```bash\n# 获取可用模型\ncurl http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fmodels\n\n# 使用别名进行对话补全（如需会自动下载）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"qwen3\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"你好！\"}\n    ],\n    \"temperature\": 0.7,\n    \"max_tokens\": 100\n  }'\n\n# 使用 DeepSeek R1 进行流式响应\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"deepseek-r1\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"请逐步解答：240 的 15% 是多少？\"}\n    ],\n    \"stream\": true\n  }'\n\n# 生成文本嵌入\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fembeddings \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"input\": [\"Hello world\", \"文本嵌入\"],\n    \"model\": \"mlx-community\u002FQwen3-Embedding-0.6B-4bit-DWQ\"\n  }'\n\n# 转录音频文件（本地处理）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Ftranscriptions \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=whisper-large\" \\\n  -F \"response_format=json\"\n\n# 文本到语音（TTS，实验性）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Fspeech \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"orpheus\",\n    \"input\": \"来自 Swama TTS 的问候\",\n    \"voice\": \"tara\",\n    \"response_format\": \"wav\"\n  }' --output speech.wav\n\n# TTS 模型：orpheus、marvis、chatterbox、chatterbox-turbo、outetts、cosyvoice2、cosyvoice3\n# 支持语音的模型：orpheus、marvis\n# Orpheus 的声音选项：tara、leah、jess、leo、dan、mia、zac、zoe\n# Marvis 的声音选项：conversational_a、conversational_b\n# CosyVoice 在未提供参考音频时会使用缓存的默认参考音频\n\n# 工具调用（函数调用）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"qwen3\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"东京现在的天气如何？\"}],\n    \"tools\": [\n      {\n        \"type\": \"function\",\n        \"function\": {\n          \"name\": \"get_weather\",\n          \"description\": \"获取当前天气\",\n          \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n              \"location\": {\"type\": \"string\", \"描述\": \"城市名称\"}\n            },\n            \"required\": [\"location\"]\n          }\n        }\n      }\n    ],\n    \"tool_choice\": \"auto\"\n  }'\n\n# 多模态支持（视觉语言模型）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"gemma3\",\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": [\n          {\"type\": \"text\", \"text\": \"这张图片里有什么？\"},\n          {\"type\": \"image_url\", \"image_url\": {\"url\": \"https:\u002F\u002Fexample.com\u002Fimage.jpg\"}}\n        ]\n      }\n    ]\n  }'\n```\n\n## 📚 命令参考\n\n### 模型管理\n\n```bash\n# 下载模型（支持别名和完整名称）\nswama pull qwen3                    # 使用别名\nswama pull whisper-large            # 下载语音识别模型\nswama pull mlx-community\u002FQwen3-8B-4bit  # 使用完整名称\n\n# 列出本地模型及可用别名\nswama list [--format json]\n\n# 运行推理（若本地未找到模型则自动下载）\nswama run qwen3 \"您的提示在此\"              # 使用别名 - 自动下载！\nswama run deepseek-coder \"编写一个 Python 函数\"  # 另一个别名\nswama run \u003C完整模型名称> \u003C提示> [选项]      # 使用完整名称\n\n# 转录音频文件\nswama transcribe audio.wav --model whisper-large --language en\n```\n\n### 服务器\n\n```bash\n# 启动 API 服务器\nswama serve [--host HOST] [--port PORT]\n```\n\n### 模型别名\n\nSwama 支持常用模型的便捷别名。您可以使用这些简短名称代替完整的模型 URL：\n\n```bash\n# 不同模型系列示例\nswama run qwen3 \"解释机器学习\"           # Qwen3 8B\nswama run llama3.2-1b \"快速问题：什么是 AI？\"  # Llama 3.2 1B（最快）\nswama run deepseek-r1 \"逐步思考：2+2*3\"    # DeepSeek R1（推理能力）\n```\n\n### 选项\n\n- `--temperature \u003Cvalue>`: 采样温度（0.0-2.0）\n- `--top-p \u003Cvalue>`: 核采样参数（0.0-1.0）\n- `--max-tokens \u003Cnumber>`: 最大生成标记数\n- `--repetition-penalty \u003Cvalue>`: 重复惩罚因子\n\n## 🔧 开发\n\n### 依赖项\n\n- [swift-nio](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-nio) - 高性能网络框架\n- [swift-argument-parser](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-argument-parser) - 命令行参数解析\n- [mlx-swift](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift) - Apple MLX Swift 绑定\n- [mlx-swift-lm](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift-lm) - MLX Swift 语言模型\n- [mlx-swift-audio](https:\u002F\u002Fgithub.com\u002FDePasqualeOrg\u002Fmlx-swift-audio) - MLX Swift 音频处理（Whisper、FunASR）\n\n### 构建\n\n```bash\n# 开发构建\nswift build\n\n# 发布构建\nswift build -c release\n\n# 运行测试\nswift test\n\n# 生成 Xcode 项目\nswift package generate-xcodeproj\n```\n\n## 🤝 贡献\n\n我们欢迎社区贡献！请按照以下步骤操作：\n\n1. 分支本仓库\n2. 创建特性分支 (`git checkout -b feature\u002Famazing-feature`)\n3. 提交更改 (`git commit -m '添加一些很棒的功能'`)\n4. 推送到分支 (`git push origin feature\u002Famazing-feature`)\n5. 打开拉取请求\n\n### 开发指南\n\n- 遵循 Swift 编码风格指南\n- 为新功能添加测试\n- 更新相关文档\n- 确保所有测试通过\n\n## 📝 许可证\n\n本项目采用 MIT 许可证授权，详情请参阅 [LICENSE](LICENSE) 文件。\n\n## 🙏 致谢\n\n- [Apple MLX](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx) 团队提供的优秀机器学习框架\n- [Swift NIO](https:\u002F\u002Fgithub.com\u002Fapple\u002Fswift-nio) 提供的高性能网络支持\n- 所有贡献者及社区成员\n\n## 📞 支持\n\n- 📝 [问题追踪器](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fissues)\n- 💬 [讨论区](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fdiscussions)\n\n## 🗺️ 路线图\n\n- 待办事项\n\n---\n\n**Swama** - 为 macOS 用户带来最佳本地 AI 体验 🚀","# Swama 快速上手指南\n\nSwama 是一款基于 Apple MLX 框架、使用纯 Swift 编写的高性能机器学习运行时，专为 macOS 和 Apple Silicon 芯片优化。它支持本地运行大语言模型（LLM）、视觉语言模型（VLM）及语音识别，并提供兼容 OpenAI 标准的 API 接口。\n\n## 环境准备\n\n在开始之前，请确保您的设备满足以下要求：\n\n*   **操作系统**：macOS 15.0 (Sequoia) 或更高版本\n*   **硬件架构**：Apple Silicon 芯片 (M1 \u002F M2 \u002F M3 \u002F M4 系列)\n*   **开发工具**（仅源码编译需要）：Xcode 16.0+ 和 Swift 6.2+\n*   **包管理器**：推荐安装 [Homebrew](https:\u002F\u002Fbrew.sh) 以便快速部署\n\n## 安装步骤\n\n推荐使用 Homebrew 进行安装，也可选择下载图形化应用。\n\n### 方式一：通过 Homebrew 安装（推荐）\n\n打开终端，执行以下命令即可一键安装 CLI 工具：\n\n```bash\nbrew install swama\n```\n\n### 方式二：安装 macOS 图形化应用\n\n如果您更喜欢菜单栏应用或图形界面：\n\n1.  **下载安装包**：访问 [Releases 页面](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases)，下载最新的 `Swama.dmg` 文件。\n2.  **安装应用**：双击挂载 DMG 文件，将 `Swama.app` 拖入 `Applications` 文件夹。\n3.  **首次运行**：\n    *   若在启动时遇到安全警告，请前往 **系统设置 > 隐私与安全性 > 通用**，点击“仍要打开”。\n    *   或者右键点击应用图标选择“打开”。\n4.  **配置 CLI**：启动菜单栏中的 Swama 应用，点击 **\"Install Command Line Tool…\"** 即可将 `swama` 命令添加到系统路径。\n\n## 基本使用\n\n安装完成后，您可以直接使用命令行进行模型推理，无需预先手动下载模型，Swama 会在首次运行时自动处理。\n\n### 1. 快速推理（使用模型别名）\n\nSwama 支持友好的模型别名，自动匹配并下载对应的量化模型。\n\n**文本对话：**\n```bash\n# 使用 Qwen3 模型进行对话（自动下载）\nswama run qwen3 \"你好，请介绍一下你自己\"\n\n# 使用 Llama 3.2 模型\nswama run llama3.2 \"讲个笑话\"\n```\n\n**多模态识图：**\n```bash\n# 使用 Gemma3 视觉模型分析图片\nswama run gemma3 \"这张图片里有什么？\" -i \u002Fpath\u002Fto\u002Fimage.jpg\n```\n\n**查看已下载模型：**\n```bash\nswama list\n```\n\n> **常用模型别名参考**：\n> *   **轻量级**：`llama3.2-1b`, `qwen3-1.7b`\n> *   **均衡型**：`qwen3`, `llama3.2`, `gemma3` (默认 VLM)\n> *   **高性能**：`deepseek-r1`, `qwen3-32b`\n> *   **语音识别**：`whisper-large`, `whisper-small`\n\n### 2. 启动本地 API 服务\n\nSwama 提供完全兼容 OpenAI 格式的 API 端点，可轻松集成到现有应用中。\n\n**启动服务：**\n```bash\n# 默认监听 localhost:28100\nswama serve\n\n# 指定主机和端口\nswama serve --host 0.0.0.0 --port 28100\n```\n\n**调用示例（curl）：**\n\n*   **聊天补全**：\n    ```bash\n    curl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n      -H \"Content-Type: application\u002Fjson\" \\\n      -d '{\n        \"model\": \"qwen3\",\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]\n      }'\n    ```\n\n*   **流式输出**：\n    ```bash\n    curl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n      -H \"Content-Type: application\u002Fjson\" \\\n      -d '{\n        \"model\": \"deepseek-r1\",\n        \"messages\": [{\"role\": \"user\", \"content\": \"逐步计算 15% of 240\"}],\n        \"stream\": true\n      }'\n    ```\n\n*   **语音转文字 (Whisper)**：\n    ```bash\n    curl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Ftranscriptions \\\n      -F \"file=@audio.wav\" \\\n      -F \"model=whisper-large\"\n    ```\n\n*   **工具调用 (Function Calling)**：\n    支持在请求中定义 `tools` 参数，模型可自动返回结构化函数调用指令。\n\n*   **多模态输入**：\n    在 `messages` 中混合传入 `text` 和 `image_url` 即可实现识图对话。","一位 macOS 开发者需要在本地快速验证多模态大模型能力，并构建一个支持语音输入和语义搜索的原型应用。\n\n### 没有 swama 时\n- **环境配置繁琐**：需要手动安装 Python 依赖、配置 MLX 环境，且常因版本冲突导致推理引擎无法启动。\n- **多模态支持割裂**：处理图像输入需单独编写预处理代码，语音转录必须依赖外部云服务或复杂的 Whisper 本地部署方案。\n- **模型管理混乱**：每次切换模型都要手动下载权重文件、记录存储路径，缺乏统一的版本管理和缓存机制。\n- **集成开发成本高**：若要嵌入现有 Swift 项目，需通过桥接调用 Python 脚本，导致延迟高且调试困难。\n- **缺乏原生体验**：无法利用 macOS 原生菜单栏进行后台常驻服务，难以实现“随时唤起”的交互流程。\n\n### 使用 swama 后\n- **开箱即用**：通过 `brew install swama` 一键安装，基于纯 Swift 和 Apple MLX 框架，完美适配 M 系列芯片，无需配置复杂环境。\n- **全能多模态**：直接通过 CLI 或 API 传入图片路径即可实现图文对话，内置 Whisper 引擎支持离线语音转文字，无需额外服务。\n- **智能模型管家**：支持使用 `qwen3` 等简短别名自动下载和管理模型，自动处理缓存与版本更新，彻底告别手动文件操作。\n- **原生无缝集成**：提供标准的 OpenAI 兼容 API 和 Swift 原生 SDK，可直接在 Xcode 项目中调用，实现低延迟流式响应。\n- **优雅的系统融合**：通过菜单栏应用常驻后台，开发者可随时通过快捷键唤起对话或执行命令，工作流丝滑流畅。\n\nswama 将原本碎片化、高门槛的本地大模型部署过程，转化为符合 macOS 开发者习惯的原生、高效且全能的智能基础设施。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTrans-N-ai_swama_2af6a615.png","Trans-N-ai","Trans-N.ai","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FTrans-N-ai_d5598b25.jpg",null,"https:\u002F\u002Fgithub.com\u002FTrans-N-ai",[81,85],{"name":82,"color":83,"percentage":84},"Swift","#F05138",95.6,{"name":86,"color":87,"percentage":88},"Shell","#89e051",4.4,539,28,"2026-04-03T13:26:05","MIT","macOS","不需要独立显卡，但必须使用 Apple Silicon 芯片 (M1\u002FM2\u002FM3\u002FM4)，基于 Apple MLX 框架优化","未说明 (取决于运行的模型大小，从 1GB 到 220GB+ 不等)",{"notes":97,"python":98,"dependencies":99},"1. 仅支持 macOS 15.0 (Sequoia) 及以上版本。2. 必须使用 Apple Silicon 架构电脑，不支持 Intel Mac。3. 无需安装 Python 环境，核心由 Swift 编写。4. 支持自动下载和管理模型，首次运行特定模型时会自动拉取。5. 提供兼容 OpenAI 格式的 API 接口。","未说明 (基于 Swift 语言，非 Python)",[100,101,102],"Swift 6.2+","Apple MLX Framework","Xcode 16.0+ (编译需要)",[26,14,55,13,53,54],"2026-03-27T02:49:30.150509","2026-04-06T07:13:44.560948",[],[108,113,118,123,128,133,138,143,148,153,158,163,168],{"id":109,"version":110,"summary_zh":111,"released_at":112},98994,"v2.1.1","## 变更内容\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F105 中更新 metallib bundle\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv2.1.0...v2.1.1","2026-03-10T08:18:35",{"id":114,"version":115,"summary_zh":116,"released_at":117},98995,"v2.1.0","## 变更内容\n* 由 @zhaopengme 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F100 中更新提交版本\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F101 中实现嵌入批处理\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F103 中引入 Qwen35 模型\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F104 中修复依赖问题\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv2.0.1...v2.1.0","2026-03-03T06:49:38",{"id":119,"version":120,"summary_zh":121,"released_at":122},98996,"v2.0.1","## 变更内容\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F98 中实现，在 ModelPool 中初始化缓存限制配置。\n* 由 @Copilot 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F97 中为 Mac 应用程序菜单添加了 100 万 token 的上下文限制选项。\n* 由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F99 中为 HTTPHandler 的请求处理添加了 URI 规范化功能。\n\n## 新贡献者\n* @Copilot 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F97 中完成了首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv2.0.0...v2.0.1","2026-01-19T03:44:20",{"id":124,"version":125,"summary_zh":126,"released_at":127},98997,"v2.0.0","## 变更内容\n- 将 `mlx-swift-example` 替换为 [mlx-swift-lm](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift-lm)\n- 将 `whisper-kit` 替换为 [mlx-swift-audio](https:\u002F\u002Fgithub.com\u002FDePasqualeOrg\u002Fmlx-swift-audio)\n- 将 `mlx_embeddings` 迁移到 [MLXEmbedders](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx-swift-lm\u002Ftree\u002Fmain\u002FLibraries\u002FEmbedders)\n- 添加实验性 TTS 端点\n- 添加上下文长度限制（可通过 UI 或 CLI 配置）\n- 增加对更多模型的支持（例如 Qwen3-VL）\n\n## 重大变更（whisper-kit）\n如果您之前下载过 Whisper 模型，请务必使用 Swama 1.5.x 版本删除旧模型，然后在 Swama 2.0.0 中重新下载。  \n这样做可以避免因 whisper-kit 更新而导致的兼容性问题。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.5.0...v2.0.0","2025-12-31T14:09:15",{"id":129,"version":130,"summary_zh":131,"released_at":132},98998,"v1.5.0","## 变更内容\n* 修复：确保工具调用在流式响应中被正确累积，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F70 中完成\n* 提升清晰度：在 fetchModel 中记录当前使用的模型，由 @kakiloki 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F72 中完成\n* 新功能：支持 CORS，由 @Rin-Li 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F69 中实现\n* 添加 gpt-oss，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F79 中完成\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.4.3...v1.5.0","2025-09-19T09:01:19",{"id":134,"version":135,"summary_zh":136,"released_at":137},98999,"v1.4.3","## 变更内容\n* 功能：新增 `rm` 命令，用于从本地存储中移除模型，由 @nova28 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F65 中实现。\n* 修复：使 `ModelPaths.customModelsDirectory` 对于测试而言变为动态，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F66 中完成。\n* 修复：提升 OpenAI API 对流式响应的兼容性，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F67 中完成。\n* 运行：当模型缺失时添加自动下载功能，由 @kakiloki 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F52 中实现。\n* 功能：新增从本地路径创建模型的功能，由 @Rin-Li 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F63 中实现。\n* 新增通义千问 Qwen3 30B 2507 别名，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F68 中完成。\n\n## 新贡献者\n* @nova28 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F65 中完成了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.4.2...v1.4.3","2025-07-30T09:23:44",{"id":139,"version":140,"summary_zh":141,"released_at":142},99000,"v1.4.2","## 变更内容\n* 修复了对Gemma 3系列模型的支持\n* @Rin-Li 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F58 中修复了 README 文件的问题\n* @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F61 中优化了 Modelpool 的多模态模型检测功能\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.4.1...v1.4.2","2025-07-24T07:13:40",{"id":144,"version":145,"summary_zh":146,"released_at":147},99001,"v1.4.1","## 变更内容\n* API 错误信息，由 @sxy-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F44 中实现\n* 修复：无法从 HuggingFace 下载器获取文件大小的问题，由 @djx-trans-n 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F45 中修复\n* 添加功能检查封装脚本，由 @Rin-Li 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F47 中添加\n* 添加模型大小信息，并更新多语言模型表格说明，由 @kakiloki 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F46 中完成\n* 修复：工具调用 Schema 问题，由 @subnix 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F51 中修复\n\n## 新贡献者\n* @kakiloki 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F46 中完成了首次贡献\n* @subnix 在 https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F51 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.4.0...v1.4.1","2025-07-15T08:57:45",{"id":149,"version":150,"summary_zh":151,"released_at":152},99002,"v1.4.0","# Swama v1.4.0 发行说明\n\n## 🆕 新增功能\n\n### OpenAI 兼容的工具调用支持\n- **函数调用 API** - 完整的 OpenAI 兼容工具调用功能，使 AI 模型能够与外部函数交互\n- **灵活的工具选择** - 支持所有工具选择模式：`\"none\"`、`\"auto\"`、`\"required\"` 以及按具体函数选择\n- **流式与非流式** - 统一处理两种响应模式下的工具调用，通过 SSE 实时传输工具调用数据块\n- **完整的消息支持** - 支持所有消息角色，包括 `system`、`user`、`assistant` 和 `tool`\n- **MLX 集成** - 在 OpenAI 工具规范与 MLX 的 `ToolSpec` 格式之间实现无缝转换，并自动处理参数\n\n### Gemma3 视觉语言模型支持\n- **新模型别名** - 为 `mlx-community\u002Fgemma-3-27b-it-4bit` 视觉语言模型添加了 `gemma3` 别名\n- **多模态推理** - 原生支持文本和图像输入，CLI 使用简便\n- **服务器优先架构** - CLI 现在优先使用 HTTP API 调用 Swama.app 后端，以提升性能\n- **自动启动能力** - 如果 Swama.app 未运行，则静默启动，并优雅地回退到直接执行\n- **增强的 CLI 选项** - 添加了 `--server-host` 和 `--server-port` 配置，便于灵活部署\n\n### ModelScope 注册中心支持\n- **双注册中心支持** - 同时支持 Hugging Face 和 ModelScope 的模型下载\n- **环境配置** - 设置 `SWAMA_REGISTRY=MODEL_SCOPE` 即可使用 ModelScope，默认为 `HUGGING_FACE`\n- **更适合中国用户的访问** - 为中国用户提供更快、更便捷的模型下载服务\n- **无缝切换** - 相同的 CLI 命令可在两个注册中心下使用，无需修改代码\n- **统一的模型管理** - 来自两个注册中心的模型都会出现在 `swama list` 中，并带有明确标识\n\n## 🚀 使用方法\n\n### 工具调用\n```bash\n# 通过 HTTP API 进行工具调用（OpenAI 兼容）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"qwen3\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"东京现在的天气如何？\"}],\n    \"tools\": [\n      {\n        \"type\": \"function\",\n        \"function\": {\n          \"name\": \"get_weather\",\n          \"description\": \"获取当前天气\",\n          \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n              \"location\": {\"type\": \"string\", \"description\": \"城市名称\"}\n            },\n            \"required\": [\"location\"]\n          }\n        }\n      }\n    ],\n    \"tool_choice\": \"auto\"\n  }'\n```\n\n### Gemma3 多模态推理\n```bash\n# 带图像输入的视觉语言模型\nswama run gemma3 \"这张图片里有什么？\" -i \u002Fpath\u002Fto\u002Fimage.jpg\n```\n\n### ModelScope 注册中心\n```bash\n# 中国用户使用 ModelScope 注册中心\nexport SWAMA_REGISTRY=MODEL_SCOPE\nswama pull qwen3\n\n# 或者使用默认的 Hugging Face 注册中心\nexport SWAMA_REGISTRY=HUGGING_FACE\ns","2025-06-27T06:08:59",{"id":154,"version":155,"summary_zh":156,"released_at":157},99003,"v1.3.0","# Swama v1.3.0 发行说明\n\n## 🆕 新增功能\n### 兼容 OpenAI 的音频 API\n- **`\u002Fv1\u002Faudio\u002Ftranscriptions` 端点** - 完全兼容 OpenAI API，实现无缝集成\n- **多部分表单数据支持** - 正确处理文件上传，并进行音频格式验证\n- **多种响应格式** - 支持 JSON、文本和详细 JSON 输出选项\n- **强大的错误处理** - 提供全面的错误响应，并附带正确的 HTTP 状态码\n\n### 增强的 CLI 工具与音频命令\n- **新增 `transcribe` 命令** - 提供全面的音频转录功能，支持自定义选项\n- **增强的 `pull` 命令** - 统一下载 MLX 和 WhisperKit 模型\n- **丰富的 CLI 选项** - 支持模型选择、语言、温度、提示词及输出格式\n- **智能模型验证** - 自动检测并验证 WhisperKit 模型\n\n## 🚀 使用方法\n\n### 音频转录\n```bash\n# 基本转录\nswama transcribe audio.wav\n\n# 指定模型和语言\nswama transcribe audio.wav -m whisper-base -l en\n\n# 获取带时间戳的详细输出\nswama transcribe audio.wav --verbose\n\n# JSON 格式输出，便于程序调用\nswama transcribe audio.wav -f json\n\n# 通过温度和提示词微调\nswama transcribe audio.wav -t 0.2 -p \"关于 AI 的技术讨论\"\n```\n\n### WhisperKit 模型管理\n```bash\n# 下载 WhisperKit 模型\nswama pull whisper-tiny\nswama pull whisper-base \nswama pull whisper-small\nswama pull whisper-large\n\n# 列出所有可用模型（包括 WhisperKit）\nswama list\n```\n\n### 音频 API 集成\n```bash\n# 通过 HTTP API 进行转录（兼容 OpenAI）\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Ftranscriptions \\\n  -F \"file=@meeting.wav\" \\\n  -F \"model=whisper-large\" \\\n  -F \"language=en\" \\\n  -F \"response_format=verbose_json\"\n\n# 简单文本响应\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Faudio\u002Ftranscriptions \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=whisper-large\" \\\n  -F \"response_format=text\"\n```\n\n## 📦 下载\n\n**[下载 Swama v1.3.0](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases\u002Ftag\u002Fv1.3.0)**\n\n提供以下格式：\n- **DMG 安装包** - 适用于 macOS 的简单拖放安装\n- **ZIP 压缩包** - 直接可运行的应用程序包\n\n## 🔄 升级说明\n- **从旧版本升级时**：安装新版本后，请从菜单栏打开 Swama，点击“安装命令行工具…”以更新 CLI 工具\n- **新增音频功能**：升级后，`transcribe` 命令和音频 API 即可立即使用\n- **模型存储位置**：WhisperKit 模型将存储在 `~\u002F.swama\u002Fmodels\u002Fwhisperkit` 目录下，以便与 MLX 模型分开管理\n- **API 兼容性**：现有的 `\u002Fv1\u002Fchat\u002Fcompletions` 和 `\u002Fv1\u002Fembeddings` 端点将继续正常工作，无需更改\n\n## 🔧 系统要求\n- macOS 14.0 或更高版本\n- Apple Silicon 处理器（M1\u002FM2\u002FM3\u002FM4）\n- 对于音频转录：需支持的音频格式（推荐 WAV，其他格式会自动转换）\n\n## 🎯 主要优势\n- **Pri","2025-06-19T09:06:28",{"id":159,"version":160,"summary_zh":161,"released_at":162},99004,"v1.2.0","# Swama v1.2.0 Release Notes\r\n\r\n## 🆕 What's New\r\n\r\n### Enhanced Model Path Management\r\n- **New centralized model storage** - Models now downloaded to `~\u002F.swama\u002Fmodels` by default\r\n- **Backward compatibility maintained** - Existing models in `~\u002FDocuments\u002Fhuggingface\u002Fmodels` still work\r\n- **Intelligent model discovery** - Automatically finds models across multiple storage locations\r\n- **Improved organization** - Better structure for different model types\r\n\r\n### Expanded Vision-Language (VL) Model Support\r\n- **Comprehensive VL model detection** - Smart pattern-based recognition (`-VL-`, `vision`, `Visual`, etc.)\r\n- **Enhanced model registry** - Improved caching and lookup performance for VL models\r\n- **Dual-path loading** - Support for both registry-based and locally stored VL models\r\n- **Better offline capabilities** - Enhanced model discovery for air-gapped environments\r\n\r\n### Robust Offline Model Support\r\n- **Local-first workflows** - Load models from local directories without network dependency\r\n- **Multi-location discovery** - Intelligent detection across preferred and legacy paths\r\n- **Air-gapped compatibility** - Full offline model resolution for network-constrained environments\r\n- **Improved local model configuration** - Streamlined process for locally stored models\r\n\r\n### Performance & Memory Optimizations\r\n- **Smart model type caching** - Avoid repeated registry lookups with intelligent caching\r\n- **Lazy-loaded VLM registry** - Better startup performance with on-demand loading\r\n- **Optimized filesystem operations** - Reduced overhead for offline model discovery\r\n- **Enhanced concurrent loading** - Type-aware caching for parallel model processing\r\n\r\n## 🚀 Usage\r\n### Load Local Models (Offline)\r\n```bash\r\n# Swama automatically discovers models in:\r\n# ~\u002F.swama\u002Fmodels\u002F\r\n# ~\u002FDocuments\u002Fhuggingface\u002Fmodels\u002F\r\nswama list  # Shows all discovered models\r\nswama run your-local-model  # Works without internet\r\n```\r\n\r\n### Improved Model Organization\r\n```bash\r\n# Models are now organized in the new preferred location\r\nls ~\u002F.swama\u002Fmodels\u002F\r\n# But legacy models still work\r\nls ~\u002FDocuments\u002Fhuggingface\u002Fmodels\u002F\r\n```\r\n\r\n## 📦 Download\r\n\r\n**[Download Swama v1.2.0](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases\u002Ftag\u002Fv1.2.0)**\r\n\r\nAvailable formats:\r\n- **DMG installer** - Easy drag-and-drop installation for macOS\r\n- **ZIP archive** - Direct application bundle\r\n\r\n## 🔄 Upgrade Notes\r\n- **If upgrading from a previous version**: After installing the new version, open Swama from the menu bar and click \"Install Command Line Tool…\" to update the CLI tools\r\n- **Model storage**: New models will be downloaded to `~\u002F.swama\u002Fmodels` by default, but existing models in the legacy location continue to work seamlessly\r\n- **VL model users**: Enhanced support means better compatibility and performance for vision-language models\r\n\r\n## 🔧 Requirements\r\n- macOS 14.0+\r\n- Apple Silicon (M1\u002FM2\u002FM3\u002FM4)\r\n\r\n## What's Changed\r\n* Fix the model name in the script by @Rin-Li in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F12\r\n* Enhanced Model Path Management and VL Model Support by @sxy-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F20\r\n\r\n## New Contributors\r\n* @Rin-Li made their first contribution in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F12\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.1.0...v1.2.0","2025-06-12T09:45:30",{"id":164,"version":165,"summary_zh":166,"released_at":167},99005,"v1.1.0","## 🆕 What's New\r\n\r\n### Text Embeddings Support\r\n- **New `\u002Fv1\u002Fembeddings` API endpoint** - Full OpenAI compatibility\r\n- **Built-in embedding generation** for semantic search and RAG applications\r\n- **Batch processing** with automatic padding and optimization\r\n\r\n### Enhanced Chat Capabilities\r\n- **System prompts support** - Define AI assistant behavior with system messages\r\n- **Multi-turn conversations** - Maintain conversation history across requests\r\n- **Full OpenAI ChatGPT API compatibility** - Drop-in replacement for chat applications\r\n\r\n### Intelligent Memory Management\r\n- **Automatic model eviction** - Prevents GPU memory exhaustion with smart cleanup\r\n- **Usage-based prioritization** - Keeps frequently used models in memory\r\n- **Concurrent inference control** - Safe parallel processing with per-model locking\r\n\r\n## 🚀 Usage\r\n\r\n### Generate Embeddings\r\n```bash\r\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fembeddings \\\r\n  -H \"Content-Type: application\u002Fjson\" \\\r\n  -d '{\r\n    \"input\": [\"Hello world\", \"Text embeddings\"],\r\n    \"model\": \"mlx-community\u002FQwen3-Embedding-0.6B-4bit-DWQ\"\r\n  }'\r\n```\r\n\r\n### Chat with System Prompts\r\n```bash\r\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\r\n  -H \"Content-Type: application\u002Fjson\" \\\r\n  -d '{\r\n    \"messages\": [\r\n      {\"role\": \"system\", \"content\": \"You are a helpful math tutor.\"},\r\n      {\"role\": \"user\", \"content\": \"Explain quadratic equations.\"}\r\n    ],\r\n    \"model\": \"qwen3\"\r\n  }'\r\n```\r\n\r\n### Multi-turn Conversations\r\n```bash\r\ncurl -X POST http:\u002F\u002Flocalhost:28100\u002Fv1\u002Fchat\u002Fcompletions \\\r\n  -H \"Content-Type: application\u002Fjson\" \\\r\n  -d '{\r\n    \"messages\": [\r\n      {\"role\": \"user\", \"content\": \"My name is Alice.\"},\r\n      {\"role\": \"assistant\", \"content\": \"Nice to meet you, Alice!\"},\r\n      {\"role\": \"user\", \"content\": \"What is my name?\"}\r\n    ],\r\n    \"model\": \"qwen3\"\r\n  }'\r\n```\r\n\r\n## 📦 Download\r\n\r\n**[Download Swama v1.1.0](https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Freleases\u002Ftag\u002Fv1.1.0)**\r\n\r\n## 🔄 Upgrade Notes\r\n- **If upgrading from a previous version**: After installing the new version, open Swama from the menu bar and click \"Install Command Line Tool…\" to update the CLI tools\r\n\r\n## 🔧 Requirements\r\n- macOS 14.0+\r\n- Apple Silicon (M1\u002FM2\u002FM3\u002FM4)\r\n\r\n\r\n## What's Changed\r\n* Fix: Correct command examples in README files by @djx-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F2\r\n* Fix Chinese ReadMe for downloading by @mmRose-MIAO in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F4\r\n* FEAT:Support model aliases for api model handler by @mmRose-MIAO in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F6\r\n* feat: Add comprehensive AI model performance benchmark script (Ollama vs Swama) by @syh-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F7\r\n* feat: add comprehensive AI model benchmark script with expanded model support by @syh-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F9\r\n* Fix support info readme by @Li-Haojie-1106 in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F11\r\n* add embedding models support by @sxy-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F13\r\n* Add Chat Message Support and Memory Management by @sxy-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F14\r\n\r\n## New Contributors\r\n* @djx-trans-n made their first contribution in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F2\r\n* @mmRose-MIAO made their first contribution in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F4\r\n* @syh-trans-n made their first contribution in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F7\r\n* @Li-Haojie-1106 made their first contribution in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F11\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcompare\u002Fv1.0.0...v1.1.0","2025-06-10T08:56:48",{"id":169,"version":170,"summary_zh":171,"released_at":172},99006,"v1.0.0","## What's Changed\r\n* add swama by @sxy-trans-n in https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fpull\u002F1\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FTrans-N-ai\u002Fswama\u002Fcommits\u002Fv1.0.0","2025-06-04T09:41:25"]