[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-k2-fsa--sherpa-onnx":3,"tool-k2-fsa--sherpa-onnx":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",145895,2,"2026-04-08T11:32:59",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":76,"owner_url":77,"languages":78,"stars":119,"forks":120,"last_commit_at":121,"license":122,"difficulty_score":32,"env_os":123,"env_gpu":124,"env_ram":125,"env_deps":126,"category_tags":132,"github_topics":134,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":155,"updated_at":156,"faqs":157,"releases":191},5635,"k2-fsa\u002Fsherpa-onnx","sherpa-onnx","Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server\u002Fclient, support 12 programming languages","sherpa-onnx 是一款功能强大的离线语音处理工具箱，基于下一代 Kaldi 框架与 ONNX Runtime 构建。它无需联网即可在本地实现语音转文字、文字转语音、说话人区分、语音增强、声源分离及语音活动检测等核心功能，全面覆盖从识别到合成的各类音频处理需求。\n\n该工具主要解决了传统语音方案依赖云端服务导致的隐私泄露风险、网络延迟高以及部署成本昂贵等问题。凭借卓越的跨平台兼容性，sherpa-onnx 不仅能运行在常见的 Windows、macOS、Linux 服务器上，更完美支持 Android、iOS、鸿蒙系统，以及树莓派、RISC-V 架构设备和各类国产 NPU（如瑞芯微、昇腾、爱芯等），让高性能语音技术得以轻松落地于嵌入式硬件与边缘设备。\n\nsherpa-onnx 特别适合开发者、研究人员及物联网工程师使用。其独特亮点在于提供了多达 12 种编程语言的 API 接口（包括 C++、Python、Go、Rust、Swift 等），并支持 WebAssembly，极大降低了集成门槛。无论你是想为智能硬件添加语音交互能力，还是构建保护用户隐私的本地化语音应用，sherpa-o","sherpa-onnx 是一款功能强大的离线语音处理工具箱，基于下一代 Kaldi 框架与 ONNX Runtime 构建。它无需联网即可在本地实现语音转文字、文字转语音、说话人区分、语音增强、声源分离及语音活动检测等核心功能，全面覆盖从识别到合成的各类音频处理需求。\n\n该工具主要解决了传统语音方案依赖云端服务导致的隐私泄露风险、网络延迟高以及部署成本昂贵等问题。凭借卓越的跨平台兼容性，sherpa-onnx 不仅能运行在常见的 Windows、macOS、Linux 服务器上，更完美支持 Android、iOS、鸿蒙系统，以及树莓派、RISC-V 架构设备和各类国产 NPU（如瑞芯微、昇腾、爱芯等），让高性能语音技术得以轻松落地于嵌入式硬件与边缘设备。\n\nsherpa-onnx 特别适合开发者、研究人员及物联网工程师使用。其独特亮点在于提供了多达 12 种编程语言的 API 接口（包括 C++、Python、Go、Rust、Swift 等），并支持 WebAssembly，极大降低了集成门槛。无论你是想为智能硬件添加语音交互能力，还是构建保护用户隐私的本地化语音应用，sherpa-onnx 都能提供灵活、高效且免费开源的技术支撑。"," ### Supported functions\n\n|Speech recognition| [Speech synthesis][tts-url] | [Source separation][ss-url] |\n|------------------|------------------|-------------------|\n|   ✔️              |         ✔️        |       ✔️           |\n\n|Speaker identification| [Speaker diarization][sd-url] | Speaker verification |\n|----------------------|-------------------- |------------------------|\n|   ✔️                  |         ✔️           |            ✔️           |\n\n| [Spoken Language identification][slid-url] | [Audio tagging][at-url] | [Voice activity detection][vad-url] |\n|--------------------------------|---------------|--------------------------|\n|                 ✔️              |          ✔️    |                ✔️         |\n\n| [Keyword spotting][kws-url] | [Add punctuation][punct-url] | [Speech enhancement][se-url] |\n|------------------|-----------------|--------------------|\n|     ✔️            |       ✔️         |      ✔️             |\n\n\n### Supported platforms\n\n|Architecture| Android | iOS     | Windows    | macOS | linux | HarmonyOS |\n|------------|---------|---------|------------|-------|-------|-----------|\n|   x64      |  ✔️      |         |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   x86      |  ✔️      |         |   ✔️      |       |        |        |\n|   arm64    |  ✔️      | ✔️      |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   arm32    |  ✔️      |         |           |       |  ✔️    |   ✔️   |\n|   riscv64  |          |         |           |       |  ✔️    |        |\n\n### Supported programming languages\n\n| 1. C++ | 2. C  | 3. Python | 4. JavaScript |\n|--------|-------|-----------|---------------|\n|   ✔️    | ✔️     | ✔️         |    ✔️          |\n\n|5. Java | 6. C# | 7. Kotlin | 8. Swift |\n|--------|-------|-----------|----------|\n| ✔️      |  ✔️    | ✔️         |  ✔️       |\n\n| 9. Go | 10. Dart | 11. Rust | 12. Pascal |\n|-------|----------|----------|------------|\n| ✔️     |  ✔️       |   ✔️      |    ✔️       |\n\n\nIt also supports WebAssembly.\n\n### Supported NPUs\n\n| [1. Rockchip NPU (RKNN)][rknpu-doc] | [2. Qualcomm NPU (QNN)][qnn-doc]  | [3. Ascend NPU][ascend-doc] |\n|-------------------------------------|-----------------------------------|-----------------------------|\n|     ✔️                              |                  ✔️               |     ✔️                      |\n\n| [4. Axera NPU][axera-npu] |\n|---------------------------|\n|     ✔️                    |\n\n[Join our discord](https:\u002F\u002Fdiscord.gg\u002FfJdxzg2VbG)\n\n\n## Introduction\n\nThis repository supports running the following functions **locally**\n\n  - Speech-to-text (i.e., ASR); both streaming and non-streaming are supported\n  - Text-to-speech (i.e., TTS)\n  - Speaker diarization\n  - Speaker identification\n  - Speaker verification\n  - Spoken language identification\n  - Audio tagging\n  - VAD (e.g., [silero-vad][silero-vad])\n  - Speech enhancement (e.g., [gtcrn][gtcrn], [DPDFNet](https:\u002F\u002Fgithub.com\u002Fceva-ip\u002FDPDFNet))\n  - Keyword spotting\n  - Source separation (e.g., [spleeter][spleeter], [UVR][UVR])\n\non the following platforms and operating systems:\n\n  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64), **RK NPU**, **Ascend NPU**\n  - Linux, macOS, Windows, openKylin\n  - Android, WearOS\n  - iOS\n  - HarmonyOS\n  - NodeJS\n  - WebAssembly\n  - [NVIDIA Jetson Orin NX][NVIDIA Jetson Orin NX] (Support running on both CPU and GPU)\n  - [NVIDIA Jetson Nano B01][NVIDIA Jetson Nano B01] (Support running on both CPU and GPU)\n  - [Raspberry Pi][Raspberry Pi]\n  - [RV1126][RV1126]\n  - [LicheePi4A][LicheePi4A]\n  - [VisionFive 2][VisionFive 2]\n  - [旭日X3派][旭日X3派]\n  - [爱芯派][爱芯派]\n  - [RK3588][RK3588]\n  - [SpacemiT-K1][SpacemiT-K1]\n  - [SpacemiT-K3][SpacemiT-K3]\n  - etc\n\nwith the following APIs\n\n  - C++, C, Python, Go, ``C#``\n  - Java, Kotlin, JavaScript\n  - Swift, Rust\n  - Dart, Object Pascal\n\n### Links for Huggingface Spaces\n\n\u003Cdetails>\n\u003Csummary>You can visit the following Huggingface spaces to try sherpa-onnx without\ninstalling anything. All you need is a browser.\u003C\u002Fsummary>\n\n| Description                                           | URL                                     | 中国镜像                               |\n|-------------------------------------------------------|-----------------------------------------|----------------------------------------|\n| Speaker diarization                                   | [Click me][hf-space-speaker-diarization]| [镜像][hf-space-speaker-diarization-cn]|\n| Speech recognition                                    | [Click me][hf-space-asr]                | [镜像][hf-space-asr-cn]                |\n| Speech recognition with [Whisper][Whisper]            | [Click me][hf-space-asr-whisper]        | [镜像][hf-space-asr-whisper-cn]        |\n| Speech synthesis                                      | [Click me][hf-space-tts]                | [镜像][hf-space-tts-cn]                |\n| Generate subtitles                                    | [Click me][hf-space-subtitle]           | [镜像][hf-space-subtitle-cn]           |\n| Audio tagging                                         | [Click me][hf-space-audio-tagging]      | [镜像][hf-space-audio-tagging-cn]      |\n| Source separation                                     | [Click me][hf-space-source-separation]  | [镜像][hf-space-source-separation-cn]  |\n| Spoken language identification with [Whisper][Whisper]| [Click me][hf-space-slid-whisper]       | [镜像][hf-space-slid-whisper-cn]       |\n\nWe also have spaces built using WebAssembly. They are listed below:\n\n| Description                                                                              | Huggingface space| ModelScope space|\n|------------------------------------------------------------------------------------------|------------------|-----------------|\n|Voice activity detection with [silero-vad][silero-vad]                                    | [Click me][wasm-hf-vad]|[地址][wasm-ms-vad]|\n|Real-time speech recognition (Chinese + English) with Zipformer                           | [Click me][wasm-hf-streaming-asr-zh-en-zipformer]|[地址][wasm-hf-streaming-asr-zh-en-zipformer]|\n|Real-time speech recognition (Chinese + English) with Paraformer                          |[Click me][wasm-hf-streaming-asr-zh-en-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-paraformer]|\n|Real-time speech recognition (Chinese + English + Cantonese) with [Paraformer-large][Paraformer-large]|[Click me][wasm-hf-streaming-asr-zh-en-yue-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-yue-paraformer]|\n|Real-time speech recognition (English) |[Click me][wasm-hf-streaming-asr-en-zipformer]    |[地址][wasm-ms-streaming-asr-en-zipformer]|\n|VAD + speech recognition (Chinese) with [Zipformer CTC](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|[Click me][wasm-hf-vad-asr-zh-zipformer-ctc-07-03]| [地址][wasm-ms-vad-asr-zh-zipformer-ctc-07-03]|\n|VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with [SenseVoice][SenseVoice]|[Click me][wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]| [地址][wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]|\n|VAD + speech recognition (English) with [Whisper][Whisper] tiny.en|[Click me][wasm-hf-vad-asr-en-whisper-tiny-en]| [地址][wasm-ms-vad-asr-en-whisper-tiny-en]|\n|VAD + speech recognition (English) with [Moonshine tiny][Moonshine tiny]|[Click me][wasm-hf-vad-asr-en-moonshine-tiny-en]| [地址][wasm-ms-vad-asr-en-moonshine-tiny-en]|\n|VAD + speech recognition (English) with Zipformer trained with [GigaSpeech][GigaSpeech]    |[Click me][wasm-hf-vad-asr-en-zipformer-gigaspeech]| [地址][wasm-ms-vad-asr-en-zipformer-gigaspeech]|\n|VAD + speech recognition (Chinese) with Zipformer trained with [WenetSpeech][WenetSpeech]  |[Click me][wasm-hf-vad-asr-zh-zipformer-wenetspeech]| [地址][wasm-ms-vad-asr-zh-zipformer-wenetspeech]|\n|VAD + speech recognition (Japanese) with Zipformer trained with [ReazonSpeech][ReazonSpeech]|[Click me][wasm-hf-vad-asr-ja-zipformer-reazonspeech]| [地址][wasm-ms-vad-asr-ja-zipformer-reazonspeech]|\n|VAD + speech recognition (Thai) with Zipformer trained with [GigaSpeech2][GigaSpeech2]      |[Click me][wasm-hf-vad-asr-th-zipformer-gigaspeech2]| [地址][wasm-ms-vad-asr-th-zipformer-gigaspeech2]|\n|VAD + speech recognition (Chinese 多种方言) with a [TeleSpeech-ASR][TeleSpeech-ASR] CTC model|[Click me][wasm-hf-vad-asr-zh-telespeech]| [地址][wasm-ms-vad-asr-zh-telespeech]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-large]| [地址][wasm-ms-vad-asr-zh-en-paraformer-large]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-small]| [地址][wasm-ms-vad-asr-zh-en-paraformer-small]|\n|VAD + speech recognition (多语种及多种中文方言) with [Dolphin][Dolphin]-base          |[Click me][wasm-hf-vad-asr-multi-lang-dolphin-base]| [地址][wasm-ms-vad-asr-multi-lang-dolphin-base]|\n|Speech synthesis (Piper, English)                                                                  |[Click me][wasm-hf-tts-piper-en]| [地址][wasm-ms-tts-piper-en]|\n|Speech synthesis (Piper, German)                                                                   |[Click me][wasm-hf-tts-piper-de]| [地址][wasm-ms-tts-piper-de]|\n|Speech synthesis (Matcha, Chinese)                                                                  |[Click me][wasm-hf-tts-matcha-zh]| [地址][wasm-ms-tts-matcha-zh]|\n|Speech synthesis (Matcha, English)                                                                  |[Click me][wasm-hf-tts-matcha-en]| [地址][wasm-ms-tts-matcha-en]|\n|Speech synthesis (Matcha, Chinese+English)                                                          |[Click me][wasm-hf-tts-matcha-zh-en]| [地址][wasm-ms-tts-matcha-zh-en]|\n|Speaker diarization                                                                         |[Click me][wasm-hf-speaker-diarization]|[地址][wasm-ms-speaker-diarization]|\n|Voice cloning with ZipVoice (Chinese+English)                                               |[Click me][wasm-hf-voice-cloning-zipvoice]|[地址][wasm-ms-voice-cloning-zipvoice]|\n|Voice cloning with Pocket TTS (English)                                               |[Click me][wasm-hf-voice-cloning-pocket]|[地址][wasm-ms-voice-cloning-pocket]|\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Android APKs\n\n\u003Cdetails>\n\n\u003Csummary>You can find pre-built Android APKs for this repository in the following table\u003C\u002Fsummary>\n\n| Description                            | URL                                | 中国用户                          |\n|----------------------------------------|------------------------------------|-----------------------------------|\n| Speaker diarization                    | [Address][apk-speaker-diarization] | [点此][apk-speaker-diarization-cn]|\n| Streaming speech recognition           | [Address][apk-streaming-asr]       | [点此][apk-streaming-asr-cn]      |\n| Simulated-streaming speech recognition | [Address][apk-simula-streaming-asr]| [点此][apk-simula-streaming-asr-cn]|\n| Text-to-speech                         | [Address][apk-tts]                 | [点此][apk-tts-cn]                |\n| Voice activity detection (VAD)         | [Address][apk-vad]                 | [点此][apk-vad-cn]                |\n| VAD + non-streaming speech recognition | [Address][apk-vad-asr]             | [点此][apk-vad-asr-cn]            |\n| Two-pass speech recognition            | [Address][apk-2pass]               | [点此][apk-2pass-cn]              |\n| Audio tagging                          | [Address][apk-at]                  | [点此][apk-at-cn]                 |\n| Audio tagging (WearOS)                 | [Address][apk-at-wearos]           | [点此][apk-at-wearos-cn]          |\n| Speaker identification                 | [Address][apk-sid]                 | [点此][apk-sid-cn]                |\n| Spoken language identification         | [Address][apk-slid]                | [点此][apk-slid-cn]               |\n| Keyword spotting                       | [Address][apk-kws]                 | [点此][apk-kws-cn]                |\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Flutter APPs\n\n\u003Cdetails>\n\n#### Real-time speech recognition\n\n| Description                    | URL                                 | 中国用户                            |\n|--------------------------------|-------------------------------------|-------------------------------------|\n| Streaming speech recognition   | [Address][apk-flutter-streaming-asr]| [点此][apk-flutter-streaming-asr-cn]|\n\n#### Text-to-speech\n\n| Description                              | URL                                | 中国用户                           |\n|------------------------------------------|------------------------------------|------------------------------------|\n| Android (arm64-v8a, armeabi-v7a, x86_64) | [Address][flutter-tts-android]     | [点此][flutter-tts-android-cn]     |\n| Linux (x64)                              | [Address][flutter-tts-linux]       | [点此][flutter-tts-linux-cn]       |\n| macOS (x64)                              | [Address][flutter-tts-macos-x64]   | [点此][flutter-tts-macos-x64-cn] |\n| macOS (arm64)                            | [Address][flutter-tts-macos-arm64] | [点此][flutter-tts-macos-arm64-cn]   |\n| Windows (x64)                            | [Address][flutter-tts-win-x64]     | [点此][flutter-tts-win-x64-cn]     |\n\n> Note: You need to build from source for iOS.\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Lazarus APPs\n\n\u003Cdetails>\n\n#### Generating subtitles\n\n| Description                    | URL                        | 中国用户                   |\n|--------------------------------|----------------------------|----------------------------|\n| Generate subtitles (生成字幕)  | [Address][lazarus-subtitle]| [点此][lazarus-subtitle-cn]|\n\n\u003C\u002Fdetails>\n\n### Links for pre-trained models\n\n\u003Cdetails>\n\n| Description                                 | URL                                                                                   |\n|---------------------------------------------|---------------------------------------------------------------------------------------|\n| Speech recognition (speech to text, ASR)    | [Address][asr-models]                                                                 |\n| Text-to-speech (TTS)                        | [Address][tts-models]                                                                 |\n| VAD                                         | [Address][vad-models]                                                                 |\n| Keyword spotting                            | [Address][kws-models]                                                                 |\n| Audio tagging                               | [Address][at-models]                                                                  |\n| Speaker identification (Speaker ID)         | [Address][sid-models]                                                                 |\n| Spoken language identification (Language ID)| See multi-lingual [Whisper][Whisper] ASR models from  [Speech recognition][asr-models]|\n| Punctuation                                 | [Address][punct-models]                                                               |\n| Speaker segmentation                        | [Address][speaker-segmentation-models]                                                |\n| Speech enhancement                          | [Address][speech-enhancement-models]                                                  |\n| Source separation                           | [Address][source-separation-models]                                                  |\n\n\u003C\u002Fdetails>\n\n#### Some pre-trained ASR models (Streaming)\n\n\u003Cdetails>\n\nPlease see\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-ctc\u002Findex.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20][sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]| Chinese, English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16][sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]| Chinese, English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23][sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]|Chinese| Suitable for Cortex A7 CPU. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23)|\n|[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17][sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]|English|Suitable for Cortex A7 CPU. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-en-20m-2023-02-17)|\n|[sherpa-onnx-streaming-zipformer-korean-2024-06-16][sherpa-onnx-streaming-zipformer-korean-2024-06-16]|Korean| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)|\n|[sherpa-onnx-streaming-zipformer-fr-2023-04-14][sherpa-onnx-streaming-zipformer-fr-2023-04-14]|French| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)|\n\n\u003C\u002Fdetails>\n\n\n#### Some pre-trained ASR models (Non-Streaming)\n\n\u003Cdetails>\n\nPlease see\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Findex.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english)| English | It is converted from \u003Chttps:\u002F\u002Fhuggingface.co\u002Fnvidia\u002Fparakeet-tdt-0.6b-v2>|\n|[Whisper tiny.en](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-whisper-tiny.en.tar.bz2)|English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Ftiny.en.html)|\n|[Moonshine tiny][Moonshine tiny]|English|See [also](https:\u002F\u002Fgithub.com\u002Fusefulsensors\u002Fmoonshine)|\n|[sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|Chinese| A Zipformer CTC model|\n|[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17][sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]|Chinese, Cantonese, English, Korean, Japanese| 支持多种中文方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsense-voice\u002Findex.html)|\n|[sherpa-onnx-paraformer-zh-2024-03-09][sherpa-onnx-paraformer-zh-2024-03-09]|Chinese, English| 也支持多种中文方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Fparaformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2024-03-09-chinese-english)|\n|[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01][sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]|Japanese|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01-japanese)|\n|[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24][sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]|Russian|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24-russian)|\n|[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24][sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]|Russian| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Fnemo\u002Frussian.html#sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24)|\n|[sherpa-onnx-zipformer-ru-2024-09-18][sherpa-onnx-zipformer-ru-2024-09-18]|Russian|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ru-2024-09-18-russian)|\n|[sherpa-onnx-zipformer-korean-2024-06-24][sherpa-onnx-zipformer-korean-2024-06-24]|Korean|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-korean-2024-06-24-korean)|\n|[sherpa-onnx-zipformer-thai-2024-06-20][sherpa-onnx-zipformer-thai-2024-06-20]|Thai| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-thai-2024-06-20-thai)|\n|[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04][sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]|Chinese| 支持多种方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Fmodels.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04)|\n\n\u003C\u002Fdetails>\n\n### Useful links\n\n- Documentation: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002F\n- Bilibili 演示视频: https:\u002F\u002Fsearch.bilibili.com\u002Fall?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi\n\n### How to reach us\n\nPlease see\nhttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fsocial-groups.html\nfor 新一代 Kaldi **微信交流群** and **QQ 交流群**.\n\n## Projects using sherpa-onnx\n\n### [Speed of Sound](https:\u002F\u002Fgithub.com\u002Fzugaldia\u002Fspeedofsound)\n\n> A voice-typing application for the Linux desktop (GTK4\u002FAdwaita).\n> It captures microphone audio, transcribes it offline using Sherpa ONNX ASR models,\n> optionally polishes the text with an LLM, and types the result into the active window\n> via XDG Remote Desktop Portal keyboard simulation.\n\n### [VoxSherpa TTS](https:\u002F\u002Fgithub.com\u002FCodeBySonu95\u002FVoxSherpa-TTS)\n\n> VoxSherpa TTS is a 100% offline Android Text-to-Speech app powered by Sherpa-ONNX.\n> It supports Kokoro-82M, Piper, and VITS engines with multilingual support including\n> Hindi, English, British English, Japanese, Chinese and 50+ more languages.\n\n- [Download APK v1.0-beta](https:\u002F\u002Fhuggingface.co\u002FCodeBySonu95\u002FSherpa-onnx-models\u002Fresolve\u002Fmain\u002FVoxSherpa-TTS_test.apk)\n- Android 11+ · 100% offline · No telemetry\n\n\u003Cdiv align=\"center\">\n\n| Generate | Models | Library | Settings |\n|:---:|:---:|:---:|:---:|\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_e7e7f7bdc274.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_8283cb377ccf.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_423e1defa720.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_5271cacf4ed2.jpg\" width=\"180\"\u002F> |\n\n\u003C\u002Fdiv>\n\n---\n### [BreezeApp](https:\u002F\u002Fgithub.com\u002Fmtkresearch\u002FBreezeApp) from [MediaTek Research](https:\u002F\u002Fgithub.com\u002Fmtkresearch)\n\n> BreezeAPP is a mobile AI application developed for both Android and iOS platforms.\n> Users can download it directly from the App Store and enjoy a variety of features\n> offline, including speech-to-text, text-to-speech, text-based chatbot interactions,\n> and image question-answering\n\n  - [Download APK for BreezeAPP](https:\u002F\u002Fhuggingface.co\u002FMediaTek-Research\u002FBreezeApp\u002Fresolve\u002Fmain\u002FBreezeApp.apk)\n  - [APK 中国镜像](https:\u002F\u002Fhf-mirror.com\u002FMediaTek-Research\u002FBreezeApp\u002Fblob\u002Fmain\u002FBreezeApp.apk)\n\n| 1 | 2 | 3 |\n|---|---|---|\n|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_ede1edb0ad13.png)|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_db9ed11ff886.png)|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_c90af9b707a7.png)|\n\n### [Open-LLM-VTuber](https:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber)\n\nTalk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking\nface running locally across platforms\n\nSee also \u003Chttps:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber\u002Fpull\u002F50>\n\n### [voiceapi](https:\u002F\u002Fgithub.com\u002Fruzhila\u002Fvoiceapi)\n\n\u003Cdetails>\n  \u003Csummary>Streaming ASR and TTS based on FastAPI\u003C\u002Fsummary>\n\n\nIt shows how to use the ASR and TTS Python APIs with FastAPI.\n\u003C\u002Fdetails>\n\n### [腾讯会议摸鱼工具 TMSpeech](https:\u002F\u002Fgithub.com\u002Fjxlpzqc\u002FTMSpeech)\n\nUses streaming ASR in C# with graphical user interface.\n\nVideo demo in Chinese: [【开源】Windows实时字幕软件（网课\u002F开会必备）](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1rX4y1p7Nx)\n\n### [lol互动助手](https:\u002F\u002Fgithub.com\u002Fl1veIn\u002Flol-wom-electron)\n\nIt uses the JavaScript API of sherpa-onnx along with [Electron](https:\u002F\u002Felectronjs.org\u002F)\n\nVideo demo in Chinese: [爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV142tje9E74)\n\n### [Sherpa-ONNX 语音识别服务器](https:\u002F\u002Fgithub.com\u002Fhfyydd\u002Fsherpa-onnx-server)\n\nA server based on nodejs providing Restful API for speech recognition.\n\n### [QSmartAssistant](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant)\n\n一个模块化，全过程可离线，低占用率的对话机器人\u002F智能音箱\n\nIt uses QT. Both [ASR](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#asr)\nand [TTS](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#tts)\nare used.\n\n### [Flutter-EasySpeechRecognition](https:\u002F\u002Fgithub.com\u002FJason-chen-coder\u002FFlutter-EasySpeechRecognition)\n\nIt extends [.\u002Fflutter-examples\u002Fstreaming_asr](.\u002Fflutter-examples\u002Fstreaming_asr) by\ndownloading models inside the app to reduce the size of the app.\n\nNote: [[Team B] Sherpa AI backend](https:\u002F\u002Fgithub.com\u002Fumgc\u002Fspring2025\u002Fpull\u002F82) also uses\nsherpa-onnx in a Flutter APP.\n\n### [sherpa-onnx-unity](https:\u002F\u002Fgithub.com\u002Fxue-fei\u002Fsherpa-onnx-unity)\n\nsherpa-onnx in Unity. See also [#1695](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1695),\n[#1892](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1892), and [#1859](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1859)\n\n### [xiaozhi-esp32-server](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server)\n\n本项目为xiaozhi-esp32提供后端服务，帮助您快速搭建ESP32设备控制服务器\nBackend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.\n\nSee also\n\n  - [ASR新增轻量级sherpa-onnx-asr](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fissues\u002F315)\n  - [feat: ASR增加sherpa-onnx模型](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fpull\u002F379)\n\n### [KaithemAutomation](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation)\n\nPure Python, GUI-focused home automation\u002Fconsumer grade SCADA.\n\nIt uses TTS from sherpa-onnx. See also [✨ Speak command that uses the new globally configured TTS model.](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation\u002Fcommit\u002F8e64d2b138725e426532f7d66bb69dd0b4f53693)\n\n### [Open-XiaoAI KWS](https:\u002F\u002Fgithub.com\u002Fidootop\u002Fopen-xiaoai-kws)\n\nEnable custom wake word for XiaoAi Speakers. 让小爱音箱支持自定义唤醒词。\n\nVideo demo in Chinese: [小爱同学启动～˶╹ꇴ╹˶！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1YfVUz5EMj)\n\n### [C++ WebSocket ASR Server](https:\u002F\u002Fgithub.com\u002Fmawwalker\u002Fstt-server)\n\nIt provides a WebSocket server based on C++ for ASR using sherpa-onnx.\n\n### [Go WebSocket Server](https:\u002F\u002Fgithub.com\u002Fbbeyondllove\u002Fasr_server)\n\nIt provides a WebSocket server based on the Go programming language for sherpa-onnx.\n\n### [Making robot Paimon, Ep10 \"The AI Part 1\"](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs)\n\nIt is a [YouTube video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs),\nshowing how the author tried to use AI so he can have a conversation with Paimon.\n\nIt uses sherpa-onnx for speech-to-text and text-to-speech.\n|1|\n|---|\n|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_4c04d35098b4.png)|\n\n### [TtsReader - Desktop application](https:\u002F\u002Fgithub.com\u002Fys-pro-duction\u002FTtsReader)\n\nA desktop text-to-speech application built using Kotlin Multiplatform.\n\n### [MentraOS](https:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS)\n\n> Smart glasses OS, with dozens of built-in apps. Users get AI assistant, notifications,\n> translation, screen mirror, captions, and more. Devs get to write 1 app that runs on\n> any pair of smart glasses.\n\nIt uses sherpa-onnx for real-time speech recognition on iOS and Android devices.\nSee also \u003Chttps:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS\u002Fpull\u002F861>\n\nIt uses Swift for iOS and Java for Android.\n\n### [flet_sherpa_onnx](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fflet_sherpa_onnx)\n\nFlet ASR\u002FSTT component based on sherpa-onnx.\nExample [a chat box agent](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fi18n-agent-action)\n\n### [achatbot-go](https:\u002F\u002Fgithub.com\u002Fai-bot-pro\u002Fachatbot-go)\n\na multimodal chatbot based on go with sherpa-onnx's speech lib api.\n\n### [fcitx5-vinput](https:\u002F\u002Fgithub.com\u002Fxifan2333\u002Ffcitx5-vinput)\n\nLocal offline voice input plugin for [Fcitx5](https:\u002F\u002Fgithub.com\u002Ffcitx\u002Ffcitx5) (Linux input method framework).\nIt uses C++ with offline ASR for speech recognition, supporting push-to-talk,\ncommand mode, and optional LLM post-processing.\n\nVideo demo in Chinese: [fcitx5-vinput](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1a6cUzVE6F)\n\n### [Wake Word](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fwake-word)\n\nA VS Code extension for hands-free voice-activated coding. It uses sherpa-onnx for real-time\nkeyword spotting (KWS) to detect custom wake phrases and trigger VS Code commands by voice.\nAudio capture is handled by [decibri](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fdecibri), a\ncross-platform Node.js microphone streaming library with prebuilt native binaries.\n\n- [VS Code Marketplace](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=analytics-in-motion.wake-word)\n- [Open VSX](https:\u002F\u002Fopen-vsx.org\u002Fextension\u002Fanalytics-in-motion\u002Fwake-word)\n- [decibri integration guides for sherpa-onnx](https:\u002F\u002Fdecibri.dev\u002Fdocs\u002Fnode\u002Fintegrations\u002Fsherpa-onnx-stt.html)\n\n[silero-vad]: https:\u002F\u002Fgithub.com\u002Fsnakers4\u002Fsilero-vad\n[Raspberry Pi]: https:\u002F\u002Fwww.raspberrypi.com\u002F\n[RV1126]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F191\u002FRV1126%20Brief%20Datasheet.pdf\n[LicheePi4A]: https:\u002F\u002Fsipeed.com\u002Flicheepi4a\n[VisionFive 2]: https:\u002F\u002Fwww.starfivetech.com\u002Fen\u002Fsite\u002Fboards\n[旭日X3派]: https:\u002F\u002Fdeveloper.horizon.ai\u002Fapi\u002Fv1\u002FfileData\u002Fdocuments_pi\u002Findex.html\n[爱芯派]: https:\u002F\u002Fwiki.sipeed.com\u002Fhardware\u002Fzh\u002FmaixIII\u002Fax-pi\u002Faxpi.html\n[hf-space-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-speaker-diarization-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-asr]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[hf-space-asr-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[Whisper]: https:\u002F\u002Fgithub.com\u002Fopenai\u002Fwhisper\n[hf-space-asr-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-asr-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-tts]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-tts-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-subtitle]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-subtitle-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-audio-tagging]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-audio-tagging-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-source-separation]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-source-separation-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-slid-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[hf-space-slid-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[wasm-hf-vad]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-ms-vad]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-hf-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-ms-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-hf-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[Paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fdamo\u002Fspeech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\u002Fsummary\n[wasm-hf-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-ms-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-hf-streaming-asr-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[wasm-ms-streaming-asr-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[SenseVoice]: https:\u002F\u002Fgithub.com\u002FFunAudioLLM\u002FSenseVoice\n[wasm-hf-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\n[wasm-ms-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\u002Fsummary\n[wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice\n[wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice\n[wasm-hf-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-ms-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-hf-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-ms-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-hf-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-ms-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-hf-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[wasm-ms-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[reazonspeech]: https:\u002F\u002Fresearch.reazon.jp\u002F_static\u002Freazonspeech_nlp2023.pdf\n[wasm-hf-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[wasm-ms-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[gigaspeech2]: https:\u002F\u002Fgithub.com\u002Fspeechcolab\u002Fgigaspeech2\n[wasm-hf-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[wasm-ms-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[telespeech-asr]: https:\u002F\u002Fgithub.com\u002Ftele-ai\u002Ftelespeech-asr\n[wasm-hf-vad-asr-zh-telespeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-ms-vad-asr-zh-telespeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-hf-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-hf-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[wasm-ms-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[dolphin]: https:\u002F\u002Fgithub.com\u002Fdataoceanai\u002Fdolphin\n[wasm-ms-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n[wasm-hf-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n\n[wasm-hf-tts-matcha-zh-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-hf-tts-matcha-zh]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-tts-matcha\n[wasm-ms-tts-matcha-zh-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-ms-tts-matcha-zh]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-tts-matcha\n[wasm-hf-tts-matcha-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-matcha\n[wasm-ms-tts-matcha-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-matcha\n[wasm-hf-tts-piper-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-ms-tts-piper-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-hf-tts-piper-de]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-ms-tts-piper-de]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-hf-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-ms-speaker-diarization]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-hf-voice-cloning-zipvoice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-ms-voice-cloning-zipvoice]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-hf-voice-cloning-pocket]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-pocket\n[wasm-ms-voice-cloning-pocket]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-pocket\n[apk-speaker-diarization]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk.html\n[apk-speaker-diarization-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk-cn.html\n[apk-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk.html\n[apk-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-cn.html\n[apk-simula-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr.html\n[apk-simula-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr-cn.html\n[apk-tts]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine.html\n[apk-tts-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine-cn.html\n[apk-vad]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk.html\n[apk-vad-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-cn.html\n[apk-vad-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr.html\n[apk-vad-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr-cn.html\n[apk-2pass]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass.html\n[apk-2pass-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass-cn.html\n[apk-at]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk.html\n[apk-at-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-cn.html\n[apk-at-wearos]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos.html\n[apk-at-wearos-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos-cn.html\n[apk-sid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk.html\n[apk-sid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk-cn.html\n[apk-slid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk.html\n[apk-slid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk-cn.html\n[apk-kws]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk.html\n[apk-kws-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk-cn.html\n[apk-flutter-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[apk-flutter-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[flutter-tts-android]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android.html\n[flutter-tts-android-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android-cn.html\n[flutter-tts-linux]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux.html\n[flutter-tts-linux-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux-cn.html\n[flutter-tts-macos-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64.html\n[flutter-tts-macos-arm64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64-cn.html\n[flutter-tts-macos-arm64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64.html\n[flutter-tts-macos-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64-cn.html\n[flutter-tts-win-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win.html\n[flutter-tts-win-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win-cn.html\n[lazarus-subtitle]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles.html\n[lazarus-subtitle-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles-cn.html\n[asr-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fasr-models\n[tts-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Ftts-models\n[vad-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsilero_vad.onnx\n[kws-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fkws-models\n[at-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Faudio-tagging-models\n[sid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[slid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[punct-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fpunctuation-models\n[speaker-segmentation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-segmentation-models\n[GigaSpeech]: https:\u002F\u002Fgithub.com\u002FSpeechColab\u002FGigaSpeech\n[WenetSpeech]: https:\u002F\u002Fgithub.com\u002Fwenet-e2e\u002FWenetSpeech\n[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-korean-2024-06-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-korean-2024-06-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n[sherpa-onnx-zipformer-ru-2024-09-18]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ru-2024-09-18.tar.bz2\n[sherpa-onnx-zipformer-korean-2024-06-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-korean-2024-06-24.tar.bz2\n[sherpa-onnx-zipformer-thai-2024-06-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-thai-2024-06-20.tar.bz2\n[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-paraformer-zh-2024-03-09]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-paraformer-zh-2024-03-09.tar.bz2\n[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n[sherpa-onnx-streaming-zipformer-fr-2023-04-14]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-fr-2023-04-14.tar.bz2\n[Moonshine tiny]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n[NVIDIA Jetson Orin NX]: https:\u002F\u002Fdeveloper.download.nvidia.com\u002Fassets\u002Fembedded\u002Fsecure\u002Fjetson\u002Forin_nx\u002Fdocs\u002FJetson_Orin_NX_DS-10712-001_v0.5.pdf?RCPGu9Q6OVAOv7a7vgtwc9-BLScXRIWq6cSLuditMALECJ_dOj27DgnqAPGVnT2VpiNpQan9SyFy-9zRykR58CokzbXwjSA7Gj819e91AXPrWkGZR3oS1VLxiDEpJa_Y0lr7UT-N4GnXtb8NlUkP4GkCkkF_FQivGPrAucCUywL481GH_WpP_p7ziHU1Wg==&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczovL3d3dy5nb29nbGUuY29tLmhrLyJ9\n[NVIDIA Jetson Nano B01]: https:\u002F\u002Fwww.seeedstudio.com\u002Fblog\u002F2020\u002F01\u002F16\u002Fnew-revision-of-jetson-nano-dev-kit-now-supports-new-jetson-nano-module\u002F\n[speech-enhancement-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeech-enhancement-models\n[source-separation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fsource-separation-models\n[RK3588]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F192\u002FRK3588%20Brief%20Datasheet.pdf\n[spleeter]: https:\u002F\u002Fgithub.com\u002Fdeezer\u002Fspleeter\n[UVR]: https:\u002F\u002Fgithub.com\u002FAnjok07\u002Fultimatevocalremovergui\n[gtcrn]: https:\u002F\u002Fgithub.com\u002FXiaobin-Rong\u002Fgtcrn\n[tts-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fall-in-one.html\n[ss-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsource-separation\u002Findex.html\n[sd-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Findex.html\n[slid-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Findex.html\n[at-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Findex.html\n[vad-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Findex.html\n[kws-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Findex.html\n[punct-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpunctuation\u002Findex.html\n[se-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeech-enhancement\u002Findex.html\n[rknpu-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Frknn\u002Findex.html\n[qnn-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fqnn\u002Findex.html\n[ascend-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fascend\u002Findex.html\n[axera-npu]: https:\u002F\u002Faxera-tech.com\u002FSkill\u002F166.html\n[SpacemiT-K1]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK1\u002FK1_brief_zh.pdf\n[SpacemiT-K3]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK3\u002FK3_brief_zh.pdf\n","### 支持的功能\n\n|语音识别| [语音合成][tts-url] | [声源分离][ss-url] |\n|------------------|------------------|-------------------|\n|   ✔️              |         ✔️        |       ✔️           |\n\n|说话人辨识| [说话人日志][sd-url] | 说话人验证 |\n|----------------------|-------------------- |------------------------|\n|   ✔️                  |         ✔️           |            ✔️           |\n\n| [口语语言辨识][slid-url] | [音频标签][at-url] | [语音活动检测][vad-url] |\n|--------------------------------|---------------|--------------------------|\n|                 ✔️              |          ✔️    |                ✔️         |\n\n| [关键词检测][kws-url] | [添加标点符号][punct-url] | [语音增强][se-url] |\n|------------------|-----------------|--------------------|\n|     ✔️            |       ✔️         |      ✔️             |\n\n\n### 支持的平台\n\n|架构| 安卓 | iOS     | Windows    | macOS | Linux | 鸿蒙OS |\n|------------|---------|---------|------------|-------|-------|-----------|\n|   x64      |  ✔️      |         |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   x86      |  ✔️      |         |   ✔️      |       |        |        |\n|   arm64    |  ✔️      | ✔️      |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   arm32    |  ✔️      |         |           |       |  ✔️    |   ✔️   |\n|   riscv64  |          |         |           |       |  ✔️    |        |\n\n### 支持的编程语言\n\n| 1. C++ | 2. C  | 3. Python | 4. JavaScript |\n|--------|-------|-----------|---------------|\n|   ✔️    | ✔️     | ✔️         |    ✔️          |\n\n|5. Java | 6. C# | 7. Kotlin | 8. Swift |\n|--------|-------|-----------|----------|\n| ✔️      |  ✔️    | ✔️         |  ✔️       |\n\n| 9. Go | 10. Dart | 11. Rust | 12. Pascal |\n|-------|----------|----------|------------|\n| ✔️     |  ✔️       |   ✔️      |    ✔️       |\n\n\n它还支持 WebAssembly。\n\n### 支持的 NPU\n\n| [1. 瑞芯微 NPU (RKNN)][rknpu-doc] | [2. 高通 NPU (QNN)][qnn-doc]  | [3. 华为昇腾 NPU][ascend-doc] |\n|-------------------------------------|-----------------------------------|-----------------------------|\n|     ✔️                              |                  ✔️               |     ✔️                      |\n\n| [4. 艾拉科技 NPU][axera-npu] |\n|---------------------------|\n|     ✔️                    |\n\n[加入我们的 Discord](https:\u002F\u002Fdiscord.gg\u002FfJdxzg2VbG)\n\n\n## 简介\n\n本仓库支持在本地运行以下功能：\n\n  - 语音转文本（即 ASR）；支持流式和非流式处理\n  - 文本转语音（即 TTS）\n  - 说话人日志\n  - 说话人辨识\n  - 说话人验证\n  - 口语语言辨识\n  - 音频标签\n  - VAD（例如 [silero-vad][silero-vad]）\n  - 语音增强（例如 [gtcrn][gtcrn]、[DPDFNet](https:\u002F\u002Fgithub.com\u002Fceva-ip\u002FDPDFNet)）\n  - 关键词检测\n  - 声源分离（例如 [spleeter][spleeter]、[UVR][UVR]）\n\n可在以下平台和操作系统上运行：\n\n  - x86、x86_64、32位 ARM、64位 ARM（arm64、aarch64）、RISC-V（riscv64）、**RK NPU**、**昇腾 NPU**\n  - Linux、macOS、Windows、openKylin\n  - 安卓、WearOS\n  - iOS\n  - 鸿蒙OS\n  - NodeJS\n  - WebAssembly\n  - [NVIDIA Jetson Orin NX][NVIDIA Jetson Orin NX]（支持在 CPU 和 GPU 上运行）\n  - [NVIDIA Jetson Nano B01][NVIDIA Jetson Nano B01]（支持在 CPU 和 GPU 上运行）\n  - [树莓派][Raspberry Pi]\n  - [RV1126][RV1126]\n  - [LicheePi4A][LicheePi4A]\n  - [VisionFive 2][VisionFive 2]\n  - [旭日X3派][旭日X3派]\n  - [爱芯派][爱芯派]\n  - [RK3588][RK3588]\n  - [SpacemiT-K1][SpacemiT-K1]\n  - [SpacemiT-K3][SpacemiT-K3]\n  - 等等\n\n并提供以下 API：\n\n  - C++、C、Python、Go、C#\n  - Java、Kotlin、JavaScript\n  - Swift、Rust\n  - Dart、Object Pascal\n\n### Hugging Face Spaces 链接\n\n\u003Cdetails>\n\u003Csummary>您可以通过访问以下 Hugging Face Spaces 来试用 Sherpa-onnx，无需任何安装。您只需要一个浏览器即可。\u003C\u002Fsummary>\n\n| 描述                                           | URL                                     | 中国镜像                               |\n|-------------------------------------------------------|-----------------------------------------|----------------------------------------|\n| 发言人分离                                   | [点击我][hf-space-speaker-diarization]| [镜像][hf-space-speaker-diarization-cn]|\n| 语音识别                                    | [点击我][hf-space-asr]                | [镜像][hf-space-asr-cn]                |\n| 使用 [Whisper][Whisper] 的语音识别            | [点击我][hf-space-asr-whisper]        | [镜像][hf-space-asr-whisper-cn]        |\n| 语音合成                                      | [点击我][hf-space-tts]                | [镜像][hf-space-tts-cn]                |\n| 生成字幕                                    | [点击我][hf-space-subtitle]           | [镜像][hf-space-subtitle-cn]           |\n| 音频标签                                        | [点击我][hf-space-audio-tagging]      | [镜像][hf-space-audio-tagging-cn]      |\n| 声源分离                                        | [点击我][hf-space-source-separation]  | [镜像][hf-space-source-separation-cn]  |\n| 使用 [Whisper][Whisper] 进行的口语语言识别    | [点击我][hf-space-slid-whisper]       | [镜像][hf-space-slid-whisper-cn]       |\n\n我们还有使用 WebAssembly 构建的空间，列表如下：\n\n| 描述                                                                              | Hugging Face Space| ModelScope Space|\n|------------------------------------------------------------------------------------------|------------------|-----------------|\n| 使用 [silero-vad][silero-vad] 的语音活动检测                                    | [点击我][wasm-hf-vad]|[地址][wasm-ms-vad]|\n| 使用 Zipformer 的实时语音识别（中文 + 英文）                           | [点击我][wasm-hf-streaming-asr-zh-en-zipformer]|[地址][wasm-hf-streaming-asr-zh-en-zipformer]|\n| 使用 Paraformer 的实时语音识别（中文 + 英文）                          |[点击我][wasm-hf-streaming-asr-zh-en-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-paraformer]|\n| 使用 [Paraformer-large][Paraformer-large] 的实时语音识别（中文 + 英文 + 粤语）|[点击我][wasm-hf-streaming-asr-zh-en-yue-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-yue-paraformer]|\n| 实时语音识别（英文） |[点击我][wasm-hf-streaming-asr-en-zipformer]    |[地址][wasm-ms-streaming-asr-en-zipformer]|\n| VAD + 语音识别（中文）与 [Zipformer CTC](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|[点击我][wasm-hf-vad-asr-zh-zipformer-ctc-07-03]| [地址][wasm-ms-vad-asr-zh-zipformer-ctc-07-03]|\n| VAD + 语音识别（中文 + 英文 + 韩语 + 日语 + 粤语）与 [SenseVoice][SenseVoice]|[点击我][wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]| [地址][wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]|\n| VAD + 语音识别（英文）与 [Whisper][Whisper] tiny.en|[点击我][wasm-hf-vad-asr-en-whisper-tiny-en]| [地址][wasm-ms-vad-asr-en-whisper-tiny-en]|\n| VAD + 语音识别（英文）与 [Moonshine tiny][Moonshine tiny]|[点击我][wasm-hf-vad-asr-en-moonshine-tiny-en]| [地址][wasm-ms-vad-asr-en-moonshine-tiny-en]|\n| VAD + 语音识别（英文）与使用 [GigaSpeech][GigaSpeech] 训练的 Zipformer    |[点击我][wasm-hf-vad-asr-en-zipformer-gigaspeech]| [地址][wasm-ms-vad-asr-en-zipformer-gigaspeech]|\n| VAD + 语音识别（中文）与使用 [WenetSpeech][WenetSpeech] 训练的 Zipformer  |[点击我][wasm-hf-vad-asr-zh-zipformer-wenetspeech]| [地址][wasm-ms-vad-asr-zh-zipformer-wenetspeech]|\n| VAD + 语音识别（日语）与使用 [ReazonSpeech][ReazonSpeech] 训练的 Zipformer|[点击我][wasm-hf-vad-asr-ja-zipformer-reazonspeech]| [地址][wasm-ms-vad-asr-ja-zipformer-reazonspeech]|\n| VAD + 语音识别（泰语）与使用 [GigaSpeech2][GigaSpeech2] 训练的 Zipformer      |[点击我][wasm-hf-vad-asr-th-zipformer-gigaspeech2]| [地址][wasm-ms-vad-asr-th-zipformer-gigaspeech2]|\n| VAD + 语音识别（中文多种方言）与 a [TeleSpeech-ASR][TeleSpeech-ASR] CTC 模型|[点击我][wasm-hf-vad-asr-zh-telespeech]| [地址][wasm-ms-vad-asr-zh-telespeech]|\n| VAD + 语音识别（英文 + 中文，及多种中文方言）与 Paraformer-large          |[点击我][wasm-hf-vad-asr-zh-en-paraformer-large]| [地址][wasm-ms-vad-asr-zh-en-paraformer-large]|\n| VAD + 语音识别（英文 + 中文，及多种中文方言）与 Paraformer-small          |[点击我][wasm-hf-vad-asr-zh-en-paraformer-small]| [地址][wasm-ms-vad-asr-zh-en-paraformer-small]|\n| VAD + 语音识别（多语种及多种中文方言）与 [Dolphin][Dolphin]-base          |[点击我][wasm-hf-vad-asr-multi-lang-dolphin-base]| [地址][wasm-ms-vad-asr-multi-lang-dolphin-base]|\n| 语音合成（Piper，英文）                                                                  |[点击我][wasm-hf-tts-piper-en]| [地址][wasm-ms-tts-piper-en]|\n| 语音合成（Piper，德语）                                                                   |[点击我][wasm-hf-tts-piper-de]| [地址][wasm-ms-tts-piper-de]|\n| 语音合成（Matcha，中文）                                                                  |[点击我][wasm-hf-tts-matcha-zh]| [地址][wasm-ms-tts-matcha-zh]|\n| 语音合成（Matcha，英文）                                                                  |[点击我][wasm-hf-tts-matcha-en]| [地址][wasm-ms-tts-matcha-en]|\n| 语音合成（Matcha，中英双语）                                                          |[点击我][wasm-hf-tts-matcha-zh-en]| [地址][wasm-ms-tts-matcha-zh-en]|\n| 发言人分离                                                                         |[点击我][wasm-hf-speaker-diarization]|[地址][wasm-ms-speaker-diarization]|\n| 使用 ZipVoice（中文+英文）进行声音克隆                                               |[点击我][wasm-hf-voice-cloning-zipvoice]|[地址][wasm-ms-voice-cloning-zipvoice]|\n| 使用 Pocket TTS（英文）进行声音克隆                                               |[点击我][wasm-hf-voice-cloning-pocket]|[地址][wasm-ms-voice-cloning-pocket]|\n\n\u003C\u002Fdetails>\n\n### 预编译的 Android APK 下载链接\n\n\u003Cdetails>\n\n\u003Csummary>您可以在下表中找到此仓库的预编译 Android APK\u003C\u002Fsummary>\n\n| 描述                            | URL                                | 中国用户                          |\n|----------------------------------------|------------------------------------|-----------------------------------|\n| 发言人分离                    | [地址][apk-speaker-diarization] | [点击此处][apk-speaker-diarization-cn]|\n| 流式语音识别           | [地址][apk-streaming-asr]       | [点击此处][apk-streaming-asr-cn]      |\n| 模拟流式语音识别         | [地址][apk-simula-streaming-asr]| [点击此处][apk-simula-streaming-asr-cn]|\n| 文本转语音                     | [地址][apk-tts]                 | [点击此处][apk-tts-cn]                |\n| 语音活动检测 (VAD)         | [地址][apk-vad]                 | [点击此处][apk-vad-cn]                |\n| VAD + 非流式语音识别     | [地址][apk-vad-asr]             | [点击此处][apk-vad-asr-cn]            |\n| 两步法语音识别               | [地址][apk-2pass]               | [点击此处][apk-2pass-cn]              |\n| 音频标签                       | [地址][apk-at]                  | [点击此处][apk-at-cn]                 |\n| 音频标签（WearOS）           | [地址][apk-at-wearos]           | [点击此处][apk-at-wearos-cn]          |\n| 发言人辨识                     | [地址][apk-sid]                 | [点击此处][apk-sid-cn]                |\n| 口语语言辨识                   | [地址][apk-slid]                | [点击此处][apk-slid-cn]               |\n| 关键词检测                     | [地址][apk-kws]                 | [点击此处][apk-kws-cn]                |\n\n\u003C\u002Fdetails>\n\n### 预编译的 Flutter APP 下载链接\n\n\u003Cdetails>\n\n#### 实时语音识别\n\n| 描述                    | URL                                 | 中国用户                            |\n|-------------------------|-------------------------------------|-------------------------------------|\n| 流式语音识别   | [地址][apk-flutter-streaming-asr]| [点击此处][apk-flutter-streaming-asr-cn]|\n\n#### 文本转语音\n\n| 描述                              | URL                                | 中国用户                           |\n|------------------------------------------|------------------------------------|------------------------------------|\n| 安卓（arm64-v8a、armeabi-v7a、x86_64） | [地址][flutter-tts-android]     | [点击此处][flutter-tts-android-cn]     |\n| Linux（x64）                              | [地址][flutter-tts-linux]       | [点击此处][flutter-tts-linux-cn]       |\n| macOS（x64）                              | [地址][flutter-tts-macos-x64]   | [点击此处][flutter-tts-macos-x64-cn] |\n| macOS（arm64）                            | [地址][flutter-tts-macos-arm64] | [点击此处][flutter-tts-macos-arm64-cn]   |\n| Windows（x64）                            | [地址][flutter-tts-win-x64]     | [点击此处][flutter-tts-win-x64-cn]     |\n\n> 注：iOS 需要从源码构建。\n\n\u003C\u002Fdetails>\n\n### 预编译的 Lazarus APP 下载链接\n\n\u003Cdetails>\n\n#### 生成字幕\n\n| 描述                    | URL                        | 中国用户                   |\n|-------------------------|----------------------------|----------------------------|\n| 生成字幕 (生成字幕)  | [地址][lazarus-subtitle]| [点击此处][lazarus-subtitle-cn]|\n\n\u003C\u002Fdetails>\n\n### 预训练模型下载链接\n\n\u003Cdetails>\n\n| 描述                                 | URL                                                                                   |\n|---------------------------------------------|---------------------------------------------------------------------------------------|\n| 语音识别（语音转文本，ASR）    | [地址][asr-models]                                                                 |\n| 文本转语音（TTS）                        | [地址][tts-models]                                                                 |\n| VAD                                         | [地址][vad-models]                                                                 |\n| 关键词检测                            | [地址][kws-models]                                                                 |\n| 音频标签                               | [地址][at-models]                                                                  |\n| 发言人辨识（Speaker ID）         | [地址][sid-models]                                                                 |\n| 口语语言辨识（Language ID)| 参见多语言 [Whisper][Whisper] ASR 模型，来自  [语音识别][asr-models]|\n| 标点符号                                 | [地址][punct-models]                                                               |\n| 发言人分割                        | [地址][speaker-segmentation-models]                                                |\n| 语音增强                          | [地址][speech-enhancement-models]                                                  |\n| 声源分离                           | [地址][source-separation-models]                                                  |\n\n\u003C\u002Fdetails>\n\n#### 部分预训练 ASR 模型（流式）\n\n\u003Cdetails>\n\n请参阅：\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-ctc\u002Findex.html>\n\n以获取更多模型。下表仅列出其中的 **部分**。\n\n|名称 | 支持的语言| 描述|\n|-----|-----|----|\n|[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20][sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]| 中文、英文| 参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16][sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]| 中文、英文| 参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23][sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]|中文|适用于Cortex A7 CPU。参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23)|\n|[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17][sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]|英文|适用于Cortex A7 CPU。参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-en-20m-2023-02-17)|\n|[sherpa-onnx-streaming-zipformer-korean-2024-06-16][sherpa-onnx-streaming-zipformer-korean-2024-06-16]|韩语| 参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)|\n|[sherpa-onnx-streaming-zipformer-fr-2023-04-14][sherpa-onnx-streaming-zipformer-fr-2023-04-14]|法语| 参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)|\n\n\u003C\u002Fdetails>\n\n\n#### 一些非流式预训练 ASR 模型\n\n\u003Cdetails>\n\n请参阅：\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Findex.html>\n\n以获取更多模型。下表仅列出其中的 **部分**。\n\n|名称 | 支持的语言| 描述|\n|-----|-----|----|\n|[sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english)| 英语 | 由 \u003Chttps:\u002F\u002Fhuggingface.co\u002Fnvidia\u002Fparakeet-tdt-0.6b-v2> 转换而来|\n|[Whisper tiny.en](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-whisper-tiny.en.tar.bz2)|英语| 参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Ftiny.en.html)|\n|[Moonshine tiny][Moonshine tiny]|英语|参见 [此处](https:\u002F\u002Fgithub.com\u002Fusefulsensors\u002Fmoonshine)|\n|[sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|中文| 一个 Zipformer CTC 模型|\n|[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17][sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]|中文、粤语、英语、韩语、日语| 支持多种中文方言。参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsense-voice\u002Findex.html)|\n|[sherpa-onnx-paraformer-zh-2024-03-09][sherpa-onnx-paraformer-zh-2024-03-09]|中文、英语| 同样支持多种中文方言。参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Fparaformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2024-03-09-chinese-english)|\n|[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01][sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]|日语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01-japanese)|\n|[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24][sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]|俄语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24-russian)|\n|[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24][sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]|俄语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Fnemo\u002Frussian.html#sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24)|\n|[sherpa-onnx-zipformer-ru-2024-09-18][sherpa-onnx-zipformer-ru-2024-09-18]|俄语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ru-2024-09-18-russian)|\n|[sherpa-onnx-zipformer-korean-2024-06-24][sherpa-onnx-zipformer-korean-2024-06-24]|韩语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-korean-2024-06-24-korean)|\n|[sherpa-onnx-zipformer-thai-2024-06-20][sherpa-onnx-zipformer-thai-2024-06-20]|泰语|参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-thai-2024-06-20-thai)|\n|[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04][sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]|中文|支持多种方言。参见 [此处](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Fmodels.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04)|\n\n\u003C\u002Fdetails>\n\n\n\n### 有用链接\n\n- 文档：https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002F\n- Bilibili 演示视频：https:\u002F\u002Fsearch.bilibili.com\u002Fall?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi\n\n### 如何联系我们\n\n请访问\nhttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fsocial-groups.html\n以加入新一代 Kaldi **微信交流群** 和 **QQ 交流群**。\n\n## 使用 sherpa-onnx 的项目\n\n### [Speed of Sound](https:\u002F\u002Fgithub.com\u002Fzugaldia\u002Fspeedofsound)\n\n> 一款用于 Linux 桌面（GTK4\u002FAdwaita）的语音输入应用。\n> 它捕获麦克风音频，使用 Sherpa ONNX ASR 模型进行离线转录，\n> 可选地通过 LLM 对文本进行润色，并通过 XDG 远程桌面门户的键盘模拟功能将结果输入到当前活动窗口中。\n\n### [VoxSherpa TTS](https:\u002F\u002Fgithub.com\u002FCodeBySonu95\u002FVoxSherpa-TTS)\n\n> VoxSherpa TTS 是一款 100% 离线的 Android 文本转语音应用，由 Sherpa-ONNX 提供支持。\n> 它支持 Kokoro-82M、Piper 和 VITS 引擎，并提供多语言支持，包括印地语、英语、英式英语、日语、中文以及 50 多种其他语言。\n\n- [下载 APK v1.0-beta 版](https:\u002F\u002Fhuggingface.co\u002FCodeBySonu95\u002FSherpa-onnx-models\u002Fresolve\u002Fmain\u002FVoxSherpa-TTS_test.apk)\n- 需 Android 11 或更高版本 · 100% 离线 · 无遥测数据\n\n\u003Cdiv align=\"center\">\n\n| 生成 | 模型 | 库 | 设置 |\n|:---:|:---:|:---:|:---:|\n| \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_e7e7f7bdc274.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_8283cb377ccf.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_423e1defa720.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_5271cacf4ed2.jpg\" width=\"180\"\u002F> |\n\n\u003C\u002Fdiv>\n\n---\n### [BreezeApp](https:\u002F\u002Fgithub.com\u002Fmtkresearch\u002FBreezeApp) 来自 [MediaTek Research](https:\u002F\u002Fgithub.com\u002Fmtkresearch)\n\n> BreezeAPP 是一款为 Android 和 iOS 平台开发的移动 AI 应用程序。\n> 用户可以直接从 App Store 下载，并在离线状态下享受多种功能，\n> 包括语音转文本、文本转语音、基于文本的聊天机器人交互以及图像问答。\n\n  - [下载 BreezeAPP 的 APK](https:\u002F\u002Fhuggingface.co\u002FMediaTek-Research\u002FBreezeApp\u002Fresolve\u002Fmain\u002FBreezeApp.apk)\n  - [中国镜像 APK](https:\u002F\u002Fhf-mirror.com\u002FMediaTek-Research\u002FBreezeApp\u002Fblob\u002Fmain\u002FBreezeApp.apk)\n\n| 1 | 2 | 3 |\n|---|---|---|\n|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_ede1edb0ad13.png)|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_db9ed11ff886.png)|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_c90af9b707a7.png)|\n\n### [Open-LLM-VTuber](https:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber)\n\n通过免手持语音交互、语音打断以及 Live2D 技术，在本地跨平台运行任何 LLM 的面部动画。\n\n更多信息请参见 \u003Chttps:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber\u002Fpull\u002F50>\n\n### [voiceapi](https:\u002F\u002Fgithub.com\u002Fruzhila\u002Fvoiceapi)\n\n\u003Cdetails>\n  \u003Csummary>基于 FastAPI 的流式 ASR 和 TTS\u003C\u002Fsummary>\n\n\n展示了如何使用 FastAPI 结合 ASR 和 TTS 的 Python API。\n\u003C\u002Fdetails>\n\n### [腾讯会议摸鱼工具 TMSpeech](https:\u002F\u002Fgithub.com\u002Fjxlpzqc\u002FTMSpeech)\n\n采用 C# 实现流式 ASR，并配有图形用户界面。\n\n中文视频演示：[【开源】Windows实时字幕软件（网课\u002F开会必备）](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1rX4y1p7Nx)\n\n### [lol互动助手](https:\u002F\u002Fgithub.com\u002Fl1veIn\u002Flol-wom-electron)\n\n该应用使用 sherpa-onnx 的 JavaScript API，并结合 [Electron](https:\u002F\u002Felectronjs.org\u002F)。\n\n中文视频演示：[爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV142tje9E74)\n\n### [Sherpa-ONNX 语音识别服务器](https:\u002F\u002Fgithub.com\u002Fhfyydd\u002Fsherpa-onnx-server)\n\n基于 Node.js 的服务器，提供用于语音识别的 Restful API。\n\n### [QSmartAssistant](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant)\n\n一个模块化、全程可离线、低资源占用的对话机器人\u002F智能音箱。\n\n它使用 QT 框架。其中既包含了 [ASR](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#asr)\n也包含了 [TTS](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#tts)。\n\n### [Flutter-EasySpeechRecognition](https:\u002F\u002Fgithub.com\u002FJason-chen-coder\u002FFlutter-EasySpeechRecognition)\n\n它扩展了 [.\u002Fflutter-examples\u002Fstreaming_asr](.\u002Fflutter-examples\u002Fstreaming_asr)，通过在应用内下载模型来减小应用体积。\n\n注：[[Team B] Sherpa AI 后端](https:\u002F\u002Fgithub.com\u002Fumgc\u002Fspring2025\u002Fpull\u002F82) 也在 Flutter 应用中使用了 sherpa-onnx。\n\n### [sherpa-onnx-unity](https:\u002F\u002Fgithub.com\u002Fxue-fei\u002Fsherpa-onnx-unity)\n\n在 Unity 中使用 sherpa-onnx。更多信息请参见 [#1695](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1695),\n[#1892](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1892), 和 [#1859](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1859)。\n\n### [xiaozhi-esp32-server](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server)\n\n该项目为 xiaozhi-esp32 提供后端服务，帮助您快速搭建 ESP32 设备控制服务器。\n\n更多信息请参见：\n\n  - [ASR 新增轻量级 sherpa-onnx-asr](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fissues\u002F315)\n  - [特性：ASR 增加 sherpa-onnx 模型](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fpull\u002F379)\n\n### [KaithemAutomation](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation)\n\n纯 Python 编写，专注于 GUI 的家庭自动化\u002F消费级 SCADA 系统。\n\n它使用 sherpa-onnx 的 TTS 功能。更多信息请参见 [✨ 使用全新全局配置的 TTS 模型发出语音指令。](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation\u002Fcommit\u002F8e64d2b138725e426532f7d66bb69dd0b4f53693)\n\n### [Open-XiaoAI KWS](https:\u002F\u002Fgithub.com\u002Fidootop\u002Fopen-xiaoai-kws)\n\n为小爱音箱启用自定义唤醒词。\n\n中文视频演示：[小爱同学启动～˶╹ꇴ╹˶！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1YfVUz5EMj)\n\n### [C++ WebSocket ASR 服务器](https:\u002F\u002Fgithub.com\u002Fmawwalker\u002Fstt-server)\n\n它基于 C++ 构建了一个 WebSocket 服务器，用于使用 sherpa-onnx 进行语音识别。\n\n### [Go WebSocket 服务器](https:\u002F\u002Fgithub.com\u002Fbbeyondllove\u002Fasr_server)\n\n它基于 Go 语言构建了一个 WebSocket 服务器，专用于 sherpa-onnx。\n\n### [制作机器人派蒙，第 10 集“AI 部分 1”](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs)\n\n这是一段 [YouTube 视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs),\n展示了作者如何尝试利用 AI 与派蒙进行对话。\n\n它使用 sherpa-onnx 进行语音转文本和文本转语音。\n|1|\n|---|\n|![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_readme_4c04d35098b4.png)|\n\n### [TtsReader - 桌面应用](https:\u002F\u002Fgithub.com\u002Fys-pro-duction\u002FTtsReader)\n\n一款使用 Kotlin Multiplatform 构建的桌面文本转语音应用程序。\n\n### [MentraOS](https:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS)\n\n> 智能眼镜操作系统，内置数十款应用。用户可以获得 AI 助手、通知、翻译、屏幕镜像、字幕等功能。开发者只需编写一次应用，即可在任何一副智能眼镜上运行。\n\n它使用 sherpa-onnx 在 iOS 和 Android 设备上进行实时语音识别。\n更多信息请参见 \u003Chttps:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS\u002Fpull\u002F861>\n\n该系统使用 Swift 开发 iOS 版本，Java 开发 Android 版本。\n\n### [flet_sherpa_onnx](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fflet_sherpa_onnx)\n\n基于 sherpa-onnx 的 Flet ASR\u002FSTT 组件。\n示例 [聊天框代理](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fi18n-agent-action)\n\n### [achatbot-go](https:\u002F\u002Fgithub.com\u002Fai-bot-pro\u002Fachatbot-go)\n\n一款基于 Go 语言的多模态聊天机器人，使用 sherpa-onnx 的语音库 API。\n\n### [fcitx5-vinput](https:\u002F\u002Fgithub.com\u002Fxifan2333\u002Ffcitx5-vinput)\n\nLocal offline voice input plugin for [Fcitx5](https:\u002F\u002Fgithub.com\u002Ffcitx\u002Ffcitx5) (Linux input method framework).\nIt uses C++ with offline ASR for speech recognition, supporting push-to-talk,\ncommand mode, and optional LLM post-processing.\n\nVideo demo in Chinese: [fcitx5-vinput](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1a6cUzVE6F)\n\n### [Wake Word](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fwake-word)\n\nA VS Code extension for hands-free voice-activated coding. It uses sherpa-onnx for real-time\nkeyword spotting (KWS) to detect custom wake phrases and trigger VS Code commands by voice.\nAudio capture is handled by [decibri](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fdecibri), a\ncross-platform Node.js microphone streaming library with prebuilt native binaries.\n\n- [VS Code Marketplace](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=analytics-in-motion.wake-word)\n- [Open VSX](https:\u002F\u002Fopen-vsx.org\u002Fextension\u002Fanalytics-in-motion\u002Fwake-word)\n- [decibri integration guides for sherpa-onnx](https:\u002F\u002Fdecibri.dev\u002Fdocs\u002Fnode\u002Fintegrations\u002Fsherpa-onnx-stt.html)\n\n[silero-vad]: https:\u002F\u002Fgithub.com\u002Fsnakers4\u002Fsilero-vad\n[Raspberry Pi]: https:\u002F\u002Fwww.raspberrypi.com\u002F\n[RV1126]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F191\u002FRV1126%20Brief%20Datasheet.pdf\n[LicheePi4A]: https:\u002F\u002Fsipeed.com\u002Flicheepi4a\n[VisionFive 2]: https:\u002F\u002Fwww.starfivetech.com\u002Fen\u002Fsite\u002Fboards\n[旭日X3派]: https:\u002F\u002Fdeveloper.horizon.ai\u002Fapi\u002Fv1\u002FfileData\u002Fdocuments_pi\u002Findex.html\n[爱芯派]: https:\u002F\u002Fwiki.sipeed.com\u002Fhardware\u002Fzh\u002FmaixIII\u002Fax-pi\u002Faxpi.html\n[hf-space-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-speaker-diarization-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-asr]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[hf-space-asr-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[Whisper]: https:\u002F\u002Fgithub.com\u002Fopenai\u002Fwhisper\n[hf-space-asr-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-asr-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-tts]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-tts-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-subtitle]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-subtitle-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-audio-tagging]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-audio-tagging-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-source-separation]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-source-separation-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-slid-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[hf-space-slid-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[wasm-hf-vad]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-ms-vad]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-hf-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-ms-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-hf-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[Paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fdamo\u002Fspeech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\u002Fsummary\n[wasm-hf-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-ms-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-hf-streaming-asr-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[wasm-ms-streaming-asr-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[SenseVoice]: https:\u002F\u002Fgithub.com\u002FFunAudioLLM\u002FSenseVoice\n[wasm-hf-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\n[wasm-ms-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\u002Fsummary\n[wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice\n[wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice\n[wasm-hf-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-ms-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-hf-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-ms-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-hf-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-ms-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-hf-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[wasm-ms-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[reazonspeech]: https:\u002F\u002Fresearch.reazon.jp\u002F_static\u002Freazonspeech_nlp2023.pdf\n[wasm-hf-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[wasm-ms-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[gigaspeech2]: https:\u002F\u002Fgithub.com\u002Fspeechcolab\u002Fgigaspeech2\n[wasm-hf-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[wasm-ms-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[telespeech-asr]: https:\u002F\u002Fgithub.com\u002Ftele-ai\u002Ftelespeech-asr\n[wasm-hf-vad-asr-zh-telespeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-ms-vad-asr-zh-telespeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-hf-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-hf-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[wasm-ms-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[dolphin]: https:\u002F\u002Fgithub.com\u002Fdataoceanai\u002Fdolphin\n[wasm-ms-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n[wasm-hf-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n\n[wasm-hf-tts-matcha-zh-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-hf-tts-matcha-zh]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-tts-matcha\n[wasm-ms-tts-matcha-zh-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-ms-tts-matcha-zh]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-tts-matcha\n[wasm-hf-tts-matcha-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-matcha\n[wasm-ms-tts-matcha-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-matcha\n[wasm-hf-tts-piper-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-ms-tts-piper-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-hf-tts-piper-de]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-ms-tts-piper-de]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-hf-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-ms-speaker-diarization]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-hf-voice-cloning-zipvoice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-ms-voice-cloning-zipvoice]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-hf-voice-cloning-pocket]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-pocket\n[wasm-ms-voice-cloning-pocket]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-pocket\n[apk-speaker-diarization]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk.html\n[apk-speaker-diarization-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk-cn.html\n[apk-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk.html\n[apk-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-cn.html\n[apk-simula-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr.html\n[apk-simula-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr-cn.html\n[apk-tts]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine.html\n[apk-tts-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine-cn.html\n[apk-vad]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk.html\n[apk-vad-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-cn.html\n[apk-vad-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr.html\n[apk-vad-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr-cn.html\n[apk-2pass]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass.html\n[apk-2pass-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass-cn.html\n[apk-at]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk.html\n[apk-at-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-cn.html\n[apk-at-wearos]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos.html\n[apk-at-wearos-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos-cn.html\n[apk-sid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk.html\n[apk-sid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk-cn.html\n[apk-slid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk.html\n[apk-slid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk-cn.html\n[apk-kws]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk.html\n[apk-kws-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk-cn.html\n[apk-flutter-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[apk-flutter-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[flutter-tts-android]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android.html\n[flutter-tts-android-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android-cn.html\n[flutter-tts-linux]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux.html\n[flutter-tts-linux-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux-cn.html\n[flutter-tts-macos-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64.html\n[flutter-tts-macos-arm64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64-cn.html\n[flutter-tts-macos-arm64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64.html\n[flutter-tts-macos-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64-cn.html\n[flutter-tts-win-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win.html\n[flutter-tts-win-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win-cn.html\n[lazarus-subtitle]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles.html\n[lazarus-subtitle-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles-cn.html\n[asr-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fasr-models\n[tts-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Ftts-models\n[vad-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsilero_vad.onnx\n[kws-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fkws-models\n[at-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Faudio-tagging-models\n[sid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[slid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[punct-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fpunctuation-models\n[speaker-segmentation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-segmentation-models\n[GigaSpeech]: https:\u002F\u002Fgithub.com\u002FSpeechColab\u002FGigaSpeech\n[WenetSpeech]: https:\u002F\u002Fgithub.com\u002Fwenet-e2e\u002FWenetSpeech\n[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-korean-2024-06-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-korean-2024-06-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n[sherpa-onnx-zipformer-ru-2024-09-18]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ru-2024-09-18.tar.bz2\n[sherpa-onnx-zipformer-korean-2024-06-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-korean-2024-06-24.tar.bz2\n[sherpa-onnx-zipformer-thai-2024-06-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-thai-2024-06-20.tar.bz2\n[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-paraformer-zh-2024-03-09]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-paraformer-zh-2024-03-09.tar.bz2\n[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n[sherpa-onnx-streaming-zipformer-fr-2023-04-14]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-fr-2023-04-14.tar.bz2\n[Moonshine tiny]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n[NVIDIA Jetson Orin NX]: https:\u002F\u002Fdeveloper.download.nvidia.com\u002Fassets\u002Fembedded\u002Fsecure\u002Fjetson\u002Forin_nx\u002Fdocs\u002FJetson_Orin_NX_DS-10712-001_v0.5.pdf?RCPGu9Q6OVAOv7a7vgtwc9-BLScXRIWq6cSLuditMALECJ_dOj27DgnqAPGVnT2VpiNpQan9SyFy-9zRykR58CokzbXwjSA7Gj819e91AXPrWkGZR3oS1VLxiDEpJa_Y0lr7UT-N4GnXtb8NlUkP4GkCkkF_FQivGPrAucCUywL481GH_WpP_p7ziHU1Wg==&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczovL3d3dy5nb29nbGUuY29tLmhrLyJ9\n[NVIDIA Jetson Nano B01]: https:\u002F\u002Fwww.seeedstudio.com\u002Fblog\u002F2020\u002F01\u002F16\u002Fnew-revision-of-jetson-nano-dev-kit-now-supports-new-jetson-nano-module\u002F\n[speech-enhancement-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeech-enhancement-models\n[source-separation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fsource-separation-models\n[RK3588]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F192\u002FRK3588%20Brief%20Datasheet.pdf\n[spleeter]: https:\u002F\u002Fgithub.com\u002Fdeezer\u002Fspleeter\n[UVR]: https:\u002F\u002Fgithub.com\u002FAnjok07\u002Fultimatevocalremovergui\n[gtcrn]: https:\u002F\u002Fgithub.com\u002FXiaobin-Rong\u002Fgtcrn\n[tts-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fall-in-one.html\n[ss-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsource-separation\u002Findex.html\n[sd-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Findex.html\n[slid-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Findex.html\n[at-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Findex.html\n[vad-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Findex.html\n[kws-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Findex.html\n[punct-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpunctuation\u002Findex.html\n[se-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeech-enhancement\u002Findex.html\n[rknpu-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Frknn\u002Findex.html\n[qnn-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fqnn\u002Findex.html\n[ascend-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fascend\u002Findex.html\n[axera-npu]: https:\u002F\u002Faxera-tech.com\u002FSkill\u002F166.html\n[SpacemiT-K1]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK1\u002FK1_brief_zh.pdf\n[SpacemiT-K3]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK3\u002FK3_brief_zh.pdf","# Sherpa-onnx 快速上手指南\n\nSherpa-onnx 是一个专注于本地运行的开源语音处理工具库，支持语音识别（ASR）、语音合成（TTS）、说话人日志、声纹识别等多种功能。它基于 ONNX Runtime，无需联网即可在多种架构（x86, ARM, RISC-V）和平台（Linux, Windows, macOS, Android, iOS, HarmonyOS）上高效运行，并支持国产 NPU（如瑞芯微 RKNN、华为昇腾 Ascend）。\n\n## 环境准备\n\n### 系统要求\nSherpa-onnx 支持广泛的操作系统和硬件架构：\n- **操作系统**: Linux (包括 openKylin), macOS, Windows, Android, iOS, HarmonyOS。\n- **硬件架构**: x86, x86_64, ARM32, ARM64 (aarch64), RISC-V (riscv64)。\n- **NPU 加速**: 支持 Rockchip (RKNN), Qualcomm (QNN), Ascend (昇腾), Axera 等 NPU。\n- **特殊设备**: 树莓派 (Raspberry Pi), NVIDIA Jetson 系列，以及各类国产开发板（如 RV1126, RK3588, 旭日 X3 派等）。\n\n### 前置依赖\n最快捷的上手方式是使用 **Python** 接口。请确保您的环境中已安装：\n- Python 3.8 或更高版本\n- pip 包管理工具\n\n> **提示**：如果您在中国大陆，建议配置 pip 国内镜像源以加速下载：\n> ```bash\n> pip config set global.index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 安装步骤\n\n### 方法一：通过 PyPI 安装（推荐）\n这是最简单的安装方式，适用于大多数通用平台（Linux, Windows, macOS）。\n\n```bash\npip install sherpa-onnx\n```\n\n### 方法二：预编译模型下载\nSherpa-onnx 本身是推理引擎，使用时需要配合具体的模型文件。您可以从 Hugging Face 或 ModelScope（魔搭）下载预训练模型。\n\n**推荐国内用户使用 ModelScope 下载模型**，速度更快。例如下载一个中文语音识别模型：\n\n```bash\n# 安装 modelscope 客户端\npip install modelscope\n\n# 下载示例模型 (以 Paraformer 流式模型为例)\nmodelscope download --model_dir .\u002Fmodels iic\u002FSenseVoiceSmall\n```\n*注：具体模型仓库地址请参考官方文档或 ModelScope 社区，不同任务（ASR\u002FTTS\u002FVAD）对应不同模型。*\n\n## 基本使用\n\n以下是一个使用 Python 进行**离线语音识别**的最简示例。假设您已经下载了一个支持的模型（此处以通用的 `sherpa-onnx-zipformer` 类模型结构为例，实际路径请替换为您下载的模型文件夹）。\n\n### 1. 语音识别 (ASR) 示例\n\n```python\nimport sherpa_onnx\n\n# 配置识别器参数\n# 请将 '.\u002Fpath\u002Fto\u002Fyour\u002Fmodel' 替换为实际下载的模型路径\nconfig = sherpa_onnx.OfflineRecognizerConfig(\n    feat_config=sherpa_onnx.FeatureConfig(\n        sample_rate=16000,\n        feature_dim=80\n    ),\n    model_config=sherpa_onnx.OfflineModelConfig(\n        transducer=sherpa_onnx.OfflineTransducerModelConfig(\n            encoder_filename=\".\u002Fpath\u002Fto\u002Fyour\u002Fmodel\u002Fencoder.onnx\",\n            decoder_filename=\".\u002Fpath\u002Fto\u002Fyour\u002Fmodel\u002Fdecoder.onnx\",\n            joiner_filename=\".\u002Fpath\u002Fto\u002Fyour\u002Fmodel\u002Fjoiner.onnx\",\n        ),\n        tokens=\".\u002Fpath\u002Fto\u002Fyour\u002Fmodel\u002Ftokens.txt\",\n        num_threads=4,\n        provider=\"cpu\", # 如有 GPU 或 NPU 可改为 \"cuda\" 或 \"rknn\" 等\n        modeling_unit=\"bpe\", # 根据模型类型调整，有些模型不需要此项\n    ),\n    decoding_method=\"greedy_search\",\n)\n\nrecognizer = sherpa_onnx.OfflineRecognizer(config)\n\n# 读取音频文件 (必须是 16kHz 单声道 WAV 格式)\nstream = recognizer.create_stream()\nstream.accept_waveform(16000, \".\u002Ftest.wav\") \n\nrecognizer.decode_stream(stream)\n\nprint(\"识别结果:\", stream.result.text)\n```\n\n### 2. 语音合成 (TTS) 示例\n\n```python\nimport sherpa_onnx\n\n# 配置合成器参数\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        vits=sherpa_onnx.OfflineTtsVitsModelConfig(\n            model=\".\u002Fpath\u002Fto\u002Fyour\u002Ftts_model.onnx\",\n            tokens=\".\u002Fpath\u002Fto\u002Fyour\u002Ftts_tokens.txt\",\n        ),\n        provider=\"cpu\",\n        num_threads=4,\n    ),\n    max_num_sentences=2,\n)\n\ntts = sherpa_onnx.OfflineTts(config)\n\n# 合成语音\ntext = \"你好，这是一个测试。\"\naudio = tts.generate(text, sid=0, speed=1.0)\n\n# 保存为 WAV 文件 (需自行实现保存逻辑，或使用 scipy\u002Fsoundfile)\n# import soundfile as sf\n# sf.write(\"output.wav\", audio.samples, audio.sample_rate)\nprint(f\"合成完成，采样率：{audio.sample_rate}, 时长：{len(audio.samples)\u002Faudio.sample_rate:.2f}秒\")\n```\n\n### 3. 在线体验与更多资源\n如果您不想在本地配置环境，可以直接访问以下国内镜像站点体验功能：\n- **语音识别体验**: [ModelScope 空间 - 语音识别](https:\u002F\u002Fmodelscope.cn\u002Fstudios) (搜索 sherpa-onnx)\n- **语音合成体验**: [ModelScope 空间 - 语音合成](https:\u002F\u002Fmodelscope.cn\u002Fstudios)\n- **说话人日志**: [ModelScope 空间 - 说话人日志](https:\u002F\u002Fmodelscope.cn\u002Fstudios)\n\n对于 Android 开发者，项目提供了预编译的 APK 演示程序，可在相关发布页面直接下载安装测试。","某智能家居团队正在为一款离线语音助手开发核心交互模块，要求设备在无网络环境下也能精准识别指令并区分不同家庭成员。\n\n### 没有 sherpa-onnx 时\n- **依赖云端导致延迟高**：语音必须上传至服务器处理，网络波动时响应慢甚至超时，用户体验割裂。\n- **隐私泄露风险大**：家庭对话录音需传输到第三方云端，存在敏感数据被截获或滥用的隐患。\n- **硬件适配成本极高**：需在 ARM 架构的开发板、RISC-V 芯片及各类 NPU 上分别移植不同的语音引擎，维护多套代码库。\n- **功能集成复杂**：想同时实现“人声分离”和“说话人区分”，不得不拼凑多个互不兼容的开源项目，导致系统臃肿不稳定。\n\n### 使用 sherpa-onnx 后\n- **毫秒级本地响应**：利用 ONNX Runtime 在设备端直接运行下一代 Kaldi 模型，断网状态下也能即时识别指令并合成回复。\n- **数据完全本地闭环**：所有语音识别、声纹验证及对话内容均在芯片内部处理，彻底杜绝隐私外传风险。\n- **一次开发多端部署**：凭借对 Android、iOS、HarmonyOS 及 RK\u002FAscend 等 NPU 的广泛支持，同一套 C++ 或 Python 代码可无缝运行于从树莓派到高端网关的各种设备。\n- **全能型单库集成**：单个库即可搞定语音转文字、文本转语音、说话人日志及背景降噪，大幅简化了工程架构与测试流程。\n\nsherpa-onnx 通过强大的跨平台离线能力，让开发者能以最低成本构建出既保护隐私又响应迅速的嵌入式智能语音应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fk2-fsa_sherpa-onnx_e7e7f7bd.jpg","k2-fsa","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fk2-fsa_fda504ee.png","",null,"https:\u002F\u002Fgithub.com\u002Fk2-fsa",[79,83,87,91,95,99,103,107,111,115],{"name":80,"color":81,"percentage":82},"C++","#f34b7d",38.8,{"name":84,"color":85,"percentage":86},"Python","#3572A5",17.1,{"name":88,"color":89,"percentage":90},"Shell","#89e051",6.8,{"name":92,"color":93,"percentage":94},"JavaScript","#f1e05a",4.9,{"name":96,"color":97,"percentage":98},"Dart","#00B4AB",4.5,{"name":100,"color":101,"percentage":102},"Kotlin","#A97BFF",4,{"name":104,"color":105,"percentage":106},"C","#555555",3.6,{"name":108,"color":109,"percentage":110},"Java","#b07219",3.5,{"name":112,"color":113,"percentage":114},"Rust","#dea584",3.3,{"name":116,"color":117,"percentage":118},"Pascal","#E3F171",3.2,11422,1300,"2026-04-08T14:29:12","Apache-2.0","Linux, macOS, Windows, Android, iOS, HarmonyOS, openKylin","非必需。支持在 CPU 上运行。可选支持 NVIDIA GPU (如 Jetson Orin NX, Jetson Nano)，也支持多种 NPU (Rockchip RKNN, Qualcomm QNN, Ascend, Axera)。未指定具体显存大小或 CUDA 版本要求。","未说明",{"notes":127,"python":128,"dependencies":129},"该工具主打本地离线运行，架构兼容性极强，支持 x86\u002Fx64, ARM (32\u002F64 位), RISC-V 等多种指令集。提供 C++, C, Python, JavaScript, Java, C#, Kotlin, Swift, Go, Dart, Rust, Pascal 等十余种语言接口。支持在浏览器中通过 WebAssembly 直接运行无需安装。针对特定硬件（如瑞芯微、华为昇腾、高通等）有专门的 NPU 加速支持。","支持 Python，但未指定具体版本要求",[130,131],"onnxruntime (隐含于 onnx 体系)","WebAssembly (可选)",[133,14],"音频",[135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154],"asr","onnx","windows","linux","macos","cpp","android","ios","raspberry-pi","aarch64","arm32","csharp","dotnet","mfc","speech-to-text","text-to-speech","vits","risc-v","lazarus","object-pascal","2026-03-27T02:49:30.150509","2026-04-09T05:47:59.832409",[158,163,168,172,177,182,187],{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},25566,"在嵌入式 Linux (如 RK3568) 上部署 sherpa-onnx 时，运行命令报错或找不到库文件怎么办？","确保交叉编译生成的动态库路径与运行时加载路径一致。如果是自定义模型（如阿里 SenseVoice），请参考官方 C API 示例代码来构建正确的调用命令。对于模型加载慢的问题，可以编写一个简单的 C++ 程序测试加载大文件（如 100M）到内存所需的时间，以排查是硬件性能瓶颈还是路径配置问题。务必确认 lib 库文件已正确复制到嵌入式系统且权限无误。","https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1430",{"id":164,"question_zh":165,"answer_zh":166,"source_url":167},25567,"安装 sherpa-onnx-gpu 后导入时报错：'libonnxruntime_providers_cuda.so: cannot open shared object file'，如何解决？","这通常是环境变量或 CUDA 版本不匹配导致的。请严格按照 ONNX Runtime 官方文档安装对应的 CUDA 和 cuDNN 版本（推荐 CUDA 11.8 和 cuDNN 8.2.4 或更高兼容版本）。即使 PyTorch 在该环境下能正常工作，sherpa-onnx 依赖的 ONNX Runtime 可能需要特定版本的 CUDA 驱动和库文件。建议检查 LD_LIBRARY_PATH 是否包含了 onnxruntime 的库路径，并重新安装匹配的 cuda 和 cudnn。","https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1030",{"id":169,"question_zh":170,"answer_zh":171,"source_url":162},25568,"如何将非官方提供的模型（如魔塔社区 ModelScope 上的 ONNX 模型）在 sherpa-onnx 中使用？","如果模型已经是 ONNX 格式，理论上可以尝试加载，但需要确保模型结构符合 sherpa-onnx 支持的算子和输入输出规范。对于不支持的模型类型（如某些特定的流式 CTC 模型），目前可能无法直接使用。维护者表示支持流式 CTC 模型已在计划中。如果模型导出时使用了特殊的 opset_version 或自定义算子，可能需要自行修改导出脚本或等待官方支持。",{"id":173,"question_zh":174,"answer_zh":175,"source_url":176},25569,"如何为 sherpa-onnx 添加新的 TTS 模型（如 Matcha-TTS）并处理元数据和采样率问题？","添加新模型时，需要参考现有脚本为模型添加元数据（metadata），例如使用类似 add_sherpa_metadata_to_matcha.py 的脚本。关于采样率，如果训练数据是 24kHz 而默认代码硬编码为 22050Hz，需要修改相关代码以支持 24kHz。对于 Vocoder，可以选择将其嵌入到 ONNX 模型中实现端到端推理（类似 VITS），也可以单独提供。HiFi-GAN 的版本选择（v1\u002Fv2\u002Fv3）取决于训练效果，sherpa-onnx 本身对版本没有强制限制，只要能导出为兼容的 ONNX 格式即可。","https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1779",{"id":178,"question_zh":179,"answer_zh":180,"source_url":181},25570,"在导出 ONNX 模型时遇到 opset_version 报错，或者模型在特定环境下无法运行怎么办？","检查导出脚本中的 opset_version 设置。有些官方提供的模型在导出时可能使用了特定的参数（如 dynamic_axes 或其他优化选项），导致在某些环境下不兼容。如果遇到此类问题，可以尝试去掉导出脚本中特定的优化行（例如 run.sh 中的某些参数），然后重新导出模型。通常将 opset_version 设置为 13 是比较通用的选择。如果问题依旧，建议查看官方导出脚本的最新版本并进行对比。","https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1536",{"id":183,"question_zh":184,"answer_zh":185,"source_url":186},25571,"sherpa-onnx 是否支持流式语音识别（Streaming ASR）以及哪些模型可以使用？","sherpa-onnx 支持流式语音识别，但需要模型本身是为流式任务训练的（如 zipformer streaming 模型）。对于其他架构（如 QuartzNet 或 Conformer），如果它们支持缓存感知流式（cache-aware streaming）或分块推理（chunked inference），理论上可以通过导出为 ONNX 并在 sherpa-onnx 中适配来实现。目前官方正在计划增加对流式 CTC 模型的支持。如果不确定模型是否支持，可以查看其训练配置是否包含 streaming 相关的参数。","https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F178",{"id":188,"question_zh":189,"answer_zh":190,"source_url":181},25572,"如何在低版本 CUDA（如 CUDA 10.2）环境下编译和使用 sherpa-onnx GPU 版本？","sherpa-onnx 的预编译包通常依赖较新的 CUDA 版本（如 11.8）。如果环境受限只能使用 CUDA 10.2，可能需要从源码重新编译，并修改构建配置以适配旧版 CUDA。但这可能会遇到算子不支持或性能下降的问题。建议尽量升级 CUDA 环境，或者使用 CPU 版本进行推理。如果必须使用旧版 CUDA，需仔细检查 ONNX Runtime 对该版本的支持情况，并可能需要手动调整 CMake 配置。",[192,197,202,207,212,217,222,227,232,237,242,247,251,256,261,266,271,276,281,286],{"id":193,"version":194,"summary_zh":195,"released_at":196},162890,"v1.12.36","## 变更内容\n* @wjddd 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3468 中为 OfflineQwen3ASRModelConfig 添加了热词参数\n* @kapitalismho 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3472 中为 Qwen3-ASR 添加了按流的语言提示功能，并优化了脚手架清理逻辑\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3475 中修复了 vad+asr 的构建问题\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3476 中更新了 Qwen3 ASR 模型\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3477 中修复了 Qwen3-ASR 的热词处理问题\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3480 中修复了 fp32 版本 Qwen3-ASR 模型的初始化问题\n* @alex-spacemit 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3481 中更新了 riscv64-spacemit 架构下的 onnxruntime 包名和哈希值\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3482 中使用模型文件路径来初始化 ONNX Runtime 会话\n* @GLM-FM 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3485 中实现了 .NET Android 目标平台支持\n* @GLM-FM 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3487 中将 CI 环境升级至 .NET 10\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3488 中发布了 v1.12.36 版本\n\n## 新贡献者\n* @wjddd 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3468 中完成了首次贡献\n* @kapitalismho 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3472 中完成了首次贡献\n* @GLM-FM 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3485 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.35...v1.12.36","2026-04-08T13:05:30",{"id":198,"version":199,"summary_zh":200,"released_at":201},162891,"v1.12.35","## 变更内容\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3429 中修复 CI 测试\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3426 中添加声源分离的 Swift API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3430 中修复读取多声道波形文件的 C API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3432 中添加声源分离的 Go API\n* spacemit 提供者：添加使用配置文件的提供者，并更新至 2.0.2 版本，由 @alex-spacemit 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3421 中完成\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3433 中修复 Swift 示例测试\n* 由 @zugaldia 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3438 中为使用 sherpa-onnx 的项目添加声速参数\n* 由 @hani-hj1908619 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3448 中移除 OnlineRecognizer 模型配置中的 num_threads 断言\n* 由 @rossarmstrong 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3447 中将 Wake Word 和 decibri 添加到使用 sherpa-onnx 的项目中\n* 由 @rossarmstrong 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3451 中修复 Wake Word 项目条目中的 URL\n* 由 @Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3434 中为 Qwen3-ASR 添加热词支持\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3452 中添加问题模板\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3453 中上传用于 https:\u002F\u002Fhuggingface.co\u002FCohereLabs\u002Fcohere-transcribe-03-2026 的模型\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3456 中添加 Cohere Transcribe 的 C++ 运行时和 Python API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3457 中添加 Cohere transcribe 模型的 C API 和 CXX API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3460 中添加 Cohere Transcribe 的 Swift API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3458 中添加 Cohere Transcribe 的 JavaScript API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3461 中添加 Cohere Transcribe 的 Java 和 Kotlin API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3463 中添加 Cohere Transcribe 的 Pascal API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3462 中添加 Cohere Transcribe 的 C# API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3464 中添加 Cohere Transcribe 的 Dart API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3465 中添加 Cohere Transcribe 的 Rust API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3466 中添加 Cohere Transcribe 的 Go API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3467 中发布 v1.12.35 版本\n\n## 新贡献者\n* @zugaldia 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3438 中完成了首次贡献\n* @hani-hj1908619 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3448 中完成了首次贡献\n* @rossarmstrong 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3447 中完成了首次贡献\n\n**完整变更日志*","2026-04-03T03:58:45",{"id":203,"version":204,"summary_zh":205,"released_at":206},162892,"v1.12.34","## 变更内容\n* 更新 Python 字幕脚本，支持 FireRedASR ctc 和 FunASR Nano。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3400 中完成。\n* 修复 Rust 文档构建问题。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3403 中完成。\n* 添加源分离的 C API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3404 中完成。\n* 添加源分离的 C++ API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3405 中完成。\n* 添加源分离的 C# API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3406 中完成。\n* 修复 TTS 已弃用警告。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3407 中完成。\n* 添加 Qwen3-ASR 支持。由 @Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3399 中完成。\n* 上传 Qwen3 ASR 0.6B int8 模型。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3409 中完成。\n* 更新 Qwen3 ASR 的测试用例。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3410 中完成。\n* 更新 Qwen3 ASR 的 Swift API 示例。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3411 中完成。\n* 添加 Qwen3 ASR 的 Go API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3412 中完成。\n* 添加 Qwen3 ASR 的 C# API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3413 中完成。\n* 修复 Go API 中的警告信息。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3414 中完成。\n* 重构 Go API 示例，改用 sherpa_onnx.ReadWave() 方法。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3415 中完成。\n* 添加 Qwen3 ASR 的 JavaScript API（WebAssembly）。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3416 中完成。\n* 添加 Qwen3 ASR 的 JavaScript API（node-addon）。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3419 中完成。\n* 添加 Qwen3 ASR 的 Java API 和 Kotlin API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3420 中完成。\n* 添加 Qwen3 ASR 的 Dart API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3422 中完成。\n* 添加 Qwen3 ASR 的 Rust API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3423 中完成。\n* 添加 Qwen3 ASR 的 Pascal API。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3424 中完成。\n* 发布 v1.12.34 版本。由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3425 中完成。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.33...v1.12.34","2026-03-26T11:35:56",{"id":208,"version":209,"summary_zh":210,"released_at":211},162893,"v1.12.33","## 变更内容\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3388 中修复了 TTS 的 MFC 示例构建问题。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3394 中添加了标点符号相关的示例。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3395 中向 Go API 添加了 OnlineRecognizerResult 缺失的字段。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3396 中修复了麦克风采样率的打印问题。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3397 中添加了实时 ASR + VAD 的 Rust 示例。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3398 中发布了 v1.12.33 版本。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.32...v1.12.33","2026-03-24T08:07:44",{"id":213,"version":214,"summary_zh":215,"released_at":216},162894,"v1.12.32","## 变更内容\n* @CodeBySonu95 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3384 中将 VoxSherpa TTS 添加到使用 sherpa-onnx 的项目中\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3385 中测试了 Windows 上的 Rust API\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3386 中支持 Rust 包的静态链接\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3387 中发布了 v1.12.32 版本\n\n## 新贡献者\n* @CodeBySonu95 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3384 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.31...v1.12.32","2026-03-22T01:27:12",{"id":218,"version":219,"summary_zh":220,"released_at":221},162895,"v1.12.31","## 变更内容\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3361 中修复了为 OHOS 构建 har 文件的问题。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3362 中重构了 MatchaTTS，使其使用新的 Generate API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3363 中重构了 Kokoro TTS，使其使用新的 Generate API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3364 中重构了 KittenTTS，使其使用新的 Generate API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3365 中重构了 VITS，使其使用新的 Generate API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3366 中添加了 TTS 的 Rust API 示例。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3367 中修复了 Swift 测试。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3368 中添加了用于音频标签的 Rust API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3369 中添加了用于说话人嵌入提取和管理的 Rust API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3370 中添加了用于说话人日区分的 Rust API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3371 中重构了语音降噪器的 Rust API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3374 中添加了 C API 和 C++ API 的文档。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3375 中添加了指向 C API 文档的链接。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3376 中添加了 Rust API 的文档。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3372 中添加了用于关键词检测、离线标点符号添加和口语语言识别的 Rust API。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3377 中添加了 Dart API 的文档。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3378 中进一步补充了 Rust API 的文档。\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3379 中发布了 v1.12.31 版本。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.30...v1.12.31","2026-03-20T11:24:18",{"id":223,"version":224,"summary_zh":225,"released_at":226},162896,"v1.12.30","## 变更内容\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3293 中修复项目中的拼写错误\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3294 中修复 WebAssembly JavaScript API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3295 中移除 C\u002FC++ API 中不必要的 SHERPA_ONNX_API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3296 中修复 CXX API 中的 bug\n* 结果输出到标准输出，由 @phillc 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3274 中实现\n* 对在线识别器 C++ 代码进行小幅修复，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3297 中完成\n* 对 JNI 封装进行小幅修复，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3298 中完成\n* 由 SanMse 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3318 中修复 test-onnx-streaming.py 中针对长音频的填充 bug\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3321 中修复样式问题\n* 由 danielr-ceva 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3276 中添加 DPDFNet 语音降噪器对离线和流式处理的支持\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3322 中上传 DPDFNet 模型\n* 由 Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3323 中添加在线标点符号的 C API 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3324 中添加 GTCRN 的在线语音降噪器及示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3328 中发布用于离线\u002F在线语音降噪器的 Rust 包\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3329 中重构 Dart API 以检查空指针\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3330 中为 Android 使用 onnxruntime 1.23.2\n* 由 Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3325 中添加在线标点符号的 Go API 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3332 中重构 ZipVoice TTS 以支持回调\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3333 中添加 ZipVoice 的 C 和 CXX API 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3334 中添加 ZipVoice 的 Go API 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3335 中添加 ZipVoice TTS 的 Python API 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3337 中添加 ZipVoice 的 WebAssembly 示例\n* WebAssembly：将下载进度文本更新为显示 MB 而不是原始字节数，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3338 中完成\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3340 中添加 PocketTTS 的 WebAssembly 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3341 中添加 ZipVoice TTS 的 JavaScript (WebAssembly) 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3342 中添加 ZipVoice TTS 的 JavaScript (node-addon) 示例\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F33 中添加 Pocket 和 Supertonic TTS 的 JavaScript 播放示例","2026-03-19T08:09:46",{"id":228,"version":229,"summary_zh":230,"released_at":231},162897,"v1.12.29","## 变更内容\n* 在 Windows 上为 Debug 和 RelWithDebInfo 构建发布 pdb 文件，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3252 中完成\n* 修复 WebAssembly 中 TTS 的内存泄漏问题，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3259 中完成\n* 重构 WebAssembly TTS API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3260 中完成\n* 添加 Supertonic2 TTS 支持，由 @Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3094 中完成\n* 上传 supertonic tts 模型，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3263 中完成\n* 为 Supertonic TTS 添加 Python API 示例，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3264 中完成\n* 在 canary 模型运行时支持动态解码器层数，由 @mm65x 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3268 中完成\n* 为 supertonic TTS 添加 C++ API，由 @Wasser1462 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3280 中完成\n* 为 SupertonicTTS 添加 C# API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3283 中完成\n* 为 SupertonicTTS 添加 Go API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3284 中完成\n* 为文本转语音添加 Rust API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3285 中完成\n* 为 SupertonicTTS 添加 Swift API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3286 中完成\n* 为 Supertonic TTS 添加 Java 和 Kotlin API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3289 中完成\n* 为 SupertonicTTS 添加 Dart API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3288 中完成\n* 为 Supertonic TTS 添加 Pascal API 及示例，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3290 中完成\n* 为 Supertonic TTS 添加 JavaScript API，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3287 中完成\n* 发布 v1.12.29 版本，由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3292 中完成\n\n## 新贡献者\n* @mm65x 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3268 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.28...v1.12.29","2026-03-12T02:20:50",{"id":233,"version":234,"summary_zh":235,"released_at":236},162898,"v1.12.28","## 变更内容\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3232 中添加了对 Moonshine v2 的 C++ 运行时支持\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3234 中将 Moonshine v2 模型导出至 sherpa-onnx\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3235 中更新了 Moonshine v2 模型的 Python API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3237 中添加了 Moonshine v2 模型的 Kotlin 和 Java API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3238 中添加了 Moonshine v2 模型的 C 和 C++ API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3240 中添加了 Moonshine v2 模型的 Swift API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3241 中添加了 Moonshine v2 模型的 JavaScript API（WebAssembly）\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3242 中添加了 Moonshine v2 模型的 JavaScript API（node-addon）\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3243 中添加了 Moonshine v2 的 C# API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3244 中添加了 Moonshine v2 的 Go API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3245 中添加了 Moonshine v2 的 Dart API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3247 中添加了 Moonshine v2 的 Rust API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3248 中添加了 Moonshine v2 的 Pascal API\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3249 中使用 WebAssembly 构建了 Moonshine v2 的 Hugging Face Spaces\n* 由 @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3250 中发布了 v1.12.28 版本\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.27...v1.12.28","2026-02-28T08:08:03",{"id":238,"version":239,"summary_zh":240,"released_at":241},162899,"v1.12.27","## 变更内容\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3213 中添加了用于 VAD 的 Rust API\n* @suykerbuyk 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3214 中将已弃用的 std::istrstream 替换为 std::istringstream\n* @suykerbuyk 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3215 中将已弃用的 std::wstring_convert 替换为手动实现的 UTF-8 编码解码器\n* @suykerbuyk 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3217 中修复了 CMake 警告：可选特性消息级别及策略版本最低要求\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3220 中上传了 FireRedASR v2 模型（AED 和 CTC）\n* @suykerbuyk 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3216 中修复了 hclust_cpp 构建时的警告信息：FetchContent_Populate 已弃用以及 #pragma message 问题\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3221 中支持了 FireRedASR CTC 模型\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3224 中更新了 FireRedASR CTC 模型的语言绑定\n* @csukuangfj 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3225 中发布了 v1.12.27 版本\n\n## 新贡献者\n* @suykerbuyk 在 https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3214 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.26...v1.12.27","2026-02-26T10:14:33",{"id":243,"version":244,"summary_zh":245,"released_at":246},162900,"v1.12.26","## What's Changed\r\n* Fix CI by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3192\r\n* Fix heap-buffer-overflow in ReadWaveImpl when data chunk size is odd by @zhangjy1014 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3195\r\n* [PocketTTS] Add seed support and voice embedding caching for consiste… by @ramishi in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3189\r\n* Feat\u002Fpocket tts cache config by @ramishi in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3200\r\n* 3197: enhanced java binding for voice_embedding_cache_capacity by @albertbolt1 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3201\r\n* Dart, flutter, go, c-api blinding and example by @ramishi in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3202\r\n* Begin to add Rust API by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3203\r\n* Add Rust API for streaming speech recognition by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3204\r\n* Add a real-time speech recognition example with microphone for Rust API. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3205\r\n* Add Rust API for offline ASR by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3207\r\n* feat: Add PocketTTS cache & seed support to Node.js Addon and WASM APIs by @ramishi in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3206\r\n* Add more examples for offline ASR models with Rust API. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3209\r\n* Update C#\u002FSwift\u002FPascal API for PocketTTS' VoiceEmbeddingCacheCapacity. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3211\r\n* Release v1.12.26 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3212\r\n\r\n## New Contributors\r\n* @zhangjy1014 made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3195\r\n* @ramishi made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3189\r\n* @albertbolt1 made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3201\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.25...v1.12.26","2026-02-24T10:52:53",{"id":248,"version":249,"summary_zh":76,"released_at":250},162901,"v1.12.25","2026-02-14T14:47:22",{"id":252,"version":253,"summary_zh":254,"released_at":255},162902,"v1.12.24","## What's Changed\r\n* Fix UnicodeDecodeError when accessing tokens in FunASR-nano tokenizer by @Wasser1462 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3058\r\n* Use more jobs for building VAD ASR APKs by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3068\r\n* Add export CGO_ENABLED=1 to all GO examples. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3069\r\n* Support BPE tokenizer by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3078\r\n* Add C++ runtime and Python support PocketTTS for streaming voice cloning on CPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3083\r\n* JS: addon static import logic for bundlers by @alfredomariamilano in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3075\r\n* Update C++ binary for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3087\r\n* Add Python API examples for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3088\r\n* Limit text length for PocketTTS. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3089\r\n* Add CI for PocketTTS. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3090\r\n* Fix Python CI by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3091\r\n* Fix build error by @eij-iew in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3096\r\n* Add Java and Kotlin API for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3095\r\n* Refactor JNI to remove casting. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3103\r\n* Refactor JNI by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3107\r\n* Support MD and MT MSVC runtime libraries (CRT) for Windows x64 static build by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3111\r\n* Fix MSVC CRT for Windows x64 shared build. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3114\r\n* Fix MSVC CRT for Windows arm64 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3117\r\n* Fix MSVC CRT for Windows x86 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3118\r\n* Refactor CI for Windows x64 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3119\r\n* Fix CI for Windows x64 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3123\r\n* Upload WenetSpeech-Wu u2pp ASR models. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3125\r\n* add TTS generation with GenerationConfig params C API by @seven1240 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3115\r\n* Refactor TTS C API by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3127\r\n* Add CXX API for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3128\r\n* Add Swift API for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3129\r\n* fix(android): Optimize UI updates and remove dead code in MainActivity by @Doheon-Kim in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3130\r\n* Change RPATH for sherpa-onnx.node by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3131\r\n* Add async js API for tts generate. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3133\r\n* fix(android): Initialize models in background coroutine to avoid UI blocking by @Doheon-Kim in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3132\r\n* Add hotword support for FunASR-Nano by @Wasser1462 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3122\r\n* Provide async JS API to create TTS. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3134\r\n* demo with UI and Web Worker to avoid main-thread blocking by @yuiyideyui in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3120\r\n* Add support for Meta Omnilingual ASR v2 models by @Edison2ST in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3138\r\n* Export omnilingualASR v2 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3140\r\n* feat: Add ys_log_probs to NeMo transducer greedy search decoder by @Jua004 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3105\r\n* Add modified beam search and hotwords support for NeMo transducer models by @nullbio in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3077\r\n* Fix ORT Value default construction for Android build by @uhuntu in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3141\r\n* Whisper timestamps by @jcheng5 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2945\r\n* Add node-addon JavaScript API for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3139\r\n* fix: Downgrade lifecycle to 2.5.1 to fix build error by @Doheon-Kim in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3143\r\n* Enable return value in callback for TTS in Go API. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3150\r\n* Refactor Go API for TTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3151\r\n* Export models for CANN 8.1 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3152\r\n* Add Go API for PocketTTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3153\r\n* Export models for CANN 8.3 and 8.5 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F315","2026-02-10T03:51:22",{"id":257,"version":258,"summary_zh":259,"released_at":260},162903,"v1.12.23","## What's Changed\r\n* Node addon api jsdoc by @alfredomariamilano in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3005\r\n* Add JavaScript async API for OfflineRecongizer decodeStream. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3049\r\n* Support creating OfflineRecognizer asynchronously in JavaScript. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3050\r\n* Fix uploading files to huggingface by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3054\r\n* Add Dart API for FunASR Nano by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3055\r\n* Fix uploading APK files by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3056\r\n* Release v1.12.23 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3057\r\n\r\n## New Contributors\r\n* @alfredomariamilano made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3005\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.22...v1.12.23","2026-01-15T08:23:35",{"id":262,"version":263,"summary_zh":264,"released_at":265},162904,"v1.12.22","## What's Changed\r\n* Update wav files for FunASR Nano by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3038\r\n* Fix onnxruntime SHA256 by @Wasser1462 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3042\r\n* Fix checking funasr nano tokenizer on Windows by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3043\r\n* Support nemotron-speech-streaming-en-0.6b by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3044\r\n* Build Android APK for nemotron-speech-streaming-en-0.6b by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3045\r\n* Fix building Linux arm wheels by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3047\r\n* Release v1.12.22 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3048\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.21...v1.12.22","2026-01-14T14:54:54",{"id":267,"version":268,"summary_zh":269,"released_at":270},162905,"v1.12.21","## What's Changed\r\n* Fix publishing NPM packages by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2909\r\n* Refactor ZipVoice C++ code by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2911\r\n* Epoxrt more zipformer ctc models to qnn by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2921\r\n* [KWS] Add phone+ppinyin tokenization with lexicon support (for zh-en model) by @pkufool in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2922\r\n* Export Paraformer ASR models to QNN by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2925\r\n* Add Transpose for a 2-D matrix. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2926\r\n* Optimize computation with Eigen. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2928\r\n* Add C++ runtime for Paraformer ASR models with Qualcomm NPU using QNN by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2931\r\n* Add Android demo for Paraformer ASR with Qualcomm NPU. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2932\r\n* Export Google MedASR to sherpa-onnx by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2934\r\n* Add C++ runtime and Python API for Google MedASR models by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2935\r\n* Fix creating a view of an Ort::Value tensor. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2939\r\n* Add C and CXX API for Google MedASR model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2946\r\n* [TTS Engine] Fix engine speed by @SergioRt1 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2895\r\n* Add Swift API for Google MedASR model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2947\r\n* Add C# API for Google MedASR model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2949\r\n* Add Pascal API for Google MedASR model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2950\r\n* Add Go API for Google MedAsr model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2952\r\n* Add Dart API for Google MedAsr model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2953\r\n* Add JavaScript API (WebAssembly) for Google MedAsr model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2954\r\n* Add JavaScript API (node-addon) for Google MedAsr model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2955\r\n* Add Kotlin and Java API for Google MedAsr model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2956\r\n* Add funASR-Nano support by @Wasser1462 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2936\r\n* Fix building for Windows by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2964\r\n* Fix building for HarmonyOS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2972\r\n* [feature] add FunASRNano config into golang api by @ilibx in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2974\r\n* Update FunAsr-Nano CTC model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2978\r\n* [opt] opt free pointer function by @ilibx in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2975\r\n* [feature] use jinja2 to generate sherpa-onnx-go lib by @ilibx in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2976\r\n* Reformat Go API code by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2979\r\n* Fix building for onnxruntime >= 1.11.0 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2981\r\n* Export Whisper to RK NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2983\r\n* Test Whisper on Ascend NPU using ACL Python API by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2986\r\n* FunASR-nano: switch to unified KV-cache LLM by @Wasser1462 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2995\r\n* Remove filesystem header by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2998\r\n* Fix(csrc\u002Fmelotts): Map 'v' to 'V' phoneme for English MeloTTS by @pqsworld in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3002\r\n* Upload FunASR Nano ASR models with LLM by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3003\r\n* Fix test wav files in FunASR Nano models. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3004\r\n* Use onnxruntime 1.23.2 for Windows by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3007\r\n* Add CI to export Whisper models to Ascend NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3008\r\n* Add C++ runtime for Whisper with Ascend NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3009\r\n* Use onnxruntime v1.23.2 for Linux aarch64 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3016\r\n* Use onnxruntime v1.23.2 for Linux arm by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3017\r\n* Start to switch from onnxruntime 1.17.1 to v1.23.2 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2993\r\n* Use onnxruntime 1.23.2 for Linux x64 + NVIDIA GPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F3018\r\n* Update CI test for FunASR Nano C\u002FC++ API by @csukuangfj ","2026-01-12T11:17:57",{"id":272,"version":273,"summary_zh":274,"released_at":275},162906,"v1.12.20","## What's Changed\r\n* Refactor axcl examples. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2867\r\n* Update README to include Axera NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2870\r\n* Add CI for Axera NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2872\r\n* Refactor sense voice impl by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2873\r\n* Refactor Paraformer Impl by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2874\r\n* Remove unused lock file by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2875\r\n* Load QNN context binary for faster startup by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2877\r\n* Export models for Ascend 910B4 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2878\r\n* Optimize streaming output results when VAD does not detect human voice for a long time by @zhouyongxyz in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2876\r\n* Build APKs for MatchaTTS Chinese+English by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2882\r\n* Publish WASM spaces for MatchaTTS Chinese+English model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2885\r\n* Add script for testing zipvoice onnx models by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2887\r\n* Upload zipvoice onnx models by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2890\r\n* Remove cppinyin from zipvoice by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2892\r\n* Fix building errors by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2893\r\n* Use a shorter name for Zipvoice models. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2894\r\n* Export GigaAM v3 to sherpa-onnx by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2901\r\n* Fix typos in URL by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2905\r\n* Support Fun-ASR-Nano-2512 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2906\r\n* Release v1.12.20 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2907\r\n\r\n## New Contributors\r\n* @zhouyongxyz made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2876\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.19...v1.12.20","2025-12-17T12:38:26",{"id":277,"version":278,"summary_zh":279,"released_at":280},162907,"asr-models-qnn-binary","Context binary files in this page are generated using qnn sdk 2.40.0.251030\r\n\r\nYou can download the requires `.so` lib files from qnn sdk 2.40.0.251030 from \r\nhttps:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models-qnn\u002Fqnn-libs-2.40.0.251030.tar.bz2\r\n\r\nPlease refer to\r\n  - https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fqnn\u002Frun-executables-on-your-phone-binary.html\r\n  - https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fqnn\u002Fmodels.html\r\nfor usages.\r\n\r\nSee\r\nhttps:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fblob\u002Fmaster\u002Fsherpa-onnx\u002Fkotlin-api\u002FOfflineRecognizer.kt#L1119\r\nfor how to use within Android.\r\n```kotlin\r\n        9022 -> {\r\n            \u002F\u002F for my Xiaomi 17 Pro\r\n            val modelDir = \"sherpa-onnx-qnn-SM8850-binary-10-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\"\r\n            return OfflineModelConfig(\r\n                provider = \"qnn\",\r\n                senseVoice = OfflineSenseVoiceModelConfig(\r\n                    qnnConfig = QnnConfig(\r\n                        \u002F\u002F Please copy libQnnHtp.so and libQnnSystem.so to jniLibs\u002Farm64-v8a by yourself\r\n                        backendLib = \"libQnnHtp.so\",\r\n                        systemLib = \"libQnnSystem.so\",\r\n                        contextBinary = \"$modelDir\u002Fmodel.bin\",\r\n                    ),\r\n                ),\r\n                tokens = \"$modelDir\u002Ftokens.txt\",\r\n                debug = true,\r\n            )\r\n        }\r\n```","2025-12-09T06:14:11",{"id":282,"version":283,"summary_zh":284,"released_at":285},162908,"v1.12.19","## What's Changed\r\n* Fix building without TTS for C API by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2838\r\n* [ZipVoice] Fix english tokenization error by @pkufool in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2834\r\n* Add simulate streaming ASR Python example for Paraformer by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2839\r\n* Fix building JNI for Windows. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2840\r\n* Avoid NaN in NeMo speaker embedding models. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2844\r\n* add spacemit ort ep for spacemit riscv cpus by @alex-spacemit in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2837\r\n* Add token-level confidence scores (ys_probs) for offline transducer models by @Mahmoud-ghareeb in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2843\r\n* Fix token log probabilities in offline transducer modified beam search decoder by @Mahmoud-ghareeb in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2846\r\n* Support AXERA ax630, ax650, and axcl backends. by @Abandon-ht in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2849\r\n* Refactor axera npu examples by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2850\r\n* fix matcha tts zh-en model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2851\r\n* Fix the English part for Matcha TTS. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2853\r\n* Refactor text-utils by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2855\r\n* Fix matcha tts by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2856\r\n* Add a space between English words for Matcha zh-en TTS by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2858\r\n* Fix punctuations in matcha zh-en tts by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2859\r\n* Upload matcha tts zh-en model by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2865\r\n* Fix the discrepancy with the Silero VAD isSpeech logic by @ming030890 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2863\r\n* Release v1.12.19 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2868\r\n\r\n## New Contributors\r\n* @alex-spacemit made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2837\r\n* @Mahmoud-ghareeb made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2843\r\n* @Abandon-ht made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2849\r\n* @ming030890 made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2863\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.18...v1.12.19","2025-12-05T12:48:35",{"id":287,"version":288,"summary_zh":289,"released_at":290},162909,"v1.12.18","## What's Changed\r\n* Fix CI by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2786\r\n* export omniASR_CTC_1B from https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fomnilingual-asr by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2788\r\n* Add C++ QNN support for SenseVoice by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2793\r\n* Export models for CANN toolkit 7.0 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2795\r\n* Support hotwords with byte level bpe by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2802\r\n* Add Android demo with QNN (Qualcomm NPU) for SenseVoice ASR by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2803\r\n* Export zipformer ctc models to QNN by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2815\r\n* Add spaces between English words for Homophone replacer. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2817\r\n* Add C++ QNN support for Zipformer CTC models. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2809\r\n* Limit symbol visibility in the shared libraries by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2822\r\n* Fix warnings for initializing tts lexicon. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2823\r\n* Export zipformer ctc models to Ascend NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2824\r\n* Refactor scripts for exporting models to Ascend NPU. by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2825\r\n* Add C++ support for Zipformer CTC on Ascend NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2826\r\n* Fix segfault when non-wav file is passed to ReadWave by @jcheng5 in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2821\r\n* Avoid calling rknn_dup_context(). by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2828\r\n* Add C++ support for Paraformer with RK NPU by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2829\r\n* Update README to include NPU support by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2830\r\n* Support running whisper large v3 with external data weight by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2807\r\n* Release v1.12.18 by @csukuangfj in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2831\r\n\r\n## New Contributors\r\n* @jcheng5 made their first contribution in https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fpull\u002F2821\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fcompare\u002Fv1.12.17...v1.12.18","2025-11-27T10:20:50"]