[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-gradio-app--fastrtc":3,"tool-gradio-app--fastrtc":61},[4,18,28,37,45,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":24,"last_commit_at":25,"category_tags":26,"status":17},9989,"n8n","n8n-io\u002Fn8n","n8n 是一款面向技术团队的公平代码（fair-code）工作流自动化平台，旨在让用户在享受低代码快速构建便利的同时，保留编写自定义代码的灵活性。它主要解决了传统自动化工具要么过于封闭难以扩展、要么完全依赖手写代码效率低下的痛点，帮助用户轻松连接 400 多种应用与服务，实现复杂业务流程的自动化。\n\nn8n 特别适合开发者、工程师以及具备一定技术背景的业务人员使用。其核心亮点在于“按需编码”：既可以通过直观的可视化界面拖拽节点搭建流程，也能随时插入 JavaScript 或 Python 代码、调用 npm 包来处理复杂逻辑。此外，n8n 原生集成了基于 LangChain 的 AI 能力，支持用户利用自有数据和模型构建智能体工作流。在部署方面，n8n 提供极高的自由度，支持完全自托管以保障数据隐私和控制权，也提供云端服务选项。凭借活跃的社区生态和数百个现成模板，n8n 让构建强大且可控的自动化系统变得简单高效。",184740,2,"2026-04-19T23:22:26",[16,14,13,15,27],"插件",{"id":29,"name":30,"github_repo":31,"description_zh":32,"stars":33,"difficulty_score":10,"last_commit_at":34,"category_tags":35,"status":17},10095,"AutoGPT","Significant-Gravitas\u002FAutoGPT","AutoGPT 是一个旨在让每个人都能轻松使用和构建 AI 的强大平台，核心功能是帮助用户创建、部署和管理能够自动执行复杂任务的连续型 AI 智能体。它解决了传统 AI 应用中需要频繁人工干预、难以自动化长流程工作的痛点，让用户只需设定目标，AI 即可自主规划步骤、调用工具并持续运行直至完成任务。\n\n无论是开发者、研究人员，还是希望提升工作效率的普通用户，都能从 AutoGPT 中受益。开发者可利用其低代码界面快速定制专属智能体；研究人员能基于开源架构探索多智能体协作机制；而非技术背景用户也可直接选用预置的智能体模板，立即投入实际工作场景。\n\nAutoGPT 的技术亮点在于其模块化“积木式”工作流设计——用户通过连接功能块即可构建复杂逻辑，每个块负责单一动作，灵活且易于调试。同时，平台支持本地自托管与云端部署两种模式，兼顾数据隐私与使用便捷性。配合完善的文档和一键安装脚本，即使是初次接触的用户也能在几分钟内启动自己的第一个 AI 智能体。AutoGPT 正致力于降低 AI 应用门槛，让人人都能成为 AI 的创造者与受益者。",183572,"2026-04-20T04:47:55",[13,36,27,14,15],"语言模型",{"id":38,"name":39,"github_repo":40,"description_zh":41,"stars":42,"difficulty_score":10,"last_commit_at":43,"category_tags":44,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":46,"name":47,"github_repo":48,"description_zh":49,"stars":50,"difficulty_score":24,"last_commit_at":51,"category_tags":52,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",161692,"2026-04-20T11:33:57",[14,13,36],{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":24,"last_commit_at":59,"category_tags":60,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":76,"owner_website":78,"owner_url":79,"languages":80,"stars":104,"forks":105,"last_commit_at":106,"license":107,"difficulty_score":24,"env_os":108,"env_gpu":108,"env_ram":108,"env_deps":109,"category_tags":120,"github_topics":122,"view_count":24,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":131,"updated_at":132,"faqs":133,"releases":134},10224,"gradio-app\u002Ffastrtc","fastrtc","The python library for real-time communication","fastrtc 是一款专为 Python 打造的实时通信库，旨在让开发者轻松将普通的 Python 函数转化为支持音频和视频流的实时应用。它主要解决了构建低延迟互动系统时面临的技术门槛高、配置复杂等痛点，让用户无需深入钻研 WebRTC 或 WebSocket 的底层细节，即可快速实现流畅的音视频交互。\n\n这款工具非常适合 AI 开发者、研究人员以及希望集成实时对话功能的后端工程师使用。无论是构建智能语音助手、实时视频聊天室，还是连接大模型（如 Gemini、OpenAI）的语音接口，fastrtc 都能提供简洁高效的解决方案。\n\n其核心亮点在于高度自动化的功能设计：内置了语音活动检测与自动轮转机制，能精准识别用户何时说完话并触发回应；同时提供开箱即用的 Gradio 界面，一键启动网页端演示。此外，fastrtc 具备极强的扩展性，可无缝挂载到 FastAPI 应用中，支持自定义前端开发，甚至能通过简单命令生成临时电话号码，直接打通传统电话网络。通过 fastrtc，创作者可以将更多精力集中在业务逻辑与创新体验上，而非繁琐的通信协议实现。","\u003Cdiv style='text-align: center; margin-bottom: 1rem; display: flex; justify-content: center; align-items: center;'>\n    \u003Ch1 style='color: white; margin: 0;'>FastRTC\u003C\u002Fh1>\n    \u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgradio-app_fastrtc_readme_dca7d2e62f26.png'\n         alt=\"FastRTC Logo\" \n         style=\"margin-right: 10px;\">\n\u003C\u002Fdiv>\n\n\u003Cdiv style=\"display: flex; flex-direction: row; justify-content: center\">\n\u003Cimg style=\"display: block; padding-right: 5px; height: 20px;\" alt=\"Static Badge\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Ffastrtc\"> \n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\" target=\"_blank\">\u003Cimg alt=\"Static Badge\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fgithub-white?logo=github&logoColor=black\">\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Ch3 style='text-align: center'>\nThe Real-Time Communication Library for Python. \n\u003C\u002Fh3>\n\nTurn any python function into a real-time audio and video stream over WebRTC or WebSockets.\n\n## Installation\n\n```bash\npip install fastrtc\n```\n\nto use built-in pause detection (see [ReplyOnPause](https:\u002F\u002Ffastrtc.org\u002Fuserguide\u002Faudio\u002F#reply-on-pause)), and text to speech (see [Text To Speech](https:\u002F\u002Ffastrtc.org\u002Fuserguide\u002Faudio\u002F#text-to-speech)), install the `vad` and `tts` extras:\n\n```bash\npip install \"fastrtc[vad, tts]\"\n```\n\n## Key Features\n\n- 🗣️ Automatic Voice Detection and Turn Taking built-in, only worry about the logic for responding to the user.\n- 💻 Automatic UI - Use the `.ui.launch()` method to launch the webRTC-enabled built-in Gradio UI.\n- 🔌 Automatic WebRTC Support - Use the `.mount(app)` method to mount the stream on a FastAPI app and get a webRTC endpoint for your own frontend! \n- ⚡️ Websocket Support - Use the `.mount(app)` method to mount the stream on a FastAPI app and get a websocket endpoint for your own frontend! \n- 📞 Automatic Telephone Support - Use the `fastphone()` method of the stream to launch the application and get a free temporary phone number!\n- 🤖 Completely customizable backend - A `Stream` can easily be mounted on a FastAPI app so you can easily extend it to fit your production application. See the [Talk To Claude](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude) demo for an example of how to serve a custom JS frontend.\n\n## Docs\n\n[https:\u002F\u002Ffastrtc.org](https:\u002F\u002Ffastrtc.org)\n\n## Examples\nSee the [Cookbook](https:\u002F\u002Ffastrtc.org\u002Fcookbook\u002F) for examples of how to use the library.\n\n\u003Ctable>\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️👀 Gemini Audio Video Chat\u003C\u002Fh3>\n\u003Cp>Stream BOTH your webcam video and audio feeds to Google Gemini. You can also upload images to augment your conversation!\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F9636dc97-4fee-46bb-abb8-b92e69c08c71\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fgemini-audio-video-chat\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fgemini-audio-video-chat\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Google Gemini Real Time Voice API\u003C\u002Fh3>\n\u003Cp>Talk to Gemini in real time using Google's voice API.\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fea6d18cb-8589-422b-9bba-56332d9f61de\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-gemini\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-gemini\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ OpenAI Real Time Voice API\u003C\u002Fh3>\n\u003Cp>Talk to ChatGPT in real time using OpenAI's voice API.\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F178bdadc-f17b-461a-8d26-e915c632ff80\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-openai\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-openai\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🤖 Hello Computer\u003C\u002Fh3>\n\u003Cp>Say computer before asking your question!\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fafb2a3ef-c1ab-4cfb-872d-578f895a10d5\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fhello-computer\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fhello-computer\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🤖 Llama Code Editor\u003C\u002Fh3>\n\u003Cp>Create and edit HTML pages with just your voice! Powered by SambaNova systems.\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F98523cf3-dac8-4127-9649-d91a997e3ef5\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fllama-code-editor\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fllama-code-editor\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Talk to Claude\u003C\u002Fh3>\n\u003Cp>Use the Anthropic and Play.Ht APIs to have an audio conversation with Claude.\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ffb6ef07f-3ccd-444a-997b-9bc9bdc035d3\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🎵 Whisper Transcription\u003C\u002Fh3>\n\u003Cp>Have whisper transcribe your speech in real time!\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F87603053-acdc-4c8a-810f-f618c49caafb\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fwhisper-realtime\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fwhisper-realtime\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>📷 Yolov10 Object Detection\u003C\u002Fh3>\n\u003Cp>Run the Yolov10 model on a user webcam stream in real time!\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff82feb74-a071-4e81-9110-a01989447ceb\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Kyutai Moshi\u003C\u002Fh3>\n\u003Cp>Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fbecc7a13-9e89-4a19-9df2-5fb1467a0137\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Ftalk-to-moshi\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Ftalk-to-moshi\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Hello Llama: Stop Word Detection\u003C\u002Fh3>\n\u003Cp>A code editor built with Llama 3.3 70b that is triggered by the phrase \"Hello Llama\". Build a Siri-like coding assistant in 100 lines of code!\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3e10cb15-ff1b-4b17-b141-ff0ad852e613\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fhey-llama-code-editor\">Demo\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fhey-llama-code-editor\u002Fblob\u002Fmain\u002Fapp.py\">Code\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## Usage\n\nThis is a shortened version of the official [usage guide](https:\u002F\u002Ffreddyaboulton.github.io\u002Fgradio-webrtc\u002Fuser-guide\u002F). \n\n- `.ui.launch()`: Launch a built-in UI for easily testing and sharing your stream. Built with [Gradio](https:\u002F\u002Fwww.gradio.app\u002F).\n- `.fastphone()`: Get a free temporary phone number to call into your stream. Hugging Face token required.\n- `.mount(app)`: Mount the stream on a [FastAPI](https:\u002F\u002Ffastapi.tiangolo.com\u002F) app. Perfect for integrating with your already existing production system.\n\n\n## Quickstart\n\n### Echo Audio\n\n```python\nfrom fastrtc import Stream, ReplyOnPause\nimport numpy as np\n\ndef echo(audio: tuple[int, np.ndarray]):\n    # The function will be passed the audio until the user pauses\n    # Implement any iterator that yields audio\n    # See \"LLM Voice Chat\" for a more complete example\n    yield audio\n\nstream = Stream(\n    handler=ReplyOnPause(echo),\n    modality=\"audio\", \n    mode=\"send-receive\",\n)\n```\n\n### LLM Voice Chat\n\n```py\nfrom fastrtc import (\n    ReplyOnPause, AdditionalOutputs, Stream,\n    audio_to_bytes, aggregate_bytes_to_16bit\n)\nimport gradio as gr\nfrom groq import Groq\nimport anthropic\nfrom elevenlabs import ElevenLabs\n\ngroq_client = Groq()\nclaude_client = anthropic.Anthropic()\ntts_client = ElevenLabs()\n\n\n# See \"Talk to Claude\" in Cookbook for an example of how to keep \n# track of the chat history.\ndef response(\n    audio: tuple[int, np.ndarray],\n):\n    prompt = groq_client.audio.transcriptions.create(\n        file=(\"audio-file.mp3\", audio_to_bytes(audio)),\n        model=\"whisper-large-v3-turbo\",\n        response_format=\"verbose_json\",\n    ).text\n    response = claude_client.messages.create(\n        model=\"claude-3-5-haiku-20241022\",\n        max_tokens=512,\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n    )\n    response_text = \" \".join(\n        block.text\n        for block in response.content\n        if getattr(block, \"type\", None) == \"text\"\n    )\n    iterator = tts_client.text_to_speech.convert_as_stream(\n        text=response_text,\n        voice_id=\"JBFqnCBsd6RMkjVDRZzb\",\n        model_id=\"eleven_multilingual_v2\",\n        output_format=\"pcm_24000\"\n        \n    )\n    for chunk in aggregate_bytes_to_16bit(iterator):\n        audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)\n        yield (24000, audio_array)\n\nstream = Stream(\n    modality=\"audio\",\n    mode=\"send-receive\",\n    handler=ReplyOnPause(response),\n)\n```\n\n### Webcam Stream\n\n```python\nfrom fastrtc import Stream\nimport numpy as np\n\n\ndef flip_vertically(image):\n    return np.flip(image, axis=0)\n\n\nstream = Stream(\n    handler=flip_vertically,\n    modality=\"video\",\n    mode=\"send-receive\",\n)\n```\n\n### Object Detection\n\n```python\nfrom fastrtc import Stream\nimport gradio as gr\nimport cv2\nfrom huggingface_hub import hf_hub_download\nfrom .inference import YOLOv10\n\nmodel_file = hf_hub_download(\n    repo_id=\"onnx-community\u002Fyolov10n\", filename=\"onnx\u002Fmodel.onnx\"\n)\n\n# git clone https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection\n# for YOLOv10 implementation\nmodel = YOLOv10(model_file)\n\ndef detection(image, conf_threshold=0.3):\n    image = cv2.resize(image, (model.input_width, model.input_height))\n    new_image = model.detect_objects(image, conf_threshold)\n    return cv2.resize(new_image, (500, 500))\n\nstream = Stream(\n    handler=detection,\n    modality=\"video\", \n    mode=\"send-receive\",\n    additional_inputs=[\n        gr.Slider(minimum=0, maximum=1, step=0.01, value=0.3)\n    ]\n)\n```\n\n## Running the Stream\n\nRun:\n\n### Gradio\n\n```py\nstream.ui.launch()\n```\n\n### Telephone (Audio Only)\n\n    ```py\n    stream.fastphone()\n    ```\n\n### FastAPI\n\n```py\napp = FastAPI()\nstream.mount(app)\n\n# Optional: Add routes\n@app.get(\"\u002F\")\nasync def _():\n    return HTMLResponse(content=open(\"index.html\").read())\n\n# uvicorn app:app --host 0.0.0.0 --port 8000\n```\n","\u003Cdiv style='text-align: center; margin-bottom: 1rem; display: flex; justify-content: center; align-items: center;'>\n    \u003Ch1 style='color: white; margin: 0;'>FastRTC\u003C\u002Fh1>\n    \u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgradio-app_fastrtc_readme_dca7d2e62f26.png'\n         alt=\"FastRTC Logo\" \n         style=\"margin-right: 10px;\">\n\u003C\u002Fdiv>\n\n\u003Cdiv style=\"display: flex; flex-direction: row; justify-content: center\">\n\u003Cimg style=\"display: block; padding-right: 5px; height: 20px;\" alt=\"Static Badge\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Ffastrtc\"> \n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\" target=\"_blank\">\u003Cimg alt=\"Static Badge\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fgithub-white?logo=github&logoColor=black\">\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Ch3 style='text-align: center'>\n面向 Python 的实时通信库。\n\u003C\u002Fh3>\n\n将任何 Python 函数转换为通过 WebRTC 或 WebSocket 实现的实时音视频流。\n\n## 安装\n\n```bash\npip install fastrtc\n```\n\n若需使用内置的静音检测功能（参见 [ReplyOnPause](https:\u002F\u002Ffastrtc.org\u002Fuserguide\u002Faudio\u002F#reply-on-pause)）以及文本转语音功能（参见 [Text To Speech](https:\u002F\u002Ffastrtc.org\u002Fuserguide\u002Faudio\u002F#text-to-speech)），请安装 `vad` 和 `tts` 附加组件：\n\n```bash\npip install \"fastrtc[vad, tts]\"\n```\n\n## 核心特性\n\n- 🗣️ 内置自动语音检测与发言权管理，您只需关注如何响应用户逻辑。\n- 💻 自动 UI — 使用 `.ui.launch()` 方法即可启动支持 WebRTC 的内置 Gradio 界面。\n- 🔌 自动 WebRTC 支持 — 使用 `.mount(app)` 方法可将流挂载到 FastAPI 应用上，从而为您的前端提供 WebRTC 端点！\n- ⚡️ WebSocket 支持 — 同样使用 `.mount(app)` 方法，您可以将流挂载到 FastAPI 应用上，获取适用于自定义前端的 WebSocket 端点！\n- 📞 自动电话支持 — 调用流的 `fastphone()` 方法即可启动应用，并获得一个免费的临时电话号码！\n- 🤖 完全可定制的后端 — `Stream` 对象可轻松挂载到 FastAPI 应用上，方便您根据生产需求进行扩展。以 [Talk To Claude](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude) 演示为例，展示了如何部署自定义 JS 前端。\n\n## 文档\n\n[https:\u002F\u002Ffastrtc.org](https:\u002F\u002Ffastrtc.org)\n\n## 示例\n\n更多使用示例，请参阅 [Cookbook](https:\u002F\u002Ffastrtc.org\u002Fcookbook\u002F)。\n\n\u003Ctable>\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️👀 Gemini 音视频聊天\u003C\u002Fh3>\n\u003Cp>将您的摄像头视频和音频同时传输至 Google Gemini。您还可以上传图片来丰富对话内容！\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F9636dc97-4fee-46bb-abb8-b92e69c08c71\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fgemini-audio-video-chat\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fgemini-audio-video-chat\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Google Gemini 实时语音 API\u003C\u002Fh3>\n\u003Cp>利用 Google 的语音 API 与 Gemini 进行实时对话。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fea6d18cb-8589-422b-9bba-56332d9f61de\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-gemini\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-gemini\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ OpenAI 实时语音 API\u003C\u002Fh3>\n\u003Cp>使用 OpenAI 的语音 API 与 ChatGPT 进行实时对话。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F178bdadc-f17b-461a-8d26-e915c632ff80\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-openai\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-openai\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🤖 Hello Computer\u003C\u002Fh3>\n\u003Cp>提问前先说“computer”！\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fafb2a3ef-c1ab-4cfb-872d-578f895a10d5\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fhello-computer\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fhello-computer\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🤖 Llama 代码编辑器\u003C\u002Fh3>\n\u003Cp>仅凭语音即可创建和编辑 HTML 页面！由 SambaNova 系统提供支持。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F98523cf3-dac8-4127-9649-d91a997e3ef5\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fllama-code-editor\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fllama-code-editor\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ 与 Claude 对话\u003C\u002Fh3>\n\u003Cp>借助 Anthropic 和 Play.Ht 的 API，与 Claude 进行语音对话。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ffb6ef07f-3ccd-444a-997b-9bc9bdc035d3\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Ftalk-to-claude\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🎵 Whisper 转录\u003C\u002Fh3>\n\u003Cp>让 Whisper 实时转录您的语音！\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F87603053-acdc-4c8a-810f-f618c49caafb\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fwhisper-realtime\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fwhisper-realtime\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>📷 Yolov10 目标检测\u003C\u002Fh3>\n\u003Cp>在用户的实时摄像头流上运行 Yolov10 模型。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff82feb74-a071-4e81-9110-a01989447ceb\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\n\u003Ctr>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Kyutai Moshi\u003C\u002Fh3>\n\u003Cp>Kyutai 的 Moshi 是一种用于模拟人类对话的新型语音到语音模型。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fbecc7a13-9e89-4a19-9df2-5fb1467a0137\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Ftalk-to-moshi\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Ftalk-to-moshi\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003Ctd width=\"50%\">\n\u003Ch3>🗣️ Hello Llama：停用词检测\u003C\u002Fh3>\n\u003Cp>基于 Llama 3.3 70b 构建的代码编辑器，可通过“Hello Llama”这一短语触发。仅需 100 行代码，即可打造类似 Siri 的编程助手。\u003C\u002Fp>\n\u003Cvideo width=\"100%\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3e10cb15-ff1b-4b17-b141-ff0ad852e613\" controls>\u003C\u002Fvideo>\n\u003Cp>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fhey-llama-code-editor\">演示\u003C\u002Fa> |\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffreddyaboulton\u002Fhey-llama-code-editor\u002Fblob\u002Fmain\u002Fapp.py\">代码\u003C\u002Fa>\n\u003C\u002Fp>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## 使用方法\n\n这是官方[使用指南](https:\u002F\u002Ffreddyaboulton.github.io\u002Fgradio-webrtc\u002Fuser-guide\u002F)的精简版。\n\n- `.ui.launch()`: 启动一个内置的 UI，方便测试和分享你的流。基于 [Gradio](https:\u002F\u002Fwww.gradio.app\u002F) 构建。\n- `.fastphone()`: 获取一个免费的临时电话号码，用于拨打到你的流中。需要 Hugging Face 的 Token。\n- `.mount(app)`: 将流挂载到一个 [FastAPI](https:\u002F\u002Ffastapi.tiangolo.com\u002F) 应用上。非常适合与你现有的生产系统集成。\n\n\n## 快速入门\n\n### 回声音频\n\n```python\nfrom fastrtc import Stream, ReplyOnPause\nimport numpy as np\n\ndef echo(audio: tuple[int, np.ndarray]):\n    # 该函数会持续接收音频，直到用户暂停\n    # 可以实现任何生成音频的迭代器\n    # 更完整的示例请参见“LLM 语音聊天”\n    yield audio\n\nstream = Stream(\n    handler=ReplyOnPause(echo),\n    modality=\"audio\", \n    mode=\"send-receive\",\n)\n```\n\n### LLM 语音聊天\n\n```py\nfrom fastrtc import (\n    ReplyOnPause, AdditionalOutputs, Stream,\n    audio_to_bytes, aggregate_bytes_to_16bit\n)\nimport gradio as gr\nfrom groq import Groq\nimport anthropic\nfrom elevenlabs import ElevenLabs\n\ngroq_client = Groq()\nclaude_client = anthropic.Anthropic()\ntts_client = ElevenLabs()\n\n\n# 请参阅 Cookbook 中的“与 Claude 对话”部分，了解如何维护聊天历史。\ndef response(\n    audio: tuple[int, np.ndarray],\n):\n    prompt = groq_client.audio.transcriptions.create(\n        file=(\"audio-file.mp3\", audio_to_bytes(audio)),\n        model=\"whisper-large-v3-turbo\",\n        response_format=\"verbose_json\",\n    ).text\n    response = claude_client.messages.create(\n        model=\"claude-3-5-haiku-20241022\",\n        max_tokens=512,\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n    )\n    response_text = \" \".join(\n        block.text\n        for block in response.content\n        if getattr(block, \"type\", None) == \"text\"\n    )\n    iterator = tts_client.text_to_speech.convert_as_stream(\n        text=response_text,\n        voice_id=\"JBFqnCBsd6RMkjVDRZzb\",\n        model_id=\"eleven_multilingual_v2\",\n        output_format=\"pcm_24000\"\n        \n    )\n    for chunk in aggregate_bytes_to_16bit(iterator):\n        audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)\n        yield (24000, audio_array)\n\nstream = Stream(\n    modality=\"audio\",\n    mode=\"send-receive\",\n    handler=ReplyOnPause(response),\n)\n```\n\n### 网络摄像头流\n\n```python\nfrom fastrtc import Stream\nimport numpy as np\n\n\ndef flip_vertically(image):\n    return np.flip(image, axis=0)\n\n\nstream = Stream(\n    handler=flip_vertically,\n    modality=\"video\",\n    mode=\"send-receive\",\n)\n```\n\n### 物体检测\n\n```python\nfrom fastrtc import Stream\nimport gradio as gr\nimport cv2\nfrom huggingface_hub import hf_hub_download\nfrom .inference import YOLOv10\n\nmodel_file = hf_hub_download(\n    repo_id=\"onnx-community\u002Fyolov10n\", filename=\"onnx\u002Fmodel.onnx\"\n)\n\n# 可克隆 https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Ffastrtc\u002Fobject-detection 获取 YOLOv10 的实现\nmodel = YOLOv10(model_file)\n\ndef detection(image, conf_threshold=0.3):\n    image = cv2.resize(image, (model.input_width, model.input_height))\n    new_image = model.detect_objects(image, conf_threshold)\n    return cv2.resize(new_image, (500, 500))\n\nstream = Stream(\n    handler=detection,\n    modality=\"video\", \n    mode=\"send-receive\",\n    additional_inputs=[\n        gr.Slider(minimum=0, maximum=1, step=0.01, value=0.3)\n    ]\n)\n```\n\n## 运行流\n\n运行方式如下：\n\n### Gradio\n\n```py\nstream.ui.launch()\n```\n\n### 电话（仅音频）\n\n```py\nstream.fastphone()\n```\n\n### FastAPI\n\n```py\napp = FastAPI()\nstream.mount(app)\n\n# 可选：添加路由\n@app.get(\"\u002F\")\nasync def _():\n    return HTMLResponse(content=open(\"index.html\").read())\n\n# 使用 uvicorn app:app --host 0.0.0.0 --port 8000 启动应用\n```","# FastRTC 快速上手指南\n\nFastRTC 是一个用于 Python 的实时通信库，可将任意 Python 函数快速转换为基于 WebRTC 或 WebSocket 的实时音视频流。它内置了语音检测、自动 UI 生成及电话接入功能，非常适合构建实时 AI 交互应用。\n\n## 环境准备\n\n*   **系统要求**：支持 Windows、macOS 和 Linux。\n*   **Python 版本**：建议 Python 3.9 及以上版本。\n*   **前置依赖**：\n    *   若需使用内置的暂停检测（Voice Activity Detection）和文本转语音（TTS）功能，需安装额外依赖包。\n    *   国内开发者如遇网络问题，建议使用国内镜像源加速安装。\n\n## 安装步骤\n\n### 1. 基础安装\n仅安装核心功能：\n```bash\npip install fastrtc -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 2. 完整功能安装\n推荐安装包含 `vad` (语音活动检测) 和 `tts` (文本转语音) 的完整版本，以启用“暂停回复”等高级功能：\n```bash\npip install \"fastrtc[vad, tts]\" -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\nFastRTC 的核心是通过 `Stream` 类包装你的处理函数。以下是三个最典型的使用场景。\n\n### 场景一：音频回声测试 (最简单示例)\n此示例将用户输入的音频直接返回，用于测试连通性。它使用了 `ReplyOnPause` 处理器，即检测到用户停止说话后才开始处理。\n\n```python\nfrom fastrtc import Stream, ReplyOnPause\nimport numpy as np\n\ndef echo(audio: tuple[int, np.ndarray]):\n    # 函数接收音频数据直到用户暂停\n    # 直接 yield 返回相同的音频\n    yield audio\n\nstream = Stream(\n    handler=ReplyOnPause(echo),\n    modality=\"audio\", \n    mode=\"send-receive\",\n)\n\n# 启动内置的 Gradio UI 进行测试\nif __name__ == \"__main__\":\n    stream.ui.launch()\n```\n\n### 场景二：实时视频处理\n此示例读取摄像头画面，将其垂直翻转后实时推流。\n\n```python\nfrom fastrtc import Stream\nimport numpy as np\n\ndef flip_vertically(image):\n    # image 是 numpy 数组\n    return np.flip(image, axis=0)\n\nstream = Stream(\n    handler=flip_vertically,\n    modality=\"video\",\n    mode=\"send-receive\",\n)\n\nif __name__ == \"__main__\":\n    stream.ui.launch()\n```\n\n### 场景三：集成 LLM 语音对话\n结合 Groq (Whisper)、Anthropic (Claude) 和 ElevenLabs (TTS) 实现完整的语音对话流程。\n\n```python\nfrom fastrtc import (\n    ReplyOnPause, Stream,\n    audio_to_bytes, aggregate_bytes_to_16bit\n)\nimport numpy as np\n# 请确保已安装相关 SDK: pip install groq anthropic elevenlabs\nfrom groq import Groq\nimport anthropic\nfrom elevenlabs import ElevenLabs\n\n# 初始化客户端 (请设置环境变量 API KEY)\ngroq_client = Groq()\nclaude_client = anthropic.Anthropic()\ntts_client = ElevenLabs()\n\ndef response(audio: tuple[int, np.ndarray]):\n    # 1. 语音转文字 (Whisper)\n    prompt = groq_client.audio.transcriptions.create(\n        file=(\"audio-file.mp3\", audio_to_bytes(audio)),\n        model=\"whisper-large-v3-turbo\",\n        response_format=\"verbose_json\",\n    ).text\n    \n    # 2. 获取 LLM 回复 (Claude)\n    response = claude_client.messages.create(\n        model=\"claude-3-5-haiku-20241022\",\n        max_tokens=512,\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n    )\n    response_text = \" \".join(\n        block.text\n        for block in response.content\n        if getattr(block, \"type\", None) == \"text\"\n    )\n    \n    # 3. 文字转语音 (ElevenLabs) 并流式返回\n    iterator = tts_client.text_to_speech.convert_as_stream(\n        text=response_text,\n        voice_id=\"JBFqnCBsd6RMkjVDRZzb\",\n        model_id=\"eleven_multilingual_v2\",\n        output_format=\"pcm_24000\"\n    )\n    \n    for chunk in aggregate_bytes_to_16bit(iterator):\n        audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)\n        yield (24000, audio_array)\n\nstream = Stream(\n    modality=\"audio\",\n    mode=\"send-receive\",\n    handler=ReplyOnPause(response),\n)\n\nif __name__ == \"__main__\":\n    stream.ui.launch()\n```\n\n## 运行方式\n\n创建 `stream` 对象后，可通过以下三种方式运行：\n\n1.  **Web UI 模式 (推荐开发调试)**\n    启动内置的 Gradio 界面，直接在浏览器中测试。\n    ```python\n    stream.ui.launch()\n    ```\n\n2.  **电话接入模式 (仅限音频)**\n    获取一个免费的临时电话号码，拨打该号码即可与你的程序交互（需要 Hugging Face Token）。\n    ```python\n    stream.fastphone()\n    ```\n\n3.  **生产部署模式 (FastAPI)**\n    将流挂载到 FastAPI 应用，以便集成到现有的前端或系统中。\n    ```python\n    from fastapi import FastAPI\n    from fastapi.responses import HTMLResponse\n\n    app = FastAPI()\n    stream.mount(app)\n\n    # 可选：添加自定义前端路由\n    @app.get(\"\u002F\")\n    async def _():\n        return HTMLResponse(content=\"\u003Ch1>FastRTC Service Running\u003C\u002Fh1>\")\n\n    # 启动命令: uvicorn app:app --host 0.0.0.0 --port 8000\n    ```","一家初创团队正在开发一款基于大模型的实时口语陪练应用，需要让用户通过网页或电话与 AI 进行低延迟的双向语音对话。\n\n### 没有 fastrtc 时\n- 开发者需手动配置复杂的 WebRTC 信令服务器和处理音视频流编解码，耗费数周时间搭建基础架构。\n- 难以实现精准的“语音停顿检测”，导致 AI 经常在用户话未说完时就抢话，或回应延迟过高，对话体验生硬。\n- 若要支持电话接入，必须额外购买昂贵的第三方通信服务（如 Twilio）并编写大量适配代码。\n- 前端展示需要单独开发 React\u002FVue 界面来捕获麦克风权限并推流，无法快速验证后端逻辑。\n\n### 使用 fastrtc 后\n- 只需定义一个普通的 Python 函数处理对话逻辑，fastrtc 自动将其转换为标准的 WebRTC 或 WebSocket 流，半天即可完成部署。\n- 内置自动语音检测（VAD）和轮次切换机制，AI 能精准识别用户何时说完，实现自然流畅的“打断”与“回应”。\n- 调用 `fastphone()` 方法即可一键生成免费临时电话号码，无需额外集成即可直接支持电话端访问。\n- 利用 `.ui.launch()` 直接启动内置的 Gradio 界面，立即获得具备音视频采集能力的测试前端，大幅加速原型迭代。\n\nfastrtc 将原本需要全栈团队耗时数月的实时通信基建，简化为几行 Python 代码，让开发者能专注于核心对话逻辑而非底层传输协议。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgradio-app_fastrtc_9635e4a1.png","gradio-app","Gradio","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fgradio-app_2a90de03.png","Delightfully easy-to-use open-source tools that make machine learning easier and more accessible",null,"admin@gradio.app","www.gradio.app","https:\u002F\u002Fgithub.com\u002Fgradio-app",[81,85,89,93,97,101],{"name":82,"color":83,"percentage":84},"JavaScript","#f1e05a",92.3,{"name":86,"color":87,"percentage":88},"Python","#3572A5",5.7,{"name":90,"color":91,"percentage":92},"Svelte","#ff3e00",1.7,{"name":94,"color":95,"percentage":96},"TypeScript","#3178c6",0.2,{"name":98,"color":99,"percentage":100},"HTML","#e34c26",0.1,{"name":102,"color":103,"percentage":100},"Just","#384d54",4577,430,"2026-04-20T06:01:23","MIT","未说明",{"notes":110,"python":108,"dependencies":111},"该工具是一个实时通信库，可将 Python 函数转换为 WebRTC 或 WebSocket 音频\u002F视频流。基础安装仅需 'pip install fastrtc'；若需使用内置的语音活动检测 (VAD) 和文本转语音 (TTS) 功能，需安装额外依赖 'fastrtc[vad, tts]'。支持通过 Gradio 快速启动 UI、集成到 FastAPI 后端或获取临时电话号码进行测试。具体运行资源取决于用户在其处理函数中调用的模型（如示例中的 Whisper、YOLOv10 或大语言模型），README 本身未规定统一的硬件门槛。",[64,112,113,114,115,116,117,118,119],"gradio","fastapi","numpy","groq","anthropic","elevenlabs","opencv-python (cv2)","huggingface_hub",[121,36,14],"音频",[123,124,125,126,127,128,129,130],"artificial-intelligence","llm","python","real-time","speech-to-text","text-to-speech","hacktoberfest","hacktoberfest2025","2026-03-27T02:49:30.150509","2026-04-20T22:35:15.610364",[],[135,140,145,150,155,160,165,170,175,180,185,190,195,200,205,210,215,220,225,230],{"id":136,"version":137,"summary_zh":138,"released_at":139},360804,"0.0.34","## 变更内容\n* 功能：允许通过 @AleksanderWWW 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F396 中将 OpenAPI 标签传递给已挂载的端点。\n* 将部分打印语句改为使用日志记录器。由 @FabienDanieau 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F397 中完成。\n* 由 @freddyaboulton 发布了一个减少了打印语句的版本，详情见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F413。\n\n## 新贡献者\n* @AleksanderWWW 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F396 中完成了首次贡献。\n* @FabienDanieau 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F397 中完成了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fgra\n[fastrtc-0.0.34.tar.gz](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Ffiles\u002F23728389\u002Ffastrtc-0.0.34.tar.gz)\ndio-app\u002Ffastrtc\u002Fcompare\u002F0.0.33...0.0.34","2025-11-24T17:16:30",{"id":141,"version":142,"summary_zh":143,"released_at":144},360805,"0.0.33","## 变更内容\n* 由 @marcusvaltonen 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F393 中处理 Gradio lite，适用于 5.46.0 之前的版本\n* 由 @freddyaboulton 发布版本 0.0.33，在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F394 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.32...0.0.33","2025-09-17T12:05:07",{"id":146,"version":147,"summary_zh":148,"released_at":149},360806,"0.0.32","## 变更内容\n* 由 @maradini77 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F375 中更新了 LICENSE 文件\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F377 中修复了异步客户端因重复使用同一实例而导致的 bug\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F378 中修复了设置输入时的 bug\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F379 中修复了全屏模式下 InteractiveAudio 的 Textbox 变体问题\n* 由 @dawoodkhan82 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F374 中修复了 `full_screen` 模式的图标大小问题，并修复了相关演示\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F383 中放宽了 Numpy Pin 的限制\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F384 中发布了 0.0.32 版本\n\n## 新贡献者\n* @maradini77 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F375 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.30...0.0.32","2025-09-02T16:23:10",{"id":151,"version":152,"summary_zh":153,"released_at":154},360807,"0.0.30","## 变更内容\n* 由 @dawoodkhan82 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F364 中实现的沉浸式 UI\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F370 中将新的编译后 JavaScript 文件添加到源代码中\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F372 中添加 GPT-OSS 示例并进行了一些调整\n\n## 新贡献者\n* @dawoodkhan82 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F364 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.29...0.0.30","2025-08-12T10:12:46",{"id":156,"version":157,"summary_zh":158,"released_at":159},360808,"0.0.29","## 变更内容\n* 可以通过 Python 代码关闭连接，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F360 中实现\n* 版本升级至 0.0.29，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F361 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.28...0.0.29","2025-07-07T08:56:18",{"id":161,"version":162,"summary_zh":163,"released_at":164},360809,"0.0.28","修复了一个导致发送-接收 + 视频功能无法正常工作的 bug。\n\n## 变更内容\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F350 中修复了交互式视频功能。\n* 版本 28，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F351 中发布。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.27...0.0.28","2025-06-13T16:28:00",{"id":166,"version":167,"summary_zh":168,"released_at":169},360810,"0.0.27","## 变更内容\n* 在文档中添加集成文本框及空格，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F343 中完成\n* 杂项：分发 starting_recording 和 stop_recording 事件。由 @AlbertMingXu 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F342 中完成\n* 版本 0.0.27，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F348 中发布\n\n## 新贡献者\n* @AlbertMingXu 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F342 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.26...0.0.27","2025-06-12T22:53:02",{"id":171,"version":172,"summary_zh":173,"released_at":174},360811,"0.0.26","## 变更内容\n* 在未检测到 VAD 暂停的情况下，将语音按 s 分块 — @sofi444，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F328\n* 修复拼写错误 — @omahs，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F330\n* 修复 Gemini 示例 — @freddyaboulton，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F332\n* 如果可用，则将 WebSocket 传递到上下文中 — @freddyaboulton，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F329\n* 新特性：将 Whisper CPP 语音转文本的 FastRTC 版本添加到文档中 — @mahimairaja，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F324\n* 添加文本模式 — @freddyaboulton，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F321\n* 【修复】在缺少 hf_token 的情况下，允许使用 Cloudflare 令牌 — @sblair12，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F338\n* 版本 0.0.26 — @freddyaboulton，见 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F339\n\n## 新贡献者\n* @omahs 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F330 中完成了首次贡献\n* @sblair12 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F338 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.25...0.0.26","2025-06-05T22:58:58",{"id":176,"version":177,"summary_zh":178,"released_at":179},360812,"0.0.25","## 变更内容\n* 由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F319 中实现的启动日志抑制功能\n* 版本 0.0.25，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F322 中发布\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.24...0.0.25","2025-05-21T15:10:02",{"id":181,"version":182,"summary_zh":183,"released_at":184},360813,"0.0.24","## 变更内容\n* @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F310 中增加了超时时间\n* 修复：即使缺少 HF_TOKEN，也能使用 CLOUDFLARE_TURN_KEY_* 环境变量，由 @tedmeftah 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F307 中完成\n* 修复静态媒体组件中的类型定义，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F311 中完成\n* 修复 UI 中的 WebRTC 错误显示及示例代码，由 @freddyaboulton 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F313 中完成\n\n## 新贡献者\n* @tedmeftah 在 https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F307 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.23...0.0.24","2025-05-13T16:13:06",{"id":186,"version":187,"summary_zh":188,"released_at":189},360814,"0.0.23","## What's Changed\r\n* Update text_to_speech_gallery.md by @Shubham-Rasal in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F296\r\n* Fixed path for `telephone\u002Fhandler` in `handle_incoming_call` by @amanchauhan11 in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F280\r\n* Fix TURN credentials for interactive video + other Gemini Audio Video demo tweaks by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F297\r\n* Add first-class support for Cartesia text-to-speech by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F298\r\n* Add ability to Hide Title in Built-in UI + llama 4 cartesia tweaks by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F299\r\n* Release verison 0.0.23 by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F300\r\n\r\n## New Contributors\r\n* @Shubham-Rasal made their first contribution in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F296\r\n* @amanchauhan11 made their first contribution in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F280\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.22...0.0.23","2025-04-23T20:24:16",{"id":191,"version":192,"summary_zh":193,"released_at":194},360822,"0.0.14","## Main Features\r\nMicrophone Muting\r\n![mute_mic](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F54897379-8df9-4bca-86da-29f68461c3af)\r\n\r\nAbility to load community VAD and Text-to-Speech Models\r\n\r\n## What's Changed\r\n* Adding nextjs + 11labs + openai streaming demo by @rohanprichard in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F139\r\n* Add Method for loading community Vad Models by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F136\r\n* Community STT models by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F147\r\n* Community stt models by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F149\r\n* fix:  unused user-provided Silero option by @CuriousMonkey7 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F150\r\n* Raise WebRTC Errors in \"Receive\" case by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F151\r\n* Raise error if code in any part of video processing fails by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F153\r\n* fix: ensure 'model'  is copied in ReplyOnPause.copy() by @CuriousMonkey7 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F155\r\n* Fix Warning in Advanced Configuration by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F157\r\n* Add microphone mute by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F158\r\n* Added to gallery by @Codeblockz in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F159\r\n* Copy Model on ReplyOnStopWords by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F160\r\n* Add docs on how to contribute by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F161\r\n* Release 0.0.14 by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F164\r\n\r\n## New Contributors\r\n* @rohanprichard made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F139\r\n* @Codeblockz made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F159\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.13...0.0.14","2025-03-11T17:07:59",{"id":196,"version":197,"summary_zh":198,"released_at":199},360815,"0.0.22","## What's Changed\r\n* Fix audio type conversion by @vvolhejn in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F259\r\n* Add llama 4 to cookbook by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F267\r\n* Update old links in pyproject.toml by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F270\r\n* Fix warning\u002Ferror messages in gradio UI by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F275\r\n* Add docs for outbound calls with twilio by @shaon72 in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F273\r\n* Be able to get current context from a websocket connection by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F276\r\n* OpenAI Demo Minor Fixes by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F279\r\n* Add a Medical Agent Example to showcase function calling by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F281\r\n* Change Repo URL by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F282\r\n* Set ice candidates server by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F285\r\n* Fix Websocket Client Processing by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F286\r\n* Fix websocket interruption by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F291\r\n* Release by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F293\r\n* Release by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F294\r\n\r\n## New Contributors\r\n* @shaon72 made their first contribution in https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fpull\u002F273\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Ffastrtc\u002Fcompare\u002F0.0.20...0.0.22","2025-04-22T18:46:22",{"id":201,"version":202,"summary_zh":203,"released_at":204},360816,"0.0.20","## What's Changed\r\n* Improve error message if track kind and modality mismatch by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F230\r\n* Ignore output_frame_size parameter by @vvolhejn in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F210\r\n* Improve error handling for websockets by @vvolhejn in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F238\r\n* Add track_constraints to Stream class by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F241\r\n* Support CloseStream in Websocket by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F242\r\n* MIT license by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F246\r\n* Allow extra tracks (#231) by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F249\r\n* Add ability to trigger ReplyOnPause without waiting for pause by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F250\r\n* Fix bug where additional outputs where not fetched immediately on subsequent connections by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F254\r\n* Introduce automatic linting with Github workflows by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F251\r\n* Add API Reference and llms.txt by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F256\r\n* Dont run docs ci on prs from forks by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F257\r\n* Introduce static type checking with pyright by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F255\r\n* Introduce unit tests by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F248\r\n* Add started_talking log message in ReplyOnPause and in api.md by @erwaen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F260\r\n* Enforce modern typing by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F258\r\n* Cloudflare turn integration by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F264\r\n* 0.0.20 release by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F265\r\n\r\n## New Contributors\r\n* @erwaen made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F260\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.19...0.0.20","2025-04-09T13:39:12",{"id":206,"version":207,"summary_zh":208,"released_at":209},360817,"0.0.19","## What's Changed\r\n* Add text-to-speech-gallery + reword galleries to be \"Plugin Ecosystem\" by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F218\r\n* Remove twice instantiated event by @marcusvaltonen in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F221\r\n* Add Kroko-ASR model to STT gallery by @sgarg26 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F219\r\n* Close Stream from Backend by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F222\r\n* Add get_context function that can be used to retrieve the webrtc id by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F223\r\n* Trigger 0.0.19 release by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F226\r\n\r\n## New Contributors\r\n* @marcusvaltonen made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F221\r\n* @sgarg26 made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F219\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.18...0.0.19","2025-03-31T17:05:13",{"id":211,"version":212,"summary_zh":213,"released_at":214},360818,"0.0.18","## What's Changed\r\n* Fix gradio concurrency issue in fetching additional outputs by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F211\r\n* Trigger 0.0.18 by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F212\r\n\r\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F1709d7d3-02c4-42db-b508-8daad0c67746\r\n\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.17...0.0.18","2025-03-25T18:46:01",{"id":216,"version":217,"summary_zh":218,"released_at":219},360819,"0.0.17","## What's Changed\r\n* Add js assets by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F192\r\n* Create py.typed by @vvolhejn in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F196\r\n* Added HumAwareVAD to VAD Gallery by @CuriousMonkey7 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F194\r\n* Minor formating change to the turn gallery by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F197\r\n* Some Video Fixes by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F200\r\n* Add support for trickle ice by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F193\r\n* trigger release by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F201\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.16...0.0.17","2025-03-21T01:01:42",{"id":221,"version":222,"summary_zh":223,"released_at":224},360820,"0.0.16","## What's Changed\r\n* Rename the Vad Gallery to Turn Taking Gallery by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F173\r\n* add fastrc with Elecron app example to cookbook by @swairshah in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F178\r\n* Add on-device whisper example to cookbook by @sofi444 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F179\r\n* Add example for \"Talk to Azure OpenAi\" by @MLYengineering in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F181\r\n* Fix Fastphone bug with latest gradio version by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F183\r\n* Fix outdated import in docs by @vvolhejn in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F185\r\n* Fix issue when the audio stream mixes sample rates and numpy array data types by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F188\r\n* Add path argument to Stream.mount to let developers mount multiple streams in the same app easily by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F189\r\n* Bump version to 0.0.16 by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F190\r\n\r\n## New Contributors\r\n* @swairshah made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F178\r\n* @sofi444 made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F179\r\n* @MLYengineering made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F181\r\n* @vvolhejn made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F185\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.15...0.0.16","2025-03-19T01:44:55",{"id":226,"version":227,"summary_zh":228,"released_at":229},360821,"0.0.15","## What's Changed\r\n* feat: Add optional startup function to ReplyOnPause by @Ryu1845 in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F170\r\n\r\nWe've added a new parameter to `ReplyOnPause` and `ReplyOnStopWords` called `startup_fn`. Pass in generator that yields any output data and the assistant will run that generator as soon as a user connects.\r\n\r\nMore information here: https:\u002F\u002Ffastrtc.org\u002Fuserguide\u002Faudio\u002F#startup-function\r\n\r\n## Audio Response\r\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6d9d9246-f874-4030-aafa-c570527472ac\r\n\r\n## Text Response\r\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Faea98394-64b7-44a9-aea1-c572af2d049f\r\n\r\n\r\n## New Contributors\r\n* @Ryu1845 made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F170\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.14...0.0.15","2025-03-13T23:08:40",{"id":231,"version":232,"summary_zh":233,"released_at":234},360823,"0.0.13","## What's Changed\r\n* Fix kokoro batch issue by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F128\r\n* Fix kokoro batch issue by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F133\r\n* UnboundLocalError: local variable 'button' referenced before assignment by @akjava in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F126\r\n* Improve Interruption Handling by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F134\r\n* feat: Added documentation for twilio integration by @mahimairaja in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F125\r\n* Simplify Cloudflare config with new endpoint by @mhart in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F135\r\n* Add subtitle to UIArgs by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F137\r\n* Some video send-receive bug fixes by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F145\r\n* Release 0.0.13 by @freddyaboulton in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F146\r\n\r\n## New Contributors\r\n* @akjava made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F126\r\n* @mahimairaja made their first contribution in https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fpull\u002F125\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Ffreddyaboulton\u002Ffastrtc\u002Fcompare\u002F0.0.11...0.0.13","2025-03-07T19:22:42"]