[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-kxxt--aspeak":3,"tool-kxxt--aspeak":64},[4,23,32,40,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":22},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,2,"2026-04-05T10:45:23",[13,14,15,16,17,18,19,20,21],"图像","数据工具","视频","插件","Agent","其他","语言模型","开发框架","音频","ready",{"id":24,"name":25,"github_repo":26,"description_zh":27,"stars":28,"difficulty_score":29,"last_commit_at":30,"category_tags":31,"status":22},2181,"OpenHands","OpenHands\u002FOpenHands","OpenHands 是一个专注于 AI 驱动开发的开源平台，旨在让智能体（Agent）像人类开发者一样理解、编写和调试代码。它解决了传统编程中重复性劳动多、环境配置复杂以及人机协作效率低等痛点，通过自动化流程显著提升开发速度。\n\n无论是希望提升编码效率的软件工程师、探索智能体技术的研究人员，还是需要快速原型验证的技术团队，都能从中受益。OpenHands 提供了灵活多样的使用方式：既可以通过命令行（CLI）或本地图形界面在个人电脑上轻松上手，体验类似 Devin 的流畅交互；也能利用其强大的 Python SDK 自定义智能体逻辑，甚至在云端大规模部署上千个智能体并行工作。\n\n其核心技术亮点在于模块化的软件智能体 SDK，这不仅构成了平台的引擎，还支持高度可组合的开发模式。此外，OpenHands 在 SWE-bench 基准测试中取得了 77.6% 的优异成绩，证明了其解决真实世界软件工程问题的能力。平台还具备完善的企业级功能，支持与 Slack、Jira 等工具集成，并提供细粒度的权限管理，适合从个人开发者到大型企业的各类用户场景。",70612,3,"2026-04-05T11:12:22",[19,17,20,16],{"id":33,"name":34,"github_repo":35,"description_zh":36,"stars":37,"difficulty_score":10,"last_commit_at":38,"category_tags":39,"status":22},3074,"gpt4free","xtekky\u002Fgpt4free","gpt4free 是一个由社区驱动的开源项目，旨在聚合多种可访问的大型语言模型（LLM）和媒体生成接口，让用户能更灵活、便捷地使用前沿 AI 能力。它解决了直接调用各类模型时面临的接口分散、门槛高或成本昂贵等痛点，通过统一的标准将不同提供商的资源整合在一起。\n\n无论是希望快速集成 AI 功能的开发者、需要多模型对比测试的研究人员，还是想免费体验最新技术的普通用户，都能从中受益。gpt4free 提供了丰富的使用方式：既包含易于上手的 Python 和 JavaScript 客户端库，也支持部署本地图形界面（GUI），更提供了兼容 OpenAI 标准的 REST API，方便无缝替换现有应用后端。\n\n其技术亮点在于强大的多提供商支持架构，能够动态调度包括 Opus、Gemini、DeepSeek 等多种主流模型资源，并支持 Docker 一键部署及本地推理。项目秉持社区优先原则，在降低使用门槛的同时，也为贡献者提供了扩展新接口的便利框架，是探索和利用多样化 AI 资源的实用工具。",65970,"2026-04-04T01:02:03",[16,19,17],{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":10,"last_commit_at":46,"category_tags":47,"status":22},51,"gstack","garrytan\u002Fgstack","gstack 是 Y Combinator CEO Garry Tan 亲自开源的一套 AI 工程化配置，旨在将 Claude Code 升级为你的虚拟工程团队。面对单人开发难以兼顾产品战略、架构设计、代码审查及质量测试的挑战，gstack 提供了一套标准化解决方案，帮助开发者实现堪比二十人团队的高效产出。\n\n这套配置特别适合希望提升交付效率的创始人、技术负责人，以及初次尝试 Claude Code 的开发者。gstack 的核心亮点在于内置了 15 个具有明确职责的 AI 角色工具，涵盖 CEO、设计师、工程经理、QA 等职能。用户只需通过简单的斜杠命令（如 `\u002Freview` 进行代码审查、`\u002Fqa` 执行测试、`\u002Fplan-ceo-review` 规划功能），即可自动化处理从需求分析到部署上线的全链路任务。\n\n所有操作基于 Markdown 和斜杠命令，无需复杂配置，完全免费且遵循 MIT 协议。gstack 不仅是一套工具集，更是一种现代化的软件工厂实践，让单人开发者也能拥有严谨的工程流程。",64261,"2026-04-05T11:08:43",[17,16],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":10,"last_commit_at":54,"category_tags":55,"status":22},193,"meilisearch","meilisearch\u002Fmeilisearch","Meilisearch 是一个开源的极速搜索服务，专为现代应用和网站打造，开箱即用。它能帮助开发者快速集成高质量的搜索功能，无需复杂的配置或额外的数据预处理。传统搜索方案往往需要大量调优才能实现准确结果，而 Meilisearch 内置了拼写容错、同义词识别、即时响应等实用特性，并支持 AI 驱动的混合搜索（结合关键词与语义理解），显著提升用户查找信息的体验。\n\nMeilisearch 特别适合 Web 开发者、产品团队或初创公司使用，尤其适用于需要快速上线搜索功能的场景，如电商网站、内容平台或 SaaS 应用。它提供简洁的 RESTful API 和多种语言 SDK，部署简单，资源占用低，本地开发或生产环境均可轻松运行。对于希望在不依赖大型云服务的前提下，为用户提供流畅、智能搜索体验的团队来说，Meilisearch 是一个高效且友好的选择。",56964,"2026-04-05T08:19:14",[13,17,14,20,16,18],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":29,"last_commit_at":62,"category_tags":63,"status":22},4128,"GPT-SoVITS","RVC-Boss\u002FGPT-SoVITS","GPT-SoVITS 是一款强大的开源语音合成与声音克隆工具，旨在让用户仅需极少量的音频数据即可训练出高质量的个性化语音模型。它核心解决了传统语音合成技术依赖海量录音数据、门槛高且成本大的痛点，实现了“零样本”和“少样本”的快速建模：用户只需提供 5 秒参考音频即可即时生成语音，或使用 1 分钟数据进行微调，从而获得高度逼真且相似度极佳的声音效果。\n\n该工具特别适合内容创作者、独立开发者、研究人员以及希望为角色配音的普通用户使用。其内置的友好 WebUI 界面集成了人声伴奏分离、自动数据集切片、中文语音识别及文本标注等辅助功能，极大地降低了数据准备和模型训练的技术门槛，让非专业人士也能轻松上手。\n\n在技术亮点方面，GPT-SoVITS 不仅支持中、英、日、韩、粤语等多语言跨语种合成，还具备卓越的推理速度，在主流显卡上可实现实时甚至超实时的生成效率。无论是需要快速制作视频配音，还是进行多语言语音交互研究，GPT-SoVITS 都能以极低的数据成本提供专业级的语音合成体验。",56375,"2026-04-05T22:15:46",[21],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":101,"forks":102,"last_commit_at":103,"license":104,"difficulty_score":10,"env_os":105,"env_gpu":106,"env_ram":107,"env_deps":108,"category_tags":115,"github_topics":116,"view_count":10,"oss_zip_url":81,"oss_zip_packed_at":81,"status":22,"created_at":124,"updated_at":125,"faqs":126,"releases":161},3784,"kxxt\u002Faspeak","aspeak","A simple text-to-speech client for Azure TTS API. ","aspeak 是一款专为调用微软 Azure 语音服务（TTS）而设计的轻量级命令行工具，能将文本快速转换为自然流畅的语音。它主要解决了开发者在集成 Azure TTS 时面临的配置繁琐问题，通过简洁的命令参数即可直接生成音频，无需编写复杂的调用代码。\n\n这款工具特别适合需要高效处理语音合成任务的开发者、研究人员以及熟悉命令行操作的普通用户。无论是用于自动化脚本、有声书制作，还是辅助功能开发，aspeak 都能提供便捷的支持。其独特的技术亮点在于从 4.0 版本起完全使用 Rust 语言重写，显著提升了运行效率与稳定性；同时支持 RESTful 和 WebSocket 两种通信模式，用户可根据网络环境灵活切换。此外，aspeak 提供了多种安装方式，包括直接下载二进制文件、通过 Python pip 安装或从源码编译，并兼容 Linux、macOS 及 Windows 主流系统。配合 Azure 提供的免费额度，用户可以低成本地体验高质量的语音合成服务。","# :speaking_head: aspeak\n\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fstargazers)\n[![GitHub issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues)\n[![GitHub forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fnetwork)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002FLICENSE)\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fgraphs\u002Fcontributors\" alt=\"Contributors\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcontributors\u002Fkxxt\u002Faspeak\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fpulse\" alt=\"Activity\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcommit-activity\u002Fm\u002Fkxxt\u002Faspeak\" \u002F>\n\u003C\u002Fa>\n\nA simple text-to-speech client for Azure TTS API. :laughing:\n\n## Note\n\nStarting from version 6.0.0, `aspeak` by default uses the RESTful API of Azure TTS. If you want to use the WebSocket API,\nyou can specify `--mode websocket` when invoking `aspeak` or set `mode = \"websocket\"` in the `auth` section of your profile.\n\nStarting from version 4.0.0, `aspeak` is rewritten in rust. The old python version is available at the `python` branch.\n\nYou can sign up for an Azure account and then\n[choose a payment plan as needed (or stick to free tier)](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcognitive-services\u002Fspeech-services\u002F).\nThe free tier includes a quota of 0.5 million characters per month, free of charge.\n\nPlease refer to the [Authentication section](#authentication) to learn how to set up authentication for aspeak.\n\n## Installation\n\n### Download from GitHub Releases (Recommended for most users)\n\nDownload the latest release from [here](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Freleases\u002Flatest).\n\nAfter downloading, extract the archive and you will get a binary executable file.\n\nYou can put it in a directory that is in your `PATH` environment variable so that you can run it from anywhere.\n\n### Install from AUR (Recommended for Arch Linux users)\n\nFrom v4.1.0, You can install `aspeak-bin` from AUR.\n\n### Install from PyPI\n\nInstalling from PyPI will also install the python binding of `aspeak` for you. Check [Library Usage#Python](#Python) for more information on using the python binding.\n\n```bash\npip install -U aspeak==6.0.0\n```\n\nNow the prebuilt wheels are only available for x86_64 architecture.\nDue to some technical issues, I haven't uploaded the source distribution to PyPI yet.\nSo to build wheel from source, you need to follow the instructions in [Install from Source](#Install-from-Source).\n\nBecause of manylinux compatibility issues, the wheels for linux are not available on PyPI. (But you can still build them from source.)\n\n### Install from Source\n\n#### CLI Only\n\nThe easiest way to install `aspeak` from source is to use cargo:\n\n```bash\ncargo install aspeak -F binary\n```\n\nAlternatively, you can also install `aspeak` from AUR.\n\n#### Python Wheel\n\nTo build the python wheel, you need to install `maturin` first:\n\n```bash\npip install maturin\n```\n\nAfter cloning the repository and `cd` into the directory\n, you can build the wheel by running:\n\n```bash\nmaturin build --release --strip -F python --bindings pyo3 --interpreter python --manifest-path Cargo.toml --out dist-pyo3\nmaturin build --release --strip --bindings bin -F binary --interpreter python --manifest-path Cargo.toml --out dist-bin\nbash merge-wheel.bash\n```\n\nIf everything goes well, you will get a wheel file in the `dist` directory.\n\n## Usage\n\nRun `aspeak help` to see the help message.\n\nRun `aspeak help \u003Csubcommand>` to see the help message of a subcommand.\n\n### Authentication\n\nThe authentication options should be placed before any subcommand.\n\nFor example, to utilize your subscription key and\nan official endpoint designated by a region,\nrun the following command:\n\n```sh\n$ aspeak --region \u003CYOUR_REGION> --key \u003CYOUR_SUBSCRIPTION_KEY> text \"Hello World\"\n```\n\nIf you are using a custom endpoint, you can use the `--endpoint` option instead of `--region`.\n\nTo avoid repetition, you can store your authentication details\nin your aspeak profile.\nRead the following section for more details.\n\nFrom v5.2.0, you can also set the authentication secrets via the following environment variables:\n\n- `ASPEAK_AUTH_KEY` for authentication using subscription key\n- `ASPEAK_AUTH_TOKEN` for authentication using authorization token\n\nFrom v4.3.0, you can let aspeak use a proxy server to connect to the endpoint.\nFor now, only http and socks5 proxies are supported (no https support yet). For example:\n\n```sh\n$ aspeak --proxy http:\u002F\u002Fyour_proxy_server:port text \"Hello World\"\n$ aspeak --proxy socks5:\u002F\u002Fyour_proxy_server:port text \"Hello World\"\n```\n\naspeak also respects the `HTTP_PROXY`(or `http_proxy`) environment variable.\n\n### Configuration\n\naspeak v4 introduces the concept of profiles.\nA profile is a configuration file where you can specify default values for the command line options.\n\nRun the following command to create your default profile:\n\n```sh\n$ aspeak config init\n```\n\nTo edit the profile, run:\n\n```sh\n$ aspeak config edit\n```\n\nIf you have trouble running the above command, you can edit the profile manually:\n\nFist get the path of the profile by running:\n\n```sh\n$ aspeak config where\n```\n\nThen edit the file with your favorite text editor.\n\nThe profile is a TOML file. The default profile looks like this:\n\nCheck the comments in the config file for more information about available options.\n\n```toml\n# Profile for aspeak\n# GitHub: https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\n\n# Output verbosity\n# 0   - Default\n# 1   - Verbose\n# The following output verbosity levels are only supported on debug build\n# 2   - Debug\n# >=3 - Trace\nverbosity = 0\n\n#\n# Authentication configuration\n#\n\n[auth]\n# Endpoint for TTS\n# endpoint = \"wss:\u002F\u002Feastus.tts.speech.microsoft.com\u002Fcognitiveservices\u002Fwebsocket\u002Fv1\"\n\n# Alternatively, you can specify the region if you are using official endpoints\n# region = \"eastus\"\n\n# Synthesizer Mode, \"rest\" or \"websocket\"\n# mode = \"rest\"\n\n# Azure Subscription Key\n# key = \"YOUR_KEY\"\n\n# Authentication Token\n# token = \"Your Authentication Token\"\n\n# Extra http headers (for experts)\n# headers = [[\"X-My-Header\", \"My-Value\"], [\"X-My-Header2\", \"My-Value2\"]]\n\n# Proxy\n# proxy = \"socks5:\u002F\u002F127.0.0.1:7890\"\n\n# Voice list API url\n# voice_list_api = \"Custom voice list API url\"\n\n#\n# Configuration for text subcommand\n#\n\n[text]\n# Voice to use. Note that it takes precedence over the locale\n# voice = \"en-US-JennyNeural\"\n# Locale to use\nlocale = \"en-US\"\n# Rate\n# rate = 0\n# Pitch\n# pitch = 0\n# Role\n# role = \"Boy\"\n# Style, \"general\" by default\n# style = \"general\"\n# Style degree, a floating-point number between 0.1 and 2.0\n# style_degree = 1.0\n\n#\n# Output Configuration\n#\n\n[output]\n# Container Format, Only wav\u002Fmp3\u002Fogg\u002Fwebm is supported.\ncontainer = \"wav\"\n# Audio Quality. Run `aspeak list-qualities` to see available qualities.\n#\n# If you choose a container format that does not support the quality level you specified here, \n# we will automatically select the closest level for you.\nquality = 0\n# Audio Format(for experts). Run `aspeak list-formats` to see available formats.\n# Note that it takes precedence over container and quality!\n# format = \"audio-16khz-128kbitrate-mono-mp3\"\n```\n\nIf you want to use a profile other than your default profile, you can use the `--profile` argument:\n\n```sh\naspeak --profile \u003CPATH_TO_A_PROFILE> text \"Hello\"\n```\n\nIf you want to temporarily disable the profile, you can use the `--no-profile` argument:\n\n```sh\naspeak --no-profile --region eastus --key \u003CYOUR_KEY> text \"Hello\"\n```\n\n### Pitch and Rate\n\n- `rate`: The speaking rate of the voice.\n  - If you use a float value (say `0.5`), the value will be multiplied by 100% and become `50.00%`.\n  - You can use the following values as well: `x-slow`, `slow`, `medium`, `fast`, `x-fast`, `default`.\n  - You can also use percentage values directly: `+10%`.\n  - You can also use a relative float value (with `f` postfix), `1.2f`:\n    - According to the [Azure documentation](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-prosody),\n    - A relative value, expressed as a number that acts as a multiplier of the default.\n    - For example, a value of `1f` results in no change in the rate. A value of `0.5f` results in a halving of the rate. A value of `3f` results in a tripling of the rate.\n- `pitch`: The pitch of the voice.\n  - If you use a float value (say `-0.5`), the value will be multiplied by 100% and become `-50.00%`.\n  - You can also use the following values as well: `x-low`, `low`, `medium`, `high`, `x-high`, `default`.\n  - You can also use percentage values directly: `+10%`.\n  - You can also use a relative value, (e.g. `-2st` or `+80Hz`):\n    - According to the [Azure documentation](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-prosody),\n    - A relative value, expressed as a number preceded by \"+\" or \"-\" and followed by \"Hz\" or \"st\" that specifies an amount to change the pitch.\n    - The \"st\" indicates the change unit is semitone, which is half of a tone (a half step) on the standard diatonic scale.\n  - You can also use an absolute value: e.g. `600Hz`\n\n**Note**: Unreasonable high\u002Flow values will be clipped to reasonable values by Azure Cognitive Services.\n\n### Examples\n\nThe following examples assume that you have already set up authentication in your profile.\n\n#### Speak \"Hello, world!\" to default speaker.\n\n```sh\n$ aspeak text \"Hello, world\"\n```\n\n#### SSML to Speech\n\n```sh\n$ aspeak ssml \u003C\u003C EOF\n\u003Cspeak version='1.0' xmlns='http:\u002F\u002Fwww.w3.org\u002F2001\u002F10\u002Fsynthesis' xml:lang='en-US'>\u003Cvoice name='en-US-JennyNeural'>Hello, world!\u003C\u002Fvoice>\u003C\u002Fspeak>\nEOF\n```\n\n#### List all available voices.\n\n```sh\n$ aspeak list-voices\n```\n\n#### List all available voices for Chinese.\n\n```sh\n$ aspeak list-voices -l zh-CN\n```\n\n#### Get information about a voice.\n\n```sh\n$ aspeak list-voices -v en-US-SaraNeural\n```\n\n\u003Cdetails>\n\n\u003Csummary>\n    Output\n\u003C\u002Fsummary>\n\n```\nMicrosoft Server Speech Text to Speech Voice (en-US, SaraNeural)\nDisplay name: Sara\nLocal name: Sara @ en-US\nLocale: English (United States)\nGender: Female\nID: en-US-SaraNeural\nVoice type: Neural\nStatus: GA\nSample rate: 48000Hz\nWords per minute: 157\nStyles: [\"angry\", \"cheerful\", \"excited\", \"friendly\", \"hopeful\", \"sad\", \"shouting\", \"terrified\", \"unfriendly\", \"whispering\"]\n```\n\n\u003C\u002Fdetails>\n\n#### Save synthesized speech to a file.\n\n```sh\n$ aspeak text \"Hello, world\" -o output.wav\n```\n\nIf you prefer mp3\u002Fogg\u002Fwebm, you can use `-c mp3`\u002F`-c ogg`\u002F`-c webm` option.\n\n```sh\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3\n$ aspeak text \"Hello, world\" -o output.ogg -c ogg\n$ aspeak text \"Hello, world\" -o output.webm -c webm\n```\n\n#### List available quality levels\n\n```sh\n$ aspeak list-qualities\n```\n\n\u003Cdetails>\n\n\u003Csummary>Output\u003C\u002Fsummary>\n\n```\nQualities for MP3:\n  3: audio-48khz-192kbitrate-mono-mp3\n  2: audio-48khz-96kbitrate-mono-mp3\n -3: audio-16khz-64kbitrate-mono-mp3\n  1: audio-24khz-160kbitrate-mono-mp3\n -2: audio-16khz-128kbitrate-mono-mp3\n -4: audio-16khz-32kbitrate-mono-mp3\n -1: audio-24khz-48kbitrate-mono-mp3\n  0: audio-24khz-96kbitrate-mono-mp3\n\nQualities for WAV:\n -2: riff-8khz-16bit-mono-pcm\n  1: riff-24khz-16bit-mono-pcm\n  0: riff-24khz-16bit-mono-pcm\n -1: riff-16khz-16bit-mono-pcm\n\nQualities for OGG:\n  0: ogg-24khz-16bit-mono-opus\n -1: ogg-16khz-16bit-mono-opus\n  1: ogg-48khz-16bit-mono-opus\n\nQualities for WEBM:\n  0: webm-24khz-16bit-mono-opus\n -1: webm-16khz-16bit-mono-opus\n  1: webm-24khz-16bit-24kbps-mono-opus\n```\n\n\u003C\u002Fdetails>\n\n#### List available audio formats (For expert users)\n\n```sh\n$ aspeak list-formats\n```\n\n\u003Cdetails>\n\n\u003Csummary>Output\u003C\u002Fsummary>\n\n```\namr-wb-16000hz\naudio-16khz-128kbitrate-mono-mp3\naudio-16khz-16bit-32kbps-mono-opus\naudio-16khz-32kbitrate-mono-mp3\naudio-16khz-64kbitrate-mono-mp3\naudio-24khz-160kbitrate-mono-mp3\naudio-24khz-16bit-24kbps-mono-opus\naudio-24khz-16bit-48kbps-mono-opus\naudio-24khz-48kbitrate-mono-mp3\naudio-24khz-96kbitrate-mono-mp3\naudio-48khz-192kbitrate-mono-mp3\naudio-48khz-96kbitrate-mono-mp3\nogg-16khz-16bit-mono-opus\nogg-24khz-16bit-mono-opus\nogg-48khz-16bit-mono-opus\nraw-16khz-16bit-mono-pcm\nraw-16khz-16bit-mono-truesilk\nraw-22050hz-16bit-mono-pcm\nraw-24khz-16bit-mono-pcm\nraw-24khz-16bit-mono-truesilk\nraw-44100hz-16bit-mono-pcm\nraw-48khz-16bit-mono-pcm\nraw-8khz-16bit-mono-pcm\nraw-8khz-8bit-mono-alaw\nraw-8khz-8bit-mono-mulaw\nriff-16khz-16bit-mono-pcm\nriff-22050hz-16bit-mono-pcm\nriff-24khz-16bit-mono-pcm\nriff-44100hz-16bit-mono-pcm\nriff-48khz-16bit-mono-pcm\nriff-8khz-16bit-mono-pcm\nriff-8khz-8bit-mono-alaw\nriff-8khz-8bit-mono-mulaw\nwebm-16khz-16bit-mono-opus\nwebm-24khz-16bit-24kbps-mono-opus\nwebm-24khz-16bit-mono-opus\n```\n\n\u003C\u002Fdetails>\n\n#### Increase\u002FDecrease audio qualities\n\n```sh\n# Less than default quality.\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3 -q=-1\n# Best quality for mp3\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3 -q=3\n```\n\n#### Read text from file and speak it.\n\n```sh\n$ cat input.txt | aspeak text\n```\n\nor\n\n```sh\n$ aspeak text -f input.txt\n```\n\nwith custom encoding:\n\n```sh\n$ aspeak text -f input.txt -e gbk\n```\n\n#### Read from stdin and speak it.\n\n```sh\n$ aspeak text\n```\n\nmaybe you prefer:\n\n```sh\n$ aspeak text -l zh-CN \u003C\u003C EOF\n我能吞下玻璃而不伤身体。\nEOF\n```\n\n#### Speak Chinese.\n\n```sh\n$ aspeak text \"你好，世界！\" -l zh-CN\n```\n\n#### Use a custom voice.\n\n```sh\n$ aspeak text \"你好，世界！\" -v zh-CN-YunjianNeural\n```\n\n#### Custom pitch, rate and style\n\n```sh\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p 1.5 -r 0.5 -S sad\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=-10% -r=+5% -S cheerful\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=+40Hz -r=1.2f -S fearful\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=high -r=x-slow -S calm\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=+1st -r=-7% -S lyrical\n```\n\n### Advanced Usage\n\n#### Use a custom audio format for output\n\n**Note**: Some audio formats are not supported when outputting to speaker.\n\n```sh\n$ aspeak text \"Hello World\" -F riff-48khz-16bit-mono-pcm -o high-quality.wav\n```\n\n## Library Usage\n\n### Python\n\nThe new version of `aspeak` is written in Rust, and the Python binding is provided by PyO3.\n\nHere is a simple example:\n\n```python\nfrom aspeak import SpeechService\n\nservice =  SpeechService(region=\"eastus\", key=\"YOUR_AZURE_SUBSCRIPTION_KEY\")\nservice.speak_text(\"Hello, world\")\n```\n\nFirst you need to create a `SpeechService` instance.\n\nWhen creating a `SpeechService` instance, you can specify the following parameters:\n\n- `audio_format`(Positional argument): The audio format of the output audio. Default is `AudioFormat.Riff24KHz16BitMonoPcm`.\n  - You can get an audio format by providing a container format and a quality level: `AudioFormat(\"mp3\", 2)`.\n- `endpoint`: The endpoint of the speech service.\n- `region`: Alternatively, you can specify the region of the speech service instead of typing the boring endpoint url.\n- `key`: The subscription key of the speech service.\n- `token`: The auth token for the speech service. If you provide a token, the subscription key will be ignored.\n- `headers`: Additional HTTP headers for the speech service.\n- `mode`: Choose the synthesizer to use. Either `rest` or `websocket`.\n  - In websocket mode, the synthesizer will connect to the endpoint when the `SpeechService` instance is created.\n\nAfter that, you can call `speak_text()` to speak the text or `speak_ssml()` to speak the SSML.\nOr you can call `synthesize_text()` or `synthesize_ssml()` to get the audio data.\n\nFor `synthesize_text()` and `synthesize_ssml()`, if you provide an `output`, the audio data will be written to that file and the function will return `None`. Otherwise, the function will return the audio data.\n\nHere are the common options for `speak_text()` and `synthesize_text()`:\n\n- `locale`: The locale of the voice. Default is `en-US`.\n- `voice`: The voice name. Default is `en-US-JennyNeural`.\n- `rate`: The speaking rate of the voice. It must be a string that fits the requirements as documented in this section: [Pitch and Rate](#pitch-and-rate)\n- `pitch`: The pitch of the voice. It must be a string that fits the requirements as documented in this section: [Pitch and Rate](#pitch-and-rate)\n- `style`: The style of the voice.\n  - You can get a list of available styles for a specific voice by executing `aspeak -L -v \u003CVOICE_ID>`\n  - The default value is `general`.\n- `style_degree`: The degree of the style.\n  - According to the\n    [Azure documentation](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-speaking-styles)\n    , style degree specifies the intensity of the speaking style.\n    It is a floating point number between 0.01 and 2, inclusive.\n  - At the time of writing, style degree adjustments are supported for Chinese (Mandarin, Simplified) neural voices.\n- `role`: The role of the voice.\n  - According to the\n    [Azure documentation](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-speaking-styles)\n    , `role` specifies the speaking role-play. The voice acts as a different age and gender, but the voice name isn't\n    changed.\n  - At the time of writing, role adjustments are supported for these Chinese (Mandarin, Simplified) neural voices:\n    `zh-CN-XiaomoNeural`, `zh-CN-XiaoxuanNeural`, `zh-CN-YunxiNeural`, and `zh-CN-YunyeNeural`.\n\n### Rust\n\nAdd `aspeak` to your `Cargo.toml`:\n\n```bash\n$ cargo add aspeak\n```\n\nThen follow the [documentation](https:\u002F\u002Fdocs.rs\u002Faspeak) of `aspeak` crate.\n\nThere are 4 examples for quick reference:\n\n- [Simple usage of RestSynthesizer](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F03-rest-synthesizer-simple.rs)\n- [Simple usage of WebsocketSynthesizer](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F04-websocket-synthesizer-simple.rs)\n- [Synthesize all txt files in a given directory](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F01-synthesize-txt-files.rs)\n- [Read-Synthesize-Speak-Loop: Read text from stdin line by line and speak it](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F02-rssl.rs)\n","# :speaking_head: aspeak\n\n[![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fstargazers)\n[![GitHub 问题](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues)\n[![GitHub 分支](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fnetwork)\n[![GitHub 许可证](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fkxxt\u002Faspeak)](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002FLICENSE)\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fgraphs\u002Fcontributors\" alt=\"贡献者\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcontributors\u002Fkxxt\u002Faspeak\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fpulse\" alt=\"活跃度\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcommit-activity\u002Fm\u002Fkxxt\u002Faspeak\" \u002F>\n\u003C\u002Fa>\n\n一个简单的 Azure TTS API 文本转语音客户端。:laughing:\n\n## 注意事项\n\n自版本 6.0.0 起，`aspeak` 默认使用 Azure TTS 的 RESTful API。若要使用 WebSocket API，您可以在调用 `aspeak` 时指定 `--mode websocket`，或在您的配置文件的 `auth` 部分设置 `mode = \"websocket\"`。\n\n自版本 4.0.0 起，`aspeak` 已用 Rust 重写。旧版 Python 版本可在 `python` 分支中找到。\n\n您可以注册一个 Azure 账户，然后\n[根据需要选择付费方案（或继续使用免费层级）](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcognitive-services\u002Fspeech-services\u002F)。\n免费层级每月提供 50 万个字符的配额，无需付费。\n\n请参阅[认证部分](#authentication)，了解如何为 aspeak 设置认证信息。\n\n## 安装\n\n### 从 GitHub 发布页面下载（推荐大多数用户）\n\n从[这里](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Freleases\u002Flatest)下载最新版本。\n\n下载后解压，您将获得一个二进制可执行文件。\n\n您可以将其放置在 `PATH` 环境变量所包含的目录中，以便在任何地方运行它。\n\n### 从 AUR 安装（推荐 Arch Linux 用户）\n\n自 v4.1.0 起，您可以通过 AUR 安装 `aspeak-bin`。\n\n### 从 PyPI 安装\n\n通过 PyPI 安装还会为您安装 `aspeak` 的 Python 绑定。有关 Python 绑定的更多信息，请参阅 [库使用#Python](#Python)。\n\n```bash\npip install -U aspeak==6.0.0\n```\n\n目前预编译的 wheel 文件仅适用于 x86_64 架构。\n由于一些技术问题，我尚未将源代码发布到 PyPI。\n因此，要从源代码构建 wheel，您需要按照[从源码安装](#Install-from-Source)中的说明进行操作。\n\n由于 manylinux 兼容性问题，Linux 平台的 wheel 文件目前无法在 PyPI 上获取。（不过您仍然可以从源代码构建它们。）\n\n### 从源码安装\n\n#### 仅命令行工具\n\n从源码安装 `aspeak` 最简单的方式是使用 cargo：\n\n```bash\ncargo install aspeak -F binary\n```\n\n或者，您也可以从 AUR 安装 `aspeak`。\n\n#### Python Wheel\n\n要构建 Python wheel，您需要先安装 `maturin`：\n\n```bash\npip install maturin\n```\n\n克隆仓库并进入该目录后，您可以运行以下命令来构建 wheel：\n\n```bash\nmaturin build --release --strip -F python --bindings pyo3 --interpreter python --manifest-path Cargo.toml --out dist-pyo3\nmaturin build --release --strip --bindings bin -F binary --interpreter python --manifest-path Cargo.toml --out dist-bin\nbash merge-wheel.bash\n```\n\n如果一切顺利，您将在 `dist` 目录下得到一个 wheel 文件。\n\n## 使用方法\n\n运行 `aspeak help` 查看帮助信息。\n\n运行 `aspeak help \u003C子命令>` 查看特定子命令的帮助信息。\n\n### 认证\n\n认证选项应放在任何子命令之前。\n\n例如，要使用您的订阅密钥和由区域指定的官方端点，可以运行以下命令：\n\n```sh\n$ aspeak --region \u003CYOUR_REGION> --key \u003CYOUR_SUBSCRIPTION_KEY> text \"Hello World\"\n```\n\n如果您使用的是自定义端点，可以使用 `--endpoint` 选项代替 `--region`。\n\n为了避免重复输入，您可以将认证信息存储在 aspeak 配置文件中。\n有关详细信息，请阅读下一节。\n\n自 v5.2.0 起，您还可以通过以下环境变量设置认证凭据：\n\n- `ASPEAK_AUTH_KEY` 用于使用订阅密钥进行认证\n- `ASPEAK_AUTH_TOKEN` 用于使用授权令牌进行认证\n\n自 v4.3.0 起，您可以让 aspeak 使用代理服务器连接到端点。\n目前仅支持 http 和 socks5 代理（暂不支持 https）。例如：\n\n```sh\n$ aspeak --proxy http:\u002F\u002Fyour_proxy_server:port text \"Hello World\"\n$ aspeak --proxy socks5:\u002F\u002Fyour_proxy_server:port text \"Hello World\"\n```\n\naspeak 也会尊重 `HTTP_PROXY`（或 `http_proxy`）环境变量。\n\n### 配置\n\naspeak v4 引入了“配置文件”的概念。\n配置文件是一个配置文件，您可以在其中为命令行选项指定默认值。\n\n运行以下命令创建您的默认配置文件：\n\n```sh\n$ aspeak config init\n```\n\n要编辑配置文件，运行：\n\n```sh\n$ aspeak config edit\n```\n\n如果您在运行上述命令时遇到困难，可以手动编辑配置文件：\n\n首先通过运行以下命令获取配置文件路径：\n\n```sh\n$ aspeak config where\n```\n\n然后使用您喜欢的文本编辑器编辑该文件。\n\n配置文件是 TOML 格式。默认配置文件如下所示：\n\n请查看配置文件中的注释，以了解可用选项的更多信息。\n\n```toml\n# aspeak 配置文件\n# GitHub: https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\n\n# 输出冗余度\n# 0   - 默认\n# 1   - 详细\n# 以下输出冗余度级别仅在调试版本中支持\n# 2   - 调试\n# >=3 - 追踪\nverbosity = 0\n\n#\n# 认证配置\n#\n\n[auth]\n# TTS 的端点\n# endpoint = \"wss:\u002F\u002Feastus.tts.speech.microsoft.com\u002Fcognitiveservices\u002Fwebsocket\u002Fv1\"\n\n# 或者，如果您使用官方端点，可以指定区域\n# region = \"eastus\"\n\n# 合成模式，“rest”或“websocket”\n# mode = \"rest\"\n\n# Azure 订阅密钥\n# key = \"YOUR_KEY\"\n\n# 认证令牌\n# token = \"Your Authentication Token\"\n\n# 额外的 HTTP 头部（面向专家）\n# headers = [[\"X-My-Header\", \"My-Value\"], [\"X-My-Header2\", \"My-Value2\"]]\n\n# 代理\n# proxy = \"socks5:\u002F\u002F127.0.0.1:7890\"\n\n# 语音列表 API 地址\n# voice_list_api = \"Custom voice list API url\"\n\n#\n# 文本子命令的配置\n#\n\n[text]\n# 要使用的语音。请注意，它优先于语言区域\n# voice = \"en-US-JennyNeural\"\n# 要使用的语言区域\nlocale = \"en-US\"\n# 语速\n# rate = 0\n# 音高\n# pitch = 0\n# 角色\n# role = \"Boy\"\n# 风格，默认为“general”\n# style = \"general\"\n# 风格程度，介于 0.1 和 2.0 之间的浮点数\n# style_degree = 1.0\n\n#\n# 输出配置\n#\n\n[output]\n# 容器格式，仅支持 wav\u002Fmp3\u002Fogg\u002Fwebm。\ncontainer = \"wav\"\n# 音质。运行 `aspeak list-qualities` 查看可用音质。\n#\n# 如果您选择的容器格式不支持此处指定的音质等级，\n\n# 我们将自动为您选择最接近的音质等级。\nquality = 0\n# 音频格式（面向专家）。运行 `aspeak list-formats` 查看可用格式。\n# 请注意，它优先级高于容器和音质！\n# format = \"audio-16khz-128kbitrate-mono-mp3\"\n```\n\n如果您想使用默认配置文件之外的其他配置文件，可以使用 `--profile` 参数：\n\n```sh\naspeak --profile \u003CPATH_TO_A_PROFILE> text \"Hello\"\n```\n\n如果您想临时禁用配置文件，可以使用 `--no-profile` 参数：\n\n```sh\naspeak --no-profile --region eastus --key \u003CYOUR_KEY> text \"Hello\"\n```\n\n### 音高和语速\n\n- `rate`：语音的语速。\n  - 如果您使用浮点值（例如 `0.5`），该值将乘以 100%，变为 `50.00%`。\n  - 您也可以使用以下值：`x-slow`、`slow`、`medium`、`fast`、`x-fast`、`default`。\n  - 您还可以直接使用百分比值：`+10%`。\n  - 您也可以使用相对浮点值（带 `f` 后缀），例如 `1.2f`：\n    - 根据 [Azure 文档](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-prosody)，\n    - 相对值表示为一个数字，作为默认语速的倍数。\n    - 例如，值为 `1f` 表示语速不变。值为 `0.5f` 表示语速减半。值为 `3f` 表示语速提高三倍。\n- `pitch`：语音的音高。\n  - 如果您使用浮点值（例如 `-0.5`），该值将乘以 100%，变为 `-50.00%`。\n  - 您也可以使用以下值：`x-low`、`low`、`medium`、`high`、`x-high`、`default`。\n  - 您还可以直接使用百分比值：`+10%`。\n  - 您也可以使用相对值（例如 `-2st` 或 `+80Hz`）：\n    - 根据 [Azure 文档](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-prosody)，\n    - 相对值表示为一个数字，前面带有 `+` 或 `-`，后面跟 `Hz` 或 `st`，用于指定音高的变化量。\n    - 其中 `st` 表示变化单位为半音，即标准全音阶上的一半音程。\n  - 您也可以使用绝对值：例如 `600Hz`\n\n**注意**：不合理的高低值将被 Azure 认知服务裁剪为合理范围。\n\n### 示例\n\n以下示例假设您已在配置文件中设置好身份验证。\n\n#### 对默认语音合成器说“Hello, world!”。\n\n```sh\n$ aspeak text \"Hello, world\"\n```\n\n#### 使用 SSML 进行语音合成。\n\n```sh\n$ aspeak ssml \u003C\u003C EOF\n\u003Cspeak version='1.0' xmlns='http:\u002F\u002Fwww.w3.org\u002F2001\u002F10\u002Fsynthesis' xml:lang='en-US'>\u003Cvoice name='en-US-JennyNeural'>Hello, world!\u003C\u002Fvoice>\u003C\u002Fspeak>\nEOF\n```\n\n#### 列出所有可用的语音。\n\n```sh\n$ aspeak list-voices\n```\n\n#### 列出所有适用于中文的语音。\n\n```sh\n$ aspeak list-voices -l zh-CN\n```\n\n#### 获取某一种语音的信息。\n\n```sh\n$ aspeak list-voices -v en-US-SaraNeural\n```\n\n\u003Cdetails>\n\n\u003Csummary>\n    输出\n\u003C\u002Fsummary>\n\n```\nMicrosoft Server Speech Text to Speech Voice (en-US, SaraNeural)\n显示名：Sara\n本地名：Sara @ en-US\n区域设置：英语（美国）\n性别：女性\nID：en-US-SaraNeural\n语音类型：神经网络\n状态：GA\n采样率：48000Hz\n每分钟字数：157\n风格：[\"angry\", \"cheerful\", \"excited\", \"friendly\", \"hopeful\", \"sad\", \"shouting\", \"terrified\", \"unfriendly\", \"whispering\"]\n```\n\n\u003C\u002Fdetails>\n\n#### 将合成的语音保存到文件。\n\n```sh\n$ aspeak text \"Hello, world\" -o output.wav\n```\n\n如果您更喜欢 mp3\u002Fogg\u002Fwebm 格式，可以使用 `-c mp3`\u002F`-c ogg`\u002F`-c webm` 选项。\n\n```sh\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3\n$ aspeak text \"Hello, world\" -o output.ogg -c ogg\n$ aspeak text \"Hello, world\" -o output.webm -c webm\n```\n\n#### 列出可用的音质等级。\n\n```sh\n$ aspeak list-qualities\n```\n\n\u003Cdetails>\n\n\u003Csummary>输出\u003C\u002Fsummary>\n\n```\nMP3 的音质等级：\n  3：audio-48khz-192kbitrate-mono-mp3\n  2：audio-48khz-96kbitrate-mono-mp3\n -3：audio-16khz-64kbitrate-mono-mp3\n  1：audio-24khz-160kbitrate-mono-mp3\n -2：audio-16khz-128kbitrate-mono-mp3\n -4：audio-16khz-32kbitrate-mono-mp3\n -1：audio-24khz-48kbitrate-mono-mp3\n  0：audio-24khz-96kbitrate-mono-mp3\n\nWAV 的音质等级：\n -2：riff-8khz-16bit-mono-pcm\n  1：riff-24khz-16bit-mono-pcm\n  0：riff-24khz-16bit-mono-pcm\n -1：riff-16khz-16bit-mono-pcm\n\nOGG 的音质等级：\n  0：ogg-24khz-16bit-mono-opus\n -1：ogg-16khz-16bit-mono-opus\n  1：ogg-48khz-16bit-mono-opus\n\nWEBM 的音质等级：\n  0：webm-24khz-16bit-mono-opus\n -1：webm-16khz-16bit-mono-opus\n  1：webm-24khz-16bit-24kbps-mono-opus\n```\n\n\u003C\u002Fdetails>\n\n#### 列出可用的音频格式（面向专家用户）\n\n```sh\n$ aspeak list-formats\n```\n\n\u003Cdetails>\n\n\u003Csummary>输出\u003C\u002Fsummary>\n\n```\namr-wb-16000hz\naudio-16khz-128kbitrate-mono-mp3\naudio-16khz-16bit-32kbps-mono-opus\naudio-16khz-32kbitrate-mono-mp3\naudio-16khz-64kbitrate-mono-mp3\naudio-24khz-160kbitrate-mono-mp3\naudio-24khz-16bit-24kbps-mono-opus\naudio-24khz-16bit-48kbps-mono-opus\naudio-24khz-48kbitrate-mono-mp3\naudio-24khz-96kbitrate-mono-mp3\naudio-48khz-192kbitrate-mono-mp3\naudio-48khz-96kbitrate-mono-mp3\nogg-16khz-16bit-mono-opus\nogg-24khz-16bit-mono-opus\nogg-48khz-16bit-mono-opus\nraw-16khz-16bit-mono-pcm\nraw-16khz-16bit-mono-truesilk\nraw-22050hz-16bit-mono-pcm\nraw-24khz-16bit-mono-pcm\nraw-24khz-16bit-mono-truesilk\nraw-44100hz-16bit-mono-pcm\nraw-48khz-16bit-mono-pcm\nraw-8khz-16bit-mono-pcm\nraw-8khz-8bit-mono-alaw\nraw-8khz-8bit-mono-mulaw\nriff-16khz-16bit-mono-pcm\nriff-22050hz-16bit-mono-pcm\nriff-24khz-16bit-mono-pcm\nriff-44100hz-16bit-mono-pcm\nriff-48khz-16bit-mono-pcm\nriff-8khz-16bit-mono-pcm\nriff-8khz-8bit-mono-alaw\nriff-8khz-8bit-mono-mulaw\nwebm-16khz-16bit-mono-opus\nwebm-24khz-16bit-24kbps-mono-opus\nwebm-24khz-16bit-mono-opus\n```\n\n\u003C\u002Fdetails>\n\n#### 调整音频音质\n\n```sh\n# 低于默认音质。\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3 -q=-1\n# 最佳 MP3 音质。\n$ aspeak text \"Hello, world\" -o output.mp3 -c mp3 -q=3\n```\n\n#### 从文件读取文本并朗读。\n\n```sh\n$ cat input.txt | aspeak text\n```\n\n或者\n\n```sh\n$ aspeak text -f input.txt\n```\n\n使用自定义编码：\n\n```sh\n$ aspeak text -f input.txt -e gbk\n```\n\n#### 从标准输入读取并朗读。\n\n```sh\n$ aspeak text\n```\n\n或者您也可以这样：\n\n```sh\n$ aspeak text -l zh-CN \u003C\u003C EOF\n我能吞下玻璃而不伤身体。\nEOF\n```\n\n#### 用中文朗读。\n\n```sh\n$ aspeak text \"你好，世界！\" -l zh-CN\n```\n\n#### 使用自定义语音。\n\n```sh\n$ aspeak text \"你好，世界！\" -v zh-CN-YunjianNeural\n```\n\n#### 自定义音高、语速和风格\n\n```sh\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p 1.5 -r 0.5 -S sad\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=-10% -r=+5% -S cheerful\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=+40Hz -r=1.2f -S fearful\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=high -r=x-slow -S calm\n$ aspeak text \"你好，世界！\" -v zh-CN-XiaoxiaoNeural -p=+1st -r=-7% -S lyrical\n```\n\n### 高级用法\n\n#### 使用自定义音频格式进行输出\n\n**注意**：某些音频格式在输出到扬声器时不受支持。\n\n```sh\n$ aspeak text \"Hello World\" -F riff-48khz-16bit-mono-pcm -o high-quality.wav\n```\n\n## 库使用\n\n### Python\n\n`aspeak` 的新版本是用 Rust 编写的，Python 绑定由 PyO3 提供。\n\n以下是一个简单的示例：\n\n```python\nfrom aspeak import SpeechService\n\nservice =  SpeechService(region=\"eastus\", key=\"YOUR_AZURE_SUBSCRIPTION_KEY\")\nservice.speak_text(\"Hello, world\")\n```\n\n首先需要创建一个 `SpeechService` 实例。\n\n在创建 `SpeechService` 实例时，可以指定以下参数：\n\n- `audio_format`（位置参数）：输出音频的音频格式。默认为 `AudioFormat.Riff24KHz16BitMonoPcm`。\n  - 可以通过提供容器格式和质量等级来获取音频格式：`AudioFormat(\"mp3\", 2)`。\n- `endpoint`：语音服务的端点。\n- `region`：也可以指定语音服务的区域，而不用输入冗长的端点 URL。\n- `key`：语音服务的订阅密钥。\n- `token`：语音服务的认证令牌。如果提供了令牌，则会忽略订阅密钥。\n- `headers`：语音服务的额外 HTTP 头。\n- `mode`：选择要使用的合成器。可以是 `rest` 或 `websocket`。\n  - 在 websocket 模式下，合成器会在创建 `SpeechService` 实例时连接到端点。\n\n之后，可以调用 `speak_text()` 来朗读文本，或调用 `speak_ssml()` 来朗读 SSML。也可以调用 `synthesize_text()` 或 `synthesize_ssml()` 来获取音频数据。\n\n对于 `synthesize_text()` 和 `synthesize_ssml()`，如果提供了 `output` 参数，音频数据将会被写入该文件，并且函数会返回 `None`。否则，函数会返回音频数据。\n\n以下是 `speak_text()` 和 `synthesize_text()` 的常见选项：\n\n- `locale`：语音的语言环境。默认为 `en-US`。\n- `voice`：语音名称。默认为 `en-US-JennyNeural`。\n- `rate`：语音的语速。必须是一个符合本节文档中要求的字符串：[音高和语速](#pitch-and-rate)。\n- `pitch`：语音的音高。必须是一个符合本节文档中要求的字符串：[音高和语速](#pitch-and-rate)。\n- `style`：语音的风格。\n  - 可以通过执行 `aspeak -L -v \u003CVOICE_ID>` 来获取特定语音的可用风格列表。\n  - 默认值为 `general`。\n- `style_degree`：风格的程度。\n  - 根据\n    [Azure 文档](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-speaking-styles)\n    ，风格程度指定了说话风格的强度。\n    它是一个介于 0.01 和 2 之间的浮点数，包括 0.01 和 2。\n  - 截至撰写本文时，风格程度调整仅适用于中文（普通话，简体）神经网络语音。\n- `role`：语音的角色。\n  - 根据\n    [Azure 文档](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fspeech-service\u002Fspeech-synthesis-markup?tabs=csharp#adjust-speaking-styles)\n    ，`role` 指定了语音扮演的角色。语音会表现出不同的年龄和性别特征，但语音名称不会改变。\n  - 截至撰写本文时，角色调整仅适用于以下中文（普通话，简体）神经网络语音：\n    `zh-CN-XiaomoNeural`、`zh-CN-XiaoxuanNeural`、`zh-CN-YunxiNeural` 和 `zh-CN-YunyeNeural`。\n\n### Rust\n\n将 `aspeak` 添加到你的 `Cargo.toml` 中：\n\n```bash\n$ cargo add aspeak\n```\n\n然后按照 `aspeak` crate 的 [文档](https:\u002F\u002Fdocs.rs\u002Faspeak) 进行操作。\n\n这里有 4 个示例供快速参考：\n\n- [RestSynthesizer 的简单用法](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F03-rest-synthesizer-simple.rs)\n- [WebsocketSynthesizer 的简单用法](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F04-websocket-synthesizer-simple.rs)\n- [合成给定目录中的所有 txt 文件](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F01-synthesize-txt-files.rs)\n- [读取-合成-朗读循环：逐行从标准输入读取文本并朗读](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F02-rssl.rs)","# aspeak 快速上手指南\n\naspeak 是一个基于 Rust 开发的轻量级命令行工具，用于调用 Azure 文本转语音 (TTS) API。它支持 RESTful 和 WebSocket 模式，提供灵活的语音配置和多种音频格式输出。\n\n## 环境准备\n\n- **操作系统**：Linux (x86_64), macOS, Windows\n- **前置依赖**：\n  - 若使用预编译二进制文件：无需额外依赖\n  - 若从源码安装：需安装 [Rust](https:\u002F\u002Frustup.rs\u002F) (推荐通过 `rustup` 安装)\n  - 若构建 Python Wheel：需安装 `maturin` (`pip install maturin`)\n- **Azure 账号**：需注册 Azure 账号并获取订阅密钥（Subscription Key）或授权令牌。免费额度包含每月 50 万字符。\n\n## 安装步骤\n\n### 方法一：下载预编译二进制文件（推荐）\n\n1. 访问 [GitHub Releases](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Freleases\u002Flatest) 下载对应系统的最新版本压缩包。\n2. 解压后得到可执行文件 `aspeak`。\n3. （可选）将文件移动到系统 `PATH` 目录以便全局调用：\n   ```bash\n   # Linux\u002FmacOS 示例\n   chmod +x aspeak\n   sudo mv aspeak \u002Fusr\u002Flocal\u002Fbin\u002F\n   ```\n\n### 方法二：通过 Cargo 从源码安装（适合开发者）\n\n确保已安装 Rust，然后运行：\n\n```bash\ncargo install aspeak -F binary\n```\n\n### 方法三：通过 PyPI 安装（含 Python 绑定）\n\n```bash\npip install -U aspeak==6.0.0\n```\n> 注意：目前 PyPI 仅提供 x86_64 架构的预编译轮子。Linux 用户若遇兼容性问题，建议使用方法一或从源码构建。\n\n### 方法四：Arch Linux 用户 (AUR)\n\n```bash\nyay -S aspeak-bin\n# 或\nparu -S aspeak-bin\n```\n\n## 基本使用\n\n### 1. 配置认证信息\n\n为避免每次命令都输入密钥，建议初始化配置文件：\n\n```bash\naspeak config init\n```\n\n运行上述命令后，编辑配置文件（路径可通过 `aspeak config where` 查看），在 `[auth]` 部分填入你的区域和密钥：\n\n```toml\n[auth]\nregion = \"eastus\"\nkey = \"YOUR_SUBSCRIPTION_KEY\"\n```\n\n> 提示：也可通过环境变量设置认证信息：\n> - `export ASPEAK_AUTH_KEY=\"YOUR_KEY\"`\n> - `export ASPEAK_AUTH_TOKEN=\"YOUR_TOKEN\"`\n\n### 2. 合成语音并播放\n\n最简单的用法，将文本转换为语音并通过默认扬声器播放：\n\n```bash\naspeak text \"你好，世界！\"\n```\n\n若未配置默认 Profile，需在命令中指定参数：\n\n```bash\naspeak --region eastus --key \u003CYOUR_KEY> text \"Hello World\"\n```\n\n### 3. 保存音频文件\n\n将合成结果保存为 WAV 文件：\n\n```bash\naspeak text \"Hello World\" -o output.wav\n```\n\n支持其他格式（MP3, OGG, WEBM）：\n\n```bash\naspeak text \"Hello World\" -o output.mp3 -c mp3\n```\n\n### 4. 查看可用语音列表\n\n列出所有支持的语音：\n\n```bash\naspeak list-voices\n```\n\n筛选特定语言（如中文）：\n\n```bash\naspeak list-voices -l zh-CN\n```\n\n查看特定语音详情：\n\n```bash\naspeak list-voices -v zh-CN-XiaoxiaoNeural\n```","一位独立开发者正在为视障用户开发一款命令行新闻阅读助手，需要将每日抓取的文本资讯实时转换为自然流畅的语音。\n\n### 没有 aspeak 时\n- **集成门槛高**：直接调用 Azure TTS API 需要手动处理复杂的 RESTful 请求头、身份验证签名及 WebSocket 连接逻辑，代码冗余且容易出错。\n- **跨平台部署难**：若使用 Python 脚本调用，需在不同操作系统上配置繁琐的环境依赖，且难以打包成单一可执行文件分发给用户。\n- **调试效率低**：每次测试不同音色或语速都需要修改代码并重新运行脚本，无法通过简单的命令行参数快速预览效果。\n- **资源消耗大**：自行维护的转换脚本在长时间运行时缺乏优化，占用较多内存，不适合在低配置设备上常驻运行。\n\n### 使用 aspeak 后\n- **开箱即用**：只需一行命令 `aspeak --region \u003C区域> --key \u003C密钥> text \"新闻内容\"` 即可直接调用 Azure 高质量语音，无需编写任何网络请求代码。\n- **分发便捷**：下载预编译的二进制文件即可在任何主流系统上运行，彻底解决了环境配置痛点，方便用户直接集成到工作流中。\n- **灵活调试**：支持通过 CLI 参数即时切换发音人、调整语调和速率，开发者能秒级验证不同配置下的听觉效果，大幅缩短迭代周期。\n- **高性能运行**：基于 Rust 重构的 aspeak 拥有极低的内存占用和启动速度，确保新闻朗读服务在后台稳定运行而不拖累系统性能。\n\naspeak 将复杂的云端语音 API 封装为极简的命令行体验，让开发者能专注于业务逻辑而非底层通信细节。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkxxt_aspeak_dfbe22e4.png","kxxt","Levi Zim","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fkxxt_f2a645a0.png","FOSS should be all you want.\r\nI love creating interesting POC or MVP projects and (hopefully not) abandon them afterwards","Looking for FLOSS Job Opportunities","403 Forbidden",null,"https:\u002F\u002Fwww.kxxt.dev","https:\u002F\u002Fgithub.com\u002Fkxxt",[85,89,93,97],{"name":86,"color":87,"percentage":88},"Rust","#dea584",95.8,{"name":90,"color":91,"percentage":92},"Python","#3572A5",3.3,{"name":94,"color":95,"percentage":96},"Shell","#89e051",0.9,{"name":98,"color":99,"percentage":100},"Makefile","#427819",0.1,500,61,"2026-03-24T13:10:32","MIT","Linux, macOS, Windows","不需要 GPU","未说明",{"notes":109,"python":110,"dependencies":111},"该工具是 Azure TTS API 的客户端，本身不进行本地语音合成，因此无需高性能硬件。主要运行方式为下载预编译二进制文件直接运行。若需从源码构建或开发 Python 绑定，则需安装 Rust (cargo) 和 maturin。支持通过环境变量或配置文件管理 Azure 密钥。网络方面支持 HTTP 和 SOCKS5 代理。","可选：如需使用 Python binding 需安装 Python（版本未明确指定，通常建议 3.8+）；CLI 版本无需 Python",[112,113,114],"maturin (仅构建 Python wheel 时需要)","cargo (仅从源码编译时需要)","pyo3 (仅构建 Python binding 时内部依赖)",[16,21],[117,118,119,120,121,122,123,67],"azure-cognitive-services","cli","python","speech-synthesis","text-to-speech","tts","tts-engine","2026-03-27T02:49:30.150509","2026-04-06T06:53:15.159826",[127,132,137,142,147,152,157],{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},17331,"遇到错误代码 429 (Too Many Requests) 或 WebSocket 限制怎么办？","免费试用 API 受到微软的严格速率限制。如果频繁遇到 429 错误，建议注册 Azure 账户并使用订阅密钥（有免费层级）。或者，可以尝试使用 edge-tts 端点（需要 aspeak v4.2.0 及以上版本），该端点限制较少但功能略少。注意：v5.0 版本已移除试用端点。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F56",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},17332,"如何在 Python 脚本中直接调用 aspeak 进行语音合成？","aspeak 在 v2.0.0.dev2 版本中稳定了功能性 API。您可以参考官方文档 (DEVELOP.md) 或查看 src\u002Fexamples 目录下的示例代码，将文本直接传递给 API 进行合成，而无需通过命令行交互。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F11",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},17333,"出现 'AssertionError' 且提示 token 为 None 是什么原因？","这通常是因为旧版本的令牌提取逻辑失效或试用端点被移除。建议升级到最新版本（如 v4.0.0 或更高）。v4.0.0 推出了基于 Rust 的重写版本，不再依赖 Python 环境，体积更小且更稳定。您可以从 GitHub Releases 下载对应平台的二进制文件，将其放入 PATH 环境变量即可使用。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F42",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},17334,"配置端点时出现 'WebSocket upgrade failed' 或连接错误如何解决？","请检查您的端点 URL 是否正确。REST API 端点和 WebSocket API 端点有细微差别。对于 WebSocket 连接，URL 必须包含 '\u002Fwebsocket\u002F' 路径。例如，应使用 'wss:\u002F\u002Feastasia.tts.speech.microsoft.com\u002Fcognitiveservices\u002Fwebsocket\u002Fv1' 而不是普通的 cognitiveservices\u002Fv1。此外，某些版本可能需要添加 'Origin: https:\u002F\u002Fazure.microsoft.com' 请求头，这通常在更新到修复版（如 v3.1.0）后自动处理。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F33",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},17335,"文本过长导致合成失败或中断怎么办？","微软免费服务对输入长度有限制（网页演示通常限制为 1000 字符），且当接收到的音频数据达到一定大小（约 2MB）时可能会强制断开 WebSocket 连接。aspeak 本身不负责自动分段，因为简单的按字符截断会破坏句子结构。建议在调用 aspeak 之前，在您的代码中根据句子标点符号对长文本进行智能分段，然后分批发送合成请求。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F24",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},17336,"能否实现实时通信或低于 200ms 的低延迟输出？","实现 200ms 以内的端到端延迟是不现实的。测试显示，即使在浏览器中直接访问微软官方接口，延迟也通常在 268ms 左右。这包含了网络传输、服务器处理和音频生成的时间。如果是用于实时对话场景，需考虑到这一物理和网络层面的基础延迟。","https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fissues\u002F19",{"id":158,"question_zh":159,"answer_zh":160,"source_url":141},17337,"如何安装 aspeak v4.0.0 Alpha 版本或 Rust 版本？","v4.0.0 Alpha 版本目前主要通过 GitHub Releases 提供预编译的二进制文件（支持 Windows 和 Linux）。下载对应系统的压缩包，解压后将可执行文件放置在系统 PATH 环境变量包含的文件夹中，即可在终端直接运行 'aspeak' 命令。该版本基于 Rust 重写，无需安装 Python 环境。未来完成 Python 绑定后，也将重新发布到 PyPI。",[162,167,172,177,182,187,192,197,202,207,212,217,222,227,232,237,242,247,252,257],{"id":163,"version":164,"summary_zh":165,"released_at":166},99570,"v6.1.0","## 通知\n\n最低支持的 Python 版本已从 3.8 提升至 3.10。\n\n## 新特性\n\n- aspeak 现在支持在 `--output` 设置为 `-` 时，将原始音频字节输出到标准输出。\n- 现已发布适用于 Apple Silicon Mac 的轮子包。\n\n## 修复\n\n- 更新依赖项，解决安全告警。\n- 修复拼写错误。\n\n## 内部改进\n\n- 在 GitHub CI 中运行测试。\n- 重构：迁移到 Rust 2024 版本。","2025-03-28T13:12:22",{"id":168,"version":169,"summary_zh":170,"released_at":171},99571,"v6.1.0-rc.1","## 新特性\n\n- aspeak 现在支持在 `--output` 设置为 `-` 时，将原始音频字节输出到标准输出。\n- 现已为 Apple Silicon 架构的 Mac 发布了轮子包。\n\n## 修复\n\n- 更新依赖项，解决了安全告警问题。\n- 修复拼写错误。\n\n## 内部改进\n\n- 在 GitHub CI 中运行测试。\n- 重构：迁移到 Rust 2024 版本。","2025-03-28T13:11:37",{"id":173,"version":174,"summary_zh":175,"released_at":176},99572,"v6.0.1","更新依赖项，以解决安全警告。","2023-10-03T02:38:13",{"id":178,"version":179,"summary_zh":180,"released_at":181},99573,"v6.0.0","aspeak v6.0 终于发布了🎉🎉🎉！这是一个包含一些破坏性变更的重大版本。请仔细阅读以下内容。\n\n## 通用\n\n- GitHub 分支：主分支已被删除。默认分支现在是 `v6`，并且会随着主版本号的更新而变化。\n- 升级依赖项（解决安全告警 #77）\n- 内部重构\n\n## 对于 CLI 用户\n\n- 现在 CLI 默认使用 REST API，而不是 WebSocket API。\n  - 您可以使用 `--mode websocket` 标志来使用 WebSocket API。\n  - 您也可以在配置文件的认证部分将 `mode` 字段设置为 `websocket`，以默认使用 WebSocket API。\n- 当 TTS API 返回空音频时，aspeak 不再报告晦涩难懂的“无法识别的格式”错误。\n  - 现在它会在此情况下报告一条警告：“收到空音频缓冲区，无内容可播放”。\n- 现在语音列表命令不再因以下 API 端点而失败：https:\u002F\u002Fspeech.platform.bing.com\u002Fconsumer\u002Fspeech\u002Fsynthesize\u002Freadaloud\u002Fvoices\u002Flist\n- 性能改进：消除不必要的内存拷贝。\n\n## 对于 Rust crate 用户\n\n本次发布包含大量破坏性变更：\n\n- `Voice` 结构体中的某些字段现在变为可选。\n- 我们现在采用一种[模块化的错误处理方式](https:\u002F\u002Fsabrinajewson.org\u002Fblog\u002Ferrors)，而不是使用一个庞大的枚举来涵盖所有错误。（#66）\n- 现在有两种合成器：`RestSynthesizer` 用于 REST API，`WebSocketSynthesizer` 用于 WebSocket API。（#71）\n  - REST 合成器新增两个方法，直接返回底层的 `Bytes` 对象，而非 `Vec\u003Cu8>`。\n- 引入了一个 `UnifiedSynthesizer` 特性，为两种合成器提供统一的接口。\n- 部分方法被重命名。例如，`Synthesizer::connect` 现已更名为 `Synthesizer::connect_websocket`。\n- 该 crate 新增四个特性标志：\n  - `rest-synthesizer`：启用 `RestSynthesizer` 结构体。\n  - `websocket-synthesizer`：启用 `WebSocketSynthesizer` 结构体。\n  - `unified-synthesizer`：启用 `UnifiedSynthesizer` 特性。\n  - `synthesizers`：启用所有合成器。\n  - 默认启用 `synthesizers` 特性。\n  - 如果禁用 `websocket-synthesizer` 特性，可以减少大量依赖项。（在 release 模式下，`aspeak.rlib` 文件大小可缩小约 0.8MB）\n- 改进了文档注释。\n- 支持 TLS 特性标志：默认情况下，该 crate 使用 `native-tls`。若要使用其他 TLS 实现，可以启用以下特性标志：\n  - `native-tls-vendored`：使用 vendored 版本的 `native-tls`。\n  - `rustls-tls-native-roots`\n  - `rustls-tls-webpki-roots`\n- 新增四个示例供快速参考：\n  - [RestSynthesizer 的简单使用](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F03-rest-synthesizer-simple.rs)\n  - [WebsocketSynthesizer 的简单使用](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F04-websocket-synthesizer-simple.rs)\n  - [合成指定目录下的所有 .txt 文件](https:\u002F\u002Fgithub.com\u002Fkxxt\u002Faspeak\u002Fblob\u002Fv6\u002Fexamples\u002F01-synthesize-txt-files.rs)\n  - [读取-合成-朗读循环：读取 ","2023-06-29T00:54:05",{"id":183,"version":184,"summary_zh":185,"released_at":186},99574,"v6.0.0-rc.1","v6.0.0-beta.3 之后的变更：\n\n- Rust crate：将所有项在根模块中设为可见（扁平优于嵌套）。\n- GitHub 分支：主分支已被删除。默认分支现为 `v5`，待 v6 发布时将切换为 `v6`。\n- Python 绑定：现已提供类型提示，可在 IDE 中获得更好的代码补全体验。","2023-06-28T01:30:40",{"id":188,"version":189,"summary_zh":190,"released_at":191},99575,"v6.0.0-beta.3","v6.0.0-beta.2 之后的更改：\n\n- 改进 Python 绑定中的错误信息\n- 启用 abi3 轮子，这样我们就无需为每个 Python 版本单独构建。","2023-06-27T09:35:03",{"id":193,"version":194,"summary_zh":195,"released_at":196},99576,"v6.0.0-beta.2","v6.0.0-beta.1 之后的变更：\n\n- CLI：性能优化：消除不必要的内存拷贝\n- 文档（crate）：新增两个示例\n- 文档（crate）：添加更多文档注释","2023-06-26T09:34:16",{"id":198,"version":199,"summary_zh":200,"released_at":201},99577,"v6.0.0-beta.1","v6.0.0-alpha.3 之后的变更：\n\n- 功能：在 `RestSynthesizer` 中添加两个方法，返回 `Bytes` 而不是 `Vec\u003Cu8>`。\n- 升级 openssl 依赖（解决安全告警 #77）。\n- 为 Rust crate 添加两个示例：\n  - 01-synthesize-txt-files.rs：从目录中的 \\*.txt 文件中合成语音。\n  - 02-rssl.rs：RSSL，即“读取-合成-播放-循环”（类似于 REPL）。逐行从标准输入读取文本，合成语音并播放。\n- 内部重构","2023-06-24T00:18:00",{"id":203,"version":204,"summary_zh":205,"released_at":206},99578,"v6.0.0-alpha.3","v6.0.0-alpha.2 之后的更改：\n\n- 改进文档注释。\n- 将 `strum` 升级至 0.25，由 @attila-lin 完成。\n- crate：支持 TLS 特性标志。\n- crate：添加 `synthesizers` 特性，启用所有合成器。","2023-06-21T01:09:04",{"id":208,"version":209,"summary_zh":210,"released_at":211},99579,"v6.0.0-alpha.2","## 对于命令行用户\n\n没有破坏性变更。但存在一些差异。\n\n### 变更\n\n- 现在命令行工具默认使用 REST API，而非 WebSocket API。\n  - 您可以使用 `--mode websocket` 标志来启用 WebSocket API。\n  - 也可以在配置文件的认证部分将 `mode` 字段设置为 `websocket`，以默认使用 WebSocket API。\n\n### 错误修复\n\n- 当 TTS API 返回空音频时，aspeak 不再报告晦涩的“未知格式”错误。\n  - 现在会输出警告：“收到空音频缓冲区，无内容可播放”。\n\n- 现在语音列表命令不再因以下 API 端点而失败：https:\u002F\u002Fspeech.platform.bing.com\u002Fconsumer\u002Fspeech\u002Fsynthesize\u002Freadaloud\u002Fvoices\u002Flist\n\n## 对于 Rust crate 用户\n\n有许多破坏性变更。\n\n- `Voice` 结构体中的某些字段现在是可选的。\n- 我们现在采用 [模块化的错误处理方式](https:\u002F\u002Fsabrinajewson.org\u002Fblog\u002Ferrors)，而不是使用一个大型枚举来涵盖所有错误。\n- 现在有两种合成器：`RestSynthesizer` 用于 REST API，`WebSocketSynthesizer` 用于 WebSocket API。\n- 提供了一个 `UnifiedSynthesizer` 特性，为两种合成器提供统一的接口。\n- 部分方法被重命名。例如，`Synthesizer::connect` 现已更名为 `Synthesizer::connect_websocket`。\n- 该 crate 新增三个特性：\n  - `rest-synthesizer`：启用 `RestSynthesizer` 结构体。\n  - `websocket-synthesizer`：启用 `WebSocketSynthesizer` 结构体。\n  - `unified-synthesizer`：启用 `UnifiedSynthesizer` 特性。\n  - 上述三个特性默认均已启用。\n  - 如果禁用 `websocket-synthesizer` 特性，可以减少大量依赖项。（在 release 模式下，`aspeak.rlib` 文件大小可减小约 0.8MB）\n- 其他一些小幅改动。\n\n## 对于 Python 绑定用户\n\n有一处破坏性变更：\n\n- `SpeechService` 类在构造时会自动连接，`connect` 方法已被移除。\n- 现在默认使用 REST API。\n- `SpeechService` 构造函数新增一个关键字参数：\n  - `mode`：取值为 `rest` 或 `websocket`。默认值为 `rest`。","2023-06-12T09:38:51",{"id":213,"version":214,"summary_zh":215,"released_at":216},99580,"v5.2.0","## CLI\r\n\r\nYou can now set the authentication secrets via the following environment variables:\r\n\r\n- `ASPEAK_AUTH_KEY` for authentication using subscription key\r\n- `ASPEAK_AUTH_TOKEN` for authentication using authorization token\r\n\r\n## Rust API\r\n\r\n- Now you can use `Voice::request_available_voices`(or `Voice::request_available_voices_with_additional_headers`) to get the list of available voices.","2023-05-05T02:35:30",{"id":218,"version":219,"summary_zh":220,"released_at":221},99581,"v5.1.0","- Add binary feature to aspeak crate to make rust lib less bloated\r\n  - From now on, building the CLI requires `-F binary` flag.","2023-04-20T15:04:24",{"id":223,"version":224,"summary_zh":225,"released_at":226},99582,"v5.0.1-alpha.2","- Add binary feature to make rust lib less bloated","2023-04-20T14:21:17",{"id":228,"version":229,"summary_zh":230,"released_at":231},99583,"v5.0.0","## Enhancements\r\n\r\n- Add support for `--color={auto,always,never}` options. And `aspeak` will also respect the `NO_COLOR` environment variable.\r\n  - There is an edge case that `aspeak` will use colored output even if `--color=never` is specified.\r\n    This is because `aspeak` uses `clap` to parse command line options. `--color=never` works only if the command line parsing is successful.\r\n    So if you specify an invalid option, `aspeak` will print the error message and exit. In this case, `aspeak` will use colored output.\r\n- More documentation for the rust crate.\r\n- Minor performance improvements.\r\n- Now you can specify the custom voice list API url in your profile(field `voice_list_api` in section `auth`).\r\n\r\n## Breaking changes\r\n\r\n- The default trial endpoint has been removed because it was shutdown by Microsoft. Now you must set up authentication to use `aspeak`.\r\n- The default voice list API url has been removed for the same reason.\r\n- The rust API has been changed.\r\n  - `Synthesizer` is now `Send`. Its various `synthesize_*` methods now takes `&mut self` instead of `&self`.\r\n  - Now you need to use the builder pattern to create various options like `TextOptions`.\r\n  - Fields of the `Voice` struct are now private. You can use the methods to access them.\r\n\r\n## Other changes\r\n\r\n- The PKGBUILDs for Arch Linux is no longer stored in this repository. You can find them in the [AUR](https:\u002F\u002Faur.archlinux.org\u002Fpackages\u002Faspeak).","2023-04-18T07:30:35",{"id":233,"version":234,"summary_zh":235,"released_at":236},99584,"v4.3.1","- Fix a bug that caused the `endpoint` and `region` settings in profile to be ineffective.","2023-04-05T10:54:22",{"id":238,"version":239,"summary_zh":240,"released_at":241},99585,"v4.3.0","- Add support for http and socks5 proxy. Command line option `--proxy` and environment variable `http_proxy`(or `HTTP_PROXY`) are available.\r\n  - Example: `aspeak --proxy \"socks5:\u002F\u002F127.0.0.1:7890\" text \"Hello World\"`\r\n  - You can also set the proxy in the `auth` section in your profile.\r\n  - By now, connection to https proxy server is not supported!\r\n  - For python binding, use the `proxy` keyword argument in the `SpeechService` constructor.\r\n- Fix: Now the `list-voices` command correctly handles the auth settings. (region, token, key)\r\n- Now you can specify the voice list API url when using the `list-voices` command.","2023-04-04T10:24:11",{"id":243,"version":244,"summary_zh":245,"released_at":246},99586,"v4.3.0-beta.2","- Change the implementation of socks5 proxy.\r\n- Make the `list-voices` command respect the proxy settings.\r\n- Fix: Now the `list-voices` command correctly handles the auth settings. (region, token, key)\r\n- Now you can specify the voice list API url when using the `list-voices` command.","2023-03-31T13:16:37",{"id":248,"version":249,"summary_zh":250,"released_at":251},99587,"v4.3.0-beta.1","- Add support for http and socks5 proxy. Command line option `--proxy` and environment variable `http_proxy`(or `HTTP_PROXY`) are available.\r\n  - Example: `aspeak --proxy \"socks5:\u002F\u002F127.0.0.1:7890\" text \"Hello World\"`\r\n  - You can also set the proxy in the `auth` section in your profile.\r\n  - By now, connection to https proxy server is not supported!\r\n  - For python binding, use the `proxy` keyword argument in the `SpeechService` constructor.","2023-03-30T13:06:09",{"id":253,"version":254,"summary_zh":255,"released_at":256},99588,"v4.2.0","- Show detailed error message in python bindings.\r\n- Fix: Previously, the `role` field in the default profile template is not commented out and set to `Boy`.\r\n  You might want to comment it out if you are already using the default profile template and haven't changed it.\r\n- The `role`, `style` and `style_degree` fields are now commented out in the default profile template.\r\n- Feature: Now you can use `--no-rich-ssml` flag to disable rich SSML features such as `role`, `style` and `style_degree`.\r\n  This is useful if you are using an endpoint that does not support rich SSML features.\r\n- Fix(Python bindings): Now the `SpeechService` constructor correctly takes an iterable instead of an iterator for `headers` keyword argument.\r\n- Fix: Now aspeak correctly handles endpoint urls that contain query parameters.","2023-03-25T02:20:05",{"id":258,"version":259,"summary_zh":260,"released_at":261},99589,"v4.1.0","- You can now use your azure subscription key to authenticate. Special thanks to @yhmickey\r\n  for trusting me and providing me his subscription key for testing.","2023-03-09T12:19:21"]