[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-greyblake--whatlang-rs":3,"tool-greyblake--whatlang-rs":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":79,"owner_twitter":75,"owner_website":81,"owner_url":82,"languages":83,"stars":100,"forks":101,"last_commit_at":102,"license":103,"difficulty_score":104,"env_os":105,"env_gpu":105,"env_ram":105,"env_deps":106,"category_tags":112,"github_topics":113,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":127,"updated_at":128,"faqs":129,"releases":160},670,"greyblake\u002Fwhatlang-rs","whatlang-rs","Natural language detection library for Rust. Try demo online: https:\u002F\u002Fwhatlang.org\u002F","whatlang-rs 是一款专注于简洁与性能的自然语言检测库，专为 Rust 生态打造。当程序需要自动识别文本所属的语言时，它能提供高效可靠的解决方案。无论是处理用户输入、内容过滤还是多语言路由，whatlang-rs 都能准确判断文本是英语、中文还是其他语种。\n\n它解决了传统方案可能存在的依赖复杂、运行缓慢等问题。作为纯 Rust 实现，whatlang-rs 不仅轻量快速，还支持超过 70 种语言及多种书写系统（如拉丁文、西里尔文等）。更值得一提的是，它会返回检测的可信度评分，帮助开发者评估结果的准确性。其核心算法基于三元组语言模型，在保证精度的同时兼顾了效率。\n\n对于 Rust 后端开发者、NLP 研究人员以及需要集成语言识别功能的工程师而言，whatlang-rs 是一个优秀的选择。它无需引入庞大的外部依赖，就能轻松融入现有的搜索或数据处理流程中，满足对高性能和内存安全有要求的场景。","\u003Cp align=\"center\">\u003Cimg width=\"160\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fmaster\u002Fmisc\u002Flogo\u002Fwhatlang-logo.svg\" alt=\"Whatlang - rust library for natural language detection\">\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">Whatlang\u003C\u002Fh1>\n\n\u003Cp align=\"center\">Natural language detection for Rust with focus on simplicity and performance.\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Ca href=\"https:\u002F\u002Fwhatlang.org\u002F\" target=\"_blank\">Try online demo.\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Factions\u002Fworkflows\u002Fci.yml\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg\" alt=\"Build Status\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fmaster\u002FLICENSE\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg\" alt=\"License\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdocs.rs\u002Fwhatlang\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fdocs.rs\u002Fwhatlang\u002Fbadge.svg\" alt=\"Documentation\">\u003C\u002Fa>\n\u003Cp>\n\n[![Stand With Ukraine](https:\u002F\u002Fraw.githubusercontent.com\u002Fvshymanskyy\u002FStandWithUkraine\u002Fmain\u002Fbanner2-direct.svg)](https:\u002F\u002Fstand-with-ukraine.pp.ua\u002F)\n\n## Content\n* [Features](#features)\n* [Get started](#get-started)\n* [Who uses Whatlang?](#who-uses-whatlang)\n* [Documentation](https:\u002F\u002Fdocs.rs\u002Fwhatlang)\n* [Supported languages](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FSUPPORTED_LANGUAGES.md)\n* [Feature toggles](#feature-toggles)\n* [How does it work?](#how-does-it-work)\n* [Make tasks](#make-tasks)\n* [Comparison with alternatives](#comparison-with-alternatives)\n* [Ports and clones](#ports-and-clones)\n* [Donations](#donations)\n* [Derivation](#derivation)\n* [License](#license)\n* [Contributors](#contributors)\n\n\n## Features\n* Supports [70 languages](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FSUPPORTED_LANGUAGES.md)\n* 100% written in Rust\n* Lightweight, fast and simple\n* Recognizes not only a language, but also a script (Latin, Cyrillic, etc)\n* Provides reliability information\n\n## Get started\n\nExample:\n\n```rust\nuse whatlang::{detect, Lang, Script};\n\nfn main() {\n    let text = \"Ĉu vi ne volas eklerni Esperanton? Bonvolu! Estas unu de la plej bonaj aferoj!\";\n\n    let info = detect(text).unwrap();\n    assert_eq!(info.lang(), Lang::Epo);\n    assert_eq!(info.script(), Script::Latin);\n    assert_eq!(info.confidence(), 1.0);\n    assert!(info.is_reliable());\n}\n```\n\nFor more details (e.g. how to blacklist some languages) please check the [documentation](https:\u002F\u002Fdocs.rs\u002Fwhatlang).\n\n## Who uses Whatlang?\n\nWhatlang is used within the following big projects as direct or indirect dependency for language recognition.\nYou're gonna be in a great company using Whatlang:\n\n* [Sonic](https:\u002F\u002Fgithub.com\u002Fvaleriansaliou\u002Fsonic) - fast, lightweight and schema-less search backend in Rust.\n* [Meilisearch](https:\u002F\u002Fgithub.com\u002Fmeilisearch) - an open-source, easy-to-use, blazingly fast, and hyper-relevant search engine built in Rust.\n\n## Feature toggles\n\n| Feature     | Description                                                                           |\n|-------------|---------------------------------------------------------------------------------------|\n| `enum-map`  | `Lang` and `Script` implement `Enum` trait from [enum-map](https:\u002F\u002Fdocs.rs\u002Fenum-map\u002F) |\n| `arbitrary` | Support [Arbitrary](https:\u002F\u002Fcrates.io\u002Fcrates\u002Farbitrary)                               |\n| `serde`     | Implements `Serialize` and `Deserialize` for `Lang` and `Script`                      |\n| `dev`       | Enables `whatlang::dev` module which provides some internal API.\u003Cbr\u002F> It exists for profiling purposes and normal users are discouraged to to rely on this API.  |\n\n## How does it work?\n\n### How does the language recognition work?\n\nThe algorithm is based on the trigram language models, which is a particular case of n-grams.\nTo understand the idea, please check the original whitepaper [Cavnar and Trenkle '94: N-Gram-Based Text Categorization'](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F2375544_N-Gram-Based_Text_Categorization).\n\n### How is `is_reliable` calculated?\n\nIt is based on the following factors:\n* How many unique trigrams are in the given text\n* How big is the difference between the first and the second(not returned) detected languages? This metric is called `rate` in the code base.\n\nTherefore, it can be presented as 2d space with threshold functions, that splits it into \"Reliable\" and \"Not reliable\" areas.\nThis function is a hyperbola and it looks like the following one:\n\n\u003Cimg alt=\"Language recognition whatlang rust\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgreyblake_whatlang-rs_readme_d5997628afe3.png\" width=\"450\" height=\"300\" \u002F>\n\nFor more details, please check a blog article [Introduction to Rust Whatlang Library and Natural Language Identification Algorithms](https:\u002F\u002Fwww.greyblake.com\u002Fblog\u002Fintroduction-to-rust-whatlang-library-and-natural-language-identification-algorithms\u002F).\n\n## Make tasks\n\n* `make bench` - run performance benchmarks\n* `make doc` - generate and open doc\n* `make test` - run tests\n* `make watch` - watch changes and run tests\n\n## Comparison with alternatives\n\n|                           | Whatlang   | CLD2        | CLD3           |\n| ------------------------- | ---------- | ----------- | -------------- |\n| Implementation language   | Rust       | C++         | C++            |\n| Languages                 | 70         | 83          | 107            |\n| Algorithm                 | trigrams   | quadgrams   | neural network |\n| Supported Encoding        | UTF-8      | UTF-8       | ?              |\n| HTML support              | no         | yes         | ?              |\n\n\n## Ports and clones\n\n* [whatlang-ffi](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-ffi) - C bindings\n* [whatlanggo](https:\u002F\u002Fgithub.com\u002Fabadojack\u002Fwhatlanggo) - whatlang clone for Go language\n* [whatlang-py](https:\u002F\u002Fgithub.com\u002Fcathalgarvey\u002Fwhatlang-py) - bindings for Python\n* [whatlang-rb](https:\u002F\u002Fgitlab.com\u002FKitaitiMakoto\u002Fwhatlang-rb) - bindings for Ruby\n* [whatlangex](https:\u002F\u002Fgithub.com\u002Fpierrelegall\u002Fwhatlangex) - bindings for Elixir\n\n## Donations\n\nYou can support the project by donating [NEAR tokens](https:\u002F\u002Fnear.org).\n\nOur NEAR wallet address is `whatlang.near`\n\n## Derivation\n\n**Whatlang** is a derivative work from [Franc](https:\u002F\u002Fgithub.com\u002Fwooorm\u002Ffranc) (JavaScript, MIT) by [Titus Wormer](https:\u002F\u002Fgithub.com\u002Fwooorm).\n\n## License\n\n[MIT](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FLICENSE) © [Sergey Potapov](http:\u002F\u002Fgreyblake.com\u002F)\n\n\n## Contributors\n\n- [greyblake](https:\u002F\u002Fgithub.com\u002Fgreyblake) Potapov Sergey - creator, maintainer.\n- [Dr-Emann](https:\u002F\u002Fgithub.com\u002FDr-Emann) Zachary Dremann - optimization and improvements\n- [BaptisteGelez](https:\u002F\u002Fgithub.com\u002FBaptisteGelez) Baptiste Gelez - improvements\n- [Vishesh Chopra](https:\u002F\u002Fgithub.com\u002FKarmicKonquest) - designed the logo\n- [Joel Natividad](https:\u002F\u002Fgithub.com\u002Fjqnatividad) - support of Tagalog\n- [ManyTheFish](https:\u002F\u002Fgithub.com\u002FManyTheFish) - crazy optimization\n- [Kerollmops](https:\u002F\u002Fgithub.com\u002FKerollmops) Clément Renault - crazy optimization\n","\u003Cp align=\"center\">\u003Cimg width=\"160\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fmaster\u002Fmisc\u002Flogo\u002Fwhatlang-logo.svg\" alt=\"Whatlang - 用于自然语言检测的 Rust 库\">\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">Whatlang\u003C\u002Fh1>\n\n\u003Cp align=\"center\">专注于简洁和性能的 Rust 自然语言检测库。\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Ca href=\"https:\u002F\u002Fwhatlang.org\u002F\" target=\"_blank\">在线试用演示。\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Factions\u002Fworkflows\u002Fci.yml\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg\" alt=\"构建状态\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fmaster\u002FLICENSE\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg\" alt=\"许可证\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdocs.rs\u002Fwhatlang\" rel=\"nofollow\">\u003Cimg src=\"https:\u002F\u002Fdocs.rs\u002Fwhatlang\u002Fbadge.svg\" alt=\"文档\">\u003C\u002Fa>\n\u003Cp>\n\n[![与乌克兰站在一起](https:\u002F\u002Fraw.githubusercontent.com\u002Fvshymanskyy\u002FStandWithUkraine\u002Fmain\u002Fbanner2-direct.svg)](https:\u002F\u002Fstand-with-ukraine.pp.ua\u002F)\n\n## 目录\n* [功能](#features)\n* [入门指南](#get-started)\n* [谁在使用 Whatlang？](#who-uses-whatlang)\n* [文档](https:\u002F\u002Fdocs.rs\u002Fwhatlang)\n* [支持的语言](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FSUPPORTED_LANGUAGES.md)\n* [功能开关](#feature-toggles)\n* [工作原理](#how-does-it-work)\n* [Make 任务](#make-tasks)\n* [与替代方案的比较](#comparison-with-alternatives)\n* [移植版本和克隆](#ports-and-clones)\n* [捐赠](#donations)\n* [衍生来源](#derivation)\n* [许可证](#license)\n* [贡献者](#contributors)\n\n\n## 功能\n* 支持 [70 种语言](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FSUPPORTED_LANGUAGES.md)\n* 100% 使用 Rust 编写\n* 轻量、快速且简单\n* 不仅能识别语言，还能识别书写系统（拉丁文、西里尔文等）\n* 提供可靠性信息\n\n## 入门指南\n\n示例：\n\n```rust\nuse whatlang::{detect, Lang, Script};\n\nfn main() {\n    let text = \"Ĉu vi ne volas eklerni Esperanton? Bonvolu! Estas unu de la plej bonaj aferoj!\";\n\n    let info = detect(text).unwrap();\n    assert_eq!(info.lang(), Lang::Epo);\n    assert_eq!(info.script(), Script::Latin);\n    assert_eq!(info.confidence(), 1.0);\n    assert!(info.is_reliable());\n}\n```\n\n更多详情（例如如何屏蔽某些语言），请查看 [文档](https:\u002F\u002Fdocs.rs\u002Fwhatlang)。\n\n## 谁在使用 Whatlang？\n\nWhatlang 被用于以下大型项目中，作为语言识别的直接或间接依赖。\n使用 Whatlang，你将与优秀的公司为伍：\n\n* [Sonic](https:\u002F\u002Fgithub.com\u002Fvaleriansaliou\u002Fsonic) - 快速、轻量且无模式的 Rust 搜索后端。\n* [Meilisearch](https:\u002F\u002Fgithub.com\u002Fmeilisearch) - 一个开源、易于使用、极速且高度相关的 Rust 搜索引擎。\n\n## 功能开关\n\n| 功能     | 描述                                                                           |\n|-------------|---------------------------------------------------------------------------------------|\n| `enum-map`  | `Lang` 和 `Script` 实现来自 [enum-map](https:\u002F\u002Fdocs.rs\u002Fenum-map\u002F) 的 `Enum` 特征 |\n| `arbitrary` | 支持 [Arbitrary](https:\u002F\u002Fcrates.io\u002Fcrates\u002Farbitrary)                               |\n| `serde`     | 为 `Lang` 和 `Script` 实现 `Serialize` 和 `Deserialize`                      |\n| `dev`       | 启用 `whatlang::dev` 模块，该模块提供一些内部 API。\u003Cbr\u002F>它仅用于性能分析目的，不建议普通用户依赖此 API。  |\n\n## 工作原理\n\n### 语言识别是如何工作的？\n\n该算法基于三元组语言模型（trigram language models），这是 n-gram（n-元语法）的一个特例。\n要理解这一概念，请查阅原始白皮书 [Cavnar and Trenkle '94: N-Gram-Based Text Categorization'](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F2375544_N-Gram-Based_Text_Categorization)。\n\n### `is_reliable` 是如何计算的？\n\n它基于以下因素：\n* 给定文本中有多少个唯一的三元组（unique trigrams）\n* 第一个和第二个（未返回的）检测到的语言之间的差异有多大？代码库中将此指标称为 `rate`。\n\n因此，它可以表示为一个带有阈值函数的二维空间，将其划分为“可靠”和“不可靠”区域。\n这个函数是一个双曲线，看起来像下面这样：\n\n\u003Cimg alt=\"语言识别 whatlang rust\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgreyblake_whatlang-rs_readme_d5997628afe3.png\" width=\"450\" height=\"300\" \u002F>\n\n更多详情，请查看博客文章 [Rust Whatlang 库与自然语言识别算法介绍](https:\u002F\u002Fwww.greyblake.com\u002Fblog\u002Fintroduction-to-rust-whatlang-library-and-natural-language-identification-algorithms\u002F)。\n\n## Make 任务\n\n* `make bench` - 运行性能基准测试\n* `make doc` - 生成并打开文档\n* `make test` - 运行测试\n* `make watch` - 监控变化并运行测试\n\n## 与替代方案的比较\n\n|                           | Whatlang   | CLD2        | CLD3           |\n| ------------------------- | ---------- | ----------- | -------------- |\n| 实现语言                  | Rust       | C++         | C++            |\n| 语言数量                  | 70         | 83          | 107            |\n| 算法                      | 三元组     | 四元组      | 神经网络       |\n| 支持的编码                | UTF-8      | UTF-8       | ?              |\n| HTML 支持                 | 否         | 是          | ?              |\n\n\n## 移植版本和克隆\n\n* [whatlang-ffi](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-ffi) - C 绑定\n* [whatlanggo](https:\u002F\u002Fgithub.com\u002Fabadojack\u002Fwhatlanggo) - 用于 Go 语言的 whatlang 克隆版\n* [whatlang-py](https:\u002F\u002Fgithub.com\u002Fcathalgarvey\u002Fwhatlang-py) - Python 绑定\n* [whatlang-rb](https:\u002F\u002Fgitlab.com\u002FKitaitiMakoto\u002Fwhatlang-rb) - Ruby 绑定\n* [whatlangex](https:\u002F\u002Fgithub.com\u002Fpierrelegall\u002Fwhatlangex) - Elixir 绑定\n\n## 捐赠\n\n你可以通过捐赠 [NEAR 代币](https:\u002F\u002Fnear.org) 来支持该项目。\n\n我们的 NEAR 钱包地址是 `whatlang.near`\n\n## 衍生来源\n\n**Whatlang** 是基于 [Franc](https:\u002F\u002Fgithub.com\u002Fwooorm\u002Ffranc)（JavaScript，MIT）由 [Titus Wormer](https:\u002F\u002Fgithub.com\u002Fwooorm) 开发的衍生作品。\n\n## 许可证\n\n[MIT](https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fblob\u002Fmaster\u002FLICENSE) © [Sergey Potapov](http:\u002F\u002Fgreyblake.com\u002F)\n\n## 贡献者\n\n- [greyblake](https:\u002F\u002Fgithub.com\u002Fgreyblake) Potapov Sergey - 创建者，维护者。\n- [Dr-Emann](https:\u002F\u002Fgithub.com\u002FDr-Emann) Zachary Dremann - 优化和改进\n- [BaptisteGelez](https:\u002F\u002Fgithub.com\u002FBaptisteGelez) Baptiste Gelez - 改进\n- [Vishesh Chopra](https:\u002F\u002Fgithub.com\u002FKarmicKonquest) - 设计了标志\n- [Joel Natividad](https:\u002F\u002Fgithub.com\u002Fjqnatividad) - 支持塔加洛语\n- [ManyTheFish](https:\u002F\u002Fgithub.com\u002FManyTheFish) - 疯狂优化\n- [Kerollmops](https:\u002F\u002Fgithub.com\u002FKerollmops) Clément Renault - 疯狂优化","# whatlang-rs 快速上手指南\n\n**whatlang-rs** 是一个专注于简洁性和性能的 Rust 自然语言检测库。它支持 70 种语言，不仅能识别语言，还能识别脚本（如拉丁文、西里尔文等），并提供可靠性评估。\n\n## 环境准备\n\n*   **操作系统**: Linux, macOS, Windows 等支持 Rust 的系统。\n*   **开发语言**: Rust (建议版本 1.60+)。\n*   **前置依赖**: 确保已安装 Rust 工具链 (`rustup`)。\n\n## 安装步骤\n\n在您的 Rust 项目根目录下，通过 Cargo 添加依赖：\n\n```bash\ncargo add whatlang\n```\n\n或者手动编辑 `Cargo.toml` 文件：\n\n```toml\n[dependencies]\nwhatlang = \"0.16\" # 请根据 crates.io 最新版本调整\n```\n\n> **提示**: 如需序列化支持，可在依赖中添加特性 `serde`。例如：`whatlang = { version = \"0.16\", features = [\"serde\"] }`。\n\n## 基本使用\n\n引入 `whatlang` 模块并调用 `detect` 函数即可开始检测文本语言。\n\n```rust\nuse whatlang::{detect, Lang, Script};\n\nfn main() {\n    let text = \"Ĉu vi ne volas eklerni Esperanton? Bonvolu! Estas unu de la plej bonaj aferoj!\";\n\n    let info = detect(text).unwrap();\n    \n    \u002F\u002F 获取语言代码\n    assert_eq!(info.lang(), Lang::Epo);\n    \n    \u002F\u002F 获取脚本类型\n    assert_eq!(info.script(), Script::Latin);\n    \n    \u002F\u002F 获取置信度 (0.0 - 1.0)\n    assert_eq!(info.confidence(), 1.0);\n    \n    \u002F\u002F 判断检测结果是否可靠\n    assert!(info.is_reliable());\n}\n```\n\n### 核心 API 说明\n\n| 方法 | 描述 |\n| :--- | :--- |\n| `detect(text)` | 执行语言检测，返回 `DetectResult` 对象。 |\n| `info.lang()` | 返回检测到的语言枚举 (`Lang`)。 |\n| `info.script()` | 返回使用的字符集脚本 (`Script`)。 |\n| `info.confidence()` | 返回检测的置信度分数。 |\n| `info.is_reliable()` | 基于唯一三igrams 数量和语言差异率判断结果是否可信。 |\n\n更多高级功能（如黑名单过滤）请参考 [官方文档](https:\u002F\u002Fdocs.rs\u002Fwhatlang)。","某团队正在开发基于 Rust 的国际社区后端，需要自动识别用户评论的语言以便路由到对应的翻译服务。\n\n### 没有 whatlang-rs 时\n- 依赖硬编码的关键词匹配，遇到生僻词或混合语言时准确率极低。\n- 无法区分拉丁文与西里尔文等相似字符，导致路由到错误的翻译接口。\n- 缺乏置信度反馈，系统难以判断是否需要人工复核低质量检测结果。\n- 集成外部 API 成本高且响应慢，严重影响高并发下的系统吞吐量。\n\n### 使用 whatlang-rs 后\n- whatlang-rs 直接支持 70 多种语言，无需维护庞大的关键词库即可精准识别。\n- 能同时检测脚本类型，有效防止因字符集混淆导致的处理逻辑错误。\n- 返回可靠的置信度评分，让系统自动过滤掉低把握的评论进入人工审核队列。\n- 纯 Rust 实现零依赖且速度极快，显著降低了服务器 CPU 开销和延迟。\n\nwhatlang-rs 以轻量高效的特性，为 Rust 应用提供了生产级的多语言处理能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgreyblake_whatlang-rs_116f8403.png","greyblake","Serhii Potapov","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fgreyblake_99d7e84e.png","La suerte se burla de mi otra vez... En forma de  m̶u̶j̶e̶r̶    JS...",null,"Berlin","https:\u002F\u002Fgreyblake.com","https:\u002F\u002Fgithub.com\u002Fgreyblake",[84,88,92,96],{"name":85,"color":86,"percentage":87},"Rust","#dea584",98.1,{"name":89,"color":90,"percentage":91},"HTML","#e34c26",1.2,{"name":93,"color":94,"percentage":95},"Ruby","#701516",0.7,{"name":97,"color":98,"percentage":99},"Makefile","#427819",0,1066,121,"2026-04-04T22:43:50","MIT",1,"未说明",{"notes":107,"python":105,"dependencies":108},"纯 Rust 编写，基于三词法（trigram）算法，轻量快速。支持 70 种语言和脚本识别。无需 GPU。Python 用户需使用独立的 whatlang-py 绑定库。",[109,110,111],"enum-map","serde","arbitrary",[14,13,26,15],[114,115,116,117,118,119,120,121,122,123,124,125,126],"language","rust","nlp","text-analysis","text-classification","classifier","ai","detect-language","language-recognition","whatlang","algorithm","text-classifier","rustlang","2026-03-27T02:49:30.150509","2026-04-06T05:37:17.056663",[130,135,140,145,150,155],{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},2779,"更新版本后编译报错‘枚举未覆盖’怎么办？","维护者已将旧版本 0.7.4 下架并发布 0.8.0 修复此问题，建议升级至最新版本。若需避免此类破坏性变更，可将 `Lang` 枚举标记为 `#[non_exhaustive]` 以表达添加枚举项为非破坏性更改。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F50",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},2780,"如何为新语言准备高质量的 n-grams 数据？","在生成 n-grams 时，注意单引号、斜杠及其他标点符号和数字不应被用于三元组（trigrams）。建议参考现有语料库格式自行生成或查找合适资源，例如使用大语料库生成自己的 n-grams 集。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F55",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},2781,"库未内置文本方向检测，如何扩展此功能？","可以在应用层通过自定义 Trait 扩展。例如定义 `trait ScriptDirection { fn direction(&self) -> Direction; }` 并为 `whatlang::Script` 实现该 Trait，根据匹配项返回 `Direction::Ltr` 或 `Direction::Rtl`，然后在模块中导入该 Trait 即可使用。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F41",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},2782,"为什么日语检测准确率相对较低？","主要是因为日语大量使用汉字，与中文难以区分。改进算法建议：若文本仅含汉字则判为中文；若含汉字且片假名\u002F平假名占比达 25% 以上则判为日语。目前已有相关优化计划实施。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F88",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},2783,"为什么使用 `hashbrown` 而非标准库 `HashMap`？","基准测试表明，使用 `std::collections::HashMap` 会导致 `detect()` 函数性能下降约 2 倍。为了保持高性能，项目选择保留 `hashbrown` 依赖。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F98",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},2784,"如何使用新 API 进行语言检测及设置白名单？","可使用 `whatlang::new(text).detect()` 获取结果，或通过 `.whitelist(&list)` 指定允许的语言列表。也可单独调用 `detect_lang()` 获取语言或 `detect_script()` 获取脚本。新 API 已实现，支持一行代码获取结果。","https:\u002F\u002Fgithub.com\u002Fgreyblake\u002Fwhatlang-rs\u002Fissues\u002F5",[161,166,171,176],{"id":162,"version":163,"summary_zh":164,"released_at":165},102262,"v0.16.2","Changes:\r\n* Add optional [Arbitrary](https:\u002F\u002Fcrates.io\u002Fcrates\u002Farbitrary) support ","2022-10-23T14:15:03",{"id":167,"version":168,"summary_zh":169,"released_at":170},102263,"v0.7.0","A new version of Whatlang (library for natural language recognition in rust) released.\r\n\r\n### Changes\r\n* Support Afrikaans language (`afr`)\r\n* Get rid of build dependencies: installation is much faster now\r\n","2019-03-03T16:17:34",{"id":172,"version":173,"summary_zh":174,"released_at":175},102264,"v0.6.0","* Use hashbrown instead of fnv (detect() is 30% faster)\r\n* Use array on stack instead of vector for detect_script (1-2% faster)\r\n* Use build.rs to generate `lang.rs` file\r\n* Add property based testing\r\n","2019-03-03T16:10:15",{"id":177,"version":178,"summary_zh":179,"released_at":180},102265,"v0.5.0","* (breaking) Rename `Lang::to_code(&self)` to `Lang::code(&self)`\r\n* (fix) Fix bug with zero division in confidence calculation\r\n* (fix) Confidence can not exceed 1.0\r\n* Implement `Lang::eng_name(&self) -> &str` function\r\n* Implement `Lang::name(&self) -> &str` function\r\n* Implement `Script::name(&self) -> &str` function\r\n* Implement trait `Dislpay` for `Script`\r\n* Implement `Display` trait for `Lang`\r\n","2019-03-03T16:12:25"]