[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-stanford-oval--WikiChat":3,"tool-stanford-oval--WikiChat":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",142651,2,"2026-04-06T23:34:12",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":78,"owner_website":79,"owner_url":80,"languages":81,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":102,"env_os":103,"env_gpu":104,"env_ram":105,"env_deps":106,"category_tags":120,"github_topics":121,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":129,"updated_at":130,"faqs":131,"releases":162},4723,"stanford-oval\u002FWikiChat","WikiChat","WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.","WikiChat 是一款由斯坦福大学研发的开源项目，旨在解决大型语言模型（LLM）在对话中频繁产生“幻觉”或编造事实的痛点。无论是面对近期发生的新闻事件，还是较为冷门的知识点，传统 AI 聊天机器人往往容易给出错误信息，而 WikiChat 通过引入维基百科作为权威数据源，确保回答的真实可靠。\n\n其核心亮点在于独特的七阶段处理流水线。该系统并非简单地将检索结果丢给模型，而是通过多轮大模型调用，依次执行检索、去噪、证据提取及事实核查等步骤，严格将生成的回答“锚定”在维基百科的真实内容上。这种机制有效过滤了模型自身的臆造，显著提升了回复的准确性。此外，WikiChat 还支持包括英语、法语、德语在内的 25 种语言版本，并提供了从本地部署到多用户云端访问的灵活配置方案。\n\n这款工具非常适合希望构建高可信度问答系统的开发者、从事自然语言处理研究的研究人员，以及对事实准确性有严苛要求的企业用户。对于普通用户而言，它也是一个验证信息真伪、获取可靠知识的强力助手。通过结合检索增强生成（RAG）技术与严谨的核查流程，WikiChat 让 AI 对话真正做到了“言之有据”。","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_WikiChat_readme_fd2e8b698a8b.png\" width=\"120px\" alt=\"WikiChat Logo\" style=\"display: block; margin: 0 auto;\" \u002F>\n    \u003Ch1 align=\"center\">\n        \u003Cb>WikiChat\u003C\u002Fb>\n        \u003Cbr>\n        \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14292\">\n            \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcs.CL-2305.14292-b31b1b\" alt=\"arXiv\">\n        \u003C\u002Fa>\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fstargazers\">\n            \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fstanford-oval\u002FWikiChat?style=social\" alt=\"Github Stars\">\n        \u003C\u002Fa>\n    \u003C\u002Fh1>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    Stopping the Hallucination of Large Language Models\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    \u003C!-- \u003Ca href=\"https:\u002F\u002Fstanford.edu\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_WikiChat_readme_159a5334f870.png\" width=\"140px\" alt=\"Stanford University\" \u002F>\n    \u003C\u002Fa> -->\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    Online demo:\n    \u003Ca href=\"https:\u002F\u002Fwikichat.genie.stanford.edu\" target=\"_blank\">\n        https:\u002F\u002Fwikichat.genie.stanford.edu\n    \u003C\u002Fa>\n    \u003Cbr>\n\u003C\u002Fp>\n\n\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3ac856ba-682c-4aed-9271-ce2f6a27cd5e\n\n\n\n# Table of Contents\n- [Introduction](#introduction)\n  - [🚨 Announcements](#-announcements)\n- [Installation](#installation)\n  - [System Requirements](#system-requirements)\n  - [Install Dependencies](#install-dependencies)\n  - [Configure the LLM of Your Choice](#configure-the-llm-of-your-choice)\n  - [Configure Information Retrieval](#configure-information-retrieval)\n    - [Option 1 (Default): Use our free rate-limited Wikipedia search API](#option-1-default-use-our-free-rate-limited-wikipedia-search-api)\n    - [Option 2: Build your own index](#option-2-build-your-own-index)\n      - [To build a Wikipedia index](#to-build-a-wikipedia-index)\n      - [To index custom documents](#to-index-custom-documents)\n      - [To use an Azure AI deployment of the embedding model instead of a local one](#to-use-an-azure-ai-deployment-of-the-embedding-model-instead-of-a-local-one)\n      - [To upload a Qdrant index to 🤗 Hub](#to-upload-a-qdrant-index-to--hub)\n  - [Run WikiChat in Terminal](#run-wikichat-in-terminal)\n  - [[Optional] Deploy WikiChat for Multi-user Access](#optional-deploy-wikichat-for-multi-user-access)\n    - [Set up Cosmos DB](#set-up-cosmos-db)\n    - [Run Chainlit](#run-chainlit)\n- [The Free Rate-limited Wikipedia Search API](#the-free-rate-limited-wikipedia-search-api)\n- [Wikipedia Preprocessing](#wikipedia-preprocessing)\n- [Other Commands](#other-commands)\n  - [Run a Distilled Model for Lower Latency and Cost](#run-a-distilled-model-for-lower-latency-and-cost)\n  - [Simulate Conversations](#simulate-conversations)\n- [License](#license)\n- [Citation](#citation)\n\n\n\n\u003C!-- \u003Chr \u002F> -->\n\n\n# Introduction\n\nLarge language model (LLM) chatbots like ChatGPT and GPT-4 get things wrong a lot, especially if the information you are looking for is recent (\"Tell me about the 2024 Super Bowl.\") or about less popular topics (\"What are some good movies to watch from [insert your favorite foreign director]?\").\nWikiChat uses Wikipedia and the following 7-stage pipeline to makes sure its responses are factual. Each numbered stage involves one or more LLM calls.\n\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\".\u002Fpublic\u002Fimg\u002Fpipeline.svg\" width=\"700px\" alt=\"WikiChat Pipeline\" \u002F>\n\u003C\u002Fp>\n\nCheck out our paper for more details:\nSina J. Semnani, Violet Z. Yao*, Heidi C. Zhang*, and Monica S. Lam. 2023. [WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14292). In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore. Association for Computational Linguistics.\n\n## 🚨 Announcements\n- (April 29, 2025) WikiChat 2.1 is now available! Key updates include:\n  - **Improved Multilingual Support**: Now supports 25 different Wikipedias (up from 10) available via web and API at search.genie.stanford.edu\u002Fwikipedia_20250320: 🇺🇸 [English](https:\u002F\u002Fen.wikipedia.org\u002F), 🇫🇷 [French](https:\u002F\u002Ffr.wikipedia.org\u002F), 🇩🇪 [German](https:\u002F\u002Fde.wikipedia.org\u002F), 🇪🇸 [Spanish](https:\u002F\u002Fes.wikipedia.org\u002F), 🇯🇵 [Japanese](https:\u002F\u002Fja.wikipedia.org\u002F), 🇷🇺 [Russian](https:\u002F\u002Fru.wikipedia.org\u002F), 🇵🇹 [Portuguese](https:\u002F\u002Fpt.wikipedia.org\u002F), 🇨🇳 [Chinese](https:\u002F\u002Fzh.wikipedia.org\u002F), 🇮🇹 [Italian](https:\u002F\u002Fit.wikipedia.org\u002F), 🇸🇦 [Arabic](https:\u002F\u002Far.wikipedia.org\u002F), 🇮🇷 [Persian](https:\u002F\u002Ffa.wikipedia.org\u002F), 🇵🇱 [Polish](https:\u002F\u002Fpl.wikipedia.org\u002F), 🇳🇱 [Dutch](https:\u002F\u002Fnl.wikipedia.org\u002F), 🇺🇦 [Ukrainian](https:\u002F\u002Fuk.wikipedia.org\u002F), 🇮🇱 [Hebrew](https:\u002F\u002Fhe.wikipedia.org\u002F), 🇮🇩 [Indonesian](https:\u002F\u002Fid.wikipedia.org\u002F), 🇹🇷 [Turkish](https:\u002F\u002Ftr.wikipedia.org\u002F), 🇨🇿 [Czech](https:\u002F\u002Fcs.wikipedia.org\u002F), 🇸🇪 [Swedish](https:\u002F\u002Fsv.wikipedia.org\u002F), 🇰🇷 [Korean](https:\u002F\u002Fko.wikipedia.org\u002F), 🇫🇮 [Finnish](https:\u002F\u002Ffi.wikipedia.org\u002F), 🇻🇳 [Vietnamese](https:\u002F\u002Fvi.wikipedia.org\u002F), 🇭🇺 [Hungarian](https:\u002F\u002Fhu.wikipedia.org\u002F), [Catalan](https:\u002F\u002Fca.wikipedia.org\u002F), 🇹🇭 [Thai](https:\u002F\u002Fth.wikipedia.org\u002F).\n  - **Improved Information Retrieval**: Improved retrieval accuracy and speed with [the latest Snowflake's Arctic embedding model](https:\u002F\u002Fhuggingface.co\u002FSnowflake\u002Fsnowflake-arctic-embed-l-v2.0).\n  - **Improved Preprocessing of Wikipedia** using [Docling](https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling). As always, preprocessed Wikipedia is available on [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fstanford-oval\u002Fwikipedia).\n  - **Improved WikiChat Pipeline**:\n    - Added inline citations to the final response.\n    - The 'generate' stage of the pipeline is now always merged with the 'claim extraction' stage, even in the non-distilled setting, for faster and cheaper inference.\n    - Removed date-based reranking in favor of LLM-based reranking.\n  - Switched to using [pixi](https:\u002F\u002Fpixi.sh\u002Flatest\u002F) for package management and [loguru](https:\u002F\u002Fgithub.com\u002FDelgan\u002Floguru) for logging.\n\n- (August 22, 2024) WikiChat 2.0 is now available! Key updates include:\n    - **Multilingual Support**: By default, retrieves information from 10 different Wikipedias: 🇺🇸 English, 🇨🇳 Chinese, 🇪🇸 Spanish, 🇵🇹 Portuguese, 🇷🇺 Russian, 🇩🇪 German, 🇮🇷 Farsi, 🇯🇵 Japanese, 🇫🇷 French, and 🇮🇹 Italian.\n    - **Improved Information Retrieval**\n      - Now supports retrieval from structured data such as tables, infoboxes, and lists, in addition to text.\n      - Has the highest quality public Wikipedia preprocessing scripts\n      - Uses the state-of-the-art multilingual retrieval model [BGE-M3](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-m3).\n      - Uses [Qdrant](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Fqdrant) for scalable vector search.\n      - Uses [RankGPT](https:\u002F\u002Fgithub.com\u002Fsunnweiwei\u002FRankGPT) to rerank search results.\n    - **Free Multilingual Wikipedia Search API**: We offer a high-quality, free (but rate-limited) search API for access to 10 Wikipedias, encompassing over 180M vector embeddings.\n\n    - **Expanded LLM Compatibility**: Supports 100+ LLMs through a unified interface, thanks to [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm).\n    - **Optimized Pipeline**: Option for a faster and more cost-effective pipeline by merging the \"generate\" and \"extract claim\" stages of WikiChat.\n    - **LangChain Compatibility**: Fully compatible with LangChain 🦜️🔗.\n    - **And Much More!**\n- (June 20, 2024) WikiChat won the 2024 Wikimedia Research Award!\n  \u003Cblockquote class=\"twitter-tweet\">\u003Cp lang=\"en\" dir=\"ltr\">The \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002FWikimedia?ref_src=twsrc%5Etfw\">@Wikimedia\u003C\u002Fa> Research Award of the Year 2024 goes to &quot;WikiChat: Stopping the hallucination of large language model chatbots by few-shot grounding on Wikipedia&quot; ⚡\u003Cbr>\u003Cbr>📜 \u003Ca href=\"https:\u002F\u002Ft.co\u002Fd2M8Qrarkw\">https:\u002F\u002Ft.co\u002Fd2M8Qrarkw\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Ft.co\u002FP2Sh47vkyi\">pic.twitter.com\u002FP2Sh47vkyi\u003C\u002Fa>\u003C\u002Fp>&mdash; Wiki Workshop 2024 (@wikiworkshop) \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002Fwikiworkshop\u002Fstatus\u002F1803793163665977481?ref_src=twsrc%5Etfw\">June 20, 2024\u003C\u002Fa>\u003C\u002Fblockquote>\n  \n- (May 16, 2024) Our follow-up paper _\"🍝 SPAGHETTI: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing\"_ is accepted to the Findings of ACL 2024. This paper adds support for structured data like tables, infoboxes and lists.\n- (January 8, 2024) Distilled LLaMA-2 models are released. You can run these models locally for a cheaper and faster alternative to paid APIs.\n- (December 8, 2023) We present our work at EMNLP 2023.\n- (October 27, 2023) The camera-ready version of our paper is now available on arXiv.\n- (October 06, 2023) Our paper is accepted to the Findings of EMNLP 2023.\n\n\n# Installation\n\nInstalling WikiChat involves the following steps:\n\n1. Install dependencies\n1. Configure the LLM of your choice. WikiChat supports over 100 LLMs, including models from OpenAI, Azure, Anthropic, Mistral, HuggingFace, Together.ai, and Groq.\n1. Select an information retrieval source. This can be any HTTP endpoint that conforms to the interface defined in `retrieval\u002Fretriever_server.py`. We provide instructions and scripts for the following options:\n    1. Use our free, rate-limited API for Wikipedia in 25 languages.\n    1. Download and host our provided Wikipedia index yourself.\n    1. Create and run a new custom index from your own documents.\n1. Run WikiChat with your desired configuration.\n1. [Optional] Deploy WikiChat for multi-user access. We provide code to deploy a simple front-end and backend, as well as instructions to connect to an Azure Cosmos DB database for storing conversations.\n\n\n## System Requirements\nThis project has been tested with Python 3.11 on Ubuntu 20.04 LTS (Focal Fossa), but it should be compatible with many other Linux distributions. If you plan to use this on Windows WSL or macOS, or with a different Python version, be prepared for potential troubleshooting during installation.\n\nHardware requirements vary based on your intended use:\n\n1. **Basic Usage**: Running WikiChat with LLM APIs and our Wikipedia search API has minimal hardware requirements and should work on most systems.\n\n1. **Local Search Index**: If you intend to host a search index locally, ensure you have sufficient disk space for the index. For large indices, retrieval latency is heavily dependant on disk speed, so we recommend using SSDs and preferably NVMe drives. For example, storage-optimized VMs like [`Standard_L8s_v3`](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fvirtual-machines\u002Flsv3-series) on Azure are suitable for this.\n\n1. **Local LLM**: If you plan to use WikiChat with a local LLM, a GPU is necessary to host the model.\n\n1. **Creating a New Retrieval Index**: If you want to index a collection, you need a GPU to embed documents to vectors. The default embedding model requires at least 13GB of GPU memory to run.\n\n\n## Install Dependencies\n\nFirst, clone the repository:\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat.git\ncd WikiChat\n```\n\nWe recommend using the pixi environment specified in `pixi.toml`. This environment includes [Python 3.11](https:\u002F\u002Fdocs.python.org\u002F3.11\u002F), [pip](https:\u002F\u002Fpip.pypa.io\u002Fen\u002Fstable\u002F), [gcc](https:\u002F\u002Fgcc.gnu.org\u002Fonlinedocs\u002F), [g++](https:\u002F\u002Fgcc.gnu.org\u002Fonlinedocs\u002Fgcc-11.2.0\u002Flibstdc++\u002Fmanual\u002F), [make](https:\u002F\u002Fwww.gnu.org\u002Fsoftware\u002Fmake\u002Fmanual\u002Fmake.html), and all required Python packages.\n\n[Pixi](https:\u002F\u002Fpixi.sh\u002F) is a cross-platform package management tool. It is a much faster alternative to conda. To install it, follow the instructions at https:\u002F\u002Fpixi.sh\u002Flatest\u002F#installation. Then create and activate the pixi environment:\n\n```bash\npixi shell\npython -m spacy download en_core_web_sm  # Spacy is only needed for user simulation\n```\n\nBy default, this repository uses [Redis Stack](https:\u002F\u002Fredis.io\u002Fabout\u002Fabout-stack\u002F) via Docker.\nIf you see `Error: Redis lookup failed` after running the chatbot, it probably means Redis is not properly set up. Check the logs of the Redis docker container. Alternatively, you can try installing Redis by following its [official documentation](https:\u002F\u002Fredis.io\u002Fdocs\u002Flatest\u002Foperate\u002Foss_and_stack\u002Finstall\u002Finstall-redis\u002F).\n\nKeep this environment activated for all subsequent commands.\n\nInstall Docker for your operating system by following the instructions at https:\u002F\u002Fdocs.docker.com\u002Fengine\u002Finstall\u002F. WikiChat uses Docker primarily for creating and serving vector databases for retrieval, specifically [🤗 Text Embedding Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference) and [Qdrant](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Fqdrant). On recent Ubuntu versions, you can try running `inv install-docker`. For other operating systems, follow the instructions on the docker website.\n\nWikiChat uses [`invoke`](https:\u002F\u002Fwww.pyinvoke.org\u002F) to add custom commands for various purposes. To see all available commands and their descriptions, run:\n```\ninvoke --list\n```\nor the shorthand:\n```\ninv -l\n```\n\nFor more details about a specific command, use:\n```\ninv [command name] --help\n```\n\nThese commands are implemented in the `tasks\u002F` folder.\n\n\n## Configure the LLM of Your Choice\n\nWikiChat is compatible with various LLMs, including models from OpenAI, Azure, Anthropic, Mistral, Together.ai, and Groq.\nYou can also use WikiChat with many locally hosted models via HuggingFace.\n\nTo configure your LLM:\n1. Fill out the appropriate fields in `llm_config.yaml`.\n\n2. Create a file named `API_KEYS` (which is included in `.gitignore`).\n3. In the `API_KEYS` file, set the API key for the LLM endpoint you want to use. The name of the API key should match the name you provided in `llm_config.yaml` under `api_key`.\nFor example, if you're using OpenAI models via openai.com and Mistral endpoints, your `API_KEYS` file might look like this:\n\n```bash\n# Fill in the following values with your API keys. Make sure there is not extra space after the key.\n# Changes to this file are ignored by git, so you can safely store your keys here during development.\nOPENAI_API_KEY=[Your OpenAI API key from https:\u002F\u002Fplatform.openai.com\u002Fapi-keys]\nMISTRAL_API_KEY=[Your Mistral API key from https:\u002F\u002Fconsole.mistral.ai\u002Fapi-keys\u002F]\n```\n\nNote that locally hosted models do NOT need an API key, but you need to provide an OpenAI-compatible endpoint in `api_base`. The code has been tested with [🤗 Text Generation Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-generation-inference\u002F) endpoints, but you can try other similar endpoints like [vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm), [SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang), etc.\n\n\n## Configure Information Retrieval\n\n### Option 1 (Default): Use our free rate-limited Wikipedia search API\nBy default, WikiChat retrieves information from 25 Wikipedias via the endpoint at https:\u002F\u002Fsearch.genie.stanford.edu\u002Fwikipedia_20250320\u002F. If you want to just try WikiChat, you do not need to modify anything.\n\n### Option 2: Build your own index\n#### To build a Wikipedia index\nThe following command will download, preprocess, and index the latest HTML dump of the [Kurdish Wikipedia](ku.wikipedia.org), which we use in this example for its relatively small size.\n\n```bash\ninv index-wikipedia-dump --workdir .\u002Fworkdir --language ku\n```\n\n#### To index custom documents\n\n1. Preprocess your data into a [JSON Lines](https:\u002F\u002Fjsonlines.org\u002F) file (with .jsonl or .jsonl.gz file extension) where each line has the following fields:\n```json\n{\"id\": \"integer\",  \"document_title\": \"string\", \"section_title\": \"string\", \"content\": \"string\", \"block_type\": \"string\", \"language\": \"string\", \"last_edit_date\": \"string (optional)\", \"url\": \"string (optional)\", \"num_tokens\": \"integer (optional)\", \"block_metadata\": \"dict (optional)\"}\n```\n`content` should be the chunked text of your documents. We recommend chunking to less than 500 tokens of the embedding model's tokenizer. See [this](https:\u002F\u002Fdocs.langchain.com\u002Foss\u002Fpython\u002Fintegrations\u002Fsplitters) for an overview on chunking methods.\n`block_type` and `language` are only used to provide filtering on search results. If you do not need them, you can simply set them to `block_type=text` and `language=en`.\nThe script will feed `document_title` > `section_title` and `content` to the embedding model to create embedding vectors.\n\nSee `preprocessing\u002Fpreprocess_wikipedia_html_dump.py` for details on how this is implemented for Wikipedia HTML dumps.\n\n1. Run the indexing command:\n\n```bash\ninv index-collection --collection-path \u003Cpath to preprocessed JSONL> --collection-name \u003Ccollection name>\n```\n\nThis command starts docker containers for [🤗 Text Embedding Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference) (one per available GPU). By default, it uses the docker image compatible with NVIDIA GPUs with Ampere 80 architecture, e.g. A100. Support for some other GPUs is also available, but you would need to choose the right docker image from [available docker images](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference?tab=readme-ov-file#docker-images).\n\n3. (Optional) Add a [payload index](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002Fconcepts\u002Fpayload\u002F#payload-indexing)\n```bash\npython retrieval\u002Fadd_payload_index.py\n```\nThis will enable queries that filter on `language` or `block_type`. Note that for large indices, it might take several minutes for the index to become available again.\n\n4. After indexing, load and use the index as in option 2. For example:\n```bash\ninv start-retriever --retriever-port \u003Cport number>\ncurl -X POST 0.0.0.0:5100\u002F\u003Ccollection name> -H \"Content-Type: application\u002Fjson\" -d '{\"query\": [\"What is GPT-4?\", \"What is LLaMA-3?\"], \"num_blocks\": 3}'\n```\n\n5. Start WikiChat by passing in the URL of this retriever. For example:\n```bash\ninv demo --retriever-endpoint \"http:\u002F\u002F0.0.0.0:\u003Cport number>\u002F\u003Ccollection name>\" --corpus_id \"the id of the corpus you are using\"\n```\n\nThe `corpus_id` parameter is used to match a short description of the corpus to give to the LLM to help determine if searching the index is beneficial. It should refer to a `Corpus` object containing a short description of the corpus, e.g. \"Kurdish Wikipedia\" or \"Business documents for company X\".\n\n#### To use an Azure AI deployment of the embedding model instead of a local one\nAfter deploying one of the available embedding models via Azure AI, add your endpoint's key as `EMBEDDING_API_KEY` to `API_KEYS`.\nYou can then use the following command to index your collection:\n\n```bash\ninv index-collection --collection-path \u003Cpath to preprocessed JSONL> --collection-name \u003Ccollection name> --embedding-model-name \u003Cmodel name> --embedding-model-url https:\u002F\u002F\u003Cdeployment name>.\u003Cdeployment region>.inference.ml.azure.com --embedding-model-port 443\n```\n\n`embedding-model-name` should be the name of the model you deployed on Azure, e.g. `Snowflake\u002Fsnowflake-arctic-embed-l-v2.0`.\nNote that Azure imposes batch_size limit depending on the model and deployment hardware. You are responsible for setting up and tearing down the Azure deployment. The URL and key for your endpoint can be found in the Azure portal under the deployment's details.\n\n\n#### To upload a Qdrant index to 🤗 Hub\n1. Split the index into smaller parts:\n```bash\ntar -cvf - \u003Cpath to the Qdrant index folder> | pigz -p 14 | split --bytes=10GB --numeric-suffixes=0 --suffix-length=4 - \u003Cpath to the output folder>\u002Fqdrant_index.tar.gz.part-\n```\n\n2. Upload the resulting parts:\n```bash\npython retrieval\u002Fupload_folder_to_hf_hub.py --folder_path \u003Cpath to the output folder> --repo_id \u003CRepo ID on 🤗 Hub>\n```\n\n\n\n## Run WikiChat in Terminal\n\nYou can run different configurations of WikiChat using commands like these:\n\n```\ninv demo --engine gpt-4o # engine can be any value configured in llm_config, for example, mistral-large, claude-sonnet-35, local\n```\n\nFor a full list of all available options, you can run `inv demo --help`\n\n## [Optional] Deploy WikiChat for Multi-user Access\nThis repository provides code to deploy a web-based chat interface via [Chainlit](https:\u002F\u002Fgithub.com\u002FChainlit\u002Fchainlit), and store user conversations to a [Cosmos DB](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fproducts\u002Fcosmos-db) database.\nThese are implemented in `backend_server.py` and `database.py` respectively. If you want to use other databases or front-ends, you need to modify these files. For development, it should be straightforward to remove the dependency on Cosmos DB and simply store conversations in memory.\nYou can also configure chatbot parameters defined in `backend_server.py`, for example to use a different LLM or add\u002Fremove stages of WikiChat.\n\n### Set up Cosmos DB\nAfter creating an instance via Azure, obtain the connection string and add this value in `API_KEYS`.\n```bash\nCOSMOS_CONNECTION_STRING=[Your Cosmos DB connection string]\n```\n\n### Run Chainlit\nRunning this will start the backend and front-end servers. You can then access the front-end at the specified port (5001 by default).\n`inv chainlit --backend-port 5001`\n\n\n\n# The Free Rate-limited Wikipedia Search API\nYou can use this API endpoint for prototyping high-quality RAG systems.\nSee https:\u002F\u002Fsearch.genie.stanford.edu\u002Fredoc for the full specification.\n\nNote that we do not provide any guarantees about this endpoint, and it is not suitable for production.\n\n\n# Wikipedia Preprocessing\nWe publicly release [preprocessed Wikipedia in 25 languages](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fstanford-oval\u002Fwikipedia).\n\n# Other Commands\n\n## Run a Distilled Model for Lower Latency and Cost\nWikiChat>=2.0 is not compatible with [fine-tuned LLaMA-2 checkpoints released](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fstanford-oval\u002Fwikichat-v10-66c580bf15e26b87d622498c). Please refer to v1.0 to run distilled models.\n\n## Simulate Conversations\nTo evaluate a chatbot, you can simulate conversations using a user simulator. The `subset` parameter can be one of `head`, `tail`, or `recent`, corresponding to the three subsets introduced in the WikiChat paper. You can also specify the language of the user (WikiChat always replies in the user's language).\nThis script reads the topic (i.e., a Wikipedia title and article) from the corresponding `benchmark\u002Ftopics\u002F{subset}_articles_{language}.json` file. Use `--num-dialogues` to set the number of simulated dialogues to generate, and `--num-turns` to specify the number of turns in each dialogue.\n\n```bash\ninv simulate-users --num-dialogues 1 --num-turns 2 --simulation-mode passage --language en --subset head\n```\nDepending on the engine you are using, this might take some time. The simulated dialogues and log files will be saved in `benchmark\u002Fsimulated_dialogues\u002F`.\nYou can also provide any of the pipeline parameters from above.\nYou can experiment with different user characteristics by modifying `user_characteristics` in `benchmark\u002Fuser_simulator.py`.\n\n# License\nWikiChat code, and models and data are released under Apache-2.0 license.\n\n# Citation\n\nIf you have used code or data from this repository, please cite the following papers:\n\n```bibtex\n@inproceedings{semnani-etal-2023-wikichat,\n    title = \"{W}iki{C}hat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on {W}ikipedia\",\n    author = \"Semnani, Sina  and\n      Yao, Violet  and\n      Zhang, Heidi  and\n      Lam, Monica\",\n    editor = \"Bouamor, Houda  and\n      Pino, Juan  and\n      Bali, Kalika\",\n    booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2023\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.findings-emnlp.157\",\n    pages = \"2387--2413\",\n}\n\n@inproceedings{zhang-etal-2024-spaghetti,\n    title = \"{SPAGHETTI}: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing\",\n    author = \"Zhang, Heidi  and\n      Semnani, Sina  and\n      Ghassemi, Farhad  and\n      Xu, Jialiang  and\n      Liu, Shicheng  and\n      Lam, Monica\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Findings of the Association for Computational Linguistics ACL 2024\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand and virtual meeting\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.96\",\n    pages = \"1663--1678\",\n}\n```\n","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_WikiChat_readme_fd2e8b698a8b.png\" width=\"120px\" alt=\"WikiChat Logo\" style=\"display: block; margin: 0 auto;\" \u002F>\n    \u003Ch1 align=\"center\">\n        \u003Cb>WikiChat\u003C\u002Fb>\n        \u003Cbr>\n        \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14292\">\n            \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcs.CL-2305.14292-b31b1b\" alt=\"arXiv\">\n        \u003C\u002Fa>\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fstargazers\">\n            \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fstanford-oval\u002FWikiChat?style=social\" alt=\"Github Stars\">\n        \u003C\u002Fa>\n    \u003C\u002Fh1>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    阻止大型语言模型的幻觉\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    \u003C!-- \u003Ca href=\"https:\u002F\u002Fstanford.edu\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_WikiChat_readme_159a5334f870.png\" width=\"140px\" alt=\"斯坦福大学\" \u002F>\n    \u003C\u002Fa> -->\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    在线演示：\n    \u003Ca href=\"https:\u002F\u002Fwikichat.genie.stanford.edu\" target=\"_blank\">\n        https:\u002F\u002Fwikichat.genie.stanford.edu\n    \u003C\u002Fa>\n    \u003Cbr>\n\u003C\u002Fp>\n\n\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3ac856ba-682c-4aed-9271-ce2f6a27cd5e\n\n\n\n# 目录\n- [简介](#introduction)\n  - [🚨 公告](#-announcements)\n- [安装](#installation)\n  - [系统要求](#system-requirements)\n  - [安装依赖](#install-dependencies)\n  - [配置您选择的语言模型](#configure-the-llm-of-your-choice)\n  - [配置信息检索](#configure-information-retrieval)\n    - [选项1（默认）：使用我们免费的限流维基百科搜索API](#option-1-default-use-our-free-rate-limited-wikipedia-search-api)\n    - [选项2：构建您自己的索引](#option-2-build-your-own-index)\n      - [构建维基百科索引](#to-build-a-wikipedia-index)\n      - [索引自定义文档](#to-index-custom-documents)\n      - [使用Azure AI部署的嵌入模型代替本地模型](#to-use-an-azure-ai-deployment-of-the-embedding-model-instead-of-a-local-one)\n      - [将Qdrant索引上传至🤗 Hub](#to-upload-a-qdrant-index-to--hub)\n  - [在终端中运行WikiChat](#run-wikichat-in-terminal)\n  - [[可选] 部署WikiChat以供多用户访问](#optional-deploy-wikichat-for-multi-user-access)\n    - [设置Cosmos DB](#set-up-cosmos-db)\n    - [运行Chainlit](#run-chainlit)\n- [免费的限流维基百科搜索API](#the-free-rate-limited-wikipedia-search-api)\n- [维基百科预处理](#wikipedia-preprocessing)\n- [其他命令](#other-commands)\n  - [运行蒸馏模型以降低延迟和成本](#run-a-distilled-model-for-lower-latency-and-cost)\n  - [模拟对话](#simulate-conversations)\n- [许可证](#license)\n- [引用](#citation)\n\n\n\n\u003C!-- \u003Chr \u002F> -->\n\n\n# 简介\n\n像ChatGPT和GPT-4这样的大型语言模型聊天机器人经常会出错，尤其是在您寻找的信息比较新（“告诉我关于2024年超级碗的情况”）或涉及不太热门的话题时（“有哪些值得一看的好电影，出自[插入您最喜欢的外国导演]？”）。WikiChat利用维基百科和以下7个阶段的流程来确保其回复内容真实可靠。每个编号阶段都会涉及一次或多次语言模型调用。\n\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\".\u002Fpublic\u002Fimg\u002Fpipeline.svg\" width=\"700px\" alt=\"WikiChat流程\" \u002F>\n\u003C\u002Fp>\n\n更多详情请参阅我们的论文：\nSina J. Semnani, Violet Z. Yao*, Heidi C. Zhang*, and Monica S. Lam. 2023. [WikiChat：通过基于维基百科的少样本接地来阻止大型语言模型聊天机器人的幻觉](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14292)。载于计算语言学协会会议成果集：EMNLP 2023，新加坡。计算语言学协会。\n\n## 🚨 通知\n- （2025年4月29日）WikiChat 2.1 现已发布！主要更新包括：\n  - **多语言支持增强**：现支持25种不同的维基百科版本（之前为10种），可通过网页和API在 search.genie.stanford.edu\u002Fwikipedia_20250320 访问：🇺🇸 [英语](https:\u002F\u002Fen.wikipedia.org\u002F)、🇫🇷 [法语](https:\u002F\u002Ffr.wikipedia.org\u002F)、🇩🇪 [德语](https:\u002F\u002Fde.wikipedia.org\u002F)、🇪🇸 [西班牙语](https:\u002F\u002Fes.wikipedia.org\u002F)、🇯🇵 [日语](https:\u002F\u002Fja.wikipedia.org\u002F)、🇷🇺 [俄语](https:\u002F\u002Fru.wikipedia.org\u002F)、🇵🇹 [葡萄牙语](https:\u002F\u002Fpt.wikipedia.org\u002F)、🇨🇳 [中文](https:\u002F\u002Fzh.wikipedia.org\u002F)、🇮🇹 [意大利语](https:\u002F\u002Fit.wikipedia.org\u002F)、🇸🇦 [阿拉伯语](https:\u002F\u002Far.wikipedia.org\u002F)、🇮🇷 [波斯语](https:\u002F\u002Ffa.wikipedia.org\u002F)、🇵🇱 [波兰语](https:\u002F\u002Fpl.wikipedia.org\u002F)、🇳🇱 [荷兰语](https:\u002F\u002Fnl.wikipedia.org\u002F)、🇺🇦 [乌克兰语](https:\u002F\u002Fuk.wikipedia.org\u002F)、🇮🇱 [希伯来语](https:\u002F\u002Fhe.wikipedia.org\u002F)、🇮🇩 [印尼语](https:\u002F\u002Fid.wikipedia.org\u002F)、🇹🇷 [土耳其语](https:\u002F\u002Ftr.wikipedia.org\u002F)、🇨🇿 [捷克语](https:\u002F\u002Fcs.wikipedia.org\u002F)、🇸🇪 [瑞典语](https:\u002F\u002Fsv.wikipedia.org\u002F)、🇰🇷 [韩语](https:\u002F\u002Fko.wikipedia.org\u002F)、🇫🇮 [芬兰语](https:\u002F\u002Ffi.wikipedia.org\u002F)、🇻🇳 [越南语](https:\u002F\u002Fvi.wikipedia.org\u002F)、🇭🇺 [匈牙利语](https:\u002F\u002Fhu.wikipedia.org\u002F)、[加泰罗尼亚语](https:\u002F\u002Fca.wikipedia.org\u002F)、🇹🇭 [泰语](https:\u002F\u002Fth.wikipedia.org\u002F)。\n  - **信息检索优化**：借助最新的 Snowflake Arctic 嵌入模型，检索准确性和速度均有所提升。\n  - **维基百科预处理改进**：使用 Docling 进行预处理。一如既往地，预处理后的维基百科数据可在 HuggingFace 上获取。\n  - **WikiChat 流程优化**：\n    - 在最终回复中添加了内联引用。\n    - “生成”阶段现在始终与“观点提取”阶段合并，即使在非蒸馏模式下也是如此，从而实现更快、更经济的推理。\n    - 取消基于日期的重新排序，转而采用基于大语言模型的重新排序。\n  - 切换至使用 pixi 进行包管理，并使用 loguru 进行日志记录。\n\n- （2024年8月22日）WikiChat 2.0 现已发布！主要更新包括：\n    - **多语言支持**：默认从10种不同语言的维基百科中检索信息：🇺🇸 英语、🇨🇳 中文、🇪🇸 西班牙语、🇵🇹 葡萄牙语、🇷🇺 俄语、🇩🇪 德语、🇮🇷 波斯语、🇯🇵 日语、🇫🇷 法语和🇮🇹 意大利语。\n    - **信息检索优化**\n      - 现在除了文本外，还支持从表格、信息框和列表等结构化数据中检索信息。\n      - 拥有目前质量最高的公开维基百科预处理脚本。\n      - 使用最先进的多语言检索模型 BGE-M3。\n      - 使用 Qdrant 进行可扩展的向量搜索。\n      - 使用 RankGPT 对搜索结果进行重新排序。\n    - **免费多语言维基百科搜索API**：我们提供高质量、免费（但有限速）的搜索API，用于访问10种维基百科，涵盖超过1.8亿条向量嵌入。\n    - **LLM兼容性扩展**：通过 LiteLLM 接口，支持100多种大语言模型。\n    - **流程优化**：可通过合并 WikiChat 的“生成”和“观点提取”阶段，获得更快且更具成本效益的流程。\n    - **LangChain兼容性**：完全兼容 LangChain 🦜️🔗。\n    - **更多功能！**\n\n- （2024年6月20日）WikiChat荣获2024年维基媒体研究奖！\n  \u003Cblockquote class=\"twitter-tweet\">\u003Cp lang=\"en\" dir=\"ltr\">2024年度\u003C a href=\"https:\u002F\u002Ftwitter.com\u002FWikimedia?ref_src=twsrc%5Etfw\">@Wikimedia\u003C\u002Fa>研究奖授予“WikiChat：通过少量示例将大型语言模型聊天机器人锚定到维基百科以阻止幻觉”⚡\u003Cbr>\u003Cbr>📜 \u003Ca href=\"https:\u002F\u002Ft.co\u002Fd2M8Qrarkw\">https:\u002F\u002Ft.co\u002Fd2M8Qrarkw\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Ft.co\u002FP2Sh47vkyi\">pic.twitter.com\u002FP2Sh47vkyi\u003C\u002Fa>\u003C\u002Fp>&mdash; Wiki Workshop 2024 (@wikiworkshop) \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002Fwikiworkshop\u002Fstatus\u002F1803793163665977481?ref_src=twsrc%5Etfw\">2024年6月20日\u003C\u002Fa>\u003C\u002Fblockquote>\n  \n- （2024年5月16日）我们的后续论文 _“🍝 SPAGHETTI：基于检索与语义解析的异构数据源开放域问答”_ 被 ACL 2024 接收。该论文增加了对表格、信息框和列表等结构化数据的支持。\n- （2024年1月8日）蒸馏版 LLaMA-2 模型发布。您可以在本地运行这些模型，作为付费API的更便宜、更快速的替代方案。\n- （2023年12月8日）我们在 EMNLP 2023 上展示了我们的工作。\n- （2023年10月27日）我们论文的最终定稿现已在 arXiv 上发布。\n- （2023年10月6日）我们的论文被 EMNLP 2023 接收。\n\n\n# 安装说明\n\n安装 WikiChat 包括以下步骤：\n\n1. 安装依赖项。\n2. 配置您选择的大语言模型。WikiChat 支持超过100种大语言模型，包括来自 OpenAI、Azure、Anthropic、Mistral、HuggingFace、Together.ai 和 Groq 的模型。\n3. 选择信息检索来源。这可以是任何符合 retrieval\u002Fretriever_server.py 中定义接口的 HTTP 终端点。我们提供了以下选项的说明和脚本：\n    1. 使用我们提供的25种语言维基百科的免费限速 API。\n    1. 下载并自行托管我们提供的维基百科索引。\n    1. 从您自己的文档中创建并运行新的自定义索引。\n4. 使用您所需的配置运行 WikiChat。\n5. 【可选】部署 WikiChat 以供多用户访问。我们提供了部署简单前端和后端的代码，以及连接 Azure Cosmos DB 数据库以存储对话记录的说明。\n\n\n## 系统要求\n该项目已在 Ubuntu 20.04 LTS (Focal Fossa) 上使用 Python 3.11 进行测试，但应能兼容许多其他 Linux 发行版。如果您计划在 Windows WSL 或 macOS 上使用，或使用不同的 Python 版本，请做好在安装过程中可能需要排查问题的准备。\n\n硬件要求因您的使用目的而异：\n\n1. **基本使用**：使用 LLM API 和我们的维基百科搜索 API 运行 WikiChat 的硬件要求很低，大多数系统均可运行。\n1. **本地搜索索引**：如果您打算在本地托管搜索索引，请确保有足够的磁盘空间。对于大型索引，检索延迟高度依赖于磁盘速度，因此建议使用 SSD，最好是 NVMe 驱动器。例如，Azure 上的存储优化虚拟机 [`Standard_L8s_v3`](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fvirtual-machines\u002Flsv3-series) 就非常适合此用途。\n1. **本地 LLM**：如果您计划使用本地 LLM 运行 WikiChat，则需要 GPU 来承载模型。\n1. **创建新检索索引**：如果您想对某个文档集合建立索引，需要 GPU 将文档嵌入为向量。默认的嵌入模型至少需要 13GB 的 GPU 内存才能运行。\n\n## 安装依赖\n\n首先，克隆仓库：\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat.git\ncd WikiChat\n```\n\n我们建议使用 `pixi.toml` 中指定的 pixi 环境。该环境包含 [Python 3.11](https:\u002F\u002Fdocs.python.org\u002F3.11\u002F)、[pip](https:\u002F\u002Fpip.pypa.io\u002Fen\u002Fstable\u002F)、[gcc](https:\u002F\u002Fgcc.gnu.org\u002Fonlinedocs\u002F)、[g++](https:\u002F\u002Fgcc.gnu.org\u002Fonlinedocs\u002Fgcc-11.2.0\u002Flibstdc++\u002Fmanual\u002F)、[make](https:\u002F\u002Fwww.gnu.org\u002Fsoftware\u002Fmake\u002Fmanual\u002Fmake.html)，以及所有所需的 Python 包。\n\n[Pixi](https:\u002F\u002Fpixi.sh\u002F) 是一个跨平台的包管理工具，它是 conda 的更快替代方案。要安装它，请按照 https:\u002F\u002Fpixi.sh\u002Flatest\u002F#installation 上的说明进行操作。然后创建并激活 pixi 环境：\n\n```bash\npixi shell\npython -m spacy download en_core_web_sm  # Spacy 仅在用户模拟时需要\n```\n\n默认情况下，此仓库通过 Docker 使用 [Redis Stack](https:\u002F\u002Fredis.io\u002Fabout\u002Fabout-stack\u002F)。\n如果运行聊天机器人后出现 `Error: Redis lookup failed`，很可能是因为 Redis 没有正确设置。请检查 Redis Docker 容器的日志。或者，您可以按照其 [官方文档](https:\u002F\u002Fredis.io\u002Fdocs\u002Flatest\u002Foperate\u002Foss_and_stack\u002Finstall\u002Finstall-redis\u002F) 安装 Redis。\n\n在后续的所有命令中，请保持此环境处于激活状态。\n\n根据您的操作系统，按照 https:\u002F\u002Fdocs.docker.com\u002Fengine\u002Finstall\u002F 上的说明安装 Docker。WikiChat 主要使用 Docker 来创建和提供用于检索的向量数据库，特别是 [🤗 Text Embedding Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference) 和 [Qdrant](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Fqdrant)。在较新的 Ubuntu 版本上，您可以尝试运行 `inv install-docker`。对于其他操作系统，请遵循 Docker 官网上的说明。\n\nWikiChat 使用 [`invoke`](https:\u002F\u002Fwww.pyinvoke.org\u002F) 添加用于各种目的的自定义命令。要查看所有可用命令及其描述，请运行：\n```\ninvoke --list\n```\n或简写形式：\n```\ninv -l\n```\n\n如需了解某个特定命令的详细信息，请使用：\n```\ninv [命令名称] --help\n```\n\n这些命令实现于 `tasks\u002F` 文件夹中。\n\n\n## 配置您选择的 LLM\n\nWikiChat 兼容多种 LLM，包括来自 OpenAI、Azure、Anthropic、Mistral、Together.ai 和 Groq 的模型。\n您也可以通过 HuggingFace 将许多本地托管的模型与 WikiChat 结合使用。\n\n要配置您的 LLM：\n1. 填写 `llm_config.yaml` 中相应的字段。\n\n2. 创建一个名为 `API_KEYS` 的文件（已包含在 `.gitignore` 中）。\n3. 在 `API_KEYS` 文件中，设置您想要使用的 LLM 端点的 API 密钥。API 密钥的名称应与您在 `llm_config.yaml` 的 `api_key` 字段中提供的名称一致。\n例如，如果您通过 openai.com 使用 OpenAI 模型，并使用 Mistral 端点，那么您的 `API_KEYS` 文件可能如下所示：\n\n```bash\n# 请用您的 API 密钥填写以下值。确保密钥后面没有多余的空格。\n# 此文件的更改会被 git 忽略，因此您可以在开发过程中安全地在此处存储您的密钥。\nOPENAI_API_KEY=[您从 https:\u002F\u002Fplatform.openai.com\u002Fapi-keys 获取的 OpenAI API 密钥]\nMISTRAL_API_KEY=[您从 https:\u002F\u002Fconsole.mistral.ai\u002Fapi-keys\u002F 获取的 Mistral API 密钥]\n```\n\n请注意，本地托管的模型不需要 API 密钥，但您需要在 `api_base` 中提供一个兼容 OpenAI 的端点。代码已针对 [🤗 Text Generation Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-generation-inference) 端点进行了测试，但您也可以尝试其他类似的端点，如 [vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)、[SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang) 等。\n\n\n## 配置信息检索\n\n### 选项 1（默认）：使用我们的免费限流 Wikipedia 搜索 API\n默认情况下，WikiChat 通过 https:\u002F\u002Fsearch.genie.stanford.edu\u002Fwikipedia_20250320\u002F 端点从 25 种语言的维基百科中检索信息。如果您只想试用 WikiChat，则无需进行任何修改。\n\n### 选项 2：构建您自己的索引\n#### 构建维基百科索引\n以下命令将下载、预处理并索引 [库尔德语维基百科](ku.wikipedia.org) 的最新 HTML 转储文件，我们在本示例中使用它是因为其规模相对较小。\n\n```bash\ninv index-wikipedia-dump --workdir .\u002Fworkdir --language ku\n```\n\n#### 索引自定义文档\n\n1. 将您的数据预处理为一个 [JSON Lines](https:\u002F\u002Fjsonlines.org\u002F) 文件（文件扩展名为 .jsonl 或 .jsonl.gz），其中每行包含以下字段：\n```json\n{\"id\": \"integer\",  \"document_title\": \"string\", \"section_title\": \"string\", \"content\": \"string\", \"block_type\": \"string\", \"language\": \"string\", \"last_edit_date\": \"string (optional)\", \"url\": \"string (optional)\", \"num_tokens\": \"integer (optional)\", \"block_metadata\": \"dict (optional)\"}\n```\n`content` 应该是您文档的分块文本。我们建议将每个分块控制在嵌入模型分词器的 500 个标记以内。有关分块方法的概述，请参阅 [此链接](https:\u002F\u002Fdocs.langchain.com\u002Foss\u002Fpython\u002Fintegrations\u002Fsplitters)。\n`block_type` 和 `language` 仅用于在搜索结果中提供过滤功能。如果您不需要它们，可以简单地将其设置为 `block_type=text` 和 `language=en`。\n脚本会将 `document_title` > `section_title` 和 `content` 输入到嵌入模型中以生成嵌入向量。\n\n有关如何针对维基百科 HTML 转储文件实现此操作的详细信息，请参阅 `preprocessing\u002Fpreprocess_wikipedia_html_dump.py`。\n\n1. 运行索引命令：\n\n```bash\ninv index-collection --collection-path \u003C预处理后的 JSONL 文件路径> --collection-name \u003C索引名称>\n```\n\n此命令会为 [🤗 文本嵌入推理](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference) 启动 Docker 容器（每个可用 GPU 一个）。默认情况下，它使用与 NVIDIA Ampere 80 架构 GPU 兼容的 Docker 镜像，例如 A100。对其他一些 GPU 也提供了支持，但您需要从 [可用的 Docker 镜像](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-embeddings-inference?tab=readme-ov-file#docker-images) 中选择合适的镜像。\n\n3. （可选）添加 [payload 索引](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002Fconcepts\u002Fpayload\u002F#payload-indexing)\n```bash\npython retrieval\u002Fadd_payload_index.py\n```\n这将启用按 `language` 或 `block_type` 进行过滤的查询。请注意，对于大型索引，索引重新可用可能需要几分钟时间。\n\n4. 索引完成后，按照选项 2 的方式加载和使用索引。例如：\n```bash\ninv start-retriever --retriever-port \u003C端口号>\ncurl -X POST 0.0.0.0:5100\u002F\u003C索引名称> -H \"Content-Type: application\u002Fjson\" -d '{\"query\": [\"What is GPT-4?\", \"What is LLaMA-3?\"], \"num_blocks\": 3}'\n```\n\n5. 通过传递此检索器的 URL 来启动 WikiChat。例如：\n```bash\ninv demo --retriever-endpoint \"http:\u002F\u002F0.0.0.0:\u003C端口号>\u002F\u003C索引名称>\" --corpus_id \"您正在使用的语料库的 ID\"\n```\n\n`corpus_id` 参数用于匹配语料库的简短描述，以便提供给 LLM，帮助其判断是否有必要搜索索引。它应指向一个包含语料库简短描述的 `Corpus` 对象，例如“库尔德语维基百科”或“X 公司的业务文档”。\n\n#### 使用 Azure AI 部署的嵌入模型代替本地模型\n在通过 Azure AI 部署了可用的嵌入模型之一后，将您的端点密钥作为 `EMBEDDING_API_KEY` 添加到 `API_KEYS` 中。\n然后，您可以使用以下命令来索引您的集合：\n\n```bash\ninv index-collection --collection-path \u003C预处理后的 JSONL 文件路径> --collection-name \u003C索引名称> --embedding-model-name \u003C模型名称> --embedding-model-url https:\u002F\u002F\u003C部署名称>.\u003C部署区域>.inference.ml.azure.com --embedding-model-port 443\n```\n\n`embedding-model-name` 应为您在 Azure 上部署的模型名称，例如 `Snowflake\u002Fsnowflake-arctic-embed-l-v2.0`。\n请注意，Azure 会根据模型和部署硬件施加 batch_size 限制。您有责任设置和拆除 Azure 部署。您的端点 URL 和密钥可以在 Azure 门户的部署详情中找到。\n\n\n#### 将 Qdrant 索引上传到 🤗 Hub\n1. 将索引拆分为更小的部分：\n```bash\ntar -cvf - \u003CQdrant 索引文件夹路径> | pigz -p 14 | split --bytes=10GB --numeric-suffixes=0 --suffix-length=4 - \u003C输出文件夹路径>\u002Fqdrant_index.tar.gz.part-\n```\n\n2. 上传生成的部分：\n```bash\npython retrieval\u002Fupload_folder_to_hf_hub.py --folder_path \u003C输出文件夹路径> --repo_id \u003C🤗 Hub 上的仓库 ID>\n```\n\n\n\n## 在终端中运行 WikiChat\n\n您可以使用类似以下的命令运行不同配置的 WikiChat：\n\n```\ninv demo --engine gpt-4o # engine 可以是 llm_config 中配置的任何值，例如 mistral-large、claude-sonnet-35、local\n```\n\n有关所有可用选项的完整列表，您可以运行 `inv demo --help`。\n\n## 【可选】部署 WikiChat 以供多用户访问\n此仓库提供了通过 [Chainlit](https:\u002F\u002Fgithub.com\u002FChainlit\u002Fchainlit) 部署基于 Web 的聊天界面，并将用户对话存储到 [Cosmos DB](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fproducts\u002Fcosmos-db) 数据库中的代码。\n这些分别在 `backend_server.py` 和 `database.py` 中实现。如果您想使用其他数据库或前端，需要修改这些文件。在开发阶段，移除对 Cosmos DB 的依赖并直接将对话存储在内存中应该非常简单。\n您还可以配置 `backend_server.py` 中定义的聊天机器人参数，例如使用不同的 LLM 或增减 WikiChat 的阶段。\n\n### 设置 Cosmos DB\n在通过 Azure 创建实例后，获取连接字符串并将该值添加到 `API_KEYS` 中。\n```bash\nCOSMOS_CONNECTION_STRING=[您的 Cosmos DB 连接字符串]\n```\n\n### 运行 Chainlit\n运行此命令将启动后端和前端服务器。然后您可以通过指定的端口（默认为 5001）访问前端。\n`inv chainlit --backend-port 5001`\n\n\n\n# 免费的限速维基百科搜索 API\n您可以使用此 API 端点来原型化高质量的 RAG 系统。\n完整的规范请参见 https:\u002F\u002Fsearch.genie.stanford.edu\u002Fredoc。\n\n请注意，我们不对该端点提供任何保证，且不适合用于生产环境。\n\n\n# 维基百科预处理\n我们公开发布了 [25 种语言的预处理维基百科数据集](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fstanford-oval\u002Fwikipedia)。\n\n# 其他命令\n\n## 运行蒸馏模型以降低延迟和成本\nWikiChat≥2.0 不兼容 [发布的微调 LLaMA-2 检查点](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fstanford-oval\u002Fwikichat-v10-66c580bf15e26b87d622498c)。请参考 v1.0 来运行蒸馏模型。\n\n## 模拟对话\n为了评估聊天机器人，你可以使用用户模拟器来模拟对话。`subset` 参数可以是 `head`、`tail` 或 `recent` 中的一个，分别对应 WikiChat 论文中介绍的三个子集。你还可以指定用户的语言（WikiChat 始终以用户使用的语言进行回复）。\n该脚本会从对应的 `benchmark\u002Ftopics\u002F{subset}_articles_{language}.json` 文件中读取主题（即维基百科标题和文章）。使用 `--num-dialogues` 来设置要生成的模拟对话数量，使用 `--num-turns` 来指定每个对话的轮次数。\n\n```bash\ninv simulate-users --num-dialogues 1 --num-turns 2 --simulation-mode passage --language en --subset head\n```\n根据你所使用的引擎不同，这可能需要一些时间。模拟对话和日志文件将保存在 `benchmark\u002Fsimulated_dialogues\u002F` 目录下。你也可以提供上述管道中的任何参数。\n你可以通过修改 `benchmark\u002Fuser_simulator.py` 文件中的 `user_characteristics` 来尝试不同的用户特征。\n\n# 许可证\nWikiChat 的代码以及模型和数据均采用 Apache-2.0 许可证发布。\n\n# 引用\n如果你使用了本仓库中的代码或数据，请引用以下论文：\n\n```bibtex\n@inproceedings{semnani-etal-2023-wikichat,\n    title = \"{W}iki{C}hat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on {W}ikipedia\",\n    author = \"Semnani, Sina  and\n      Yao, Violet  and\n      Zhang, Heidi  and\n      Lam, Monica\",\n    editor = \"Bouamor, Houda  and\n      Pino, Juan  and\n      Bali, Kalika\",\n    booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2023\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.findings-emnlp.157\",\n    pages = \"2387--2413\",\n}\n\n@inproceedings{zhang-etal-2024-spaghetti,\n    title = \"{SPAGHETTI}: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing\",\n    author = \"Zhang, Heidi  and\n      Semnani, Sina  and\n      Ghassemi, Farhad  and\n      Xu, Jialiang  and\n      Liu, Shicheng  and\n      Lam, Monica\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Findings of the Association for Computational Linguistics ACL 2024\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand and virtual meeting\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.96\",\n    pages = \"1663--1678\",\n}\n```","# WikiChat 快速上手指南\n\nWikiChat 是由斯坦福大学开发的开源项目，旨在通过基于维基百科的检索增强生成（RAG）技术，解决大语言模型（LLM）的“幻觉”问题，确保回答的事实准确性。支持包括中文在内的 25 种语言。\n\n## 环境准备\n\n### 系统要求\n*   **操作系统**: 推荐 Ubuntu 20.04 LTS。Windows 用户建议使用 WSL，macOS 用户可能需要进行额外调试。\n*   **Python 版本**: 3.11\n*   **硬件需求**:\n    *   **基础使用**（调用 API + 官方搜索接口）：普通配置即可，无特殊 GPU 要求。\n    *   **本地部署索引**：需要充足的磁盘空间，强烈建议使用 SSD 或 NVMe 硬盘以保证检索速度。\n    *   **本地运行 LLM 或构建索引**：需要 NVIDIA GPU。构建默认索引至少需要 13GB 显存。\n*   **其他依赖**:\n    *   [Pixi](https:\u002F\u002Fpixi.sh\u002F) (跨平台包管理工具，替代 Conda)\n    *   Docker (用于运行 Redis Stack)\n\n### 前置依赖安装\n请确保已安装 Git 和 Docker。接着安装 Pixi 包管理器：\n```bash\n# 安装 Pixi (Linux\u002FmacOS)\ncurl -fsSL https:\u002F\u002Fpixi.sh\u002Finstall.sh | bash\n# Windows PowerShell\npowershell -ExecutionPolicy ByPass -c \"irm -useb https:\u002F\u002Fpixi.sh\u002Finstall.ps1 | iex\"\n```\n\n## 安装步骤\n\n### 1. 克隆项目\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat.git\ncd WikiChat\n```\n\n### 2. 创建并激活环境\n使用项目提供的 `pixi.toml` 配置环境，这将自动安装 Python 3.11 及所有必要的依赖库。\n```bash\npixi shell\npython -m spacy download en_core_web_sm\n```\n*注意：后续所有命令均需在此激活的 shell 环境中运行。*\n\n### 3. 配置 Redis\nWikiChat 默认使用 Docker 运行 Redis Stack 作为缓存和消息队列。确保 Docker 正在运行，项目启动时会自动拉取并运行容器。\n如果遇到 `Error: Redis lookup failed`，请检查 Docker 容器日志或手动安装 Redis。\n\n### 4. 配置大语言模型 (LLM)\nWikiChat 通过 [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm) 支持 100+ 种模型（OpenAI, Azure, Anthropic, Mistral, 本地模型等）。\n\n在项目根目录创建或编辑 `.env` 文件，配置你的 API Key：\n```bash\n# 示例：配置 OpenAI\nexport OPENAI_API_KEY=\"sk-...\"\n# 示例：配置其他模型请参考 .env.example 或 LiteLLM 文档\n```\n\n### 5. 配置信息检索源\n你可以选择以下两种方式之一：\n\n*   **选项 A（推荐\u002F默认）：使用官方免费 API**\n    无需额外配置，直接运行即可。该 API 提供 25 种语言的维基百科检索（含速率限制）。\n\n*   **选项 B：自建本地索引**\n    如果你需要更高性能、自定义数据或去除速率限制，需自行构建索引。\n    *   构建维基百科索引或自定义文档索引需消耗大量 GPU 资源。\n    *   具体命令参考项目中的 `To build a Wikipedia index` 部分。\n\n## 基本使用\n\n### 在终端运行\n完成上述配置后，直接在终端启动对话：\n\n```bash\npython -m wikichat.chat\n```\n\n系统将加载配置好的 LLM 和检索源，你可以在命令行中与 WikiChat 进行交互。它会基于检索到的维基百科内容生成带有引用来源的回答。\n\n### (可选) 部署多用户 Web 界面\n如果需要提供 Web 界面供多人使用，可以部署 Chainlit 前端并连接数据库：\n\n1.  **设置数据库** (可选，用于存储对话历史):\n    配置 Azure Cosmos DB 或其他支持的数据库连接字符串到 `.env`。\n2.  **启动 Web 服务**:\n    ```bash\n    chainlit run wikichat\u002Fchainlit_app.py\n    ```\n    启动后，访问终端输出的本地地址（通常为 `http:\u002F\u002Flocalhost:8000`）即可在浏览器中使用。\n\n---\n*注：本指南基于 WikiChat 2.1 版本。如需更高级的蒸馏模型运行或对话模拟功能，请参阅项目源码中的 `Other Commands` 章节。*","某科技媒体编辑正在撰写一篇关于\"2024 年诺贝尔物理学奖得主最新研究进展”的深度报道，需要确保文中引用的实验数据和理论细节绝对准确。\n\n### 没有 WikiChat 时\n- **事实性幻觉频发**：通用大模型常凭空捏造不存在的实验参数或错误归因研究成果，导致文章出现严重事实错误。\n- **时效信息缺失**：面对近期发生的获奖新闻，模型因训练数据截止而无法提供最新的研究细节，只能给出模糊或过时的回答。\n- **核实成本高昂**：编辑必须手动交叉验证模型生成的每一处引用，耗费大量时间查阅原始论文和维基百科，极大拖慢发稿速度。\n- **冷门领域失准**：在涉及特定物理分支的冷门知识时，模型倾向于“一本正经地胡说八道”，缺乏可靠的知识边界。\n\n### 使用 WikiChat 后\n- **源头事实锚定**：WikiChat 强制通过检索维基百科语料库生成回答，彻底消除了模型凭空捏造数据和理论的幻觉问题。\n- **实时知识同步**：直接调用最新的维基百科条目，能够精准捕捉并解读刚刚更新的诺贝尔奖相关研究动态。\n- **可信引用溯源**：系统自动基于检索到的真实段落生成回复，编辑可直接信任其内容，无需逐字人工复核，效率提升数倍。\n- **专业领域严谨**：即使在深奥的物理学细分领域，也能严格依据收录的权威词条作答，确保专业术语和逻辑的准确性。\n\nWikiChat 通过将大模型的生成能力牢牢“锚定”在维基百科的真实数据上，让内容创作者在追求效率的同时不再牺牲信息的真实性。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_WikiChat_036412ad.png","stanford-oval","Stanford Open Virtual Assistant Lab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fstanford-oval_b4570575.png","Research projects in the Stanford Open Virtual Assistant Lab",null,"genie@cs.stanford.edu","StanfordOVAL","https:\u002F\u002Foval.cs.stanford.edu","https:\u002F\u002Fgithub.com\u002Fstanford-oval",[82,86,90,94],{"name":83,"color":84,"percentage":85},"Python","#3572A5",90.8,{"name":87,"color":88,"percentage":89},"Jinja","#a52a22",6.8,{"name":91,"color":92,"percentage":93},"JavaScript","#f1e05a",2.1,{"name":95,"color":96,"percentage":97},"CSS","#663399",0.2,1567,140,"2026-04-05T17:39:36","Apache-2.0",4,"Linux (Ubuntu 20.04 LTS 已测试), macOS (可能需排查), Windows (仅限 WSL，可能需排查)","本地运行 LLM 或创建检索索引时必需。创建索引时默认嵌入模型需至少 13GB 显存。具体型号未说明，需支持相应显存容量。","未说明（但建议本地托管大型搜索索引时使用高性能存储和充足内存）",{"notes":107,"python":108,"dependencies":109},"1. 推荐使用 pixi 管理环境而非 conda。2. 必须安装并运行 Docker 以支持 Redis Stack。3. 若仅调用 API 和使用官方搜索接口，硬件要求极低；若本地部署模型或构建索引，需高性能 GPU（显存≥13GB）和高速 SSD（推荐 NVMe）。4. 支持超过 100 种大语言模型及 25 种语言的维基百科检索。","3.11",[110,111,112,113,114,115,116,117,118,119],"pixi (包管理工具)","spacy (en_core_web_sm)","Redis Stack (via Docker)","Qdrant (向量数据库)","LiteLLM","LangChain","Docling","Snowflake Arctic Embed \u002F BGE-M3 (嵌入模型)","RankGPT","Loguru",[35,14],[122,123,124,125,126,127,128],"natural-language-processing","chatbot","nlp","factuality","emnlp2023","rag","llm","2026-03-27T02:49:30.150509","2026-04-07T09:50:08.708680",[132,137,142,147,152,157],{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},21476,"构建 Colbert Index 时在 \"Sorting codes...\" 步骤卡住不动怎么办？","这通常是因为内存（RAM）不足导致的。有用户反馈使用 128GB RAM 会在该步骤卡住超过 50 小时，而升级到 256GB RAM 后问题解决。建议检查您的内存配置，如果可能，请增加系统内存至 256GB 或更高再尝试构建索引。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F11",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},21477,"新版界面中找不到 \"Most factual\" 模式或 GPT-4 选项怎么办？","这可能是前端框架的一个 Bug，导致部分组件未加载。解决方法是多次刷新页面。正常情况下，聊天框附近应该有一个 \"settings\"（设置）按钮，点击该按钮即可选择不同版本（包括 Most factual 模式）。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F12",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},21478,"调用本地 LLM 服务时出现 \"422 Unprocessable Entity\" 错误如何解决？","目前代码库与 vLLM 兼容性尚有问题。建议暂时改用 Hugging Face 的 text-generation-inference (TGI) 来部署本地模型，测试表明其与 WikiChat 兼容良好。此外，请检查本地模型服务的端口，默认情况下 WikiChat 期望本地模型运行在 5002 端口。如需修改，请编辑 `llm_config.yaml` 文件（参考第 99-103 行）。若问题依旧，可在配置文件中开启 LiteLLM 的详细日志记录以辅助排查。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F32",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},21479,"项目是否提供 API 以便在硬件资源有限的情况下使用？","项目暂未提供完整聊天机器人的 API，但在 WikiChat v2.0 中发布了一个免费的、有限流的 Wikipedia 搜索 API。使用该搜索 API 可以显著降低运行 WikiChat 所需的计算资源。具体文档请参阅项目 README 中的 \"The free rate-limited Wikipedia search API\" 部分。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F14",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},21480,"在线演示页面显示 \"Oops! Something went wrong\" 无法使用怎么办？","这通常是服务器端临时故障。维护者在收到报告后会重启服务器。如果您遇到此问题，请先尝试刷新页面；如果仍然报错，可能是服务器需要重启，请稍后再试或等待维护者处理。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F8",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},21481,"如何为 WikiChat 添加新的语言支持（例如加泰罗尼亚语）？","WikiChat 对新语言的支持主要取决于底层大语言模型（如 GPT-4o）的多语言能力。只要底层模型支持该语言，WikiChat 通常就能理解并生成该语言的回复。对于特定语言的维基百科数据支持（如加泰罗尼亚语维基百科），维护者会根据社区需求逐步添加。如果您需要特定语言支持，可以提交 Issue 请求，维护者已表示会考虑添加。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fissues\u002F35",[163,168,173],{"id":164,"version":165,"summary_zh":166,"released_at":167},127485,"v2.1","WikiChat 2.1现已发布！主要更新包括：\n  - **多语言支持增强**：现支持25种不同的维基百科版本（此前为10种），可通过网页和API在search.genie.stanford.edu\u002Fwikipedia_20250320访问：🇺🇸 [英语](https:\u002F\u002Fen.wikipedia.org\u002F)、🇫🇷 [法语](https:\u002F\u002Ffr.wikipedia.org\u002F)、🇩🇪 [德语](https:\u002F\u002Fde.wikipedia.org\u002F)、🇪🇸 [西班牙语](https:\u002F\u002Fes.wikipedia.org\u002F)、🇯🇵 [日语](https:\u002F\u002Fja.wikipedia.org\u002F)、🇷🇺 [俄语](https:\u002F\u002Fru.wikipedia.org\u002F)、🇵🇹 [葡萄牙语](https:\u002F\u002Fpt.wikipedia.org\u002F)、🇨🇳 [中文](https:\u002F\u002Fzh.wikipedia.org\u002F)、🇮🇹 [意大利语](https:\u002F\u002Fit.wikipedia.org\u002F)、🇸🇦 [阿拉伯语](https:\u002F\u002Far.wikipedia.org\u002F)、🇮🇷 [波斯语](https:\u002F\u002Ffa.wikipedia.org\u002F)、🇵🇱 [波兰语](https:\u002F\u002Fpl.wikipedia.org\u002F)、🇳🇱 [荷兰语](https:\u002F\u002Fnl.wikipedia.org\u002F)、🇺🇦 [乌克兰语](https:\u002F\u002Fuk.wikipedia.org\u002F)、🇮🇱 [希伯来语](https:\u002F\u002Fhe.wikipedia.org\u002F)、🇮🇩 [印尼语](https:\u002F\u002Fid.wikipedia.org\u002F)、🇹🇷 [土耳其语](https:\u002F\u002Ftr.wikipedia.org\u002F)、🇨🇿 [捷克语](https:\u002F\u002Fcs.wikipedia.org\u002F)、🇸🇪 [瑞典语](https:\u002F\u002Fsv.wikipedia.org\u002F)、🇰🇷 [韩语](https:\u002F\u002Fko.wikipedia.org\u002F)、🇫🇮 [芬兰语](https:\u002F\u002Ffi.wikipedia.org\u002F)、🇻🇳 [越南语](https:\u002F\u002Fvi.wikipedia.org\u002F)、🇭🇺 [匈牙利语](https:\u002F\u002Fhu.wikipedia.org\u002F)、[加泰罗尼亚语](https:\u002F\u002Fca.wikipedia.org\u002F)、🇹🇭 [泰语](https:\u002F\u002Fth.wikipedia.org\u002F)。\n  - **信息检索优化**：借助[Snowflake最新的Arctic嵌入模型](https:\u002F\u002Fhuggingface.co\u002FSnowflake\u002Fsnowflake-arctic-embed-l-v2.0)，检索准确性和速度均有所提升。\n  - **维基百科预处理改进**：采用[Docling](https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling)进行预处理。一如既往，预处理后的维基百科数据可在[HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fstanford-oval\u002Fwikipedia)上获取。\n  - **WikiChat流水线优化**：\n    - 在最终回复中添加了内联引用。\n    - 流水线的“生成”阶段现在始终与“证据抽取”阶段合并，即使在未蒸馏的设置下也是如此，从而实现更快、更经济的推理。\n    - 取消了基于日期的重排序，转而采用由大语言模型驱动的重排序。\n  - 切换至使用[pixi](https:\u002F\u002Fpixi.sh\u002Flatest\u002F)进行包管理，并采用[loguru](https:\u002F\u002Fgithub.com\u002FDelgan\u002Floguru)进行日志记录。","2025-04-29T10:01:45",{"id":169,"version":170,"summary_zh":171,"released_at":172},127486,"v2.0","- **多语言支持**：默认从10种不同的维基百科中检索信息：🇺🇸 英语、🇨🇳 中文、🇪🇸 西班牙语、🇵🇹 葡萄牙语、🇷🇺 俄语、🇩🇪 德语、🇮🇷 波斯语、🇯🇵 日语、🇫🇷 法语和🇮🇹 意大利语。\n- **信息检索能力提升**\n      - 现在不仅支持文本检索，还支持从表格、信息框和列表等结构化数据中提取信息。\n      - 拥有目前质量最高的公开维基百科预处理脚本。\n      - 使用最先进的多语言检索模型[BGE-M3](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-m3)。\n      - 利用[Qdrant](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Fqdrant)实现可扩展的向量搜索。\n      - 使用[RankGPT](https:\u002F\u002Fgithub.com\u002Fsunnweiwei\u002FRankGPT)对搜索结果进行重排序。\n- **免费多语言维基百科搜索API**：我们提供高质量、免费（但有限流）的搜索API，可访问10种语言的维基百科，涵盖超过1.8亿条向量嵌入。详情请参阅其[API文档](https:\u002F\u002Fwikichat.genie.stanford.edu\u002Fsearch\u002Fredoc)。\n- **将WikiChat适配到您自有文档（而非维基百科）的指南。**\n\n- **LLM兼容性扩展**：借助[LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm)，通过统一接口支持100多种大语言模型。\n- **优化的流水线**：可通过合并WikiChat中的“生成”和“提取主张”两个阶段，打造更快速且更具成本效益的流程。\n- **LangChain兼容性**：与LangChain完全兼容🦜️🔗。\n- **还有更多！**\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002FWikiChat\u002Fcompare\u002Fv1.0...v2.0","2024-08-23T19:56:33",{"id":174,"version":175,"summary_zh":176,"released_at":177},127487,"v1.0","本版本发布了我们发表于 EMNLP 2023 的论文中的代码，论文标题为：[Findings of EMNLP 2023 paper](https:\u002F\u002Faclanthology.org\u002F2023.findings-emnlp.157\u002F)。","2024-08-23T04:37:30"]