[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-stanford-oval--storm":3,"tool-stanford-oval--storm":64},[4,17,27,35,44,52],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":10,"last_commit_at":41,"category_tags":42,"status":16},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[13,14,15,43],"视频",{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":23,"last_commit_at":50,"category_tags":51,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":53,"name":54,"github_repo":55,"description_zh":56,"stars":57,"difficulty_score":23,"last_commit_at":58,"category_tags":59,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,60,43,61,15,62,26,13,63],"数据工具","插件","其他","音频",{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":23,"env_os":93,"env_gpu":94,"env_ram":94,"env_deps":95,"category_tags":103,"github_topics":104,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":114,"updated_at":115,"faqs":116,"releases":146},4234,"stanford-oval\u002Fstorm","storm","An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.","STORM 是一款由斯坦福大学研发的智能知识整理系统，能够像人类研究员一样，基于互联网搜索自动撰写带有引用来源的长篇专题报告。它主要解决了用户在面对海量信息时，难以高效完成深度调研、梳理逻辑框架及核实出处的痛点，将繁琐的资料搜集与初稿撰写过程自动化。\n\n该系统特别适合需要快速掌握陌生领域全貌的研究人员、学生、内容创作者，以及希望构建定制化知识引擎的开发者使用。对于普通用户，它能提供结构清晰的百科式文章；对于开发者，其模块化的 Python 包支持灵活接入不同的语言模型和检索源。\n\nSTORM 的核心技术亮点在于其独特的“多视角提问”机制。不同于直接让大模型生成内容，STORM 会在预写作阶段模拟不同专家的视角，主动提出一系列深入且广泛的问题，并通过搜索引擎获取答案，从而构建出逻辑严密的详细大纲。在此基础上，系统再结合收集到的参考文献生成最终报告。此外，最新升级的 Co-STORM 版本还引入了人机协作模式，允许用户在调研过程中介入引导，使生成的内容更符合特定需求。无论是作为个人研究助手，还是作为二次开发的基础设施，STORM 都展现了强大的知识合成能力。","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Flogo.svg\" style=\"width: 25%; height: auto;\">\n\u003C\u002Fp>\n\n# STORM: Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking\n\n\u003Cp align=\"center\">\n| \u003Ca href=\"http:\u002F\u002Fstorm.genie.stanford.edu\">\u003Cb>Research preview\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14207\">\u003Cb>STORM Paper\u003C\u002Fb>\u003C\u002Fa>| \u003Ca href=\"https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232\">\u003Cb>Co-STORM Paper\u003C\u002Fb>\u003C\u002Fa>  | \u003Ca href=\"https:\u002F\u002Fstorm-project.stanford.edu\u002F\">\u003Cb>Website\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n**Latest News** 🔥\n\n- [2025\u002F01] We add [litellm](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm) integration for language models and embedding models in `knowledge-storm` v1.1.0.\n\n- [2024\u002F09] Co-STORM codebase is now released and integrated into `knowledge-storm` python package v1.0.0. Run `pip install knowledge-storm --upgrade` to check it out.\n\n- [2024\u002F09] We introduce collaborative STORM (Co-STORM) to support human-AI collaborative knowledge curation! [Co-STORM Paper](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232) has been accepted to EMNLP 2024 main conference.\n\n- [2024\u002F07] You can now install our package with `pip install knowledge-storm`!\n- [2024\u002F07] We add `VectorRM` to support grounding on user-provided documents, complementing existing support of search engines (`YouRM`, `BingSearch`). (check out [#58](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F58))\n- [2024\u002F07] We release demo light for developers a minimal user interface built with streamlit framework in Python, handy for local development and demo hosting (checkout [#54](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F54))\n- [2024\u002F06] We will present STORM at NAACL 2024! Find us at Poster Session 2 on June 17 or check our [presentation material](assets\u002Fstorm_naacl2024_slides.pdf). \n- [2024\u002F05] We add Bing Search support in [rm.py](knowledge_storm\u002Frm.py). Test STORM with `GPT-4o` - we now configure the article generation part in our demo using `GPT-4o` model.\n- [2024\u002F04] We release refactored version of STORM codebase! We define [interface](knowledge_storm\u002Finterface.py) for STORM pipeline and reimplement STORM-wiki (check out [`src\u002Fstorm_wiki`](knowledge_storm\u002Fstorm_wiki)) to demonstrate how to instantiate the pipeline. We provide API to support customization of different language models and retrieval\u002Fsearch integration.\n\n[![Code style: black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n\n## Overview [(Try STORM now!)](https:\u002F\u002Fstorm.genie.stanford.edu\u002F)\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Foverview.svg\" style=\"width: 90%; height: auto;\">\n\u003C\u002Fp>\nSTORM is a LLM system that writes Wikipedia-like articles from scratch based on Internet search. Co-STORM further enhanced its feature by enabling human to collaborative LLM system to support more aligned and preferred information seeking and knowledge curation.\n\nWhile the system cannot produce publication-ready articles that often require a significant number of edits, experienced Wikipedia editors have found it helpful in their pre-writing stage.\n\n**More than 70,000 people have tried our [live research preview](https:\u002F\u002Fstorm.genie.stanford.edu\u002F). Try it out to see how STORM can help your knowledge exploration journey and please provide feedback to help us improve the system 🙏!**\n\n\n\n## How STORM & Co-STORM works\n\n### STORM\n\nSTORM breaks down generating long articles with citations into two steps:\n\n1. **Pre-writing stage**: The system conducts Internet-based research to collect references and generates an outline.\n2. **Writing stage**: The system uses the outline and references to generate the full-length article with citations.\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_storm_readme_19b9352d17a2.jpg\" style=\"width: 60%; height: auto;\">\n\u003C\u002Fp>\n\nSTORM identifies the core of automating the research process as automatically coming up with good questions to ask. Directly prompting the language model to ask questions does not work well. To improve the depth and breadth of the questions, STORM adopts two strategies:\n1. **Perspective-Guided Question Asking**: Given the input topic, STORM discovers different perspectives by surveying existing articles from similar topics and uses them to control the question-asking process.\n2. **Simulated Conversation**: STORM simulates a conversation between a Wikipedia writer and a topic expert grounded in Internet sources to enable the language model to update its understanding of the topic and ask follow-up questions.\n\n### CO-STORM\n\nCo-STORM proposes **a collaborative discourse protocol** which implements a turn management policy to support smooth collaboration among \n\n- **Co-STORM LLM experts**: This type of agent generates answers grounded on external knowledge sources and\u002For raises follow-up questions based on the discourse history.\n- **Moderator**: This agent generates thought-provoking questions inspired by information discovered by the retriever but not directly used in previous turns. Question generation can also be grounded!\n- **Human user**: The human user will take the initiative to either (1) observe the discourse to gain deeper understanding of the topic, or (2) actively engage in the conversation by injecting utterances to steer the discussion focus.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_storm_readme_e16cf41879f9.jpg\" style=\"width: 60%; height: auto;\">\n\u003C\u002Fp>\n\nCo-STORM also maintains a dynamic updated **mind map**, which organize collected information into a hierarchical concept structure, aiming to **build a shared conceptual space between the human user and the system**. The mind map has been proven to help reduce the mental load when the discourse goes long and in-depth. \n\nBoth STORM and Co-STORM are implemented in a highly modular way using [dspy](https:\u002F\u002Fgithub.com\u002Fstanfordnlp\u002Fdspy).\n\n## Installation\n\n\nTo install the knowledge storm library, use `pip install knowledge-storm`. \n\nYou could also install the source code which allows you to modify the behavior of STORM engine directly.\n1. Clone the git repository.\n    ```shell\n    git clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm.git\n    cd storm\n    ```\n   \n2. Install the required packages.\n   ```shell\n   conda create -n storm python=3.11\n   conda activate storm\n   pip install -r requirements.txt\n   ```\n   \n\n## API\n\nCurrently, our package support:\n\n- Language model components: All language models supported by litellm as listed [here](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fproviders)\n- Embedding model components: All embedding models supported by litellm as listed [here](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fembedding\u002Fsupported_embedding)\n- retrieval module components: `YouRM`, `BingSearch`, `VectorRM`, `SerperRM`, `BraveRM`, `SearXNG`, `DuckDuckGoSearchRM`, `TavilySearchRM`, `GoogleSearch`, and `AzureAISearch` as \n\n:star2: **PRs for integrating more search engines\u002Fretrievers into [knowledge_storm\u002Frm.py](knowledge_storm\u002Frm.py) are highly appreciated!**\n\nBoth STORM and Co-STORM are working in the information curation layer, you need to set up the information retrieval module and language model module to create their `Runner` classes respectively.\n\n### STORM\n\nThe STORM knowledge curation engine is defined as a simple Python `STORMWikiRunner` class. Here is an example of using You.com search engine and OpenAI models.\n\n```python\nimport os\nfrom knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.rm import YouRM\n\nlm_configs = STORMWikiLMConfigs()\nopenai_kwargs = {\n    'api_key': os.getenv(\"OPENAI_API_KEY\"),\n    'temperature': 1.0,\n    'top_p': 0.9,\n}\n# STORM is a LM system so different components can be powered by different models to reach a good balance between cost and quality.\n# For a good practice, choose a cheaper\u002Ffaster model for `conv_simulator_lm` which is used to split queries, synthesize answers in the conversation.\n# Choose a more powerful model for `article_gen_lm` to generate verifiable text with citations.\ngpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)\ngpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)\nlm_configs.set_conv_simulator_lm(gpt_35)\nlm_configs.set_question_asker_lm(gpt_35)\nlm_configs.set_outline_gen_lm(gpt_4)\nlm_configs.set_article_gen_lm(gpt_4)\nlm_configs.set_article_polish_lm(gpt_4)\n# Check out the STORMWikiRunnerArguments class for more configurations.\nengine_args = STORMWikiRunnerArguments(...)\nrm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)\nrunner = STORMWikiRunner(engine_args, lm_configs, rm)\n```\n\nThe `STORMWikiRunner` instance can be evoked with the simple `run` method:\n```python\ntopic = input('Topic: ')\nrunner.run(\n    topic=topic,\n    do_research=True,\n    do_generate_outline=True,\n    do_generate_article=True,\n    do_polish_article=True,\n)\nrunner.post_run()\nrunner.summary()\n```\n- `do_research`: if True, simulate conversations with difference perspectives to collect information about the topic; otherwise, load the results.\n- `do_generate_outline`: if True, generate an outline for the topic; otherwise, load the results.\n- `do_generate_article`: if True, generate an article for the topic based on the outline and the collected information; otherwise, load the results.\n- `do_polish_article`: if True, polish the article by adding a summarization section and (optionally) removing duplicate content; otherwise, load the results.\n\n### Co-STORM\n\nThe Co-STORM knowledge curation engine is defined as a simple Python `CoStormRunner` class. Here is an example of using Bing search engine and OpenAI models.\n\n```python\nfrom knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.logging_wrapper import LoggingWrapper\nfrom knowledge_storm.rm import BingSearch\n\n# Co-STORM adopts the same multi LM system paradigm as STORM \nlm_config: CollaborativeStormLMConfigs = CollaborativeStormLMConfigs()\nopenai_kwargs = {\n    \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n    \"api_provider\": \"openai\",\n    \"temperature\": 1.0,\n    \"top_p\": 0.9,\n    \"api_base\": None,\n} \nquestion_answering_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)\ndiscourse_manage_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nutterance_polishing_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=2000, **openai_kwargs)\nwarmstart_outline_gen_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nquestion_asking_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=300, **openai_kwargs)\nknowledge_base_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)\n\nlm_config.set_question_answering_lm(question_answering_lm)\nlm_config.set_discourse_manage_lm(discourse_manage_lm)\nlm_config.set_utterance_polishing_lm(utterance_polishing_lm)\nlm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)\nlm_config.set_question_asking_lm(question_asking_lm)\nlm_config.set_knowledge_base_lm(knowledge_base_lm)\n\n# Check out the Co-STORM's RunnerArguments class for more configurations.\ntopic = input('Topic: ')\nrunner_argument = RunnerArgument(topic=topic, ...)\nlogging_wrapper = LoggingWrapper(lm_config)\nbing_rm = BingSearch(bing_search_api_key=os.environ.get(\"BING_SEARCH_API_KEY\"),\n                     k=runner_argument.retrieve_top_k)\ncostorm_runner = CoStormRunner(lm_config=lm_config,\n                               runner_argument=runner_argument,\n                               logging_wrapper=logging_wrapper,\n                               rm=bing_rm)\n```\n\nThe `CoStormRunner` instance can be evoked with the `warmstart()` and `step(...)` methods.\n\n```python\n# Warm start the system to build shared conceptual space between Co-STORM and users\ncostorm_runner.warm_start()\n\n# Step through the collaborative discourse \n# Run either of the code snippets below in any order, as many times as you'd like\n# To observe the conversation:\nconv_turn = costorm_runner.step()\n# To inject your utterance to actively steer the conversation:\ncostorm_runner.step(user_utterance=\"YOUR UTTERANCE HERE\")\n\n# Generate report based on the collaborative discourse\ncostorm_runner.knowledge_base.reorganize()\narticle = costorm_runner.generate_report()\nprint(article)\n```\n\n\n\n## Quick Start with Example Scripts\n\nWe provide scripts in our [examples folder](examples) as a quick start to run STORM and Co-STORM with different configurations.\n\nWe suggest using `secrets.toml` to set up the API keys. Create a file `secrets.toml` under the root directory and add the following content:\n\n```shell\n# ============ language model configurations ============ \n# Set up OpenAI API key.\nOPENAI_API_KEY=\"your_openai_api_key\"\n# If you are using the API service provided by OpenAI, include the following line:\nOPENAI_API_TYPE=\"openai\"\n# If you are using the API service provided by Microsoft Azure, include the following lines:\nOPENAI_API_TYPE=\"azure\"\nAZURE_API_BASE=\"your_azure_api_base_url\"\nAZURE_API_VERSION=\"your_azure_api_version\"\n# ============ retriever configurations ============ \nBING_SEARCH_API_KEY=\"your_bing_search_api_key\" # if using bing search\n# ============ encoder configurations ============ \nENCODER_API_TYPE=\"openai\" # if using openai encoder\n```\n\n### STORM examples\n\n**To run STORM with `gpt` family models with default configurations:**\n\nRun the following command.\n```bash\npython examples\u002Fstorm_examples\u002Frun_storm_wiki_gpt.py \\\n    --output-dir $OUTPUT_DIR \\\n    --retriever bing \\\n    --do-research \\\n    --do-generate-outline \\\n    --do-generate-article \\\n    --do-polish-article\n```\n\n**To run STORM using your favorite language models or grounding on your own corpus:** Check out [examples\u002Fstorm_examples\u002FREADME.md](examples\u002Fstorm_examples\u002FREADME.md).\n\n### Co-STORM examples\n\nTo run Co-STORM with `gpt` family models with default configurations,\n\n1. Add `BING_SEARCH_API_KEY=\"xxx\"` and `ENCODER_API_TYPE=\"xxx\"` to `secrets.toml`\n2. Run the following command\n\n```bash\npython examples\u002Fcostorm_examples\u002Frun_costorm_gpt.py \\\n    --output-dir $OUTPUT_DIR \\\n    --retriever bing\n```\n\n\n## Customization of the Pipeline\n\n### STORM\n\nIf you have installed the source code, you can customize STORM based on your own use case. STORM engine consists of 4 modules:\n\n1. Knowledge Curation Module: Collects a broad coverage of information about the given topic.\n2. Outline Generation Module: Organizes the collected information by generating a hierarchical outline for the curated knowledge.\n3. Article Generation Module: Populates the generated outline with the collected information.\n4. Article Polishing Module: Refines and enhances the written article for better presentation.\n\nThe interface for each module is defined in `knowledge_storm\u002Finterface.py`, while their implementations are instantiated in `knowledge_storm\u002Fstorm_wiki\u002Fmodules\u002F*`. These modules can be customized according to your specific requirements (e.g., generating sections in bullet point format instead of full paragraphs).\n\n### Co-STORM\n\nIf you have installed the source code, you can customize Co-STORM based on your own use case\n\n1. Co-STORM introduces multiple LLM agent types (i.e. Co-STORM experts and Moderator). LLM agent interface is defined in `knowledge_storm\u002Finterface.py` , while its implementation is instantiated in `knowledge_storm\u002Fcollaborative_storm\u002Fmodules\u002Fco_storm_agents.py`. Different LLM agent policies can be customized.\n2. Co-STORM introduces a collaborative discourse protocol, with its core function centered on turn policy management. We provide an example implementation of turn policy management through `DiscourseManager` in `knowledge_storm\u002Fcollaborative_storm\u002Fengine.py`. It can be customized and further improved.\n\n## Datasets\nTo facilitate the study of automatic knowledge curation and complex information seeking, our project releases the following datasets:\n\n### FreshWiki\nThe FreshWiki Dataset is a collection of 100 high-quality Wikipedia articles focusing on the most-edited pages from February 2022 to September 2023. See Section 2.1 in [STORM paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14207) for more details.\n\nYou can download the dataset from [huggingface](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FEchoShao8899\u002FFreshWiki) directly. To ease the data contamination issue, we archive the [source code](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Ftree\u002FNAACL-2024-code-backup\u002FFreshWiki) for the data construction pipeline that can be repeated at future dates.\n\n### WildSeek\nTo study users’ interests in complex information seeking tasks in the wild, we utilized data collected from the web research preview to create the WildSeek dataset. We downsampled the data to ensure the diversity of the topics and the quality of the data. Each data point is a pair comprising a topic and the user’s goal for conducting deep search on the topic.  For more details, please refer to Section 2.2 and Appendix A of [Co-STORM paper](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232).\n\nThe WildSeek dataset is available [here](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYuchengJiang\u002FWildSeek).\n\n## Replicate STORM & Co-STORM paper result\n\nFor STORM paper experiments, please switch to the branch `NAACL-2024-code-backup` [here](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Ftree\u002FNAACL-2024-code-backup).\n\nFor Co-STORM paper experiments, please switch to the branch `EMNLP-2024-code-backup` (placeholder for now, will be updated soon).\n\n## Roadmap & Contributions\nOur team is actively working on:\n1. Human-in-the-Loop Functionalities: Supporting user participation in the knowledge curation process.\n2. Information Abstraction: Developing abstractions for curated information to support presentation formats beyond the Wikipedia-style report.\n\nIf you have any questions or suggestions, please feel free to open an issue or pull request. We welcome contributions to improve the system and the codebase!\n\nContact person: [Yijia Shao](mailto:shaoyj@stanford.edu) and [Yucheng Jiang](mailto:yuchengj@stanford.edu)\n\n## Acknowledgement\nWe would like to thank Wikipedia for its excellent open-source content. The FreshWiki dataset is sourced from Wikipedia, licensed under the Creative Commons Attribution-ShareAlike (CC BY-SA) license.\n\nWe are very grateful to [Michelle Lam](https:\u002F\u002Fmichelle123lam.github.io\u002F) for designing the logo for this project and [Dekun Ma](https:\u002F\u002Fdekun.me) for leading the UI development.\n\nThanks to Vercel for their support of [open-source software](https:\u002F\u002Fstorm.genie.stanford.edu)\n\n## Citation\nPlease cite our paper if you use this code or part of it in your work:\n```bibtex\n@inproceedings{jiang-etal-2024-unknown,\n    title = \"Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations\",\n    author = \"Jiang, Yucheng  and\n      Shao, Yijia  and\n      Ma, Dekun  and\n      Semnani, Sina  and\n      Lam, Monica\",\n    editor = \"Al-Onaizan, Yaser  and\n      Bansal, Mohit  and\n      Chen, Yun-Nung\",\n    booktitle = \"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing\",\n    month = nov,\n    year = \"2024\",\n    address = \"Miami, Florida, USA\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.554\u002F\",\n    doi = \"10.18653\u002Fv1\u002F2024.emnlp-main.554\",\n    pages = \"9917--9955\",\n}\n\n@inproceedings{shao-etal-2024-assisting,\n    title = \"Assisting in Writing {W}ikipedia-like Articles From Scratch with Large Language Models\",\n    author = \"Shao, Yijia  and\n      Jiang, Yucheng  and\n      Kanell, Theodore  and\n      Xu, Peter  and\n      Khattab, Omar  and\n      Lam, Monica\",\n    editor = \"Duh, Kevin  and\n      Gomez, Helena  and\n      Bethard, Steven\",\n    booktitle = \"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)\",\n    month = jun,\n    year = \"2024\",\n    address = \"Mexico City, Mexico\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.naacl-long.347\u002F\",\n    doi = \"10.18653\u002Fv1\u002F2024.naacl-long.347\",\n    pages = \"6252--6278\",\n}\n```\n","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Flogo.svg\" style=\"width: 25%; height: auto;\">\n\u003C\u002Fp>\n\n# STORM：基于检索与多视角提问的主题大纲生成\n\n\u003Cp align=\"center\">\n| \u003Ca href=\"http:\u002F\u002Fstorm.genie.stanford.edu\">\u003Cb>研究预览\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14207\">\u003Cb>STORM论文\u003C\u002Fb>\u003C\u002Fa>| \u003Ca href=\"https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232\">\u003Cb>Co-STORM论文\u003C\u002Fb>\u003C\u002Fa>  | \u003Ca href=\"https:\u002F\u002Fstorm-project.stanford.edu\u002F\">\u003Cb>官网\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n**最新消息** 🔥\n\n- [2025\u002F01] 我们在`knowledge-storm` v1.1.0中加入了对语言模型和嵌入模型的`litellm`（https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm）集成。\n\n- [2024\u002F09] Co-STORM代码库现已发布，并整合到`knowledge-storm` Python包v1.0.0中。运行`pip install knowledge-storm --upgrade`即可体验。\n\n- [2024\u002F09] 我们推出了协作式STORM（Co-STORM），以支持人机协同的知识整理！[Co-STORM论文](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232)已被EMNLP 2024主会接收。\n\n- [2024\u002F07] 现在您可以通过`pip install knowledge-storm`安装我们的软件包！\n- [2024\u002F07] 我们新增了`VectorRM`，用于支持基于用户提供的文档进行知识增强，补充了现有的搜索引擎支持（`YouRM`、`BingSearch`）。（参见[#58](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F58)）\n- [2024\u002F07] 我们发布了面向开发者的轻量级演示版——一个基于Python Streamlit框架构建的极简用户界面，便于本地开发和演示部署。（参见[#54](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F54)）\n- [2024\u002F06] 我们将在NAACL 2024上展示STORM！请于6月17日莅临海报展示环节2，或查看我们的[演示材料](assets\u002Fstorm_naacl2024_slides.pdf)。\n- [2024\u002F05] 我们在[rm.py](knowledge_storm\u002Frm.py)中增加了必应搜索支持。使用`GPT-4o`测试STORM——我们现在已将演示中的文章生成部分配置为使用`GPT-4o`模型。\n- [2024\u002F04] 我们发布了重构后的STORM代码库！我们定义了STORM流水线的[接口](knowledge_storm\u002Finterface.py)，并重新实现了STORM-wiki（参见[`src\u002Fstorm_wiki`](knowledge_storm\u002Fstorm_wiki)），以展示如何实例化该流水线。我们提供了API，支持自定义不同的语言模型以及检索\u002F搜索集成。\n\n[![代码风格：black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n\n## 概述 [(立即试用STORM！)](https:\u002F\u002Fstorm.genie.stanford.edu\u002F)\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Foverview.svg\" style=\"width: 90%; height: auto;\">\n\u003C\u002Fp>\nSTORM是一个基于互联网搜索从零开始撰写维基百科式文章的LLM系统。Co-STORM进一步增强了其功能，允许人类与LLM系统协作，从而更好地满足信息需求和知识整理的目标。\n\n尽管该系统目前还无法生成可以直接发表的文章——这类文章通常需要大量编辑——但经验丰富的维基百科编辑认为它在写作前的准备阶段非常有帮助。\n\n**已有超过7万人试用了我们的[在线研究预览](https:\u002F\u002Fstorm.genie.stanford.edu\u002F)。快来体验STORM如何助力您的知识探索之旅，并欢迎提供反馈以帮助我们改进系统 🙏！**\n\n\n\n## STORM与Co-STORM的工作原理\n\n### STORM\n\nSTORM将带有引用的长篇文章生成过程分解为两个步骤：\n\n1. **写作准备阶段**：系统通过互联网调研收集参考资料，并生成文章大纲。\n2. **写作阶段**：系统利用大纲和参考资料生成包含引用的完整文章。\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_storm_readme_19b9352d17a2.jpg\" style=\"width: 60%; height: auto;\">\n\u003C\u002Fp>\n\nSTORM认为，自动化研究过程的核心在于能够自动提出高质量的问题。直接让语言模型提问效果并不理想。为了提升问题的深度和广度，STORM采用了两种策略：\n1. **视角引导式提问**：给定输入主题后，STORM会通过调研类似主题的现有文章来发现不同视角，并利用这些视角来指导提问过程。\n2. **模拟对话**：STORM模拟一位维基百科作者与基于互联网资源的主题专家之间的对话，使语言模型能够不断更新对主题的理解，并提出后续问题。\n\n### CO-STORM\n\nCo-STORM提出了一种**协作式话语协议**，通过实施轮次管理策略来支持以下角色之间的顺畅协作：\n\n- **Co-STORM LLM专家**：此类代理会基于外部知识源生成答案，或根据对话历史提出后续问题。\n- **主持人**：该代理会根据检索器发现但尚未在先前轮次中使用的信息，生成具有启发性的提问。问题生成也可以基于知识来源。\n- **人类用户**：人类用户可以选择（1）观察对话以更深入地理解主题，或（2）主动参与对话，通过插入话语来引导讨论方向。\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_storm_readme_e16cf41879f9.jpg\" style=\"width: 60%; height: auto;\">\n\u003C\u002Fp>\n\nCo-STORM还会维护一张动态更新的**思维导图**，将收集到的信息组织成层次化的概念结构，旨在**在人类用户和系统之间构建共享的概念空间**。实践证明，当对话持续且深入时，思维导图有助于减轻认知负担。\n\nSTORM和Co-STORM均采用高度模块化的方式实现，使用了[dspy](https:\u002F\u002Fgithub.com\u002Fstanfordnlp\u002Fdspy)框架。\n\n## 安装\n\n\n要安装knowledge storm库，请使用`pip install knowledge-storm`。\n\n您也可以安装源代码，以便直接修改STORM引擎的行为。\n1. 克隆Git仓库。\n    ```shell\n    git clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm.git\n    cd storm\n    ```\n   \n2. 安装所需依赖。\n   ```shell\n   conda create -n storm python=3.11\n   conda activate storm\n   pip install -r requirements.txt\n   ```\n\n## API\n\n目前，我们的软件包支持：\n\n- 语言模型组件：所有由 litellm 支持的语言模型，详见 [这里](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fproviders)\n- 嵌入模型组件：所有由 litellm 支持的嵌入模型，详见 [这里](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fembedding\u002Fsupported_embedding)\n- 检索模块组件：`YouRM`、`BingSearch`、`VectorRM`、`SerperRM`、`BraveRM`、`SearXNG`、`DuckDuckGoSearchRM`、`TavilySearchRM`、`GoogleSearch` 和 `AzureAISearch` 等\n\n:star2: **欢迎为将更多搜索引擎\u002F检索器集成到 [knowledge_storm\u002Frm.py](knowledge_storm\u002Frm.py) 中提交 PR！**\n\nSTORM 和 Co-STORM 都工作在信息整理层，您需要分别设置信息检索模块和语言模型模块来创建它们的 `Runner` 类。\n\n### STORM\n\nSTORM 知识整理引擎被定义为一个简单的 Python `STORMWikiRunner` 类。以下是一个使用 You.com 搜索引擎和 OpenAI 模型的示例。\n\n```python\nimport os\nfrom knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.rm import YouRM\n\nlm_configs = STORMWikiLMConfigs()\nopenai_kwargs = {\n    'api_key': os.getenv(\"OPENAI_API_KEY\"),\n    'temperature': 1.0,\n    'top_p': 0.9,\n}\n# STORM 是一个 LM 系统，因此不同的组件可以由不同的模型驱动，以在成本和质量之间取得良好平衡。\n# 为了最佳实践，为用于拆分查询、在对话中综合答案的 `conv_simulator_lm` 选择更便宜\u002F更快的模型。\n# 为用于生成带有引用的可验证文本的 `article_gen_lm` 选择更强大的模型。\ngpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)\ngpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)\nlm_configs.set_conv_simulator_lm(gpt_35)\nlm_configs.set_question_asker_lm(gpt_35)\nlm_configs.set_outline_gen_lm(gpt_4)\nlm_configs.set_article_gen_lm(gpt_4)\nlm_configs.set_article_polish_lm(gpt_4)\n# 更多配置请参阅 STORMWikiRunnerArguments 类。\nengine_args = STORMWikiRunnerArguments(...)\nrm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)\nrunner = STORMWikiRunner(engine_args, lm_configs, rm)\n```\n\n可以通过简单的 `run` 方法调用 `STORMWikiRunner` 实例：\n```python\ntopic = input('主题: ')\nrunner.run(\n    topic=topic,\n    do_research=True,\n    do_generate_outline=True,\n    do_generate_article=True,\n    do_polish_article=True,\n)\nrunner.post_run()\nrunner.summary()\n```\n- `do_research`: 如果为 True，则模拟与不同观点的对话以收集关于该主题的信息；否则加载已有的结果。\n- `do_generate_outline`: 如果为 True，则为该主题生成大纲；否则加载已有的结果。\n- `do_generate_article`: 如果为 True，则根据大纲和收集到的信息生成该主题的文章；否则加载已有的结果。\n- `do_polish_article`: 如果为 True，则通过添加总结部分并（可选）去除重复内容来润色文章；否则加载已有的结果。\n\n### Co-STORM\n\nCo-STORM 知识整理引擎被定义为一个简单的 Python `CoStormRunner` 类。以下是一个使用 Bing 搜索引擎和 OpenAI 模型的示例。\n\n```python\nfrom knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.logging_wrapper import LoggingWrapper\nfrom knowledge_storm.rm import BingSearch\n\n# Co-STORM 采用与 STORM 相同的多 LM 系统范式\nlm_config: CollaborativeStormLMConfigs = CollaborativeStormLMConfigs()\nopenai_kwargs = {\n    \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n    \"api_provider\": \"openai\",\n    \"temperature\": 1.0,\n    \"top_p\": 0.9,\n    \"api_base\": None,\n} \nquestion_answering_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)\ndiscourse_manage_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nutterance_polishing_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=2000, **openai_kwargs)\nwarmstart_outline_gen_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nquestion_asking_lm = LitellmModel(model=gpt_4o_model_name, max tokens=300, **openai_kwargs)\nknowledge_base_lm = LitellmModel(model=gpt_4o_model_name, max tokens=1000, **openai_kwargs)\n\nlm_config.set_question_answering_lm(question_answering_lm)\nlm_config.set_discourse_manage_lm(discourse_manage_lm)\nlm_config.set_utterance_polishing_lm(utterance_polishing_lm)\nlm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)\nlm_config.set_question_asking_lm(question_asking_lm)\nlm_config.set_knowledge_base_lm(knowledge_base_lm)\n\n# 更多配置请参阅 Co-STORM 的 RunnerArguments 类。\ntopic = input('主题: ')\nrunner_argument = RunnerArgument(topic=topic, ...)\nlogging_wrapper = LoggingWrapper(lm_config)\nbing_rm = BingSearch(bing_search_api_key=os.environ.get(\"BING_SEARCH_API_KEY\"),\n                     k=runner_argument.retrieve_top_k)\ncostorm_runner = CoStormRunner(lm_config=lm_config,\n                               runner_argument=runner_argument,\n                               logging_wrapper=logging_wrapper,\n                               rm=bing_rm)\n```\n\n可以通过 `warmstart()` 和 `step(...)` 方法调用 `CoStormRunner` 实例。\n\n```python\n# 热身系统，以在 Co-STORM 和用户之间建立共享的概念空间\ncostorm_runner.warm_start()\n\n# 逐步推进协作性讨论\n# 您可以按任意顺序运行下面任一代码片段，次数不限\n# 要观察对话：\nconv_turn = costorm_runner.step()\n# 要主动引导对话，插入您的发言：\ncostorm_runner.step(user_utterance=\"您的发言\")\n\n# 根据协作性讨论生成报告\ncostorm_runner.knowledge_base.reorganize()\narticle = costorm_runner.generate_report()\nprint(article)\n```\n\n\n\n## 使用示例脚本快速入门\n\n我们在 [examples 文件夹](examples) 中提供了脚本，以便快速启动 STORM 和 Co-STORM，并使用不同的配置进行运行。\n\n我们建议使用 `secrets.toml` 来设置 API 密钥。在根目录下创建一个 `secrets.toml` 文件，并添加以下内容：\n\n```shell\n# ============ 语言模型配置 ============\n# 设置 OpenAI API 密钥。\nOPENAI_API_KEY=\"your_openai_api_key\"\n# 如果您使用的是 OpenAI 提供的 API 服务，请包含以下行：\nOPENAI_API_TYPE=\"openai\"\n# 如果您使用的是 Microsoft Azure 提供的 API 服务，请包含以下行：\nOPENAI_API_TYPE=\"azure\"\nAZURE_API_BASE=\"your_azure_api_base_url\"\nAZURE_API_VERSION=\"your_azure_api_version\"\n# ============ 检索器配置 ============\nBING_SEARCH_API_KEY=\"your_bing_search_api_key\" # 如果使用 bing search\n\n# ============ 编码器配置 ============ \nENCODER_API_TYPE=\"openai\" # 如果使用 OpenAI 编码器\n```\n\n### STORM 示例\n\n**使用默认配置运行 STORM 与 `gpt` 系列模型：**\n\n运行以下命令。\n```bash\npython examples\u002Fstorm_examples\u002Frun_storm_wiki_gpt.py \\\n    --output-dir $OUTPUT_DIR \\\n    --retriever bing \\\n    --do-research \\\n    --do-generate-outline \\\n    --do-generate-article \\\n    --do-polish-article\n```\n\n**使用您喜爱的语言模型或基于您自己的语料库运行 STORM：** 请查看 [examples\u002Fstorm_examples\u002FREADME.md](examples\u002Fstorm_examples\u002FREADME.md)。\n\n### Co-STORM 示例\n\n要使用默认配置运行 Co-STORM 与 `gpt` 系列模型，\n\n1. 在 `secrets.toml` 中添加 `BING_SEARCH_API_KEY=\"xxx\"` 和 `ENCODER_API_TYPE=\"xxx\"`\n2. 运行以下命令\n\n```bash\npython examples\u002Fcostorm_examples\u002Frun_costorm_gpt.py \\\n    --output-dir $OUTPUT_DIR \\\n    --retriever bing\n```\n\n\n## 流水线的定制化\n\n### STORM\n\n如果您已安装源代码，可以根据自己的使用场景对 STORM 进行定制。STORM 引擎由 4 个模块组成：\n\n1. 知识整理模块：收集关于给定主题的广泛信息。\n2. 大纲生成模块：通过为整理后的知识生成层次化的大纲来组织收集到的信息。\n3. 文章生成模块：将收集到的信息填充到生成的大纲中。\n4. 文章润色模块：对撰写的文章进行优化和提升，以获得更好的呈现效果。\n\n每个模块的接口在 `knowledge_storm\u002Finterface.py` 中定义，其实现则在 `knowledge_storm\u002Fstorm_wiki\u002Fmodules\u002F*` 中实例化。这些模块可以根据您的具体需求进行定制（例如，生成项目符号格式的章节而不是完整的段落）。\n\n### Co-STORM\n\n如果您已安装源代码，可以根据自己的使用场景对 Co-STORM 进行定制。\n\n1. Co-STORM 引入了多种 LLM 代理类型（即 Co-STORM 专家和主持人）。LLM 代理的接口在 `knowledge_storm\u002Finterface.py` 中定义，其实现在 `knowledge_storm\u002Fcollaborative_storm\u002Fmodules\u002Fco_storm_agents.py` 中实例化。可以自定义不同的 LLM 代理策略。\n2. Co-STORM 引入了一种协作性话语协议，其核心功能在于轮次策略管理。我们在 `knowledge_storm\u002Fcollaborative_storm\u002Fengine.py` 中提供了通过 `DiscourseManager` 实现轮次策略管理的示例。该部分可以进一步定制和改进。\n\n## 数据集\n为了促进自动知识整理和复杂信息检索的研究，我们的项目发布了以下数据集：\n\n### FreshWiki\nFreshWiki 数据集是一组包含 100 篇高质量维基百科文章的集合，重点涵盖了 2022 年 2 月至 2023 年 9 月期间编辑次数最多的页面。更多详情请参阅 [STORM 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14207) 的第 2.1 节。\n\n您可以直接从 [huggingface](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FEchoShao8899\u002FFreshWiki) 下载该数据集。为避免数据污染问题，我们还存档了用于构建数据的 [源代码](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Ftree\u002FNAACL-2024-code-backup\u002FFreshWiki)，以便在未来重复使用。\n\n### WildSeek\n为了研究用户在真实环境中对复杂信息检索任务的兴趣，我们利用网络研究预览中收集的数据创建了 WildSeek 数据集。我们对数据进行了下采样，以确保主题的多样性和数据的质量。每个数据点都是一对内容，包括一个主题以及用户针对该主题进行深度搜索的目标。更多详情请参阅 [Co-STORM 论文](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232) 的第 2.2 节和附录 A。\n\nWildSeek 数据集可在 [这里](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYuchengJiang\u002FWildSeek) 获取。\n\n## 复现 STORM 和 Co-STORM 论文结果\n对于 STORM 论文实验，请切换到分支 `NAACL-2024-code-backup` [这里](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Ftree\u002FNAACL-2024-code-backup)。\n\n对于 Co-STORM 论文实验，目前暂用分支 `EMNLP-2024-code-backup`（即将更新）。\n\n## 路线图与贡献\n我们的团队正在积极开发：\n1. 人机协作功能：支持用户参与知识整理过程。\n2. 信息抽象：为整理后的信息开发抽象表示，以支持除维基百科式报告之外的其他呈现形式。\n\n如果您有任何问题或建议，请随时提交问题或拉取请求。我们欢迎任何有助于改进系统和代码库的贡献！\n\n联系人：[Yijia Shao](mailto:shaoyj@stanford.edu) 和 [Yucheng Jiang](mailto:yuchengj@stanford.edu)\n\n## 致谢\n我们感谢维基百科提供的优秀开源内容。FreshWiki 数据集来源于维基百科，采用知识共享署名-相同方式共享（CC BY-SA）许可协议。\n\n我们非常感谢 [Michelle Lam](https:\u002F\u002Fmichelle123lam.github.io\u002F) 为本项目设计的标志，以及 [Dekun Ma](https:\u002F\u002Fdekun.me) 主导的 UI 开发工作。\n\n同时感谢 Vercel 对 [开源软件](https:\u002F\u002Fstorm.genie.stanford.edu) 的支持。\n\n## 引用\n如果您在工作中使用了本代码或其中的一部分，请引用我们的论文：\n```bibtex\n@inproceedings{jiang-etal-2024-unknown,\n    title = \"进入未知的未知：通过参与语言模型代理对话实现主动式人类学习\",\n    author = \"Jiang, Yucheng  and\n      Shao, Yijia  and\n      Ma, Dekun  and\n      Semnani, Sina  and\n      Lam, Monica\",\n    editor = \"Al-Onaizan, Yaser  and\n      Bansal, Mohit  and\n      Chen, Yun-Nung\",\n    booktitle = \"2024 年自然语言处理经验方法会议论文集\",\n    month = nov,\n    year = \"2024\",\n    address = \"迈阿密，佛罗里达州，美国\",\n    publisher = \"计算语言学协会\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.554\u002F\",\n    doi = \"10.18653\u002Fv1\u002F2024.emnlp-main.554\",\n    pages = \"9917--9955\",\n}\n\n@inproceedings{shao-etal-2024-assisting,\n    title = \"借助大型语言模型从零开始协助撰写类似维基百科的文章\",\n    author = \"Shao, Yijia  and\n      Jiang, Yucheng  and\n      Kanell, Theodore  and\n      Xu, Peter  and\n      Khattab, Omar  and\n      Lam, Monica\",\n    editor = \"Duh, Kevin  and\n      Gomez, Helena  and\n      Bethard, Steven\",\n    booktitle = \"2024 年北美计算语言学协会人类语言技术会议论文集（第一卷：长篇论文）\",\n    month = jun,\n    year = \"2024\",\n    address = \"墨西哥城，墨西哥\",\n    publisher = \"计算语言学协会\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.naacl-long.347\u002F\",\n    doi = \"10.18653\u002Fv1\u002F2024.naacl-long.347\",\n    pages = \"6252--6278\",\n}\n```","# STORM & Co-STORM 快速上手指南\n\nSTORM 是一个基于大语言模型（LLM）的系统，能够通过互联网搜索从零开始撰写维基百科风格的文章。Co-STORM 是其进阶版本，支持人机协作进行知识策展，通过动态思维导图构建共享概念空间。\n\n## 环境准备\n\n*   **操作系统**：Linux, macOS, Windows\n*   **Python 版本**：推荐 Python 3.11\n*   **依赖管理**：推荐使用 `conda` 或 `venv` 创建独立虚拟环境\n*   **API 密钥**：\n    *   **大模型**：需准备支持的 LLM API Key（如 OpenAI, 或通过 LiteLLM 接入的其他模型）。\n    *   **搜索引擎**：需准备至少一种检索引擎的 API Key（如 You.com, Bing Search, Serper 等），或使用本地向量库 (`VectorRM`)。\n\n## 安装步骤\n\n### 方式一：通过 Pip 安装（推荐）\n\n直接安装最新发布的 `knowledge-storm` 包（已包含 STORM 和 Co-STORM 功能）：\n\n```bash\npip install knowledge-storm --upgrade\n```\n\n### 方式二：源码安装（适合开发者）\n\n如需修改引擎行为或参与开发，可克隆源码安装：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm.git\ncd storm\nconda create -n storm python=3.11\nconda activate storm\npip install -r requirements.txt\n```\n\n> **提示**：国内用户若遇到 pip 下载缓慢，可添加清华或阿里镜像源加速：\n> `pip install knowledge-storm --upgrade -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n\n## 基本使用\n\nSTORM 和 Co-STORM 均基于模块化设计，支持通过 `litellm` 接入多种模型和检索器。以下展示最核心的调用流程。\n\n### 1. 使用 STORM 生成文章\n\nSTORM 分为“预写（调研与大纲）”和“写作（生成全文）”两个阶段。\n\n```python\nimport os\nfrom knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.rm import YouRM  # 也可替换为 BingSearch, VectorRM 等\n\n# 1. 配置大模型 (利用 LiteLLM 兼容多种提供商)\nlm_configs = STORMWikiLMConfigs()\nopenai_kwargs = {\n    'api_key': os.getenv(\"OPENAI_API_KEY\"),\n    'temperature': 1.0,\n    'top_p': 0.9,\n}\n\n# 最佳实践：低成本模型用于对话模拟\u002F提问，高性能模型用于大纲\u002F文章生成\ngpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)\ngpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)\n\nlm_configs.set_conv_simulator_lm(gpt_35)\nlm_configs.set_question_asker_lm(gpt_35)\nlm_configs.set_outline_gen_lm(gpt_4)\nlm_configs.set_article_gen_lm(gpt_4)\nlm_configs.set_article_polish_lm(gpt_4)\n\n# 2. 配置检索引擎 (此处以 You.com 为例)\n# 请确保环境变量中设置了 YDC_API_KEY 或直接在参数中传入\nrm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=3) \n\n# 3. 初始化运行器\nengine_args = STORMWikiRunnerArguments(\n    output_dir='.\u002Fstorm_output',\n    search_top_k=3,\n    max_thread_num=10\n)\nrunner = STORMWikiRunner(engine_args, lm_configs, rm)\n\n# 4. 执行任务\ntopic = \"Artificial Intelligence\"  # 替换为你想调研的主题\nrunner.run(\n    topic=topic,\n    do_research=True,       # True: 进行联网调研; False: 加载已有结果\n    do_generate_outline=True,\n    do_generate_article=True,\n    do_polish_article=True,\n)\n\n# 5. 结束并查看摘要\nrunner.post_run()\nrunner.summary()\n```\n\n### 2. 使用 Co-STORM 进行人机协作\n\nCo-STORM 允许用户介入对话过程，系统会维护一个动态更新的思维导图。\n\n```python\nimport os\nfrom knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.logging_wrapper import LoggingWrapper\nfrom knowledge_storm.rm import BingSearch\n\n# 1. 配置多模型组件\nlm_config = CollaborativeStormLMConfigs()\nopenai_kwargs = {\n    \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n    \"temperature\": 1.0,\n    \"top_p\": 0.9,\n}\n\n# 为不同角色分配模型 (示例中均使用 gpt-4o，实际可按需区分)\nmodel_name = 'gpt-4o'\nquestion_answering_lm = LitellmModel(model=model_name, max_tokens=1000, **openai_kwargs)\ndiscourse_manage_lm = LitellmModel(model=model_name, max_tokens=500, **openai_kwargs)\nutterance_polishing_lm = LitellmModel(model=model_name, max_tokens=2000, **openai_kwargs)\nwarmstart_outline_gen_lm = LitellmModel(model=model_name, max_tokens=500, **openai_kwargs)\nquestion_asking_lm = LitellmModel(model=model_name, max_tokens=300, **openai_kwargs)\nknowledge_base_lm = LitellmModel(model=model_name, max_tokens=1000, **openai_kwargs)\n\nlm_config.set_question_answering_lm(question_answering_lm)\nlm_config.set_discourse_manage_lm(discourse_manage_lm)\nlm_config.set_utterance_polishing_lm(utterance_polishing_lm)\nlm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)\nlm_config.set_question_asking_lm(question_asking_lm)\nlm_config.set_knowledge_base_lm(knowledge_base_lm)\n\n# 2. 初始化参数与检索器\ntopic = \"Quantum Computing\"\nrunner_argument = RunnerArgument(\n    topic=topic,\n    retrieve_top_k=3,\n    max_search_queries=2\n)\nlogging_wrapper = LoggingWrapper(lm_config)\nbing_rm = BingSearch(bing_search_api_key=os.environ.get(\"BING_SEARCH_API_KEY\"), k=3)\n\ncostorm_runner = CoStormRunner(\n    lm_config=lm_config,\n    runner_argument=runner_argument,\n    logging_wrapper=logging_wrapper,\n    rm=bing_rm\n)\n\n# 3. 启动协作流程\n# 预热：建立初始共享概念空间\ncostorm_runner.warm_start()\n\n# 步进式交互：\n# 模式 A: 观察系统生成的对话回合\nconv_turn = costorm_runner.step()\nprint(f\"System: {conv_turn.utterance}\")\n\n# 模式 B: 用户主动注入观点引导讨论 (取消注释即可使用)\n# user_input = \"Please focus more on the hardware challenges.\"\n# conv_turn = costorm_runner.step(user_utterance=user_input)\n```\n\n### 输出说明\n运行完成后，文章、大纲及中间调研数据将保存在指定的 `output_dir` 目录中（默认为当前目录下的 `storm_output` 文件夹），格式包含 Markdown 源码及引用信息。","某科技公司的行业分析师需要在两天内完成一份关于“固态电池技术突破与商业化路径”的深度调研报告，以供高层战略会议使用。\n\n### 没有 storm 时\n- **信息搜集碎片化**：分析师需手动在多个搜索引擎和学术库中反复切换关键词，耗时数小时才能拼凑出零散的文献列表，极易遗漏关键视角。\n- **大纲构建困难**：面对海量杂乱信息，难以快速梳理出逻辑严密的多维度大纲，往往陷入“只见树木不见森林”的困境，导致报告结构松散。\n- **引用核对繁琐**：人工整理参考文献和对应引注极易出错，花费大量时间核对来源真实性，且容易因疲劳产生幻觉或张冠李戴。\n- **视角单一局限**：受限于个人知识边界，报告往往缺乏跨学科或多利益相关方（如政策、供应链、技术瓶颈）的深度追问，内容深度不足。\n\n### 使用 storm 后\n- **自动化全景调研**：storm 自动基于互联网进行多轮检索，模拟不同专家视角主动提问，迅速覆盖技术原理、市场障碍及政策环境等全方位信息。\n- **智能生成结构化大纲**：系统先通过“预写作阶段”生成逻辑清晰的多级大纲，确保报告骨架严谨，分析师只需微调即可锁定核心叙事线。\n- **一键生成带引注长文**：storm 直接输出包含准确 citations 的完整长篇报告，所有论据均自动关联原始来源，大幅降低事实核查成本。\n- **深度与广度兼备**：借助多视角问答机制，报告自然融入了原本容易被忽视的边缘视角，显著提升了内容的专业深度和决策参考价值。\n\nstorm 将原本需要数天的人工调研与撰写过程压缩至小时级，让分析师从繁琐的信息搬运工转型为高价值的策略洞察者。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fstanford-oval_storm_7b4a710d.png","stanford-oval","Stanford Open Virtual Assistant Lab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fstanford-oval_b4570575.png","Research projects in the Stanford Open Virtual Assistant Lab",null,"genie@cs.stanford.edu","StanfordOVAL","https:\u002F\u002Foval.cs.stanford.edu","https:\u002F\u002Fgithub.com\u002Fstanford-oval",[85],{"name":86,"color":87,"percentage":88},"Python","#3572A5",100,28052,2552,"2026-04-05T22:51:36","MIT","Linux, macOS, Windows","未说明",{"notes":96,"python":97,"dependencies":98},"该工具主要依赖外部 API（如 OpenAI、Bing Search、You.com 等）进行推理和检索，README 中未提及本地运行大模型所需的 GPU 或显存需求。建议使用 conda 创建 Python 3.11 环境并安装 requirements.txt 中的依赖。需配置相应服务的 API Key 方可运行。","3.11",[99,100,101,102],"knowledge-storm","litellm","dspy","streamlit",[26,62,13,15],[105,106,107,108,109,110,111,112,113],"large-language-models","nlp","knowledge-curation","naacl","report-generation","retrieval-augmented-generation","emnlp2024","agentic-rag","deep-research","2026-03-27T02:49:30.150509","2026-04-06T14:04:01.070935",[117,122,127,132,137,142],{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},19283,"如何集成本地 LLM（如 Ollama）或使用 Docker 部署？","项目现已支持 Ollama（通过 PR #81 合并），您可以将其作为兼容 OpenAI API 的端点使用。关于 Docker 容器或 Docker Compose 的支持，目前需关注项目后续更新或自行构建镜像。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fissues\u002F2",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},19284,"为什么无法使用 Azure OpenAI，报错提示没有有效的提供者？","由于包升级，默认对 Azure OpenAI 的支持曾被移除。临时解决方法是修改 `knowledge_storm\u002Fstorm_wiki\u002Fengine.py` 文件中的 `init_openai_model` 函数，显式传入 `azure_api_key`, `api_base`, `api_version` 等参数构建 `azure_kwargs` 字典来初始化模型。官方计划在未来版本重新添加默认支持。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fissues\u002F86",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},19285,"除了 You.com，还支持哪些搜索 API？","目前除了 You.com，项目还支持 Bing Search 以及基于向量数据库的自定义语料库检索。此外，社区也在讨论支持 Semantic Scholar API 等其他来源。具体配置请参考项目的 API 文档部分。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fissues\u002F8",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},19286,"运行本地语料库时出现 'IndexError: list index out of range' 或 'hits' 错误怎么办？","这通常是因为向量存储（vector_store）为空，未加载 CSV 数据。请在运行命令时添加 `--update-vector-store` 参数，这将指示模型将 CSV 文件中的文档添加到离线向量存储中。例如：`python examples\u002Frun_storm_wiki_gpt_with_VectorRM.py ... --update-vector-store --csv-file-path \u003C你的 csv 路径> ...`。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fissues\u002F82",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},19287,"Web 演示版的“创建新文章”功能为什么不可用？","由于用户需求极高，搜索 API 的配额在两天内耗尽，且超出了实验室预算，因此该功能目前已被禁用。开发者正在重构代码库并规划路线图，旨在将 STORM 通用化为易用的知识策划引擎。用户可以订阅邮件通知以获取功能恢复的消息。","https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fissues\u002F14",{"id":143,"question_zh":144,"answer_zh":145,"source_url":141},19288,"连接 OpenAI API 时遇到连接中断或错误，可能是什么原因？","这可能是与 OpenAI API 端点的网络连接问题。有用户反馈，关闭 VPN 后问题得以解决。建议检查网络环境，尝试断开 VPN 或代理后再进行连接测试。",[147,152,157,162,167,172],{"id":148,"version":149,"summary_zh":150,"released_at":151},117272,"v1.1.0","将知识风暴包升级至 v1.1.0，以包含最新更改。\n\n## 语言模型 API 更新\n- #309 STORM 和 Co-STORM 现在支持由 LiteLLM 提供支持的语言模型\n\n## 检索器 API 更新\n- #236 修复 Serper 检索器的 bug\n- #198 添加 Azure AI 搜索\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fcompare\u002Fv1.0.0...v1.1.0","2025-01-23T07:21:44",{"id":153,"version":154,"summary_zh":155,"released_at":156},117273,"v1.0.0","我们非常高兴地宣布Co-STORM的发布——这是STORM项目的重大更新，将**人机协作的知识编纂**推向了前台！新版本让用户能够以更加互动和对齐的方式与语言模型进行交流，从而彻底改变我们共同探索和编纂知识的方式。\n\nknowledge-storm软件包现已升级至v1.0.0——请通过以下命令进行升级：\n```python\npip install knowledge-storm --upgrade\n```\n\n## 最新动态 🔥\n- [2024年9月] Co-STORM现已上线，并已完全集成到knowledge-storm Python软件包中。立即升级软件包即可体验！\n- [2024年9月] 我们关于Co-STORM的论文已被EMNLP 2024接收！您可以通过[这里](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2408.15232)阅读。\n\n## 新特性 🎉\n\n**🚀 Co-STORM引擎**\n\n- Co-STORM现已集成到knowledge-storm软件包中。更多详情请参阅[API文档](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm#co-storm-1)。\n- 我们在Co-STORM中引入了**Agent接口**，提供了一个统一的框架，用于定义不同的语言模型代理策略，以支持信息检索和知识编纂任务。您可以在此处探索该接口：[knowledge_storm\u002Finterface.py](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fknowledge_storm\u002Finterface.py)。\n  - Co-STORM LLM专家：这些代理会基于外部知识源生成响应，并根据对话内容提出后续问题。\n  - 主持人代理：通过生成富有洞察力的问题来引导对话，同时关注那些已被发现但尚未充分探索的领域。\n  - 人类参与：用户可以插入自己的话语来引导对话方向，从而实现互动性和协作性的体验。\n\n**🧠 动态思维导图**\n\nCo-STORM引入了**KnowledgeBase**数据类（[dataclass.py](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fknowledge_storm\u002Fdataclass.py)），它将检索到的信息组织成面向概念的层级结构，从而在用户与系统之间构建一个共享的概念空间。这一结构以图形化的思维导图形式呈现，使用户能够轻松导航并深入探索知识编纂过程。\n\n**💻 图形化UI更新**\n\nCo-STORM的交互式图形化界面即将在我们的[实时研究预览网站](http:\u002F\u002Fstorm.genie.stanford.edu\u002F)上推出。敬请期待后续更新！\n\n**🔧 模块化与灵活性**\n\n- 与STORM一样，Co-STORM同样基于[dspy](https:\u002F\u002Fgithub.com\u002Fstanfordnlp\u002Fdspy)库构建，确保了完全的模块化设计。您可以轻松自定义语言模型（LM）和检索模块（RM），以满足高级用例的需求。有关自定义指南，请参阅[此处](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm#co-storm-2)。\n- Co-STORM支持与STORM相同的语言模型和检索模块。完整的支持模型列表请见[此处](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm#api)。\n\n## 贡献者 🙌\n\n本次发布离不开以下各位的辛勤付出：\n\n- [@Yucheng-Jiang](https:\u002F\u002Fgithub.com\u002FYucheng-Jiang)\n- [@shaoyijia](https:\u002F\u002Fgithub.com\u002Fshaoyijia)\n- [@deku","2024-09-25T19:50:16",{"id":158,"version":159,"summary_zh":160,"released_at":161},117274,"v0.2.4","将 `knowledge-storm` 包升级至 v0.2.4，以包含最新更改。对演示灯光进行了小幅修复 (#118)。\n\n## 新增模型支持\n- `GoogleModel` (#105)\n- `DeepSeekModel` (#84)\n\n## 新增检索器支持\n- `SerperRM` (#102)\n- `BraveRM` (#134)\n- 通过将向量存储的构建与检索器类解耦，改进了 `VectorRM` (#126)。","2024-08-09T16:57:35",{"id":163,"version":164,"summary_zh":165,"released_at":166},117275,"v0.2.3","## 可在 PyPI 上获取的 Python 包\n- 现在可以通过 `pip install knowledge-storm` 安装 storm。(#80)\n- 我们更新了 README.md，以说明当前的 API。\n- 如果您更倾向于使用源代码，请注意，拉取新版本会导致破坏性变更，因为我们已将 `src\u002F` 重命名为 `knowledge_storm\u002F`。\n\n## Ollama 支持\n#81，感谢：@zhjain\n\n## 路线图与贡献\n我们的团队正在积极开发以下功能：\n1. 人机协作功能：支持用户参与知识梳理过程。\n2. 信息抽象：为梳理后的信息构建抽象层，以支持除维基百科式报告之外的其他呈现形式。\n\n**非常欢迎针对 [knowledge_storm\u002Flm.py](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fknowledge_storm\u002Flm.py) 集成更多语言模型，以及针对 [knowledge_storm\u002Frm.py](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fknowledge_storm\u002Frm.py) 集成更多搜索引擎和检索器的 PR！**","2024-07-18T05:02:04",{"id":168,"version":169,"summary_zh":170,"released_at":171},117276,"v0.2.0","## 支持从自定义文档中检索信息\n- 新增 `VectorRM`，以支持基于用户提供的文档进行知识增强，从而补充现有对搜索引擎（`YouRM`、`BingSearch`）的支持。（[#58](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F58)）\n- 我们提供了[示例脚本](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fexamples\u002Frun_storm_wiki_gpt_with_VectorRM.py)以及[详细说明](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fblob\u002Fmain\u002Fexamples\u002FREADME.md#run-storm-with-your-own-corpus)。\n\n## UI 更新\n- 我们发布了一个全新的用户界面，供用户直接与 STORM 交互，访问地址为 https:\u002F\u002Fstorm.genie.stanford.edu\u002F。此次更新提升了稳定性，并新增了热门话题发现功能。（感谢来自耶鲁大学的 [dekunma](https:\u002F\u002Fgithub.com\u002Fdekunma) 的协作）\n- 我们还为开发者发布了 [demo light](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Ftree\u002Fmain\u002Ffrontend\u002Fdemo_light)，这是一个使用 Python 的 Streamlit 框架构建的极简用户界面，便于本地开发和演示部署。（[#54](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fpull\u002F54)）\n\n![storm_ui](https:\u002F\u002Fgithub.com\u002Fstanford-oval\u002Fstorm\u002Fassets\u002F51142637\u002F0596fca3-6372-4d81-b07f-d04b58b9f283)\n\n## 其他\n- 旧金山线下交流：我们的团队成员将于 7 月 11 日在 [维基百科人旧金山聚会](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FWikipedia:Meetup\u002FSan_Francisco) 上发表特邀演讲。欢迎前来与我们面对面交流！\n- 即将发布的重大更新：一个支持人机协作模式的重大更新将于夏季发布。敬请期待！","2024-07-08T15:15:45",{"id":173,"version":174,"summary_zh":175,"released_at":176},117277,"v0.1.0","## 流水线重构与模块接口\n\n我们发布了经过重构的 STORM 流水线代码，旨在让 STORM 的运行、定制和开发更加便捷。现已定义了流水线各模块及数据类的接口，从而提升了模块化程度和可扩展性。\n\n\n## 语言模型定制支持\n\nSTORM 现已支持跨多个系列和客户端的语言模型定制，包括：\n- GPT 系列\n- Claude 系列\n- VLLM 客户端\n- TGI 客户端\n- Together 客户端\n\n`examples\u002F` 目录下提供了演示语言模型定制的示例。更多详情请参阅 README 文件。\n\n\n## 检索器模块定制支持\n\nSTORM 引入了对 `Retriever` 模块进行定制的能力。作为知识编排引擎，STORM 会使用由 `Retriever` 模块提供的信息。\n有关检索器模块定制的更多信息，请参阅 README 文件。\n\n-----\n本次发布重点在于 **提升 STORM 流水线的模块化与灵活性**，使用户能够根据自身需求无缝集成并定制语言模型和检索器模块。**非常欢迎为 STORM 贡献更多语言模型支持以及搜索引擎\u002F检索模型支持！**","2024-04-23T05:27:58"]