[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-OSU-NLP-Group--HippoRAG":3,"tool-OSU-NLP-Group--HippoRAG":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",153609,2,"2026-04-13T11:34:59",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":77,"owner_website":76,"owner_url":78,"languages":79,"stars":84,"forks":85,"last_commit_at":86,"license":87,"difficulty_score":10,"env_os":88,"env_gpu":89,"env_ram":90,"env_deps":91,"category_tags":99,"github_topics":76,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":100,"updated_at":101,"faqs":102,"releases":132},7114,"OSU-NLP-Group\u002FHippoRAG","HippoRAG","[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personalized PageRank.","HippoRAG 是一款受人类长期记忆机制启发的新型检索增强生成（RAG）框架，旨在帮助大语言模型持续整合外部文档中的知识。它巧妙融合了知识图谱与个性化页面排名算法，将传统的 RAG 升级为具备“记忆”能力的智能系统。\n\n针对现有系统在应对复杂、多跳推理任务时关联能力不足，以及在处理海量上下文时难以有效“理解”的痛点，HippoRAG 显著提升了模型在事实记忆、逻辑关联及复杂语境整合方面的表现。实验表明，它在增强高级推理能力的同时，并未牺牲简单任务的性能，且在离线索引阶段比 GraphRAG、RAPTOR 等同类图方案更节省资源，在线响应也保持了高效低延迟。\n\n该工具特别适合 AI 研究人员、大模型应用开发者以及需要构建企业级知识库的技术团队使用。其核心技术亮点在于模拟了人脑的海马体功能，通过非参数化的持续学习机制，让模型能够像人类一样识别并利用新知识间的深层联系，从而向真正的“长期记忆”迈进了一大步。无论是处理多文档问答还是复杂叙事分析，HippoRAG 都能提供更精准、连贯的智能支持。","\u003Ch1 align=\"center\">HippoRAG 2: From RAG to Memory\u003C\u002Fh1>\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_5c05af34a2ce.png\" width=\"55%\" style=\"max-width: 300px;\">\n\u003C\u002Fp>\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" \u002F>](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1nuelysWsXL8F5xH6q4JYJI8mvtlmeM9O#scrollTo=TjHdNe2KC81K)\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2502.14802 HippoRAG 2-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗 Dataset-HippoRAG 2-yellow\" \u002F>](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_2\u002Ftree\u002Fmain)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2405.14831 HippoRAG 1-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-HippoRAG 1-blue\" \u002F>](https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Ftree\u002Flegacy)\n\n### HippoRAG 2 is a powerful memory framework for LLMs that enhances their ability to recognize and utilize connections in new knowledge—mirroring a key function of human long-term memory.\n\nOur experiments show that HippoRAG 2 improves associativity (multi-hop retrieval) and sense-making (the process of integrating large and complex contexts) in even the most advanced RAG systems, without sacrificing their performance on simpler tasks.\n\nLike its predecessor, HippoRAG 2 remains cost and latency efficient in online processes, while using significantly fewer resources for offline indexing compared to other graph-based solutions such as GraphRAG, RAPTOR, and LightRAG.\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_5b90a821219a.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>Figure 1:\u003C\u002Fb> Evaluation of continual learning capabilities across three key dimensions: factual memory (NaturalQuestions, PopQA), sense-making (NarrativeQA), and associativity (MuSiQue, 2Wiki, HotpotQA, and LV-Eval). HippoRAG 2 surpasses other methods across all\ncategories, bringing it one step closer to true long-term memory.\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_aea07d15223b.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>Figure 2:\u003C\u002Fb> HippoRAG 2 methodology.\n\u003C\u002Fp>\n\n#### Check out our papers to learn more:\n\n* [**HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831) [NeurIPS '24].\n* [**From RAG to Memory: Non-Parametric Continual Learning for Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802) [ICML '25].\n\n----\n\n## Installation\n\n```sh\nconda create -n hipporag python=3.10\nconda activate hipporag\npip install hipporag\n```\nInitialize the environmental variables and activate the environment:\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=\u003Cyour openai api key>   # if you want to use OpenAI model\n\nconda activate hipporag\n```\n\n## Quick Start\n\n### OpenAI Models\n\nThis simple example will illustrate how to use `hipporag` with any OpenAI model:\n\n```python\nfrom hipporag import HippoRAG\n\n# Prepare datasets and evaluation\ndocs = [\n    \"Oliver Badman is a politician.\",\n    \"George Rankin is a politician.\",\n    \"Thomas Marwick is a politician.\",\n    \"Cinderella attended the royal ball.\",\n    \"The prince used the lost glass slipper to search the kingdom.\",\n    \"When the slipper fit perfectly, Cinderella was reunited with the prince.\",\n    \"Erik Hort's birthplace is Montebello.\",\n    \"Marina is bom in Minsk.\",\n    \"Montebello is a part of Rockland County.\"\n]\n\nsave_dir = 'outputs'# Define save directory for HippoRAG objects (each LLM\u002FEmbedding model combination will create a new subdirectory)\nllm_model_name = 'gpt-4o-mini' # Any OpenAI model name\nembedding_model_name = 'nvidia\u002FNV-Embed-v2'# Embedding model name (NV-Embed, GritLM or Contriever for now)\n\n#Startup a HippoRAG instance\nhipporag = HippoRAG(save_dir=save_dir, \n                    llm_model_name=llm_model_name,\n                    embedding_model_name=embedding_model_name) \n\n#Run indexing\nhipporag.index(docs=docs)\n\n#Separate Retrieval & QA\nqueries = [\n    \"What is George Rankin's occupation?\",\n    \"How did Cinderella reach her happy ending?\",\n    \"What county is Erik Hort's birthplace a part of?\"\n]\n\nretrieval_results = hipporag.retrieve(queries=queries, num_to_retrieve=2)\nqa_results = hipporag.rag_qa(retrieval_results)\n\n#Combined Retrieval & QA\nrag_results = hipporag.rag_qa(queries=queries)\n\n#For Evaluation\nanswers = [\n    [\"Politician\"],\n    [\"By going to the ball.\"],\n    [\"Rockland County\"]\n]\n\ngold_docs = [\n    [\"George Rankin is a politician.\"],\n    [\"Cinderella attended the royal ball.\",\n    \"The prince used the lost glass slipper to search the kingdom.\",\n    \"When the slipper fit perfectly, Cinderella was reunited with the prince.\"],\n    [\"Erik Hort's birthplace is Montebello.\",\n    \"Montebello is a part of Rockland County.\"]\n]\n\nrag_results = hipporag.rag_qa(queries=queries, \n                              gold_docs=gold_docs,\n                              gold_answers=answers)\n```\n\n#### Example (OpenAI Compatible Embeddings)\n\nIf you want to use LLMs and Embeddings Compatible to OpenAI, please use the following methods.\u003C\u002Fp>\n    \n```python\nhipporag = HippoRAG(save_dir=save_dir, \n    llm_model_name='Your LLM Model name',\n    llm_base_url='Your LLM Model url',\n    embedding_model_name='Your Embedding model name',  \n    embedding_base_url='Your Embedding model url')\n```\n\n### Local Deployment (vLLM)\n\nThis simple example will illustrate how to use `hipporag` with any vLLM-compatible locally deployed LLM.\n\n1. Run a local [OpenAI-compatible vLLM server](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online) with specified GPUs (make sure you leave enough memory for your embedding model).\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. Now you can use very similar code to the one above to use `hipporag`: \n\n```python\nsave_dir = 'outputs'# Define save directory for HippoRAG objects (each LLM\u002FEmbedding model combination will create a new subdirectory)\nllm_model_name = # Any OpenAI model name\nembedding_model_name = # Embedding model name (NV-Embed, GritLM or Contriever for now)\nllm_base_url= # Base url for your deployed LLM (i.e. http:\u002F\u002Flocalhost:8000\u002Fv1)\n\nhipporag = HippoRAG(save_dir=save_dir,\n                    llm_model_name=llm_model,\n                    embedding_model_name=embedding_model_name,\n                    llm_base_url=llm_base_url)\n\n# Same Indexing, Retrieval and QA as running OpenAI models above\n```\n\n## Testing\n\nWhen making a contribution to HippoRAG, please run the scripts below to ensure that your changes do not result in unexpected behavior from our core modules. \n\nThese scripts test for indexing, graph loading, document deletion and incremental updates to a HippoRAG object.\n\n### OpenAI Test\n\nTo test HippoRAG with an OpenAI LLM and embedding model, simply run the following. \nThe cost of this test will be negligible.\n\n```sh\nexport OPENAI_API_KEY=\u003Cyour openai api key> \n\nconda activate hipporag\n\npython tests_openai.py\n```\n\n### Local Test\n\nTo test locally, you must deploy a vLLM instance. We choose to deploy a smaller 8B model `Llama-3.1-8B-Instruct` for cheaper testing.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.1-8B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 --port 6578\n```\n\nThen, we run the following test script:\n\n```sh\nCUDA_VISIBLE=1 python tests_local.py\n```\n\n## Reproducing our Experiments\n\nTo use our code to run experiments we recommend you clone this repository and follow the structure of the `main.py` script.\n\n### Data for Reproducibility\n\nWe evaluated several sampled datasets in our paper, some of which are already included in the `reproduce\u002Fdataset` directory of this repo. For the complete set of datasets, please visit\nour [HuggingFace dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_v2) and place them under `reproduce\u002Fdataset`. We also provide the OpenIE results for both `gpt-4o-mini` and `Llama-3.3-70B-Instruct` for our `musique` sample under `outputs\u002Fmusique`.\n\nTo test your environment is properly set up, you can use the small dataset `reproduce\u002Fdataset\u002Fsample.json` for debugging as shown below.\n\n### Running Indexing & QA\n\nInitialize the environmental variables and activate the environment:\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=\u003Cyour openai api key>   # if you want to use OpenAI model\n\nconda activate hipporag\n```\n\n### Run with OpenAI Model\n\n```sh\ndataset=sample  # or any other dataset under `reproduce\u002Fdataset`\n\n# Run OpenAI model\npython main.py --dataset $dataset --llm_base_url https:\u002F\u002Fapi.openai.com\u002Fv1 --llm_name gpt-4o-mini --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n### Run with vLLM (Llama)\n\n1. As above, run a local [OpenAI-compatible vLLM server](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online) with specified GPU.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag  # vllm should be in this environment\n\n# Tune gpu-memory-utilization or max_model_len to fit your GPU memory, if OOM occurs\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. Use another GPUs to run the main program in another terminal.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=2,3  # set another GPUs while vLLM server is running\nexport HF_HOME=\u003Cpath to Huggingface home directory>\ndataset=sample\n\npython main.py --dataset $dataset --llm_base_url http:\u002F\u002Flocalhost:8000\u002Fv1 --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n#### Advanced: Run with vLLM offline batch\n\nvLLM offers an [offline batch mode](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#offline-batched-inference) for faster inference, which could bring us more than 3x faster indexing compared to vLLM online server. \n\n1. Use the following command to run the main program with vLLM offline batch mode.\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3 # use all GPUs for faster offline indexing\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\nexport OPENAI_API_KEY=''\ndataset=sample\n\npython main.py --dataset $dataset --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --openie_mode offline --skip_graph\n```\n\n2. After the first step, OpenIE result is saved to file. Go back to run vLLM online server and main program as described in the `Run with vLLM (Llama)` main section.\n\n## Debugging Note\n\n- `\u002Freproduce\u002Fdataset\u002Fsample.json` is a small dataset specifically for debugging.\n- When debugging vLLM offline mode, set `tensor_parallel_size` as `1` in `hipporag\u002Fllm\u002Fvllm_offline.py`.\n- If you want to rerun a particular experiment, remember to clear the saved files, including OpenIE results and knowledge graph, e.g.,\n\n```sh\nrm reproduce\u002Fdataset\u002Fopenie_results\u002Fopenie_sample_results_ner_meta-llama_Llama-3.3-70B-Instruct_3.json\nrm -rf outputs\u002Fsample\u002Fsample_meta-llama_Llama-3.3-70B-Instruct_nvidia_NV-Embed-v2\n```\n### Custom Datasets\n\nTo setup your own custom dataset for evaluation, follow the format and naming convention shown in `reproduce\u002Fdataset\u002Fsample_corpus.json` (your dataset's name should be followed by `_corpus.json`). If running an experiment with pre-defined questions, organize your query corpus according to the query file `reproduce\u002Fdataset\u002Fsample.json`, be sure to also follow our naming convention.\n\nThe corpus and optional query JSON files should have the following format:\n\n#### Retrieval Corpus JSON\n\n```json\n[\n  {\n    \"title\": \"FIRST PASSAGE TITLE\",\n    \"text\": \"FIRST PASSAGE TEXT\",\n    \"idx\": 0\n  },\n  {\n    \"title\": \"SECOND PASSAGE TITLE\",\n    \"text\": \"SECOND PASSAGE TEXT\",\n    \"idx\": 1\n  }\n]\n```\n\n#### (Optional) Query JSON\n\n```json\n\n[\n  {\n    \"id\": \"sample\u002Fquestion_1.json\",\n    \"question\": \"QUESTION\",\n    \"answer\": [\n      \"ANSWER\"\n    ],\n    \"answerable\": true,\n    \"paragraphs\": [\n      {\n        \"title\": \"{FIRST SUPPORTING PASSAGE TITLE}\",\n        \"text\": \"{FIRST SUPPORTING PASSAGE TEXT}\",\n        \"is_supporting\": true,\n        \"idx\": 0\n      },\n      {\n        \"title\": \"{SECOND SUPPORTING PASSAGE TITLE}\",\n        \"text\": \"{SECOND SUPPORTING PASSAGE TEXT}\",\n        \"is_supporting\": true,\n        \"idx\": 1\n      }\n    ]\n  }\n]\n```\n\n#### (Optional) Chunking Corpus\n\nWhen preparing your data, you may need to chunk each passage, as longer passage may be too complex for the OpenIE process.\n\n## Code Structure\n\n```\n📦 .\n│-- 📂 src\u002Fhipporag\n│   ├── 📂 embedding_model          # Implementation of all embedding models\n│   │   ├── __init__.py             # Getter function for get specific embedding model classes\n|   |   ├── base.py                 # Base embedding model class `BaseEmbeddingModel` to inherit and `EmbeddingConfig`\n|   |   ├── NVEmbedV2.py            # Implementation of NV-Embed-v2 model\n|   |   ├── ...\n│   ├── 📂 evaluation               # Implementation of all evaluation metrics\n│   │   ├── __init__.py\n|   |   ├── base.py                 # Base evaluation metric class `BaseMetric` to inherit\n│   │   ├── qa_eval.py              # Eval metrics for QA\n│   │   ├── retrieval_eval.py       # Eval metrics for retrieval\n│   ├── 📂 information_extraction  # Implementation of all information extraction models\n│   │   ├── __init__.py\n|   |   ├── openie_openai_gpt.py    # Model for OpenIE with OpenAI GPT\n|   |   ├── openie_vllm_offline.py  # Model for OpenIE with LLMs deployed offline with vLLM\n│   ├── 📂 llm                      # Classes for inference with large language models\n│   │   ├── __init__.py             # Getter function\n|   |   ├── base.py                 # Config class for LLM inference and base LLM inference class to inherit\n|   |   ├── openai_gpt.py           # Class for inference with OpenAI GPT\n|   |   ├── vllm_llama.py           # Class for inference using a local vLLM server\n|   |   ├── vllm_offline.py         # Class for inference using the vLLM API directly\n│   ├── 📂 prompts                  # Prompt templates and prompt template manager class\n|   │   ├── 📂 dspy_prompts         # Prompts for filtering\n|   │   │   ├── ...\n|   │   ├── 📂 templates            # All prompt templates for template manager to load\n|   │   │   ├── README.md           # Documentations of usage of prompte template manager and prompt template files\n|   │   │   ├── __init__.py\n|   │   │   ├── triple_extraction.py\n|   │   │   ├── ...\n│   │   ├── __init__.py\n|   |   ├── linking.py              # Instruction for linking\n|   |   ├── prompt_template_manager.py  # Implementation of prompt template manager\n│   ├── 📂 utils                    # All utility functions used across this repo (the file name indicates its relevant usage)\n│   │   ├── config_utils.py         # We use only one config across all modules and its setup is specified here\n|   |   ├── ...\n│   ├── __init__.py\n│   ├── HippoRAG.py          # Highest level class for initiating retrieval, question answering, and evaluations\n│   ├── embedding_store.py   # Storage database to load, manage and save embeddings for passages, entities and facts.\n│   ├── rerank.py            # Reranking and filtering methods\n│-- 📂 examples\n│   ├── ...\n│   ├── ...\n│-- 📜 README.md\n│-- 📜 requirements.txt   # Dependencies list\n│-- 📜 .gitignore         # Files to exclude from Git\n\n\n```\n\n## Contact\n\nQuestions or issues? File an issue or contact \n[Bernal Jiménez Gutiérrez](mailto:jimenezgutierrez.1@osu.edu),\n[Yiheng Shu](mailto:shu.251@osu.edu),\n[Yu Su](mailto:su.809@osu.edu),\nThe Ohio State University\n\n## Citation\n\nIf you find this work useful, please consider citing our papers:\n\n### HippoRAG 2\n```\n@misc{gutiérrez2025ragmemorynonparametriccontinual,\n      title={From RAG to Memory: Non-Parametric Continual Learning for Large Language Models}, \n      author={Bernal Jiménez Gutiérrez and Yiheng Shu and Weijian Qi and Sizhe Zhou and Yu Su},\n      year={2025},\n      eprint={2502.14802},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802}, \n}\n```\n\n### HippoRAG\n\n```\n@inproceedings{gutiérrez2024hipporag,\n      title={HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models}, \n      author={Bernal Jiménez Gutiérrez and Yiheng Shu and Yu Gu and Michihiro Yasunaga and Yu Su},\n      booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},\n      year={2024},\n      url={https:\u002F\u002Fopenreview.net\u002Fforum?id=hkujvAPVsg}\n ```\n\n## TODO:\n\n- [x] Add support for more embedding models\n- [x] Add support for embedding endpoints\n- [ ] Add support for vector database integration\n\nPlease feel free to open an issue or PR if you have any questions or suggestions.\n","\u003Ch1 align=\"center\">HippoRAG 2：从 RAG 到记忆\u003C\u002Fh1>\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_5c05af34a2ce.png\" width=\"55%\" style=\"max-width: 300px;\">\n\u003C\u002Fp>\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" \u002F>](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1nuelysWsXL8F5xH6q4JYJI8mvtlmeM9O#scrollTo=TjHdNe2KC81K)\n\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2502.14802 HippoRAG 2-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗 Dataset-HippoRAG 2-yellow\" \u002F>](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_2\u002Ftree\u002Fmain)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2405.14831 HippoRAG 1-b31b1b\" \u002F>](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831)\n[\u003Cimg align=\"center\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-HippoRAG 1-blue\" \u002F>](https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Ftree\u002Flegacy)\n\n### HippoRAG 2 是一个强大的 LLM 记忆框架，能够增强模型识别和利用新知识中关联的能力——这正是人类长期记忆的核心功能。\n\n我们的实验表明，即使在最先进的 RAG 系统中，HippoRAG 2 也能提升联想能力（多跳检索）和意义构建能力（整合庞大复杂上下文的过程），同时不会降低其在简单任务上的表现。\n\n与前代产品类似，HippoRAG 2 在在线处理过程中保持了较低的成本和延迟，而在离线索引方面，相比 GraphRAG、RAPTOR 和 LightRAG 等基于图的解决方案，所需资源则显著减少。\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_5b90a821219a.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>图 1：\u003C\u002Fb> 对持续学习能力的评估涵盖三个关键维度：事实记忆（NaturalQuestions、PopQA）、意义构建（NarrativeQA）以及联想能力（MuSiQue、2Wiki、HotpotQA 和 LV-Eval）。HippoRAG 2 在所有类别中均优于其他方法，使其距离真正的长期记忆更近一步。\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg align=\"center\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_readme_aea07d15223b.png\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cb>图 2：\u003C\u002Fb> HippoRAG 2 的方法论。\n\u003C\u002Fp>\n\n#### 更多信息请参阅我们的论文：\n\n* [**HippoRAG：受神经生物学启发的大语言模型长期记忆**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14831) [NeurIPS '24]。\n* [**从 RAG 到记忆：大语言模型的非参数化持续学习**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802) [ICML '25]。\n\n----\n\n## 安装\n\n```sh\nconda create -n hipporag python=3.10\nconda activate hipporag\npip install hipporag\n```\n初始化环境变量并激活环境：\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003CHuggingface 主目录路径>\nexport OPENAI_API_KEY=\u003C您的 OpenAI API 密钥>   # 如果您想使用 OpenAI 模型\n\nconda activate hipporag\n```\n\n## 快速入门\n\n### OpenAI 模型\n\n以下示例将展示如何使用任何 OpenAI 模型来运行 `hipporag`：\n\n```python\nfrom hipporag import HippoRAG\n\n# 准备数据集和评估\ndocs = [\n    \"奥利弗·巴德曼是一位政治家。\",\n    \"乔治·兰金是一位政治家。\",\n    \"托马斯·马维克是一位政治家。\",\n    \"灰姑娘参加了王室舞会。\",\n    \"王子用丢失的玻璃鞋在整个王国寻找。\",\n    \"当鞋子完美契合时，灰姑娘与王子重逢了。\",\n    \"埃里克·霍特的出生地是蒙特贝洛。\",\n    \"玛丽娜出生于明斯克。\",\n    \"蒙特贝洛属于洛克兰县。\"\n]\n\nsave_dir = 'outputs' # 定义 HippoRAG 对象的保存目录（每种 LLM\u002F嵌入模型组合都会创建一个新的子目录）\nllm_model_name = 'gpt-4o-mini' # 任何 OpenAI 模型名称\nembedding_model_name = 'nvidia\u002FNV-Embed-v2' # 嵌入模型名称（目前支持 NV-Embed、GritLM 或 Contriever）\n\n# 启动一个 HippoRAG 实例\nhipporag = HippoRAG(save_dir=save_dir, \n                    llm_model_name=llm_model_name,\n                    embedding_model_name=embedding_model_name) \n\n# 运行索引\nhipporag.index(docs=docs)\n\n# 分离式检索与问答\nqueries = [\n    \"乔治·兰金的职业是什么？\",\n    \"灰姑娘是如何迎来幸福结局的？\",\n    \"埃里克·霍特的出生地属于哪个县？\"\n]\n\nretrieval_results = hipporag.retrieve(queries=queries, num_to_retrieve=2)\nqa_results = hipporag.rag_qa(retrieval_results)\n\n# 检索与问答一体化\nrag_results = hipporag.rag_qa(queries=queries)\n\n# 用于评估\nanswers = [\n    [\"政治家\"],\n    [\"通过参加舞会。\"],\n    [\"洛克兰县\"]\n]\n\ngold_docs = [\n    [\"乔治·兰金是一位政治家。\"],\n    [\"灰姑娘参加了王室舞会。\",\n    \"王子用丢失的玻璃鞋在整个王国寻找。\",\n    \"当鞋子完美契合时，灰姑娘与王子重逢了。\"],\n    [\"埃里克·霍特的出生地是蒙特贝洛。\",\n    \"蒙特贝洛属于洛克兰县。\"]\n]\n\nrag_results = hipporag.rag_qa(queries=queries, \n                              gold_docs=gold_docs,\n                              gold_answers=answers)\n```\n\n#### 示例（兼容 OpenAI 的嵌入模型）\n\n如果您希望使用与 OpenAI 兼容的 LLM 和嵌入模型，请采用以下方法。\u003C\u002Fp>\n    \n```python\nhipporag = HippoRAG(save_dir=save_dir, \n    llm_model_name='您的 LLM 模型名称',\n    llm_base_url='您的 LLM 模型 URL',\n    embedding_model_name='您的嵌入模型名称',  \n    embedding_base_url='您的嵌入模型 URL')\n```\n\n### 本地部署（vLLM）\n\n以下示例将展示如何使用任何兼容 vLLM 的本地部署 LLM 来运行 `hipporag`。\n\n1. 使用指定的 GPU 运行一个本地的 [兼容 OpenAI 的 vLLM 服务器](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online)，并确保为您的嵌入模型留出足够的内存。\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003CHuggingface 主目录路径>\n\nconda activate hipporag  # vllm 应该在此环境中\n\n# 如果出现 OOM 错误，可调整 gpu-memory-utilization 或 max_model_len 以适应您的 GPU 内存\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. 现在您可以使用与上述代码非常相似的方式来运行 `hipporag`：\n\n```python\nsave_dir = 'outputs' # 定义 HippoRAG 对象的保存目录（每种 LLM\u002F嵌入模型组合都会创建一个新的子目录）\nllm_model_name = # 任何 OpenAI 模型名称\nembedding_model_name = # 嵌入模型名称（目前支持 NV-Embed、GritLM 或 Contriever）\nllm_base_url= # 您部署的 LLM 的基础 URL（例如 http:\u002F\u002Flocalhost:8000\u002Fv1）\n\nhipporag = HippoRAG(save_dir=save_dir,\n                    llm_model_name=llm_model,\n                    embedding_model_name=embedding_model_name,\n                    llm_base_url=llm_base_url)\n\n# 索引、检索和问答流程与运行 OpenAI 模型时相同\n```\n\n## 测试\n\n在向 HippoRAG 贡献代码时，请运行以下脚本，以确保您的更改不会导致我们核心模块出现意外行为。\n\n这些脚本会测试索引构建、图加载、文档删除以及对 HippoRAG 对象的增量更新功能。\n\n### OpenAI 测试\n\n要使用 OpenAI 的 LLM 和嵌入模型测试 HippoRAG，只需运行以下命令。此测试的成本可以忽略不计。\n\n```sh\nexport OPENAI_API_KEY=\u003C您的 OpenAI API 密钥> \n\nconda activate hipporag\n\npython tests_openai.py\n```\n\n### 本地测试\n\n要在本地进行测试，您需要部署一个 vLLM 实例。为了降低测试成本，我们选择部署较小的 8B 模型 `Llama-3.1-8B-Instruct`。\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003CHuggingFace 家目录路径>\n\nconda activate hipporag  # vllm 应该安装在这个环境中\n\n# 如果出现 OOM 错误，请调整 gpu-memory-utilization 或 max_model_len 以适应您的 GPU 内存\nvllm serve meta-llama\u002FLlama-3.1-8B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 --port 6578\n```\n\n然后，我们运行以下测试脚本：\n\n```sh\nCUDA_VISIBLE=1 python tests_local.py\n```\n\n## 复现我们的实验\n\n要使用我们的代码运行实验，我们建议您克隆此仓库，并按照 `main.py` 脚本的结构进行操作。\n\n### 用于复现的数据\n\n我们在论文中评估了多个采样数据集，其中一些已经包含在本仓库的 `reproduce\u002Fdataset` 目录中。完整的数据集请访问我们的 [HuggingFace 数据集](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fosunlp\u002FHippoRAG_v2)，并将它们放置在 `reproduce\u002Fdataset` 目录下。我们还提供了针对 `musique` 样本的 OpenIE 结果，分别使用 `gpt-4o-mini` 和 `Llama-3.3-70B-Instruct` 模型，存储在 `outputs\u002Fmusique` 目录中。\n\n为了测试您的环境是否正确配置，您可以使用小数据集 `reproduce\u002Fdataset\u002Fsample.json` 进行调试，如下所示。\n\n### 运行索引构建与问答\n\n首先初始化环境变量并激活环境：\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport HF_HOME=\u003CHuggingFace 家目录路径>\nexport OPENAI_API_KEY=\u003C您的 OpenAI API 密钥>   # 如果您想使用 OpenAI 模型\n\nconda activate hipporag\n```\n\n### 使用 OpenAI 模型运行\n\n```sh\ndataset=sample  # 或者 `reproduce\u002Fdataset` 下的其他数据集\n\n# 运行 OpenAI 模型\npython main.py --dataset $dataset --llm_base_url https:\u002F\u002Fapi.openai.com\u002Fv1 --llm_name gpt-4o-mini --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n### 使用 vLLM (Llama) 运行\n\n1. 如上所述，启动一个本地的 [兼容 OpenAI 的 vLLM 服务器](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#quickstart-online)，并指定 GPU。\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003CHuggingFace 家目录路径>\n\nconda activate hipporag  # vllm 应该安装在这个环境中\n\n# 如果出现 OOM 错误，请调整 gpu-memory-utilization 或 max_model_len 以适应您的 GPU 内存\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n2. 在另一个终端中，使用其他 GPU 运行主程序。\n\n```sh\nexport CUDA_VISIBLE_DEVICES=2,3  # 设置其他 GPU，同时 vLLM 服务器正在运行\nexport HF_HOME=\u003CHuggingFace 家目录路径>\ndataset=sample\n\npython main.py --dataset $dataset --llm_base_url http:\u002F\u002Flocalhost:8000\u002Fv1 --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --embedding_name nvidia\u002FNV-Embed-v2\n```\n\n#### 高级：使用 vLLM 离线批处理模式\n\nvLLM 提供了 [离线批处理模式](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fgetting_started\u002Fquickstart.html#offline-batched-inference)，可以实现更快的推理速度，相比在线 vLLM 服务器，索引构建速度可提升 3 倍以上。\n\n1. 使用以下命令以 vLLM 离线批处理模式运行主程序。\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1,2,3 # 使用所有 GPU 加速离线索引\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003CHuggingFace 家目录路径>\nexport OPENAI_API_KEY=''\ndataset=sample\n\npython main.py --dataset $dataset --llm_name meta-llama\u002FLlama-3.3-70B-Instruct --openie_mode offline --skip_graph\n```\n\n2. 第一步完成后，OpenIE 结果将保存到文件中。随后按照“使用 vLLM (Llama) 运行”部分的说明，重新启动 vLLM 在线服务器和主程序。\n\n## 调试注意事项\n\n- `\u002Freproduce\u002Fdataset\u002Fsample.json` 是专门用于调试的小数据集。\n- 在调试 vLLM 离线模式时，请将 `hipporag\u002Fllm\u002Fvllm_offline.py` 中的 `tensor_parallel_size` 设置为 `1`。\n- 如果您想重新运行某个实验，请务必清除已保存的文件，包括 OpenIE 结果和知识图谱，例如：\n\n```sh\nrm reproduce\u002Fdataset\u002Fopenie_results\u002Fopenie_sample_results_ner_meta-llama_Llama-3.3-70B-Instruct_3.json\nrm -rf outputs\u002Fsample\u002Fsample_meta-llama_Llama-3.3-70B-Instruct_nvidia_NV-Embed-v2\n```\n\n### 自定义数据集\n\n要设置您自己的自定义数据集进行评估，请遵循 `reproduce\u002Fdataset\u002Fsample_corpus.json` 中显示的格式和命名规范（您的数据集名称后应加上 `_corpus.json`）。如果要运行带有预定义问题的实验，应根据查询文件 `reproduce\u002Fdataset\u002Fsample.json` 组织您的查询语料库，并确保也遵循我们的命名规范。\n\n语料库和可选的查询 JSON 文件应具有以下格式：\n\n#### 检索语料库 JSON\n\n```json\n[\n  {\n    \"title\": \"第一条文档标题\",\n    \"text\": \"第一条文档内容\",\n    \"idx\": 0\n  },\n  {\n    \"title\": \"第二条文档标题\",\n    \"text\": \"第二条文档内容\",\n    \"idx\": 1\n  }\n]\n```\n\n#### （可选）查询 JSON\n\n```json\n\n[\n  {\n    \"id\": \"sample\u002Fquestion_1.json\",\n    \"question\": \"问题内容\",\n    \"answer\": [\n      \"答案内容\"\n    ],\n    \"answerable\": true,\n    \"paragraphs\": [\n      {\n        \"title\": \"{第一支持文档标题}\",\n        \"text\": \"{第一支持文档内容}\",\n        \"is_supporting\": true,\n        \"idx\": 0\n      },\n      {\n        \"title\": \"{第二支持文档标题}\",\n        \"text\": \"{第二支持文档内容}\",\n        \"is_supporting\": true,\n        \"idx\": 1\n      }\n    ]\n  }\n]\n```\n\n#### （可选）语料分块\n\n在准备数据时，您可能需要对每条文档进行分块，因为较长的文档可能会使 OpenIE 处理变得过于复杂。\n\n## 代码结构\n\n```\n📦 .\n│-- 📂 src\u002Fhipporag\n│   ├── 📂 embedding_model          # 所有嵌入模型的实现\n│   │   ├── __init__.py             # 获取特定嵌入模型类的工厂函数\n|   |   ├── base.py                 # 嵌入模型基类 `BaseEmbeddingModel` 及其配置类 `EmbeddingConfig`\n|   |   ├── NVEmbedV2.py            # NV-Embed-v2 模型的实现\n|   |   ├── ...\n│   ├── 📂 evaluation               # 所有评估指标的实现\n│   │   ├── __init__.py\n|   |   ├── base.py                 # 评估指标基类 `BaseMetric`\n│   │   ├── qa_eval.py              # 问答任务的评估指标\n│   │   ├── retrieval_eval.py       # 检索任务的评估指标\n│   ├── 📂 information_extraction  # 所有信息抽取模型的实现\n│   │   ├── __init__.py\n|   |   ├── openie_openai_gpt.py    # 使用 OpenAI GPT 的 OpenIE 模型\n|   |   ├── openie_vllm_offline.py  # 使用本地部署的 vLLM 模型进行 OpenIE\n│   ├── 📂 llm                      # 大语言模型推理相关的类\n│   │   ├── __init__.py             # 工厂函数\n|   |   ├── base.py                 # 大语言模型推理的配置类及基类\n|   |   ├── openai_gpt.py           # 使用 OpenAI GPT 进行推理的类\n|   |   ├── vllm_llama.py           # 使用本地 vLLM 服务器进行推理的类\n|   |   ├── vllm_offline.py         # 直接使用 vLLM API 进行推理的类\n│   ├── 📂 prompts                  # 提示模板及提示模板管理器类\n|   │   ├── 📂 dspy_prompts         # 过滤相关的提示\n|   │   │   ├── ...\n|   │   ├── 📂 templates            # 提示模板管理器可加载的所有提示模板\n|   │   │   ├── README.md           # 提示模板管理器及提示模板文件的使用说明\n|   │   │   ├── __init__.py\n|   │   │   ├── triple_extraction.py\n|   │   │   ├── ...\n│   │   ├── __init__.py\n|   |   ├── linking.py              # 链接相关的指令\n|   |   ├── prompt_template_manager.py  # 提示模板管理器的实现\n│   ├── 📂 utils                    # 本仓库中通用的工具函数（文件名表明其用途）\n│   │   ├── config_utils.py         # 所有模块共用一个配置，此处定义其设置\n|   |   ├── ...\n│   ├── __init__.py\n│   ├── HippoRAG.py          # 最高层次的类，用于启动检索、问答和评估流程\n│   ├── embedding_store.py   # 存储库，用于加载、管理和保存段落、实体和事实的嵌入向量。\n│   ├── rerank.py            # 重排序和过滤方法\n│-- 📂 examples\n│   ├── ...\n│   ├── ...\n│-- 📜 README.md\n│-- 📜 requirements.txt   # 依赖列表\n│-- 📜 .gitignore         # 需要从 Git 中排除的文件\n\n\n```\n\n## 联系方式\n\n如有任何问题或疑问，请提交 issue 或联系以下人员：\n\n[Bernal Jiménez Gutiérrez](mailto:jimenezgutierrez.1@osu.edu)、  \n[Yiheng Shu](mailto:shu.251@osu.edu)、  \n[Yu Su](mailto:su.809@osu.edu)，  \n俄亥俄州立大学\n\n## 引用\n\n如果您觉得这项工作对您有所帮助，请考虑引用我们的论文：\n\n### HippoRAG 2\n```\n@misc{gutiérrez2025ragmemorynonparametriccontinual,\n      title={从 RAG 到记忆：大语言模型的非参数持续学习}, \n      author={Bernal Jiménez Gutiérrez、Yiheng Shu、Weijian Qi、Sizhe Zhou、Yu Su},\n      year={2025},\n      eprint={2502.14802},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.14802}, \n}\n```\n\n### HippoRAG\n\n```\n@inproceedings{gutiérrez2024hipporag,\n      title={HippoRAG：受神经生物学启发的大语言模型长期记忆}, \n      author={Bernal Jiménez Gutiérrez、Yiheng Shu、Yu Gu、Michihiro Yasunaga、Yu Su},\n      booktitle={第三十八届神经信息处理系统年会},\n      year={2024},\n      url={https:\u002F\u002Fopenreview.net\u002Fforum?id=hkujvAPVsg}\n ```\n## 待办事项：\n\n- [x] 增加对更多嵌入模型的支持\n- [x] 增加对嵌入端点的支持\n- [ ] 增加对向量数据库集成的支持\n\n如果您有任何问题或建议，欢迎随时提交 issue 或 pull request。","# HippoRAG 2 快速上手指南\n\nHippoRAG 2 是一个强大的大语言模型（LLM）记忆框架，旨在增强模型识别和利用新知识之间关联的能力。相比传统的 RAG 系统，它在多跳检索（associativity）和复杂上下文整合（sense-making）方面表现更佳，同时保持了高效的索引速度和较低的资源消耗。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux (推荐) 或 macOS\n*   **Python 版本**: 3.10\n*   **GPU**: 推荐使用 NVIDIA GPU（用于加速嵌入模型和本地 LLM 部署），需安装 CUDA 驱动。\n*   **API Key**: 如果使用 OpenAI 模型，需准备 `OPENAI_API_KEY`。\n*   **Hugging Face**: 建议配置 `HF_HOME` 环境变量以缓存模型文件。\n\n> **国内开发者提示**：\n> *   下载 Hugging Face 模型时，如遇网络问题，可设置镜像加速：\n>     ```bash\n>     export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com\n>     ```\n> *   安装 Python 包时，可使用清华或阿里镜像源加速 `pip` 安装。\n\n## 安装步骤\n\n### 1. 创建并激活 Conda 环境\n\n```sh\nconda create -n hipporag python=3.10\nconda activate hipporag\n```\n\n### 2. 安装 HippoRAG\n\n```sh\npip install hipporag\n```\n*(注：如需加速，可添加 `-i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`)*\n\n### 3. 配置环境变量\n\n根据您的使用场景（OpenAI API 或本地部署），设置以下变量：\n\n```sh\n# 指定可用的 GPU 设备\nexport CUDA_VISIBLE_DEVICES=0,1,2,3\n\n# 设置 Hugging Face 缓存目录 (可选)\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\n# 如果使用 OpenAI 模型，填入您的 API Key\nexport OPENAI_API_KEY=\u003Cyour openai api key>\n\n# 重新激活环境以确保变量生效\nconda activate hipporag\n```\n\n## 基本使用\n\n以下示例展示如何使用 HippoRAG 进行文档索引、检索和问答。\n\n### 场景一：使用 OpenAI 模型\n\n这是最简单的上手方式，适用于快速验证功能。\n\n```python\nfrom hipporag import HippoRAG\n\n# 1. 准备数据\ndocs = [\n    \"Oliver Badman is a politician.\",\n    \"George Rankin is a politician.\",\n    \"Thomas Marwick is a politician.\",\n    \"Cinderella attended the royal ball.\",\n    \"The prince used the lost glass slipper to search the kingdom.\",\n    \"When the slipper fit perfectly, Cinderella was reunited with the prince.\",\n    \"Erik Hort's birthplace is Montebello.\",\n    \"Marina is bom in Minsk.\",\n    \"Montebello is a part of Rockland County.\"\n]\n\nsave_dir = 'outputs' # 保存目录\nllm_model_name = 'gpt-4o-mini' # LLM 模型名称\nembedding_model_name = 'nvidia\u002FNV-Embed-v2' # 嵌入模型名称\n\n# 2. 初始化 HippoRAG 实例\nhipporag = HippoRAG(save_dir=save_dir, \n                    llm_model_name=llm_model_name,\n                    embedding_model_name=embedding_model_name) \n\n# 3. 执行索引 (Indexing)\nhipporag.index(docs=docs)\n\n# 4. 执行检索与问答 (Retrieval & QA)\nqueries = [\n    \"What is George Rankin's occupation?\",\n    \"How did Cinderella reach her happy ending?\",\n    \"What county is Erik Hort's birthplace a part of?\"\n]\n\n# 方式 A: 分步执行 (先检索，后问答)\nretrieval_results = hipporag.retrieve(queries=queries, num_to_retrieve=2)\nqa_results = hipporag.rag_qa(retrieval_results)\n\n# 方式 B: 一步执行 (直接获取最终答案)\nrag_results = hipporag.rag_qa(queries=queries)\n\nprint(rag_results)\n```\n\n### 场景二：使用兼容 OpenAI 格式的本地\u002F第三方模型\n\n如果您使用的是本地部署的 vLLM 或其他兼容 OpenAI API 的服务，请按以下方式初始化：\n\n```python\nhipporag = HippoRAG(save_dir=save_dir, \n    llm_model_name='Your LLM Model name',\n    llm_base_url='Your LLM Model url',       # 例如：http:\u002F\u002Flocalhost:8000\u002Fv1\n    embedding_model_name='Your Embedding model name',  \n    embedding_base_url='Your Embedding model url') # 如果嵌入模型也是 API 服务\n```\n\n### 场景三：本地部署 (vLLM + Llama)\n\n对于需要数据隐私或离线运行的场景，可以使用 vLLM 部署本地模型。\n\n**第一步：启动 vLLM 服务**\n在终端中运行以下命令启动本地 LLM 服务器（确保显存充足）：\n\n```sh\nexport CUDA_VISIBLE_DEVICES=0,1\nexport VLLM_WORKER_MULTIPROC_METHOD=spawn\nexport HF_HOME=\u003Cpath to Huggingface home directory>\n\nconda activate hipporag\n\n# 启动服务，根据显存调整 --gpu-memory-utilization\nvllm serve meta-llama\u002FLlama-3.3-70B-Instruct --tensor-parallel-size 2 --max_model_len 4096 --gpu-memory-utilization 0.95 \n```\n\n**第二步：运行 Python 代码**\n在另一个终端或脚本中，指向本地服务地址：\n\n```python\nsave_dir = 'outputs'\nllm_model_name = 'meta-llama\u002FLlama-3.3-70B-Instruct'\nembedding_model_name = 'nvidia\u002FNV-Embed-v2'\nllm_base_url = 'http:\u002F\u002Flocalhost:8000\u002Fv1' # 本地 vLLM 地址\n\nhipporag = HippoRAG(save_dir=save_dir,\n                    llm_model_name=llm_model_name,\n                    embedding_model_name=embedding_model_name,\n                    llm_base_url=llm_base_url)\n\n# 后续操作与 OpenAI 示例相同\nhipporag.index(docs=docs)\nrag_results = hipporag.rag_qa(queries=queries)\n```","某大型法律科技公司的研发团队正在构建一个智能案情分析助手，需要让 AI 从成千上万份分散的法律文书、判例和新闻报告中提取线索，回答如“被告 A 公司与证人 B 所在的蒙哥马利县有何关联”这类复杂的多跳推理问题。\n\n### 没有 HippoRAG 时\n- **知识碎片化严重**：传统 RAG 只能检索到包含关键词的单一文档，无法自动串联起分散在不同文件中的实体关系（如人物、地点、事件），导致回答断章取义。\n- **多跳推理能力弱**：面对需要跨越多个文档才能推导出的结论（A 认识 B，B 住在 C 地，因此 A 与 C 地有关），系统往往直接返回“未找到相关信息”。\n- **上下文理解混乱**：当强行拼接大量检索片段作为上下文时，模型容易陷入信息过载，产生幻觉或逻辑矛盾，难以形成连贯的案情叙事。\n- **索引成本高昂**：若尝试使用其他基于知识图谱的方案（如 GraphRAG），离线构建索引的过程极其消耗算力和时间，难以适应频繁更新的法律数据库。\n\n### 使用 HippoRAG 后\n- **模拟人类长期记忆**：HippoRAG 利用神经生物学灵感构建动态知识网络，能像人脑一样自动识别并存储文档间隐含的实体关联，将碎片信息织成完整的知识网。\n- **精准多跳检索**：结合个性化 PageRank 算法，系统能顺藤摸瓜，从“被告”推导至“关联公司”再定位到“涉案地点”，准确回答复杂的跨文档推理问题。\n- **增强叙事整合力**：在處理长篇复杂的案情背景时，HippoRAG 显著提升了模型的“意义构建”能力，使其能输出逻辑严密、因果清晰的综合分析报告。\n- **高效低耗部署**：相比同类图技术，HippoRAG 在离线索引阶段大幅降低了资源消耗，同时保持在线查询的低延迟，让大规模法律库的实时更新成为可能。\n\nHippoRAG 通过将静态检索升级为具备关联推理能力的动态记忆系统，真正解决了大模型在处理复杂、分散知识时的“断链”难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOSU-NLP-Group_HippoRAG_5c05af34.png","OSU-NLP-Group","OSU Natural Language Processing","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FOSU-NLP-Group_eb35ecd2.jpg","",null,"osunlp","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group",[80],{"name":81,"color":82,"percentage":83},"Python","#3572A5",100,3350,335,"2026-04-12T09:31:21","MIT","Linux","本地部署必需 NVIDIA GPU。示例中使用了多卡配置（如 --tensor-parallel-size 2），运行大模型（如 Llama-3.3-70B）需高显存，建议通过调整 --gpu-memory-utilization 适配显存；若使用 OpenAI API 则无需本地 GPU。","未说明（取决于加载的模型大小，运行 70B 模型需大量系统内存）",{"notes":92,"python":93,"dependencies":94},"1. 推荐使用 conda 创建名为 'hipporag' 的虚拟环境。\n2. 支持两种模式：调用 OpenAI API（无需本地显卡）或本地部署 vLLM 服务（需 NVIDIA 显卡）。\n3. 本地部署时需设置环境变量 CUDA_VISIBLE_DEVICES 指定显卡，并可能需要调整 vLLM 的显存利用率参数以防溢出。\n4. 嵌入模型支持 NV-Embed, GritLM 或 Contriever。\n5. 离线批处理模式可显著提升索引速度（3 倍以上）。","3.10",[95,96,97,98],"hipporag","vllm","transformers (隐含)","CUDA Toolkit (隐含)",[35,14],"2026-03-27T02:49:30.150509","2026-04-13T22:45:50.326421",[103,108,113,117,122,127],{"id":104,"question_zh":105,"answer_zh":106,"source_url":107},31966,"运行测试时遇到 Hugging Face 下载模型报错（404 Not Found）怎么办？","这通常是因为参数配置错误或模型路径不存在。维护者已提交 PR 更新了 QA 的使用方法，请等待合并或查看最新文档。如果是新遇到的具体问题，建议开启新的 Issue 讨论，而不是在旧帖中回复。","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fissues\u002F6",{"id":109,"question_zh":110,"answer_zh":111,"source_url":112},31967,"运行索引函数时程序卡住或出现 Faiss\u002FCUDA 相关错误如何解决？","这通常是环境配置问题而非数据大小问题。请首先检查您的 conda 环境是否完全符合项目 `requirements.txt` 中的依赖版本要求。此外，部分错误可能源于 ColBERT 的输出而非 HippoRAG 本身，若是新出现的不同错误，请单独新建 Issue 反馈。","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fissues\u002F15",{"id":114,"question_zh":115,"answer_zh":116,"source_url":112},31968,"使用免费版的 TogetherAI API 导致文件生成失败或超时怎么办？","免费版 API 通常有严格的超时限制，可能导致代码无法生成必要的中间文件（如 `..._facts_and_sim_graph...`）。解决方案是切换到付费版的 OpenAI API 或其他无严格超时限制的提供商，问题即可解决。",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},31969,"HippoRAG 支持替换 ColBERTv2 使用其他嵌入模型（如 Contriever 或 BGE）吗？","支持。虽然默认实现使用了 ColBERTv2，但用户可以自行选择嵌入模型。作者在研究时也考虑过当时发布的 BGE 等模型，并正在探索这方面的扩展。如果您觉得 ColBERTv2 与 Faiss 耦合过紧或增量更新困难，可以尝试集成其他稠密嵌入模型。","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fissues\u002F48",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},31970,"使用 vLLM 运行 Qwen2.5 系列模型时出现“除以零”（division by zero）错误怎么办？","这是一个与 Python 环境及模型依赖相关的问题。测试表明 Qwen2.5-7b-instruct 和 32b-instruct 均可能出现此错误，而较新的 qwen3-8b 可能正常。请仔细查阅 Qwen 官方文档中关于其依赖项的说明，并确保本地环境与模型版本兼容。","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fissues\u002F93",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},31971,"使用 Ollama (如 llama3 或 qwen2) 运行时出现 'expected string or bytes-like object' 或 JSON 解析错误如何修复？","这是因为 Ollama 的返回格式无法被默认的 `extract_json_dict` 直接解析。需要修改源码：\n1. 调整解析逻辑以适配 Ollama 的非 JSON 模式返回格式。\n2. 由于返回值中缺少 `['token_usage']['total_tokens']` 属性，需使用 Huggingface 上的对应模型 tokenizer 对输入输出进行解析以估算 token 数（虽不完全准确但可用）。\n需修改的文件包括 `src\u002Fopenie_with_retrieval_option_parallel.py` 和 `src\u002Fnamed_entity_extraction_parallel.py` 中的三处代码。","https:\u002F\u002Fgithub.com\u002FOSU-NLP-Group\u002FHippoRAG\u002Fissues\u002F42",[133],{"id":134,"version":135,"summary_zh":136,"released_at":137},239226,"v1.0.0","这个历史版本包含了HippoRAG的第一个版本。","2025-02-27T17:29:48"]