[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-superlinear-ai--raglite":3,"tool-superlinear-ai--raglite":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",143909,2,"2026-04-07T11:33:18",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":78,"owner_website":79,"owner_url":80,"languages":81,"stars":90,"forks":91,"last_commit_at":92,"license":93,"difficulty_score":32,"env_os":94,"env_gpu":95,"env_ram":96,"env_deps":97,"category_tags":111,"github_topics":113,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":132,"updated_at":133,"faqs":134,"releases":165},5228,"superlinear-ai\u002Fraglite","raglite","🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL","RAGLite 是一款专为检索增强生成（RAG）打造的轻量级 Python 工具包，支持使用 DuckDB 或 PostgreSQL 作为底层数据库。它旨在解决传统 RAG 方案依赖繁重框架、文档处理粗糙以及检索精度不足等痛点，帮助开发者快速构建高效、精准的问答系统。\n\n这款工具特别适合希望摆脱 LangChain 等重型依赖、追求极致性能与灵活控制的 AI 开发者及研究人员。RAGLite 的核心亮点在于其“无束缚”的设计理念：它不仅兼容各类大模型提供商和本地模型，还引入了多项前沿技术。例如，它利用数学优化算法实现最优语义分块和句子分割，结合“延迟分块”与上下文标题技术提升嵌入质量；支持混合搜索与自适应检索，让大模型自主判断是否需要检索信息；同时针对长上下文提示词进行了专门优化，显著降低延迟并提升输出质量。此外，RAGLite 内置模型上下文协议（MCP）服务器，可轻松对接 Claude 桌面端等客户端，并提供可选的 Web 或 Slack 聊天界面。凭借对 Metal 和 CUDA 加速的原生支持，它在保证开源许可宽松的同时，实现了速度与效果的双重突破。","[![Open in Dev Containers](https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=Dev%20Containers&message=Open&color=blue&logo=data:image\u002Fsvg%2bxml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCI+PHBhdGggZmlsbD0iI2ZmZiIgZD0iTTE3IDE2VjdsLTYgNU0yIDlWOGwxLTFoMWw0IDMgOC04aDFsNCAyIDEgMXYxNGwtMSAxLTQgMmgtMWwtOC04LTQgM0gzbC0xLTF2LTFsMy0zIi8+PC9zdmc+)](https:\u002F\u002Fvscode.dev\u002Fredirect?url=vscode:\u002F\u002Fms-vscode-remote.remote-containers\u002FcloneInVolume?url=https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite) [![Open in GitHub Codespaces](https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=GitHub%20Codespaces&message=Open&color=blue&logo=github)](https:\u002F\u002Fgithub.com\u002Fcodespaces\u002Fnew\u002Fsuperlinear-ai\u002Fraglite)\n\n# 🥤 RAGLite\n\nRAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL.\n\n## Features\n\n##### Configurable\n\n- 🧠 Choose any LLM provider with [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm), including local [llama-cpp-python](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python) models\n- 💾 Choose either [DuckDB](https:\u002F\u002Fduckdb.org) or [PostgreSQL](https:\u002F\u002Fgithub.com\u002Fpostgres\u002Fpostgres) as a keyword & vector search database\n- 🥇 Choose any reranker with [rerankers](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers), including multilingual [FlashRank](https:\u002F\u002Fgithub.com\u002FPrithivirajDamodaran\u002FFlashRank) as the default\n\n##### Fast and permissive\n\n- ❤️ Only lightweight and permissive open source dependencies (e.g., no [PyTorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fpytorch) or [LangChain](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain))\n- 🚀 Acceleration with Metal on macOS, and CUDA on Linux and Windows\n\n##### Unhobbled\n\n- 📖 PDF to Markdown conversion on top of [pdftext](https:\u002F\u002Fgithub.com\u002FVikParuchuri\u002Fpdftext) and [pypdfium2](https:\u002F\u002Fgithub.com\u002Fpypdfium2-team\u002Fpypdfium2)\n- 🧬 Multi-vector chunk embedding with [late chunking](https:\u002F\u002Fweaviate.io\u002Fblog\u002Flate-chunking) and [contextual chunk headings](https:\u002F\u002Fd-star.ai\u002Fsolving-the-out-of-context-chunk-problem-for-rag)\n- ✏️ Optimal sentence splitting with [wtpsplit-lite](https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fwtpsplit-lite) by solving a [binary integer programming problem](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FInteger_programming)\n- ✂️ Optimal [semantic chunking](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=8OJC21T2SL4&t=1930s) by solving a [binary integer programming problem](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FInteger_programming)\n- 🔍 [Hybrid search](https:\u002F\u002Fplg.uwaterloo.ca\u002F~gvcormac\u002Fcormacksigir09-rrf.pdf) with the database's native keyword & vector search ([FTS](https:\u002F\u002Fduckdb.org\u002Fdocs\u002Fstable\u002Fextensions\u002Ffull_text_search)+[VSS](https:\u002F\u002Fduckdb.org\u002Fdocs\u002Fstable\u002Fextensions\u002Fvss); [tsvector](https:\u002F\u002Fwww.postgresql.org\u002Fdocs\u002Fcurrent\u002Fdatatype-textsearch.html)+[pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector))\n- 💭 [Adaptive retrieval](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.14403) where the LLM decides whether to and what to retrieve based on the query\n- 💰 Improved cost and latency with a [prompt caching-aware message array structure](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fprompt-caching)\n- 🍰 Improved output quality with [Anthropic's long-context prompt format](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Flong-context-tips)\n- 🌀 Optimal [closed-form linear query adapter](src\u002Fraglite\u002F_query_adapter.py) by solving an [orthogonal Procrustes problem](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FOrthogonal_Procrustes_problem)\n\n##### Extensible\n\n- 🔌 A built-in [Model Context Protocol](https:\u002F\u002Fmodelcontextprotocol.io) (MCP) server that any MCP client like [Claude desktop](https:\u002F\u002Fclaude.ai\u002Fdownload) can connect with\n- 💬 Optional customizable ChatGPT-like frontend for [web](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fcopilot), [Slack](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fslack), and [Teams](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fteams) with [Chainlit](https:\u002F\u002Fgithub.com\u002FChainlit\u002Fchainlit)\n- ✍️ Optional conversion of any input document to Markdown with [Pandoc](https:\u002F\u002Fgithub.com\u002Fjgm\u002Fpandoc)\n- 🔎 Optional high-quality document processing with [Mistral OCR](https:\u002F\u002Fdocs.mistral.ai\u002Fcapabilities\u002Fdocument\u002F) for PDFs, images, DOCX, and PPTX with automatic image descriptions\n- ✅ Optional evaluation of retrieval and generation performance with [Ragas](https:\u002F\u002Fgithub.com\u002Fexplodinggradients\u002Fragas)\n\n## Installing\n\n> [!TIP]\n> 🚀 If you want to use local models, it is recommended to install [an accelerated llama-cpp-python precompiled binary](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python?tab=readme-ov-file#supported-backends) with:\n> ```sh\n> # Configure which llama-cpp-python precompiled binary to install (⚠️ not every combination is available):\n> LLAMA_CPP_PYTHON_VERSION=0.3.9\n> PYTHON_VERSION=310|311|312\n> ACCELERATOR=metal|cu121|cu122|cu123|cu124\n> PLATFORM=macosx_11_0_arm64|linux_x86_64|win_amd64\n> \n> # Install llama-cpp-python:\n> pip install \"https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python\u002Freleases\u002Fdownload\u002Fv$LLAMA_CPP_PYTHON_VERSION-$ACCELERATOR\u002Fllama_cpp_python-$LLAMA_CPP_PYTHON_VERSION-cp$PYTHON_VERSION-cp$PYTHON_VERSION-$PLATFORM.whl\"\n> ```\n\nInstall RAGLite with:\n\n```sh\npip install raglite\n```\n\nTo add support for a customizable ChatGPT-like frontend, use the `chainlit` extra:\n\n```sh\npip install raglite[chainlit]\n```\n\nTo add support for filetypes other than PDF, use the `pandoc` extra:\n\n```sh\npip install raglite[pandoc]\n```\n\nTo add support for high-quality document processing with [Mistral OCR](https:\u002F\u002Fdocs.mistral.ai\u002Fcapabilities\u002Fdocument\u002F), use the `mistral-ocr` extra:\n\n```sh\npip install raglite[mistral-ocr]\n```\n\nTo add support for evaluation, use the `ragas` extra:\n\n```sh\npip install raglite[ragas]\n```\n\n## Using\n\n### Overview\n\n1. [Configuring RAGLite](#1-configuring-raglite)\n2. [Inserting documents](#2-inserting-documents)\n3. [Retrieval-Augmented Generation (RAG)](#3-retrieval-augmented-generation-rag)\n4. [Computing and using an optimal query adapter](#4-computing-and-using-an-optimal-query-adapter)\n5. [Evaluation of retrieval and generation](#5-evaluation-of-retrieval-and-generation)\n6. [Running a Model Context Protocol (MCP) server](#6-running-a-model-context-protocol-mcp-server)\n7. [Serving a customizable ChatGPT-like frontend](#7-serving-a-customizable-chatgpt-like-frontend)\n\n### 1. Configuring RAGLite\n\n> [!TIP]\n> 🧠 RAGLite extends [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm) with support for [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) models using [llama-cpp-python](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python). To select a llama.cpp model (e.g., from [Unsloth's collection](https:\u002F\u002Fhuggingface.co\u002Funsloth)), use a model identifier of the form `\"llama-cpp-python\u002F\u003Chugging_face_repo_id>\u002F\u003Cfilename>@\u003Cn_ctx>\"`, where `n_ctx` is an optional parameter that specifies the context size of the model.\n\n> [!TIP]\n> 💾 You can create a PostgreSQL database in a few clicks at [neon.tech](https:\u002F\u002Fneon.tech).\n\nFirst, configure RAGLite with your preferred DuckDB or PostgreSQL database and [any LLM supported by LiteLLM](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fproviders\u002Fopenai):\n\n```python\nfrom raglite import RAGLiteConfig\n\n# Example 'remote' config with a PostgreSQL database and an OpenAI LLM:\nmy_config = RAGLiteConfig(\n    db_url=\"postgresql:\u002F\u002Fmy_username:my_password@my_host:5432\u002Fmy_database\",\n    llm=\"gpt-4o-mini\",  # Or any LLM supported by LiteLLM\n    embedder=\"text-embedding-3-large\",  # Or any embedder supported by LiteLLM\n)\n\n# Example 'local' config with a DuckDB database and a llama.cpp LLM:\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    llm=\"llama-cpp-python\u002Funsloth\u002FQwen3-8B-GGUF\u002F*Q4_K_M.gguf@8192\",\n    embedder=\"llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512\", # More than 512 tokens degrades bge-m3's performance\n)\n```\n\nYou can also configure [any reranker supported by rerankers](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers):\n\n```python\nfrom rerankers import Reranker\n\n# Example remote API-based reranker:\nmy_config = RAGLiteConfig(\n    db_url=\"postgresql:\u002F\u002Fmy_username:my_password@my_host:5432\u002Fmy_database\"\n    reranker=Reranker(\"rerank-v3.5\", model_type=\"cohere\", api_key=COHERE_API_KEY, verbose=0)  # Multilingual\n)\n\n# Example local cross-encoder reranker per language (this is the default):\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    reranker={\n        \"en\": Reranker(\"ms-marco-MiniLM-L-12-v2\", model_type=\"flashrank\", verbose=0),  # English\n        \"other\": Reranker(\"ms-marco-MultiBERT-L-12\", model_type=\"flashrank\", verbose=0),  # Other languages\n    }\n)\n```\n\nSelf-query is also supported, allowing the LLM to automatically generate and apply metadata filters to refine search results based on the user's input. To enable self-query, set `self_query=True` in your `RAGLiteConfig`:\n\n```python\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    llm=\"gpt-4o-mini\",\n    embedder=\"text-embedding-3-large\",\n    self_query=True,  # Enable self-query\n)\n```\n\n### 2. Inserting documents\n\n> [!TIP]\n> ✍️ To insert documents other than PDF, install the `pandoc` extra with `pip install raglite[pandoc]`.\n\n> [!TIP]\n> 🔎 For higher-quality document processing with automatic image descriptions, install the `mistral-ocr` extra with `pip install raglite[mistral-ocr]` and configure it as follows:\n> ```python\n> from raglite import RAGLiteConfig, MistralOCRConfig\n>\n> my_config = RAGLiteConfig(\n>     document_processor=MistralOCRConfig(\n>         include_image_descriptions=True,  # Describe images, charts, and diagrams as text\n>         image_types=frozenset({\"chart\", \"diagram\", \"photo\", \"table\", \"logo\", \"icon\"}),  # Custom image categories\n>         exclude_image_types=frozenset({\"logo\", \"icon\"}),  # Filter out specific types from the output\n>     ),\n> )\n> ```\n> The `image_types` parameter defines the categories that Mistral classifies each image into — you can use the defaults or provide your own domain-specific types. Use `exclude_image_types` to filter out any classified types that are not useful for retrieval.\n\nNext, insert some documents into the database. RAGLite will take care of the [conversion to Markdown](src\u002Fraglite\u002F_markdown.py), [optimal level 4 semantic chunking](src\u002Fraglite\u002F_split_chunks.py), and [multi-vector embedding with late chunking](src\u002Fraglite\u002F_embed.py):\n\n```python\n# Insert documents given their file path\nfrom pathlib import Path\nfrom raglite import Document, insert_documents\n\ndocuments = [\n    Document.from_path(Path(\"On the Measure of Intelligence.pdf\")),\n    Document.from_path(Path(\"Special Relativity.pdf\")),\n]\ninsert_documents(documents, config=my_config)\n\n# Insert documents given their text\u002Fplain or text\u002Fmarkdown content\ncontent = \"\"\"\n# ON THE ELECTRODYNAMICS OF MOVING BODIES\n## By A. EINSTEIN  June 30, 1905\nIt is known that Maxwell...\n\"\"\"\ndocuments = [\n    Document.from_text(content, author=\"Einstein\", topic=\"physics\", year=1905)\n]\ninsert_documents(documents, config=my_config)\n```\n\n> [!TIP]\n> 📝 Documents can include metadata by passing keyword arguments to `Document.from_text()` or `Document.from_path()`. This metadata can later be used for filtering during retrieval.\n> For list values, metadata is stored as-is (e.g. `domain=[\"open\", \"music\"]`).\n\nYou may also want to expand the document metadata before insertion:\n\n```python\nfrom typing import Annotated\nfrom pydantic import Field\nfrom raglite import expand_document_metadata\n\n# Expand the documents' metadata.\nmetadata_fields = {\n    \"title\": Annotated[str, Field(..., description=\"Document title.\")],\n    \"author\": Annotated[str, Field(..., description=\"Primary author.\")],\n    \"topics\": Annotated[list[Literal[\"A\", \"B\", \"C\"]], Field(..., description=\"Key themes.\")],\n}\ndocuments = list(expand_document_metadata(documents, metadata_fields, config=my_config))\n\n# Insert documents given their text\u002Fplain or text\u002Fmarkdown content\ninsert_documents(documents, config=my_config)\n```\n\n### 3. Retrieval-Augmented Generation (RAG)\n\n#### 3.1 Adaptive RAG\n\nNow you can run an adaptive RAG pipeline that consists of adding the user prompt to the message history and streaming the LLM response:\n\n```python\nfrom raglite import rag\n\n# Create a user message\nmessages = []  # Or start with an existing message history\nmessages.append({\n    \"role\": \"user\",\n    \"content\": \"How is intelligence measured?\"\n})\n\n# Adaptively decide whether to retrieve and then stream the response\nchunk_spans = []\nstream = rag(messages, on_retrieval=lambda x: chunk_spans.extend(x), config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# Access the documents referenced in the RAG context\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\nThe LLM will adaptively decide whether to retrieve information based on the complexity of the user prompt. If retrieval is necessary, the LLM generates the search query and RAGLite applies hybrid search and reranking to retrieve the most relevant chunk spans (each of which is a list of consecutive chunks). The retrieval results are sent to the `on_retrieval` callback and are appended to the message history as a tool output. Finally, the assistant response is streamed and appended to the message history.\n\n#### 3.2 Programmable RAG\n\nIf you need manual control over the RAG pipeline, you can run a basic but powerful pipeline that consists of retrieving the most relevant chunk spans with hybrid search and reranking, converting the user prompt to a RAG instruction and appending it to the message history, and finally generating the RAG response:\n\n```python\nfrom raglite import add_context, rag, retrieve_context, vector_search\n\n# Choose a search method\nfrom dataclasses import replace\nmy_config = replace(my_config, search_method=vector_search)  # Or `hybrid_search`, `search_and_rerank_chunks`, ...\n\n# Retrieve relevant chunk spans with the configured search method\nuser_prompt = \"How is intelligence measured?\"\nchunk_spans = retrieve_context(\n    query=user_prompt, \n    num_chunks=5, \n    metadata_filter={\"author\": \"Einstein\"},  # Optional: filter by metadata\n    config=my_config\n)\n\n# Append a RAG instruction based on the user prompt and context to the message history\nmessages = []  # Or start with an existing message history\nmessages.append(add_context(user_prompt=user_prompt, context=chunk_spans, config=my_config))\n\n# Stream the RAG response and append it to the message history\nstream = rag(messages, config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# Access the documents referenced in the RAG context\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\n> [!TIP]\n> 🥇 Reranking can significantly improve the output quality of a RAG application. To add reranking to your application: first search for a larger set of 20 relevant chunks, then rerank them with a [rerankers](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers) reranker, and finally keep the top 5 chunks.\n\nRAGLite also offers more advanced control over the individual steps of a full RAG pipeline:\n\n1. Searching for relevant chunks with keyword, vector, or hybrid search\n2. Retrieving the chunks from the database\n3. Reranking the chunks and selecting the top 5 results\n4. Extending the chunks with their neighbors and grouping them into chunk spans\n5. Converting the user prompt to a RAG instruction and appending it to the message history\n6. Streaming an LLM response to the message history\n7. Accessing the cited documents from the chunk spans\n\nA full RAG pipeline is straightforward to implement with RAGLite:\n\n```python\n# Search for chunks\nfrom raglite import hybrid_search, keyword_search, vector_search\n\nuser_prompt = \"How is intelligence measured?\"\nchunk_ids_vector, _ = vector_search(user_prompt, num_results=20, config=my_config)\nchunk_ids_keyword, _ = keyword_search(user_prompt, num_results=20, config=my_config)\nchunk_ids_hybrid, _ = hybrid_search(\n    user_prompt, num_results=20, metadata_filter={\"topic\": \"physics\"}, config=my_config\n)  # Filter results to only include chunks from documents with topic=\"physics\" (works with any search method)\n\n# Multi-value filter in one field uses OR semantics:\nchunk_ids_or, _ = hybrid_search(\n    user_prompt,\n    num_results=20,\n    metadata_filter={\"domain\": [\"open\", \"music\"]},\n    config=my_config,\n)  # Returns chunks where domain includes \"open\" OR \"music\".\n\n# Retrieve chunks\nfrom raglite import retrieve_chunks\n\nchunks_hybrid = retrieve_chunks(chunk_ids_hybrid, config=my_config)\n\n# Rerank chunks and keep the top 5 (optional, but recommended)\nfrom raglite import rerank_chunks\n\nchunks_reranked = rerank_chunks(user_prompt, chunks_hybrid, config=my_config)\nchunks_reranked = chunks_reranked[:5]\n\n# Extend chunks with their neighbors and group them into chunk spans\nfrom raglite import retrieve_chunk_spans\n\nchunk_spans = retrieve_chunk_spans(chunks_reranked, config=my_config)\n\n# Append a RAG instruction based on the user prompt and context to the message history\nfrom raglite import add_context\n\nmessages = []  # Or start with an existing message history\nmessages.append(add_context(user_prompt=user_prompt, context=chunk_spans, config=my_config))\n\n# Stream the RAG response and append it to the message history\nfrom raglite import rag\n\nstream = rag(messages, config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# Access the documents referenced in the RAG context\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\n### 4. Computing and using an optimal query adapter\n\nRAGLite can compute and apply an [optimal closed-form query adapter](src\u002Fraglite\u002F_query_adapter.py) to the prompt embedding to improve the output quality of RAG. To benefit from this, first generate a set of evals with `insert_evals` and then compute and store the optimal query adapter with `update_query_adapter`:\n\n```python\n# Improve RAG with an optimal query adapter\nfrom raglite import insert_evals, update_query_adapter\n\ninsert_evals(num_evals=100, config=my_config)\nupdate_query_adapter(config=my_config)  # From here, every vector search will use the query adapter\n```\n\n### 5. Evaluation of retrieval and generation\n\nIf you installed the `ragas` extra, you can use RAGLite to answer the evals and then evaluate the quality of both the retrieval and generation steps of RAG using [Ragas](https:\u002F\u002Fgithub.com\u002Fexplodinggradients\u002Fragas):\n\n```python\n# Evaluate retrieval and generation\nfrom raglite import answer_evals, evaluate, insert_evals\n\ninsert_evals(num_evals=100, config=my_config)\nanswered_evals_df = answer_evals(num_evals=10, config=my_config)\nevaluation_df = evaluate(answered_evals_df, config=my_config)\n```\n\n### 6. Running a Model Context Protocol (MCP) server\n\nRAGLite comes with an [MCP server](https:\u002F\u002Fmodelcontextprotocol.io) implemented with [FastMCP](https:\u002F\u002Fgithub.com\u002Fjlowin\u002Ffastmcp) that exposes a `search_knowledge_base` [tool](https:\u002F\u002Fgithub.com\u002Fjlowin\u002Ffastmcp?tab=readme-ov-file#tools). To use the server:\n\n1. Install [Claude desktop](https:\u002F\u002Fclaude.ai\u002Fdownload)\n2. Install [uv](https:\u002F\u002Fdocs.astral.sh\u002Fuv\u002Fgetting-started\u002Finstallation\u002F) so that Claude desktop can start the server\n3. Configure Claude desktop to use `uv` to start the MCP server with:\n\n```\nraglite \\\n    --db-url duckdb:\u002F\u002F\u002Fraglite.db \\\n    --llm llama-cpp-python\u002Funsloth\u002FQwen3-4B-GGUF\u002F*Q4_K_M.gguf@8192 \\\n    --embedder llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512 \\\n    mcp install\n```\n\nTo use an API-based LLM, make sure to include your credentials in a `.env` file or supply them inline:\n\n```sh\nexport OPENAI_API_KEY=sk-...\nraglite \\\n    --llm gpt-4o-mini \\\n    --embedder text-embedding-3-large \\\n    mcp install\n```\n\nNow, when you start Claude desktop you should see a 🔨 icon at the bottom right of your prompt indicating that the Claude has successfully connected with the MCP server.\n\nWhen relevant, Claude will suggest to use the `search_knowledge_base` tool that the MCP server provides. You can also explicitly ask Claude to search the knowledge base if you want to be certain that it does.\n\n\u003Cdiv align=\"center\">\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3a597a17-874e-475f-a6dd-cd3ccf360fb9\" \u002F>\u003C\u002Fdiv>\n\n### 7. Serving a customizable ChatGPT-like frontend\n\nIf you installed the `chainlit` extra, you can serve a customizable ChatGPT-like frontend with:\n\n```sh\nraglite chainlit\n```\n\nThe application is also deployable to [web](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fcopilot), [Slack](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fslack), and [Teams](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fteams).\n\nYou can specify the database URL, LLM, and embedder directly in the Chainlit frontend, or with the CLI as follows:\n\n```sh\nraglite \\\n    --db-url duckdb:\u002F\u002F\u002Fraglite.db \\\n    --llm llama-cpp-python\u002Funsloth\u002FQwen3-4B-GGUF\u002F*Q4_K_M.gguf@8192 \\\n    --embedder llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512 \\\n    chainlit\n```\n\nTo use an API-based LLM, make sure to include your credentials in a `.env` file or supply them inline:\n\n```sh\nOPENAI_API_KEY=sk-... raglite --llm gpt-4o-mini --embedder text-embedding-3-large chainlit\n```\n\n\u003Cdiv align=\"center\">\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa303ed4a-54cd-45ea-a2b5-86e086053aed\" \u002F>\u003C\u002Fdiv>\n\n## Contributing\n\n\u003Cdetails>\n\u003Csummary>Prerequisites\u003C\u002Fsummary>\n\n1. [Generate an SSH key](https:\u002F\u002Fdocs.github.com\u002Fen\u002Fauthentication\u002Fconnecting-to-github-with-ssh\u002Fgenerating-a-new-ssh-key-and-adding-it-to-the-ssh-agent#generating-a-new-ssh-key) and [add the SSH key to your GitHub account](https:\u002F\u002Fdocs.github.com\u002Fen\u002Fauthentication\u002Fconnecting-to-github-with-ssh\u002Fadding-a-new-ssh-key-to-your-github-account).\n1. Configure SSH to automatically load your SSH keys:\n\n    ```sh\n    cat \u003C\u003C EOF >> ~\u002F.ssh\u002Fconfig\n    \n    Host *\n      AddKeysToAgent yes\n      IgnoreUnknown UseKeychain\n      UseKeychain yes\n      ForwardAgent yes\n    EOF\n    ```\n\n1. [Install Docker Desktop](https:\u002F\u002Fwww.docker.com\u002Fget-started).\n1. [Install VS Code](https:\u002F\u002Fcode.visualstudio.com\u002F) and [VS Code's Dev Containers extension](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=ms-vscode-remote.remote-containers). Alternatively, install [PyCharm](https:\u002F\u002Fwww.jetbrains.com\u002Fpycharm\u002Fdownload\u002F).\n1. _Optional:_ install a [Nerd Font](https:\u002F\u002Fwww.nerdfonts.com\u002Ffont-downloads) such as [FiraCode Nerd Font](https:\u002F\u002Fgithub.com\u002Fryanoasis\u002Fnerd-fonts\u002Ftree\u002Fmaster\u002Fpatched-fonts\u002FFiraCode) and [configure VS Code](https:\u002F\u002Fgithub.com\u002Ftonsky\u002FFiraCode\u002Fwiki\u002FVS-Code-Instructions) or [PyCharm](https:\u002F\u002Fgithub.com\u002Ftonsky\u002FFiraCode\u002Fwiki\u002FIntellij-products-instructions) to use it.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails open>\n\u003Csummary>Development environments\u003C\u002Fsummary>\n\nThe following development environments are supported:\n\n1. ⭐️ _GitHub Codespaces_: click on [Open in GitHub Codespaces](https:\u002F\u002Fgithub.com\u002Fcodespaces\u002Fnew\u002Fsuperlinear-ai\u002Fraglite) to start developing in your browser.\n1. ⭐️ _VS Code Dev Container (with container volume)_: click on [Open in Dev Containers](https:\u002F\u002Fvscode.dev\u002Fredirect?url=vscode:\u002F\u002Fms-vscode-remote.remote-containers\u002FcloneInVolume?url=https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite) to clone this repository in a container volume and create a Dev Container with VS Code.\n1. ⭐️ _uv_: clone this repository and run the following from root of the repository:\n\n    ```sh\n    # Create and install a virtual environment\n    uv sync --python 3.10 --all-extras\n\n    # Activate the virtual environment\n    source .venv\u002Fbin\u002Factivate\n\n    # Install the pre-commit hooks\n    pre-commit install --install-hooks\n    ```\n\n1. _VS Code Dev Container_: clone this repository, open it with VS Code, and run \u003Ckbd>Ctrl\u002F⌘\u003C\u002Fkbd> + \u003Ckbd>⇧\u003C\u002Fkbd> + \u003Ckbd>P\u003C\u002Fkbd> → _Dev Containers: Reopen in Container_.\n1. _PyCharm Dev Container_: clone this repository, open it with PyCharm, [create a Dev Container with Mount Sources](https:\u002F\u002Fwww.jetbrains.com\u002Fhelp\u002Fpycharm\u002Fstart-dev-container-inside-ide.html), and [configure an existing Python interpreter](https:\u002F\u002Fwww.jetbrains.com\u002Fhelp\u002Fpycharm\u002Fconfiguring-python-interpreter.html#widget) at `\u002Fopt\u002Fvenv\u002Fbin\u002Fpython`.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails open>\n\u003Csummary>Developing\u003C\u002Fsummary>\n\n- This project follows the [Conventional Commits](https:\u002F\u002Fwww.conventionalcommits.org\u002F) standard to automate [Semantic Versioning](https:\u002F\u002Fsemver.org\u002F) and [Keep A Changelog](https:\u002F\u002Fkeepachangelog.com\u002F) with [Commitizen](https:\u002F\u002Fgithub.com\u002Fcommitizen-tools\u002Fcommitizen).\n- Run `poe` from within the development environment to print a list of [Poe the Poet](https:\u002F\u002Fgithub.com\u002Fnat-n\u002Fpoethepoet) tasks available to run on this project.\n- Run `uv add {package}` from within the development environment to install a run time dependency and add it to `pyproject.toml` and `uv.lock`. Add `--dev` to install a development dependency.\n- Run `uv sync --upgrade` from within the development environment to upgrade all dependencies to the latest versions allowed by `pyproject.toml`. Add `--only-dev` to upgrade the development dependencies only.\n- Run `cz bump` to bump the package's version, update the `CHANGELOG.md`, and create a git tag. Then push the changes and the git tag with `git push origin main --tags`.\n\n\u003C\u002Fdetails>\n\n## Star History\n\n\u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#superlinear-ai\u002Fraglite&Timeline\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png&theme=dark\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png\" \u002F>\n   \u003Cimg alt=\"Star History Chart\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n","[![在开发容器中打开](https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=Dev%20Containers&message=Open&color=blue&logo=data:image\u002Fsvg%2bxml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCI+PHBhdGggZmlsbD0iI2ZmZiIgZD0iTTE3IDE2VjdsLTYgNU0yIDlWOGwxLTFoMWw0IDMgOC04aDFsNCAyIDEgMXYxNGwtMSAxLTQgM2gtMWwtOC04LTQgM0hzbC0xLTF2LTFsMy0zIi8+PC9zdmc+)](https:\u002F\u002Fvscode.dev\u002Fredirect?url=vscode:\u002F\u002Fms-vscode-remote.remote-containers\u002FcloneInVolume?url=https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite) [![在 GitHub Codespaces 中打开](https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=GitHub%20Codespaces&message=Open&color=blue&logo=github)](https:\u002F\u002Fgithub.com\u002Fcodespaces\u002Fnew\u002Fsuperlinear-ai\u002Fraglite)\n\n# 🥤 RAGLite\n\nRAGLite 是一个用于检索增强生成（RAG）的 Python 工具包，支持 DuckDB 或 PostgreSQL。\n\n## 特性\n\n##### 可配置\n\n- 🧠 可通过 [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm) 选择任何 LLM 提供商，包括本地的 [llama-cpp-python](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python) 模型\n- 💾 可选择 [DuckDB](https:\u002F\u002Fduckdb.org) 或 [PostgreSQL](https:\u002F\u002Fgithub.com\u002Fpostgres\u002Fpostgres) 作为关键词和向量搜索数据库\n- 🥇 可通过 [rerankers](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers) 选择任何重排序器，包括默认的多语言 [FlashRank](https:\u002F\u002Fgithub.com\u002FPrithivirajDamodaran\u002FFlashRank)\n\n##### 快速且宽松\n\n- ❤️ 仅依赖轻量级、许可宽松的开源库（例如，不使用 [PyTorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fpytorch) 或 [LangChain](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain)）\n- 🚀 在 macOS 上使用 Metal 加速，在 Linux 和 Windows 上使用 CUDA 加速\n\n##### 无限制\n\n- 📖 基于 [pdftext](https:\u002F\u002Fgithub.com\u002FVikParuchuri\u002Fpdftext) 和 [pypdfium2](https:\u002F\u002Fgithub.com\u002Fpypdfium2-team\u002Fpypdfium2) 的 PDF 到 Markdown 转换\n- 🧬 使用 [late chunking](https:\u002F\u002Fweaviate.io\u002Fblog\u002Flate-chunking) 和 [contextual chunk headings](https:\u002F\u002Fd-star.ai\u002Fsolving-the-out-of-context-chunk-problem-for-rag) 进行多向量分块嵌入\n- ✏️ 通过解决 [二元整数规划问题](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FInteger_programming) 使用 [wtpsplit-lite](https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fwtpsplit-lite) 实现最优句子分割\n- ✂️ 通过解决 [二元整数规划问题](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FInteger_programming) 实现最优 [语义分块](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=8OJC21T2SL4&t=1930s)\n- 🔍 使用数据库原生的关键词和向量搜索进行 [混合搜索](https:\u002F\u002Fplg.uwaterloo.ca\u002F~gvcormac\u002Fcormacksigir09-rrf.pdf)（[FTS](https:\u002F\u002Fduckdb.org\u002Fdocs\u002Fstable\u002Fextensions\u002Ffull_text_search)+[VSS](https:\u002F\u002Fduckdb.org\u002Fdocs\u002Fstable\u002Fextensions\u002Fvss); [tsvector](https:\u002F\u002Fwww.postgresql.org\u002Fdocs\u002Fcurrent\u002Fdatatype-textsearch.html)+[pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector)）\n- 💭 [自适应检索](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.14403)，其中 LLM 根据查询决定是否以及检索什么内容\n- 💰 通过 [提示缓存感知的消息数组结构](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fprompt-caching) 改善成本和延迟\n- 🍰 通过 [Anthropic 的长上下文提示格式](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fbuild-with-claude\u002Fprompt-engineering\u002Flong-context-tips) 提升输出质量\n- 🌀 通过解 [正交 Procrustes 问题](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FOrthogonal_Procrustes_problem) 得到最优的 [闭式线性查询适配器](src\u002Fraglite\u002F_query_adapter.py)\n\n##### 可扩展\n\n- 🔌 内置 [Model Context Protocol](https:\u002F\u002Fmodelcontextprotocol.io) (MCP) 服务器，任何 MCP 客户端如 [Claude desktop](https:\u002F\u002Fclaude.ai\u002Fdownload) 都可以连接\n- 💬 可选的类 ChatGPT 前端，适用于 [web](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fcopilot)、[Slack](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fslack) 和 [Teams](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fteams)，通过 [Chainlit](https:\u002F\u002Fgithub.com\u002FChainlit\u002Fchainlit) 实现\n- ✍️ 可选的将任意输入文档转换为 Markdown，使用 [Pandoc](https:\u002F\u002Fgithub.com\u002Fjgm\u002Fpandoc)\n- 🔎 可选的高质量文档处理，使用 [Mistral OCR](https:\u002F\u002Fdocs.mistral.ai\u002Fcapabilities\u002Fdocument\u002F) 处理 PDF、图片、DOCX 和 PPTX，并自动生成图像描述\n- ✅ 可选的使用 [Ragas](https:\u002F\u002Fgithub.com\u002Fexplodinggradients\u002Fragas) 评估检索和生成性能\n\n## 安装\n\n> [!TIP]\n> 🚀 如果你想使用本地模型，建议安装加速版的 [llama-cpp-python 预编译二进制文件](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python?tab=readme-ov-file#supported-backends)：\n> ```sh\n> # 配置要安装的 llama-cpp-python 预编译二进制版本（⚠️ 并非所有组合都可用）：\n> LLAMA_CPP_PYTHON_VERSION=0.3.9\n> PYTHON_VERSION=310|311|312\n> ACCELERATOR=metal|cu121|cu122|cu123|cu124\n> PLATFORM=macosx_11_0_arm64|linux_x86_64|win_amd64\n> \n> # 安装 llama-cpp-python：\n> pip install \"https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python\u002Freleases\u002Fdownload\u002Fv$LLAMA_CPP_PYTHON_VERSION-$ACCELERATOR\u002Fllama_cpp_python-$LLAMA_CPP_PYTHON_VERSION-cp$PYTHON_VERSION-cp$PYTHON_VERSION-$PLATFORM.whl\"\n> ```\n\n安装 RAGLite：\n```sh\npip install raglite\n```\n\n若需添加类 ChatGPT 的可定制前端支持，可使用 `chainlit` 附加组件：\n```sh\npip install raglite[chainlit]\n```\n\n若需支持除 PDF 之外的其他文件类型，可使用 `pandoc` 附加组件：\n```sh\npip install raglite[pandoc]\n```\n\n若需支持使用 [Mistral OCR](https:\u002F\u002Fdocs.mistral.ai\u002Fcapabilities\u002Fdocument\u002F) 进行高质量文档处理，可使用 `mistral-ocr` 附加组件：\n```sh\npip install raglite[mistral-ocr]\n```\n\n若需支持评估功能，可使用 `ragas` 附加组件：\n```sh\npip install raglite[ragas]\n```\n\n## 使用\n\n### 概述\n\n1. [配置 RAGLite](#1-configuring-raglite)\n2. [插入文档](#2-inserting-documents)\n3. [检索增强生成（RAG）](#3-retrieval-augmented-generation-rag)\n4. [计算并使用最优查询适配器](#4-computing-and-using-an-optimal-query-adapter)\n5. [检索和生成的评估](#5-evaluation-of-retrieval-and-generation)\n6. [运行 Model Context Protocol (MCP) 服务器](#6-running-a-model-context-protocol-mcp-server)\n7. [提供可定制的类 ChatGPT 前端](#7-serving-a-customizable-chatgpt-like-frontend)\n\n### 1. 配置 RAGLite\n\n> [!TIP]\n> 🧠 RAGLite 扩展了 [LiteLLM](https:\u002F\u002Fgithub.com\u002FBerriAI\u002Flitellm)，支持使用 [llama-cpp-python](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python) 的 [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) 模型。要选择一个 llama.cpp 模型（例如来自 [Unsloth 的集合](https:\u002F\u002Fhuggingface.co\u002Funsloth))，可以使用形如 `\"llama-cpp-python\u002F\u003Chugging_face_repo_id>\u002F\u003Cfilename>@\u003Cn_ctx>\"` 的模型标识符，其中 `n_ctx` 是可选参数，用于指定模型的上下文大小。\n\n> [!TIP]\n> 💾 你可以在 [neon.tech](https:\u002F\u002Fneon.tech) 上轻松创建一个 PostgreSQL 数据库。\n\n首先，使用你喜欢的 DuckDB 或 PostgreSQL 数据库以及 [LiteLLM 支持的任何 LLM](https:\u002F\u002Fdocs.litellm.ai\u002Fdocs\u002Fproviders\u002Fopenai) 配置 RAGLite：\n```python\nfrom raglite import RAGLiteConfig\n\n# 使用 PostgreSQL 数据库和 OpenAI LLM 的示例“远程”配置：\nmy_config = RAGLiteConfig(\n    db_url=\"postgresql:\u002F\u002Fmy_username:my_password@my_host:5432\u002Fmy_database\",\n    llm=\"gpt-4o-mini\",  # 或者任何 LiteLLM 支持的 LLM\n    embedder=\"text-embedding-3-large\",  # 或者任何 LiteLLM 支持的嵌入模型\n)\n\n# 使用 DuckDB 数据库和 llama.cpp LLM 的示例“本地”配置：\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    llm=\"llama-cpp-python\u002Funsloth\u002FQwen3-8B-GGUF\u002F*Q4_K_M.gguf@8192\",\n    embedder=\"llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512\", # 超过 512 个 token 会降低 bge-m3 的性能\n)\n```\n\n你也可以配置 [rerankers 支持的任何重排序器](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers)：\n\n```python\nfrom rerankers import Reranker\n\n# 基于远程 API 的重排序器示例：\nmy_config = RAGLiteConfig(\n    db_url=\"postgresql:\u002F\u002Fmy_username:my_password@my_host:5432\u002Fmy_database\"\n    reranker=Reranker(\"rerank-v3.5\", model_type=\"cohere\", api_key=COHERE_API_KEY, verbose=0)  # 多语言\n)\n\n# 按语言划分的本地交叉编码器重排序器示例（这是默认设置）：\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    reranker={\n        \"en\": Reranker(\"ms-marco-MiniLM-L-12-v2\", model_type=\"flashrank\", verbose=0),  # 英语\n        \"other\": Reranker(\"ms-marco-MultiBERT-L-12\", model_type=\"flashrank\", verbose=0),  # 其他语言\n    }\n)\n```\n\n自查询功能也已支持，允许 LLM 根据用户输入自动生成并应用元数据过滤器，以优化搜索结果。要启用自查询功能，请在 `RAGLiteConfig` 中将 `self_query` 设置为 `True`：\n\n```python\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    llm=\"gpt-4o-mini\",\n    embedder=\"text-embedding-3-large\",\n    self_query=True,  # 启用自查询\n)\n```\n\n### 2. 插入文档\n\n> [!TIP]\n> ✍️ 若要插入 PDF 以外的文档，请使用 `pip install raglite[pandoc]` 安装 pandoc 插件。\n\n> [!TIP]\n> 🔎 为了获得更高质量的文档处理效果，并实现自动图像描述功能，可使用 `pip install raglite[mistral-ocr]` 安装 mistral-ocr 插件，并按如下方式配置：\n> ```python\n> from raglite import RAGLiteConfig, MistralOCRConfig\n>\n> my_config = RAGLiteConfig(\n>     document_processor=MistralOCRConfig(\n>         include_image_descriptions=True,  # 将图片、图表和示意图描述为文本\n>         image_types=frozenset({\"chart\", \"diagram\", \"photo\", \"table\", \"logo\", \"icon\"}),  # 自定义图片类别\n>         exclude_image_types=frozenset({\"logo\", \"icon\"}),  # 从输出中过滤掉特定类型的图片\n>     ),\n> )\n> ```\n> `image_types` 参数定义了 Mistral 对每张图片进行分类的类别——你可以使用默认值，也可以提供自己领域的特定类型。使用 `exclude_image_types` 可以过滤掉任何对检索无用的已分类类型。\n\n接下来，将一些文档插入数据库。RAGLite 会负责将其转换为 Markdown 格式（src\u002Fraglite\u002F_markdown.py）、进行最优的四级语义分块（src\u002Fraglite\u002F_split_chunks.py）以及执行延迟分块的多向量嵌入（src\u002Fraglite\u002F_embed.py）：\n\n```python\n# 根据文件路径插入文档\nfrom pathlib import Path\nfrom raglite import Document, insert_documents\n\ndocuments = [\n    Document.from_path(Path(\"论智力的测量.pdf\")),\n    Document.from_path(Path(\"狭义相对论.pdf\")),\n]\ninsert_documents(documents, config=my_config)\n\n# 根据纯文本或 Markdown 内容插入文档\ncontent = \"\"\"\n# 论运动物体的电动力学\n## 爱因斯坦 1905年6月30日\n众所周知，麦克斯韦……\n\"\"\"\ndocuments = [\n    Document.from_text(content, author=\"爱因斯坦\", topic=\"物理学\", year=1905)\n]\ninsert_documents(documents, config=my_config)\n```\n\n> [!TIP]\n> 📝 文档可以通过向 `Document.from_text()` 或 `Document.from_path()` 传递关键字参数来包含元数据。这些元数据随后可用于检索时的过滤。\n> 对于列表类型的值，元数据会原样存储（例如 `domain=[\"open\", \"music\"]`）。\n\n你还可以在插入之前扩展文档的元数据：\n\n```python\nfrom typing import Annotated\nfrom pydantic import Field\nfrom raglite import expand_document_metadata\n\n# 扩展文档的元数据。\nmetadata_fields = {\n    \"title\": Annotated[str, Field(..., description=\"文档标题。\")],\n    \"author\": Annotated[str, Field(..., description=\"主要作者。\")],\n    \"topics\": Annotated[list[Literal[\"A\", \"B\", \"C\"]], Field(..., description=\"关键主题。\")],\n}\ndocuments = list(expand_document_metadata(documents, metadata_fields, config=my_config))\n\n# 根据纯文本或 Markdown 内容插入文档\ninsert_documents(documents, config=my_config)\n```\n\n### 3. 检索增强生成 (RAG)\n\n#### 3.1 自适应 RAG\n\n现在你可以运行一个自适应 RAG 流程，该流程包括将用户提示添加到消息历史记录中，并流式传输 LLM 的响应：\n\n```python\nfrom raglite import rag\n\n# 创建用户消息\nmessages = []  # 或者从现有的消息历史开始\nmessages.append({\n    \"role\": \"user\",\n    \"content\": \"智力是如何衡量的？\"\n})\n\n# 自适应地决定是否需要检索并流式传输响应\nchunk_spans = []\nstream = rag(messages, on_retrieval=lambda x: chunk_spans.extend(x), config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# 获取 RAG 上下文中引用的文档\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\nLLM 会根据用户提示的复杂程度自适应地决定是否需要检索信息。如果需要检索，LLM 会生成搜索查询，RAGLite 则会通过混合搜索和重排序来检索最相关的 chunk spans（每个 chunk span 是一系列连续的 chunks）。检索结果会被发送到 `on_retrieval` 回调函数中，并作为工具输出附加到消息历史记录中。最后，助手的响应会被流式传输并附加到消息历史记录中。\n\n#### 3.2 可编程 RAG\n\n如果你需要手动控制 RAG 流程，可以运行一个简单但强大的流程：首先使用混合搜索和重排序检索最相关的 chunk spans，然后将用户提示转换为 RAG 指令并将其附加到消息历史记录中，最后生成 RAG 响应：\n\n```python\nfrom raglite import add_context, rag, retrieve_context, vector_search\n\n# 选择一种搜索方法\nfrom dataclasses import replace\nmy_config = replace(my_config, search_method=vector_search)  # 或者 `hybrid_search`、`search_and_rerank_chunks` 等\nuser_prompt = \"智力是如何衡量的？\"\nchunk_spans = retrieve_context(\n    query=user_prompt, \n    num_chunks=5, \n    metadata_filter={\"author\": \"爱因斯坦\"},  # 可选：按元数据过滤\n    config=my_config\n)\n\n# 根据用户提示和上下文向消息历史中追加一条 RAG 指令\nmessages = []  # 或者从现有的消息历史开始\nmessages.append(add_context(user_prompt=user_prompt, context=chunk_spans, config=my_config))\n\n# 流式输出 RAG 响应并将其追加到消息历史中\nstream = rag(messages, config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# 访问 RAG 上下文中引用的文档\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\n> [!TIP]\n> 🥇 重排序可以显著提升 RAG 应用程序的输出质量。要在你的应用程序中添加重排序功能：首先搜索更大规模的 20 个相关块，然后使用 [rerankers](https:\u002F\u002Fgithub.com\u002FAnswerDotAI\u002Frerankers) 提供的重排序器对这些块进行重排序，最后保留前 5 个块。\n\nRAGLite 还提供了对完整 RAG 流程中各个步骤更高级的控制：\n\n1. 使用关键词、向量或混合搜索查找相关块\n2. 从数据库中检索这些块\n3. 对块进行重排序并选择前 5 个结果\n4. 扩展块及其邻近块，并将它们分组为块跨度\n5. 将用户提示转换为 RAG 指令并将其追加到消息历史中\n6. 将 LLM 响应流式传输到消息历史中\n7. 从块跨度中访问引用的文档\n\n使用 RAGLite 实现完整的 RAG 流程非常简单：\n\n```python\n# 搜索块\nfrom raglite import hybrid_search, keyword_search, vector_search\n\nuser_prompt = \"如何衡量智力？\"\nchunk_ids_vector, _ = vector_search(user_prompt, num_results=20, config=my_config)\nchunk_ids_keyword, _ = keyword_search(user_prompt, num_results=20, config=my_config)\nchunk_ids_hybrid, _ = hybrid_search(\n    user_prompt, num_results=20, metadata_filter={\"topic\": \"physics\"}, config=my_config\n)  # 过滤结果，仅包含主题为“物理”的文档中的块（适用于任何搜索方法）\n\n# 在单个字段中使用多值过滤器时采用 OR 语义：\nchunk_ids_or, _ = hybrid_search(\n    user_prompt,\n    num_results=20,\n    metadata_filter={\"domain\": [\"open\", \"music\"]},\n    config=my_config,\n)  # 返回 domain 包含 \"open\" 或 \"music\" 的块。\n\n# 检索块\nfrom raglite import retrieve_chunks\n\nchunks_hybrid = retrieve_chunks(chunk_ids_hybrid, config=my_config)\n\n# 对块进行重排序并保留前 5 个（可选，但建议）\nfrom raglite import rerank_chunks\n\nchunks_reranked = rerank_chunks(user_prompt, chunks_hybrid, config=my_config)\nchunks_reranked = chunks_reranked[:5]\n\n# 扩展块及其邻近块，并将其分组为块跨度\nfrom raglite import retrieve_chunk_spans\n\nchunk_spans = retrieve_chunk_spans(chunks_reranked, config=my_config)\n\n# 根据用户提示和上下文向消息历史中追加一条 RAG 指令\nfrom raglite import add_context\n\nmessages = []  # 或者从现有的消息历史开始\nmessages.append(add_context(user_prompt=user_prompt, context=chunk_spans, config=my_config))\n\n# 流式输出 RAG 响应并将其追加到消息历史中\nfrom raglite import rag\n\nstream = rag(messages, config=my_config)\nfor update in stream:\n    print(update, end=\"\")\n\n# 访问 RAG 上下文中引用的文档\ndocuments = [chunk_span.document for chunk_span in chunk_spans]\n```\n\n### 4. 计算并使用最优查询适配器\n\nRAGLite 可以计算并应用一个 [最优闭式查询适配器](src\u002Fraglite\u002F_query_adapter.py) 到提示嵌入中，以提高 RAG 的输出质量。要从中受益，首先使用 `insert_evals` 生成一组评估数据，然后使用 `update_query_adapter` 计算并存储最优查询适配器：\n\n```python\n# 使用最优查询适配器改进 RAG\nfrom raglite import insert_evals, update_query_adapter\n\ninsert_evals(num_evals=100, config=my_config)\nupdate_query_adapter(config=my_config)  # 从此以后，每次向量搜索都会使用该查询适配器\n```\n\n### 5. 检索与生成的评估\n\n如果你安装了 `ragas` 附加组件，就可以使用 RAGLite 回答评估问题，然后使用 [Ragas](https:\u002F\u002Fgithub.com\u002Fexplodinggradients\u002Fragas) 评估 RAG 检索和生成两个步骤的质量：\n\n```python\n# 评估检索和生成\nfrom raglite import answer_evals, evaluate, insert_evals\n\ninsert_evals(num_evals=100, config=my_config)\nanswered_evals_df = answer_evals(num_evals=10, config=my_config)\nevaluation_df = evaluate(answered_evals_df, config=my_config)\n```\n\n### 6. 运行模型上下文协议 (MCP) 服务器\n\nRAGLite 自带一个基于 [FastMCP](https:\u002F\u002Fgithub.com\u002Fjlowin\u002Ffastmcp) 实现的 [MCP 服务器](https:\u002F\u002Fmodelcontextprotocol.io)，它公开了一个 `search_knowledge_base` [工具](https:\u002F\u002Fgithub.com\u002Fjlowin\u002Ffastmcp?tab=readme-ov-file#tools)。要使用该服务器：\n\n1. 安装 [Claude 桌面版](https:\u002F\u002Fclaude.ai\u002Fdownload)\n2. 安装 [uv](https:\u002F\u002Fdocs.astral.sh\u002Fuv\u002Fgetting-started\u002Finstallation\u002F)，以便 Claude 桌面版能够启动服务器\n3. 配置 Claude 桌面版使用 `uv` 启动 MCP 服务器，命令如下：\n\n```\nraglite \\\n    --db-url duckdb:\u002F\u002F\u002Fraglite.db \\\n    --llm llama-cpp-python\u002Funsloth\u002FQwen3-4B-GGUF\u002F*Q4_K_M.gguf@8192 \\\n    --embedder llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512 \\\n    mcp install\n```\n\n如果使用基于 API 的 LLM，请确保将凭据放入 `.env` 文件中，或直接在命令行中提供：\n\n```sh\nexport OPENAI_API_KEY=sk-...\nraglite \\\n    --llm gpt-4o-mini \\\n    --embedder text-embedding-3-large \\\n    mcp install\n```\n\n现在，当你启动 Claude 桌面版时，应该会在提示框的右下角看到一个 🔨 图标，表明 Claude 已成功连接到 MCP 服务器。\n\n在适当的时候，Claude 会建议使用 MCP 服务器提供的 `search_knowledge_base` 工具。你也可以明确要求 Claude 搜索知识库，以确保它一定会执行此操作。\n\n\u003Cdiv align=\"center\">\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3a597a17-874e-475f-a6dd-cd3ccf360fb9\" \u002F>\u003C\u002Fdiv>\n\n### 7. 提供可自定义的类 ChatGPT 前端\n\n如果你安装了 `chainlit` 附加组件，你可以通过以下命令提供一个可自定义的类 ChatGPT 前端：\n\n```sh\nraglite chainlit\n```\n\n该应用还可以部署到 [web](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fcopilot)、[Slack](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fslack) 和 [Teams](https:\u002F\u002Fdocs.chainlit.io\u002Fdeploy\u002Fteams) 平台。\n\n你可以在 Chainlit 前端直接指定数据库 URL、LLM 和嵌入器，也可以通过 CLI 指定，如下所示：\n\n```sh\nraglite \\\n    --db-url duckdb:\u002F\u002F\u002Fraglite.db \\\n    --llm llama-cpp-python\u002Funsloth\u002FQwen3-4B-GGUF\u002F*Q4_K_M.gguf@8192 \\\n    --embedder llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512 \\\n    chainlit\n```\n\n如果要使用基于 API 的 LLM，请确保将你的凭据添加到 `.env` 文件中，或者直接在命令行中提供：\n\n```sh\nOPENAI_API_KEY=sk-... raglite --llm gpt-4o-mini --embedder text-embedding-3-large chainlit\n```\n\n\u003Cdiv align=\"center\">\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa303ed4a-54cd-45ea-a2b5-86e086053aed\" \u002F>\u003C\u002Fdiv>\n\n## 贡献\n\n\u003Cdetails>\n\u003Csummary>先决条件\u003C\u002Fsummary>\n\n1. [生成 SSH 密钥](https:\u002F\u002Fdocs.github.com\u002Fen\u002Fauthentication\u002Fconnecting-to-github-with-ssh\u002Fgenerating-a-new-ssh-key-and-adding-it-to-the-ssh-agent#generating-a-new-ssh-key)并 [将 SSH 密钥添加到你的 GitHub 账户](https:\u002F\u002Fdocs.github.com\u002Fen\u002Fauthentication\u002Fconnecting-to-github-with-ssh\u002Fadding-a-new-ssh-key-to-your-github-account)。\n1. 配置 SSH 自动加载你的 SSH 密钥：\n\n    ```sh\n    cat \u003C\u003C EOF >> ~\u002F.ssh\u002Fconfig\n    \n    Host *\n      AddKeysToAgent yes\n      IgnoreUnknown UseKeychain\n      UseKeychain yes\n      ForwardAgent yes\n    EOF\n    ```\n\n1. [安装 Docker Desktop](https:\u002F\u002Fwww.docker.com\u002Fget-started)。\n1. [安装 VS Code](https:\u002F\u002Fcode.visualstudio.com\u002F) 和 [VS Code 的 Dev Containers 扩展](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=ms-vscode-remote.remote-containers)。或者，也可以安装 [PyCharm](https:\u002F\u002Fwww.jetbrains.com\u002Fpycharm\u002Fdownload\u002F)。\n1. _可选:_ 安装一种 [Nerd Font](https:\u002F\u002Fwww.nerdfonts.com\u002Ffont-downloads)，例如 [FiraCode Nerd Font](https:\u002F\u002Fgithub.com\u002Fryanoasis\u002Fnerd-fonts\u002Ftree\u002Fmaster\u002Fpatched-fonts\u002FFiraCode)，并 [配置 VS Code](https:\u002F\u002Fgithub.com\u002Ftonsky\u002FFiraCode\u002Fwiki\u002FVS-Code-Instructions) 或 [PyCharm](https:\u002F\u002Fgithub.com\u002Ftonsky\u002FFiraCode\u002Fwiki\u002FIntellij-products-instructions) 使用它。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails open>\n\u003Csummary>开发环境\u003C\u002Fsummary>\n\n支持以下开发环境：\n\n1. ⭐️ _GitHub Codespaces_: 点击 [在 GitHub Codespaces 中打开](https:\u002F\u002Fgithub.com\u002Fcodespaces\u002Fnew\u002Fsuperlinear-ai\u002Fraglite) 即可在浏览器中开始开发。\n1. ⭐️ _VS Code Dev Container (带容器卷)_: 点击 [在 Dev Containers 中打开](https:\u002F\u002Fvscode.dev\u002Fredirect?url=vscode:\u002F\u002Fms-vscode-remote.remote-containers\u002FcloneInVolume?url=https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite) 将此仓库克隆到容器卷中，并使用 VS Code 创建一个 Dev Container。\n1. ⭐️ _uv_: 克隆此仓库，在仓库根目录下运行以下命令：\n\n    ```sh\n    # 创建并安装虚拟环境\n    uv sync --python 3.10 --all-extras\n\n    # 激活虚拟环境\n    source .venv\u002Fbin\u002Factivate\n\n    # 安装 pre-commit 钩子\n    pre-commit install --install-hooks\n    ```\n\n1. _VS Code Dev Container_: 克隆此仓库，用 VS Code 打开，然后按 \u003Ckbd>Ctrl\u002F⌘\u003C\u002Fkbd> + \u003Ckbd>⇧\u003C\u002Fkbd> + \u003Ckbd>P\u003C\u002Fkbd> → _Dev Containers: 在容器中重新打开_。\n1. _PyCharm Dev Container_: 克隆此仓库，用 PyCharm 打开，[创建带有挂载源的 Dev Container](https:\u002F\u002Fwww.jetbrains.com\u002Fhelp\u002Fpycharm\u002Fstart-dev-container-inside-ide.html)，并在 `\u002Fopt\u002Fvenv\u002Fbin\u002Fpython` 处 [配置现有的 Python 解释器](https:\u002F\u002Fwww.jetbrains.com\u002Fhelp\u002Fpycharm\u002Fconfiguring-python-interpreter.html#widget)。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails open>\n\u003Csummary>开发\u003C\u002Fsummary>\n\n- 本项目遵循 [Conventional Commits](https:\u002F\u002Fwww.conventionalcommits.org\u002F) 标准，以借助 [Commitizen](https:\u002F\u002Fgithub.com\u002Fcommitizen-tools\u002Fcommitizen) 自动化 [语义版本控制](https:\u002F\u002Fsemver.org\u002F) 和 [Keep A Changelog](https:\u002F\u002Fkeepachangelog.com\u002F)。\n- 在开发环境中运行 `poe` 可以打印出适用于该项目的 [Poe the Poet](https:\u002F\u002Fgithub.com\u002Fnat-n\u002Fpoethepoet) 任务列表。\n- 在开发环境中运行 `uv add {package}` 可以安装运行时依赖项，并将其添加到 `pyproject.toml` 和 `uv.lock` 中。添加 `--dev` 参数可以安装开发依赖项。\n- 在开发环境中运行 `uv sync --upgrade` 可以将所有依赖项升级到 `pyproject.toml` 允许的最新版本。添加 `--only-dev` 参数则仅升级开发依赖项。\n- 运行 `cz bump` 可以提升软件包版本、更新 `CHANGELOG.md` 并创建一个 Git 标签。随后使用 `git push origin main --tags` 推送更改和 Git 标签。\n\n\u003C\u002Fdetails>\n\n## 星标历史\n\n\u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#superlinear-ai\u002Fraglite&Timeline\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png&theme=dark\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png\" \u002F>\n   \u003Cimg alt=\"星标历史图表\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_readme_8a1839d1ee2c.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>","# RAGLite 快速上手指南\n\nRAGLite 是一个轻量级、高性能的 Python RAG（检索增强生成）工具包，支持 DuckDB 或 PostgreSQL 数据库，兼容本地模型与云端大模型。\n\n## 环境准备\n\n- **系统要求**：macOS、Linux 或 Windows\n- **Python 版本**：3.10 \u002F 3.11 \u002F 3.12\n- **可选加速**：\n  - macOS：Metal 加速\n  - Linux\u002FWindows：CUDA 加速（需安装对应版本的 `llama-cpp-python`）\n\n> 💡 若使用本地大模型（如 llama.cpp），推荐预先安装加速版二进制文件：\n```sh\nLLAMA_CPP_PYTHON_VERSION=0.3.9\nPYTHON_VERSION=312\nACCELERATOR=metal  # 或 cu124（根据平台选择）\nPLATFORM=macosx_11_0_arm64  # 或 linux_x86_64 \u002F win_amd64\n\npip install \"https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python\u002Freleases\u002Fdownload\u002Fv$LLAMA_CPP_PYTHON_VERSION-$ACCELERATOR\u002Fllama_cpp_python-$LLAMA_CPP_PYTHON_VERSION-cp$PYTHON_VERSION-cp$PYTHON_VERSION-$PLATFORM.whl\"\n```\n\n## 安装步骤\n\n基础安装：\n```sh\npip install raglite\n```\n\n按需扩展功能：\n```sh\n# 添加 Chainlit 前端支持（类似 ChatGPT 的 Web 界面）\npip install raglite[chainlit]\n\n# 支持更多文档格式（DOCX, PPTX 等，需 Pandoc）\npip install raglite[pandoc]\n\n# 启用高质量 OCR 文档处理（含图像描述）\npip install raglite[mistral-ocr]\n\n# 添加 RAG 性能评估支持\npip install raglite[ragas]\n```\n\n## 基本使用\n\n### 1. 配置 RAGLite\n\n支持远程（PostgreSQL + OpenAI）或本地（DuckDB + llama.cpp）配置：\n\n```python\nfrom raglite import RAGLiteConfig\n\n# 本地配置示例（DuckDB + 本地模型）\nmy_config = RAGLiteConfig(\n    db_url=\"duckdb:\u002F\u002F\u002Fraglite.db\",\n    llm=\"llama-cpp-python\u002Funsloth\u002FQwen3-8B-GGUF\u002F*Q4_K_M.gguf@8192\",\n    embedder=\"llama-cpp-python\u002Flm-kit\u002Fbge-m3-gguf\u002F*F16.gguf@512\",\n)\n```\n\n### 2. 插入文档\n\n支持 PDF、Markdown 或纯文本，自动完成分块、嵌入与索引：\n\n```python\nfrom pathlib import Path\nfrom raglite import Document, insert_documents\n\n# 从文件路径插入\ndocuments = [\n    Document.from_path(Path(\"example.pdf\")),\n]\ninsert_documents(documents, config=my_config)\n\n# 从文本内容插入（可附带元数据）\ncontent = \"# 相对论\\n由爱因斯坦提出...\"\ndocuments = [\n    Document.from_text(content, author=\"Einstein\", topic=\"physics\", year=1905)\n]\ninsert_documents(documents, config=my_config)\n```\n\n### 3. 执行 RAG 查询\n\n调用 `rag` 函数进行自适应检索与生成：\n\n```python\nfrom raglite import rag\n\nmessages = [{\"role\": \"user\", \"content\": \"什么是狭义相对论？\"}]\nresponse = rag(messages, config=my_config)\n\nprint(response[\"content\"])\n```\n\n即可完成一次完整的检索增强生成流程。","某金融科技公司的数据团队需要构建一个内部系统，让分析师能快速从数百份复杂的 PDF 财报和合规文档中检索关键数据并生成摘要。\n\n### 没有 raglite 时\n- **解析质量差**：传统工具处理包含复杂表格的 PDF 时经常乱码，导致关键财务数据丢失或错位。\n- **检索不精准**：仅靠简单的向量搜索，无法有效结合关键词匹配，常漏掉包含特定术语的重要段落。\n- **上下文断裂**：机械式的文本切分破坏了句子完整性，大模型因缺乏前后文背景而产生“幻觉”或错误解读。\n- **部署门槛高**：依赖 PyTorch 等重型框架，导致本地开发环境配置繁琐，且在资源受限的服务器上运行缓慢。\n\n### 使用 raglite 后\n- **高保真文档转换**：利用内置的 pdftext 和 pypdfium2 引擎，完美将含复杂表格的 PDF 转为结构化 Markdown，保留数据原貌。\n- **混合搜索增强**：自动结合数据库原生的全文检索（FTS）与向量搜索（VSS），显著提升了针对专业术语的召回率。\n- **智能语义分块**：通过求解二进制整数规划问题实现最优语义切分，并自动添加上下文标题，确保大模型获取的信息完整连贯。\n- **轻量极速运行**：仅依赖 DuckDB 等轻量组件，无需安装庞大的深度学习框架，即可在普通笔记本上利用 Metal\u002FCUDA 加速快速启动服务。\n\nraglite 通过极致的文档解析、智能分块与混合检索策略，让企业能以最低成本构建出高精度、低延迟的专业级 RAG 应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsuperlinear-ai_raglite_90a5e35d.png","superlinear-ai","Superlinear","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fsuperlinear-ai_35452e4e.png","Making your AI journey matter",null,"hello@superlinear.eu","superlinear_eu","https:\u002F\u002Fsuperlinear.eu","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai",[82,86],{"name":83,"color":84,"percentage":85},"Python","#3572A5",99.6,{"name":87,"color":88,"percentage":89},"Dockerfile","#384d54",0.4,1151,101,"2026-04-03T08:05:27","MPL-2.0","Linux, macOS, Windows","非必需。支持可选加速：macOS 使用 Metal，Linux\u002FWindows 使用 CUDA (支持 cu121, cu122, cu123, cu124)。若运行本地 LLM (llama-cpp-python)，需根据模型大小配置相应显存。","未说明 (取决于所选本地模型大小及文档处理量)",{"notes":98,"python":99,"dependencies":100},"该工具主打轻量级，明确不包含 PyTorch 和 LangChain 依赖。若需使用本地模型，建议安装针对特定平台（Metal\u002FCUDA）预编译的 llama-cpp-python 二进制文件以获得加速。数据库可选择轻量的 DuckDB 或 PostgreSQL。支持通过额外安装包（extras）来启用前端界面、多格式文档转换、高质量 OCR 或评估功能。","3.10, 3.11, 3.12",[101,102,103,104,105,106,107,108,109,110],"litellm","llama-cpp-python (可选)","duckdb 或 postgresql","rerankers","pdftext","pypdfium2","wtpsplit-lite","chainlit (可选)","pandoc (可选)","ragas (可选)",[112,14,35],"其他",[114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131],"llm","markdown","pdf","rag","retrieval-augmented-generation","sqlite","vector-search","pgvector","postgres","postgresql","reranking","late-chunking","late-interaction","colbert","evals","query-adapter","chainlit","duckdb","2026-03-27T02:49:30.150509","2026-04-08T03:56:13.797469",[135,140,145,150,155,160],{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},23718,"使用 Ollama 运行本地 LLM 后端时遇到响应格式错误或连接问题怎么办？","该问题已在 LiteLLM v1.72.3 版本中修复。请确保升级依赖：`pip install --upgrade litellm` 或直接更新 RAGLite 到 v1.0 及以上版本。修复后，可以使用如下命令正常运行：`raglite --llm ollama_chat\u002Fqwen3:8b chainlit`。如果使用的是 Qwen3 等具有思维链功能的模型，可能需要添加 `\u002Fnothink` 参数来禁用思考过程以获得正常回复。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F108",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},23719,"转换没有标题（headings）的 PDF 文件时报错 \"ValueError: attempt to get argmin of an empty sequence\" 如何解决？","此错误发生在 PDF 中无法检测到任何标题字体大小时。维护者确认该问题将在后续版本中修复。临时解决方案是修改源码中的 `add_heading_level_metadata` 函数：当 `heading_font_sizes` 数组为空时，跳过最小值计算逻辑，将相关段落视为普通段落处理（例如设置默认层级 idx=6），并继续执行 `pdftext` 解析和 `mdformat` 格式化流程。用户也可以等待官方发布包含此修复的新版本。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F88",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},23720,"RAGLite 是否支持在相似度搜索之前进行元数据过滤（例如多租户场景）？","是的，RAGLite 支持在相似度搜索前进行元数据过滤。该功能已通过 `_apply_metadata_filter` 实现，并在 PR #160 中引入。使用方法请参考官方 README 中\"插入文档\"章节的示例（见项目主页链接）。无需为每个租户创建独立的数据库，只需在插入或查询时传入相应的元数据过滤条件即可。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F159",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},23721,"使用 markdown_sentence_boundaries 函数时出现 IndexError 越界错误怎么办？","该问题是由于计算出的标题结束索引超出文档实际长度导致的。已在 v0.6.1 版本中修复。修复方案是在获取标题索引时增加边界检查：将 `heading_end` 限制为 `min(char_idx[end_line], len(doc))`，确保不会访问超出 `boundary_probas` 数组范围的索引。建议用户升级到 RAGLite v0.6.1 或更高版本以解决此问题。此外，有用户反馈该问题可能与使用 Microsoft Office 365 转换 PDF 有关，尝试使用 WPS 或 Office 2019 转换可暂时规避。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F79",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},23722,"重复插入相同文档时，如何避免嵌入向量（embeddings）被重新计算以提升效率？","RAGLite 在 v0.6.1 版本中已优化此行为，能够检测并防止不必要的嵌入重计算。当插入已存在的文档时，系统会自动跳过嵌入生成步骤。请确保您使用的是 v0.6.1 或更新版本。如果您仍遇到重复计算问题，请检查是否在每次插入时使用了不同的文档 ID 或元数据，导致系统误判为新文档。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F75",{"id":161,"question_zh":162,"answer_zh":163,"source_url":164},23723,"如何评估 RAGLite 中查询适配器（query adapter）的效果？是否有基准测试方法？","目前社区正在进行查询适配器的基准测试实验。推荐方法是：使用合成数据集训练查询适配器（选项 B），将向量搜索改为标准模式（通过 `insert_evals`），并使用强大的 LLM（如 gpt-4o）生成问题。然后在测试集上对比“基线向量搜索”与“向量搜索 + 查询适配器”的效果。完整数据集的基准测试（不含重排序器）正在计划中。更多细节可参考项目 Slack 频道中的讨论记录。","https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fissues\u002F89",[166,171,176,181,186,191,196,201,206,211,216,221,226,231,236,241,246],{"id":167,"version":168,"summary_zh":169,"released_at":170},145205,"v1.0.0","# 🎉  RAGLite v1.0：DuckDB、通义千问3、并行插入、基准测试、更优的检索质量\n\n## 发布亮点\n\n- 🐤 支持 DuckDB (#137)\n- 🐻 支持通义千问3 (#124)\n- ⚡️ 并行文档插入 (#150)\n- 🏁 使用 `raglite bench` 进行基准测试 (#150)\n- 🎯 通过改进多向量搜索、分块质量和分块前言，提升检索质量 (#123, #126, #132)\n- 💎 新增并优化查询适配器算法 (#146, #147, #149)\n\n## 变更内容\n* 修复：避免将 Markdown 转换为 Markdown，由 @joachim-Heirbrant-SL 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F116 中完成\n* 修复：修正单句分块问题，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F115 中完成\n* 修复：整合标题并防止窗口化分块，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F117 中完成\n* 修复：改善上下文相关的分块标题，由 @SimonJasansky 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F118 中完成\n* 功能：新增使用单个分块嵌入的选项，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F119 中完成\n* 功能：在文档级别添加元数据，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F122 中完成\n* 功能：增加对推理工具使用的支持，并升级至通义千问3，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F124 中完成\n* 功能：为分块内容添加前言，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F126 中完成\n* 功能：引入“小分块”以优化分块方式，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F123 中完成\n* 修复：移除 mdformat，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F128 中完成\n* 功能：按多向量相似度的 L∞ 范数对分块进行排序，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F132 中完成\n* 功能：启用加权倒数排名融合，由 @emilradix 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F136 中完成\n* 修复：修正 Markdown 标题解析中的“差一”错误，由 @joachim-Heirbrant-SL 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F133 中完成\n* 功能：改进配置和 API，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F138 中完成\n* 功能：将 SQLite 替换为 DuckDB，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F137 中完成\n* CI：在 CI 中跳过耗时测试，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F139 中完成\n* 修复：根据分块大小调整过采样策略，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F140 中完成\n* 功能：使 Pandas 成为可选依赖项，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F141 中完成\n* 修复：升级重排模型及推荐的 Cohere 模型，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F142 中完成\n* 修复：改进晚期分块中的标记分配，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F144 中完成\n* 修复：在 DuckDB 插入后运行检查点，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F145 中完成\n* 功能：改进查询适配器算法，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F146 中完成\n* 功能：添加","2025-06-11T15:55:39",{"id":172,"version":173,"summary_zh":174,"released_at":175},145206,"v0.7.0","## 变更内容\n* 功能：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F86 中实现，使导入速度更快。\n* 修复：通过 @joachim-Heirbrant-SL 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F93 中实现，避免块 ID 冲突。\n* 功能：通过 @ThomasDelsart 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F96 中实现，新增直接将 Markdown 内容插入数据库的功能。\n* 功能：通过 @rchretien 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F97 中实现，使 llama-cpp-python 成为可选依赖项。\n* 功能：通过 @rchretien 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F98 中实现，从 poetry-cookiecutter 迁移到 substrate。\n* 杂项：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F105 中实现，升级脚手架。\n* 修复：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F106 中实现，恢复 pandoc 的额外名称。\n* 文档：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F107 中实现，改进内联注释。\n* 修复：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F109 中实现，对可选依赖项采用惰性抛出“模块未找到”错误的方式。\n* 功能：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F110 中实现，计算最优的句子边界。\n* 修复：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F111 中实现，修复 CLI 入口点回归问题。\n* 功能：通过 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F112 中实现，用声明式优化取代后处理。\n\n## 新贡献者\n* @joachim-Heirbrant-SL 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F93 中完成了首次贡献。\n* @ThomasDelsart 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F96 中完成了首次贡献。\n* @rchretien 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F97 中完成了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.6.2...v0.7.0","2025-03-17T13:22:49",{"id":177,"version":178,"summary_zh":179,"released_at":180},145207,"v0.6.2","## 变更内容\n* 修复：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F84 中移除了不必要的停止序列\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.6.1...v0.6.2","2025-01-06T22:27:25",{"id":182,"version":183,"summary_zh":184,"released_at":185},145208,"v0.6.1","## 变更内容\n* 修复：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F83 中实现，按条件启用 `LlamaRAMCache`\n* 修复（依赖）：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F78 中实现，排除会导致 `get_model_info` 失效的 litellm 版本\n* 修复：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F80 中实现，提升（重新）插入速度\n* 修复：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F81 中实现，修复 Markdown 标题边界问题\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.6.0...v0.6.1","2025-01-06T14:18:00",{"id":187,"version":188,"summary_zh":189,"released_at":190},145209,"v0.6.0","## 变更内容\n* chore: 由 @eltociear 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F70 中更新 _extract.py\n* feat: 由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F72 中改进句子拆分\n* feat: 由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F71 中为 llama-cpp-python 添加流式工具的使用\n* feat: 由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F74 中从 xx_sent_ud_sm 升级到 SaT\n* feat: 由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F69 中添加对 Python 3.12 的支持\n* chore: 由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F76 中更新代码杂项\n\n## 新贡献者\n* @eltociear 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F70 中做出了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.5.1...v0.6.0","2025-01-05T15:39:53",{"id":192,"version":193,"summary_zh":194,"released_at":195},145210,"v0.5.1","## 变更内容\n* 修复：由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F68 中改进了空数据库的输出。\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.5.0...v0.5.1","2024-12-18T15:15:04",{"id":197,"version":198,"summary_zh":199,"released_at":200},145211,"v0.5.0","## 变更内容\n* 样式：@lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F59 中降低了 httpx 的日志级别\n* 功能：让 LLM 决定是否检索上下文，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F62 中实现\n* 修复：支持 pgvector v0.7.0 及以上版本，由 @undo76 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F63 中完成\n* 文档：在 README 中添加 GitHub 星标历史，由 @MattiaMolon 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F65 中完成\n* 功能：添加 MCP 服务器，由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F67 中实现\n\n## 新贡献者\n* @MattiaMolon 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F65 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.4.1...v0.5.0","2024-12-17T09:49:40",{"id":202,"version":203,"summary_zh":204,"released_at":205},145212,"v0.4.1","## 变更内容\n* 修复：支持使用 LiteLLM 将 Ragas 嵌入，由 @undo76 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F56 中实现。\n* 修复：添加并启用 OpenAI 严格模式，由 @undo76 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F55 中实现。\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.4.0...v0.4.1","2024-12-05T20:50:10",{"id":207,"version":208,"summary_zh":209,"released_at":210},145213,"v0.4.0","## 变更内容\n* 功能：改进延迟分块，并由 @lsorber 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F51 中优化 pgvector 设置\n    - 添加针对 #24 的 workaround，将嵌入模型的上下文长度从 512 增加至可由用户自定义的大小。\n    - 将嵌入模型的默认上下文长度提升至 1024 个 token（超过此值会降低 bge-m3 的性能）。\n    - 将 llama-cpp-python 升级至最新版本。\n    - 使用肯德尔秩相关系数对重排序器进行更稳健的测试。\n    - 优化 pgvector 的配置参数。\n    - 提供对混合检索和向量检索中过采样的更好控制。\n    - 升级到 PostgreSQL 17。\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.3.0...v0.4.0","2024-12-04T16:31:11",{"id":212,"version":213,"summary_zh":214,"released_at":215},145214,"v0.3.0","## 变更内容\n* 功能新增：支持提示词缓存，并应用 Anthropic 的长上下文提示词格式，由 @undo76 在 https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F52 中实现。\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.2.1...v0.3.0","2024-12-03T18:26:21",{"id":217,"version":218,"summary_zh":219,"released_at":220},145215,"v0.2.1","## What's Changed\r\n* fix: add fallbacks for model info by @undo76 in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F44\r\n* fix: improve unpacking of keyword search results by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F46\r\n* fix: upgrade rerankers and remove flashrank patch by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F47\r\n* fix: improve structured output extraction and query adapter updates by @emilradix in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F34\r\n\r\n## New Contributors\r\n* @undo76 made their first contribution in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F44\r\n* @emilradix made their first contribution in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F34\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.2.0...v0.2.1","2024-11-22T17:12:57",{"id":222,"version":223,"summary_zh":224,"released_at":225},145216,"v0.2.0","## What's Changed\r\n* feat: add Chainlit frontend by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F33\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.1.4...v0.2.0","2024-10-21T15:36:26",{"id":227,"version":228,"summary_zh":229,"released_at":230},145217,"v0.1.4","## What's Changed\r\n* fix: fix optimal chunking edge cases by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F32\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.1.3...v0.1.4","2024-10-15T08:36:58",{"id":232,"version":233,"summary_zh":234,"released_at":235},145218,"v0.1.3","## What's Changed\r\n* fix: improve chunk and segment ordering by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F29\r\n* fix: upgrade pdftext by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F30\r\n* ci: expand test matrix with Python 3.11 by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F31\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.1.2...v0.1.3","2024-10-13T18:25:33",{"id":237,"version":238,"summary_zh":239,"released_at":240},145219,"v0.1.2","## What's Changed\r\n* fix: avoid pdftext v0.3.11 by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F27\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.1.1...v0.1.2","2024-10-08T12:43:49",{"id":242,"version":243,"summary_zh":244,"released_at":245},145220,"v0.1.1","## What's Changed\r\n* docs: improve installation instructions by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F21\r\n* fix: patch rerankers flashrank issue by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F22\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcompare\u002Fv0.1.0...v0.1.1","2024-10-07T21:09:48",{"id":247,"version":248,"summary_zh":249,"released_at":250},145221,"v0.1.0","## What's Changed\r\n* feat: implement basic features by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F2\r\n* feat: simplify document insertion by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F6\r\n* feat: optimize config for CPU and GPU by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F7\r\n* fix: improve indexing of multiple documents by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F8\r\n* feat: improve exception feedback for extraction by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F9\r\n* feat: automatically adjust number of RAG contexts by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F10\r\n* fix: lazily import optional dependencies by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F11\r\n* feat: infer missing font sizes by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F12\r\n* feat: add evaluation by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F14\r\n* feat: upgrade default CPU model to Phi-3.5-mini by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F15\r\n* feat: make query adapter minimally invasive by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F16\r\n* feat: add PostgreSQL support by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F18\r\n* feat: add LiteLLM and late chunking by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F19\r\n* feat: add reranking by @lsorber in https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fpull\u002F20\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsuperlinear-ai\u002Fraglite\u002Fcommits\u002Fv0.1.0","2024-10-07T09:49:03"]