[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-PromtEngineer--localGPT":3,"tool-PromtEngineer--localGPT":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":107,"forks":108,"last_commit_at":109,"license":110,"difficulty_score":10,"env_os":111,"env_gpu":112,"env_ram":113,"env_deps":114,"category_tags":126,"github_topics":79,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":127,"updated_at":128,"faqs":129,"releases":159},3300,"PromtEngineer\u002FlocalGPT","localGPT","Chat with your documents on your local device using GPT models. No data leaves your device and 100% private. ","localGPT 是一款完全私密的本地文档智能平台，让你能在自己的设备上直接与各类文档（如 PDF、Word、TXT 等）进行对话。它彻底解决了用户在使用云端 AI 服务时对数据泄露的担忧，确保所有数据处理均在本地完成，无需联网，实现 100% 隐私安全。\n\n无论是需要快速总结长篇报告、从海量资料中检索关键信息，还是对特定内容进行深度问答，localGPT 都能轻松胜任。它不仅适合注重数据合规的企业用户和研究人员，也面向希望在不依赖外部服务器的情况下探索大模型能力的开发者及普通个人用户。\n\n在技术层面，localGPT 超越了传统的检索增强生成（RAG）方案。它内置混合搜索引擎，巧妙结合了语义相似度、关键词匹配及先进的\"Late Chunking\"技术，以提升长文本处理的精准度。系统还配备智能路由机制，能自动判断是直接由大模型回答还是调用检索流程，并通过上下文剪枝和独立验证环节进一步保障答案质量。此外，它架构轻量模块化，支持 CPU、GPU 等多种硬件加速，并可灵活接入 Ollama 托管的各类开源模型，部署与维护十分简便。","# LocalGPT - Private Document Intelligence Platform\n\n\u003Cdiv align=\"center\">\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F2947\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_4cc089988f35.png\" alt=\"PromtEngineer%2FlocalGPT | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![GitHub Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fstargazers)\n[![GitHub Forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fnetwork\u002Fmembers)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n[![GitHub Pull Requests](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues-pr\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fpulls)\n[![Python 3.8+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8+-blue.svg?style=flat-square)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg?style=flat-square)](LICENSE)\n[![Docker](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocker-supported-blue.svg?style=flat-square)](https:\u002F\u002Fwww.docker.com\u002F)\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fx.com\u002Fengineerrprompt\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFollow%20on%20X-000000?style=for-the-badge&logo=x&logoColor=white\" alt=\"Follow on X\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FtUDWAFGc\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJoin%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\" alt=\"Join our Discord\" \u002F>\n    \u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdiv>\n\n## 🚀 What is LocalGPT?\n\nLocalGPT is a **fully private, on-premise Document Intelligence platform**. Ask questions, summarise, and uncover insights from your files with state-of-the-art AI—no data ever leaves your machine.\n\nMore than a traditional RAG (Retrieval-Augmented Generation) tool, LocalGPT features a **hybrid search engine** that blends semantic similarity, keyword matching, and [Late Chunking](https:\u002F\u002Fjina.ai\u002Fnews\u002Flate-chunking-in-long-context-embedding-models\u002F) for long-context precision. A **smart router** automatically selects between RAG and direct LLM answering for every query, while **contextual enrichment** and sentence-level [Context Pruning](https:\u002F\u002Fhuggingface.co\u002Fnaver\u002Fprovence-reranker-debertav3-v1) surface only the most relevant content. An independent **verification** pass adds an extra layer of accuracy.\n\nThe architecture is **modular and lightweight**—enable only the components you need. With a pure-Python core and minimal dependencies, LocalGPT is simple to deploy, run, and maintain on any infrastructure.The system has minimal dependencies on frameworks and libraries, making it easy to deploy and maintain. The RAG system is pure python and does not require any additional dependencies.\n\n## ▶️ Video\nWatch this [video](https:\u002F\u002Fyoutu.be\u002FJTbtGH3secI) to get started with LocalGPT. \n\n| Home | Create Index | Chat |\n|------|--------------|------|\n| ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_a1cdbea25610.png) | ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_1442ad929804.png) | ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_51c6fdeec29e.png) |\n\n## ✨ Features\n\n- **Utmost Privacy**: Your data remains on your computer, ensuring 100% security.\n- **Versatile Model Support**: Seamlessly integrate a variety of open-source models via Ollama.\n- **Diverse Embeddings**: Choose from a range of open-source embeddings.\n- **Reuse Your LLM**: Once downloaded, reuse your LLM without the need for repeated downloads.\n- **Chat History**: Remembers your previous conversations (in a session).\n- **API**: LocalGPT has an API that you can use for building RAG Applications.\n- **GPU, CPU, HPU & MPS Support**: Supports multiple platforms out of the box, Chat with your data using `CUDA`, `CPU`, `HPU (Intel® Gaudi®)` or `MPS` and more!\n\n### 📖 Document Processing\n- **Multi-format Support**: PDF, DOCX, TXT, Markdown, and more (Currently only PDF is supported)\n- **Contextual Enrichment**: Enhanced document understanding with AI-generated context, inspired by [Contextual Retrieval](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fcontextual-retrieval)\n- **Batch Processing**: Handle multiple documents simultaneously\n\n### 🤖 AI-Powered Chat\n- **Natural Language Queries**: Ask questions in plain English\n- **Source Attribution**: Every answer includes document references\n- **Smart Routing**: Automatically chooses between RAG and direct LLM responses\n- **Query Decomposition**: Breaks complex queries into sub-questions for better answers\n- **Semantic Caching**: TTL-based caching with similarity matching for faster responses\n- **Session-Aware History**: Maintains conversation context across interactions\n- **Answer Verification**: Independent verification pass for accuracy\n- **Multiple AI Models**: Ollama for inference, HuggingFace for embeddings and reranking\n\n\n### 🛠️ Developer-Friendly\n- **RESTful APIs**: Complete API access for integration\n- **Real-time Progress**: Live updates during document processing\n- **Flexible Configuration**: Customize models, chunk sizes, and search parameters\n- **Extensible Architecture**: Plugin system for custom components\n\n### 🎨 Modern Interface\n- **Intuitive Web UI**: Clean, responsive design\n- **Session Management**: Organize conversations by topic\n- **Index Management**: Easy document collection management\n- **Real-time Chat**: Streaming responses for immediate feedback\n\n---\n\n## 🚀 Quick Start\n\nNote: The installation is currently only tested on macOS. \n\n### Prerequisites\n- Python 3.8 or higher (tested with Python 3.11.5)\n- Node.js 16+ and npm (tested with Node.js 23.10.0, npm 10.9.2)\n- Docker (optional, for containerized deployment)\n- 8GB+ RAM (16GB+ recommended)\n- Ollama (required for both deployment approaches)\n\n### ***NOTE***\nBefore this brach is moved to the main branch, please clone this branch for instalation:\n\n```bash\ngit clone -b localgpt-v2 https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n```\n\n### Option 1: Docker Deployment \n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Install Ollama locally (required even for Docker)\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\n\n# Start Ollama\nollama serve\n\n# Start with Docker (in a new terminal)\n.\u002Fstart-docker.sh\n\n# Access the application\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**Docker Management Commands:**\n```bash\n# Check container status\ndocker compose ps\n\n# View logs\ndocker compose logs -f\n\n# Stop containers\n.\u002Fstart-docker.sh stop\n```\n\n### Option 2: Direct Development (Recommended for Development)\n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Install Python dependencies\npip install -r requirements.txt\n\n# Key dependencies installed:\n# - torch==2.4.1, transformers==4.51.0 (AI models)\n# - lancedb (vector database)\n# - rank_bm25, fuzzywuzzy (search algorithms)\n# - sentence_transformers, rerankers (embedding\u002Freranking)\n# - docling (document processing)\n# - colpali-engine (multimodal processing - support coming soon)\n\n# Install Node.js dependencies\nnpm install\n\n# Install and start Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\nollama serve\n\n# Start the system (in a new terminal)\npython run_system.py\n\n# Access the application\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**System Management:**\n```bash\n# Check system health (comprehensive diagnostics)\npython system_health_check.py\n\n# Check service status and health\npython run_system.py --health\n\n# Start in production mode\npython run_system.py --mode prod\n\n# Skip frontend (backend + RAG API only)\npython run_system.py --no-frontend\n\n# View aggregated logs\npython run_system.py --logs-only\n\n# Stop all services\npython run_system.py --stop\n# Or press Ctrl+C in the terminal running python run_system.py\n```\n\n**Service Architecture:**\nThe `run_system.py` launcher manages four key services:\n- **Ollama Server** (port 11434): AI model serving\n- **RAG API Server** (port 8001): Document processing and retrieval\n- **Backend Server** (port 8000): Session management and API endpoints\n- **Frontend Server** (port 3000): React\u002FNext.js web interface\n\n### Option 3: Manual Component Startup\n\n```bash\n# Terminal 1: Start Ollama\nollama serve\n\n# Terminal 2: Start RAG API\npython -m rag_system.api_server\n\n# Terminal 3: Start Backend\ncd backend && python server.py\n\n# Terminal 4: Start Frontend\nnpm run dev\n\n# Access at http:\u002F\u002Flocalhost:3000\n```\n\n---\n\n### Detailed Installation\n\n#### 1. Install System Dependencies\n\n**Ubuntu\u002FDebian:**\n```bash\nsudo apt update\nsudo apt install python3.8 python3-pip nodejs npm docker.io docker-compose\n```\n\n**macOS:**\n```bash\nbrew install python@3.8 node npm docker docker-compose\n```\n\n**Windows:**\n```bash\n# Install Python 3.8+, Node.js, and Docker Desktop\n# Then use PowerShell or WSL2\n```\n\n#### 2. Install AI Models\n\n**Install Ollama (Recommended):**\n```bash\n# Install Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\n\n# Pull recommended models\nollama pull qwen3:0.6b          # Fast generation model\nollama pull qwen3:8b            # High-quality generation model\n```\n\n#### 3. Configure Environment\n\n```bash\n# Copy environment template\ncp .env.example .env\n\n# Edit configuration\nnano .env\n```\n\n**Key Configuration Options:**\n```env\n# AI Models (referenced in rag_system\u002Fmain.py)\nOLLAMA_HOST=http:\u002F\u002Flocalhost:11434\n\n# Database Paths (used by backend and RAG system)\nDATABASE_PATH=.\u002Fbackend\u002Fchat_data.db\nVECTOR_DB_PATH=.\u002Flancedb\n\n# Server Settings (used by run_system.py)\nBACKEND_PORT=8000\nFRONTEND_PORT=3000\nRAG_API_PORT=8001\n\n# Optional: Override default models\nGENERATION_MODEL=qwen3:8b\nENRICHMENT_MODEL=qwen3:0.6b\nEMBEDDING_MODEL=Qwen\u002FQwen3-Embedding-0.6B\nRERANKER_MODEL=answerdotai\u002Fanswerai-colbert-small-v1\n```\n\n#### 4. Initialize the System\n\n```bash\n# Run system health check\npython system_health_check.py\n\n# Initialize databases\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n\n# Test installation\npython -c \"from rag_system.main import get_agent; print('✅ Installation successful!')\"\n\n# Validate complete setup\npython run_system.py --health\n```\n\n---\n\n## 🎯 Getting Started\n\n### 1. Create Your First Index\n\nAn **index** is a collection of processed documents that you can chat with.\n\n#### Using the Web Interface:\n1. Open http:\u002F\u002Flocalhost:3000\n2. Click \"Create New Index\"\n3. Upload your documents (PDF, DOCX, TXT)\n4. Configure processing options\n5. Click \"Build Index\"\n\n#### Using Scripts:\n```bash\n# Simple script approach\n.\u002Fsimple_create_index.sh \"My Documents\" \"path\u002Fto\u002Fdocument.pdf\"\n\n# Interactive script\npython create_index_script.py\n```\n\n#### Using API:\n```bash\n# Create index\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"name\": \"My Index\", \"description\": \"My documents\"}'\n\n# Upload documents\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fupload \\\n  -F \"files=@document.pdf\"\n\n# Build index\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fbuild\n```\n\n### 2. Start Chatting\n\nOnce your index is built:\n\n1. **Create a Chat Session**: Click \"New Chat\" or use an existing session\n2. **Select Your Index**: Choose which document collection to query\n3. **Ask Questions**: Type natural language questions about your documents\n4. **Get Answers**: Receive AI-generated responses with source citations\n\n### 3. Advanced Features\n\n#### Custom Model Configuration\n```bash\n# Use different models for different tasks\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Fsessions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"title\": \"High Quality Session\",\n    \"model\": \"qwen3:8b\",\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-4B\"\n  }'\n```\n\n#### Batch Document Processing\n```bash\n# Process multiple documents at once\npython demo_batch_indexing.py --config batch_indexing_config.json\n```\n\n#### API Integration\n```python\nimport requests\n\n# Chat with your documents via API\nresponse = requests.post('http:\u002F\u002Flocalhost:8000\u002Fchat', json={\n    'query': 'What are the key findings in the research papers?',\n    'session_id': 'your-session-id',\n    'search_type': 'hybrid',\n    'retrieval_k': 20\n})\n\nprint(response.json()['response'])\n```\n\n---\n\n## 🔧 Configuration\n\n### Model Configuration\n\nLocalGPT supports multiple AI model providers with centralized configuration:\n\n#### Ollama Models (Local Inference)\n```python\nOLLAMA_CONFIG = {\n    \"host\": \"http:\u002F\u002Flocalhost:11434\",\n    \"generation_model\": \"qwen3:8b\",        # Main text generation\n    \"enrichment_model\": \"qwen3:0.6b\"       # Lightweight routing\u002Fenrichment\n}\n```\n\n#### External Models (HuggingFace Direct)\n```python\nEXTERNAL_MODELS = {\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",           # 1024 dimensions\n    \"reranker_model\": \"answerdotai\u002Fanswerai-colbert-small-v1\", # ColBERT reranker\n    \"fallback_reranker\": \"BAAI\u002Fbge-reranker-base\"             # Backup reranker\n}\n```\n\n### Pipeline Configuration\n\nLocalGPT offers two main pipeline configurations:\n\n#### Default Pipeline (Production-Ready)\n```python\n\"default\": {\n    \"description\": \"Production-ready pipeline with hybrid search, AI reranking, and verification\",\n    \"storage\": {\n        \"lancedb_uri\": \".\u002Flancedb\",\n        \"text_table_name\": \"text_pages_v3\",\n        \"bm25_path\": \".\u002Findex_store\u002Fbm25\"\n    },\n    \"retrieval\": {\n        \"retriever\": \"multivector\",\n        \"search_type\": \"hybrid\",\n        \"late_chunking\": {\"enabled\": True},\n        \"dense\": {\"enabled\": True, \"weight\": 0.7},\n        \"bm25\": {\"enabled\": True}\n    },\n    \"reranker\": {\n        \"enabled\": True,\n        \"type\": \"ai\",\n        \"strategy\": \"rerankers-lib\",\n        \"model_name\": \"answerdotai\u002Fanswerai-colbert-small-v1\",\n        \"top_k\": 10\n    },\n    \"query_decomposition\": {\"enabled\": True, \"max_sub_queries\": 3},\n    \"verification\": {\"enabled\": True},\n    \"retrieval_k\": 20,\n    \"contextual_enricher\": {\"enabled\": True, \"window_size\": 1}\n}\n```\n\n#### Fast Pipeline (Speed-Optimized)\n```python\n\"fast\": {\n    \"description\": \"Speed-optimized pipeline with minimal overhead\",\n    \"retrieval\": {\n        \"search_type\": \"vector_only\",\n        \"late_chunking\": {\"enabled\": False}\n    },\n    \"reranker\": {\"enabled\": False},\n    \"query_decomposition\": {\"enabled\": False},\n    \"verification\": {\"enabled\": False},\n    \"retrieval_k\": 10,\n    \"contextual_enricher\": {\"enabled\": False}\n}\n```\n\n### Search Configuration\n\n```python\nSEARCH_CONFIG = {\n    'hybrid': {\n        'dense_weight': 0.7,\n        'sparse_weight': 0.3,\n        'retrieval_k': 20,\n        'reranker_top_k': 10\n    }\n}\n```\n---\n\n## 🛠️ Troubleshooting\n\n### Common Issues\n\n#### Installation Problems\n```bash\n# Check Python version\npython --version  # Should be 3.8+\n\n# Check dependencies\npip list | grep -E \"(torch|transformers|lancedb)\"\n\n# Reinstall dependencies\npip install -r requirements.txt --force-reinstall\n```\n\n#### Model Loading Issues\n```bash\n# Check Ollama status\nollama list\ncurl http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags\n\n# Pull missing models\nollama pull qwen3:0.6b\n```\n\n#### Database Issues\n```bash\n# Check database connectivity\npython -c \"from backend.database import ChatDatabase; db = ChatDatabase(); print('✅ Database OK')\"\n\n# Reset database (WARNING: This deletes all data)\nrm backend\u002Fchat_data.db\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n```\n\n#### Performance Issues\n```bash\n# Check system resources\npython system_health_check.py\n\n# Monitor memory usage\nhtop  # or Task Manager on Windows\n\n# Optimize for low-memory systems\nexport PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512\n```\n\n### Getting Help\n\n1. **Check Logs**: The system creates structured logs in the `logs\u002F` directory:\n   - `logs\u002Fsystem.log`: Main system events and errors\n   - `logs\u002Follama.log`: Ollama server logs\n   - `logs\u002Frag-api.log`: RAG API processing logs\n   - `logs\u002Fbackend.log`: Backend server logs\n   - `logs\u002Ffrontend.log`: Frontend build and runtime logs\n\n2. **System Health**: Run comprehensive diagnostics:\n   ```bash\n   python system_health_check.py  # Full system diagnostics\n   python run_system.py --health  # Service status check\n   ```\n\n3. **Health Endpoints**: Check individual service health:\n   - Backend: `http:\u002F\u002Flocalhost:8000\u002Fhealth`\n   - RAG API: `http:\u002F\u002Flocalhost:8001\u002Fhealth`\n   - Ollama: `http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags`\n\n4. **Documentation**: Check the [Technical Documentation](TECHNICAL_DOCS.md)\n5. **GitHub Issues**: Report bugs and request features\n6. **Community**: Join our Discord\u002FSlack community\n\n---\n\n## 🔗 API Reference\n\n### Core Endpoints\n\n#### Chat API\n```http\n# Session-based chat (recommended)\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the main topics discussed?\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20,\n  \"ai_rerank\": true,\n  \"context_window_size\": 5\n}\n\n# Legacy chat endpoint\nPOST \u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the main topics discussed?\",\n  \"session_id\": \"uuid\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20\n}\n```\n\n#### Index Management\n```http\n# Create index\nPOST \u002Findexes\nContent-Type: application\u002Fjson\n{\n  \"name\": \"My Index\",\n  \"description\": \"Description\",\n  \"config\": \"default\"\n}\n\n# Get all indexes\nGET \u002Findexes\n\n# Get specific index\nGET \u002Findexes\u002F{id}\n\n# Upload documents to index\nPOST \u002Findexes\u002F{id}\u002Fupload\nContent-Type: multipart\u002Fform-data\nfiles: [file1.pdf, file2.pdf, ...]\n\n# Build index (process uploaded documents)\nPOST \u002Findexes\u002F{id}\u002Fbuild\nContent-Type: application\u002Fjson\n{\n  \"config_mode\": \"default\",\n  \"enable_enrich\": true,\n  \"chunk_size\": 512\n}\n\n# Delete index\nDELETE \u002Findexes\u002F{id}\n```\n\n#### Session Management\n```http\n# Create session\nPOST \u002Fsessions\nContent-Type: application\u002Fjson\n{\n  \"title\": \"My Session\",\n  \"model\": \"qwen3:0.6b\"\n}\n\n# Get all sessions\nGET \u002Fsessions\n\n# Get specific session\nGET \u002Fsessions\u002F{session_id}\n\n# Get session documents\nGET \u002Fsessions\u002F{session_id}\u002Fdocuments\n\n# Get session indexes\nGET \u002Fsessions\u002F{session_id}\u002Findexes\n\n# Link index to session\nPOST \u002Fsessions\u002F{session_id}\u002Findexes\u002F{index_id}\n\n# Delete session\nDELETE \u002Fsessions\u002F{session_id}\n\n# Rename session\nPOST \u002Fsessions\u002F{session_id}\u002Frename\nContent-Type: application\u002Fjson\n{\n  \"new_title\": \"Updated Session Name\"\n}\n```\n\n### Advanced Features\n\n#### Query Decomposition\nThe system can break complex queries into sub-questions for better answers:\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"Compare the methodologies and analyze their effectiveness\",\n  \"query_decompose\": true,\n  \"compose_sub_answers\": true\n}\n```\n\n#### Answer Verification\nIndependent verification pass for accuracy using a separate verification model:\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the key findings?\",\n  \"verify\": true\n}\n```\n\n#### Contextual Enrichment\nDocument context enrichment during indexing for better understanding:\n```bash\n# Enable during index building\nPOST \u002Findexes\u002F{id}\u002Fbuild\n{\n  \"enable_enrich\": true,\n  \"window_size\": 2\n}\n```\n\n#### Late Chunking\nBetter context preservation by chunking after embedding:\n```bash\n# Configure in pipeline\n\"late_chunking\": {\"enabled\": true}\n```\n\n#### Streaming Chat\n```http\nPOST \u002Fchat\u002Fstream\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"Explain the methodology\",\n  \"session_id\": \"uuid\",\n  \"stream\": true\n}\n```\n\n#### Batch Processing\n```bash\n# Using the batch indexing script\npython demo_batch_indexing.py --config batch_indexing_config.json\n\n# Example batch configuration (batch_indexing_config.json):\n{\n  \"index_name\": \"Sample Batch Index\",\n  \"index_description\": \"Example batch index configuration\",\n  \"documents\": [\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1039.pdf\",\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1041.pdf\"\n  ],\n  \"processing\": {\n    \"chunk_size\": 512,\n    \"chunk_overlap\": 64,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true,\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",\n    \"generation_model\": \"qwen3:0.6b\",\n    \"retrieval_mode\": \"hybrid\",\n    \"window_size\": 2\n  }\n}\n```\n\n```http\n# API endpoint for batch processing\nPOST \u002Fbatch\u002Findex\nContent-Type: application\u002Fjson\n\n{\n  \"file_paths\": [\"doc1.pdf\", \"doc2.pdf\"],\n  \"config\": {\n    \"chunk_size\": 512,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true\n  }\n}\n```\n\nFor complete API documentation, see [API_REFERENCE.md](API_REFERENCE.md).\n\n---\n\n## 🏗️ Architecture\n\nLocalGPT is built with a modular, scalable architecture:\n\n```mermaid\ngraph TB\n    UI[Web Interface] --> API[Backend API]\n    API --> Agent[RAG Agent]\n    Agent --> Retrieval[Retrieval Pipeline]\n    Agent --> Generation[Generation Pipeline]\n\n    Retrieval --> Vector[Vector Search]\n    Retrieval --> BM25[BM25 Search]\n    Retrieval --> Rerank[Reranking]\n\n    Vector --> LanceDB[(LanceDB)]\n    BM25 --> BM25DB[(BM25 Index)]\n\n    Generation --> Ollama[Ollama Models]\n    Generation --> HF[Hugging Face Models]\n\n    API --> SQLite[(SQLite DB)]\n```\n\nOverview of the Retrieval Agent\n\n```mermaid\ngraph TD\n    classDef llmcall fill:#e6f3ff,stroke:#007bff;\n    classDef pipeline fill:#e6ffe6,stroke:#28a745;\n    classDef cache fill:#fff3e0,stroke:#fd7e14;\n    classDef logic fill:#f8f9fa,stroke:#6c757d;\n    classDef thread stroke-dasharray: 5 5;\n\n    A(Start: Agent.run) --> B_asyncio.run(_run_async);\n    B --> C{_run_async};\n\n    C --> C1[Get Chat History];\n    C1 --> T1[Build Triage Prompt \u003Cbr\u002F> Query + Doc Overviews ];\n    T1 --> T2[\"(asyncio.to_thread)\u003Cbr\u002F>LLM Triage: RAG or LLM_DIRECT?\"]; class T2 llmcall,thread;\n    T2 --> T3{Decision?};\n\n    T3 -- RAG --> RAG_Path;\n    T3 -- LLM_DIRECT --> LLM_Path;\n\n    subgraph RAG Path\n        RAG_Path --> R1[Format Query + History];\n        R1 --> R2[\"(asyncio.to_thread)\u003Cbr\u002F>Generate Query Embedding\"]; class R2 pipeline,thread;\n        R2 --> R3{{Check Semantic Cache}}; class R3 cache;\n        R3 -- Hit --> R_Cache_Hit(Return Cached Result);\n        R_Cache_Hit --> R_Hist_Update;\n        R3 -- Miss --> R4{Decomposition \u003Cbr\u002F> Enabled?};\n\n        R4 -- Yes --> R5[\"(asyncio.to_thread)\u003Cbr\u002F>Decompose Raw Query\"]; class R5 llmcall,thread;\n        R5 --> R6{{Run Sub-Queries \u003Cbr\u002F> Parallel RAG Pipeline}}; class R6 pipeline,thread;\n        R6 --> R7[Collect Results & Docs];\n        R7 --> R8[\"(asyncio.to_thread)\u003Cbr\u002F>Compose Final Answer\"]; class R8 llmcall,thread;\n        R8 --> V1(RAG Answer);\n\n        R4 -- No --> R9[\"(asyncio.to_thread)\u003Cbr\u002F>Run Single Query \u003Cbr\u002F>(RAG Pipeline)\"]; class R9 pipeline,thread;\n        R9 --> V1;\n\n        V1 --> V2{{Verification \u003Cbr\u002F> await verify_async}}; class V2 llmcall;\n        V2 --> V3(Final RAG Result);\n        V3 --> R_Cache_Store{{Store in Semantic Cache}}; class R_Cache_Store cache;\n        R_Cache_Store --> FinalResult;\n    end\n\n    subgraph Direct LLM Path\n        LLM_Path --> L1[Format Query + History];\n        L1 --> L2[\"(asyncio.to_thread)\u003Cbr\u002F>Generate Direct LLM Answer \u003Cbr\u002F> (No RAG)\"]; class L2 llmcall,thread;\n        L2 --> FinalResult(Final Direct Result);\n    end\n\n    FinalResult --> R_Hist_Update(Update Chat History);\n    R_Hist_Update --> ZZZ(End: Return Result);\n```\n\n---\n\n## 🤝 Contributing\n\nWe welcome contributions from developers of all skill levels! LocalGPT is an open-source project that benefits from community involvement.\n\n### 🚀 Quick Start for Contributors\n\n```bash\n# Fork and clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Set up development environment\npip install -r requirements.txt\nnpm install\n\n# Install Ollama and models\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b qwen3:8b\n\n# Verify setup\npython system_health_check.py\npython run_system.py --mode dev\n```\n\n### 📋 How to Contribute\n\n1. **🐛 Report Bugs**: Use our [bug report template](.github\u002FISSUE_TEMPLATE\u002Fbug_report.md)\n2. **💡 Request Features**: Use our [feature request template](.github\u002FISSUE_TEMPLATE\u002Ffeature_request.md)\n3. **🔧 Submit Code**: Follow our [development workflow](CONTRIBUTING.md#development-workflow)\n4. **📚 Improve Docs**: Help make our documentation better\n\n### 📖 Detailed Guidelines\n\nFor comprehensive contributing guidelines, including:\n- Development setup and workflow\n- Coding standards and best practices\n- Testing requirements\n- Documentation standards\n- Release process\n\n**👉 See our [CONTRIBUTING.md](CONTRIBUTING.md) guide**\n\n---\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. For models, please check their respective licenses.\n\n---\n\n## 📞 Support\n\n- **Documentation**: [Technical Docs](TECHNICAL_DOCS.md)\n- **Issues**: [GitHub Issues](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n- **Discussions**: [GitHub Discussions](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fdiscussions)\n- **Business Deployment and Customization**: [Contact Us](https:\u002F\u002Ftally.so\u002Fr\u002Fwv6R2d)\n---\n\n\u003Cdiv align=\"center\">\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_fbad6b3666ff.png)](https:\u002F\u002Fstar-history.com\u002F#PromtEngineer\u002FlocalGPT&Date)\n","# LocalGPT - 私有文档智能平台\n\n\u003Cdiv align=\"center\">\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F2947\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_4cc089988f35.png\" alt=\"PromtEngineer%2FlocalGPT | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![GitHub Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fstargazers)\n[![GitHub Forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fnetwork\u002Fmembers)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n[![GitHub Pull Requests](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues-pr\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fpulls)\n[![Python 3.8+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8+-blue.svg?style=flat-square)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg?style=flat-square)](LICENSE)\n[![Docker](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocker-supported-blue.svg?style=flat-square)](https:\u002F\u002Fwww.docker.com\u002F)\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fx.com\u002Fengineerrprompt\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFollow%20on%20X-000000?style=for-the-badge&logo=x&logoColor=white\" alt=\"Follow on X\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FtUDWAFGc\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJoin%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\" alt=\"Join our Discord\" \u002F>\n    \u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdiv>\n\n## 🚀 什么是LocalGPT？\n\nLocalGPT是一个**完全私密、本地部署的文档智能平台**。使用最先进的AI技术，您可以对文件进行提问、总结并挖掘洞察——数据永远不会离开您的设备。\n\nLocalGPT不仅是一款传统的RAG（检索增强生成）工具，还配备了一个**混合搜索引擎**，结合语义相似度、关键词匹配以及[延迟分块](https:\u002F\u002Fjina.ai\u002Fnews\u002Flate-chunking-in-long-context-embedding-models\u002F)技术，以实现长上下文的精准检索。一个**智能路由系统**会自动为每个查询选择RAG或直接LLM回答的方式，而**上下文增强**和句子级别的[上下文修剪](https:\u002F\u002Fhuggingface.co\u002Fnaver\u002Fprovence-reranker-debertav3-v1)功能则只会呈现最相关的内容。此外，独立的**验证环节**进一步提升了答案的准确性。\n\n该架构具有**模块化和轻量级**的特点，您只需启用所需的组件即可。凭借纯Python核心和极少的依赖项，LocalGPT可以在任何基础设施上轻松部署、运行和维护。系统对框架和库的依赖性极低，因此易于部署和维护。RAG系统完全由纯Python编写，无需任何额外的依赖。\n\n## ▶️ 视频\n观看此[视频](https:\u002F\u002Fyoutu.be\u002FJTbtGH3secI)，开始使用LocalGPT吧。\n\n| 首页 | 创建索引 | 聊天 |\n|------|--------------|------|\n| ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_a1cdbea25610.png) | ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_1442ad929804.png) | ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_51c6fdeec29e.png) |\n\n## ✨ 功能\n\n- **极致隐私**：您的数据始终保留在本地计算机上，确保100%的安全性。\n- **多模型支持**：通过Ollama无缝集成多种开源模型。\n- **多样化的嵌入模型**：可选择多种开源嵌入模型。\n- **重复利用LLM**：下载一次LLM后，无需再次下载即可重复使用。\n- **聊天历史**：在会话中记住您之前的对话。\n- **API**：LocalGPT提供API，可用于构建RAG应用。\n- **GPU、CPU、HPU及MPS支持**：开箱即用，支持多种平台。您可以通过`CUDA`、`CPU`、`HPU (Intel® Gaudi®)`或`MPS`等硬件与您的数据进行交互！\n\n### 📖 文档处理\n- **多格式支持**：PDF、DOCX、TXT、Markdown等多种格式（目前仅支持PDF）。\n- **上下文增强**：借助AI生成的上下文信息，提升文档理解能力，灵感来源于[上下文检索](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fcontextual-retrieval)。\n- **批量处理**：可同时处理多个文档。\n\n### 🤖 AI驱动的聊天\n- **自然语言查询**：可以用日常英语提问。\n- **来源标注**：每个答案都会附上文档引用。\n- **智能路由**：自动在RAG和直接LLM回答之间切换。\n- **查询分解**：将复杂查询拆解为子问题，以获得更佳答案。\n- **语义缓存**：基于TTL的缓存机制，结合相似度匹配，加快响应速度。\n- **会话感知历史**：在多次交互中保持对话上下文的一致性。\n- **答案验证**：独立的验证环节确保答案的准确性。\n- **多种AI模型**：使用Ollama进行推理，HuggingFace用于嵌入和重排序。\n\n\n### 🛠️ 开发友好\n- **RESTful API**：提供完整的API访问权限，便于集成。\n- **实时进度**：文档处理过程中提供实时更新。\n- **灵活配置**：可自定义模型、分块大小和搜索参数。\n- **可扩展架构**：支持插件系统，方便添加自定义组件。\n\n### 🎨 现代化界面\n- **直观的Web界面**：设计简洁、响应迅速。\n- **会话管理**：按主题组织对话。\n- **索引管理**：轻松管理文档集合。\n- **实时聊天**：流式响应，即时反馈。\n\n---\n\n## 🚀 快速入门\n\n注意：目前安装仅在macOS上测试过。\n\n### 前置条件\n- Python 3.8及以上版本（已测试Python 3.11.5）\n- Node.js 16及以上版本及npm（已测试Node.js 23.10.0，npm 10.9.2）\n- Docker（可选，用于容器化部署）\n- 8GB以上内存（建议16GB以上）\n- Ollama（两种部署方式均需）\n\n### ***注意***\n在本分支合并到主分支之前，请克隆此分支进行安装：\n\n```bash\ngit clone -b localgpt-v2 https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n```\n\n### 方法一：Docker部署\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# 在本地安装Ollama（即使使用Docker也需安装）\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\n\n# 启动Ollama\nollama serve\n\n# 在新终端中启动Docker\n.\u002Fstart-docker.sh\n\n# 访问应用\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**Docker管理命令：**\n```bash\n# 查看容器状态\ndocker compose ps\n\n# 查看日志\ndocker compose logs -f\n\n# 停止容器\n.\u002Fstart-docker.sh stop\n```\n\n### 方法二：直接开发（推荐用于开发环境）\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# 安装Python依赖\npip install -r requirements.txt\n\n# 已安装的关键依赖：\n# - torch==2.4.1, transformers==4.51.0（AI模型）\n# - lancedb（向量数据库）\n# - rank_bm25, fuzzywuzzy（搜索算法）\n\n# - sentence_transformers、重排序器（嵌入\u002F重排序）\n# - docling（文档处理）\n# - colpali-engine（多模态处理——支持即将推出）\n\n# 安装 Node.js 依赖\nnpm install\n\n# 安装并启动 Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\nollama serve\n\n# 启动系统（在新的终端中）\npython run_system.py\n\n# 访问应用\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**系统管理：**\n```bash\n# 检查系统健康状况（全面诊断）\npython system_health_check.py\n\n# 检查服务状态和健康状况\npython run_system.py --health\n\n# 以生产模式启动\npython run_system.py --mode prod\n\n# 跳过前端（仅后台 + RAG API）\npython run_system.py --no-frontend\n\n# 查看聚合日志\npython run_system.py --logs-only\n\n# 停止所有服务\npython run_system.py --stop\n# 或者在运行 python run_system.py 的终端中按 Ctrl+C\n```\n\n**服务架构：**\n`run_system.py` 启动脚本管理四个关键服务：\n- **Ollama 服务器**（端口 11434）：AI 模型服务\n- **RAG API 服务器**（端口 8001）：文档处理与检索\n- **后端服务器**（端口 8000）：会话管理及 API 端点\n- **前端服务器**（端口 3000）：React\u002FNext.js Web 界面\n\n### 选项 3：手动启动各组件\n\n```bash\n# 终端 1：启动 Ollama\nollama serve\n\n# 终端 2：启动 RAG API\npython -m rag_system.api_server\n\n# 终端 3：启动后端\ncd backend && python server.py\n\n# 终端 4：启动前端\nnpm run dev\n\n# 访问地址 http:\u002F\u002Flocalhost:3000\n```\n\n---\n\n### 详细安装步骤\n\n#### 1. 安装系统依赖\n\n**Ubuntu\u002FDebian：**\n```bash\nsudo apt update\nsudo apt install python3.8 python3-pip nodejs npm docker.io docker-compose\n```\n\n**macOS：**\n```bash\nbrew install python@3.8 node npm docker docker-compose\n```\n\n**Windows：**\n```bash\n# 安装 Python 3.8+、Node.js 和 Docker Desktop\n# 然后使用 PowerShell 或 WSL2\n```\n\n#### 2. 安装 AI 模型\n\n**推荐安装 Ollama：**\n```bash\n# 安装 Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\n\n# 拉取推荐模型\nollama pull qwen3:0.6b          # 快速生成模型\nollama pull qwen3:8b            # 高质量生成模型\n```\n\n#### 3. 配置环境变量\n\n```bash\n# 复制环境模板\ncp .env.example .env\n\n# 编辑配置\nnano .env\n```\n\n**关键配置选项：**\n```env\n# AI 模型（在 rag_system\u002Fmain.py 中引用）\nOLLAMA_HOST=http:\u002F\u002Flocalhost:11434\n\n# 数据库路径（后端和 RAG 系统使用）\nDATABASE_PATH=.\u002Fbackend\u002Fchat_data.db\nVECTOR_DB_PATH=.\u002Flancedb\n\n# 服务器设置（run_system.py 使用）\nBACKEND_PORT=8000\nFRONTEND_PORT=3000\nRAG_API_PORT=8001\n\n# 可选：覆盖默认模型\nGENERATION_MODEL=qwen3:8b\nENRICHMENT_MODEL=qwen3:0.6b\nEMBEDDING_MODEL=Qwen\u002FQwen3-Embedding-0.6B\nRERANKER_MODEL=answerdotai\u002Fanswerai-colbert-small-v1\n```\n\n#### 4. 初始化系统\n\n```bash\n# 运行系统健康检查\npython system_health_check.py\n\n# 初始化数据库\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n\n# 测试安装\npython -c \"from rag_system.main import get_agent; print('✅ 安装成功！')\"\n\n# 验证完整设置\npython run_system.py --health\n```\n\n---\n\n## 🎯 开始使用\n\n### 1. 创建第一个索引\n\n**索引** 是一组已处理的文档，您可以与之进行对话。\n\n#### 使用 Web 界面：\n1. 打开 http:\u002F\u002Flocalhost:3000\n2. 点击“创建新索引”\n3. 上传您的文档（PDF、DOCX、TXT）\n4. 配置处理选项\n5. 点击“构建索引”\n\n#### 使用脚本：\n```bash\n# 简单脚本方式\n.\u002Fsimple_create_index.sh \"我的文档\" \"path\u002Fto\u002Fdocument.pdf\"\n\n# 交互式脚本\npython create_index_script.py\n```\n\n#### 使用 API：\n```bash\n# 创建索引\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"name\": \"我的索引\", \"description\": \"我的文档\"}'\n\n# 上传文档\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fupload \\\n  -F \"files=@document.pdf\"\n\n# 构建索引\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fbuild\n```\n\n### 2. 开始聊天\n\n当您的索引构建完成后：\n\n1. **创建聊天会话**：点击“新建聊天”或使用现有会话\n2. **选择索引**：选择要查询的文档集合\n3. **提问**：输入关于您文档的自然语言问题\n4. **获取答案**：接收带有来源引用的 AI 生成回复\n\n### 3. 高级功能\n\n#### 自定义模型配置\n```bash\n# 为不同任务使用不同模型\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Fsessions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"title\": \"高质量会话\",\n    \"model\": \"qwen3:8b\",\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-4B\"\n  }'\n```\n\n#### 批量文档处理\n```bash\n# 一次性处理多个文档\npython demo_batch_indexing.py --config batch_indexing_config.json\n```\n\n#### API 集成\n```python\nimport requests\n\n# 通过 API 与您的文档对话\nresponse = requests.post('http:\u002F\u002Flocalhost:8000\u002Fchat', json={\n    'query': '研究论文中的主要发现是什么？',\n    'session_id': 'your-session-id',\n    'search_type': 'hybrid',\n    'retrieval_k': 20\n})\n\nprint(response.json()['response'])\n```\n\n---\n\n## 🔧 配置\n\n### 模型配置\n\nLocalGPT 支持多个 AI 模型提供商，并采用集中式配置：\n\n#### Ollama 模型（本地推理）\n```python\nOLLAMA_CONFIG = {\n    \"host\": \"http:\u002F\u002Flocalhost:11434\",\n    \"generation_model\": \"qwen3:8b\",        # 主文本生成\n    \"enrichment_model\": \"qwen3:0.6b\"       # 轻量级路由\u002F增强\n}\n```\n\n#### 外部模型（HuggingFace 直接调用）\n```python\nEXTERNAL_MODELS = {\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",           # 1024 维度\n    \"reranker_model\": \"answerdotai\u002Fanswerai-colbert-small-v1\", # ColBERT 重排序器\n    \"fallback_reranker\": \"BAAI\u002Fbge-reranker-base\"             # 备用重排序器\n}\n```\n\n### 管道配置\n\nLocalGPT 提供两种主要的管道配置：\n\n#### 默认管道（生产就绪）\n```python\n\"default\": {\n    \"description\": \"具有混合搜索、AI 重排和验证功能的生产就绪管道\",\n    \"storage\": {\n        \"lancedb_uri\": \".\u002Flancedb\",\n        \"text_table_name\": \"text_pages_v3\",\n        \"bm25_path\": \".\u002Findex_store\u002Fbm25\"\n    },\n    \"retrieval\": {\n        \"retriever\": \"multivector\",\n        \"search_type\": \"hybrid\",\n        \"late_chunking\": {\"enabled\": True},\n        \"dense\": {\"enabled\": True, \"weight\": 0.7},\n        \"bm25\": {\"enabled\": True}\n    },\n    \"reranker\": {\n        \"enabled\": True,\n        \"type\": \"ai\",\n        \"strategy\": \"rerankers-lib\",\n        \"model_name\": \"answerdotai\u002Fanswerai-colbert-small-v1\",\n        \"top_k\": 10\n    },\n    \"query_decomposition\": {\"enabled\": True, \"max_sub_queries\": 3},\n    \"verification\": {\"enabled\": True},\n    \"retrieval_k\": 20,\n    \"contextual_enricher\": {\"enabled\": True, \"window_size\": 1}\n}\n```\n\n#### 快速管道（速度优化）\n```python\n\"fast\": {\n    \"description\": \"具有最小开销的速度优化管道\",\n    \"retrieval\": {\n        \"search_type\": \"vector_only\",\n        \"late_chunking\": {\"enabled\": False}\n    },\n    \"reranker\": {\"enabled\": False},\n    \"query_decomposition\": {\"enabled\": False},\n    \"verification\": {\"enabled\": False},\n    \"retrieval_k\": 10,\n    \"contextual_enricher\": {\"enabled\": False}\n}\n```\n\n### 搜索配置\n\n```python\nSEARCH_CONFIG = {\n    'hybrid': {\n        'dense_weight': 0.7,\n        'sparse_weight': 0.3,\n        'retrieval_k': 20,\n        'reranker_top_k': 10\n    }\n}\n```\n---\n\n## 🛠️ 故障排除\n\n### 常见问题\n\n#### 安装问题\n```bash\n# 检查 Python 版本\npython --version  # 应为 3.8+\n\n# 检查依赖项\npip list | grep -E \"(torch|transformers|lancedb)\"\n\n# 重新安装依赖项\npip install -r requirements.txt --force-reinstall\n```\n\n#### 模型加载问题\n```bash\n# 检查 Ollama 状态\nollama list\ncurl http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags\n\n# 拉取缺失的模型\nollama pull qwen3:0.6b\n```\n\n#### 数据库问题\n```bash\n# 检查数据库连接\npython -c \"from backend.database import ChatDatabase; db = ChatDatabase(); print('✅ 数据库正常')\"\n\n# 重置数据库（警告：这将删除所有数据）\nrm backend\u002Fchat_data.db\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n```\n\n#### 性能问题\n```bash\n# 检查系统资源\npython system_health_check.py\n\n# 监控内存使用情况\nhtop  # 或 Windows 上的任务管理器\n\n# 针对低内存系统进行优化\nexport PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512\n```\n\n### 获取帮助\n\n1. **查看日志**：系统会在 `logs\u002F` 目录下创建结构化日志：\n   - `logs\u002Fsystem.log`：主要系统事件和错误\n   - `logs\u002Follama.log`：Ollama 服务器日志\n   - `logs\u002Frag-api.log`：RAG API 处理日志\n   - `logs\u002Fbackend.log`：后端服务器日志\n   - `logs\u002Ffrontend.log`：前端构建和运行时日志\n\n2. **系统健康检查**：运行全面诊断：\n   ```bash\n   python system_health_check.py  # 全面系统诊断\n   python run_system.py --health  # 服务状态检查\n   ```\n\n3. **健康端点**：检查各个服务的健康状况：\n   - 后端：`http:\u002F\u002Flocalhost:8000\u002Fhealth`\n   - RAG API：`http:\u002F\u002Flocalhost:8001\u002Fhealth`\n   - Ollama：`http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags`\n\n4. **文档**：查阅 [技术文档](TECHNICAL_DOCS.md)\n5. **GitHub 问题**：报告 bug 并请求功能\n6. **社区**：加入我们的 Discord\u002FSlack 社区\n\n---\n\n## 🔗 API 参考\n\n### 核心端点\n\n#### 聊天 API\n```http\n# 基于会话的聊天（推荐）\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"讨论的主要议题有哪些？\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20,\n  \"ai_rerank\": true,\n  \"context_window_size\": 5\n}\n\n# 旧版聊天端点\nPOST \u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"讨论的主要议题有哪些？\",\n  \"session_id\": \"uuid\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20\n}\n```\n\n#### 索引管理\n```http\n# 创建索引\nPOST \u002Findexes\nContent-Type: application\u002Fjson\n{\n  \"name\": \"我的索引\",\n  \"description\": \"描述\",\n  \"config\": \"default\"\n}\n\n# 获取所有索引\nGET \u002Findexes\n\n# 获取特定索引\nGET \u002Findexes\u002F{id}\n\n# 将文档上传到索引\nPOST \u002Findexes\u002F{id}\u002Fupload\nContent-Type: multipart\u002Fform-data\nfiles: [file1.pdf, file2.pdf, ...]\n\n# 构建索引（处理上传的文档）\nPOST \u002Findexes\u002F{id}\u002Fbuild\nContent-Type: application\u002Fjson\n{\n  \"config_mode\": \"default\",\n  \"enable_enrich\": true,\n  \"chunk_size\": 512\n}\n\n# 删除索引\nDELETE \u002Findexes\u002F{id}\n```\n\n#### 会话管理\n```http\n# 创建会话\nPOST \u002Fsessions\nContent-Type: application\u002Fjson\n{\n  \"title\": \"我的会话\",\n  \"model\": \"qwen3:0.6b\"\n}\n\n# 获取所有会话\nGET \u002Fsessions\n\n# 获取特定会话\nGET \u002Fsessions\u002F{session_id}\n\n# 获取会话文档\nGET \u002Fsessions\u002F{session_id}\u002Fdocuments\n\n# 获取会话索引\nGET \u002Fsessions\u002F{session_id}\u002Findexes\n\n# 将索引关联到会话\nPOST \u002Fsessions\u002F{session_id}\u002Findexes\u002F{index_id}\n\n# 删除会话\nDELETE \u002Fsessions\u002F{session_id}\n\n# 重命名会话\nPOST \u002Fsessions\u002F{session_id}\u002Frename\nContent-Type: application\u002Fjson\n{\n  \"new_title\": \"更新后的会话名称\"\n}\n```\n\n### 高级功能\n\n#### 查询分解\n系统可以将复杂查询拆分为子问题，以获得更好的答案：\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"比较这些方法论并分析其有效性\",\n  \"query_decompose\": true,\n  \"compose_sub_answers\": true\n}\n```\n\n#### 答案验证\n使用独立的验证模型进行准确性验证：\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"关键发现是什么？\",\n  \"verify\": true\n}\n```\n\n#### 上下文增强\n在索引构建过程中对文档上下文进行增强，以提高理解能力：\n```bash\n# 在索引构建时启用\nPOST \u002Findexes\u002F{id}\u002Fbuild\n{\n  \"enable_enrich\": true,\n  \"window_size\": 2\n}\n```\n\n#### 晚期分块\n通过在嵌入后再进行分块来更好地保留上下文：\n```bash\n# 在管道中配置\n\"late_chunking\": {\"enabled\": true}\n```\n\n#### 流式聊天\n```http\nPOST \u002Fchat\u002Fstream\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"请解释该方法论\",\n  \"session_id\": \"uuid\",\n  \"stream\": true\n}\n```\n\n#### 批量处理\n```bash\n# 使用批量索引脚本\npython demo_batch_indexing.py --config batch_indexing_config.json\n\n# 示例批量配置（batch_indexing_config.json）：\n{\n  \"index_name\": \"示例批量索引\",\n  \"index_description\": \"示例批量索引配置\",\n  \"documents\": [\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1039.pdf\",\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1041.pdf\"\n  ],\n  \"processing\": {\n    \"chunk_size\": 512,\n    \"chunk_overlap\": 64,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true,\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",\n    \"generation_model\": \"qwen3:0.6b\",\n    \"retrieval_mode\": \"hybrid\",\n    \"window_size\": 2\n  }\n}\n```\n\n```http\n# 批量处理的API端点\nPOST \u002Fbatch\u002Findex\nContent-Type: application\u002Fjson\n\n{\n  \"file_paths\": [\"doc1.pdf\", \"doc2.pdf\"],\n  \"config\": {\n    \"chunk_size\": 512,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true\n  }\n}\n```\n\n有关完整的API文档，请参阅[API_REFERENCE.md](API_REFERENCE.md)。\n\n---\n\n## 🏗️ 架构\n\nLocalGPT采用模块化、可扩展的架构：\n\n```mermaid\ngraph TB\n    UI[Web界面] --> API[后端API]\n    API --> Agent[RAG代理]\n    Agent --> Retrieval[检索管道]\n    Agent --> Generation[生成管道]\n\n    Retrieval --> Vector[向量搜索]\n    Retrieval --> BM25[BM25搜索]\n    Retrieval --> Rerank[重排序]\n\n    Vector --> LanceDB[(LanceDB)]\n    BM25 --> BM25DB[(BM25索引)]\n\n    Generation --> Ollama[Ollama模型]\n    Generation --> HF[Hugging Face模型]\n\n    API --> SQLite[(SQLite数据库)]\n```\n\n检索代理概述\n\n```mermaid\ngraph TD\n    classDef llmcall fill:#e6f3ff,stroke:#007bff;\n    classDef pipeline fill:#e6ffe6,stroke:#28a745;\n    classDef cache fill:#fff3e0,stroke:#fd7e14;\n    classDef logic fill:#f8f9fa,stroke:#6c757d;\n    classDef thread stroke-dasharray: 5 5;\n\n    A(开始：Agent.run) --> B_asyncio.run(_run_async);\n    B --> C{_run_async};\n\n    C --> C1[获取聊天历史];\n    C1 --> T1[构建分类提示 \u003Cbr\u002F> 查询 + 文档概览 ];\n    T1 --> T2[\"(asyncio.to_thread)\u003Cbr\u002F>LLM分类：RAG还是LLM_DIRECT？\"]; class T2 llmcall,thread;\n    T2 --> T3{决策?};\n\n    T3 -- RAG --> RAG_Path;\n    T3 -- LLM_DIRECT --> LLM_Path;\n\n    subgraph RAG Path\n        RAG_Path --> R1[格式化查询 + 历史];\n        R1 --> R2[\"(asyncio.to_thread)\u003Cbr\u002F>生成查询嵌入\"]; class R2 pipeline,thread;\n        R2 --> R3{{检查语义缓存}}; class R3 cache；\n        R3 -- 命中 --> R_Cache_Hit(返回缓存结果);\n        R_Cache_Hit --> R_Hist_Update;\n        R3 -- 未命中 --> R4{是否启用分解 \u003Cbr\u002F> 启用？};\n\n        R4 -- 是 --> R5[\"(asyncio.to_thread)\u003Cbr\u002F>分解原始查询\"]; class R5 llmcall,thread;\n        R5 --> R6{{运行子查询 \u003Cbr\u002F> 并行RAG管道}}；class R6 pipeline,thread；\n        R6 --> R7[收集结果和文档];\n        R7 --> R8[\"(asyncio.to_thread)\u003Cbr\u002F>组合最终答案\"]; class R8 llmcall,thread；\n        R8 --> V1(RAG答案);\n\n        R4 -- 否 --> R9[\"(asyncio.to_thread)\u003Cbr\u002F>运行单个查询 \u003Cbr\u002F>(RAG管道)\"]; class R9 pipeline,thread；\n        R9 --> V1;\n\n        V1 --> V2{{验证 \u003Cbr\u002F> await verify_async}}；class V2 llmcall；\n        V2 --> V3(最终RAG结果);\n        V3 --> R_Cache_Store{{存储到语义缓存}}；class R_Cache_Store cache；\n        R_Cache_Store --> FinalResult；\n    end\n\n    subgraph Direct LLM Path\n        LLM_Path --> L1[格式化查询 + 历史];\n        L1 --> L2[\"(asyncio.to_thread)\u003Cbr\u002F>生成直接LLM答案 \u003Cbr\u002F> (无RAG)\"]; class L2 llmcall,thread；\n        L2 --> FinalResult(最终直接结果)；\n    end\n\n    FinalResult --> R_Hist_Update(更新聊天历史);\n    R_Hist_Update --> ZZZ(结束：返回结果);\n```\n\n---\n\n## 🤝 贡献\n\n我们欢迎所有技能水平的开发者贡献！LocalGPT是一个开源项目，受益于社区参与。\n\n### 🚀 贡献者快速入门\n\n```bash\n# 分支并克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# 设置开发环境\npip install -r requirements.txt\nnpm install\n\n# 安装Ollama及模型\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b qwen3:8b\n\n# 验证设置\npython system_health_check.py\npython run_system.py --mode dev\n```\n\n### 📋 如何贡献\n\n1. **🐛 报告Bug**：使用我们的[Bug报告模板](.github\u002FISSUE_TEMPLATE\u002Fbug_report.md)\n2. **💡 请求功能**：使用我们的[功能请求模板](.github\u002FISSUE_TEMPLATE\u002Ffeature_request.md)\n3. **🔧 提交代码**：遵循我们的[开发工作流程](CONTRIBUTING.md#development-workflow)\n4. **📚 改进文档**：帮助完善我们的文档\n\n### 📖 详细指南\n\n关于全面的贡献指南，包括：\n- 开发设置与工作流程\n- 编码标准与最佳实践\n- 测试要求\n- 文档标准\n- 发布流程\n\n**👉 请参阅我们的[CONTRIBUTING.md](CONTRIBUTING.md)指南**\n\n---\n\n## 📄 许可证\n\n本项目采用MIT许可证——详情请参阅[LICENSE](LICENSE)文件。对于模型，请查看其各自的许可证。\n\n---\n\n## 📞 支持\n\n- **文档**：[技术文档](TECHNICAL_DOCS.md)\n- **问题**：[GitHub问题](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n- **讨论**：[GitHub讨论](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fdiscussions)\n- **商业部署与定制**：[联系我们](https:\u002F\u002Ftally.so\u002Fr\u002Fwv6R2d)\n---\n\n\u003Cdiv align=\"center\">\n\n## 星标历史\n\n[![星标历史图表](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_readme_fbad6b3666ff.png)](https:\u002F\u002Fstar-history.com\u002F#PromtEngineer\u002FlocalGPT&Date)","# LocalGPT 快速上手指南\n\nLocalGPT 是一个完全私有、本地运行的文档智能平台。它支持在离线环境下对 PDF、DOCX、TXT 等文档进行问答、摘要和洞察分析，确保数据永不离开您的设备。\n\n## 1. 环境准备\n\n### 系统要求\n- **操作系统**: macOS (当前主要测试环境), Linux, Windows (需 WSL2)\n- **内存**: 最低 8GB，推荐 16GB+\n- **Python**: 3.8+ (推荐 3.11.5)\n- **Node.js**: 16+ (推荐 23.10.0)\n- **Ollama**: 必须安装 (用于运行大模型)\n- **Docker**: 可选 (用于容器化部署)\n\n### 前置依赖安装\n\n**macOS (使用 Homebrew):**\n```bash\nbrew install python@3.11 node npm docker docker-compose\n```\n\n**Ubuntu\u002FDebian:**\n```bash\nsudo apt update\nsudo apt install python3.11 python3-pip nodejs npm docker.io docker-compose\n```\n\n**Windows:**\n请安装 Python 3.8+、Node.js 和 Docker Desktop，建议使用 PowerShell 或 WSL2 进行操作。\n\n### 安装 Ollama 并拉取模型\n无论采用哪种部署方式，都必须先安装 Ollama 并下载模型：\n\n```bash\n# 安装 Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\n\n# 拉取推荐模型 (轻量版与高质量版)\nollama pull qwen3:0.6b\nollama pull qwen3:8b\n\n# 启动 Ollama 服务\nollama serve\n```\n> **提示**: 国内用户若下载缓慢，可配置 `OLLAMA_HOST` 或使用国内镜像源加速模型下载。\n\n---\n\n## 2. 安装步骤\n\n⚠️ **注意**: 目前请使用 `localgpt-v2` 分支进行安装。\n\n```bash\ngit clone -b localgpt-v2 https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n```\n\n您可以选择以下两种方式之一进行部署：\n\n### 方式一：Docker 部署 (推荐生产环境)\n\n```bash\n# 启动 Docker 容器\n.\u002Fstart-docker.sh\n\n# 访问应用\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**常用管理命令:**\n```bash\ndocker compose ps          # 查看容器状态\ndocker compose logs -f     # 查看实时日志\n.\u002Fstart-docker.sh stop     # 停止服务\n```\n\n### 方式二：直接开发部署 (推荐开发者)\n\n```bash\n# 1. 安装 Python 依赖\npip install -r requirements.txt\n\n# 2. 安装前端依赖\nnpm install\n\n# 3. 确保 Ollama 已在另一终端运行 (ollama serve)\n\n# 4. 启动系统\npython run_system.py\n\n# 5. 访问应用\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**高级启动选项:**\n```bash\npython run_system.py --health       # 检查系统健康状态\npython run_system.py --mode prod    # 生产模式启动\npython run_system.py --no-frontend  # 仅启动后端和 RAG API\npython run_system.py --stop         # 停止所有服务\n```\n\n---\n\n## 3. 基本使用\n\n### 第一步：创建索引 (Index)\n索引是您上传并处理后的文档集合，是与 AI 对话的基础。\n\n**通过 Web 界面操作:**\n1. 浏览器访问 `http:\u002F\u002Flocalhost:3000`\n2. 点击 **\"Create New Index\"**\n3. 上传文档 (支持 PDF, DOCX, TXT 等)\n4. 配置处理选项后点击 **\"Build Index\"**\n\n**通过命令行脚本:**\n```bash\n# 简单模式\n.\u002Fsimple_create_index.sh \"My Documents\" \"path\u002Fto\u002Fdocument.pdf\"\n\n# 交互模式\npython create_index_script.py\n```\n\n**通过 API:**\n```bash\n# 创建索引\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"name\": \"My Index\", \"description\": \"My documents\"}'\n\n# 上传文件 (替换 INDEX_ID 为实际返回的 ID)\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fupload \\\n  -F \"files=@document.pdf\"\n\n# 构建索引\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fbuild\n```\n\n### 第二步：开始对话\n\n索引构建完成后：\n1. 点击 **\"New Chat\"** 创建新会话。\n2. 在会话设置中选择刚才创建的 **Index**。\n3. 在对话框输入自然语言问题（例如：“这份文档的主要结论是什么？”）。\n4. 系统将自动检索相关片段并生成带来源引用的回答。\n\n### 第三步：进阶配置 (可选)\n\n您可以在创建会话时指定不同的模型组合：\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Fsessions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"title\": \"High Quality Session\",\n    \"model\": \"qwen3:8b\",\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-4B\"\n  }'\n```\n\nLocalGPT 架构模块化，支持 GPU (CUDA)、CPU、Intel Gaudi (HPU) 及 Apple Silicon (MPS) 等多种硬件加速方案，可根据实际需求灵活调整 `.env` 配置文件。","某金融合规专员需要在离线内网环境中，快速从数百份包含敏感客户数据的 PDF 合同与审计报告中提取关键风险条款。\n\n### 没有 localGPT 时\n- 数据隐私风险极高：将含敏感信息的文档上传至云端 AI 服务违反公司“数据不出域”的合规红线。\n- 检索精度不足：传统关键词搜索无法理解语义，常漏掉表述不同但含义相似的风险描述。\n- 上下文割裂：面对长篇幅合同，人工阅读耗时且容易忽略跨段落的逻辑关联，效率低下。\n- 部署门槛高：现有私有化方案依赖复杂的环境配置和昂贵的专用硬件，难以在普通办公机运行。\n\n### 使用 localGPT 后\n- 实现 100% 数据本地化：所有文档处理与问答均在本地完成，数据绝不离开设备，完美满足合规要求。\n- 混合搜索提升准确率：利用语义相似度与关键词匹配的混合引擎，精准定位隐蔽的风险条款，不再遗漏。\n- 智能长文处理：通过 Late Chunking 技术与上下文剪枝，自动梳理长篇合同的逻辑脉络，直接生成摘要与洞察。\n- 轻量级灵活部署：基于纯 Python 核心，无需复杂依赖即可在普通 CPU 或现有 GPU 工作站上快速启动。\n\nlocalGPT 让金融机构能在绝对安全的前提下，将沉睡的本地文档转化为可即时对话的智能知识库。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPromtEngineer_localGPT_a1cdbea2.png","PromtEngineer","PromptEngineer","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FPromtEngineer_a63ace20.png","Building Cool Stuff!",null,"engineerrprompt","https:\u002F\u002Fengineerprompt.ai\u002F","https:\u002F\u002Fgithub.com\u002FPromtEngineer",[84,88,92,96,100,104],{"name":85,"color":86,"percentage":87},"Python","#3572A5",64.4,{"name":89,"color":90,"percentage":91},"TypeScript","#3178c6",30.1,{"name":93,"color":94,"percentage":95},"Shell","#89e051",4.3,{"name":97,"color":98,"percentage":99},"CSS","#663399",0.7,{"name":101,"color":102,"percentage":103},"JavaScript","#f1e05a",0.3,{"name":105,"color":106,"percentage":103},"HTML","#e34c26",22209,2487,"2026-04-04T03:34:38","MIT","Linux, macOS, Windows","非必需。支持 CPU、GPU (CUDA)、Intel Gaudi (HPU) 和 Apple MPS。若使用 GPU，未指定具体型号或显存大小，但建议拥有兼容 CUDA 的 NVIDIA 显卡以获得更好性能。","最低 8GB，推荐 16GB+",{"notes":115,"python":116,"dependencies":117},"必须安装并运行 Ollama 服务（需手动拉取 qwen3:0.6b 和 qwen3:8b 模型）。前端需要 Node.js 16+ (测试版本 23.10.0) 和 npm。目前安装流程主要在 macOS 上经过测试，其他系统可能需要额外配置。支持 Docker 部署。","3.8+ (测试版本 3.11.5)",[118,119,120,121,122,123,124,125],"torch==2.4.1","transformers==4.51.0","lancedb","sentence_transformers","rerankers","docling","rank_bm25","fuzzywuzzy",[26,13],"2026-03-27T02:49:30.150509","2026-04-06T06:44:37.553360",[130,135,140,145,150,155],{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},15156,"模型推理速度非常慢，如何解决？","这通常是因为 llama.cpp 已不再支持旧的 ggml 格式，转而使用 gguf 格式。您可以尝试通过强制安装特定版本的 llama-cpp-python 并启用 CUDA 加速来解决：\n```bash\nCMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.78 --no-cache-dir\n```\n如果更新后仍无法加载模型，请确保您使用的模型文件格式与当前库版本兼容（建议切换到 GGUF 格式的模型）。","https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues\u002F394",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},15157,"运行时报错 'Chroma collection contains fewer than X elements' 且只能识别到少量数据怎么办？","这通常意味着数据摄入（ingestion）过程未正确完成或索引损坏。虽然具体修复步骤因情况而异，但确认摄入脚本（ingest.py）是否成功执行是关键。如果遇到此错误，建议重新运行数据摄入流程，并确保在运行问答脚本前没有报错。如果问题持续，可能需要删除旧的数据库文件夹（如 DB 目录）后重新摄入数据。","https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues\u002F343",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},15158,"遇到 'AssertionError: Torch not compiled with CUDA enabled' 错误，但已安装 CUDA 版本的 PyTorch 怎么办？","即使系统显示已安装 CUDA 版本的 PyTorch，缓存中的旧版本可能仍在被使用。解决方法是强制重新安装带有 CUDA 支持的 torch 包，并清除缓存：\n```bash\npip install --force-reinstall --no-cache-dir torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n```\n注意：请将 URL 中的 cu118 替换为您实际需要的 CUDA 版本。您也可以访问 PyTorch 官网使用其命令生成器获取适合您操作系统和 CUDA 版本的准确安装命令。","https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues\u002F156",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},15159,"如何为 localGPT 添加类似 text-generation-webui 的 Gradio Web 界面？","社区用户已经提供了相关的修改方案。首先安装 gradio：\n```bash\npip3 install gradio\n```\n然后可以运行修改后的脚本（如 `run_localGPT_WebUI.py` 或社区提供的 `run_localGPT_try_API.py`）。部分实现支持生成长期有效的公共 URL（默认 72 小时），以便在外网访问。具体的修改文件包（如 localGPT-try-API-n-WEBUI-v4.zip）可在相关 Issue 的附件或讨论中找到。","https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues\u002F74",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},15160,"运行时出现 'pydantic.error_wrappers.ValidationError: none is not an allowed value' 错误是什么原因？","该错误通常发生在 LLMChain 初始化时传入了空的 llm 对象。这往往是因为模型加载失败导致 llm 变量为 None。常见原因包括：\n1. 模型文件格式不匹配（如代码期望 GGUF 但提供了 GGML，或反之）；\n2. llama-cpp-python 版本与模型格式不兼容；\n3. 显存不足导致模型加载中断。\n请检查日志中模型加载部分是否有报错，并确认安装的 llama-cpp-python 版本支持您当前使用的模型格式（推荐使用 GGUF 格式及最新版 llama-cpp-python）。","https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues\u002F535",{"id":156,"question_zh":157,"answer_zh":158,"source_url":144},15161,"如何在 Windows 上确保 ingest.py 能正确使用 GPU 进行数据摄入？","在 Windows 上，如果 ingest.py 未使用 GPU，通常是因为 PyTorch 未正确安装 CUDA 版本。即使 conda 显示已安装，也可能因缓存问题导致实际使用的是 CPU 版本。请执行以下命令强制重装：\n```bash\npip install --force-reinstall --no-cache-dir torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n```\n安装完成后，再次运行 ingest.py，日志中应显示使用了 CUDA 设备。",[]]