[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-DonTizi--rlama":3,"tool-DonTizi--rlama":62},[4,18,26,36,46,54],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",158594,2,"2026-04-16T23:34:05",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":42,"last_commit_at":43,"category_tags":44,"status":17},8272,"opencode","anomalyco\u002Fopencode","OpenCode 是一款开源的 AI 编程助手（Coding Agent），旨在像一位智能搭档一样融入您的开发流程。它不仅仅是一个代码补全插件，而是一个能够理解项目上下文、自主规划任务并执行复杂编码操作的智能体。无论是生成全新功能、重构现有代码，还是排查难以定位的 Bug，OpenCode 都能通过自然语言交互高效完成，显著减少开发者在重复性劳动和上下文切换上的时间消耗。\n\n这款工具专为软件开发者、工程师及技术研究人员设计，特别适合希望利用大模型能力来提升编码效率、加速原型开发或处理遗留代码维护的专业人群。其核心亮点在于完全开源的架构，这意味着用户可以审查代码逻辑、自定义行为策略，甚至私有化部署以保障数据安全，彻底打破了传统闭源 AI 助手的“黑盒”限制。\n\n在技术体验上，OpenCode 提供了灵活的终端界面（Terminal UI）和正在测试中的桌面应用程序，支持 macOS、Windows 及 Linux 全平台。它兼容多种包管理工具，安装便捷，并能无缝集成到现有的开发环境中。无论您是追求极致控制权的资深极客，还是渴望提升产出的独立开发者，OpenCode 都提供了一个透明、可信",144296,1,"2026-04-16T14:50:03",[13,45],"插件",{"id":47,"name":48,"github_repo":49,"description_zh":50,"stars":51,"difficulty_score":32,"last_commit_at":52,"category_tags":53,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":55,"name":56,"github_repo":57,"description_zh":58,"stars":59,"difficulty_score":32,"last_commit_at":60,"category_tags":61,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[45,13,15,14],{"id":63,"github_repo":64,"name":65,"description_en":66,"description_zh":67,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":76,"owner_twitter":78,"owner_website":79,"owner_url":80,"languages":81,"stars":110,"forks":111,"last_commit_at":112,"license":113,"difficulty_score":32,"env_os":114,"env_gpu":115,"env_ram":116,"env_deps":117,"category_tags":122,"github_topics":76,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":123,"updated_at":124,"faqs":125,"releases":159},8218,"DonTizi\u002Frlama","rlama","A powerful document AI question-answering tool that connects to your local Ollama models. Create, manage, and interact with RAG systems for all your document needs.","rlama 是一款强大的本地文档智能问答工具，旨在帮助用户轻松构建基于检索增强生成（RAG）技术的私有知识库系统。它能无缝连接用户本地运行的 Ollama 大语言模型，让用户直接对各类文档进行提问、管理和交互，无需依赖云端服务即可实现高效的信息检索。\n\n针对个人和企业面临的数据隐私顾虑及高昂的 API 调用成本问题，rlama 提供了一套完整的本地化解决方案。它支持从本地文件夹或指定网站抓取内容，自动处理文档分块并建立索引，让用户能够随时向自己的文档库发起自然语言查询。此外，它还具备目录监控和网站更新追踪功能，确保知识库能随源文件变化自动同步。\n\n这款工具特别适合注重数据安全的开发者、研究人员以及希望搭建私有知识助理的技术爱好者使用。其独特的技术亮点在于完全本地化的部署架构，结合命令行界面的灵活操作，不仅支持多种文档格式，还允许用户直接在 Hugging Face 上浏览并运行 GGUF 格式的模型。尽管目前项目因作者事务暂时暂停维护，但其已有的功能依然为构建离线、安全且可定制的 AI 问答系统提供了极具价值的参考与实践路径。","\u003C!-- Social Links Navigation Bar -->\n\u003Cdiv align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fx.com\u002FLeDonTizi\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTwitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white\" alt=\"Twitter\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FtP5JB9DR\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-5865F2?style=for-the-badge&logo=discord&logoColor=white\" alt=\"Discord\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.youtube.com\u002F@Dontizi\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FYouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white\" alt=\"YouTube\">\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Cbr>\n\n# RLAMA - User Guide\n\n> **⚠️ Project Temporarily Paused**  \n> This project is currently on pause due to my work and university commitments that take up a lot of my time. I am not able to actively maintain this project at the moment. Development will resume when my situation allows it.\n\nRLAMA is a powerful AI-driven question-answering tool for your documents, seamlessly integrating with your local Ollama models. It enables you to create, manage, and interact with Retrieval-Augmented Generation (RAG) systems tailored to your documentation needs.\n\n\n[![RLAMA Demonstration](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FDonTizi_rlama_readme_68f4fb920967.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=EIsQnBqeQxQ)\n\n## Table of Contents\n- [Vision & Roadmap](#vision--roadmap)\n- [Installation](#installation)\n- [Available Commands](#available-commands)\n  - [rag - Create a RAG system](#rag---create-a-rag-system)\n  - [crawl-rag - Create a RAG system from a website](#crawl-rag---create-a-rag-system-from-a-website)\n  - [wizard - Create a RAG system with interactive setup](#wizard---create-a-rag-system-with-interactive-setup)\n  - [watch - Set up directory watching for a RAG system](#watch---set-up-directory-watching-for-a-rag-system)\n  - [watch-off - Disable directory watching for a RAG system](#watch-off---disable-directory-watching-for-a-rag-system)\n  - [check-watched - Check a RAG's watched directory for new files](#check-watched---check-a-rags-watched-directory-for-new-files)\n  - [web-watch - Set up website monitoring for a RAG system](#web-watch---set-up-website-monitoring-for-a-rag-system)\n  - [web-watch-off - Disable website monitoring for a RAG system](#web-watch-off---disable-website-monitoring-for-a-rag-system)\n  - [check-web-watched - Check a RAG's monitored website for updates](#check-web-watched---check-a-rags-monitored-website-for-updates)\n  - [run - Use a RAG system](#run---use-a-rag-system)\n  - [api - Start API server](#api---start-api-server)\n  - [list - List RAG systems](#list---list-rag-systems)\n  - [delete - Delete a RAG system](#delete---delete-a-rag-system)\n  - [list-docs - List documents in a RAG](#list-docs---list-documents-in-a-rag)\n  - [list-chunks - Inspect document chunks](#list-chunks---inspect-document-chunks)\n  - [view-chunk - View chunk details](#view-chunk---view-chunk-details)\n  - [add-docs - Add documents to RAG](#add-docs---add-documents-to-rag)\n  - [crawl-add-docs - Add website content to RAG](#crawl-add-docs---add-website-content-to-rag)\n  - [update-model - Change LLM model](#update-model---change-llm-model)\n  - [update - Update RLAMA](#update---update-rlama)\n  - [version - Display version](#version---display-version)\n  - [hf-browse - Browse GGUF models on Hugging Face](#hf-browse---browse-gguf-models-on-hugging-face)\n  - [run-hf - Run a Hugging Face GGUF model](#run-hf---run-a-hugging-face-gguf-model)\n- [Uninstallation](#uninstallation)\n- [Supported Document Formats](#supported-document-formats)\n- [Troubleshooting](#troubleshooting)\n- [Using OpenAI Models](#using-openai-models)\n\n## Vision & Roadmap\nRLAMA aims to become the definitive tool for creating local RAG systems that work seamlessly for everyone—from individual developers to large enterprises. Here's our strategic roadmap:\n\n### Completed Features ✅\n- ✅ **Basic RAG System Creation**: CLI tool for creating and managing RAG systems\n- ✅ **Document Processing**: Support for multiple document formats (.txt, .md, .pdf, etc.)\n- ✅ **Document Chunking**: Advanced semantic chunking with multiple strategies (fixed, semantic, hierarchical, hybrid)\n- ✅ **Vector Storage**: Local storage of document embeddings\n- ✅ **Context Retrieval**: Basic semantic search with configurable context size\n- ✅ **Ollama Integration**: Seamless connection to Ollama models\n- ✅ **Cross-Platform Support**: Works on Linux, macOS, and Windows\n- ✅ **Easy Installation**: One-line installation script\n- ✅ **API Server**: HTTP endpoints for integrating RAG capabilities in other applications\n- ✅ **Web Crawling**: Create RAGs directly from websites\n- ✅ **Guided RAG Setup Wizard**: Interactive interface for easy RAG creation\n- ✅ **Hugging Face Integration**: Access to 45,000+ GGUF models from Hugging Face Hub\n\n### Small LLM Optimization (Q2 2025)\n- [ ] **Prompt Compression**: Smart context summarization for limited context windows\n- ✅ **Adaptive Chunking**: Dynamic content segmentation based on semantic boundaries and document structure\n- ✅ **Minimal Context Retrieval**: Intelligent filtering to eliminate redundant content\n- [ ] **Parameter Optimization**: Fine-tuned settings for different model sizes\n\n### Advanced Embedding Pipeline (Q2-Q3 2025)\n- [ ] **Multi-Model Embedding Support**: Integration with various embedding models\n- [ ] **Hybrid Retrieval Techniques**: Combining sparse and dense retrievers for better accuracy\n- [ ] **Embedding Evaluation Tools**: Built-in metrics to measure retrieval quality\n- [ ] **Automated Embedding Cache**: Smart caching to reduce computation for similar queries\n\n### User Experience Enhancements (Q3 2025)\n- [ ] **Lightweight Web Interface**: Simple browser-based UI for the existing CLI backend\n- [ ] **Knowledge Graph Visualization**: Interactive exploration of document connections\n- [ ] **Domain-Specific Templates**: Pre-configured settings for different domains\n\n### Enterprise Features (Q4 2025)\n- [ ] **Multi-User Access Control**: Role-based permissions for team environments\n- [ ] **Integration with Enterprise Systems**: Connectors for SharePoint, Confluence, Google Workspace\n- [ ] **Knowledge Quality Monitoring**: Detection of outdated or contradictory information\n- [ ] **System Integration API**: Webhooks and APIs for embedding RLAMA in existing workflows\n- [ ] **AI Agent Creation Framework**: Simplified system for building custom AI agents with RAG capabilities\n\n### Next-Gen Retrieval Innovations (Q1 2026)\n- [ ] **Multi-Step Retrieval**: Using the LLM to refine search queries for complex questions\n- [ ] **Cross-Modal Retrieval**: Support for image content understanding and retrieval\n- [ ] **Feedback-Based Optimization**: Learning from user interactions to improve retrieval\n- [ ] **Knowledge Graphs & Symbolic Reasoning**: Combining vector search with structured knowledge\n\nRLAMA's core philosophy remains unchanged: to provide a simple, powerful, local RAG solution that respects privacy, minimizes resource requirements, and works seamlessly across platforms.\n\n## Installation\n\n### Prerequisites\n- [Ollama](https:\u002F\u002Follama.ai\u002F) installed and running\n\n### Installation from terminal\n\n```bash\ncurl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002Fdontizi\u002Frlama\u002Fmain\u002Finstall.sh | sh\n```\n\n## Tech Stack\n\nRLAMA is built with:\n\n- **Core Language**: Go (chosen for performance, cross-platform compatibility, and single binary distribution)\n- **CLI Framework**: Cobra (for command-line interface structure)\n- **LLM Integration**: Ollama API (for embeddings and completions)\n- **Storage**: Local filesystem-based storage (JSON files for simplicity and portability)\n- **Vector Search**: Custom implementation of cosine similarity for embedding retrieval\n\n## Architecture\n\nRLAMA follows a clean architecture pattern with clear separation of concerns:\n\n```\nrlama\u002F\n├── cmd\u002F                  # CLI commands (using Cobra)\n│   ├── root.go           # Base command\n│   ├── rag.go            # Create RAG systems\n│   ├── run.go            # Query RAG systems\n│   └── ...\n├── internal\u002F\n│   ├── client\u002F           # External API clients\n│   │   └── ollama_client.go # Ollama API integration\n│   ├── domain\u002F           # Core domain models\n│   │   ├── rag.go        # RAG system entity\n│   │   └── document.go   # Document entity\n│   ├── repository\u002F       # Data persistence\n│   │   └── rag_repository.go # Handles saving\u002Floading RAGs\n│   └── service\u002F          # Business logic\n│       ├── rag_service.go      # RAG operations\n│       ├── document_loader.go  # Document processing\n│       └── embedding_service.go # Vector embeddings\n└── pkg\u002F                  # Shared utilities\n    └── vector\u002F           # Vector operations\n```\n\n## Data Flow\n\n1. **Document Processing**: Documents are loaded from the file system, parsed based on their type, and converted to plain text.\n2. **Embedding Generation**: Document text is sent to Ollama to generate vector embeddings.\n3. **Storage**: The RAG system (documents + embeddings) is stored in the user's home directory (~\u002F.rlama).\n4. **Query Process**: When a user asks a question, it's converted to an embedding, compared against stored document embeddings, and relevant content is retrieved.\n5. **Response Generation**: Retrieved content and the question are sent to Ollama to generate a contextually-informed response.\n\n## Visual Representation\n\n```\n┌─────────────┐     ┌─────────────┐     ┌─────────────┐\n│  Documents  │────>│  Document   │────>│  Embedding  │\n│  (Input)    │     │  Processing │     │  Generation │\n└─────────────┘     └─────────────┘     └─────────────┘\n                                              │\n                                              ▼\n┌─────────────┐     ┌─────────────┐     ┌─────────────┐\n│   Query     │────>│  Vector     │\u003C────│ Vector Store│\n│  Response   │     │  Search     │     │ (RAG System)│\n└─────────────┘     └─────────────┘     └─────────────┘\n       ▲                   │\n       │                   ▼\n┌─────────────┐     ┌─────────────┐\n│   Ollama    │\u003C────│   Context   │\n│    LLM      │     │  Building   │\n└─────────────┘     └─────────────┘\n```\n\nRLAMA is designed to be lightweight and portable, focusing on providing RAG capabilities with minimal dependencies. The entire system runs locally, with the only external dependency being Ollama for LLM capabilities.\n\n## Available Commands\n\nYou can get help on all commands by using:\n\n```bash\nrlama --help\n```\n\n### Global Flags\n\nThese flags can be used with any command:\n\n```bash\n--host string       Ollama host (default: localhost)\n--port string       Ollama port (default: 11434)\n--num-thread int    Number of threads for Ollama to use (default: 0, use Ollama default)\n```\n\n**Performance Optimization:**\n- Use `--num-thread 16` (or your CPU core count) to potentially improve processing speed\n- Ollama often uses half the available cores by default\n- Setting this to your full core count can significantly speed up text generation and embeddings\n\n**Usage Examples:**\n```bash\n# Use 16 threads for better performance\nrlama --num-thread 16 run my-docs\n\n# Create a RAG with optimized thread usage\nrlama --num-thread 16 rag llama3 documentation .\u002Fdocs\n\n# Run with custom host and thread settings\nrlama --host 192.168.1.100 --port 11434 --num-thread 16 run my-rag\n```\n\n### Custom Data Directory\n\nRLAMA stores data in `~\u002F.rlama` by default. To use a different location:\n\n1. **Command-line flag** (highest priority):\n   ```bash\n   # Use with any command\n   rlama --data-dir \u002Fpath\u002Fto\u002Fcustom\u002Fdirectory run my-rag\n   ```\n\n2. **Environment variable**:\n   ```bash\n   # Set the environment variable\n   export RLAMA_DATA_DIR=\u002Fpath\u002Fto\u002Fcustom\u002Fdirectory\n   rlama run my-rag\n   ```\n\nThe precedence order is: command-line flag > environment variable > default location.\n\n### rag - Create a RAG system\n\nCreates a new RAG system by indexing all documents in the specified folder.\n\n```bash\nrlama rag [model] [rag-name] [folder-path]\n```\n\n**Parameters:**\n- `model`: Name of the Ollama model to use (e.g., llama3, mistral, gemma) or a Hugging Face model using the format `hf.co\u002Fusername\u002Frepository[:quantization]`.\n- `rag-name`: Unique name to identify your RAG system.\n- `folder-path`: Path to the folder containing your documents.\n\n**Example:**\n\n```bash\n# Using a standard Ollama model\nrlama rag llama3 documentation .\u002Fdocs\n\n# Using a Hugging Face model\nrlama rag hf.co\u002Fbartowski\u002FLlama-3.2-1B-Instruct-GGUF my-rag .\u002Fdocs\n\n# Using a Hugging Face model with specific quantization\nrlama rag hf.co\u002Fmlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF:Q5_K_M my-rag .\u002Fdocs\n```\n\n### crawl-rag - Create a RAG system from a website\n\nCreates a new RAG system by crawling a website and indexing its content.\n\n```bash\nrlama crawl-rag [model] [rag-name] [website-url]\n```\n\n**Parameters:**\n- `model`: Name of the Ollama model to use (e.g., llama3, mistral, gemma).\n- `rag-name`: Unique name to identify your RAG system.\n- `website-url`: URL of the website to crawl and index.\n\n**Options:**\n- `--max-depth`: Maximum crawl depth (default: 2)\n- `--concurrency`: Number of concurrent crawlers (default: 5)\n- `--exclude-path`: Paths to exclude from crawling (comma-separated)\n- `--chunk-size`: Character count per chunk (default: 1000)\n- `--chunk-overlap`: Overlap between chunks in characters (default: 200)\n- `--chunking-strategy`: Chunking strategy to use (options: \"fixed\", \"semantic\", \"hybrid\", \"hierarchical\", default: \"hybrid\")\n\n#### Chunking Strategies\n\nRLAMA offers multiple advanced chunking strategies to optimize document retrieval:\n\n- **Fixed**: Traditional chunking with fixed size and overlap, respecting sentence boundaries when possible.\n- **Semantic**: Intelligently splits documents based on semantic boundaries like headings, paragraphs, and natural topic shifts.\n- **Hybrid**: Automatically selects the best strategy based on document type and content (markdown, HTML, code, or plain text).\n- **Hierarchical**: For very long documents, creates a two-level chunking structure with major sections and sub-chunks.\n\nThe system automatically adapts to different document types:\n- Markdown documents: Split by headers and sections\n- HTML documents: Split by semantic HTML elements\n- Code documents: Split by functions, classes, and logical blocks\n- Plain text: Split by paragraphs with contextual overlap\n\n**Example:**\n\n```bash\n# Create a new RAG from a documentation website\nrlama crawl-rag llama3 docs-rag https:\u002F\u002Fdocs.example.com\n\n# Customize crawling behavior\nrlama crawl-rag llama3 blog-rag https:\u002F\u002Fblog.example.com --max-depth=3 --exclude-path=\u002Farchive,\u002Ftags\n\n# Create a RAG with semantic chunking\nrlama rag llama3 documentation .\u002Fdocs --chunking-strategy=semantic\n\n# Use hierarchical chunking for large documents\nrlama rag llama3 book-rag .\u002Fbooks --chunking-strategy=hierarchical\n```\n\n### wizard - Create a RAG system with interactive setup\n\nProvides an interactive step-by-step wizard for creating a new RAG system.\n\n```bash\nrlama wizard\n```\n\nThe wizard guides you through:\n- Naming your RAG\n- Choosing an Ollama model\n- Selecting document sources (local folder or website)\n- Configuring chunking parameters\n- Setting up file filtering\n\n**Example:**\n\n```bash\nrlama wizard\n# Follow the prompts to create your customized RAG\n```\n\n### watch - Set up directory watching for a RAG system\n\nConfigure a RAG system to automatically watch a directory for new files and add them to the RAG.\n\n```bash\nrlama watch [rag-name] [directory-path] [interval]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to watch.\n- `directory-path`: Path to the directory to watch for new files.\n- `interval`: Time in minutes to check for new files (use 0 to check only when the RAG is used).\n\n**Example:**\n\n```bash\n# Set up directory watching to check every 60 minutes\nrlama watch my-docs .\u002Fwatched-folder 60\n\n# Set up directory watching to only check when the RAG is used\nrlama watch my-docs .\u002Fwatched-folder 0\n\n# Customize what files to watch\nrlama watch my-docs .\u002Fwatched-folder 30 --exclude-dir=node_modules,tmp --process-ext=.md,.txt\n```\n\n### watch-off - Disable directory watching for a RAG system\n\nDisable automatic directory watching for a RAG system.\n\n```bash\nrlama watch-off [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to disable watching.\n\n**Example:**\n\n```bash\nrlama watch-off my-docs\n```\n\n### check-watched - Check a RAG's watched directory for new files\n\nManually check a RAG's watched directory for new files and add them to the RAG.\n\n```bash\nrlama check-watched [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to check.\n\n**Example:**\n\n```bash\nrlama check-watched my-docs\n```\n\n### web-watch - Set up website monitoring for a RAG system\n\nConfigure a RAG system to automatically monitor a website for updates and add new content to the RAG.\n\n```bash\nrlama web-watch [rag-name] [website-url] [interval]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to monitor.\n- `website-url`: URL of the website to monitor.\n- `interval`: Time in minutes between checks (use 0 to check only when the RAG is used).\n\n**Example:**\n\n```bash\n# Set up website monitoring to check every 60 minutes\nrlama web-watch my-docs https:\u002F\u002Fexample.com 60\n\n# Set up website monitoring to only check when the RAG is used\nrlama web-watch my-docs https:\u002F\u002Fexample.com 0\n\n# Customize what content to monitor\nrlama web-watch my-docs https:\u002F\u002Fexample.com 30 --exclude-path=\u002Farchive,\u002Ftags\n```\n\n### web-watch-off - Disable website monitoring for a RAG system\n\nDisable automatic website monitoring for a RAG system.\n\n```bash\nrlama web-watch-off [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to disable monitoring.\n\n**Example:**\n\n```bash\nrlama web-watch-off my-docs\n```\n\n### check-web-watched - Check a RAG's monitored website for updates\n\nManually check a RAG's monitored website for new updates and add them to the RAG.\n\n```bash\nrlama check-web-watched [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to check.\n\n**Example:**\n\n```bash\nrlama check-web-watched my-docs\n```\n\n### run - Use a RAG system\n\nStarts an interactive session to interact with an existing RAG system.\n\n```bash\nrlama run [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to use.\n- `--context-size`: (Optional) Number of context chunks to retrieve (default: 20)\n\n**Example:**\n\n```bash\nrlama run documentation\n> How do I install the project?\n> What are the main features?\n> exit\n```\n\n**Context Size Tips:**\n- Smaller values (5-15) for faster responses with key information\n- Medium values (20-40) for balanced performance\n- Larger values (50+) for complex questions needing broad context\n- Consider your model's context window limits\n\n```bash\nrlama run documentation --context-size=50  # Use 50 context chunks\n```\n\n### api - Start API server\n\nStarts an HTTP API server that exposes RLAMA's functionality through RESTful endpoints.\n\n```bash\nrlama api [--port PORT]\n```\n\n**Parameters:**\n- `--port`: (Optional) Port number to run the API server on (default: 11249)\n\n**Example:**\n\n```bash\nrlama api --port 8080\n```\n\n**Available Endpoints:**\n\n1. **Query a RAG system** - `POST \u002Frag`\n   ```bash\n   curl -X POST http:\u002F\u002Flocalhost:11249\u002Frag \\\n     -H \"Content-Type: application\u002Fjson\" \\\n     -d '{\n       \"rag_name\": \"documentation\",\n       \"prompt\": \"How do I install the project?\",\n       \"context_size\": 20\n     }'\n   ```\n\n   Request fields:\n   - `rag_name` (required): Name of the RAG system to query\n   - `prompt` (required): Question or prompt to send to the RAG\n   - `context_size` (optional): Number of chunks to include in context\n   - `model` (optional): Override the model used by the RAG\n\n2. **Check server health** - `GET \u002Fhealth`\n   ```bash\n   curl http:\u002F\u002Flocalhost:11249\u002Fhealth\n   ```\n\n**Integration Example:**\n```javascript\n\u002F\u002F Node.js example\nconst response = await fetch('http:\u002F\u002Flocalhost:11249\u002Frag', {\n  method: 'POST',\n  headers: { 'Content-Type': 'application\u002Fjson' },\n  body: JSON.stringify({\n    rag_name: 'my-docs',\n    prompt: 'Summarize the key features'\n  })\n});\nconst data = await response.json();\nconsole.log(data.response);\n```\n\n### list - List RAG systems\n\nDisplays a list of all available RAG systems.\n\n```bash\nrlama list\n```\n\n### delete - Delete a RAG system\n\nPermanently deletes a RAG system and all its indexed documents.\n\n```bash\nrlama delete [rag-name] [--force\u002F-f]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system to delete.\n- `--force` or `-f`: (Optional) Delete without asking for confirmation.\n\n**Example:**\n\n```bash\nrlama delete old-project\n```\n\nOr to delete without confirmation:\n\n```bash\nrlama delete old-project --force\n```\n\n### list-docs - List documents in a RAG\n\nDisplays all documents in a RAG system with metadata.\n\n```bash\nrlama list-docs [rag-name]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system\n\n**Example:**\n\n```bash\nrlama list-docs documentation\n```\n\n### list-chunks - Inspect document chunks\n\nList and filter document chunks in a RAG system with various options:\n\n```bash\n# Basic chunk listing\nrlama list-chunks [rag-name]\n\n# With content preview (shows first 100 characters)\nrlama list-chunks [rag-name] --show-content\n\n# Filter by document name\u002FID substring\nrlama list-chunks [rag-name] --document=readme\n\n# Combine options\nrlama list-chunks [rag-name] --document=api --show-content\n```\n\n**Options:**\n- `--show-content`: Display chunk content preview\n- `--document`: Filter by document name\u002FID substring\n\n**Output columns:**\n- Chunk ID (use with view-chunk command)\n- Document Source\n- Chunk Position (e.g., \"2\u002F5\" for second of five chunks)\n- Content Preview (if enabled)\n- Created Date\n\n### view-chunk - View chunk details\n\nDisplay detailed information about a specific chunk.\n\n```bash\nrlama view-chunk [rag-name] [chunk-id]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system\n- `chunk-id`: Chunk identifier from list-chunks\n\n**Example:**\n\n```bash\nrlama view-chunk documentation doc123_chunk_0\n```\n\n### add-docs - Add documents to RAG\n\nAdd new documents to an existing RAG system.\n\n```bash\nrlama add-docs [rag-name] [folder-path] [flags]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system\n- `folder-path`: Path to documents folder\n\n**Example:**\n\n```bash\nrlama add-docs documentation .\u002Fnew-docs --exclude-ext=.tmp\n```\n\n### crawl-add-docs - Add website content to RAG\n\nAdd content from a website to an existing RAG system.\n\n```bash\nrlama crawl-add-docs [rag-name] [website-url]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system\n- `website-url`: URL of the website to crawl and add to the RAG\n\n**Options:**\n- `--max-depth`: Maximum crawl depth (default: 2)\n- `--concurrency`: Number of concurrent crawlers (default: 5)\n- `--exclude-path`: Paths to exclude from crawling (comma-separated)\n- `--chunk-size`: Character count per chunk (default: 1000)\n- `--chunk-overlap`: Overlap between chunks in characters (default: 200)\n\n**Example:**\n\n```bash\n# Add blog content to an existing RAG\nrlama crawl-add-docs my-docs https:\u002F\u002Fblog.example.com\n\n# Customize crawling behavior\nrlama crawl-add-docs knowledge-base https:\u002F\u002Fdocs.example.com --max-depth=1 --exclude-path=\u002Fapi\n```\n\n### update-model - Change LLM model\n\nUpdate the LLM model used by a RAG system.\n\n```bash\nrlama update-model [rag-name] [new-model]\n```\n\n**Parameters:**\n- `rag-name`: Name of the RAG system\n- `new-model`: New Ollama model name\n\n**Example:**\n\n```bash\nrlama update-model documentation deepseek-r1:7b-instruct\n```\n\n### update - Update RLAMA\n\nChecks if a new version of RLAMA is available and installs it.\n\n```bash\nrlama update [--force\u002F-f]\n```\n\n**Options:**\n- `--force` or `-f`: (Optional) Update without asking for confirmation.\n\n### version - Display version\n\nDisplays the current version of RLAMA.\n\n```bash\nrlama --version\n```\n\nor\n\n```bash\nrlama -v\n```\n\n### hf-browse - Browse GGUF models on Hugging Face\n\nSearch and browse GGUF models available on Hugging Face.\n\n```bash\nrlama hf-browse [search-term] [flags]\n```\n\n**Parameters:**\n- `search-term`: (Optional) Term to search for (e.g., \"llama3\", \"mistral\")\n\n**Flags:**\n- `--open`: Open the search results in your default web browser\n- `--quant`: Specify quantization type to suggest (e.g., Q4_K_M, Q5_K_M)\n- `--limit`: Limit number of results (default: 10)\n\n**Examples:**\n\n```bash\n# Search for GGUF models and show command-line help\nrlama hf-browse \"llama 3\"\n\n# Open browser with search results\nrlama hf-browse mistral --open\n\n# Search with specific quantization suggestion\nrlama hf-browse phi --quant Q4_K_M\n```\n\n### run-hf - Run a Hugging Face GGUF model\n\nRun a Hugging Face GGUF model directly using Ollama. This is useful for testing models before creating a RAG system with them.\n\n```bash\nrlama run-hf [huggingface-model] [flags]\n```\n\n**Parameters:**\n- `huggingface-model`: Hugging Face model path in the format `username\u002Frepository`\n\n**Flags:**\n- `--quant`: Quantization to use (e.g., Q4_K_M, Q5_K_M)\n\n**Examples:**\n\n```bash\n# Try a model in chat mode\nrlama run-hf bartowski\u002FLlama-3.2-1B-Instruct-GGUF\n\n# Specify quantization\nrlama run-hf mlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF --quant Q5_K_M\n```\n\n## Uninstallation\n\nTo uninstall RLAMA:\n\n### Removing the binary\n\nIf you installed via `go install`:\n\n```bash\nrlama uninstall\n```\n\n### Removing data\n\nRLAMA stores its data in `~\u002F.rlama`. To remove it:\n\n```bash\nrm -rf ~\u002F.rlama\n```\n\n## Supported Document Formats\n\nRLAMA supports many file formats:\n\n- **Text**: `.txt`, `.md`, `.html`, `.json`, `.csv`, `.yaml`, `.yml`, `.xml`, `.org`\n- **Code**: `.go`, `.py`, `.js`, `.java`, `.c`, `.cpp`, `.cxx`, `.h`, `.rb`, `.php`, `.rs`, `.swift`, `.kt`, `.ts`, `.tsx`, `.f`, `.F`, `.F90`, `.el`, `.svelte`\n- **Documents**: `.pdf`, `.docx`, `.doc`, `.rtf`, `.odt`, `.pptx`, `.ppt`, `.xlsx`, `.xls`, `.epub`\n\nInstalling dependencies via `install_deps.sh` is recommended to improve support for certain formats.\n\n## Troubleshooting\n\n### Ollama is not accessible\n\nIf you encounter connection errors to Ollama:\n1. Check that Ollama is running.\n2. By default, Ollama must be accessible at `http:\u002F\u002Flocalhost:11434` or the host and port specified by the OLLAMA_HOST environment variable.\n3. If your Ollama instance is running on a different host or port, use the `--host` and `--port` flags:\n   ```bash\n   rlama --host 192.168.1.100 --port 8000 list\n   rlama --host my-ollama-server --port 11434 run my-rag\n   ```\n4. Check Ollama logs for potential errors.\n\n### Text extraction issues\n\nIf you encounter problems with certain formats:\n1. Install dependencies via `.\u002Fscripts\u002Finstall_deps.sh`.\n2. Verify that your system has the required tools (`pdftotext`, `tesseract`, etc.).\n\n### The RAG doesn't find relevant information\n\nIf the answers are not relevant:\n1. Check that the documents are properly indexed with `rlama list`.\n2. Make sure the content of the documents is properly extracted.\n3. Try rephrasing your question more precisely.\n4. Consider adjusting chunking parameters during RAG creation\n\n### Other issues\n\nFor any other issues, please open an issue on the [GitHub repository](https:\u002F\u002Fgithub.com\u002Fdontizi\u002Frlama\u002Fissues) providing:\n1. The exact command used.\n2. The complete output of the command.\n3. Your operating system and architecture.\n4. The RLAMA version (`rlama --version`).\n\n### Configuring Ollama Connection\n\nRLAMA provides multiple ways to connect to your Ollama instance:\n\n1. **Command-line flags** (highest priority):\n   ```bash\n   rlama --host 192.168.1.100 --port 8080 run my-rag\n   ```\n\n2. **Environment variable**:\n   ```bash\n   # Format: \"host:port\" or just \"host\"\n   export OLLAMA_HOST=remote-server:8080\n   rlama run my-rag\n   ```\n\n3. **Default values** (used if no other method is specified):\n   - Host: `localhost`\n   - Port: `11434`\n\nThe precedence order is: command-line flags > environment variable > default values.\n\n## Advanced Usage\n\n### Context Size Management\n\n```bash\n# Quick answers with minimal context\nrlama run my-docs --context-size=10\n\n# Deep analysis with maximum context\nrlama run my-docs --context-size=50\n\n# Balance between speed and depth\nrlama run my-docs --context-size=30\n```\n\n### RAG Creation with Filtering\n```bash\nrlama rag llama3 my-project .\u002Fcode \\\n  --exclude-dir=node_modules,dist \\\n  --process-ext=.go,.ts \\\n  --exclude-ext=.spec.ts\n```\n\n### Chunk Inspection\n```bash\n# List chunks with content preview\nrlama list-chunks my-project --show-content\n\n# Filter chunks from specific document\nrlama list-chunks my-project --document=architecture\n```\n\n## Help System\n\nGet full command help:\n```bash\nrlama --help\n```\n\nCommand-specific help:\n```bash\nrlama rag --help\nrlama list-chunks --help\nrlama update-model --help\n```\n\nAll commands support the global `--host` and `--port` flags for custom Ollama connections.\n\nThe precedence order is: command-line flags > environment variable > default values.\n\n## Hugging Face Integration\n\nRLAMA now supports using GGUF models directly from Hugging Face through Ollama's native integration:\n\n### Browsing Hugging Face Models\n\n```bash\n# Search for GGUF models on Hugging Face\nrlama hf-browse \"llama 3\"\n\n# Open browser with search results\nrlama hf-browse mistral --open\n```\n\n### Testing a Model\n\nBefore creating a RAG, you can test a Hugging Face model directly:\n\n```bash\n# Try a model in chat mode\nrlama run-hf bartowski\u002FLlama-3.2-1B-Instruct-GGUF\n\n# Specify quantization\nrlama run-hf mlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF --quant Q5_K_M\n```\n\n### Creating a RAG with Hugging Face Models\n\nUse Hugging Face models when creating RAG systems:\n\n```bash\n# Create a RAG with a Hugging Face model\nrlama rag hf.co\u002Fbartowski\u002FLlama-3.2-1B-Instruct-GGUF my-rag .\u002Fdocs\n\n# Use specific quantization\nrlama rag hf.co\u002Fmlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF:Q5_K_M my-rag .\u002Fdocs\n```\n\n## Using OpenAI Models\n\nRLAMA supports using OpenAI models with two approaches:\n\n### Option 1: Default API Keys (Automatic Usage)\n\nSet your default OpenAI API key in the web interface or via environment variable. This key will be automatically used for all RLAMA commands without needing to specify a profile.\n\n**Via Web Interface:**\n1. Navigate to **Settings → Default API Keys**\n2. Enter your OpenAI API key (starts with `sk-`)\n3. Click **Save Default API Keys**\n\n**Via Environment Variable:**\n```bash\nexport OPENAI_API_KEY=\"your-api-key\"\n```\n\n**Usage with default keys:**\n```bash\n# These commands will automatically use your default OpenAI API key\nrlama rag o3-mini my-rag .\u002Fdocuments\nrlama rag gpt-4o another-rag .\u002Fdocs\nrlama update-model my-rag gpt-4o\nrlama run my-rag\n```\n\n### Option 2: Named Profiles (Specific Usage)\n\nCreate named profiles for different OpenAI accounts or organizations. Use these when you need to switch between different API keys.\n\n**Create profiles:**\n```bash\n# Create profiles for different accounts\nrlama profile add work-account openai \"sk-work-api-key\"\nrlama profile add personal-account openai \"sk-personal-api-key\"\n```\n\n**Usage with named profiles:**\n```bash\n# Specify profile with --profile flag\nrlama rag o3-mini work-rag .\u002Fdocuments --profile work-account\nrlama rag gpt-4o personal-rag .\u002Fdocs --profile personal-account\nrlama update-model my-rag gpt-4o --profile work-account\n```\n\n### Available OpenAI Models (Updated January 2025)\n\n#### Reasoning Models (o-series)\n| Model | Input Price | Output Price | Context | Description |\n|-------|------------|-------------|---------|-------------|\n| **o3-mini** ⭐ | $1.10\u002F1M | $4.40\u002F1M | 200K | Latest reasoning model, 93% cheaper than o1 |\n| o1-pro | $150.00\u002F1M | $600.00\u002F1M | 200K | Most powerful reasoning model (Enterprise) |\n| o1 | $15.00\u002F1M | $60.00\u002F1M | 200K | Advanced reasoning model |\n\n#### GPT-4 Series  \n| Model | Input Price | Output Price | Context | Description |\n|-------|------------|-------------|---------|-------------|\n| **GPT-4.5** 🆕 | $75.00\u002F1M | $150.00\u002F1M | 128K | Natural conversation, emotional intelligence |\n| **GPT-4.1** 🆕 | $30.00\u002F1M | $60.00\u002F1M | 1M | Latest GPT-4 with 1M context window |\n| **GPT-4.1-nano** 🆕 | $5.00\u002F1M | $15.00\u002F1M | 128K | Lightweight version of GPT-4.1 |\n| **GPT-4o** 🔥 | $5.00\u002F1M | $15.00\u002F1M | 128K | Multimodal with images and audio support |\n| **GPT-4o mini** 💰 | $0.15\u002F1M | $0.60\u002F1M | 128K | Efficient version of GPT-4o |\n\n#### GPT-3.5 Series\n| Model | Input Price | Output Price | Context | Description |\n|-------|------------|-------------|---------|-------------|\n| GPT-3.5 Turbo | $0.50\u002F1M | $1.50\u002F1M | 16K | Fast and economical model |\n\n**Legend:** ⭐ = Recommended, 🆕 = New (2025), 🔥 = Popular, 💰 = Budget-friendly\n\n**Cost Optimization Tips:**\n- Use context caching for 50% reduction on repeated content\n- Choose appropriate context window sizes\n- Test multiple models for your specific use case\n- Consider o3-mini for reasoning tasks at reduced cost\n\nNote: Only inference uses OpenAI API. Document embeddings still use Ollama for processing.\n\n## Managing API Profiles\n\n### Using Default Keys (Recommended for Most Users)\n\nFor most users, setting up default API keys is the simplest approach:\n\n**Via Web Interface:**\n1. Open RLAMA web interface\n2. Go to **Settings → Default API Keys** \n3. Enter your OpenAI API key\n4. Save the configuration\n\n**Commands will automatically use your default key:**\n```bash\n# No --profile needed - uses default key automatically\nrlama rag o3-mini my-rag .\u002Fdocuments\nrlama update-model my-rag gpt-4o\nrlama run my-rag\n```\n\n### Using Named Profiles (Advanced Users)\n\nFor users managing multiple OpenAI accounts or organizations:\n\n#### Creating Named Profiles\n\n**Via CLI:**\n```bash\n# Create profiles for different environments\nrlama profile add work-openai openai \"sk-work-key...\"\nrlama profile add personal-openai openai \"sk-personal-key...\"\n```\n\n**Via Web Interface:**\n1. Navigate to **Settings → Named Profiles**\n2. Click **\"New Profile\"**\n3. Fill in the profile details:\n   - **Name**: Unique identifier (e.g., `work-account`, `personal-account`)\n   - **Provider**: OpenAI (automatically selected)\n   - **API Key**: Your OpenAI API key (starts with `sk-`)\n   - **Description**: Optional description for the profile\n\n#### Managing Profiles\n\n```bash\n# List all profiles\nrlama profile list\n\n# Delete a profile\nrlama profile delete old-profile\n```\n\n#### Using Named Profiles\n\n```bash\n# Specify profile with --profile flag\nrlama rag gpt-4o work-rag .\u002Fdocuments --profile work-openai\nrlama rag o3-mini personal-rag .\u002Fdocuments --profile personal-openai\n\n# Update models with specific profiles\nrlama update-model work-rag gpt-4o --profile work-openai\nrlama update-model personal-rag o3-mini --profile personal-openai\n```\n\n### Web Interface Features\n\nThe RLAMA web interface provides:\n- **Real-time validation** of API key format\n- **Secure storage** with masked key display\n- **Integration examples** showing exact CLI commands\n- **Model pricing table** with latest 2025 rates\n- **Usage guidance** for both default keys and named profiles\n\n### Benefits of Each Approach\n\n**Default API Keys:**\n- ✅ Simple setup - configure once, use everywhere\n- ✅ No need to remember profile names\n- ✅ Automatic usage in all commands\n- ✅ Perfect for single OpenAI account users\n\n**Named Profiles:**\n- ✅ Multiple API keys management\n- ✅ Project-specific configurations\n- ✅ Environment separation (dev\u002Fstaging\u002Fprod)\n- ✅ Organization account switching\n- ✅ Audit trail with usage tracking\n\n### Example Workflows\n\n#### Simple Workflow (Default Keys)\n```bash\n# 1. Set default API key in web interface (one-time setup)\n# 2. Use RLAMA commands directly - no profiles needed\nrlama rag o3-mini my-docs .\u002Fdocs\nrlama run my-docs  # Uses default key automatically\n```\n\n#### Advanced Workflow (Named Profiles)\n```bash\n# 1. Create profiles for different environments\nrlama profile add dev-openai openai \"sk-dev-key...\"\nrlama profile add prod-openai openai \"sk-prod-key...\"\n\n# 2. Create RAGs with specific profiles\nrlama rag o3-mini dev-docs .\u002Fdev-docs --profile dev-openai\nrlama rag gpt-4o prod-docs .\u002Fprod-docs --profile prod-openai\n\n# 3. Use RAGs with their associated profiles\nrlama run dev-docs   # Must specify profile or use default\nrlama run prod-docs  # Profile is remembered per RAG\n```\n\nThis dual approach ensures RLAMA works seamlessly for both simple single-account usage and complex multi-account enterprise scenarios.\n","\u003C!-- 社交链接导航栏 -->\n\u003Cdiv align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fx.com\u002FLeDonTizi\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTwitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white\" alt=\"Twitter\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FtP5JB9DR\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-5865F2?style=for-the-badge&logo=discord&logoColor=white\" alt=\"Discord\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.youtube.com\u002F@Dontizi\" target=\"_blank\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FYouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white\" alt=\"YouTube\">\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Cbr>\n\n# RLAMA - 用户指南\n\n> **⚠️ 项目暂时暂停**  \n> 由于我的工作和大学事务占据了大量时间，该项目目前处于暂停状态。我暂时无法积极维护此项目。待情况允许时，开发将重新启动。\n\nRLAMA 是一款功能强大的 AI 驱动文档问答工具，可无缝集成到您本地的 Ollama 模型中。它使您能够创建、管理和交互基于检索增强生成（RAG）的系统，以满足您的文档需求。\n\n\n[![RLAMA 演示](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FDonTizi_rlama_readme_68f4fb920967.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=EIsQnBqeQxQ)\n\n## 目录\n- [愿景与路线图](#vision--roadmap)\n- [安装](#installation)\n- [可用命令](#available-commands)\n  - [rag - 创建 RAG 系统](#rag---create-a-rag-system)\n  - [crawl-rag - 从网站创建 RAG 系统](#crawl-rag---create-a-rag-system-from-a-website)\n  - [wizard - 使用交互式设置创建 RAG 系统](#wizard---create-a-rag-system-with-interactive-setup)\n  - [watch - 为 RAG 系统设置目录监控](#watch---set-up-directory-watching-for-a-rag-system)\n  - [watch-off - 停用 RAG 系统的目录监控](#watch-off---disable-directory-watching-for-a-rag-system)\n  - [check-watched - 检查 RAG 的监控目录是否有新文件](#check-watched---check-a-rags-watched-directory-for-new-files)\n  - [web-watch - 为 RAG 系统设置网站监控](#web-watch---set-up-website-monitoring-for-a-rag-system)\n  - [web-watch-off - 停用 RAG 系统的网站监控](#web-watch-off---disable-website-monitoring-for-a-rag-system)\n  - [check-web-watched - 检查 RAG 监控的网站是否有更新](#check-web-watched---check-a-rags-monitored-website-for-updates)\n  - [run - 使用 RAG 系统](#run---use-a-rag-system)\n  - [api - 启动 API 服务器](#api---start-api-server)\n  - [list - 列出 RAG 系统](#list---list-rag-systems)\n  - [delete - 删除 RAG 系统](#delete---delete-a-rag-system)\n  - [list-docs - 列出 RAG 中的文档](#list-docs---list-documents-in-a-rag)\n  - [list-chunks - 检查文档分块](#list-chunks---inspect-document-chunks)\n  - [view-chunk - 查看分块详情](#view-chunk---view-chunk-details)\n  - [add-docs - 向 RAG 添加文档](#add-docs---add-documents-to-rag)\n  - [crawl-add-docs - 向 RAG 添加网站内容](#crawl-add-docs---add-website-content-to-rag)\n  - [update-model - 更改 LLM 模型](#update-model---change-llm-model)\n  - [update - 更新 RLAMA](#update---update-rlama)\n  - [version - 显示版本信息](#version---display-version)\n  - [hf-browse - 浏览 Hugging Face 上的 GGUF 模型](#hf-browse---browse-gguf-models-on-hugging-face)\n  - [run-hf - 运行 Hugging Face GGUF 模型](#run-hf---run-a-hugging-face-gguf-model)\n- [卸载](#uninstallation)\n- [支持的文档格式](#supported-document-formats)\n- [故障排除](#troubleshooting)\n- [使用 OpenAI 模型](#using-openai-models)\n\n## 愿景与路线图\nRLAMA 致力于成为创建本地 RAG 系统的权威工具，让每个人都能无缝使用——无论是个人开发者还是大型企业。以下是我们的战略路线图：\n\n### 已完成的功能 ✅\n- ✅ **基础 RAG 系统创建**：用于创建和管理 RAG 系统的 CLI 工具\n- ✅ **文档处理**：支持多种文档格式（.txt、.md、.pdf 等）\n- ✅ **文档分块**：采用多种策略进行高级语义分块（固定大小、语义、层次化、混合）\n- ✅ **向量存储**：本地存储文档嵌入\n- ✅ **上下文检索**：基本的语义搜索，可配置上下文大小\n- ✅ **Ollama 集成**：与 Ollama 模型无缝连接\n- ✅ **跨平台支持**：适用于 Linux、macOS 和 Windows\n- ✅ **简易安装**：一行安装脚本\n- ✅ **API 服务器**：HTTP 端点，用于在其他应用程序中集成 RAG 功能\n- ✅ **网页爬取**：直接从网站创建 RAG\n- ✅ **引导式 RAG 设置向导**：交互式界面，方便轻松创建 RAG\n- ✅ **Hugging Face 集成**：可访问 Hugging Face Hub 上的 45,000 多个 GGUF 模型\n\n### 小型 LLM 优化（2025 年第二季度）\n- [ ] **提示压缩**：针对有限上下文窗口的智能上下文摘要\n- ✅ **自适应分块**：根据语义边界和文档结构动态分割内容\n- ✅ **最小化上下文检索**：智能过滤冗余内容\n- [ ] **参数优化**：针对不同模型规模的精细调优设置\n\n### 高级嵌入流水线（2025 年第二至第三季度）\n- [ ] **多模型嵌入支持**：集成多种嵌入模型\n- [ ] **混合检索技术**：结合稀疏和密集检索器以提高准确性\n- [ ] **嵌入评估工具**：内置指标用于衡量检索质量\n- [ ] **自动嵌入缓存**：智能缓存机制，减少相似查询的计算量\n\n### 用户体验提升（2025 年第三季度）\n- [ ] **轻量级 Web 界面**：基于现有 CLI 后端的简单浏览器界面\n- [ ] **知识图可视化**：交互式探索文档之间的关联\n- [ ] **领域特定模板**：针对不同领域的预配置设置\n\n### 企业级功能（2025 年第四季度）\n- [ ] **多用户访问控制**：面向团队环境的角色权限管理\n- [ ] **与企业系统集成**：SharePoint、Confluence、Google Workspace 等的连接器\n- [ ] **知识质量监控**：检测过时或相互矛盾的信息\n- [ ] **系统集成 API**：用于将 RLAMA 嵌入现有工作流程的 Webhook 和 API\n- [ ] **AI 代理创建框架**：简化构建具备 RAG 能力的自定义 AI 代理的系统\n\n### 下一代检索创新（2026年第一季度）\n- [ ] **多步检索**：利用大语言模型对复杂问题的搜索查询进行优化\n- [ ] **跨模态检索**：支持图像内容的理解与检索\n- [ ] **基于反馈的优化**：通过用户交互学习来改进检索效果\n- [ ] **知识图谱与符号推理**：将向量检索与结构化知识相结合\n\nRLAMA 的核心理念始终不变：提供一个简单、强大且本地化的 RAG 解决方案，既尊重隐私、又最大限度地降低资源需求，并能在跨平台环境中无缝运行。\n\n## 安装\n\n### 先决条件\n- 已安装并运行 Ollama (https:\u002F\u002Follama.ai\u002F)\n\n### 通过终端安装\n\n```bash\ncurl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002Fdontizi\u002Frlama\u002Fmain\u002Finstall.sh | sh\n```\n\n## 技术栈\n\nRLAMA 采用以下技术构建：\n\n- **核心语言**：Go（因其性能、跨平台兼容性以及单二进制分发而被选用）\n- **CLI 框架**：Cobra（用于命令行界面的结构设计）\n- **LLM 集成**：Ollama API（用于嵌入和补全）\n- **存储**：基于本地文件系统的存储（使用 JSON 文件以简化和便于移植）\n- **向量检索**：自定义实现的余弦相似度算法，用于嵌入检索\n\n## 架构\n\nRLAMA 遵循清晰的架构模式，实现了明确的关注点分离：\n\n```\nrlama\u002F\n├── cmd\u002F                  # CLI 命令（使用 Cobra）\n│   ├── root.go           # 基础命令\n│   ├── rag.go            # 创建 RAG 系统\n│   ├── run.go            # 查询 RAG 系统\n│   └── ...\n├── internal\u002F\n│   ├── client\u002F           # 外部 API 客户端\n│   │   └── ollama_client.go # Ollama API 集成\n│   ├── domain\u002F           # 核心领域模型\n│   │   ├── rag.go        # RAG 系统实体\n│   │   └── document.go   # 文档实体\n│   ├── repository\u002F       # 数据持久化\n│   │   └── rag_repository.go # 负责保存和加载 RAG\n│   └── service\u002F          # 业务逻辑\n│       ├── rag_service.go      # RAG 操作\n│       ├── document_loader.go  # 文档处理\n│       └── embedding_service.go # 向量嵌入\n└── pkg\u002F                  # 共享工具\n    └── vector\u002F           # 向量运算\n```\n\n## 数据流\n\n1. **文档处理**：从文件系统加载文档，根据其类型进行解析，并转换为纯文本。\n2. **嵌入生成**：将文档文本发送至 Ollama 以生成向量嵌入。\n3. **存储**：RAG 系统（文档 + 嵌入）存储在用户的主目录下 (~\u002F.rlama)。\n4. **查询过程**：当用户提出问题时，将其转换为嵌入，与存储的文档嵌入进行比较，并检索相关内容。\n5. **响应生成**：将检索到的内容和问题一起发送至 Ollama，以生成上下文相关的回答。\n\n## 可视化表示\n\n```\n┌─────────────┐     ┌─────────────┐     ┌─────────────┐\n│  Documents  │────>│  Document   │────>│  Embedding  │\n│  (Input)    │     │  Processing │     │  Generation │\n└─────────────┘     └─────────────┘     └─────────────┘\n                                              │\n                                              ▼\n┌─────────────┐     ┌─────────────┐     ┌─────────────┐\n│   Query     │────>│  Vector     │\u003C────│ Vector Store│\n│  Response   │     │  Search     │     │ (RAG System)│\n└─────────────┘     └─────────────┘     └─────────────┘\n       ▲                   │\n       │                   ▼\n┌─────────────┐     ┌─────────────┐\n│   Ollama    │\u003C────│   Context   │\n│    LLM      │     │  Building   │\n└─────────────┘     └─────────────┘\n```\n\nRLAMA 的设计注重轻量化和便携性，旨在以最少的依赖项提供 RAG 功能。整个系统在本地运行，唯一的外部依赖是用于 LLM 功能的 Ollama。\n\n## 可用命令\n\n您可以通过以下命令获取所有命令的帮助信息：\n\n```bash\nrlama --help\n```\n\n### 全局选项\n\n这些选项可与任何命令一起使用：\n\n```bash\n--host string       Ollama 主机地址（默认：localhost）\n--port string       Ollama 端口（默认：11434）\n--num-thread int    Ollama 使用的线程数（默认：0，使用 Ollama 默认值）\n```\n\n**性能优化：**\n- 使用 `--num-thread 16`（或您的 CPU 核心数）可以潜在地提高处理速度\n- Ollama 默认通常只使用一半可用的核心\n- 将此设置为您全部的核心数可以显著加快文本生成和嵌入的速度\n\n**使用示例：**\n```bash\n# 使用 16 个线程以获得更好的性能\nrlama --num-thread 16 run my-docs\n\n# 创建一个优化线程使用的 RAG\nrlama --num-thread 16 rag llama3 documentation .\u002Fdocs\n\n# 使用自定义主机和线程设置运行\nrlama --host 192.168.1.100 --port 11434 --num-thread 16 run my-rag\n```\n\n### 自定义数据目录\n\nRLAMA 默认将数据存储在 `~\u002F.rlama` 中。如需使用其他位置：\n\n1. **命令行标志**（优先级最高）：\n   ```bash\n   # 与任何命令一起使用\n   rlama --data-dir \u002Fpath\u002Fto\u002Fcustom\u002Fdirectory run my-rag\n   ```\n\n2. **环境变量**：\n   ```bash\n   # 设置环境变量\n   export RLAMA_DATA_DIR=\u002Fpath\u002Fto\u002Fcustom\u002Fdirectory\n   rlama run my-rag\n   ```\n\n优先级顺序为：命令行标志 > 环境变量 > 默认位置。\n\n### rag - 创建 RAG 系统\n\n通过索引指定文件夹中的所有文档来创建一个新的 RAG 系统。\n\n```bash\nrlama rag [model] [rag-name] [folder-path]\n```\n\n**参数：**\n- `model`：要使用的 Ollama 模型名称（例如 llama3、mistral、gemma），或使用 `hf.co\u002F用户名\u002F仓库[:量化]` 格式的 Hugging Face 模型。\n- `rag-name`：用于标识您的 RAG 系统的唯一名称。\n- `folder-path`：包含您文档的文件夹路径。\n\n**示例：**\n\n```bash\n# 使用标准 Ollama 模型\nrlama rag llama3 documentation .\u002Fdocs\n\n# 使用 Hugging Face 模型\nrlama rag hf.co\u002Fbartowski\u002FLlama-3.2-1B-Instruct-GGUF my-rag .\u002Fdocs\n\n# 使用带有特定量化设置的 Hugging Face 模型\nrlama rag hf.co\u002Fmlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF:Q5_K_M my-rag .\u002Fdocs\n```\n\n### crawl-rag - 从网站创建 RAG 系统\n\n通过抓取网站并对其内容建立索引，创建一个新的 RAG 系统。\n\n```bash\nrlama crawl-rag [模型] [rag-名称] [网站-URL]\n```\n\n**参数：**\n- `模型`: 要使用的 Ollama 模型名称（例如，llama3、mistral、gemma）。\n- `rag-名称`: 用于标识您的 RAG 系统的唯一名称。\n- `网站-URL`: 要抓取和索引的网站 URL。\n\n**选项：**\n- `--max-depth`: 最大抓取深度（默认：2）\n- `--concurrency`: 并发爬虫数量（默认：5）\n- `--exclude-path`: 需要排除的路径（逗号分隔）\n- `--chunk-size`: 每个块的字符数（默认：1000）\n- `--chunk-overlap`: 块之间的重叠字符数（默认：200）\n- `--chunking-strategy`: 使用的分块策略（选项：“fixed”、“semantic”、“hybrid”、“hierarchical”，默认：“hybrid”）\n\n#### 分块策略\n\nRLAMA 提供多种高级分块策略，以优化文档检索：\n\n- **Fixed**: 传统的固定大小和重叠分块方式，尽可能尊重句子边界。\n- **Semantic**: 根据语义边界（如标题、段落和自然的主题转换）智能地拆分文档。\n- **Hybrid**: 自动根据文档类型和内容选择最佳策略（Markdown、HTML、代码或纯文本）。\n- **Hierarchical**: 对于非常长的文档，创建包含主要章节和子块的两级分块结构。\n\n系统会自动适应不同的文档类型：\n- Markdown 文档：按标题和章节拆分\n- HTML 文档：按语义 HTML 元素拆分\n- 代码文档：按函数、类和逻辑块拆分\n- 纯文本：按段落拆分，并保留上下文重叠。\n\n**示例：**\n\n```bash\n# 从文档网站创建新的 RAG\nrlama crawl-rag llama3 docs-rag https:\u002F\u002Fdocs.example.com\n\n# 自定义抓取行为\nrlama crawl-rag llama3 blog-rag https:\u002F\u002Fblog.example.com --max-depth=3 --exclude-path=\u002Farchive,\u002Ftags\n\n# 使用语义分块创建 RAG\nrlama rag llama3 documentation .\u002Fdocs --chunking-strategy=semantic\n\n# 对于大型文档使用分层分块\nrlama rag llama3 book-rag .\u002Fbooks --chunking-strategy=hierarchical\n```\n\n### wizard - 通过交互式设置创建 RAG 系统\n\n提供一个交互式的逐步向导，用于创建新的 RAG 系统。\n\n```bash\nrlama wizard\n```\n\n向导将引导您完成以下步骤：\n- 为您的 RAG 命名\n- 选择 Ollama 模型\n- 选择文档来源（本地文件夹或网站）\n- 配置分块参数\n- 设置文件过滤\n\n**示例：**\n\n```bash\nrlama wizard\n# 按照提示创建您自定义的 RAG\n```\n\n### watch - 为 RAG 系统设置目录监控\n\n配置 RAG 系统，使其自动监视指定目录中的新文件，并将其添加到 RAG 中。\n\n```bash\nrlama watch [rag-名称] [目录-路径] [间隔]\n```\n\n**参数：**\n- `rag-名称`: 要监视的 RAG 系统名称。\n- `目录-路径`: 要监视新文件的目录路径。\n- `间隔`: 检查新文件的时间间隔（以分钟为单位；使用 0 表示仅在 RAG 被使用时检查）。\n\n**示例：**\n\n```bash\n# 设置每 60 分钟检查一次的目录监控\nrlama watch my-docs .\u002Fwatched-folder 60\n\n# 设置仅在 RAG 被使用时检查的目录监控\nrlama watch my-docs .\u002Fwatched-folder 0\n\n# 自定义要监视的文件\nrlama watch my-docs .\u002Fwatched-folder 30 --exclude-dir=node_modules,tmp --process-ext=.md,.txt\n```\n\n### watch-off - 关闭 RAG 系统的目录监控\n\n关闭 RAG 系统的自动目录监控功能。\n\n```bash\nrlama watch-off [rag-名称]\n```\n\n**参数：**\n- `rag-名称`: 要停止监控的 RAG 系统名称。\n\n**示例：**\n\n```bash\nrlama watch-off my-docs\n```\n\n### check-watched - 检查 RAG 的监视目录中是否有新文件\n\n手动检查 RAG 监视的目录中是否有新文件，并将其添加到 RAG 中。\n\n```bash\nrlama check-watched [rag-名称]\n```\n\n**参数：**\n- `rag-名称`: 要检查的 RAG 系统名称。\n\n**示例：**\n\n```bash\nrlama check-watched my-docs\n```\n\n### web-watch - 为 RAG 系统设置网站监控\n\n配置 RAG 系统，使其自动监控网站更新，并将新内容添加到 RAG 中。\n\n```bash\nrlama web-watch [rag-名称] [网站-URL] [间隔]\n```\n\n**参数：**\n- `rag-名称`: 要监控的 RAG 系统名称。\n- `网站-URL`: 要监控的网站 URL。\n- `间隔`: 检查之间的时间间隔（以分钟为单位；使用 0 表示仅在 RAG 被使用时检查）。\n\n**示例：**\n\n```bash\n# 设置每 60 分钟检查一次的网站监控\nrlama web-watch my-docs https:\u002F\u002Fexample.com 60\n\n# 设置仅在 RAG 被使用时检查的网站监控\nrlama web-watch my-docs https:\u002F\u002Fexample.com 0\n\n# 自定义要监控的内容\nrlama web-watch my-docs https:\u002F\u002Fexample.com 30 --exclude-path=\u002Farchive,\u002Ftags\n```\n\n### web-watch-off - 关闭 RAG 系统的网站监控\n\n关闭 RAG 系统的自动网站监控功能。\n\n```bash\nrlama web-watch-off [rag-名称]\n```\n\n**参数：**\n- `rag-名称`: 要停止监控的 RAG 系统名称。\n\n**示例：**\n\n```bash\nrlama web-watch-off my-docs\n```\n\n### check-web-watched - 检查 RAG 监控的网站是否有更新\n\n手动检查 RAG 监控的网站是否有新更新，并将其添加到 RAG 中。\n\n```bash\nrlama check-web-watched [rag-名称]\n```\n\n**参数：**\n- `rag-名称`: 要检查的 RAG 系统名称。\n\n**示例：**\n\n```bash\nrlama check-web-watched my-docs\n```\n\n### run - 使用 RAG 系统\n\n启动一个交互式会话，与现有的 RAG 系统进行交互。\n\n```bash\nrlama run [rag-名称]\n```\n\n**参数：**\n- `rag-名称`: 要使用的 RAG 系统名称。\n- `--context-size`: （可选）要检索的上下文块数量（默认：20）。\n\n**示例：**\n\n```bash\nrlama run documentation\n> 如何安装该项目？\n> 主要功能有哪些？\n> exit\n```\n\n**上下文大小提示：**\n- 较小的值（5–15）适合快速响应并获取关键信息。\n- 中等值（20–40）可在性能和信息量之间取得平衡。\n- 较大的值（50+）适用于需要广泛背景信息的复杂问题。\n- 请考虑您的模型的上下文窗口限制。\n\n```bash\nrlama run documentation --context-size=50  # 使用 50 个上下文块\n```\n\n### api - 启动 API 服务器\n\n启动一个 HTTP API 服务器，通过 RESTful 端点暴露 RLAMA 的功能。\n\n```bash\nrlama api [--port PORT]\n```\n\n**参数：**\n- `--port`:（可选）API 服务器运行的端口号（默认：11249）\n\n**示例：**\n\n```bash\nrlama api --port 8080\n```\n\n**可用端点：**\n\n1. **查询 RAG 系统** - `POST \u002Frag`\n   ```bash\n   curl -X POST http:\u002F\u002Flocalhost:11249\u002Frag \\\n     -H \"Content-Type: application\u002Fjson\" \\\n     -d '{\n       \"rag_name\": \"documentation\",\n       \"prompt\": \"如何安装该项目？\",\n       \"context_size\": 20\n     }'\n   ```\n\n   请求字段：\n   - `rag_name`（必填）：要查询的 RAG 系统名称\n   - `prompt`（必填）：发送给 RAG 的问题或提示\n   - `context_size`（可选）：包含在上下文中的块数\n   - `model`（可选）：覆盖 RAG 使用的模型\n\n2. **检查服务器健康状态** - `GET \u002Fhealth`\n   ```bash\n   curl http:\u002F\u002Flocalhost:11249\u002Fhealth\n   ```\n\n**集成示例：**\n```javascript\n\u002F\u002F Node.js 示例\nconst response = await fetch('http:\u002F\u002Flocalhost:11249\u002Frag', {\n  method: 'POST',\n  headers: { 'Content-Type': 'application\u002Fjson' },\n  body: JSON.stringify({\n    rag_name: 'my-docs',\n    prompt: '总结主要特性'\n  })\n});\nconst data = await response.json();\nconsole.log(data.response);\n```\n\n### list - 列出 RAG 系统\n\n显示所有可用的 RAG 系统列表。\n\n```bash\nrlama list\n```\n\n### delete - 删除 RAG 系统\n\n永久删除一个 RAG 系统及其所有已索引的文档。\n\n```bash\nrlama delete [rag-name] [--force\u002F-f]\n```\n\n**参数：**\n- `rag-name`：要删除的 RAG 系统名称。\n- `--force` 或 `-f`：（可选）无需确认直接删除。\n\n**示例：**\n\n```bash\nrlama delete old-project\n```\n\n或者不需确认直接删除：\n\n```bash\nrlama delete old-project --force\n```\n\n### list-docs - 列出 RAG 中的文档\n\n显示 RAG 系统中所有文档及其元数据。\n\n```bash\nrlama list-docs [rag-name]\n```\n\n**参数：**\n- `rag-name`：RAG 系统名称\n\n**示例：**\n\n```bash\nrlama list-docs documentation\n```\n\n### list-chunks - 检查文档块\n\n列出并筛选 RAG 系统中的文档块，提供多种选项：\n\n```bash\n# 基本块列表\nrlama list-chunks [rag-name]\n\n# 显示内容预览（前 100 个字符）\nrlama list-chunks [rag-name] --show-content\n\n# 按文档名\u002FID 子串过滤\nrlama list-chunks [rag-name] --document=readme\n\n# 组合选项\nrlama list-chunks [rag-name] --document=api --show-content\n```\n\n**选项：**\n- `--show-content`：显示块内容预览\n- `--document`：按文档名\u002FID 子串过滤\n\n**输出列：**\n- 块 ID（用于 view-chunk 命令）\n- 文档来源\n- 块位置（例如，“2\u002F5”表示五个块中的第二个）\n- 内容预览（如果启用）\n- 创建日期\n\n### view-chunk - 查看块详情\n\n显示特定块的详细信息。\n\n```bash\nrlama view-chunk [rag-name] [chunk-id]\n```\n\n**参数：**\n- `rag-name`：RAG 系统名称\n- `chunk-id`：来自 list-chunks 的块标识符\n\n**示例：**\n\n```bash\nrlama view-chunk documentation doc123_chunk_0\n```\n\n### add-docs - 向 RAG 添加文档\n\n将新文档添加到现有 RAG 系统中。\n\n```bash\nrlama add-docs [rag-name] [folder-path] [flags]\n```\n\n**参数：**\n- `rag-name`：RAG 系统名称\n- `folder-path`：文档文件夹路径\n\n**示例：**\n\n```bash\nrlama add-docs documentation .\u002Fnew-docs --exclude-ext=.tmp\n```\n\n### crawl-add-docs - 将网站内容添加到 RAG\n\n将网站内容添加到现有 RAG 系统中。\n\n```bash\nrlama crawl-add-docs [rag-name] [website-url]\n```\n\n**参数：**\n- `rag-name`：RAG 系统名称\n- `website-url`：要抓取并添加到 RAG 的网站 URL\n\n**选项：**\n- `--max-depth`：最大抓取深度（默认：2）\n- `--concurrency`：并发爬虫数量（默认：5）\n- `--exclude-path`：要排除的路径（逗号分隔）\n- `--chunk-size`：每个块的字符数（默认：1000）\n- `--chunk-overlap`：块之间的重叠字符数（默认：200）\n\n**示例：**\n\n```bash\n# 将博客内容添加到现有 RAG\nrlama crawl-add-docs my-docs https:\u002F\u002Fblog.example.com\n\n# 自定义抓取行为\nrlama crawl-add-docs knowledge-base https:\u002F\u002Fdocs.example.com --max-depth=1 --exclude-path=\u002Fapi\n```\n\n### update-model - 更改 LLM 模型\n\n更新 RAG 系统使用的 LLM 模型。\n\n```bash\nrlama update-model [rag-name] [new-model]\n```\n\n**参数：**\n- `rag-name`：RAG 系统名称\n- `new-model`：新的 Ollama 模型名称\n\n**示例：**\n\n```bash\nrlama update-model documentation deepseek-r1:7b-instruct\n```\n\n### update - 更新 RLAMA\n\n检查是否有 RLAMA 的新版本，并进行安装。\n\n```bash\nrlama update [--force\u002F-f]\n```\n\n**选项：**\n- `--force` 或 `-f`：（可选）无需确认直接更新。\n\n### version - 显示版本\n\n显示 RLAMA 的当前版本。\n\n```bash\nrlama --version\n```\n\n或者\n\n```bash\nrlama -v\n```\n\n### hf-browse - 浏览 Hugging Face 上的 GGUF 模型\n\n搜索并浏览 Hugging Face 上可用的 GGUF 模型。\n\n```bash\nrlama hf-browse [search-term] [flags]\n```\n\n**参数：**\n- `search-term`：（可选）搜索关键词（如“llama3”、“mistral”）\n\n**标志：**\n- `--open`：在默认浏览器中打开搜索结果\n- `--quant`：指定量化类型以供推荐（如 Q4_K_M、Q5_K_M）\n- `--limit`：限制结果数量（默认：10）\n\n**示例：**\n\n```bash\n# 搜索 GGUF 模型并显示命令行帮助\nrlama hf-browse “llama 3”\n\n# 打开浏览器查看搜索结果\nrlama hf-browse mistral --open\n\n# 指定量化建议进行搜索\nrlama hf-browse phi --quant Q4_K_M\n```\n\n### run-hf - 运行 Hugging Face GGUF 模型\n\n使用 Ollama 直接运行 Hugging Face GGUF 模型。这在用其创建 RAG 系统之前测试模型时非常有用。\n\n```bash\nrlama run-hf [huggingface-model] [flags]\n```\n\n**参数：**\n- `huggingface-model`：Hugging Face 模型路径，格式为“用户名\u002F仓库”\n\n**标志：**\n- `--quant`：要使用的量化方式（如 Q4_K_M、Q5_K_M）\n\n**示例：**\n\n```bash\n# 以聊天模式试用模型\nrlama run-hf bartowski\u002FLlama-3.2-1B-Instruct-GGUF\n\n# 指定量化方式\nrlama run-hf mlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF --quant Q5_K_M\n```\n\n## 卸载\n\n卸载 RLAMA：\n\n### 移除二进制文件\n\n如果您通过 `go install` 安装：\n\n```bash\nrlama uninstall\n```\n\n### 移除数据\n\nRLAMA 的数据存储在 `~\u002F.rlama` 中。要移除它：\n\n```bash\nrm -rf ~\u002F.rlama\n```\n\n## 支持的文档格式\n\nRLAMA 支持多种文件格式：\n\n- **文本**：`.txt`、`.md`、`.html`、`.json`、`.csv`、`.yaml`、`.yml`、`.xml`、`.org`\n- **代码**：`.go`、`.py`、`.js`、`.java`、`.c`、`.cpp`、`.cxx`、`.h`、`.rb`、`.php`、`.rs`、`.swift`、`.kt`、`.ts`、`.tsx`、`.f`、`.F`、`.F90`、`.el`、`.svelte`\n- **文档**：`.pdf`、`.docx`、`.doc`、`.rtf`、`.odt`、`.pptx`、`.ppt`、`.xlsx`、`.xls`、`.epub`\n\n建议通过 `install_deps.sh` 安装依赖项，以提升对某些格式的支持。\n\n## 故障排除\n\n### Ollama 无法访问\n\n如果遇到与 Ollama 的连接错误：\n1. 检查 Ollama 是否正在运行。\n2. 默认情况下，Ollama 必须可通过 `http:\u002F\u002Flocalhost:11434` 或由 `OLLAMA_HOST` 环境变量指定的主机和端口访问。\n3. 如果您的 Ollama 实例运行在不同的主机或端口上，请使用 `--host` 和 `--port` 标志：\n   ```bash\n   rlama --host 192.168.1.100 --port 8000 list\n   rlama --host my-ollama-server --port 11434 run my-rag\n   ```\n4. 检查 Ollama 日志以查找潜在错误。\n\n### 文本提取问题\n\n如果某些格式出现问题：\n1. 通过 `.\u002Fscripts\u002Finstall_deps.sh` 安装依赖项。\n2. 确保系统已安装所需工具（如 `pdftotext`、`tesseract` 等）。\n\n### RAG 未找到相关信息\n\n如果回答不相关：\n1. 使用 `rlama list` 检查文档是否已正确索引。\n2. 确保文档内容已正确提取。\n3. 尝试更精确地重新表述问题。\n4. 考虑在创建 RAG 时调整分块参数。\n\n### 其他问题\n\n如遇其他问题，请在 [GitHub 仓库](https:\u002F\u002Fgithub.com\u002Fdontizi\u002Frlama\u002Fissues) 上提交问题，并提供以下信息：\n1. 执行的确切命令。\n2. 命令的完整输出。\n3. 操作系统及架构。\n4. RLAMA 版本（`rlama --version`）。\n\n### 配置 Ollama 连接\n\nRLAMA 提供了多种连接到 Ollama 实例的方式：\n\n1. **命令行标志**（优先级最高）：\n   ```bash\n   rlama --host 192.168.1.100 --port 8080 run my-rag\n   ```\n\n2. **环境变量**：\n   ```bash\n   # 格式：“host:port”或仅“host”\n   export OLLAMA_HOST=remote-server:8080\n   rlama run my-rag\n   ```\n\n3. **默认值**（若未指定其他方法则使用）：\n   - 主机：`localhost`\n   - 端口：`11434`\n\n优先级顺序为：命令行标志 > 环境变量 > 默认值。\n\n## 高级用法\n\n### 上下文大小管理\n\n```bash\n# 以最小上下文快速回答\nrlama run my-docs --context-size=10\n\n# 以最大上下文进行深度分析\nrlama run my-docs --context-size=50\n\n# 在速度与深度之间取得平衡\nrlama run my-docs --context-size=30\n```\n\n### 带过滤条件的 RAG 创建\n```bash\nrlama rag llama3 my-project .\u002Fcode \\\n  --exclude-dir=node_modules,dist \\\n  --process-ext=.go,.ts \\\n  --exclude-ext=.spec.ts\n```\n\n### 分块检查\n```bash\n# 列出分块并预览内容\nrlama list-chunks my-project --show-content\n\n# 筛选特定文档的分块\nrlama list-chunks my-project --document=architecture\n```\n\n## 帮助系统\n\n获取完整命令帮助：\n```bash\nrlama --help\n```\n\n特定命令帮助：\n```bash\nrlama rag --help\nrlama list-chunks --help\nrlama update-model --help\n```\n\n所有命令均支持全局 `--host` 和 `--port` 标志，用于自定义 Ollama 连接。\n\n优先级顺序为：命令行标志 > 环境变量 > 默认值。\n\n## Hugging Face 集成\n\nRLAMA 现在支持通过 Ollama 的原生集成直接使用来自 Hugging Face 的 GGUF 模型：\n\n### 浏览 Hugging Face 模型\n\n```bash\n# 搜索 Hugging Face 上的 GGUF 模型\nrlama hf-browse \"llama 3\"\n\n# 打开浏览器查看搜索结果\nrlama hf-browse mistral --open\n```\n\n### 测试模型\n\n在创建 RAG 之前，您可以直接测试 Hugging Face 模型：\n\n```bash\n# 以聊天模式尝试模型\nrlama run-hf bartowski\u002FLlama-3.2-1B-Instruct-GGUF\n\n# 指定量化方式\nrlama run-hf mlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF --quant Q5_K_M\n```\n\n### 使用 Hugging Face 模型创建 RAG\n\n在创建 RAG 系统时可使用 Hugging Face 模型：\n\n```bash\n# 使用 Hugging Face 模型创建 RAG\nrlama rag hf.co\u002Fbartowski\u002FLlama-3.2-1B-Instruct-GGUF my-rag .\u002Fdocs\n\n# 使用特定量化\nrlama rag hf.co\u002Fmlabonne\u002FMeta-Llama-3.1-8B-Instruct-abliterated-GGUF:Q5_K_M my-rag .\u002Fdocs\n```\n\n## 使用 OpenAI 模型\n\nRLAMA 支持两种方式使用 OpenAI 模型：\n\n### 方法一：默认 API 密钥（自动使用）\n\n您可以在 Web 界面或通过环境变量设置默认的 OpenAI API 密钥。该密钥将自动用于所有 RLAMA 命令，无需指定配置文件。\n\n**通过 Web 界面：**\n1. 导航至 **设置 → 默认 API 密钥**\n2. 输入您的 OpenAI API 密钥（以 `sk-` 开头）\n3. 单击 **保存默认 API 密钥**\n\n**通过环境变量：**\n```bash\nexport OPENAI_API_KEY=\"your-api-key\"\n```\n\n**使用默认密钥：**\n```bash\n# 下列命令将自动使用您的默认 OpenAI API 密钥\nrlama rag o3-mini my-rag .\u002Fdocuments\nrlama rag gpt-4o another-rag .\u002Fdocs\nrlama update-model my-rag gpt-4o\nrlama run my-rag\n```\n\n### 方法二：命名配置文件（特定使用）\n\n为不同的 OpenAI 账户或组织创建命名配置文件。当需要在不同 API 密钥之间切换时，可使用这些配置文件。\n\n**创建配置文件：**\n```bash\n# 为不同账户创建配置文件\nrlama profile add work-account openai \"sk-work-api-key\"\nrlama profile add personal-account openai \"sk-personal-api-key\"\n```\n\n**使用命名配置文件：**\n```bash\n# 使用 --profile 标志指定配置文件\nrlama rag o3-mini work-rag .\u002Fdocuments --profile work-account\nrlama rag gpt-4o personal-rag .\u002Fdocs --profile personal-account\nrlama update-model my-rag gpt-4o --profile work-account\n```\n\n### 可用的 OpenAI 模型（2025 年 1 月更新）\n\n#### 推理模型（o 系列）\n| 模型 | 输入价格 | 输出价格 | 上下文长度 | 描述 |\n|-------|------------|-------------|---------|-------------|\n| **o3-mini** ⭐ | $1.10\u002F1M | $4.40\u002F1M | 20 万 | 最新推理模型，比 o1 便宜 93% |\n| o1-pro | $150.00\u002F1M | $600.00\u002F1M | 20 万 | 功能最强大的推理模型（企业版） |\n| o1 | $15.00\u002F1M | $60.00\u002F1M | 20 万 | 高级推理模型 |\n\n#### GPT-4 系列  \n| 模型 | 输入价格 | 输出价格 | 上下文长度 | 描述 |\n|-------|------------|-------------|---------|-------------|\n| **GPT-4.5** 🆕 | $75.00\u002F1M | $150.00\u002F1M | 12.8 万 | 自然对话，具备情感智能 |\n| **GPT-4.1** 🆕 | $30.00\u002F1M | $60.00\u002F1M | 100 万 | 具有 100 万上下文窗口的最新 GPT-4 |\n| **GPT-4.1-nano** 🆕 | $5.00\u002F1M | $15.00\u002F1M | 12.8 万 | GPT-4.1 的轻量级版本 |\n| **GPT-4o** 🔥 | $5.00\u002F1M | $15.00\u002F1M | 12.8 万 | 多模态，支持图像和音频 |\n| **GPT-4o mini** 💰 | $0.15\u002F1M | $0.60\u002F1M | 12.8 万 | GPT-4o 的高效版本 |\n\n#### GPT-3.5 系列\n| 模型 | 输入价格 | 输出价格 | 上下文长度 | 描述 |\n|-------|------------|-------------|---------|-------------|\n| GPT-3.5 Turbo | $0.50\u002F1M | $1.50\u002F1M | 1.6 万 | 快速且经济实惠的模型 |\n\n**图例：** ⭐ = 推荐，🆕 = 新增（2025 年），🔥 = 流行，💰 = 经济实惠\n\n**成本优化建议：**\n- 对于重复内容，使用上下文缓存可降低 50% 的成本\n- 选择合适的上下文窗口大小\n- 针对特定用例测试多种模型\n- 在推理任务中考虑使用 o3-mini 以降低成本\n\n注意：仅推理调用 OpenAI API。文档嵌入仍使用 Ollama 进行处理。\n\n## 管理 API 配置文件\n\n### 使用默认密钥（大多数用户推荐）\n\n对于大多数用户来说，设置默认 API 密钥是最简单的方法：\n\n**通过 Web 界面：**\n1. 打开 RLAMA Web 界面\n2. 转到 **设置 → 默认 API 密钥**\n3. 输入您的 OpenAI API 密钥\n4. 保存配置\n\n**命令将自动使用您的默认密钥：**\n```bash\n# 无需 --profile 参数 - 自动使用默认密钥\nrlama rag o3-mini my-rag .\u002Fdocuments\nrlama update-model my-rag gpt-4o\nrlama run my-rag\n```\n\n### 使用命名配置文件（高级用户）\n\n适用于管理多个 OpenAI 账户或组织的用户：\n\n#### 创建命名配置文件\n\n**通过 CLI：**\n```bash\n# 为不同环境创建配置文件\nrlama profile add work-openai openai \"sk-work-key...\"\nrlama profile add personal-openai openai \"sk-personal-key...\"\n```\n\n**通过 Web 界面：**\n1. 导航到 **设置 → 命名配置文件**\n2. 点击 **“新建配置文件”**\n3. 填写配置文件详细信息：\n   - **名称**：唯一标识符（例如 `work-account`、`personal-account`）\n   - **提供商**：OpenAI（自动选择）\n   - **API 密钥**：您的 OpenAI API 密钥（以 `sk-` 开头）\n   - **描述**：配置文件的可选描述\n\n#### 管理配置文件\n\n```bash\n# 列出所有配置文件\nrlama profile list\n\n# 删除一个配置文件\nrlama profile delete old-profile\n```\n\n#### 使用命名配置文件\n\n```bash\n# 使用 --profile 标志指定配置文件\nrlama rag gpt-4o work-rag .\u002Fdocuments --profile work-openai\nrlama rag o3-mini personal-rag .\u002Fdocuments --profile personal-openai\n\n# 使用特定配置文件更新模型\nrlama update-model work-rag gpt-4o --profile work-openai\nrlama update-model personal-rag o3-mini --profile personal-openai\n```\n\n### Web 界面功能\n\nRLAMA Web 界面提供：\n- **实时验证** API 密钥格式\n- **安全存储**，密钥显示为掩码形式\n- **集成示例** 显示精确的 CLI 命令\n- **模型定价表**，包含最新的 2025 年费率\n- **使用指南**，适用于默认密钥和命名配置文件\n\n### 各种方法的优势\n\n**默认 API 密钥：**\n- ✅ 设置简单 - 一次配置，处处可用\n- ✅ 无需记住配置文件名称\n- ✅ 所有命令自动使用\n- ✅ 非常适合单个 OpenAI 账户用户\n\n**命名配置文件：**\n- ✅ 可管理多个 API 密钥\n- ✅ 支持项目特定配置\n- ✅ 实现环境分离（开发\u002F预发布\u002F生产）\n- ✅ 方便切换组织账户\n- ✅ 提供使用跟踪审计线索\n\n### 示例工作流程\n\n#### 简单工作流程（默认密钥）\n```bash\n# 1. 在 Web 界面设置默认 API 密钥（一次性设置）\n# 2. 直接使用 RLAMA 命令 - 无需配置文件\nrlama rag o3-mini my-docs .\u002Fdocs\nrlama run my-docs   # 自动使用默认密钥\n```\n\n#### 高级工作流程（命名配置文件）\n```bash\n# 1. 为不同环境创建配置文件\nrlama profile add dev-openai openai \"sk-dev-key...\"\nrlama profile add prod-openai openai \"sk-prod-key...\"\n\n# 2. 使用特定配置文件创建 RAG\nrlama rag o3-mini dev-docs .\u002Fdev-docs --profile dev-openai\nrlama rag gpt-4o prod-docs .\u002Fprod-docs --profile prod-openai\n\n# 3. 使用与各自配置文件关联的 RAG\nrlama run dev-docs   # 必须指定配置文件或使用默认\nrlama run prod-docs  # 配置文件会为每个 RAG 记住\n```\n\n这种双重方法确保 RLAMA 既能无缝适用于简单的单账户使用场景，也能满足复杂的多账户企业级需求。","# RLAMA 快速上手指南\n\nRLAMA 是一款强大的本地 AI 问答工具，专为文档检索增强生成（RAG）设计。它能无缝集成本地 Ollama 模型，帮助你基于个人文档构建私有的知识库问答系统。\n\n> **⚠️ 项目状态提示**  \n> 目前该项目因作者工作学业原因暂时暂停维护，但现有功能仍可正常使用。\n\n## 环境准备\n\n在开始之前，请确保你的系统满足以下要求：\n\n*   **操作系统**：支持 Linux、macOS 和 Windows。\n*   **核心依赖**：必须安装并运行 [Ollama](https:\u002F\u002Follama.ai\u002F)。\n    *   安装 Ollama 后，请确保至少拉取了一个大语言模型（例如 `llama3` 或 `qwen2`），用于处理嵌入和生成回答。\n    *   验证安装：在终端运行 `ollama list` 确认模型已就绪。\n\n## 安装步骤\n\nRLAMA 提供了一键安装脚本，无需手动编译。\n\n### 1. 执行安装命令\n\n在终端中运行以下命令（国内用户若下载缓慢，可尝试配置代理或使用镜像加速）：\n\n```bash\ncurl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002Fdontizi\u002Frlama\u002Fmain\u002Finstall.sh | sh\n```\n\n### 2. 验证安装\n\n安装完成后，输入以下命令查看帮助信息，确认安装成功：\n\n```bash\nrlama --help\n```\n\n## 基本使用\n\n以下是构建一个本地文档问答系统的最简流程。\n\n### 第一步：创建 RAG 系统\n\n使用 `rag` 命令创建一个知识库。该命令会读取指定文件夹下的所有文档（支持 .txt, .md, .pdf 等），利用 Ollama 生成向量嵌入并存储。\n\n**命令格式：**\n```bash\nrlama rag \u003C模型名称> \u003C知识库名称> \u003C文档文件夹路径>\n```\n\n**示例：**\n假设你有一个名为 `my-docs` 的文件夹存放技术文档，想使用 `llama3` 模型构建名为 `tech-kb` 的知识库：\n\n```bash\nrlama rag llama3 tech-kb .\u002Fmy-docs\n```\n\n> **性能优化提示**：如果你的 CPU 核心数较多，可以通过 `--num-thread` 参数加速处理过程（例如设置为 16）：\n> ```bash\n> rlama --num-thread 16 rag llama3 tech-kb .\u002Fmy-docs\n> ```\n\n### 第二步：与知识库对话\n\n创建完成后，使用 `run` 命令即可针对该知识库进行提问。\n\n**命令格式：**\n```bash\nrlama run \u003C知识库名称>\n```\n\n**示例：**\n启动交互式对话：\n\n```bash\nrlama run tech-kb\n```\n\n进入交互模式后，直接输入你的问题（例如：“如何配置环境变量？”），RLAMA 将检索相关文档片段并结合 Ollama 模型生成准确的回答。\n\n### 其他常用操作\n\n*   **列出所有知识库**：\n    ```bash\n    rlama list\n    ```\n*   **向现有知识库添加新文档**：\n    ```bash\n    rlama add-docs tech-kb .\u002Fnew-docs\n    ```\n*   **删除知识库**：\n    ```bash\n    rlama delete tech-kb\n    ```\n\n---\n*注：默认数据存储在 `~\u002F.rlama` 目录。如需自定义存储路径，可使用 `--data-dir` 标志或设置 `RLAMA_DATA_DIR` 环境变量。*","某法律科技公司的初级律师需要快速从数百页不断更新的公司内部合规手册和最新法律法规网页中，查找特定条款的解读依据。\n\n### 没有 rlama 时\n- 面对本地 PDF 手册和外部法规网站的混合资料，只能手动复制粘贴到通用聊天机器人，不仅效率低下，还容易因上下文长度限制遗漏关键信息。\n- 每次法规网站更新或内部手册修订，都需要重新整理文档并手动再次上传，无法实现知识的自动同步，极易导致回答基于过时版本。\n- 缺乏对本地大模型（如 Ollama）的原生支持，若担心数据隐私不敢使用云端 API，就只能放弃智能问答，回归原始的关键词搜索。\n- 无法追溯答案的具体来源段落，难以向合伙人验证引用的准确性，增加了复核成本。\n\n### 使用 rlama 后\n- 利用 `crawl-rag` 和 `add-docs` 命令，一键将本地合规手册与外部法规网站整合成统一的 RAG 系统，直接连接本地 Ollama 模型进行私有化问答。\n- 通过 `web-watch` 和 `watch` 功能设置监控，当法规网站更新或本地文件夹新增文档时，rlama 自动检测并增量更新知识库，确保持续获取最新信息。\n- 全程数据在本地运行，无需上传敏感法律文档至云端，完美兼顾了数据安全与智能化效率。\n- 使用 `list-chunks` 和 `view-chunk` 随时 inspect 文档切片，问答结果自带精确的来源定位，让每一条法律解读都有据可查。\n\nrlama 将分散的静态文档转化为可自动进化、安全可控的本地法律智慧库，极大提升了专业领域的知识检索效率。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FDonTizi_rlama_68f4fb92.jpg","DonTizi","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FDonTizi_86d80fcf.jpg","Hey!\r\nI'm an AI and Machine Learning Engineer from Montreal! ",null,"Montreal","LeDonTizi","www.melboucierayane.com","https:\u002F\u002Fgithub.com\u002FDonTizi",[82,86,90,94,98,102,106],{"name":83,"color":84,"percentage":85},"Go","#00ADD8",52.6,{"name":87,"color":88,"percentage":89},"JavaScript","#f1e05a",30.3,{"name":91,"color":92,"percentage":93},"Python","#3572A5",8,{"name":95,"color":96,"percentage":97},"CSS","#663399",7.5,{"name":99,"color":100,"percentage":101},"Shell","#89e051",1.2,{"name":103,"color":104,"percentage":105},"PowerShell","#012456",0.2,{"name":107,"color":108,"percentage":109},"HTML","#e34c26",0.1,1095,75,"2026-04-14T09:15:00","Apache-2.0","Linux, macOS, Windows","未说明 (依赖 Ollama，通常支持 CPU 运行，也可利用 NVIDIA\u002FAMD\u002FMac GPU)","未说明 (取决于所选 LLM 模型大小)",{"notes":118,"python":119,"dependencies":120},"该工具基于 Go 语言开发，以单二进制文件分发，无需 Python 环境。唯一的外部强依赖是本地运行的 Ollama 服务，用于提供嵌入生成和文本补全能力。数据存储于用户主目录 (~\u002F.rlama)。可通过 --num-thread 参数调整线程数以优化性能。","不需要 (核心语言为 Go)",[121],"Ollama (必须安装并运行)",[35,14,45],"2026-03-27T02:49:30.150509","2026-04-17T09:54:30.685226",[126,131,136,141,146,151,155],{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},36772,"运行 rlama api 时提示 'unknown command' 错误怎么办？","该问题通常出现在旧版本中。请尝试将 rlama 升级到最新版本（如 v0.1.27 或更高），升级后命令即可正常工作。如果问题依旧，可能是超时设置导致，可以尝试修改源码 server.go 中的 ReadTimeout 和 WriteTimeout 参数（例如设置为 1024 * time.Second）来增加超时时间。","https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fissues\u002F29",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},36773,"为什么 rlama list 显示有多个文档，但运行时模型只能访问其中一部分？","这是早期版本（如 v0.1.25 及之前）的已知限制。维护者已在 v0.1.26 版本中通过引入 'EnhancedHybridStore'（增强型混合存储）解决了此问题，确保所有文档都能被正确索引和检索。请将 rlama 更新至 v0.1.26 或更高版本以解决该问题。","https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fissues\u002F8",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},36774,"rlama rag 无法识别 .cxx、.f90 或 .svelte 等特定扩展名的文件怎么办？","维护者已在后续版本中添加了对多种编程语言扩展名的支持。如果您使用的是旧版本，请升级 rlama。对于需要支持的特殊扩展名，社区建议添加通用开关（如 --extensions cxx,f,f90），目前维护者已承诺在更新中加入更多扩展名支持，建议直接使用最新版以获得最广泛的文件类型兼容。","https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fissues\u002F10",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},36775,"在 Ubuntu 上安装时遇到 'exec: \"python\": executable file not found in $PATH' 错误如何解决？","该错误是因为系统默认使用 'python3' 而程序硬编码了 'python'，且未正确处理虚拟环境。解决方法如下：\n1. 安装必要依赖：sudo apt-get install -y python3 python3-pip python3-venv poppler-utils tesseract-ocr。\n2. 确保 Ollama 已安装并运行。\n3. 重新运行安装脚本或手动创建虚拟环境。\n维护者已在新版本中修复了硬编码问题并实现了虚拟环境支持，建议直接下载最新版本重试。","https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fissues\u002F80",{"id":147,"question_zh":148,"answer_zh":149,"source_url":150},36776,"运行特定大模型（如 gemma3:27b）时报错 'model is not supported by your version of Ollama' 怎么办？","该错误表明当前安装的 Ollama 版本过旧，不支持所请求的模型架构。请运行以下命令升级 Ollama 到最新版本：\ncurl -fsSL https:\u002F\u002Follama.com\u002Finstall.sh | sh\n升级完成后，重新启动 Ollama 服务并拉取模型即可正常运行。","https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fissues\u002F54",{"id":152,"question_zh":153,"answer_zh":154,"source_url":130},36777,"如何处理大型上下文导致的请求超时问题？","当传入的上下文较大时，处理时间可能超过默认超时限制。临时解决方案是修改源码 server.go 文件，增加超时设置：\nReadTimeout: 1024 * time.Second,\nWriteTimeout: 1024 * time.Second,\n编译后即可生效。未来版本可能会将此配置化，目前建议通过修改源码或使用高性能硬件来缓解此问题。",{"id":156,"question_zh":157,"answer_zh":158,"source_url":145},36778,"如何在没有外部提取器（如 pdftotext）的情况下使用 rlama？","如果没有安装外部提取器（如 poppler-utils 中的 pdftotext），rlama 会警告 'Text extraction will be limited'，但仍可运行，只是对 PDF 等格式的支持受限。建议在 Ubuntu\u002FDebian 系统上运行以下命令安装依赖以启用完整功能：\nsudo apt-get install -y poppler-utils tesseract-ocr catdoc unrtf\n安装后重启 rlama 即可自动识别提取器。",[160,164,168,172,176,180,185,189,193,197,202,207,211,216,221,226,230,234,238,243],{"id":161,"version":162,"summary_zh":76,"released_at":163},297186,"v0.1.39","2025-05-24T17:24:09",{"id":165,"version":166,"summary_zh":76,"released_at":167},297187,"v0.1.38","2025-05-24T04:17:17",{"id":169,"version":170,"summary_zh":76,"released_at":171},297188,"v0.1.37","2025-05-24T03:42:31",{"id":173,"version":174,"summary_zh":76,"released_at":175},297189,"v0.1.36","2025-04-03T15:32:28",{"id":177,"version":178,"summary_zh":76,"released_at":179},297190,"v0.1.35","2025-04-01T23:36:26",{"id":181,"version":182,"summary_zh":183,"released_at":184},297191,"v0.1.34","### v0.1.34 版本更新说明\n\n\n1. **RAG 系统的网站监控**\n    - 新增用于设置和管理网站监控的命令：\n        - `web-watch`：配置 RAG 系统以监控网站更新。\n        - `web-watch-off`：关闭网站监控。\n        - `check-web-watched`：手动检查已监控网站是否有更新。\n\n2. **Hugging Face 集成**\n    - 支持访问 Hugging Face 上的 GGUF 模型：\n        - `hf-browse`：浏览可用模型。\n        - `run-hf`：运行 Hugging Face 的 GGUF 模型。\n    - 示例：\n        - `rlama hf-browse mistral --open`\n        - `rlama run-hf bartowski\u002FLlama-3.2-1B-Instruct-GGUF`\n\n3. **高级分块策略**\n    - 新增多种优化文档检索的分块策略：\n        - `fixed`、`semantic`、`hybrid`、`hierarchical`\n    - 示例：\n        - `rlama rag llama3 documentation .\u002Fdocs --chunking-strategy=semantic`\n        - `rlama rag llama3 book-rag .\u002Fbooks --chunking-strategy=hierarchical`\n\n4. **OpenAI 模型支持**\n    - 可在 RLAMA 中使用 OpenAI 模型进行推理：\n        - 通过 `export OPENAI_API_KEY=\"your-api-key\"` 设置 API 密钥。\n        - 使用 OpenAI 模型创建 RAG：\n            - `rlama rag gpt-4-turbo my-rag .\u002Fdocuments`\n        - 支持的模型包括 `gpt-4-turbo`、`o3-mini` 等。\n\n5. **自定义数据目录**\n    - 可为 RLAMA 设置自定义数据目录：\n        - 命令行参数：`rlama --data-dir \u002Fpath\u002Fto\u002Fdirectory`\n        - 环境变量：`export RLAMA_DATA_DIR=\u002Fpath\u002Fto\u002Fdirectory`\n\n6. **重排序功能增强**\n    - 提供重排序模型的配置选项：\n        - `add-reranker`：为 RAG 配置重排序功能。\n        - 参数包括模型、权重和阈值。\n    - 示例：\n        - `rlama add-reranker my-rag --model reranker-model --weight 0.8`\n\n7. **API 配置文件管理**\n    - 支持管理多个 API 配置文件：\n        - `profile add`：添加新的 API 配置文件。\n        - `profile list`：列出所有配置文件。\n        - `profile delete`：删除配置文件。\n    - 示例：\n        - `rlama profile add openai-work openai \"sk-your-api-key\"`\n\n8. **向导功能增强**\n    - 改进了交互式 RAG 系统创建流程：\n        - 支持网站爬取选项。\n        - 优化了分块策略的选择。\n\n9. **文档更新**\n    - 新增关于分块策略和重排序功能的详细指南。\n\n10. **多项改进与错误修复**\n    - 优化了添加文档的相关命令。\n    - 改进了错误处理和用户反馈机制。\n    - 更新了依赖项和内部库，以提升性能和稳定性。","2025-03-22T18:33:14",{"id":186,"version":187,"summary_zh":76,"released_at":188},297192,"v0.1.33","2025-03-21T15:44:15",{"id":190,"version":191,"summary_zh":76,"released_at":192},297193,"v0.1.32","2025-03-16T19:53:29",{"id":194,"version":195,"summary_zh":76,"released_at":196},297194,"v0.1.31","2025-03-16T03:59:42",{"id":198,"version":199,"summary_zh":200,"released_at":201},297195,"v0.1.30","# RLAMA v0.1.30 - 网页爬取与交互式向导\n\n我们很高兴地宣布 RLAMA v0.1.30 的发布，为您的本地 RAG 体验带来了强大的新功能。本次更新新增了网页爬取功能，可直接从网站构建 RAG 系统，并引入了一个交互式向导，让 RAG 的创建更加简便。\n\n## 🔍 新增网页爬取功能\n\n借助全新的爬取功能，您可以直接从网站构建 RAG 系统：\n\n- **`crawl-rag`**：通过爬取网站来创建新的 RAG 系统  \n  ```bash\n  rlama crawl-rag llama3 docs-rag https:\u002F\u002Fdocs.example.com --max-depth=2\n  ```\n\n- **`crawl-add-docs`**：将网站内容添加到现有 RAG 中  \n  ```bash\n  rlama crawl-add-docs my-rag https:\u002F\u002Fblog.example.com --exclude-path=\u002Farchive,\u002Ftags\n  ```\n\n您可以通过灵活的选项控制爬取行为：\n- 使用 `--max-depth` 设置爬取深度；\n- 使用 `--concurrency` 调整并发请求数量；\n- 使用 `--exclude-path` 跳过特定路径；\n- 使用 `--chunk-size` 和 `--chunk-overlap` 细化文档分块设置。\n\n## 🧙‍♂️ 交互式 RAG 创建向导\n\n现在，借助我们的分步向导，搭建 RAG 系统比以往更加简单：\n\n```bash\nrlama wizard\n```\n\n向导将引导您完成以下步骤：\n- 为 RAG 命名；\n- 选择 Ollama 模型；\n- 选择文档文件夹；\n- 配置文档分块参数；\n- 设置文件过滤规则。\n\n无论是新手用户，还是偏好引导式流程的开发者，这款向导都是理想之选！\n\n## 🔧 技术改进\n\n- **依赖库更新**：升级至 Go 1.23.0；\n- **新增库支持**：引入 `goquery` 用于 HTML 解析，并优化了文档处理逻辑；\n- **性能提升**：增强了并发处理能力，使网页爬取速度更快。\n\n## 使用示例\n\n### 从文档网站创建 RAG：\n```bash\nrlama crawl-rag llama3 product-docs https:\u002F\u002Fproduct.example.com\u002Fdocs --max-depth=3\n```\n\n### 将博客内容添加到现有知识库：\n```bash\nrlama crawl-add-docs knowledge-base https:\u002F\u002Fcompany.blog.com --concurrency=10\n```\n\n### 使用交互式向导：\n```bash\nrlama wizard\n# 按照提示操作，即可创建您专属的 RAG\n```\n\n---\n\n我们始终致力于将 RLAMA 打造成为本地 RAG 系统的最佳工具。一如既往，您的反馈对我们至关重要！","2025-03-16T01:10:33",{"id":203,"version":204,"summary_zh":205,"released_at":206},297196,"v0.1.29","> 本次拉取请求为 `rlama` 项目中的 RAG 系统引入了一项新的目录监听功能。主要变更包括更新 `README.md` 文件以记录新命令、修改命令文件以支持目录监听，以及新增服务和方法来实现该功能。\n> \n> ### 文档更新：\n> * [`README.md`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R27-R29)：添加了对新命令 `watch`、`watch-off` 和 `check-watched` 的说明。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R27-R29) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R208-R267)\n> \n> ### 命令新增与修改：\n> * [`cmd\u002Froot.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-ab967ab1a2f3a1b769106eeb7bfe892ef0e81d1d27811fa15be08e6749feee1fR7-R12)：导入了新的 `service` 包，并添加了一个用于启动监听守护进程的函数。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-ab967ab1a2f3a1b769106eeb7bfe892ef0e81d1d27811fa15be08e6749feee1fR7-R12) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-ab967ab1a2f3a1b769106eeb7bfe892ef0e81d1d27811fa15be08e6749feee1fR84-R101)\n> * [`cmd\u002Frun.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349R11)：添加了对 `checkWatchedDirectory` 的调用，在向 RAG 系统查询之前检查是否有新文件。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349R11) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349R61-R62) [[3]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349R84-R96)\n> * [`cmd\u002Fwatch.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-b1d36367bc59ae1783324a4e3f2f6d15d421aa8f8a966109b04e3d77058184b8R1-R162)：创建了新命令 `watch`、`watch-off` 和 `check-watched`，用于管理 RAG 系统的目录监听功能。\n> \n> ### 服务与领域层变更：\n> * [`internal\u002Fdomain\u002Frag.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-298fd7511dab8e5e5f020a531394be216f4371e4ebe3624011f1f27e7790509eR19-R33)：在 `RagSystem` 中增加了用于存储目录监听配置的字段，并创建了一个新的 `DocumentWatchOptions` 结构体。\n> * [`internal\u002Fservice\u002Ffile_watcher.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-7330841e716e4324bf9ca9ab22eac1aa66e7c7b6249161706d7ce5b1478747fdR1-R203)：实现了 `FileWatcher` 服务，用于处理目录监听、检测新文件以及更新 RAG 系统。\n> * [`internal\u002Fservice\u002Frag_service.go`](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F42\u002Ffiles#diff-cf1dacf22586e45564587ed181f75a38110a790c942","2025-03-13T17:25:39",{"id":208,"version":209,"summary_zh":76,"released_at":210},297197,"v0.1.28","2025-03-12T20:17:47",{"id":212,"version":213,"summary_zh":214,"released_at":215},297198,"v0.1.27","> 这份 Pull Request 包含多项更改，旨在提升 `DocumentLoader` 和 `HNSWStore` 类的功能性和鲁棒性。其中最重要的改动包括新增用于从多种文件类型中提取内容的方法，以及对余弦相似度计算的优化，以更好地处理边界情况。\n> \n> ### DocumentLoader 的改进：\n> * 新增方法 `extractCSVContent`，用于从 CSV 文件中提取内容，支持处理表头和数据行。\n> * 新增方法 `extractExcelContent`，通过 `xlsx2csv` 命令行工具或 Python 脚本作为备用方案，从 Excel 文件中提取内容。\n> * 新增方法 `extractContent`，用于根据文件扩展名判断文件类型，并调用相应的提取方法。\n> \n> ### HNSWStore 的增强：\n> * 优化了 `computeCosineSimilarity` 函数，增加了对空向量的检查、长度不匹配的日志记录，以及对其中一个范数为零的情况的处理，从而避免错误并提高代码的鲁棒性。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F34\u002Ffiles#diff-6f072b273c60117726774c1a32b4029c80bda9aaf1aa795e033f9f568d0b2c57R45-R56) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F34\u002Ffiles#diff-6f072b273c60117726774c1a32b4029c80bda9aaf1aa795e033f9f568d0b2c57R67-R69) 支持 xlsx 格式的嵌入表示。\n","2025-03-12T19:21:02",{"id":217,"version":218,"summary_zh":219,"released_at":220},297199,"v0.1.26","> ### 混合存储集成：\n> * 在 `RagSystem` 中将 `VectorStore` 替换为 `HybridStore`，以支持使用新的 `EnhancedHybridStore` 类进行向量与文本的联合检索。（`internal\u002Fdomain\u002Frag.go`、`internal\u002Frepository\u002Frag_repository.go`、`internal\u002Fservice\u002Frag_service.go`）[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F30\u002Ffiles#diff-298fd7511dab8e5e5f020a531394be216f4371e4ebe3624011f1f27e7790509eL9-R46) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F30\u002Ffiles#diff-a4036d37933c77737713e4ecbae4960db3610e7f49b2a48a85ada2cdb6a857d9L82-R82) [[3]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F30\u002Ffiles#diff-cf1dacf22586e45564587ed181f75a38110a790c94262a247dc3c46d03ed5e1eL155-R155)\n> * 实现了 `EnhancedHybridStore` 类，该类结合了 HNSW 向量检索和 BM25 文本检索功能，包含添加文档、删除文档以及执行混合检索的方法。（`pkg\u002Fvector\u002Fhybrid_store.go`）\n> \n> ### 元数据处理：\n> * 在 `Document` 结构体中新增了 `Metadata` 字段，并更新了相关方法以支持该新字段。（`internal\u002Fdomain\u002Fdocument.go`）[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F30\u002Ffiles#diff-e55a0e90347377b1ce48ceeb86fcde934bf58619cc6898b1b069e76b936d1024R16) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F30\u002Ffiles#diff-e55a0e90347377b1ce48ceeb86fcde934bf58619cc6898b1b069e76b936d1024R33)\n> \n> ### 嵌入缓存：\n> * 引入了 `EmbeddingCache` 类，用于缓存嵌入向量，避免对相同内容重复生成嵌入，包含添加、获取及清理缓存嵌入的方法。（`internal\u002Fservice\u002Fembedding_cache.go`）\n> \n> ### 代码库增强：\n> * 更新了 Go 模块版本，并在 `go.mod` 中添加了几项间接依赖，以支持新功能。（`go.mod`）\n> * 简化了 `main.go` 文件，移除了对根命令执行的错误处理。（`main.go`）\n> \n> ### 新的向量存储实现：\n> * 添加了 `HNSWStore` 类，作为 HNSW 算法的一种更简单的近似实现，用于向量的存储与检索，包含添加、删除及搜索向量的方法。（`pkg\u002Fvector\u002Fhnsw_vector_store.go`）\n\n","2025-03-12T01:05:48",{"id":222,"version":223,"summary_zh":224,"released_at":225},297200,"v0.1.25","> 此次 Pull Request 包含 RLAMA 项目的多项增强和新功能，重点在于改进文档、新增命令以及优化文档处理能力。最重要的变更包括对 README 文件的更新、新命令的实现以及文档加载服务的增强。\n> \n> ### 文档更新：\n> * 在 `README.md` 中新增了多个章节和命令，以提供关于列出文档、检查文档分块、查看分块详情、添加文档以及更新模型的详细使用说明。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R29-R33) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R171) [[3]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R182-R191) [[4]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R224-R323) [[5]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L257-R374) [[6]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R405) [[7]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R436-R485)\n> \n> ### 新增命令：\n> * 实现了 `list-chunks` 命令，用于在 RAG 系统中检查文档分块，并提供过滤选项。\n> * 增强了 `add-docs` 命令，增加了排除目录和文件扩展名的选项，并支持处理特定文件扩展名。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-ed61343783bd8d9847378adbac3ba10fdca49161eda0f2f715ff242162c36122L8-R13) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-ed61343783bd8d9847378adbac3ba10fdca49161eda0f2f715ff242162c36122L29-R59)\n> * 更新了 `run` 命令，新增了 `--context-size` 参数，用于检索指定数量的上下文分块。[[1]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349R13-R16) [[2]](https:\u002F\u002Fgithub.com\u002FDonTizi\u002Frlama\u002Fpull\u002F25\u002Ffiles#diff-8146f8148ccbf6711d65f532f6ab9a7c8dfbdc3960c7ffc974f17d6d224dd349L56-R60) [[3]](https:\u002F\u002Fgithub.com\u002FDonTzi","2025-03-11T05:16:58",{"id":227,"version":228,"summary_zh":76,"released_at":229},297201,"v0.1.24","2025-03-10T14:29:32",{"id":231,"version":232,"summary_zh":76,"released_at":233},297202,"v0.1.23","2025-03-09T04:47:17",{"id":235,"version":236,"summary_zh":76,"released_at":237},297203,"v0.1.22","2025-03-08T20:13:46",{"id":239,"version":240,"summary_zh":241,"released_at":242},297204,"v0.1.21","# 在 bge-m3 不可用时添加嵌入模型的回退机制\n\n## 解决的问题\n此前，当 bge-m3 嵌入模型未安装时，应用程序会完全失败，并向用户显示阻塞型错误。这种情况可能发生在以下两种情形中：\n1. Ollama 无法访问。\n2. bge-m3 模型未安装。\n\n## 实现的解决方案\n本 PR 引入了嵌入的回退机制：\n- 系统首先尝试使用专门针对嵌入优化的 bge-m3 模型。\n- 如果失败，则自动切换到为 RAG 指定的 LLM 模型作为替代方案。\n- 同时显示一条提示信息，说明如何提升性能（通过安装 bge-m3）。\n\n## 优势\n- **更好的用户体验**：即使未预装 bge-m3，用户仍可创建并使用 RAG。\n- **无阻塞性问题**：流程将继续运行，并采用可行的替代方案。\n- **明确指引**：用户将收到清晰的说明，指导其如何进一步提升性能。\n\n## 执行的测试\n- 测试了以下场景：\n  - bge-m3 已安装（按预期工作）。\n  - bge-m3 未安装（回退至指定模型）。\n  - Ollama 无法访问（现显示适当的错误信息）。\n\n此次改进使 RLAMA 更加健壮且更易于使用，尤其对于那些可能不了解推荐嵌入模型的新用户而言。","2025-03-08T08:33:24",{"id":244,"version":245,"summary_zh":246,"released_at":247},297205,"v0.1.2","# 新增功能\n\n## 模型更新\n- 可以更改现有 RAG 系统使用的 Ollama 模型\n\n## 文档管理\n- 向现有 RAG 系统添加文档\n- 从 RAG 系统中移除特定文档\n- 列出 RAG 中的所有文档及其详细信息\n\n## 大小报告\n- 显示每个 RAG 系统中文档的总大小\n\n# 新增命令\n\n- **`rlama update-model [rag-name] [new-model]`**: 更改 RAG 使用的模型  \n- **`rlama add-docs [rag-name] [folder-path]`**: 向现有 RAG 添加文档  \n- **`rlama remove-doc [rag-name] [doc-id]`**: 从 RAG 中移除特定文档  \n- **`rlama list-docs [rag-name]`**: 列出 RAG 中的所有文档及其详细信息","2025-03-08T05:19:10"]