[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-katanaml--sparrow":3,"tool-katanaml--sparrow":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",155373,2,"2026-04-14T11:34:08",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":93,"forks":94,"last_commit_at":95,"license":96,"difficulty_score":10,"env_os":97,"env_gpu":98,"env_ram":99,"env_deps":100,"category_tags":107,"github_topics":108,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":117,"updated_at":118,"faqs":119,"releases":149},7516,"katanaml\u002Fsparrow","sparrow","Structured data extraction and instruction calling with ML, LLM and Vision LLM","Sparrow 是一款专为结构化数据提取设计的开源工具，能够利用机器学习、大语言模型（LLM）及视觉大模型，将发票、收据、银行对账单、表格及各类图片自动转化为干净的 JSON 格式数据。它有效解决了传统文档处理中非结构化信息难以整理、人工录入效率低且易出错的痛点，让杂乱的文件瞬间变为可查询、可验证的规范数据。\n\n这款工具非常适合开发者、数据工程师以及需要构建自动化文档处理流程的研究人员使用。通过其直观的拖拽式 Web 界面和完善的 RESTful API，用户既能快速上手体验，也能轻松将其集成到现有系统中。Sparrow 的技术亮点在于其灵活的“插件化”架构，支持混合调用多种处理管道；同时兼容 Apple Silicon (MLX)、Ollama、vLLM 等多种后端，并能在本地运行 Mistral、Qwen 等先进的视觉大模型，兼顾了隐私安全与高性能。此外，它还具备基于 JSON Schema 的自动数据校验和可视化标注功能，确保提取结果的准确性与可追溯性，是企业级文档智能处理的理想选择。","# Sparrow\n\n[![PyPI - Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-v3.12+-blue.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow)\n[![GitHub Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkatanaml\u002Fsparrow.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fstargazers)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002Fkatanaml\u002Fsparrow.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues)\n[![Current Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-0.4.4-green.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow)\n[![License: GPL v3](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-GPLv3-blue.svg)](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fgpl-3.0)\n\n**Structured data extraction and instruction calling with ML, LLM and Vision LLM**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"300\" height=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_1993b2875611.png\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>🚀 \u003Ca href=\"https:\u002F\u002Fsparrow.katanaml.io\">Try Sparrow Online\u003C\u002Fa> | 📖 \u003Ca href=\"#-quickstart\">Quick Start\u003C\u002Fa> | 🛠️ \u003Ca href=\"#️-installation\">Installation\u003C\u002Fa> | 📚 \u003Ca href=\"#-examples\">Examples\u003C\u002Fa> | 🤖 \u003Ca href=\"#-sparrow-agent\">Agents\u003C\u002Fa>\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n---\n\n## 🌟 Sparrow\n\nProduction-ready structured data extraction powered by ML, LLMs & Vision LLMs.\n\nTurn invoices, receipts, statements, forms and images into clean structured data.\n\n[🚀 Try Sparrow Online](https:\u002F\u002Fsparrow.katanaml.io)\n\n![Sparrow UI](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_ab573d748d65.png)\n\n### Sparrow UI Features\n- **Drag & Drop**: Upload documents directly\n- **Real-time Processing**: See results instantly  \n- **Data Query**: JSON based schema for data query\n- **Structured Output**: JSON structured output\n- **Result Annotation**: View bounding boxes\n\n## 📑 Table of Contents\n\n- [✨ Key Features](#-key-features)\n- [🏗️ Architecture](#️-architecture)\n- [🚀 Quickstart](#-quickstart)\n- [🛠️ Installation](#️-installation)\n- [📚 Examples](#-examples)\n- [💻 CLI Usage](#-cli-usage)\n- [🌐 API Usage](#-api-usage)\n- [🤖 Sparrow Agent](#-sparrow-agent)\n- [📊 Dashboard](#-dashboard)\n- [🔧 Pipeline Comparison](#-pipeline-comparison)\n- [⚡ Performance Tips](#-performance-tips)\n- [🔍 Troubleshooting](#-troubleshooting)\n- [⭐ Star History](#-star-history)\n- [📜 License](#-license)\n\n## ✨ Key Features\n\n🎯 **Universal Document Processing**: Handle invoices, receipts, forms, bank statements, tables    \n🔧 **Pluggable Architecture**: Mix and match different pipelines (Sparrow Parse, Instructor, Agents)  \n🖥️ **Multiple Backends**: MLX (Apple Silicon), Ollama, vLLM, Docker, Hugging Face Cloud GPU  \n📱 **Multi-format Support**: Images (PNG, JPG) and multi-page PDFs  \n🎨 **Schema Validation**: JSON schema-based extraction with automatic validation  \n🌐 **API-First Design**: RESTful APIs for easy integration  \n💬 **Instruction Calling**: Text processing, validation, decision making with GPT-OSS, Mistral, Qwen 3.5, etc.  \n📊 **Visual Monitoring**: Built-in dashboard and agent workflow tracking  \n🔒 **Enterprise Ready**: Rate limiting, usage analytics, commercial licensing available  \n🚀 **Local Vision LLMs**: Mistral, Qwen 3.5, DeepSeek OCR, dots.ocr, dots-mocr, etc.  \n\n## 🏗️ Architecture\n\n![Sparrow Architecture](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_668c0612bb5f.jpeg)\n\n### Core Components\n\n| Component | Purpose | Use Case |\n|-----------|---------|----------|\n| **[Sparrow ML LLM](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ml\u002Fllm)** | Main API engine | Document processing pipelines |\n| **[Sparrow Parse](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-data\u002Fparse)** | Vision LLM library | Structured JSON extraction |\n| **[Sparrow Agents](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ml\u002Fagents)** | Workflow orchestration | Complex multi-step processing |\n| **[Sparrow OCR](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-data\u002Focr)** | Text recognition | OCR preprocessing |\n| **[Sparrow UI](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ui\u002F)** | Web interface | Interactive document processing |\n\n## 🚀 Quickstart\n\n### Prerequisites\n- **Python 3.12.10+** (use `pyenv` for version management)\n- **macOS** (for MLX backend) or **Linux\u002FWindows** (for other backends)\n- **GPU** (make sure GPU have enough memory to run selected Vision LLM)\n\n### 30-Second Setup\n\n```bash\n# 1. Install pyenv and Python 3.12.10\npyenv install 3.12.10\npyenv global 3.12.10\n\n# 2. Create virtual environment\npython -m venv .env_sparrow_parse\nsource .env_sparrow_parse\u002Fbin\u002Factivate  # Linux\u002FMac\n# or .env_sparrow_parse\\Scripts\\activate  # Windows\n\n# 3. Install Sparrow Parse pipeline\ngit clone https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow.git\ncd sparrow\u002Fsparrow-ml\u002Fllm\npip install -r requirements_sparrow_parse.txt\n\n# 4. For macOS: Install poppler for PDF processing\nbrew install poppler\n\n# 5. Start the API server\npython api.py\n```\n\nBefore running `pip install -r requirements_sparrow_parse.txt`, check your platform. If you are on macOS and want to run MLX backend, go to `requirements_sparrow_parse.txt` and make sure `sparrow-parse[mlx]` libary reference is defined. If you are running Sparrow on Linux\u002FWindows, make sure to use `sparrow-parse` library reference, this will skip MLX related libraries.\n\n### First Document Extraction\n\n```bash\n# Extract data from a bonds table\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n```\n\n**Result:**\n```json\n{\n  \"data\": [\n    {\"instrument_name\": \"UNITS BLACKROCK...\", \"valuation\": 19049},\n    {\"instrument_name\": \"UNITS ISHARES...\", \"valuation\": 83488}\n  ],\n  \"valid\": \"true\"\n}\n```\n\nUse `--options mlx` for MLX backend, `--options ollama` for Ollama backend, `--options vllm` for vLLM backend. Make sure to provide correct Vision LLM model name, download model first separately with MLX, vLLM or Ollama.\n\n## 🛠️ Installation\n\n### Quick Setup\n\n```bash\n# 1. Clone repository\ngit clone https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow.git\ncd sparrow\n```\n\n📖 **For complete installation instructions**, see our [detailed environment setup guide](environment_setup.md).\n\n### Essential Steps Summary\n\n1. **Python Environment**: Install Python 3.12.10 using pyenv\n2. **Virtual Environments**: Create separate environments for different pipelines:\n   - `.env_sparrow_parse` - for Sparrow Parse (Vision LLM)\n   - `.env_instructor` - for Instructor (Text LLM) \n   - `.env_ocr` - for OCR service (optional)\n3. **System Dependencies**: Install poppler for PDF processing\n4. **Requirements**: Install pipeline-specific dependencies, for example:\n\n`pip install -r requirements_sparrow_parse.txt`\n\n### Platform-Specific Notes\n\n**macOS:**\n```bash\nbrew install poppler  # Required for PDF processing\n```\n\n**Ubuntu\u002FDebian:**\n```bash\nsudo apt-get install poppler-utils libpoppler-cpp-dev\n```\n\n**Apple Silicon**: MLX backend available for optimal performance  \n**NVIDIA\u002FAMD GPU**: Use vLLM or Ollama backend  \n**CPU Only**: Use smaller models or Hugging Face cloud backend  \n\n### Verification\n\n```bash\n# Test installation\npython api.py --port 8002\n# Visit http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Fdocs\n```\n\n## 📚 Examples\n\n### 🏦 Bank Statement Processing\n\n![Bank Statement](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_66e5d8637661.png)\n\n```bash\n# Extract all data from bank statement\n.\u002Fsparrow.sh \"*\" \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbank_statement.pdf\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 View Complete JSON Output\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n{\n  \"bank\": \"First Platypus Bank\",\n  \"address\": \"1234 Kings St., New York, NY 12123\",\n  \"account_holder\": \"Mary G. Orta\",\n  \"account_number\": \"1234567890123\",\n  \"statement_date\": \"3\u002F1\u002F2022\",\n  \"period_covered\": \"2\u002F1\u002F2022 - 3\u002F1\u002F2022\",\n  \"account_summary\": {\n    \"balance_on_march_1\": \"$25,032.23\",\n    \"total_money_in\": \"$10,234.23\",\n    \"total_money_out\": \"$10,532.51\"\n  },\n  \"transactions\": [\n    {\n      \"date\": \"02\u002F01\",\n      \"description\": \"PGD EasyPay Debit\",\n      \"withdrawal\": \"203.24\",\n      \"deposit\": \"\",\n      \"balance\": \"22,098.23\"\n    },\n    {\n      \"date\": \"02\u002F02\",\n      \"description\": \"AB&B Online Payment*****\",\n      \"withdrawal\": \"71.23\",\n      \"deposit\": \"\",\n      \"balance\": \"22,027.00\"\n    },\n    {\n      \"date\": \"02\u002F04\",\n      \"description\": \"Check No. 2345\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"450.00\",\n      \"balance\": \"22,477.00\"\n    },\n    {\n      \"date\": \"02\u002F05\",\n      \"description\": \"Payroll Direct Dep 23422342 Giants\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"2,534.65\",\n      \"balance\": \"25,011.65\"\n    },\n    {\n      \"date\": \"02\u002F06\",\n      \"description\": \"Signature POS Debit - TJP\",\n      \"withdrawal\": \"84.50\",\n      \"deposit\": \"\",\n      \"balance\": \"24,927.15\"\n    },\n    {\n      \"date\": \"02\u002F07\",\n      \"description\": \"Check No. 234\",\n      \"withdrawal\": \"1,400.00\",\n      \"deposit\": \"\",\n      \"balance\": \"23,527.15\"\n    },\n    {\n      \"date\": \"02\u002F08\",\n      \"description\": \"Check No. 342\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"25.00\",\n      \"balance\": \"23,552.15\"\n    },\n    {\n      \"date\": \"02\u002F09\",\n      \"description\": \"FPB AutoPay***** Credit Card\",\n      \"withdrawal\": \"456.02\",\n      \"deposit\": \"\",\n      \"balance\": \"23,096.13\"\n    },\n    {\n      \"date\": \"02\u002F08\",\n      \"description\": \"Check No. 123\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"25.00\",\n      \"balance\": \"23,552.15\"\n    },\n    {\n      \"date\": \"02\u002F09\",\n      \"description\": \"FPB AutoPay***** Credit Card\",\n      \"withdrawal\": \"156.02\",\n      \"deposit\": \"\",\n      \"balance\": \"23,096.13\"\n    },\n    {\n      \"date\": \"02\u002F08\",\n      \"description\": \"Cash Deposit\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"25.00\",\n      \"balance\": \"23,552.15\"\n    }\n  ],\n  \"valid\": \"true\"\n}\n```\n\n\u003C\u002Fdetails>\n\n### 📊 Financial Tables\n\n![Bonds Table](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_980bb8e0a0b5.png)\n\n```bash\n# Extract structured data from financial table\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 View JSON Output\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n{\n  \"data\": [\n    {\n      \"instrument_name\": \"UNITS BLACKROCK FIX INC DUB FDS PLC ISHS EUR INV GRD CP BD IDX\u002FINST\u002FE\",\n      \"valuation\": 19049\n    },\n    {\n      \"instrument_name\": \"UNITS ISHARES III PLC CORE EUR GOVT BOND UCITS ETF\u002FEUR\",\n      \"valuation\": 83488\n    },\n    {\n      \"instrument_name\": \"UNITS ISHARES III PLC EUR CORP BOND 1-5YR UCITS ETF\u002FEUR\",\n      \"valuation\": 213030\n    },\n    {\n      \"instrument_name\": \"UNIT ISHARES VI PLC\u002FJP MORGAN USD E BOND EUR HED UCITS ETF DIST\u002FHDGD\u002F\",\n      \"valuation\": 32774\n    },\n    {\n      \"instrument_name\": \"UNITS XTRACKERS II SICAV\u002FEUR HY CORP BOND UCITS ETF\u002F-1D-\u002FDISTR.\",\n      \"valuation\": 23643\n    }\n  ],\n  \"valid\": \"true\"\n}\n```\n\n\u003C\u002Fdetails>\n\n### 🧾 Invoice Processing\n\n```bash\n# Extract invoice with cropping for better accuracy\n.\u002Fsparrow.sh \"*\" \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --crop-size 60 \\\n  --file-path \"data\u002Finvoice.pdf\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 View Complete JSON Output\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n{\n  \"invoice_number\": \"61356291\",\n  \"date_of_issue\": \"09\u002F06\u002F2012\",\n  \"seller\": {\n    \"name\": \"Chapman, Kim and Green\",\n    \"address\": \"64731 James Branch, Smithmouth, NC 26872\",\n    \"tax_id\": \"949-84-9105\",\n    \"iban\": \"GB50ACIE59715038217063\"\n  },\n  \"client\": {\n    \"name\": \"Rodriguez-Stevens\",\n    \"address\": \"2280 Angela Plain, Hortonshire, MS 93248\",\n    \"tax_id\": \"939-98-8477\"\n  },\n  \"items\": [\n    {\n      \"description\": \"Wine Glasses Goblets Pair Clear\",\n      \"quantity\": 5,\n      \"unit\": \"each\",\n      \"net_price\": 12.0,\n      \"net_worth\": 60.0,\n      \"vat_percentage\": 10,\n      \"gross_worth\": 66.0\n    },\n    {\n      \"description\": \"With Hooks Stemware Storage Multiple Uses Iron Wine Rack Hanging\",\n      \"quantity\": 4,\n      \"unit\": \"each\", \n      \"net_price\": 28.08,\n      \"net_worth\": 112.32,\n      \"vat_percentage\": 10,\n      \"gross_worth\": 123.55\n    },\n    {\n      \"description\": \"Replacement Corkscrew Parts Spiral Worm Wine Opener Bottle Houdini\",\n      \"quantity\": 1,\n      \"unit\": \"each\",\n      \"net_price\": 7.5,\n      \"net_worth\": 7.5,\n      \"vat_percentage\": 10,\n      \"gross_worth\": 8.25\n    },\n    {\n      \"description\": \"HOME ESSENTIALS GRADIENT STEMLESS WINE GLASSES SET OF 4 20 FL OZ (591 ml) NEW\",\n      \"quantity\": 1,\n      \"unit\": \"each\",\n      \"net_price\": 12.99,\n      \"net_worth\": 12.99,\n      \"vat_percentage\": 10,\n      \"gross_worth\": 14.29\n    }\n  ],\n  \"summary\": {\n    \"total_net_worth\": 192.81,\n    \"total_vat\": 19.28,\n    \"total_gross_worth\": 212.09\n  }\n}\n```\n\n\u003C\u002Fdetails>\n\n### 📄 Multi-page PDF Processing\n\n```bash\n# Process multi-page PDF with structured output per page\n.\u002Fsparrow.sh '{\"table\": [{\"description\": \"str\", \"latest_amount\": 0, \"previous_amount\": 0}]}' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Ffinancial_report.pdf\" \\\n  --debug-dir \"debug\u002F\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 View JSON Output\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n[\n    {\n        \"table\": [\n            {\n                \"description\": \"Revenues\",\n                \"latest_amount\": 12453,\n                \"previous_amount\": 11445\n            },\n            {\n                \"description\": \"Operating expenses\",\n                \"latest_amount\": 9157,\n                \"previous_amount\": 8822\n            }\n        ],\n        \"valid\": \"true\",\n        \"page\": 1\n    },\n    {\n        \"table\": [\n            {\n                \"description\": \"Revenues\", \n                \"latest_amount\": 12453,\n                \"previous_amount\": 11445\n            },\n            {\n                \"description\": \"Operating expenses\",\n                \"latest_amount\": 9157,\n                \"previous_amount\": 8822\n            }\n        ],\n        \"valid\": \"true\",\n        \"page\": 2\n    }\n]\n```\n\n\u003C\u002Fdetails>\n\n### 💬 Text Instruction Processing\n\n```bash\n# Instruction-based processing\n.\u002Fsparrow.sh \"instruction: do arithmetic operation, payload: 2+2=\" \\\n  --pipeline \"sparrow-instructor\" \\\n  --options mlx \\\n  --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit\n\n# Instruction processing with document input\n.\u002Fsparrow.sh \"check if business entity Chapman, Kim and Green is invoice issuing party\" \n  --pipeline \"sparrow-parse\" \n  --instruction \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n```\n\n\n**JSON Output:**\n```\nThe result of 2 + 2 is:\n\n4\n```\n\n\n### 📈 Stock Data Function Calling\n\n```bash\n# Function calling example\n.\u002Fsparrow.sh assistant --pipeline \"stocks\" --query \"Oracle\"\n```\n\n**JSON Output:**\n```json\n{\n  \"company\": \"Oracle Corporation\",\n  \"ticker\": \"ORCL\"\n}\n```\n\n**Additional Output:**\n```\nThe stock price of the Oracle Corporation is 186.3699951171875. USD\n```\n\n## 💻 CLI Usage\n\n### Basic Syntax\n\n```bash\n.\u002Fsparrow.sh \"\u003CJSON_SCHEMA>\" --pipeline \"\u003CPIPELINE>\" [OPTIONS] --file-path \"\u003CFILE>\"\n```\n\n### Command Line Arguments\n\n| Argument | Type | Description | Example |\n|----------|------|-------------|---------|\n| `query` | JSON\u002FString | Schema or instruction | `'[{\"field\":\"str\"}]'` |\n| `--pipeline` | String | Pipeline to use | `sparrow-parse` |\n| `--file-path` | Path | Input document | `data\u002Finvoice.pdf` |\n| `--hints-file-path` | Path | Query hints | `data\u002Fhints.json` |\n| `--options` | String | Backend configuration | `mlx,model-name` |\n| `--instruction` | Boolean | Sparrow query will be used as instruction | `--instruction` |\n| `--validation` | Boolean | Sparrow query will be used for field validation | `--validation` |\n| `--markdown` | Boolean | Markdown pre-processing | `--markdown` |\n| `--ocr` | Boolean | Experimental functionality | `--ocr` |\n| `--table` | Boolean | Experimental functionality | `--table` |\n| `--table-template` | String | Experimental functionality | `--name` |\n| `--crop-size` | Integer | Border cropping pixels | `60` |\n| `--page-type` | String | Page classification | `financial_table` \n| `--debug` | Boolean | Enable debug mode | `--debug` |\n| `--debug-dir` | Path | Debug output folder | `.\u002Fdebug\u002F` |\n\n### Pipeline Options\n\n#### Sparrow Parse (Vision LLM)\n```bash\n# MLX Backend (Apple Silicon)\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n\n# Hugging Face Cloud GPU\n--options huggingface --options your-space\u002Fmodel-name\n\n# Additional flags\n--options tables_only        # Extract only tables\n--options validation_off     # Disable schema validation\n--options apply_annotation   # Include bounding boxes\n--page-type financial_table  # Classify page type\n```\n\n#### Sparrow Instructor (Text LLM)\n```bash\n# Instruction-based processing\n.\u002Fsparrow.sh \"instruction: do arithmetic operation, payload: 2+2=\" \\\n  --pipeline \"sparrow-instructor\" \\\n  --options mlx \\\n  --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit\n```\n\n### Advanced Examples\n\n```bash\n# Multi-page PDF with page classification\n.\u002Fsparrow.sh \"*\" \\\n  --page-type invoice \\\n  --page-type table \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"multi_page.pdf\"\n\n# Handle missing fields with null values\n.\u002Fsparrow.sh '[{\"required_field\":\"str\", \"optional_field\":\"str or null\"}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"document.png\"\n\n# Table extraction with cropping\n.\u002Fsparrow.sh '*' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --options tables_only \\\n  --crop-size 100 \\\n  --file-path \"scan.pdf\"\n\n# Instruction execution\n.\u002Fsparrow.sh \"check if business entity Chapman, Kim and Green is invoice issuing party\" \n  --pipeline \"sparrow-parse\" \n  --instruction \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n\n# Field validation\n.\u002Fsparrow.sh \"tax_id,shipment_code,total_gross_worth\" \n  --pipeline \"sparrow-parse\" \n  --validation \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n\n{\n  \"tax_id\": true,\n  \"shipment_code\": false,\n  \"total_gross_worth\": true\n}\n```\n\n## 🌐 API Usage\n\n### Starting the Server\n\n```bash\n# Default port (8002)\npython api.py\n\n# Custom port\npython api.py --port 8001\n\n# Multiple instances\npython api.py --port 8002 &  # Sparrow Parse\npython api.py --port 8003 &  # Instructor\n```\n\n### API Endpoints\n\n#### Document Extraction (`\u002Finference`)\n\n```bash\ncurl -X POST 'http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Finference' \\\n  -H 'Content-Type: multipart\u002Fform-data' \\\n  -F 'query=[{\"field_name\":\"str\", \"amount\":0}]' \\\n  -F 'pipeline=sparrow-parse' \\\n  -F 'options=mlx,mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit' \\\n  -F 'file=@document.pdf'\n```\n\n#### Text Instructions (`\u002Finstruction-inference`)\n\n```bash\ncurl -X POST 'http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Finstruction-inference' \\\n  -H 'Content-Type: application\u002Fx-www-form-urlencoded' \\\n  -d 'query=instruction: analyze data, payload: {...}' \\\n  -d 'pipeline=sparrow-instructor' \\\n  -d 'options=mlx,mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit'\n```\n\n### API Documentation\n\nVisit `http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Fdocs` for interactive Swagger documentation.\n\n![API Documentation](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_54219b8de487.png)\n\n## 🤖 Sparrow Agent\n\n![Sparrow Agents](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_246b4808bc8e.png)\n\nOrchestrate complex document processing workflows with visual monitoring powered by Prefect.\n\n### Features\n- **Multi-step Workflows**: Chain classification, extraction, and validation\n- **Visual Monitoring**: Real-time pipeline tracking\n- **Error Handling**: Robust failure recovery\n- **Extensible**: Custom agents for specific use cases\n\n### Usage\n\n```bash\n# Start agent server\ncd sparrow-ml\u002Fagents\npython api.py --port 8001\n\n# Process medical prescriptions\ncurl -X POST 'http:\u002F\u002Flocalhost:8001\u002Fapi\u002Fv1\u002Fsparrow-agents\u002Fexecute\u002Ffile' \\\n  -F 'agent_name=medical_prescriptions' \\\n  -F 'extraction_params={\"sparrow_key\":\"123456\"}' \\\n  -F 'file=@prescription.pdf'\n```\n\n## 📊 Dashboard\n\nBuilt-in analytics and monitoring dashboard at [sparrow.katanaml.io](https:\u002F\u002Fsparrow.katanaml.io). This is part of Sparrow UI, requires local Oracle Database 23ai Free. \n\n![Dashboard](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_90fea7deece2.png)\n\n### Features\n- **Usage Analytics**: Track API calls, success rates, performance\n- **Geographic Distribution**: See usage by country\n- **Model Performance**: Compare different model performance\n- **Real-time Monitoring**: Live processing statistics\n\n## 🔧 Pipeline Comparison\n\n| Feature | Sparrow Parse | Sparrow Instructor | Sparrow Agents |\n|---------|---------------|-------------------|----------------|\n| **Input** | Documents + JSON schema | Text instructions | Complex workflows |\n| **Output** | Structured JSON | Free-form text | Multi-step results |\n| **Use Cases** | Data extraction, forms | Summarization, analysis | Enterprise workflows |\n| **Validation** | Schema-based | Manual | Custom rules |\n| **Complexity** | Simple | Medium | High |\n| **Best For** | Invoices, tables, forms | Text processing | Multi-document flows |\n\n### When to Use What\n\n**Sparrow Parse**: Use for structured data extraction from documents  \n**Sparrow Instructor**: Use for text analysis, summarization, Q&A  \n**Sparrow Agents**: Use for complex multi-step document processing workflows  \n\n## ⚡ Performance Tips\n\n### Hardware Optimization\n\n**Apple Silicon (MLX)**\n- ✅ Best performance with unified memory\n- ✅ Models: Mistral-Small-3.2-24B, Qwen2.5-VL-72B, \n- ⚠️ Requires macOS with Apple Silicon\n\n**NVIDIA GPU**\n- ✅ Use vLLM or Ollama backends\n- ✅ Recommended: Nvidia DGX Spark with 12GB+ VRAM or AMD GPU\n- ⚠️ Requires CUDA setup\n\n**CPU Only**\n- ⚠️ Significantly slower\n- ✅ Use smaller models (7B parameters max)\n- ✅ Consider Hugging Face cloud backend\n\n### Memory Management\n\n```bash\n# Reduce memory usage\n--crop-size 100        # Crop large images\n--options tables_only  # Process only tables\n\n# For large PDFs\n--debug-dir .\u002Ftemp     # Monitor processing\n# Split large PDFs manually if needed\n```\n\n### Model Selection\n\n| Use Case | Recommended Model | Memory | Speed |\n|----------|------------------|---------|--------|\n| **Forms\u002FInvoices** | Mistral-Small-3.2-24B | 35GB | Fast |\n| **Complex Tables** | Qwen2.5-VL-72B | 50GB | Slower |\n| **Quick Testing** | Qwen2.5-VL-7B | 20GB | Fastest |\n\n## 🔍 Troubleshooting\n\n### Common Issues\n\n\u003Cdetails>\n\u003Csummary>🚫 Installation Problems\u003C\u002Fsummary>\n\n**Python Version Issues:**\n```bash\n# Verify Python version\npython --version  # Should be 3.12.10+\n\n# Fix with pyenv\npyenv install 3.12.10\npyenv global 3.12.10\n```\n\n**MLX Installation (Apple Silicon):**\n```bash\n# If MLX fails to install\npip install --upgrade pip\npip install mlx-vlm --no-cache-dir\n```\n\n```bash\n# If pip install command throws AttributeError: 'NoneType' object has no attribute 'get'\n# POTENTIAL SECURITY RISK - SSL verification is bypassed. Apply if you know what you are doing\npip install mlx-vlm --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org\n```\n\n**Poppler Missing:**\n```bash\n# macOS\nbrew install poppler\n\n# Ubuntu\u002FDebian  \nsudo apt-get install poppler-utils\n\n# Verify installation\npdftoppm -h\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>🔧 Runtime Issues\u003C\u002Fsummary>\n\n**Memory Errors:**\n- Use smaller models (7B instead of 72B)\n- Enable image cropping: `--crop-size 100`\n- Process single pages instead of entire PDFs\n\n**Model Loading Fails:**\n```bash\n# Clear model cache\nrm -rf ~\u002F.cache\u002Fhuggingface\u002F\nrm -rf ~\u002F.mlx\u002F\n\n# Redownload models\npython -c \"from mlx_vlm import load; load('model-name')\"\n```\n\n**API Connection Issues:**\n```bash\n# Check if server is running\ncurl http:\u002F\u002Flocalhost:8002\u002Fhealth\n\n# Check logs\npython api.py --debug\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>📄 Document Processing Issues\u003C\u002Fsummary>\n\n**Poor Extraction Quality:**\n- Try image cropping: `--crop-size 60`\n- Use `--options tables_only` for table documents\n- Ensure image resolution is adequate (300+ DPI)\n- Use schema validation: avoid `--options validation_off`\n\n**PDF Processing Fails:**\n```bash\n# Test PDF manually\npdftoppm -png input.pdf output\n\n# Check page count\npython -c \"\nimport pypdf\nwith open('file.pdf', 'rb') as f:\n    reader = pypdf.PdfReader(f)\n    print(f'Pages: {len(reader.pages)}')\n\"\n```\n\n**JSON Schema Errors:**\n- Validate JSON syntax: Use [jsonlint.com](https:\u002F\u002Fjsonlint.com)\n- Use proper field types: `\"str\"`, `0`, `0.0`, `\"str or null\"`\n- Test with simple schema first\n\n\u003C\u002Fdetails>\n\n### Getting Help\n\n1. **📖 Check Documentation**: Review this README and component docs\n2. **🐛 Search Issues**: [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues)  \n3. **💬 Create Issue**: Provide logs, system info, minimal example\n4. **📧 Commercial Support**: [abaranovskis@redsamuraiconsulting.com](mailto:abaranovskis@redsamuraiconsulting.com)\n\n## ⭐ Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=katanaml\u002Fsparrow&type=Date)](https:\u002F\u002Fstar-history.com\u002F#katanaml\u002Fsparrow&Date)\n\n## 📜 License\n\n**Open Source**: Licensed under GPL 3.0. Free for open source projects and organizations under $5M revenue.\n\n**Commercial**: Dual licensing available for proprietary use, enterprise features, and dedicated support.\n\n**Contact**: [abaranovskis@redsamuraiconsulting.com](mailto:abaranovskis@redsamuraiconsulting.com) for commercial licensing and consulting.\n\n## 👥 Authors\n\n- **[Katana ML](https:\u002F\u002Fkatanaml.io)** - AI\u002FML consulting and solutions\n- **[Andrej Baranovskij](https:\u002F\u002Fgithub.com\u002Fabaranovskis-redsamurai)** - Lead developer\n\n---\n\n\u003Cp align=\"center\">\n  \u003Cstrong>⭐ Star us on GitHub if Sparrow is useful for your projects!\u003C\u002Fstrong>\u003Cbr>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\">github.com\u002Fkatanaml\u002Fsparrow\u003C\u002Fa>\n\u003C\u002Fp>\n","# 麻雀\n\n[![PyPI - Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-v3.12+-blue.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow)\n[![GitHub Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkatanaml\u002Fsparrow.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fstargazers)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002Fkatanaml\u002Fsparrow.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues)\n[![Current Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-0.4.4-green.svg)](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow)\n[![License: GPL v3](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-GPLv3-blue.svg)](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fgpl-3.0)\n\n**基于机器学习、大语言模型及视觉大语言模型的结构化数据提取与指令调用**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"300\" height=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_1993b2875611.png\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>🚀 \u003Ca href=\"https:\u002F\u002Fsparrow.katanaml.io\">在线试用麻雀\u003C\u002Fa> | 📖 \u003Ca href=\"#-quickstart\">快速入门\u003C\u002Fa> | 🛠️ \u003Ca href=\"#️-installation\">安装指南\u003C\u002Fa> | 📚 \u003Ca href=\"#-examples\">示例\u003C\u002Fa> | 🤖 \u003Ca href=\"#-sparrow-agent\">智能代理\u003C\u002Fa>\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n---\n\n## 🌟 麻雀\n\n由机器学习、大语言模型及视觉大语言模型驱动的生产级结构化数据提取工具。\n\n将发票、收据、对账单、表格和图像转化为干净的结构化数据。\n\n[🚀 在线试用麻雀](https:\u002F\u002Fsparrow.katanaml.io)\n\n![麻雀UI](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_ab573d748d65.png)\n\n### 麻雀UI功能\n- **拖放上传**：直接上传文档\n- **实时处理**：即时查看结果  \n- **数据查询**：基于JSON的模式进行数据查询\n- **结构化输出**：JSON格式的结构化输出\n- **结果标注**：查看边界框\n\n## 📑 目录\n\n- [✨ 核心特性](#-key-features)\n- [🏗️ 架构](#️-architecture)\n- [🚀 快速入门](#-quickstart)\n- [🛠️ 安装](#️-installation)\n- [📚 示例](#-examples)\n- [💻 CLI使用](#-cli-usage)\n- [🌐 API使用](#-api-usage)\n- [🤖 麻雀智能代理](#-sparrow-agent)\n- [📊 仪表盘](#-dashboard)\n- [🔧 流程对比](#-pipeline-comparison)\n- [⚡ 性能优化建议](#-performance-tips)\n- [🔍 故障排除](#-troubleshooting)\n- [⭐ 星标历史](#-star-history)\n- [📜 许可证](#-license)\n\n## ✨ 核心特性\n\n🎯 **通用文档处理**：支持处理发票、收据、表格、银行对账单、数据表等  \n🔧 **可插拔架构**：灵活组合不同流程（Sparrow Parse、Instructor、Agent）  \n🖥️ **多后端支持**：MLX（Apple Silicon）、Ollama、vLLM、Docker、Hugging Face Cloud GPU  \n📱 **多格式支持**：支持PNG、JPG等图片以及多页PDF  \n🎨 **模式校验**：基于JSON模式的自动校验式提取  \n🌐 **API优先设计**：提供RESTful API，便于集成  \n💬 **指令调用**：利用GPT-OSS、Mistral、Qwen 3.5等进行文本处理、校验和决策  \n📊 **可视化监控**：内置仪表盘和代理工作流追踪  \n🔒 **企业级支持**：提供速率限制、使用分析及商业授权选项  \n🚀 **本地视觉大语言模型**：Mistral、Qwen 3.5、DeepSeek OCR、dots.ocr、dots-mocr等  \n\n## 🏗️ 架构\n\n![麻雀架构](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_668c0612bb5f.jpeg)\n\n### 核心组件\n\n| 组件 | 用途 | 使用场景 |\n|-----------|---------|----------|\n| **[Sparrow ML LLM](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ml\u002Fllm)** | 主要API引擎 | 文档处理流程 |\n| **[Sparrow Parse](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-data\u002Fparse)** | 视觉大语言模型库 | 结构化JSON提取 |\n| **[Sparrow Agents](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ml\u002Fagents)** | 工作流编排 | 复杂多步骤处理 |\n| **[Sparrow OCR](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-data\u002Focr)** | 文本识别 | OCR预处理 |\n| **[Sparrow UI](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Ftree\u002Fmain\u002Fsparrow-ui\u002F)** | Web界面 | 交互式文档处理 |\n\n## 🚀 快速入门\n\n### 前置条件\n- **Python 3.12.10+**（推荐使用`pyenv`管理版本）\n- **macOS**（用于MLX后端）或**Linux\u002FWindows**（用于其他后端）\n- **GPU**（确保显存足以运行所选视觉大语言模型）\n\n### 30秒快速设置\n\n```bash\n# 1. 安装pyenv和Python 3.12.10\npyenv install 3.12.10\npyenv global 3.12.10\n\n# 2. 创建虚拟环境\npython -m venv .env_sparrow_parse\nsource .env_sparrow_parse\u002Fbin\u002Factivate  # Linux\u002FMac\n# 或 .env_sparrow_parse\\Scripts\\activate  # Windows\n\n# 3. 安装Sparrow Parse流程\ngit clone https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow.git\ncd sparrow\u002Fsparrow-ml\u002Fllm\npip install -r requirements_sparrow_parse.txt\n\n# 4. 对于macOS：安装poppler以处理PDF\nbrew install poppler\n\n# 5. 启动API服务器\npython api.py\n```\n\n在运行`pip install -r requirements_sparrow_parse.txt`之前，请确认你的平台。如果你使用的是macOS并希望运行MLX后端，请检查`requirements_sparrow_parse.txt`文件，确保其中包含`sparrow-parse[mlx]`的引用。如果你是在Linux或Windows上运行麻雀，则应使用`sparrow-parse`的引用，这样可以跳过与MLX相关的依赖项。\n\n### 第一次文档提取\n\n```bash\n# 从债券表格中提取数据\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n```\n\n**结果：**\n```json\n{\n  \"data\": [\n    {\"instrument_name\": \"UNITS BLACKROCK...\", \"valuation\": 19049},\n    {\"instrument_name\": \"UNITS ISHARES...\", \"valuation\": 83488}\n  ],\n  \"valid\": \"true\"\n}\n```\n\n请根据使用的后端选择相应的`--options`参数：`--options mlx`用于MLX后端，`--options ollama`用于Ollama后端，`--options vllm`用于vLLM后端。务必提供正确的视觉大语言模型名称，并提前通过MLX、vLLM或Ollama单独下载该模型。\n\n## 🛠️ 安装\n\n### 快速设置\n\n```bash\n# 1. 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow.git\ncd sparrow\n```\n\n📖 **如需完整的安装说明**，请参阅我们的[详细环境搭建指南](environment_setup.md)。\n\n### 关键步骤概览\n\n1. **Python环境**：使用pyenv安装Python 3.12.10\n2. **虚拟环境**：为不同流程创建独立环境：\n   - `.env_sparrow_parse` - 用于Sparrow Parse（视觉大语言模型）\n   - `.env_instructor` - 用于Instructor（文本大语言模型）\n   - `.env_ocr` - 用于OCR服务（可选）\n3. **系统依赖**：安装poppler以处理PDF\n4. **依赖安装**：安装各流程特定的依赖项，例如：\n\n`pip install -r requirements_sparrow_parse.txt`\n\n### 平台特定说明\n\n**macOS:**\n```bash\nbrew install poppler  # 处理 PDF 所需\n```\n\n**Ubuntu\u002FDebian:**\n```bash\nsudo apt-get install poppler-utils libpoppler-cpp-dev\n```\n\n**Apple Silicon**: 可使用 MLX 后端以获得最佳性能  \n**NVIDIA\u002FAMD GPU**: 使用 vLLM 或 Ollama 后端  \n**仅 CPU**: 使用较小模型或 Hugging Face 云后端  \n\n### 验证\n\n```bash\n# 测试安装\npython api.py --port 8002\n# 访问 http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Fdocs\n```\n\n## 📚 示例\n\n### 🏦 银行对账单处理\n\n![银行对账单](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_66e5d8637661.png)\n\n```bash\n# 从银行对账单中提取所有数据\n.\u002Fsparrow.sh \"*\" \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbank_statement.pdf\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 查看完整 JSON 输出\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n{\n  \"bank\": \"第一鸭嘴兽银行\",\n  \"address\": \"纽约市国王街1234号，邮编12123\",\n  \"account_holder\": \"玛丽·G·奥尔塔\",\n  \"account_number\": \"1234567890123\",\n  \"statement_date\": \"2022年3月1日\",\n  \"period_covered\": \"2022年2月1日至2022年3月1日\",\n  \"account_summary\": {\n    \"balance_on_march_1\": \"$25,032.23\",\n    \"total_money_in\": \"$10,234.23\",\n    \"total_money_out\": \"$10,532.51\"\n  },\n  \"transactions\": [\n    {\n      \"date\": \"2月1日\",\n      \"description\": \"PGD EasyPay 借记\",\n      \"withdrawal\": \"203.24\",\n      \"deposit\": \"\",\n      \"balance\": \"22,098.23\"\n    },\n    {\n      \"date\": \"2月2日\",\n      \"description\": \"AB&B 在线支付*****\",\n      \"withdrawal\": \"71.23\",\n      \"deposit\": \"\",\n      \"balance\": \"22,027.00\"\n    },\n    {\n      \"date\": \"2月4日\",\n      \"description\": \"支票 No. 2345\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"450.00\",\n      \"balance\": \"22,477.00\"\n    },\n    {\n      \"date\": \"2月5日\",\n      \"description\": \"巨人队23422342号工资直接存款\",\n      \"withdrawal\": \"\",\n      \"deposit\": \"2,534.65\",\n      \"balance\": \"25,011.65\"\n    },\n    {\n      \"date\": \"2月6日\",\n      \"description\": \"TJP 签名式 POS 借记\",\n      \"withdrawal\": \"84.50\",\n      \"deposit\": \"\",\n，bal…\n\n# 处理多页PDF，每页输出结构化数据\n.\u002Fsparrow.sh '{\"table\": [{\"description\": \"str\", \"latest_amount\": 0, \"previous_amount\": 0}]}' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Ffinancial_report.pdf\" \\\n  --debug-dir \"debug\u002F\"\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📄 查看 JSON 输出\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```json\n[\n    {\n        \"table\": [\n            {\n                \"description\": \"收入\",\n                \"latest_amount\": 12453,\n                \"previous_amount\": 11445\n            },\n            {\n                \"description\": \"运营费用\",\n                \"latest_amount\": 9157,\n                \"previous_amount\": 8822\n            }\n        ],\n        \"valid\": \"true\",\n        \"page\": 1\n    },\n    {\n        \"table\": [\n            {\n                \"description\": \"收入\", \n                \"latest_amount\": 12453,\n                \"previous_amount\": 11445\n            },\n            {\n                \"description\": \"运营费用\",\n                \"latest_amount\": 9157,\n                \"previous_amount\": 8822\n            }\n        ],\n        \"valid\": \"true\",\n        \"page\": 2\n    }\n]\n```\n\n\u003C\u002Fdetails>\n\n### 💬 文本指令处理\n\n```bash\n# 基于指令的处理\n.\u002Fsparrow.sh \"instruction: do arithmetic operation, payload: 2+2=\" \\\n  --pipeline \"sparrow-instructor\" \\\n  --options mlx \\\n  --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit\n\n# 带文档输入的指令处理\n.\u002Fsparrow.sh \"check if business entity Chapman, Kim and Green is invoice issuing party\" \n  --pipeline \"sparrow-parse\" \n  --instruction \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n```\n\n\n**JSON 输出：**\n```\n2 + 2 的结果是：\n\n4\n```\n\n\n### 📈 股票数据函数调用\n\n```bash\n# 函数调用示例\n.\u002Fsparrow.sh assistant --pipeline \"stocks\" --query \"Oracle\"\n```\n\n**JSON 输出：**\n```json\n{\n  \"company\": \"Oracle Corporation\",\n  \"ticker\": \"ORCL\"\n}\n```\n\n**附加输出：**\n```\n甲骨文公司的股价为 186.3699951171875 美元。\n```\n\n## 💻 CLI 使用\n\n### 基本语法\n\n```bash\n.\u002Fsparrow.sh \"\u003CJSON_SCHEMA>\" --pipeline \"\u003CPIPELINE>\" [OPTIONS] --file-path \"\u003CFILE>\"\n```\n\n### 命令行参数\n\n| 参数 | 类型 | 描述 | 示例 |\n|----------|------|-------------|---------|\n| `query` | JSON\u002FString | 模式或指令 | `'[{\"field\":\"str\"}]'` |\n| `--pipeline` | String | 使用的管道 | `sparrow-parse` |\n| `--file-path` | Path | 输入文档 | `data\u002Finvoice.pdf` |\n| `--hints-file-path` | Path | 查询提示 | `data\u002Fhints.json` |\n| `--options` | String | 后端配置 | `mlx,model-name` |\n| `--instruction` | Boolean | Sparrow 查询将作为指令使用 | `--instruction` |\n| `--validation` | Boolean | Sparrow 查询将用于字段验证 | `--validation` |\n| `--markdown` | Boolean | Markdown 预处理 | `--markdown` |\n| `--ocr` | Boolean | 实验性功能 | `--ocr` |\n| `--table` | Boolean | 实验性功能 | `--table` |\n| `--table-template` | String | 实验性功能 | `--name` |\n| `--crop-size` | Integer | 边框裁剪像素 | `60` |\n| `--page-type` | String | 页面分类 | `financial_table` \n| `--debug` | Boolean | 启用调试模式 | `--debug` |\n| `--debug-dir` | Path | 调试输出文件夹 | `.\u002Fdebug\u002F` |\n\n### 管道选项\n\n#### Sparrow Parse（视觉 LLM）\n```bash\n# MLX 后端（Apple Silicon）\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n\n# Hugging Face Cloud GPU\n--options huggingface --options your-space\u002Fmodel-name\n\n# 其他标志\n--options tables_only        # 仅提取表格\n--options validation_off     # 禁用模式验证\n--options apply_annotation   # 包括边界框\n--page-type financial_table  # 分类页面类型\n```\n\n#### Sparrow Instructor（文本 LLM）\n```bash\n# 基于指令的处理\n.\u002Fsparrow.sh \"instruction: do arithmetic operation, payload: 2+2=\" \\\n  --pipeline \"sparrow-instructor\" \\\n  --options mlx \\\n  --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit\n```\n\n### 高级示例\n\n```bash\n# 多页 PDF 并进行页面分类\n.\u002Fsparrow.sh \"*\" \\\n  --page-type invoice \\\n  --page-type table \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"multi_page.pdf\"\n\n# 处理缺失字段并用空值代替\n.\u002Fsparrow.sh '[{\"required_field\":\"str\", \"optional_field\":\"str or null\"}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"document.png\"\n\n# 带裁剪的表格提取\n.\u002Fsparrow.sh '*' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --options tables_only \\\n  --crop-size 100 \\\n  --file-path \"scan.pdf\"\n\n# 执行指令\n.\u002Fsparrow.sh \"check if business entity Chapman, Kim and Green is invoice issuing party\" \n  --pipeline \"sparrow-parse\" \n  --instruction \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n\n# 字段验证\n.\u002Fsparrow.sh \"tax_id,shipment_code,total_gross_worth\" \n  --pipeline \"sparrow-parse\" \n  --validation \n  --options mlx --options lmstudio-community\u002FMistral-Small-3.2-24B-Instruct-2506-8bit \n  --file-path \"invoice_1.jpg\"\n\n{\n  \"tax_id\": true,\n  \"shipment_code\": false,\n  \"total_gross_worth\": true\n}\n```\n\n## 🌐 API 使用\n\n### 启动服务器\n\n```bash\n# 默认端口 (8002)\npython api.py\n\n# 自定义端口\npython api.py --port 8001\n\n# 多实例\npython api.py --port 8002 &  # Sparrow Parse\npython api.py --port 8003 &  # Instructor\n```\n\n### API 端点\n\n#### 文档提取 (`\u002Finference`)\n\n```bash\ncurl -X POST 'http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Finference' \\\n  -H 'Content-Type: multipart\u002Fform-data' \\\n  -F 'query=[{\"field_name\":\"str\", \"amount\":0}]' \\\n  -F 'pipeline=sparrow-parse' \\\n  -F 'options=mlx,mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit' \\\n  -F 'file=@document.pdf'\n```\n\n#### 文本指令 (`\u002Finstruction-inference`)\n\n```bash\ncurl -X POST 'http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Finstruction-inference' \\\n  -H 'Content-Type: application\u002Fx-www-form-urlencoded' \\\n  -d 'query=instruction: analyze data, payload: {...}' \\\n  -d 'pipeline=sparrow-instructor' \\\n  -d 'options=mlx,mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit'\n```\n\n### API 文档\n\n访问 `http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Fdocs` 查看交互式 Swagger 文档。\n\n![API 文档](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_54219b8de487.png)\n\n## 🤖 麻雀代理\n\n![麻雀代理](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_246b4808bc8e.png)\n\n借助 Prefect 提供的可视化监控功能，编排复杂的文档处理工作流。\n\n### 功能\n- **多步骤工作流**：串联分类、提取和验证流程\n- **可视化监控**：实时跟踪管道状态\n- **错误处理**：强大的失败恢复机制\n- **可扩展性**：针对特定用例的自定义代理\n\n### 使用方法\n\n```bash\n# 启动代理服务器\ncd sparrow-ml\u002Fagents\npython api.py --port 8001\n\n# 处理医疗处方\ncurl -X POST 'http:\u002F\u002Flocalhost:8001\u002Fapi\u002Fv1\u002Fsparrow-agents\u002Fexecute\u002Ffile' \\\n  -F 'agent_name=medical_prescriptions' \\\n  -F 'extraction_params={\"sparrow_key\":\"123456\"}' \\\n  -F 'file=@prescription.pdf'\n```\n\n## 📊 仪表板\n\n内置分析与监控仪表板，访问地址为 [sparrow.katanaml.io](https:\u002F\u002Fsparrow.katanaml.io)。这是 Sparrow UI 的一部分，需要本地安装 Oracle Database 23ai Free 版本。\n\n![仪表板](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_readme_90fea7deece2.png)\n\n### 功能\n- **使用情况分析**：跟踪 API 调用、成功率及性能指标\n- **地理分布**：按国家查看使用情况\n- **模型性能**：对比不同模型的表现\n- **实时监控**：展示实时处理统计信息\n\n## 🔧 工作流对比\n\n| 特性 | Sparrow Parse | Sparrow Instructor | Sparrow Agents |\n|---------|---------------|-------------------|----------------|\n| **输入** | 文档 + JSON 模式 | 文本指令 | 复杂工作流 |\n| **输出** | 结构化 JSON | 自由格式文本 | 多步骤结果 |\n| **应用场景** | 数据提取、表单 | 摘要生成、分析 | 企业级工作流 |\n| **验证方式** | 基于模式 | 手动 | 自定义规则 |\n| **复杂度** | 简单 | 中等 | 高 |\n| **适用场景** | 发票、表格、表单 | 文本处理 | 多文档流程 |\n\n### 如何选择使用\n\n**Sparrow Parse**：适用于从文档中提取结构化数据  \n**Sparrow Instructor**：适用于文本分析、摘要生成及问答任务  \n**Sparrow Agents**：适用于复杂的多步骤文档处理工作流  \n\n## ⚡ 性能优化建议\n\n### 硬件优化\n\n**Apple Silicon (MLX)**\n- ✅ 统一内存带来最佳性能\n- ✅ 支持模型：Mistral-Small-3.2-24B、Qwen2.5-VL-72B\n- ⚠️ 需 macOS 系统且配备 Apple Silicon 芯片\n\n**NVIDIA GPU**\n- ✅ 推荐使用 vLLM 或 Ollama 后端\n- ✅ 建议使用 Nvidia DGX Spark（显存 12GB 以上）或 AMD GPU\n- ⚠️ 需配置 CUDA 环境\n\n**仅 CPU**\n- ⚠️ 性能显著较低\n- ✅ 宜选用较小规模模型（参数量不超过 7B）\n- ✅ 可考虑使用 Hugging Face 云端后端\n\n### 内存管理\n\n```bash\n# 降低内存占用\n--crop-size 100        # 裁剪大尺寸图片\n--options tables_only  # 仅处理表格\n\n# 处理大型 PDF\n--debug-dir .\u002Ftemp     # 监控处理过程\n# 必要时手动拆分大 PDF\n```\n\n### 模型选择\n\n| 使用场景 | 推荐模型 | 内存需求 | 速度 |\n|----------|------------------|---------|--------|\n| **表单\u002F发票** | Mistral-Small-3.2-24B | 35GB | 快速 |\n| **复杂表格** | Qwen2.5-VL-72B | 50GB | 较慢 |\n| **快速测试** | Qwen2.5-VL-7B | 20GB | 最快 |\n\n## 🔍 故障排除\n\n### 常见问题\n\n\u003Cdetails>\n\u003Csummary>🚫 安装问题\u003C\u002Fsummary>\n\n**Python 版本问题：**\n```bash\n# 检查 Python 版本\npython --version  # 应为 3.12.10+\n\n# 使用 pyenv 修复\npyenv install 3.12.10\npyenv global 3.12.10\n```\n\n**MLX 安装（Apple Silicon）：**\n```bash\n# 若 MLX 安装失败\npip install --upgrade pip\npip install mlx-vlm --no-cache-dir\n```\n\n```bash\n# 若 pip install 报错 AttributeError: 'NoneType' object has no attribute 'get'\n# 存在安全风险——绕过了 SSL 验证。请在了解风险的情况下谨慎操作\npip install mlx-vlm --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org\n```\n\n**Poppler 缺失：**\n```bash\n# macOS\nbrew install poppler\n\n# Ubuntu\u002FDebian  \nsudo apt-get install poppler-utils\n\n# 验证安装\npdftoppm -h\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>🔧 运行时问题\u003C\u002Fsummary>\n\n**内存不足错误：**\n- 使用更小规模的模型（如 7B 而不是 72B）\n- 开启图像裁剪功能：`--crop-size 100`\n- 分别处理单页而非整份 PDF\n\n**模型加载失败：**\n```bash\n# 清除模型缓存\nrm -rf ~\u002F.cache\u002Fhuggingface\u002F\nrm -rf ~\u002F.mlx\u002F\n\n# 重新下载模型\npython -c \"from mlx_vlm import load; load('model-name')\"\n```\n\n**API 连接问题：**\n```bash\n# 检查服务是否运行\ncurl http:\u002F\u002Flocalhost:8002\u002Fhealth\n\n# 查看日志\npython api.py --debug\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>📄 文档处理问题\u003C\u002Fsummary>\n\n**提取质量不佳：**\n- 尝试图像裁剪：`--crop-size 60`\n- 对于表格类文档，使用 `--options tables_only`\n- 确保图像分辨率足够高（300+ DPI）\n- 避免使用 `--options validation_off`，启用模式校验\n\n**PDF 处理失败：**\n```bash\n# 手动测试 PDF\npdftoppm -png input.pdf output\n\n# 检查页数\npython -c \"\nimport pypdf\nwith open('file.pdf', 'rb') as f:\n    reader = pypdf.PdfReader(f)\n    print(f'Pages: {len(reader.pages)}')\n\"\n```\n\n**JSON 模式错误：**\n- 校验 JSON 语法：使用 [jsonlint.com](https:\u002F\u002Fjsonlint.com)\n- 确保字段类型正确：`\"str\"`、`0`、`0.0`、`\"str or null\"`\n- 先用简单模式进行测试\n\n\u003C\u002Fdetails>\n\n### 获取帮助\n\n1. **📖 查阅文档**：仔细阅读本 README 和各组件文档\n2. **🐛 搜索问题**：访问 [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues)  \n3. **💬 提交新问题**：提供日志、系统信息及最小复现示例\n4. **📧 商业支持**：联系 [abaranovskis@redsamuraiconsulting.com](mailto:abaranovskis@redsamuraiconsulting.com)\n\n## ⭐ 星标历史\n\n[![星标历史图](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=katanaml\u002Fsparrow&type=Date)](https:\u002F\u002Fstar-history.com\u002F#katanaml\u002Fsparrow&Date)\n\n## 📜 许可证\n\n**开源**：采用 GPL 3.0 许可证。对开源项目及年收入低于 500 万美元的组织免费。\n\n**商业版**：提供双重许可，适用于专有用途、企业级功能及专属支持。\n\n**联系方式**：如需商业许可或咨询，请联系 [abaranovskis@redsamuraiconsulting.com](mailto:abaranovskis@redsamuraiconsulting.com)。\n\n## 👥 作者\n\n- **[Katana ML](https:\u002F\u002Fkatanaml.io)** - AI\u002FML 咨询与解决方案提供商\n- **[Andrej Baranovskij](https:\u002F\u002Fgithub.com\u002Fabaranovskis-redsamurai)** - 主要开发者\n\n---\n\n\u003Cp align=\"center\">\n  \u003Cstrong>⭐ 如果 Sparrow 对您的项目有帮助，请在 GitHub 上为我们点亮星标！\u003C\u002Fstrong>\u003Cbr>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\">github.com\u002Fkatanaml\u002Fsparrow\u003C\u002Fa>\n\u003C\u002Fp>","# Sparrow 快速上手指南\n\nSparrow 是一个生产级的结构化数据提取工具，利用机器学习 (ML)、大语言模型 (LLM) 和视觉大语言模型 (Vision LLM)，将发票、收据、报表、表格和图像转换为干净的 JSON 数据。\n\n## 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**:\n    *   **macOS**: 推荐用于 Apple Silicon (M1\u002FM2\u002FM3) 用户，可使用高性能的 MLX 后端。\n    *   **Linux \u002F Windows**: 支持其他后端（如 Ollama, vLLM）。\n*   **Python 版本**: 必须安装 **Python 3.12.10+**。建议使用 `pyenv` 进行版本管理。\n*   **硬件要求**:\n    *   **GPU**: 运行视觉大模型需要足够的显存。\n    *   **Apple Silicon**: 原生支持 MLX 加速。\n    *   **NVIDIA\u002FAMD GPU**: 推荐使用 vLLM 或 Ollama 后端。\n*   **系统依赖**:\n    *   需要安装 `poppler` 以支持 PDF 处理。\n    *   **macOS**: `brew install poppler`\n    *   **Ubuntu\u002FDebian**: `sudo apt-get install poppler-utils libpoppler-cpp-dev`\n\n## 安装步骤\n\n### 1. 设置 Python 环境\n\n使用 `pyenv` 安装并切换至指定 Python 版本：\n\n```bash\n# 安装 Python 3.12.10\npyenv install 3.12.10\npyenv global 3.12.10\n\n# 创建虚拟环境\npython -m venv .env_sparrow_parse\nsource .env_sparrow_parse\u002Fbin\u002Factivate  # Linux\u002FMac\n# Windows 用户请使用: .env_sparrow_parse\\Scripts\\activate\n```\n\n### 2. 克隆项目并安装依赖\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow.git\ncd sparrow\u002Fsparrow-ml\u002Fllm\n\n# 根据平台选择依赖文件\n# macOS (MLX 后端): 确保 requirements_sparrow_parse.txt 中包含 sparrow-parse[mlx]\n# Linux\u002FWindows: 确保使用标准的 sparrow-parse 库\npip install -r requirements_sparrow_parse.txt\n```\n\n> **注意**：在安装前请检查 `requirements_sparrow_parse.txt`。如果您在 macOS 上希望使用 MLX 加速，请确认文件中引用了 `sparrow-parse[mlx]`；若在 Linux\u002FWindows 上，请使用标准引用以跳过 MLX 相关库。\n\n### 3. 启动服务\n\n```bash\npython api.py\n```\n\n启动后，可访问 `http:\u002F\u002Flocalhost:8002\u002Fapi\u002Fv1\u002Fsparrow-llm\u002Fdocs` 查看 API 文档。\n\n## 基本使用\n\nSparrow 提供了命令行脚本 `sparrow.sh` 用于快速提取数据。以下是最简单的使用示例。\n\n### 示例：从债券表格图片中提取数据\n\n此命令将读取 `data\u002Fbonds_table.png`，并根据提供的 JSON Schema 提取仪器名称和估值。\n\n```bash\n.\u002Fsparrow.sh '[{\"instrument_name\":\"str\", \"valuation\":0}]' \\\n  --pipeline \"sparrow-parse\" \\\n  --options mlx \\\n  --options mlx-community\u002FQwen2.5-VL-72B-Instruct-4bit \\\n  --file-path \"data\u002Fbonds_table.png\"\n```\n\n**参数说明：**\n*   `'[{\"instrument_name\":\"str\", \"valuation\":0}]'`: 定义输出数据的 JSON Schema。\n*   `--pipeline \"sparrow-parse\"`: 指定使用 Sparrow Parse 流水线（基于 Vision LLM）。\n*   `--options mlx`: 指定后端为 MLX（macOS Apple Silicon）。如果是其他平台，可替换为 `ollama` 或 `vllm`。\n*   `--options \u003Cmodel_name>`: 指定具体的视觉大模型名称（需预先通过对应后端下载）。\n*   `--file-path`: 输入文件路径（支持 PNG, JPG, PDF）。\n\n**预期输出结果：**\n\n```json\n{\n  \"data\": [\n    {\"instrument_name\": \"UNITS BLACKROCK...\", \"valuation\": 19049},\n    {\"instrument_name\": \"UNITS ISHARES...\", \"valuation\": 83488}\n  ],\n  \"valid\": \"true\"\n}\n```\n\n### 进阶提示\n*   **通配符提取**：如果不确定具体字段，可以使用 `*` 作为 Schema 参数来提取文档中的所有信息（例如处理银行对账单）。\n*   **裁剪优化**：对于复杂文档，可添加 `--crop-size 60` 参数进行预处理裁剪以提高准确率。\n*   **模型准备**：在使用前，请确保已通过 MLX、Ollama 或 vLLM 单独下载了对应的模型文件。","某中型物流公司的财务团队每天需处理数百张来自不同供应商的纸质发票和手写收据，以便录入 ERP 系统进行结算。\n\n### 没有 sparrow 时\n- 财务人员必须手动逐字敲击发票上的金额、日期和税号，耗时且极易因疲劳产生录入错误。\n- 面对模糊的手写收据或复杂排版的 PDF，传统 OCR 软件经常识别错乱，需要人工反复校对修正。\n- 非结构化的文本数据无法直接对接数据库，开发人员需编写大量定制化正则代码来清洗每种格式的单据。\n- 缺乏可视化的验证手段，当识别出错时，难以快速定位是图片哪个区域导致了提取偏差。\n- 整个流程从扫描到入库平均耗时 3 天，严重拖慢了月度结账和供应商付款进度。\n\n### 使用 sparrow 后\n- 利用 Sparrow 的 Vision LLM 能力，系统自动将发票图像转化为标准 JSON 数据，人工录入工作减少 90%。\n- 即使面对手写体或低质量扫描件，Sparrow 也能精准提取关键字段，并通过 JSON Schema 自动校验数据合法性。\n- 借助 Sparrow 可插拔的架构，团队无需重写代码即可灵活切换不同的解析管道，轻松适配新出现的单据格式。\n- 通过 Sparrow UI 自带的标注功能，财务人员能直接看到识别结果对应的原图边界框，秒级完成异常数据复核。\n- 数据处理实现实时化，单据上传即生成结构化报表，月度结账周期从 3 天缩短至 4 小时。\n\nSparrow 通过将多模态大模型与结构化数据提取深度融合，让企业以极低成本实现了文档处理流程的自动化与智能化闭环。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fkatanaml_sparrow_7475e2d5.png","katanaml","Katana ML","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fkatanaml_fc9545ad.png","Machine Learning for Business Automation",null,"andrejusb","https:\u002F\u002Fkatanaml.io","https:\u002F\u002Fgithub.com\u002Fkatanaml",[81,85,89],{"name":82,"color":83,"percentage":84},"Python","#3572A5",99.8,{"name":86,"color":87,"percentage":88},"Shell","#89e051",0.2,{"name":90,"color":91,"percentage":92},"Dockerfile","#384d54",0.1,5151,511,"2026-04-14T11:55:42","GPL-3.0","macOS, Linux, Windows","非绝对必需（支持 CPU 或云端），但推荐 NVIDIA\u002FAMD GPU 以运行 vLLM\u002FOllama 后端，或 Apple Silicon (M1\u002FM2\u002FM3) 以运行 MLX 后端。显存需求取决于所选 Vision LLM 模型大小（例如运行 72B 模型需较大显存）。","未说明（建议根据所选模型大小配置，运行大型 Vision LLM 通常需要 16GB+）",{"notes":101,"python":102,"dependencies":103},"1. 必须使用 pyenv 管理 Python 版本 (3.12.10+)。\n2. 不同后端需安装不同依赖：macOS 使用 MLX 后端需在 requirements 中指定 sparrow-parse[mlx]；Linux\u002FWindows 使用标准 sparrow-parse。\n3. 系统级依赖：macOS 需通过 brew 安装 poppler；Ubuntu\u002FDebian 需安装 poppler-utils 和 libpoppler-cpp-dev 以支持 PDF 处理。\n4. 需预先通过 MLX、vLLM 或 Ollama 单独下载对应的 Vision LLM 模型（如 Qwen2.5-VL-72B-Instruct-4bit）。\n5. 建议为不同流水线（Parse, Instructor, OCR）创建独立的虚拟环境。","3.12.10+",[104,105,106],"sparrow-parse","poppler-utils","pyenv",[35,13,15,14],[109,110,111,112,113,114,115,116],"machinelearning","huggingface-transformers","nlp-machine-learning","computer-vision","gpt","llm","rag","vllm","2026-03-27T02:49:30.150509","2026-04-15T06:06:32.148662",[120,125,130,135,140,144],{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},33685,"遇到 'Invalid Sparrow key' 错误怎么办？","该错误通常是因为 API URL 指向了官方的 Hugging Face Spaces 实例，而您使用的是本地服务。解决方法：\n1. 如果您在本地运行代码，可以将 Sparrow Key 替换为您自己设置的任意值。\n2. 如果您运行 Sparrow UI 并想测试推理，请修改 `views\u002Fdata_inference.py` 文件中的 `render_results` 函数，将 API URL 更改为指向您的本地实例地址。","https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues\u002F9",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},33686,"在非 Mac 平台（如 Linux\u002FWindows）使用 Sparrow Parse 时出现 'ModuleNotFoundError: No module named mlx_vlm' 错误如何解决？","这是因为代码在头部直接导入了仅适用于 Mac 的 `mlx_vlm` 模块。维护者已确认这是一个需要修复的问题（将导入移至条件块中以进行延迟加载）。\n临时解决方案或预期行为：该问题已在后续更新中修复。如果您遇到此问题，请确保升级到最新版本，或者检查代码是否已将 `from sparrow_parse.vllm.mlx_inference import MLXInference` 移至针对 MLX 方法的条件判断块内，以避免在非 Mac 平台上加载该模块。","https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues\u002F102",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},33687,"在 Apple M 系列芯片上安装 sparrow-ocr 依赖项失败或运行时无响应怎么办？","在 Apple M 系列芯片上安装 `requirements.txt` 可能会遇到编译错误（如 PyMuPDF 相关错误），或者运行时进入无限循环。这通常与 PaddleOCR 等库在 M 芯片上的兼容性有关。\n推荐解决方案：最简单的方法是将该服务封装到 Docker 容器中运行，以避开本地环境依赖和架构兼容性问题。","https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues\u002F55",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},33688,"调用 API 时出现 \"TypeError: 'ModelConfig' object is not subscriptable\" 错误是什么原因？","这通常是由于 `mlx-vlm` 库的版本不兼容导致的。维护者确认在 `mlx-vlm==0.1.4` 版本下（符合 requirements 文件要求）在 M4 机器上可以正常工作。请检查您的环境，确保安装了正确版本的 `mlx-vlm`，或者尝试重新安装依赖：`pip install -r requirements.txt` 以确保版本匹配。","https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues\u002F78",{"id":141,"question_zh":142,"answer_zh":143,"source_url":139},33689,"如何为不同的多模态模型（如 Qwen2-VL, LLaVA, Phi3-V）构造正确的 Prompt 消息格式？","不同模型需要的消息格式不同，参考以下 `prompt_utils.py` 中的逻辑：\n1. 对于 `idefics2` 和 `qwen2_vl`：内容应为列表格式 `[{\"type\": \"image\"}, {\"type\": \"text\", \"text\": prompt}]`。\n2. 对于 `llava-qwen2`, `llava`, `llava_next`, `bunny-llama`：内容应为字符串格式 `f\"\u003Cimage>\\n{prompt}\"`。\n3. 对于 `phi3_v`：内容应为字符串格式 `f\"\u003C|image_1|>\\n{prompt}\"`。\n请根据您使用的模型名称选择对应的格式构造 JSON 消息。",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},33690,"安装 sparrow-ocr 时遇到 python-poppler 元数据准备失败（mesonpy 属性错误）怎么办？","该错误通常发生在构建 `python-poppler` 时，原因是构建后端 `mesonpy` 版本不兼容或缺少必要属性。虽然具体讨论在截断前未完全展示，但此类问题的通用解决方案包括：\n1. 升级构建工具：`pip install --upgrade pip setuptools wheel meson-python`。\n2. 如果可能，优先使用 Docker 容器运行 OCR 服务，以避免复杂的系统级依赖编译问题（参考 Issue #55 的建议）。","https:\u002F\u002Fgithub.com\u002Fkatanaml\u002Fsparrow\u002Fissues\u002F46",[150,155,160,165,170,175,180,185,190,195,200,205,210,215,220,225,230,235,240,245],{"id":151,"version":152,"summary_zh":153,"released_at":154},263534,"v0.4.4","## 新的 MLX 后端\r\n\r\n改进了 Sparrow 后端，升级了 MLX 库，新增了 Mistral 3.2 模型，并优化了用户界面。","2025-09-27T15:20:24",{"id":156,"version":157,"summary_zh":158,"released_at":159},263535,"v0.4.3","## 数据标注支持\r\n\r\n通过边界框标注和视觉坐标提取，增强文档处理能力，实现字段位置的精准定位与跟踪。","2025-05-24T18:38:50",{"id":161,"version":162,"summary_zh":163,"released_at":164},263536,"v0.4.2","## 大模型指令调用\n\n基于微服务架构的大模型请求处理与指令调用","2025-05-08T11:52:14",{"id":166,"version":167,"summary_zh":168,"released_at":169},263537,"v0.4.1","## 麻雀UI仪表盘\r\n\r\n在麻雀UI中新增了仪表盘，用于可视化展示系统性能。","2025-04-11T07:08:28",{"id":171,"version":172,"summary_zh":173,"released_at":174},263538,"v0.4.0","## 免费层级与全新视觉后端模型\r\n\r\nSparrow 已更新，新增适用于 https:\u002F\u002Fsparrow.katanaml.io\u002F 的免费层级功能。现已集成多款全新视觉后端模型，例如 Mistral Small 3.1 和 Qwen 2.5 72B。","2025-03-29T20:13:28",{"id":176,"version":177,"summary_zh":178,"released_at":179},263539,"v0.3.0","## 麻雀代理\n\n新增了麻雀代理功能，用于编排复杂的文档处理任务。这一新组件允许您将多项数据提取操作组合成一个单一的工作流，并通过与 Prefect 的集成实现可视化监控和跟踪。用户现在可以通过一个直观的 API 处理文档，该 API 在一条无缝的流水线中完成分类、提取和验证等步骤。","2025-03-09T14:37:54",{"id":181,"version":182,"summary_zh":183,"released_at":184},263540,"v0.2.4","## 图片裁剪与 UI 框架更新\n\n实现了图片裁剪功能，这在表单页面中非常有用，可以有效减小图片的整体大小。同时，对 Sparrow UI 框架进行了多项界面优化，并启用了更完善的日志记录功能。\n\n本次发布支持在本地 Mac Mini M4 Pro（64GB 内存）上部署 Sparrow —— https:\u002F\u002Fsparrow.katanaml.io","2025-01-23T07:50:47",{"id":186,"version":187,"summary_zh":188,"released_at":189},263541,"v0.2.3","## 表格处理\r\n\r\n新增支持自动检测表格，并将裁剪后的表格图像发送进行推理。","2024-12-16T19:58:28",{"id":191,"version":192,"summary_zh":193,"released_at":194},263542,"v0.2.2","## 多页PDF文档支持\r\n\r\n通过命令行和API新增了对多页PDF文档的支持。","2024-11-24T19:36:15",{"id":196,"version":197,"summary_zh":198,"released_at":199},263543,"v0.2.1","## 依赖项清理\n\n移除了对 LlamaIndex、Haystack、Unstructured 等库的依赖，因为 Sparrow 的主要 focus 是 Sparrow Parse。","2024-11-08T19:28:16",{"id":201,"version":202,"summary_zh":203,"released_at":204},263544,"v0.2.0","## Sparrow Parse with Vision LLM support\r\n\r\nThis release starts new phase in Sparrow development - Vision LLM support for document data processing.\r\n\r\n1. Sparrow Parse library supports Vision LLM\r\n2. Sparrow Parse provides factory class implementation to run inference locally or on cloud GPU\r\n3. Sparrow supports JSON as input query\r\n4. JSON query validation and LLM response JSON validation is performed","2024-10-04T10:37:47",{"id":206,"version":207,"summary_zh":208,"released_at":209},263545,"v0.1.8","## [v0.1.8] - 2024-07-02\r\n\r\n### New Features\r\n\r\n- Sparrow Parse integration\r\n\r\n### What's Changed\r\n\r\n- Sparrow Parse is integrated into Instructor agent. README updated with example for Instructor agent","2024-07-02T06:58:07",{"id":211,"version":212,"summary_zh":213,"released_at":214},263546,"v0.1.7","## [v0.1.7] - 2024-04-23\r\n\r\n### New Features\r\n\r\n- New Instructor agent\r\n\r\n### What's Changed\r\n\r\n- Added instructor agent for better JSON response generation","2024-04-23T14:09:32",{"id":216,"version":217,"summary_zh":218,"released_at":219},263547,"v0.1.6","## [v0.1.6] - 2024-04-17\r\n\r\n### New Features\r\n\r\n- New agents with Unstructured\r\n\r\n### What's Changed\r\n\r\n- Added unstructured-light and unstructured agents for better data pre-processing","2024-04-17T07:12:11",{"id":221,"version":222,"summary_zh":223,"released_at":224},263548,"v0.1.5","## [v0.1.5] - 2024-03-27\r\n\r\n### New Features\r\n\r\n- Virtual Environments support\r\n\r\n### What's Changed\r\n\r\n- Fixes in LlamaIndex agent to run with latest LlamaIndex versions\r\n- LLM function calling agent","2024-03-27T13:04:48",{"id":226,"version":227,"summary_zh":228,"released_at":229},263549,"v0.1.4","## [v0.1.4] - 2024-03-07\r\n\r\n### New Features\r\n\r\n- OCR + LLM support, new vprocessor agent\r\n\r\n### What's Changed\r\n\r\n- Improved FastAPI endpoints","2024-03-07T12:10:13",{"id":231,"version":232,"summary_zh":233,"released_at":234},263550,"v0.1.3","## [v0.1.3] - 2024-02-11\r\n\r\n### New Features\r\n\r\n- Added Haystack agent for structured data\r\n\r\n### What's Changed\r\n\r\n- Changed plugins to agents","2024-02-11T18:04:29",{"id":236,"version":237,"summary_zh":238,"released_at":239},263551,"v0.1.2","## [v0.1.2] - 2024-01-31\r\n\r\n### New Features\r\n\r\n- Added support for plugin architecture. This allows to use within Sparrow various toolkits, such as LlamaIndex or Haystack\r\n\r\n### What's Changed\r\n\r\n- Significant code refactoring","2024-01-31T19:13:47",{"id":241,"version":242,"summary_zh":243,"released_at":244},263552,"v0.1.1","## [v0.1.1] - 2024-01-19\r\n\r\n### New Features\r\n\r\n- Minor improvements related to data ingestion\r\n\r\n### What's Changed\r\n\r\n- Fixed bug to clean Vector DB, when new document is inserted\r\n- Tested with Notus and Openhermes LLMs\r\n- Tested with longer and more realistic documents\r\n- Upgraded LlamaIndex and LangChain","2024-01-19T19:58:33",{"id":246,"version":247,"summary_zh":248,"released_at":249},263553,"v0.1.0","### New Features\r\n\r\n- Lemming LLM RAG\r\n\r\n### What's Changed\r\n\r\n- ","2024-01-12T08:40:55"]