[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-naiveHobo--InvoiceNet":3,"tool-naiveHobo--InvoiceNet":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",142651,2,"2026-04-06T23:34:12",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":77,"owner_website":79,"owner_url":80,"languages":81,"stars":90,"forks":91,"last_commit_at":92,"license":93,"difficulty_score":94,"env_os":95,"env_gpu":96,"env_ram":97,"env_deps":98,"category_tags":108,"github_topics":111,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":129,"updated_at":130,"faqs":131,"releases":162},4938,"naiveHobo\u002FInvoiceNet","InvoiceNet","Deep neural network to extract intelligent information from invoice documents.","InvoiceNet 是一款基于深度神经网络的开源工具，专为从发票文档中智能提取关键信息而设计。它有效解决了传统人工录入发票数据效率低下、易出错，以及通用自动化方案难以适应不同发票版式的痛点。\n\n通过直观的图形用户界面（UI），用户可以直接查看 PDF 或图片格式的发票，一键提取并保存结构化数据。其核心亮点在于高度的灵活性：不仅支持自定义添加或移除提取字段，还配备了专门的训练界面，允许用户利用自有数据集定制专属模型，从而适应特定业务场景的需求。\n\n目前，InvoiceNet 主要面向开发者和研究人员。虽然官方暂未提供通用的预训练模型，但它开放了完整的数据准备脚本和训练流程，非常适合希望构建私有化发票识别系统、或有意贡献数据以推动建立大规模公共发票数据集的技术团队。对于需要在 Ubuntu 或 Windows 环境下部署 OCR 与深度学习流程的用户而言，InvoiceNet 提供了一个透明、可扩展且易于上手的基础平台。","![InvoiceNet Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_054015db09de.png)\n\n--------------------------------------------------------------------------------\n\nDeep neural network to extract intelligent information from invoice documents.\n\n**TL;DR**\n\n* An easy to use UI to view PDF\u002FJPG\u002FPNG invoices and extract information.\n* Train custom models using the Trainer UI on your own dataset.\n* Add or remove invoice fields as per your convenience.\n* Save the extracted information into your system with the click of a button.\n\n:star: We appreciate your star, it helps!\n\nThe InvoiceNet logo was designed by [Sidhant Tibrewal](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsidhant-tibrewal-864058148\u002F).\n[Check out](https:\u002F\u002Fwww.behance.net\u002Ftiber_sid) his work for some more beautiful designs.\n\n---\n\n![InvoiceNet](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_4a1513a20ebf.png)\n\n---\n\n**DISCLAIMER**: \n\nPre-trained models for some general invoice fields are not available right now but will soon be provided.\nThe training GUI and data preparation scripts have been made available.\n\nInvoice documents contain sensitive information because of which collecting a sizable dataset has proven to be difficult.\nThis makes it difficult for developers like us to train large-scale generalised models and make them available to the community.\n\nIf you have a dataset of invoice documents that you are comfortable sharing with us, please reach out (\u003Csarthakmittal2608@gmail.com>).\nWe have the tools to create the first publicly-available large-scale invoice dataset along with a software platform for structured information extraction.\n\n---\n\n## Installation\n\n#### Ubuntu 20.04\n\nInvoiceNet has been developed and tested on **Ubuntu 20.04** with **CUDA Version: 11.8**, **cuDNN version: 8.9.7**, and **Tensorflow v2.13.1**.\n\nTo install InvoiceNet on Ubuntu, run the following commands:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# Run installation script\n.\u002Finstall.sh\n```\n\nThe install.sh script will install all the dependencies, create a virtual environment, and install InvoiceNet in the virtual environment.\n\nTo be able to use InvoiceNet, you need to source the virtual environment that the package was installed in.\n\n```bash\n# Source virtual environment\nsource env\u002Fbin\u002Factivate\n```\n\n#### Windows 10\n\nThe recommended way is to install InvoiceNet along with its dependencies in an Anaconda environment:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# Create conda environment and activate\nconda create --name invoicenet python=3.7\nconda activate invoicenet\n\n# Install InvoiceNet\npip install .\n\n# Install poppler\nconda install -c conda-forge poppler\n```\n\nSome dependencies also need to be installed separately on Windows 10 before running InvoiceNet:\n\n- [Tesseract 5.0.0](https:\u002F\u002Fgithub.com\u002FUB-Mannheim\u002Ftesseract\u002Fwiki)\n- [ImageMagick 7.0.10](https:\u002F\u002Fimagemagick.org\u002Fscript\u002Fdownload.php#windows)\n- [Ghostscript 9.52](https:\u002F\u002Fwww.ghostscript.com\u002Fdownload\u002Fgsdnld.html)\n\n\n\n## Data Preparation\nThe training data must be arranged in a single directory. The invoice documents are expected be PDF files and each invoice is expected to have a corresponding JSON label file with the same name. Your training data should be in the following format:\n\n```\ntrain_data\u002F\n    invoice1.pdf\n    invoice1.json\n    nike-invoice.pdf\n    nike-invoice.json\n    12345.pdf\n    12345.json\n    ...\n```\n\nThe JSON labels should have the following format:\n```\n{\n \"vendor_name\":\"Nike\",\n \"invoice_date\":\"12-01-2017\",\n \"invoice_number\":\"R0007546449\",\n \"total_amount\":\"137.51\",\n ... other fields\n}\n```\n\nTo begin the data preparation process, click on the \"Prepare Data\" button in the GUI or follow the instructions below if you're using the CLI.\n\n\n## Add Your Own Fields\nTo add your own fields to InvoiceNet, open **invoicenet\u002F\\_\\_init\\_\\_.py**.\n\nThere are 4 pre-defined field types:\n- **FIELD_TYPES[\"general\"]** : General field like names, address, invoice number, etc.\n- **FIELD_TYPES[\"optional\"]** : Optional fields that might not be present in all invoices.\n- **FIELD_TYPES[\"amount\"]** : Fields that represent an amount.\n- **FIELD_TYPES[\"date\"]** : Fields that represent a date.\n\nChoose the appropriate field type for the field and add the line mentioned below.\n\n```python\n# Add the following line at the end of the file\n\n# For example, to add a field total_amount\nFIELDS[\"total_amount\"] = FIELD_TYPES[\"amount\"]\n\n# For example, to add a field invoice_date\nFIELDS[\"invoice_date\"] = FIELD_TYPES[\"date\"]\n\n# For example, to add a field tax_id (which might be optional)\nFIELDS[\"tax_id\"] = FIELD_TYPES[\"optional\"]\n\n# For example, to add a field vendor_name\nFIELDS[\"vendor_name\"] = FIELD_TYPES[\"general\"]\n```\n\n\n## Using the GUI\nInvoiceNet provides you with a GUI to train a model on your data and extract information from invoice documents using this trained model\n\n![Trainer](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_dc8902a77a9c.png)\n\n\nRun the following command to run the trainer GUI:\n\n```bash\npython trainer.py\n```\n\nRun the following command to run the extractor GUI:\n\n```bash\npython extractor.py\n```\n\nYou need to prepare the data for training first. \nYou can do so by setting the **Data Folder** field to the directory containing your training data and the clicking the **Prepare Data** button.\nOnce the data is prepared, you can start training by clicking the **Start** button.\n\n\n## Using the CLI\n\n### Training \n\nPrepare the data for training first by running the following command:\n```bash\npython prepare_data.py --data_dir train_data\u002F\n```\n\nTrain InvoiceNet using the following command:\n```bash\npython train.py --field enter-field-here --batch_size 8\n\n# For example, for field 'total_amount'\npython train.py --field total_amount --batch_size 8\n```\n\n---\n\n### Prediction\nIf you are trying to use different ocr, change the ocr_engine in this function before running predict.py [create_ngrams.py](https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fblob\u002Fe883158a690726afd1de5b76b5810287013577c6\u002Finvoicenet\u002Fcommon\u002Futil.py#L193)\n\n---\n\n#### Single invoice\nTo extract a field from a single invoice file, run the following command:\n```bash\npython predict.py --field enter-field-here --invoice path-to-invoice-file\n\n# For example, to extract field total_amount from an invoice file invoices\u002F1.pdf\npython predict.py --field total_amount --invoice invoices\u002F1.pdf\n```\n\n---\n\n#### Multiple invoices\nFor extracting information using the trained InvoiceNet model, you just need to place the PDF invoice documents in one directory in the following format:\n\n```\npredict_data\u002F\n    invoice1.pdf\n    invoice2.pdf\n    ...\n```\n\nRun InvoiceNet using the following command:\n```bash\npython predict.py --field enter-field-here --data_dir predict_data\u002F\n\n# For example, for field 'total_amount'\npython predict.py --field total_amount --data_dir predict_data\u002F\n```\n---\n\n## Reference\nThis implementation is largely based on the work of R. Palm et al, who should be cited if this is used in a scientific publication (or the preceding conference papers):\n\n[1] Palm, Rasmus Berg, Florian Laws, and Ole Winther. **\"Attend, Copy, Parse End-to-end information extraction from documents.\"** 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.\n\n```bibtex\n@inproceedings{palm2019attend,\n  title={Attend, Copy, Parse End-to-end information extraction from documents},\n  author={Palm, Rasmus Berg and Laws, Florian and Winther, Ole},\n  booktitle={2019 International Conference on Document Analysis and Recognition (ICDAR)},\n  pages={329--336},\n  year={2019},\n  organization={IEEE}\n}\n```\n\n### Note\nAn implementation of an inferior (also slightly broken) invoice handling system based on the paper **\"Cloudscan - A configuration-free invoice analysis system using recurrent neural networks.\"** is available [here](https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Ftree\u002Fcloudscan).\n\n[2] Palm, Rasmus Berg, Ole Winther, and Florian Laws. **\"Cloudscan - A configuration-free invoice analysis system using recurrent neural networks.\"** 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017.\n\n```bibtex\n@inproceedings{palm2017cloudscan,\n  title={Cloudscan-a configuration-free invoice analysis system using recurrent neural networks},\n  author={Palm, Rasmus Berg and Winther, Ole and Laws, Florian},\n  booktitle={2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},\n  volume={1},\n  pages={406--413},\n  year={2017},\n  organization={IEEE}\n}\n```\n","![InvoiceNet Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_054015db09de.png)\n\n--------------------------------------------------------------------------------\n\n基于深度神经网络的发票文档智能信息提取工具。\n\n**简而言之**\n\n* 一个易于使用的界面，用于查看PDF\u002FJPG\u002FPNG格式的发票并提取信息。\n* 使用训练界面在您自己的数据集上训练自定义模型。\n* 根据需要添加或删除发票字段。\n* 一键将提取的信息保存到您的系统中。\n\n:star: 您的点赞对我们非常重要！\n\nInvoiceNet的标志由[Sidhant Tibrewal](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsidhant-tibrewal-864058148\u002F)设计。[查看](https:\u002F\u002Fwww.behance.net\u002Ftiber_sid)他的作品，欣赏更多精美设计。\n\n---\n\n![InvoiceNet](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_4a1513a20ebf.png)\n\n---\n\n**免责声明**：\n\n目前尚未提供针对一些通用发票字段的预训练模型，但很快就会推出。训练GUI和数据准备脚本已经开放使用。\n\n由于发票文件包含敏感信息，收集足够规模的数据集一直颇具挑战性。这使得像我们这样的开发者难以训练大规模的通用模型并向社区开放。\n\n如果您拥有愿意与我们共享的发票数据集，请联系我们（sarthakmittal2608@gmail.com）。我们具备创建首个公开可用的大规模发票数据集以及结构化信息提取软件平台的工具。\n\n---\n\n## 安装\n\n#### Ubuntu 20.04\n\nInvoiceNet已在**Ubuntu 20.04**上开发并测试完成，使用的CUDA版本为11.8，cuDNN版本为8.9.7，TensorFlow版本为2.13.1。\n\n要在Ubuntu上安装InvoiceNet，请执行以下命令：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# 运行安装脚本\n.\u002Finstall.sh\n```\n\ninstall.sh脚本将安装所有依赖项，创建虚拟环境，并在该虚拟环境中安装InvoiceNet。\n\n要使用InvoiceNet，您需要激活安装包所在的虚拟环境。\n\n```bash\n# 激活虚拟环境\nsource env\u002Fbin\u002Factivate\n```\n\n#### Windows 10\n\n推荐的方式是在Anaconda环境中安装InvoiceNet及其依赖项：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# 创建并激活conda环境\nconda create --name invoicenet python=3.7\nconda activate invoicenet\n\n# 安装InvoiceNet\npip install .\n\n# 安装poppler\nconda install -c conda-forge poppler\n```\n\n此外，在Windows 10上运行InvoiceNet之前，还需要单独安装一些依赖项：\n\n- [Tesseract 5.0.0](https:\u002F\u002Fgithub.com\u002FUB-Mannheim\u002Ftesseract\u002Fwiki)\n- [ImageMagick 7.0.10](https:\u002F\u002Fimagemagick.org\u002Fscript\u002Fdownload.php#windows)\n- [Ghostscript 9.52](https:\u002F\u002Fwww.ghostscript.com\u002Fdownload\u002Fgsdnld.html)\n\n\n\n## 数据准备\n训练数据必须组织在一个目录中。发票文件应为PDF格式，且每张发票都应有一个同名的JSON标签文件。您的训练数据应如下所示：\n\n```\ntrain_data\u002F\n    invoice1.pdf\n    invoice1.json\n    nike-invoice.pdf\n    nike-invoice.json\n    12345.pdf\n    12345.json\n    ...\n```\n\nJSON标签文件应具有以下格式：\n```\n{\n \"vendor_name\":\"Nike\",\n \"invoice_date\":\"12-01-2017\",\n \"invoice_number\":\"R0007546449\",\n \"total_amount\":\"137.51\",\n ... 其他字段\n}\n```\n\n要开始数据准备过程，可在GUI中点击“准备数据”按钮，或者如果您使用命令行界面，请按照以下说明操作。\n\n\n## 添加自定义字段\n要向InvoiceNet添加自定义字段，请打开**invoicenet\u002F__init__.py**。\n\n预定义了4种字段类型：\n- **FIELD_TYPES[\"general\"]**：一般字段，如名称、地址、发票编号等。\n- **FIELD_TYPES[\"optional\"]**：可选字段，可能并非所有发票都包含。\n- **FIELD_TYPES[\"amount\"]**：表示金额的字段。\n- **FIELD_TYPES[\"date\"]**：表示日期的字段。\n\n为字段选择合适的类型，并添加下方所示的代码行。\n\n```python\n# 在文件末尾添加以下行\n\n# 例如，添加total_amount字段\nFIELDS[\"total_amount\"] = FIELD_TYPES[\"amount\"]\n\n# 例如，添加invoice_date字段\nFIELDS[\"invoice_date\"] = FIELD_TYPES[\"date\"]\n\n# 例如，添加tax_id字段（可能是可选的）\nFIELDS[\"tax_id\"] = FIELD_TYPES[\"optional\"]\n\n# 例如，添加vendor_name字段\nFIELDS[\"vendor_name\"] = FIELD_TYPES[\"general\"]\n```\n\n\n## 使用GUI\nInvoiceNet提供了一个图形用户界面，允许您基于自己的数据训练模型，并使用该模型从发票文档中提取信息。\n\n![Trainer](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_readme_dc8902a77a9c.png)\n\n\n运行以下命令以启动训练GUI：\n\n```bash\npython trainer.py\n```\n\n运行以下命令以启动提取GUI：\n\n```bash\npython extractor.py\n```\n\n首先需要准备好用于训练的数据。您可以通过将**数据文件夹**字段设置为您训练数据所在的目录，然后点击**准备数据**按钮来完成。数据准备好后，即可点击**开始**按钮进行训练。\n\n\n## 使用命令行\n\n### 训练\n\n首先通过运行以下命令准备训练数据：\n```bash\npython prepare_data.py --data_dir train_data\u002F\n```\n\n然后使用以下命令训练InvoiceNet：\n```bash\npython train.py --field enter-field-here --batch_size 8\n\n# 例如，针对total_amount字段\npython train.py --field total_amount --batch_size 8\n```\n\n---\n\n### 预测\n如果您尝试使用不同的OCR引擎，请在运行predict.py之前更改此函数中的ocr_engine [create_ngrams.py](https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fblob\u002Fe883158a690726afd1de5b76b5810287013577c6\u002Finvoicenet\u002Fcommon\u002Futil.py#L193)。\n\n---\n\n#### 单张发票\n要从单张发票文件中提取某个字段，运行以下命令：\n```bash\npython predict.py --field enter-field-here --invoice path-to-invoice-file\n\n# 例如，从invoices\u002F1.pdf中提取total_amount字段\npython predict.py --field total_amount --invoice invoices\u002F1.pdf\n```\n\n---\n\n#### 多张发票\n要使用已训练好的InvoiceNet模型提取信息，只需将PDF格式的发票文件按以下格式放置在一个目录中：\n\n```\npredict_data\u002F\n    invoice1.pdf\n    invoice2.pdf\n    ...\n```\n\n然后使用以下命令运行InvoiceNet：\n```bash\npython predict.py --field enter-field-here --data_dir predict_data\u002F\n\n# 例如，针对total_amount字段\npython predict.py --field total_amount --data_dir predict_data\u002F\n```\n---\n\n## 参考文献\n本实现主要基于 R. Palm 等人的工作，若在科学出版物中使用，请引用该工作（或其之前的会议论文）：\n\n[1] Palm, Rasmus Berg, Florian Laws, and Ole Winther. **“Attend, Copy, Parse：端到端的文档信息抽取”**。2019 年国际文档分析与识别会议（ICDAR）。IEEE，2019 年。\n\n```bibtex\n@inproceedings{palm2019attend,\n  title={Attend, Copy, Parse End-to-end information extraction from documents},\n  author={Palm, Rasmus Berg and Laws, Florian and Winther, Ole},\n  booktitle={2019 International Conference on Document Analysis and Recognition (ICDAR)},\n  pages={329--336},\n  year={2019},\n  organization={IEEE}\n}\n```\n\n### 注\n基于论文 **“Cloudscan——一种使用循环神经网络的无配置发票分析系统”** 的较差（且略有缺陷）的发票处理系统实现可在此处获取：[这里](https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Ftree\u002Fcloudscan)。\n\n[2] Palm, Rasmus Berg, Ole Winther, and Florian Laws. **“Cloudscan——一种使用循环神经网络的无配置发票分析系统”**。2017 年第 14 届 IAPR 国际文档分析与识别会议（ICDAR）。第 1 卷。IEEE，2017 年。\n\n```bibtex\n@inproceedings{palm2017cloudscan,\n  title={Cloudscan-a configuration-free invoice analysis system using recurrent neural networks},\n  author={Palm, Rasmus Berg and Winther, Ole and Laws, Florian},\n  booktitle={2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},\n  volume={1},\n  pages={406--413},\n  year={2017},\n  organization={IEEE}\n}\n```","# InvoiceNet 快速上手指南\n\nInvoiceNet 是一个基于深度神经网络的开源工具，旨在从发票文档（PDF\u002FJPG\u002FPNG）中智能提取关键信息。它提供图形界面（GUI）和命令行（CLI）两种操作方式，支持自定义字段训练。\n\n## 1. 环境准备\n\n### 系统要求\n- **推荐系统**：Ubuntu 20.04\n- **Windows 支持**：Windows 10（需额外安装依赖）\n- **核心依赖版本**（Ubuntu 测试环境）：\n  - CUDA: 11.8\n  - cuDNN: 8.9.7\n  - TensorFlow: v2.13.1\n  - Python: 3.7 (Windows Conda 环境推荐)\n\n### 前置依赖 (Windows 用户必读)\n若在 Windows 10 上运行，除 Anaconda 外，还需手动安装以下软件并配置环境变量：\n- [Tesseract 5.0.0](https:\u002F\u002Fgithub.com\u002FUB-Mannheim\u002Ftesseract\u002Fwiki) (OCR 引擎)\n- [ImageMagick 7.0.10](https:\u002F\u002Fimagemagick.org\u002Fscript\u002Fdownload.php#windows)\n- [Ghostscript 9.52](https:\u002F\u002Fwww.ghostscript.com\u002Fdownload\u002Fgsdnld.html)\n\n> **注意**：目前官方暂未提供通用的预训练模型，用户需使用自己的数据集进行训练。\n\n## 2. 安装步骤\n\n### 方案 A：Ubuntu 20.04 (推荐)\n使用官方提供的脚本自动安装依赖并创建虚拟环境。\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# 运行安装脚本（自动安装依赖、创建虚拟环境）\n.\u002Finstall.sh\n\n# 激活虚拟环境\nsource env\u002Fbin\u002Factivate\n```\n\n### 方案 B：Windows 10\n推荐使用 Anaconda 环境进行隔离安装。\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet.git\ncd InvoiceNet\u002F\n\n# 创建并激活 conda 环境\nconda create --name invoicenet python=3.7\nconda activate invoicenet\n\n# 安装 InvoiceNet\npip install .\n\n# 安装 poppler (用于 PDF 处理)\nconda install -c conda-forge poppler\n```\n*请确保已按“环境准备”章节手动安装了 Tesseract、ImageMagick 和 Ghostscript。*\n\n## 3. 基本使用\n\n### 第一步：准备数据\n训练数据需放在同一目录下，包含 PDF 发票文件及其对应的 JSON 标签文件（文件名需一致）。\n\n**目录结构示例：**\n```text\ntrain_data\u002F\n    invoice1.pdf\n    invoice1.json\n    nike-invoice.pdf\n    nike-invoice.json\n```\n\n**JSON 标签格式示例：**\n```json\n{\n \"vendor_name\":\"Nike\",\n \"invoice_date\":\"12-01-2017\",\n \"total_amount\":\"137.51\"\n}\n```\n\n**自定义字段（可选）：**\n若需提取新字段，编辑 `invoicenet\u002F__init__.py`，在文件末尾添加字段定义：\n```python\n# 示例：添加总金额字段\nFIELDS[\"total_amount\"] = FIELD_TYPES[\"amount\"]\n# 示例：添加可选的税号字段\nFIELDS[\"tax_id\"] = FIELD_TYPES[\"optional\"]\n```\n\n### 第二步：训练模型 (GUI 方式)\n启动训练器图形界面，可视化完成数据预处理和模型训练。\n\n```bash\npython trainer.py\n```\n**操作流程：**\n1. 在 GUI 中设置 **Data Folder** 为你的数据目录（如 `train_data\u002F`）。\n2. 点击 **Prepare Data** 按钮预处理数据。\n3. 点击 **Start** 按钮开始训练。\n\n### 第三步：提取信息 (预测)\n训练完成后，可使用 CLI 对单张或多张发票进行信息提取。\n\n**提取单个发票字段：**\n```bash\n# 语法：python predict.py --field \u003C字段名> --invoice \u003C发票路径>\npython predict.py --field total_amount --invoice invoices\u002F1.pdf\n```\n\n**批量提取目录下的发票：**\n将待处理的 PDF 放入同一目录（如 `predict_data\u002F`），然后运行：\n```bash\n# 语法：python predict.py --field \u003C字段名> --data_dir \u003C目录路径>\npython predict.py --field total_amount --data_dir predict_data\u002F\n```\n\n> **提示**：如需更换 OCR 引擎，请在运行预测前修改 `invoicenet\u002Fcommon\u002Futil.py` 中的 `create_ngrams` 函数内的 `ocr_engine` 配置。","某中型电商企业的财务团队每天需处理来自全球供应商的数百份 PDF 和图片格式发票，急需将非结构化文档转化为可录入 ERP 系统的结构化数据。\n\n### 没有 InvoiceNet 时\n- 财务人员必须手动逐张打开发票，肉眼识别并键盘录入供应商名称、日期、金额等关键字段，耗时且枯燥。\n- 面对不同国家、不同版式的发票，人工判断字段位置极易出错，导致入账金额偏差或重复支付风险。\n- 新增一种特殊格式的发票时，需要重新培训员工或编写复杂的正则规则脚本，响应业务变化的周期长达数周。\n- 提取后的数据分散在 Excel 表格中，无法直接通过一键操作同步至公司内部数据库，二次整理成本高昂。\n\n### 使用 InvoiceNet 后\n- 利用其内置的 UI 界面批量上传 PDF 或 JPG 发票，深度神经网络自动在秒级内完成关键信息的智能提取与展示。\n- 通过 Trainer UI 使用企业自有数据集训练定制模型，轻松适应各种非标发票版式，显著降低人为识别错误率。\n- 只需修改配置文件即可灵活增加或删除提取字段（如税号、订单号），无需重写代码即可快速响应新的财务合规需求。\n- 确认提取结果无误后，点击按钮即可将结构化 JSON 数据直接保存并集成到现有系统中，实现从文档到数据库的自动化闭环。\n\nInvoiceNet 将原本需要数小时的人工核对工作压缩至分钟级，让财务团队从繁琐的数据搬运中解放出来，专注于高价值的财务分析工作。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FnaiveHobo_InvoiceNet_054015db.png","naiveHobo","Sarthak Mittal","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FnaiveHobo_b266bb3a.jpg","\"I love deadlines. I love the whooshing noise they make as they go by.\"","@Polybee-SG ",null,"sarthakmittal2608@gmail.com","linkedin.com\u002Fin\u002Fsarthak-mittal","https:\u002F\u002Fgithub.com\u002FnaiveHobo",[82,86],{"name":83,"color":84,"percentage":85},"Python","#3572A5",99.9,{"name":87,"color":88,"percentage":89},"Shell","#89e051",0.1,2683,415,"2026-03-25T18:28:03","MIT",4,"Linux (Ubuntu 20.04), Windows 10","需要 NVIDIA GPU，CUDA 版本需为 11.8，配合 cuDNN 8.9.7","未说明",{"notes":99,"python":100,"dependencies":101},"Ubuntu 用户需运行 install.sh 脚本自动配置虚拟环境；Windows 用户建议使用 Anaconda 环境，并需手动安装 Tesseract、ImageMagick 和 Ghostscript 等外部工具。目前暂无通用的预训练模型，需用户使用自己的数据集进行训练。数据准备要求 PDF 发票文件与同名的 JSON 标签文件配对存放。","3.7 (Windows 推荐), 未明确指定 Ubuntu 版本但通过脚本安装",[102,103,104,105,106,107],"Tensorflow v2.13.1","cuDNN 8.9.7","Tesseract 5.0.0","ImageMagick 7.0.10","Ghostscript 9.52","poppler",[109,14,110],"其他","音频",[112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128],"invoice","invoice-management","invoices","invoice-insight","classification","deep-learning","deep-neural-networks","deeplearning","keras","keras-tensorflow","keras-neural-networks","invoice-pdf","invoice-parser","invoice-software","information-retrieval","information-extraction","billing","2026-03-27T02:49:30.150509","2026-04-07T16:51:23.998616",[132,137,142,147,152,157],{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},22425,"如何准备输入数据？支持什么格式？","项目需要 HOCR 格式的输入。维护者已提交包含数据集文件和支持文件的更新（commit: a931196），用户可参考该提交获取示例数据。标签映射关系如下：0: \"Other\", 1: \"Invoice Date\", 2: \"Invoice Number\", 3: \"Buyer GST\", 4: \"Seller GST\", 5: \"Total Amount\"。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F1",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},22426,"训练时出现 'OutOfRangeError: End of sequence' 错误怎么办？","该错误通常是因为训练数据量不足导致的。有用户反馈将训练数据增加到至少 10 组（10 个 PDF + 10 个对应的 JSON 文件）后，训练即可正常运行。即使设置了 Batch Size 为 1，如果只有 1 组数据也可能报错，建议增加数据量。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F32",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},22427,"在哪里可以获取数据集和 pickle 文件？","维护者已在 commit (a931196) 中添加了数据集文件和一些支持文件。用户可以拉取该版本的代码或查看该提交记录来获取所需的样本数据和 pickle 文件以运行脚本。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F2",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},22428,"在 Extractor GUI 中添加了自定义标签但无法选中进行标注怎么办？","字段复选框会自动激活的前提是 InvoiceNet 能够找到该字段的已训练模型。如果尚未训练出对应字段的模型，这些选项可能处于不可选状态。需要先训练模型或确保模型能识别该字段。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F82",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},22429,"遇到 'ValueError: setting an array element with a sequence' 错误如何解决？","这是一个可选解析器（optional parser）中的错误，特别是当某些训练数据中标签字段缺失时。维护者已在修复版本（commit: fe6ad79d53b5cae08dd6383c912543f5b5fb9950）中解决了此问题。对于可选字段（即某些文档中可能不存在的字段），不需要将其设为非可选并填 null，也不需要从训练数据中完全移除，更新代码即可自动处理为 null。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F16",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},22430,"项目使用什么模型？是否使用 PDF 中的原始文本？","项目不使用 PDF 中的嵌入文本。维护者测试了约 10000 份文档后，决定在所有情况下都使用 OCR 技术。这样做的好处是建立了统一的管道，可同时处理含嵌入文本的 PDF、扫描文档和图像。","https:\u002F\u002Fgithub.com\u002FnaiveHobo\u002FInvoiceNet\u002Fissues\u002F13",[]]