[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-OpenDCAI--Paper2Any":3,"tool-OpenDCAI--Paper2Any":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":75,"owner_avatar_url":76,"owner_bio":77,"owner_company":78,"owner_location":78,"owner_email":79,"owner_twitter":78,"owner_website":78,"owner_url":80,"languages":81,"stars":119,"forks":120,"last_commit_at":121,"license":122,"difficulty_score":123,"env_os":124,"env_gpu":124,"env_ram":124,"env_deps":125,"category_tags":129,"github_topics":130,"view_count":23,"oss_zip_url":78,"oss_zip_packed_at":78,"status":16,"created_at":138,"updated_at":139,"faqs":140,"releases":169},3858,"OpenDCAI\u002FPaper2Any","Paper2Any","Turn paper\u002Ftext\u002Ftopic into editable research figures, technical route diagrams, and presentation slides.","Paper2Any 是一款专注于科研多模态工作流的智能工具，旨在帮助用户将学术论文（PDF）、截图、纯文本或研究主题，一键转化为可编辑的研究图表、技术路线图、实验数据图以及演示文稿（PPT）。\n\n在科研工作中，研究人员往往需要耗费大量时间手动绘制复杂的模型结构图或制作汇报幻灯片。Paper2Any 通过 AI 技术自动化了这一繁琐过程，不仅支持从非结构化文档中提取关键信息生成可视化内容，还确保了输出结果的可编辑性，让用户能轻松进行二次调整。此外，它近期还扩展了视频生成、海报制作及论文回复信起草等功能，全面覆盖科研展示与沟通场景。\n\n这款工具特别适合高校研究人员、研究生、算法工程师以及需要频繁进行学术汇报的教育工作者使用。无论是需要快速梳理论文思路，还是准备会议展示材料，Paper2Any 都能显著提升效率。\n\n其技术亮点在于强大的多文件格式解析能力与灵活的三层模型配置系统，用户可根据需求选择不同的大模型策略。同时，工具集成了 Drawio 支持，能够生成专业的矢量流程图，并具备图像感知修订功能，确保生成的图表既美观又符合学术规范。","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_2914b1b5bb3e.png\" alt=\"Paper2Any Logo\" width=\"200\"\u002F>\n\n# Paper2Any\n\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11+-3776AB?style=flat-square&logo=python&logoColor=white)](https:\u002F\u002Fwww.python.org\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-2F80ED?style=flat-square&logo=apache&logoColor=white)](LICENSE)\n[![GitHub Repo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-OpenDCAI%2FPaper2Any-24292F?style=flat-square&logo=github&logoColor=white)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any)\n[![Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenDCAI\u002FPaper2Any?style=flat-square&logo=github&label=Stars&color=F2C94C)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fstargazers)\n\nEnglish | [中文](README_CN.md)\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F17634\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_d336655397cd.png\" alt=\"OpenDCAI%2FPaper2Any | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n✨ **Focus on paper multimodal workflows: from paper PDFs\u002Fscreenshots\u002Ftext to one-click generation of model diagrams, technical roadmaps, experimental plots, and slide decks** ✨\n\n| 📄 **Universal File Support** &nbsp;|&nbsp; 🎯 **AI-Powered Generation** &nbsp;|&nbsp; 🎨 **Custom Styling** &nbsp;|&nbsp; ⚡ **Lightning Speed** |\n\n\u003Cbr>\n\n\u003Ca href=\"#-quick-start\" target=\"_self\">\n  \u003Cimg alt=\"Quickstart\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🚀-Quick_Start-2F80ED?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\" target=\"_blank\">\n  \u003Cimg alt=\"Online Demo\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🌐-Online_Demo-56CCF2?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"docs\u002F\" target=\"_blank\">\n  \u003Cimg alt=\"Docs\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F📚-Docs-2D9CDB?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"docs\u002Fcontributing.md\" target=\"_blank\">\n  \u003Cimg alt=\"Contributing\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤝-Contributing-27AE60?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"#wechat-group\" target=\"_self\">\n  \u003Cimg alt=\"WeChat\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F💬-WeChat_Group-07C160?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\n\u003Cbr>\n\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_b294d195b8c5.png\" alt=\"Paper2Any Web Interface\" width=\"80%\"\u002F>\n\n\u003C\u002Fdiv>\n\n\n## 📑 Table of Contents\n\n- [🔥 News](#-news)\n- [✨ Core Features](#-core-features)\n- [📸 Showcase](#-showcase)\n- [🧩 Drawio](#-drawio)\n- [🚀 Quick Start](#-quick-start)\n- [📂 Project Structure](#-project-structure)\n- [🗺️ Roadmap](#️-roadmap)\n- [🤝 Contributing](#-contributing)\n\n---\n\n## 🔥 News\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-03-28 · Editable PPT Showcase Refresh\u003C\u002Fstrong>\u003Cbr>\n> Added two new \u003Cstrong>editable PPT\u003C\u002Fstrong> showcase screenshots for the frontend-deck workflow:\u003Cbr>\n> a generated multi-slide gallery view and the canvas editing workspace with deck theme lock.\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-03-26 · Workflow Showcase Update\u003C\u002Fstrong>\u003Cbr>\n> Added showcase coverage for \u003Cstrong>Paper2Video\u003C\u002Fstrong>, \u003Cstrong>Paper2Poster\u003C\u002Fstrong>, and \u003Cstrong>Paper2Citation\u003C\u002Fstrong>.\u003Cbr>\n> The README now includes a compressed video demo plus refreshed English\u002FChinese workflow previews.\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-02-02 · Paper2Rebuttal\u003C\u002Fstrong>\u003Cbr>\n> Added rebuttal drafting support with structured response guidance and image-aware revision prompts.\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-28 · Drawio Update\u003C\u002Fstrong>\u003Cbr>\n> Added Drawio support for visual diagram creation and showcase-ready outputs in the workflow.\u003Cbr>\n> KB updates in one line: multi-file PPT generation with doc convert\u002Fmerge, optional image injection, and embedding-assisted retrieval.\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-25 · New Features\u003C\u002Fstrong>\u003Cbr>\n> Added **AI-assisted outline editing**, **three-layer model configuration system** for flexible model selection, and **user points management** with daily quota allocation.\u003Cbr>\n> 🌐 Online Demo: \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-20 · Bug Fixes\u003C\u002Fstrong>\u003Cbr>\n> Fixed bugs in experimental plot generation (image\u002Ftext) and resolved the missing historical files issue.\u003Cbr>\n> 🌐 Online Demo: \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-14 · Feature Updates & Backend Architecture Upgrade\u003C\u002Fstrong>\u003Cbr>\n> 1. **Feature Updates**: Added **Image2PPT**, optimized **Paper2Figure** interaction, and improved **PDF2PPT** effects.\u003Cbr>\n> 2. **Standardized API**: Refactored backend interfaces with RESTful `\u002Fapi\u002Fv1\u002F` structure, removing obsolete endpoints for better maintainability.\u003Cbr>\n> 3. **Dynamic Configuration**: Supported dynamic model selection (e.g., GPT-4o, Qwen-VL) via API parameters, eliminating hardcoded model dependencies.\u003Cbr>\n> 🌐 Online Demo: \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n- 2025-12-12 · Paper2Figure Web public beta is live\n- 2025-10-01 · Released the first version \u003Ccode>0.1.0\u003C\u002Fcode>\n\n---\n\n## ✨ Core Features\n\n> From paper PDFs \u002F images \u002F text to **editable** scientific figures, slide decks, video scripts, academic posters, and other multimodal content in one click.\n\nPaper2Any currently includes the following sub-capabilities:\n\n- **📊 Paper2Figure - Editable Scientific Figures**: Model architecture diagrams, technical roadmaps (PPT + SVG), and experimental plots with editable PPTX output.\n- **🧩 Paper2Diagram \u002F Image2Drawio - Editable Diagrams**: Generate draw.io diagrams from paper\u002Ftext or images, with drawio\u002Fpng\u002Fsvg export and chat-based edits.\n- **🎬 Paper2PPT - Editable Slide Decks**: Paper\u002Ftext\u002Ftopic to PPT, long-doc support, and built-in table\u002Ffigure extraction.\n- **📝 Paper2Rebuttal**: Draft structured rebuttals and revision responses with claims-to-evidence grounding.\n- **🖼️ PDF2PPT - Layout-Preserving Conversion**: Accurate layout retention for PDF → editable PPTX.\n- **🖼️ Image2PPT - Image to Slides**: Convert images or screenshots into structured slides.\n- **🎨 PPTPolish - Smart Beautification**: AI-based layout optimization and style transfer.\n- **🎬 Paper2Video**: Generate video scripts and narration assets.\n- **🖼️ Paper2Poster - Academic Poster**: Turn paper PDFs into poster-ready layouts with configurable sections, logos, and export assets.\n- **🔎 Paper2Citation - Citation Explorer**: Track citing authors, institutions, and notable downstream works from author names or DOI\u002Fpaper URLs.\n- **📝 Paper2Technical**: Produce technical reports and method summaries.\n- **📚 Knowledge Base (KB)**: Ingest\u002Fembedding, semantic search, and KB-driven PPT\u002Fpodcast\u002Fmindmap generation.\n\n---\n\n## 📸 Showcase\n\n### 🧩 Drawio\n\n\u003Cdiv align=\"center\">\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"33%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_ac25ef0ee794.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Upload a paper figure or screenshot as the starting point\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"34%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_1d3af098805d.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Keep the source structure visible before conversion\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"33%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_e42b63ccc600.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Convert the image into an editable DrawIO canvas\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"48%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_8b59a21aebf0.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Generate a model or system diagram directly inside the DrawIO workbench\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"52%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_45a2526e477f.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Refine the generated architecture with chat editing and export-ready layout\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003C\u002Fdiv>\n\n---\n\n### 📝 Paper2Rebuttal: Rebuttal Drafting\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_cc9cab70a91a.png\" width=\"95%\"\u002F>\n\u003Cbr>\u003Csub>✨ Rebuttal drafting and revision support\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 📊 Paper2Figure: Scientific Figure Generation\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_076e4d1fbed2.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Model Architecture Diagram Generation\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_14afff7223ba.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Model Architecture Diagram Generation\u003C\u002Fsub>\n\n\u003Cbr>\u003Cbr>\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"56%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_29080d5bd025.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Technical roadmap workbench: choose route type, input source, model config, and visual template\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"44%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3dc0c587f868.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Generated technical roadmap figure with structured dual-column layout\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_f4bc4088bd34.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Experimental Plot Generation (Multiple Styles)\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎬 Paper2PPT: Paper to Presentation\n\n\u003Cdiv align=\"center\">\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_a5f3a4fbef11.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ End-to-end PPT generation demo\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3fd4f6369d89.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Paper \u002F text \u002F topic to polished slide deck\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_6db1c04d4eef.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Edit slide text directly on canvas while keeping the deck theme locked\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_4674abcd902c.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Review the generated multi-page gallery before export\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_c87b5b199e3b.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ AI-assisted outline refinement with targeted rewrite prompts\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_433628136412.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ Structured outline editing down to section and bullet detail\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_f62390a12912.png\" width=\"78%\"\u002F>\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_c7bb0212f9ae.png\" width=\"80%\"\u002F>\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_369cda27c799.png\" width=\"80%\"\u002F>\n\u003Cbr>\u003Csub>✨ Long document support for 40+ slides · Intelligent table extraction and insertion · Version history and iterative deck management\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎬 Paper2Video: PPT to Narrated Video\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3c1af41b3c34.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ PPT \u002F PDF to narrated video with script confirmation, Aliyun TTS voices, and downloadable output\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🖼️ Paper2Poster: Paper to Poster\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_08d36e70b483.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>PNG poster result\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_9d5186a63fe2.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>PPT poster result\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cbr>\u003Csub>✨ Paper PDF to academic poster with configurable layout, editable poster output, and one-click export\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🔎 Paper2Citation: Citation Explorer\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_5386bf0c7b21.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Search authors or papers to inspect citation candidates, institutions, and downstream citation context\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎨 PPT Smart Beautification\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3539548dfece.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ AI-based Layout Optimization\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_71d22e550c59.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ AI-based Layout Optimization & Style Transfer\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🖼️ PDF2PPT: Layout-Preserving Conversion\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_5b1c750a53d9.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ Intelligent Cutout & Layout Preservation\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_bc1832666902.png\" width=\"93%\"\u002F>\n\u003Cbr>\u003Csub>✨ Image2PPT\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n## 🚀 Quick Start\n\n### Requirements\n\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11+-3776AB?style=flat-square&logo=python&logoColor=white)\n![pip](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpip-latest-3776AB?style=flat-square&logo=pypi&logoColor=white)\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>🐳 Docker (Recommended) — Deployment & Updates\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```bash\n# 1. Clone\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. Configure environment variables\ncp fastapi_app\u002F.env.example fastapi_app\u002F.env\ncp frontend-workflow\u002F.env.example frontend-workflow\u002F.env\n```\n\n**Required configuration:**\n\n`fastapi_app\u002F.env` (backend):\n```bash\n# Internal API auth key. Must match frontend VITE_API_KEY.\nBACKEND_API_KEY=your-backend-api-key\n\n# Required: Your LLM API URL (replace with your own)\nDEFAULT_LLM_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\u002F\n\n# Optional: DrawIO OCR \u002F VLM service\nPAPER2DRAWIO_OCR_API_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1\nPAPER2DRAWIO_OCR_API_KEY=your_dashscope_key\n\n# Optional: MinerU official remote API\nMINERU_API_BASE_URL=https:\u002F\u002Fmineru.net\u002Fapi\u002Fv4\nMINERU_API_KEY=your_mineru_api_key\n\n# Optional: SAM3 segmentation service for PDF2PPT \u002F Image2PPT \u002F Image2Drawio\n# SAM3_SERVER_URLS=http:\u002F\u002FGPU_MACHINE_IP:8001\n# SAM3_SERVER_URLS=http:\u002F\u002FGPU1:8021,http:\u002F\u002FGPU2:8022\n\n# Optional: Supabase (skip for no auth — core features still work)\n# SUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\n# SUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n`frontend-workflow\u002F.env` (frontend):\n```bash\n# Must match BACKEND_API_KEY in fastapi_app\u002F.env\nVITE_API_KEY=your-backend-api-key\n\n# Required: LLM API URLs available in the UI dropdown (comma separated)\nVITE_DEFAULT_LLM_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\nVITE_LLM_API_URLS=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# Optional: DrawIO page model candidates shown in the UI\nVITE_PAPER2DRAWIO_MODEL=claude-sonnet-4-5-20250929,gpt-5.2\n# Optional: Supabase (keep consistent with backend)\n# VITE_SUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\n# VITE_SUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n```bash\n# 3. Build + run\ndocker compose up -d --build\n```\n\nOpen:\n- Frontend: http:\u002F\u002Flocalhost:3000\n- Backend health: http:\u002F\u002Flocalhost:8000\u002Fhealth\n\n> **GPU services note:** Docker only starts the frontend and backend. No GPU model services are included.\n> - Paper2PPT, Paper2Figure, Knowledge Base, etc. only need LLM APIs and work out of the box.\n> - **PDF2PPT, Image2PPT, Image2Drawio** require the SAM3 segmentation service (needs GPU), deployed separately:\n>   ```bash\n>   # On a machine with GPU\n>   python -m dataflow_agent.toolkits.model_servers.sam3_server \\\n>       --port 8001 --checkpoint models\u002Fsam3\u002Fsam3.pt \\\n>       --bpe models\u002Fsam3\u002Fbpe_simple_vocab_16e6.txt.gz --device cuda\n>   ```\n>   Then add to `fastapi_app\u002F.env`: `SAM3_SERVER_URLS=http:\u002F\u002FGPU_MACHINE_IP:8001`\n>\n> See the \"Advanced: Local Model Server Load Balancing\" section below for details.\n\nModify & update:\n- After changing code or `.env`, rebuild: `docker compose up -d --build`\n- Pull latest code and rebuild:\n  - `git pull`\n  - `docker compose up -d --build`\n\nCommon commands:\n- View logs: `docker compose logs -f`\n- Stop services: `docker compose down`\n\nNotes:\n- The first build may take a while (system deps + Python deps).\n- Frontend env is baked at build time (compose build args). If you change it, rebuild with `docker compose up -d --build`.\n- Outputs\u002Fmodels are mounted to the host (`.\u002Foutputs`, `.\u002Fmodels`) for persistence.\n\n\u003C\u002Fdetails>\n\n### 🐧 Linux Installation\n\n> We recommend using Conda to create an isolated environment (Python 3.11).  \n\n#### 1. Create Environment & Install Base Dependencies\n\n```bash\n# 0. Create and activate a conda environment\nconda create -n paper2any python=3.11 -y\nconda activate paper2any\n\n# 1. Clone repository\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. Install base dependencies\npip install -r requirements-base.txt\n\n# 3. Install in editable (dev) mode\npip install -e .\n```\n\n#### 2. Install Paper2Any-specific Dependencies (Required)\n\nPaper2Any involves LaTeX rendering, vector graphics processing as well as PPT\u002FPDF conversion, which require extra dependencies:\n\n```bash\n# 1. Python dependencies\npip install -r requirements-paper.txt || pip install -r requirements-paper-backup.txt\n\n# 2. LaTeX engine (tectonic) - recommended via conda\nconda install -c conda-forge tectonic -y\n\n# 3. Resolve doclayout_yolo dependency conflicts (Important)\npip install doclayout_yolo --no-deps\n\n# 4. System dependencies (Ubuntu example)\nsudo apt-get update\nsudo apt-get install -y inkscape libreoffice poppler-utils wkhtmltopdf\n```\n\n#### 3. Environment Variables\n\n```bash\nexport DF_API_KEY=your_api_key_here\nexport DF_API_URL=xxx  # Optional: if you need a third-party API gateway\nexport MINERU_DEVICES=\"0,1,2,3\" # Optional: MinerU task GPU resource pool\n```\n\n> [!TIP]\n> 📚 **For detailed configuration guide**, see [Configuration Guide](docs\u002Fguides\u002Fconfiguration.md) for step-by-step instructions on configuring models, environment variables, and starting services.\n\n#### 4. Configure Environment Files (Optional)\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📝 Click to expand: Detailed .env Configuration Guide\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nPaper2Any uses two `.env` files for configuration. **Both are optional** - you can run the application without them using default settings.\n\n##### Step 1: Copy Example Files\n\n```bash\n# Copy backend environment file\ncp fastapi_app\u002F.env.example fastapi_app\u002F.env\n\n# Copy frontend environment file\ncp frontend-workflow\u002F.env.example frontend-workflow\u002F.env\n```\n\n##### Step 2: Backend Configuration (`fastapi_app\u002F.env`)\n\n**Supabase (Optional)** - Only needed if you want user authentication and cloud storage:\n```bash\nSUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\nSUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n**Model Configuration** - Customize which models to use for different workflows:\n```bash\n# Default LLM API URL\nDEFAULT_LLM_API_URL=http:\u002F\u002F123.129.219.111:3000\u002Fv1\u002F\n\n# Workflow-level defaults\nPAPER2PPT_DEFAULT_MODEL=gpt-5.1\nPAPER2PPT_DEFAULT_IMAGE_MODEL=gemini-3-pro-image-preview\nPDF2PPT_DEFAULT_MODEL=gpt-4o\n# ... see .env.example for full list\n```\n\n**Service Integration Configuration** - External or local services used by image\u002FPDF workflows:\n```bash\n# DrawIO OCR \u002F VLM\nPAPER2DRAWIO_OCR_API_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1\nPAPER2DRAWIO_OCR_API_KEY=your_dashscope_key\n\n# MinerU official remote API; if MINERU_API_KEY is empty, backend falls back to local MINERU_PORT\nMINERU_API_BASE_URL=https:\u002F\u002Fmineru.net\u002Fapi\u002Fv4\nMINERU_API_KEY=your_mineru_api_key\nMINERU_API_MODEL_VERSION=vlm\n\n# SAM3 segmentation service for PDF2PPT \u002F Image2PPT \u002F Image2Drawio\n# One endpoint:\nSAM3_SERVER_URLS=http:\u002F\u002F127.0.0.1:8001\n# Or multiple endpoints for load balancing:\n# SAM3_SERVER_URLS=http:\u002F\u002F127.0.0.1:8021,http:\u002F\u002F127.0.0.1:8022\n```\n\n##### Step 3: Frontend Configuration (`frontend-workflow\u002F.env`)\n\n**LLM Provider Configuration** - Controls the API endpoint dropdown in the UI:\n```bash\n# Default API URL shown in the UI\nVITE_DEFAULT_LLM_API_URL=https:\u002F\u002Fapi.apiyi.com\u002Fv1\n\n# Available API URLs in the dropdown (comma-separated)\nVITE_LLM_API_URLS=https:\u002F\u002Fapi.apiyi.com\u002Fv1,http:\u002F\u002Fb.apiyi.com:16888\u002Fv1,http:\u002F\u002F123.129.219.111:3000\u002Fv1\n```\n\n**What happens when you modify `VITE_LLM_API_URLS`:**\n- The frontend will display a **dropdown menu** with all URLs you specify\n- Users can select different API endpoints without manually typing URLs\n- Useful for switching between OpenAI, local models, or custom API gateways\n\n**Supabase (Optional)** - Uncomment these lines if you want user authentication:\n```bash\nVITE_SUPABASE_URL=https:\u002F\u002Fyour-project.supabase.co\nVITE_SUPABASE_ANON_KEY=your-anon-key\nSUPABASE_SERVICE_ROLE_KEY=your-service-role-key\nSUPABASE_JWT_SECRET=your-jwt-secret\n```\n\n##### Running Without Supabase\n\nIf you skip Supabase configuration:\n- ✅ All core features work normally\n- ✅ CLI scripts work without any configuration\n- ❌ No user authentication or quotas\n- ❌ No cloud file storage\n\n\u003C\u002Fdetails>\n\n> [!NOTE]\n> **Quick Start:** You can skip the `.env` configuration entirely and use CLI scripts directly with `--api-key` parameter. See [CLI Scripts](#️-cli-scripts-command-line-interface) section below.\n\n---\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Advanced Configuration: Local Model Service Load Balancing\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nIf you are deploying in a high-concurrency local environment, you can use `script\u002Fstart_model_servers.sh` to start a local model service cluster (MinerU \u002F SAM \u002F OCR).\n\nScript location: `\u002FDataFlow-Agent\u002Fscript\u002Fstart_model_servers.sh`\n\n**Main configuration items:**\n\n- **MinerU (PDF Parsing)**\n  - `MINERU_MODEL_PATH`: Model path (default `models\u002FMinerU2.5-2509-1.2B`)\n  - `MINERU_GPU_UTIL`: GPU memory utilization (default 0.85)\n  - **Instance configuration**: By default, one instance is started on each configured GPU, ports 8011-8013.\n  - **Load Balancer**: Port 8010, automatically dispatches requests.\n\n- **SAM3 (Segment Anything Model 3)**\n  - **Instance configuration**: By default, one instance per configured GPU, ports 8021-8022.\n  - **Model assets**: default paths are `.\u002Fmodels\u002Fsam3\u002Fsam3.pt` and `.\u002Fmodels\u002Fsam3\u002Fbpe_simple_vocab_16e6.txt.gz`.\n  - **Load Balancer**: Port 8020.\n\n- **OCR (PaddleOCR)**\n  - **Config**: Runs on CPU, uses uvicorn's worker mechanism (4 workers by default).\n  - **Port**: 8003.\n\n> Before using, please modify `gpu_id` and the number of instances in the script according to your actual GPU count and memory.\n\nFor local one-command development test on a single GPU (SAM3 + backend + frontend), run:\n\n```bash\nbash script\u002Fstart_local_sam3_dev.sh\n```\n\n\u003C\u002Fdetails>\n\n---\n\n### 🪟 Windows Installation\n\n> [!NOTE]\n> We currently recommend trying Paper2Any on Linux \u002F WSL. If you need to deploy on native Windows, please follow the steps below.\n\n#### 1. Create Environment & Install Base Dependencies\n\n```bash\n# 0. Create and activate a conda environment\nconda create -n paper2any python=3.12 -y\nconda activate paper2any\n\n# 1. Clone repository\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. Install base dependencies\npip install -r requirements-win-base.txt\n\n# 3. Install in editable (dev) mode\npip install -e .\n```\n\n#### 2. Install Paper2Any-specific Dependencies (Recommended)\n\nPaper2Any involves LaTeX rendering and vector graphics processing, which require extra dependencies (see `requirements-paper.txt`):\n\n```bash\n# Python dependencies\npip install -r requirements-paper.txt\n\n# tectonic: LaTeX engine (recommended via conda)\nconda install -c conda-forge tectonic -y\n```\n\n**🎨 Install Inkscape (SVG\u002FVector Graphics Processing | Recommended\u002FRequired)**\n\n1. Download and install (Windows 64-bit MSI): [Inkscape Download](https:\u002F\u002Finkscape.org\u002Frelease\u002Finkscape-1.4.2\u002Fwindows\u002F64-bit\u002Fmsi\u002F?redirected=1)\n2. Add the Inkscape executable directory to the system environment variable Path (example): `C:\\Program Files\\Inkscape\\bin\\`\n\n> [!TIP]\n> After configuring the Path, it is recommended to reopen the terminal (or restart VS Code \u002F PowerShell) to ensure the environment variables take effect.\n\n#### ⚡ Install Windows Build of vLLM (Optional | For Local Inference Acceleration)\n\nRelease page: [vllm-windows releases](https:\u002F\u002Fgithub.com\u002FSystemPanic\u002Fvllm-windows\u002Freleases)  \nRecommended version: 0.11.0\n\n```bash\npip install vllm-0.11.0+cu124-cp312-cp312-win_amd64.whl\n```\n\n> [!IMPORTANT]\n> Please make sure the `.whl` matches your current environment:\n> - Python: cp312 (Python 3.12)\n> - Platform: win_amd64\n> - CUDA: cu124 (must match your local CUDA \u002F driver)\n\n#### Launch Application\n\n**Paper2Any - Paper Workflow Web Frontend (Recommended)**\n\n```bash\n# Configure local backend runtime (single source of truth)\n# Edit deploy\u002Fapp_config.sh:\n#   APP_PORT=8000\n#   APP_WORKERS=2\n\n# Start backend API\n.\u002Fdeploy\u002Fstart.sh\n\n# Start frontend (new terminal)\ncd frontend-workflow\nnpm install\nnpm run dev\n```\n\nDefault local addresses:\n- Frontend dev server: http:\u002F\u002Flocalhost:3000\n- Backend health: http:\u002F\u002F127.0.0.1:8000\u002Fhealth\n\nUseful local deploy commands:\n- Start backend: `.\u002Fdeploy\u002Fstart.sh`\n- Stop backend: `.\u002Fdeploy\u002Fstop.sh`\n- Restart backend: `.\u002Fdeploy\u002Frestart.sh`\n\nNotes:\n- `deploy\u002Fstart.sh` and `deploy\u002Fstop.sh` both read the same runtime config from `deploy\u002Fapp_config.sh`.\n- If you change `APP_PORT`, update the frontend proxy target in `frontend-workflow\u002Fvite.config.ts` as well.\n\n**Configure Frontend Proxy**\n\nModify `server.proxy` in `frontend-workflow\u002Fvite.config.ts`:\n\n```typescript\nexport default defineConfig({\n  plugins: [react()],\n  server: {\n    port: 3000,\n    open: true,\n    allowedHosts: true,\n    proxy: {\n      '\u002Fapi': {\n        target: 'http:\u002F\u002F127.0.0.1:8000',  \u002F\u002F FastAPI backend address\n        changeOrigin: true,\n      },\n      '\u002Foutputs': {\n        target: 'http:\u002F\u002F127.0.0.1:8000',\n        changeOrigin: true,\n      },\n    },\n  },\n})\n```\n\nVisit `http:\u002F\u002Flocalhost:3000`.\n\n**Windows: Load MinerU Pre-trained Model**\n\n```powershell\n# Start in PowerShell\nvllm serve opendatalab\u002FMinerU2.5-2509-1.2B `\n  --host 127.0.0.1 `\n  --port 8010 `\n  --logits-processors mineru_vl_utils:MinerULogitsProcessor `\n  --gpu-memory-utilization 0.6 `\n  --trust-remote-code `\n  --enforce-eager\n```\n\n---\n\n### Launch Application\n\n#### 🎨 Web Frontend (Recommended)\n\n```bash\n# Configure deploy\u002Fapp_config.sh first if you want to change the local port\u002Fworkers\n\n# Start backend API\n.\u002Fdeploy\u002Fstart.sh\n\n# Start frontend (new terminal)\ncd frontend-workflow\nnpm install\nnpm run dev\n```\n\nVisit `http:\u002F\u002Flocalhost:3000`.\nBackend health is available at `http:\u002F\u002F127.0.0.1:8000\u002Fhealth` by default.\n\n---\n\n### 🖥️ CLI Scripts (Command-Line Interface)\n\nPaper2Any provides standalone CLI scripts that accept command-line parameters for direct workflow execution without requiring the web frontend\u002Fbackend.\n\n#### Environment Variables\n\nConfigure API access via environment variables (optional):\n\n```bash\nexport DF_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1  # LLM API URL\nexport DF_API_KEY=sk-xxx                      # API key\nexport DF_MODEL=gpt-4o                        # Default model\n```\n\n#### Available CLI Scripts\n\n**1. Paper2Figure CLI** - Generate scientific figures (3 types)\n\n```bash\n# Generate model architecture diagram from PDF\npython script\u002Frun_paper2figure_cli.py \\\n  --input paper.pdf \\\n  --graph-type model_arch \\\n  --api-key sk-xxx\n\n# Generate technical roadmap from text\npython script\u002Frun_paper2figure_cli.py \\\n  --input \"Transformer architecture with attention mechanism\" \\\n  --input-type TEXT \\\n  --graph-type tech_route\n\n# Generate experimental data visualization\npython script\u002Frun_paper2figure_cli.py \\\n  --input paper.pdf \\\n  --graph-type exp_data\n```\n\n**Graph types:** `model_arch` (model architecture), `tech_route` (technical roadmap), `exp_data` (experimental plots)\n\n**2. Paper2PPT CLI** - Convert papers to PPT presentations\n\n```bash\n# Basic usage\npython script\u002Frun_paper2ppt_cli.py \\\n  --input paper.pdf \\\n  --api-key sk-xxx \\\n  --page-count 15\n\n# With custom style\npython script\u002Frun_paper2ppt_cli.py \\\n  --input paper.pdf \\\n  --style \"Academic style; English; Modern design\" \\\n  --language en\n```\n\n**3. PDF2PPT CLI** - One-click PDF to editable PPT\n\n```bash\n# Basic conversion (no AI enhancement)\npython script\u002Frun_pdf2ppt_cli.py --input slides.pdf\n\n# With AI enhancement\npython script\u002Frun_pdf2ppt_cli.py \\\n  --input slides.pdf \\\n  --use-ai-edit \\\n  --api-key sk-xxx\n```\n\n**4. Image2PPT CLI** - Convert images to editable PPT\n\n```bash\n# Basic conversion\npython script\u002Frun_image2ppt_cli.py --input screenshot.png\n\n# With AI enhancement\npython script\u002Frun_image2ppt_cli.py \\\n  --input diagram.jpg \\\n  --use-ai-edit \\\n  --api-key sk-xxx\n```\n\n**5. PPT2Polish CLI** - Beautify existing PPT files\n\n```bash\n# Basic beautification\npython script\u002Frun_ppt2polish_cli.py \\\n  --input old_presentation.pptx \\\n  --style \"Academic style, clean and elegant\" \\\n  --api-key sk-xxx\n\n# With reference image for style consistency\npython script\u002Frun_ppt2polish_cli.py \\\n  --input old_presentation.pptx \\\n  --style \"Modern minimalist style\" \\\n  --ref-img reference_style.png \\\n  --api-key sk-xxx\n```\n\n> [!NOTE]\n> **System Requirements for PPT2Polish:**\n> - LibreOffice: `sudo apt-get install libreoffice` (Ubuntu\u002FDebian)\n> - pdf2image: `pip install pdf2image`\n> - poppler-utils: `sudo apt-get install poppler-utils`\n\n#### Common Options\n\nAll CLI scripts support these common options:\n\n- `--api-url URL` - LLM API URL (default: from `DF_API_URL` env var)\n- `--api-key KEY` - API key (default: from `DF_API_KEY` env var)\n- `--model NAME` - Text model name (default: varies by script)\n- `--output-dir DIR` - Custom output directory (default: `outputs\u002Fcli\u002F{script_name}\u002F{timestamp}`)\n- `--help` - Show detailed help message\n\nFor complete parameter documentation, run any script with `--help`:\n\n```bash\npython script\u002Frun_paper2figure_cli.py --help\n```\n\n---\n\n## 📂 Project Structure\n\n```\nPaper2Any\u002F\n├── dataflow_agent\u002F          # Core codebase\n│   ├── agentroles\u002F         # Agent definitions\n│   │   └── paper2any_agents\u002F # Paper2Any-specific agents\n│   ├── workflow\u002F           # Workflow definitions\n│   ├── promptstemplates\u002F   # Prompt templates\n│   └── toolkits\u002F           # Toolkits (drawing, PPT generation, etc.)\n├── fastapi_app\u002F            # Backend API service\n├── frontend-workflow\u002F      # Frontend web interface\n├── static\u002F                 # Static assets\n├── script\u002F                 # Script tools\n└── tests\u002F                  # Test cases\n```\n\n---\n\n## 🗺️ Roadmap\n\n\u003Ctable>\n\u003Ctr>\n\u003Cth width=\"35%\">Feature\u003C\u002Fth>\n\u003Cth width=\"15%\">Status\u003C\u002Fth>\n\u003Cth width=\"50%\">Sub-features\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>📊 Paper2Figure\u003C\u002Fstrong>\u003Cbr>\u003Csub>Editable Scientific Figures\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-85%25-blue?style=flat-square&logo=progress\" alt=\"85%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Model_Architecture-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Technical_Roadmap-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Experimental_Plots-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Web_Frontend-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🧩 Paper2Diagram\u003C\u002Fstrong>\u003Cbr>\u003Csub>Drawio Diagrams\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-80%25-blue?style=flat-square&logo=progress\" alt=\"80%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Paper_or_Text_to_Drawio-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Image_to_Drawio-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Chat_Edit-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Export_Drawio_PNG_SVG-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎬 Paper2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>Editable Slide Decks\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-70%25-yellow?style=flat-square&logo=progress\" alt=\"70%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Beamer_Style-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Long_Doc_PPT-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Template_based_PPT_Generation-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-KB_based_PPT_Generation-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Table_Extraction-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Figure_Extraction-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🖼️ PDF2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>Layout-Preserving Conversion\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-90%25-green?style=flat-square&logo=progress\" alt=\"90%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Smart_Cutout-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Layout_Preservation-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Editable_PPTX-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🖼️ Image2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>Image to Slides\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-85%25-blue?style=flat-square&logo=progress\" alt=\"85%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Single_or_Multi_Image_Input-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Layout_Aware_Slides-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎨 PPTPolish\u003C\u002Fstrong>\u003Cbr>\u003Csub>Smart Beautification\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-60%25-yellow?style=flat-square&logo=progress\" alt=\"60%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Style_Transfer-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Layout_Optimization-yellow?style=flat-square\" alt=\"In_Progress\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Reference_Image_Polish-yellow?style=flat-square\" alt=\"In_Progress\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>📚 Knowledge Base\u003C\u002Fstrong>\u003Cbr>\u003Csub>KB Workflows\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-75%25-blue?style=flat-square&logo=progress\" alt=\"75%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Ingest_and_Embedding-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Semantic_Search-success?style=flat-square\" alt=\"Done\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-KB_PPT_Podcast_Mindmap-success?style=flat-square\" alt=\"Done\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎬 Paper2Video\u003C\u002Fstrong>\u003Cbr>\u003Csub>Video Script Generation\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-40%25-yellow?style=flat-square&logo=progress\" alt=\"40%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Script_and_Narration-yellow?style=flat-square\" alt=\"In_Progress\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Storyboard_Assets-yellow?style=flat-square\" alt=\"In_Progress\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 🤝 Contributing\n\nWe welcome all forms of contribution!\n\n[![Issues](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FIssues-Submit_Bug-red?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues)\n[![Discussions](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscussions-Feature_Request-blue?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fdiscussions)\n[![PR](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPR-Submit_Code-green?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fpulls)\n\n---\n\n## 📄 License\n\nThis project is licensed under [Apache License 2.0](LICENSE).\n\n\u003C!-- ---\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_a92212198a53.png)](https:\u002F\u002Fstar-history.com\u002F#OpenDCAI\u002FPaper2Any&Date) -->\n\n---\n\n\u003Cdiv align=\"center\">\n\n**If this project helps you, please give us a ⭐️ Star!**\n\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenDCAI\u002FPaper2Any?style=social)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fstargazers)\n[![GitHub forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FOpenDCAI\u002FPaper2Any?style=social)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fnetwork\u002Fmembers)\n\n\u003Cbr>\n\n\u003Ca name=\"wechat-group\">\u003C\u002Fa>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_cbdaf7956775.png\" alt=\"DataFlow-Agent WeChat Community\" width=\"300\"\u002F>\n\u003Cbr>\n\u003Csub>Scan to join the community WeChat group\u003C\u002Fsub>\n\n\u003Cp align=\"center\"> \n  \u003Cem> ❤️ Made with by OpenDCAI Team\u003C\u002Fem>\n\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_2914b1b5bb3e.png\" alt=\"Paper2Any Logo\" width=\"200\"\u002F>\n\n# Paper2Any\n\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11+-3776AB?style=flat-square&logo=python&logoColor=white)](https:\u002F\u002Fwww.python.org\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-2F80ED?style=flat-square&logo=apache&logoColor=white)](LICENSE)\n[![GitHub Repo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-OpenDCAI%2FPaper2Any-24292F?style=flat-square&logo=github&logoColor=white)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any)\n[![Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenDCAI\u002FPaper2Any?style=flat-square&logo=github&label=Stars&color=F2C94C)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fstargazers)\n\nEnglish | [中文](README_CN.md)\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F17634\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_d336655397cd.png\" alt=\"OpenDCAI%2FPaper2Any | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n✨ **专注于论文多模态工作流：从论文PDF\u002F截图\u002F文本一键生成模型示意图、技术路线图、实验图表及演示文稿** ✨\n\n| 📄 **通用文件支持** &nbsp;|&nbsp; 🎯 **AI赋能生成** &nbsp;|&nbsp; 🎨 **自定义样式** &nbsp;|&nbsp; ⚡ **闪电般速度** |\n\n\u003Cbr>\n\n\u003Ca href=\"#-quick-start\" target=\"_self\">\n  \u003Cimg alt=\"Quickstart\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🚀-Quick_Start-2F80ED?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\" target=\"_blank\">\n  \u003Cimg alt=\"Online Demo\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🌐-Online_Demo-56CCF2?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"docs\u002F\" target=\"_blank\">\n  \u003Cimg alt=\"Docs\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F📚-Docs-2D9CDB?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"docs\u002Fcontributing.md\" target=\"_blank\">\n  \u003Cimg alt=\"Contributing\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤝-Contributing-27AE60?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\u003Ca href=\"#wechat-group\" target=\"_self\">\n  \u003Cimg alt=\"WeChat\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F💬-WeChat_Group-07C160?style=for-the-badge\" \u002F>\n\u003C\u002Fa>\n\n\u003Cbr>\n\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_b294d195b8c5.png\" alt=\"Paper2Any Web Interface\" width=\"80%\"\u002F>\n\n\u003C\u002Fdiv>\n\n\n## 📑 目录\n\n- [🔥 新闻](#-news)\n- [✨ 核心功能](#-core-features)\n- [📸 展示](#-showcase)\n- [🧩 Drawio](#-drawio)\n- [🚀 快速入门](#-quick-start)\n- [📂 项目结构](#-project-structure)\n- [🗺️ 路线图](#️-roadmap)\n- [🤝 贡献](#-contributing)\n\n---\n\n## 🔥 新闻\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-03-28 · 可编辑PPT展示更新\u003C\u002Fstrong>\u003Cbr>\n> 为前端演示文稿工作流新增了两张可编辑PPT的展示截图：\u003Cbr>\n> 包括生成的多幻灯片画廊视图，以及带有主题锁定的画布编辑工作区。\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-03-26 · 工作流展示更新\u003C\u002Fstrong>\u003Cbr>\n> 增加了对\u003Cstrong>Paper2Video\u003C\u002Fstrong>、\u003Cstrong>Paper2Poster\u003C\u002Fstrong>和\u003Cstrong>Paper2Citation\u003C\u002Fstrong>的展示内容。\u003Cbr>\n> README现在包含一个压缩视频演示，以及更新后的英文和中文工作流预览。\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-02-02 · Paper2Rebuttal\u003C\u002Fstrong>\u003Cbr>\n> 新增反驳草稿支持，提供结构化回复指导和图像感知的修订提示。\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-28 · Drawio更新\u003C\u002Fstrong>\u003Cbr>\n> 在工作流中增加了Drawio支持，用于创建可视化图表，并生成适合展示的输出。\u003Cbr>\n> 知识库更新一句话：支持多文件PPT生成，具备文档转换与合并功能，可选插入图片，并结合嵌入式检索。\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-25 · 新功能\u003C\u002Fstrong>\u003Cbr>\n> 新增了\u003Cstrong>AI辅助大纲编辑\u003C\u002Fstrong>、\u003Cstrong>三层模型配置系统\u003C\u002Fstrong>以实现灵活的模型选择，以及\u003Cstrong>用户积分管理\u003C\u002Fstrong>并每日分配额度。\u003Cbr>\n> 🌐 在线演示： \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-20 · Bug修复\u003C\u002Fstrong>\u003Cbr>\n> 修复了实验图表生成中的bug（图像\u002F文本相关），并解决了历史文件丢失的问题。\u003Cbr>\n> 🌐 在线演示： \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n> [!TIP]\n> 🆕 \u003Cstrong>2026-01-14 · 功能更新与后端架构升级\u003C\u002Fstrong>\u003Cbr>\n> 1. **功能更新**：新增了\u003Cstrong>Image2PPT\u003C\u002Fstrong>,优化了\u003Cstrong>Paper2Figure\u003C\u002Fstrong>的交互体验，并提升了\u003Cstrong>PDF2PPT\u003C\u002Fstrong>的效果。\u003Cbr>\n> 2. **标准化API**：重构了后端接口，采用RESTful `\u002Fapi\u002Fv1\u002F`结构，移除了过时的端点以提高可维护性。\u003Cbr>\n> 3. **动态配置**：通过API参数支持动态模型选择（如GPT-4o、Qwen-VL），消除了硬编码的模型依赖。\u003Cbr>\n> 🌐 在线演示： \u003Ca href=\"http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\">http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F\u003C\u002Fa>\n\n- 2025-12-12 · Paper2Figure网页公测上线\n- 2025-10-01 · 发布首个版本 \u003Ccode>0.1.0\u003C\u002Fcode>\n\n---\n\n## ✨ 核心功能\n\n> 从论文PDF\u002F图片\u002F文本出发，一键生成可编辑的科学图表、演示文稿、视频脚本、学术海报等多模态内容。\n\nPaper2Any目前包含以下子能力：\n\n- **📊 Paper2Figure - 可编辑的科学图表**：包括模型架构图、技术路线图（PPT + SVG）以及实验图表，最终输出可编辑的PPTX文件。\n- **🧩 Paper2Diagram \u002F Image2Drawio - 可编辑的图表**：根据论文或文本、图片生成draw.io格式的图表，支持导出为drawio\u002Fpng\u002Fsvg格式，并可通过聊天方式进行编辑。\n- **🎬 Paper2PPT - 可编辑的演示文稿**：将论文、文本或主题内容转化为PPT，支持长文档处理，并内置表格和图表提取功能。\n- **📝 Paper2Rebuttal**：基于证据撰写结构化的反驳意见和修改建议。\n- **🖼️ PDF2PPT - 版面保持转换**：实现PDF到可编辑PPTX的精准版面保留。\n- **🖼️ Image2PPT - 图片转幻灯片**：将图片或截图转化为结构化的幻灯片。\n- **🎨 PPTPolish - 智能美化**：利用AI进行布局优化和风格转换。\n- **🎬 Paper2Video**：生成视频脚本和旁白素材。\n- **🖼️ Paper2Poster - 学术海报**：将论文PDF转化为海报级排版，支持自定义版块、添加Logo，并导出成品。\n- **🔎 Paper2Citation - 引用探索器**：通过作者姓名或DOI\u002F论文链接，追踪引用该论文的作者、机构及后续重要研究成果。\n- **📝 Paper2Technical**：生成技术报告和方法总结。\n- **📚 知识库（KB）**：支持知识导入与嵌入、语义搜索，以及基于知识库生成PPT、播客和思维导图等功能。\n\n---\n\n## 📸 展示\n\n### 🧩 Drawio\n\n\u003Cdiv align=\"center\">\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"33%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_ac25ef0ee794.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 上传纸质图表或截图作为起点\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"34%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_1d3af098805d.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 在转换前保持源结构可见\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"33%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_e42b63ccc600.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 将图像转换为可编辑的 DrawIO 画布\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"48%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_8b59a21aebf0.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 直接在 DrawIO 工作台中生成模型或系统图\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"52%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_45a2526e477f.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 使用聊天编辑功能完善生成的架构，并导出准备就绪的布局\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003C\u002Fdiv>\n\n---\n\n### 📝 Paper2Rebuttal：反驳稿撰写\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_cc9cab70a91a.png\" width=\"95%\"\u002F>\n\u003Cbr>\u003Csub>✨ 反驳稿撰写与修订支持\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 📊 Paper2Figure：科学图表生成\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_076e4d1fbed2.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 模型架构图生成\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_14afff7223ba.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 模型架构图生成\u003C\u002Fsub>\n\n\u003Cbr>\u003Cbr>\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"56%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_29080d5bd025.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 技术路线图工作台：选择路径类型、输入来源、模型配置和可视化模板\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"44%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3dc0c587f868.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 生成具有结构化双栏布局的技术路线图\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_f4bc4088bd34.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 实验曲线图生成（多种风格）\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎬 Paper2PPT：论文转演示文稿\n\n\u003Cdiv align=\"center\">\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_a5f3a4fbef11.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 端到端 PPT 生成演示\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3fd4f6369d89.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 从论文\u002F文本\u002F主题生成精美的幻灯片集\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_6db1c04d4eef.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 在锁定主题的前提下，直接在画布上编辑幻灯片文字\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_4674abcd902c.gif\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 导出前预览生成的多页幻灯片库\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_c87b5b199e3b.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ AI 辅助的大纲优化，提供针对性的改写提示\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd width=\"50%\" align=\"center\" valign=\"top\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_433628136412.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>✨ 大纲编辑细化到章节和要点级别\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_f62390a12912.png\" width=\"78%\"\u002F>\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_c7bb0212f9ae.png\" width=\"80%\"\u002F>\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_369cda27c799.png\" width=\"80%\"\u002F>\n\u003Cbr>\u003Csub>✨ 支持超过 40 张幻灯片的长文档 · 智能表格提取与插入 · 版本历史与迭代式文稿管理\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎬 Paper2Video：PPT 转配音视频\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3c1af41b3c34.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 将 PPT\u002FPDF 转换为带旁白的视频，支持脚本确认、阿里云 TTS 音色，并可下载输出\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🖼️ Paper2Poster：论文转海报\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_08d36e70b483.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>PNG 海报结果\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_9d5186a63fe2.png\" width=\"100%\"\u002F>\n      \u003Cbr>\u003Csub>PPT 海报结果\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003Cbr>\u003Csub>✨ 将论文 PDF 转换为学术海报，支持可配置布局、可编辑的海报输出以及一键导出\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🔎 Paper2Citation：引用探索器\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_5386bf0c7b21.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 搜索作者或论文，查看引用候选、所属机构及下游引用背景\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎨 PPT 智能美化\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_3539548dfece.gif\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 基于 AI 的版面优化\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_71d22e550c59.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 基于 AI 的版面优化与风格迁移\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n### 🖼️ PDF2PPT：保留版面的转换\n\n\u003Cdiv align=\"center\">\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_5b1c750a53d9.png\" width=\"90%\"\u002F>\n\u003Cbr>\u003Csub>✨ 智能裁剪与版面保留\u003C\u002Fsub>\n\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_bc1832666902.png\" width=\"93%\"\u002F>\n\u003Cbr>\u003Csub>✨ 图像转 PPT\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n---\n\n## 🚀 快速入门\n\n### 需求\n\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11+-3776AB?style=flat-square&logo=python&logoColor=white)\n![pip](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpip-latest-3776AB?style=flat-square&logo=pypi&logoColor=white)\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>🐳 Docker（推荐）— 部署与更新\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```bash\n# 1. 克隆\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. 配置环境变量\ncp fastapi_app\u002F.env.example fastapi_app\u002F.env\ncp frontend-workflow\u002F.env.example frontend-workflow\u002F.env\n```\n\n**必需配置：**\n\n`fastapi_app\u002F.env`（后端）：\n```bash\n# 内部 API 认证密钥。必须与前端 VITE_API_KEY 匹配。\nBACKEND_API_KEY=your-backend-api-key\n\n# 必需：您的 LLM API 地址（请替换为您自己的）\nDEFAULT_LLM_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\u002F\n\n# 可选：DrawIO OCR \u002F VLM 服务\nPAPER2DRAWIO_OCR_API_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1\nPAPER2DRAWIO_OCR_API_KEY=your_dashscope_key\n\n# 可选：MinerU 官方远程 API\nMINERU_API_BASE_URL=https:\u002F\u002Fmineru.net\u002Fapi\u002Fv4\nMINERU_API_KEY=your_mineru_api_key\n\n# 可选：用于 PDF2PPT \u002F Image2PPT \u002F Image2Drawio 的 SAM3 分割服务\n# SAM3_SERVER_URLS=http:\u002F\u002FGPU_MACHINE_IP:8001\n# SAM3_SERVER_URLS=http:\u002F\u002FGPU1:8021,http:\u002F\u002FGPU2:8022\n\n# 可选：Supabase（无需认证也可使用——核心功能仍可正常运行）\n# SUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\n# SUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n`frontend-workflow\u002F.env`（前端）：\n```bash\n# 必须与 fastapi_app\u002F.env 中的 BACKEND_API_KEY 一致\nVITE_API_KEY=your-backend-api-key\n\n# 必需：UI 下拉菜单中显示的 LLM API 地址（用逗号分隔）\nVITE_DEFAULT_LLM_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\nVITE_LLM_API_URLS=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# 可选：UI 中显示的 Paper2Drawio 模型候选\nVITE_PAPER2DRAWIO_MODEL=claude-sonnet-4-5-20250929,gpt-5.2\n# 可选：Supabase（与后端保持一致）\n# VITE_SUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\n# VITE_SUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n```bash\n# 3. 构建并运行\ndocker compose up -d --build\n```\n\n打开：\n- 前端：http:\u002F\u002Flocalhost:3000\n- 后端健康检查：http:\u002F\u002Flocalhost:8000\u002Fhealth\n\n> **关于 GPU 服务的说明**：Docker 只会启动前端和后端，不会包含任何 GPU 模型服务。\n> - Paper2PPT、Paper2Figure、知识库等功能仅需 LLM API 即可开箱即用。\n> - 而 **PDF2PPT、Image2PPT、Image2Drawio** 则需要 SAM3 分割服务（需 GPU），该服务需单独部署：\n>   ```bash\n>   # 在配备 GPU 的机器上\n>   python -m dataflow_agent.toolkits.model_servers.sam3_server \\\n>       --port 8001 --checkpoint models\u002Fsam3\u002Fsam3.pt \\\n>       --bpe models\u002Fsam3\u002Fbpe_simple_vocab_16e6.txt.gz --device cuda\n>   ```\n>   然后在 `fastapi_app\u002F.env` 中添加：`SAM3_SERVER_URLS=http:\u002F\u002FGPU_MACHINE_IP:8001`\n>\n> 更多详情请参阅下方的“进阶：本地模型服务器负载均衡”部分。\n\n修改与更新：\n- 修改代码或 `.env` 文件后，请重新构建：`docker compose up -d --build`\n- 拉取最新代码并重新构建：\n  - `git pull`\n  - `docker compose up -d --build`\n\n常用命令：\n- 查看日志：`docker compose logs -f`\n- 停止服务：`docker compose down`\n\n注意事项：\n- 首次构建可能需要较长时间（系统依赖项 + Python 依赖项）。\n- 前端环境变量在构建时已固化（通过 compose 构建参数）。若需更改，需重新构建：`docker compose up -d --build`。\n- 输出文件和模型会挂载到宿主机目录（`.\u002Foutputs`、`.\u002Fmodels`），以确保数据持久化。\n\n\u003C\u002Fdetails>\n\n### 🐧 Linux 安装\n\n> 我们建议使用 Conda 创建一个隔离的环境（Python 3.11）。\n\n#### 1. 创建环境并安装基础依赖\n\n```bash\n# 0. 创建并激活 Conda 环境\nconda create -n paper2any python=3.11 -y\nconda activate paper2any\n\n# 1. 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. 安装基础依赖\npip install -r requirements-base.txt\n\n# 3. 以可编辑（开发）模式安装\npip install -e .\n```\n\n#### 2. 安装 Paper2Any 特定依赖（必需）\n\nPaper2Any 涉及 LaTeX 渲染、矢量图形处理以及 PPT\u002FPDF 转换等功能，这些都需要额外的依赖：\n\n```bash\n# 1. Python 依赖\npip install -r requirements-paper.txt || pip install -r requirements-paper-backup.txt\n\n# 2. LaTeX 引擎（tectonic）——推荐通过 Conda 安装\nconda install -c conda-forge tectonic -y\n\n# 3. 解决 doclayout_yolo 依赖冲突（重要）\npip install doclayout_yolo --no-deps\n\n# 4. 系统依赖（以 Ubuntu 为例）\nsudo apt-get update\nsudo apt-get install -y inkscape libreoffice poppler-utils wkhtmltopdf\n```\n\n#### 3. 环境变量\n\n```bash\nexport DF_API_KEY=your_api_key_here\nexport DF_API_URL=xxx  # 可选：如果您需要第三方 API 网关\nexport MINERU_DEVICES=\"0,1,2,3\" # 可选：MinerU 任务的 GPU 资源池\n```\n\n> [!TIP]\n> 📚 **详细配置指南**，请参阅 [配置指南](docs\u002Fguides\u002Fconfiguration.md)，其中提供了关于配置模型、环境变量以及启动服务的分步说明。\n\n#### 4. 配置环境文件（可选）\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>📝 点击展开：详细的 .env 配置指南\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nPaper2Any 使用两个 `.env` 文件进行配置。**两者均为可选**——您可以在不使用它们的情况下直接运行应用程序，采用默认设置。\n\n##### 步骤 1：复制示例文件\n\n```bash\n# 复制后端环境文件\ncp fastapi_app\u002F.env.example fastapi_app\u002F.env\n\n# 复制前端环境文件\ncp frontend-workflow\u002F.env.example frontend-workflow\u002F.env\n```\n\n##### 步骤 2：后端配置（`fastapi_app\u002F.env`）\n\n**Supabase（可选）**——仅当您需要用户认证和云存储时才需配置：\n```bash\nSUPABASE_URL=https:\u002F\u002Fyour-project-id.supabase.co\nSUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n**模型配置**——自定义不同工作流所使用的模型：\n```bash\n# 默认 LLM API 地址\nDEFAULT_LLM_API_URL=http:\u002F\u002F123.129.219.111:3000\u002Fv1\u002F\n\n# 工作流级别的默认值\nPAPER2PPT_DEFAULT_MODEL=gpt-5.1\nPAPER2PPT_DEFAULT_IMAGE_MODEL=gemini-3-pro-image-preview\nPDF2PPT_DEFAULT_MODEL=gpt-4o\n# …完整列表请参见 .env.example\n```\n\n**服务集成配置**——图像\u002FPDF 工作流中使用的外部或本地服务：\n```bash\n# DrawIO OCR \u002F VLM\nPAPER2DRAWIO_OCR_API_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1\nPAPER2DRAWIO_OCR_API_KEY=your_dashscope_key\n\n# MinerU 官方远程 API；如果 MINERU_API_KEY 为空，后端将回退到本地 MINERU_PORT\nMINERU_API_BASE_URL=https:\u002F\u002Fmineru.net\u002Fapi\u002Fv4\nMINERU_API_KEY=your_mineru_api_key\nMINERU_API_MODEL_VERSION=vlm\n\n# SAM3 分割服务，用于 PDF2PPT \u002F Image2PPT \u002F Image2Drawio\n# 单个端点：\nSAM3_SERVER_URLS=http:\u002F\u002F127.0.0.1:8001\n# 或多个端点以实现负载均衡：\n# SAM3_SERVER_URLS=http:\u002F\u002F127.0.0.1:8021,http:\u002F\u002F127.0.0.1:8022\n```\n\n##### 步骤 3：前端配置（`frontend-workflow\u002F.env`）\n\n**LLM 提供商配置**——控制 UI 中的 API 端点下拉菜单：\n```bash\n# UI 中显示的默认 API 地址\nVITE_DEFAULT_LLM_API_URL=https:\u002F\u002Fapi.apiyi.com\u002Fv1\n\n# 下拉菜单中可用的 API URL（逗号分隔）\nVITE_LLM_API_URLS=https:\u002F\u002Fapi.apiyi.com\u002Fv1,http:\u002F\u002Fb.apiyi.com:16888\u002Fv1,http:\u002F\u002F123.129.219.111:3000\u002Fv1\n```\n\n**修改 `VITE_LLM_API_URLS` 后会发生什么：**\n- 前端会显示一个包含你指定的所有 URL 的 **下拉菜单**\n- 用户无需手动输入 URL，即可选择不同的 API 端点\n- 适用于在 OpenAI、本地模型或自定义 API 网关之间切换\n\n**Supabase（可选）** - 如果需要用户认证，请取消注释以下行：\n```bash\nVITE_SUPABASE_URL=https:\u002F\u002Fyour-project.supabase.co\nVITE_SUPABASE_ANON_KEY=your-anon-key\nSUPABASE_SERVICE_ROLE_KEY=your-service-role-key\nSUPABASE_JWT_SECRET=your-jwt-secret\n```\n\n##### 不使用 Supabase 运行\n\n如果跳过 Supabase 配置：\n- ✅ 所有核心功能正常运行\n- ✅ CLI 脚本无需任何配置即可运行\n- ❌ 无用户认证和配额限制\n- ❌ 无云端文件存储\n\n\u003C\u002Fdetails>\n\n> [!NOTE]\n> **快速入门：** 您可以完全跳过 `.env` 配置，直接通过 `--api-key` 参数使用 CLI 脚本。请参阅下方的 [CLI 脚本](#️-cli-scripts-command-line-interface) 部分。\n\n---\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>高级配置：本地模型服务负载均衡\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n如果您在高并发的本地环境中部署，可以使用 `script\u002Fstart_model_servers.sh` 来启动本地模型服务集群（MinerU \u002F SAM \u002F OCR）。\n\n脚本位置：`\u002FDataFlow-Agent\u002Fscript\u002Fstart_model_servers.sh`\n\n**主要配置项：**\n\n- **MinerU（PDF 解析）**\n  - `MINERU_MODEL_PATH`：模型路径（默认为 `models\u002FMinerU2.5-2509-1.2B`）\n  - `MINERU_GPU_UTIL`：GPU 内存利用率（默认 0.85）\n  - **实例配置**：默认情况下，每块已配置的 GPU 上启动一个实例，端口为 8011-8013。\n  - **负载均衡器**：端口 8010，自动分发请求。\n\n- **SAM3（Segment Anything Model 3）**\n  - **实例配置**：默认情况下，每块已配置的 GPU 上启动一个实例，端口为 8021-8022。\n  - **模型资源**：默认路径为 `.\u002Fmodels\u002Fsam3\u002Fsam3.pt` 和 `.\u002Fmodels\u002Fsam3\u002Fbpe_simple_vocab_16e6.txt.gz`。\n  - **负载均衡器**：端口 8020。\n\n- **OCR（PaddleOCR）**\n  - **配置**：在 CPU 上运行，使用 uvicorn 的工作进程机制（默认 4 个工作进程）。\n  - **端口**：8003。\n\n> 在使用前，请根据您实际的 GPU 数量和内存情况，修改脚本中的 `gpu_id` 和实例数量。\n\n若要在单 GPU 上进行本地一键开发测试（SAM3 + 后端 + 前端），请运行：\n\n```bash\nbash script\u002Fstart_local_sam3_dev.sh\n```\n\n\u003C\u002Fdetails>\n\n---\n\n### 🪟 Windows 安装\n\n> [!NOTE]\n> 目前我们建议在 Linux \u002F WSL 上尝试 Paper2Any。如果您需要在原生 Windows 上部署，请按照以下步骤操作。\n\n#### 1. 创建环境并安装基础依赖\n\n```bash\n# 0. 创建并激活 conda 环境\nconda create -n paper2any python=3.12 -y\nconda activate paper2any\n\n# 1. 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n\n# 2. 安装基础依赖\npip install -r requirements-win-base.txt\n\n# 3. 以可编辑（开发）模式安装\npip install -e .\n```\n\n#### 2. 安装 Paper2Any 特定依赖（推荐）\n\nPaper2Any 涉及 LaTeX 渲染和矢量图形处理，需要额外的依赖（见 `requirements-paper.txt`）：\n\n```bash\n# Python 依赖\npip install -r requirements-paper.txt\n\n# tectonic：LaTeX 引擎（推荐通过 conda 安装）\nconda install -c conda-forge tectonic -y\n```\n\n**🎨 安装 Inkscape（SVG\u002F矢量图形处理 | 推荐\u002F必需）**\n\n1. 下载并安装（Windows 64 位 MSI）：[Inkscape 下载](https:\u002F\u002Finkscape.org\u002Frelease\u002Finkscape-1.4.2\u002Fwindows\u002F64-bit\u002Fmsi\u002F?redirected=1)\n2. 将 Inkscape 可执行文件目录添加到系统环境变量 Path 中（示例）：`C:\\Program Files\\Inkscape\\bin\\`\n\n> [!TIP]\n> 配置 Path 后，建议重新打开终端（或重启 VS Code \u002F PowerShell），以确保环境变量生效。\n\n#### ⚡ 安装 vLLM 的 Windows 版本（可选 | 用于本地推理加速）\n\n发布页面：[vllm-windows releases](https:\u002F\u002Fgithub.com\u002FSystemPanic\u002Fvllm-windows\u002Freleases)  \n推荐版本：0.11.0\n\n```bash\npip install vllm-0.11.0+cu124-cp312-cp312-win_amd64.whl\n```\n\n> [!IMPORTANT]\n> 请确保 `.whl` 文件与您的当前环境匹配：\n> - Python：cp312（Python 3.12）\n> - 平台：win_amd64\n> - CUDA：cu124（必须与您本地的 CUDA \u002F 驱动程序匹配）\n\n#### 启动应用\n\n**Paper2Any - 论文工作流 Web 前端（推荐）**\n\n```bash\n# 配置本地后端运行时（单一可信来源）\n# 编辑 deploy\u002Fapp_config.sh：\n#   APP_PORT=8000\n#   APP_WORKERS=2\n\n# 启动后端 API\n.\u002Fdeploy\u002Fstart.sh\n\n# 启动前端（新开终端）\ncd frontend-workflow\nnpm install\nnpm run dev\n```\n\n默认本地地址：\n- 前端开发服务器：http:\u002F\u002Flocalhost:3000\n- 后端健康检查：http:\u002F\u002F127.0.0.1:8000\u002Fhealth\n\n有用的本地部署命令：\n- 启动后端：`.\u002Fdeploy\u002Fstart.sh`\n- 停止后端：`.\u002Fdeploy\u002Fstop.sh`\n- 重启后端：`.\u002Fdeploy\u002Frestart.sh`\n\n注意：\n- `deploy\u002Fstart.sh` 和 `deploy\u002Fstop.sh` 都会读取 `deploy\u002Fapp_config.sh` 中的相同运行时配置。\n- 如果您更改了 `APP_PORT`，还需同时更新 `frontend-workflow\u002Fvite.config.ts` 中的前端代理目标。\n\n**配置前端代理**\n\n修改 `frontend-workflow\u002Fvite.config.ts` 中的 `server.proxy`：\n\n```typescript\nexport default defineConfig({\n  plugins: [react()],\n  server: {\n    port: 3000,\n    open: true,\n    allowedHosts: true,\n    proxy: {\n      '\u002Fapi': {\n        target: 'http:\u002F\u002F127.0.0.1:8000',  \u002F\u002F FastAPI 后端地址\n        changeOrigin: true,\n      },\n      '\u002Foutputs': {\n        target: 'http:\u002F\u002F127.0.0.1:8000',\n        changeOrigin: true,\n      },\n    },\n  },\n})\n```\n\n访问 `http:\u002F\u002Flocalhost:3000`。\n\n**Windows：加载 MinerU 预训练模型**\n\n```powershell\n# 在 PowerShell 中启动\nvllm serve opendatalab\u002FMinerU2.5-2509-1.2B `\n  --host 127.0.0.1 `\n  --port 8010 `\n  --logits-processors mineru_vl_utils:MinerULogitsProcessor `\n  --gpu-memory-utilization 0.6 `\n  --trust-remote-code `\n  --enforce-eager\n```\n\n---\n\n### 启动应用\n\n#### 🎨 Web 前端（推荐）\n\n```bash\n# 如果想更改本地端口\u002F工作进程数，请先配置 deploy\u002Fapp_config.sh\n\n# 启动后端 API\n.\u002Fdeploy\u002Fstart.sh\n\n# 启动前端（新开终端）\ncd frontend-workflow\nnpm install\nnpm run dev\n```\n\n访问 `http:\u002F\u002Flocalhost:3000`。默认情况下，后端健康检查地址为 `http:\u002F\u002F127.0.0.1:8000\u002Fhealth`。\n\n---\n\n### 🖥️ 命令行脚本 (CLI)\n\nPaper2Any 提供了独立的命令行脚本，支持通过命令行参数直接执行工作流，无需使用 Web 前端或后端。\n\n#### 环境变量\n\n可通过环境变量配置 API 访问（可选）：\n\n```bash\nexport DF_API_URL=https:\u002F\u002Fapi.openai.com\u002Fv1  # LLM API 地址\nexport DF_API_KEY=sk-xxx                      # API 密钥\nexport DF_MODEL=gpt-4o                        # 默认模型\n```\n\n#### 可用的 CLI 脚本\n\n**1. Paper2Figure CLI** - 生成科学图表（3 种类型）\n\n```bash\n# 从 PDF 文件生成模型架构图\npython script\u002Frun_paper2figure_cli.py \\\n  --input paper.pdf \\\n  --graph-type model_arch \\\n  --api-key sk-xxx\n\n# 从文本生成技术路线图\npython script\u002Frun_paper2figure_cli.py \\\n  --input \"带有注意力机制的 Transformer 架构\" \\\n  --input-type TEXT \\\n  --graph-type tech_route\n\n# 生成实验数据可视化图表\npython script\u002Frun_paper2figure_cli.py \\\n  --input paper.pdf \\\n  --graph-type exp_data\n```\n\n**图表类型：** `model_arch`（模型架构）、`tech_route`（技术路线图）、`exp_data`（实验图表）\n\n**2. Paper2PPT CLI** - 将论文转换为 PPT 演示文稿\n\n```bash\n# 基本用法\npython script\u002Frun_paper2ppt_cli.py \\\n  --input paper.pdf \\\n  --api-key sk-xxx \\\n  --page-count 15\n\n# 自定义风格\npython script\u002Frun_paper2ppt_cli.py \\\n  --input paper.pdf \\\n  --style \"学术风格；英语；现代设计\" \\\n  --language en\n```\n\n**3. PDF2PPT CLI** - 一键将 PDF 转换为可编辑的 PPT\n\n```bash\n# 基本转换（无 AI 增强）\npython script\u002Frun_pdf2ppt_cli.py --input slides.pdf\n\n# 带 AI 增强\npython script\u002Frun_pdf2ppt_cli.py \\\n  --input slides.pdf \\\n  --use-ai-edit \\\n  --api-key sk-xxx\n```\n\n**4. Image2PPT CLI** - 将图片转换为可编辑的 PPT\n\n```bash\n# 基本转换\npython script\u002Frun_image2ppt_cli.py --input screenshot.png\n\n# 带 AI 增强\npython script\u002Frun_image2ppt_cli.py \\\n  --input diagram.jpg \\\n  --use-ai-edit \\\n  --api-key sk-xxx\n```\n\n**5. PPT2Polish CLI** - 美化现有 PPT 文件\n\n```bash\n# 基本美化\npython script\u002Frun_ppt2polish_cli.py \\\n  --input old_presentation.pptx \\\n  --style \"学术风格，简洁优雅\" \\\n  --api-key sk-xxx\n\n# 使用参考图片保持风格一致\npython script\u002Frun_ppt2polish_cli.py \\\n  --input old_presentation.pptx \\\n  --style \"现代简约风格\" \\\n  --ref-img reference_style.png \\\n  --api-key sk-xxx\n```\n\n> [!NOTE]\n> **PPT2Polish 的系统要求：**\n> - LibreOffice：`sudo apt-get install libreoffice`（Ubuntu\u002FDebian）\n> - pdf2image：`pip install pdf2image`\n> - poppler-utils：`sudo apt-get install poppler-utils`\n\n#### 共享选项\n\n所有 CLI 脚本均支持以下通用选项：\n\n- `--api-url URL` - LLM API 地址（默认值来自 `DF_API_URL` 环境变量）\n- `--api-key KEY` - API 密钥（默认值来自 `DF_API_KEY` 环境变量）\n- `--model NAME` - 文本模型名称（默认值因脚本而异）\n- `--output-dir DIR` - 自定义输出目录（默认：`outputs\u002Fcli\u002F{脚本名}\u002F{时间戳}`）\n- `--help` - 显示详细帮助信息\n\n如需查看完整的参数说明，可在任意脚本后添加 `--help` 参数：\n\n```bash\npython script\u002Frun_paper2figure_cli.py --help\n```\n\n---\n\n## 📂 项目结构\n\n```\nPaper2Any\u002F\n├── dataflow_agent\u002F          # 核心代码库\n│   ├── agentroles\u002F         # 代理定义\n│   │   └── paper2any_agents\u002F # Paper2Any 特定代理\n│   ├── workflow\u002F           # 工作流定义\n│   ├── promptstemplates\u002F   # 提示模板\n│   └── toolkits\u002F           # 工具包（绘图、PPT 生成等）\n├── fastapi_app\u002F            # 后端 API 服务\n├── frontend-workflow\u002F      # 前端 Web 界面\n├── static\u002F                 # 静态资源\n├── script\u002F                 # 脚本工具\n└── tests\u002F                  # 测试用例\n```\n\n---\n\n## 🗺️ 路线图\n\n\u003Ctable>\n\u003Ctr>\n\u003Cth width=\"35%\">功能\u003C\u002Fth>\n\u003Cth width=\"15%\">状态\u003C\u002Fth>\n\u003Cth width=\"50%\">子功能\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>📊 Paper2Figure\u003C\u002Fstrong>\u003Cbr>\u003Csub>可编辑的科学图表\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-85%25-blue?style=flat-square&logo=progress\" alt=\"85%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Model_Architecture-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Technical_Roadmap-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Experimental_Plots-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Web_Frontend-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🧩 Paper2Diagram\u003C\u002Fstrong>\u003Cbr>\u003Csub>Drawio 图表\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-80%25-blue?style=flat-square&logo=progress\" alt=\"80%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Paper_or_Text_to_Drawio-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Image_to_Drawio-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Chat_Edit-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Export_Drawio_PNG_SVG-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎬 Paper2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>可编辑的幻灯片演示文稿\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-70%25-yellow?style=flat-square&logo=progress\" alt=\"70%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Beamer_Style-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Long_Doc_PPT-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Template_based_PPT_Generation-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-KB_based_PPT_Generation-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Table_Extraction-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Figure_Extraction-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🖼️ PDF2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>保持版面的转换\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-90%25-green?style=flat-square&logo=progress\" alt=\"90%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Smart_Cutout-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Layout_Preservation-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Editable_PPTX-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🖼️ Image2PPT\u003C\u002Fstrong>\u003Cbr>\u003Csub>图片转幻灯片\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-85%25-blue?style=flat-square&logo=progress\" alt=\"85%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Single_or_Multi_Image_Input-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Layout_Aware_Slides-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎨 PPTPolish\u003C\u002Fstrong>\u003Cbr>\u003Csub>智能美化\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-60%25-yellow?style=flat-square&logo=progress\" alt=\"60%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Style_Transfer-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Layout_Optimization-yellow?style=flat-square\" alt=\"进行中\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Reference_Image_Polish-yellow?style=flat-square\" alt=\"进行中\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>📚 Knowledge Base\u003C\u002Fstrong>\u003Cbr>\u003Csub>知识库工作流\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-75%25-blue?style=flat-square&logo=progress\" alt=\"75%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Ingest_and_Embedding-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-Semantic_Search-success?style=flat-square\" alt=\"完成\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F✓-KB_PPT_Podcast_Mindmap-success?style=flat-square\" alt=\"完成\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cstrong>🎬 Paper2Video\u003C\u002Fstrong>\u003Cbr>\u003Csub>视频脚本生成\u003C\u002Fsub>\u003C\u002Ftd>\n\u003Ctd>\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProgress-40%25-yellow?style=flat-square&logo=progress\" alt=\"40%\"\u002F>\u003C\u002Ftd>\n\u003Ctd>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Script_and_Narration-yellow?style=flat-square\" alt=\"进行中\"\u002F>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F⚠-Storyboard_Assets-yellow?style=flat-square\" alt=\"进行中\"\u002F>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 🤝 贡献\n\n我们欢迎任何形式的贡献！\n\n[![Issues](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FIssues-Submit_Bug-red?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues)\n[![Discussions](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscussions-Feature_Request-blue?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fdiscussions)\n[![PR](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPR-Submit_Code-green?style=for-the-badge&logo=github)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fpulls)\n\n---\n\n## 📄 许可证\n\n本项目采用 [Apache License 2.0](LICENSE) 许可。\n\n\u003C!-- ---\n\n## 星级历史\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_a92212198a53.png)](https:\u002F\u002Fstar-history.com\u002F#OpenDCAI\u002FPaper2Any&Date) -->\n\n---\n\n\u003Cdiv align=\"center\">\n\n**如果这个项目对您有帮助，请给我们一颗⭐️星！**\n\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenDCAI\u002FPaper2Any?style=social)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fstargazers)\n[![GitHub forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FOpenDCAI\u002FPaper2Any?style=social)](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fnetwork\u002Fmembers)\n\n\u003Cbr>\n\n\u003Ca name=\"wechat-group\">\u003C\u002Fa>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_readme_cbdaf7956775.png\" alt=\"DataFlow-Agent 微信社区\" width=\"300\"\u002F>\n\u003Cbr>\n\u003Csub>扫描加入社区微信群\u003C\u002Fsub>\n\n\u003Cp align=\"center\"> \n  \u003Cem> ❤️ 由 OpenDCAI 团队制作\u003C\u002Fem>\n\u003C\u002Fp>\n\n\u003C\u002Fdiv>","# Paper2Any 快速上手指南\n\nPaper2Any 是一款专注于论文多模态工作流的 AI 工具，支持从论文 PDF、截图或文本一键生成可编辑的模型图、技术路线图、实验图表、PPT 演示文稿及学术海报等。\n\n## 🛠️ 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux \u002F macOS \u002F Windows (推荐 WSL2)\n*   **Python 版本**：3.11 或更高版本\n*   **依赖管理**：pip 或 conda\n*   **网络环境**：需能访问 Hugging Face 或 ModelScope（国内用户建议配置镜像源）\n\n## 📦 安装步骤\n\n### 1. 克隆项目\n首先从 GitHub 获取源代码：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any.git\ncd Paper2Any\n```\n\n### 2. 创建虚拟环境\n推荐使用 `conda` 或 `venv` 隔离环境（以 conda 为例）：\n\n```bash\nconda create -n paper2any python=3.11 -y\nconda activate paper2any\n```\n\n### 3. 安装依赖\n安装项目所需的核心依赖包。国内用户建议使用清华或阿里镜像源加速安装：\n\n```bash\n# 使用 pip 安装（推荐国内镜像）\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n> **注意**：如果项目包含特定的前端构建步骤或额外的模型权重下载脚本，请参考 `docs\u002F` 目录下的详细文档进行后续配置。\n\n## 🚀 基本使用\n\nPaper2Any 提供了 Web 界面和 API 两种主要使用方式。以下是启动本地服务的最简示例。\n\n### 启动 Web 服务\n在项目根目录下运行启动脚本（具体脚本名称可能因版本更新略有不同，通常为 `main.py` 或 `app.py`）：\n\n```bash\npython main.py\n```\n\n启动成功后，终端将显示本地访问地址，通常为：\n`http:\u002F\u002F127.0.0.1:8000` 或 `http:\u002F\u002Flocalhost:7860`\n\n### 核心功能操作示例\n\n打开浏览器访问上述地址，您将看到主界面。以下是几个典型工作流的操作逻辑：\n\n1.  **生成可编辑科学图表 (Paper2Figure)**\n    *   上传论文 PDF 或模型架构图截图。\n    *   选择 \"Model Architecture\" 或 \"Technical Roadmap\" 模式。\n    *   点击生成，系统将输出可编辑的 PPTX 或 SVG 文件。\n\n2.  **论文转 PPT (Paper2PPT)**\n    *   上传长篇论文文档或直接输入研究主题。\n    *   利用 \"AI-assisted outline\" 调整大纲结构。\n    *   选择主题风格，一键生成包含图表提取的多页幻灯片。\n\n3.  **图片转 Drawio (Image2Drawio)**\n    *   上传现有的流程图或架构截图。\n    *   系统自动识别元素并转换为 Drawio 可编辑画布，支持通过对话进一步修改。\n\n### 在线体验\n如果您暂时不想部署本地环境，可以直接访问官方提供的在线演示：\n🌐 [http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F](http:\u002F\u002Fdcai-paper2any.nas.cpolar.cn\u002F)","某高校计算机视觉实验室的博士生李明，刚完成一篇关于新型神经网络架构的论文初稿，急需将文中复杂的模型结构和实验数据转化为高质量的学术图表与会议汇报 PPT。\n\n### 没有 Paper2Any 时\n- **绘图耗时极长**：需要手动在 Visio 或 Draw.io 中重新绘制模型架构图，反复调整线条对齐和配色，往往耗费数天时间。\n- **信息提取易错**：从 PDF 论文中摘录实验数据并重绘曲线图时，容易因人工抄录导致数据偏差，且风格难以统一。\n- **PPT 制作繁琐**：为了准备组会汇报，需手动将文字内容拆解到幻灯片，再逐一插入图片，排版过程机械且枯燥。\n- **修改成本高昂**：一旦论文逻辑微调，所有相关的图表和幻灯片都需要人工逐个返工，版本管理混乱。\n\n### 使用 Paper2Any 后\n- **一键生成可编辑图表**：直接上传论文 PDF，Paper2Any 自动解析并生成标准的模型架构图和技术路线图，输出为可二次编辑的 Draw.io 格式。\n- **精准还原实验数据**：工具自动识别文中的实验结果，瞬间重绘出高保真的实验对比曲线图，确保数据零误差且风格专业。\n- **自动化构建演示文稿**：输入论文主题，Paper2Any 即刻生成包含完整逻辑链条的多页 PPT，自动匹配学术风格模板并填入核心内容。\n- **敏捷响应迭代需求**：当论文内容更新时，只需重新运行流程，所有图表和幻灯片自动同步最新逻辑，大幅缩短修改周期。\n\nPaper2Any 将研究人员从繁琐的“美工”工作中解放出来，实现了从论文原稿到高质量学术可视化成果的分钟级转化。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOpenDCAI_Paper2Any_2914b1b5.png","OpenDCAI","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FOpenDCAI_da5c3cd0.png","Define the future of Data-centric AI together",null,"opendcai@iaar.ac.cn","https:\u002F\u002Fgithub.com\u002FOpenDCAI",[82,86,90,94,98,102,106,110,113,116],{"name":83,"color":84,"percentage":85},"Python","#3572A5",68.8,{"name":87,"color":88,"percentage":89},"TypeScript","#3178c6",29.3,{"name":91,"color":92,"percentage":93},"Shell","#89e051",0.9,{"name":95,"color":96,"percentage":97},"Jinja","#a52a22",0.5,{"name":99,"color":100,"percentage":101},"PLpgSQL","#336790",0.4,{"name":103,"color":104,"percentage":105},"CSS","#663399",0.1,{"name":107,"color":108,"percentage":109},"Dockerfile","#384d54",0,{"name":111,"color":112,"percentage":109},"JavaScript","#f1e05a",{"name":114,"color":115,"percentage":109},"Makefile","#427819",{"name":117,"color":118,"percentage":109},"HTML","#e34c26",2082,145,"2026-04-05T10:22:53","Apache-2.0",4,"未说明",{"notes":126,"python":127,"dependencies":128},"README 中未详细列出具体的系统依赖库、GPU 型号或内存需求。该工具支持多种大模型（如 GPT-4o, Qwen-VL）并通过 API 动态配置，因此实际运行资源取决于所选用的后端模型服务。项目提供在线演示环境，本地部署需参考 docs\u002F 目录下的详细文档。","3.11+",[124],[14,13,15],[131,132,133,134,135,136,137],"agent","ai","aippt","langgraph","paper2slides","editable-pptx","ppt-generator","2026-03-27T02:49:30.150509","2026-04-06T08:09:02.100404",[141,146,151,156,161,165],{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},17659,"如何配置和使用国产大模型（如 DeepSeek）或其他自定义模型？","在最新代码中，您可以分别在前端和后端的 `.env` 文件中修改模型名称和 API URL 进行配置。请注意，您需要确保所选的服务商支持相应的调用格式。维护者表示后续版本迭代中会进一步优化模型的灵活选择支持。","https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues\u002F103",{"id":147,"question_zh":148,"answer_zh":149,"source_url":150},17660,"项目是否支持统一配置全局模型，而不是将模型写死在代码中？","是的，针对模型硬编码的问题，开发团队正在积极修改以灵活支持更多效果更好的模型。目前建议在本地部署时，通过修改配置文件来统一设置所使用的模型。未来版本将提供全局配置界面以支持更灵活的模型切换。","https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues\u002F90",{"id":152,"question_zh":153,"answer_zh":154,"source_url":155},17661,"UI 界面中的模型和 URL 是硬编码的吗？如何修改？","此前 UI 中存在硬编码问题，但维护者已确认工作流（wf）中的硬编码问题已经处理完毕。如果您仍遇到相关问题，请确保拉取了最新代码，并检查前后端的 `.env` 配置文件是否正确设置了模型和 URL。","https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues\u002F99",{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},17662,"启动后端服务时出现 'undefined symbol: ncclGroupSimulateEnd' 报错怎么办？","该错误通常与 PyTorch 或 CUDA 环境依赖冲突有关（特别是 `libtorch_cuda.so`）。虽然当前 Issue 中未给出最终确切的解决步骤，但建议检查您的 Python 环境中 `torch` 和 `nccl` 库的版本兼容性，尝试重新安装匹配您 CUDA 版本的 PyTorch，或创建一个新的干净虚拟环境并按文档重新安装依赖。","https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FPaper2Any\u002Fissues\u002F113",{"id":162,"question_zh":163,"answer_zh":164,"source_url":150},17663,"项目支持哪些具体的模型列表？","目前项目正在积极扩展支持的模型列表，特别是国产模型。具体支持的模型取决于您在 `.env` 文件中配置的服务商及其 API 格式兼容性。建议关注项目的后续更新公告以获取官方推荐的支持模型列表。",{"id":166,"question_zh":167,"answer_zh":168,"source_url":145},17664,"为什么我更新了最新版本仍然无法选择国产模型？","虽然最新代码允许通过修改 `.env` 文件来配置模型，但可能需要手动调整前后端配置才能生效。如果界面仍未显示选项，可能是因为前端缓存或配置未正确加载。请尝试清除浏览器缓存并重启后端服务，同时确认 `.env` 文件中的模型参数格式符合服务商要求。",[]]