[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-vxcontrol--pentagi":3,"tool-vxcontrol--pentagi":61},[4,18,26,36,44,52],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",142651,2,"2026-04-06T23:34:12",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":10,"last_commit_at":50,"category_tags":51,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":53,"name":54,"github_repo":55,"description_zh":56,"stars":57,"difficulty_score":10,"last_commit_at":58,"category_tags":59,"status":17},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[14,15,13,60],"视频",{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":76,"owner_website":78,"owner_url":79,"languages":80,"stars":116,"forks":117,"last_commit_at":118,"license":119,"difficulty_score":120,"env_os":121,"env_gpu":122,"env_ram":123,"env_deps":124,"category_tags":133,"github_topics":134,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":153,"updated_at":154,"faqs":155,"releases":188},4916,"vxcontrol\u002Fpentagi","pentagi","Fully autonomous AI Agents system capable of performing complex penetration testing tasks","PentAGI 是一款基于人工智能的全自动渗透测试系统，旨在帮助安全专业人员高效完成复杂的网络安全评估任务。它通过自主决策智能体，自动规划并执行从信息收集、漏洞扫描到利用验证的完整测试流程，解决了传统渗透测试中人力成本高、步骤繁琐且容易遗漏关键路径的痛点。\n\n这款工具特别适合信息安全研究员、道德黑客以及需要自动化安全审计的企业团队使用。即便是不具备深厚脚本编写能力的从业者，也能借助其智能化流程快速开展专业级测试。\n\nPentAGI 的技术亮点在于其高度隔离的 Docker 沙箱环境，确保所有操作安全可控；内置超过 20 种主流安全工具（如 Nmap、Metasploit），并拥有“专家代理团队”协作机制，可针对不同任务分配专用 AI 角色。此外，它还集成了知识图谱与长期记忆系统，能够积累过往测试经验以优化未来策略，配合实时监控系统与详尽的漏洞报告生成能力，让安全测试过程更加透明、智能且可追溯。","# PentAGI\n\n\u003Cdiv align=\"center\" style=\"font-size: 1.5em; margin: 20px 0;\">\n    \u003Cstrong>P\u003C\u002Fstrong>enetration testing \u003Cstrong>A\u003C\u002Fstrong>rtificial \u003Cstrong>G\u003C\u002Fstrong>eneral \u003Cstrong>I\u003C\u002Fstrong>ntelligence\n\u003C\u002Fdiv>\n\u003Cbr>\n\u003Cdiv align=\"center\">\n\n> **Join the Community!** Connect with security researchers, AI enthusiasts, and fellow ethical hackers. Get support, share insights, and stay updated with the latest PentAGI developments.\n\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-7289DA?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)⠀[![Telegram](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTelegram-2CA5E0?logo=telegram&logoColor=white)](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F15161\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fvxcontrol_pentagi_readme_4a68feb902da.png\" alt=\"vxcontrol%2Fpentagi | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n\u003C\u002Fdiv>\n\n## Table of Contents\n\n- [Overview](#-overview)\n- [Features](#-features)\n- [Quick Start](#-quick-start)\n- [API Access](#-api-access)\n- [Advanced Setup](#-advanced-setup)\n- [Development](#-development)\n- [Testing LLM Agents](#-testing-llm-agents)\n- [Embedding Configuration and Testing](#-embedding-configuration-and-testing)\n- [Function Testing with ftester](#-function-testing-with-ftester)\n- [Building](#%EF%B8%8F-building)\n- [Credits](#-credits)\n- [License](#-license)\n\n## Overview\n\nPentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. The project is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests.\n\nYou can watch the video **PentAGI overview**:\n[![PentAGI Overview Video](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fvxcontrol_pentagi_readme_51b6624d8609.png)](https:\u002F\u002Fyoutu.be\u002FR70x5Ddzs1o)\n\n## Features\n\n- Secure & Isolated. All operations are performed in a sandboxed Docker environment with complete isolation.\n- Fully Autonomous. AI-powered agent that automatically determines and executes penetration testing steps with optional execution monitoring and intelligent task planning for enhanced reliability.\n- Professional Pentesting Tools. Built-in suite of 20+ professional security tools including nmap, metasploit, sqlmap, and more.\n- Smart Memory System. Long-term storage of research results and successful approaches for future use.\n- Knowledge Graph Integration. Graphiti-powered knowledge graph using Neo4j for semantic relationship tracking and advanced context understanding.\n- Web Intelligence. Built-in browser via [scraper](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fvxcontrol\u002Fscraper) for gathering latest information from web sources.\n- External Search Systems. Integration with advanced search APIs including [Tavily](https:\u002F\u002Ftavily.com), [Traversaal](https:\u002F\u002Ftraversaal.ai), [Perplexity](https:\u002F\u002Fwww.perplexity.ai), [DuckDuckGo](https:\u002F\u002Fduckduckgo.com\u002F), [Google Custom Search](https:\u002F\u002Fprogrammablesearchengine.google.com\u002F), [Sploitus Search](https:\u002F\u002Fsploitus.com) and [Searxng](https:\u002F\u002Fsearxng.org) for comprehensive information gathering.\n- Team of Specialists. Delegation system with specialized AI agents for research, development, and infrastructure tasks, enhanced with optional execution monitoring and intelligent task planning for optimal performance with smaller models.\n- Comprehensive Monitoring. Detailed logging and integration with Grafana\u002FPrometheus for real-time system observation.\n- Detailed Reporting. Generation of thorough vulnerability reports with exploitation guides.\n- Smart Container Management. Automatic Docker image selection based on specific task requirements.\n- Modern Interface. Clean and intuitive web UI for system management and monitoring.\n- Comprehensive APIs. Full-featured REST and GraphQL APIs with Bearer token authentication for automation and integration.\n- Persistent Storage. All commands and outputs are stored in PostgreSQL with [pgvector](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fvxcontrol\u002Fpgvector) extension.\n- Scalable Architecture. Microservices-based design supporting horizontal scaling.\n- Self-Hosted Solution. Complete control over your deployment and data.\n- Flexible Authentication. Support for 10+ LLM providers ([OpenAI](https:\u002F\u002Fplatform.openai.com\u002F), [Anthropic](https:\u002F\u002Fwww.anthropic.com\u002F), [Google AI\u002FGemini](https:\u002F\u002Fai.google.dev\u002F), [AWS Bedrock](https:\u002F\u002Faws.amazon.com\u002Fbedrock\u002F), [Ollama](https:\u002F\u002Follama.com\u002F), [DeepSeek](https:\u002F\u002Fwww.deepseek.com\u002Fen\u002F), [GLM](https:\u002F\u002Fz.ai\u002F), [Kimi](https:\u002F\u002Fplatform.moonshot.ai\u002F), [Qwen](https:\u002F\u002Fwww.alibabacloud.com\u002Fen\u002F), Custom) plus aggregators ([OpenRouter](https:\u002F\u002Fopenrouter.ai\u002F), [DeepInfra](https:\u002F\u002Fdeepinfra.com\u002F)). For production local deployments, see our [vLLM + Qwen3.5-27B-FP8 guide](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md).\n- API Token Authentication. Secure Bearer token system for programmatic access to REST and GraphQL APIs.\n- Quick Deployment. Easy setup through [Docker Compose](https:\u002F\u002Fdocs.docker.com\u002Fcompose\u002F) with comprehensive environment configuration.\n\n## Architecture\n\n### System Context\n\n```mermaid\nflowchart TB\n    classDef person fill:#08427B,stroke:#073B6F,color:#fff\n    classDef system fill:#1168BD,stroke:#0B4884,color:#fff\n    classDef external fill:#666666,stroke:#0B4884,color:#fff\n\n    pentester[\"👤 Security Engineer\n    (User of the system)\"]\n\n    pentagi[\"✨ PentAGI\n    (Autonomous penetration testing system)\"]\n\n    target[\"🎯 target-system\n    (System under test)\"]\n    llm[\"🧠 llm-provider\n    (OpenAI\u002FAnthropic\u002FOllama\u002FBedrock\u002FGemini\u002FCustom)\"]\n    search[\"🔍 search-systems\n    (Google\u002FDuckDuckGo\u002FTavily\u002FTraversaal\u002FPerplexity\u002FSploitus\u002FSearxng)\"]\n    langfuse[\"📊 langfuse-ui\n    (LLM Observability Dashboard)\"]\n    grafana[\"📈 grafana\n    (System Monitoring Dashboard)\"]\n\n    pentester --> |Uses HTTPS| pentagi\n    pentester --> |Monitors AI HTTPS| langfuse\n    pentester --> |Monitors System HTTPS| grafana\n    pentagi --> |Tests Various protocols| target\n    pentagi --> |Queries HTTPS| llm\n    pentagi --> |Searches HTTPS| search\n    pentagi --> |Reports HTTPS| langfuse\n    pentagi --> |Reports HTTPS| grafana\n\n    class pentester person\n    class pentagi system\n    class target,llm,search,langfuse,grafana external\n\n    linkStyle default stroke:#ffffff,color:#ffffff\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Container Architecture\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n```mermaid\ngraph TB\n    subgraph Core Services\n        UI[Frontend UI\u003Cbr\u002F>React + TypeScript]\n        API[Backend API\u003Cbr\u002F>Go + GraphQL]\n        DB[(Vector Store\u003Cbr\u002F>PostgreSQL + pgvector)]\n        MQ[Task Queue\u003Cbr\u002F>Async Processing]\n        Agent[AI Agents\u003Cbr\u002F>Multi-Agent System]\n    end\n\n    subgraph Knowledge Graph\n        Graphiti[Graphiti\u003Cbr\u002F>Knowledge Graph API]\n        Neo4j[(Neo4j\u003Cbr\u002F>Graph Database)]\n    end\n\n    subgraph Monitoring\n        Grafana[Grafana\u003Cbr\u002F>Dashboards]\n        VictoriaMetrics[VictoriaMetrics\u003Cbr\u002F>Time-series DB]\n        Jaeger[Jaeger\u003Cbr\u002F>Distributed Tracing]\n        Loki[Loki\u003Cbr\u002F>Log Aggregation]\n        OTEL[OpenTelemetry\u003Cbr\u002F>Data Collection]\n    end\n\n    subgraph Analytics\n        Langfuse[Langfuse\u003Cbr\u002F>LLM Analytics]\n        ClickHouse[ClickHouse\u003Cbr\u002F>Analytics DB]\n        Redis[Redis\u003Cbr\u002F>Cache + Rate Limiter]\n        MinIO[MinIO\u003Cbr\u002F>S3 Storage]\n    end\n\n    subgraph Security Tools\n        Scraper[Web Scraper\u003Cbr\u002F>Isolated Browser]\n        PenTest[Security Tools\u003Cbr\u002F>20+ Pro Tools\u003Cbr\u002F>Sandboxed Execution]\n    end\n\n    UI --> |HTTP\u002FWS| API\n    API --> |SQL| DB\n    API --> |Events| MQ\n    MQ --> |Tasks| Agent\n    Agent --> |Commands| PenTest\n    Agent --> |Queries| DB\n    Agent --> |Knowledge| Graphiti\n    Graphiti --> |Graph| Neo4j\n\n    API --> |Telemetry| OTEL\n    OTEL --> |Metrics| VictoriaMetrics\n    OTEL --> |Traces| Jaeger\n    OTEL --> |Logs| Loki\n\n    Grafana --> |Query| VictoriaMetrics\n    Grafana --> |Query| Jaeger\n    Grafana --> |Query| Loki\n\n    API --> |Analytics| Langfuse\n    Langfuse --> |Store| ClickHouse\n    Langfuse --> |Cache| Redis\n    Langfuse --> |Files| MinIO\n\n    classDef core fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef knowledge fill:#ffa,stroke:#333,stroke-width:2px,color:#000\n    classDef monitoring fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef analytics fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef tools fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class UI,API,DB,MQ,Agent core\n    class Graphiti,Neo4j knowledge\n    class Grafana,VictoriaMetrics,Jaeger,Loki,OTEL monitoring\n    class Langfuse,ClickHouse,Redis,MinIO analytics\n    class Scraper,PenTest tools\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Entity Relationship\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n```mermaid\nerDiagram\n    Flow ||--o{ Task : contains\n    Task ||--o{ SubTask : contains\n    SubTask ||--o{ Action : contains\n    Action ||--o{ Artifact : produces\n    Action ||--o{ Memory : stores\n\n    Flow {\n        string id PK\n        string name \"Flow name\"\n        string description \"Flow description\"\n        string status \"active\u002Fcompleted\u002Ffailed\"\n        json parameters \"Flow parameters\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Task {\n        string id PK\n        string flow_id FK\n        string name \"Task name\"\n        string description \"Task description\"\n        string status \"pending\u002Frunning\u002Fdone\u002Ffailed\"\n        json result \"Task results\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    SubTask {\n        string id PK\n        string task_id FK\n        string name \"Subtask name\"\n        string description \"Subtask description\"\n        string status \"queued\u002Frunning\u002Fcompleted\u002Ffailed\"\n        string agent_type \"researcher\u002Fdeveloper\u002Fexecutor\"\n        json context \"Agent context\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Action {\n        string id PK\n        string subtask_id FK\n        string type \"command\u002Fsearch\u002Fanalyze\u002Fetc\"\n        string status \"success\u002Ffailure\"\n        json parameters \"Action parameters\"\n        json result \"Action results\"\n        timestamp created_at\n    }\n\n    Artifact {\n        string id PK\n        string action_id FK\n        string type \"file\u002Freport\u002Flog\"\n        string path \"Storage path\"\n        json metadata \"Additional info\"\n        timestamp created_at\n    }\n\n    Memory {\n        string id PK\n        string action_id FK\n        string type \"observation\u002Fconclusion\"\n        vector embedding \"Vector representation\"\n        text content \"Memory content\"\n        timestamp created_at\n    }\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Agent Interaction\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n```mermaid\nsequenceDiagram\n    participant O as Orchestrator\n    participant R as Researcher\n    participant D as Developer\n    participant E as Executor\n    participant VS as Vector Store\n    participant KB as Knowledge Base\n\n    Note over O,KB: Flow Initialization\n    O->>VS: Query similar tasks\n    VS-->>O: Return experiences\n    O->>KB: Load relevant knowledge\n    KB-->>O: Return context\n\n    Note over O,R: Research Phase\n    O->>R: Analyze target\n    R->>VS: Search similar cases\n    VS-->>R: Return patterns\n    R->>KB: Query vulnerabilities\n    KB-->>R: Return known issues\n    R->>VS: Store findings\n    R-->>O: Research results\n\n    Note over O,D: Planning Phase\n    O->>D: Plan attack\n    D->>VS: Query exploits\n    VS-->>D: Return techniques\n    D->>KB: Load tools info\n    KB-->>D: Return capabilities\n    D-->>O: Attack plan\n\n    Note over O,E: Execution Phase\n    O->>E: Execute plan\n    E->>KB: Load tool guides\n    KB-->>E: Return procedures\n    E->>VS: Store results\n    E-->>O: Execution status\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Memory System\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n```mermaid\ngraph TB\n    subgraph \"Long-term Memory\"\n        VS[(Vector Store\u003Cbr\u002F>Embeddings DB)]\n        KB[Knowledge Base\u003Cbr\u002F>Domain Expertise]\n        Tools[Tools Knowledge\u003Cbr\u002F>Usage Patterns]\n    end\n\n    subgraph \"Working Memory\"\n        Context[Current Context\u003Cbr\u002F>Task State]\n        Goals[Active Goals\u003Cbr\u002F>Objectives]\n        State[System State\u003Cbr\u002F>Resources]\n    end\n\n    subgraph \"Episodic Memory\"\n        Actions[Past Actions\u003Cbr\u002F>Commands History]\n        Results[Action Results\u003Cbr\u002F>Outcomes]\n        Patterns[Success Patterns\u003Cbr\u002F>Best Practices]\n    end\n\n    Context --> |Query| VS\n    VS --> |Retrieve| Context\n\n    Goals --> |Consult| KB\n    KB --> |Guide| Goals\n\n    State --> |Record| Actions\n    Actions --> |Learn| Patterns\n    Patterns --> |Store| VS\n\n    Tools --> |Inform| State\n    Results --> |Update| Tools\n\n    VS --> |Enhance| KB\n    KB --> |Index| VS\n\n    classDef ltm fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef wm fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef em fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n\n    class VS,KB,Tools ltm\n    class Context,Goals,State wm\n    class Actions,Results,Patterns em\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Chain Summarization\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nThe chain summarization system manages conversation context growth by selectively summarizing older messages. This is critical for preventing token limits from being exceeded while maintaining conversation coherence.\n\n```mermaid\nflowchart TD\n    A[Input Chain] --> B{Needs Summarization?}\n    B -->|No| C[Return Original Chain]\n    B -->|Yes| D[Convert to ChainAST]\n    D --> E[Apply Section Summarization]\n    E --> F[Process Oversized Pairs]\n    F --> G[Manage Last Section Size]\n    G --> H[Apply QA Summarization]\n    H --> I[Rebuild Chain with Summaries]\n    I --> J{Is New Chain Smaller?}\n    J -->|Yes| K[Return Optimized Chain]\n    J -->|No| C\n\n    classDef process fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef decision fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef output fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class A,D,E,F,G,H,I process\n    class B,J decision\n    class C,K output\n```\n\nThe algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.\n\n### Global Summarizer Configuration Options\n\n| Parameter             | Environment Variable             | Default | Description                                                |\n| --------------------- | -------------------------------- | ------- | ---------------------------------------------------------- |\n| Preserve Last         | `SUMMARIZER_PRESERVE_LAST`       | `true`  | Whether to keep all messages in the last section intact    |\n| Use QA Pairs          | `SUMMARIZER_USE_QA`              | `true`  | Whether to use QA pair summarization strategy              |\n| Summarize Human in QA | `SUMMARIZER_SUM_MSG_HUMAN_IN_QA` | `false` | Whether to summarize human messages in QA pairs            |\n| Last Section Size     | `SUMMARIZER_LAST_SEC_BYTES`      | `51200` | Maximum byte size for last section (50KB)                  |\n| Max Body Pair Size    | `SUMMARIZER_MAX_BP_BYTES`        | `16384` | Maximum byte size for a single body pair (16KB)            |\n| Max QA Sections       | `SUMMARIZER_MAX_QA_SECTIONS`     | `10`    | Maximum QA pair sections to preserve                       |\n| Max QA Size           | `SUMMARIZER_MAX_QA_BYTES`        | `65536` | Maximum byte size for QA pair sections (64KB)              |\n| Keep QA Sections      | `SUMMARIZER_KEEP_QA_SECTIONS`    | `1`     | Number of recent QA sections to keep without summarization |\n\n### Assistant Summarizer Configuration Options\n\nAssistant instances can use customized summarization settings to fine-tune context management behavior:\n\n| Parameter          | Environment Variable                    | Default | Description                                                          |\n| ------------------ | --------------------------------------- | ------- | -------------------------------------------------------------------- |\n| Preserve Last      | `ASSISTANT_SUMMARIZER_PRESERVE_LAST`    | `true`  | Whether to preserve all messages in the assistant's last section     |\n| Last Section Size  | `ASSISTANT_SUMMARIZER_LAST_SEC_BYTES`   | `76800` | Maximum byte size for assistant's last section (75KB)                |\n| Max Body Pair Size | `ASSISTANT_SUMMARIZER_MAX_BP_BYTES`     | `16384` | Maximum byte size for a single body pair in assistant context (16KB) |\n| Max QA Sections    | `ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS`  | `7`     | Maximum QA sections to preserve in assistant context                 |\n| Max QA Size        | `ASSISTANT_SUMMARIZER_MAX_QA_BYTES`     | `76800` | Maximum byte size for assistant's QA sections (75KB)                 |\n| Keep QA Sections   | `ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS` | `3`     | Number of recent QA sections to preserve without summarization       |\n\nThe assistant summarizer configuration provides more memory for context retention compared to the global settings, preserving more recent conversation history while still ensuring efficient token usage.\n\n### Summarizer Environment Configuration\n\n```bash\n# Default values for global summarizer logic\nSUMMARIZER_PRESERVE_LAST=true\nSUMMARIZER_USE_QA=true\nSUMMARIZER_SUM_MSG_HUMAN_IN_QA=false\nSUMMARIZER_LAST_SEC_BYTES=51200\nSUMMARIZER_MAX_BP_BYTES=16384\nSUMMARIZER_MAX_QA_SECTIONS=10\nSUMMARIZER_MAX_QA_BYTES=65536\nSUMMARIZER_KEEP_QA_SECTIONS=1\n\n# Default values for assistant summarizer logic\nASSISTANT_SUMMARIZER_PRESERVE_LAST=true\nASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800\nASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384\nASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7\nASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800\nASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Advanced Agent Supervision\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nPentAGI includes sophisticated multi-layered agent supervision mechanisms to ensure efficient task execution, prevent infinite loops, and provide intelligent recovery from stuck states:\n\n### Execution Monitoring (Beta)\n- **Automatic Mentor Intervention**: Adviser agent (mentor) is automatically invoked when execution patterns indicate potential issues\n- **Pattern Detection**: Monitors identical tool calls (threshold: 5, configurable) and total tool calls (threshold: 10, configurable)\n- **Progress Analysis**: Evaluates whether agent advances toward subtask objective, detects loops and inefficiencies\n- **Alternative Strategies**: Recommends different approaches when current strategy fails\n- **Information Retrieval Guidance**: Suggests searching for established solutions instead of reinventing\n- **Enhanced Response Format**: Tool responses include both `\u003Coriginal_result>` and `\u003Cmentor_analysis>` sections\n- **Configurable**: Enable via `EXECUTION_MONITOR_ENABLED` (default: false), customize thresholds with `EXECUTION_MONITOR_SAME_TOOL_LIMIT` and `EXECUTION_MONITOR_TOTAL_TOOL_LIMIT`\n\n**Best for**: Smaller models (\u003C 32B parameters), complex attack scenarios requiring continuous guidance, preventing agents from getting stuck on single approach\n\n**Performance Impact**: 2-3x increase in execution time and token usage, but delivers **2x improvement in result quality** based on testing with Qwen3.5-27B-FP8\n\n### Intelligent Task Planning (Beta)\n- **Automated Decomposition**: Planner (adviser in planning mode) generates 3-7 specific, actionable steps before specialist agents begin work\n- **Context-Aware Plans**: Analyzes full execution context via enricher agent to create informed plans\n- **Structured Assignment**: Original request wrapped in `\u003Ctask_assignment>` structure with execution plan and instructions\n- **Scope Management**: Prevents scope creep by keeping agents focused on current subtask only\n- **Enriched Instructions**: Plans highlight critical actions, potential pitfalls, and verification points\n- **Configurable**: Enable via `AGENT_PLANNING_STEP_ENABLED` (default: false)\n\n**Best for**: Models \u003C 32B parameters, complex penetration testing workflows, improving success rates on sophisticated tasks\n\n**Enhanced Adviser Configuration**: Works exceptionally well when adviser agent uses stronger model or enhanced settings. Example: using same base model with maximum reasoning mode for adviser (see [`vllm-qwen3.5-27b-fp8.provider.yml`](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml)) enables comprehensive task analysis and strategic planning from identical model architecture.\n\n**Performance Impact**: Adds planning overhead but significantly improves completion rates and reduces redundant work\n\n### Tool Call Limits (Always Active)\n- **Hard Limits**: Prevent runaway executions regardless of supervision mode status\n- **Differentiated by Agent Type**:\n  - General agents (Assistant, Primary Agent, Pentester, Coder, Installer): `MAX_GENERAL_AGENT_TOOL_CALLS` (default: 100)\n  - Limited agents (Searcher, Enricher, Memorist, Generator, Reporter, Adviser, Reflector, Planner): `MAX_LIMITED_AGENT_TOOL_CALLS` (default: 20)\n- **Graceful Termination**: Reflector guides agents to proper completion when approaching limits\n- **Resource Protection**: Ensures system stability and prevents resource exhaustion\n\n### Reflector Integration (Always Active)\n- **Automatic Correction**: Invoked when LLM fails to generate tool calls after 3 attempts\n- **Strategic Guidance**: Analyzes failures and guides agents toward proper tool usage or barrier tools (`done`, `ask`)\n- **Recovery Mechanism**: Provides contextual guidance based on specific failure patterns\n- **Limit Enforcement**: Coordinates graceful termination when tool call limits are reached\n\n### Recommendations for Open Source Models\n\n**Must-Have for Models \u003C 32B Parameters**:\nTesting with Qwen3.5-27B-FP8 demonstrates that enabling both Execution Monitoring and Task Planning is **essential** for smaller open source models:\n- **Quality Improvement**: 2x better results compared to baseline execution without supervision\n- **Loop Prevention**: Significantly reduces infinite loops and redundant work\n- **Attack Diversity**: Encourages exploration of multiple attack vectors instead of fixating on single approach\n- **Air-Gapped Deployments**: Enables production-grade autonomous pentesting in closed network environments with local LLM inference\n\n**Trade-offs**:\n- Token consumption: 2-3x increase due to mentor\u002Fplanner invocations\n- Execution time: 2-3x longer due to analysis and planning steps\n- Result quality: 2x improvement in completeness, accuracy, and attack coverage\n- Model requirements: Works best when adviser uses enhanced configuration (higher reasoning parameters, stronger model variant, or different model)\n\n**Configuration Strategy**:\nFor optimal performance with smaller models, configure adviser agent with enhanced settings:\n- Use same model with maximum reasoning mode (example: [`vllm-qwen3.5-27b-fp8.provider.yml`](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml))\n- Or use stronger model for adviser while keeping base model for other agents\n- Adjust monitoring thresholds based on task complexity and model capabilities\n\n\n\n\u003C\u002Fdetails>\n\nThe architecture of PentAGI is designed to be modular, scalable, and secure. Here are the key components:\n\n1. **Core Services**\n   - Frontend UI: React-based web interface with TypeScript for type safety\n   - Backend API: Go-based REST and GraphQL APIs with Bearer token authentication for programmatic access\n   - Vector Store: PostgreSQL with pgvector for semantic search and memory storage\n   - Task Queue: Async task processing system for reliable operation\n   - AI Agent: Multi-agent system with specialized roles for efficient testing\n\n2. **Knowledge Graph**\n   - Graphiti: Knowledge graph API for semantic relationship tracking and contextual understanding\n   - Neo4j: Graph database for storing and querying relationships between entities, actions, and outcomes\n   - Automatic capturing of agent responses and tool executions for building comprehensive knowledge base\n\n3. **Monitoring Stack**\n   - OpenTelemetry: Unified observability data collection and correlation\n   - Grafana: Real-time visualization and alerting dashboards\n   - VictoriaMetrics: High-performance time-series metrics storage\n   - Jaeger: End-to-end distributed tracing for debugging\n   - Loki: Scalable log aggregation and analysis\n\n4. **Analytics Platform**\n   - Langfuse: Advanced LLM observability and performance analytics\n   - ClickHouse: Column-oriented analytics data warehouse\n   - Redis: High-speed caching and rate limiting\n   - MinIO: S3-compatible object storage for artifacts\n\n5. **Security Tools**\n   - Web Scraper: Isolated browser environment for safe web interaction\n   - Pentesting Tools: Comprehensive suite of 20+ professional security tools\n   - Sandboxed Execution: All operations run in isolated containers\n\n6. **Memory Systems**\n   - Long-term Memory: Persistent storage of knowledge and experiences\n   - Working Memory: Active context and goals for current operations\n   - Episodic Memory: Historical actions and success patterns\n   - Knowledge Base: Structured domain expertise and tool capabilities\n   - Context Management: Intelligently manages growing LLM context windows using chain summarization\n\nThe system uses Docker containers for isolation and easy deployment, with separate networks for core services, monitoring, and analytics to ensure proper security boundaries. Each component is designed to scale horizontally and can be configured for high availability in production environments.\n\n## Quick Start\n\n### System Requirements\n\n- Docker and Docker Compose (or Podman - see [Podman configuration](#running-pentagi-with-podman))\n- Minimum 2 vCPU\n- Minimum 4GB RAM\n- 20GB free disk space\n- Internet access for downloading images and updates\n\n### Using Installer (Recommended)\n\nPentAGI provides an interactive installer with a terminal-based UI for streamlined configuration and deployment. The installer guides you through system checks, LLM provider setup, search engine configuration, and security hardening.\n\n**Supported Platforms:**\n- **Linux**: amd64 [download](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Famd64\u002Finstaller-latest.zip) | arm64 [download](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Farm64\u002Finstaller-latest.zip)\n- **Windows**: amd64 [download](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fwindows\u002Famd64\u002Finstaller-latest.zip)\n- **macOS**: amd64 (Intel) [download](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fdarwin\u002Famd64\u002Finstaller-latest.zip) | arm64 (M-series) [download](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fdarwin\u002Farm64\u002Finstaller-latest.zip)\n\n**Quick Installation (Linux amd64):**\n\n```bash\n# Create installation directory\nmkdir -p pentagi && cd pentagi\n\n# Download installer\nwget -O installer.zip https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Famd64\u002Finstaller-latest.zip\n\n# Extract\nunzip installer.zip\n\n# Run interactive installer\n.\u002Finstaller\n```\n\n**Prerequisites & Permissions:**\n\nThe installer requires appropriate privileges to interact with the Docker API for proper operation. By default, it uses the Docker socket (`\u002Fvar\u002Frun\u002Fdocker.sock`) which requires either:\n\n- **Option 1 (Recommended for production):** Run the installer as root:\n  ```bash\n  sudo .\u002Finstaller\n  ```\n\n- **Option 2 (Development environments):** Grant your user access to the Docker socket by adding them to the `docker` group:\n  ```bash\n  # Add your user to the docker group\n  sudo usermod -aG docker $USER\n  \n  # Log out and log back in, or activate the group immediately\n  newgrp docker\n  \n  # Verify Docker access (should run without sudo)\n  docker ps\n  ```\n\n  ⚠️ **Security Note:** Adding a user to the `docker` group grants root-equivalent privileges. Only do this for trusted users in controlled environments. For production deployments, consider using rootless Docker mode or running the installer with sudo.\n\nThe installer will:\n1. **System Checks**: Verify Docker, network connectivity, and system requirements\n2. **Environment Setup**: Create and configure `.env` file with optimal defaults\n3. **Provider Configuration**: Set up LLM providers (OpenAI, Anthropic, Gemini, Bedrock, Ollama, Custom)\n4. **Search Engines**: Configure DuckDuckGo, Google, Tavily, Traversaal, Perplexity, Sploitus, Searxng\n5. **Security Hardening**: Generate secure credentials and configure SSL certificates\n6. **Deployment**: Start PentAGI with docker-compose\n\n**For Production & Enhanced Security:**\n\nFor production deployments or security-sensitive environments, we **strongly recommend** using a distributed two-node architecture where worker operations are isolated on a separate server. This prevents untrusted code execution and network access issues on your main system.\n\n**See detailed guide**: [Worker Node Setup](examples\u002Fguides\u002Fworker_node.md)\n\nThe two-node setup provides:\n- **Isolated Execution**: Worker containers run on dedicated hardware\n- **Network Isolation**: Separate network boundaries for penetration testing\n- **Security Boundaries**: Docker-in-Docker with TLS authentication\n- **OOB Attack Support**: Dedicated port ranges for out-of-band techniques\n\n### Manual Installation\n\n1. Create a working directory or clone the repository:\n\n```bash\nmkdir pentagi && cd pentagi\n```\n\n2. Copy `.env.example` to `.env` or download it:\n\n```bash\ncurl -o .env https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002F.env.example\n```\n\n3. Touch examples files (`example.custom.provider.yml`, `example.ollama.provider.yml`) or download it:\n\n```bash\ncurl -o example.custom.provider.yml https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fexamples\u002Fconfigs\u002Fcustom-openai.provider.yml\ncurl -o example.ollama.provider.yml https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fexamples\u002Fconfigs\u002Follama-llama318b.provider.yml\n```\n\n4. Fill in the required API keys in `.env` file.\n\n```bash\n# Required: At least one of these LLM providers\nOPEN_AI_KEY=your_openai_key\nANTHROPIC_API_KEY=your_anthropic_key\nGEMINI_API_KEY=your_gemini_key\n\n# Optional: AWS Bedrock provider (enterprise-grade models)\nBEDROCK_REGION=us-east-1\n# Choose one authentication method:\nBEDROCK_DEFAULT_AUTH=true                        # Option 1: Use AWS SDK default credential chain (recommended for EC2\u002FECS)\n# BEDROCK_BEARER_TOKEN=your_bearer_token         # Option 2: Bearer token authentication\n# BEDROCK_ACCESS_KEY_ID=your_aws_access_key      # Option 3: Static credentials\n# BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# Optional: Ollama provider (local or cloud)\n# OLLAMA_SERVER_URL=http:\u002F\u002Follama-server:11434   # Local server\n# OLLAMA_SERVER_URL=https:\u002F\u002Follama.com           # Cloud service\n# OLLAMA_SERVER_API_KEY=your_ollama_cloud_key    # Required for cloud, empty for local\n\n# Optional: Chinese AI providers\n# DEEPSEEK_API_KEY=your_deepseek_key             # DeepSeek (strong reasoning)\n# GLM_API_KEY=your_glm_key                       # GLM (Zhipu AI)\n# KIMI_API_KEY=your_kimi_key                     # Kimi (Moonshot AI, ultra-long context)\n# QWEN_API_KEY=your_qwen_key                     # Qwen (Alibaba Cloud, multimodal)\n\n# Optional: Local LLM provider (zero-cost inference)\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=your_model_name\n\n# Optional: Additional search capabilities\nDUCKDUCKGO_ENABLED=true\nDUCKDUCKGO_REGION=us-en\nDUCKDUCKGO_SAFESEARCH=\nDUCKDUCKGO_TIME_RANGE=\nSPLOITUS_ENABLED=true\nGOOGLE_API_KEY=your_google_key\nGOOGLE_CX_KEY=your_google_cx\nTAVILY_API_KEY=your_tavily_key\nTRAVERSAAL_API_KEY=your_traversaal_key\nPERPLEXITY_API_KEY=your_perplexity_key\nPERPLEXITY_MODEL=sonar-pro\nPERPLEXITY_CONTEXT_SIZE=medium\n\n# Searxng meta search engine (aggregates results from multiple sources)\nSEARXNG_URL=http:\u002F\u002Fyour-searxng-instance:8080\nSEARXNG_CATEGORIES=general\nSEARXNG_LANGUAGE=\nSEARXNG_SAFESEARCH=0\nSEARXNG_TIME_RANGE=\nSEARXNG_TIMEOUT=\n\n## Graphiti knowledge graph settings\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http:\u002F\u002Fgraphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j settings (used by Graphiti stack)\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt:\u002F\u002Fneo4j:7687\n\n# Assistant configuration\nASSISTANT_USE_AGENTS=false         # Default value for agent usage when creating new assistants\n```\n\n5. Change all security related environment variables in `.env` file to improve security.\n\n\u003Cdetails>\n    \u003Csummary>Security related environment variables\u003C\u002Fsummary>\n\n### Main Security Settings\n- `COOKIE_SIGNING_SALT` - Salt for cookie signing, change to random value\n- `PUBLIC_URL` - Public URL of your server (eg. `https:\u002F\u002Fpentagi.example.com`)\n- `SERVER_SSL_CRT` and `SERVER_SSL_KEY` - Custom paths to your existing SSL certificate and key for HTTPS (these paths should be used in the docker-compose.yml file to mount as volumes)\n\n### Scraper Access\n- `SCRAPER_PUBLIC_URL` - Public URL for scraper if you want to use different scraper server for public URLs\n- `SCRAPER_PRIVATE_URL` - Private URL for scraper (local scraper server in docker-compose.yml file to access it to local URLs)\n\n### Access Credentials\n- `PENTAGI_POSTGRES_USER` and `PENTAGI_POSTGRES_PASSWORD` - PostgreSQL credentials\n- `NEO4J_USER` and `NEO4J_PASSWORD` - Neo4j credentials (for Graphiti knowledge graph)\n\n\u003C\u002Fdetails>\n\n6. Remove all inline comments from `.env` file if you want to use it in VSCode or other IDEs as a envFile option:\n\n```bash\nperl -i -pe 's\u002F\\s+#.*$\u002F\u002F' .env\n```\n\n7. Run the PentAGI stack:\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose.yml\ndocker compose up -d\n```\n\nVisit [localhost:8443](https:\u002F\u002Flocalhost:8443) to access PentAGI Web UI (default is `admin@pentagi.com` \u002F `admin`)\n\n> [!NOTE]\n> If you caught an error about `pentagi-network` or `observability-network` or `langfuse-network` you need to run `docker-compose.yml` firstly to create these networks and after that run `docker-compose-langfuse.yml`, `docker-compose-graphiti.yml`, and `docker-compose-observability.yml` to use Langfuse, Graphiti, and Observability services.\n>\n> You have to set at least one Language Model provider (OpenAI, Anthropic, Gemini, AWS Bedrock, or Ollama) to use PentAGI. AWS Bedrock provides enterprise-grade access to multiple foundation models from leading AI companies, while Ollama provides zero-cost local inference if you have sufficient computational resources. Additional API keys for search engines are optional but recommended for better results.\n>\n> **For fully local deployment with advanced models**: See our comprehensive guide on [Running PentAGI with vLLM and Qwen3.5-27B-FP8](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md) for a production-grade local LLM setup. This configuration achieves ~13,000 TPS for prompt processing and ~650 TPS for completion on 4× RTX 5090 GPUs, supporting 12+ concurrent flows with complete independence from cloud providers.\n>\n> `LLM_SERVER_*` environment variables are experimental feature and will be changed in the future. Right now you can use them to specify custom LLM server URL and one model for all agent types.\n>\n> `PROXY_URL` is a global proxy URL for all LLM providers and external search systems. You can use it for isolation from external networks.\n>\n> The `docker-compose.yml` file runs the PentAGI service as root user because it needs access to docker.sock for container management. If you're using TCP\u002FIP network connection to Docker instead of socket file, you can remove root privileges and use the default `pentagi` user for better security.\n\n### Accessing PentAGI from External Networks\n\nBy default, PentAGI binds to `127.0.0.1` (localhost only) for security. To access PentAGI from other machines on your network, you need to configure external access.\n\n#### Configuration Steps\n\n1. **Update `.env` file** with your server's IP address:\n\n```bash\n# Network binding - allow external connections\nPENTAGI_LISTEN_IP=0.0.0.0\nPENTAGI_LISTEN_PORT=8443\n\n# Public URL - use your actual server IP or hostname\n# Replace 192.168.1.100 with your server's IP address\nPUBLIC_URL=https:\u002F\u002F192.168.1.100:8443\n\n# CORS origins - list all URLs that will access PentAGI\n# Include localhost for local access AND your server IP for external access\nCORS_ORIGINS=https:\u002F\u002Flocalhost:8443,https:\u002F\u002F192.168.1.100:8443\n```\n\n> [!IMPORTANT]\n> - Replace `192.168.1.100` with your actual server's IP address\n> - Do NOT use `0.0.0.0` in `PUBLIC_URL` or `CORS_ORIGINS` - use the actual IP address\n> - Include both localhost and your server IP in `CORS_ORIGINS` for flexibility\n\n2. **Recreate containers** to apply the changes:\n\n```bash\ndocker compose down\ndocker compose up -d --force-recreate\n```\n\n3. **Verify port binding:**\n\n```bash\ndocker ps | grep pentagi\n```\n\nYou should see `0.0.0.0:8443->8443\u002Ftcp` or `:::8443->8443\u002Ftcp`.\n\nIf you see `127.0.0.1:8443->8443\u002Ftcp`, the environment variable wasn't picked up. In this case, directly edit `docker-compose.yml` line 31:\n\n```yaml\nports:\n  - \"0.0.0.0:8443:8443\"\n```\n\nThen recreate containers again.\n\n4. **Configure firewall** to allow incoming connections on port 8443:\n\n```bash\n# Ubuntu\u002FDebian with UFW\nsudo ufw allow 8443\u002Ftcp\nsudo ufw reload\n\n# CentOS\u002FRHEL with firewalld\nsudo firewall-cmd --permanent --add-port=8443\u002Ftcp\nsudo firewall-cmd --reload\n```\n\n5. **Access PentAGI:**\n\n- **Local access:** `https:\u002F\u002Flocalhost:8443`\n- **Network access:** `https:\u002F\u002Fyour-server-ip:8443`\n\n> [!NOTE]\n> You'll need to accept the self-signed SSL certificate warning in your browser when accessing via IP address.\n\n---\n\n### Running PentAGI with Podman\n\nPentAGI fully supports Podman as a Docker alternative. However, when using **Podman in rootless mode**, the scraper service requires special configuration because rootless containers cannot bind privileged ports (ports below 1024).\n\n#### Podman Rootless Configuration\n\nThe default scraper configuration uses port 443 (HTTPS), which is a privileged port. For Podman rootless, reconfigure the scraper to use a non-privileged port:\n\n**1. Edit `docker-compose.yml`** - modify the `scraper` service (around line 199):\n\n```yaml\nscraper:\n  image: vxcontrol\u002Fscraper:latest\n  restart: unless-stopped\n  container_name: scraper\n  hostname: scraper\n  expose:\n    - 3000\u002Ftcp  # Changed from 443 to 3000\n  ports:\n    - \"${SCRAPER_LISTEN_IP:-127.0.0.1}:${SCRAPER_LISTEN_PORT:-9443}:3000\"  # Map to port 3000\n  environment:\n    - MAX_CONCURRENT_SESSIONS=${LOCAL_SCRAPER_MAX_CONCURRENT_SESSIONS:-10}\n    - USERNAME=${LOCAL_SCRAPER_USERNAME:-someuser}\n    - PASSWORD=${LOCAL_SCRAPER_PASSWORD:-somepass}\n  logging:\n    options:\n      max-size: 50m\n      max-file: \"7\"\n  volumes:\n    - scraper-ssl:\u002Fusr\u002Fsrc\u002Fapp\u002Fssl\n  networks:\n    - pentagi-network\n  shm_size: 2g\n```\n\n**2. Update `.env` file** - change the scraper URL to use HTTP and port 3000:\n\n```bash\n# Scraper configuration for Podman rootless\nSCRAPER_PRIVATE_URL=http:\u002F\u002Fsomeuser:somepass@scraper:3000\u002F\nLOCAL_SCRAPER_USERNAME=someuser\nLOCAL_SCRAPER_PASSWORD=somepass\n```\n\n> [!IMPORTANT]\n> Key changes for Podman:\n> - Use **HTTP** instead of HTTPS for `SCRAPER_PRIVATE_URL`\n> - Use port **3000** instead of 443\n> - Change internal `expose` to `3000\u002Ftcp`\n> - Update port mapping to target `3000` instead of `443`\n\n**3. Recreate containers:**\n\n```bash\npodman-compose down\npodman-compose up -d --force-recreate\n```\n\n**4. Test scraper connectivity:**\n\n```bash\n# Test from within the pentagi container\npodman exec -it pentagi wget -O- \"http:\u002F\u002Fsomeuser:somepass@scraper:3000\u002Fhtml?url=http:\u002F\u002Fexample.com\"\n```\n\nIf you see HTML output, the scraper is working correctly.\n\n#### Podman Rootful Mode\n\nIf you're running Podman in rootful mode (with sudo), you can use the default configuration without modifications. The scraper will work on port 443 as intended.\n\n#### Docker Compatibility\n\nAll Podman configurations remain fully compatible with Docker. The non-privileged port approach works identically on both container runtimes.\n\n### Assistant Configuration\n\nPentAGI allows you to configure default behavior for assistants:\n\n| Variable               | Default | Description                                                             |\n| ---------------------- | ------- | ----------------------------------------------------------------------- |\n| `ASSISTANT_USE_AGENTS` | `false` | Controls the default value for agent usage when creating new assistants |\n\nThe `ASSISTANT_USE_AGENTS` setting affects the initial state of the \"Use Agents\" toggle when creating a new assistant in the UI:\n- `false` (default): New assistants are created with agent delegation disabled by default\n- `true`: New assistants are created with agent delegation enabled by default\n\nNote that users can always override this setting by toggling the \"Use Agents\" button in the UI when creating or editing an assistant. This environment variable only controls the initial default state.\n\n## 🔌 API Access\n\nPentAGI provides comprehensive programmatic access through both REST and GraphQL APIs, allowing you to integrate penetration testing workflows into your automation pipelines, CI\u002FCD processes, and custom applications.\n\n### Generating API Tokens\n\nAPI tokens are managed through the PentAGI web interface:\n\n1. Navigate to **Settings** → **API Tokens** in the web UI\n2. Click **Create Token** to generate a new API token\n3. Configure token properties:\n   - **Name** (optional): A descriptive name for the token\n   - **Expiration Date**: When the token will expire (minimum 1 minute, maximum 3 years)\n4. Click **Create** and **copy the token immediately** - it will only be shown once for security reasons\n5. Use the token as a Bearer token in your API requests\n\nEach token is associated with your user account and inherits your role's permissions.\n\n### Using API Tokens\n\nInclude the API token in the `Authorization` header of your HTTP requests:\n\n```bash\n# GraphQL API example\ncurl -X POST https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fgraphql \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\" \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"query\": \"{ flows { id title status } }\"}'\n\n# REST API example\ncurl https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fflows \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\"\n```\n\n### API Exploration and Testing\n\nPentAGI provides interactive documentation for exploring and testing API endpoints:\n\n#### GraphQL Playground\n\nAccess the GraphQL Playground at `https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fgraphql\u002Fplayground`\n\n1. Click the **HTTP Headers** tab at the bottom\n2. Add your authorization header:\n   ```json\n   {\n     \"Authorization\": \"Bearer YOUR_API_TOKEN\"\n   }\n   ```\n3. Explore the schema, run queries, and test mutations interactively\n\n#### Swagger UI\n\nAccess the REST API documentation at `https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Findex.html`\n\n1. Click the **Authorize** button\n2. Enter your token in the format: `Bearer YOUR_API_TOKEN`\n3. Click **Authorize** to apply\n4. Test endpoints directly from the Swagger UI\n\n### Generating API Clients\n\nYou can generate type-safe API clients for your preferred programming language using the schema files included with PentAGI:\n\n#### GraphQL Clients\n\nThe GraphQL schema is available at:\n- **Web UI**: Navigate to Settings to download `schema.graphqls`\n- **Direct file**: `backend\u002Fpkg\u002Fgraph\u002Fschema.graphqls` in the repository\n\nGenerate clients using tools like:\n- **GraphQL Code Generator** (JavaScript\u002FTypeScript): [https:\u002F\u002Fthe-guild.dev\u002Fgraphql\u002Fcodegen](https:\u002F\u002Fthe-guild.dev\u002Fgraphql\u002Fcodegen)\n- **genqlient** (Go): [https:\u002F\u002Fgithub.com\u002FKhan\u002Fgenqlient](https:\u002F\u002Fgithub.com\u002FKhan\u002Fgenqlient)\n- **Apollo iOS** (Swift): [https:\u002F\u002Fwww.apollographql.com\u002Fdocs\u002Fios](https:\u002F\u002Fwww.apollographql.com\u002Fdocs\u002Fios)\n\n#### REST API Clients\n\nThe OpenAPI specification is available at:\n- **Swagger JSON**: `https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json`\n- **Swagger YAML**: Available in `backend\u002Fpkg\u002Fserver\u002Fdocs\u002Fswagger.yaml`\n\nGenerate clients using:\n- **OpenAPI Generator**: [https:\u002F\u002Fopenapi-generator.tech](https:\u002F\u002Fopenapi-generator.tech)\n  ```bash\n  openapi-generator-cli generate \\\n    -i https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -g python \\\n    -o .\u002Fpentagi-client\n  ```\n\n- **Swagger Codegen**: [https:\u002F\u002Fgithub.com\u002Fswagger-api\u002Fswagger-codegen](https:\u002F\u002Fgithub.com\u002Fswagger-api\u002Fswagger-codegen)\n  ```bash\n  swagger-codegen generate \\\n    -i https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -l typescript-axios \\\n    -o .\u002Fpentagi-client\n  ```\n\n- **swagger-typescript-api** (TypeScript): [https:\u002F\u002Fgithub.com\u002Facacode\u002Fswagger-typescript-api](https:\u002F\u002Fgithub.com\u002Facacode\u002Fswagger-typescript-api)\n  ```bash\n  npx swagger-typescript-api \\\n    -p https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -o .\u002Fsrc\u002Fapi \\\n    -n pentagi-api.ts\n  ```\n\n### API Usage Examples\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Creating a New Flow (GraphQL)\u003C\u002Fb>\u003C\u002Fsummary>\n\n```graphql\nmutation CreateFlow {\n  createFlow(\n    modelProvider: \"openai\"\n    input: \"Test the security of https:\u002F\u002Fexample.com\"\n  ) {\n    id\n    title\n    status\n    createdAt\n  }\n}\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Listing Flows (REST API)\u003C\u002Fb>\u003C\u002Fsummary>\n\n```bash\ncurl https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fflows \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\" \\\n  | jq '.flows[] | {id, title, status}'\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Python Client Example\u003C\u002Fb>\u003C\u002Fsummary>\n\n```python\nimport requests\n\nclass PentAGIClient:\n    def __init__(self, base_url, api_token):\n        self.base_url = base_url\n        self.headers = {\n            \"Authorization\": f\"Bearer {api_token}\",\n            \"Content-Type\": \"application\u002Fjson\"\n        }\n    \n    def create_flow(self, provider, target):\n        query = \"\"\"\n        mutation CreateFlow($provider: String!, $input: String!) {\n          createFlow(modelProvider: $provider, input: $input) {\n            id\n            title\n            status\n          }\n        }\n        \"\"\"\n        response = requests.post(\n            f\"{self.base_url}\u002Fapi\u002Fv1\u002Fgraphql\",\n            json={\n                \"query\": query,\n                \"variables\": {\n                    \"provider\": provider,\n                    \"input\": target\n                }\n            },\n            headers=self.headers\n        )\n        return response.json()\n    \n    def get_flows(self):\n        response = requests.get(\n            f\"{self.base_url}\u002Fapi\u002Fv1\u002Fflows\",\n            headers=self.headers\n        )\n        return response.json()\n\n# Usage\nclient = PentAGIClient(\n    \"https:\u002F\u002Fyour-pentagi-instance:8443\",\n    \"your_api_token_here\"\n)\n\n# Create a new flow\nflow = client.create_flow(\"openai\", \"Scan https:\u002F\u002Fexample.com for vulnerabilities\")\nprint(f\"Created flow: {flow}\")\n\n# List all flows\nflows = client.get_flows()\nprint(f\"Total flows: {len(flows['flows'])}\")\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>TypeScript Client Example\u003C\u002Fb>\u003C\u002Fsummary>\n\n```typescript\nimport axios, { AxiosInstance } from 'axios';\n\ninterface Flow {\n  id: string;\n  title: string;\n  status: string;\n  createdAt: string;\n}\n\nclass PentAGIClient {\n  private client: AxiosInstance;\n\n  constructor(baseURL: string, apiToken: string) {\n    this.client = axios.create({\n      baseURL: `${baseURL}\u002Fapi\u002Fv1`,\n      headers: {\n        'Authorization': `Bearer ${apiToken}`,\n        'Content-Type': 'application\u002Fjson',\n      },\n    });\n  }\n\n  async createFlow(provider: string, input: string): Promise\u003CFlow> {\n    const query = `\n      mutation CreateFlow($provider: String!, $input: String!) {\n        createFlow(modelProvider: $provider, input: $input) {\n          id\n          title\n          status\n          createdAt\n        }\n      }\n    `;\n\n    const response = await this.client.post('\u002Fgraphql', {\n      query,\n      variables: { provider, input },\n    });\n\n    return response.data.data.createFlow;\n  }\n\n  async getFlows(): Promise\u003CFlow[]> {\n    const response = await this.client.get('\u002Fflows');\n    return response.data.flows;\n  }\n\n  async getFlow(flowId: string): Promise\u003CFlow> {\n    const response = await this.client.get(`\u002Fflows\u002F${flowId}`);\n    return response.data;\n  }\n}\n\n\u002F\u002F Usage\nconst client = new PentAGIClient(\n  'https:\u002F\u002Fyour-pentagi-instance:8443',\n  'your_api_token_here'\n);\n\n\u002F\u002F Create a new flow\nconst flow = await client.createFlow(\n  'openai',\n  'Perform penetration test on https:\u002F\u002Fexample.com'\n);\nconsole.log('Created flow:', flow);\n\n\u002F\u002F List all flows\nconst flows = await client.getFlows();\nconsole.log(`Total flows: ${flows.length}`);\n```\n\n\u003C\u002Fdetails>\n\n### Security Best Practices\n\nWhen working with API tokens:\n\n- **Never commit tokens to version control** - use environment variables or secrets management\n- **Rotate tokens regularly** - set appropriate expiration dates and create new tokens periodically\n- **Use separate tokens for different applications** - makes it easier to revoke access if needed\n- **Monitor token usage** - review API token activity in the Settings page\n- **Revoke unused tokens** - disable or delete tokens that are no longer needed\n- **Use HTTPS only** - never send API tokens over unencrypted connections\n\n### Token Management\n\n- **View tokens**: See all your active tokens in Settings → API Tokens\n- **Edit tokens**: Update token names or revoke tokens\n- **Delete tokens**: Permanently remove tokens (this action cannot be undone)\n- **Token ID**: Each token has a unique ID that can be copied for reference\n\nThe token list shows:\n- Token name (if provided)\n- Token ID (unique identifier)\n- Status (active\u002Frevoked\u002Fexpired)\n- Creation date\n- Expiration date\n\n### Custom LLM Provider Configuration\n\nWhen using custom LLM providers with the `LLM_SERVER_*` variables, you can fine-tune the reasoning format used in requests.\n\n> [!TIP]\n> For production-grade local deployments, consider using **vLLM** with **Qwen3.5-27B-FP8** for optimal performance. See our [comprehensive deployment guide](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md) which includes hardware requirements, configuration templates ([thinking mode](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml) and [non-thinking mode](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8-no-think.provider.yml)), and performance benchmarks showing 13K TPS prompt processing on 4× RTX 5090 GPUs.\n\n| Variable                        | Default | Description                                                                             |\n| ------------------------------- | ------- | --------------------------------------------------------------------------------------- |\n| `LLM_SERVER_URL`                |         | Base URL for the custom LLM API endpoint                                                |\n| `LLM_SERVER_KEY`                |         | API key for the custom LLM provider                                                     |\n| `LLM_SERVER_MODEL`              |         | Default model to use (can be overridden in provider config)                             |\n| `LLM_SERVER_CONFIG_PATH`        |         | Path to the YAML configuration file for agent-specific models                           |\n| `LLM_SERVER_PROVIDER`           |         | Provider name prefix for model names (e.g., `openrouter`, `deepseek` for LiteLLM proxy) |\n| `LLM_SERVER_LEGACY_REASONING`   | `false` | Controls reasoning format in API requests                                               |\n| `LLM_SERVER_PRESERVE_REASONING` | `false` | Preserve reasoning content in multi-turn conversations (required by some providers)     |\n\nThe `LLM_SERVER_PROVIDER` setting is particularly useful when using **LiteLLM proxy**, which adds a provider prefix to model names. For example, when connecting to Moonshot API through LiteLLM, models like `kimi-2.5` become `moonshot\u002Fkimi-2.5`. By setting `LLM_SERVER_PROVIDER=moonshot`, you can use the same provider configuration file for both direct API access and LiteLLM proxy access without modifications.\n\nThe `LLM_SERVER_LEGACY_REASONING` setting affects how reasoning parameters are sent to the LLM:\n- `false` (default): Uses modern format where reasoning is sent as a structured object with `max_tokens` parameter\n- `true`: Uses legacy format with string-based `reasoning_effort` parameter\n\nThis setting is important when working with different LLM providers as they may expect different reasoning formats in their API requests. If you encounter reasoning-related errors with custom providers, try changing this setting.\n\nThe `LLM_SERVER_PRESERVE_REASONING` setting controls whether reasoning content is preserved in multi-turn conversations:\n- `false` (default): Reasoning content is not preserved in conversation history\n- `true`: Reasoning content is preserved and sent in subsequent API calls\n\nThis setting is required by some LLM providers (e.g., Moonshot) that return errors like \"thinking is enabled but reasoning_content is missing in assistant tool call message\" when reasoning content is not included in multi-turn conversations. Enable this setting if your provider requires reasoning content to be preserved.\n\n### Ollama Provider Configuration\n\nPentAGI supports Ollama for both local LLM inference (zero-cost, enhanced privacy) and Ollama Cloud (managed service with free tier).\n\n#### Configuration Variables\n\n| Variable                            | Default     | Description                               |\n| ----------------------------------- | ----------- | ----------------------------------------- |\n| `OLLAMA_SERVER_URL`                 |             | URL of your Ollama server or Ollama Cloud |\n| `OLLAMA_SERVER_API_KEY`             |             | API key for Ollama Cloud authentication   |\n| `OLLAMA_SERVER_MODEL`               |             | Default model for inference               |\n| `OLLAMA_SERVER_CONFIG_PATH`         |             | Path to custom agent configuration file   |\n| `OLLAMA_SERVER_PULL_MODELS_TIMEOUT` | `600`       | Timeout for model downloads (seconds)     |\n| `OLLAMA_SERVER_PULL_MODELS_ENABLED` | `false`     | Auto-download models on startup           |\n| `OLLAMA_SERVER_LOAD_MODELS_ENABLED` | `false`     | Query server for available models         |\n\n#### Ollama Cloud Configuration\n\nOllama Cloud provides managed inference with a generous free tier and scalable paid plans.\n\n**Free Tier Setup (Single Model)**\n\n```bash\n# Free tier allows one model at a time\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_MODEL=gpt-oss:120b  # Example: OpenAI OSS 120B model\n```\n\n**Paid Tier Setup (Multi-Model with Pre-built Configuration)**\n\nFor paid tiers supporting multiple concurrent models, use the pre-built Ollama Cloud configuration:\n\n```bash\n# Using pre-built Ollama Cloud configuration (included in Docker image)\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-cloud.provider.yml\n```\n\nThe pre-built `ollama-cloud.provider.yml` configuration includes optimized model assignments for all agent types:\n- **Simple\u002FAssistant**: `nemotron-3-super:cloud` - Fast general-purpose model\n- **Primary Agent**: `qwen3-coder-next:cloud` - Advanced reasoning with high effort mode\n- **Coder\u002FPentester**: `qwen3-coder-next:cloud` - Specialized coding models\n- **Searcher**: `qwen3.5:397b-cloud` - Large context for information gathering\n- **Refiner\u002FRefactor**: `glm-5:cloud` - High-quality text refinement\n- **Adviser\u002FEnricher**: `minimax-m2.7:cloud` - Efficient advisory tasks\n- **Installer**: `devstral-2:123b-cloud` - Installation and setup tasks\n\n**Custom Configuration (Advanced)**\n\nTo create your own agent configuration, mount a custom file from your host filesystem:\n\n```bash\n# Using custom provider configuration\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama.provider.yml\n\n# Mount custom configuration from host filesystem (in .env or docker-compose override)\nPENTAGI_OLLAMA_SERVER_CONFIG_PATH=\u002Fpath\u002Fon\u002Fhost\u002Fmy-ollama-config.yml\n```\n\nThe `PENTAGI_OLLAMA_SERVER_CONFIG_PATH` environment variable maps your host configuration file to `\u002Fopt\u002Fpentagi\u002Fconf\u002Follama.provider.yml` inside the container.\n\n**Example custom configuration** (`my-ollama-config.yml`):\n\n```yaml\nprimary_agent:\n  model: \"qwen3-coder-next:cloud\"\n  temperature: 1.0\n  top_p: 0.9\n  max_tokens: 32768\n  reasoning:\n    effort: high\n\ncoder:\n  model: \"qwen3-coder:32b\"\n  temperature: 1.0\n  max_tokens: 20480\n```\n\n#### Local Ollama Configuration\n\nFor self-hosted Ollama instances:\n\n```bash\n# Basic local Ollama setup\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0\n\n# Production setup with auto-pull and model discovery\nOLLAMA_SERVER_URL=http:\u002F\u002Follama-server:11434\nOLLAMA_SERVER_PULL_MODELS_ENABLED=true\nOLLAMA_SERVER_PULL_MODELS_TIMEOUT=900\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=true\n\n# Using pre-built configurations from Docker image\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\n# or\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwen332b-fp16-tc.provider.yml\n# or\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwq32b-fp16-tc.provider.yml\n```\n\n**Performance Considerations:**\n\n- **Model Discovery** (`OLLAMA_SERVER_LOAD_MODELS_ENABLED=true`): Adds 1-2s startup latency querying Ollama API\n- **Auto-pull** (`OLLAMA_SERVER_PULL_MODELS_ENABLED=true`): First startup may take several minutes downloading models\n- **Pull timeout** (`OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900`): 15 minutes in seconds\n- **Static Config**: Disable both flags and specify models in config file for fastest startup\n\n#### Creating Custom Ollama Models with Extended Context\n\nPentAGI requires models with larger context windows than the default Ollama configurations. You need to create custom models with increased `num_ctx` parameter through Modelfiles. While typical agent workflows consume around 64K tokens, PentAGI uses 110K context size for safety margin and handling complex penetration testing scenarios.\n\n**Important**: The `num_ctx` parameter can only be set during model creation via Modelfile - it cannot be changed after model creation or overridden at runtime.\n\n##### Example: Qwen3 32B FP16 with Extended Context\n\nCreate a Modelfile named `Modelfile_qwen3_32b_fp16_tc`:\n\n```dockerfile\nFROM qwen3:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.3\nPARAMETER top_p 0.8\nPARAMETER min_p 0.0\nPARAMETER top_k 20\nPARAMETER repeat_penalty 1.1\n```\n\nBuild the custom model:\n\n```bash\nollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tc\n```\n\n##### Example: QwQ 32B FP16 with Extended Context\n\nCreate a Modelfile named `Modelfile_qwq_32b_fp16_tc`:\n\n```dockerfile\nFROM qwq:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.2\nPARAMETER top_p 0.7\nPARAMETER min_p 0.0\nPARAMETER top_k 40\nPARAMETER repeat_penalty 1.2\n```\n\nBuild the custom model:\n\n```bash\nollama create qwq:32b-fp16-tc -f Modelfile_qwq_32b_fp16_tc\n```\n\n> **Note**: The QwQ 32B FP16 model requires approximately **71.3 GB VRAM** for inference. Ensure your system has sufficient GPU memory before attempting to use this model.\n\nThese custom models are referenced in the pre-built provider configuration files (`ollama-qwen332b-fp16-tc.provider.yml` and `ollama-qwq32b-fp16-tc.provider.yml`) that are included in the Docker image at `\u002Fopt\u002Fpentagi\u002Fconf\u002F`.\n\n### OpenAI Provider Configuration\n\nPentAGI integrates with OpenAI's comprehensive model lineup, featuring advanced reasoning capabilities with extended chain-of-thought, agentic models with enhanced tool integration, and specialized code models for security engineering.\n\n#### Configuration Variables\n\n| Variable             | Default                     | Description                 |\n| -------------------- | --------------------------- | --------------------------- |\n| `OPEN_AI_KEY`        |                             | API key for OpenAI services |\n| `OPEN_AI_SERVER_URL` | `https:\u002F\u002Fapi.openai.com\u002Fv1` | OpenAI API endpoint         |\n\n#### Configuration Examples\n\n```bash\n# Basic OpenAI setup\nOPEN_AI_KEY=your_openai_api_key\nOPEN_AI_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# Using with proxy for enhanced security\nOPEN_AI_KEY=your_openai_api_key\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### Supported Models\n\nPentAGI supports 31 OpenAI models with tool calling, streaming, reasoning modes, and prompt caching. Models marked with `*` are used in default configuration.\n\n**GPT-5.2 Series - Latest Flagship Agentic (December 2025)**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5.2`*            | ✅        | $1.75\u002F$14.00\u002F$0.18         | Latest flagship with enhanced reasoning and tool integration, autonomous security research |\n| `gpt-5.2-pro`         | ✅        | $21.00\u002F$168.00\u002F$0.00       | Premium version with superior agentic coding, mission-critical security research, zero-day discovery |\n| `gpt-5.2-codex`       | ✅        | $1.75\u002F$14.00\u002F$0.18         | Most advanced code-specialized, context compaction, strong cybersecurity capabilities |\n\n**GPT-5\u002F5.1 Series - Advanced Agentic Models**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5`               | ✅        | $1.25\u002F$10.00\u002F$0.13         | Premier agentic with advanced reasoning, autonomous security research, exploit chain development |\n| `gpt-5.1`             | ✅        | $1.25\u002F$10.00\u002F$0.13         | Enhanced agentic with adaptive reasoning, balanced penetration testing with strong tool coordination |\n| `gpt-5-pro`           | ✅        | $15.00\u002F$120.00\u002F$0.00       | Premium version with major reasoning improvements, reduced hallucinations, critical security operations |\n| `gpt-5-mini`          | ✅        | $0.25\u002F$2.00\u002F$0.03          | Efficient balancing speed and intelligence, automated vulnerability analysis, exploit generation |\n| `gpt-5-nano`          | ✅        | $0.05\u002F$0.40\u002F$0.01          | Fastest for high-throughput scanning, reconnaissance, bulk vulnerability detection |\n\n**GPT-5\u002F5.1 Codex Series - Code-Specialized**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5.1-codex-max`   | ✅        | $1.25\u002F$10.00\u002F$0.13         | Enhanced reasoning for sophisticated coding, proven CVE findings, systematic exploit development |\n| `gpt-5.1-codex`       | ✅        | $1.25\u002F$10.00\u002F$0.13         | Standard code-optimized with strong reasoning, exploit generation, vulnerability analysis |\n| `gpt-5-codex`         | ✅        | $1.25\u002F$10.00\u002F$0.13         | Foundational code-specialized, vulnerability scanning, basic exploit generation |\n| `gpt-5.1-codex-mini`  | ✅        | $0.25\u002F$2.00\u002F$0.03          | Compact high-performance, 4x higher capacity, rapid vulnerability detection |\n| `codex-mini-latest`   | ✅        | $1.50\u002F$6.00\u002F$0.38          | Latest compact code model, automated code review, basic vulnerability analysis |\n\n**GPT-4.1 Series - Enhanced Intelligence**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-4.1`             | ❌        | $2.00\u002F$8.00\u002F$0.50          | Enhanced flagship with superior function calling, complex threat analysis, sophisticated exploit development |\n| `gpt-4.1-mini`*       | ❌        | $0.40\u002F$1.60\u002F$0.10          | Balanced performance with improved efficiency, routine security assessments, automated code analysis |\n| `gpt-4.1-nano`        | ❌        | $0.10\u002F$0.40\u002F$0.03          | Ultra-fast lightweight, bulk security scanning, rapid reconnaissance, continuous monitoring |\n\n**GPT-4o Series - Multimodal Flagship**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-4o`              | ❌        | $2.50\u002F$10.00\u002F$1.25         | Multimodal flagship with vision, image analysis, web UI assessment, multi-tool orchestration |\n| `gpt-4o-mini`         | ❌        | $0.15\u002F$0.60\u002F$0.08          | Compact multimodal with strong function calling, high-frequency scanning, cost-effective bulk operations |\n\n**o-Series - Advanced Reasoning Models**\n\n| Model ID              | Thinking | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `o4-mini`*            | ✅        | $1.10\u002F$4.40\u002F$0.28          | Next-gen reasoning with enhanced speed, methodical security assessments, systematic exploit development |\n| `o3`*                 | ✅        | $2.00\u002F$8.00\u002F$0.50          | Advanced reasoning powerhouse, multi-stage attack chains, deep vulnerability analysis |\n| `o3-mini`             | ✅        | $1.10\u002F$4.40\u002F$0.55          | Compact reasoning with extended thinking, step-by-step attack planning, logical vulnerability chaining |\n| `o1`                  | ✅        | $15.00\u002F$60.00\u002F$7.50        | Premier reasoning with maximum depth, advanced penetration testing, novel exploit research |\n| `o3-pro`              | ✅        | $20.00\u002F$80.00\u002F$0.00        | Most advanced reasoning, 80% cheaper than o1-pro, zero-day research, critical security investigations |\n| `o1-pro`              | ✅        | $150.00\u002F$600.00\u002F$0.00      | Previous-gen premium reasoning, exhaustive security analysis, mission-critical challenges |\n\n**Prices**: Per 1M tokens. Reasoning models include thinking tokens in output pricing.\n\n> [!WARNING]\n> **GPT-5* Models - Trusted Access Required**\n>\n> All GPT-5 series models (`gpt-5`, `gpt-5.1`, `gpt-5.2`, `gpt-5-pro`, `gpt-5.2-pro`, and all Codex variants) work **unstably with PentAGI** and may trigger OpenAI's cybersecurity safety mechanisms without verified access.\n>\n> **To use GPT-5* models reliably:**\n> 1. **Individual users**: Verify your identity at [chatgpt.com\u002Fcyber](https:\u002F\u002Fchatgpt.com\u002Fcyber)\n> 2. **Enterprise teams**: Request trusted access through your OpenAI representative\n> 3. **Security researchers**: Apply for the [Cybersecurity Grant Program](https:\u002F\u002Fopenai.com\u002Fform\u002Fcybersecurity-grant-program\u002F) (includes $10M in API credits)\n>\n> **Recommended alternatives without verification:**\n> - Use `o-series` models (o3, o4-mini, o1) for reasoning tasks\n> - Use `gpt-4.1` series for general intelligence and function calling\n> - All o-series and gpt-4.x models work reliably without special access\n\n**Reasoning Effort Levels**:\n- **High**: Maximum reasoning depth (refiner - o3 with high effort)\n- **Medium**: Balanced reasoning (primary_agent, assistant, reflector - o4-mini\u002Fo3 with medium effort)\n- **Low**: Efficient targeted reasoning (coder, installer, pentester - o3\u002Fo4-mini with low effort; adviser - gpt-5.2 with low effort)\n\n**Key Features**:\n- **Extended Reasoning**: o-series models with chain-of-thought for complex security analysis\n- **Agentic Intelligence**: GPT-5\u002F5.1\u002F5.2 series with enhanced tool integration and autonomous capabilities\n- **Prompt Caching**: Cost reduction on repeated context (10-50% of input price)\n- **Code Specialization**: Dedicated Codex models for vulnerability discovery and exploit development\n- **Multimodal Support**: GPT-4o series for vision-based security assessments\n- **Tool Calling**: Robust function calling across all models for pentesting tool orchestration\n- **Streaming**: Real-time response streaming for interactive workflows\n- **Proven Track Record**: Industry-leading models with CVE discoveries and real-world security applications\n\n### Anthropic Provider Configuration\n\nPentAGI integrates with Anthropic's Claude models, featuring advanced extended thinking capabilities, exceptional safety mechanisms, and sophisticated understanding of complex security contexts with prompt caching.\n\n#### Configuration Variables\n\n| Variable               | Default                        | Description                    |\n| ---------------------- | ------------------------------ | ------------------------------ |\n| `ANTHROPIC_API_KEY`    |                                | API key for Anthropic services |\n| `ANTHROPIC_SERVER_URL` | `https:\u002F\u002Fapi.anthropic.com\u002Fv1` | Anthropic API endpoint         |\n\n#### Configuration Examples\n\n```bash\n# Basic Anthropic setup\nANTHROPIC_API_KEY=your_anthropic_api_key\nANTHROPIC_SERVER_URL=https:\u002F\u002Fapi.anthropic.com\u002Fv1\n\n# Using with proxy for secure environments\nANTHROPIC_API_KEY=your_anthropic_api_key\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### Supported Models\n\nPentAGI supports 10 Claude models with tool calling, streaming, extended thinking, adaptive thinking, and prompt caching. Models marked with `*` are used in default configuration.\n\n**Claude 4 Series - Latest Models (2025-2026)**\n\n| Model ID                 | Thinking | Release Date | Price (Input\u002FOutput\u002FCache R\u002FW) | Use Case                                        |\n| ------------------------ | -------- | ------------ | ------------------------------ | ----------------------------------------------- |\n| `claude-opus-4-6`*       | ✅        | May 2025     | $5.00\u002F$25.00\u002F$0.50\u002F$6.25       | Most intelligent model for autonomous agents and coding. Extended + adaptive thinking for complex exploit development, multi-stage attack simulation |\n| `claude-sonnet-4-6`*     | ✅        | Aug 2025     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | Best speed\u002Fintelligence balance with adaptive thinking. Multi-phase security assessments, intelligent vulnerability analysis, real-time threat hunting |\n| `claude-haiku-4-5`*      | ✅        | Oct 2025     | $1.00\u002F$5.00\u002F$0.10\u002F$1.25        | Fastest model with near-frontier intelligence. High-frequency scanning, real-time monitoring, bulk automated testing |\n\n**Legacy Models - Still Supported**\n\n| Model ID                 | Thinking | Release Date | Price (Input\u002FOutput\u002FCache R\u002FW) | Use Case                                        |\n| ------------------------ | -------- | ------------ | ------------------------------ | ----------------------------------------------- |\n| `claude-sonnet-4-5`      | ✅        | Sep 2025     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | State-of-the-art reasoning (superseded by 4-6). Sophisticated penetration testing, advanced threat analysis |\n| `claude-opus-4-5`        | ✅        | Nov 2025     | $5.00\u002F$25.00\u002F$0.50\u002F$6.25       | Ultimate reasoning (superseded by opus-4-6). Critical security research, zero-day discovery, red team operations |\n| `claude-opus-4-1`        | ✅        | Aug 2025     | $15.00\u002F$75.00\u002F$1.50\u002F$18.75     | Advanced reasoning (superseded). Complex penetration testing, sophisticated threat modeling |\n| `claude-sonnet-4-0`      | ✅        | May 2025     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | High-performance reasoning (superseded). Complex threat modeling, multi-tool coordination |\n| `claude-opus-4-0`        | ✅        | May 2025     | $15.00\u002F$75.00\u002F$1.50\u002F$18.75     | First generation Opus (superseded). Multi-step exploit development, autonomous pentesting workflows |\n\n**Deprecated Models - Migrate to Current Models**\n\n| Model ID                     | Thinking | Release Date | Price (Input\u002FOutput\u002FCache R\u002FW) | Notes                                        |\n| ---------------------------- | -------- | ------------ | ------------------------------ | -------------------------------------------- |\n| `claude-3-haiku-20240307`    | ❌        | Mar 2024     | $0.25\u002F$1.25\u002F$0.03\u002F$0.30        | Will be retired April 19, 2026. Migrate to claude-haiku-4-5 |\n\n**Prices**: Per 1M tokens. Cache pricing includes both Read and Write costs.\n\n**Extended Thinking Configuration**:\n- **Max Tokens 4096**: Generator (claude-opus-4-6) for maximum reasoning depth on complex exploit development\n- **Max Tokens 2048**: Coder (claude-sonnet-4-6) for balanced code analysis and vulnerability research  \n- **Max Tokens 1024**: Primary agent, assistant, refiner, adviser, reflector, searcher, installer, pentester for focused reasoning on specific tasks\n- **Extended Thinking**: All Claude 4.5+ and 4.6 models support configurable extended thinking for deep reasoning tasks\n\n**Key Features**:\n- **Extended Thinking**: All Claude 4.5+ and 4.6 models with configurable chain-of-thought reasoning depths for complex security analysis\n- **Adaptive Thinking**: Claude 4.6 series (Opus\u002FSonnet) dynamically adjusts reasoning depth based on task complexity for optimal performance\n- **Prompt Caching**: Significant cost reduction with separate read\u002Fwrite pricing (10% read, 125% write of input)\n- **Extended Context Window**: 200K tokens standard, up to 1M tokens (beta) for Claude Opus\u002FSonnet 4.6 for comprehensive codebase analysis\n- **Tool Calling**: Robust function calling with exceptional accuracy for security tool orchestration\n- **Streaming**: Real-time response streaming for interactive penetration testing workflows\n- **Safety-First Design**: Built-in safety mechanisms ensuring responsible security testing practices\n- **Multimodal Support**: Vision capabilities in latest models for screenshot analysis and UI security assessment\n- **Constitutional AI**: Advanced safety training providing reliable and ethical security guidance\n\n### Google AI (Gemini) Provider Configuration\n\nPentAGI integrates with Google's Gemini models through the Google AI API, offering state-of-the-art multimodal reasoning capabilities with extended thinking and context caching.\n\n#### Configuration Variables\n\n| Variable            | Default                                     | Description                    |\n| ------------------- | ------------------------------------------- | ------------------------------ |\n| `GEMINI_API_KEY`    |                                             | API key for Google AI services |\n| `GEMINI_SERVER_URL` | `https:\u002F\u002Fgenerativelanguage.googleapis.com` | Google AI API endpoint         |\n\n#### Configuration Examples\n\n```bash\n# Basic Gemini setup\nGEMINI_API_KEY=your_gemini_api_key\nGEMINI_SERVER_URL=https:\u002F\u002Fgenerativelanguage.googleapis.com\n\n# Using with proxy\nGEMINI_API_KEY=your_gemini_api_key\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### Supported Models\n\nPentAGI supports 13 Gemini models with tool calling, streaming, thinking modes, and context caching. Models marked with `*` are used in default configuration.\n\n**Gemini 3.1 Series - Latest Flagship (February 2026)**\n\n| Model ID                              | Thinking | Context | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-3.1-pro-preview`*             | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | Latest flagship with refined thinking, improved token efficiency, optimized for software engineering and agentic workflows |\n| `gemini-3.1-pro-preview-customtools`  | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | Custom tools endpoint optimized for bash and custom tools (view_file, search_code) prioritization |\n| `gemini-3.1-flash-lite-preview`*      | ✅        | 1M      | $0.25\u002F$1.50\u002F$0.03          | Most cost-efficient with fastest performance for high-volume agentic tasks and low-latency applications |\n\n**Gemini 3 Series (⚠️ gemini-3-pro-preview DEPRECATED - Shutdown March 9, 2026)**\n\n| Model ID                              | Thinking | Context | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-3-pro-preview`                | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | ⚠️ DEPRECATED - Migrate to gemini-3.1-pro-preview before March 9, 2026 |\n| `gemini-3-flash-preview`*             | ✅        | 1M      | $0.50\u002F$3.00\u002F$0.05          | Frontier intelligence with superior search grounding, high-throughput security scanning |\n\n**Gemini 2.5 Series - Advanced Thinking Models**\n\n| Model ID                                 | Thinking | Context | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ---------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-2.5-pro`                         | ✅        | 1M      | $1.25\u002F$10.00\u002F$0.13         | State-of-the-art for complex coding and reasoning, sophisticated threat modeling |\n| `gemini-2.5-flash`                       | ✅        | 1M      | $0.30\u002F$2.50\u002F$0.03          | First hybrid reasoning model with thinking budgets, best price-performance for large-scale assessments |\n| `gemini-2.5-flash-lite`                  | ✅        | 1M      | $0.10\u002F$0.40\u002F$0.01          | Smallest and most cost-effective for at-scale usage, high-throughput scanning |\n| `gemini-2.5-flash-lite-preview-09-2025`  | ✅        | 1M      | $0.10\u002F$0.40\u002F$0.01          | Latest preview optimized for cost-efficiency, high throughput, and quality |\n\n**Gemini 2.0 Series - Balanced Multimodal for Agents**\n\n| Model ID                              | Thinking | Context | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-2.0-flash`                    | ❌        | 1M      | $0.10\u002F$0.40\u002F$0.03          | Balanced multimodal built for agents era, diverse security tasks and real-time monitoring |\n| `gemini-2.0-flash-lite`               | ❌        | 1M      | $0.08\u002F$0.30\u002F$0.00          | Lightweight for continuous monitoring, basic scanning, automated alert processing |\n\n**Specialized Open-Source Models (Free)**\n\n| Model ID                              | Thinking | Context | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemma-3-27b-it`                      | ❌        | 128K    | Free\u002FFree\u002FFree             | Open-source from Gemini tech, on-premises security operations, privacy-sensitive testing |\n| `gemma-3n-4b-it`                      | ❌        | 128K    | Free\u002FFree\u002FFree             | Efficient for edge devices (mobile\u002Flaptops\u002Ftablets), offline vulnerability scanning |\n\n**Prices**: Per 1M tokens (Standard Paid tier). Context window is input token limit.\n\n> [!WARNING]\n> **Gemini 3 Pro Preview Deprecation**\n>\n> `gemini-3-pro-preview` will be **shut down on March 9, 2026**. Migrate to `gemini-3.1-pro-preview` to avoid service disruption. The new model offers:\n>\n> - Refined performance and reliability\n> - Improved thinking and token efficiency\n> - Better grounded, factually consistent responses\n> - Enhanced software engineering behavior\n\n**Key Features**:\n- **Extended Thinking**: Step-by-step reasoning for complex security analysis (all Gemini 3.x and 2.5 series)\n- **Context Caching**: Significant cost reduction on repeated context (10-90% of input price)\n- **Ultra-Long Context**: 1M tokens for comprehensive codebase analysis and documentation review\n- **Multimodal Support**: Text, image, video, audio, and PDF processing for comprehensive assessments\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming for interactive security workflows\n- **Code Execution**: Built-in code execution for offensive tool testing and exploit validation\n- **Search Grounding**: Google Search integration for threat intelligence and CVE research\n- **File Search**: Document retrieval and RAG capabilities for knowledge-based assessments\n- **Batch API**: 50% cost reduction for non-real-time batch processing\n\n**Reasoning Effort Levels**:\n- **High**: Maximum thinking depth for complex multi-step analysis (generator)\n- **Medium**: Balanced reasoning for general agentic tasks (primary_agent, assistant, refiner, adviser)\n- **Low**: Efficient thinking for focused tasks (coder, installer, pentester)\n\n### AWS Bedrock Provider Configuration\n\nPentAGI integrates with Amazon Bedrock, offering access to 20+ foundation models from leading AI companies including Anthropic, Amazon, Cohere, DeepSeek, OpenAI, Qwen, Mistral, and Moonshot.\n\n#### Configuration Variables\n\n| Variable                    | Default     | Description                                                                                         |\n| --------------------------- | ----------- | --------------------------------------------------------------------------------------------------- |\n| `BEDROCK_REGION`            | `us-east-1` | AWS region for Bedrock service                                                                      |\n| `BEDROCK_DEFAULT_AUTH`      | `false`     | Use AWS SDK default credential chain (environment, EC2 role, ~\u002F.aws\u002Fcredentials) - highest priority |\n| `BEDROCK_BEARER_TOKEN`      |             | Bearer token authentication - priority over static credentials                                      |\n| `BEDROCK_ACCESS_KEY_ID`     |             | AWS access key ID for static credentials                                                            |\n| `BEDROCK_SECRET_ACCESS_KEY` |             | AWS secret access key for static credentials                                                        |\n| `BEDROCK_SESSION_TOKEN`     |             | AWS session token for temporary credentials (optional, used with static credentials)                |\n| `BEDROCK_SERVER_URL`        |             | Custom Bedrock endpoint (VPC endpoints, local testing)                                              |\n\n**Authentication Priority**: `BEDROCK_DEFAULT_AUTH` → `BEDROCK_BEARER_TOKEN` → `BEDROCK_ACCESS_KEY_ID`+`BEDROCK_SECRET_ACCESS_KEY`\n\n#### Configuration Examples\n\n```bash\n# Recommended: Default AWS SDK authentication (EC2\u002FECS\u002FLambda roles)\nBEDROCK_REGION=us-east-1\nBEDROCK_DEFAULT_AUTH=true\n\n# Bearer token authentication (AWS STS, custom auth)\nBEDROCK_REGION=us-east-1\nBEDROCK_BEARER_TOKEN=your_bearer_token\n\n# Static credentials (development, testing)\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# With proxy and custom endpoint\nBEDROCK_REGION=us-east-1\nBEDROCK_DEFAULT_AUTH=true\nBEDROCK_SERVER_URL=https:\u002F\u002Fbedrock-runtime.us-east-1.vpce-xxx.amazonaws.com\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### Supported Models\n\nPentAGI supports 21 AWS Bedrock models with tool calling, streaming, and multimodal capabilities. Models marked with `*` are used in default configuration.\n\n| Model ID                                         | Provider        | Thinking | Multimodal | Price (Input\u002FOutput) | Use Case                                |\n| ------------------------------------------------ | --------------- | -------- | ---------- | -------------------- | --------------------------------------- |\n| `us.amazon.nova-2-lite-v1:0`                     | Amazon Nova     | ❌        | ✅          | $0.33\u002F$2.75          | Adaptive reasoning, efficient thinking  |\n| `us.amazon.nova-premier-v1:0`                    | Amazon Nova     | ❌        | ✅          | $2.50\u002F$12.50         | Complex reasoning, advanced analysis    |\n| `us.amazon.nova-pro-v1:0`                        | Amazon Nova     | ❌        | ✅          | $0.80\u002F$3.20          | Balanced accuracy, speed, cost          |\n| `us.amazon.nova-lite-v1:0`                       | Amazon Nova     | ❌        | ✅          | $0.06\u002F$0.24          | Fast processing, high-volume operations |\n| `us.amazon.nova-micro-v1:0`                      | Amazon Nova     | ❌        | ❌          | $0.035\u002F$0.14         | Ultra-low latency, real-time monitoring |\n| `us.anthropic.claude-opus-4-6-v1`*               | Anthropic       | ✅        | ✅          | $5.00\u002F$25.00         | World-class coding, enterprise agents   |\n| `us.anthropic.claude-sonnet-4-6`                 | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | Frontier intelligence, enterprise scale |\n| `us.anthropic.claude-opus-4-5-20251101-v1:0`     | Anthropic       | ✅        | ✅          | $5.00\u002F$25.00         | Multi-day software development          |\n| `us.anthropic.claude-haiku-4-5-20251001-v1:0`*   | Anthropic       | ✅        | ✅          | $1.00\u002F$5.00          | Near-frontier performance, high speed   |\n| `us.anthropic.claude-sonnet-4-5-20250929-v1:0`*  | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | Real-world agents, coding excellence    |\n| `us.anthropic.claude-sonnet-4-20250514-v1:0`     | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | Balanced performance, production-ready  |\n| `us.anthropic.claude-3-5-haiku-20241022-v1:0`    | Anthropic       | ❌        | ❌          | $0.80\u002F$4.00          | Fastest model, cost-effective scanning  |\n| `cohere.command-r-plus-v1:0`                     | Cohere          | ❌        | ❌          | $3.00\u002F$15.00         | Large-scale operations, superior RAG    |\n| `deepseek.v3.2`                                  | DeepSeek        | ❌        | ❌          | $0.58\u002F$1.68          | Long-context reasoning, efficiency      |\n| `openai.gpt-oss-120b-1:0`*                       | OpenAI (OSS)    | ✅        | ❌          | $0.15\u002F$0.60          | Strong reasoning, scientific analysis   |\n| `openai.gpt-oss-20b-1:0`                         | OpenAI (OSS)    | ✅        | ❌          | $0.07\u002F$0.30          | Efficient coding, software development  |\n| `qwen.qwen3-next-80b-a3b`                        | Qwen            | ❌        | ❌          | $0.15\u002F$1.20          | Ultra-long context, flagship reasoning  |\n| `qwen.qwen3-32b-v1:0`                            | Qwen            | ❌        | ❌          | $0.15\u002F$0.60          | Balanced reasoning, research use cases  |\n| `qwen.qwen3-coder-30b-a3b-v1:0`                  | Qwen            | ❌        | ❌          | $0.15\u002F$0.60          | Vibe coding, natural-language first     |\n| `qwen.qwen3-coder-next`                          | Qwen            | ❌        | ❌          | $0.45\u002F$1.80          | Tool use, function calling optimized    |\n| `mistral.mistral-large-3-675b-instruct`          | Mistral         | ❌        | ✅          | $4.00\u002F$12.00         | Advanced multimodal, long-context       |\n| `moonshotai.kimi-k2.5`                           | Moonshot        | ❌        | ✅          | $0.60\u002F$3.00          | Vision, language, code in one model     |\n\n**Prices**: Per 1M tokens. Models with thinking\u002Freasoning support additional compute costs during reasoning phase.\n\n#### Tested but Incompatible Models\n\nSome AWS Bedrock models were tested but are **not supported** due to technical limitations:\n\n| Model Family              | Reason for Incompatibility                                                                |\n| ------------------------- | ----------------------------------------------------------------------------------------- |\n| **GLM (Z.AI)**            | Tool calling format incompatible with Converse API (expects string instead of JSON)       |\n| **AI21 Jamba**            | Severe rate limits (1-2 req\u002Fmin) prevent reliable testing and production use              |\n| **Meta Llama 3.3\u002F3.1**    | Unstable tool call result processing, causes unexpected failures in multi-turn workflows  |\n| **Mistral Magistral**     | Tool calling not supported by the model                                                   |\n| **Moonshot K2-Thinking**  | Unstable streaming behavior with tool calls, unreliable in production                     |\n| **Qwen3-VL**              | Unstable streaming with tool calling, multimodal + tools combination fails intermittently |\n\n> [!IMPORTANT]\n> **Rate Limits & Quota Management**\n>\n> Default AWS Bedrock quotas for Claude models are **extremely restrictive** (2-20 requests\u002Fminute for new accounts). For production penetration testing:\n>\n> 1. **Request quota increases** through AWS Service Quotas console for models you plan to use\n> 2. **Use Amazon Nova models** - higher default quotas and excellent performance\n> 3. **Enable provisioned throughput** for consistent high-volume testing\n> 4. **Monitor usage** - AWS throttles aggressively at quota limits\n>\n> Without quota increases, expect frequent delays and workflow interruptions.\n\n> [!WARNING]\n> **Converse API Requirements**\n>\n> PentAGI uses Amazon Bedrock **Converse API** for unified model access. All supported models require:\n>\n> - ✅ Converse\u002FConverseStream API support\n> - ✅ Tool use (function calling) for penetration testing workflows\n> - ✅ Streaming tool use for real-time feedback\n>\n> Verify model capabilities at: [AWS Bedrock Model Features](https:\u002F\u002Fdocs.aws.amazon.com\u002Fbedrock\u002Flatest\u002Fuserguide\u002Fconversation-inference-supported-models-features.html)\n\n**Key Features**:\n- **Automatic Prompt Caching**: 40-70% cost reduction on repeated context (Claude 4.x models)\n- **Extended Thinking**: Step-by-step reasoning for complex security analysis (Claude, DeepSeek R1, OpenAI GPT)\n- **Multimodal Analysis**: Process screenshots, diagrams, video for comprehensive testing (Nova, Claude, Mistral, Kimi)\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming for interactive security assessment workflows\n\n### DeepSeek Provider Configuration\n\nPentAGI integrates with DeepSeek, providing access to advanced AI models with strong reasoning, coding capabilities, and context caching at competitive prices.\n\n#### Configuration Variables\n\n| Variable              | Default Value              | Description                                         |\n| --------------------- | -------------------------- | --------------------------------------------------- |\n| `DEEPSEEK_API_KEY`    |                            | DeepSeek API key for authentication                 |\n| `DEEPSEEK_SERVER_URL` | `https:\u002F\u002Fapi.deepseek.com` | DeepSeek API endpoint URL                           |\n| `DEEPSEEK_PROVIDER`   |                            | Provider prefix for LiteLLM integration (optional)  |\n\n#### Configuration Examples\n\n```bash\n# Direct API usage\nDEEPSEEK_API_KEY=your_deepseek_api_key\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com\n\n# With LiteLLM proxy\nDEEPSEEK_API_KEY=your_litellm_key\nDEEPSEEK_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nDEEPSEEK_PROVIDER=deepseek  # Adds prefix to model names (deepseek\u002Fdeepseek-chat) for LiteLLM\n```\n\n#### Supported Models\n\nPentAGI supports 2 DeepSeek-V3.2 models with tool calling, streaming, thinking modes, and context caching. Both models are used in default configuration.\n\n| Model ID              | Thinking | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| --------------------- | -------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `deepseek-chat`*      | ❌        | 128K    | 8K         | $0.28\u002F$0.42\u002F$0.03          | General dialogue, code generation, tool calling |\n| `deepseek-reasoner`*  | ✅        | 128K    | 64K        | $0.28\u002F$0.42\u002F$0.03          | Advanced reasoning, complex logic, security analysis |\n\n**Prices**: Per 1M tokens. Cache pricing is for prompt caching (10% of input cost). Models with thinking support include reinforcement learning chain-of-thought reasoning.\n\n**Key Features**:\n- **Automatic Prompt Caching**: 40-60% cost reduction on repeated context (10% of input price)\n- **Extended Thinking**: Reinforcement learning CoT for complex security analysis (deepseek-reasoner)\n- **Strong Coding**: Optimized for code generation and exploit development\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming for interactive workflows\n- **Multilingual**: Strong Chinese and English support\n- **Additional Features**: JSON Output, Chat Prefix Completion, FIM (Fill-in-the-Middle) Completion\n\n**LiteLLM Integration**: Set `DEEPSEEK_PROVIDER=deepseek` to enable model name prefixing when using default PentAGI configurations with LiteLLM proxy. Leave empty for direct API usage.\n\n### GLM Provider Configuration\n\nPentAGI integrates with GLM from Zhipu AI (Z.AI), providing advanced language models with MoE architecture, strong reasoning, and agentic capabilities developed by Tsinghua University.\n\n#### Configuration Variables\n\n| Variable          | Default Value                   | Description                                                |\n| ----------------- | ------------------------------- | ---------------------------------------------------------- |\n| `GLM_API_KEY`     |                                 | GLM API key for authentication                             |\n| `GLM_SERVER_URL`  | `https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4`  | GLM API endpoint URL (international)                       |\n| `GLM_PROVIDER`    |                                 | Provider prefix for LiteLLM integration (optional)         |\n\n#### Configuration Examples\n\n```bash\n# Direct API usage (international endpoint)\nGLM_API_KEY=your_glm_api_key\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4\n\n# Alternative endpoints\nGLM_SERVER_URL=https:\u002F\u002Fopen.bigmodel.cn\u002Fapi\u002Fpaas\u002Fv4  # China\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fcoding\u002Fpaas\u002Fv4   # Coding-specific\n\n# With LiteLLM proxy\nGLM_API_KEY=your_litellm_key\nGLM_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nGLM_PROVIDER=zai  # Adds prefix to model names (zai\u002Fglm-4) for LiteLLM\n```\n\n#### Supported Models\n\nPentAGI supports 12 GLM models with tool calling, streaming, thinking modes, and prompt caching. Models marked with `*` are used in default configuration.\n\n**GLM-5 Series - Flagship MoE (744B\u002F40B active)**\n\n| Model ID                | Thinking      | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ----------------------- | ------------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `glm-5`*                | ✅ Forced      | 200K    | 128K       | $1.00\u002F$3.20\u002F$0.20          | Flagship agentic engineering, complex multi-stage tasks |\n| `glm-5-code`†           | ✅ Forced      | 200K    | 128K       | $1.20\u002F$5.00\u002F$0.30          | Code-specialized, exploit development (requires Coding Plan) |\n\n**GLM-4.7 Series - Premium with Interleaved Thinking**\n\n| Model ID                | Thinking      | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ----------------------- | ------------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `glm-4.7`*              | ✅ Forced      | 200K    | 128K       | $0.60\u002F$2.20\u002F$0.11          | Premium with thinking before each response\u002Ftool call |\n| `glm-4.7-flashx`*       | ✅ Hybrid      | 200K    | 128K       | $0.07\u002F$0.40\u002F$0.01          | High-speed with priority GPU, best price\u002Fperformance |\n| `glm-4.7-flash`         | ✅ Hybrid      | 200K    | 128K       | Free\u002FFree\u002FFree             | Free ~30B SOTA model, 1 concurrent request      |\n\n**GLM-4.6 Series - Balanced with Auto-Thinking**\n\n| Model ID                | Thinking      | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ----------------------- | ------------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `glm-4.6`               | ✅ Auto        | 200K    | 128K       | $0.60\u002F$2.20\u002F$0.11          | Balanced, streaming tool calls, 30% token efficient |\n\n**GLM-4.5 Series - Unified Reasoning\u002FCoding\u002FAgents**\n\n| Model ID                | Thinking      | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ----------------------- | ------------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `glm-4.5`               | ✅ Auto        | 128K    | 96K        | $0.60\u002F$2.20\u002F$0.11          | Unified model, MoE 355B\u002F32B active              |\n| `glm-4.5-x`             | ✅ Auto        | 128K    | 96K        | $2.20\u002F$8.90\u002F$0.45          | Ultra-fast premium, lowest latency              |\n| `glm-4.5-air`*          | ✅ Auto        | 128K    | 96K        | $0.20\u002F$1.10\u002F$0.03          | Cost-effective, MoE 106B\u002F12B, best price\u002Fquality |\n| `glm-4.5-airx`          | ✅ Auto        | 128K    | 96K        | $1.10\u002F$4.50\u002F$0.22          | Accelerated Air with priority GPU               |\n| `glm-4.5-flash`         | ✅ Auto        | 128K    | 96K        | Free\u002FFree\u002FFree             | Free with reasoning\u002Fcoding\u002Fagents support       |\n\n**GLM-4 Legacy - Dense Architecture**\n\n| Model ID                | Thinking      | Context | Max Output | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ----------------------- | ------------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `glm-4-32b-0414-128k`   | ❌             | 128K    | 16K        | $0.10\u002F$0.10\u002F$0.00          | Ultra-budget dense 32B, high-volume parsing     |\n\n**Prices**: Per 1M tokens. Cache pricing is for prompt caching. † Model requires **Coding Plan subscription**.\n\n> [!WARNING]\n> **Coding Plan Requirement**\n>\n> The `glm-5-code` model requires an active **Coding Plan subscription**. Attempting to use this model without the subscription will result in:\n>\n> ```\n> API returned unexpected status code: 403: You do not have permission to access glm-5-code\n> ```\n>\n> For code-specialized tasks without Coding Plan, use `glm-5` (general flagship) or `glm-4.7` (premium with interleaved thinking) instead.\n\n**Thinking Modes**:\n- **Forced**: Model always uses thinking mode before responding (GLM-5, GLM-4.7)\n- **Hybrid**: Model intelligently decides when to use thinking (GLM-4.7-FlashX, GLM-4.7-Flash)\n- **Auto**: Model automatically determines when reasoning is needed (GLM-4.6, GLM-4.5 series)\n\n**Key Features**:\n- **Prompt Caching**: Significant cost reduction on repeated context (cached input pricing shown)\n- **Interleaved Thinking**: GLM-4.7 thinks before each response and tool call with preserved reasoning across multi-turn dialogues\n- **Ultra-Long Context**: 200K tokens for GLM-5 and GLM-4.7\u002F4.6 series for massive codebase analysis\n- **MoE Architecture**: Efficient 744B parameters with 40B active (GLM-5), 355B\u002F32B (GLM-4.5), 106B\u002F12B (GLM-4.5-Air)\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming with streaming tool calls support (GLM-4.6+)\n- **Multilingual**: Exceptional Chinese and English NLP capabilities\n- **Free Options**: GLM-4.7-Flash and GLM-4.5-Flash for prototyping and experimentation\n\n**LiteLLM Integration**: Set `GLM_PROVIDER=zai` to enable model name prefixing when using default PentAGI configurations with LiteLLM proxy. Leave empty for direct API usage.\n\n### Kimi Provider Configuration\n\nPentAGI integrates with Kimi from Moonshot AI, providing ultra-long context models with multimodal capabilities perfect for analyzing extensive codebases and documentation.\n\n#### Configuration Variables\n\n| Variable           | Default Value                | Description                                         |\n| ------------------ | -----------------------------| --------------------------------------------------- |\n| `KIMI_API_KEY`     |                              | Kimi API key for authentication                     |\n| `KIMI_SERVER_URL`  | `https:\u002F\u002Fapi.moonshot.ai\u002Fv1` | Kimi API endpoint URL (international)               |\n| `KIMI_PROVIDER`    |                              | Provider prefix for LiteLLM integration (optional)  |\n\n#### Configuration Examples\n\n```bash\n# Direct API usage (international endpoint)\nKIMI_API_KEY=your_kimi_api_key\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1\n\n# Alternative endpoint\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.cn\u002Fv1  # China\n\n# With LiteLLM proxy\nKIMI_API_KEY=your_litellm_key\nKIMI_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nKIMI_PROVIDER=moonshot  # Adds prefix to model names (moonshot\u002Fkimi-k2.5) for LiteLLM\n```\n\n#### Supported Models\n\nPentAGI supports 11 Kimi\u002FMoonshot models with tool calling, streaming, thinking modes, and multimodal capabilities. Models marked with `*` are used in default configuration.\n\n**Kimi K2.5 Series - Advanced Multimodal**\n\n| Model ID                   | Thinking | Multimodal | Context | Speed      | Price (Input\u002FOutput) | Use Case                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `kimi-k2.5`*               | ✅        | ✅          | 256K    | Standard   | $0.60\u002F$3.00          | Most intelligent, versatile, vision+text+code   |\n\n**Kimi K2 Series - MoE Foundation (1T params, 32B activated)**\n\n| Model ID                   | Thinking | Multimodal | Context | Speed      | Price (Input\u002FOutput) | Use Case                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `kimi-k2-0905-preview`*    | ❌        | ❌          | 256K    | Standard   | $0.60\u002F$2.50          | Enhanced agentic coding, improved frontend      |\n| `kimi-k2-0711-preview`     | ❌        | ❌          | 128K    | Standard   | $0.60\u002F$2.50          | Powerful code and agent capabilities            |\n| `kimi-k2-turbo-preview`*   | ❌        | ❌          | 256K    | Turbo      | $1.15\u002F$8.00          | High-speed version, 60-100 tokens\u002Fsec           |\n| `kimi-k2-thinking`         | ✅        | ❌          | 256K    | Standard   | $0.60\u002F$2.50          | Long-term thinking, multi-step tool usage       |\n| `kimi-k2-thinking-turbo`   | ✅        | ❌          | 256K    | Turbo      | $1.15\u002F$8.00          | High-speed thinking, deep reasoning             |\n\n**Moonshot V1 Series - General Text Generation**\n\n| Model ID                   | Thinking | Multimodal | Context | Speed      | Price (Input\u002FOutput) | Use Case                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `moonshot-v1-8k`           | ❌        | ❌          | 8K      | Standard   | $0.20\u002F$2.00          | Short text generation, cost-effective           |\n| `moonshot-v1-32k`          | ❌        | ❌          | 32K     | Standard   | $1.00\u002F$3.00          | Long text generation, balanced                  |\n| `moonshot-v1-128k`         | ❌        | ❌          | 128K    | Standard   | $2.00\u002F$5.00          | Very long text generation, extensive context    |\n\n**Moonshot V1 Vision Series - Multimodal**\n\n| Model ID                      | Thinking | Multimodal | Context | Speed      | Price (Input\u002FOutput) | Use Case                                        |\n| ----------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `moonshot-v1-8k-vision-preview`   | ❌        | ✅          | 8K      | Standard   | $0.20\u002F$2.00          | Vision understanding, short context             |\n| `moonshot-v1-32k-vision-preview`  | ❌        | ✅          | 32K     | Standard   | $1.00\u002F$3.00          | Vision understanding, medium context            |\n| `moonshot-v1-128k-vision-preview` | ❌        | ✅          | 128K    | Standard   | $2.00\u002F$5.00          | Vision understanding, long context              |\n\n**Prices**: Per 1M tokens. Turbo models offer 60-100 tokens\u002Fsec output speed with higher pricing.\n\n**Key Features**:\n- **Ultra-Long Context**: Up to 256K tokens for comprehensive codebase analysis\n- **Multimodal Capabilities**: Vision models support image understanding for screenshot analysis (Kimi K2.5, V1 Vision series)\n- **Extended Thinking**: Deep reasoning with multi-step tool usage (kimi-k2.5, kimi-k2-thinking models)\n- **High-Speed Turbo**: 60-100 tokens\u002Fsec output for real-time workflows (Turbo variants)\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming for interactive security assessment\n- **Multilingual**: Strong Chinese and English language support\n- **MoE Architecture**: Efficient 1T total parameters with 32B activated for K2 series\n\n**LiteLLM Integration**: Set `KIMI_PROVIDER=moonshot` to enable model name prefixing when using default PentAGI configurations with LiteLLM proxy. Leave empty for direct API usage.\n\n### Qwen Provider Configuration\n\nPentAGI integrates with Qwen from Alibaba Cloud Model Studio (DashScope), providing powerful multilingual models with reasoning capabilities and context caching support.\n\n#### Configuration Variables\n\n| Variable           | Default Value                                          | Description                                         |\n| ------------------ | ------------------------------------------------------ | --------------------------------------------------- |\n| `QWEN_API_KEY`     |                                                        | Qwen API key for authentication                     |\n| `QWEN_SERVER_URL`  | `https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1` | Qwen API endpoint URL (international)               |\n| `QWEN_PROVIDER`    |                                                        | Provider prefix for LiteLLM integration (optional)  |\n\n#### Configuration Examples\n\n```bash\n# Direct API usage (Global\u002FUS endpoint)\nQWEN_API_KEY=your_qwen_api_key\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1\n\n# Alternative endpoints\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-intl.aliyuncs.com\u002Fcompatible-mode\u002Fv1  # International (Singapore)\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1       # Chinese Mainland (Beijing)\n\n# With LiteLLM proxy\nQWEN_API_KEY=your_litellm_key\nQWEN_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nQWEN_PROVIDER=dashscope  # Adds prefix to model names (dashscope\u002Fqwen-plus) for LiteLLM\n```\n\n#### Supported Models\n\nPentAGI supports 32 Qwen models with tool calling, streaming, thinking modes, and context caching. Models marked with `*` are used in default configuration.\n\n**Wide Availability Models (All Regions)**\n\n| Model ID                     | Thinking | Intl | Global\u002FUS | China | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3-max`*                 | ✅        | ✅    | ✅         | ✅     | $2.40\u002F$12.00\u002F$0.48         | Flagship reasoning, complex security analysis   |\n| `qwen3-max-preview`          | ✅        | ✅    | ✅         | ✅     | $2.40\u002F$12.00\u002F$0.48         | Preview version with extended thinking          |\n| `qwen-max`                   | ❌        | ✅    | ❌         | ✅     | $1.60\u002F$6.40\u002F$0.32          | Strong instruction following, legacy flagship   |\n| `qwen3.5-plus`*              | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$2.40\u002F$0.08          | Balanced reasoning, general dialogue, coding    |\n| `qwen-plus`                  | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$4.00\u002F$0.08          | Cost-effective balanced performance             |\n| `qwen3.5-flash`*             | ✅        | ✅    | ✅         | ✅     | $0.10\u002F$0.40\u002F$0.02          | Ultra-fast lightweight, high-throughput         |\n| `qwen-flash`                 | ❌        | ✅    | ✅         | ✅     | $0.05\u002F$0.40\u002F$0.01          | Fast with context caching, cost-optimized       |\n| `qwen-turbo`                 | ✅        | ✅    | ❌         | ✅     | $0.05\u002F$0.50\u002F$0.01          | Deprecated, use qwen-flash instead              |\n| `qwq-plus`                   | ✅        | ✅    | ❌         | ✅     | $0.80\u002F$2.40\u002F$0.16          | Deep reasoning, chain-of-thought analysis       |\n\n**Region-Specific Models**\n\n| Model ID                     | Thinking | Intl | Global\u002FUS | China | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen-plus-us`               | ✅        | ❌    | ✅         | ❌     | $0.40\u002F$4.00\u002F$0.08          | US region optimized balanced model              |\n| `qwen-long-latest`           | ❌        | ❌    | ❌         | ✅     | $0.07\u002F$0.29\u002F$0.01          | Ultra-long context (10M tokens)                 |\n\n**Open Source - Qwen3.5 Series**\n\n| Model ID                     | Thinking | Intl | Global\u002FUS | China | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3.5-397b-a17b`          | ✅        | ✅    | ✅         | ✅     | $0.60\u002F$3.60\u002F$0.12          | Largest 397B parameters, exceptional reasoning  |\n| `qwen3.5-122b-a10b`          | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$3.20\u002F$0.08          | Large 122B parameters, strong performance       |\n| `qwen3.5-27b`                | ✅        | ✅    | ✅         | ✅     | $0.30\u002F$2.40\u002F$0.06          | Medium 27B parameters, balanced                 |\n| `qwen3.5-35b-a3b`            | ✅        | ✅    | ✅         | ✅     | $0.25\u002F$2.00\u002F$0.05          | Efficient 35B with 3B active MoE                |\n\n**Open Source - Qwen3 Series**\n\n| Model ID                       | Thinking | Intl | Global\u002FUS | China | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ------------------------------ | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3-next-80b-a3b-thinking`  | ✅        | ✅    | ✅         | ✅     | $0.15\u002F$1.43\u002F$0.03          | Next-gen 80B thinking-only mode                 |\n| `qwen3-next-80b-a3b-instruct`  | ❌        | ✅    | ✅         | ✅     | $0.15\u002F$1.20\u002F$0.03          | Next-gen 80B instruction following              |\n| `qwen3-235b-a22b`              | ✅        | ✅    | ✅         | ✅     | $0.70\u002F$8.40\u002F$0.14          | Dual-mode 235B with 22B active                  |\n| `qwen3-32b`                    | ✅        | ✅    | ✅         | ✅     | $0.29\u002F$2.87\u002F$0.06          | Versatile 32B dual-mode                         |\n| `qwen3-30b-a3b`                | ✅        | ✅    | ✅         | ✅     | $0.20\u002F$2.40\u002F$0.04          | Efficient 30B MoE architecture                  |\n| `qwen3-14b`                    | ✅        | ✅    | ✅         | ✅     | $0.35\u002F$4.20\u002F$0.07          | Medium 14B performance-cost balance             |\n| `qwen3-8b`                     | ✅        | ✅    | ✅         | ✅     | $0.18\u002F$2.10\u002F$0.04          | Compact 8B efficiency optimized                 |\n| `qwen3-4b`                     | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | Lightweight 4B for simple tasks                 |\n| `qwen3-1.7b`                   | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | Ultra-compact 1.7B basic tasks                  |\n| `qwen3-0.6b`                   | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | Smallest 0.6B minimal resources                 |\n\n**Open Source - QwQ & Qwen2.5 Series**\n\n| Model ID                     | Thinking | Intl | Global\u002FUS | China | Price (Input\u002FOutput\u002FCache) | Use Case                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwq-32b`                    | ✅        | ✅    | ✅         | ✅     | $0.29\u002F$0.86\u002F$0.06          | Open 32B reasoning, deep research               |\n| `qwen2.5-14b-instruct-1m`    | ❌        | ✅    | ❌         | ✅     | $0.81\u002F$3.22\u002F$0.16          | Extended 1M context, 14B parameters             |\n| `qwen2.5-7b-instruct-1m`     | ❌        | ✅    | ❌         | ✅     | $0.37\u002F$1.47\u002F$0.07          | Extended 1M context, 7B parameters              |\n| `qwen2.5-72b-instruct`       | ❌        | ✅    | ❌         | ✅     | $1.40\u002F$5.60\u002F$0.28          | Large 72B instruction following                 |\n| `qwen2.5-32b-instruct`       | ❌        | ✅    | ❌         | ✅     | $0.70\u002F$2.80\u002F$0.14          | Medium 32B instruction following                |\n| `qwen2.5-14b-instruct`       | ❌        | ✅    | ❌         | ✅     | $0.35\u002F$1.40\u002F$0.07          | Compact 14B instruction following               |\n| `qwen2.5-7b-instruct`        | ❌        | ✅    | ❌         | ✅     | $0.18\u002F$0.70\u002F$0.04          | Small 7B instruction following                  |\n| `qwen2.5-3b-instruct`        | ❌        | ❌    | ❌         | ✅     | $0.04\u002F$0.13\u002F$0.01          | Lightweight 3B Chinese Mainland only            |\n\n**Prices**: Per 1M tokens. Cache pricing is for implicit context caching (20% of input cost). Models with thinking support include additional reasoning computation during CoT phase.\n\n**Region Availability**:\n- **Intl** (International): Singapore region (`dashscope-intl.aliyuncs.com`)\n- **Global\u002FUS**: US Virginia region (`dashscope-us.aliyuncs.com`)\n- **China**: Chinese Mainland Beijing region (`dashscope.aliyuncs.com`)\n\n**Key Features**:\n- **Automatic Context Caching**: 30-50% cost reduction on repeated context with implicit cache (20% of input price)\n- **Extended Thinking**: Chain-of-thought reasoning for complex security analysis (Qwen3-Max, QwQ, Qwen3.5-Plus)\n- **Tool Calling**: Seamless integration with 20+ pentesting tools via function calling\n- **Streaming**: Real-time response streaming for interactive workflows\n- **Multilingual**: Strong Chinese, English, and multi-language support\n- **Ultra-Long Context**: Up to 10M tokens with qwen-long-latest for massive codebase analysis\n\n**LiteLLM Integration**: Set `QWEN_PROVIDER=dashscope` to enable model name prefixing when using default PentAGI configurations with LiteLLM proxy. Leave empty for direct API usage.\n\n## 🔧 Advanced Setup\n\n### Langfuse Integration\n\nLangfuse provides advanced capabilities for monitoring and analyzing AI agent operations.\n\n1. Configure Langfuse environment variables in existing `.env` file.\n\n\u003Cdetails>\n    \u003Csummary>Langfuse valuable environment variables\u003C\u002Fsummary>\n\n### Database Credentials\n- `LANGFUSE_POSTGRES_USER` and `LANGFUSE_POSTGRES_PASSWORD` - Langfuse PostgreSQL credentials\n- `LANGFUSE_CLICKHOUSE_USER` and `LANGFUSE_CLICKHOUSE_PASSWORD` - ClickHouse credentials\n- `LANGFUSE_REDIS_AUTH` - Redis password\n\n### Encryption and Security Keys\n- `LANGFUSE_SALT` - Salt for hashing in Langfuse Web UI\n- `LANGFUSE_ENCRYPTION_KEY` - Encryption key (32 bytes in hex)\n- `LANGFUSE_NEXTAUTH_SECRET` - Secret key for NextAuth\n\n### Admin Credentials\n- `LANGFUSE_INIT_USER_EMAIL` - Admin email\n- `LANGFUSE_INIT_USER_PASSWORD` - Admin password\n- `LANGFUSE_INIT_USER_NAME` - Admin username\n\n### API Keys and Tokens\n- `LANGFUSE_INIT_PROJECT_PUBLIC_KEY` - Project public key (used from PentAGI side too)\n- `LANGFUSE_INIT_PROJECT_SECRET_KEY` - Project secret key (used from PentAGI side too)\n\n### S3 Storage\n- `LANGFUSE_S3_ACCESS_KEY_ID` - S3 access key ID\n- `LANGFUSE_S3_SECRET_ACCESS_KEY` - S3 secret access key\n\n\u003C\u002Fdetails>\n\n2. Enable integration with Langfuse for PentAGI service in `.env` file.\n\n```bash\nLANGFUSE_BASE_URL=http:\u002F\u002Flangfuse-web:3000\nLANGFUSE_PROJECT_ID= # default: value from ${LANGFUSE_INIT_PROJECT_ID}\nLANGFUSE_PUBLIC_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_PUBLIC_KEY}\nLANGFUSE_SECRET_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_SECRET_KEY}\n```\n\n3. Run the Langfuse stack:\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-langfuse.yml\ndocker compose -f docker-compose.yml -f docker-compose-langfuse.yml up -d\n```\n\nVisit [localhost:4000](http:\u002F\u002Flocalhost:4000) to access Langfuse Web UI with credentials from `.env` file:\n\n- `LANGFUSE_INIT_USER_EMAIL` - Admin email\n- `LANGFUSE_INIT_USER_PASSWORD` - Admin password\n\n### Monitoring and Observability\n\nFor detailed system operation tracking, integration with monitoring tools is available.\n\n1. Enable integration with OpenTelemetry and all observability services for PentAGI in `.env` file.\n\n```bash\nOTEL_HOST=otelcol:8148\n```\n\n2. Run the observability stack:\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-observability.yml\ndocker compose -f docker-compose.yml -f docker-compose-observability.yml up -d\n```\n\nVisit [localhost:3000](http:\u002F\u002Flocalhost:3000) to access Grafana Web UI.\n\n> [!NOTE]\n> If you want to use Observability stack with Langfuse, you need to enable integration in `.env` file to set `LANGFUSE_OTEL_EXPORTER_OTLP_ENDPOINT` to `http:\u002F\u002Fotelcol:4318`.\n>\n> To run all available stacks together (Langfuse, Graphiti, and Observability):\n>\n> ```bash\n> docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\n> ```\n>\n> You can also register aliases for these commands in your shell to run it faster:\n>\n> ```bash\n> alias pentagi=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml\"\n> alias pentagi-up=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\"\n> alias pentagi-down=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml down\"\n> ```\n\n### Knowledge Graph Integration (Graphiti)\n\nPentAGI integrates with [Graphiti](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi-graphiti), a temporal knowledge graph system powered by Neo4j, to provide advanced semantic understanding and relationship tracking for AI agent operations. The vxcontrol fork provides custom entity and edge types that are specific to pentesting purposes.\n\n#### What is Graphiti?\n\nGraphiti automatically extracts and stores structured knowledge from agent interactions, building a graph of entities, relationships, and temporal context. This enables:\n\n- **Semantic Memory**: Store and recall relationships between tools, targets, vulnerabilities, and techniques\n- **Contextual Understanding**: Track how different pentesting actions relate to each other over time\n- **Knowledge Reuse**: Learn from past penetration tests and apply insights to new assessments\n- **Advanced Querying**: Search for complex patterns like \"What tools were effective against similar targets?\"\n\n#### Enabling Graphiti\n\nThe Graphiti knowledge graph is **optional** and disabled by default. To enable it:\n\n1. Configure Graphiti environment variables in `.env` file:\n\n```bash\n## Graphiti knowledge graph settings\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http:\u002F\u002Fgraphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j settings (used by Graphiti stack)\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt:\u002F\u002Fneo4j:7687\n\n# OpenAI API key (required by Graphiti for entity extraction)\nOPEN_AI_KEY=your_openai_api_key\n```\n\n2. Run the Graphiti stack along with the main PentAGI services:\n\n```bash\n# Download the Graphiti compose file if needed\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-graphiti.yml\n\n# Start PentAGI with Graphiti\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d\n```\n\n3. Verify Graphiti is running:\n\n```bash\n# Check service health\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml ps graphiti neo4j\n\n# View Graphiti logs\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml logs -f graphiti\n\n# Access Neo4j Browser (optional)\n# Visit http:\u002F\u002Flocalhost:7474 and login with NEO4J_USER\u002FNEO4J_PASSWORD\n\n# Access Graphiti API (optional, for debugging)\n# Visit http:\u002F\u002Flocalhost:8000\u002Fdocs for Swagger API documentation\n```\n\n> [!NOTE]\n> The Graphiti service is defined in `docker-compose-graphiti.yml` as a separate stack. You must run both compose files together to enable the knowledge graph functionality. The pre-built Docker image `vxcontrol\u002Fgraphiti:latest` is used by default.\n\n#### What Gets Stored\n\nWhen enabled, PentAGI automatically captures:\n\n- **Agent Responses**: All agent reasoning, analysis, and decisions\n- **Tool Executions**: Commands executed, tools used, and their results\n- **Context Information**: Flow, task, and subtask hierarchy\n\n### GitHub and Google OAuth Integration\n\nOAuth integration with GitHub and Google allows users to authenticate using their existing accounts on these platforms. This provides several benefits:\n\n- Simplified login process without need to create separate credentials\n- Enhanced security through trusted identity providers\n- Access to user profile information from GitHub\u002FGoogle accounts\n- Seamless integration with existing development workflows\n\nFor using GitHub OAuth you need to create a new OAuth application in your GitHub account and set the `OAUTH_GITHUB_CLIENT_ID` and `OAUTH_GITHUB_CLIENT_SECRET` in `.env` file.\n\nFor using Google OAuth you need to create a new OAuth application in your Google account and set the `OAUTH_GOOGLE_CLIENT_ID` and `OAUTH_GOOGLE_CLIENT_SECRET` in `.env` file.\n\n### Docker Image Configuration\n\nPentAGI allows you to configure Docker image selection for executing various tasks. The system automatically chooses the most appropriate image based on the task type, but you can constrain this selection by specifying your preferred images:\n\n| Variable                           | Default                | Description                                                 |\n| ---------------------------------- | ---------------------- | ----------------------------------------------------------- |\n| `DOCKER_DEFAULT_IMAGE`             | `debian:latest`        | Default Docker image for general tasks and ambiguous cases  |\n| `DOCKER_DEFAULT_IMAGE_FOR_PENTEST` | `vxcontrol\u002Fkali-linux` | Default Docker image for security\u002Fpenetration testing tasks |\n\nWhen these environment variables are set, AI agents will be limited to the image choices you specify. This is particularly useful for:\n\n- **Security Enforcement**: Restricting usage to only verified and trusted images\n- **Environment Standardization**: Using corporate or customized images across all operations\n- **Performance Optimization**: Utilizing pre-built images with necessary tools already installed\n\nConfiguration examples:\n\n```bash\n# Using a custom image for general tasks\nDOCKER_DEFAULT_IMAGE=mycompany\u002Fcustom-debian:latest\n\n# Using a specialized image for penetration testing\nDOCKER_DEFAULT_IMAGE_FOR_PENTEST=mycompany\u002Fpentest-tools:v2.0\n```\n\n> [!NOTE]\n> If a user explicitly specifies a particular Docker image in their task, the system will try to use that exact image, ignoring these settings. These variables only affect the system's automatic image selection process.\n\n## 💻 Development\n\n### Development Requirements\n\n- golang\n- nodejs\n- docker\n- postgres\n- commitlint\n\n### Environment Setup\n\n#### Backend Setup\n\nRun once `cd backend && go mod download` to install needed packages.\n\nFor generating swagger files have to run\n\n```bash\nswag init -g ..\u002F..\u002Fpkg\u002Fserver\u002Frouter.go -o pkg\u002Fserver\u002Fdocs\u002F --parseDependency --parseInternal --parseDepth 2 -d cmd\u002Fpentagi\n```\n\nbefore installing `swag` package via\n\n```bash\ngo install github.com\u002Fswaggo\u002Fswag\u002Fcmd\u002Fswag@v1.8.7\n```\n\nFor generating graphql resolver files have to run\n\n```bash\ngo run github.com\u002F99designs\u002Fgqlgen --config .\u002Fgqlgen\u002Fgqlgen.yml\n```\n\nafter that you can see the generated files in `pkg\u002Fgraph` folder.\n\nFor generating ORM methods (database package) from sqlc configuration\n\n```bash\ndocker run --rm -v $(pwd):\u002Fsrc -w \u002Fsrc --network pentagi-network -e DATABASE_URL=\"{URL}\" sqlc\u002Fsqlc:1.27.0 generate -f sqlc\u002Fsqlc.yml\n```\n\nFor generating Langfuse SDK from OpenAPI specification\n\n```bash\nfern generate --local\n```\n\nand to install fern-cli\n\n```bash\nnpm install -g fern-api\n```\n\n#### Testing\n\nFor running tests `cd backend && go test -v .\u002F...`\n\n#### Frontend Setup\n\nRun once `cd frontend && npm install` to install needed packages.\n\nFor generating graphql files have to run `npm run graphql:generate` which using `graphql-codegen.ts` file.\n\nBe sure that you have `graphql-codegen` installed globally:\n\n```bash\nnpm install -g graphql-codegen\n```\n\nAfter that you can run:\n* `npm run prettier` to check if your code is formatted correctly\n* `npm run prettier:fix` to fix it\n* `npm run lint` to check if your code is linted correctly\n* `npm run lint:fix` to fix it\n\nFor generating SSL certificates you need to run `npm run ssl:generate` which using `generate-ssl.ts` file or it will be generated automatically when you run `npm run dev`.\n\n#### Backend Configuration\n\nEdit the configuration for `backend` in `.vscode\u002Flaunch.json` file:\n- `DATABASE_URL` - PostgreSQL database URL (eg. `postgres:\u002F\u002Fpostgres:postgres@localhost:5432\u002Fpentagidb?sslmode=disable`)\n- `DOCKER_HOST` - Docker SDK API (eg. for macOS `DOCKER_HOST=unix:\u002F\u002F\u002FUsers\u002F\u003Cmy-user>\u002FLibrary\u002FContainers\u002Fcom.docker.docker\u002FData\u002Fdocker.raw.sock`) [more info](https:\u002F\u002Fstackoverflow.com\u002Fa\u002F62757128\u002F5922857)\n\nOptional:\n- `SERVER_PORT` - Port to run the server (default: `8443`)\n- `SERVER_USE_SSL` - Enable SSL for the server (default: `false`)\n\n#### Frontend Configuration\n\nEdit the configuration for `frontend` in `.vscode\u002Flaunch.json` file:\n- `VITE_API_URL` - Backend API URL. *Omit* the URL scheme (e.g., `localhost:8080` *NOT* `http:\u002F\u002Flocalhost:8080`)\n- `VITE_USE_HTTPS` - Enable SSL for the server (default: `false`)\n- `VITE_PORT` - Port to run the server (default: `8000`)\n- `VITE_HOST` - Host to run the server (default: `0.0.0.0`)\n\n### Running the Application\n\n#### Backend\n\nRun the command(s) in `backend` folder:\n- Use `.env` file to set environment variables like a `source .env`\n- Run `go run cmd\u002Fpentagi\u002Fmain.go` to start the server\n\n> [!NOTE]\n> The first run can take a while as dependencies and docker images need to be downloaded to setup the backend environment.\n\n#### Frontend\n\nRun the command(s) in `frontend` folder:\n- Run `npm install` to install the dependencies\n- Run `npm run dev` to run the web app\n- Run `npm run build` to build the web app\n\nOpen your browser and visit the web app URL.\n\n## Testing LLM Agents\n\nPentAGI includes a powerful utility called `ctester` for testing and validating LLM agent capabilities. This tool helps ensure your LLM provider configurations work correctly with different agent types, allowing you to optimize model selection for each specific agent role.\n\nThe utility features parallel testing of multiple agents, detailed reporting, and flexible configuration options.\n\n### Key Features\n\n- **Parallel Testing**: Tests multiple agents simultaneously for faster results\n- **Comprehensive Test Suite**: Evaluates basic completion, JSON responses, function calling, and penetration testing knowledge\n- **Detailed Reporting**: Generates markdown reports with success rates and performance metrics\n- **Flexible Configuration**: Test specific agents or test groups as needed\n- **Specialized Test Groups**: Includes domain-specific tests for cybersecurity and penetration testing scenarios\n\n### Usage Scenarios\n\n#### For Developers (with local Go environment)\n\nIf you've cloned the repository and have Go installed:\n\n```bash\n# Default configuration with .env file\ncd backend\ngo run cmd\u002Fctester\u002F*.go -verbose\n\n# Custom provider configuration\ngo run cmd\u002Fctester\u002F*.go -config ..\u002Fexamples\u002Fconfigs\u002Fopenrouter.provider.yml -verbose\n\n# Generate a report file\ngo run cmd\u002Fctester\u002F*.go -config ..\u002Fexamples\u002Fconfigs\u002Fdeepinfra.provider.yml -report ..\u002Ftest-report.md\n\n# Test specific agent types only\ngo run cmd\u002Fctester\u002F*.go -agents simple,simple_json,primary_agent -verbose\n\n# Test specific test groups only\ngo run cmd\u002Fctester\u002F*.go -groups basic,advanced -verbose\n```\n\n#### For Users (using Docker image)\n\nIf you prefer to use the pre-built Docker image without setting up a development environment:\n\n```bash\n# Using Docker to test with default environment\ndocker run --rm -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -verbose\n\n# Test with your custom provider configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  -v $(pwd)\u002Fmy-config.yml:\u002Fopt\u002Fpentagi\u002Fconfig.yml \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconfig.yml -agents simple,primary_agent,coder -verbose\n\n# Generate a detailed report\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  -v $(pwd):\u002Fopt\u002Fpentagi\u002Foutput \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -report \u002Fopt\u002Fpentagi\u002Foutput\u002Freport.md\n```\n\n#### Using Pre-configured Providers\n\nThe Docker image comes with built-in support for major providers (OpenAI, Anthropic, Gemini, Ollama) and pre-configured provider files for additional services (OpenRouter, DeepInfra, DeepSeek, Moonshot, Novita):\n\n```bash\n# Test with OpenRouter configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fopenrouter.provider.yml\n\n# Test with DeepInfra configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fdeepinfra.provider.yml\n\n# Test with DeepSeek configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider deepseek\n\n# Test with GLM configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider glm\n\n# Test with Kimi configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider kimi\n\n# Test with Qwen configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider qwen\n\n# Test with DeepSeek configuration file for custom provider\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fdeepseek.provider.yml\n\n# Test with Moonshot configuration file for custom provider\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\n\n# Test with Novita configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fnovita.provider.yml\n\n# Test with OpenAI configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type openai\n\n# Test with Anthropic configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type anthropic\n\n# Test with Gemini configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type gemini\n\n# Test with AWS Bedrock configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type bedrock\n\n# Test with Custom OpenAI configuration\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\n\n# Test with Ollama configuration (local inference)\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\n\n# Test with Ollama Qwen3 32B configuration (requires custom model creation)\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwen332b-fp16-tc.provider.yml\n\n# Test with Ollama QwQ 32B configuration (requires custom model creation and 71.3GB VRAM)\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwq32b-fp16-tc.provider.yml\n```\n\nTo use these configurations, your `.env` file only needs to contain:\n\n```\nLLM_SERVER_URL=https:\u002F\u002Fopenrouter.ai\u002Fapi\u002Fv1      # or https:\u002F\u002Fapi.deepinfra.com\u002Fv1\u002Fopenai or https:\u002F\u002Fapi.openai.com\u002Fv1 or https:\u002F\u002Fapi.novita.ai\u002Fopenai\nLLM_SERVER_KEY=your_api_key\nLLM_SERVER_MODEL=                                # Leave empty, as models are specified in the config\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fopenrouter.provider.yml  # or deepinfra.provider.ymll or custom-openai.provider.yml or novita.provider.yml\nLLM_SERVER_PROVIDER=                             # Provider name for LiteLLM proxy (e.g., openrouter, deepseek, moonshot, novita)\nLLM_SERVER_LEGACY_REASONING=false                # Controls reasoning format, for OpenAI must be true (default: false)\nLLM_SERVER_PRESERVE_REASONING=false              # Preserve reasoning content in multi-turn conversations (required by Moonshot, default: false)\n\n# For OpenAI (official API)\nOPEN_AI_KEY=your_openai_api_key                  # Your OpenAI API key\nOPEN_AI_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1     # OpenAI API endpoint\n\n# For Anthropic (Claude models)\nANTHROPIC_API_KEY=your_anthropic_api_key         # Your Anthropic API key\nANTHROPIC_SERVER_URL=https:\u002F\u002Fapi.anthropic.com\u002Fv1  # Anthropic API endpoint\n\n# For Gemini (Google AI)\nGEMINI_API_KEY=your_gemini_api_key               # Your Google AI API key\nGEMINI_SERVER_URL=https:\u002F\u002Fgenerativelanguage.googleapis.com  # Google AI API endpoint\n\n# For AWS Bedrock (enterprise foundation models)\nBEDROCK_REGION=us-east-1                         # AWS region for Bedrock service\n# Authentication (choose one method, priority: DefaultAuth > BearerToken > AccessKey):\nBEDROCK_DEFAULT_AUTH=false                       # Use AWS SDK credential chain (env vars, EC2 role, ~\u002F.aws\u002Fcredentials)\nBEDROCK_BEARER_TOKEN=                            # Bearer token authentication (takes priority over static credentials)\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key        # AWS access key ID (static credentials)\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key    # AWS secret access key (static credentials)\nBEDROCK_SESSION_TOKEN=                           # AWS session token (optional, for temporary credentials with static auth)\nBEDROCK_SERVER_URL=                              # Optional custom Bedrock endpoint (VPC endpoints, local testing)\n\n# For Ollama (local server or cloud)\nOLLAMA_SERVER_URL=                               # Local: http:\u002F\u002Follama-server:11434, Cloud: https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=                           # Required for Ollama Cloud (https:\u002F\u002Follama.com\u002Fsettings\u002Fkeys), leave empty for local\nOLLAMA_SERVER_MODEL=\nOLLAMA_SERVER_CONFIG_PATH=\nOLLAMA_SERVER_PULL_MODELS_TIMEOUT=\nOLLAMA_SERVER_PULL_MODELS_ENABLED=\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=\n\n# For DeepSeek (Chinese AI with strong reasoning)\nDEEPSEEK_API_KEY=                                # DeepSeek API key\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com     # DeepSeek API endpoint\nDEEPSEEK_PROVIDER=                               # Optional: LiteLLM prefix (e.g., 'deepseek')\n\n# For GLM (Zhipu AI)\nGLM_API_KEY=                                     # GLM API key\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4      # GLM API endpoint (international)\nGLM_PROVIDER=                                    # Optional: LiteLLM prefix (e.g., 'zai')\n\n# For Kimi (Moonshot AI)\nKIMI_API_KEY=                                    # Kimi API key\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1       # Kimi API endpoint (international)\nKIMI_PROVIDER=                                   # Optional: LiteLLM prefix (e.g., 'moonshot')\n\n# For Qwen (Alibaba Cloud DashScope)\nQWEN_API_KEY=                                    # Qwen API key\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1  # Qwen API endpoint (US)\nQWEN_PROVIDER=                                   # Optional: LiteLLM prefix (e.g., 'dashscope')\n\n# For Ollama (local inference) use variables above\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\nOLLAMA_SERVER_PULL_MODELS_ENABLED=false\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=false\n```\n\n#### Using OpenAI with Unverified Organizations\n\nFor OpenAI accounts with unverified organizations that don't have access to the latest reasoning models (o1, o3, o4-mini), you need to use a custom configuration.\n\nTo use OpenAI with unverified organization accounts, configure your `.env` file as follows:\n\n```bash\nLLM_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\nLLM_SERVER_KEY=your_openai_api_key\nLLM_SERVER_MODEL=                                # Leave empty, models are specified in config\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\nLLM_SERVER_LEGACY_REASONING=true                 # Required for OpenAI reasoning format\n```\n\nThis configuration uses the pre-built `custom-openai.provider.yml` file that maps all agent types to models available for unverified organizations, using `o3-mini` instead of models like `o1`, `o3`, and `o4-mini`.\n\nYou can test this configuration using:\n\n```bash\n# Test with custom OpenAI configuration for unverified accounts\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\n```\n\n> [!NOTE]\n> The `LLM_SERVER_LEGACY_REASONING=true` setting is crucial for OpenAI compatibility as it ensures reasoning parameters are sent in the format expected by OpenAI's API.\n\n#### Using LiteLLM Proxy\n\nWhen using LiteLLM proxy to access various LLM providers, model names are prefixed with the provider name (e.g., `moonshot\u002Fkimi-2.5` instead of `kimi-2.5`). To use the same provider configuration files with both direct API access and LiteLLM proxy, set the `LLM_SERVER_PROVIDER` variable:\n\n```bash\n# Direct access to Moonshot API\nLLM_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1\nLLM_SERVER_KEY=your_moonshot_api_key\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\nLLM_SERVER_PROVIDER=                             # Empty for direct access\n\n# Access via LiteLLM proxy\nLLM_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nLLM_SERVER_KEY=your_litellm_api_key\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\nLLM_SERVER_PROVIDER=moonshot                     # Provider prefix for LiteLLM\n```\n\nWith `LLM_SERVER_PROVIDER=moonshot`, the system automatically prefixes all model names from the configuration file with `moonshot\u002F`, making them compatible with LiteLLM's model naming convention.\n\n**LiteLLM Provider Name Mapping:**\n\nWhen using LiteLLM proxy, set the corresponding `*_PROVIDER` variable to enable model prefixing:\n\n- `deepseek` - for DeepSeek models (`DEEPSEEK_PROVIDER=deepseek` → `deepseek\u002Fdeepseek-chat`)\n- `zai` - for GLM models (`GLM_PROVIDER=zai` → `zai\u002Fglm-4`)\n- `moonshot` - for Kimi models (`KIMI_PROVIDER=moonshot` → `moonshot\u002Fkimi-k2.5`)\n- `dashscope` - for Qwen models (`QWEN_PROVIDER=dashscope` → `dashscope\u002Fqwen-plus`)\n- `openai`, `anthropic`, `gemini` - for major cloud providers\n- `openrouter` - for OpenRouter aggregator\n- `deepinfra` - for DeepInfra hosting\n- `novita` - for Novita AI\n- Any other provider name configured in your LiteLLM instance\n\n**Example with LiteLLM:**\n```bash\n# Use DeepSeek models via LiteLLM proxy with model prefixing\nDEEPSEEK_API_KEY=your_litellm_proxy_key\nDEEPSEEK_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nDEEPSEEK_PROVIDER=deepseek  # Models become deepseek\u002Fdeepseek-chat, deepseek\u002Fdeepseek-reasoner for LiteLLM\n\n# Direct DeepSeek API usage (no prefix needed)\nDEEPSEEK_API_KEY=your_deepseek_api_key\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com\n# Leave DEEPSEEK_PROVIDER empty\n```\n\nThis approach allows you to:\n- Use the same configuration files for both direct and proxied access\n- Switch between providers without modifying configuration files\n- Easily test different routing strategies with LiteLLM\n\n#### Running Tests in a Production Environment\n\nIf you already have a running PentAGI container and want to test the current configuration:\n\n```bash\n# Run ctester in an existing container using current environment variables\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -verbose\n\n# Test specific agent types with deterministic ordering\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -agents simple,primary_agent,pentester -groups basic,knowledge -verbose\n\n# Generate a report file inside the container\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -report \u002Fopt\u002Fpentagi\u002Fdata\u002Fagent-test-report.md\n\n# Access the report from the host\ndocker cp pentagi:\u002Fopt\u002Fpentagi\u002Fdata\u002Fagent-test-report.md .\u002F\n```\n\n### Command-line Options\n\nThe utility accepts several options:\n\n- `-env \u003Cpath>` - Path to environment file (default: `.env`)\n- `-type \u003Cprovider>` - Provider type: `custom`, `openai`, `anthropic`, `ollama`, `bedrock`, `gemini` (default: `custom`)\n- `-config \u003Cpath>` - Path to custom provider config (default: from `LLM_SERVER_CONFIG_PATH` env variable)\n- `-tests \u003Cpath>` - Path to custom tests YAML file (optional)\n- `-report \u003Cpath>` - Path to write the report file (optional)\n- `-agents \u003Clist>` - Comma-separated list of agent types to test (default: `all`)\n- `-groups \u003Clist>` - Comma-separated list of test groups to run (default: `all`)\n- `-verbose` - Enable verbose output with detailed test results for each agent\n\n### Available Agent Types\n\nAgents are tested in the following deterministic order:\n\n1. **simple** - Basic completion tasks\n2. **simple_json** - JSON-structured responses\n3. **primary_agent** - Main reasoning agent\n4. **assistant** - Interactive assistant mode\n5. **generator** - Content generation\n6. **refiner** - Content refinement and improvement\n7. **adviser** - Expert advice and consultation\n8. **reflector** - Self-reflection and analysis\n9. **searcher** - Information gathering and search\n10. **enricher** - Data enrichment and expansion\n11. **coder** - Code generation and analysis\n12. **installer** - Installation and setup tasks\n13. **pentester** - Penetration testing and security assessment\n\n### Available Test Groups\n\n- **basic** - Fundamental completion and prompt response tests\n- **advanced** - Complex reasoning and function calling tests\n- **json** - JSON format validation and structure tests (specifically designed for `simple_json` agent)\n- **knowledge** - Domain-specific cybersecurity and penetration testing knowledge tests\n\n> **Note**: The `json` test group is specifically designed for the `simple_json` agent type, while all other agents are tested with `basic`, `advanced`, and `knowledge` groups. This specialization ensures optimal testing coverage for each agent's intended purpose.\n\n### Example Provider Configuration\n\nProvider configuration defines which models to use for different agent types:\n\n```yaml\nsimple:\n  model: \"provider\u002Fmodel-name\"\n  temperature: 0.7\n  top_p: 0.95\n  n: 1\n  max_tokens: 4000\n\nsimple_json:\n  model: \"provider\u002Fmodel-name\"\n  temperature: 0.7\n  top_p: 1.0\n  n: 1\n  max_tokens: 4000\n  json: true\n\n# ... other agent types ...\n```\n\n### Optimization Workflow\n\n1. **Create a baseline**: Run tests with default configuration to establish benchmark performance\n2. **Analyze agent-specific performance**: Review the deterministic agent ordering to identify underperforming agents\n3. **Test specialized configurations**: Experiment with different models for each agent type using provider-specific configs\n4. **Focus on domain knowledge**: Pay special attention to knowledge group tests for cybersecurity expertise\n5. **Validate function calling**: Ensure tool-based tests pass consistently for critical agent types\n6. **Compare results**: Look for the best success rate and performance across all test groups\n7. **Deploy optimal configuration**: Use in production with your optimized setup\n\nThis tool helps ensure your AI agents are using the most effective models for their specific tasks, improving reliability while optimizing costs.\n\n## Embedding Configuration and Testing\n\nPentAGI uses vector embeddings for semantic search, knowledge storage, and memory management. The system supports multiple embedding providers that can be configured according to your needs and preferences.\n\n### Supported Embedding Providers\n\nPentAGI supports the following embedding providers:\n\n- **OpenAI** (default): Uses OpenAI's text embedding models\n- **Ollama**: Local embedding model through Ollama\n- **Mistral**: Mistral AI's embedding models\n- **Jina**: Jina AI's embedding service\n- **HuggingFace**: Models from HuggingFace\n- **GoogleAI**: Google's embedding models\n- **VoyageAI**: VoyageAI's embedding models\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Embedding Provider Configuration\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n### Environment Variables\n\nTo configure the embedding provider, set the following environment variables in your `.env` file:\n\n```bash\n# Primary embedding configuration\nEMBEDDING_PROVIDER=openai       # Provider type (openai, ollama, mistral, jina, huggingface, googleai, voyageai)\nEMBEDDING_MODEL=text-embedding-3-small  # Model name to use\nEMBEDDING_URL=                  # Optional custom API endpoint\nEMBEDDING_KEY=                  # API key for the provider (if required)\nEMBEDDING_BATCH_SIZE=100        # Number of documents to process in a batch\nEMBEDDING_STRIP_NEW_LINES=true  # Whether to remove new lines from text before embedding\n\n# Advanced settings\nPROXY_URL=                      # Optional proxy for all API calls\nHTTP_CLIENT_TIMEOUT=600         # Timeout in seconds for external API calls (default: 600, 0 = no timeout)\n\n# SSL\u002FTLS Certificate Configuration (for external communication with LLM backends and tool servers)\nEXTERNAL_SSL_CA_PATH=           # Path to custom CA certificate file (PEM format) inside the container\n                                # Must point to \u002Fopt\u002Fpentagi\u002Fssl\u002F directory (e.g., \u002Fopt\u002Fpentagi\u002Fssl\u002Fca-bundle.pem)\nEXTERNAL_SSL_INSECURE=false     # Skip certificate verification (use only for testing)\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>How to Add Custom CA Certificates\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nIf you see this error: `tls: failed to verify certificate: x509: certificate signed by unknown authority`\n\n**Step 1:** Get your CA certificate bundle in PEM format (can contain multiple certificates)\n\n**Step 2:** Place the file in the SSL directory on your host machine:\n```bash\n# Default location (if PENTAGI_SSL_DIR is not set)\ncp ca-bundle.pem .\u002Fpentagi-ssl\u002F\n\n# Or custom location (if using PENTAGI_SSL_DIR in docker-compose.yml)\ncp ca-bundle.pem \u002Fpath\u002Fto\u002Fyour\u002Fssl\u002Fdir\u002F\n```\n\n**Step 3:** Set the path in `.env` file (path must be inside the container):\n```bash\n# The volume pentagi-ssl is mounted to \u002Fopt\u002Fpentagi\u002Fssl inside the container\nEXTERNAL_SSL_CA_PATH=\u002Fopt\u002Fpentagi\u002Fssl\u002Fca-bundle.pem\nEXTERNAL_SSL_INSECURE=false\n```\n\n**Step 4:** Restart PentAGI:\n```bash\ndocker compose restart pentagi\n```\n\n**Notes:**\n- The `pentagi-ssl` volume is mounted to `\u002Fopt\u002Fpentagi\u002Fssl` inside the container\n- You can change host directory using `PENTAGI_SSL_DIR` variable in docker-compose.yml\n- File supports multiple certificates and intermediate CAs in one PEM file\n- Use `EXTERNAL_SSL_INSECURE=true` only for testing (not recommended for production)\n\n\u003C\u002Fdetails>\n\n### Provider-Specific Limitations\n\nEach provider has specific limitations and supported features:\n\n- **OpenAI**: Supports all configuration options\n- **Ollama**: Does not support `EMBEDDING_KEY` as it uses local models\n- **Mistral**: Does not support `EMBEDDING_MODEL` or custom HTTP client\n- **Jina**: Does not support custom HTTP client\n- **HuggingFace**: Requires `EMBEDDING_KEY` and supports all other options\n- **GoogleAI**: Does not support `EMBEDDING_URL`, requires `EMBEDDING_KEY`\n- **VoyageAI**: Supports all configuration options\n\nIf `EMBEDDING_URL` and `EMBEDDING_KEY` are not specified, the system will attempt to use the corresponding LLM provider settings (e.g., `OPEN_AI_KEY` when `EMBEDDING_PROVIDER=openai`).\n\n### Why Consistent Embedding Providers Matter\n\nIt's crucial to use the same embedding provider consistently because:\n\n1. **Vector Compatibility**: Different providers produce vectors with different dimensions and mathematical properties\n2. **Semantic Consistency**: Changing providers can break semantic similarity between previously embedded documents\n3. **Memory Corruption**: Mixed embeddings can lead to poor search results and broken knowledge base functionality\n\nIf you change your embedding provider, you should flush and reindex your entire knowledge base (see `etester` utility below).\n\n\u003C\u002Fdetails>\n\n### Embedding Tester Utility (etester)\n\nPentAGI includes a specialized `etester` utility for testing, managing, and debugging embedding functionality. This tool is essential for diagnosing and resolving issues related to vector embeddings and knowledge storage.\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Etester Commands\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n```bash\n# Test embedding provider and database connection\ncd backend\ngo run cmd\u002Fetester\u002Fmain.go test -verbose\n\n# Show statistics about the embedding database\ngo run cmd\u002Fetester\u002Fmain.go info\n\n# Delete all documents from the embedding database (use with caution!)\ngo run cmd\u002Fetester\u002Fmain.go flush\n\n# Recalculate embeddings for all documents (after changing provider)\ngo run cmd\u002Fetester\u002Fmain.go reindex\n\n# Search for documents in the embedding database\ngo run cmd\u002Fetester\u002Fmain.go search -query \"How to install PostgreSQL\" -limit 5\n```\n\n### Using Docker\n\nIf you're running PentAGI in Docker, you can use etester from within the container:\n\n```bash\n# Test embedding provider\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester test\n\n# Show detailed database information\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester info -verbose\n```\n\n### Advanced Search Options\n\nThe `search` command supports various filters to narrow down results:\n\n```bash\n# Filter by document type\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -query \"Security vulnerability\" -doc_type guide -threshold 0.8\n\n# Filter by flow ID\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -query \"Code examples\" -doc_type code -flow_id 42\n\n# All available search options\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -help\n```\n\nAvailable search parameters:\n- `-query STRING`: Search query text (required)\n- `-doc_type STRING`: Filter by document type (answer, memory, guide, code)\n- `-flow_id NUMBER`: Filter by flow ID (positive number)\n- `-answer_type STRING`: Filter by answer type (guide, vulnerability, code, tool, other)\n- `-guide_type STRING`: Filter by guide type (install, configure, use, pentest, development, other)\n- `-limit NUMBER`: Maximum number of results (default: 3)\n- `-threshold NUMBER`: Similarity threshold (0.0-1.0, default: 0.7)\n\n### Common Troubleshooting Scenarios\n\n1. **After changing embedding provider**: Always run `flush` or `reindex` to ensure consistency\n2. **Poor search results**: Try adjusting the similarity threshold or check if embeddings are correctly generated\n3. **Database connection issues**: Verify PostgreSQL is running with pgvector extension installed\n4. **Missing API keys**: Check environment variables for your chosen embedding provider\n\n\u003C\u002Fdetails>\n\n## 🔍 Function Testing with ftester\n\nPentAGI includes a versatile utility called `ftester` for debugging, testing, and developing specific functions and AI agent behaviors. While `ctester` focuses on testing LLM model capabilities, `ftester` allows you to directly invoke individual system functions and AI agent components with precise control over execution context.\n\n### Key Features\n\n- **Direct Function Access**: Test individual functions without running the entire system\n- **Mock Mode**: Test functions without a live PentAGI deployment using built-in mocks\n- **Interactive Input**: Fill function arguments interactively for exploratory testing\n- **Detailed Output**: Color-coded terminal output with formatted responses and errors\n- **Context-Aware Testing**: Debug AI agents within the context of specific flows, tasks, and subtasks\n- **Observability Integration**: All function calls are logged to Langfuse and Observability stack\n\n### Usage Modes\n\n#### Command Line Arguments\n\nRun ftester with specific function and arguments directly from the command line:\n\n```bash\n# Basic usage with mock mode\ncd backend\ngo run cmd\u002Fftester\u002Fmain.go [function_name] -[arg1] [value1] -[arg2] [value2]\n\n# Example: Test terminal command in mock mode\ngo run cmd\u002Fftester\u002Fmain.go terminal -command \"ls -la\" -message \"List files\"\n\n# Using a real flow context\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 terminal -command \"whoami\" -message \"Check user\"\n\n# Testing AI agent in specific task\u002Fsubtask context\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 -subtask 789 pentester -message \"Find vulnerabilities\"\n```\n\n#### Interactive Mode\n\nRun ftester without arguments for a guided interactive experience:\n\n```bash\n# Start interactive mode\ngo run cmd\u002Fftester\u002Fmain.go [function_name]\n\n# For example, to interactively fill browser tool arguments\ngo run cmd\u002Fftester\u002Fmain.go browser\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Available Functions\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n### Environment Functions\n- **terminal**: Execute commands in a container and return the output\n- **file**: Perform file operations (read, write, list) in a container\n\n### Search Functions\n- **browser**: Access websites and capture screenshots\n- **google**: Search the web using Google Custom Search\n- **duckduckgo**: Search the web using DuckDuckGo\n- **tavily**: Search using Tavily AI search engine\n- **traversaal**: Search using Traversaal AI search engine\n- **perplexity**: Search using Perplexity AI\n- **sploitus**: Search for security exploits, vulnerabilities (CVEs), and pentesting tools\n- **searxng**: Search using Searxng meta search engine (aggregates results from multiple engines)\n\n### Vector Database Functions\n- **search_in_memory**: Search for information in vector database\n- **search_guide**: Find guidance documents in vector database\n- **search_answer**: Find answers to questions in vector database\n- **search_code**: Find code examples in vector database\n\n### AI Agent Functions\n- **advice**: Get expert advice from an AI agent\n- **coder**: Request code generation or modification\n- **maintenance**: Run system maintenance tasks\n- **memorist**: Store and organize information in vector database\n- **pentester**: Perform security tests and vulnerability analysis\n- **search**: Complex search across multiple sources\n\n### Utility Functions\n- **describe**: Show information about flows, tasks, and subtasks\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Debugging Flow Context\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nThe `describe` function provides detailed information about tasks and subtasks within a flow. This is particularly useful for diagnosing issues when PentAGI encounters problems or gets stuck.\n\n```bash\n# List all flows in the system\ngo run cmd\u002Fftester\u002Fmain.go describe\n\n# Show all tasks and subtasks for a specific flow\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 describe\n\n# Show detailed information for a specific task\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 describe\n\n# Show detailed information for a specific subtask\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 -subtask 789 describe\n\n# Show verbose output with full descriptions and results\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 describe -verbose\n```\n\nThis function allows you to identify the exact point where a flow might be stuck and resume processing by directly invoking the appropriate agent function.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Function Help and Discovery\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nEach function has a help mode that shows available parameters:\n\n```bash\n# Get help for a specific function\ngo run cmd\u002Fftester\u002Fmain.go [function_name] -help\n\n# Examples:\ngo run cmd\u002Fftester\u002Fmain.go terminal -help\ngo run cmd\u002Fftester\u002Fmain.go browser -help\ngo run cmd\u002Fftester\u002Fmain.go describe -help\n```\n\nYou can also run ftester without arguments to see a list of all available functions:\n\n```bash\ngo run cmd\u002Fftester\u002Fmain.go\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Output Format\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nThe `ftester` utility uses color-coded output to make interpretation easier:\n\n- **Blue headers**: Section titles and key names\n- **Cyan [INFO]**: General information messages\n- **Green [SUCCESS]**: Successful operations\n- **Red [ERROR]**: Error messages\n- **Yellow [WARNING]**: Warning messages\n- **Yellow [MOCK]**: Indicates mock mode operation\n- **Magenta values**: Function arguments and results\n\nJSON and Markdown responses are automatically formatted for readability.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Advanced Usage Scenarios\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\n### Debugging Stuck AI Flows\n\nWhen PentAGI gets stuck in a flow:\n\n1. Pause the flow through the UI\n2. Use `describe` to identify the current task and subtask\n3. Directly invoke the agent function with the same task\u002Fsubtask IDs\n4. Examine the detailed output to identify the issue\n5. Resume the flow or manually intervene as needed\n\n### Testing Environment Variables\n\nVerify that API keys and external services are configured correctly:\n\n```bash\n# Test Google search API configuration\ngo run cmd\u002Fftester\u002Fmain.go google -query \"pentesting tools\"\n\n# Test browser access to external websites\ngo run cmd\u002Fftester\u002Fmain.go browser -url \"https:\u002F\u002Fexample.com\"\n```\n\n### Developing New AI Agent Behaviors\n\nWhen developing new prompt templates or agent behaviors:\n\n1. Create a test flow in the UI\n2. Use ftester to directly invoke the agent with different prompts\n3. Observe responses and adjust prompts accordingly\n4. Check Langfuse for detailed traces of all function calls\n\n### Verifying Docker Container Setup\n\nEnsure containers are properly configured:\n\n```bash\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 terminal -command \"env | grep -i proxy\" -message \"Check proxy settings\"\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Docker Container Usage\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nIf you have PentAGI running in Docker, you can use ftester from within the container:\n\n```bash\n# Run ftester inside the running PentAGI container\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester [arguments]\n\n# Examples:\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester -flow 123 describe\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester -flow 123 terminal -command \"ps aux\" -message \"List processes\"\n```\n\nThis is particularly useful for production deployments where you don't have a local development environment.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Integration with Observability Tools\u003C\u002Fb> (click to expand)\u003C\u002Fsummary>\n\nAll function calls made through ftester are logged to:\n\n1. **Langfuse**: Captures the entire AI agent interaction chain, including prompts, responses, and function calls\n2. **OpenTelemetry**: Records metrics, traces, and logs for system performance analysis\n3. **Terminal Output**: Provides immediate feedback on function execution\n\nTo access detailed logs:\n\n- Check Langfuse UI for AI agent traces (typically at `http:\u002F\u002Flocalhost:4000`)\n- Use Grafana dashboards for system metrics (typically at `http:\u002F\u002Flocalhost:3000`)\n- Examine terminal output for immediate function results and errors\n\n\u003C\u002Fdetails>\n\n### Command-line Options\n\nThe main utility accepts several options:\n\n- `-env \u003Cpath>` - Path to environment file (optional, default: `.env`)\n- `-provider \u003Ctype>` - Provider type to use (default: `custom`, options: `openai`, `anthropic`, `ollama`, `bedrock`, `gemini`, `custom`)\n- `-flow \u003Cid>` - Flow ID for testing (0 means using mocks, default: `0`)\n- `-task \u003Cid>` - Task ID for agent context (optional)\n- `-subtask \u003Cid>` - Subtask ID for agent context (optional)\n\nFunction-specific arguments are passed after the function name using `-name value` format.\n\n## Building\n\n### Building Docker Image\n\nThe Docker build process automatically embeds version information from git tags. To properly version your build, use the provided scripts:\n\n#### Linux\u002FmacOS\n\n```bash\n# Load version variables\nsource .\u002Fscripts\u002Fversion.sh\n\n# Standard build\ndocker build \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t pentagi:$PACKAGE_VER .\n\n# Multi-platform build\ndocker buildx build \\\n  --platform linux\u002Famd64,linux\u002Farm64 \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t pentagi:$PACKAGE_VER .\n\n# Build and push\ndocker buildx build \\\n  --platform linux\u002Famd64,linux\u002Farm64 \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t myregistry\u002Fpentagi:$PACKAGE_VER \\\n  --push .\n```\n\n#### Windows (PowerShell)\n\n```powershell\n# Load version variables\n. .\\scripts\\version.ps1\n\n# Standard build\ndocker build `\n  --build-arg PACKAGE_VER=$env:PACKAGE_VER `\n  --build-arg PACKAGE_REV=$env:PACKAGE_REV `\n  -t pentagi:$env:PACKAGE_VER .\n\n# Multi-platform build\ndocker buildx build `\n  --platform linux\u002Famd64,linux\u002Farm64 `\n  --build-arg PACKAGE_VER=$env:PACKAGE_VER `\n  --build-arg PACKAGE_REV=$env:PACKAGE_REV `\n  -t pentagi:$env:PACKAGE_VER .\n```\n\n#### Quick build without version\n\nFor development builds without version tracking:\n\n```bash\ndocker build -t pentagi:dev .\n```\n\n> [!NOTE]\n> - The build scripts automatically determine version from git tags\n> - Release builds (on tag commit) have no revision suffix\n> - Development builds (after tag) include commit hash as revision (e.g., `1.1.0-bc6e800`)\n> - To use the built image locally, update the image name in `docker-compose.yml` or use the `build` option\n\n## Credits\n\nThis project is made possible thanks to the following research and developments:\n- [Emerging Architectures for LLM Applications](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2023-06-23-agent)\n- [A Survey of Autonomous LLM Agents](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08299)\n- [Codel](https:\u002F\u002Fgithub.com\u002Fsemanser\u002Fcodel) by Andriy Semenets - initial architectural inspiration for agent-based automation\n\n## License\n\n**PentAGI** is licensed under the [MIT License](LICENSE).\n\nCopyright (c) 2025 PentAGI Development Team\n\n### Third-Party Dependencies\n\nAll third-party dependencies use MIT-compatible licenses. See [licenses\u002F](licenses\u002F) directory for detailed license reports.\n\n### VXControl Cloud Services\n\n⚠️ **Note:** While the VXControl Cloud SDK code is MIT licensed, accessing **VXControl Cloud Services** (threat intelligence, AI support, premium features) requires a separate License Key and compliance with [Terms of Service](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fcloud#license-and-terms).\n\nThe SDK code itself is free to use - service access requires registration.\n\nFor questions contact: **info@pentagi.com** or **info@vxcontrol.com**\n","# PentAGI\n\n\u003Cdiv align=\"center\" style=\"font-size: 1.5em; margin: 20px 0;\">\n    渗透测试 \u003Cstrong>人\u003C\u002Fstrong>工通用 \u003Cstrong>智\u003C\u002Fstrong>能\n\u003C\u002Fdiv>\n\u003Cbr>\n\u003Cdiv align=\"center\">\n\n> **加入社区！** 与安全研究人员、AI爱好者及同行的道德黑客们建立联系。获取支持、分享见解，并随时了解PentAGI的最新进展。\n\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-7289DA?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)⠀[![Telegram](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTelegram-2CA5E0?logo=telegram&logoColor=white)](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F15161\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fvxcontrol_pentagi_readme_4a68feb902da.png\" alt=\"vxcontrol%2Fpentagi | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n\u003C\u002Fdiv>\n\n## 目录\n\n- [概述](#-overview)\n- [功能](#-features)\n- [快速入门](#-quick-start)\n- [API访问](#-api-access)\n- [高级设置](#-advanced-setup)\n- [开发](#-development)\n- [LLM代理测试](#-testing-llm-agents)\n- [嵌入配置与测试](#-embedding-configuration-and-testing)\n- [使用ftester进行函数测试](#-function-testing-with-ftester)\n- [构建](#%EF%B8%8F-building)\n- [致谢](#-credits)\n- [许可证](#-license)\n\n## 概述\n\nPentAGI是一款创新的自动化安全测试工具，利用前沿的人工智能技术。该项目专为信息安全专业人士、研究人员和爱好者设计，旨在提供一个强大而灵活的解决方案，用于执行渗透测试。\n\n您可以观看视频 **PentAGI概览**：\n[![PentAGI概览视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fvxcontrol_pentagi_readme_51b6624d8609.png)](https:\u002F\u002Fyoutu.be\u002FR70x5Ddzs1o)\n\n## 功能\n\n- 安全隔离。所有操作均在沙箱化的Docker环境中进行，实现完全隔离。\n- 全自动运行。由AI驱动的代理可自动确定并执行渗透测试步骤，支持可选的执行监控和智能任务规划，以提高可靠性。\n- 专业渗透测试工具。内置超过20种专业安全工具，包括nmap、metasploit、sqlmap等。\n- 智能记忆系统。长期存储研究结果和成功方法，供未来参考。\n- 知识图谱集成。基于Graphiti的知识图谱，采用Neo4j技术，用于语义关系追踪和高级上下文理解。\n- 网络情报功能。通过[scraper](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fvxcontrol\u002Fscraper)内置浏览器，从网络资源中收集最新信息。\n- 外部搜索系统。集成先进的搜索API，包括[Tavily](https:\u002F\u002Ftavily.com)、[Traversaal](https:\u002F\u002Ftraversaal.ai)、[Perplexity](https:\u002F\u002Fwww.perplexity.ai)、[DuckDuckGo](https:\u002F\u002Fduckduckgo.com\u002F)、[Google自定义搜索](https:\u002F\u002Fprogrammablesearchengine.google.com\u002F)、[Sploitus Search](https:\u002F\u002Fsploitus.com)以及[Searxng](https:\u002F\u002Fsearxng.org)，以实现全面的信息收集。\n- 专家团队。配备专门的AI代理，负责研究、开发和基础设施任务，并可通过可选的执行监控和智能任务规划进一步优化性能，尤其适用于小型模型。\n- 全面监控。提供详细的日志记录，并与Grafana\u002FPrometheus集成，实现实时系统观测。\n- 详尽报告。生成包含漏洞利用指南的全面漏洞报告。\n- 智能容器管理。根据具体任务需求自动选择Docker镜像。\n- 现代化界面。简洁直观的Web UI，便于系统管理和监控。\n- 全功能API。提供完整的REST和GraphQL API，支持Bearer令牌认证，方便自动化和集成。\n- 持久化存储。所有命令和输出均存储在PostgreSQL数据库中，并使用[pgvector](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fvxcontrol\u002Fpgvector)扩展。\n- 可扩展架构。基于微服务的设计，支持水平扩展。\n- 自托管方案。完全掌控您的部署和数据。\n- 灵活的身份验证。支持10多家LLM提供商（[OpenAI](https:\u002F\u002Fplatform.openai.com\u002F)、[Anthropic](https:\u002F\u002Fwww.anthropic.com\u002F)、[Google AI\u002FGemini](https:\u002F\u002Fai.google.dev\u002F)、[AWS Bedrock](https:\u002F\u002Faws.amazon.com\u002Fbedrock\u002F)、[Ollama](https:\u002F\u002Follama.com\u002F)、[DeepSeek](https:\u002F\u002Fwww.deepseek.com\u002Fen\u002F)、[GLM](https:\u002F\u002Fz.ai\u002F)、[Kimi](https:\u002F\u002Fplatform.moonshot.ai\u002F)、[Qwen](https:\u002F\u002Fwww.alibabacloud.com\u002Fen\u002F)、自定义）以及聚合平台（[OpenRouter](https:\u002F\u002Fopenrouter.ai\u002F)、[DeepInfra](https:\u002F\u002Fdeepinfra.com\u002F)）。对于生产环境的本地部署，请参阅我们的[vLLM + Qwen3.5-27B-FP8指南](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md)。\n- API令牌认证。安全的Bearer令牌系统，用于程序化访问REST和GraphQL API。\n- 快速部署。通过[Docker Compose](https:\u002F\u002Fdocs.docker.com\u002Fcompose\u002F)轻松完成设置，并提供全面的环境配置。\n\n## 架构\n\n### 系统上下文\n\n```mermaid\nflowchart TB\n    classDef person fill:#08427B,stroke:#073B6F,color:#fff\n    classDef system fill:#1168BD,stroke:#0B4884,color:#fff\n    classDef external fill:#666666,stroke:#0B4884,color:#fff\n\n    pentester[\"👤 安全工程师\n    （系统用户）\"]\n\n    pentagi[\"✨ PentAGI\n    （自主渗透测试系统）\"]\n\n    target[\"🎯 目标系统\n    （被测试系统）\"]\n    llm[\"🧠 LLM提供商\n    （OpenAI\u002FAnthropic\u002FOllama\u002FBedrock\u002FGemini\u002F自定义）\"]\n    search[\"🔍 搜索系统\n    （Google\u002FDuckDuckGo\u002FTavily\u002FTraversaal\u002FPerplexity\u002FSploitus\u002FSearxng）\"]\n    langfuse[\"📊 LangFuse UI\n    （LLM可观测性仪表盘）\"]\n    grafana[\"📈 Grafana\n    （系统监控仪表盘）\"]\n\n    pentester --> |使用HTTPS| pentagi\n    pentester --> |监控AI HTTPS| langfuse\n    pentester --> |监控系统HTTPS| grafana\n    pentagi --> |测试各种协议| target\n    pentagi --> |查询HTTPS| llm\n    pentagi --> |搜索HTTPS| search\n    pentagi --> |报告HTTPS| langfuse\n    pentagi --> |报告HTTPS| grafana\n\n    class pentester person\n    class pentagi system\n    class target,llm,search,langfuse,grafana external\n\n    linkStyle default stroke:#ffffff,color:#ffffff\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>容器架构\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n```mermaid\ngraph TB\n    subgraph 核心服务\n        UI[前端UI\u003Cbr\u002F>React + TypeScript]\n        API[后端API\u003Cbr\u002F>Go + GraphQL]\n        DB[向量存储\u003Cbr\u002F>PostgreSQL + pgvector]\n        MQ[任务队列\u003Cbr\u002F>异步处理]\n        Agent[AI代理\u003Cbr\u002F>多智能体系统]\n    end\n\n    subgraph 知识图谱\n        Graphiti[Graphiti\u003Cbr\u002F>知识图谱API]\n        Neo4j[Neo4j\u003Cbr\u002F>图数据库]\n    end\n\nsubgraph 监控\n        Grafana[Grafana\u003Cbr\u002F>仪表盘]\n        VictoriaMetrics[VictoriaMetrics\u003Cbr\u002F>时序数据库]\n        Jaeger[Jaeger\u003Cbr\u002F>分布式追踪]\n        Loki[Loki\u003Cbr\u002F>日志聚合]\n        OTEL[OpenTelemetry\u003Cbr\u002F>数据采集]\n    end\n\n    subgraph 分析\n        Langfuse[Langfuse\u003Cbr\u002F>LLM分析]\n        ClickHouse[ClickHouse\u003Cbr\u002F>分析数据库]\n        Redis[Redis\u003Cbr\u002F>缓存 + 限流器]\n        MinIO[MinIO\u003Cbr\u002F>S3存储]\n    end\n\n    subgraph 安全工具\n        Scraper[网页爬虫\u003Cbr\u002F>隔离浏览器]\n        PenTest[安全工具\u003Cbr\u002F>20+专业工具\u003Cbr\u002F>沙箱执行]\n    end\n\n    UI --> |HTTP\u002FWS| API\n    API --> |SQL| 数据库\n    API --> |事件| 消息队列\n    消息队列 --> |任务| 代理\n    代理 --> |命令| 渗透测试\n    代理 --> |查询| 数据库\n    代理 --> |知识| Graphiti\n    Graphiti --> |图| Neo4j\n\n    API --> |遥测| OTEL\n    OTEL --> |指标| VictoriaMetrics\n    OTEL --> |追踪| Jaeger\n    OTEL --> |日志| Loki\n\n    Grafana --> |查询| VictoriaMetrics\n    Grafana --> |查询| Jaeger\n    Grafana --> |查询| Loki\n\n    API --> |分析| Langfuse\n    Langfuse --> |存储| ClickHouse\n    Langfuse --> |缓存| Redis\n    Langfuse --> |文件| MinIO\n\n    classDef core fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef knowledge fill:#ffa,stroke:#333,stroke-width:2px,color:#000\n    classDef monitoring fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef analytics fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef tools fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class UI,API,DB,MQ,Agent core\n    class Graphiti,Neo4j knowledge\n    class Grafana,VictoriaMetrics,Jaeger,Loki,OTEL monitoring\n    class Langfuse,ClickHouse,Redis,MinIO analytics\n    class Scraper,PenTest tools\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>实体关系\u003C\u002Fb> (点击展开)\u003C\u002Fsummary>\n\n```mermaid\nerDiagram\n    Flow ||--o{ Task : 包含\n    Task ||--o{ SubTask : 包含\n    SubTask ||--o{ Action : 包含\n    Action ||--o{ Artifact : 产生\n    Action ||--o{ Memory : 存储\n\n    Flow {\n        string id PK\n        string name \"Flow名称\"\n        string description \"Flow描述\"\n        string status \"active\u002Fcompleted\u002Ffailed\"\n        json parameters \"Flow参数\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Task {\n        string id PK\n        string flow_id FK\n        string name \"Task名称\"\n        string description \"Task描述\"\n        string status \"pending\u002Frunning\u002Fdone\u002Ffailed\"\n        json result \"Task结果\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    SubTask {\n        string id PK\n        string task_id FK\n        string name \"子任务名称\"\n        string description \"子任务描述\"\n        string status \"queued\u002Frunning\u002Fcompleted\u002Ffailed\"\n        string agent_type \"researcher\u002Fdeveloper\u002Fexecutor\"\n        json context \"Agent上下文\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Action {\n        string id PK\n        string subtask_id FK\n        string type \"command\u002Fsearch\u002Fanalyze\u002Fetc\"\n        string status \"success\u002Ffailure\"\n        json parameters \"Action参数\"\n        json result \"Action结果\"\n        timestamp created_at\n    }\n\n    Artifact {\n        string id PK\n        string action_id FK\n        string type \"file\u002Freport\u002Flog\"\n        string path \"存储路径\"\n        json metadata \"附加信息\"\n        timestamp created_at\n    }\n\n    Memory {\n        string id PK\n        string action_id FK\n        string type \"observation\u002Fconclusion\"\n        vector embedding \"向量表示\"\n        text content \"记忆内容\"\n        timestamp created_at\n    }\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>代理交互\u003C\u002Fb> (点击展开)\u003C\u002Fsummary>\n\n```mermaid\nsequenceDiagram\n    participant O as Orchestrator\n    participant R as Researcher\n    participant D as Developer\n    participant E as Executor\n    participant VS as Vector Store\n    participant KB as Knowledge Base\n\n    Note over O,KB: Flow Initialization\n    O->>VS: Query similar tasks\n    VS-->>O: Return experiences\n    O->>KB: Load relevant knowledge\n    KB-->>O: Return context\n\n    Note over O,R: Research Phase\n    O->>R: Analyze target\n    R->>VS: Search similar cases\n    VS-->>R: Return patterns\n    R->>KB: Query vulnerabilities\n    KB-->>R: Return known issues\n    R->>VS: Store findings\n    R-->>O: Research results\n\n    Note over O,D: Planning Phase\n    O->>D: Plan attack\n    D->>VS: Query exploits\n    VS-->>D: Return techniques\n    D->>KB: Load tools info\n    KB-->>D: Return capabilities\n    D-->>O: Attack plan\n\n    Note over O,E: Execution Phase\n    O->>E: Execute plan\n    E->>KB: Load tool guides\n    KB-->>E: Return procedures\n    E->>VS: Store results\n    E-->>O: Execution状态\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>记忆系统\u003C\u002Fb> (点击展开)\u003C\u002Fsummary>\n\n```mermaid\ngraph TB\n    subgraph \"长期记忆\"\n        VS[(向量存储\u003Cbr\u002F>嵌入数据库)]\n        KB[知识库\u003Cbr\u002F>领域专业知识]\n        Tools[工具知识\u003Cbr\u002F>使用模式]\n    end\n\n    subgraph \"工作记忆\"\n        Context[当前上下文\u003Cbr\u002F>任务状态]\n        Goals[活跃目标\u003Cbr\u002F>目的]\n        State[系统状态\u003Cbr\u002F>资源]\n    end\n\n    subgraph \"情景记忆\"\n        Actions[过去行动\u003Cbr\u002F>命令历史]\n        Results[行动结果\u003Cbr\u002F>成果]\n        Patterns[成功模式\u003Cbr\u002F>最佳实践]\n    end\n\n    Context --> |查询| VS\n    VS --> |检索| Context\n\n    Goals --> |咨询| KB\n    KB --> |指导| Goals\n\n    State --> |记录| Actions\n    Actions --> |学习| Patterns\n    Patterns --> |存储| VS\n\n    Tools --> |告知| State\n    Results --> |更新| Tools\n\n    VS --> |增强| KB\n    KB --> |索引| VS\n\n    classDef ltm fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef wm fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef em fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n\n    class VS,KB,Tools ltm\n    class Context,Goals,State wm\n    class Actions,Results,Patterns em\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>链式摘要\u003C\u002Fb> (点击展开)\u003C\u002Fsummary>\n\n链式摘要系统通过有选择地总结较早的消息来管理对话上下文的增长。这对于在保持对话连贯性的同时防止超出令牌限制至关重要。\n\n```mermaid\nflowchart TD\n    A[输入链] --> B{需要摘要吗？}\n    B -->|否| C[返回原始链]\n    B -->|是| D[转换为ChainAST]\n    D --> E[应用段落摘要]\n    E --> F[处理过大的配对]\n    F --> G[管理最后一节大小]\n    G --> H[应用问答摘要]\n    H --> I[用摘要重建链]\n    I --> J{新链是否更小？}\n    J -->|是| K[返回优化后的链]\n    J -->|否| C\n\nclassDef process fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef decision fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef output fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class A,D,E,F,G,H,I process\n    class B,J decision\n    class C,K output\n```\n\n该算法基于对话链的结构化表示（ChainAST）运行，能够保留包括工具调用及其响应在内的消息类型。所有摘要操作在减少上下文大小的同时，仍能保持关键的对话流程。\n\n\n\n### 全局摘要器配置选项\n\n| 参数             | 环境变量             | 默认值 | 描述                                                |\n| ----------------- | -------------------- | ------- | -------------------------------------------------- |\n| 保留最后一段     | `SUMMARIZER_PRESERVE_LAST`       | `true`  | 是否完整保留最后一段的所有消息                    |\n| 使用问答对策略   | `SUMMARIZER_USE_QA`              | `true`  | 是否使用问答对摘要策略                            |\n| 摘要问答中的人类消息 | `SUMMARIZER_SUM_MSG_HUMAN_IN_QA` | `false` | 是否摘要问答中的人类消息                          |\n| 最后一段最大字节数 | `SUMMARIZER_LAST_SEC_BYTES`      | `51200` | 最后一段的最大字节数（50KB）                      |\n| 单个主体对的最大字节数 | `SUMMARIZER_MAX_BP_BYTES`        | `16384` | 单个主体对的最大字节数（16KB）                    |\n| 最大保留的问答段数 | `SUMMARIZER_MAX_QA_SECTIONS`     | `10`    | 最大保留的问答段数量                              |\n| 问答段的最大字节数 | `SUMMARIZER_MAX_QA_BYTES`        | `65536` | 问答段的最大字节数（64KB）                        |\n| 保留的问答段数量 | `SUMMARIZER_KEEP_QA_SECTIONS`    | `1`     | 不进行摘要而直接保留的最近问答段数量            |\n\n### 助手摘要器配置选项\n\n助手实例可以使用自定义的摘要设置来微调上下文管理行为：\n\n| 参数          | 环境变量                    | 默认值 | 描述                                                          |\n| -------------- | --------------------------- | ------- | -------------------------------------------------------------- |\n| 保留最后一段 | `ASSISTANT_SUMMARIZER_PRESERVE_LAST`    | `true`  | 是否保留助手最后部分的所有消息                             |\n| 最后一段最大字节数 | `ASSISTANT_SUMMARIZER_LAST_SEC_BYTES`   | `76800` | 助手最后部分的最大字节数（75KB）                           |\n| 单个主体对的最大字节数 | `ASSISTANT_SUMMARIZER_MAX_BP_BYTES`     | `16384` | 助手上下文中单个主体对的最大字节数（16KB）                 |\n| 最大保留的问答段数 | `ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS`  | `7`     | 助手上下文中最多保留的问答段数量                             |\n| 问答段的最大字节数 | `ASSISTANT_SUMMARIZER_MAX_QA_BYTES`     | `76800` | 助手问答段的最大字节数（75KB）                               |\n| 保留的问答段数量 | `ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS` | `3`     | 不进行摘要而直接保留的最近问答段数量                         |\n\n与全局设置相比，助手摘要器的配置提供了更多的内存用于上下文保留，从而能够在确保高效使用令牌的同时，保留更多近期的对话历史。\n\n### 摘要器环境配置\n\n```bash\n# 全局摘要器逻辑的默认值\nSUMMARIZER_PRESERVE_LAST=true\nSUMMARIZER_USE_QA=true\nSUMMARIZER_SUM_MSG_HUMAN_IN_QA=false\nSUMMARIZER_LAST_SEC_BYTES=51200\nSUMMARIZER_MAX_BP_BYTES=16384\nSUMMARIZER_MAX_QA_SECTIONS=10\nSUMMARIZER_MAX_QA_BYTES=65536\nSUMMARIZER_KEEP_QA_SECTIONS=1\n\n# 助手摘要器逻辑的默认值\nASSISTANT_SUMMARIZER_PRESERVE_LAST=true\nASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800\nASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384\nASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7\nASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800\nASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>高级代理监督\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\nPentAGI 包含了复杂的多层代理监督机制，以确保任务高效执行、防止无限循环，并从卡住状态中智能恢复：\n\n### 执行监控（测试版）\n- **自动导师干预**：当执行模式表明可能存在潜在问题时，顾问代理（导师）会自动介入。\n- **模式检测**：监控重复的工具调用次数（阈值：5次，可配置）以及总工具调用次数（阈值：10次，可配置）。\n- **进度分析**：评估代理是否朝着子任务目标推进，检测循环和低效情况。\n- **替代策略**：当前策略失败时，推荐不同的方法。\n- **信息检索指导**：建议搜索已有的解决方案，而不是重新发明。\n- **增强的响应格式**：工具响应包含 `\u003Coriginal_result>` 和 `\u003Cmentor_analysis>` 两部分。\n- **可配置**：通过 `EXECUTION_MONITOR_ENABLED` 开启（默认：关闭），并使用 `EXECUTION_MONITOR_SAME_TOOL_LIMIT` 和 `EXECUTION_MONITOR_TOTAL_TOOL_LIMIT` 自定义阈值。\n\n**最适合**：参数量小于 32B 的小型模型、需要持续指导的复杂攻击场景，以及防止代理陷入单一方法的情况。\n\n**性能影响**：执行时间和令牌使用量增加 2-3 倍，但根据 Qwen3.5-27B-FP8 的测试结果，**结果质量提升了 2 倍**。\n\n### 智能任务规划（测试版）\n- **自动化分解**：规划者（处于规划模式的顾问）会在专业代理开始工作之前生成 3-7 个具体且可操作的步骤。\n- **上下文感知计划**：通过丰富者代理分析完整的执行上下文，制定知情的计划。\n- **结构化分配**：原始请求被封装在包含执行计划和指令的 `\u003Ctask_assignment>` 结构中。\n- **范围管理**：防止范围蔓延，使代理仅专注于当前子任务。\n- **强化的指示**：计划突出关键行动、潜在陷阱和验证点。\n- **可配置**：通过 `AGENT_PLANNING_STEP_ENABLED` 开启（默认：关闭）。\n\n**最适合**：参数量小于 32B 的模型、复杂的渗透测试工作流，以及提高复杂任务成功率的场景。\n\n**增强的顾问配置**：当顾问代理使用更强的模型或增强设置时，效果尤为显著。例如，使用相同的基础模型并启用最大推理模式作为顾问（参见 [`vllm-qwen3.5-27b-fp8.provider.yml`](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml)），即使基于相同的模型架构，也能实现全面的任务分析和战略规划。\n\n**性能影响**：增加了规划开销，但显著提高了完成率并减少了重复工作。\n\n### 工具调用限制（始终启用）\n- **硬性限制**：无论监督模式状态如何，均能防止失控执行\n- **按代理类型区分**：\n  - 通用代理（助理、主代理、渗透测试员、编码员、安装员）：`MAX_GENERAL_AGENT_TOOL_CALLS`（默认：100）\n  - 限定代理（搜索者、丰富者、记忆者、生成器、报告者、顾问、反射者、规划者）：`MAX_LIMITED_AGENT_TOOL_CALLS`（默认：20）\n- **优雅终止**：当接近限制时，反射者会引导代理正确完成任务\n- **资源保护**：确保系统稳定并防止资源耗尽\n\n### 反射者集成（始终启用）\n- **自动纠正**：在大模型连续3次未能生成工具调用时触发\n- **策略性指导**：分析失败原因，并引导代理正确使用工具或采用屏障工具（`done`、`ask`）\n- **恢复机制**：根据具体的失败模式提供上下文指导\n- **限制执行**：在达到工具调用上限时协调进行优雅终止\n\n### 针对开源模型的建议\n\n**参数量小于32B的模型必备**：\n通过使用Qwen3.5-27B-FP8进行测试表明，对于较小的开源模型，同时启用执行监控和任务规划是**必不可少**的：\n- **质量提升**：与无监督的基础执行相比，结果质量提升2倍\n- **循环预防**：显著减少无限循环和重复工作\n- **攻击多样性**：鼓励探索多种攻击向量，而非局限于单一方法\n- **空气隔离部署**：可在本地LLM推理的封闭网络环境中实现生产级自主渗透测试\n\n**权衡**：\n- Token消耗：因导师\u002F规划者的调用而增加2–3倍\n- 执行时间：由于分析和规划步骤，延长2–3倍\n- 结果质量：完整性、准确性和攻击覆盖面提升2倍\n- 模型要求：顾问代理使用增强配置效果最佳（更高的推理参数、更强的模型版本或不同模型）\n\n**配置策略**：\n为使小模型获得最佳性能，应为顾问代理配置增强设置：\n- 使用相同模型并开启最大推理模式（示例：[`vllm-qwen3.5-27b-fp8.provider.yml`](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml)）\n- 或为顾问代理选用更强的模型，而其他代理仍使用基础模型\n- 根据任务复杂度和模型能力调整监控阈值\n\n\n\n\u003C\u002Fdetails>\n\nPentAGI的架构设计具有模块化、可扩展性和安全性。以下是其关键组件：\n\n1. **核心服务**\n   - 前端UI：基于React的Web界面，使用TypeScript以确保类型安全\n   - 后端API：基于Go的REST和GraphQL API，采用Bearer令牌认证以支持程序化访问\n   - 向量存储：PostgreSQL结合pgvector，用于语义搜索和记忆存储\n   - 任务队列：异步任务处理系统，确保可靠运行\n   - AI代理：多代理系统，各代理分工明确，以高效执行测试任务\n\n2. **知识图谱**\n   - Graphiti：知识图谱API，用于跟踪语义关系和上下文理解\n   - Neo4j：图数据库，用于存储和查询实体、动作及结果之间的关系\n   - 自动捕获代理响应和工具执行记录，构建全面的知识库\n\n3. **监控堆栈**\n   - OpenTelemetry：统一的可观测性数据收集与关联\n   - Grafana：实时可视化与告警仪表盘\n   - VictoriaMetrics：高性能的时间序列指标存储\n   - Jaeger：端到端分布式追踪，便于调试\n   - Loki：可扩展的日志聚合与分析\n\n4. **分析平台**\n   - Langfuse：先进的LLM可观测性与性能分析工具\n   - ClickHouse：面向列的分析数据仓库\n   - Redis：高速缓存与限流功能\n   - MinIO：兼容S3的对象存储，用于存储各类工件\n\n5. **安全工具**\n   - 网页爬虫：隔离的浏览器环境，确保安全的网页交互\n   - 渗透测试工具：包含20余种专业安全工具的完整套件\n   - 沙箱执行：所有操作均在隔离容器中运行\n\n6. **记忆系统**\n   - 长期记忆：持久存储知识与经验\n   - 工作记忆：当前操作的活跃上下文与目标\n   - 事件记忆：历史行动与成功模式\n   - 知识库：结构化的领域专业知识与工具能力\n   - 上下文管理：通过链式摘要技术智能管理不断增长的LLM上下文窗口\n\n系统采用Docker容器实现隔离与便捷部署，核心服务、监控和分析模块分别使用独立网络，以确保严格的安全边界。每个组件均可水平扩展，并能在生产环境中配置为高可用架构。\n\n## 快速入门\n\n### 系统要求\n\n- Docker与Docker Compose（或Podman——参见[使用Podman运行PentAGI](#running-pentagi-with-podman)）\n- 至少2个vCPU\n- 至少4GB内存\n- 20GB可用磁盘空间\n- 具备互联网连接，以便下载镜像和更新\n\n### 使用安装程序（推荐）\n\nPentAGI提供了一个交互式安装程序，采用终端界面，可简化配置与部署流程。安装程序将引导您完成系统检查、LLM提供商设置、搜索引擎配置以及安全加固等步骤。\n\n**支持平台：**\n- **Linux**：amd64 [下载](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Famd64\u002Finstaller-latest.zip) | arm64 [下载](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Farm64\u002Finstaller-latest.zip)\n- **Windows**：amd64 [下载](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fwindows\u002Famd64\u002Finstaller-latest.zip)\n- **macOS**：amd64（Intel）[下载](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fdarwin\u002Famd64\u002Finstaller-latest.zip) | arm64（M系列）[下载](https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Fdarwin\u002Farm64\u002Finstaller-latest.zip)\n\n**快速安装（Linux amd64）：**\n\n```bash\n# 创建安装目录\nmkdir -p pentagi && cd pentagi\n\n# 下载安装程序\nwget -O installer.zip https:\u002F\u002Fpentagi.com\u002Fdownloads\u002Flinux\u002Famd64\u002Finstaller-latest.zip\n\n# 解压\nunzip installer.zip\n\n# 运行交互式安装程序\n.\u002Finstaller\n```\n\n**前提条件与权限：**\n\n安装程序需要适当的权限才能与 Docker API 交互以正常运行。默认情况下，它会使用 Docker 套接字 (`\u002Fvar\u002Frun\u002Fdocker.sock`)，这要求：\n\n- **选项 1（推荐用于生产环境）：** 以 root 用户身份运行安装程序：\n  ```bash\n  sudo .\u002Finstaller\n  ```\n\n- **选项 2（开发环境）：** 将您的用户添加到 `docker` 组，以获得对 Docker 套接字的访问权限：\n  ```bash\n  # 将当前用户添加到 docker 组\n  sudo usermod -aG docker $USER\n  \n  # 注销并重新登录，或立即生效组更改\n  newgrp docker\n  \n  # 验证 Docker 访问权限（无需 sudo 即可运行）\n  docker ps\n  ```\n\n  ⚠️ **安全提示：** 将用户加入 `docker` 组会赋予其与 root 用户相当的权限。请仅在受控环境中为可信用户执行此操作。对于生产部署，建议使用无 root 权限的 Docker 模式，或以 sudo 运行安装程序。\n\n安装程序将执行以下步骤：\n1. **系统检查**：验证 Docker、网络连接及系统要求\n2. **环境设置**：创建并配置 `.env` 文件，使用最优默认值\n3. **提供商配置**：设置 LLM 提供商（OpenAI、Anthropic、Gemini、Bedrock、Ollama、自定义）\n4. **搜索引擎**：配置 DuckDuckGo、Google、Tavily、Traversaal、Perplexity、Sploitus、Searxng\n5. **安全加固**：生成安全凭据并配置 SSL 证书\n6. **部署**：使用 docker-compose 启动 PentAGI\n\n**适用于生产及增强安全性：**\n\n对于生产部署或安全敏感的环境，我们**强烈建议**采用分布式双节点架构，将工作负载隔离在独立服务器上。这样可以防止不可信代码的执行，并避免主系统上的网络访问问题。\n\n**查看详细指南**：[工作节点设置](examples\u002Fguides\u002Fworker_node.md)\n\n双节点架构提供：\n- **隔离执行**：工作容器运行在专用硬件上\n- **网络隔离**：渗透测试拥有独立的网络边界\n- **安全边界**：基于 TLS 认证的 Docker-in-Docker 架构\n- **带外攻击支持**：专用于带外技术的端口范围\n\n### 手动安装\n\n1. 创建一个工作目录或克隆仓库：\n\n```bash\nmkdir pentagi && cd pentagi\n```\n\n2. 复制 `.env.example` 到 `.env` 或直接下载：\n\n```bash\ncurl -o .env https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002F.env.example\n```\n\n3. 创建示例文件（`example.custom.provider.yml`、`example.ollama.provider.yml`）或直接下载：\n\n```bash\ncurl -o example.custom.provider.yml https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fexamples\u002Fconfigs\u002Fcustom-openai.provider.yml\ncurl -o example.ollama.provider.yml https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fexamples\u002Fconfigs\u002Follama-llama318b.provider.yml\n```\n\n4. 在 `.env` 文件中填写所需的 API 密钥。\n\n```bash\n# 必需：至少选择一个 LLM 提供商\nOPEN_AI_KEY=your_openai_key\nANTHROPIC_API_KEY=your_anthropic_key\nGEMINI_API_KEY=your_gemini_key\n\n# 可选：AWS Bedrock 提供商（企业级模型）\nBEDROCK_REGION=us-east-1\n# 选择一种认证方式：\nBEDROCK_DEFAULT_AUTH=true                        # 选项 1：使用 AWS SDK 默认凭证链（推荐用于 EC2\u002FECS）\n# BEDROCK_BEARER_TOKEN=your_bearer_token         # 选项 2：Bearer 令牌认证\n# BEDROCK_ACCESS_KEY_ID=your_aws_access_key      # 选项 3：静态凭证\n# BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# 可选：Ollama 提供商（本地或云端）\n# OLLAMA_SERVER_URL=http:\u002F\u002Follama-server:11434   # 本地服务器\n# OLLAMA_SERVER_URL=https:\u002F\u002Follama.com           # 云服务\n# OLLAMA_SERVER_API_KEY=your_ollama_cloud_key    # 云端服务需提供，本地则为空\n\n# 可选：中国 AI 提供商\n# DEEPSEEK_API_KEY=your_deepseek_key             # DeepSeek（强推理能力）\n# GLM_API_KEY=your_glm_key                       # GLM（智谱 AI）\n# KIMI_API_KEY=your_kimi_key                     # Kimi（月之暗面，超长上下文）\n# QWEN_API_KEY=your_qwen_key                     # Qwen（阿里云，多模态）\n\n# 可选：本地 LLM 提供商（零成本推理）\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=your_model_name\n\n# 可选：额外的搜索功能\nDUCKDUCKGO_ENABLED=true\nDUCKDUCKGO_REGION=us-en\nDUCKDUCKGO_SAFESEARCH=\nDUCKDUCKGO_TIME_RANGE=\nSPLOITUS_ENABLED=true\nGOOGLE_API_KEY=your_google_key\nGOOGLE_CX_KEY=your_google_cx\nTAVILY_API_KEY=your_tavily_key\nTRAVERSAAL_API_KEY=your_traversaal_key\nPERPLEXITY_API_KEY=your_perplexity_key\nPERPLEXITY_MODEL=sonar-pro\nPERPLEXITY_CONTEXT_SIZE=medium\n\n# Searxng 元搜索引擎（整合多个来源结果）\nSEARXNG_URL=http:\u002F\u002Fyour-searxng-instance:8080\nSEARXNG_CATEGORIES=general\nSEARXNG_LANGUAGE=\nSEARXNG_SAFESEARCH=0\nSEARXNG_TIME_RANGE=\nSEARXNG_TIMEOUT=\n\n## Graphiti 知识图谱设置\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http:\u002F\u002Fgraphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j 设置（Graphiti 栈使用）\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt:\u002F\u002Fneo4j:7687\n\n# 助手配置\nASSISTANT_USE_AGENTS=false         # 创建新助手时的默认代理使用值\n```\n\n5. 修改 `.env` 文件中的所有安全相关环境变量以提升安全性。\n\n\u003Cdetails>\n    \u003Csummary>安全相关环境变量\u003C\u002Fsummary>\n\n### 主要安全设置\n- `COOKIE_SIGNING_SALT` - 用于 Cookie 签名的盐值，应更改为随机值\n- `PUBLIC_URL` - 您服务器的公开 URL（例如 `https:\u002F\u002Fpentagi.example.com`）\n- `SERVER_SSL_CRT` 和 `SERVER_SSL_KEY` - 您现有 SSL 证书和密钥的自定义路径，用于 HTTPS（这些路径应在 `docker-compose.yml` 文件中作为卷挂载）\n\n### 抓取器访问\n- `SCRAPER_PUBLIC_URL` - 如果您希望为公共 URL 使用不同的抓取服务器，则指定其公开 URL\n- `SCRAPER_PRIVATE_URL` - 抓取器的私有 URL（在 `docker-compose.yml` 文件中配置本地抓取服务器，以便访问本地 URL）\n\n### 访问凭据\n- `PENTAGI_POSTGRES_USER` 和 `PENTAGI_POSTGRES_PASSWORD` - PostgreSQL 凭据\n- `NEO4J_USER` 和 `NEO4J_PASSWORD` - Neo4j 凭据（用于 Graphiti 知识图谱）\n\n\u003C\u002Fdetails>\n\n6. 如果您希望在 VSCode 或其他 IDE 中将 `.env` 文件用作环境变量文件，请移除其中的所有内联注释：\n\n```bash\nperl -i -pe 's\u002F\\s+#.*$\u002F\u002F' .env\n```\n\n7. 运行 PentAGI 堆栈：\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose.yml\ndocker compose up -d\n```\n\n访问 [localhost:8443](https:\u002F\u002Flocalhost:8443) 即可进入 PentAGI Web UI（默认用户名为 `admin@pentagi.com`，密码为 `admin`）。\n\n> [!NOTE]\n> 如果您遇到关于 `pentagi-network`、`observability-network` 或 `langfuse-network` 的错误，则需要先运行 `docker-compose.yml` 来创建这些网络，然后再分别运行 `docker-compose-langfuse.yml`、`docker-compose-graphiti.yml` 和 `docker-compose-observability.yml`，以使用 Langfuse、Graphiti 和 Observability 服务。\n>\n> 要使用 PentAGI，您必须至少设置一个语言模型提供商（OpenAI、Anthropic、Gemini、AWS Bedrock 或 Ollama）。AWS Bedrock 提供来自领先 AI 公司的多种基础模型的企业级访问权限，而 Ollama 则在您拥有足够计算资源的情况下提供零成本的本地推理能力。此外，为搜索引擎配置额外的 API 密钥是可选的，但建议这样做以获得更好的结果。\n>\n> **对于使用高级模型的完全本地部署**：请参阅我们的综合指南 [使用 vLLM 和 Qwen3.5-27B-FP8 运行 PentAGI](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md)，了解生产级别的本地 LLM 配置。此配置在 4 张 RTX 5090 GPU 上实现了约 13,000 TPS 的提示处理速度和约 650 TPS 的完成速度，支持 12 个以上的并发流程，并且完全独立于云服务提供商。\n>\n> `LLM_SERVER_*` 环境变量是一项实验性功能，未来可能会发生变化。目前，您可以使用它们来指定自定义的 LLM 服务器 URL 和适用于所有代理类型的单一模型。\n>\n> `PROXY_URL` 是所有 LLM 提供商和外部搜索系统的全局代理 URL。您可以使用它来实现与外部网络的隔离。\n>\n> `docker-compose.yml` 文件以 root 用户身份运行 PentAGI 服务，因为该服务需要访问 docker.sock 来管理容器。如果您使用 TCP\u002FIP 网络连接到 Docker 而不是套接字文件，则可以移除 root 权限，改用默认的 `pentagi` 用户，从而提高安全性。\n\n### 从外部网络访问 PentAGI\n\n默认情况下，PentAGI 绑定到 `127.0.0.1`（仅限本地访问），以确保安全性。要从您网络中的其他设备访问 PentAGI，您需要配置外部访问权限。\n\n#### 配置步骤\n\n1. 使用您的服务器 IP 地址更新 `.env` 文件：\n\n```bash\n# 网络绑定 - 允许外部连接\nPENTAGI_LISTEN_IP=0.0.0.0\nPENTAGI_LISTEN_PORT=8443\n\n# 公网 URL - 使用您实际的服务器 IP 或主机名\n# 将 192.168.1.100 替换为您的服务器 IP 地址\nPUBLIC_URL=https:\u002F\u002F192.168.1.100:8443\n\n# CORS 源 - 列出所有将访问 PentAGI 的 URL\n# 包括 localhost 以供本地访问，以及您的服务器 IP 以供外部访问\nCORS_ORIGINS=https:\u002F\u002Flocalhost:8443,https:\u002F\u002F192.168.1.100:8443\n```\n\n> [!IMPORTANT]\n> - 请将 `192.168.1.100` 替换为您实际的服务器 IP 地址\n> - 请勿在 `PUBLIC_URL` 或 `CORS_ORIGINS` 中使用 `0.0.0.0`，应使用实际的 IP 地址\n> - 为了灵活性，请在 `CORS_ORIGINS` 中同时包含 localhost 和您的服务器 IP\n\n2. 重新创建容器以应用更改：\n\n```bash\ndocker compose down\ndocker compose up -d --force-recreate\n```\n\n3. 验证端口绑定：\n\n```bash\ndocker ps | grep pentagi\n```\n\n您应该会看到 `0.0.0.0:8443->8443\u002Ftcp` 或 `:::8443->8443\u002Ftcp`。\n\n如果显示的是 `127.0.0.1:8443->8443\u002Ftcp`，则说明环境变量未被正确读取。在这种情况下，请直接编辑 `docker-compose.yml` 文件第 31 行：\n\n```yaml\nports:\n  - \"0.0.0.0:8443:8443\"\n```\n\n然后再次重新创建容器。\n\n4. 配置防火墙以允许端口 8443 的入站连接：\n\n```bash\n# Ubuntu\u002FDebian 使用 UFW\nsudo ufw allow 8443\u002Ftcp\nsudo ufw reload\n\n# CentOS\u002FRHEL 使用 firewalld\nsudo firewall-cmd --permanent --add-port=8443\u002Ftcp\nsudo firewall-cmd --reload\n```\n\n5. 访问 PentAGI：\n\n- **本地访问：** `https:\u002F\u002Flocalhost:8443`\n- **网络访问：** `https:\u002F\u002Fyour-server-ip:8443`\n\n> [!NOTE]\n> 当您通过 IP 地址访问时，浏览器会提示您接受自签名的 SSL 证书警告。\n\n---\n\n### 使用 Podman 运行 PentAGI\n\nPentAGI 完全支持 Podman 作为 Docker 的替代方案。然而，在使用 **Podman 的无根模式** 时，爬虫服务需要特殊配置，因为无根容器无法绑定特权端口（即 1024 以下的端口）。\n\n#### Podman 无根模式配置\n\n默认的爬虫配置使用 443 端口（HTTPS），这是一个特权端口。对于 Podman 无根模式，需要将爬虫配置为使用非特权端口：\n\n**1. 编辑 `docker-compose.yml`** - 修改 `scraper` 服务（大约在第 199 行）：\n\n```yaml\nscraper:\n  image: vxcontrol\u002Fscraper:latest\n  restart: unless-stopped\n  container_name: scraper\n  hostname: scraper\n  expose:\n    - 3000\u002Ftcp  # 从 443 改为 3000\n  ports:\n    - \"${SCRAPER_LISTEN_IP:-127.0.0.1}:${SCRAPER_LISTEN_PORT:-9443}:3000\"  # 映射到 3000 端口\n  environment:\n    - MAX_CONCURRENT_SESSIONS=${LOCAL_SCRAPER_MAX_CONCURRENT_SESSIONS:-10}\n    - USERNAME=${LOCAL_SCRAPER_USERNAME:-someuser}\n    - PASSWORD=${LOCAL_SCRAPER_PASSWORD:-somepass}\n  logging:\n    options:\n      max-size: 50m\n      max-file: \"7\"\n  volumes:\n    - scraper-ssl:\u002Fusr\u002Fsrc\u002Fapp\u002Fssl\n  networks:\n    - pentagi-network\n  shm_size: 2g\n```\n\n**2. 更新 `.env` 文件** - 将爬虫 URL 更改为使用 HTTP 和 3000 端口：\n\n```bash\n# Podman 无根模式下的爬虫配置\nSCRAPER_PRIVATE_URL=http:\u002F\u002Fsomeuser:somepass@scraper:3000\u002F\nLOCAL_SCRAPER_USERNAME=someuser\nLOCAL_SCRAPER_PASSWORD=somepass\n```\n\n> [!IMPORTANT]\n> Podman 配置的关键变更：\n> - `SCRAPER_PRIVATE_URL` 使用 **HTTP** 而不是 HTTPS\n> - 使用 **3000 端口** 而不是 443\n> - 将内部 `expose` 改为 `3000\u002Ftcp`\n> - 更新端口映射，使其指向 `3000` 而不是 `443`\n\n**3. 重新创建容器：**\n\n```bash\npodman-compose down\npodman-compose up -d --force-recreate\n```\n\n**4. 测试爬虫连通性：**\n\n```bash\n# 在 pentagi 容器内测试\npodman exec -it pentagi wget -O- \"http:\u002F\u002Fsomeuser:somepass@scraper:3000\u002Fhtml?url=http:\u002F\u002Fexample.com\"\n```\n\n如果能看到 HTML 输出，则说明爬虫工作正常。\n\n#### Podman 根模式\n\n如果您以根模式运行 Podman（使用 sudo），则无需修改即可使用默认配置。爬虫将按预期在 443 端口上运行。\n\n#### Docker 兼容性\n\n所有 Podman 配置均与 Docker 完全兼容。非特权端口的方法在两种容器运行时中均可正常使用。\n\n### 助手配置\n\nPentAGI 允许您为助手配置默认行为：\n\n| 变量               | 默认值 | 描述                                                             |\n| ---------------------- | ------- | ----------------------------------------------------------------------- |\n| `ASSISTANT_USE_AGENTS` | `false` | 控制创建新助手时是否使用代理的默认值 |\n\n`ASSISTANT_USE_AGENTS` 设置会影响在 UI 中创建新助手时“使用代理”切换按钮的初始状态：\n- `false`（默认）：新助手默认创建时禁用代理委派\n- `true`：新助手默认创建时启用代理委派\n\n请注意，用户始终可以在创建或编辑助手时通过切换 UI 中的“使用代理”按钮来覆盖此设置。该环境变量仅控制初始默认状态。\n\n## 🔌 API 访问\n\nPentAGI 通过 REST 和 GraphQL API 提供全面的程序化访问功能，使您能够将渗透测试工作流集成到自动化流水线、CI\u002FCD 流程以及自定义应用程序中。\n\n### 生成 API Token\n\nAPI Token 通过 PentAGI Web 界面进行管理：\n\n1. 在 Web UI 中导航至 **设置** → **API Token**\n2. 单击 **创建 Token** 以生成新的 API Token\n3. 配置 Token 属性：\n   - **名称**（可选）：Token 的描述性名称\n   - **过期日期**：Token 的过期时间（最短 1 分钟，最长 3 年）\n4. 单击 **创建** 并立即复制 Token——出于安全考虑，它只会显示一次\n5. 在您的 API 请求中将 Token 作为 Bearer Token 使用\n\n每个 Token 都与您的用户账户关联，并继承您角色的权限。\n\n### 使用 API Token\n\n在您的 HTTP 请求的 `Authorization` 头中包含 API Token：\n\n```bash\n# GraphQL API 示例\ncurl -X POST https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fgraphql \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\" \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"query\": \"{ flows { id title status } }\"}'\n\n# REST API 示例\ncurl https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fflows \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\"\n```\n\n### API 探索与测试\n\nPentAGI 提供交互式文档，用于探索和测试 API 端点：\n\n#### GraphQL Playground\n\n访问 GraphQL Playground：`https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fgraphql\u002Fplayground`\n\n1. 点击底部的 **HTTP Headers** 选项卡\n2. 添加您的授权头：\n   ```json\n   {\n     \"Authorization\": \"Bearer YOUR_API_TOKEN\"\n   }\n   ```\n3. 交互式地探索 Schema、运行查询并测试变更操作\n\n#### Swagger UI\n\n访问 REST API 文档：`https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Findex.html`\n\n1. 点击 **Authorize** 按钮\n2. 以 `Bearer YOUR_API_TOKEN` 格式输入您的 Token\n3. 单击 **Authorize** 以应用\n4. 直接从 Swagger UI 测试端点\n\n### 生成 API 客户端\n\n您可以使用 PentAGI 附带的 Schema 文件为首选编程语言生成类型安全的 API 客户端：\n\n#### GraphQL 客户端\n\nGraphQL Schema 可在以下位置获取：\n- **Web UI**：导航至设置下载 `schema.graphqls`\n- **直接文件**：仓库中的 `backend\u002Fpkg\u002Fgraph\u002Fschema.graphqls`\n\n使用以下工具生成客户端：\n- **GraphQL Code Generator**（JavaScript\u002FTypeScript）：[https:\u002F\u002Fthe-guild.dev\u002Fgraphql\u002Fcodegen](https:\u002F\u002Fthe-guild.dev\u002Fgraphql\u002Fcodegen)\n- **genqlient**（Go）：[https:\u002F\u002Fgithub.com\u002FKhan\u002Fgenqlient](https:\u002F\u002Fgithub.com\u002FKhan\u002Fgenqlient)\n- **Apollo iOS**（Swift）：[https:\u002F\u002Fwww.apollographql.com\u002Fdocs\u002Fios](https:\u002F\u002Fwww.apollographql.com\u002Fdocs\u002Fios)\n\n#### REST API 客户端\n\nOpenAPI 规范可在以下位置获取：\n- **Swagger JSON**：`https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json`\n- **Swagger YAML**：位于 `backend\u002Fpkg\u002Fserver\u002Fdocs\u002Fswagger.yaml`\n\n使用以下工具生成客户端：\n- **OpenAPI Generator**：[https:\u002F\u002Fopenapi-generator.tech](https:\u002F\u002Fopenapi-generator.tech)\n  ```bash\n  openapi-generator-cli generate \\\n    -i https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -g python \\\n    -o .\u002Fpentagi-client\n  ```\n\n- **Swagger Codegen**：[https:\u002F\u002Fgithub.com\u002Fswagger-api\u002Fswagger-codegen](https:\u002F\u002Fgithub.com\u002Fswagger-api\u002Fswagger-codegen)\n  ```bash\n  swagger-codegen generate \\\n    -i https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -l typescript-axios \\\n    -o .\u002Fpentagi-client\n  ```\n\n- **swagger-typescript-api**（TypeScript）：[https:\u002F\u002Fgithub.com\u002Facacode\u002Fswagger-typescript-api](https:\u002F\u002Fgithub.com\u002Facacode\u002Fswagger-typescript-api)\n  ```bash\n  npx swagger-typescript-api \\\n    -p https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fswagger\u002Fdoc.json \\\n    -o .\u002Fsrc\u002Fapi \\\n    -n pentagi-api.ts\n  ```\n\n### API 使用示例\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>创建新流程（GraphQL）\u003C\u002Fb>\u003C\u002Fsummary>\n\n```graphql\nmutation CreateFlow {\n  createFlow(\n    modelProvider: \"openai\"\n    input: \"测试 https:\u002F\u002Fexample.com 的安全性\"\n  ) {\n    id\n    title\n    status\n    createdAt\n  }\n}\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>列出流程（REST API）\u003C\u002Fb>\u003C\u002Fsummary>\n\n```bash\ncurl https:\u002F\u002Fyour-pentagi-instance:8443\u002Fapi\u002Fv1\u002Fflows \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\" \\\n  | jq '.flows[] | {id, title, status}'\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Python 客户端示例\u003C\u002Fb>\u003C\u002Fsummary>\n\n```python\nimport requests\n\nclass PentAGIClient:\n    def __init__(self, base_url, api_token):\n        self.base_url = base_url\n        self.headers = {\n            \"Authorization\": f\"Bearer {api_token}\",\n            \"Content-Type\": \"application\u002Fjson\"\n        }\n    \n    def create_flow(self, provider, target):\n        query = \"\"\"\n        mutation CreateFlow($provider: String!, $input: String!) {\n          createFlow(modelProvider: $provider, input: $input) {\n            id\n            title\n            status\n          }\n        }\n        \"\"\"\n        response = requests.post(\n            f\"{self.base_url}\u002Fapi\u002Fv1\u002Fgraphql\",\n            json={\n                \"query\": query,\n                \"variables\": {\n                    \"provider\": provider,\n                    \"input\": target\n                }\n            },\n            headers=self.headers\n        )\n        return response.json()\n    \n    def get_flows(self):\n        response = requests.get(\n            f\"{self.base_url}\u002Fapi\u002Fv1\u002Fflows\",\n            headers=self.headers\n        )\n        return response.json()\n\n# 使用\nclient = PentAGIClient(\n    \"https:\u002F\u002Fyour-pentagi-instance:8443\",\n    \"your_api_token_here\"\n)\n\n# 创建一个新流程\nflow = client.create_flow(\"openai\", \"扫描 https:\u002F\u002Fexample.com 的漏洞\")\nprint(f\"创建的流程：{flow}\")\n\n# 列出所有流程\nflows = client.get_flows()\nprint(f\"总流程数: {len(flows['flows'])}\")\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>TypeScript 客户端示例\u003C\u002Fb>\u003C\u002Fsummary>\n\n```typescript\nimport axios, { AxiosInstance } from 'axios';\n\ninterface Flow {\n  id: string;\n  title: string;\n  status: string;\n  createdAt: string;\n}\n\nclass PentAGIClient {\n  private client: AxiosInstance;\n\n  constructor(baseURL: string, apiToken: string) {\n    this.client = axios.create({\n      baseURL: `${baseURL}\u002Fapi\u002Fv1`,\n      headers: {\n        'Authorization': `Bearer ${apiToken}`,\n        'Content-Type': 'application\u002Fjson',\n      },\n    });\n  }\n\n  async createFlow(provider: string, input: string): Promise\u003CFlow> {\n    const query = `\n      mutation CreateFlow($provider: String!, $input: String!) {\n        createFlow(modelProvider: $provider, input: $input) {\n          id\n          title\n          status\n          createdAt\n        }\n      }\n    `;\n\n    const response = await this.client.post('\u002Fgraphql', {\n      query,\n      variables: { provider, input },\n    });\n\n    return response.data.data.createFlow;\n  }\n\n  async getFlows(): Promise\u003CFlow[]> {\n    const response = await this.client.get('\u002Fflows');\n    return response.data.flows;\n  }\n\n  async getFlow(flowId: string): Promise\u003CFlow> {\n    const response = await this.client.get(`\u002Fflows\u002F${flowId}`);\n    return response.data;\n  }\n}\n\n\u002F\u002F 使用示例\nconst client = new PentAGIClient(\n  'https:\u002F\u002Fyour-pentagi-instance:8443',\n  'your_api_token_here'\n);\n\n\u002F\u002F 创建新流程\nconst flow = await client.createFlow(\n  'openai',\n  '对 https:\u002F\u002Fexample.com 执行渗透测试'\n);\nconsole.log('创建的流程:', flow);\n\n\u002F\u002F 列出所有流程\nconst flows = await client.getFlows();\nconsole.log(`总流程数: ${flows.length}`);\n```\n\n\u003C\u002Fdetails>\n\n### 安全最佳实践\n\n在使用 API 令牌时：\n\n- **切勿将令牌提交到版本控制系统**——请使用环境变量或密钥管理工具。\n- **定期轮换令牌**——设置适当的过期日期，并定期生成新令牌。\n- **为不同应用使用独立令牌**——这样在需要时更容易撤销访问权限。\n- **监控令牌使用情况**——在“设置”页面查看 API 令牌的活动记录。\n- **撤销未使用的令牌**——禁用或删除不再需要的令牌。\n- **仅使用 HTTPS**——切勿通过未加密的连接发送 API 令牌。\n\n### 令牌管理\n\n- **查看令牌**：在“设置”→“API 令牌”中查看所有有效令牌。\n- **编辑令牌**：更新令牌名称或撤销令牌。\n- **删除令牌**：永久移除令牌（此操作不可撤销）。\n- **令牌 ID**：每个令牌都有一个唯一的 ID，可复制以供参考。\n\n令牌列表显示：\n- 令牌名称（如果已提供）\n- 令牌 ID（唯一标识符）\n- 状态（启用\u002F已撤销\u002F已过期）\n- 创建日期\n- 过期日期\n\n### 自定义 LLM 提供商配置\n\n当使用 `LLM_SERVER_*` 变量配置自定义 LLM 提供商时，可以微调请求中使用的推理格式。\n\n> [!提示]\n> 对于生产级本地部署，建议使用 **vLLM** 结合 **Qwen3.5-27B-FP8** 以获得最佳性能。请参阅我们的[全面部署指南](examples\u002Fguides\u002Fvllm-qwen35-27b-fp8.md)，其中包含硬件要求、配置模板（[思考模式](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8.provider.yml)和[非思考模式](examples\u002Fconfigs\u002Fvllm-qwen3.5-27b-fp8-no-think.provider.yml)）以及性能基准测试结果，表明在 4 张 RTX 5090 显卡上可实现每秒 13,000 次提示处理。\n\n| 变量                        | 默认值 | 描述                                                                             |\n| ------------------------------- | ------- | --------------------------------------------------------------------------------------- |\n| `LLM_SERVER_URL`                |         | 自定义 LLM API 端点的基础 URL                                                |\n| `LLM_SERVER_KEY`                |         | 自定义 LLM 提供商的 API 密钥                                                     |\n| `LLM_SERVER_MODEL`              |         | 默认使用的模型（可在提供商配置中覆盖）                             |\n| `LLM_SERVER_CONFIG_PATH`        |         | 用于代理特定模型的 YAML 配置文件路径                           |\n| `LLM_SERVER_PROVIDER`           |         | 模型名称前的提供商前缀（例如，LiteLLM 代理中的 `openrouter`、`deepseek`） |\n| `LLM_SERVER_LEGACY_REASONING`   | `false` | 控制 API 请求中的推理格式                                               |\n| `LLM_SERVER_PRESERVE_REASONING` | `false` | 在多轮对话中保留推理内容（某些提供商要求）     |\n\n`LLM_SERVER_PROVIDER` 设置在使用 **LiteLLM 代理** 时特别有用，因为它会在模型名称前添加提供商前缀。例如，通过 LiteLLM 连接到 Moonshot API 时，`kimi-2.5` 等模型会变为 `moonshot\u002Fkimi-2.5`。通过设置 `LLM_SERVER_PROVIDER=moonshot`，您可以使用相同的提供商配置文件来同时支持直接 API 访问和 LiteLLM 代理访问，而无需进行任何修改。\n\n`LLM_SERVER_LEGACY_REASONING` 设置影响推理参数如何发送到 LLM：\n- `false`（默认）：采用现代格式，将推理作为带有 `max_tokens` 参数的结构化对象发送。\n- `true`：采用旧版格式，使用基于字符串的 `reasoning_effort` 参数。\n\n此设置在与不同 LLM 提供商合作时非常重要，因为它们可能期望在 API 请求中使用不同的推理格式。如果您在使用自定义提供商时遇到与推理相关的问题，请尝试更改此设置。\n\n`LLM_SERVER_PRESERVE_REASONING` 设置控制是否在多轮对话中保留推理内容：\n- `false`（默认）：对话历史中不保留推理内容。\n- `true`：保留推理内容并在后续 API 调用中发送。\n\n此设置对于某些 LLM 提供商（如 Moonshot）是必需的，因为当多轮对话中未包含推理内容时，它们会返回类似“已启用思考功能，但助手工具调用消息中缺少 reasoning_content”的错误。如果您的提供商要求保留推理内容，请启用此设置。\n\n### Ollama 提供商配置\n\nPentAGI 支持使用 Ollama 进行本地 LLM 推理（零成本、隐私性更强）以及 Ollama Cloud（提供免费层级的托管服务）。\n\n#### 配置变量\n\n| 变量                            | 默认值     | 描述                               |\n| ----------------------------------- | ----------- | ----------------------------------------- |\n| `OLLAMA_SERVER_URL`                 |             | 您的 Ollama 服务器或 Ollama Cloud 的 URL |\n| `OLLAMA_SERVER_API_KEY`             |             | 用于 Ollama Cloud 身份验证的 API 密钥   |\n| `OLLAMA_SERVER_MODEL`               |             | 推理的默认模型                       |\n| `OLLAMA_SERVER_CONFIG_PATH`         |             | 自定义代理配置文件路径               |\n| `OLLAMA_SERVER_PULL_MODELS_TIMEOUT` | `600`       | 模型下载的超时时间（秒）             |\n| `OLLAMA_SERVER_PULL_MODELS_ENABLED` | `false`     | 启动时自动下载模型                   |\n| `OLLAMA_SERVER_LOAD_MODELS_ENABLED` | `false`     | 查询服务器以获取可用模型             |\n\n#### Ollama Cloud 配置\n\nOllama Cloud 提供托管推理服务，包含慷慨的免费层级和可扩展的付费方案。\n\n**免费层级设置（单模型）**\n\n```bash\n# 免费层级一次只能使用一个模型\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_MODEL=gpt-oss:120b  # 示例：OpenAI OSS 120B 模型\n```\n\n**付费层级设置（多模型与预建配置）**\n\n对于支持多个并发模型的付费层级，可以使用预建的 Ollama Cloud 配置：\n\n```bash\n# 使用预建的 Ollama Cloud 配置（包含在 Docker 镜像中）\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-cloud.provider.yml\n```\n\n预建的 `ollama-cloud.provider.yml` 配置包括针对所有代理类型的优化模型分配：\n- **简单\u002F助理**：`nemotron-3-super:cloud` - 快速通用模型\n- **主要代理**：`qwen3-coder-next:cloud` - 高效模式下的高级推理\n- **编码员\u002F渗透测试员**：`qwen3-coder-next:cloud` - 专门的编码模型\n- **搜索者**：`qwen3.5:397b-cloud` - 大上下文信息收集能力\n- **精炼者\u002F重构者**：`glm-5:cloud` - 高质量文本精炼\n- **顾问\u002F增强者**：`minimax-m2.7:cloud` - 高效的咨询任务\n- **安装者**：`devstral-2:123b-cloud` - 安装和设置任务\n\n**自定义配置（高级）**\n\n要创建您自己的代理配置，可以从主机文件系统挂载自定义文件：\n\n```bash\n# 使用自定义提供商配置\nOLLAMA_SERVER_URL=https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=your_ollama_cloud_api_key\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama.provider.yml\n\n# 从主机文件系统挂载自定义配置（在 .env 或 docker-compose 覆盖文件中）\nPENTAGI_OLLAMA_SERVER_CONFIG_PATH=\u002Fpath\u002Fon\u002Fhost\u002Fmy-ollama-config.yml\n```\n\n环境变量 `PENTAGI_OLLAMA_SERVER_CONFIG_PATH` 将您的主机配置文件映射到容器内的 `\u002Fopt\u002Fpentagi\u002Fconf\u002Follama.provider.yml`。\n\n**自定义配置示例**（`my-ollama-config.yml`）：\n\n```yaml\nprimary_agent:\n  model: \"qwen3-coder-next:cloud\"\n  temperature: 1.0\n  top_p: 0.9\n  max_tokens: 32768\n  reasoning:\n    effort: high\n\ncoder:\n  model: \"qwen3-coder:32b\"\n  temperature: 1.0\n  max_tokens: 20480\n```\n\n#### 本地 Ollama 配置\n\n对于自托管的 Ollama 实例：\n\n```bash\n# 基本本地 Ollama 设置\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0\n\n# 生产环境设置，启用自动拉取和模型发现\nOLLAMA_SERVER_URL=http:\u002F\u002Follama-server:11434\nOLLAMA_SERVER_PULL_MODELS_ENABLED=true\nOLLAMA_SERVER_PULL_MODELS_TIMEOUT=900\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=true\n\n# 使用 Docker 镜像中的预建配置\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\n# 或\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwen332b-fp16-tc.provider.yml\n# 或\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwq32b-fp16-tc.provider.yml\n```\n\n**性能注意事项：**\n\n- **模型发现**（`OLLAMA_SERVER_LOAD_MODELS_ENABLED=true`）：查询 Ollama API 会增加 1-2 秒的启动延迟。\n- **自动拉取**（`OLLAMA_SERVER_PULL_MODELS_ENABLED=true`）：首次启动可能需要几分钟下载模型。\n- **拉取超时**（`OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900`）：即 15 分钟。\n- **静态配置**：为加快启动速度，可禁用上述两个选项，并在配置文件中指定模型。\n\n#### 创建具有扩展上下文的自定义 Ollama 模型\n\nPentAGI 需要使用比默认 Ollama 配置更大的上下文窗口的模型。您需要通过 Modelfile 创建具有更高 `num_ctx` 参数的自定义模型。虽然典型的代理工作流通常消耗约 64K 个标记，但 PentAGI 为了安全起见并应对复杂的渗透测试场景，采用了 110K 的上下文大小。\n\n**重要提示**：`num_ctx` 参数只能在通过 Modelfile 创建模型时设置，创建后无法更改或在运行时覆盖。\n\n##### 示例：Qwen3 32B FP16 扩展上下文版本\n\n创建名为 `Modelfile_qwen3_32b_fp16_tc` 的 Modelfile：\n\n```dockerfile\nFROM qwen3:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.3\nPARAMETER top_p 0.8\nPARAMETER min_p 0.0\nPARAMETER top_k 20\nPARAMETER repeat_penalty 1.1\n```\n\n构建自定义模型：\n\n```bash\nollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tc\n```\n\n##### 示例：QwQ 32B FP16 扩展上下文版本\n\n创建名为 `Modelfile_qwq_32b_fp16_tc` 的 Modelfile：\n\n```dockerfile\nFROM qwq:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.2\nPARAMETER top_p 0.7\nPARAMETER min_p 0.0\nPARAMETER top_k 40\nPARAMETER repeat_penalty 1.2\n```\n\n构建自定义模型：\n\n```bash\nollama create qwq:32b-fp16-tc -f Modelfile_qwq_32b_fp16_tc\n```\n\n> **注意**：QwQ 32B FP16 模型进行推理时大约需要 **71.3 GB VRAM**。请确保您的系统具备足够的 GPU 内存后再尝试使用此模型。\n\n这些自定义模型已在预建的提供商配置文件中引用（`ollama-qwen332b-fp16-tc.provider.yml` 和 `ollama-qwq32b-fp16-tc.provider.yml`），这些文件位于 Docker 镜像的 `\u002Fopt\u002Fpentagi\u002Fconf\u002F` 目录下。\n\n### OpenAI 提供商配置\n\nPentAGI 集成 OpenAI 的全面模型系列，具备先进的推理能力、扩展的思维链、增强工具集成的代理模型，以及专用于安全工程的代码模型。\n\n#### 配置变量\n\n| 变量             | 默认值                     | 描述                 |\n| -------------------- | --------------------------- | --------------------------- |\n| `OPEN_AI_KEY`        |                             | OpenAI 服务的 API 密钥 |\n| `OPEN_AI_SERVER_URL` | `https:\u002F\u002Fapi.openai.com\u002Fv1` | OpenAI API 端点         |\n\n#### 配置示例\n\n```bash\n\n# 基本的 OpenAI 设置\nOPEN_AI_KEY=你的 OpenAI API 密钥\nOPEN_AI_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# 使用代理以增强安全性\nOPEN_AI_KEY=你的 OpenAI API 密钥\nPROXY_URL=http:\u002F\u002F你的代理:8080\n```\n\n#### 支持的模型\n\nPentAGI 支持 31 种具备工具调用、流式传输、推理模式和提示缓存功能的 OpenAI 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**GPT-5.2 系列 - 最新旗舰代理模型（2025 年 12 月）**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5.2`*            | ✅        | $1.75\u002F$14.00\u002F$0.18         | 最新旗舰模型，具备更强的推理能力和工具集成，适用于自主安全研究 |\n| `gpt-5.2-pro`         | ✅        | $21.00\u002F$168.00\u002F$0.00       | 高级版本，拥有卓越的代理编码能力，适用于关键任务安全研究及零日漏洞发现 |\n| `gpt-5.2-codex`       | ✅        | $1.75\u002F$14.00\u002F$0.18         | 最先进的代码专用模型，擅长上下文压缩，具有强大的网络安全能力 |\n\n**GPT-5\u002F5.1 系列 - 高级代理模型**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5`               | ✅        | $1.25\u002F$10.00\u002F$0.13         | 顶级代理模型，具备先进推理能力，适用于自主安全研究及漏洞利用链开发 |\n| `gpt-5.1`             | ✅        | $1.25\u002F$10.00\u002F$0.13         | 增强型代理模型，具有自适应推理能力，可在渗透测试中实现工具的高效协调 |\n| `gpt-5-pro`           | ✅        | $15.00\u002F$120.00\u002F$0.00       | 高级版本，推理能力大幅提升，幻觉现象显著减少，适用于关键安全运营 |\n| `gpt-5-mini`          | ✅        | $0.25\u002F$2.00\u002F$0.03          | 在速度与智能之间取得良好平衡，适用于自动化漏洞分析和漏洞利用生成 |\n| `gpt-5-nano`          | ✅        | $0.05\u002F$0.40\u002F$0.01          | 运行速度最快，适合高吞吐量扫描、侦察及批量漏洞检测 |\n\n**GPT-5\u002F5.1 Codex 系列 - 代码专用模型**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-5.1-codex-max`   | ✅        | $1.25\u002F$10.00\u002F$0.13         | 推理能力进一步提升，适用于复杂代码编写、已验证的 CVE 漏洞挖掘及系统性漏洞利用开发 |\n| `gpt-5.1-codex`       | ✅        | $1.25\u002F$10.00\u002F$0.13         | 标准优化的代码专用模型，具备强大推理能力，可用于漏洞利用生成和漏洞分析 |\n| `gpt-5-codex`         | ✅        | $1.25\u002F$10.00\u002F$0.13         | 基础代码专用模型，适用于漏洞扫描及基础漏洞利用生成 |\n| `gpt-5.1-codex-mini`  | ✅        | $0.25\u002F$2.00\u002F$0.03          | 体积小巧但性能强劲，容量是普通模型的四倍，可快速检测漏洞 |\n| `codex-mini-latest`   | ✅        | $1.50\u002F$6.00\u002F$0.38          | 最新的紧凑型代码模型，可用于自动化代码审查及基础漏洞分析 |\n\n**GPT-4.1 系列 - 增强型智能模型**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-4.1`             | ❌        | $2.00\u002F$8.00\u002F$0.50          | 增强版旗舰模型，具备更出色的函数调用能力，适用于复杂威胁分析及高级漏洞利用开发 |\n| `gpt-4.1-mini`*       | ❌        | $0.40\u002F$1.60\u002F$0.10          | 性能均衡且效率更高，适用于常规安全评估及自动化代码分析 |\n| `gpt-4.1-nano`        | ❌        | $0.10\u002F$0.40\u002F$0.03          | 超轻量级高速模型，适用于批量安全扫描、快速侦察及持续监控 |\n\n**GPT-4o 系列 - 多模态旗舰模型**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `gpt-4o`              | ❌        | $2.50\u002F$10.00\u002F$1.25         | 多模态旗舰模型，支持视觉、图像分析、网页界面评估及多工具协同操作 |\n| `gpt-4o-mini`         | ❌        | $0.15\u002F$0.60\u002F$0.08          | 紧凑型多模态模型，具备强大的函数调用能力，适用于高频扫描及经济高效的批量操作 |\n\n**o 系列 - 高级推理模型**\n\n| 模型 ID              | 思维能力 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | -------------------------- | ----------------------------------------------- |\n| `o4-mini`*            | ✅        | $1.10\u002F$4.40\u002F$0.28          | 新一代推理模型，速度更快，适用于系统性的安全评估及漏洞利用开发 |\n| `o3`*                 | ✅        | $2.00\u002F$8.00\u002F$0.50          | 强大的高级推理模型，适用于多阶段攻击链及深度漏洞分析 |\n| `o3-mini`             | ✅        | $1.10\u002F$4.40\u002F$0.55          | 紧凑型推理模型，思维能力更强，可用于分步攻击计划及逻辑漏洞关联 |\n| `o1`                  | ✅        | $15.00\u002F$60.00\u002F$7.50        | 顶级推理模型，深度极强，适用于高级渗透测试及新型漏洞利用研究 |\n| `o3-pro`              | ✅        | $20.00\u002F$80.00\u002F$0.00       | 最先进的推理模型，成本仅为 o1-pro 的 80%，适用于零日漏洞研究及关键安全调查 |\n| `o1-pro`              | ✅        | $150.00\u002F$600.00\u002F$0.00      | 上一代高端推理模型，适用于详尽的安全分析及重大挑战 |\n\n**价格**：每 100 万 tokens 计算。推理模型的输出定价中包含思维 tokens 的费用。\n\n> [!WARNING]\n> **GPT-5* 模型 - 需要受信访问权限**\n>\n> 所有 GPT-5 系列模型（`gpt-5`、`gpt-5.1`、`gpt-5.2`、`gpt-5-pro`、`gpt-5.2-pro` 及所有 Codex 变体）在 PentAGI 中运行时**不稳定**，若未经过验证的访问权限，可能会触发 OpenAI 的网络安全防护机制。\n>\n> **要可靠地使用 GPT-5* 模型：**\n> 1. **个人用户**：请前往 [chatgpt.com\u002Fcyber](https:\u002F\u002Fchatgpt.com\u002Fcyber) 完成身份验证。\n> 2. **企业团队**：请通过您的 OpenAI 代表申请受信访问权限。\n> 3. **安全研究人员**：请申请 [网络安全资助计划](https:\u002F\u002Fopenai.com\u002Fform\u002Fcybersecurity-grant-program\u002F)（包含价值 1000 万美元的 API 信用额度）。\n>\n> **推荐的无需验证的替代方案：**\n> - 对于推理任务，建议使用 `o 系列` 模型（o3、o4-mini、o1）。\n> - 对于通用智能和函数调用任务，建议使用 `gpt-4.1` 系列。\n> - 所有 o 系列及 gpt-4.x 系列模型均可在无需特殊访问权限的情况下稳定运行。\n\n**推理努力程度**：\n- **高**：最大推理深度（精炼者 - o3，高努力）\n- **中**：平衡推理（主要代理、助手、反思者 - o4-mini\u002Fo3，中等努力）\n- **低**：高效定向推理（编码员、安装者、渗透测试员 - o3\u002Fo4-mini，低努力；顾问 - gpt-5.2，低努力）\n\n**关键特性**：\n- **扩展推理**：o系列模型具备思维链功能，适用于复杂的安全分析\n- **智能体式智能**：GPT-5\u002F5.1\u002F5.2系列具有增强的工具集成和自主能力\n- **提示缓存**：降低重复上下文的成本（输入价格的10%-50%）\n- **代码专业化**：专用Codex模型用于漏洞发现和漏洞利用开发\n- **多模态支持**：GPT-4o系列可用于基于视觉的安全评估\n- **工具调用**：所有模型均支持强大的函数调用功能，便于渗透测试工具编排\n- **流式传输**：实时响应流式传输，适用于交互式工作流程\n- **成熟的应用记录**：行业领先的模型已在CVE漏洞发现及实际安全应用中取得成果\n\n\n\n### Anthropic 提供商配置\n\nPentAGI 集成了 Anthropic 的 Claude 模型，这些模型具备先进的扩展思维能力、卓越的安全机制以及对复杂安全情境的深刻理解，并支持提示缓存功能。\n\n#### 配置变量\n\n| 变量               | 默认                        | 描述                    |\n| ---------------------- | ------------------------------ | ------------------------------ |\n| `ANTHROPIC_API_KEY`    |                                | Anthropic 服务的 API 密钥 |\n| `ANTHROPIC_SERVER_URL` | `https:\u002F\u002Fapi.anthropic.com\u002Fv1` | Anthropic API 端点         |\n\n#### 配置示例\n\n```bash\n# 基本 Anthropic 设置\nANTHROPIC_API_KEY=your_anthropic_api_key\nANTHROPIC_SERVER_URL=https:\u002F\u002Fapi.anthropic.com\u002Fv1\n\n# 在安全环境中使用代理\nANTHROPIC_API_KEY=your_anthropic_api_key\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### 支持的模型\n\nPentAGI 支持 10 款带有工具调用、流式传输、扩展推理、自适应推理和提示缓存功能的 Claude 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**Claude 4 系列 - 最新模型（2025-2026年）**\n\n| 模型编号                 | 推理能力 | 发布日期 | 价格（输入\u002F输出\u002F缓存读写） | 使用场景                                        |\n| ------------------------ | -------- | ------------ | ------------------------------ | ----------------------------------------------- |\n| `claude-opus-4-6`*       | ✅        | 2025年5月     | $5.00\u002F$25.00\u002F$0.50\u002F$6.25       | 最具智能的模型，适用于自主代理和编码任务。扩展+自适应推理，适合复杂漏洞利用开发、多阶段攻击模拟 |\n| `claude-sonnet-4-6`*     | ✅        | 2025年8月     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | 具备最佳速度与智能平衡的自适应推理模型。适用于多阶段安全评估、智能漏洞分析、实时威胁狩猎 |\n| `claude-haiku-4-5`*      | ✅        | 2025年10月     | $1.00\u002F$5.00\u002F$0.10\u002F$1.25        | 速度最快的模型，接近前沿水平的智能。适用于高频扫描、实时监控、批量自动化测试 |\n\n**旧版模型 - 仍受支持**\n\n| 模型编号                 | 推理能力 | 发布日期 | 价格（输入\u002F输出\u002F缓存读写） | 使用场景                                        |\n| ------------------------ | -------- | ------------ | ------------------------------ | ----------------------------------------------- |\n| `claude-sonnet-4-5`      | ✅        | 2025年9月     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | 当时最先进的推理能力（已被4-6取代）。适用于复杂的渗透测试和高级威胁分析 |\n| `claude-opus-4-5`        | ✅        | 2025年11月     | $5.00\u002F$25.00\u002F$0.50\u002F$6.25       | 极致的推理能力（已被opus-4-6取代）。适用于关键安全研究、零日漏洞发现、红队行动 |\n| `claude-opus-4-1`        | ✅        | 2025年8月     | $15.00\u002F$75.00\u002F$1.50\u002F$18.75     | 高级推理能力（已被取代）。适用于复杂渗透测试和高级威胁建模 |\n| `claude-sonnet-4-0`      | ✅        | 2025年5月     | $3.00\u002F$15.00\u002F$0.30\u002F$3.75       | 高性能推理能力（已被取代）。适用于复杂威胁建模和多工具协调 |\n| `claude-opus-4-0`        | ✅        | 2025年5月     | $15.00\u002F$75.00\u002F$1.50\u002F$18.75     | 第一代 Opus 模型（已被取代）。适用于多步骤漏洞利用开发和自主渗透测试工作流 |\n\n**已弃用模型 - 请迁移到当前模型**\n\n| 模型编号                     | 推理能力 | 发布日期 | 价格（输入\u002F输出\u002F缓存读写） | 备注                                        |\n| ---------------------------- | -------- | ------------ | ------------------------------ | -------------------------------------------- |\n| `claude-3-haiku-20240307`    | ❌        | 2024年3月     | $0.25\u002F$1.25\u002F$0.03\u002F$0.30        | 将于2026年4月19日退役。请迁移到 claude-haiku-4-5 |\n\n**价格**：每100万标记符计价。缓存定价包括读取和写入成本。\n\n**扩展推理配置**：\n- **最大标记数4096**：生成器（claude-opus-4-6）用于在复杂漏洞利用开发中实现最大推理深度\n- **最大标记数2048**：编码员（claude-sonnet-4-6）用于平衡的代码分析和漏洞研究  \n- **最大标记数1024**：主要代理、助手、精炼者、顾问、反思者、搜索者、安装者、渗透测试员，用于特定任务的专注推理\n- **扩展推理**：所有 Claude 4.5+ 和 4.6 模型均支持可配置的扩展推理，以应对深度推理任务\n\n**关键特性**：\n- **扩展推理**：所有 Claude 4.5+ 和 4.6 模型均支持可配置的思维链推理深度，适用于复杂的安全分析\n- **自适应推理**：Claude 4.6 系列（Opus\u002FSonnet）可根据任务复杂度动态调整推理深度，以达到最佳性能\n- **提示缓存**：通过单独的读写定价显著降低成本（读取费用为输入的10%，写入费用为输入的125%）\n- **扩展上下文窗口**：标准为20万标记符，最高可达100万标记符（测试版），适用于 Claude Opus\u002FSonnet 4.6 的全面代码库分析\n- **工具调用**：功能调用强大且准确，非常适合安全工具编排\n- **流式传输**：实时响应流式传输，适用于交互式渗透测试工作流\n- **安全优先设计**：内置安全机制，确保负责任的安全测试实践\n- **多模态支持**：最新模型具备视觉功能，可用于截图分析和UI安全评估\n- **宪法式AI**：先进的安全训练提供可靠且符合伦理的安全指导\n\n### Google AI（Gemini）提供商配置\n\nPentAGI 通过 Google AI API 集成 Google 的 Gemini 模型，提供最先进的多模态推理能力，并支持扩展思考和上下文缓存功能。\n\n#### 配置变量\n\n| 变量            | 默认值                                     | 描述                    |\n| ------------------- | ------------------------------------------- | ------------------------------ |\n| `GEMINI_API_KEY`    |                                             | Google AI 服务的 API 密钥 |\n| `GEMINI_SERVER_URL` | `https:\u002F\u002Fgenerativelanguage.googleapis.com` | Google AI API 端点         |\n\n#### 配置示例\n\n```bash\n# 基本 Gemini 设置\nGEMINI_API_KEY=your_gemini_api_key\nGEMINI_SERVER_URL=https:\u002F\u002Fgenerativelanguage.googleapis.com\n\n# 使用代理\nGEMINI_API_KEY=your_gemini_api_key\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### 支持的模型\n\nPentAGI 支持 13 种具备工具调用、流式传输、思考模式和上下文缓存功能的 Gemini 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**Gemini 3.1 系列 - 最新旗舰版（2026 年 2 月）**\n\n| 模型 ID                              | 思考 | 上下文 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-3.1-pro-preview`*             | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | 最新旗舰版，具有更精细的思考能力、更高的 token 效率，专为软件工程和代理工作流优化 |\n| `gemini-3.1-pro-preview-customtools`  | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | 自定义工具端点，针对 bash 和自定义工具（view_file、search_code）优先级进行了优化 |\n| `gemini-3.1-flash-lite-preview`*      | ✅        | 1M      | $0.25\u002F$1.50\u002F$0.03          | 成本效益最高，性能最快，适用于高吞吐量的代理任务和低延迟应用 |\n\n**Gemini 3 系列（⚠️ gemini-3-pro-preview 已弃用 - 将于 2026 年 3 月 9 日停用）**\n\n| 模型 ID                              | 思考 | 上下文 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-3-pro-preview`                | ✅        | 1M      | $2.00\u002F$12.00\u002F$0.20         | ⚠️ 已弃用 - 请在 2026 年 3 月 9 日前迁移到 gemini-3.1-pro-preview |\n| `gemini-3-flash-preview`*             | ✅        | 1M      | $0.50\u002F$3.00\u002F$0.05          | 前沿智能，具有卓越的搜索基础，适合高通量安全扫描 |\n\n**Gemini 2.5 系列 - 高级思考模型**\n\n| 模型 ID                                 | 思考 | 上下文 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ---------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-2.5-pro`                         | ✅        | 1M      | $1.25\u002F$10.00\u002F$0.13         | 处理复杂编码和推理任务的最先进模型，适用于高级威胁建模 |\n| `gemini-2.5-flash`                       | ✅        | 1M      | $0.30\u002F$2.50\u002F$0.03          | 首个具备思考预算的混合推理模型，性价比极高，适用于大规模评估 |\n| `gemini-2.5-flash-lite`                  | ✅        | 1M      | $0.10\u002F$0.40\u002F$0.01          | 规模最小、成本最低，适用于大规模使用和高通量扫描 |\n| `gemini-2.5-flash-lite-preview-09-2025`  | ✅        | 1M      | $0.10\u002F$0.40\u002F$0.01          | 最新预览版，专为成本效益、高吞吐量和高质量而优化 |\n\n**Gemini 2.0 系列 - 适用于代理的平衡型多模态模型**\n\n| 模型 ID                              | 思考 | 上下文 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemini-2.0-flash`                    | ❌        | 1M      | $0.10\u002F$0.40\u002F$0.03          | 专为代理时代设计的平衡型多模态模型，适用于多样化的安全任务和实时监控 |\n| `gemini-2.0-flash-lite`               | ❌        | 1M      | $0.08\u002F$0.30\u002F$0.00          | 轻量级模型，适用于持续监控、基础扫描和自动化告警处理 |\n\n**专用开源模型（免费）**\n\n| 模型 ID                              | 思考 | 上下文 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ------------------------------------- | -------- | ------- | -------------------------- | ----------------------------------------------- |\n| `gemma-3-27b-it`                      | ❌        | 128K    | 免费\u002F免费\u002F免费             | 基于 Gemini 技术的开源模型，适用于本地安全运营和隐私敏感测试 |\n| `gemma-3n-4b-it`                      | ❌        | 128K    | 免费\u002F免费\u002F免费             | 适用于边缘设备（手机、笔记本电脑、平板电脑），可进行离线漏洞扫描 |\n\n**价格**：每 100 万 tokens（标准付费层级）。上下文窗口即输入 token 限制。\n\n> [!WARNING]\n> **Gemini 3 Pro Preview 已弃用**\n>\n> `gemini-3-pro-preview` 将于 **2026 年 3 月 9 日停止服务**。请迁移到 `gemini-3.1-pro-preview` 以避免服务中断。新模型具有以下优势：\n>\n> - 更加完善的性能和可靠性\n> - 更好的思考能力和 token 效率\n> - 回答更加扎实、事实一致\n> - 更强的软件工程行为表现\n\n**关键特性**：\n- **扩展思考**：用于复杂安全分析的逐步推理（所有 Gemini 3.x 和 2.5 系列）\n- **上下文缓存**：重复使用上下文时可大幅降低成本（降低至输入价格的 10%-90%）\n- **超长上下文**：100 万 tokens，适用于全面的代码库分析和文档审查\n- **多模态支持**：支持文本、图像、视频、音频和 PDF 处理，可用于全面评估\n- **工具调用**：通过函数调用与 20 多种渗透测试工具无缝集成\n- **流式传输**：实时响应流式传输，适用于交互式安全工作流\n- **代码执行**：内置代码执行功能，可用于攻击性工具测试和漏洞利用验证\n- **搜索基础**：集成 Google 搜索，用于威胁情报和 CVE 研究\n- **文件搜索**：文档检索和 RAG 功能，适用于基于知识的评估\n- **批量 API**：非实时批量处理可节省 50% 的成本\n\n**推理力度等级**：\n- **高**：最大思考深度，适用于复杂的多步分析（generator）\n- **中**：平衡推理，适用于一般代理任务（primary_agent、assistant、refiner、adviser）\n- **低**：高效思考，适用于专注任务（coder、installer、pentester）\n\n### AWS Bedrock 提供商配置\n\nPentAGI 与 Amazon Bedrock 集成，提供来自领先 AI 公司的 20 多种基础模型的访问权限，包括 Anthropic、Amazon、Cohere、DeepSeek、OpenAI、Qwen、Mistral 和 Moonshot。\n\n#### 配置变量\n\n| 变量                    | 默认值     | 描述                                                                                         |\n| --------------------------- | ----------- | --------------------------------------------------------------------------------------------------- |\n| `BEDROCK_REGION`            | `us-east-1` | Bedrock 服务的 AWS 区域                                                                      |\n| `BEDROCK_DEFAULT_AUTH`      | `false`     | 使用 AWS SDK 默认凭证链（环境变量、EC2 角色、~\u002F.aws\u002Fcredentials）——优先级最高                 |\n| `BEDROCK_BEARER_TOKEN`      |             | Bearer 令牌认证——优先级高于静态凭证                                      |\n| `BEDROCK_ACCESS_KEY_ID`     |             | 用于静态凭证的 AWS 访问密钥 ID                                                            |\n| `BEDROCK_SECRET_ACCESS_KEY` |             | 用于静态凭证的 AWS 秘密访问密钥                                                          |\n| `BEDROCK_SESSION_TOKEN`     |             | 用于临时凭证的 AWS 会话令牌（可选，与静态凭证一起使用）                |\n| `BEDROCK_SERVER_URL`        |             | 自定义 Bedrock 端点（VPC 终端节点、本地测试）                                              |\n\n**认证优先级**：`BEDROCK_DEFAULT_AUTH` → `BEDROCK_BEARER_TOKEN` → `BEDROCK_ACCESS_KEY_ID`+`BEDROCK_SECRET_ACCESS_KEY`\n\n#### 配置示例\n\n```bash\n# 推荐：默认 AWS SDK 认证（EC2\u002FECS\u002FLambda 角色）\nBEDROCK_REGION=us-east-1\nBEDROCK_DEFAULT_AUTH=true\n\n# Bearer 令牌认证（AWS STS、自定义认证）\nBEDROCK_REGION=us-east-1\nBEDROCK_BEARER_TOKEN=your_bearer_token\n\n# 硬编码凭证（开发、测试）\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# 使用代理和自定义端点\nBEDROCK_REGION=us-east-1\nBEDROCK_DEFAULT_AUTH=true\nBEDROCK_SERVER_URL=https:\u002F\u002Fbedrock-runtime.us-east-1.vpce-xxx.amazonaws.com\nPROXY_URL=http:\u002F\u002Fyour-proxy:8080\n```\n\n#### 支持的模型\n\nPentAGI 支持 21 种具有工具调用、流式传输和多模态能力的 AWS Bedrock 模型。标有 `*` 的模型在默认配置中使用。\n\n| 模型 ID                                         | 提供商        | 思考 | 多模态 | 价格（输入\u002F输出） | 使用场景                                |\n| ------------------------------------------------ | --------------- | -------- | ---------- | -------------------- | --------------------------------------- |\n| `us.amazon.nova-2-lite-v1:0`                     | Amazon Nova     | ❌        | ✅          | $0.33\u002F$2.75          | 自适应推理、高效思考  |\n| `us.amazon.nova-premier-v1:0`                    | Amazon Nova     | ❌        | ✅          | $2.50\u002F$12.50         | 复杂推理、高级分析    |\n| `us.amazon.nova-pro-v1:0`                        | Amazon Nova     | ❌        | ✅          | $0.80\u002F$3.20          | 平衡准确性、速度和成本          |\n| `us.amazon.nova-lite-v1:0`                       | Amazon Nova     | ❌        | ✅          | $0.06\u002F$0.24          | 快速处理、高吞吐量操作 |\n| `us.amazon.nova-micro-v1:0`                      | Amazon Nova     | ❌        | ❌          | $0.035\u002F$0.14         | 超低延迟、实时监控  |\n| `us.anthropic.claude-opus-4-6-v1`*               | Anthropic       | ✅        | ✅          | $5.00\u002F$25.00         | 世界级编程、企业级智能体   |\n| `us.anthropic.claude-sonnet-4-6`                 | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | 前沿智能、企业级规模  |\n| `us.anthropic.claude-opus-4-5-20251101-v1:0`     | Anthropic       | ✅        | ✅          | $5.00\u002F$25.00         | 多日软件开发          |\n| `us.anthropic.claude-haiku-4-5-20251001-v1:0`*   | Anthropic       | ✅        | ✅          | $1.00\u002F$5.00          | 准前沿性能、高速运算  |\n| `us.anthropic.claude-sonnet-4-5-20250929-v1:0`*  | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | 实际场景中的智能体、卓越编程  |\n| `us.anthropic.claude-sonnet-4-20250514-v1:0`     | Anthropic       | ✅        | ✅          | $3.00\u002F$15.00         | 平衡性能、生产就绪  |\n| `us.anthropic.claude-3-5-haiku-20241022-v1:0`    | Anthropic       | ❌        | ❌          | $0.80\u002F$4.00          | 最快模型、经济高效的扫描  |\n| `cohere.command-r-plus-v1:0`                     | Cohere          | ❌        | ❌          | $3.00\u002F$15.00         | 大规模运营、卓越的 RAG    |\n| `deepseek.v3.2`                                  | DeepSeek        | ❌        | ❌          | $0.58\u002F$1.68          | 长上下文推理、高效性      |\n| `openai.gpt-oss-120b-1:0`*                       | OpenAI (OSS)    | ✅        | ❌          | $0.15\u002F$0.60          | 强大推理、科学分析   |\n| `openai.gpt-oss-20b-1:0`                         | OpenAI (OSS)    | ✅        | ❌          | $0.07\u002F$0.30          | 高效编程、软件开发  |\n| `qwen.qwen3-next-80b-a3b`                        | Qwen            | ❌        | ❌          | $0.15\u002F$1.20          | 超长上下文、旗舰级推理  |\n| `qwen.qwen3-32b-v1:0`                            | Qwen            | ❌        | ❌          | $0.15\u002F$0.60          | 平衡推理、科研用途  |\n| `qwen.qwen3-coder-30b-a3b-v1:0`                  | Qwen            | ❌        | ❌          | $0.15\u002F$0.60          | 编码氛围、自然语言优先  |\n| `qwen.qwen3-coder-next`                          | Qwen            | ❌        | ❌          | $0.45\u002F$1.80          | 优化了工具使用和函数调用    |\n| `mistral.mistral-large-3-675b-instruct`          | Mistral         | ❌        | ✅          | $4.00\u002F$12.00         | 先进的多模态、长上下文      |\n| `moonshotai.kimi-k2.5`                           | Moonshot        | ❌        | ✅          | $0.60\u002F$3.00          | 视觉、语言和代码一体化模型     |\n\n**价格**：每 100 万 tokens。支持思考\u002F推理的模型在推理阶段会产生额外的计算成本。\n\n#### 已测试但不兼容的模型\n\n一些 AWS Bedrock 模型经过测试，但由于技术限制而 **不被支持**：\n\n| 模型家族              | 不兼容原因                                                                |\n| ------------------------- | ----------------------------------------------------------------------------------------- |\n| **GLM (Z.AI)**            | 工具调用格式与 Converse API 不兼容（Converse API 期望字符串而非 JSON）       |\n| **AI21 Jamba**            | 严重的速率限制（每分钟 1-2 次请求）导致无法可靠地进行测试和生产使用      |\n| **Meta Llama 3.3\u002F3.1**    | 工具调用结果处理不稳定，会在多轮对话流程中引发意外失败                  |\n| **Mistral Magistral**     | 该模型不支持工具调用                                                    |\n| **Moonshot K2-Thinking**  | 工具调用时流式传输行为不稳定，在生产环境中不可靠                     |\n| **Qwen3-VL**              | 使用工具调用时流式传输不稳定，多模态结合工具功能会间歇性失效          |\n\n> [!IMPORTANT]\n> **速率限制与配额管理**\n>\n> AWS Bedrock 对于 Claude 模型的默认配额非常严格（新账户每分钟仅允许 2-20 次请求）。若要用于生产环境中的渗透测试：\n>\n> 1. 请通过 AWS Service Quotas 控制台为计划使用的模型申请提高配额。\n> 2. 建议使用 Amazon Nova 系列模型——它们具有更高的默认配额且性能优异。\n> 3. 启用预置吞吐量以确保高容量测试的一致性。\n> 4. 密切监控使用情况——AWS 在达到配额上限时会采取严格的限流措施。\n>\n> 若未提高配额，将频繁出现延迟和工作流中断。\n\n> [!WARNING]\n> **Converse API 要求**\n>\n> PentAGI 使用 Amazon Bedrock 的 Converse API 来实现统一的模型接入。所有受支持的模型必须满足以下条件：\n>\n> - ✅ 支持 Converse\u002FConverseStream API\n> - ✅ 支持工具调用（函数调用），以适应渗透测试工作流\n> - ✅ 支持流式工具调用，以便提供实时反馈\n>\n> 请在以下链接处确认各模型的功能支持情况：[AWS Bedrock 模型功能](https:\u002F\u002Fdocs.aws.amazon.com\u002Fbedrock\u002Flatest\u002Fuserguide\u002Fconversation-inference-supported-models-features.html)\n\n**核心特性**：\n- **自动提示缓存**：对于重复上下文，可降低 40%-70% 的成本（Claude 4.x 系列模型）。\n- **扩展思维**：针对复杂安全分析提供逐步推理能力（Claude、DeepSeek R1、OpenAI GPT）。\n- **多模态分析**：能够处理截图、图表、视频等，实现全面的测试（Nova、Claude、Mistral、Kimi）。\n- **工具调用**：通过函数调用与 20 多种渗透测试工具无缝集成。\n- **流式传输**：支持实时响应流式传输，适用于交互式的安全评估工作流。\n\n\n\n### DeepSeek 提供商配置\n\nPentAGI 集成 DeepSeek 平台，提供具备强大推理能力、编码能力和上下文缓存功能的先进 AI 模型，价格极具竞争力。\n\n#### 配置变量\n\n| 变量              | 默认值              | 描述                                         |\n| --------------------- | -------------------------- | --------------------------------------------------- |\n| `DEEPSEEK_API_KEY`    |                            | DeepSeek API 密钥，用于身份验证                 |\n| `DEEPSEEK_SERVER_URL` | `https:\u002F\u002Fapi.deepseek.com` | DeepSeek API 的服务端 URL                           |\n| `DEEPSEEK_PROVIDER`   |                            | LiteLLM 集成时的提供商前缀（可选）  |\n\n#### 配置示例\n\n```bash\n# 直接使用 API\nDEEPSEEK_API_KEY=your_deepseek_api_key\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com\n\n# 使用 LiteLLM 代理\nDEEPSEEK_API_KEY=your_litellm_key\nDEEPSEEK_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nDEEPSEEK_PROVIDER=deepseek  # 为模型名称添加前缀（如 deepseek\u002Fdeepseek-chat），便于 LiteLLM 调用\n```\n\n#### 支持的模型\n\nPentAGI 支持两款 DeepSeek-V3.2 模型，它们均具备工具调用、流式传输、思维模式以及上下文缓存功能。这两款模型在默认配置中均可使用。\n\n| 模型 ID              | 思维模式 | 上下文长度 | 最大输出 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| --------------------- | -------- | ------- | ---------- | -------------------------- | ----------------------------------------------- |\n| `deepseek-chat`*      | ❌        | 128K    | 8K         | $0.28\u002F$0.42\u002F$0.03          | 通用对话、代码生成、工具调用                   |\n| `deepseek-reasoner`*  | ✅        | 128K    | 64K        | $0.28\u002F$0.42\u002F$0.03          | 高级推理、复杂逻辑分析、安全分析                |\n\n**价格**：按每 100 万 tokens 计算。缓存定价基于提示缓存（为输入费用的 10%）。支持“思维模式”的模型包含强化学习链式思维推理机制。\n\n**核心特性**：\n- **自动提示缓存**：对于重复内容，可节省 40%-60% 的成本（提示缓存费用为输入费用的 10%）。\n- **扩展思维**：采用强化学习链式思维推理技术，适用于复杂的安全分析任务（deepseek-reasoner）。\n- **强大的编码能力**：专为代码生成和漏洞利用开发而优化。\n- **工具调用**：可通过函数调用与 20 多种渗透测试工具无缝集成。\n- **流式传输**：支持实时响应流式传输，适用于交互式工作流。\n- **多语言支持**：中文和英文表现尤为出色。\n- **其他功能**：JSON 输出、聊天前缀补全、FIM（中间填空）补全。\n\n**LiteLLM 集成**：若使用 PentAGI 默认配置并通过 LiteLLM 代理调用模型，请设置 `DEEPSEEK_PROVIDER=deepseek`，以启用模型名称前缀。若直接使用 API，则保持为空即可。\n\n### GLM 提供商配置\n\nPentAGI 集成来自智谱 AI（Z.AI）的 GLM 系列模型，这些模型采用 MoE 架构，具备强大的推理能力和代理式功能，由清华大学研发。\n\n#### 配置变量\n\n| 变量          | 默认值                   | 描述                                                |\n| ----------------- | ------------------------------- | ---------------------------------------------------------- |\n| `GLM_API_KEY`     |                                 | GLM API 密钥，用于身份验证                             |\n| `GLM_SERVER_URL`  | `https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4`  | GLM API 的国际服务端 URL                       |\n| `GLM_PROVIDER`    |                                 | LiteLLM 集成时的提供商前缀（可选）         |\n\n#### 配置示例\n\n```bash\n# 直接使用 API（国际服务端）\nGLM_API_KEY=your_glm_api_key\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4\n\n# 其他服务端\nGLM_SERVER_URL=https:\u002F\u002Fopen.bigmodel.cn\u002Fapi\u002Fpaas\u002Fv4  # 中国境内\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fcoding\u002Fpaas\u002Fv4   # 专门用于编码相关API\n\n# 使用 LiteLLM 代理\nGLM_API_KEY=your_litellm_key\nGLM_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nGLM_PROVIDER=zai  # 为 LiteLLM 添加模型名称前缀（zai\u002Fglm-4）\n```\n\n#### 支持的模型\n\nPentAGI 支持 12 款具备工具调用、流式输出、思考模式和提示缓存功能的 GLM 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**GLM-5 系列 - 旗舰 MoE（744B\u002F40B 活性参数）**\n\n| 模型 ID                | 思考模式      | 上下文长度 | 最大输出长度 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ----------------------- | ------------- | ---------- | ------------ | ---------------------- | ----------------------------------------------- |\n| `glm-5`*                | ✅ 强制启用    | 20万       | 12.8万       | $1.00\u002F$3.20\u002F$0.20      | 旗舰级智能体工程，复杂多步骤任务               |\n| `glm-5-code`†           | ✅ 强制启用    | 20万       | 12.8万       | $1.20\u002F$5.00\u002F$0.30      | 代码专用，漏洞利用开发（需订阅 Coding Plan）   |\n\n**GLM-4.7 系列 - 高端，交错思考模式**\n\n| 模型 ID                | 思考模式      | 上下文长度 | 最大输出长度 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ----------------------- | ------------- | ---------- | ------------ | ---------------------- | ----------------------------------------------- |\n| `glm-4.7`*              | ✅ 强制启用    | 20万       | 12.8万       | $0.60\u002F$2.20\u002F$0.11      | 每次响应\u002F工具调用前都会进行思考的高端模型     |\n| `glm-4.7-flashx`*       | ✅ 混合模式    | 20万       | 12.8万       | $0.07\u002F$0.40\u002F$0.01      | 高速优先 GPU，性价比最佳                       |\n| `glm-4.7-flash`         | ✅ 混合模式    | 20万       | 12.8万       | 免费\u002F免费\u002F免费         | 免费的约 300 亿参数 SOTA 模型，仅支持单并发请求|\n\n**GLM-4.6 系列 - 平衡型，自动思考模式**\n\n| 模型 ID                | 思考模式      | 上下文长度 | 最大输出长度 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ----------------------- | ------------- | ---------- | ------------ | ---------------------- | ----------------------------------------------- |\n| `glm-4.6`               | ✅ 自动启用    | 20万       | 12.8万       | $0.60\u002F$2.20\u002F$0.11      | 平衡型，支持流式工具调用，令牌效率提升 30%    |\n\n**GLM-4.5 系列 - 统一推理\u002F编码\u002F智能体**\n\n| 模型 ID                | 思考模式      | 上下文长度 | 最大输出长度 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ----------------------- | ------------- | ---------- | ------------ | ---------------------- | ----------------------------------------------- |\n| `glm-4.5`               | ✅ 自动启用    | 12.8万     | 9.6万        | $0.60\u002F$2.20\u002F$0.11      | 统一模型，MoE 架构，活性参数分别为 3550 亿和 320 亿|\n| `glm-4.5-x`             | ✅ 自动启用    | 12.8万     | 9.6万        | $2.20\u002F$8.90\u002F$0.45      | 超高速高端模型，延迟最低                       |\n| `glm-4.5-air`*          | ✅ 自动启用    | 12.8万     | 9.6万        | $0.20\u002F$1.10\u002F$0.03      | 高性价比，MoE 架构，活性参数分别为 1060 亿和 120 亿，性价比最优|\n| `glm-4.5-airx`          | ✅ 自动启用    | 12.8万     | 9.6万        | $1.10\u002F$4.50\u002F$0.22      | 加速版 Air，使用优先 GPU                       |\n| `glm-4.5-flash`         | ✅ 自动启用    | 12.8万     | 9.6万        | 免费\u002F免费\u002F免费         | 免费提供推理、编码和智能体支持                 |\n\n**GLM-4 旧版 - 密集架构**\n\n| 模型 ID                | 思考模式      | 上下文长度 | 最大输出长度 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ----------------------- | ------------- | ---------- | ------------ | ---------------------- | ----------------------------------------------- |\n| `glm-4-32b-0414-128k`   | ❌ 无思考模式  | 12.8万     | 1.6万        | $0.10\u002F$0.10\u002F$0.00      | 超低成本密集型 320 亿参数模型，适用于高吞吐量解析|\n\n**价格说明**：每 100 万个 token 计算。缓存价格适用于提示词缓存。† 该模型需要 **Coding Plan 订阅**。\n\n> [!WARNING]\n> **Coding Plan 要求**\n>\n> `glm-5-code` 模型需要有效的 **Coding Plan 订阅**。若未订阅而尝试使用该模型，将出现以下错误：\n>\n> ```\n> API 返回意外状态码：403：您没有访问 glm-5-code 的权限\n> ```\n>\n> 对于无需 Coding Plan 的代码专用任务，请改用 `glm-5`（通用旗舰模型）或 `glm-4.7`（具有交错思考模式的高端模型）。\n\n**思考模式**：\n- **强制启用**：模型在每次响应前都会强制进入思考模式（GLM-5、GLM-4.7）\n- **混合模式**：模型会智能判断何时使用思考模式（GLM-4.7-FlashX、GLM-4.7-Flash）\n- **自动启用**：模型会自动决定何时需要进行推理（GLM-4.6、GLM-4.5 系列）\n\n**核心特性**：\n- **提示缓存**：重复上下文可显著降低成本（缓存输入价格已标明）\n- **交错思考模式**：GLM-4.7 在每次响应和工具调用前都会进行思考，并在多轮对话中保持推理连贯性\n- **超长上下文**：GLM-5 和 GLM-4.7\u002F4.6 系列支持 20 万 token，适合大规模代码库分析\n- **MoE 架构**：高效 7440 亿参数，其中 400 亿参数处于活跃状态（GLM-5），3550 亿\u002F320 亿参数（GLM-4.5），1060 亿\u002F120 亿参数（GLM-4.5-Air）\n- **工具调用**：通过函数调用无缝集成 20 多种渗透测试工具\n- **流式输出**：实时响应流式传输，支持流式工具调用（GLM-4.6 及以上）\n- **多语言支持**：卓越的中文和英文 NLP 能力\n- **免费选项**：GLM-4.7-Flash 和 GLM-4.5-Flash 适用于原型设计和实验\n\n**LiteLLM 集成**：设置 `GLM_PROVIDER=zai` 可在使用 PentAGI 默认配置与 LiteLLM 代理时启用模型名称前缀。若直接使用 API，则留空即可。\n\n### Kimi 提供者配置\n\nPentAGI 与 Moonshot AI 的 Kimi 集成，提供具备多模态能力的超长上下文模型，非常适合分析大型代码库和文档。\n\n#### 配置变量\n\n| 变量           | 默认值                | 描述                                         |\n| ---------------- | --------------------- | -------------------------------------------- |\n| `KIMI_API_KEY` |                       | Kimi API 密钥，用于身份验证                  |\n| `KIMI_SERVER_URL`  | `https:\u002F\u002Fapi.moonshot.ai\u002Fv1` | Kimi API 端点 URL（国际版）                   |\n| `KIMI_PROVIDER`    |                       | LiteLLM 集成时的提供者前缀（可选）           |\n\n#### 配置示例\n\n```bash\n# 直接使用 API（国际版端点）\nKIMI_API_KEY=your_kimi_api_key\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1\n\n# 替代端点\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.cn\u002Fv1  # 中国境内\n\n# 使用 LiteLLM 代理\nKIMI_API_KEY=您的 liteLLM 密钥\nKIMI_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nKIMI_PROVIDER=moonshot  # 为 LiteLLM 添加模型名称前缀（如 moonshot\u002Fkimi-k2.5）\n```\n\n#### 支持的模型\n\nPentAGI 支持 11 种具备工具调用、流式传输、思考模式和多模态能力的 Kimi\u002FMoonshot 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**Kimi K2.5 系列 - 高级多模态**\n\n| 模型 ID                   | 思考 | 多模态 | 上下文长度 | 速度      | 价格（输入\u002F输出） | 使用场景                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `kimi-k2.5`*               | ✅        | ✅          | 256K    | 标准   | $0.60\u002F$3.00          | 最智能、多功能，支持视觉+文本+代码              |\n\n**Kimi K2 系列 - MoE 基础模型（1T 参数，32B 激活）**\n\n| 模型 ID                   | 思考 | 多模态 | 上下文长度 | 速度      | 价格（输入\u002F输出） | 使用场景                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `kimi-k2-0905-preview`*    | ❌        | ❌          | 256K    | 标准   | $0.60\u002F$2.50          | 增强的代理式编程，改进的前端开发                |\n| `kimi-k2-0711-preview`     | ❌        | ❌          | 128K    | 标准   | $0.60\u002F$2.50          | 强大的代码和代理能力                            |\n| `kimi-k2-turbo-preview`*   | ❌        | ❌          | 256K    | Turbo      | $1.15\u002F$8.00          | 高速版本，每秒可生成 60–100 个 token            |\n| `kimi-k2-thinking`         | ✅        | ❌          | 256K    | 标准   | $0.60\u002F$2.50          | 长期思考，多步工具使用                          |\n| `kimi-k2-thinking-turbo`   | ✅        | ❌          | 256K    | Turbo      | $1.15\u002F$8.00          | 高速思考，深度推理                              |\n\n**Moonshot V1 系列 - 通用文本生成**\n\n| 模型 ID                   | 思考 | 多模态 | 上下文长度 | 速度      | 价格（输入\u002F输出） | 使用场景                                        |\n| -------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `moonshot-v1-8k`           | ❌        | ❌          | 8K      | 标准   | $0.20\u002F$2.00          | 短文本生成，经济高效                            |\n| `moonshot-v1-32k`          | ❌        | ❌          | 32K     | 标准   | $1.00\u002F$3.00          | 长文本生成，平衡性能                            |\n| `moonshot-v1-128k`         | ❌        | ❌          | 128K    | 标准   | $2.00\u002F$5.00          | 超长文本生成，超大上下文                        |\n\n**Moonshot V1 Vision 系列 - 多模态**\n\n| 模型 ID                      | 思考 | 多模态 | 上下文长度 | 速度      | 价格（输入\u002F输出） | 使用场景                                        |\n| ----------------------------- | -------- | ---------- | ------- | ---------- | -------------------- | ----------------------------------------------- |\n| `moonshot-v1-8k-vision-preview`   | ❌        | ✅          | 8K      | 标准   | $0.20\u002F$2.00          | 视觉理解，短上下文                              |\n| `moonshot-v1-32k-vision-preview`  | ❌        | ✅          | 32K     | 标准   | $1.00\u002F$3.00          | 视觉理解，中等上下文                            |\n| `moonshot-v1-128k-vision-preview` | ❌        | ✅          | 128K    | 标准   | $2.00\u002F$5.00          | 视觉理解，长上下文                              |\n\n**价格**：按每 100 万个 token 计算。Turbo 模型以更高的价格提供每秒 60–100 个 token 的输出速度。\n\n**关键特性**：\n- **超长上下文**：最高可达 256K 个 token，适合全面的代码库分析\n- **多模态能力**：视觉模型支持图像理解，可用于截图分析（Kimi K2.5、V1 Vision 系列）\n- **扩展思考**：通过多步工具使用实现深度推理（kimi-k2.5、kimi-k2-thinking 模型）\n- **高速 Turbo**：每秒 60–100 个 token 的输出，适用于实时工作流程（Turbo 版本）\n- **工具调用**：通过函数调用与 20 多种渗透测试工具无缝集成\n- **流式传输**：实时响应流式传输，用于交互式安全评估\n- **多语言支持**：强大的中文和英文语言支持\n- **MoE 架构**：K2 系列采用高效的 1T 总参数架构，其中 32B 参数处于激活状态\n\n**LiteLLM 集成**：设置 `KIMI_PROVIDER=moonshot` 可在使用默认 PentAGI 配置时启用模型名称前缀，配合 LiteLLM 代理使用。若直接使用 API，则留空即可。\n\n### Qwen 提供者配置\n\nPentAGI 集成来自阿里云 Model Studio（DashScope）的 Qwen，提供强大的多语言模型，具备推理能力和上下文缓存支持。\n\n#### 配置变量\n\n| 变量           | 默认值                                          | 描述                                         |\n| ------------------ | ------------------------------------------------------ | --------------------------------------------------- |\n| `QWEN_API_KEY`     |                                                        | Qwen API 密钥，用于身份验证                     |\n| `QWEN_SERVER_URL`  | `https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1` | Qwen API 端点 URL（国际版）                       |\n| `QWEN_PROVIDER`    |                                                        | LiteLLM 集成中的提供者前缀（可选）              |\n\n#### 配置示例\n\n```bash\n# 直接使用 API（全球\u002F美国端点）\nQWEN_API_KEY=您的 qwen_api_key\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1\n\n# 其他端点\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-intl.aliyuncs.com\u002Fcompatible-mode\u002Fv1  # 国际版（新加坡）\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1       # 中国大陆版（北京）\n\n# 使用 LiteLLM 代理\nQWEN_API_KEY=您的 liteLLM 密钥\nQWEN_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nQWEN_PROVIDER=dashscope  # 为 LiteLLM 添加模型名称前缀（如 dashscope\u002Fqwen-plus）\n```\n\n#### 支持的模型\n\nPentAGI 支持 32 种具备工具调用、流式传输、思考模式和上下文缓存功能的 Qwen 模型。标有 `*` 的模型为默认配置中使用的模型。\n\n**广泛可用的模型（所有地区）**\n\n| 模型ID                     | 思考能力 | 国际版 | 全球\u002F美国 | 中国 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3-max`*                 | ✅        | ✅    | ✅         | ✅     | $2.40\u002F$12.00\u002F$0.48         | 旗舰级推理、复杂安全分析                        |\n| `qwen3-max-preview`          | ✅        | ✅    | ✅         | ✅     | $2.40\u002F$12.00\u002F$0.48         | 预览版，具备扩展的思考能力                      |\n| `qwen-max`                   | ❌        | ✅    | ❌         | ✅     | $1.60\u002F$6.40\u002F$0.32          | 强大的指令遵循能力，旧版旗舰模型                |\n| `qwen3.5-plus`*              | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$2.40\u002F$0.08          | 平衡的推理能力、通用对话及代码编写              |\n| `qwen-plus`                  | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$4.00\u002F$0.08          | 高性价比的平衡性能                              |\n| `qwen3.5-flash`*             | ✅        | ✅    | ✅         | ✅     | $0.10\u002F$0.40\u002F$0.02          | 超快速轻量级，高吞吐量                          |\n| `qwen-flash`                 | ❌        | ✅    | ✅         | ✅     | $0.05\u002F$0.40\u002F$0.01          | 带上下文缓存的快速模型，成本优化                |\n| `qwen-turbo`                 | ✅        | ✅    | ❌         | ✅     | $0.05\u002F$0.50\u002F$0.01          | 已弃用，请使用 qwen-flash 代替                  |\n| `qwq-plus`                   | ✅        | ✅    | ❌         | ✅     | $0.80\u002F$2.40\u002F$0.16          | 深度推理、思维链分析                            |\n\n**区域特定模型**\n\n| 模型ID                     | 思考能力 | 国际版 | 全球\u002F美国 | 中国 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen-plus-us`               | ✅        | ❌    | ✅         | ❌     | $0.40\u002F$4.00\u002F$0.08          | 针对美国地区的优化平衡模型                      |\n| `qwen-long-latest`           | ❌        | ❌    | ❌         | ✅     | $0.07\u002F$0.29\u002F$0.01          | 超长上下文（10M tokens）                        |\n\n**开源 - Qwen3.5系列**\n\n| 模型ID                     | 思考能力 | 国际版 | 全球\u002F美国 | 中国 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3.5-397b-a17b`          | ✅        | ✅    | ✅         | ✅     | $0.60\u002F$3.60\u002F$0.12          | 参数量达397B的超大规模模型，推理能力卓越      |\n| `qwen3.5-122b-a10b`          | ✅        | ✅    | ✅         | ✅     | $0.40\u002F$3.20\u002F$0.08          | 大规模122B参数模型，性能强劲                    |\n| `qwen3.5-27b`                | ✅        | ✅    | ✅         | ✅     | $0.30\u002F$2.40\u002F$0.06          | 中等规模27B参数模型，性能均衡                  |\n| `qwen3.5-35b-a3b`            | ✅        | ✅    | ✅         | ✅     | $0.25\u002F$2.00\u002F$0.05          | 高效的35B模型，其中3B为活跃MoE单元              |\n\n**开源 - Qwen3系列**\n\n| 模型ID                       | 思考能力 | 国际版 | 全球\u002F美国 | 中国 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ------------------------------ | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwen3-next-80b-a3b-thinking`  | ✅        | ✅    | ✅         | ✅     | $0.15\u002F$1.43\u002F$0.03          | 新一代80B纯思考模式                             |\n| `qwen3-next-80b-a3b-instruct`  | ❌        | ✅    | ✅         | ✅     | $0.15\u002F$1.20\u002F$0.03          | 新一代80B指令跟随模型                           |\n| `qwen3-235b-a22b`              | ✅        | ✅    | ✅         | ✅     | $0.70\u002F$8.40\u002F$0.14          | 双模态235B模型，其中22B为活跃                  |\n| `qwen3-32b`                    | ✅        | ✅    | ✅         | ✅     | $0.29\u002F$2.87\u002F$0.06          | 多功能32B双模态模型                             |\n| `qwen3-30b-a3b`                | ✅        | ✅    | ✅         | ✅     | $0.20\u002F$2.40\u002F$0.04          | 高效的30B MoE架构                                |\n| `qwen3-14b`                    | ✅        | ✅    | ✅         | ✅     | $0.35\u002F$4.20\u002F$0.07          | 中等规模14B，性能与成本的平衡                    |\n| `qwen3-8b`                     | ✅        | ✅    | ✅         | ✅     | $0.18\u002F$2.10\u002F$0.04          | 紧凑型8B，效率优化                              |\n| `qwen3-4b`                     | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | 轻量级4B，适用于简单任务                       |\n| `qwen3-1.7b`                   | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | 超紧凑型1.7B，适合基础任务                      |\n| `qwen3-0.6b`                   | ✅        | ✅    | ❌         | ✅     | $0.11\u002F$1.26\u002F$0.02          | 最小的0.6B模型，资源占用极低                    |\n\n**开源 - QwQ & Qwen2.5系列**\n\n| 模型ID                     | 思考能力 | 国际版 | 全球\u002F美国 | 中国 | 价格（输入\u002F输出\u002F缓存） | 使用场景                                        |\n| ---------------------------- | -------- | ---- | --------- | ----- | -------------------------- | ----------------------------------------------- |\n| `qwq-32b`                    | ✅        | ✅    | ✅         | ✅     | $0.29\u002F$0.86\u002F$0.06          | 开源32B推理模型，适用于深度研究                  |\n| `qwen2.5-14b-instruct-1m`    | ❌        | ✅    | ❌         | ✅     | $0.81\u002F$3.22\u002F$0.16          | 上下文长度达1M，14B参数                         |\n| `qwen2.5-7b-instruct-1m`     | ❌        | ✅    | ❌         | ✅     | $0.37\u002F$1.47\u002F$0.07          | 上下文长度达1M，7B参数                          |\n| `qwen2.5-72b-instruct`       | ❌        | ✅    | ❌         | ✅     | $1.40\u002F$5.60\u002F$0.28          | 大规模72B指令跟随模型                           |\n| `qwen2.5-32b-instruct`       | ❌        | ✅    | ❌         | ✅     | $0.70\u002F$2.80\u002F$0.14          | 中等规模32B指令跟随模型                         |\n| `qwen2.5-14b-instruct`       | ❌        | ✅    | ❌         | ✅     | $0.35\u002F$1.40\u002F$0.07          | 紧凑型14B指令跟随模型                           |\n| `qwen2.5-7b-instruct`        | ❌        | ✅    | ❌         | ✅     | $0.18\u002F$0.70\u002F$0.04          | 小规模7B指令跟随模型                           |\n| `qwen2.5-3b-instruct`        | ❌        | ❌    | ❌         | ✅     | $0.04\u002F$0.13\u002F$0.01          | 轻量级3B模型，仅在中国大陆地区可用              |\n\n**价格**：以每100万tokens为单位。缓存定价基于隐式上下文缓存（占输入成本的20%）。支持思考能力的模型在思维链阶段会进行额外的推理计算。\n\n**区域可用性**：\n- **国际**：新加坡区域（`dashscope-intl.aliyuncs.com`）\n- **全球\u002F美国**：美国弗吉尼亚区域（`dashscope-us.aliyuncs.com`）\n- **中国**：中国内地北京区域（`dashscope.aliyuncs.com`）\n\n**核心功能**：\n- **自动上下文缓存**：通过隐式缓存将重复上下文的成本降低30%-50%（仅需输入价格的20%）\n- **扩展思维**：针对复杂安全分析的思维链推理能力（Qwen3-Max、QwQ、Qwen3.5-Plus）\n- **工具调用**：通过函数调用与20余种渗透测试工具无缝集成\n- **流式传输**：实时响应流，适用于交互式工作流\n- **多语言支持**：强大的中文、英文及其他多语言支持\n- **超长上下文**：使用qwen-long-latest可支持高达1000万标记符，适用于大规模代码库分析\n\n**LiteLLM集成**：设置`QWEN_PROVIDER=dashscope`，即可在使用LiteLLM代理的默认PentAGI配置时启用模型名称前缀。若直接使用API，则保持为空。\n\n\n\n## 🔧 高级设置\n\n### Langfuse集成\n\nLangfuse提供先进的AI智能体运行监控与分析能力。\n\n1. 在现有的`.env`文件中配置Langfuse环境变量。\n\n\u003Cdetails>\n    \u003Csummary>Langfuse重要环境变量\u003C\u002Fsummary>\n\n### 数据库凭证\n- `LANGFUSE_POSTGRES_USER` 和 `LANGFUSE_POSTGRES_PASSWORD` - Langfuse PostgreSQL数据库凭证\n- `LANGFUSE_CLICKHOUSE_USER` 和 `LANGFUSE_CLICKHOUSE_PASSWORD` - ClickHouse数据库凭证\n- `LANGFUSE_REDIS_AUTH` - Redis密码\n\n### 加密与安全密钥\n- `LANGFUSE_SALT` - Langfuse Web UI中用于哈希的盐值\n- `LANGFUSE_ENCRYPTION_KEY` - 加密密钥（32字节，十六进制格式）\n- `LANGFUSE_NEXTAUTH_SECRET` - NextAuth使用的密钥\n\n### 管理员凭证\n- `LANGFUSE_INIT_USER_EMAIL` - 管理员邮箱\n- `LANGFUSE_INIT_USER_PASSWORD` - 管理员密码\n- `LANGFUSE_INIT_USER_NAME` - 管理员用户名\n\n### API密钥与令牌\n- `LANGFUSE_INIT_PROJECT_PUBLIC_KEY` - 项目公钥（PentAGI端也会使用）\n- `LANGFUSE_INIT_PROJECT_SECRET_KEY` - 项目私钥（PentAGI端也会使用）\n\n### S3存储\n- `LANGFUSE_S3_ACCESS_KEY_ID` - S3访问密钥ID\n- `LANGFUSE_S3_SECRET_ACCESS_KEY` - S3秘密访问密钥\n\n\u003C\u002Fdetails>\n\n2. 在`.env`文件中为PentAGI服务启用Langfuse集成。\n\n```bash\nLANGFUSE_BASE_URL=http:\u002F\u002Flangfuse-web:3000\nLANGFUSE_PROJECT_ID= # 默认值来自${LANGFUSE_INIT_PROJECT_ID}\nLANGFUSE_PUBLIC_KEY= # 默认值来自${LANGFUSE_INIT_PROJECT_PUBLIC_KEY}\nLANGFUSE_SECRET_KEY= # 默认值来自${LANGFUSE_INIT_PROJECT_SECRET_KEY}\n```\n\n3. 启动Langfuse堆栈：\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-langfuse.yml\ndocker compose -f docker-compose.yml -f docker-compose-langfuse.yml up -d\n```\n\n访问[localhost:4000](http:\u002F\u002Flocalhost:4000)，使用`.env`文件中的凭据登录Langfuse Web界面：\n\n- `LANGFUSE_INIT_USER_EMAIL` - 管理员邮箱\n- `LANGFUSE_INIT_USER_PASSWORD` - 管理员密码\n\n### 监控与可观测性\n\n为了更细致地追踪系统运行情况，可以集成监控工具。\n\n1. 在`.env`文件中为PentAGI启用OpenTelemetry及所有可观测性服务的集成。\n\n```bash\nOTEL_HOST=otelcol:8148\n```\n\n2. 启动可观测性堆栈：\n\n```bash\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-observability.yml\ndocker compose -f docker-compose.yml -f docker-compose-observability.yml up -d\n```\n\n访问[localhost:3000](http:\u002F\u002Flocalhost:3000)，即可进入Grafana Web界面。\n\n> [!注意]\n> 若希望同时使用Langfuse和可观测性堆栈，需在`.env`文件中设置`LANGFUSE_OTEL_EXPORTER_OTLP_ENDPOINT`为`http:\u002F\u002Fotelcol:4318`。\n>\n> 若要同时启动所有可用堆栈（Langfuse、Graphiti和可观测性）：\n>\n> ```bash\n> docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\n> ```\n>\n> 您还可以在Shell中为这些命令设置别名，以加快执行速度：\n>\n> ```bash\n> alias pentagi=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml\"\n> alias pentagi-up=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\"\n> alias pentagi-down=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml down\"\n> ```\n\n### 知识图谱集成（Graphiti）\n\nPentAGI集成了由Neo4j驱动的时间知识图谱系统[Graphiti](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi-graphiti)，以提供高级语义理解和关系追踪功能，助力AI智能体的运行。vxcontrol分支提供了专为渗透测试设计的自定义实体和边类型。\n\n#### 什么是Graphiti？\n\nGraphiti能够自动从智能体交互中提取并存储结构化知识，构建实体、关系及时间上下文的图谱。这使得：\n\n- **语义记忆**：存储并回忆工具、目标、漏洞和技巧之间的关系\n- **情境理解**：跟踪不同渗透测试行为随时间的关联\n- **知识复用**：从过往渗透测试中学习，并将洞见应用于新的评估\n- **高级查询**：搜索复杂模式，例如“哪些工具对类似目标有效？”\n\n#### 启用Graphiti\n\nGraphiti知识图谱是**可选**的，默认情况下未启用。要启用它：\n\n1. 在`.env`文件中配置Graphiti相关环境变量：\n\n```bash\n## Graphiti知识图谱设置\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http:\u002F\u002Fgraphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j设置（Graphiti堆栈使用）\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt:\u002F\u002Fneo4j:7687\n\n# OpenAI API密钥（Graphiti用于实体提取所需）\nOPEN_AI_KEY=your_openai_api_key\n```\n\n2. 运行Graphiti堆栈以及主PentAGI服务：\n\n```bash\n# 如有需要，下载Graphiti的Compose文件\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmaster\u002Fdocker-compose-graphiti.yml\n\n# 启动包含Graphiti的PentAGI\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d\n```\n\n3. 验证Graphiti是否正常运行：\n\n```bash\n# 检查服务状态\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml ps graphiti neo4j\n\n# 查看Graphiti日志\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml logs -f graphiti\n\n# 访问Neo4j浏览器（可选）\n# 前往http:\u002F\u002Flocalhost:7474，使用NEO4J_USER\u002FNEO4J_PASSWORD登录\n\n# 访问Graphiti API（可选，用于调试）\n\n# 访问 http:\u002F\u002Flocalhost:8000\u002Fdocs 获取 Swagger API 文档\n```\n\n> [!NOTE]\n> Graphiti 服务在 `docker-compose-graphiti.yml` 中被定义为一个独立的堆栈。必须同时运行这两个 compose 文件才能启用知识图谱功能。默认使用预构建的 Docker 镜像 `vxcontrol\u002Fgraphiti:latest`。\n\n#### 存储的内容\n\n启用后，PentAGI 会自动捕获：\n\n- **代理响应**：所有代理的推理、分析和决策\n- **工具执行**：执行的命令、使用的工具及其结果\n- **上下文信息**：流程、任务及子任务层级结构\n\n### GitHub 和 Google OAuth 集成\n\n通过与 GitHub 和 Google 的 OAuth 集成，用户可以使用这些平台上的现有账户进行身份验证。这带来了多项优势：\n\n- 简化登录流程，无需创建单独的凭据\n- 通过受信任的身份提供商提升安全性\n- 可访问 GitHub\u002FGoogle 账户中的用户个人资料信息\n- 与现有开发工作流无缝集成\n\n要使用 GitHub OAuth，您需要在 GitHub 账户中创建一个新的 OAuth 应用程序，并在 `.env` 文件中设置 `OAUTH_GITHUB_CLIENT_ID` 和 `OAUTH_GITHUB_CLIENT_SECRET`。\n\n要使用 Google OAuth，您需要在 Google 账户中创建一个新的 OAuth 应用程序，并在 `.env` 文件中设置 `OAUTH_GOOGLE_CLIENT_ID` 和 `OAUTH_GOOGLE_CLIENT_SECRET`。\n\n### Docker 镜像配置\n\nPentAGI 允许您配置用于执行各种任务的 Docker 镜像选择。系统会根据任务类型自动选择最合适的镜像，但您可以通过指定首选镜像来限制这一选择：\n\n| 变量                           | 默认值                | 描述                                                 |\n| ---------------------------------- | ---------------------- | ----------------------------------------------------------- |\n| `DOCKER_DEFAULT_IMAGE`             | `debian:latest`        | 通用任务及不确定情况下的默认 Docker 镜像  |\n| `DOCKER_DEFAULT_IMAGE_FOR_PENTEST` | `vxcontrol\u002Fkali-linux` | 安全\u002F渗透测试任务的默认 Docker 镜像 |\n\n当设置这些环境变量后，AI 代理将仅限于您指定的镜像选项。这对于以下场景特别有用：\n\n- **安全策略实施**：仅允许使用经过验证的可信镜像\n- **环境标准化**：在所有操作中统一使用企业或自定义镜像\n- **性能优化**：利用已安装必要工具的预构建镜像\n\n配置示例：\n\n```bash\n# 为通用任务使用自定义镜像\nDOCKER_DEFAULT_IMAGE=mycompany\u002Fcustom-debian:latest\n\n# 为渗透测试使用专用镜像\nDOCKER_DEFAULT_IMAGE_FOR_PENTEST=mycompany\u002Fpentest-tools:v2.0\n```\n\n> [!NOTE]\n> 如果用户在其任务中明确指定了特定的 Docker 镜像，系统将优先使用该镜像，而忽略这些设置。这些变量仅影响系统的自动镜像选择过程。\n\n## 💻 开发\n\n### 开发要求\n\n- golang\n- nodejs\n- docker\n- postgres\n- commitlint\n\n### 环境设置\n\n#### 后端设置\n\n运行一次 `cd backend && go mod download` 来安装所需的包。\n\n要生成 swagger 文件，需先运行\n\n```bash\nswag init -g ..\u002F..\u002Fpkg\u002Fserver\u002Frouter.go -o pkg\u002Fserver\u002Fdocs\u002F --parseDependency --parseInternal --parseDepth 2 -d cmd\u002Fpentagi\n```\n\n然后再通过以下命令安装 `swag` 包：\n\n```bash\ngo install github.com\u002Fswaggo\u002Fswag\u002Fcmd\u002Fswag@v1.8.7\n```\n\n要生成 graphql 解析器文件，需运行\n\n```bash\ngo run github.com\u002F99designs\u002Fgqlgen --config .\u002Fgqlgen\u002Fgqlgen.yml\n```\n\n之后，您可以在 `pkg\u002Fgraph` 文件夹中看到生成的文件。\n\n要从 sqlc 配置生成 ORM 方法（数据库包），可运行以下命令：\n\n```bash\ndocker run --rm -v $(pwd):\u002Fsrc -w \u002Fsrc --network pentagi-network -e DATABASE_URL=\"{URL}\" sqlc\u002Fsqlc:1.27.0 generate -f sqlc\u002Fsqlc.yml\n```\n\n要从 OpenAPI 规范生成 Langfuse SDK，可运行以下命令：\n\n```bash\nfern generate --local\n```\n\n并安装 fern-cli：\n\n```bash\nnpm install -g fern-api\n```\n\n#### 测试\n\n运行测试时，请进入 `backend` 目录并执行：`cd backend && go test -v .\u002F...`\n\n#### 前端设置\n\n运行一次 `cd frontend && npm install` 来安装所需的包。\n\n要生成 graphql 文件，需运行 `npm run graphql:generate`，该命令会使用 `graphql-codegen.ts` 文件。\n\n请确保您已全局安装 `graphql-codegen`：\n\n```bash\nnpm install -g graphql-codegen\n```\n\n之后，您可以运行：\n* `npm run prettier` 检查代码格式是否正确\n* `npm run prettier:fix` 自动修复格式问题\n* `npm run lint` 检查代码是否符合规范\n* `npm run lint:fix` 自动修正不符合规范的地方\n\n要生成 SSL 证书，需运行 `npm run ssl:generate`，该命令会使用 `generate-ssl.ts` 文件；或者在您运行 `npm run dev` 时，SSL 证书将自动生成。\n\n#### 后端配置\n\n编辑 `.vscode\u002Flaunch.json` 文件中的后端配置：\n- `DATABASE_URL`：PostgreSQL 数据库 URL（例如 `postgres:\u002F\u002Fpostgres:postgres@localhost:5432\u002Fpentagidb?sslmode=disable`）\n- `DOCKER_HOST`：Docker SDK API（例如，在 macOS 上为 `DOCKER_HOST=unix:\u002F\u002F\u002FUsers\u002F\u003Cmy-user>\u002FLibrary\u002FContainers\u002Fcom.docker.docker\u002FData\u002Fdocker.raw.sock`）[更多信息](https:\u002F\u002Fstackoverflow.com\u002Fa\u002F62757128\u002F5922857)\n\n可选：\n- `SERVER_PORT`：服务器运行端口（默认：`8443`）\n- `SERVER_USE_SSL`：启用服务器 SSL（默认：`false`）\n\n#### 前端配置\n\n编辑 `.vscode\u002Flaunch.json` 文件中的前端配置：\n- `VITE_API_URL`：后端 API URL。请省略协议部分（例如，`localhost:8080` *而不是* `http:\u002F\u002Flocalhost:8080`）\n- `VITE_USE_HTTPS`：启用服务器 SSL（默认：`false`）\n- `VITE_PORT`：服务器运行端口（默认：`8000`）\n- `VITE_HOST`：服务器运行主机（默认：`0.0.0.0`）\n\n### 运行应用\n\n#### 后端\n\n在 `backend` 文件夹中运行以下命令：\n- 使用 `.env` 文件设置环境变量，例如 `source .env`\n- 运行 `go run cmd\u002Fpentagi\u002Fmain.go` 启动服务器\n\n> [!NOTE]\n> 第一次运行可能需要一些时间，因为需要下载依赖项和 Docker 镜像来搭建后端环境。\n\n#### 前端\n\n在 `frontend` 文件夹中运行以下命令：\n- 运行 `npm install` 安装依赖\n- 运行 `npm run dev` 启动 Web 应用\n- 运行 `npm run build` 构建 Web 应用\n\n打开浏览器并访问 Web 应用的 URL。\n\n## 测试 LLM 代理\n\nPentAGI 内置了一个强大的工具 `ctester`，用于测试和验证 LLM 代理的能力。该工具可以帮助您确保 LLM 提供商的配置能够正确地与不同类型的代理配合使用，从而为每个特定角色的代理优化模型选择。\n该工具支持多代理并行测试、详细的报告输出以及灵活的配置选项。\n\n### 核心功能\n\n- **并行测试**：同时测试多个智能体，以加快测试速度\n- **全面的测试套件**：评估基础完成度、JSON响应、函数调用以及渗透测试知识\n- **详细报告**：生成包含成功率和性能指标的 Markdown 报告\n- **灵活配置**：可根据需要测试特定智能体或测试组\n- **专用测试组**：包含针对网络安全和渗透测试场景的领域特定测试\n\n### 使用场景\n\n#### 针对开发者（本地 Go 环境）\n\n如果您已克隆仓库并安装了 Go：\n\n```bash\n# 使用 .env 文件的默认配置\ncd backend\ngo run cmd\u002Fctester\u002F*.go -verbose\n\n# 自定义提供商配置\ngo run cmd\u002Fctester\u002F*.go -config ..\u002Fexamples\u002Fconfigs\u002Fopenrouter.provider.yml -verbose\n\n# 生成报告文件\ngo run cmd\u002Fctester\u002F*.go -config ..\u002Fexamples\u002Fconfigs\u002Fdeepinfra.provider.yml -report ..\u002Ftest-report.md\n\n# 仅测试特定类型的智能体\ngo run cmd\u002Fctester\u002F*.go -agents simple,simple_json,primary_agent -verbose\n\n# 仅测试特定的测试组\ngo run cmd\u002Fctester\u002F*.go -groups basic,advanced -verbose\n```\n\n#### 针对用户（使用 Docker 镜像）\n\n如果您更倾向于使用预构建的 Docker 镜像而无需搭建开发环境：\n\n```bash\n# 使用 Docker 运行默认环境下的测试\ndocker run --rm -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -verbose\n\n# 使用自定义提供商配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  -v $(pwd)\u002Fmy-config.yml:\u002Fopt\u002Fpentagi\u002Fconfig.yml \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconfig.yml -agents simple,primary_agent,coder -verbose\n\n# 生成详细报告\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  -v $(pwd):\u002Fopt\u002Fpentagi\u002Foutput \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -report \u002Fopt\u002Fpentagi\u002Foutput\u002Freport.md\n```\n\n#### 使用预配置的提供商\n\nDocker 镜像内置支持主流提供商（OpenAI、Anthropic、Gemini、Ollama），并提供针对其他服务的预配置提供商文件（OpenRouter、DeepInfra、DeepSeek、Moonshot、Novita）：\n\n```bash\n# 使用 OpenRouter 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fopenrouter.provider.yml\n\n# 使用 DeepInfra 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fdeepinfra.provider.yml\n\n# 使用 DeepSeek 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider deepseek\n\n# 使用 GLM 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider glm\n\n# 使用 Kimi 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider kimi\n\n# 使用 Qwen 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -provider qwen\n\n# 使用 DeepSeek 的自定义提供商配置文件进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fdeepseek.provider.yml\n\n# 使用 Moonshot 的自定义提供商配置文件进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\n\n# 使用 Novita 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fnovita.provider.yml\n\n# 使用 OpenAI 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type openai\n\n# 使用 Anthropic 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type anthropic\n\n# 使用 Gemini 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type gemini\n\n# 使用 AWS Bedrock 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -type bedrock\n\n# 使用自定义 OpenAI 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\n\n# 使用 Ollama 配置进行本地推理测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\n\n# 使用 Ollama Qwen3 32B 配置进行测试（需自定义模型）\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwen332b-fp16-tc.provider.yml\n\n# 使用 Ollama QwQ 32B 配置进行测试（需自定义模型且需要 71.3GB 显存）\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Follama-qwq32b-fp16-tc.provider.yml\n```\n\n要使用这些配置，您的 `.env` 文件只需包含以下内容：\n\n```\nLLM_SERVER_URL=https:\u002F\u002Fopenrouter.ai\u002Fapi\u002Fv1      # 或 https:\u002F\u002Fapi.deepinfra.com\u002Fv1\u002Fopenai 或 https:\u002F\u002Fapi.openai.com\u002Fv1 或 https:\u002F\u002Fapi.novita.ai\u002Fopenai\nLLM_SERVER_KEY=your_api_key\nLLM_SERVER_MODEL=                                # 留空，因为模型已在配置中指定\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fopenrouter.provider.yml  # 或 deepinfra.provider.ymll 或 custom-openai.provider.yml 或 novita.provider.yml\nLLM_SERVER_PROVIDER=                             # LiteLLM 代理的提供商名称（如 openrouter、deepseek、moonshot、novita）\nLLM_SERVER_LEGACY_REASONING=false                # 控制推理格式，对于 OpenAI 必须为 true（默认：false）\nLLM_SERVER_PRESERVE_REASONING=false              # 在多轮对话中保留推理内容（Moonshot 要求，默认：false）\n\n# 对于 OpenAI（官方 API）\nOPEN_AI_KEY=your_openai_api_key                  # 您的 OpenAI API 密钥\nOPEN_AI_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1     # OpenAI API 端点\n\n# 对于 Anthropic（Claude 模型）\nANTHROPIC_API_KEY=your_anthropic_api_key         # 您的 Anthropic API 密钥\nANTHROPIC_SERVER_URL=https:\u002F\u002Fapi.anthropic.com\u002Fv1  # Anthropic API 端点\n\n# 对于 Gemini（Google AI）\nGEMINI_API_KEY=your_gemini_api_key               # 您的 Google AI API 密钥\nGEMINI_SERVER_URL=https:\u002F\u002Fgenerativelanguage.googleapis.com  # Google AI API 端点\n\n# 对于 AWS Bedrock（企业级基础模型）\nBEDROCK_REGION=us-east-1                         # AWS 区域，用于 Bedrock 服务\n\n# 身份验证（选择一种方法，优先级：DefaultAuth > BearerToken > AccessKey）：\nBEDROCK_DEFAULT_AUTH=false                       # 使用 AWS SDK 凭证链（环境变量、EC2 角色、~\u002F.aws\u002Fcredentials）\nBEDROCK_BEARER_TOKEN=                            # Bearer 令牌身份验证（优先于静态凭证）\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key        # AWS 访问密钥 ID（静态凭证）\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key    # AWS 秘密访问密钥（静态凭证）\nBEDROCK_SESSION_TOKEN=                           # AWS 会话令牌（可选，用于使用静态认证的临时凭证）\nBEDROCK_SERVER_URL=                              # 可选的自定义 Bedrock 终端节点（VPC 终端节点、本地测试）\n\n# 用于 Ollama（本地服务器或云）\nOLLAMA_SERVER_URL=                               # 本地：http:\u002F\u002Follama-server:11434，云：https:\u002F\u002Follama.com\nOLLAMA_SERVER_API_KEY=                           # Ollama Cloud 所需（https:\u002F\u002Follama.com\u002Fsettings\u002Fkeys），本地则留空\nOLLAMA_SERVER_MODEL=\nOLLAMA_SERVER_CONFIG_PATH=\nOLLAMA_SERVER_PULL_MODELS_TIMEOUT=\nOLLAMA_SERVER_PULL_MODELS_ENABLED=\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=\n\n# 用于 DeepSeek（具有强大推理能力的中文 AI）\nDEEPSEEK_API_KEY=                                # DeepSeek API 密钥\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com     # DeepSeek API 终端节点\nDEEPSEEK_PROVIDER=                               # 可选：LiteLLM 前缀（例如 'deepseek'）\n\n# 用于 GLM（智谱 AI）\nGLM_API_KEY=                                     # GLM API 密钥\nGLM_SERVER_URL=https:\u002F\u002Fapi.z.ai\u002Fapi\u002Fpaas\u002Fv4      # GLM API 国际终端节点（国际版）\nGLM_PROVIDER=                                    # 可选：LiteLLM 前缀（例如 'zai'）\n\n# 用于 Kimi（月之暗面 AI）\nKIMI_API_KEY=                                    # Kimi API 密钥\nKIMI_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1       # Kimi API 国际终端节点（国际版）\nKIMI_PROVIDER=                                   # 可选：LiteLLM 前缀（例如 'moonshot'）\n\n# 用于 Qwen（阿里云 DashScope）\nQWEN_API_KEY=                                    # Qwen API 密钥\nQWEN_SERVER_URL=https:\u002F\u002Fdashscope-us.aliyuncs.com\u002Fcompatible-mode\u002Fv1  # Qwen API 美国终端节点\nQWEN_PROVIDER=                                   # 可选：LiteLLM 前缀（例如 'dashscope'）\n\n# 对于 Ollama（本地推理），请使用上述变量\nOLLAMA_SERVER_URL=http:\u002F\u002Flocalhost:11434\nOLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0\nOLLAMA_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Follama-llama318b.provider.yml\nOLLAMA_SERVER_PULL_MODELS_ENABLED=false\nOLLAMA_SERVER_LOAD_MODELS_ENABLED=false\n```\n\n#### 使用未验证组织的 OpenAI\n\n对于未验证组织且无法访问最新推理模型（o1、o3、o4-mini）的 OpenAI 账户，需要使用自定义配置。\n\n要为未验证组织账户使用 OpenAI，请按如下方式配置 `.env` 文件：\n\n```bash\nLLM_SERVER_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\nLLM_SERVER_KEY=your_openai_api_key\nLLM_SERVER_MODEL=                                # 留空，模型将在配置中指定\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\nLLM_SERVER_LEGACY_REASONING=true                 # OpenAI 推理格式所需\n```\n\n此配置使用预建的 `custom-openai.provider.yml` 文件，将所有代理类型映射到未验证组织可用的模型，使用 `o3-mini` 替代 `o1`、`o3` 和 `o4-mini` 等模型。\n\n您可以通过以下命令测试此配置：\n\n```bash\n# 使用针对未验证账户的自定义 OpenAI 配置进行测试\ndocker run --rm \\\n  -v $(pwd)\u002F.env:\u002Fopt\u002Fpentagi\u002F.env \\\n  vxcontrol\u002Fpentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -config \u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom-openai.provider.yml\n```\n\n> [!NOTE]\n> `LLM_SERVER_LEGACY_REASONING=true` 设置对于 OpenAI 兼容性至关重要，因为它确保推理参数以 OpenAI API 所期望的格式发送。\n\n#### 使用 LiteLLM 代理\n\n当使用 LiteLLM 代理访问各种 LLM 提供商时，模型名称会加上提供商前缀（例如 `moonshot\u002Fkimi-2.5` 而不是 `kimi-2.5`）。为了使相同的提供商配置文件既能用于直接 API 访问，也能用于 LiteLLM 代理，需设置 `LLM_SERVER_PROVIDER` 变量：\n\n```bash\n# 直接访问 Moonshot API\nLLM_SERVER_URL=https:\u002F\u002Fapi.moonshot.ai\u002Fv1\nLLM_SERVER_KEY=your_moonshot_api_key\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\nLLM_SERVER_PROVIDER=                             # 直接访问时留空\n\n# 通过 LiteLLM 代理访问\nLLM_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nLLM_SERVER_KEY=your_litellm_api_key\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fmoonshot.provider.yml\nLLM_SERVER_PROVIDER=moonshot                     # LiteLLM 的提供商前缀\n```\n\n设置 `LLM_SERVER_PROVIDER=moonshot` 后，系统会自动在配置文件中的所有模型名称前加上 `moonshot\u002F` 前缀，使其与 LiteLLM 的模型命名规范兼容。\n\n**LiteLLM 提供商名称映射：**\n\n使用 LiteLLM 代理时，设置相应的 `*_PROVIDER` 变量以启用模型前缀：\n\n- `deepseek` - 用于 DeepSeek 模型（`DEEPSEEK_PROVIDER=deepseek` → `deepseek\u002Fdeepseek-chat`）\n- `zai` - 用于 GLM 模型（`GLM_PROVIDER=zai` → `zai\u002Fglm-4`）\n- `moonshot` - 用于 Kimi 模型（`KIMI_PROVIDER=moonshot` → `moonshot\u002Fkimi-k2.5`）\n- `dashscope` - 用于 Qwen 模型（`QWEN_PROVIDER=dashscope` → `dashscope\u002Fqwen-plus`）\n- `openai`、`anthropic`、`gemini` - 用于主要的云提供商\n- `openrouter` - 用于 OpenRouter 聚合器\n- `deepinfra` - 用于 DeepInfra 托管\n- `novita` - 用于 Novita AI\n- 您的 LiteLLM 实例中配置的任何其他提供商名称\n\n**LiteLLM 示例：**\n```bash\n# 通过 LiteLLM 代理使用 DeepSeek 模型，并添加模型前缀\nDEEPSEEK_API_KEY=your_litellm_proxy_key\nDEEPSEEK_SERVER_URL=http:\u002F\u002Flitellm-proxy:4000\nDEEPSEEK_PROVIDER=deepseek  # 模型变为 deepseek\u002Fdeepseek-chat、deepseek\u002Fdeepseek-reasoner，适用于 LiteLLM\n\n# 直接使用 DeepSeek API（无需前缀）\nDEEPSEEK_API_KEY=your_deepseek_api_key\nDEEPSEEK_SERVER_URL=https:\u002F\u002Fapi.deepseek.com\n# DEEPSEEK_PROVIDER 留空\n```\n\n这种方法允许您：\n- 对于直接访问和代理访问使用相同的配置文件\n- 在不修改配置文件的情况下切换提供商\n- 轻松测试 LiteLLM 的不同路由策略\n\n#### 在生产环境中运行测试\n\n如果您已经有一个正在运行的 PentAGI 容器，并希望测试当前配置：\n\n```bash\n# 在现有容器中使用当前环境变量运行 ctester\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -verbose\n\n# 使用确定性顺序测试特定代理类型\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -agents simple,primary_agent,pentester -groups basic,knowledge -verbose\n\n# 在容器内生成报告文件\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fctester -report \u002Fopt\u002Fpentagi\u002Fdata\u002Fagent-test-report.md\n\n# 从主机访问报告\ndocker cp pentagi:\u002Fopt\u002Fpentagi\u002Fdata\u002Fagent-test-report.md .\u002F\n```\n\n### 命令行选项\n\n该工具接受多个选项：\n\n- `-env \u003Cpath>` - 环境文件路径（默认：`.env`）\n- `-type \u003Cprovider>` - 提供商类型：`custom`、`openai`、`anthropic`、`ollama`、`bedrock`、`gemini`（默认：`custom`）\n- `-config \u003Cpath>` - 自定义提供商配置文件路径（默认：来自环境变量 `LLM_SERVER_CONFIG_PATH`）\n- `-tests \u003Cpath>` - 自定义测试 YAML 文件路径（可选）\n- `-report \u003Cpath>` - 报告文件输出路径（可选）\n- `-agents \u003Clist>` - 要测试的代理类型逗号分隔列表（默认：`all`）\n- `-groups \u003Clist>` - 要运行的测试组逗号分隔列表（默认：`all`）\n- `-verbose` - 启用详细输出，显示每个代理的详细测试结果\n\n### 可用代理类型\n\n代理将按以下确定性顺序进行测试：\n\n1. **simple** - 基本完成任务\n2. **simple_json** - JSON 结构化响应\n3. **primary_agent** - 主要推理代理\n4. **assistant** - 交互式助手模式\n5. **generator** - 内容生成\n6. **refiner** - 内容精炼与改进\n7. **adviser** - 专家建议与咨询\n8. **reflector** - 自我反思与分析\n9. **searcher** - 信息收集与搜索\n10. **enricher** - 数据丰富与扩展\n11. **coder** - 代码生成与分析\n12. **installer** - 安装与设置任务\n13. **pentester** - 渗透测试与安全评估\n\n### 可用测试组\n\n- **basic** - 基础完成与提示响应测试\n- **advanced** - 复杂推理与函数调用测试\n- **json** - JSON 格式验证与结构测试（专为 `simple_json` 代理设计）\n- **knowledge** - 领域特定的网络安全与渗透测试知识测试\n\n> **注意**：`json` 测试组专为 `simple_json` 代理类型设计，而其他所有代理则会使用 `basic`、`advanced` 和 `knowledge` 组进行测试。这种专业化确保了对每个代理预期用途的最佳测试覆盖。\n\n### 示例提供商配置\n\n提供商配置定义了不同代理类型应使用的模型：\n\n```yaml\nsimple:\n  model: \"provider\u002Fmodel-name\"\n  temperature: 0.7\n  top_p: 0.95\n  n: 1\n  max_tokens: 4000\n\nsimple_json:\n  model: \"provider\u002Fmodel-name\"\n  temperature: 0.7\n  top_p: 1.0\n  n: 1\n  max_tokens: 4000\n  json: true\n\n# ... 其他代理类型 ...\n```\n\n### 优化工作流程\n\n1. **创建基线**：使用默认配置运行测试，以建立基准性能\n2. **分析代理特定性能**：查看确定性的代理排序，找出表现不佳的代理\n3. **测试专用配置**：针对每个代理类型，使用提供商特定的配置尝试不同的模型\n4. **关注领域知识**：特别注意网络安全专业知识的知识组测试\n5. **验证函数调用**：确保关键代理类型的工具测试能够持续通过\n6. **比较结果**：寻找在所有测试组中最佳的成功率和性能\n7. **部署最优配置**：使用优化后的设置投入生产\n\n此工具有助于确保您的 AI 代理为其特定任务使用最有效的模型，从而提高可靠性并优化成本。\n\n## 嵌入配置与测试\n\nPentAGI 使用向量嵌入进行语义搜索、知识存储和内存管理。该系统支持多种嵌入提供商，可根据您的需求和偏好进行配置。\n\n### 支持的嵌入提供商\n\nPentAGI 支持以下嵌入提供商：\n\n- **OpenAI**（默认）：使用 OpenAI 的文本嵌入模型\n- **Ollama**：通过 Ollama 使用本地嵌入模型\n- **Mistral**：Mistral AI 的嵌入模型\n- **Jina**：Jina AI 的嵌入服务\n- **HuggingFace**：来自 HuggingFace 的模型\n- **GoogleAI**：Google 的嵌入模型\n- **VoyageAI**：VoyageAI 的嵌入模型\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>嵌入提供商配置\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n### 环境变量\n\n要配置嵌入提供商，请在 `.env` 文件中设置以下环境变量：\n\n```bash\n# 主要嵌入配置\nEMBEDDING_PROVIDER=openai       # 提供商类型（openai、ollama、mistral、jina、huggingface、googleai、voyageai）\nEMBEDDING_MODEL=text-embedding-3-small  # 要使用的模型名称\nEMBEDDING_URL=                  # 可选的自定义 API 端点\nEMBEDDING_KEY=                  # 提供商的 API 密钥（如果需要）\nEMBEDDING_BATCH_SIZE=100        # 每批处理的文档数量\nEMBEDDING_STRIP_NEW_LINES=true  # 是否在嵌入前移除文本中的换行符\n\n# 高级设置\nPROXY_URL=                      # 所有 API 调用的可选代理\nHTTP_CLIENT_TIMEOUT=600         # 外部 API 调用的超时时间（秒）（默认：600，0 表示无超时）\n\n# SSL\u002FTLS 证书配置（用于与 LLM 后端及工具服务器的外部通信）\nEXTERNAL_SSL_CA_PATH=           # 容器内自定义 CA 证书文件路径（PEM 格式）\n                                # 必须指向 \u002Fopt\u002Fpentagi\u002Fssl\u002F 目录（例如：\u002Fopt\u002Fpentagi\u002Fssl\u002Fca-bundle.pem）\nEXTERNAL_SSL_INSECURE=false     # 跳过证书验证（仅用于测试）\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>如何添加自定义 CA 证书\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n如果您看到以下错误：`tls: failed to verify certificate: x509: certificate signed by unknown authority`\n\n**步骤 1**：获取您的 CA 证书包，格式为 PEM（可以包含多个证书）\n\n**步骤 2**：将文件放置在主机上的 SSL 目录中：\n```bash\n# 默认位置（如果未设置 PENTAGI_SSL_DIR）\ncp ca-bundle.pem .\u002Fpentagi-ssl\u002F\n\n# 或自定义位置（如果在 docker-compose.yml 中使用 PENTAGI_SSL_DIR）\ncp ca-bundle.pem \u002Fpath\u002Fto\u002Fyour\u002Fssl\u002Fdir\u002F\n```\n\n**步骤 3**：在 `.env` 文件中设置路径（路径必须位于容器内）：\n```bash\n# pentagi-ssl 卷被挂载到容器内的 \u002Fopt\u002Fpentagi\u002Fssl\nEXTERNAL_SSL_CA_PATH=\u002Fopt\u002Fpentagi\u002Fssl\u002Fca-bundle.pem\nEXTERNAL_SSL_INSECURE=false\n```\n\n**步骤 4**：重启 PentAGI：\n```bash\ndocker compose restart pentagi\n```\n\n**注意事项**：\n- `pentagi-ssl` 卷被挂载到容器内的 `\u002Fopt\u002Fpentagi\u002Fssl`\n- 您可以通过在 docker-compose.yml 中使用 `PENTAGI_SSL_DIR` 变量来更改主机目录\n- 文件支持在一个 PEM 文件中包含多个证书和中间 CA\n- 仅在测试时使用 `EXTERNAL_SSL_INSECURE=true`（不建议用于生产）\n\n\u003C\u002Fdetails>\n\n### 供应商特定限制\n\n每个供应商都有特定的限制和支持的功能：\n\n- **OpenAI**：支持所有配置选项\n- **Ollama**：不支持 `EMBEDDING_KEY`，因为它使用本地模型\n- **Mistral**：不支持 `EMBEDDING_MODEL` 或自定义 HTTP 客户端\n- **Jina**：不支持自定义 HTTP 客户端\n- **HuggingFace**：需要 `EMBEDDING_KEY`，并支持其他所有选项\n- **GoogleAI**：不支持 `EMBEDDING_URL`，需要 `EMBEDDING_KEY`\n- **VoyageAI**：支持所有配置选项\n\n如果未指定 `EMBEDDING_URL` 和 `EMBEDDING_KEY`，系统将尝试使用对应的 LLM 供应商设置（例如，当 `EMBEDDING_PROVIDER=openai` 时使用 `OPEN_AI_KEY`）。\n\n### 为什么保持嵌入供应商一致很重要\n\n务必始终使用相同的嵌入供应商，原因如下：\n\n1. **向量兼容性**：不同供应商生成的向量具有不同的维度和数学特性。\n2. **语义一致性**：更换供应商会导致先前嵌入文档之间的语义相似性失效。\n3. **内存损坏**：混合使用的嵌入可能导致搜索结果不佳，并破坏知识库功能。\n\n如果您更改了嵌入供应商，则应清空并重新索引整个知识库（请参阅下方的 `etester` 工具）。\n\n\u003C\u002Fdetails>\n\n### 嵌入测试工具 (etester)\n\nPentAGI 包含一个专门的 `etester` 工具，用于测试、管理和调试嵌入功能。该工具对于诊断和解决与向量嵌入及知识存储相关的问题至关重要。\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Etester 命令\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n```bash\n# 测试嵌入供应商和数据库连接\ncd backend\ngo run cmd\u002Fetester\u002Fmain.go test -verbose\n\n# 显示嵌入数据库的相关统计信息\ngo run cmd\u002Fetester\u002Fmain.go info\n\n# 删除嵌入数据库中的所有文档（谨慎使用！）\ngo run cmd\u002Fetester\u002Fmain.go flush\n\n# 为所有文档重新计算嵌入（更换供应商后）\ngo run cmd\u002Fetester\u002Fmain.go reindex\n\n# 在嵌入数据库中搜索文档\ngo run cmd\u002Fetester\u002Fmain.go search -query \"如何安装 PostgreSQL\" -limit 5\n```\n\n### 使用 Docker\n\n如果您在 Docker 中运行 PentAGI，可以在容器内使用 etester：\n\n```bash\n# 测试嵌入供应商\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester test\n\n# 显示详细的数据库信息\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester info -verbose\n```\n\n### 高级搜索选项\n\n`search` 命令支持多种过滤器来缩小搜索范围：\n\n```bash\n# 按文档类型筛选\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -query \"安全漏洞\" -doc_type guide -threshold 0.8\n\n# 按流程 ID 筛选\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -query \"代码示例\" -doc_type code -flow_id 42\n\n# 所有可用的搜索选项\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fetester search -help\n```\n\n可用的搜索参数：\n- `-query STRING`：搜索查询文本（必填）\n- `-doc_type STRING`：按文档类型筛选（答案、记忆、指南、代码）\n- `-flow_id NUMBER`：按流程 ID 筛选（正整数）\n- `-answer_type STRING`：按答案类型筛选（指南、漏洞、代码、工具或其他）\n- `-guide_type STRING`：按指南类型筛选（安装、配置、使用、渗透测试、开发或其他）\n- `-limit NUMBER`：最大结果数量（默认：3）\n- `-threshold NUMBER`：相似度阈值（0.0–1.0，默认：0.7）\n\n### 常见问题排查场景\n\n1. **更换嵌入供应商后**：务必运行 `flush` 或 `reindex` 以确保一致性。\n2. **搜索结果不佳**：尝试调整相似度阈值，或检查嵌入是否正确生成。\n3. **数据库连接问题**：确认 PostgreSQL 正在运行，并已安装 pgvector 扩展。\n4. **缺少 API 密钥**：检查您所选嵌入供应商的环境变量。\n\n\u003C\u002Fdetails>\n\n## 🔍 功能测试工具 ftester\n\nPentAGI 包含一个多功能工具 `ftester`，用于调试、测试和开发特定功能及 AI 代理行为。虽然 `ctester` 专注于测试 LLM 模型的能力，但 `ftester` 允许您直接调用系统中的各个功能以及 AI 代理组件，并精确控制执行上下文。\n\n### 主要特点\n\n- **直接访问功能**：无需运行整个系统即可测试单个功能。\n- **模拟模式**：无需实际部署 PentAGI 即可使用内置模拟进行功能测试。\n- **交互式输入**：以交互方式填写函数参数，便于探索性测试。\n- **详细输出**：终端输出带颜色编码，格式化显示响应和错误信息。\n- **上下文感知测试**：可在特定流程、任务和子任务的上下文中调试 AI 代理。\n- **可观测性集成**：所有函数调用都会记录到 Langfuse 和可观测性堆栈中。\n\n### 使用模式\n\n#### 命令行参数\n\n通过命令行直接指定函数和参数运行 ftester：\n\n```bash\n# 基本用法（模拟模式）\ncd backend\ngo run cmd\u002Fftester\u002Fmain.go [function_name] -[arg1] [value1] -[arg2] [value2]\n\n# 示例：在模拟模式下测试终端命令\ngo run cmd\u002Fftester\u002Fmain.go terminal -command \"ls -la\" -message \"列出文件\"\n\n# 使用真实流程上下文\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 terminal -command \"whoami\" -message \"检查用户\"\n\n# 在特定任务\u002F子任务上下文中测试 AI 代理\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 -subtask 789 pentester -message \"查找漏洞\"\n```\n\n#### 交互模式\n\n不带参数运行 ftester，进入引导式交互体验：\n\n```bash\n# 启动交互模式\ngo run cmd\u002Fftester\u002Fmain.go [function_name]\n\n# 例如，交互式填写浏览器工具参数\ngo run cmd\u002Fftester\u002Fmain.go browser\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>可用功能\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n### 环境功能\n- **terminal**：在容器中执行命令并返回输出\n- **file**：在容器中执行文件操作（读取、写入、列出）\n\n### 搜索功能\n- **browser**：访问网站并截取屏幕截图\n- **google**：使用 Google 自定义搜索进行网络搜索\n- **duckduckgo**：使用 DuckDuckGo 进行网络搜索\n- **tavily**：使用 Tavily AI 搜索引擎进行搜索\n- **traversaal**：使用 Traversaal AI 搜索引擎进行搜索\n- **perplexity**：使用 Perplexity AI 进行搜索\n- **sploitus**：搜索安全漏洞、CVE 编号及渗透测试工具\n- **searxng**：使用 Searxng 元搜索引擎进行搜索（整合多个引擎的结果）\n\n### 向量数据库功能\n- **search_in_memory**：在向量数据库中搜索信息\n- **search_guide**：在向量数据库中查找指南文档\n- **search_answer**：在向量数据库中查找问题的答案\n- **search_code**：在向量数据库中查找代码示例\n\n### AI 代理功能\n- **advice**: 从 AI 代理获取专家建议\n- **coder**: 请求代码生成或修改\n- **maintenance**: 运行系统维护任务\n- **memorist**: 在向量数据库中存储和组织信息\n- **pentester**: 执行安全测试和漏洞分析\n- **search**: 跨多个来源的复杂搜索\n\n### 实用功能\n- **describe**: 显示流程、任务和子任务的相关信息\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>调试流程上下文\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n`describe` 函数提供了流程中任务和子任务的详细信息。这在 PentAGI 遇到问题或卡住时，尤其有助于诊断问题。\n\n```bash\n# 列出系统中的所有流程\ngo run cmd\u002Fftester\u002Fmain.go describe\n\n# 显示特定流程的所有任务和子任务\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 describe\n\n# 显示特定任务的详细信息\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 describe\n\n# 显示特定子任务的详细信息\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 -task 456 -subtask 789 describe\n\n# 显示包含完整描述和结果的详细输出\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 describe -verbose\n```\n\n此函数允许您识别流程可能卡住的具体位置，并通过直接调用相应的代理功能来恢复处理。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>函数帮助与发现\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n每个函数都有帮助模式，显示可用参数：\n\n```bash\n# 获取特定函数的帮助\ngo run cmd\u002Fftester\u002Fmain.go [function_name] -help\n\n# 示例：\ngo run cmd\u002Fftester\u002Fmain.go terminal -help\ngo run cmd\u002Fftester\u002Fmain.go browser -help\ngo run cmd\u002Fftester\u002Fmain.go describe -help\n```\n\n您也可以不带参数运行 ftester 来查看所有可用函数的列表：\n\n```bash\ngo run cmd\u002Fftester\u002Fmain.go\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>输出格式\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n`ftester` 工具使用彩色编码输出，以方便解读：\n\n- **蓝色标题**：章节标题和键名\n- **青色 [INFO]**：一般信息消息\n- **绿色 [SUCCESS]**：成功操作\n- **红色 [ERROR]**：错误消息\n- **黄色 [WARNING]**：警告消息\n- **黄色 [MOCK]**：表示模拟模式运行\n- ** magenta 值**：函数参数和结果\n\nJSON 和 Markdown 响应会自动格式化以提高可读性。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>高级使用场景\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n### 调试卡住的 AI 流程\n\n当 PentAGI 卡在一个流程中时：\n\n1. 通过 UI 暂停流程\n2. 使用 `describe` 确定当前任务和子任务\n3. 直接使用相同的任务\u002F子任务 ID 调用代理功能\n4. 检查详细输出以确定问题所在\n5. 根据需要恢复流程或手动干预\n\n### 测试环境变量\n\n验证 API 密钥和外部服务是否正确配置：\n\n```bash\n# 测试 Google 搜索 API 配置\ngo run cmd\u002Fftester\u002Fmain.go google -query \"pentesting tools\"\n\n# 测试浏览器访问外部网站\ngo run cmd\u002Fftester\u002Fmain.go browser -url \"https:\u002F\u002Fexample.com\"\n```\n\n### 开发新的 AI 代理行为\n\n在开发新的提示模板或代理行为时：\n\n1. 在 UI 中创建一个测试流程\n2. 使用 ftester 直接调用代理并使用不同的提示\n3. 观察响应并相应调整提示\n4. 检查 Langfuse 以获取所有函数调用的详细跟踪记录\n\n### 验证 Docker 容器设置\n\n确保容器已正确配置：\n\n```bash\ngo run cmd\u002Fftester\u002Fmain.go -flow 123 terminal -command \"env | grep -i proxy\" -message \"检查代理设置\"\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Docker 容器使用\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n如果您在 Docker 中运行 PentAGI，可以在容器内使用 ftester：\n\n```bash\n# 在正在运行的 PentAGI 容器中运行 ftester\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester [arguments]\n\n# 示例：\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester -flow 123 describe\ndocker exec -it pentagi \u002Fopt\u002Fpentagi\u002Fbin\u002Fftester -flow 123 terminal -command \"ps aux\" -message \"列出进程\"\n```\n\n这对于没有本地开发环境的生产部署特别有用。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>与可观测性工具集成\u003C\u002Fb>（点击展开）\u003C\u002Fsummary>\n\n通过 ftester 进行的所有函数调用都会被记录到：\n\n1. **Langfuse**：捕获完整的 AI 代理交互链，包括提示、响应和函数调用\n2. **OpenTelemetry**：记录系统性能分析所需的指标、跟踪和日志\n3. **终端输出**：提供函数执行的即时反馈\n\n要访问详细日志：\n\n- 在 Langfuse UI 中查看 AI 代理跟踪记录（通常位于 `http:\u002F\u002Flocalhost:4000`）\n- 使用 Grafana 仪表板查看系统指标（通常位于 `http:\u002F\u002Flocalhost:3000`）\n- 检查终端输出以获取即时的函数结果和错误信息\n\n\u003C\u002Fdetails>\n\n### 命令行选项\n\n主工具接受多个选项：\n\n- `-env \u003Cpath>` - 环境文件路径（可选，默认为 `.env`）\n- `-provider \u003Ctype>` - 使用的提供商类型（默认为 `custom`，可选值：`openai`、`anthropic`、`ollama`、`bedrock`、`gemini`、`custom`）\n- `-flow \u003Cid>` - 用于测试的流程 ID（0 表示使用模拟数据，默认为 `0`）\n- `-task \u003Cid>` - 用于代理上下文的任务 ID（可选）\n- `-subtask \u003Cid>` - 用于代理上下文的子任务 ID（可选）\n\n特定于函数的参数在函数名称后使用 `-name value` 格式传递。\n\n## 构建\n\n### 构建 Docker 镜像\n\nDocker 构建过程会自动嵌入来自 Git 标签的版本信息。为了正确地为构建打上版本标签，请使用提供的脚本：\n\n#### Linux\u002FmacOS\n\n```bash\n# 加载版本变量\nsource .\u002Fscripts\u002Fversion.sh\n\n# 标准构建\ndocker build \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t pentagi:$PACKAGE_VER .\n\n# 多平台构建\ndocker buildx build \\\n  --platform linux\u002Famd64,linux\u002Farm64 \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t pentagi:$PACKAGE_VER .\n\n# 构建并推送\ndocker buildx build \\\n  --platform linux\u002Famd64,linux\u002Farm64 \\\n  --build-arg PACKAGE_VER=$PACKAGE_VER \\\n  --build-arg PACKAGE_REV=$PACKAGE_REV \\\n  -t myregistry\u002Fpentagi:$PACKAGE_VER \\\n  --push .\n```\n\n#### Windows (PowerShell)\n\n```powershell\n# 加载版本变量\n. .\\scripts\\version.ps1\n\n# 标准构建\ndocker build `\n  --build-arg PACKAGE_VER=$env:PACKAGE_VER `\n  --build-arg PACKAGE_REV=$env:PACKAGE_REV `\n  -t pentagi:$env:PACKAGE_VER .\n\n# 多平台构建\ndocker buildx build `\n  --platform linux\u002Famd64,linux\u002Farm64 `\n  --build-arg PACKAGE_VER=$env:PACKAGE_VER `\n  --build-arg PACKAGE_REV=$env:PACKAGE_REV `\n  -t pentagi:$env:PACKAGE_VER .\n```\n\n#### 不带版本号的快速构建\n\n对于不进行版本跟踪的开发构建：\n\n```bash\ndocker build -t pentagi:dev .\n```\n\n> [!NOTE]\n> - 构建脚本会自动从 Git 标签中确定版本号\n> - 发布版（在标签提交时）没有修订后缀\n> - 开发版（标签之后）会将提交哈希作为修订号（例如 `1.1.0-bc6e800`）\n> - 若要在本地使用构建好的镜像，请更新 `docker-compose.yml` 中的镜像名称，或直接使用 `build` 选项\n\n## 致谢\n\n本项目得以实现，得益于以下研究与开发工作：\n- [LLM 应用的新兴架构](https:\u002F\u002Flilianweng.github.io\u002Fposts\u002F2023-06-23-agent)\n- [自主 LLM 代理综述](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08299)\n- Andriy Semenets 的 [Codel](https:\u002F\u002Fgithub.com\u002Fsemanser\u002Fcodel) —— 基于代理的自动化设计的初始灵感来源\n\n## 许可证\n\n**PentAGI** 采用 [MIT 许可证](LICENSE)授权。\n\n版权所有 © 2025 PentAGI 开发团队\n\n### 第三方依赖\n\n所有第三方依赖均采用与 MIT 兼容的许可证。详细许可证信息请参阅 `licenses\u002F` 目录。\n\n### VXControl 云服务\n\n⚠️ **注意：** 虽然 VXControl 云 SDK 的代码采用 MIT 许可证，但访问 **VXControl 云服务**（威胁情报、AI 支持、高级功能）需要单独的许可证密钥，并遵守 [服务条款](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fcloud#license-and-terms)。\n\nSDK 代码本身可免费使用，但需注册才能访问相关服务。\n\n如有疑问，请联系：**info@pentagi.com** 或 **info@vxcontrol.com**","# PentAGI 快速上手指南\n\nPentAGI 是一款基于人工智能的自动化渗透测试工具，专为安全研究人员和伦理黑客设计。它利用大语言模型（LLM）自主规划并执行渗透测试步骤，内置 20+ 专业安全工具，并在隔离的 Docker 环境中运行以确保安全。\n\n## 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**: Linux (推荐 Ubuntu 20.04+\u002FDebian) 或 macOS。Windows 用户建议使用 WSL2。\n*   **Docker & Docker Compose**: 必须安装最新版本的 Docker Engine 和 Docker Compose 插件。\n    *   验证安装：`docker --version` 和 `docker compose version`\n*   **硬件资源**:\n    *   **CPU**: 至少 4 核（推荐 8 核+）\n    *   **内存**: 至少 8GB RAM（推荐 16GB+，若运行本地大模型需更多）\n    *   **磁盘**: 至少 20GB 可用空间\n*   **LLM API Key**: 您需要一个可用的大模型服务密钥。支持 OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, DeepSeek, Qwen 等。\n    *   *国内开发者提示*: 推荐使用 **DeepSeek**, **Qwen (通义千问)** 或通过 **OpenRouter** 聚合接入，以获得更稳定的连接和更低延迟。\n\n## 安装步骤\n\nPentAGI 采用微服务架构，通过 Docker Compose 一键部署是最便捷的方式。\n\n### 1. 克隆项目仓库\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi.git\ncd pentagi\n```\n\n### 2. 配置环境变量\n\n复制示例配置文件并根据您的需求进行编辑。您需要在此处填入 LLM Provider 的 API Key 和其他必要配置。\n\n```bash\ncp .env.example .env\nnano .env\n```\n\n**关键配置项说明 (.env):**\n*   `LLM_PROVIDER`: 选择提供商 (例如: `openai`, `anthropic`, `deepseek`, `qwen`, `ollama`)\n*   `LLM_API_KEY`: 填入您的 API Key\n*   `LLM_MODEL`: 指定模型名称 (例如: `gpt-4o`, `deepseek-chat`, `qwen-plus`)\n*   `SEARCH_PROVIDER`: 选择搜索引擎 (例如: `tavily`, `duckduckgo`, `searxng`)\n    *   *注*: 若使用 Tavily 等国外服务网络不稳定，可考虑搭建本地 SearXNG 或使用支持国内访问的搜索 API。\n\n### 3. 启动服务\n\n使用 Docker Compose 启动所有核心服务（包括前端、后端、向量数据库、知识图谱及监控组件）：\n\n```bash\ndocker compose up -d\n```\n\n> **注意**: 首次启动时，Docker 会拉取多个镜像（包括 PostgreSQL+pgvector, Neo4j, Grafana 等），根据网络状况可能需要几分钟时间。\n>\n> *国内加速建议*: 如果拉取镜像缓慢，请配置 Docker 镜像加速器（如阿里云、腾讯云镜像加速地址）后再执行上述命令。\n\n### 4. 验证安装\n\n检查所有容器是否正常运行：\n\n```bash\ndocker compose ps\n```\n\n确保状态均为 `Up` 或 `healthy`。\n\n## 基本使用\n\n安装完成后，您可以通过 Web 界面或 API 与 PentAGI 交互。\n\n### 1. 访问 Web 控制台\n\n打开浏览器访问默认地址：\n\n```text\nhttp:\u002F\u002Flocalhost:3000\n```\n\n*   **功能**: 在界面中创建新的渗透测试任务（Flow），查看实时日志、生成的报告以及知识图谱可视化。\n*   **监控**: 系统集成了 Grafana (`http:\u002F\u002Flocalhost:3001`) 和 Langfuse (LLM 观测)，可用于监控系统性能和 AI 决策过程。\n\n### 2. 发起第一个渗透测试任务 (API 方式)\n\n您也可以通过 curl 命令直接调用 API 发起任务。以下是一个简单的示例，假设目标为本地测试环境（**请勿对未授权目标进行测试**）：\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:8080\u002Fgraphql \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -H \"Authorization: Bearer YOUR_API_TOKEN\" \\\n  -d '{\n    \"query\": \"mutation CreateFlow($input: CreateFlowInput!) { createFlow(input: $input) { id name status } }\",\n    \"variables\": {\n      \"input\": {\n        \"name\": \"Initial Scan\",\n        \"description\": \"Automated nmap scan and vulnerability assessment\",\n        \"target\": \"192.168.1.100\", \n        \"parameters\": {\n          \"tools\": [\"nmap\", \"nikto\"],\n          \"depth\": \"basic\"\n        }\n      }\n    }\n  }\n```\n\n*   **获取 Token**: `YOUR_API_TOKEN` 可在 `.env` 文件中预设，或通过系统初始化后的输出获取。\n*   **查看进度**: 任务提交后，可在 Web UI 的 \"Flows\" 页面查看实时执行步骤、AI 思考过程及最终生成的漏洞报告。\n\n### 3. 查看结果\n\n任务完成后，PentAGI 会自动生成详细的报告，包含：\n*   发现的开放端口和服务\n*   潜在的安全漏洞\n*   利用建议\n*   执行过程中的所有命令输出和截图（如有）\n\n数据持久化存储在 PostgreSQL 中，您可以随时回溯历史任务记录。","某金融科技公司安全团队需在版本发布前，对内部新开发的微服务架构进行深度渗透测试，以排查潜在的高危漏洞。\n\n### 没有 pentagi 时\n- **人力耗时巨大**：安全专家需手动串联 Nmap、Metasploit、SQLMap 等二十多种工具，单个系统的完整测试周期长达数天。\n- **知识孤岛严重**：过往成功的攻击路径和漏洞利用技巧散落在不同成员的笔记中，无法形成系统化的知识库供团队复用。\n- **信息滞后缺失**：难以实时追踪最新的 CVE 漏洞情报和 Web 端新型攻击手法，导致测试用例更新缓慢，容易漏测新兴威胁。\n- **报告整理繁琐**：测试结束后，人工汇总日志、截图并编写修复指南极易出错，且格式不统一，开发团队理解成本高。\n\n### 使用 pentagi 后\n- **全自动闭环执行**：pentagi 自主规划测试步骤，在隔离的 Docker 环境中自动调用专业工具链完成扫描与利用，将测试周期压缩至小时级。\n- **智能记忆与图谱**：内置的智能记忆系统和 Neo4j 知识图谱自动存储成功攻击路径，让每次测试都能站在“前人肩膀”上，越用越聪明。\n- **实时情报联动**：通过集成的 Tavily、Perplexity 等搜索系统，pentagi 能实时抓取最新漏洞情报并动态调整攻击策略，确保覆盖零日风险。\n- **一键生成详报**：测试完成后自动生成包含复现步骤和修复建议的专业报告，并通过 Grafana 实时监控全过程，极大降低沟通成本。\n\npentagi 将原本依赖资深专家经验的复杂渗透测试，转化为可自主演进、持续积累智慧的自动化安全防御体系。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fvxcontrol_pentagi_833602ef.png","vxcontrol","VXControl","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fvxcontrol_954cf1e2.png","",null,"info@vxcontrol.com","https:\u002F\u002Fvxcontrol.com","https:\u002F\u002Fgithub.com\u002Fvxcontrol",[81,85,89,91,95,99,103,106,110,113],{"name":82,"color":83,"percentage":84},"Go","#00ADD8",76.3,{"name":86,"color":87,"percentage":88},"TypeScript","#3178c6",19.9,{"name":90,"color":83,"percentage":10},"Go Template",{"name":92,"color":93,"percentage":94},"CSS","#663399",0.3,{"name":96,"color":97,"percentage":98},"PLpgSQL","#336790",0.2,{"name":100,"color":101,"percentage":102},"Dockerfile","#384d54",0.1,{"name":104,"color":105,"percentage":102},"Shell","#89e051",{"name":107,"color":108,"percentage":109},"JavaScript","#f1e05a",0,{"name":111,"color":112,"percentage":109},"PowerShell","#012456",{"name":114,"color":115,"percentage":109},"HTML","#e34c26",14324,1828,"2026-04-07T01:52:19","MIT",4,"Linux, macOS, Windows","非必需（取决于所选 LLM 提供商）。若本地部署大模型（如使用 vLLM + Qwen），需高性能 NVIDIA GPU；若使用云端 API（OpenAI, Anthropic 等）则无特定显卡要求。","最低 8GB，推荐 16GB+（运行本地大模型或完整微服务架构时需 32GB+）",{"notes":125,"python":126,"dependencies":127},"该项目主要通过 Docker Compose 进行一键部署，所有核心组件（包括 AI 代理、数据库、监控工具）均运行在隔离的容器中。用户无需手动配置 Python 环境或安装具体依赖库，但需确保宿主机已安装 Docker 和 Docker Compose。支持超过 10 种 LLM 提供商，生产环境本地部署可参考 vLLM 指南。系统包含复杂的微服务架构（如 Grafana, Prometheus, Neo4j, MinIO 等），对宿主机资源有一定要求。","未说明（项目主要基于 Docker 部署，内部环境由镜像管理）",[128,129,130,131,82,132,86],"Docker","Docker Compose","PostgreSQL (with pgvector)","Neo4j","React",[35,13],[135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152],"ai-agents","ai-security-tool","autonomous-agents","golang","graphql","multi-agent-system","penetration-testing-tools","react","security-automation","security-testing","security-tools","anthropic","gpt","offensive-security","open-source","openai","penetration-testing","self-hosted","2026-03-27T02:49:30.150509","2026-04-07T14:36:43.289459",[156,161,165,170,175,180,184],{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},22309,"遇到 'Unable to create langfuse client: public key is required' 错误导致容器重启怎么办？","可以通过修改 .env 配置文件将 Langfuse 设置为云端模式来解决。在 .env 文件中添加或修改以下配置：\nLANGFUSE_BASE_URL=https:\u002F\u002Fcloud.langfuse.com\n这样可以避免本地部署所需的公钥配置问题。","https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fissues\u002F37",{"id":162,"question_zh":163,"answer_zh":164,"source_url":160},22310,"Docker 启动时报错提示找不到 observability 相关的配置文件（如 config.yml, grafana.ini）如何解决？","这是因为主机上缺少必要的配置文件，而 Docker 将卷挂载点创建为了空目录。解决方法是从 PentAGI 仓库克隆或下载源代码，并将 `observability` 目录复制到你的安装位置。\n具体步骤：\n1. 克隆仓库：git clone https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi.git\n2. 将仓库中的 `observability` 目录复制到你的 docker-compose.yml 所在的目录，确保挂载路径正确。",{"id":166,"question_zh":167,"answer_zh":168,"source_url":169},22311,"如何为运行工作流的 Kali 容器配置全局 DNS 设置？","建议参考官方指南为节点进行单独安装配置。对于 DNS 设置，可以在宿主机的 \u002Fetc\u002Fdocker\u002Fdaemon.json 文件中配置 dns-servers 选项。注意，这些 DNS 设置通常只需要在 worker 节点上进行配置。","https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fissues\u002F73",{"id":171,"question_zh":172,"answer_zh":173,"source_url":174},22312,"如何在 Podman 或 Docker 中以特权模式运行 Scraper 服务以绑定特权端口？","虽然不推荐，但可以通过在 docker-compose 或 podman 配置中指定用户为 root 并添加网络绑定能力来实现。配置示例如下：\npentagi-scraper:\n  user: root\n  cap_add:\n    - NET_BIND_SERVICE\n注意：Podman 的 rootless 模式仍然会阻止特权端口绑定，因此需要使用 rootful 模式的 Podman 或 Docker。","https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fissues\u002F100",{"id":176,"question_zh":177,"answer_zh":178,"source_url":179},22313,"是否支持使用 Deepseek、通义千问等非 OpenAI 的免费或自定义 LLM API？","支持。PentAGI 允许配置兼容 OpenAI 格式的第三方 API。你可以在 .env 文件中设置自定义 LLM 提供商，例如：\nLLM_SERVER_URL=https:\u002F\u002Fapi.deepseek.com\nLLM_SERVER_KEY=你的密钥\nLLM_SERVER_MODEL=deepseek-chat\nLLM_SERVER_CONFIG_PATH=\u002Fopt\u002Fpentagi\u002Fconf\u002Fdeepseek.provider.yml\n确保在 docker-compose.yml 的环境变量中传递这些参数即可。","https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fissues\u002F24",{"id":181,"question_zh":182,"answer_zh":183,"source_url":179},22314,"为什么不支持 Ollama 作为嵌入模型或后端？","早期版本曾支持 Ollama，但由于功能增多以及对调用函数时 JSON 结构严格性的要求提高，目前不再支持。主要原因是现有的 Ollama 模型在处理复杂的攻击模拟任务时，难以严格遵循所需的函数结构，且容易偏离轨道，上下文长度也是影响因素之一。建议使用兼容 OpenAI 格式的其他后端。",{"id":185,"question_zh":186,"answer_zh":187,"source_url":174},22315,"截图功能未生效，日志显示尝试安装 npm\u002Fselenium 但失败，如何解决？","该问题通常与环境中缺少必要的依赖或权限有关。如果启用了 DuckDuckGo 并提供了 API 密钥但仍无法截图，可能需要检查 Scraper 服务的运行权限。尝试以特权模式运行 Scraper 容器（见相关 FAQ），或者确保基础镜像中包含了必要的工具（如 curl 等）。此问题已在后续版本（如 #144）中得到修复，建议升级到最新版本。",[189,194,199,204,209,214,219],{"id":190,"version":191,"summary_zh":192,"released_at":193},136068,"v1.2.0","---\r\n\r\n\u003Cdiv align=\"center\">\r\n\r\n> 🚀 **PentAGI 1.2 - 增强的AI能力！** 重大升级，带来最新推理模型、令牌缓存、全面分析以及REST API接入，便于与自动化平台无缝集成。\r\n\r\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-7289DA?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)⠀[![Telegram](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTelegram-2CA5E0?logo=telegram&logoColor=white)](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)\r\n\r\n\u003C\u002Fdiv>\r\n\r\n---\r\n\r\n### 🎯 主要特性\r\n\r\n**🧠 最新推理模型支持** - 完全集成具备原生推理能力的前沿AI模型：\r\n- Gemini 2.5\u002F3.0系列，支持思考令牌\r\n- Anthropic Claude Sonnet 4+，扩展推理能力\r\n- DeepSeek R1和Kimi K2.5，推理模式\r\n- OpenAI o系列模型，标志性思维过程\r\n- OpenRouter及OpenAI兼容端点，保留推理内容\r\n\r\n**💰 令牌缓存与成本优化** - 智能提示缓存可在多轮智能体对话中将输入令牌成本降低40%-70%：\r\n- 原生支持Anthropic（临时缓存控制）和Gemini（预创建内容缓存）\r\n- 自动跟踪缓存命中情况，并提供详细分析\r\n- 特别适用于长上下文渗透测试会话\r\n- 所有提供商统一的缓存令牌报告\r\n\r\n**📊 使用分析与监控** - 全面的REST API端点，用于详细跟踪资源利用率：\r\n- 按智能体类型（研究员\u002F开发者\u002F执行者）划分的令牌使用情况\r\n- 成本分析，区分缓存读写\r\n- 每个流程及子任务的执行时间指标\r\n- 工具调用频率统计\r\n- 为可视化分析仪表盘奠定基础（将于v1.3推出）\r\n\r\n**🔑 API令牌管理** - 基于JWT的API认证，实现对PentAGI的程序化访问：\r\n- 可通过Web界面生成和管理API令牌\r\n- 提供完整的REST和GraphQL API访问，便于自动化\r\n- 支持任何语言的客户端代码生成的OpenAPI规范\r\n- 适配n8n、OpenClaw、Claude Desktop及自定义解决方案\r\n- 为官方MCP服务器奠定基础（计划在后续版本中推出）\r\n\r\n**🔍 Sploitus集成** - 实验性支持漏洞搜索引擎：\r\n- 该服务受Cloudflare保护，需进行IP信誉验证\r\n- 启用前请使用内置`ftester`工具检查您的IP信誉\r\n- 可通过`SPLOITUS_ENABLED`环境变量进行配置\r\n\r\n**📡 Langfuse v3可观测性** - 完全迁移到Langfuse v3标准，增强LLM操作跟踪：\r\n- 观测类型分离：Span、Generation、Agent、Tool、Chain、Retriever、Evaluator、Embedding、Guardrail\r\n- 增强消息链可视化，支持Playground模式导航\r\n- 详细的分数指标和执行时间日志记录\r\n- 改进所有观测类型中的变量和元数据跟踪\r\n\r\n### 🚀 新特性\r\n\r\n- **推理内容保留**：智能消息链摘要","2026-02-25T16:26:25",{"id":195,"version":196,"summary_zh":197,"released_at":198},136069,"v1.1.0","## 🔧 错误修复与改进\n\n### LiteLLM 直通支持\n- 修复了 Gemini 提供商的兼容性问题，这些问题曾导致 LiteLLM 无法正常集成。\n- 现在所有提供商都支持 LiteLLM 直通模式，并采用标准化的端点：\n  - OpenAI：`http:\u002F\u002Flitellm:4000\u002Fopenai\u002Fv1`\n  - Anthropic：`http:\u002F\u002Flitellm:4000\u002Fanthropic\u002Fv1`\n  - Gemini：`http:\u002F\u002Flitellm:4000\u002Fgemini`\n- 已使用 LiteLLM v1.80.11-stable.1 进行测试和验证。\n- 针对 Gemini 提供商进行了增强，添加了自定义 HTTP 传输层，用于注入 API 密钥和重写 URL。\n\n### Windows 文件路径兼容性\n- 更改了 PentAGI 容器中的文件挂载方案，以解决 Windows 路径格式的问题。\n- 从主机路径映射迁移到固定的容器路径，以提高跨平台兼容性。\n- 更新了卷挂载：\n  - `PENTAGI_LLM_SERVER_CONFIG_PATH` → `\u002Fopt\u002Fpentagi\u002Fconf\u002Fcustom.provider.yml`\n  - `PENTAGI_OLLAMA_SERVER_CONFIG_PATH` → `\u002Fopt\u002Fpentagi\u002Fconf\u002Follama.provider.yml`\n  - `PENTAGI_DOCKER_CERT_PATH` → `\u002Fopt\u002Fpentagi\u002Fdocker\u002Fssl`\n- **迁移**：安装程序 v1.0.0 会自动将旧设置迁移到新架构。\n- 用户现在可以通过安装程序表单指定主机文件系统中的绝对路径。\n\n### Ollama 单模型配置\n- 添加了 `OLLAMA_SERVER_MODEL` 环境变量，用于为所有代理选择单一模型。\n- 消除了在简单部署中创建自定义提供商配置文件的必要性。\n- 增加了以下微调选项：\n  - `OLLAMA_SERVER_PULL_MODELS_ENABLED`：控制是否自动下载模型（默认：`false`）。\n  - `OLLAMA_SERVER_LOAD_MODELS_ENABLED`：启动时查询可用模型（默认：`false`）。\n  - `OLLAMA_SERVER_PULL_MODELS_TIMEOUT`：模型拉取操作的超时时间，单位为秒（默认：`600`）。\n- 默认模型：`llama3.1:8b-instruct-q8_0`。\n\n### 安装程序 v1.0.0\n- 将安装程序版本升级至 1.0.0，并进行了全面的稳定性改进。\n- 全面支持 Windows，所有配置场景均与 Linux 和 macOS 保持一致。\n- 在 Windows 上禁用了 Docker Compose 命令中的 ANSI 格式化，以获得更清晰的控制台输出。\n- 自动将旧路径变量的设置迁移到新架构：\n  - `DOCKER_CERT_PATH` → `PENTAGI_DOCKER_CERT_PATH`\n  - `LLM_SERVER_CONFIG_PATH` → `PENTAGI_LLM_SERVER_CONFIG_PATH`\n  - `OLLAMA_SERVER_CONFIG_PATH` → `PENTAGI_OLLAMA_SERVER_CONFIG_PATH`\n- 在整个安装过程中增强了错误处理和验证功能。\n- **建议**：下载最新版安装程序，运行“应用更改”以迁移到新的文件挂载方案，然后转到“维护”选项卡并执行“更新 PentAGI”，以下载支持这些新选项的新版本。\n\n### 终端命令处理增强\n- 改进了代理对同步终端命令返回空结果的理解。\n- 增强了后台命令的处理能力，实现了异步结果捕获。\n- 引入了后台命令执行的快速检查超时机制。\n- 对命令失败和静默成功的情况提供了更清晰的反馈。\n- 成功消息更加准确，能够反映实际的命令执行结果。","2026-01-17T23:30:43",{"id":200,"version":201,"summary_zh":202,"released_at":203},136070,"v1.0.1","## 🐛 错误修复与改进\n\n### 增强的错误诊断\n- 当 LLM 无法生成工具调用时，已在错误消息中添加停止原因。\n- 如果停止原因是 `length`，请在提供商设置中为受影响的代理增加 `max_tokens` 参数值。\n- 有助于更好地进行故障排除和配置优化。\n\n### DuckDuckGo 搜索稳定性\n- 已迁移到新的 DuckDuckGo API，并采用 HTML 响应解析。\n- 添加了覆盖真实搜索场景的全面测试。\n- 显著提升了可靠性和搜索结果质量。\n\n### 提供商护栏绕过\n- 在所有代理提示中添加了明确的授权框架。\n- 防止被 OpenAI、Anthropic 和 Google Gemini 的内容过滤器拦截。\n- 明确说明渗透测试属于预先授权的活动。\n\n### OpenAI 配置更新\n- 由于 OpenAI 提示评估不稳定，已暂时将主代理和助手的模型从 `gpt-5` 切换至 `o4-mini`。\n- 为多个代理提高了 `max_tokens` 上限，以增强输出能力。\n- **建议**：使用 OpenAI 提供商时，启用人工介入模式（在 `.env` 文件中设置 `ASK_USER=true`），以提升稳定性。\n\n### 其他改进\n- 在向量存储通信中增强了消息格式，显示文档匹配分数。\n- 改进了生成器和精炼器提示的清晰度，以便更好地理解用户任务。\n- 为 AskUser 工具添加了客户交互协议。\n\n---\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fcompare\u002Fv1.0.0...v1.0.1","2026-01-06T17:53:27",{"id":205,"version":206,"summary_zh":207,"released_at":208},136071,"v1.0.0","---\r\n\r\n\u003Cdiv align=\"center\">\r\n\r\n> 🎉 **PentAGI 1.0 - 已达生产就绪状态！** 我们自主渗透测试平台的首个稳定版本，带来企业级功能、增强的AI能力以及全新设计的用户体验。\r\n\r\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-7289DA?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)⠀[![Telegram](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTelegram-2CA5E0?logo=telegram&logoColor=white)](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)\r\n\r\n\u003C\u002Fdiv>\r\n\r\n---\r\n\r\n### 🎯 主要特性\r\n\r\n**🧠 Graphiti 知识图谱集成** - 革命性的记忆系统，采用 [Graphiti](https:\u002F\u002Fgithub.com\u002Fgetzep\u002Fgraphiti)，这是一种时序知识图谱，可在整个渗透测试会话中保持上下文连贯性。支持配置部署模式（嵌入式、外部或禁用），并利用基于图的推理实现更智能的代理决策。\r\n\r\n**⚙️ 交互式安装程序** - 专业的设置向导，具备全面的系统检查、Docker 卷检测以及针对所有服务的分步配置功能，包括 LLM 提供商、搜索引擎和可观测性堆栈。适用于 Linux、macOS 和 Windows。\r\n\r\n**🎨 现代化前端重制** - 使用 React 19 和 Tailwind CSS v4 对 UI\u002FUX 进行彻底重构，并优化了架构设计：\r\n- 基于代理、任务、工具和向量存储的高级流程管理与筛选功能\r\n- 改进的设置界面，提供表格化数据视图和表单验证\r\n- 实时提示通知与响应式设计\r\n- 更完善的收藏夹系统和侧边栏导航\r\n\r\n**🔧 提供商管理系统** - 统一的 LLM 提供商配置，支持以下选项：\r\n- AWS Bedrock（支持临时凭证和会话令牌）\r\n- Google Gemini（2.5 Flash 和 Pro 模型）\r\n- Ollama（本地部署）\r\n- 自定义 OpenAI 兼容端点\r\n\r\n**🔍 增强的搜索生态** - 集成 [SearXNG](https:\u002F\u002Fdocs.searxng.org\u002F) 元搜索引擎，提供注重隐私的搜索功能，作为现有 Perplexity 和 DuckDuckGo 提供商的补充。\r\n\r\n**⚡ 补丁精炼器** - 智能结果精炼系统，可自动优化代理输出、验证发现结果，并在呈现最终结果前确保准确性。\r\n\r\n### 🚀 新增功能\r\n\r\n- **提示与代理管理**：通过 Web 界面创建、编辑和测试自定义 AI 代理配置\r\n- **提供商测试界面**：内置测试功能，用于验证 LLM 提供商配置，并生成详细报告\r\n- **SSL\u002FTLS 配置**：支持外部证书，可自定义 CA 路径，并提供开发专用的不安全模式\r\n- **增强的容器管理**：可配置的渗透测试专用 Docker 镜像，提升隔离安全性\r\n- **安装 ID 与许可管理**：集成 PentAGI Cloud API，实现许可证密钥管理\r\n- **卷持久化检测**：自动检查 Pentagi 和 Langfuse 服务的 Docker 卷是否存在\r\n\r\n### 🎨 UI\u002FUX 改进\r\n\r\n- **React 19 迁移**：升级至最新版 React，性能显著提升","2025-12-31T11:15:01",{"id":210,"version":211,"summary_zh":212,"released_at":213},136072,"v0.3.0","---\n\n\u003Cdiv align=\"center\">\n\n> 🚀 **加入社区！** 与安全研究人员、AI爱好者及同行的道德黑客们建立联系。获取支持、分享见解，并随时掌握PentAGI的最新动态。\n\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-7289DA?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)⠀[![Telegram](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTelegram-2CA5E0?logo=telegram&logoColor=white)](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)\n\n\u003C\u002Fdiv>\n\n---\n\n### 🎯 主要特性\n\n**🤖 助手模式** - 全功能的交互式AI助手，支持流式响应、持久化聊天会话以及智能代理委派。您可以创建多个聊天会话，并在人工协助与自动化渗透测试工作流之间无缝切换。\n\n**🧪 专业测试套件** - 包含三款专用测试工具：\n- **[ctester](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Ftree\u002Fmaster\u002Fbackend\u002Fcmd\u002Fctester)**：用于LLM代理配置的并行执行与详细报告生成工具。\n- **[etester](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Ftree\u002Fmaster\u002Fbackend\u002Fcmd\u002Fetester)**：用于向量嵌入管理，包括提供商测试与数据库优化。\n- **[ftester](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Ftree\u002Fmaster\u002Fbackend\u002Fcmd\u002Fftester)**：通过交互式模拟模式调试单个函数及AI行为。\n\n**🔍 增强的搜索能力** - 集成了[Perplexity AI](https:\u002F\u002Fwww.perplexity.ai\u002F)和[DuckDuckGo](https:\u002F\u002Fduckduckgo.com\u002F)搜索引擎，同时兼容现有服务；此外，还提供多提供商嵌入系统，支持OpenAI、Ollama、Mistral、Jina、HuggingFace、GoogleAI及VoyageAI等。\n\n**🛡️ 自定义Kali Linux环境** - 专为渗透测试优化的[Docker镜像](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fvxcontrol\u002Fkali-linux)，内置增强型安全工具与网络管理功能。其[开源构建配置](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fkali-linux-image)采用MIT许可证，支持自动化跨平台构建与安全认证。\n\n**⚡ LLM集成升级** - PentAGI现基于langchaingo的自定义分支开发，大幅提升了LLM服务商兼容性、函数调用能力、流式响应效率，并优化了外部服务集成。\n\n### 🚀 新增功能\n\n- **社区启动**：正式开通[Discord](https:\u002F\u002Fdiscord.gg\u002F2xrMh7qX6m)与[Telegram](https:\u002F\u002Ft.me\u002F+Ka9i6CNwe71hMWQy)频道，旨在为安全研究者与AI爱好者提供社区支持、知识共享与协作平台。\n- **灵活的LLM配置**：支持YAML\u002FJSON格式的自定义配置系统，可为每个代理单独指定模型参数（[示例](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Ftree\u002Fmaster\u002Fexamples\u002Fconfigs)）。\n- **高级报告生成**：针对流程、任务及子任务生成全面的Markdown与PDF报告。\n- **智能上下文管理**：增强了[对话摘要](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi?tab=read","2025-06-25T22:07:13",{"id":215,"version":216,"summary_zh":217,"released_at":218},136073,"v0.2.0","## 🚀 新特性\n\n### 前端架构\n- ✨ 使用 TypeScript 实现了现代化的 React 18 架构，提升类型安全性\n- 🎨 引入 shadcn\u002Fui 组件库与 Radix UI 基础组件，确保设计一致性\n- 🌓 集成 Tailwind CSS，支持暗色\u002F亮色主题切换\n- 📱 实现响应式设计，适配移动端、平板端及桌面端布局\n- ⚡ 通过 Vite 和模块分块优化构建流程\n\n### 核心功能\n- 💬 基于 WebSocket 订阅的 AI 代理实时聊天界面\n- 🤖 多智能体系统，具备研究者、开发者、执行者等专业角色\n- 📊 终端集成，支持实时输出监控\n- 🎯 任务跟踪系统，可创建子任务并监控进度\n- 🔍 集成向量存储的搜索功能\n- 📸 截图捕获与管理系统\n\n### 安全与认证\n- 🔐 支持多提供商认证\n- 🔑 集成 GitHub 和 Google 的 OAuth 登录\n- 🛡️ 支持 SSL\u002FTLS 加密通信\n- 🔒 基于环境变量的配置管理\n\n## 🐛 Bug 修复\n- 修复了 GraphQL 订阅的 WebSocket 连接处理问题\n- 改进了终端输出中的错误处理逻辑\n- 解决了主题切换后状态未持久化的问题\n- 修正了移动端布局的响应式表现\n\n## 🔄 变更\n- 从 Create React App 迁移到 Vite，提升构建性能\n- 将所有依赖更新至最新稳定版本\n- 按功能模块重构代码结构，提升可维护性\n- 完善 TypeScript 类型定义，增强类型支持\n\n## 📚 文档\n- 添加了全面的前端文档\n- 包含开发环境搭建说明\n- 补充了组件架构相关文档\n- 更新了环境配置指南\n\n## 🛠️ 技术细节\n- React 18.3.1\n- TypeScript 5.6.2\n- Vite 5.4.7\n- GraphQL 16.9.0\n- Tailwind CSS 3.4.13\n\n## 🔜 即将推出\n- 更完善的性能监控\n- 更健全的错误上报机制\n- 扩展测试覆盖率\n- 增加更多 UI 组件\n\n## 🙏 致谢\n- 感谢 @sirozha 完成了新版本前端开发\n\n## 变更内容\n* 功能：frontend，由 @sirozha 在 https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fpull\u002F1 中实现\n\n## 新贡献者\n* @sirozha 在 https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fpull\u002F1 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fcompare\u002Fv0.1.0...v0.2.0","2025-01-09T13:55:06",{"id":220,"version":221,"summary_zh":222,"released_at":223},136074,"v0.1.0","## 🎯 当前状态\n\nPentAGI 处于早期 Alpha 阶段，主要聚焦于核心功能和系统稳定性。本次发布展示了我们自主渗透测试系统的基本能力，同时系统仍在积极开发和改进中。\n\n## ✨ 可用功能\n\n### 核心功能\n- 🤖 多智能体系统（研究员、开发者、执行者）\n- 🛡️ 与关键安全测试工具的集成\n- 🧠 基于向量存储的基础记忆系统\n- 🔄 自主决策能力\n\n### 技术实现\n- 🐳 基于 Docker 的部署\n- 📊 基本监控（Grafana + OpenTelemetry）\n- 📝 LLM 运维追踪（Langfuse）\n- 🔌 支持 OpenAI\u002FAnthropic API\n\n## ⚠️ 重要提示\n\n- 本版本为 **Alpha 测试版**，旨在用于测试和收集反馈\n- 不建议用于生产环境\n- 请预期频繁的更新和变化\n- 部分功能可能不稳定或不完整\n- 文档较为有限\n\n## 🚀 快速入门\n\n```bash\nmkdir pentagi && cd pentagi\ncurl -O https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmain\u002Fdocker-compose.yml\ncurl -o .env https:\u002F\u002Fraw.githubusercontent.com\u002Fvxcontrol\u002Fpentagi\u002Fmain\u002F.env.example\n# 配置您的 .env 文件\ndocker compose up -d\n```\n\n---\n\n有关详细文档和最新动态，请访问 [README 文件](https:\u002F\u002Fgithub.com\u002Fvxcontrol\u002Fpentagi\u002Fblob\u002Fmaster\u002FREADME.md)。","2025-01-07T01:26:34"]