[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-confident-ai--deepteam":3,"tool-confident-ai--deepteam":65},[4,17,27,35,48,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",154349,2,"2026-04-13T23:32:16",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,3,"2026-04-06T11:19:32",[15,26,14,13],"图像",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":10,"last_commit_at":33,"category_tags":34,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":10,"last_commit_at":41,"category_tags":42,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",85092,"2026-04-10T11:13:16",[26,43,44,45,14,46,15,13,47],"数据工具","视频","插件","其他","音频",{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":54,"last_commit_at":55,"category_tags":56,"status":16},5784,"funNLP","fighting41love\u002FfunNLP","funNLP 是一个专为中文自然语言处理（NLP）打造的超级资源库，被誉为\"NLP 民工的乐园”。它并非单一的软件工具，而是一个汇集了海量开源项目、数据集、预训练模型和实用代码的综合性平台。\n\n面对中文 NLP 领域资源分散、入门门槛高以及特定场景数据匮乏的痛点，funNLP 提供了“一站式”解决方案。这里不仅涵盖了分词、命名实体识别、情感分析、文本摘要等基础任务的标准工具，还独特地收录了丰富的垂直领域资源，如法律、医疗、金融行业的专用词库与数据集，甚至包含古诗词生成、歌词创作等趣味应用。其核心亮点在于极高的全面性与实用性，从基础的字典词典到前沿的 BERT、GPT-2 模型代码，再到高质量的标注数据和竞赛方案，应有尽有。\n\n无论是刚刚踏入 NLP 领域的学生、需要快速验证想法的算法工程师，还是从事人工智能研究的学者，都能在这里找到急需的“武器弹药”。对于开发者而言，它能大幅减少寻找数据和复现模型的时间；对于研究者，它提供了丰富的基准测试资源和前沿技术参考。funNLP 以开放共享的精神，极大地降低了中文自然语言处理的开发与研究成本，是中文 AI 社区不可或缺的宝藏仓库。",79857,1,"2026-04-08T20:11:31",[15,43,46],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":54,"last_commit_at":63,"category_tags":64,"status":16},6590,"gpt4all","nomic-ai\u002Fgpt4all","GPT4All 是一款让普通电脑也能轻松运行大型语言模型（LLM）的开源工具。它的核心目标是打破算力壁垒，让用户无需依赖昂贵的显卡（GPU）或云端 API，即可在普通的笔记本电脑和台式机上私密、离线地部署和使用大模型。\n\n对于担心数据隐私、希望完全掌控本地数据的企业用户、研究人员以及技术爱好者来说，GPT4All 提供了理想的解决方案。它解决了传统大模型必须联网调用或需要高端硬件才能运行的痛点，让日常设备也能成为强大的 AI 助手。无论是希望构建本地知识库的开发者，还是单纯想体验私有化 AI 聊天的普通用户，都能从中受益。\n\n技术上，GPT4All 基于高效的 `llama.cpp` 后端，支持多种主流模型架构（包括最新的 DeepSeek R1 蒸馏模型），并采用 GGUF 格式优化推理速度。它不仅提供界面友好的桌面客户端，支持 Windows、macOS 和 Linux 等多平台一键安装，还为开发者提供了便捷的 Python 库，可轻松集成到 LangChain 等生态中。通过简单的下载和配置，用户即可立即开始探索本地大模型的无限可能。",77307,"2026-04-11T06:52:37",[15,13],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":10,"env_os":92,"env_gpu":92,"env_ram":92,"env_deps":93,"category_tags":97,"github_topics":98,"view_count":10,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":104,"updated_at":105,"faqs":106,"releases":136},7297,"confident-ai\u002Fdeepteam","deepteam","DeepTeam is a framework to red team LLMs and LLM systems.","DeepTeam 是一款专为大语言模型（LLM）及其系统设计的开源“红队”测试框架。简单来说，它就像是为 AI 系统进行的渗透测试，旨在主动发现潜在的安全隐患。\n\n在 AI 应用开发中，模型可能面临越狱攻击、提示词注入、隐私数据泄露或输出偏见等风险。DeepTeam 通过模拟多种真实攻击场景（如多轮对话利用、敏感信息诱导等），帮助开发者提前识别这些漏洞，并提供相应的防护护栏，确保 AI 代理、RAG 管道和聊天机器人在生产环境中的安全与合规。\n\n这款工具非常适合 AI 开发者、安全研究人员以及负责大模型落地的工程团队使用。其核心亮点在于易用性与本地化部署：用户无需将数据上传至云端，即可在本地机器上运行测试。DeepTeam 内置了 50 多种现成的漏洞检测模板，支持调用任意大模型作为“裁判”，自动对测试结果进行二元判定并给出详细的推理依据。此外，它基于成熟的 DeepEval 框架构建，既能独立使用，也能配合 Confident AI 平台进行更复杂的风险管理与报告协作，是保障大模型安全不可或缺的得力助手。","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_1ff8586e1806.png\" alt=\"DeepTeam Logo\" width=\"55%\">\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">The LLM Red Teaming Framework\u003C\u002Fh1>\n\n\u003Ch4 align=\"center\">\n    \u003Cp>\n        \u003Ca href=\"https:\u002F\u002Fwww.trydeepteam.com?utm_source=GitHub\">Documentation\u003C\u002Fa> |\n        \u003Ca href=\"#-vulnerabilities-attacks-and-features\">Vulnerabilities, Attacks, and Features\u003C\u002Fa> |\n        \u003Ca href=\"#-quickstart\">Getting Started\u003C\u002Fa> |\n        \u003Ca href=\"#deepteam-with-confident-ai\">Confident AI\u003C\u002Fa>\n    \u003Cp>\n\u003C\u002Fh4>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Freleases\">\n        \u003Cimg alt=\"GitHub release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fconfident-ai\u002Fdeepteam\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.com\u002Finvite\u002F3SEyvpgu2f\">\n        \u003Cimg alt=\"Discord\" src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1167926797498376322?color=7289da&logo=discord&logoColor=white\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FLICENSE.md\">\n        \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fconfident-ai\u002Fdeepteam.svg?color=yellow\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=de\">Deutsch\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=es\">Español\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=fr\">français\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ja\">日本語\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ko\">한국어\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=pt\">Português\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ru\">Русский\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=zh\">中文\u003C\u002Fa>\n\u003C\u002Fp>\n\n**DeepTeam** is a simple-to-use, open-source red teaming framework for LLM systems. Think of it as penetration testing, but for LLMs.\n\nDeepTeam simulates attacks — jailbreaking, prompt injection, multi-turn exploitation, and more — to uncover vulnerabilities like bias, PII leakage, and SQL injection in your AI agents, RAG pipelines, and chatbots. It also offers **guardrails** to prevent these issues in production.\n\nDeepTeam runs **locally on your machine** and is built on [DeepEval](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepeval), the open-source LLM evaluation framework.\n\n> [!IMPORTANT]\n> Need a place for your red teaming results to live? Sign up to the [Confident AI](https:\u002F\u002Fapp.confident-ai.com?utm_source=GitHub) platform to manage risk assessments, monitor vulnerabilities in production, and share reports with your team.\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_b9f93b7ee88b.gif\" alt=\"Confident AI + DeepTeam\" width=\"100%\">\n\u003C\u002Fp>\n\n> Want to talk LLM security, need help picking attacks, or just to say hi? [Come join our discord.](https:\u002F\u002Fdiscord.com\u002Finvite\u002F3SEyvpgu2f)\n\n&nbsp;\n\n# 🔥 Vulnerabilities, Attacks, and Features\n\n- 📐 50+ ready-to-use [vulnerabilities](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities) (all with explanations) powered by **ANY** LLM of your choice. Each vulnerability uses LLM-as-a-Judge metrics that run **locally on your machine** to produce binary pass\u002Ffail scores with reasoning:\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Data Privacy\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [PII Leakage](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-pii-leakage) — disclosure of sensitive personal information\n    - [Prompt Leakage](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-prompt-leakage) — exposure of system prompt secrets and instructions\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Responsible AI\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Bias](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bias) — stereotypes and unfair treatment across gender, race, religion, politics\n    - [Toxicity](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-toxicity) — harmful, offensive, or demeaning content\n    - [Child Protection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-child-protection) — child-related privacy and safety risks\n    - [Ethics](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-ethics) — violations of moral reasoning and organizational values\n    - [Fairness](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-fairness) — discriminatory outcomes across groups and contexts\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Security\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [BFLA](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bfla) — broken function-level authorization\n    - [BOLA](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bola) — broken object-level authorization\n    - [RBAC](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-rbac) — role-based access control bypass\n    - [Debug Access](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-debug-access) — unauthorized access to debug modes and dev endpoints\n    - [Shell Injection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-shell-injection) — unauthorized system command execution\n    - [SQL Injection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-sql-injection) — database query manipulation\n    - [SSRF](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-ssrf) — server-side request forgery to internal services\n    - [Tool Metadata Poisoning](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-tool-metadata-poisoning) — corrupted tool schemas and descriptions\n    - [Cross-Context Retrieval](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-cross-context-retrieval) — data access across isolation boundaries\n    - [System Reconnaissance](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-system-reconnaissance) — probing internal architecture and configurations\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Safety\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Illegal Activity](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-illegal-activity) — facilitation of fraud, weapons, drugs, or other unlawful actions\n    - [Graphic Content](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-graphic-content) — explicit, violent, or sexual material\n    - [Personal Safety](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-personal-safety) — self-harm, harassment, or dangerous advice\n    - [Unexpected Code Execution](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-unexpected-code-execution) — coerced execution of unauthorized code\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Business\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Misinformation](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-misinformation) — factual errors and unsupported claims\n    - [Intellectual Property](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-intellectual-property) — copyright, trademark, and patent violations\n    - [Competition](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-competition) — competitor endorsement and market manipulation\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Agentic\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Goal Theft](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-vulnerabilities-goal-theft) — extracting or redirecting an agent's objectives\n    - [Recursive Hijacking](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-vulnerabilities-recursive-hijacking) — self-modifying goal chains that alter objectives\n    - [Excessive Agency](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-excessive-agency) — agents acting beyond their authority\n    - [Robustness](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-robustness) — input overreliance and prompt hijacking\n    - [Indirect Instruction](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-indirect-instruction) — hidden instructions in retrieved content\n    - [Tool Orchestration Abuse](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-tool-orchestration-abuse) — exploiting tool calling sequences\n    - [Agent Identity & Trust Abuse](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-agent-identity-abuse) — impersonating agent identity\n    - [Inter-Agent Communication Compromise](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-inter-agent-communication-compromise) — spoofing multi-agent message passing\n    - [Autonomous Agent Drift](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-autonomous-agent-drift) — agents deviating from intended goals over time\n    - [Exploit Tool Agent](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-exploit-tool-agent) — weaponizing tools for unintended actions\n    - [External System Abuse](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-external-system-abuse) — using agents to attack external services\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Custom\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Custom Vulnerabilities](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-custom-vulnerability) — define and test your own criteria in a few lines of code\n\n    \u003C\u002Fdetails>\n\n- 💥 20+ research-backed [adversarial attack](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks) methods for both single-turn and multi-turn (conversational) red teaming. Attacks enhance baseline vulnerability probes using SOTA techniques like jailbreaking, prompt injection, and encoding-based obfuscation:\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Single-Turn\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Prompt Injection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-prompt-injection) — crafted injections that bypass LLM restrictions\n    - [Roleplay](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-roleplay) — persona-based scenarios exploiting collaborative training\n    - [Leetspeak](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-leetspeak) — symbolic character substitution to avoid keyword detection\n    - [ROT13](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-rot13-encoding) — alphabetic rotation to evade content filters\n    - [Base64](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-base64-encoding) — encoding attacks as random-looking data\n    - [Gray Box](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-gray-box-attack) — leveraging partial system knowledge for targeted attacks\n    - [Math Problem](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-math-problem) — disguising attacks within mathematical inputs\n    - [Multilingual](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-multilingual) — translating attacks to less-spoken languages\n    - Prompt Probing — probing the LLM to extract system prompt details\n    - [Adversarial Poetry](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-adversarial-poetry) — transforming attacks into poetic verse with metaphor\n    - [System Override](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-system-override) — disguising attacks as legitimate system commands\n    - [Permission Escalation](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-permission-escalation) — shifting perceived identity to bypass role restrictions\n    - [Goal Redirection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-goal-redirection) — reframing agent objectives for unauthorized outcomes\n    - [Linguistic Confusion](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-semantic-manipulation) — semantic ambiguity to confuse language understanding\n    - [Input Bypass](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-input-bypass) — circumventing validation via exception handling claims\n    - [Context Poisoning](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-context-poisoning) — injecting false background context to bias reasoning\n    - [Character Stream](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-character-stream) — character-by-character input to bypass filters\n    - [Context Flooding](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-context-flooding) — flooding input with benign text to hide malicious instructions\n    - [Embedded Instruction JSON](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-embedded-instruction-json) — hiding attacks inside realistic JSON structures\n    - [Synthetic Context Injection](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-synthetic-context-injection) — fabricating system context to exploit long-context handling\n    - [Authority Escalation](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-authority-escalation) — framing requests from positions of power\n    - [Emotional Manipulation](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-emotional-manipulation) — high-intensity emotional pressure for unsafe compliance\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>Multi-Turn\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [Linear Jailbreaking](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-linear-jailbreaking) — iteratively refining attacks using target LLM responses\n    - [Tree Jailbreaking](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-tree-jailbreaking) — exploring parallel attack variations to find the best bypass\n    - [Crescendo Jailbreaking](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-crescendo-jailbreaking) — gradual escalation from benign to harmful prompts\n    - [Sequential Jailbreak](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-sequential-jailbreaking) — multi-turn conversational scaffolding toward restricted outputs\n    - [Bad Likert Judge](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-bad-likert-judge) — exploiting Likert scale evaluation roles to extract harmful content\n\n    \u003C\u002Fdetails>\n\n- 🏛️ Red team against established [AI safety frameworks](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguidelines-and-frameworks) out-of-the-box. Each framework automatically maps its categories to the right vulnerabilities and attacks:\n  - OWASP Top 10 for LLMs 2025\n  - OWASP Top 10 for Agents 2026\n  - NIST AI RMF\n  - MITRE ATLAS\n  - BeaverTails\n  - Aegis\n- 🛡️ 7 production-ready [guardrails](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguardrails) for fast binary classification to guard LLM inputs and outputs in real time.\n- 🧩 Build your own **custom vulnerabilities** and attacks that integrate seamlessly with DeepTeam's ecosystem.\n- 🔗 Run red teaming from the **CLI** with YAML configs, or programmatically in Python.\n- 📊 Access risk assessments, display in dataframes, and save locally in JSON.\n\n&nbsp;\n\n# 🚀 QuickStart\n\nDeepTeam does not require you to define what LLM system you are red teaming — because neither will malicious users. All you need to do is install `deepteam`, define a `model_callback`, and you're good to go.\n\n## Installation\n\n```\npip install -U deepteam\n```\n\n## Red Team Your First LLM\n\n```python\nfrom deepteam import red_team\nfrom deepteam.vulnerabilities import Bias\nfrom deepteam.attacks.single_turn import PromptInjection\n\nasync def model_callback(input: str) -> str:\n    # Replace this with your LLM application\n    return f\"I'm sorry but I can't answer this: {input}\"\n\nrisk_assessment = red_team(\n    model_callback=model_callback,\n    vulnerabilities=[Bias(types=[\"race\"])],\n    attacks=[PromptInjection()]\n)\n```\n\nDon't forget to set your `OPENAI_API_KEY` as an environment variable before running (you can also use [any custom model](https:\u002F\u002Fdeepeval.com\u002Fguides\u002Fguides-using-custom-llms) supported in DeepEval), and run the file:\n\n```bash\npython red_team_llm.py\n```\n\n**That's it! Your first red team is complete.** Here's what happened:\n\n- `model_callback` wraps your LLM system and generates a `str` output for a given `input`.\n- At red teaming time, `deepteam` simulates a [`PromptInjection`](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-prompt-injection) attack targeting [`Bias`](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bias) vulnerabilities.\n- Your `model_callback`'s outputs are evaluated using the `BiasMetric`, producing a binary score of 0 or 1.\n- The final passing rate for `Bias` is determined by the proportion of scores that equal 1.\n\nUnlike traditional evaluation, red teaming does not require a prepared dataset — adversarial attacks are dynamically generated based on the vulnerabilities you want to test for.\n\n&nbsp;\n\n## Red Team Against Safety Frameworks\n\nUse established AI safety standards like OWASP and NIST instead of manually picking vulnerabilities:\n\n```python\nfrom deepteam import red_team\nfrom deepteam.frameworks import OWASPTop10\n\nasync def model_callback(input: str) -> str:\n    # Replace this with your LLM application\n    return f\"I'm sorry but I can't answer this: {input}\"\n\nrisk_assessment = red_team(\n    model_callback=model_callback,\n    framework=OWASPTop10()\n)\n```\n\nThis automatically maps the framework's categories to the right vulnerabilities and attacks. Available frameworks include `OWASPTop10`, `OWASP_ASI_2026`, `NIST`, `MITRE`, `Aegis`, and `BeaverTails`.\n\n&nbsp;\n\n## Guard Your LLM in Production\n\nOnce you've found your vulnerabilities, use DeepTeam's guardrails to prevent them in production:\n\n```python\nfrom deepteam import Guardrails\nfrom deepteam.guardrails import PromptInjectionGuard, ToxicityGuard, PrivacyGuard\n\nguardrails = Guardrails(\n    input_guards=[PromptInjectionGuard(), PrivacyGuard()],\n    output_guards=[ToxicityGuard()]\n)\n\n# Guard inputs before they reach your LLM\ninput_result = guardrails.guard_input(\"Tell me how to hack a database\")\nprint(input_result.breached)  # True\n\n# Guard outputs before they reach your users\noutput_result = guardrails.guard_output(input=\"Hi\", output=\"Here is some toxic content...\")\nprint(output_result.breached)  # True\n```\n\n7 guards are available out-of-the-box: `ToxicityGuard`, `PromptInjectionGuard`, `PrivacyGuard`, `IllegalGuard`, `HallucinationGuard`, `TopicalGuard`, and `CybersecurityGuard`. [Read the full guardrails docs here.](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguardrails)\n\n&nbsp;\n\n# DeepTeam with Confident AI\n\n[Confident AI](https:\u002F\u002Fapp.confident-ai.com?utm_source=GitHub) is the all-in-one platform that integrates natively with DeepTeam and [DeepEval](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepeval).\n\n- **Manage risk assessments** — view, compare, and track red teaming results across iterations\n- **Monitor in production** — detect and alert on vulnerabilities hitting your live LLM system\n- **Share reports** — generate and distribute security reports across your team\n- **Run from your IDE** — use Confident AI's MCP server to run red teams, pull results, and inspect vulnerabilities without leaving Cursor or Claude Code\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_b9f93b7ee88b.gif\" alt=\"Confident AI\" width=\"90%\">\n\u003C\u002Fp>\n\n&nbsp;\n\n# Contributing\n\nPlease read [CONTRIBUTING.md](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FCONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.\n\n&nbsp;\n\n# Authors\n\nBuilt by the founders of Confident AI. Contact jeffreyip@confident-ai.com for all enquiries.\n\n&nbsp;\n\n# License\n\nDeepTeam is licensed under Apache 2.0 - see the [LICENSE.md](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FLICENSE.md) file for details.\n","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_1ff8586e1806.png\" alt=\"DeepTeam Logo\" width=\"55%\">\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">LLM红队框架\u003C\u002Fh1>\n\n\u003Ch4 align=\"center\">\n    \u003Cp>\n        \u003Ca href=\"https:\u002F\u002Fwww.trydeepteam.com?utm_source=GitHub\">文档\u003C\u002Fa> |\n        \u003Ca href=\"#-vulnerabilities-attacks-and-features\">漏洞、攻击 和 功能\u003C\u002Fa> |\n        \u003Ca href=\"#-quickstart\">快速入门\u003C\u002Fa> |\n        \u003Ca href=\"#deepteam-with-confident-ai\">Confident AI\u003C\u002Fa>\n    \u003Cp>\n\u003C\u002Fh4>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Freleases\">\n        \u003Cimg alt=\"GitHub release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fconfident-ai\u002Fdeepteam\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.com\u002Finvite\u002F3SEyvpgu2f\">\n        \u003Cimg alt=\"Discord\" src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1167926797498376322?color=7289da&logo=discord&logoColor=white\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FLICENSE.md\">\n        \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fconfident-ai\u002Fdeepteam.svg?color=yellow\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=de\">Deutsch\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=es\">Español\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=fr\">français\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ja\">日本語\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ko\">한국어\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=pt\">Português\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=ru\">Русский\u003C\u002Fa> | \n    \u003Ca href=\"https:\u002F\u002Fwww.readme-i18n.com\u002Fconfident-ai\u002Fdeepteam?lang=zh\">中文\u003C\u002Fa>\n\u003C\u002Fp>\n\n**DeepTeam** 是一个简单易用的开源 LLM 系统红队框架。你可以把它想象成针对 LLM 的渗透测试。\n\nDeepTeam 会模拟各种攻击——越狱、提示注入、多轮利用等——以发现你的 AI 代理、RAG 流水线和聊天机器人中的偏见、PII 泄露、SQL 注入等漏洞。它还提供 **护栏机制**，用于在生产环境中预防这些问题。\n\nDeepTeam 可以在 **本地机器上运行**，并基于开源 LLM 评估框架 [DeepEval](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepeval) 构建。\n\n> [!IMPORTANT]\n> 需要一个地方来存储你的红队测试结果吗？请注册 [Confident AI](https:\u002F\u002Fapp.confident-ai.com?utm_source=GitHub) 平台，以管理风险评估、监控生产环境中的漏洞，并与团队共享报告。\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_b9f93b7ee88b.gif\" alt=\"Confident AI + DeepTeam\" width=\"100%\">\n\u003C\u002Fp>\n\n> 想讨论 LLM 安全问题、需要帮助选择攻击方式，或者只是想打个招呼？[加入我们的 Discord 社区。](https:\u002F\u002Fdiscord.com\u002Finvite\u002F3SEyvpgu2f)\n\n&nbsp;\n\n# 🔥 漏洞、攻击 和 功能\n\n- 📐 50 多种即用型 [漏洞](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities)（均附有说明），可由你选择的 **任意** LLM 来触发。每种漏洞都使用 LLM 作为裁判的指标，在 **本地机器上** 运行，生成带有推理过程的二元通过\u002F失败评分：\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>数据隐私\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [PII 泄露](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-pii-leakage) — 敏感个人信息的泄露\n    - [提示泄露](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-prompt-leakage) — 系统提示中秘密信息和指令的暴露\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>负责任的人工智能\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [偏见](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bias) — 性别、种族、宗教、政治等方面的刻板印象和不公平待遇\n    - [毒性](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-toxicity) — 有害、冒犯性或贬低性的内容\n    - [儿童保护](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-child-protection) — 与儿童相关的隐私和安全风险\n    - [伦理](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-ethics) — 对道德推理和组织价值观的违背\n    - [公平性](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-fairness) — 不同群体和情境下的歧视性结果\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>安全性\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [BFLA](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bfla) — 功能级授权被破坏\n    - [BOLA](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bola) — 对象级授权被破坏\n    - [RBAC](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-rbac) — 基于角色的访问控制被绕过\n    - [调试访问](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-debug-access) — 未经授权访问调试模式和开发端点\n    - [Shell 注入](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-shell-injection) — 未经授权执行系统命令\n    - [SQL 注入](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-sql-injection) — 数据库查询被篡改\n    - [SSRF](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-ssrf) — 向内部服务发起服务器端请求伪造\n    - [工具元数据中毒](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-tool-metadata-poisoning) — 工具的架构和描述被篡改\n    - [跨上下文检索](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-cross-context-retrieval) — 跨隔离边界访问数据\n    - [系统侦察](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-system-reconnaissance) — 探测内部架构和配置\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>安全性\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [非法活动](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-illegal-activity) — 为欺诈、武器、毒品或其他非法行为提供便利\n    - [血腥内容](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-graphic-content) — 明显暴力或色情的内容\n    - [个人安全](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-personal-safety) — 自残、骚扰或危险建议\n    - [意外代码执行](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-unexpected-code-execution) — 强制执行未经授权的代码\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>业务\u003C\u002Fb>\u003C\u002Fsummary>\n\n- [虚假信息](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-misinformation) — 事实性错误及无据之说\n    - [知识产权](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-intellectual-property) — 著作权、商标权及专利权侵权\n    - [竞争](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-competition) — 竞争对手背书与市场操纵\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>代理型\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [目标窃取](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-vulnerabilities-goal-theft) — 挖掘或重定向智能体的目标\n    - [递归劫持](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-vulnerabilities-recursive-hijacking) — 自我修改的目标链会改变智能体的意图\n    - [过度代理](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-excessive-agency) — 智能体超出其权限范围行事\n    - [鲁棒性](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-robustness) — 对输入的过度依赖及提示劫持\n    - [间接指令](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-indirect-instruction) — 在检索到的内容中隐藏指令\n    - [工具编排滥用](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-tool-orchestration-abuse) — 利用工具调用序列进行攻击\n    - [智能体身份与信任滥用](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-agent-identity-abuse) — 冒充智能体身份\n    - [多智能体通信被破坏](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-inter-agent-communication-compromise) — 欺骗多智能体之间的消息传递\n    - [自主智能体漂移](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-autonomous-agent-drift) — 智能体随时间偏离预期目标\n    - [利用工具的智能体](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-exploit-tool-agent) — 将工具武器化以执行非预期行为\n    - [外部系统滥用](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-external-system-abuse) — 使用智能体攻击外部服务\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>自定义\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [自定义漏洞](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-custom-vulnerability) — 仅需几行代码即可定义并测试您自己的评估标准\n\n    \u003C\u002Fdetails>\n\n- 💥 20多种基于研究的[对抗性攻击](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks)方法，适用于单轮和多轮（对话式）红队演练。这些攻击采用最先进的技术，如越狱、提示注入和基于编码的混淆等，以增强基础漏洞探测能力：\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>单轮\u003C\u002Fb>\u003C\u002Fsummary>\n\n    - [提示注入](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-prompt-injection) — 精心设计的注入内容可绕过大模型的限制\n    - [角色扮演](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-roleplay) — 基于角色的情景模拟，利用协作训练中的弱点\n    - [Leetspeak](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-leetspeak) — 通过符号字符替换来规避关键词检测\n    - [ROT13](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-rot13-encoding) — 通过字母旋转来逃避内容过滤器\n    - [Base64](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-base64-encoding) — 将攻击编码为看似随机的数据\n    - [灰盒](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-gray-box-attack) — 利用对系统的部分了解实施针对性攻击\n    - [数学问题](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-math-problem) — 将攻击伪装成数学输入\n    - [多语言](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-multilingual) — 将攻击翻译成较少使用的语言\n    - 提示探测 — 探测大模型以获取系统提示的具体细节\n    - [对抗性诗歌](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-adversarial-poetry) — 将攻击转化为带有隐喻的诗体表达\n    - [系统覆盖](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-system-override) — 将攻击伪装成合法的系统命令\n    - [权限提升](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-permission-escalation) — 通过改变感知身份来绕过角色限制\n    - [目标重定向](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-goal-redirection) — 重新设定智能体目标以达成未经授权的结果\n    - [语言混淆](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-semantic-manipulation) — 利用语义模糊来干扰语言理解\n    - [输入绕过](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-input-bypass) — 通过异常处理机制绕过验证\n    - [上下文污染](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-agentic-attacks-context-poisoning) — 注入虚假背景信息以偏颇推理\n    - [字符流](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-character-stream) — 逐字符输入以绕过过滤器\n    - [上下文泛滥](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-context-flooding) — 向输入中大量添加无关文本以隐藏恶意指令\n    - [嵌入式指令JSON](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-embedded-instruction-json) — 将攻击隐藏在看似真实的JSON结构中\n    - [合成上下文注入](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-synthetic-context-injection) — 构造虚假系统上下文以利用长上下文处理机制\n    - [权威升级](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-authority-escalation) — 从权力地位出发提出请求\n    - [情绪操控](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-emotional-manipulation) — 通过高强度的情绪施压促使不安全的服从\n\n    \u003C\u002Fdetails>\n\n  - \u003Cdetails>\n    \u003Csummary>\u003Cb>多轮\u003C\u002Fb>\u003C\u002Fsummary>\n\n- [线性越狱](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-linear-jailbreaking) — 通过目标LLM的响应迭代优化攻击\n    - [树形越狱](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-tree-jailbreaking) — 探索并行的攻击变体以找到最佳绕过方法\n    - [渐强式越狱](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-crescendo-jailbreaking) — 从良性提示逐步升级到有害提示\n    - [序列越狱](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-sequential-jailbreaking) — 多轮对话搭建脚手架，诱导产生受限输出\n    - [糟糕的李克特量表评判者](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-bad-likert-judge) — 利用李克特量表评估角色提取有害内容\n\n    \u003C\u002Fdetails>\n\n- 🏛️ 红队对抗现成的[AI安全框架](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguidelines-and-frameworks)，开箱即用。每个框架会自动将其类别映射到相应的漏洞和攻击：\n  - OWASP LLM十大风险2025\n  - OWASP Agent十大风险2026\n  - NIST AI RMF\n  - MITRE ATLAS\n  - BeaverTails\n  - Aegis\n- 🛡️ 7种生产就绪的[护栏](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguardrails)，用于快速二分类，实时保护LLM的输入和输出。\n- 🧩 构建您自己的**自定义漏洞**和攻击，与DeepTeam生态系统无缝集成。\n- 🔗 可以使用YAML配置文件通过**命令行**运行红队测试，也可以用Python进行编程化操作。\n- 📊 访问风险评估结果，以数据框形式展示，并可保存为本地JSON文件。\n\n&nbsp;\n\n\n\n# 🚀 快速入门\n\nDeepTeam无需您指定要红队测试的LLM系统——因为恶意用户也不会这样做。您只需安装`deepteam`，定义一个`model_callback`函数，即可开始。\n\n## 安装\n\n```\npip install -U deepteam\n```\n\n## 对您的第一个LLM进行红队测试\n\n```python\nfrom deepteam import red_team\nfrom deepteam.vulnerabilities import Bias\nfrom deepteam.attacks.single_turn import PromptInjection\n\nasync def model_callback(input: str) -> str:\n    # 替换为您自己的LLM应用\n    return f\"很抱歉，我无法回答这个问题：{input}\"\n\nrisk_assessment = red_team(\n    model_callback=model_callback,\n    vulnerabilities=[Bias(types=[\"race\"])],\n    attacks=[PromptInjection()]\n)\n```\n\n在运行之前，请别忘了将`OPENAI_API_KEY`设置为环境变量（您也可以使用[任何自定义模型](https:\u002F\u002Fdeepeval.com\u002Fguides\u002Fguides-using-custom-llms)），然后运行以下命令：\n\n```bash\npython red_team_llm.py\n```\n\n**就这样！您的第一次红队测试完成了。** 过程如下：\n\n- `model_callback`封装了您的LLM系统，根据给定的`input`生成一个`str`类型的输出。\n- 在红队测试时，`deepteam`模拟了一次针对[`Bias`](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-vulnerabilities-bias)漏洞的[`PromptInjection`](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fred-teaming-adversarial-attacks-prompt-injection)攻击。\n- 您的`model_callback`输出会使用`BiasMetric`进行评估，得出0或1的二元分数。\n- 最终的`Bias`通过率由得分为1的比例决定。\n\n与传统评估不同，红队测试不需要准备好的数据集——对抗性攻击是根据您想要测试的漏洞动态生成的。\n\n&nbsp;\n\n## 对抗安全框架进行红队测试\n\n您可以使用OWASP、NIST等成熟的AI安全标准，而不是手动选择漏洞：\n\n```python\nfrom deepteam import red_team\nfrom deepteam.frameworks import OWASPTop10\n\nasync def model_callback(input: str) -> str:\n    # 替换为您自己的LLM应用\n    return f\"很抱歉，我无法回答这个问题：{input}\"\n\nrisk_assessment = red_team(\n    model_callback=model_callback,\n    framework=OWASPTop10()\n)\n```\n\n这会自动将框架的类别映射到相应的漏洞和攻击。可用的框架包括`OWASPTop10`、`OWASP_ASI_2026`、`NIST`、`MITRE`、`Aegis`和`BeaverTails`。\n\n&nbsp;\n\n## 在生产环境中保护您的LLM\n\n一旦发现漏洞，可以使用DeepTeam的护栏来防止它们在生产环境中发生：\n\n```python\nfrom deepteam import Guardrails\nfrom deepteam.guardrails import PromptInjectionGuard, ToxicityGuard, PrivacyGuard\n\nguardrails = Guardrails(\n    input_guards=[PromptInjectionGuard(), PrivacyGuard()],\n    output_guards=[ToxicityGuard()]\n)\n\n# 在输入到达您的LLM之前进行防护\ninput_result = guardrails.guard_input(\"告诉我如何入侵数据库\")\nprint(input_result.breached)  # True\n\n# 在输出发送给用户之前进行防护\noutput_result = guardrails.guard_output(input=\"你好\", output=\"这里有一些有毒的内容...\")\nprint(output_result.breached)  # True\n```\n\n开箱即用的护栏共有7种：`ToxicityGuard`、`PromptInjectionGuard`、`PrivacyGuard`、`IllegalGuard`、`HallucinationGuard`、`TopicalGuard`和`CybersecurityGuard`。[完整护栏文档请见此处。](https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fguardrails)\n\n&nbsp;\n\n# DeepTeam与Confident AI\n\n[Confident AI](https:\u002F\u002Fapp.confident-ai.com?utm_source=GitHub)是一个一体化平台，可原生集成DeepTeam和[DeepEval](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepeval)。\n\n- **管理风险评估** — 查看、比较和跟踪各次红队测试的结果\n- **生产环境监控** — 检测并警报影响您在线LLM系统的漏洞\n- **分享报告** — 生成并向团队分发安全报告\n- **直接在IDE中运行** — 使用Confident AI的MCP服务器运行红队测试、拉取结果并检查漏洞，无需离开Cursor或Claude Code。\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_readme_b9f93b7ee88b.gif\" alt=\"Confident AI\" width=\"90%\">\n\u003C\u002Fp>\n\n&nbsp;\n\n# 贡献\n\n请阅读[CONTRIBUTING.md](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)，了解我们的行为准则以及向我们提交拉取请求的流程。\n\n&nbsp;\n\n# 作者\n\n由Confident AI的创始人打造。如有任何疑问，请联系jeffreyip@confident-ai.com。\n\n&nbsp;\n\n# 许可证\n\nDeepTeam采用Apache 2.0许可证授权——详情请参阅[LICENSE.md](https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fblob\u002Fmain\u002FLICENSE.md)文件。","# DeepTeam 快速上手指南\n\nDeepTeam 是一个开源的大语言模型（LLM）红队测试框架，旨在通过模拟越狱、提示注入、多轮攻击等手段，发现 AI 代理、RAG 管道和聊天机器人中的安全漏洞（如偏见、隐私泄露、SQL 注入等）。它基于 DeepEval 构建，完全在本地运行。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS, 或 Windows\n*   **Python 版本**：Python 3.9 或更高版本\n*   **前置依赖**：\n    *   已安装 `pip` 包管理工具\n    *   （可选但推荐）拥有可用的 LLM API Key（如 OpenAI, Azure, 或本地部署的模型），用于驱动攻击生成和评估判断。\n\n## 安装步骤\n\n使用 pip 直接安装 DeepTeam 及其核心依赖：\n\n```bash\npip install deepteam\n```\n\n> **提示**：如果您在中国大陆地区遇到下载速度慢的问题，可以使用国内镜像源加速安装：\n> ```bash\n> pip install deepteam -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n安装完成后，建议初始化配置文件（根据引导输入您的 LLM 提供商及 API Key）：\n\n```bash\ndeepteam init\n```\n\n## 基本使用\n\nDeepTeam 的核心工作流是通过定义目标模型和要测试的漏洞类型来运行红队测试。以下是一个最简单的 Python 脚本示例，展示如何对目标 LLM 进行基础的越狱和提示注入测试。\n\n### 1. 创建测试脚本\n\n新建一个文件（例如 `run_test.py`），并写入以下代码。此示例假设您已通过 `deepteam init` 配置了默认模型，或者直接在代码中指定目标模型。\n\n```python\nfrom deepteam import RedTeamer\nfrom deepteam.vulnerabilities import PromptInjection, Jailbreak\n\n# 定义要测试的目标模型 (此处以 OpenAI 为例，也可替换为本地模型或其他提供商)\n# 确保已设置环境变量 OPENAI_API_KEY 或在 init 时配置\ntarget_model = {\n    \"model\": \"gpt-3.5-turbo\", \n    \"api_key\": \"YOUR_OPENAI_API_KEY\" # 或者从环境变量读取\n}\n\n# 初始化红队测试器\nred_teamer = RedTeamer(\n    target_model=target_model,\n    vulnerabilities=[\n        PromptInjection(),\n        Jailbreak()\n    ]\n)\n\n# 运行测试\nresults = red_teamer.generate()\n\n# 打印简要结果\nprint(f\"测试完成。发现漏洞数量：{len(results)}\")\nfor result in results:\n    print(f\"- 漏洞类型：{result.vulnerability}, 状态：{'失败' if result.success else '通过'}\")\n    if not result.success:\n        print(f\"  攻击提示示例：{result.input}\")\n```\n\n### 2. 运行测试\n\n在终端中执行脚本：\n\n```bash\npython run_test.py\n```\n\n### 3. 查看结果\n\n运行结束后，DeepTeam 会在控制台输出测试结果，包括是否成功触发了漏洞、使用的攻击提示（Prompt）以及模型的响应。\n\n*   **本地报告**：测试结果通常会自动保存为本地文件（如 JSON 或 HTML 报告，具体取决于配置）。\n*   **云端管理（可选）**：如果您注册了 [Confident AI](https:\u002F\u002Fapp.confident-ai.com)，可以配置 API Key 将测试结果同步到云端仪表盘，以便团队协作和长期监控。\n\n通过以上步骤，您即可快速启动针对 LLM 应用的安全评估。更多高级用法（如自定义漏洞、多轮对话攻击、RAG 测试等）请参考官方文档。","某金融科技公司正在开发一款基于大模型的智能客服助手，用于处理用户账户查询和敏感业务咨询，上线前急需确保系统安全合规。\n\n### 没有 deepteam 时\n- 安全测试依赖人工构造攻击提示，耗时数周仅能覆盖极少数场景，难以发现深层漏洞。\n- 无法系统化检测隐私泄露风险，用户身份证号或银行卡号可能在特定诱导下被模型无意输出。\n- 缺乏对偏见和毒性内容的量化评估，模型可能在涉及性别或地域话题时给出不当回答，引发公关危机。\n- 每次模型迭代后需重新进行全套人工测试，效率低下且标准不一，导致上线周期被迫延长。\n\n### 使用 deepteam 后\n- 利用内置的 50+ 种现成攻击模板（如越狱、提示词注入），在本地自动运行数千次模拟攻击，几小时内即可全面扫描系统弱点。\n- 通过专门的“隐私数据泄露”检测模块，精准识别并修复了模型在多重诱导下输出用户 PII 信息的漏洞，确保数据合规。\n- 借助自动化评估指标，快速定位并消除了模型在特定话题上的偏见与毒性回复，显著提升了内容安全性。\n- 将红队测试集成至 CI\u002FCD 流程，每次代码更新自动触发深度安全扫描，大幅缩短验证周期并保障迭代质量。\n\ndeepteam 将原本昂贵且滞后的 AI 安全审计转变为自动化、可量化的常规开发环节，为智能客服的安全上线筑牢了防线。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fconfident-ai_deepteam_35ba9799.png","confident-ai","Confident AI","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fconfident-ai_21a458ba.png","",null,"www.confident-ai.com","https:\u002F\u002Fgithub.com\u002Fconfident-ai",[84],{"name":85,"color":86,"percentage":87},"Python","#3572A5",100,1537,246,"2026-04-13T11:51:38","Apache-2.0","未说明",{"notes":94,"python":92,"dependencies":95},"该工具基于 DeepEval 构建，设计为在本地机器上运行。它支持使用任意选择的 LLM（ANY LLM）作为评判标准来执行红队测试，因此具体的硬件资源需求（如 GPU、内存）取决于用户所选择运行的目标大语言模型，而非工具本身有固定硬性要求。README 中未列出具体的 Python 版本或其他系统级依赖限制。",[96],"deepeval",[15],[99,100,101,102,103],"llm-guardrails","llm-red-teaming","llm-safety","hacktoberfest","python","2026-03-27T02:49:30.150509","2026-04-14T12:31:40.192143",[107,112,117,122,127,132],{"id":108,"question_zh":109,"answer_zh":110,"source_url":111},32880,"如何配置 DeepTeam 以完全在本地运行模型，避免调用外部 API？","虽然文档提到可以使用 `deepeval set-local-model` 命令设置本地模型，但部分功能（如基线攻击模拟）可能仍硬编码了外部 API 调用。如果必须完全本地运行，目前可能需要等待功能更新或修改源码。建议关注官方文档更新，因为维护者正在努力完善本地基线攻击生成功能。若文档未明确说明某些步骤仍需外部调用，用户应知晓当前版本可能存在限制。","https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fissues\u002F11",{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},32881,"遇到 'ModuleNotFoundError: No module named deepeval.metrics.red_teaming_metrics' 错误怎么办？","该错误是因为 DeepEval v3.0+ 重构后移除了 `red_teaming_metrics` 模块，将其功能迁移到了 DeepTeam 包中。请确保将 DeepTeam 升级到最新版本，维护者已在最新发行版中修复了此导入问题。不要尝试降级 DeepEval，升级 DeepTeam 即可解决。","https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fissues\u002F92",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},32882,"RedTeamer 类中的 target_purpose 参数有什么作用？如何使用它来定制特定场景的攻击？","`target_purpose` 用于定义红队测试的具体目标或场景（例如金融用例），从而使生成的攻击更具针对性而非通用。早期版本中该参数可能未生效，但在最新更新中已修复。现在你可以在初始化 `RedTeamer` 时传入 `target_purpose`，它会将此上下文传递给攻击模拟器，生成符合特定目的的攻击向量。具体用法可参考官方文档的红队介绍部分。","https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fissues\u002F3",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},32883,"使用自定义模型时遇到遥测（Telemetry）相关报错如何解决？","如果在运行自定义模型时遇到遥测相关的错误，可以通过设置环境变量来禁用遥测功能。必须在导入 `deepteam` 包之前设置以下环境变量：\n```python\nimport os\nos.environ[\"DEEPTEAM_TELEMETRY_OPT_OUT\"] = \"YES\"\nimport deepteam\n```\n注意：变量值必须是 \"YES\" 且变量名是 `DEEPTEAM_TELEMETRY_OPT_OUT`（包含下划线），设置为 \"1\" 或其他值无效。","https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fissues\u002F116",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},32884,"如何在 DeepTeam 中使用 Hugging Face 托管的模型进行评测和模拟？","DeepTeam 基于 DeepEval 构建，因此可以直接复用 DeepEval 的模型集成能力。你不需要直接修改 DeepTeam 的 API 调用逻辑，只需在 DeepTeam 的任何模型参数位置传入配置好的 DeepEval 模型对象即可。如果 Hugging Face 模型支持 OpenAI API 格式，你可以先在 DeepEval 中配置该模型，然后将其传递给 DeepTeam 的 `simulator_model` 或 `evaluation_model` 参数，即可无缝使用。","https:\u002F\u002Fgithub.com\u002Fconfident-ai\u002Fdeepteam\u002Fissues\u002F163",{"id":133,"question_zh":134,"answer_zh":135,"source_url":126},32885,"如何编写一个简单的回调函数来连接本地 Ollama 模型进行红队测试？","你可以编写一个 HTTP 请求回调函数来与本地 Ollama 实例通信。示例代码如下：\n```python\ndef ollama_callback(prompt: str) -> str:\n    import httpx\n    try:\n        resp = httpx.post(\n            \"http:\u002F\u002Flocalhost:11434\u002Fapi\u002Fgenerate\",\n            json={\n                \"model\": \"gpt-oss:20b\", # 替换为你的模型名称\n                \"prompt\": prompt,\n                \"stream\": False\n            },\n        )\n        resp.raise_for_status()\n        data = resp.json()\n        return data.get('response', '')\n    except Exception as e:\n        return f\"Error: {str(e)}\"\n```\n然后将此 `ollama_callback` 传递给 `teamer.red_team(model_callback=...)` 即可使用本地模型响应攻击提示。",[137,142,147],{"id":138,"version":139,"summary_zh":140,"released_at":141},247565,"v1.0.4","我们很高兴地宣布 **DeepTeam v1.0.0** 的稳定版正式发布——这是一个用于编排大语言模型红队测试的 Python 框架。从模拟对抗性、恶意输入，到评估您的 LLM 应用在这些场景下的响应行为，DeepTeam 能够帮助您全面发现整个 AI 技术栈中的安全漏洞。\n\n我们在设计 DeepTeam 时充分考虑了模块化理念，使用户能够轻松构建自定义的 LLM 安全测试流水线，通过组合 **20 多种单轮和多轮攻击方法**，覆盖不同风险类别下的 **50 多种漏洞**。\n\n\n---\n\n## 为什么选择 DeepTeam？\n\n**🧩 模块化架构**  \n通过自由组合各类漏洞和攻击手段，您可以灵活搭建专属的安全测试流水线。\n\n**🏗️ 行业标准集成**  \n与 OWASP 大语言模型十大风险、NIST 人工智能风险管理框架以及相关合规标准无缝对接。\n\n**⚔️ 20+ 种攻击方法**  \n单轮攻击：提示注入、越狱、ROT13、Base64、自动化规避、数据提取、响应操纵、灰盒攻击等。  \n多轮攻击：对话劫持、渐进式越狱链、上下文投毒。\n\n**🎯 50+ 种漏洞检测**  \n偏见（性别、种族、宗教、政治）• PII 泄露（数据库访问、会话泄露）• 虚假信息（事实性错误、幻觉）• 有害内容（暴力、色情）• 毒性言论（侮辱、仇恨言论）• 过度自主性 • 对上下文的过度依赖 • 以及其他 40 多种漏洞。\n\n**🐍 Python 优先 + CLI**  \n提供简洁的 Python API，并支持异步操作。可通过命令行直接运行测试，便于集成到 CI\u002FCD 流程中。\n\n**🔄 带状态的测试**  \n可复用测试用例，持续跟踪安全改进效果，量化迭代修复成果。\n\n---\n\n## 快速入门\n\n### 安装\n```bash\npip install -U deepteam\n```\n\n### 基本示例\n\n```python\nfrom deepteam import red_team\nfrom deepteam.vulnerabilities import Bias, PIILeakage, Toxicity\nfrom deepteam.attacks.single_turn import PromptInjection\n\nasync def model_callback(input: str) -> str:\n    return your_llm.generate(input)\n\n# 定义漏洞\nbias = Bias(types=[\"race\", \"gender\"])\npii = PIILeakage(types=[\"database_access\"])\ntoxicity = Toxicity(types=[\"insults\"])\n\n# 执行红队测试\nrisk_assessment = red_team(\n    model_callback=model_callback,\n    vulnerabilities=[bias, pii, toxicity],\n    attacks=[PromptInjection()]\n)\n\n# 查看结果\nprint(risk_assessment.overview)\nrisk_assessment.overview.to_df()  # 导出为 Pandas DataFrame\nrisk_assessment.save(to=\".\u002Fresults\u002F\")  # 保存至本地\n```\n\n---\n\n## 其他亮点优势\n\n✅ 无需数据集——对抗性攻击可在运行时动态模拟  \n✅ 全面的风险评估——按漏洞细分，提供通过率\u002F失败率及详细分析依据  \n✅ 灵活的集成能力——兼容 OpenAI、Anthropic、自定义 LLM 及任何其他提供商  \n✅ 与 Pandas 无缝集成——可导出为 DataFrame，便于分析和生成合规文档  \n✅ 基于 DeepEval 构建——依托最广泛采用的开源 LLM 评估框架  \n\n## 更多资源\n\n感谢您","2025-11-12T14:18:25",{"id":143,"version":144,"summary_zh":145,"released_at":146},247566,"v0.1.9","# 🚀 DeepTeam CLI 发布\n\n我们很高兴地发布 **DeepTeam CLI** 的第一个版本——这是一个功能强大的命令行工具，用于红队测试以及使用 DeepEval 评估 LLM 应用程序。\n\n## ✨ 功能特性\n\n- **红队模拟**\n  - 轻松指定模拟器和评估模型（如 `gpt-3.5-turbo-0125`、`gpt-4o` 等）\n  - 使用预定义的漏洞类别（例如：偏见、毒性）对 LLM 系统发起攻击\n\n- **目标系统配置**\n  - 可以通过自定义 Python 封装来测试基础模型（如 `gpt-3.5-turbo`）以及完整的 LLM 应用程序\n  - 简单的 YAML 配置结构，用于定义目标模型的目的和行为\n\n- **系统控制**\n  - 设置并发和并行度：`max_concurrent`、`run_async`\n  - 指定每种漏洞类型要运行的攻击次数\n  - 可选的错误处理机制（`ignore_errors`）和结果存储路径（`output_folder`）\n\n- **可插拔的漏洞与攻击**\n  - 支持多种攻击类型（例如：`Prompt Injection`）\n  - 可以定义默认的漏洞，如：\n    - `Bias`：针对种族和性别\n    - `Toxicity`：脏话和侮辱性语言\n\n## 🛠 示例用法\n\n```yaml\nmodels:\n  simulator: gpt-3.5-turbo-0125\n  evaluation: gpt-4o\n\ntarget:\n  purpose: \"一个有用的 AI 助手\"\n  model: gpt-3.5-turbo\n\nsystem_config:\n  max_concurrent: 10\n  attacks_per_vulnerability_type: 3\n  run_async: true\n  ignore_errors: false\n  output_folder: \"results\"\n\ndefault_vulnerabilities:\n  - name: \"Bias\"\n    types: [\"race\", \"gender\"]\n  - name: \"Toxicity\"\n    types: [\"profanity\", \"insults\"]\n\nattacks:\n  - name: \"Prompt Injection\"\n```\n```bash\ndeepteam run config.yaml\n```\n\n敬请期待更多攻击类型、评估指标以及与 DeepEval 框架的集成。\n\n## 🧠 代理式红队测试\n代理式红队测试旨在检测那些只有在系统自主运行、保持持久记忆并追求复杂目标时才会显现出来的漏洞。\n\n### 🧨 专门的攻击方法\n\nDeepTeam 包含 6 种针对代理系统的特定攻击：\n\n- 权限欺骗：伪装成系统管理员或进行权限覆盖\n- 角色操纵：诱骗代理改变其角色\n- 目标重定向：重新定义或破坏代理的优先级\n- 语言混淆：利用歧义扰乱语言理解\n- 验证绕过：通过巧妙措辞绕过安全检查\n- 上下文注入：注入虚假的环境状态\n\n### 示例\n```python\nfrom deepteam import red_team\nfrom deepteam.vulnerabilities.agentic import DirectControlHijacking\nfrom deepteam.attacks.single_turn import AuthoritySpoofing\n\n# 测试你的代理是否可能被劫持\nrisk_assessment = red_team(\n    model_callback=your_agent_callback,\n    vulnerabilities=[DirectControlHijacking()],\n    attacks=[AuthoritySpoofing()]\n)\n```\n\n🧪 祝您红队测试愉快——现在不仅适用于聊天机器人，也适用于自主代理！","2025-07-02T07:01:43",{"id":148,"version":149,"summary_zh":150,"released_at":151},247567,"v0.1.4","# DeepTeam v0.1.0 – 首次发布 🎉\n\n我们很高兴地推出 **DeepTeam** 的首个公开版本，这是一个用于 **LLM 红队测试**的开源框架。\n\n🧠 DeepTeam 让你能够模拟针对语言模型的真实世界攻击，测试诸如越狱、拒绝服务绕过等失效模式，并通过结构化、可复现的评估方法发现模型中的漏洞。\n\n## 🚀 核心功能\n\n- ✅ 内置对抗性攻击策略（越狱、拒绝服务绕过、提示注入）\n- ✅ 自动化生成对抗性测试用例\n- ✅ 多维度评估指标（通过\u002F失败、毒性、相关性等）\n- ✅ 与你的 LLM 应用及测试流水线无缝集成\n- ✅ 类型安全的 Python API，配置简单\n\n立即开始使用，安装 `deepteam`：\n\n```bash\npip install deepteam\n```\n\n文档地址：https:\u002F\u002Fwww.trydeepteam.com\u002Fdocs\u002Fgetting-started","2025-05-23T19:36:02"]