[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-uditgoenka--autoresearch":3,"tool-uditgoenka--autoresearch":61},[4,18,26,36,44,52],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",141543,2,"2026-04-06T11:32:54",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":10,"last_commit_at":50,"category_tags":51,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":53,"name":54,"github_repo":55,"description_zh":56,"stars":57,"difficulty_score":10,"last_commit_at":58,"category_tags":59,"status":17},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[14,15,13,60],"视频",{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":79,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":32,"env_os":92,"env_gpu":93,"env_ram":93,"env_deps":94,"category_tags":98,"github_topics":100,"view_count":32,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":109,"updated_at":110,"faqs":111,"releases":112},4453,"uditgoenka\u002Fautoresearch","autoresearch","Claude Autoresearch Skill — Autonomous goal-directed iteration for Claude Code. Inspired by Karpathy's autoresearch. Modify → Verify → Keep\u002FDiscard → Repeat forever.","autoresearch 是一款能将 Claude Code 转化为“自动进化引擎”的开源技能。它灵感源自 Andrej Karpathy 的 autoresearch 项目，旨在通过自主循环迭代，帮助用户在任何可量化的领域实现持续优化。\n\n传统开发或优化过程往往依赖人工反复试错，效率低且容易中断。autoresearch 解决了这一痛点，它确立目标与核心指标后，能自动执行“修改→验证→保留或回滚”的闭环流程。系统会基于历史结果智能选择下一步操作，每次仅做一个聚焦的改动，若验证通过则提交，若效果变差则自动撤销，确保进步层层累积而不会倒退。\n\n这款工具特别适合开发者、数据科学家、运维工程师以及任何希望通过数据驱动方式优化代码、营销策略或业务流程的专业人士。无论是机器学习模型调优、代码重构，还是内容生成优化，只要你有明确的衡量指标，它都能发挥作用。\n\n其独特亮点在于将复杂的自动化探索简化为机械式的可靠循环：利用 Git 作为记忆库记录每一次尝试，支持自动回滚错误操作，并能全天候无人值守运行。你只需设定好目标和评分标准，剩下的交给 autoresearch 日夜不停地迭代，醒来即可收获优化成","autoresearch 是一款能将 Claude Code 转化为“自动进化引擎”的开源技能。它灵感源自 Andrej Karpathy 的 autoresearch 项目，旨在通过自主循环迭代，帮助用户在任何可量化的领域实现持续优化。\n\n传统开发或优化过程往往依赖人工反复试错，效率低且容易中断。autoresearch 解决了这一痛点，它确立目标与核心指标后，能自动执行“修改→验证→保留或回滚”的闭环流程。系统会基于历史结果智能选择下一步操作，每次仅做一个聚焦的改动，若验证通过则提交，若效果变差则自动撤销，确保进步层层累积而不会倒退。\n\n这款工具特别适合开发者、数据科学家、运维工程师以及任何希望通过数据驱动方式优化代码、营销策略或业务流程的专业人士。无论是机器学习模型调优、代码重构，还是内容生成优化，只要你有明确的衡量指标，它都能发挥作用。\n\n其独特亮点在于将复杂的自动化探索简化为机械式的可靠循环：利用 Git 作为记忆库记录每一次尝试，支持自动回滚错误操作，并能全天候无人值守运行。你只需设定好目标和评分标准，剩下的交给 autoresearch 日夜不停地迭代，醒来即可收获优化成果。它证明了无需通用人工智能（AGI），仅需明确的目标、指标和不知疲倦的循环，就能产生复利般的增益效果。","\u003Cdiv align=\"center\">\n\n# Claude Autoresearch\n\n**Turn [Claude Code](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) into a relentless improvement engine.**\n\nBased on [Karpathy's autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch) — constraint + mechanical metric + autonomous iteration = compounding gains.\n\n[![Claude Code Skill](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FClaude_Code-Skill-blue?logo=anthropic&logoColor=white)](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code)\n[![Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-1.9.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Freleases)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg)](LICENSE)\n\n[![Based on](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBased_on-Karpathy's_Autoresearch-orange)](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch)\n[![Follow @iuditg](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFollow-@iuditg-000000?style=flat&logo=x&logoColor=white)](https:\u002F\u002Fx.com\u002Fintent\u002Ffollow?screen_name=iuditg)\n[![Support](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSupport-PayPal-00457C?style=flat&logo=paypal&logoColor=white)](https:\u002F\u002Fpaypal.me\u002Fuditgoenka)\n\n\u003Cbr>\n\n*\"Set the GOAL → Claude runs the LOOP → You wake up to results\"*\n\n*You don't need AGI. You need a goal, a metric, and a loop that never quits.*\n\n\u003Cbr>\n\n[How It Works](#how-it-works) · [Commands](#commands) · [Quick Start](#quick-start) · [Guides](guide\u002F) · [FAQ](#faq)\n\n\u003C\u002Fdiv>\n\n---\n\n```\n      PLAN              LOOP             DEBUG              FIX            SECURE            SHIP\n ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐\n │   Goal   │     │  Modify  │     │   Find   │     │   Fix    │     │  STRIDE  │     │  Stage   │\n │  Metric  │────▶│  Verify  │────▶│   Bugs   │────▶│  Errors  │────▶│  OWASP   │────▶│  Deploy  │\n │  Scope   │     │  Keep\u002F   │     │  Trace   │     │  Repair  │     │  Red     │     │ Release  │\n └──────────┘     │  Discard │     └──────────┘     └──────────┘     │  Team    │     └──────────┘\n\u002Fautoresearch:    └──────────┘    \u002Fautoresearch:    \u002Fautoresearch:   └──────────┘    \u002Fautoresearch:\n  plan            \u002Fautoresearch     debug              fix          \u002Fautoresearch:      ship\n                                                                     security\n\n                  ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐\n                  │ Scenario │     │ Predict  │     │  Learn   │     │  Reason  │\n                  │   Edge   │     │ 5-Expert │     │   Docs   │     │  Debate  │\n                  │   Cases  │     │  Swarm   │     │   Gen    │     │ Converge │\n                  └──────────┘     └──────────┘     └──────────┘     └──────────┘\n                 \u002Fautoresearch:   \u002Fautoresearch:   \u002Fautoresearch:   \u002Fautoresearch:\n                   scenario         predict           learn           reason\n```\n\n---\n\n## Why This Exists\n\n[Karpathy's autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch) demonstrated that a 630-line Python script could autonomously improve ML models overnight — **100 experiments per night** — by following simple principles: one metric, constrained scope, fast verification, automatic rollback, git as memory.\n\n**Claude Autoresearch generalizes these principles to ANY domain.** Not just ML — code, content, marketing, sales, HR, DevOps, or anything with a number you can measure.\n\n---\n\n## How It Works\n\n```\nLOOP (FOREVER or N times):\n  1. Review current state + git history + results log\n  2. Pick the next change (based on what worked, what failed, what's untried)\n  3. Make ONE focused change\n  4. Git commit (before verification)\n  5. Run mechanical verification (tests, benchmarks, scores)\n  6. If improved → keep. If worse → git revert. If crashed → fix or skip.\n  7. Log the result\n  8. Repeat. Never stop until you interrupt (or N iterations complete).\n```\n\nEvery improvement stacks. Every failure auto-reverts. Progress is logged in TSV format.\n\n### The Setup Phase\n\nBefore looping, Claude performs a one-time setup:\n\n1. **Read context** — reads all in-scope files\n2. **Define goal** — extracts or asks for a mechanical metric\n3. **Define scope** — which files can be modified vs read-only\n4. **Establish baseline** — runs verification on current state (iteration #0)\n5. **Confirm and go** — shows setup, then begins the loop\n\n### 8 Critical Rules\n\n| # | Rule |\n|---|------|\n| 1 | **Loop until done** — unbounded: forever. Bounded: N times then summarize |\n| 2 | **Read before write** — understand full context before modifying |\n| 3 | **One change per iteration** — atomic changes. If it breaks, you know why |\n| 4 | **Mechanical verification only** — no subjective \"looks good.\" Use metrics |\n| 5 | **Automatic rollback** — failed changes revert instantly |\n| 6 | **Simplicity wins** — equal results + less code = KEEP |\n| 7 | **Git is memory** — experiments committed with `experiment:` prefix, `git revert` preserves failed experiments in history, agent MUST read `git log` + `git diff` before each iteration |\n| 8 | **When stuck, think harder** — re-read, combine near-misses, try radical changes |\n\n---\n\n## Commands\n\n| Command | What it does |\n|---------|--------------|\n| `\u002Fautoresearch` | Run the autonomous iteration loop (unlimited) |\n| `Iterations: N` | Add to inline config to run exactly N iterations then stop |\n| `\u002Fautoresearch:plan` | Interactive wizard: Goal → Scope, Metric, Verify config |\n| `\u002Fautoresearch:security` | Autonomous STRIDE + OWASP + red-team security audit |\n| `\u002Fautoresearch:ship` | Universal shipping workflow (code, content, marketing, sales, research, design) |\n| `\u002Fautoresearch:debug` | Autonomous bug-hunting loop — scientific method + iterative investigation |\n| `\u002Fautoresearch:fix` | Autonomous fix loop — iteratively repair errors until zero remain |\n| `\u002Fautoresearch:scenario` | Scenario-driven use case generator — explore situations, edge cases, derivative scenarios |\n| `\u002Fautoresearch:predict` | Multi-persona prediction | Pre-analyze code from 5 expert perspectives before acting |\n| `\u002Fautoresearch:learn` | Autonomous documentation engine — scout codebase, generate\u002Fupdate docs, validate, fix loop |\n| `\u002Fautoresearch:reason` | Adversarial refinement — blind judge panel converges subjective content through isolated multi-agent debate |\n| `Guard: \u003Ccommand>` | Optional safety net — must pass for changes to be kept |\n\n**All commands use `AskUserQuestion` for interactive setup when invoked without arguments.** Just type the command — Claude will ask you what you need step by step with smart defaults based on your codebase. Power users can skip the wizard by providing flags inline.\n\n### Quick Decision Guide\n\n| I want to... | Use |\n|--------------|-----|\n| Improve test coverage \u002F reduce bundle size \u002F any metric | `\u002Fautoresearch` (add `Iterations: N` for bounded runs) |\n| Don't know what metric to use | `\u002Fautoresearch:plan` |\n| Run a security audit | `\u002Fautoresearch:security` |\n| Ship a PR \u002F deployment \u002F release | `\u002Fautoresearch:ship` |\n| Optimize without breaking existing tests | Add `Guard: npm test` |\n| Hunt all bugs in a codebase | `\u002Fautoresearch:debug` (add `Iterations: 20` for bounded runs) |\n| Fix all errors (tests, types, lint) | `\u002Fautoresearch:fix` |\n| Debug then auto-fix | `\u002Fautoresearch:debug --fix` |\n| Check if something is ready to ship | `\u002Fautoresearch:ship --checklist-only` |\n| Explore edge cases for a feature | `\u002Fautoresearch:scenario` |\n| Generate test scenarios | `\u002Fautoresearch:scenario --domain software --format test-scenarios` |\n| Stress test a user journey | `\u002Fautoresearch:scenario --depth deep` |\n| I want expert opinions before I start | `\u002Fautoresearch:predict` |\n| Analyze this from multiple angles | `\u002Fautoresearch:predict --chain debug` |\n| Generate docs for a new codebase | `\u002Fautoresearch:learn --mode init` |\n| Update existing docs after changes | `\u002Fautoresearch:learn --mode update` |\n| Check if docs are stale | `\u002Fautoresearch:learn --mode check` |\n| Debate an architecture decision | `\u002Fautoresearch:reason --domain software` |\n| Refine a pitch or proposal adversarially | `\u002Fautoresearch:reason --domain business` |\n| Converge on best design then validate | `\u002Fautoresearch:reason --chain predict` |\n\n---\n\n## Quick Start\n\n### 1. Install\n\n**Option A — Plugin install (recommended):**\n\nIn Claude Code, run:\n```\n\u002Fplugin marketplace add uditgoenka\u002Fautoresearch\n\u002Fplugin install autoresearch@autoresearch\n```\n\nThat's it. All 10 commands are available after restarting Claude Code.\n\n> **Note:** Start a new Claude Code session after installing. Reference files aren't resolvable in the same session where installation happened — this is a Claude Code platform limitation.\n\n**Updating (no reinstall needed):**\n```\n\u002Fplugin update autoresearch\n```\n\nThat pulls the latest version. Run `\u002Freload-plugins` to activate. No need to uninstall or re-clone.\n\n**Option B — Manual copy:**\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch.git\n\n# Copy skill + subcommands to your project\ncp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch .claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch .claude\u002Fcommands\u002Fautoresearch\ncp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md .claude\u002Fcommands\u002Fautoresearch.md\n```\n\nOr install globally:\n```bash\ncp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch ~\u002F.claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch ~\u002F.claude\u002Fcommands\u002Fautoresearch\ncp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md ~\u002F.claude\u002Fcommands\u002Fautoresearch.md\n```\n\n> **Note:** The `commands\u002F` directory is required for subcommands (`\u002Fautoresearch:ship`, `\u002Fautoresearch:plan`, `\u002Fautoresearch:security`) to work.\n\n### 2. Run It\n\n```\n\u002Fautoresearch\nGoal: Increase test coverage from 72% to 90%\nScope: src\u002F**\u002F*.test.ts, src\u002F**\u002F*.ts\nMetric: coverage % (higher is better)\nVerify: npm test -- --coverage | grep \"All files\"\n```\n\n### 3. Walk Away\n\nClaude reads all files, establishes a baseline, and starts iterating — one change at a time. Keep improvements, auto-revert failures, log everything. **Never stops until you interrupt** (or N iterations complete).\n\n---\n\n## \u002Fautoresearch:plan — Goal → Config Wizard\n\nThe hardest part isn't the loop — it's defining Scope, Metric, and Verify correctly. `\u002Fautoresearch:plan` converts your plain-language goal into a validated, ready-to-execute configuration.\n\n```\n\u002Fautoresearch:plan\nGoal: Make the API respond faster\n```\n\nThe wizard walks you through 5 steps: capture goal → define scope → define metric → define direction → validate verify command (dry-run). Every gate is mechanical — scope must resolve to files, metric must output a number, verify must pass a dry-run.\n\n---\n\n## \u002Fautoresearch:security — Autonomous Security Audit\n\nRead-only security audit using STRIDE threat modeling, OWASP Top 10 sweeps, and red-team adversarial analysis with 4 hostile personas.\n\n```\n\u002Fautoresearch:security\nIterations: 10\n```\n\n**What it does:** Codebase recon → asset inventory → trust boundaries → STRIDE threat model → attack surface map → autonomous testing loop → structured report.\n\nEvery finding requires **code evidence** (file:line + attack scenario). No theoretical fluff.\n\n| Flag | Purpose |\n|------|---------|\n| `--diff` | Only audit files changed since last audit |\n| `--fix` | Auto-fix confirmed Critical\u002FHigh findings |\n| `--fail-on \u003Cseverity>` | Exit non-zero for CI\u002FCD gating |\n\n**Output:** Creates `security\u002F{date}-{slug}\u002F` with 7 structured report files.\n\n---\n\n## \u002Fautoresearch:ship — Universal Shipping Workflow\n\nShip anything through 8 phases: **Identify → Inventory → Checklist → Prepare → Dry-run → Ship → Verify → Log.**\n\n```\n\u002Fautoresearch:ship --auto\n```\n\nAuto-detects what you're shipping (code PR, deployment, blog post, email campaign, sales deck, research paper, design assets) and generates domain-specific checklists — every item mechanically verifiable.\n\n| Flag | Purpose |\n|------|---------|\n| `--dry-run` | Validate everything but don't ship |\n| `--auto` | Auto-approve if checklist passes |\n| `--force` | Skip non-critical items (blockers still enforced) |\n| `--rollback` | Undo last ship action |\n| `--monitor N` | Post-ship monitoring for N minutes |\n| `--type \u003Ctype>` | Override auto-detection |\n| `--checklist-only` | Just check readiness |\n\n**9 supported types:** code-pr, code-release, deployment, content, marketing-email, marketing-campaign, sales, research, design.\n\n---\n\n## \u002Fautoresearch:debug — Autonomous Bug Hunter (v1.3.0)\n\nScientific method meets autoresearch loop. Doesn't stop at one bug — iteratively hunts ALL bugs using falsifiable hypotheses, evidence-based investigation, and 7 investigation techniques.\n\n```\n\u002Fautoresearch:debug\nScope: src\u002Fapi\u002F**\u002F*.ts\nSymptom: API returns 500 on POST \u002Fusers\nIterations: 20\n```\n\n**How it works:** Gather symptoms → Recon (map error surface) → Hypothesize (specific, testable) → Test (one experiment per iteration) → Classify (confirmed\u002Fdisproven\u002Finconclusive) → Log → Repeat.\n\nEvery finding requires **code evidence** (file:line + reproduction steps). Every disproven hypothesis is logged — equally valuable. Uses 7 techniques: binary search, differential debugging, minimal reproduction, trace execution, pattern search, working backwards, rubber duck.\n\n| Flag | Purpose |\n|------|---------|\n| `--fix` | After hunting, auto-switch to `\u002Fautoresearch:fix` |\n| `--scope \u003Cglob>` | Limit investigation scope |\n| `--symptom \"\u003Ctext>\"` | Pre-fill symptom |\n| `--severity \u003Clevel>` | Minimum severity to report |\n\n---\n\n## \u002Fautoresearch:fix — Autonomous Error Crusher (v1.3.0)\n\nTakes a broken state and iteratively repairs it until everything passes. ONE fix per iteration. Atomic, committed, verified, auto-reverted on failure.\n\n```\n\u002Fautoresearch:fix\n```\n\n**How it works:** Auto-detects what's broken (tests, types, lint, build) → Prioritizes (blockers first) → Fixes ONE thing → Commits → Verifies error count decreased → Guard check (no regressions) → Keep\u002FRevert → Repeat until zero errors.\n\n**Stops automatically when error count hits zero** — even in unbounded mode.\n\n| Flag | Purpose |\n|------|---------|\n| `--target \u003Ccommand>` | Explicit verify command |\n| `--guard \u003Ccommand>` | Safety command that must always pass |\n| `--category \u003Ctype>` | Only fix specific type (test, type, lint, build) |\n| `--from-debug` | Read findings from latest debug session |\n\n**Chain them:** Run `\u002Fautoresearch:debug` with `Iterations: 15`, then `\u002Fautoresearch:fix --from-debug` with `Iterations: 30`\n\n---\n\n## \u002Fautoresearch:learn — Autonomous Documentation Engine\n\nScout codebase → generate docs → validate → fix → repeat. 4 modes: init (create from scratch), update (refresh existing), check (read-only health report), summarize (quick overview).\n\n```\n\u002Fautoresearch:learn --mode init --depth deep\n```\n\nDynamic doc discovery (scans `docs\u002F*.md`), project-type detection, validation-fix loop (max 3 retries), scale-aware scouting, git-diff scoping for updates, selective single-doc update with `--file`. Auto-generates Mermaid architecture diagrams, conditional docs (API reference, testing guide, config guide, changelog), cross-reference links between docs, and dependency documentation. Supports `--format` for alternative output formats.\n\n---\n\n## \u002Fautoresearch:predict — Multi-Persona Prediction (v1.7.0)\n\nBefore you debug, fix, or ship — get 5 expert perspectives in 2 minutes.\n\n`\u002Fautoresearch:predict` simulates a team of experts (Architect, Security Analyst, Performance Engineer, Reliability Engineer, Devil's Advocate) who independently analyze your code, debate findings, and reach consensus. Chain the output directly to any other command:\n\n- `\u002Fautoresearch:predict --chain debug` — pre-ranked hypotheses before debugging\n- `\u002Fautoresearch:predict --chain security` — multi-persona red team analysis\n- `\u002Fautoresearch:predict --chain scenario,debug,fix` — full quality pipeline\n\n---\n\n## \u002Fautoresearch:reason — Adversarial Refinement (v1.9.0)\n\nExtends autoresearch to **subjective domains** where no objective metric exists. The blind judge panel IS the fitness function — it's val_bpb for architecture decisions, product strategy, content quality, and design debates.\n\n```\n\u002Fautoresearch:reason\nTask: Should we use event sourcing for our order management system?\nDomain: software\nIterations: 8\n```\n\n**How it works:** Generate-A → Critic attacks (strawman) → Author-B responds → Synthesizer merges → Blind judge panel (randomized labels) picks winner → Winner becomes new A → Repeat until convergence.\n\n**Key invariant:** Every agent is a cold-start fresh invocation — no shared session, no history bleed. Judges never see A\u002FB\u002FAB labels, only X\u002FY\u002FZ.\n\n| Flag | Purpose |\n|------|---------|\n| `--iterations N` | Bounded mode — run exactly N rounds |\n| `--judges N` | Judge count (3-7, odd preferred) |\n| `--convergence N` | Consecutive wins to converge (default: 3) |\n| `--mode \u003Cmode>` | convergent (default), creative, debate |\n| `--domain \u003Ctype>` | software, product, business, security, research, content |\n| `--chain \u003Ctargets>` | Chain converged output to any autoresearch command |\n\n**Chain patterns:** `reason → predict` (converge then stress-test), `reason → plan,fix` (converge then implement), `reason → scenario` (converge then explore edge cases).\n\n**Output:** Creates `reason\u002F{date}-{slug}\u002F` with lineage.md, candidates.md, judge-transcripts.md, reason-results.tsv, handoff.json.\n\n---\n\n## \u002Fautoresearch:scenario — Scenario Explorer (v1.6.0)\n\nAutonomous scenario exploration engine. Takes a seed scenario and iteratively generates situations across 12 dimensions — happy paths, errors, edge cases, abuse, scale, concurrency, temporal, data variation, permissions, integrations, recovery, and state transitions.\n\n```\n\u002Fautoresearch:scenario\nScenario: User attempts to checkout with multiple payment methods\nIterations: 25\n```\n\n**How it works:** Seed analysis → Decompose into 12 dimensions → Generate ONE situation per iteration → Classify (new\u002Fvariant\u002Fduplicate) → Expand edge cases → Log → Repeat until all dimensions explored.\n\nAdaptive setup: provides 4-8 questions based on how much context you give. Just type `\u002Fautoresearch:scenario` with nothing else and it walks you through everything.\n\n| Flag | Purpose |\n|------|---------|\n| `--domain \u003Ctype>` | Domain: software, product, business, security, marketing |\n| `--depth \u003Clevel>` | Depth: shallow (10), standard (25), deep (50+) |\n| `--format \u003Ctype>` | Output: use-cases, user-stories, test-scenarios, threat-scenarios |\n| `--focus \u003Carea>` | Prioritize: edge-cases, failures, security, scale |\n| `--scope \u003Cglob>` | Limit to specific files\u002Ffeatures |\n\n**5 domains supported** with tailored dimension priorities and output formats. **Chain with** `\u002Fautoresearch:debug` to hunt bugs in discovered edge cases, or `\u002Fautoresearch:security` to audit discovered threat scenarios.\n\n---\n\n## Guard — Prevent Regressions (v1.0.4)\n\nWhen optimizing a metric, the loop might break existing behavior. **Guard** is an optional safety net.\n\n```\n\u002Fautoresearch\nGoal: Reduce API response time to under 100ms\nVerify: npm run bench:api | grep \"p95\"\nGuard: npm test\n```\n\n- **Verify** = \"Did the metric improve?\" (the goal)\n- **Guard** = \"Did anything else break?\" (the safety net)\n\nIf the metric improves but the guard fails, Claude reworks the optimization (up to 2 attempts). Guard\u002Ftest files are never modified.\n\n> **Credit:** Guard was contributed by [@pronskiy](https:\u002F\u002Fgithub.com\u002Fpronskiy) (JetBrains) in [PR #7](https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F7).\n\n---\n\n## Results Tracking\n\nEvery iteration is logged in TSV format:\n\n```tsv\niteration  commit   metric  delta   status    description\n0          a1b2c3d  85.2    0.0     baseline  initial state\n1          b2c3d4e  87.1    +1.9    keep      add tests for auth edge cases\n2          -        86.5    -0.6    discard   refactor test helpers (broke 2 tests)\n3          c3d4e5f  88.3    +1.2    keep      add error handling tests\n```\n\nEvery 10 iterations, Claude prints a progress summary. Bounded loops print a final summary with baseline → current best.\n\n---\n\n## Crash Recovery\n\n| Failure | Response |\n|---------|----------|\n| Syntax error | Fix immediately, don't count as iteration |\n| Runtime error | Attempt fix (max 3 tries), then move on |\n| Resource exhaustion | Revert, try smaller variant |\n| Infinite loop \u002F hang | Kill after timeout, revert |\n| External dependency | Skip, log, try different approach |\n\n---\n\n## Repository Structure\n\n```\nautoresearch\u002F\n├── README.md\n├── COMPARISON.md                                  ← Karpathy's Autoresearch vs Claude Autoresearch\n├── guide\u002F                                         ← Comprehensive guides — one per command + advanced patterns\n│   ├── README.md                                  ← Guide index\n│   ├── getting-started.md                         ← Installation, core concepts, FAQ\n│   ├── autoresearch.md                            ← The autonomous loop\n│   ├── autoresearch-plan.md                       ← Setup wizard\n│   ├── autoresearch-debug.md                      ← Bug hunter\n│   ├── autoresearch-fix.md                        ← Error crusher\n│   ├── autoresearch-security.md                   ← Security auditor\n│   ├── autoresearch-ship.md                       ← Shipping workflow\n│   ├── autoresearch-scenario.md                   ← Scenario explorer\n│   ├── autoresearch-predict.md                    ← Multi-persona swarm prediction\n│   ├── autoresearch-learn.md                      ← Documentation engine\n│   ├── autoresearch-reason.md                     ← Adversarial refinement\n│   ├── chains-and-combinations.md                 ← Multi-command pipelines\n│   ├── examples-by-domain.md                      ← Real-world examples by domain\n│   ├── advanced-patterns.md                       ← Guards, MCP, CI\u002FCD, FAQ\n│   └── scenario\u002F                                  ← 10 real-world scenario walkthroughs\n│       ├── README.md                              ← Scenario guide index\n│       ├── real-time-chat-messaging.md\n│       ├── multi-tenant-saas-onboarding.md\n│       ├── cicd-pipeline-deployment.md\n│       ├── healthcare-appointment-scheduling.md\n│       ├── social-media-content-moderation.md\n│       ├── iot-firmware-updates.md\n│       ├── document-collaboration.md\n│       ├── cross-border-wire-transfers.md\n│       ├── search-autocomplete.md\n│       ├── mobile-push-notifications.md\n│       └── adversarial-architecture-decisions.md\n├── LICENSE\n├── .claude-plugin\u002F\n│   └── marketplace.json                           ← Plugin marketplace manifest (source: .\u002Fclaude-plugin)\n├── claude-plugin\u002F                                 ← Distribution package (what users install)\n│   ├── .claude-plugin\u002F\n│   │   └── plugin.json                            ← Plugin metadata + version\n│   ├── commands\u002F\n│   │   ├── autoresearch.md                        ← Main \u002Fautoresearch command\n│   │   └── autoresearch\u002F\n│   │       ├── ship.md                            ← \u002Fautoresearch:ship registration\n│   │       ├── plan.md                            ← \u002Fautoresearch:plan registration\n│   │       ├── security.md                        ← \u002Fautoresearch:security registration\n│   │       ├── debug.md                           ← \u002Fautoresearch:debug registration\n│   │       ├── fix.md                             ← \u002Fautoresearch:fix registration\n│   │       ├── scenario.md                        ← \u002Fautoresearch:scenario registration\n│   │       ├── predict.md                         ← \u002Fautoresearch:predict registration\n│   │       ├── learn.md                           ← \u002Fautoresearch:learn registration\n│   │       └── reason.md                          ← \u002Fautoresearch:reason registration\n│   └── skills\u002F\n│       └── autoresearch\u002F\n│           ├── SKILL.md                           ← Main skill (loaded by Claude Code)\n│           └── references\u002F\n│               ├── autonomous-loop-protocol.md    ← 8-phase loop protocol\n│               ├── core-principles.md             ← 7 universal principles\n│               ├── plan-workflow.md               ← Plan wizard protocol\n│               ├── security-workflow.md           ← Security audit protocol\n│               ├── ship-workflow.md               ← Ship workflow protocol\n│               ├── debug-workflow.md              ← Debug loop protocol\n│               ├── fix-workflow.md                ← Fix loop protocol\n│               ├── scenario-workflow.md           ← Scenario exploration protocol\n│               ├── predict-workflow.md            ← Multi-persona swarm prediction workflow\n│               ├── learn-workflow.md              ← Documentation engine protocol\n│               ├── reason-workflow.md             ← Adversarial refinement protocol\n│               └── results-logging.md             ← TSV tracking format\n```\n\n---\n\n## FAQ\n\n**Q: I don't know what metric to use.**\nA: Run `\u002Fautoresearch:plan` — it analyzes your codebase, suggests metrics, and dry-runs the verify command before you launch.\n\n**Q: Does this work with any project?**\nA: Yes. Any language, framework, or domain. Install via `\u002Fplugin marketplace add uditgoenka\u002Fautoresearch` or manually copy from the `claude-plugin\u002F` directory.\n\n**Q: How do I stop the loop?**\nA: `Ctrl+C` or add `Iterations: N` to your inline config to run exactly N iterations. Claude commits before verifying, so your last successful state is always in git.\n\n**Q: Can I use this for non-code tasks?**\nA: Absolutely. Sales emails, marketing copy, HR policies, runbooks — anything with a measurable metric. See [Examples by Domain](guide\u002Fexamples-by-domain.md).\n\n**Q: Does \u002Fautoresearch:security modify my code?**\nA: No. It's read-only — analyzes code and produces a structured report. Use `--fix` to opt into auto-remediation of confirmed Critical\u002FHigh findings.\n\n**Q: Can I use MCP servers?**\nA: Yes. Any MCP server configured in Claude Code is available during the loop for database queries, API calls, analytics, etc. See [Advanced Patterns](guide\u002Fadvanced-patterns.md#using-with-mcp-servers).\n\n**Q: What's the difference between \u002Fautoresearch:predict and \u002Fautoresearch:reason?**\nA: Predict is a one-shot analysis — 5 experts debate your existing code. Reason is an iterative refinement loop — competing candidates are generated, critiqued, synthesized, and blind-judged over multiple rounds until convergence. Use predict for analysis before acting; use reason for decisions where no objective metric exists.\n\n---\n\n## Contributing\n\nContributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md).\n\nAreas of interest: new domain examples, verification script templates, CI\u002FCD integrations, real-world benchmarks. All guides are in the [guide\u002F](guide\u002F) folder.\n\n---\n\n## Star History\n\n\u003Ca href=\"https:\u002F\u002Fwww.star-history.com\u002F?repos=uditgoenka%2Fautoresearch&type=timeline&legend=top-left\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fimage?repos=uditgoenka\u002Fautoresearch&type=timeline&theme=dark&legend=bottom-right&v=20260319\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_84ee95e20a50.png\" \u002F>\n   \u003Cimg alt=\"Star History Chart\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_84ee95e20a50.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n\n---\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n---\n\n## Credits\n\n- **[Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy)** — for [autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch)\n- **[Anthropic](https:\u002F\u002Fanthropic.com)** — for [Claude Code](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) and the skills system\n\n---\n\n\u003Cdiv align=\"center\">\n\n## About the Author\n\n\u003Ca href=\"https:\u002F\u002Fudit.co\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_b2172c3a127b.png\" width=\"80\" style=\"border-radius: 50%;\" alt=\"Udit Goenka\" \u002F>\n\u003C\u002Fa>\n\n**[Udit Goenka](https:\u002F\u002Fudit.co)** — AI Product Expert, Founder & Angel Investor\n\nSelf-taught builder who went from a slow internet connection in India to founding multiple companies and helping 700+ startups generate over ~$25m in revenue.\n\n**Building:** [TinyCheque](https:\u002F\u002Ftinycheque.com) (India's first agentic AI venture studio) · [Firstsales.io](https:\u002F\u002Ffirstsales.io) (sales automation)\n\n**Investing:** 38 startups backed, 6 exits. Focused on early-stage AI and SaaS.\n\n**Connect:** [udit.co](https:\u002F\u002Fudit.co) · [@iuditg](https:\u002F\u002Fx.com\u002Fiuditg) · [@uditgoenka](https:\u002F\u002Fgithub.com\u002Fuditgoenka) · [Newsletter](https:\u002F\u002Fudit.co\u002Fblog)\n\n> *\"Autonomy scales when you constrain scope, clarify success, mechanize verification, and let agents optimize tactics while humans optimize strategy.\"*\n\n\u003C\u002Fdiv>\n","\u003Cdiv align=\"center\">\n\n# Claude 自动研究\n\n**将 [Claude Code](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) 转化为一个不懈改进的引擎。**\n\n基于 [Karpathy 的 autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch) — 约束 + 机械指标 + 自主迭代 = 复利式增长。\n\n[![Claude Code 技能](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FClaude_Code-Skill-blue?logo=anthropic&logoColor=white)](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code)\n[![版本](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-1.9.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Freleases)\n[![许可证：MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg)](LICENSE)\n\n[![基于](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBased_on-Karpathy's_Autoresearch-orange)](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch)\n[![关注 @iuditg](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFollow-@iuditg-000000?style=flat&logo=x&logoColor=white)](https:\u002F\u002Fx.com\u002Fintent\u002Ffollow?screen_name=iuditg)\n[![支持](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSupport-PayPal-00457C?style=flat&logo=paypal&logoColor=white)](https:\u002F\u002Fpaypal.me\u002Fuditgoenka)\n\n\u003Cbr>\n\n*“设定目标 → Claude 执行循环 → 你醒来时已有成果”*\n\n*你不需要 AGI。你需要的是一个目标、一个指标，以及永不停歇的循环。*\n\n\u003Cbr>\n\n[工作原理](#how-it-works) · [命令](#commands) · [快速入门](#quick-start) · [指南](guide\u002F) · [常见问题](#faq)\n\n\u003C\u002Fdiv>\n\n---\n\n```\n      计划              循环             调试              修复            安全            发布\n ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐\n │   目标   │     │  修改  │     │   查找   │     │   修复    │     │  STRIDE  │     │  阶段   │\n │  指标  │────▶│  验证  │────▶│   缺陷   │────▶│  错误  │────▶│  OWASP   │────▶│  部署  │\n │  范围   │     │  保留\u002F   │     └──────────┘     └──────────┘     │  团队    │     └──────────┘\n\u002Fautoresearch:    └──────────┘    \u002Fautoresearch:    \u002Fautoresearch:   └──────────┘    \u002Fautoresearch:\n  计划            \u002Fautoresearch     调试              修复          \u002Fautoresearch:      发布\n                                                                     安全\n\n                  ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐\n                  │ 场景   │     │ 预测  │     │  学习   │     │  推理  │\n                  │   边缘   │     │ 5位专家 │     │   文档   │     │  辩论  │\n                  │   案例  │     │  群体   │     │   生成   │     │ 收敛  │\n                  └──────────┘     └──────────┘     └──────────┘     └──────────┘\n                 \u002Fautoresearch:   \u002Fautoresearch:   \u002Fautoresearch:   \u002Fautoresearch:\n                   场景         预测           学习           推理\n```\n\n---\n\n## 为何存在\n\n[Karpathy 的 autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch) 证明，仅用一段 630 行的 Python 脚本，就能在遵循简单原则的情况下——单一指标、受限范围、快速验证、自动回滚、以 Git 作为记忆——实现机器学习模型的自主优化，每晚完成 **100 次实验**。\n\n**Claude Autoresearch 将这些原则推广到任何领域。** 不仅限于机器学习——代码、内容、营销、销售、人力资源、DevOps，或任何可量化的事物。\n\n---\n\n## 工作原理\n\n```\n循环（无限次或 N 次）：\n  1. 回顾当前状态 + Git 历史 + 结果日志\n  2. 根据成功、失败和未尝试过的方案，选择下一次改动\n  3. 进行一次专注的改动\n  4. 在验证前提交 Git 提交\n  5. 执行机械验证（测试、基准测试、评分）\n  6. 若有改进 → 保留；若变差 → Git 回滚；若崩溃 → 修复或跳过。\n  7. 记录结果\n  8. 重复。除非你中断，否则永不结束（或直到完成 N 次迭代）。\n```\n\n每一次改进都会叠加。每次失败都会自动回滚。进展会以 TSV 格式记录。\n\n### 设置阶段\n\n在进入循环之前，Claude 会进行一次性设置：\n\n1. **读取上下文** — 读取所有相关文件\n2. **定义目标** — 提取或询问一个可量化的指标\n3. **定义范围** — 哪些文件可以修改，哪些只读\n4. **建立基线** — 对当前状态进行验证（第 0 次迭代）\n5. **确认并开始** — 展示设置后，开始循环\n\n### 8 条关键规则\n\n| 序号 | 规则 |\n|---|------|\n| 1 | **循环直至完成** — 无限制：永远持续。有限制：N 次后总结 |\n| 2 | **先读后写** — 在修改之前充分理解上下文 |\n| 3 | **每次迭代一次改动** — 原子级变更。若出错，便知原因 |\n| 4 | **仅进行机械验证** — 不依赖主观“看起来不错”。使用指标 |\n| 5 | **自动回滚** — 失败的更改会立即回滚 |\n| 6 | **越简单越好** — 在效果相同的情况下，代码越少越保留 |\n| 7 | **Git 是记忆** — 实验以 `experiment:` 为前缀提交，`git revert` 会将失败的实验保留在历史中，代理在每次迭代前必须阅读 `git log` 和 `git diff` |\n| 8 | **卡住时，再深入思考** — 重新阅读、结合接近成功的尝试、尝试激进的改变 |\n\n---\n\n## 命令\n\n| 命令 | 功能 |\n|---------|--------------|\n| `\u002Fautoresearch` | 运行自主迭代循环（无限次） |\n| `Iterations: N` | 添加到内联配置，运行恰好 N 次迭代后停止 |\n| `\u002Fautoresearch:plan` | 交互式向导：目标 → 范围、指标、验证配置 |\n| `\u002Fautoresearch:security` | 自主 STRIDE + OWASP + 红队安全审计 |\n| `\u002Fautoresearch:ship` | 通用发布流程（代码、内容、营销、销售、研究、设计） |\n| `\u002Fautoresearch:debug` | 自主漏洞挖掘循环 — 科学方法 + 迭代式调查 |\n| `\u002Fautoresearch:fix` | 自主修复循环 — 迭代修复错误，直至全部解决 |\n| `\u002Fautoresearch:scenario` | 场景驱动的用例生成器 — 探索各种情况、边缘案例及衍生场景 |\n| `\u002Fautoresearch:predict` | 多角色预测 | 在行动前从 5 位专家视角预分析代码 |\n| `\u002Fautoresearch:learn` | 自主文档生成引擎 — 搜集代码库信息，生成\u002F更新文档，验证并修复 |\n| `\u002Fautoresearch:reason` | 对抗性优化 — 通过隔离的多智能体辩论，让盲评小组对主观内容达成共识 |\n| `Guard: \u003Ccommand>` | 可选的安全网 — 必须通过才能保留更改 |\n\n**所有命令在未提供参数时，都会使用 `AskUserQuestion` 进行交互式设置。** 只需输入命令，Claude 会根据你的代码库，逐步询问你需要的内容，并提供智能默认值。高级用户可以直接在命令中添加标志来跳过向导。\n\n### 快速决策指南\n\n| 我想... | 使用 |\n|--------------|-----|\n| 提高测试覆盖率 \u002F 减少打包体积 \u002F 任何指标 | `\u002Fautoresearch`（对于有限次运行，添加 `Iterations: N`） |\n| 不知道该使用什么指标 | `\u002Fautoresearch:plan` |\n| 运行安全审计 | `\u002Fautoresearch:security` |\n| 发布 PR \u002F 部署 \u002F 版本 | `\u002Fautoresearch:ship` |\n| 在不破坏现有测试的情况下优化 | 添加 `Guard: npm test` |\n| 搜索代码库中的所有 bug | `\u002Fautoresearch:debug`（对于有限次运行，添加 `Iterations: 20`） |\n| 修复所有错误（测试、类型检查、lint） | `\u002Fautoresearch:fix` |\n| 先调试再自动修复 | `\u002Fautoresearch:debug --fix` |\n| 检查某项内容是否已准备好发布 | `\u002Fautoresearch:ship --checklist-only` |\n| 探索某个功能的边界情况 | `\u002Fautoresearch:scenario` |\n| 生成测试场景 | `\u002Fautoresearch:scenario --domain software --format test-scenarios` |\n| 对用户流程进行压力测试 | `\u002Fautoresearch:scenario --depth deep` |\n| 在开始前想要专家意见 | `\u002Fautoresearch:predict` |\n| 从多个角度分析 | `\u002Fautoresearch:predict --chain debug` |\n| 为新代码库生成文档 | `\u002Fautoresearch:learn --mode init` |\n| 更新变更后的现有文档 | `\u002Fautoresearch:learn --mode update` |\n| 检查文档是否过时 | `\u002Fautoresearch:learn --mode check` |\n| 讨论架构决策 | `\u002Fautoresearch:reason --domain software` |\n| 以对抗性方式完善提案或方案 | `\u002Fautoresearch:reason --domain business` |\n| 先达成最佳设计共识再进行验证 | `\u002Fautoresearch:reason --chain predict` |\n\n---\n\n## 快速入门\n\n### 1. 安装\n\n**选项 A — 插件安装（推荐）：**\n\n在 Claude Code 中运行：\n```\n\u002Fplugin marketplace add uditgoenka\u002Fautoresearch\n\u002Fplugin install autoresearch@autoresearch\n```\n\n完成！重启 Claude Code 后，所有 10 条命令即可使用。\n\n> **注意：** 安装后请启动一个新的 Claude Code 会话。在同一会话中无法解析引用文件——这是 Claude Code 平台的限制。\n\n**更新（无需重新安装）：**\n```\n\u002Fplugin update autoresearch\n```\n\n这将拉取最新版本。运行 `\u002Freload-plugins` 即可激活，无需卸载或重新克隆。\n\n**选项 B — 手动复制：**\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch.git\n\n# 将技能及子命令复制到您的项目中\ncp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch .claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch .claude\u002Fcommands\u002Fautoresearch\ncp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md .claude\u002Fcommands\u002Fautoresearch.md\n```\n\n或者全局安装：\n```bash\ncp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch ~\u002F.claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch ~\u002F.claude\u002Fcommands\u002Fautoresearch\ncp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md ~\u002F.claude\u002Fcommands\u002Fautoresearch.md\n```\n\n> **注意：** `commands\u002F` 目录是使子命令（如 `\u002Fautoresearch:ship`、`\u002Fautoresearch:plan`、`\u002Fautoresearch:security`）正常运行所必需的。\n\n### 2. 运行它\n\n```\n\u002Fautoresearch\n目标：将测试覆盖率从 72% 提升至 90%\n范围：src\u002F**\u002F*.test.ts, src\u002F**\u002F*.ts\n指标：覆盖率 %（越高越好）\n验证：npm test -- --coverage | grep \"All files\"\n```\n\n### 3. 离开\n\nClaude 会读取所有文件，建立基线并开始迭代——每次只做一次更改。保留改进，自动回滚失败，并记录所有操作。**除非您中断，否则不会停止**（或直到达到设定的迭代次数）。\n\n---\n\n## \u002Fautoresearch:plan — 目标 → 配置向导\n\n最难的部分不是循环本身，而是正确地定义范围、指标和验证步骤。`\u002Fautoresearch:plan` 可以将您的自然语言目标转化为经过验证、可直接执行的配置。\n\n```\n\u002Fautoresearch:plan\n目标：让 API 响应更快\n```\n\n向导会引导您完成 5 个步骤：捕捉目标 → 定义范围 → 定义指标 → 定义方向 → 验证验证命令（干运行）。每个环节都有严格的要求——范围必须能解析为文件，指标必须输出一个数字，验证必须通过干运行。\n\n---\n\n## \u002Fautoresearch:security — 自主安全审计\n\n基于 STRIDE 威胁建模、OWASP Top 10 检查以及红队对抗性分析（使用 4 种敌对角色）的只读安全审计。\n\n```\n\u002Fautoresearch:security\n迭代次数：10\n```\n\n**它的工作方式：** 代码库侦察 → 资产清单 → 信任边界 → STRIDE 威胁模型 → 攻击面地图 → 自主测试循环 → 结构化报告。\n\n每个发现都需提供 **代码证据**（文件:行 + 攻击场景）。不含任何理论上的空谈。\n\n| 标志 | 用途 |\n|------|---------|\n| `--diff` | 仅审计自上次审计以来更改的文件 |\n| `--fix` | 自动修复确认的严重\u002F高危问题 |\n| `--fail-on \u003Cseverity>` | 如果出现指定级别的问题，则退出非零状态，用于 CI\u002FCD 门控 |\n\n**输出：** 创建 `security\u002F{日期}-{slug}\u002F` 文件夹，包含 7 个结构化的报告文件。\n\n---\n\n## \u002Fautoresearch:ship — 通用发布工作流\n\n通过 8 个阶段发布任何内容：**识别 → 清点 → 检查清单 → 准备 → 干运行 → 发布 → 验证 → 日志记录。**\n\n```\n\u002Fautoresearch:ship --auto\n```\n\n它可以自动检测您要发布的对象（代码 PR、部署、博客文章、邮件营销活动、销售演示文稿、研究论文、设计资产），并生成特定领域的检查清单——每项内容均可机械验证。\n\n| 标志 | 用途 |\n|------|---------|\n| `--dry-run` | 验证所有内容但不实际发布 |\n| `--auto` | 如果检查清单通过则自动批准 |\n| `--force` | 跳过非关键项（但仍强制执行关键项） |\n| `--rollback` | 撤销上一次发布操作 |\n| `--monitor N` | 发布后监控 N 分钟 |\n| `--type \u003Ctype>` | 覆盖自动检测结果 |\n| `--checklist-only` | 仅检查准备情况 |\n\n**支持的类型：** code-pr、code-release、deployment、content、marketing-email、marketing-campaign、sales、research、design。\n\n---\n\n## \u002Fautoresearch:debug — 自主 Bug 搜寻器（v1.3.0）\n\n科学方法与 autoresearch 循环相结合。它不会只找到一个 bug——而是通过可证伪的假设、基于证据的调查以及 7 种调查技术，迭代式地搜寻所有 bugs。\n\n```\n\u002Fautoresearch:debug\n范围：src\u002Fapi\u002F**\u002F*.ts\n症状：API 在 POST \u002Fusers 请求时返回 500 错误\n迭代次数：20\n```\n\n**工作原理：** 收集症状 → 侦察（绘制错误表面） → 提出假设（具体且可测试） → 测试（每次迭代进行一项实验） → 分类（确认\u002F证伪\u002F不确定） → 记录 → 重复。\n\n每个发现都需要 **代码证据**（文件:行 + 复现步骤）。所有被证伪的假设都会被记录下来——同样具有价值。它使用 7 种技术：二分查找、差异调试、最小化复现、跟踪执行、模式搜索、逆向推理、橡皮鸭法。\n\n| 标志 | 用途 |\n|------|---------|\n| `--fix` | 搜寻完成后自动切换到 `\u002Fautoresearch:fix` |\n| `--scope \u003Cglob>` | 限制调查范围 |\n| `--symptom \"\u003Ctext>\"` | 预先填写症状 |\n| `--severity \u003Clevel>` | 最低报告级别 |\n\n---\n\n## \u002Fautoresearch:fix — 自主错误修复器（v1.3.0）\n\n接收一个存在错误的状态，并通过迭代逐步修复，直到所有问题都解决。每次迭代仅修复一处。操作具有原子性、提交后会验证，若失败则自动回滚。\n\n```\n\u002Fautoresearch:fix\n```\n\n**工作原理：** 自动检测哪些部分存在问题（测试、类型检查、代码风格检查、构建）→ 按优先级排序（阻塞问题优先）→ 修复其中一处 → 提交 → 验证错误数量是否减少 → 安全检查（确保无回归）→ 继续或回滚 → 重复此过程，直至错误数为零。\n\n**当错误数降为零时自动停止** — 即使在无界模式下也是如此。\n\n| 标志 | 用途 |\n|------|------|\n| `--target \u003Ccommand>` | 显式验证命令 |\n| `--guard \u003Ccommand>` | 必须始终通过的安全检查命令 |\n| `--category \u003Ctype>` | 仅修复特定类型的错误（测试、类型检查、代码风格检查、构建） |\n| `--from-debug` | 从最近的调试会话中读取发现的问题 |\n\n**串联使用：** 先运行 `\u002Fautoresearch:debug` 并设置 `Iterations: 15`，再运行 `\u002Fautoresearch:fix --from-debug` 并设置 `Iterations: 30`。\n\n---\n\n## \u002Fautoresearch:learn — 自主文档生成引擎\n\n扫描代码库 → 生成文档 → 验证 → 修复 → 重复。提供四种模式：init（从零开始创建）、update（更新现有文档）、check（只读健康报告）、summarize（快速概览）。\n\n```\n\u002Fautoresearch:learn --mode init --depth deep\n```\n\n动态发现文档（扫描 `docs\u002F*.md` 文件），自动检测项目类型，进行验证-修复循环（最多重试三次），根据规模调整扫描范围，利用 git-diff 确定更新范围，支持通过 `--file` 选择性更新单个文档。自动生成 Mermaid 架构图、条件化文档（API 参考、测试指南、配置指南、变更日志）、文档间的交叉引用链接以及依赖关系文档。支持 `--format` 参数以输出其他格式。\n\n---\n\n## \u002Fautoresearch:predict — 多角色预测（v1.7.0）\n\n在调试、修复或发布之前——两分钟内获取五位专家的观点。\n\n`\u002Fautoresearch:predict` 模拟一支由架构师、安全分析师、性能工程师、可靠性工程师和“魔鬼代言人”组成的专家团队，他们独立分析你的代码，讨论发现并达成共识。其输出可直接串联到其他任何命令：\n\n- `\u002Fautoresearch:predict --chain debug` — 在调试前对假设进行预排名\n- `\u002Fautoresearch:predict --chain security` — 多角色红队分析\n- `\u002Fautoresearch:predict --chain scenario,debug,fix` — 完整的质量流程\n\n---\n\n## \u002Fautoresearch:reason — 对抗式优化（v1.9.0）\n\n将 autoresearch 扩展至**主观领域**，即不存在客观度量标准的场景。盲评小组本身就是适应度函数——它用于评估架构决策、产品战略、内容质量和设计讨论等。\n\n```\n\u002Fautoresearch:reason\n任务：我们的订单管理系统是否应采用事件溯源？\n领域：软件\n迭代次数：8\n```\n\n**工作原理：** 生成 A → 批评者提出攻击性观点（稻草人论证）→ 作者 B 回应 → 合成器整合 → 盲评小组（随机打标签）选出胜者 → 胜者成为新的 A → 重复此过程，直至收敛。\n\n**关键不变性：** 每个参与者都是全新启动的独立调用——无共享会话，无历史信息泄露。评委永远不会看到 A\u002FB\u002FAB 的标签，只会看到 X\u002FY\u002FZ。\n\n| 标志 | 用途 |\n|------|------|\n| `--iterations N` | 有界模式——精确执行 N 轮 |\n| `--judges N` | 评委人数（3–7 人，建议奇数） |\n| `--convergence N` | 连续获胜达到收敛的次数（默认：3） |\n| `--mode \u003Cmode>` | 收敛模式（默认）、创意模式、辩论模式 |\n| `--domain \u003Ctype>` | 软件、产品、业务、安全、研究、内容等领域 |\n| `--chain \u003Ctargets>` | 将收敛后的结果串联到任何 autoresearch 命令\n\n**串联模式：** `reason → predict`（先收敛再压力测试）、`reason → plan,fix`（先收敛再实施）、`reason → scenario`（先收敛再探索边界情况）。\n\n**输出：** 创建 `reason\u002F{date}-{slug}\u002F` 目录，包含 lineage.md、candidates.md、judge-transcripts.md、reason-results.tsv 和 handoff.json。\n\n---\n\n## \u002Fautoresearch:scenario — 场景探索器（v1.6.0）\n\n自主场景探索引擎。以一个初始场景为起点，按 12 个维度迭代生成各种情境——正常路径、错误场景、边界情况、滥用、规模、并发、时间相关、数据变化、权限、集成、恢复以及状态转换。\n\n```\n\u002Fautoresearch:scenario\n场景：用户尝试使用多种支付方式结账\n迭代次数：25\n```\n\n**工作原理：** 分析初始场景 → 分解为 12 个维度 → 每次迭代生成一种情境 → 分类（新情境\u002F变体\u002F重复）→ 拓展边界情况 → 记录 → 重复此过程，直至所有维度都被探索完毕。\n\n自适应设置：根据你提供的上下文信息，系统会提出 4–8 个问题。只需输入 `\u002Fautoresearch:scenario`，无需其他参数，系统便会引导你完成整个流程。\n\n| 标志 | 用途 |\n|------|------|\n| `--domain \u003Ctype>` | 领域：软件、产品、业务、安全、营销 |\n| `--depth \u003Clevel>` | 深度：浅层（10）、标准（25）、深层（50+） |\n| `--format \u003Ctype>` | 输出格式：用例、用户故事、测试场景、威胁场景 |\n| `--focus \u003Carea>` | 优先处理：边界情况、失败场景、安全问题、规模相关问题 |\n| `--scope \u003Cglob>` | 限制于特定文件或功能\n\n**支持 5 种领域**，并针对不同领域定制维度优先级和输出格式。可与 `\u002Fautoresearch:debug` 串联，以在发现的边界情况下查找 bug；或与 `\u002Fautoresearch:security` 串联，以审计发现的威胁场景。\n\n---\n\n## Guard — 防止回归（v1.0.4）\n\n在优化某个指标时，循环可能会破坏现有行为。**Guard** 是一个可选的安全保障机制。\n\n```\n\u002Fautoresearch\n目标：将 API 响应时间降至 100ms 以下\n验证：npm run bench:api | grep \"p95\"\nGuard：npm test\n```\n\n- **Verify** = “指标是否改善了？”（目标）\n- **Guard** = “是否有其他部分被破坏了？”（安全保障）\n\n如果指标改善但安全检查未通过，Claude 会重新调整优化方案（最多尝试两次）。安全检查或测试文件绝不会被修改。\n\n> **鸣谢：** Guard 由 [@pronskiy](https:\u002F\u002Fgithub.com\u002Fpronskiy)（JetBrains）在 [PR #7](https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F7) 中贡献。\n\n---\n\n## 结果跟踪\n\n每一轮迭代都会以 TSV 格式记录：\n\n```tsv\niteration  commit   metric  delta   status    description\n0          a1b2c3d  85.2    0.0     baseline  初始状态\n1          b2c3d4e  87.1    +1.9    keep      添加认证边界情况的测试\n2          -        86.5    -0.6    discard   重构测试辅助工具（导致 2 个测试失败）\n3          c3d4e5f  88.3    +1.2    keep      添加错误处理测试\n```\n\n每 10 次迭代，Claude 会打印一次进度摘要。有界循环会在最后打印一份包含基线与当前最佳结果的总结。\n\n---\n\n## 故障恢复\n\n| 故障 | 应对措施 |\n|-------|----------|\n| 语法错误 | 立即修复，不计入迭代次数 |\n| 运行时错误 | 尝试修复（最多 3 次），然后继续 |\n| 资源耗尽 | 回滚，尝试较小的变体 |\n| 无限循环\u002F卡死 | 超时后终止进程，回滚 |\n| 外部依赖问题 | 跳过，记录日志，尝试其他方法 |\n\n---\n\n## 仓库结构\n\n```\nautoresearch\u002F\n├── README.md\n├── COMPARISON.md                                  ← Karpathy 的 Autoresearch 与 Claude Autoresearch 的对比\n├── guide\u002F                                         ← 综合指南 — 每个命令对应一篇，另加高级模式\n│   ├── README.md                                  ← 指南索引\n│   ├── getting-started.md                         ← 安装、核心概念、常见问题解答\n│   ├── autoresearch.md                            ← 自主循环\n│   ├── autoresearch-plan.md                       ← 设置向导\n│   ├── autoresearch-debug.md                      ← 捕捉错误\n│   ├── autoresearch-fix.md                        ← 修复错误\n│   ├── autoresearch-security.md                   ← 安全审计员\n│   ├── autoresearch-ship.md                       ← 发布工作流\n│   ├── autoresearch-scenario.md                   ← 场景探索者\n│   ├── autoresearch-predict.md                    ← 多角色群体预测\n│   ├── autoresearch-learn.md                      ← 文档生成引擎\n│   ├── autoresearch-reason.md                     ← 对抗性优化\n│   ├── chains-and-combinations.md                 ← 多命令流水线\n│   ├── examples-by-domain.md                      ← 按领域划分的真实场景示例\n│   ├── advanced-patterns.md                       ← 防护机制、MCP、CI\u002FCD、常见问题解答\n│   └── scenario\u002F                                  ← 10 个真实场景操作指南\n│       ├── README.md                              ← 场景指南索引\n│       ├── real-time-chat-messaging.md\n│       ├── multi-tenant-saas-onboarding.md\n│       ├── cicd-pipeline-deployment.md\n│       ├── healthcare-appointment-scheduling.md\n│       ├── social-media-content-moderation.md\n│       ├── iot-firmware-updates.md\n│       ├── document-collaboration.md\n│       ├── cross-border-wire-transfers.md\n│       ├── search-autocomplete.md\n│       ├── mobile-push-notifications.md\n│       └── adversarial-architecture-decisions.md\n├── LICENSE\n├── .claude-plugin\u002F\n│   └── marketplace.json                           ← 插件市场清单（来源：.\u002Fclaude-plugin）\n├── claude-plugin\u002F                                 ← 分发包（用户安装的内容）\n│   ├── .claude-plugin\u002F\n│   │   └── plugin.json                            ← 插件元数据 + 版本\n│   ├── commands\u002F\n│   │   ├── autoresearch.md                        ← 主命令 \u002Fautoresearch\n│   │   └── autoresearch\u002F\n│   │       ├── ship.md                            ← \u002Fautoresearch:ship 注册\n│   │       ├── plan.md                            ← \u002Fautoresearch:plan 注册\n│   │       ├── security.md                        ← \u002Fautoresearch:security 注册\n│   │       ├── debug.md                           ← \u002Fautoresearch:debug 注册\n│   │       ├── fix.md                             ← \u002Fautoresearch:fix 注册\n│   │       ├── scenario.md                        ← \u002Fautoresearch:scenario 注册\n│   │       ├── predict.md                         ← \u002Fautoresearch:predict 注册\n│   │       ├── learn.md                           ← \u002Fautoresearch:learn 注册\n│   │       └── reason.md                          ← \u002Fautoresearch:reason 注册\n│   └── skills\u002F\n│       └── autoresearch\u002F\n│           ├── SKILL.md                           ← 主技能（由 Claude Code 加载）\n│           └── references\u002F\n│               ├── autonomous-loop-protocol.md    ← 8 阶段循环协议\n│               ├── core-principles.md             ← 7 条通用原则\n│               ├── plan-workflow.md               ← 计划向导协议\n│               ├── security-workflow.md           ← 安全审计协议\n│               ├── ship-workflow.md               ← 发布工作流协议\n│               ├── debug-workflow.md              ← 调试循环协议\n│               ├── fix-workflow.md                ← 修复循环协议\n│               ├── scenario-workflow.md           ← 场景探索协议\n│               ├── predict-workflow.md            ← 多角色群体预测协议\n│               ├── learn-workflow.md              ← 文档生成引擎协议\n│               ├── reason-workflow.md             ← 对抗性优化协议\n│               └── results-logging.md             ← TSV 格式跟踪记录\n```\n\n---\n\n## 常见问题解答\n\n**问：我不知道该使用什么指标。**\n答：运行 `\u002Fautoresearch:plan` — 它会分析你的代码库，建议合适的指标，并在你正式开始之前先进行一次验证命令的试运行。\n\n**问：这个工具适用于任何项目吗？**\n答：是的。无论语言、框架或领域，都可以使用。你可以通过 `\u002Fplugin marketplace add uditgoenka\u002Fautoresearch` 进行安装，或者手动从 `claude-plugin\u002F` 目录中复制。\n\n**问：如何停止循环？**\n答：按 `Ctrl+C` 即可停止，或者在内联配置中添加 `Iterations: N` 来指定只运行 N 次迭代。Claude 会在验证之前提交更改，因此你的最后一次成功状态始终会保存在 Git 中。\n\n**问：我可以用它处理非代码任务吗？**\n答：当然可以。销售邮件、营销文案、人力资源政策、操作手册等，只要能设定可衡量的指标即可。更多信息请参阅 [按领域划分的示例](guide\u002Fexamples-by-domain.md)。\n\n**问：\u002Fautoresearch:security 会修改我的代码吗？**\n答：不会。它是只读模式——仅分析代码并生成结构化报告。如果需要自动修复已确认的严重或高危问题，可以使用 `--fix` 参数。\n\n**问：我可以使用 MCP 服务器吗？**\n答：可以。任何在 Claude Code 中配置的 MCP 服务器都可以在循环过程中用于数据库查询、API 调用、数据分析等。详情请参阅 [高级模式](guide\u002Fadvanced-patterns.md#using-with-mcp-servers)。\n\n**问：\u002Fautoresearch:predict 和 \u002Fautoresearch:reason 有什么区别？**\n答：Predict 是一次性分析——5 位专家会针对你现有的代码展开讨论。而 Reason 则是一个迭代优化的循环过程——系统会生成多个候选方案，经过评审、整合和盲评等多个轮次，直到结果趋于一致。建议在需要行动前使用 Predict 进行分析；而在缺乏客观指标的情况下做决策时，则应使用 Reason。\n\n---\n\n## 参与贡献\n\n欢迎贡献！详情请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。\n\n感兴趣的领域：新的领域示例、验证脚本模板、CI\u002FCD 集成、实际基准测试。所有指南都位于 [guide\u002F](guide\u002F) 文件夹中。\n\n---\n\n## 星标历史\n\n\u003Ca href=\"https:\u002F\u002Fwww.star-history.com\u002F?repos=uditgoenka%2Fautoresearch&type=timeline&legend=top-left\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fimage?repos=uditgoenka\u002Fautoresearch&type=timeline&theme=dark&legend=bottom-right&v=20260319\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_84ee95e20a50.png\" \u002F>\n   \u003Cimg alt=\"星标历史图表\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_84ee95e20a50.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n\n---\n\n## 许可证\n\nMIT — 详见 [LICENSE](LICENSE)。\n\n---\n\n## 致谢\n\n- **[Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy)** — 感谢 [autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch)\n- **[Anthropic](https:\u002F\u002Fanthropic.com)** — 感谢 [Claude Code](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) 以及技能系统\n\n---\n\n\u003Cdiv align=\"center\">\n\n## 关于作者\n\n\u003Ca href=\"https:\u002F\u002Fudit.co\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_readme_b2172c3a127b.png\" width=\"80\" style=\"border-radius: 50%;\" alt=\"Udit Goenka\" \u002F>\n\u003C\u002Fa>\n\n**[Udit Goenka](https:\u002F\u002Fudit.co)** — AI 产品专家、创始人兼天使投资人\n\n一位自学成才的创业者，从印度那里的慢速网络起步，最终创立了多家公司，并帮助超过700家初创企业累计创造了逾2500万美元的收入。\n\n**创建的企业：** [TinyCheque](https:\u002F\u002Ftinycheque.com)（印度首家代理式 AI 创业工作室）· [Firstsales.io](https:\u002F\u002Ffirstsales.io)（销售自动化平台）\n\n**投资方面：** 已支持38家初创企业，其中6家成功退出。专注于早期阶段的AI和SaaS领域。\n\n**联系方式：** [udit.co](https:\u002F\u002Fudit.co) · [@iuditg](https:\u002F\u002Fx.com\u002Fiuditg) · [@uditgoenka](https:\u002F\u002Fgithub.com\u002Fuditgoenka) · [博客](https:\u002F\u002Fudit.co\u002Fblog)\n\n> *“当您限制任务范围、明确成功标准、实现验证流程的自动化，并让智能体负责优化执行策略、而人类专注于战略规划时，自主性才能真正规模化。”*\n\n\u003C\u002Fdiv>","# Claude Autoresearch 快速上手指南\n\nClaude Autoresearch 是一个基于 Karpathy 理念的自主迭代工具，它将 Claude Code 转化为一个不知疲倦的改进引擎。通过设定目标、机械指标和自动循环，它能在代码、内容、营销等多个领域实现持续的自动化优化。\n\n## 环境准备\n\n在使用本工具前，请确保满足以下要求：\n\n*   **核心依赖**：已安装并配置好 **Claude Code** CLI 工具。\n*   **版本要求**：建议使用最新版本的 Claude Code 以获得最佳插件兼容性。\n*   **项目环境**：目标项目需初始化 Git 仓库（`git init`），因为该工具依赖 `git commit` 和 `git revert` 作为记忆和回滚机制。\n*   **验证命令**：确保你的项目中已有可执行的验证命令（如 `npm test`, `pytest`, `go test` 等），用于提供机械化的改进指标。\n\n> **注意**：目前官方未提供特定的中国镜像源。如果在国内访问 GitHub 或插件市场受阻，请自行配置全局代理或使用国内 Git 镜像加速克隆过程。\n\n## 安装步骤\n\n推荐通过 Claude Code 插件市场进行安装，操作简单且易于更新。\n\n### 方法一：插件市场安装（推荐）\n\n在终端启动 Claude Code 后，依次运行以下命令：\n\n```bash\n\u002Fplugin marketplace add uditgoenka\u002Fautoresearch\n\u002Fplugin install autoresearch@autoresearch\n```\n\n安装完成后，**必须重启 Claude Code 会话**才能生效（这是 Claude Code 平台的限制，当前会话无法解析新引用的文件）。\n\n**更新插件：**\n无需重新安装，运行以下命令即可拉取最新版本：\n```bash\n\u002Fplugin update autoresearch\n```\n随后运行 `\u002Freload-plugins` 激活更新。\n\n### 方法二：手动复制安装\n\n如果无法访问插件市场，可手动克隆仓库并复制文件：\n\n```bash\n# 1. 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch.git\n\n# 2. 复制技能文件和命令文件到当前项目 (.claude 目录)\ncp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch .claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch .claude\u002Fcommands\u002Fautoresearch\ncp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md .claude\u002Fcommands\u002Fautoresearch.md\n\n# 或者安装到全局 (~\u002F.claude)\n# cp -r autoresearch\u002Fclaude-plugin\u002Fskills\u002Fautoresearch ~\u002F.claude\u002Fskills\u002Fautoresearch\n# cp -r autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch ~\u002F.claude\u002Fcommands\u002Fautoresearch\n# cp autoresearch\u002Fclaude-plugin\u002Fcommands\u002Fautoresearch.md ~\u002F.claude\u002Fcommands\u002Fautoresearch.md\n```\n\n## 基本使用\n\n安装并重启会话后，你可以直接使用 `\u002Fautoresearch` 命令启动自主迭代循环。\n\n### 1. 启动自动优化循环\n\n在 Claude Code 中输入 `\u002Fautoresearch`，然后根据提示配置目标。以下是一个提升测试覆盖率的示例：\n\n```text\n\u002Fautoresearch\nGoal: Increase test coverage from 72% to 90%\nScope: src\u002F**\u002F*.test.ts, src\u002F**\u002F*.ts\nMetric: coverage % (higher is better)\nVerify: npm test -- --coverage | grep \"All files\"\n```\n\n**工作流程说明：**\n1.  **读取上下文**：Claude 会读取作用域内的所有文件。\n2.  **建立基线**：运行验证命令记录当前状态（迭代 #0）。\n3.  **开始循环**：\n    *   进行一次聚焦的代码修改。\n    *   提交 Git (`experiment:` 前缀)。\n    *   运行验证命令。\n    *   **结果判断**：若指标提升则保留；若变差则自动 `git revert` 回滚；若崩溃则尝试修复或跳过。\n    *   记录日志并重复，直到你手动中断或达到指定迭代次数。\n\n### 2. 使用规划向导（不确定指标时）\n\n如果你不清楚如何定义具体的指标或验证命令，可以使用规划向导：\n\n```text\n\u002Fautoresearch:plan\nGoal: Make the API respond faster\n```\n\n该向导会通过交互式问答，帮你逐步确定目标范围、量化指标和验证命令，并自动进行干跑（dry-run）验证配置是否有效。\n\n### 3. 常用场景命令速查\n\n*   **安全审计**：`\u002Fautoresearch:security` (基于 STRIDE 和 OWASP 的自动审计)\n*   **自动修复错误**：`\u002Fautoresearch:fix` (循环修复直到零错误)\n*   **调试漏洞**：`\u002Fautoresearch:debug` (科学方法驱动的漏洞猎寻)\n*   **发布流程**：`\u002Fautoresearch:ship` (通用的发布\u002F部署工作流)\n*   **限制迭代次数**：在命令后添加 `Iterations: N` (例如：`\u002Fautoresearch Iterations: 10`)\n\n> **核心原则**：每次迭代只做一个原子化修改，完全依赖机械化指标（而非主观判断），失败自动回滚，Git 即记忆。","某电商初创团队的后端工程师需要在周五下班前，将订单推荐算法的响应延迟从 200ms 优化至 100ms 以内，同时确保单元测试通过率保持 100%。\n\n### 没有 autoresearch 时\n- 工程师只能凭经验手动修改代码，每次调整后需人工运行测试，耗时且容易遗漏边界情况。\n- 遇到性能回退时，往往需要翻阅大量 Git 记录才能定位是哪次提交导致了问题，排查效率极低。\n- 由于精力有限，一天仅能尝试 3-5 种优化方案，难以覆盖更多潜在的最优解空间。\n- 深夜疲劳作战容易导致判断失误，可能误将带有隐蔽 Bug 的代码合并到主分支。\n- 缺乏系统性的实验记录，无法量化对比不同优化策略的实际收益，决策依赖直觉。\n\n### 使用 autoresearch 后\n- 设定“延迟低于 100ms\"为目标和自动化测试为指标后，autoresearch 自动执行“修改 - 验证 - 保留\u002F回滚”循环，无需人工干预。\n- 一旦某次修改导致测试失败或性能下降，工具立即自动 Git 回滚并记录原因，确保持续集成环境始终稳定。\n- 整夜可自主完成上百次迭代实验，快速遍历各种参数组合与重构方案，挖掘出人类难以想到的优化路径。\n- 所有实验结果自动生成 TSV 日志，清晰展示每次尝试的得失，让团队次日直接基于数据决策而非猜测。\n- 工程师只需定义初始范围和约束，即可安心休息，醒来时直接获得经过充分验证的最优代码版本。\n\n核心价值在于将繁琐的试错过程转化为无人值守的自动化增益引擎，让开发者从重复劳动中解放出来专注于架构设计。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuditgoenka_autoresearch_8483b29e.png","uditgoenka","Udit Goenka","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fuditgoenka_b2172c3a.jpg","Obsession about building and selling product. Currently building TinyCheque.com - India's first AI Venture Studio. ","TinyCheque","Mumbai, India",null,"uditg","tinycheque.com","https:\u002F\u002Fgithub.com\u002Fuditgoenka",[84],{"name":85,"color":86,"percentage":87},"Shell","#89e051",100,3251,250,"2026-04-06T09:21:13","MIT","Linux, macOS, Windows","未说明",{"notes":95,"python":93,"dependencies":96},"该工具并非独立的 Python 脚本，而是专为 'Claude Code' CLI 工具设计的插件\u002F技能。运行前必须安装并配置好 Claude Code 环境及有效的 Anthropic API Key。支持通过插件市场安装或手动复制文件到 .claude 目录。无特定的 GPU、内存或 Python 版本硬性要求，其运行资源取决于宿主机器运行 Claude Code 及被操作项目的需求。",[97],"Claude Code (Anthropic)",[15,14,13,99],"其他",[101,102,64,103,104,105,106,107,108],"ai","autonomous-agent","claude","claude-code","iteration","karpathy","productivity","skill","2026-03-27T02:49:30.150509","2026-04-06T23:07:53.370583",[],[113,118,123,128,133,138,143,148,153,158,163,168,173,178,183,188,193,198,203,208],{"id":114,"version":115,"summary_zh":116,"released_at":117},118283,"v1.9.0","## v1.9.0 新功能\n\n### `\u002Fautoresearch:reason` — 第十个子命令\n\n将自动研究扩展到**主观领域**，即不存在客观度量标准的领域。通过孤立的多智能体对抗式精炼与盲评机制构建主观适应度函数——这与科学依靠同行评审、数学依赖证明的方式相同。\n\n**盲评专家组就相当于主观工作的 val_bpb。**\n\n### 工作原理\n\n```\n生成-A → 评论者攻击 A（稻草人）→ 作者-B 看到任务+A+评论 →\n合成器看到任务+A+B → 产出 AB → 盲评专家组（随机标签）\n选择 A\u002FB\u002FAB → 胜者成为新的 A → 重复直至收敛\n```\n\n**关键不变性：** 每个智能体都是冷启动的新调用——无共享会话，无历史信息泄露。这可以防止在主观任务中因溜须拍马而导致单模型精炼失败的现象。\n\n### 快速入门\n\n```bash\n# 辩论一个软件架构决策\n\u002Fautoresearch:reason\n任务：我们的订单管理系统是否应该使用事件溯源？\n领域：软件\n迭代次数：8\n\n# 用5位盲评专家优化一份商业提案\n\u002Fautoresearch:reason --judges 5 --iterations 10\n任务：撰写一份有说服力的A轮融资商业计划书\n领域：商业\n\n# 链式流程：收敛 → 验证 → 实施\n\u002Fautoresearch:reason --chain predict,fix\n任务：为高流量API设计缓存策略\n领域：软件\n```\n\n### 功能特性\n\n- **8阶段协议：** 准备 → 生成-A → 评论者 → 生成-B → 合成-AB → 评委组 → 收敛检查 → 移交\n- **上下文隔离：** 每个智能体都是全新启动——防止溜须拍马和锚定效应\n- **盲评机制：** 评委收到加密随机标签（X\u002FY\u002FZ），强制进行对比评估\n- **收敛检测：** 连续N次多数胜出（默认：3轮）\n- **振荡检测：** 若现任方案在未收敛的情况下更换5次以上，则强制停止\n- **3种模式：** `convergent`（默认）、`creative`（不自动停止）、`debate`（不合成）\n- **6个领域：** 软件、产品、商业、安全、研究、内容\n- **`--chain` 标志：** 将收敛后的结果传递给调试、计划、修复、安全、场景分析、预测、部署或学习等下游工具\n\n### 标志选项\n\n| 标志 | 用途 | 默认值 |\n|------|------|--------|\n| `--iterations N` | 限定迭代次数 | 无限制 |\n| `--judges N` | 评委人数（3–7，奇数） | 3 |\n| `--convergence N` | 连续获胜次数以停止 | 3 |\n| `--mode` | 收敛模式、创意模式、辩论模式 | 收敛模式 |\n| `--domain` | 软件、产品、商业、安全、研究、内容 | 由用户指定 |\n| `--chain \u003Ctargets>` | 链接到下游工具 | 无 |\n| `--judge-personas` | 覆盖默认评委 | 领域默认 |\n| `--no-synthesis` | 仅A vs B | 假 |\n\n### 链式模式\n\n```\nreason → predict        （先收敛，再用5位专家进行压力测试）\nreason → plan,fix       （先收敛，再实施）\nreason → scenario       （先收敛，再探索边缘情况）\nreason → debug          （先收敛，再通过实证验证）\npredict → reason        （先识别问题，再辩论解决方案）\nscenario → reason       （发现边缘情况","2026-03-31T17:50:38",{"id":119,"version":120,"summary_zh":121,"released_at":122},118284,"v1.8.2","## 稳定性与文档修复补丁\n\n通过一轮包含50次迭代的 `\u002Fautoresearch:debug` 审计，共修复了10个缺陷。本次更新不包含新功能，仅专注于提升系统稳定性及文档引用的一致性。\n\n### 缺陷修复\n\n- **learn.md** — 补齐缺失的参数解析章节（此前有9个标志位被静默忽略）\n- **SKILL.md** — 修正了 `--budget` 标志语义不符的问题（原为“LLM成本”，现更正为与 predict-workflow.md 一致的“最大发现数”）\n- **results-logging.md** — 完善了状态枚举，新增 `keep (reworked)`、`no-op` 和 `hook-blocked` 三个状态\n- **debug.md** — 在参数提示和解析中添加了 `--technique` 标志\n- **fix.md** — 在参数提示和解析中添加了 `--skip-lint` 标志\n\n### 文档一致性改进\n\n- **getting-started.md** — 修正命令数量，将“7”改为“9”\n- **README.md** — 将 learn.md、learn-workflow.md 和 autoresearch-learn.md 添加至仓库结构树\n- **COMPARISON.md** — 在架构树、命令表格、功能列表以及链式调用部分中加入了 learn 模块（共9条命令）\n- **CONTRIBUTING.md** — 将 COMPARISON.md 和 guide\u002Fscenario\u002F 目录添加至仓库结构、文件列表及 PR 提交指南中\n- **指南徽章** — 修复了 debug 和 security 指南中过时的版本号，由 1.7.0 更新至 1.8.2\n- **发布脚本** — 将 guide\u002Fscenario\u002F 目录和 COMPARISON.md 添加至文档审查清单\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.8.1...v1.8.2","2026-03-21T17:03:12",{"id":124,"version":125,"summary_zh":126,"released_at":127},118285,"v1.8.1","## \u002Fautoresearch:learn — 10 项增强功能\n\n此补丁为自主文档引擎新增了 10 项改进，使其在生成哪些文档以及如何保持文档同步方面更加智能。\n\n### 新增功能\n\n- **Mermaid 架构图** — `system-architecture.md` 现在会自动生成组件图、数据流图和依赖关系图。\n- **条件性文档生成** — 当检测到相关信号时，会自动创建 4 种新类型的文档：\n  - `api-reference.md`（路由、控制器、OpenAPI 规范）\n  - `testing-guide.md`（测试目录、CI 测试步骤）\n  - `configuration-guide.md`（`.env.example` 文件、`config\u002F` 目录）\n  - `changelog.md`（按约定式提交分组的 Git 日志）。\n- **依赖关系文档** — `codebase-summary.md` 现在新增“关键依赖”章节，列出包名、版本及用途。\n- **基于差异的定向更新** — 更新模式会将 Git 更改映射到受影响的文档，仅深度刷新过时的部分。\n- **交叉引用链接** — 在相关文档之间添加“另请参阅”链接（例如：架构文档 → API 参考文档）。\n- **增量扫描框架** — 使用缓存的扫描上下文实现仅针对增量部分的重新扫描（未来优化）。\n- **`--format` 标志** — 支持以 Markdown（默认）、HTML、JSON 或 RST 格式输出。\n\n### 发布脚本改进\n\n- `release.sh` 和 `release.md` 现在已将 `guide\u002Fscenario\u002F` 和 `COMPARISON.md` 纳入文档审查清单，并自动暂存。\n\n### 完整变更日志\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.8.0...v1.8.1","2026-03-21T08:12:33",{"id":129,"version":130,"summary_zh":131,"released_at":132},118286,"v1.8.0","## 变更内容\n* 更新 COMPARISON.md，由 @uditgoenka 在 https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F49 中完成\n* 新功能：\u002Fautoresearch:learn — 自动化代码库文档生成引擎，由 @uditgoenka 在 https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F50 中完成\n* 发布：v1.8.0 — \u002Fautoresearch:learn，由 @uditgoenka 在 https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F52 中完成\n\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.6...v1.8.0","2026-03-20T13:04:41",{"id":134,"version":135,"summary_zh":136,"released_at":137},118287,"v1.7.6","## 新增内容\n\n### 📖 10 个基于场景的指南示例 (`guide\u002Fscenario\u002F`)\n\n针对特定领域的 `\u002Fautoresearch:scenario` 全流程实战教程。每个指南包含命令配置、覆盖 12 个探索维度的 5–6 个示例情境、链式模式以及领域专属提示。\n\n| 指南 | 领域 | 关键维度 |\n|-------|--------|----------------|\n| [实时聊天消息](guide\u002Fscenario\u002Freal-time-chat-messaging.md) | 软件 | 并发、恢复、时序 |\n| [多租户 SaaS 上手](guide\u002Fscenario\u002Fmulti-tenant-saas-onboarding.md) | 软件 | 权限、数据多样性、状态转换 |\n| [CI\u002FCD 流水线部署](guide\u002Fscenario\u002Fcicd-pipeline-deployment.md) | 软件 | 回复、错误、状态转换 |\n| [医疗预约安排](guide\u002Fscenario\u002Fhealthcare-appointment-scheduling.md) | 业务 | 时序、权限、并发 |\n| [社交媒体内容审核](guide\u002Fscenario\u002Fsocial-media-content-moderation.md) | 产品 | 边界情况、滥用、规模 |\n| [物联网固件更新](guide\u002Fscenario\u002Fiot-firmware-updates.md) | 软件 | 恢复、错误、规模 |\n| [文档协作](guide\u002Fscenario\u002Fdocument-collaboration.md) | 软件 | 并发、状态转换、权限 |\n| [跨境电汇](guide\u002Fscenario\u002Fcross-border-wire-transfers.md) | 安全 | 滥用、权限、集成 |\n| [搜索自动补全](guide\u002Fscenario\u002Fsearch-autocomplete.md) | 软件 | 边界情况、规模、数据多样性 |\n| [移动推送通知](guide\u002Fscenario\u002Fmobile-push-notifications.md) | 产品 | 规模、时序、数据多样性 |\n\n### 📊 Karpathy 与 Claude Autoresearch 对比（`COMPARISON.md`）\n\n一份详尽的 525 行对比文档，涵盖：\n\n- **起源故事** — 一段 630 行的 Python 脚本如何启发了通用技能系统\n- **核心循环对比** — 相同的 DNA，不同的表现形式\n- **7 项共同原则** — 在两种实现之间一一对应\n- **领域泛化能力** — 仅限 ML 的方法与可衡量指标在 10 多个领域的对比\n- **命令界面** — 1 个脚本 vs 8 个专用子命令及链式调用\n- **30 多张功能逐项对比表** — 包括验证、安全性、Git 集成、设置和工作流\n- **Claude Autoresearch 在 Karpathy 原版基础上新增的 14 项内置功能**\n- **Karpathy 版本具备而 Claude Autoresearch 缺失的功能**（GPU 训练、不可变评估）\n- **理念分歧** — 极简\u002F有观点 vs 泛化\u002F模块化\n- **何时选择哪一种** 的决策指南\n\n### 日常维护\n\n- 版本号升级：在所有 5 个版本位置将 1.7.5 更新至 1.7.6\n- 更新 README 中的仓库结构树，加入新文件\n- 更新指南索引，使其指向场景子文件夹\n- 同步分发文件（`.claude\u002F` → `claude-plugin\u002F`）\n\n---\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.5...v1.7.6","2026-03-20T12:21:12",{"id":139,"version":140,"summary_zh":141,"released_at":142},118288,"v1.7.5","## 文档改进\n\n新增 1,500 多行可操作的实施指南，以将基准评分从 **65.4\u002F100** 提升至 **95+\u002F100**。每项新增内容均包含可执行代码片段、配置参数及实际案例。\n\n### 变更内容\n\n#### TSV 日志记录（Q7：30→95）\n- 设置与初始化脚本、`log_iteration()` 函数、读取\u002F查询模式、循环集成生命周期\n\n#### 噪音指标处理（Q9：31→81）— 新增第 5.1 阶段\n- 多次运行验证、最小改进阈值、确认运行、环境锁定\n\n#### 使用 Git 作为存储（Q5：45→90）\n- **新增：** 用于 Git 存储自动化的 Bash 函数：`git_memory_init()`、`read_git_memory()`、`query_git_memory()`、`write_git_memory()`\n- Git 操作的错误处理（分离 HEAD 恢复、空仓库处理）\n- 完整集成示例，展示代理如何基于 Git 历史做出决策\n\n#### 机器学习机械指标（Q2：69→90）\n- **新增：** 包含 Python 提取模式的 ML 准确率指标配置\n- `verify_metric.py` — 用于程序化提取指标的可重用 Python 脚本\n- 错误处理：超时、无效输出、崩溃恢复\n- 完整的 ML 模型准确率优化示例，附迭代过程说明\n\n#### 每次迭代仅做一项更改（Q8：52→90）\n- **新增：** `Atomicity: strict` 配置指令及强制执行机制\n- Bash 验证脚本：文件数量检查、单句测试、“and”检测\n- 原子性级别表（严格 vs 松散）及使用指南\n\n#### DevOps 流水线优化（Q10：72→90）\n- **新增：** DevOps 工作流的 CLI 调用命令（交互式 + 无头模式）\n- 错误处理：部署超时、健康检查重试、内存不足恢复\n- 复杂流水线的指标定义（持续时间、镜像大小、发布时间、CPU 利用率）\n- 基于 Kubernetes 的生产回滚模式\n\n#### Git 回滚（Q6：73→92）\n- 可执行的 `safe_revert()` Bash 函数，替代伪代码\n\n#### 机械验证（Q4：80→100）\n- 9 种语言特定的验证模板\n\n#### 项目初始化（Q1：86→99）\n- 完整的 Phase 0 引导序列\n\n### 文件变更\n\n| 文件 | 总新增行数 |\n|------|------------|\n| `autonomous-loop-protocol.md` | +421 |\n| `advanced-patterns.md` | +106 |\n| `core-principles.md` | +115 |\n| `results-logging.md` | +91 |\n| `examples-by-domain.md` | +208 |\n| `getting-started.md` | +30 |\n\n### 升级\n\n```bash\n\u002Fplugin update autoresearch\n```\n\n**完整更新日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.4...v1.7.5","2026-03-19T12:38:20",{"id":144,"version":145,"summary_zh":146,"released_at":147},118289,"v1.7.4","## Bug 修复\n\n修复 #43 — 在 macOS 上安装插件时，由于插件缓存中的递归自我嵌套，出现 `ENAMETOOLONG` 错误，导致安装失败。\n\n### 问题描述\n\n当 `marketplace.json` 中的 `\"source\": \".\u002F\"` 时，Claude Code 会缓存**整个仓库**——包括 `marketplace.json` 文件本身。而这个缓存副本又会触发另一轮缓存过程，从而形成一个无限循环：\n\n```\n~\u002F.claude\u002Fplugins\u002Fcache\u002Fautoresearch\u002F1.7.3\u002Fautoresearch\u002F1.7.3\u002Fautoresearch\u002F1.7.3\u002F...（深度超过45层）\n```\n\n这使得文件路径长度达到 1021 字符以上，超过了 macOS 对路径长度 1024 字符的限制。\n\n### 修复内容\n\n我们将插件分发文件隔离到一个专用的 `claude-plugin\u002F` 子目录中，该目录不包含 `marketplace.json`，从而打破了递归循环：\n\n```\n修复前：marketplace.json → source: \".\u002F\"              → 缓存整个仓库 → 无限循环\n修复后：marketplace.json → source: \".\u002Fclaude-plugin\"  → 只缓存插件文件 → 结束\n```\n\n### 变更内容\n\n- **重构了插件打包结构**——将分发文件移至独立的 `claude-plugin\u002F` 目录。\n- **更新了 `marketplace.json`**——将 `source` 从 `\".\u002F\"` 改为 `\".\u002Fclaude-plugin\"`。\n- **新增了 `.gitignore` 文件**——此前仓库没有 `.gitignore`。\n- **移除了重复目录**——根目录下的 `commands\u002F` 和 `skills\u002F` 被 `claude-plugin\u002F` 替代。\n- **更新了发布脚本**——`release.sh` 现在同步到 `claude-plugin\u002F` 目录，而不是根目录。\n- **更新了所有文档**——README、CONTRIBUTING、指南以及发布文档均使用了新的路径。\n\n### 升级方法\n\n```bash\n\u002Fplugin update autoresearch\n```\n\n或者重新安装：\n```bash\n\u002Fplugin marketplace add uditgoenka\u002Fautoresearch\n\u002Fplugin install autoresearch@autoresearch\n```\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.3...v1.7.4","2026-03-19T06:46:55",{"id":149,"version":150,"summary_zh":151,"released_at":152},118290,"v1.7.3","## 变更内容\n* 发布 v1.7.3 — 进一步的稳定性修复与改进，由 @uditgoenka 在 https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F42 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.2...v1.7.3","2026-03-18T12:57:08",{"id":154,"version":155,"summary_zh":156,"released_at":157},118291,"v1.7.2","## v1.7.2 — 稳定性修复\n\n全面的稳定性补丁，修复了通过两轮深度 `\u002Fautoresearch:debug --iterations 25` 审计发现的 **25+ 个 bug**。纯 Bug 修复版本，无新增功能。\n\n### 亮点\n\n- **Ship 命令**：新增 `--target` 标志，用于明确指定目标船只\n- **Debug→Fix 链**：`--fix` 现在会正确将 `--from-debug` 传递给 fix 命令\n- **安全 CI**：模板现在同时复制 `commands\u002F` 和 `skills\u002F`（之前遗漏了 commands）\n- **预测预算**：解决了单位冲突问题——`--budget` 现在统计的是发现数量，而非金额\n- **Git 安全性**：移除了所有 `git add -A` 的引用，改为在各处显式暂存文件\n- **预测工作流**：修正了 `--rounds` 的取值范围（原为 0-3，现为 1-3），并调整了深度预设的命名\n- **发布脚本**：修复了步骤编号不一致的问题\n- **SKILL.md**：移除了过时的 `\u002Fug:autoresearch` 别名\n\n### 完整变更日志\n\n| 模块 | 修复内容 |\n|------|----------|\n| `ship.md` | 新增 `--target \u003Cpath>` 标志及参数提示 |\n| `debug-workflow.md` | `--fix` 链会传递 `--from-debug` |\n| `security-workflow.md` | CI 模板同时复制 `commands\u002F` 和基础命令 |\n| `predict-workflow.md` | 预算：从金额改为统计发现数量 |\n| `predict-workflow.md` | `--rounds` 取值范围：0-3 → 1-3 |\n| `predict-workflow.md` | 深度预设：quick → shallow |\n| `fix-workflow.md` | 将 `git add -A` 改为 `git add \u003Cfiles>` |\n| `autonomous-loop-protocol.md` | 将 `git add -A` 改为显式暂存，并添加 WARNING 提示 |\n| `security.md` | 新增 `--scope`、`--depth` 和 `Focus:` |\n| `predict.md` | 将 `--dry-run` 改为 `--incremental` |\n| `release.sh` | 步骤编号由 [1\u002F6]→[1\u002F7]、[2\u002F6]→[2\u002F7] |\n| `SKILL.md` | 移除过时的 `\u002Fug:autoresearch` 别名 |\n| 分发 | 同步所有根目录下的 `commands\u002F` 和 `skills\u002F` |\n\n### 升级\n```\n\u002Fplugin update autoresearch\n```\n\n**共修改 23 个文件**，新增 57 行，删除 47 行。","2026-03-18T09:59:44",{"id":159,"version":160,"summary_zh":161,"released_at":162},118292,"v1.7.1","## 变更内容\n\n### 文档重构\n- 将单体式的 `GUIDE.md`（1,791 行）和 `EXAMPLES.md`（2,228 行）替换为 `guide\u002F` 文件夹中的**13 个聚焦指南文件**，总行数达 6,183 行。\n- 每个子命令都有自己的详细指南：`guide\u002Fautoresearch-predict.md`（778 行）、`guide\u002Fautoresearch-security.md`（512 行）等。\n- 新增文件：`guide\u002Fchains-and-combinations.md`、`guide\u002Fexamples-by-domain.md`、`guide\u002Fadvanced-patterns.md`。\n\n### 命令可靠性修复\n- **命令现在在首次调用时即可触发** — 移除了所有命令文件中重复的 518 行 `SKILL.md` 内容（上下文减少 50%-65%）。\n- **`--iterations N` 现在可正常工作** — 实现了显式参数解析，跟踪迭代计数，并在达到 N 次迭代后强制停止。\n- **大量上下文不再导致命令失效** — 标志位现已从正文上下文中单独提取。\n- **命令始终实时流式输出** — 明确规定“绝不后台运行”。\n\n### 分发同步\n- 根目录下的 `commands\u002F` 和 `skills\u002F` 现在会在发布时自动从 `.claude\u002F` 同步。\n- 在 `release.sh` 中新增了第 3 步同步操作（共 7 步）。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.7.0...v1.7.1","2026-03-18T09:14:36",{"id":164,"version":165,"summary_zh":166,"released_at":167},118293,"v1.7.0","## What's Changed\n* fix: remove self-referencing source URL causing recursive directory nesting by @daviseford in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F36\n* feat: \u002Fautoresearch:predict — Multi-Persona Swarm Prediction (v1.7.0) by @uditgoenka in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F39\n\n## New Contributors\n* @daviseford made their first contribution in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F36\n\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.6.2...v1.7.0","2026-03-18T06:29:51",{"id":169,"version":170,"summary_zh":171,"released_at":172},118294,"v1.6.2","## What's Changed\n* Simplify autoresearch plugin installation instructions by @tomashm in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F31\n* Release v1.6.2 — Comprehensive GUIDE.md by @uditgoenka in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F34\n\n## New Contributors\n* @tomashm made their first contribution in https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fpull\u002F31\n\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.6.1...v1.6.2","2026-03-17T17:31:11",{"id":174,"version":175,"summary_zh":176,"released_at":177},118295,"v1.6.1","## What's Changed\n\nFixes #29 — The \"git is memory\" mechanism that powers inter-iteration learning in the autonomous loop wasn't working reliably. The agent would skip reading git history, use destructive `git reset --hard` (destroying experiment memory), and enter the loop without verifying git state.\n\nThis release comprehensively hardens the protocol with **10 targeted fixes** discovered through a 20-iteration edge case audit.\n\n### Highlights\n\n**New: Phase 0 — Precondition Checks**\nBefore entering the loop, the agent now verifies: git repo exists, working tree is clean, no stale lock files, HEAD is attached, and detects git hooks (including husky and pre-commit framework). Blocks loop entry on failures.\n\n**Mandatory Git History Reading**\nPhase 1 now uses `MUST run:` language for `git log --oneline -20` and `git diff HEAD~1`. Phase 2 requires consulting git history before ideation. The agent can no longer skip reading its own experiment history.\n\n**Safe Rollbacks with `safe_revert()`**\nAll 5 rollback sites in Phase 6 now use a `safe_revert()` helper that tries `git revert HEAD --no-edit` first (preserving experiment history), falling back to `git reset --hard HEAD~1` only when revert conflicts occur.\n\n**Robust Commit Phase**\n- Nothing-to-commit detection (`no-op` status instead of confusing errors)\n- `git add -A` scope warning with verification step\n- Hook failure recovery (2 attempts, never `--no-verify`)\n\n### All Changes\n\n| # | Severity | Fix |\n|---|----------|-----|\n| 1 | CRITICAL | Phase 0 precondition checks before loop entry |\n| 2 | CRITICAL | `safe_revert()` with conflict fallback across all rollback sites |\n| 3 | HIGH | Mandatory `git log` + `git diff` reading (MUST language) |\n| 4 | HIGH | Hook failure recovery in loop body |\n| 5 | MEDIUM | Nothing-to-commit handling (`no-op` status) |\n| 6 | MEDIUM | `git add -A` scope warning + verification |\n| 7 | MEDIUM | Expanded hook detection (husky, pre-commit-config, commit-msg) |\n| 8 | LOW | Guard failure recovery uses `safe_revert()` |\n| 9 | LOW | New statuses: `no-op`, `hook-blocked` + valid statuses list |\n| 10 | LOW | Final summary includes skipped iteration count |\n\n### Files Changed\n\n- `skills\u002Fautoresearch\u002Freferences\u002Fautonomous-loop-protocol.md` — +123\u002F-25 lines (core protocol)\n- `skills\u002Fautoresearch\u002FSKILL.md` — version bump to 1.6.1\n- `README.md` — version badge + expanded Rule #7\n\n### Verification\n\n- 20-iteration edge case audit via `\u002Fautoresearch:scenario`\n- 10-iteration stability verification via `\u002Fautoresearch:debug`\n- 5-iteration dry-run against Issue #29\n\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.6.0...v1.6.1","2026-03-17T09:54:49",{"id":179,"version":180,"summary_zh":181,"released_at":182},118296,"v1.6.0","## What's New\n\n### `\u002Fautoresearch:scenario` — Scenario Explorer\n\nNew subcommand that autonomously explores a seed scenario across **12 dimensions** to generate situations, edge cases, failure modes, and derivative scenarios. Think brainstorming meets the autoresearch loop.\n\n**Just type:**\n```\n\u002Fautoresearch:scenario\n```\nClaude asks 4-8 adaptive questions, then iterates — generating one concrete situation per iteration, classifying it (new\u002Fvariant\u002Fduplicate), expanding edge cases, and logging everything.\n\n### 7-Phase Autonomous Loop\n\n```\nSeed → Decompose → Generate → Classify → Expand → Log → Repeat\n```\n\nEach iteration picks an unexplored dimension\u002Fcombination and generates a concrete situation with actors, triggers, flow, expected outcome, and severity.\n\n### 12 Exploration Dimensions\n\nHappy path, error path, edge case, abuse\u002Fmisuse, scale, concurrent, temporal, data variation, permission, integration, recovery, state transition.\n\n### 5 Domain Templates\n\n| Domain | Priority Dimensions | Default Output |\n|--------|-------------------|----------------|\n| Software\u002FAPI | error, edge case, concurrent, integration | test-scenarios |\n| Product\u002FUX | happy path, error, permission, temporal | user-stories |\n| Business\u002FProcess | happy path, error, permission, recovery | use-cases |\n| Security\u002FCompliance | abuse, permission, data variation, concurrent | threat-scenarios |\n| Marketing\u002FSales | happy path, data variation, temporal, scale | user-stories |\n\n### Adaptive Interactive Setup\n\nQuestion count adapts (4-8) based on context provided:\n- No input → 8 questions\n- Vague scenario (≤5 words, no actor\u002Faction) → 7 questions\n- Clear scenario, no domain → 5 questions\n- Clear scenario + domain → 4 questions\n\n### Robustness (from dry-run testing)\n\n- Deferred tool fetch: ToolSearch fallback for AskUserQuestion\n- Vague\u002Fclear classification with word count + actor+action heuristic\n- Inline context parsing rules (flag ordering, conflict resolution)\n- Cancel & interruption handling\n\n### Flags\n\n| Flag | Purpose |\n|------|---------|\n| `--domain \u003Ctype>` | software, product, business, security, marketing |\n| `--depth \u003Clevel>` | shallow (10), standard (25), deep (50+) |\n| `--format \u003Ctype>` | use-cases, user-stories, test-scenarios, threat-scenarios, mixed |\n| `--focus \u003Carea>` | edge-cases, failures, security, scale |\n| `--scope \u003Cglob>` | Limit to specific files\u002Ffeatures |\n\n### Chaining\n\n```bash\n\u002Fautoresearch:scenario --iterations 15\n\u002Fautoresearch:debug --scope src\u002Fauth\u002F**\n\u002Fautoresearch:fix --from-debug --iterations 20\n```\n\n### Documentation\n\n- **README.md** — commands table, quick decision guide, dedicated section, repo structure\n- **EXAMPLES.md** — 10 new scenario examples with chaining patterns\n\n### Files\n\n| File | Status |\n|------|--------|\n| `commands\u002Fautoresearch\u002Fscenario.md` | Created (14 lines) |\n| `skills\u002Fautoresearch\u002Freferences\u002Fscenario-workflow.md` | Created (353 lines) |\n| `skills\u002Fautoresearch\u002FSKILL.md` | Modified (+70 lines, v1.6.0) |\n| `README.md` | Modified (+36 lines) |\n| `EXAMPLES.md` | Modified (+96 lines) |\n\n### Upgrade\n\nPlugin: `\u002Fplugin update autoresearch@autoresearch`\nManual: Pull latest, re-copy `skills\u002Fautoresearch\u002F` and `commands\u002Fautoresearch\u002F`\n\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.5.0...v1.6.0","2026-03-17T09:05:31",{"id":184,"version":185,"summary_zh":186,"released_at":187},118297,"v1.5.0","## What's New\n\n### Mandatory Interactive Setup Gate\n\nAll autoresearch commands now **enforce** `AskUserQuestion` when invoked without required context. Previously, invoking `\u002Fautoresearch` or any subcommand without inline configuration would silently skip the interactive setup wizard and proceed directly to execution — leaving Claude without the Goal, Scope, Metric, or other required fields.\n\n### Changes\n\n**New: Routing table in SKILL.md**\n\nA `MANDATORY: Interactive Setup Gate` section at the top of `SKILL.md` now maps every command to its required context and the specific `AskUserQuestion` flow to use when context is missing:\n\n| Command | Required Context | If Missing |\n|---------|-----------------|------------|\n| `\u002Fautoresearch` | Goal, Scope, Metric, Direction, Verify | Batch 1 (4 questions) + Batch 2 (3 questions) |\n| `\u002Fautoresearch:plan` | Goal | Ask via plan-workflow.md |\n| `\u002Fautoresearch:debug` | Issue\u002FSymptom, Scope | 4 batched questions |\n| `\u002Fautoresearch:fix` | Target, Scope | 4 batched questions |\n| `\u002Fautoresearch:security` | Scope, Depth | 3 batched questions |\n| `\u002Fautoresearch:ship` | What\u002FType, Mode | 3 batched questions |\n\n**Strengthened language across all 6 files**\n\n- All interactive setup sections renamed to `PREREQUISITE: Interactive Setup`\n- Added `CRITICAL`, `BLOCKING PREREQUISITE`, `MUST`, and `DO NOT skip` enforcement language\n- Added **STOP guards** at Phase 1 entry points in debug and fix workflows to catch execution without prior setup\n- Changed descriptive \"use AskUserQuestion\" to imperative \"you MUST call AskUserQuestion\"\n\n### Root Causes Fixed\n\n1. **No mandatory language** — Setup instructions used advisory \"use\" instead of imperative \"MUST use\"\n2. **No routing guard** — Setup Phase was buried deep in SKILL.md; subcommands jumped to reference files bypassing it\n3. **Interactive setup buried** — In reference files, setup was a peer section to execution phases rather than a blocking prerequisite\n\n### Files Changed\n\n- `SKILL.md` — routing table + Setup Phase strengthening (+25 lines)\n- `references\u002Fdebug-workflow.md` — PREREQUISITE + Phase 1 STOP guard\n- `references\u002Ffix-workflow.md` — PREREQUISITE + Phase 1 STOP guard\n- `references\u002Fsecurity-workflow.md` — CRITICAL BLOCKING language\n- `references\u002Fship-workflow.md` — CRITICAL BLOCKING language\n- `references\u002Fplan-workflow.md` — CRITICAL BLOCKING at Phase 1\n- `README.md` — version badge bump\n\n### Upgrade\n\nIf installed via plugin: `\u002Fplugin update autoresearch@autoresearch`\n\nIf installed manually: pull latest and re-copy `skills\u002Fautoresearch\u002F` to your `.claude\u002Fskills\u002Fautoresearch\u002F`\n\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.4.0...v1.5.0","2026-03-17T08:42:23",{"id":189,"version":190,"summary_zh":191,"released_at":192},118298,"v1.4.0","## Breaking Change\n\n**`\u002Floop N \u002Fautoresearch` no longer recommended.** Use `Iterations: N` inline config instead.\n\n### Why\n\nClaude Code's `\u002Floop` command is a **scheduler** (runs on time intervals like `\u002Floop 5m \u002Ffoo`), NOT an iteration counter. When users typed `\u002Floop 5 \u002Fautoresearch`, Claude interpreted \"5\" as \"5 minutes\" and scheduled a recurring task — not 5 iterations. (#24)\n\n### New Syntax\n\n**Before (broken):**\n```\n\u002Floop 25 \u002Fautoresearch\nGoal: Increase test coverage to 90%\n```\n\n**After (correct):**\n```\n\u002Fautoresearch\nGoal: Increase test coverage to 90%\nIterations: 25\n```\n\n**For CI\u002FCD:**\n```bash\nclaude -p \"\u002Fautoresearch:security --fail-on critical --iterations 10\"\n```\n\n### What Changed\n\n- All 16 documentation files updated — zero `\u002Floop` references remain\n- `Iterations: N` is now a first-class inline config parameter\n- `--iterations N` flag supported for CLI\u002FCI\u002FCD contexts\n- Works with all subcommands: `\u002Fautoresearch`, `:debug`, `:fix`, `:security`, `:ship`\n\n### Upgrade\n\n```bash\n# If installed via git\ncd ~\u002F.claude\u002Fskills && git pull\n\n# Or re-install\ngit clone https:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch.git\ncp -r autoresearch\u002Fskills\u002Fautoresearch ~\u002F.claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fcommands\u002Fautoresearch ~\u002F.claude\u002Fcommands\u002Fautoresearch\n```\n\n### Full Changelog\n\nhttps:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.3.3...v1.4.0","2026-03-17T05:26:51",{"id":194,"version":195,"summary_zh":196,"released_at":197},118299,"v1.3.3","## Bug Fix\n\nFixes #22 — `\u002Fautoresearch` (without suffix like `:debug` or `:fix`) was throwing \"Unknown skill: autoresearch\" error.\n\n### Root Cause\n\nClaude Code resolves slash commands via `.claude\u002Fcommands\u002F\u003Cname>.md` files. All subcommands (`:debug`, `:fix`, `:plan`, `:security`, `:ship`) had their registration files, but the **base `\u002Fautoresearch` command** was missing its registration file — only the directory existed.\n\n### What Changed\n\n- Added `commands\u002Fautoresearch.md` — base command registration with proper frontmatter, argument hints, and protocol loading instructions\n- Version bumped to 1.3.3\n\n### Upgrade\n\n```bash\n# If installed via plugin\n\u002Fplugin install autoresearch@autoresearch\n\n# If installed via git\ncd ~\u002F.claude\u002Fskills && git pull\n```\n\n### Full Changelog\n\nhttps:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.3.2...v1.3.3","2026-03-16T17:43:41",{"id":199,"version":200,"summary_zh":201,"released_at":202},118300,"v1.3.2","## What's New\n\nAll 6 subcommands now batch their `AskUserQuestion` calls — asking 3-4 questions per call instead of one at a time. Users see all configuration choices together for full context upfront.\n\n### Batched Setup by Command\n\n| Subcommand | Questions per Call | Topics |\n|------------|-------------------|--------|\n| `\u002Fautoresearch` | **Batch 1:** 4 (Goal, Scope, Metric, Direction) **Batch 2:** 3 (Verify, Guard, Launch) |\n| `\u002Fautoresearch:plan` | **Batch 1:** 4 (Goal, Scope, Metric, Direction) **Batch 2:** 3 (Verify, Guard, Launch) |\n| `\u002Fautoresearch:debug` | **1 call:** 4 (Issue, Scope, Depth, After) |\n| `\u002Fautoresearch:fix` | **1 call:** 4 (Fix What, Guard, Scope, Launch) |\n| `\u002Fautoresearch:security` | **1 call:** 3 (Scope, Depth, Action) |\n| `\u002Fautoresearch:ship` | **1 call:** 3 (What, Mode, Monitor) |\n\n### Why This Matters\n\n- **Before:** Each question asked one at a time — tedious back-and-forth\n- **After:** All questions in a single call — users see full context and make informed choices\n- Reduces round-trips from 5-7 to 1-2 per command\n- Users can adjust answers with full visibility of related options\n\n### Files Changed\n\n- `skills\u002Fautoresearch\u002FSKILL.md` — Updated interactive setup documentation + version bump\n- `.claude\u002Fskills\u002Fautoresearch\u002FSKILL.md` — Synced version\n- `README.md` — Version badge updated\n- All reference files (`debug-workflow.md`, `fix-workflow.md`, `security-workflow.md`, `ship-workflow.md`, `plan-workflow.md`) — Batched question definitions\n\n### Full Changelog\n\nhttps:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.3.1...v1.3.2","2026-03-16T17:30:38",{"id":204,"version":205,"summary_zh":206,"released_at":207},118301,"v1.3.1","## What's New\n\n### Interactive Setup with AskUserQuestion\n\nAll 6 commands now use Claude's `AskUserQuestion` tool for guided setup when invoked without arguments. Just type the command — Claude walks you through it with smart defaults.\n\n**Before:** You had to know the exact syntax\n```\n\u002Fautoresearch\nGoal: Increase test coverage from 72% to 90%\nScope: src\u002F**\u002F*.test.ts, src\u002F**\u002F*.ts\nMetric: coverage % (higher is better)\nVerify: npm test -- --coverage | grep \"All files\"\n```\n\n**After:** Just type `\u002Fautoresearch` — Claude asks step by step\n```\n→ \"What do you want to improve?\"\n→ \"Which files can autoresearch modify?\"\n→ \"What number tells you if things got better?\"\n→ \"What command produces the metric?\" (dry-run validated)\n→ \"Any guard command?\"\n→ \"Ready to launch?\"\n```\n\n| Command | Interactive Questions | Skips When |\n|---------|----------------------|------------|\n| `\u002Fautoresearch` | Goal, Scope, Metric, Verify, Guard | Inline config |\n| `\u002Fautoresearch:plan` | Goal + wizard (already had it) | Goal inline |\n| `\u002Fautoresearch:security` | Scope, Depth, After-action | `--diff`, `--fix`, `--fail-on` |\n| `\u002Fautoresearch:ship` | Ship type, Ship mode | `--type`, `--dry-run`, `--auto` |\n| `\u002Fautoresearch:debug` | Symptom, Scope, After-action | `--scope`, `--symptom`, `--fix` |\n| `\u002Fautoresearch:fix` | Category, Guard, Scope, Confirm | `--target`, `--guard`, `--scope` |\n\nPower users can still provide everything inline — the wizard is bypassed when flags are present.\n\n### Bug Fixes\n\n- **Plugin install:** Fixed marketplace.json source from `\".\"` to proper git URL\n- **Plugin install:** Fixed README instructions — now includes required `installLocation` and `lastUpdated` fields (fixes #19)\n- **Plugin install:** Added marketplace registration step to README\n\n## Install\n\n```bash\n\u002Fplugin install autoresearch@autoresearch\n```\n\nOr manual:\n```bash\ncp -r autoresearch\u002Fskills\u002Fautoresearch .claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fcommands\u002Fautoresearch .claude\u002Fcommands\u002Fautoresearch\n```\n\n## Full Changelog\n\nhttps:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.3.0...v1.3.1","2026-03-16T17:06:47",{"id":209,"version":210,"summary_zh":211,"released_at":212},118302,"v1.3.0","## What's New\n\n### `\u002Fautoresearch:debug` — Autonomous Bug Hunter\n\nScientific method meets autoresearch loop. Doesn't stop at one bug — iteratively hunts ALL bugs using falsifiable hypotheses and evidence-based investigation.\n\n```bash\n# Hunt all bugs\n\u002Floop 20 \u002Fautoresearch:debug\n\n# Debug specific error\n\u002Fautoresearch:debug\nSymptom: API returns 500 on POST \u002Fusers\n\n# Debug then auto-fix everything found\n\u002Fautoresearch:debug --fix\n```\n\n**What makes it different from regular debugging:**\n\n| Feature | Traditional Debug | `\u002Fautoresearch:debug` |\n|---------|------------------|----------------------|\n| Scope | Find one bug | Hunt ALL bugs iteratively |\n| Method | Ad-hoc | Scientific method (hypothesize → test → prove\u002Fdisprove) |\n| Memory | Start fresh each time | Results log tracks what was tried |\n| Rollback | Manual | Automatic (disproven hypotheses reverted) |\n| Coverage | Unknown | Composite metric tracks files investigated |\n\n**Built-in knowledge:**\n- 7 investigation techniques (binary search, differential, minimal reproduction, trace, pattern search, working backwards, rubber duck)\n- 4 cognitive bias guards (confirmation, anchoring, sunk cost, availability)\n- 15 common bug patterns across 7 languages (JS, TS, Python, Go, Rust, Java, SQL)\n- 5 domain-specific checklists (API, database, auth, async\u002Fconcurrency, network)\n- The 5 Whys root cause drill-down\n- \"What NOT to Do\" anti-patterns table\n\n### `\u002Fautoresearch:fix` — Autonomous Error Crusher\n\nTakes a broken state and iteratively repairs it until zero errors remain. ONE fix per iteration. Atomic, committed, verified, auto-reverted on failure.\n\n```bash\n# Fix everything\n\u002Fautoresearch:fix\n\n# Fix with guard (no regressions)\n\u002Fautoresearch:fix\nGuard: npm test\n\n# Fix from debug findings\n\u002Floop 30 \u002Fautoresearch:fix --from-debug\n```\n\n**What makes it different:**\n\n| Feature | Traditional Fix | `\u002Fautoresearch:fix` |\n|---------|----------------|---------------------|\n| Scope | Fix one thing | Fix ALL errors until zero |\n| Priority | Manual | Auto-prioritized (blockers → errors → warnings) |\n| Safety | Hope nothing breaks | Guard command prevents regressions |\n| Atomicity | Sometimes | Always (one fix per iteration) |\n| Rollback | Manual undo | Automatic revert on failure |\n| Completion | Guess | Stops automatically at zero errors |\n\n**Built-in knowledge:**\n- Fix strategies for 5 languages (TypeScript, Python, Go, Rust, Java)\n- 9-row anti-patterns table (never suppress, never delete tests, never use `any`)\n- Compound fix detection (fixing one reveals another)\n- Impact assessment with blast radius analysis\n- 5-step rollback protocol\n- Parallel fix detection for independent errors\n- Escalation path after 3 failed attempts\n- Domain-specific patterns: dependency fixes, DB migrations, CI\u002FCD pipelines\n\n### Meta: Built by Autoresearch\n\nBoth protocols were **refined through 35 autoresearch loop iterations** across 3 parallel agents:\n\n```\nScore: 90\u002F100 → 143\u002F150 (rubric expanded mid-loop)\nIterations: 35 total (13 debug + 19 fix + 2 manual + 1 scoring)\nProtocol size: 490 → 1,092 lines (+123%)\n```\n\nThe autoresearch loop was used to improve its own debugging and fixing protocols.\n\n## All Commands (v1.3.0)\n\n| Command | Purpose | Since |\n|---------|---------|-------|\n| `\u002Fautoresearch` | Autonomous iteration loop | v1.0.0 |\n| `\u002Fautoresearch:plan` | Goal → config wizard | v1.0.2 |\n| `\u002Fautoresearch:security` | STRIDE + OWASP audit | v1.0.3 |\n| `\u002Fautoresearch:ship` | Universal shipping workflow | v1.1.0 |\n| `\u002Fautoresearch:debug` | Autonomous bug hunting | **v1.3.0** |\n| `\u002Fautoresearch:fix` | Autonomous error fixing | **v1.3.0** |\n| `Guard: \u003Ccommand>` | Regression prevention | v1.0.4 |\n\n## Install\n\n```bash\n\u002Fplugin install autoresearch@autoresearch\n```\n\nOr manual:\n```bash\ncp -r autoresearch\u002Fskills\u002Fautoresearch .claude\u002Fskills\u002Fautoresearch\ncp -r autoresearch\u002Fcommands\u002Fautoresearch .claude\u002Fcommands\u002Fautoresearch\n```\n\n## Full Changelog\n\nhttps:\u002F\u002Fgithub.com\u002Fuditgoenka\u002Fautoresearch\u002Fcompare\u002Fv1.2.0...v1.3.0","2026-03-16T12:56:53"]