[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-ClaytonFarr--ralph-playbook":3,"tool-ClaytonFarr--ralph-playbook":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":76,"owner_website":79,"owner_url":80,"languages":81,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":23,"env_os":98,"env_gpu":99,"env_ram":100,"env_deps":101,"category_tags":108,"github_topics":79,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":109,"updated_at":110,"faqs":111,"releases":142},2401,"ClaytonFarr\u002Fralph-playbook","ralph-playbook","A comprehensive guide to running autonomous AI coding loops using Geoff Huntley's Ralph methodology. View as formatted guide below 👇","ralph-playbook 是一份基于 Geoff Huntley\"Ralph 方法论”的实战指南，旨在帮助开发者高效运行自主 AI 编程循环。它并非直接提供代码库，而是通过结构化的工作流，解决在使用大模型进行复杂软件开发时常见的上下文丢失、规划混乱及执行不一致等痛点。\n\n该指南将开发过程梳理为清晰的三个阶段：首先通过对话定义需求并拆解为具体任务（JTBD），生成规范文档；随后进入核心的“循环机制”，区分“规划模式”与“构建模式”。在规划模式下，AI 专注于分析差距并生成优先级待办列表；在构建模式下，AI 则依据计划逐步实现功能、运行测试并提交代码。这种“三阶段、两提示、一循环”的设计，确保了从创意到落地的全流程可控且可迭代。\n\nralph-playbook 特别适合希望利用 AI 代理（Agent）独立承担完整开发任务的软件工程师和技术团队。其独特亮点在于强调“反压机制”（如自动测试）和上下文隔离，让 AI 能在保持目标一致性的同时，安全地处理多轮迭代任务。无论你是想系统化提升 AI 编码效率的资深开发者，还是正在探索自主智能体工作流的科研人员，这份指南都能为你提供一套经过验证的最","ralph-playbook 是一份基于 Geoff Huntley\"Ralph 方法论”的实战指南，旨在帮助开发者高效运行自主 AI 编程循环。它并非直接提供代码库，而是通过结构化的工作流，解决在使用大模型进行复杂软件开发时常见的上下文丢失、规划混乱及执行不一致等痛点。\n\n该指南将开发过程梳理为清晰的三个阶段：首先通过对话定义需求并拆解为具体任务（JTBD），生成规范文档；随后进入核心的“循环机制”，区分“规划模式”与“构建模式”。在规划模式下，AI 专注于分析差距并生成优先级待办列表；在构建模式下，AI 则依据计划逐步实现功能、运行测试并提交代码。这种“三阶段、两提示、一循环”的设计，确保了从创意到落地的全流程可控且可迭代。\n\nralph-playbook 特别适合希望利用 AI 代理（Agent）独立承担完整开发任务的软件工程师和技术团队。其独特亮点在于强调“反压机制”（如自动测试）和上下文隔离，让 AI 能在保持目标一致性的同时，安全地处理多轮迭代任务。无论你是想系统化提升 AI 编码效率的资深开发者，还是正在探索自主智能体工作流的科研人员，这份指南都能为你提供一套经过验证的最佳实践方案。","# The Ralph Playbook\n\nDecember 2025 boiled [Ralph's](https:\u002F\u002Fghuntley.com\u002Fralph\u002F) powerful yet dumb little face to the top of most AI-related timelines.\n\nI try to pay attention to the crazy-smart insights [@GeoffreyHuntley](https:\u002F\u002Fx.com\u002FGeoffreyHuntley) shares, but I can't say Ralph really clicked for me this summer. Now, all of the recent hubbub has made it hard to ignore.\n\n[@mattpocockuk](https:\u002F\u002Fx.com\u002Fmattpocockuk\u002Fstatus\u002F2008200878633931247) and [@ryancarson](https:\u002F\u002Fx.com\u002Fryancarson\u002Fstatus\u002F2008548371712135632)'s overviews helped a lot - right until Geoff came in and [said 'nah'](https:\u002F\u002Fx.com\u002FGeoffreyHuntley\u002Fstatus\u002F2008731415312236984).\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FClaytonFarr_ralph-playbook_readme_b1d483cbb3de.png\" alt=\"nah\" width=\"500\" \u002F>\n\n## So what is the optimal way to Ralph?\n\nMany folks seem to be getting good results with various shapes - but I wanted to read the tea leaves as closely as possible from the person who not only captured this approach but also has had the most ass-time in the seat putting it through its paces.\n\nSo I dug in to really _RTFM_ on [recent videos](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=O2bBWDoxO4s) and Geoff's [original post](https:\u002F\u002Fghuntley.com\u002Fralph\u002F) to try and untangle for myself what works best.\n\nBelow is the result - a (likely OCD-fueled) Ralph Playbook that organizes the miscellaneous details for putting this all into practice w\u002Fo hopefully neutering it in the process.\n\n> Digging into all of this has also brought to mind some possibly valuable [additional enhancements](#enhancements) to the core approach that aim to stay aligned with the guidelines that make Ralph work so well.\n\n> [!TIP]\n> View as [📖 Formatted Guide →](https:\u002F\u002FClaytonFarr.github.io\u002Fralph-playbook\u002F)\n\nHope this helps you out - [@ClaytonFarr](https:\u002F\u002Fx.com\u002FClaytonFarr)\n\n---\n\n## Table of Contents\n\n- [Workflow](#workflow)\n- [Key Principles](#key-principles)\n- [Loop Mechanics](#loop-mechanics)\n- [License](#license)\n- [Files](#files)\n- [Enhancements?](#enhancements)\n\n---\n\n## Workflow\n\nA picture is worth a thousand tweets and an hour-long video. Geoff's [overview here](https:\u002F\u002Fghuntley.com\u002Fralph\u002F) (sign up to his newsletter to see full article) really helped clarify the workflow details for moving from 1) idea → 2) individual JTBD-aligned specs → 3) comprehensive implementation plan → 4) Ralph work loops.\n\n![ralph-diagram.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FClaytonFarr_ralph-playbook_readme_9ce8e43650f7.png)\n\n### 🗘 Three Phases, Two Prompts, One Loop\n\nThis diagram clarified for me that Ralph isn't just \"a loop that codes.\" It's a funnel with 3 Phases, 2 Prompts, and 1 Loop.\n\n#### Phase 1. Define Requirements (LLM conversation)\n\n- Discuss project ideas → identify Jobs to Be Done (JTBD)\n- Break individual JTBD into topic(s) of concern\n- Use subagents to load info from URLs into context\n- LLM understands JTBD topic of concern: subagent writes `specs\u002FFILENAME.md` for each topic\n\n#### Phase 2 \u002F 3. Run Ralph Loop (two modes, swap `PROMPT.md` as needed)\n\nSame loop mechanism, different prompts for different objectives:\n\n| Mode       | When to use                            | Prompt focus                                            |\n| ---------- | -------------------------------------- | ------------------------------------------------------- |\n| _PLANNING_ | No plan exists, or plan is stale\u002Fwrong | Generate\u002Fupdate `IMPLEMENTATION_PLAN.md` only           |\n| _BUILDING_ | Plan exists                            | Implement from plan, commit, update plan as side effect |\n\n_Prompt differences per mode:_\n\n- 'PLANNING' prompt does gap analysis (specs vs code) and outputs a prioritized TODO list—no implementation, no commits.\n- 'BUILDING' prompt assumes plan exists, picks tasks from it, implements, runs tests (backpressure), commits.\n\n_Why use the loop for both modes?_\n\n- BUILDING requires it: inherently iterative (many tasks × fresh context = isolation)\n- PLANNING uses it for consistency: same execution model, though often completes in 1-2 iterations\n- Flexibility: if plan needs refinement, loop allows multiple passes reading its own output\n- Simplicity: one mechanism for everything; clean file I\u002FO; easy stop\u002Frestart\n\n_Context loaded each iteration:_ `PROMPT.md` + `AGENTS.md`\n\n_PLANNING mode loop lifecycle:_\n\n1. Subagents study `specs\u002F*` and existing `\u002Fsrc`\n2. Compare specs against code (gap analysis)\n3. Create\u002Fupdate `IMPLEMENTATION_PLAN.md` with prioritized tasks\n4. No implementation\n\n_BUILDING mode loop lifecycle:_\n\n1. _Orient_ – subagents study `specs\u002F*` (requirements)\n2. _Read plan_ – study `IMPLEMENTATION_PLAN.md`\n3. _Select_ – pick the most important task\n4. _Investigate_ – subagents study relevant `\u002Fsrc` (\"don't assume not implemented\")\n5. _Implement_ – N subagents for file operations\n6. _Validate_ – 1 subagent for build\u002Ftests (backpressure)\n7. _Update `IMPLEMENTATION_PLAN.md`_ – mark task done, note discoveries\u002Fbugs\n8. _Update `AGENTS.md`_ – if operational learnings\n9. _Commit_\n10. _Loop ends_ → context cleared → next iteration starts fresh\n\n#### Concepts\n\n| Term                    | Definition                                                      |\n| ----------------------- | --------------------------------------------------------------- |\n| _Job to be Done (JTBD)_ | High-level user need or outcome                                 |\n| _Topic of Concern_      | A distinct aspect\u002Fcomponent within a JTBD                       |\n| _Spec_                  | Requirements doc for one topic of concern (`specs\u002FFILENAME.md`) |\n| _Task_                  | Unit of work derived from comparing specs to code               |\n\n_Relationships:_\n\n- 1 JTBD → multiple topics of concern\n- 1 topic of concern → 1 spec\n- 1 spec → multiple tasks (specs are larger than tasks)\n\n_Example:_\n\n- JTBD: \"Help designers create mood boards\"\n- Topics: image collection, color extraction, layout, sharing\n- Each topic → one spec file\n- Each spec → many tasks in implementation plan\n\n_Topic Scope Test: \"One Sentence Without 'And'\"_\n\n- Can you describe the topic of concern in one sentence without conjoining unrelated capabilities?\n  - ✓ \"The color extraction system analyzes images to identify dominant colors\"\n  - ✗ \"The user system handles authentication, profiles, and billing\" → 3 topics\n- If you need \"and\" to describe what it does, it's probably multiple topics\n\n---\n\n## Key Principles\n\n### ⏳ Context Is _Everything_\n\n- When 200K+ tokens advertised = ~176K truly usable\n- And 40-60% context utilization for \"smart zone\"\n- Tight tasks + 1 task per loop = _100% smart zone context utilization_\n\nThis informs and drives everything else:\n\n- _Use the main agent\u002Fcontext as a scheduler_\n  - Don't allocate expensive work to main context; spawn subagents whenever possible instead\n- _Use subagents as memory extension_\n  - Each subagent gets ~156kb that's garbage collected\n  - Fan out to avoid polluting main context\n- _Simplicity and brevity win_\n  - Applies to number of parts in system, loop config, and content\n  - Verbose inputs degrade determinism\n- _Prefer Markdown over JSON_\n  - To define and track work, for better token efficiency\n\n### 🧭 Steering Ralph: Patterns + Backpressure\n\nCreating the right signals & gates to steer Ralph's successful output is **critical**. You can steer from two directions:\n\n- _Steer upstream_\n  - Ensure deterministic setup:\n    - Allocate first ~5,000 tokens for specs\n    - Every loop's context is allocated with the same files so model starts from known state (`PROMPT.md` + `AGENTS.md`)\n  - Your existing code shapes what gets used and generated\n  - If Ralph is generating wrong patterns, add\u002Fupdate utilities and existing code patterns to steer it toward correct ones\n- _Steer downstream_\n  - Create backpressure via tests, typechecks, lints, builds, etc. that will reject invalid\u002Funacceptable work\n  - Prompt says \"run tests\" generically. `AGENTS.md` specifies actual commands to make backpressure project-specific\n  - Backpressure can extend beyond code validation: some acceptance criteria resist programmatic checks - creative quality, aesthetics, UX feel. LLM-as-judge tests can provide backpressure for subjective criteria with binary pass\u002Ffail. ([More detailed thoughts below](#non-deterministic-backpressure) on how to approach this with Ralph.)\n- _Remind Ralph to create\u002Fuse backpressure_\n  - Remind Ralph to use backpressure when implementing: \"Important: When authoring documentation, capture the why — tests and implementation importance.\"\n\n### 🙏 Let Ralph Ralph\n\nRalph's effectiveness comes from how much you trust it do the right thing (eventually) and engender its ability to do so.\n\n- _Let Ralph Ralph_\n  - Lean into LLM's ability to self-identify, self-correct and self-improve\n  - Applies to implementation plan, task definition and prioritization\n  - Eventual consistency achieved through iteration\n- _Use protection_\n  - To operate autonomously, Ralph requires `--dangerously-skip-permissions` - asking for approval on every tool call would break the loop. This bypasses Claude's permission system entirely - so a sandbox becomes your only security boundary.\n  - Philosophy: \"It's not if it gets popped, it's when. And what is the blast radius?\"\n  - Running without a sandbox exposes credentials, browser cookies, SSH keys, and access tokens on your machine\n  - Run in isolated environments with minimum viable access:\n    - Only the API keys and deploy keys needed for the task\n    - No access to private data beyond requirements\n    - Restrict network connectivity where possible\n  - Options: Docker sandboxes (local), Fly Sprites\u002FE2B\u002Fetc. (remote\u002Fproduction) - [additional notes](references\u002Fsandbox-environments.md)\n  - Additional escape hatches: Ctrl+C stops the loop; `git reset --hard` reverts uncommitted changes; regenerate plan if trajectory goes wrong\n\n### 🚦 Move Outside the Loop\n\nTo get the most out of Ralph, you need to get out of his way. Ralph should be doing _all_ of the work, including decided which planned work to implement next and how to implement it. Your job is now to sit on the loop, not in it - to engineer the setup and environment that will allow Ralph to succeed.\n\n_Observe and course correct_ – especially early on, sit and watch. What patterns emerge? Where does Ralph go wrong? What signs does he need? The prompts you start with won't be the prompts you end with - they evolve through observed failure patterns.\n\n_Tune it like a guitar_ – instead of prescribing everything upfront, observe and adjust reactively. When Ralph fails a specific way, add a sign to help him next time.\n\nBut signs aren't just prompt text. They're _anything_ Ralph can discover:\n\n- Prompt guardrails - explicit instructions like \"don't assume not implemented\"\n- `AGENTS.md` - operational learnings about how to build\u002Ftest\n- Utilities in your codebase - when you add a pattern, Ralph discovers it and follows it\n- Other discoverable, relevant inputs…\n\n> [!TIP]\n>\n> 1. try starting with _nothing_ in `AGENTS.md` (empty file; no _best practices_, etc.)\n> 2. spot-test desired actions, find missteps ([walkthrough example from Geoff](https:\u002F\u002Fx.com\u002FClaytonFarr\u002Fstatus\u002F2010780371542241508))\n> 3. watch initial loops, see where gaps occur\n> 4. tune behavior _only as needed_, via AGENTS updates and\u002For code patterns (shared utilities, etc.)\n\nAnd remember, _the plan is disposable:_\n\n- If it's wrong, throw it out, and start over\n- Regeneration cost is one Planning loop; cheap compared to Ralph going in circles\n- Regenerate when:\n  - Ralph is going off track (implementing wrong things, duplicating work)\n  - Plan feels stale or doesn't match current state\n  - Too much clutter from completed items\n  - You've made significant spec changes\n  - You're confused about what's actually done\n\n---\n\n## Loop Mechanics\n\n### I. Task Selection\n\n`loop.sh` acts in effect as an 'outer loop' where each loop = a single task (in separate sessions). When the task is completed, `loop.sh` kicks off a fresh session to select the next task, if any remaining tasks are available.\n\nGeoff's initial minimal form of `loop.sh` script:\n\n```bash\nwhile :; do cat PROMPT.md | claude ; done\n```\n\n_Note:_ The same approach can be used with other CLIs; e.g. `amp`, `codex`, `opencode`, etc.\n\n_What controls task continuation?_\n\nThe continuation mechanism is elegantly simple:\n\n1. _Bash loop runs_ → feeds `PROMPT.md` to claude\n2. _PROMPT.md instructs_ → \"Study IMPLEMENTATION_PLAN.md and choose the most important thing...\"\n3. _Agent completes one task_ → updates IMPLEMENTATION_PLAN.md on disk, commits, exits\n4. _Bash loop restarts immediately_ → fresh context window\n5. _Agent reads updated plan_ → picks next most important thing...\n\n_Key insight:_ The IMPLEMENTATION_PLAN.md file persists on disk between iterations and acts as shared state between otherwise isolated loop executions. Each iteration deterministically loads the same files (`PROMPT.md` + `AGENTS.md` + `specs\u002F*`) and reads the current state from disk.\n\n_No sophisticated orchestration needed_ - just a dumb bash loop that keeps restarting the agent, and the agent figures out what to do next by reading the plan file each time.\n\n### II. Task Execution\n\nEach task is prompted to keep doing its work against backpressure (tests, etc) until it passes - creating a pseudo inner 'loop' (in single session).\n\nThis inner loop is just internal self-correction \u002F iterative reasoning within one long model response, powered by backpressure prompts, tool use, and subagents. It's not a loop in the programming sense.\n\nA single task execution has no hard technical limit. Control relies on:\n\n- _Scope discipline_ - PROMPT.md instructs \"one task\" and \"commit when tests pass\"\n- _Backpressure_ - tests\u002Fbuild failures force the agent to fix issues before committing\n- _Natural completion_ - agent exits after successful commit\n\n_Ralph can go in circles, ignore instructions, or take wrong directions_ - this is expected and part of the tuning process. When Ralph \"tests you\" by failing in specific ways, you add guardrails to the prompt or adjust backpressure mechanisms. The nondeterminism is manageable through observation and iteration.\n\n### Enhanced `loop.sh` Example\n\nWraps core loop with mode selection (plan\u002Fbuild), with max-iterations for max number of tasks to complete, and git push after each iteration.\n\n_This enhancement uses two saved prompt files:_\n\n- `PROMPT_plan.md` - Planning mode (gap analysis, generates\u002Fupdates plan)\n- `PROMPT_build.md` - Building mode (implements from plan)\n\n```bash\n#!\u002Fbin\u002Fbash\n# Usage: .\u002Floop.sh [plan|build] [max_iterations]\n# Examples:\n#   .\u002Floop.sh              # Build mode, unlimited tasks\n#   .\u002Floop.sh 20           # Build mode, max 20 tasks\n#   .\u002Floop.sh build 20     # Build mode, max 20 tasks\n#   .\u002Floop.sh plan         # Plan mode, unlimited tasks\n#   .\u002Floop.sh plan 5       # Plan mode, max 5 tasks\n\n# Parse arguments\nif [ \"$1\" = \"plan\" ]; then\n    # Plan mode\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"build\" ]; then\n    # Explicit build mode (with optional max iterations)\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # Build mode with max tasks (bare number)\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=$1\nelse\n    # Build mode, unlimited (no arguments or invalid input)\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=0\nfi\n\nITERATION=0\nCURRENT_BRANCH=$(git branch --show-current)\n\necho \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\necho \"Mode:   $MODE\"\necho \"Prompt: $PROMPT_FILE\"\necho \"Branch: $CURRENT_BRANCH\"\n[ $MAX_ITERATIONS -gt 0 ] && echo \"Max:    $MAX_ITERATIONS iterations (number of tasks)\"\necho \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n\n# Verify prompt file exists\nif [ ! -f \"$PROMPT_FILE\" ]; then\n    echo \"Error: $PROMPT_FILE not found\"\n    exit 1\nfi\n\nwhile true; do\n    if [ $MAX_ITERATIONS -gt 0 ] && [ $ITERATION -ge $MAX_ITERATIONS ]; then\n        echo \"Reached max iterations (number of tasks): $MAX_ITERATIONS\"\n        break\n    fi\n\n    # Run Ralph iteration with selected prompt\n    # -p: Headless mode (non-interactive, reads from stdin)\n    # --dangerously-skip-permissions: Auto-approve all tool calls (YOLO mode)\n    # --output-format=stream-json: Structured output for logging\u002Fmonitoring\n    # --model opus: Primary agent uses Opus for complex reasoning (task selection, prioritization)\n    #               Can use 'sonnet' in build mode for speed if plan is clear and tasks well-defined\n    # --verbose: Detailed execution logging\n    cat \"$PROMPT_FILE\" | claude -p \\\n        --dangerously-skip-permissions \\\n        --output-format=stream-json \\\n        --model opus \\\n        --verbose\n\n    # Push changes after each iteration\n    git push origin \"$CURRENT_BRANCH\" || {\n        echo \"Failed to push. Creating remote branch...\"\n        git push -u origin \"$CURRENT_BRANCH\"\n    }\n\n    ITERATION=$((ITERATION + 1))\n    echo -e \"\\n\\n======================== LOOP $ITERATION ========================\\n\"\ndone\n```\n\n_Mode selection:_\n\n- No keyword → Uses `PROMPT_build.md` for building (implementation)\n- `plan` keyword → Uses `PROMPT_plan.md` for planning (gap analysis, plan generation)\n\n_Max-iterations:_\n\n- Limits the _task selection loop_ (number of tasks attempted; NOT tool calls within a single task)\n- Each iteration = one fresh context window = one task from IMPLEMENTATION_PLAN.md = one commit\n- `.\u002Floop.sh` runs unlimited (manual stop with Ctrl+C)\n- `.\u002Floop.sh 20` runs max 20 iterations then stops\n\n_Claude CLI flags:_\n\n- `-p` (headless mode): Enables non-interactive operation, reads prompt from stdin\n- `--dangerously-skip-permissions`: Bypasses all permission prompts for fully automated runs\n- `--output-format=stream-json`: Outputs structured JSON for logging\u002Fmonitoring\u002Fvisualization\n- `--model opus`: Primary agent uses Opus for task selection, prioritization, and coordination (can use `sonnet` for speed if tasks are clear)\n- `--verbose`: Provides detailed execution logging\n\n### Streamed Output Variant\n\nAn alternative `loop_streamed.sh` that pipes Claude's raw JSON output through `parse_stream.js` for readable, color-coded terminal display showing tool calls, results, and execution stats.\n\n_Differences from base `loop.sh`:_\n\n- Passes prompt as argument (`-p \"$FULL_PROMPT\"`) instead of stdin pipe\n- Adds `--include-partial-messages` for real-time streaming\n- Pipes output through `parse_stream.js` (Node.js, no dependencies)\n- Appends \"Execute the instructions above.\" to prompt content\n\n_Files:_ [`loop_streamed.sh`](files\u002Floop_streamed.sh) · [`parse_stream.js`](files\u002Fparse_stream.js)\n\n— contributed by [@terry-xyz](https:\u002F\u002Fgithub.com\u002Fterry-xyz) · [@blackrosesxyz](https:\u002F\u002Fx.com\u002Fblackrosesxyz)\n\n## License\n\nThis repository is available under the [MIT License](LICENSE).\n\nThird-party screenshots and externally sourced images are excluded unless\nexplicitly noted otherwise. See [NOTICE](NOTICE) for details.\n\n---\n\n## Files\n\n```\nproject-root\u002F\n├── loop.sh                         # Ralph loop script\n├── PROMPT_build.md                 # Build mode instructions\n├── PROMPT_plan.md                  # Plan mode instructions\n├── AGENTS.md                       # Operational guide loaded each iteration\n├── IMPLEMENTATION_PLAN.md          # Prioritized task list (generated\u002Fupdated by Ralph)\n├── specs\u002F                          # Requirement specs (one per JTBD topic)\n│   ├── [jtbd-topic-a].md\n│   └── [jtbd-topic-b].md\n├── src\u002F                            # Application source code\n└── src\u002Flib\u002F                        # Shared utilities & components\n```\n\n### `loop.sh`\n\nThe primary loop script that orchestrates Ralph iterations.\n\nSee [Loop Mechanics](#loop-mechanics) section for detailed implementation examples and configuration options.\n\n_Setup:_ Make the script executable before first use:\n\n```bash\nchmod +x loop.sh\n```\n\n_Core function:_ Continuously feeds prompt file to Claude, manages iteration limits, and pushes changes after each task completion.\n\n### PROMPTS\n\nThe instruction set for each loop iteration. Swap between PLANNING and BUILDING versions as needed.\n\n_Prompt Structure:_\n\n| Section                | Purpose                                               |\n| ---------------------- | ----------------------------------------------------- |\n| _Phase 0_ (0a, 0b, 0c) | Orient: study specs, source location, current plan    |\n| _Phase 1-4_            | Main instructions: task, validation, commit           |\n| _999... numbering_     | Guardrails\u002Finvariants (higher number = more critical) |\n\n_Key Language Patterns_ (Geoff's specific phrasing):\n\n- \"study\" (not \"read\" or \"look at\")\n- \"don't assume not implemented\" (critical - the Achilles' heel)\n- \"using parallel subagents\" \u002F \"up to N subagents\"\n- \"only 1 subagent for build\u002Ftests\" (backpressure control)\n- \"Think extra hard\" (now \"Ultrathink\")\n- \"capture the why\"\n- \"keep it up to date\"\n- \"if functionality is missing then it's your job to add it\"\n- \"resolve them or document them\"\n\n#### `PROMPT_plan.md` Template\n\n_Notes:_\n\n- Update [project-specific goal] placeholder below.\n- Current subagents names presume using Claude.\n\n```\n0a. Study `specs\u002F*` with up to 250 parallel Sonnet subagents to learn the application specifications.\n0b. Study @IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.\n0c. Study `src\u002Flib\u002F*` with up to 250 parallel Sonnet subagents to understand shared utilities & components.\n0d. For reference, the application source code is in `src\u002F*`.\n\n1. Study @IMPLEMENTATION_PLAN.md (if present; it may be incorrect) and use up to 500 Sonnet subagents to study existing source code in `src\u002F*` and compare it against `specs\u002F*`. Use an Opus subagent to analyze findings, prioritize tasks, and create\u002Fupdate @IMPLEMENTATION_PLAN.md as a bullet point list sorted in priority of items yet to be implemented. Ultrathink. Consider searching for TODO, minimal implementations, placeholders, skipped\u002Fflaky tests, and inconsistent patterns. Study @IMPLEMENTATION_PLAN.md to determine starting point for research and keep it up to date with items considered complete\u002Fincomplete using subagents.\n\nIMPORTANT: Plan only. Do NOT implement anything. Do NOT assume functionality is missing; confirm with code search first. Treat `src\u002Flib` as the project's standard library for shared utilities and components. Prefer consolidated, idiomatic implementations there over ad-hoc copies.\n\nULTIMATE GOAL: We want to achieve [project-specific goal]. Consider missing elements and plan accordingly. If an element is missing, search first to confirm it doesn't exist, then if needed author the specification at specs\u002FFILENAME.md. If you create a new element then document the plan to implement it in @IMPLEMENTATION_PLAN.md using a subagent.\n```\n\n#### `PROMPT_build.md` Template\n\n_Note:_ Current subagents names presume using Claude.\n\n```\n0a. Study `specs\u002F*` with up to 500 parallel Sonnet subagents to learn the application specifications.\n0b. Study @IMPLEMENTATION_PLAN.md.\n0c. For reference, the application source code is in `src\u002F*`.\n\n1. Your task is to implement functionality per the specifications using parallel subagents. Follow @IMPLEMENTATION_PLAN.md and choose the most important item to address. Before making changes, search the codebase (don't assume not implemented) using Sonnet subagents. You may use up to 500 parallel Sonnet subagents for searches\u002Freads and only 1 Sonnet subagent for build\u002Ftests. Use Opus subagents when complex reasoning is needed (debugging, architectural decisions).\n2. After implementing functionality or resolving problems, run the tests for that unit of code that was improved. If functionality is missing then it's your job to add it as per the application specifications. Ultrathink.\n3. When you discover issues, immediately update @IMPLEMENTATION_PLAN.md with your findings using a subagent. When resolved, update and remove the item.\n4. When the tests pass, update @IMPLEMENTATION_PLAN.md, then `git add -A` then `git commit` with a message describing the changes. After the commit, `git push`.\n\n99999. Important: When authoring documentation, capture the why — tests and implementation importance.\n999999. Important: Single sources of truth, no migrations\u002Fadapters. If tests unrelated to your work fail, resolve them as part of the increment.\n9999999. As soon as there are no build or test errors create a git tag. If there are no git tags start at 0.0.0 and increment patch by 1 for example 0.0.1  if 0.0.0 does not exist.\n99999999. You may add extra logging if required to debug issues.\n999999999. Keep @IMPLEMENTATION_PLAN.md current with learnings using a subagent — future work depends on this to avoid duplicating efforts. Update especially after finishing your turn.\n9999999999. When you learn something new about how to run the application, update @AGENTS.md using a subagent but keep it brief. For example if you run commands multiple times before learning the correct command then that file should be updated.\n99999999999. For any bugs you notice, resolve them or document them in @IMPLEMENTATION_PLAN.md using a subagent even if it is unrelated to the current piece of work.\n999999999999. Implement functionality completely. Placeholders and stubs waste efforts and time redoing the same work.\n9999999999999. When @IMPLEMENTATION_PLAN.md becomes large periodically clean out the items that are completed from the file using a subagent.\n99999999999999. If you find inconsistencies in the specs\u002F* then use an Opus 4.6 subagent with 'ultrathink' requested to update the specs.\n999999999999999. IMPORTANT: Keep @AGENTS.md operational only — status updates and progress notes belong in `IMPLEMENTATION_PLAN.md`. A bloated AGENTS.md pollutes every future loop's context.\n```\n\n### `AGENTS.md`\n\nSingle, canonical \"heart of the loop\" - a concise, operational \"how to run\u002Fbuild\" guide.\n\n- NOT a changelog or progress diary\n- Describes how to build\u002Frun the project\n- Captures operational learnings that improve the loop\n- Keep brief (~60 lines)\n\nStatus, progress, and planning belong in `IMPLEMENTATION_PLAN.md`, not here.\n\n_Loopback \u002F Immediate Self-Evaluation:_\n\nAGENTS.md should contain the project-specific commands that enable loopback - the ability for Ralph to immediately evaluate his work within the same loop. This includes:\n\n- Build commands\n- Test commands (targeted and full suite)\n- Typecheck\u002Flint commands\n- Any other validation tools\n\nThe BUILDING prompt says \"run tests\" generically; AGENTS.md specifies the actual commands. This is how backpressure gets wired in per-project.\n\n#### Example\n\n```\n## Build & Run\n\nSuccinct rules for how to BUILD the project:\n\n## Validation\n\nRun these after implementing to get immediate feedback:\n\n- Tests: `[test command]`\n- Typecheck: `[typecheck command]`\n- Lint: `[lint command]`\n\n## Operational Notes\n\nSuccinct learnings about how to RUN the project:\n\n...\n\n### Codebase Patterns\n\n...\n```\n\n### `IMPLEMENTATION_PLAN.md`\n\nPrioritized bullet-point list of tasks derived from gap analysis (specs vs code) - generated by Ralph.\n\n- _Created_ via PLANNING mode\n- _Updated_ during BUILDING mode (mark complete, add discoveries, note bugs)\n- _Can be regenerated_ – Geoff: \"I have deleted the TODO list multiple times\" → switch to PLANNING mode\n- _Self-correcting_ – BUILDING mode can even create new specs if missing\n\nThe circularity is intentional: eventual consistency through iteration.\n\n_No pre-specified template_ - let Ralph\u002FLLM dictate and manage format that works best for it.\n\n### `specs\u002F*`\n\nOne markdown file per topic of concern. These are the source of truth for what should be built.\n\n- Created during Requirements phase (human + LLM conversation)\n- Consumed by both PLANNING and BUILDING modes\n- Can be updated if inconsistencies discovered (rare, use subagent)\n\n_No pre-specified template_ - let Ralph\u002FLLM dictate and manage format that works best for it.\n\n### `src\u002F` and `src\u002Flib\u002F`\n\nApplication source code and shared utilities\u002Fcomponents.\n\nReferenced in `PROMPT.md` templates for orientation steps.\n\n---\n\n## Enhancements?\n\nI'm still determining the value\u002Fviability of these, but the opportunities sound promising:\n\n- [Claude's AskUserQuestionTool for Planning](#use-claudes-askuserquestiontool-for-planning) - use Claude's built-in interview tool to systematically clarify JTBD, edge cases, and acceptance criteria for specs.\n- [Acceptance-Driven Backpressure](#acceptance-driven-backpressure) - Derive test requirements during planning from acceptance criteria. Prevents \"cheating\" - can't claim done without appropriate tests passing.\n- [Non-Deterministic Backpressure](#non-deterministic-backpressure) - Using LLM-as-judge for tests against subjective tasks (tone, aesthetics, UX). Binary pass\u002Ffail reviews that iterate until pass.\n- [Ralph-Friendly Work Branches](#ralph-friendly-work-branches) - Asking Ralph to \"filter to feature X\" at runtime is unreliable. Instead, create scoped plan per branch upfront.\n- [JTBD → Story Map → SLC Release](#jtbd--story-map--slc-release) - Push the power of \"Letting Ralph Ralph\" to connect JTBD's audience and activities to Simple\u002FLovable\u002FComplete releases.\n- [Specs Audit](#specs-audit) - Dedicated mode for generating\u002Fmaintaining specs with quality rules: behavioral outcomes only, topic scoping, consistent naming.\n- [Reverse Engineering Brownfield Projects to Specs](#reverse-engineering-brownfield-projects-to-specs) - Bring brownfield codebases into Ralph's workflow by reverse-engineering existing code into specs before planning new work.\n\n---\n\n### Use Claude's AskUserQuestionTool for Planning\n\nDuring Phase 1 (Define Requirements), use Claude's built-in `AskUserQuestionTool` to systematically explore JTBD, topics of concern, edge cases, and acceptance criteria through structured interview before writing specs.\n\n_When to use:_ Minimal\u002Fvague initial requirements, need to clarify constraints, or multiple valid approaches exist.\n\n_Invoke:_ \"Interview me using AskUserQuestion to understand [JTBD\u002Ftopic\u002Facceptance criteria\u002F...]\"\n\nClaude will ask targeted questions to clarify requirements and ensure alignment before producing `specs\u002F*.md` files.\n\n_Flow:_\n\n1. Start with known information →\n2. _Claude interviews via AskUserQuestion_ →\n3. Iterate until clear →\n4. Claude writes specs with acceptance criteria →\n5. Proceed to planning\u002Fbuilding\n\nNo code or prompt changes needed - this simply enhances Phase 1 using existing Claude Code capabilities.\n\n_Inspiration_ - [Thariq's X post](https:\u002F\u002Fx.com\u002Ftrq212\u002Fstatus\u002F2005315275026260309):\n\n---\n\n### Acceptance-Driven Backpressure\n\nGeoff's Ralph _implicitly_ connects specs → implementation → tests through emergent iteration. This enhancement would make that connection _explicit_ by deriving test requirements during planning, creating a direct line from \"what success looks like\" to \"what verifies it.\"\n\nThis enhancement connects acceptance criteria (in specs) directly to test requirements (in implementation plan), improving backpressure quality by:\n\n- _Preventing \"no cheating\"_ - Can't claim done without required tests derived from acceptance criteria\n- _Enabling TDD workflow_ - Test requirements known before implementation starts\n- _Improving convergence_ - Clear completion signal (required tests pass) vs ambiguous (\"seems done?\")\n- _Maintaining determinism_ - Test requirements in plan (known state) not emergent (probabilistic)\n\n#### Compatibility with Core Philosophy\n\n| Principle             | Maintained? | How                                                         |\n| --------------------- | ----------- | ----------------------------------------------------------- |\n| Monolithic operation  | ✅ Yes      | One agent, one task, one loop at a time                     |\n| Backpressure critical | ✅ Yes      | Tests are the mechanism, just derived explicitly now        |\n| Context efficiency    | ✅ Yes      | Planning decides tests once vs building rediscovering       |\n| Deterministic setup   | ✅ Yes      | Test requirements in plan (known state) not emergent        |\n| Let Ralph Ralph       | ✅ Yes      | Ralph still prioritizes and chooses implementation approach |\n| Plan is disposable    | ✅ Yes      | Wrong test requirements? Regenerate plan                    |\n| \"Capture the why\"     | ✅ Yes      | Test intent documented in plan before implementation        |\n| No cheating           | ✅ Yes      | Required tests prevent placeholder implementations          |\n\n#### The Prescriptiveness Balance\n\nThe critical distinction:\n\n_Acceptance criteria_ (in specs) = Behavioral outcomes, observable results, what success looks like\n\n- ✅ \"Extracts 5-10 dominant colors from any uploaded image\"\n- ✅ \"Processes images \u003C5MB in \u003C100ms\"\n- ✅ \"Handles edge cases: grayscale, single-color, transparent backgrounds\"\n\n_Test requirements_ (in implementation plan) = Verification points derived from acceptance criteria\n\n- ✅ \"Required tests: Extract 5-10 colors, Performance \u003C100ms, Handle grayscale edge case\"\n\n_Implementation approach_ (up to Ralph) = Technical decisions about how to achieve it\n\n- ❌ \"Use K-means clustering with 3 iterations and LAB color space conversion\"\n\nThe key: _Specify WHAT to verify (outcomes), not HOW to implement (approach)_\n\nThis maintains \"Let Ralph Ralph\" principle - Ralph decides implementation details while having clear success signals.\n\n#### Architecture: Three-Phase Connection\n\n```\nPhase 1: Requirements Definition\n    specs\u002F*.md + Acceptance Criteria\n    ↓\nPhase 2: Planning (derives test requirements)\n    IMPLEMENTATION_PLAN.md + Required Tests\n    ↓\nPhase 3: Building (implements with tests)\n    Implementation + Tests → Backpressure\n```\n\n#### Phase 1: Requirements Definition\n\nDuring the human + LLM conversation that produces specs:\n\n- Discuss JTBD and break into topics of concern\n- Use subagents to load external context as needed\n- _Discuss and define acceptance criteria_ - what observable, verifiable outcomes indicate success\n- Keep criteria behavioral (outcomes), not implementation (how to build it)\n- LLM writes specs including acceptance criteria however makes most sense for the spec\n- Acceptance criteria become the foundation for deriving test requirements in planning phase\n\n#### Phase 2: Planning Mode Enhancement\n\nModify `PROMPT_plan.md` instruction 1 to include test derivation. Add after the first sentence:\n\n```markdown\nFor each task in the plan, derive required tests from acceptance criteria in specs - what specific outcomes need verification (behavior, performance, edge cases). Tests verify WHAT works, not HOW it's implemented. Include as part of task definition.\n```\n\n#### Phase 3: Building Mode Enhancement\n\nModify `PROMPT_build.md` instructions:\n\n_Instruction 1:_ Add after \"choose the most important item to address\":\n\n```markdown\nTasks include required tests - implement tests as part of task scope.\n```\n\n_Instruction 2:_ Replace \"run the tests for that unit of code\" with:\n\n```markdown\nrun all required tests specified in the task definition. All required tests must exist and pass before the task is considered complete.\n```\n\n_Prepend new guardrail_ (in the 9s sequence):\n\n```markdown\n999. Required tests derived from acceptance criteria must exist and pass before committing. Tests are part of implementation scope, not optional. Test-driven development approach: tests can be written first or alongside implementation.\n```\n\n---\n\n### Non-Deterministic Backpressure\n\nSome acceptance criteria resist programmatic validation:\n\n- _Creative quality_ - Writing tone, narrative flow, engagement\n- _Aesthetic judgments_ - Visual harmony, design balance, brand consistency\n- _UX quality_ - Intuitive navigation, clear information hierarchy\n- _Content appropriateness_ - Context-aware messaging, audience fit\n\nThese require human-like judgment but need backpressure to meet acceptance criteria during building loop.\n\n_Solution:_ Add LLM-as-Judge tests as backpressure with binary pass\u002Ffail.\n\nLLM reviews are non-deterministic (same artifact may receive different judgments across runs). This aligns with Ralph philosophy: \"deterministically bad in an undeterministic world.\" The loop provides eventual consistency through iteration—reviews run until pass, accepting natural variance.\n\n#### What Needs to Be Created (First Step)\n\nCreate two files in `src\u002Flib\u002F`:\n\n```\nsrc\u002Flib\u002F\n  llm-review.ts          # Core fixture - single function, clean API\n  llm-review.test.ts     # Reference examples showing the pattern (Ralph learns from these)\n```\n\n##### `llm-review.ts` - Binary pass\u002Ffail API Ralph discovers:\n\n```typescript\ninterface ReviewResult {\n  pass: boolean;\n  feedback?: string; \u002F\u002F Only present when pass=false\n}\n\nfunction createReview(config: {\n  criteria: string; \u002F\u002F What to evaluate (behavioral, observable)\n  artifact: string; \u002F\u002F Text content OR screenshot path\n  intelligence?: \"fast\" | \"smart\"; \u002F\u002F Optional, defaults to 'fast'\n}): Promise\u003CReviewResult>;\n```\n\n_Multimodal support:_ Both intelligence levels would use multimodal model (text + vision). Artifact type detection is automatic:\n\n- Text evaluation: `artifact: \"Your content here\"` → Routes as text input\n- Vision evaluation: `artifact: \".\u002Ftmp\u002Fscreenshot.png\"` → Routes as vision input (detects .png, .jpg, .jpeg extensions)\n\n_Intelligence levels_ (quality of judgment, not capability type):\n\n- `fast` (default): Quick, cost-effective models for straightforward evaluations\n  - Example: Gemini 3.0 Flash (multimodal, fast, cheap)\n- `smart`: Higher-quality models for nuanced aesthetic\u002Fcreative judgment\n  - Example: GPT 5.1 (multimodal, better judgment, higher cost)\n\nThe fixture implementation selects appropriate models. (Examples are current options, not requirements.)\n\n##### `llm-review.test.ts` - Shows Ralph how to use it (text and vision examples):\n\n```typescript\nimport { createReview } from \"@\u002Flib\u002Fllm-review\";\n\n\u002F\u002F Example 1: Text evaluation\ntest(\"welcome message tone\", async () => {\n  const message = generateWelcomeMessage();\n  const result = await createReview({\n    criteria:\n      \"Message uses warm, conversational tone appropriate for design professionals while clearly conveying value proposition\",\n    artifact: message, \u002F\u002F Text content\n  });\n  expect(result.pass).toBe(true);\n});\n\n\u002F\u002F Example 2: Vision evaluation (screenshot path)\ntest(\"dashboard visual hierarchy\", async () => {\n  await page.screenshot({ path: \".\u002Ftmp\u002Fdashboard.png\" });\n  const result = await createReview({\n    criteria:\n      \"Layout demonstrates clear visual hierarchy with obvious primary action\",\n    artifact: \".\u002Ftmp\u002Fdashboard.png\", \u002F\u002F Screenshot path\n  });\n  expect(result.pass).toBe(true);\n});\n\n\u002F\u002F Example 3: Smart intelligence for complex judgment\ntest(\"brand visual consistency\", async () => {\n  await page.screenshot({ path: \".\u002Ftmp\u002Fhomepage.png\" });\n  const result = await createReview({\n    criteria:\n      \"Visual design maintains professional brand identity suitable for financial services while avoiding corporate sterility\",\n    artifact: \".\u002Ftmp\u002Fhomepage.png\",\n    intelligence: \"smart\", \u002F\u002F Complex aesthetic judgment\n  });\n  expect(result.pass).toBe(true);\n});\n```\n\n_Ralph learns from these examples:_ Both text and screenshots work as artifacts. Choose based on what needs evaluation. The fixture handles the rest internally.\n\n_Future extensibility:_ Current design uses single `artifact: string` for simplicity. Can expand to `artifact: string | string[]` if clear patterns emerge requiring multiple artifacts (before\u002Fafter comparisons, consistency across items, multi-perspective evaluation). Composite screenshots or concatenated text could handle most multi-item needs.\n\n#### Integration with Ralph Workflow\n\n_Planning Phase_ - Update `PROMPT_plan.md`:\n\nAfter:\n\n```\n...Study @IMPLEMENTATION_PLAN.md to determine starting point for research and keep it up to date with items considered complete\u002Fincomplete using subagents.\n```\n\nInsert this:\n\n```\nWhen deriving test requirements from acceptance criteria, identify whether verification requires programmatic validation (measurable, inspectable) or human-like judgment (perceptual quality, tone, aesthetics). Both types are equally valid backpressure mechanisms. For subjective criteria that resist programmatic validation, explore src\u002Flib for non-deterministic evaluation patterns.\n```\n\n_Building Phase_ - Update `PROMPT_build.md`:\n\nPrepend new guardrail (in the 9s sequence):\n\n```markdown\n9999. Create tests to verify implementation meets acceptance criteria and include both conventional tests (behavior, performance, correctness) and perceptual quality tests (for subjective criteria, see src\u002Flib patterns).\n```\n\n_Discovery, not documentation:_ Ralph learns LLM review patterns from `llm-review.test.ts` examples during `src\u002Flib` exploration (Phase 0c). No AGENTS.md updates needed - the code examples are the documentation.\n\n#### Compatibility with Core Philosophy\n\n| Principle             | Maintained? | How                                                                                                                                          |\n| --------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------- |\n| Backpressure critical | ✅ Yes      | Extends backpressure to non-programmatic acceptance                                                                                          |\n| Deterministic setup   | ⚠️ Partial  | Criteria in plan (deterministic), evaluation non-deterministic but converges through iteration. Intentional tradeoff for subjective quality. |\n| Context efficiency    | ✅ Yes      | Fixture reused via `src\u002Flib`, small test definitions                                                                                         |\n| Let Ralph Ralph       | ✅ Yes      | Ralph discovers pattern, chooses when to use, writes criteria                                                                                |\n| Plan is disposable    | ✅ Yes      | Review requirements part of plan, regenerate if wrong                                                                                        |\n| Simplicity wins       | ✅ Yes      | Single function, binary result, no scoring complexity                                                                                        |\n| Add signs for Ralph   | ✅ Yes      | Light prompt additions, learning from code exploration                                                                                       |\n\n---\n\n### Ralph-Friendly Work Branches\n\n_The Critical Principle:_ Geoff's Ralph works from a single, disposable plan where Ralph picks \"most important.\" To use branches with Ralph while maintaining this pattern, you must scope at plan creation, not at task selection.\n\n_Why this matters:_\n\n- ❌ _Wrong approach_: Create full plan, then ask Ralph to \"filter\" tasks at runtime → unreliable (70-80%), violates determinism\n- ✅ _Right approach_: Create a scoped plan upfront for each work branch → deterministic, simple, maintains \"plan is disposable\"\n\n_Solution:_ Add a `plan-work` mode to create a work-scoped IMPLEMENTATION_PLAN.md on the current branch. User creates work branch, then runs `plan-work` with a natural language description of the work focus. The LLM uses this description to scope the plan. Post planning, Ralph builds from this already-scoped plan with zero semantic filtering - just picks \"most important\" as always.\n\n_Terminology:_ \"Work\" is intentionally broad - it can describe features, topics of concern, refactoring efforts, infrastructure changes, bug fixes, or any coherent body of related changes. The work description you pass to `plan-work` is natural language for the LLM - it can be prose, not constrained by git branch naming rules.\n\n#### Design Principles\n\n- ✅ _Each Ralph session operates monolithically_ on ONE body of work per branch\n- ✅ _User creates branches manually_ - full control over naming conventions and strategy (e.g. worktrees)\n- ✅ _Natural language work descriptions_ - pass prose to LLM, unconstrained by git naming rules\n- ✅ _Scoping at plan creation_ (deterministic) not task selection (probabilistic)\n- ✅ _Single plan per branch_ - one IMPLEMENTATION_PLAN.md per branch\n- ✅ _Plan remains disposable_ - regenerate scoped plan when wrong\u002Fstale for a branch\n- ✅ No dynamic branch switching within a loop session\n- ✅ Maintains simplicity and determinism\n- ✅ Optional - main branch workflow still works\n- ✅ No semantic filtering at build time - Ralph just picks \"most important\"\n\n#### Workflow\n\n_1. Full Planning (on main branch)_\n\n```bash\n.\u002Floop.sh plan\n# Generate full IMPLEMENTATION_PLAN.md for entire project\n```\n\n_2. Create Work Branch_\n\nUser performs:\n\n```bash\ngit checkout -b ralph\u002Fuser-auth-oauth\n# Create branch with whatever naming convention you prefer\n# Suggestion: ralph\u002F* prefix for work branches\n```\n\n_3. Scoped Planning (on work branch)_\n\n```bash\n.\u002Floop.sh plan-work \"user authentication system with OAuth and session management\"\n# Pass natural language description - LLM uses this to scope the plan\n# Creates focused IMPLEMENTATION_PLAN.md with only tasks for this work\n```\n\n_4. Build from Plan (on work branch)_\n\n```bash\n.\u002Floop.sh\n# Ralph builds from scoped plan (no filtering needed)\n# Picks most important task from already-scoped plan\n```\n\n_5. PR Creation (when work complete)_\n\nUser performs:\n\n```bash\ngh pr create --base main --head ralph\u002Fuser-auth-oauth --fill\n```\n\n#### Work-Scoped Loop Script\n\nExtends the base enhanced loop script to add work branch support with scoped planning:\n\n```bash\n#!\u002Fbin\u002Fbash\nset -euo pipefail\n\n# Usage:\n#   .\u002Floop.sh [plan|build] [max_iterations]  # Plan\u002Fbuild on current branch\n#   .\u002Floop.sh plan-work \"work description\"  # Create scoped plan on current branch\n# Examples:\n#   .\u002Floop.sh                               # Build mode, unlimited\n#   .\u002Floop.sh 20                            # Build mode, max 20\n#   .\u002Floop.sh build 20                      # Build mode, max 20\n#   .\u002Floop.sh plan 5                        # Full planning, max 5\n#   .\u002Floop.sh plan-work \"user auth\"         # Scoped planning\n\n# Parse arguments\nMODE=\"build\"\nPROMPT_FILE=\"PROMPT_build.md\"\n\nif [ \"$1\" = \"plan\" ]; then\n    # Full planning mode\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"build\" ]; then\n    # Explicit build mode (with optional max iterations)\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"plan-work\" ]; then\n    # Scoped planning mode\n    if [ -z \"$2\" ]; then\n        echo \"Error: plan-work requires a work description\"\n        echo \"Usage: .\u002Floop.sh plan-work \\\"description of the work\\\"\"\n        exit 1\n    fi\n    MODE=\"plan-work\"\n    WORK_DESCRIPTION=\"$2\"\n    PROMPT_FILE=\"PROMPT_plan_work.md\"\n    MAX_ITERATIONS=${3:-5}  # Default 5 for work planning\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # Build mode with max iterations (bare number)\n    MAX_ITERATIONS=$1\nelse\n    # Build mode, unlimited\n    MAX_ITERATIONS=0\nfi\n\nITERATION=0\nCURRENT_BRANCH=$(git branch --show-current)\n\n# Validate branch for plan-work mode\nif [ \"$MODE\" = \"plan-work\" ]; then\n    if [ \"$CURRENT_BRANCH\" = \"main\" ] || [ \"$CURRENT_BRANCH\" = \"master\" ]; then\n        echo \"Error: plan-work should be run on a work branch, not main\u002Fmaster\"\n        echo \"Create a work branch first: git checkout -b ralph\u002Fyour-work\"\n        exit 1\n    fi\n\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n    echo \"Mode:    plan-work\"\n    echo \"Branch:  $CURRENT_BRANCH\"\n    echo \"Work:    $WORK_DESCRIPTION\"\n    echo \"Prompt:  $PROMPT_FILE\"\n    echo \"Plan:    Will create scoped IMPLEMENTATION_PLAN.md\"\n    [ \"$MAX_ITERATIONS\" -gt 0 ] && echo \"Max:     $MAX_ITERATIONS iterations\"\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n\n    # Warn about uncommitted changes to IMPLEMENTATION_PLAN.md\n    if [ -f \"IMPLEMENTATION_PLAN.md\" ] && ! git diff --quiet IMPLEMENTATION_PLAN.md 2>\u002Fdev\u002Fnull; then\n        echo \"Warning: IMPLEMENTATION_PLAN.md has uncommitted changes that will be overwritten\"\n        read -p \"Continue? [y\u002FN] \" -n 1 -r\n        echo\n        [[ ! $REPLY =~ ^[Yy]$ ]] && exit 1\n    fi\n\n    # Export work description for PROMPT_plan_work.md\n    export WORK_SCOPE=\"$WORK_DESCRIPTION\"\nelse\n    # Normal plan\u002Fbuild mode\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n    echo \"Mode:   $MODE\"\n    echo \"Branch: $CURRENT_BRANCH\"\n    echo \"Prompt: $PROMPT_FILE\"\n    echo \"Plan:   IMPLEMENTATION_PLAN.md\"\n    [ \"$MAX_ITERATIONS\" -gt 0 ] && echo \"Max:    $MAX_ITERATIONS iterations\"\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\nfi\n\n# Verify prompt file exists\nif [ ! -f \"$PROMPT_FILE\" ]; then\n    echo \"Error: $PROMPT_FILE not found\"\n    exit 1\nfi\n\n# Main loop\nwhile true; do\n    if [ \"$MAX_ITERATIONS\" -gt 0 ] && [ \"$ITERATION\" -ge \"$MAX_ITERATIONS\" ]; then\n        echo \"Reached max iterations: $MAX_ITERATIONS\"\n\n        if [ \"$MODE\" = \"plan-work\" ]; then\n            echo \"\"\n            echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n            echo \"Scoped plan created: $WORK_DESCRIPTION\"\n            echo \"To build, run:\"\n            echo \"  .\u002Floop.sh 20\"\n            echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n        fi\n        break\n    fi\n\n    # Run Ralph iteration with selected prompt\n    # -p: Headless mode (non-interactive, reads from stdin)\n    # --dangerously-skip-permissions: Auto-approve all tool calls (YOLO mode)\n    # --output-format=stream-json: Structured output for logging\u002Fmonitoring\n    # --model opus: Primary agent uses Opus for complex reasoning (task selection, prioritization)\n    #               Can use 'sonnet' for speed if plan is clear and tasks well-defined\n    # --verbose: Detailed execution logging\n\n    # For plan-work mode, substitute ${WORK_SCOPE} in prompt before piping\n    if [ \"$MODE\" = \"plan-work\" ]; then\n        envsubst \u003C \"$PROMPT_FILE\" | claude -p \\\n            --dangerously-skip-permissions \\\n            --output-format=stream-json \\\n            --model opus \\\n            --verbose\n    else\n        cat \"$PROMPT_FILE\" | claude -p \\\n            --dangerously-skip-permissions \\\n            --output-format=stream-json \\\n            --model opus \\\n            --verbose\n    fi\n\n    # Push to current branch\n    CURRENT_BRANCH=$(git branch --show-current)\n    git push origin \"$CURRENT_BRANCH\" || {\n        echo \"Failed to push. Creating remote branch...\"\n        git push -u origin \"$CURRENT_BRANCH\"\n    }\n\n    ITERATION=$((ITERATION + 1))\n    echo -e \"\\n\\n======================== LOOP $ITERATION ========================\\n\"\ndone\n```\n\n#### `PROMPT_plan_work.md` Template\n\n_Note:_ Identical to `PROMPT_plan.md` but with scoping instructions and `WORK_SCOPE` env var substituted (automatically by the loop script).\n\n```\n0a. Study `specs\u002F*` with up to 250 parallel Sonnet subagents to learn the application specifications.\n0b. Study @IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.\n0c. Study `src\u002Flib\u002F*` with up to 250 parallel Sonnet subagents to understand shared utilities & components.\n0d. For reference, the application source code is in `src\u002F*`.\n\n1. You are creating a SCOPED implementation plan for work: \"${WORK_SCOPE}\". Study @IMPLEMENTATION_PLAN.md (if present; it may be incorrect) and use up to 500 Sonnet subagents to study existing source code in `src\u002F*` and compare it against `specs\u002F*`. Use an Opus subagent to analyze findings, prioritize tasks, and create\u002Fupdate @IMPLEMENTATION_PLAN.md as a bullet point list sorted in priority of items yet to be implemented. Ultrathink. Consider searching for TODO, minimal implementations, placeholders, skipped\u002Fflaky tests, and inconsistent patterns. Study @IMPLEMENTATION_PLAN.md to determine starting point for research and keep it up to date with items considered complete\u002Fincomplete using subagents.\n\nIMPORTANT: This is SCOPED PLANNING for \"${WORK_SCOPE}\" only. Create a plan containing ONLY tasks directly related to this work scope. Be conservative - if uncertain whether a task belongs to this work, exclude it. The plan can be regenerated if too narrow. Plan only. Do NOT implement anything. Do NOT assume functionality is missing; confirm with code search first. Treat `src\u002Flib` as the project's standard library for shared utilities and components. Prefer consolidated, idiomatic implementations there over ad-hoc copies.\n\nULTIMATE GOAL: We want to achieve the scoped work \"${WORK_SCOPE}\". Consider missing elements related to this work and plan accordingly. If an element is missing, search first to confirm it doesn't exist, then if needed author the specification at specs\u002FFILENAME.md. If you create a new element then document the plan to implement it in @IMPLEMENTATION_PLAN.md using a subagent.\n```\n\n#### Compatibility with Core Philosophy\n\n| Principle              | Maintained? | How                                                                      |\n| ---------------------- | ----------- | ------------------------------------------------------------------------ |\n| Monolithic operation   | ✅ Yes      | Ralph still operates as single process within branch                     |\n| One task per loop      | ✅ Yes      | Unchanged                                                                |\n| Fresh context          | ✅ Yes      | Unchanged                                                                |\n| Deterministic          | ✅ Yes      | Scoping at plan creation (deterministic), not runtime (prob.)            |\n| Simple                 | ✅ Yes      | Optional enhancement, main workflow still works                          |\n| Plan-driven            | ✅ Yes      | One IMPLEMENTATION_PLAN.md per branch                                    |\n| Single source of truth | ✅ Yes      | One plan per branch - scoped plan replaces full plan on branch           |\n| Plan is disposable     | ✅ Yes      | Regenerate scoped plan anytime: `.\u002Floop.sh plan-work \"work description\"` |\n| Markdown over JSON     | ✅ Yes      | Still markdown plans                                                     |\n| Let Ralph Ralph        | ✅ Yes      | Ralph picks \"most important\" from already-scoped plan - no filter        |\n\n---\n\n### JTBD → Story Map → SLC Release\n\n#### Topics of Concern → Activities\n\nGeoff's [suggested workflow](https:\u002F\u002Fghuntley.com\u002Fcontent\u002Fimages\u002Fsize\u002Fw2400\u002F2025\u002F07\u002FThe-ralph-Process.png) already aligns planning with Jobs-to-be-Done — breaking JTBDs into topics of concern, which in turn become specs. I love this and I think there's an opportunity to lean further into the product benefits this approach affords by reframing _topics of concern_ as _activities_.\n\nActivities are verbs in a journey (\"upload photo\", \"extract colors\") rather than capabilities (\"color extraction system\"). They're naturally scoped by user intent.\n\n> Topics: \"color extraction\", \"layout engine\" → capability-oriented\n> Activities: \"upload photo\", \"see extracted colors\", \"arrange layout\" → journey-oriented\n\n#### Activities → User Journey\n\nActivities — and their constituent steps — sequence naturally into a user flow, creating a _journey structure_ that makes gaps and dependencies visible. A _[User Story Map](https:\u002F\u002Fwww.nngroup.com\u002Farticles\u002Fuser-story-mapping\u002F)_ organizes activities as columns (the journey backbone) with capability depths as rows — the full space of what _could_ be built:\n\n```\nUPLOAD    →   EXTRACT    →   ARRANGE     →   SHARE\n\nbasic         auto           manual          export\nbulk          palette        templates       collab\nbatch         AI themes      auto-layout     embed\n```\n\n#### User Journey → Release Slices\n\nHorizontal slices through the map become candidate releases. Not every activity needs new capability in every release — some cells stay empty, and that's fine if the slice is still coherent:\n\n```\n                  UPLOAD    →   EXTRACT    →   ARRANGE     →   SHARE\n\nRelease 1:        basic         auto                           export\n                  ───────────────────────────────────────────────────\nRelease 2:                      palette        manual\n                  ───────────────────────────────────────────────────\nRelease 3:        batch         AI themes      templates       embed\n```\n\n#### Release Slices → SLC Releases\n\nThe story map gives you _structure_ for slicing. Jason Cohen's _[Simple, Lovable, Complete (SLC)](https:\u002F\u002Flongform.asmartbear.com\u002Fslc\u002F)_ gives you _criteria_ for what makes a slice good:\n\n- _Simple_ — Narrow scope you can ship fast. Not every activity, not every depth.\n- _Complete_ — Fully accomplishes a job within that scope. Not a broken preview.\n- _Lovable_ — People actually want to use it. Delightful within its boundaries.\n\n_Why SLC over MVP?_ MVPs optimize for learning at the customer's expense — \"minimum\" often means broken or frustrating. SLC flips this: learn in-market _while_ delivering real value. If it succeeds, you have optionality. If it fails, you still treated users well.\n\nEach slice can become a release with a clear value and identity:\n\n```\n                  UPLOAD    →   EXTRACT    →   ARRANGE     →   SHARE\n\nPalette Picker:   basic         auto                           export\n                  ───────────────────────────────────────────────────\nMood Board:                     palette        manual\n                  ───────────────────────────────────────────────────\nDesign Studio:    batch         AI themes      templates       embed\n```\n\n- _Palette Picker_ — Upload, extract, export. Instant value from day one.\n- _Mood Board_ — Adds arrangement. Creative expression enters the journey.\n- _Design Studio_ — Professional features: batch processing, AI themes, embeddable output.\n\n---\n\n#### Operationalizing with Ralph\n\nThe concepts above — activities, story maps, SLC releases — are the _thinking tools_. How do we translate them into Ralph's workflow?\n\n_Default Ralph approach:_\n\n1. _Define Requirements_: Human + LLM define JTBD topics of concern → `specs\u002F*.md`\n2. _Create Tasks Plan_: LLM analyzes all specs + current code → `IMPLEMENTATION_PLAN.md`\n3. _Build_: Ralph builds against full scope\n\nThis works well for capability-focused work (features, refactors, infrastructure). But it doesn't naturally produce valuable (SLC) product releases - it produces \"whatever the specs describe\".\n\n_Activities → SLC Release approach:_\n\nTo get SLC releases, we need to ground activities in audience context. Audience defines WHO has the JTBDs, which in turn informs WHAT activities matter and what \"lovable\" means.\n\n```\nAudience (who)\n    └── has JTBDs (desired outcomes)\n            └── fulfilled by Activities (means to achieve outcomes)\n```\n\n##### Workflow\n\n_I. Requirements Phase (2 steps):_\n\nStill performed in LLM conversations with the human, similar to the default Ralph approach.\n\n1. _Define audience and their JTBDs_ — WHO are we building for and what OUTCOMES do they want?\n\n   - Human + LLM discuss and determine the audience(s) and their JTBDs (outcomes they want)\n   - May contain multiple connected audiences (e.g. \"designer\" creates, \"client\" reviews)\n   - Generates `AUDIENCE_JTBD.md`\n\n2. _Define activities_ — WHAT do users do to accomplish their JTBDs?\n\n   - Informed by `AUDIENCE_JTBD.md`\n   - For each JTBD, identify activities necessary to accomplish it\n   - For each activity, determine:\n     - Capability depths (basic → enhanced) — levels of sophistication\n     - Desired outcome(s) at each depth — what does success look like?\n   - Generates `specs\u002F*.md` (one per activity)\n\n   The discrete steps within activities are implicit and LLM can infer them during planning.\n\n_II. Planning Phase:_\n\nPerformed in Ralph loop with _updated_ planning prompt.\n\n- LLM analyzes:\n  - `AUDIENCE_JTBD.md` (who, desired outcomes)\n  - `specs\u002F*` (what could be built)\n  - Current code state (what exists)\n- LLM determines next SLC slice (which activities, at what capability depths) and plans tasks for that slice\n- LLM generates `IMPLEMENTATION_PLAN.md`\n- _Human verifies_ plan before building:\n  - Does the scope represent a coherent SLC release?\n  - Are the right activities included at the right depths?\n  - If wrong → re-run planning loop to regenerate plan, optionally updating inputs or planning prompt\n  - If right → proceed to building\n\n_III. Building Phase:_\n\nPerformed in Ralph loop with standard building prompt.\n\n##### Updated Planning Prompt\n\nVariant of `PROMPT_plan.md` that adds audience context and SLC-oriented slice recommendation.\n\n_Notes:_\n\n- Unlike the default template, this does not have a `[project-specific goal]` placeholder — the goal is implicit: recommend the most valuable next release for the audience.\n- Current subagents names presume using Claude.\n\n```\n0a. Study @AUDIENCE_JTBD.md to understand who we're building for and their Jobs to Be Done.\n0b. Study `specs\u002F*` with up to 250 parallel Sonnet subagents to learn JTBD activities.\n0c. Study @IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.\n0d. Study `src\u002Flib\u002F*` with up to 250 parallel Sonnet subagents to understand shared utilities & components.\n0e. For reference, the application source code is in `src\u002F*`.\n\n1. Sequence the activities in `specs\u002F*` into a user journey map for the audience in @AUDIENCE_JTBD.md. Consider how activities flow into each other and what dependencies exist.\n\n2. Determine the next SLC release. Use up to 500 Sonnet subagents to compare `src\u002F*` against `specs\u002F*`. Use an Opus subagent to analyze findings. Ultrathink. Given what's already implemented recommend which activities (at what capability depths) form the most valuable next release. Prefer thin horizontal slices - the narrowest scope that still delivers real value. A good slice is Simple (narrow, achievable), Lovable (people want to use it), and Complete (fully accomplishes a meaningful job, not a broken preview).\n\n3. Use an Opus subagent (ultrathink) to analyze and synthesize the findings, prioritize tasks, and create\u002Fupdate @IMPLEMENTATION_PLAN.md as a bullet point list sorted in priority of items yet to be implemented for the recommended SLC release. Begin plan with a summary of the recommended SLC release (what's included and why), then list prioritized tasks for that scope. Consider TODOs, placeholders, minimal implementations, skipped tests - but scoped to the release. Note discoveries outside scope as future work.\n\nIMPORTANT: Plan only. Do NOT implement anything. Do NOT assume functionality is missing; confirm with code search first. Treat `src\u002Flib` as the project's standard library for shared utilities and components. Prefer consolidated, idiomatic implementations there over ad-hoc copies.\n\nULTIMATE GOAL: We want to achieve the most valuable next release for the audience in @AUDIENCE_JTBD.md. Consider missing elements and plan accordingly. If an element is missing, search first to confirm it doesn't exist, then if needed author the specification at specs\u002FFILENAME.md. If you create a new element then document the plan to implement it in @IMPLEMENTATION_PLAN.md using a subagent.\n```\n\n##### Notes\n\n_Why `AUDIENCE_JTBD.md` as a separate artifact:_\n\n- Single source of truth — prevents drift across specs\n- Enables holistic reasoning: \"What does this audience need MOST?\"\n- JTBDs captured alongside audience (the \"why\" lives with the \"who\")\n- Referenced twice: during spec creation AND SLC planning\n- Keeps activity specs focused on WHAT, not repeating WHO\n\n_Cardinalities:_\n\n- One audience → many JTBDs (\"Designer\" has \"capture space\", \"explore concepts\", \"present to client\")\n- One JTBD → many activities (\"capture space\" includes upload, measurements, room detection)\n- One activity → can serve multiple JTBDs (\"upload photo\" serves both \"capture\" and \"gather inspiration\")\n\n---\n\n### Specs Audit\n\nA dedicated loop mode for generating and maintaining spec files with enforced quality rules. Ensures specs stay focused on behavioral outcomes (not implementation details), properly scoped topics (\"one sentence without 'and'\"), and consistent file naming conventions.\n\n_When to use:_ After writing or updating specs, run specs mode to enforce consistency and hygiene across all spec files.\n\n_What it does:_\n\n- Iterates over existing `specs\u002F*` files\n- Enforces quality rules: behavioral outcomes only, no code blocks, no implementation details\n- Validates topic scoping using the \"One Sentence Without 'And'\" test\n- Creates new spec files when needed based on `specs\u002FREADME.md`\n- Applies consistent file naming: `\u003Cint>-filename.md` (e.g., `01-range-optimization.md`)\n\n_Usage:_ Add a `specs` argument to your loop script that selects `PROMPT_specs.md`:\n\n```bash\n.\u002Floop.sh specs        # Specs mode, unlimited iterations\n.\u002Floop.sh specs 3      # Specs mode, max 3 iterations\n```\n\n_To add specs mode to `loop.sh`:_ insert a new `elif` branch in the argument parsing:\n\n```bash\n# Parse arguments\nif [ \"$1\" = \"plan\" ]; then\n    # Plan mode\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"specs\" ]; then        # ← add this block\n    # Specs mode\n    MODE=\"specs\"\n    PROMPT_FILE=\"PROMPT_specs.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # Build mode with max iterations\n    ...\n```\n\n_To add specs mode to `loop_streamed.sh`:_ same change — add the `elif` block in the same position. The rest of the script (streaming, `parse_stream.js` piping) works unchanged.\n\n_Files:_ [`PROMPT_specs.md`](files\u002FPROMPT_specs.md)\n\n#### `PROMPT_specs.md` Template\n\n_Notes:_\n\n- Specs define WHAT to verify (outcomes), not HOW to implement (approach). Implementation decisions are left to Ralph during the build phase.\n\n```\n0a. Study `specs\u002F*` with up to 250 parallel Sonnet subagents to learn the application specifications.\n\n1. Identify Jobs to Be Done (JTBD) → Break individual JTBD into topic(s) of concern → Use subagents to load info from URLs into context → LLM understands JTBD topic of concern: subagent writes specs\u002FFILENAME.md for each topic.\n\n## RULES (don't apply to `specs\u002FREADME.md`)\n\n999. NEVER add code blocks or suggest how a variable should be named. This will be decided by Ralph.\n\n9999.\n- Acceptance criteria (in specs) = Behavioral outcomes, observable results\nfor example:\n✓ \"Extracts 5-10 dominant colors from any uploaded image\"\n✓ \"Processes images \u003C5MB in \u003C100ms\"\n✓ \"Handles edge cases: grayscale, single-color, transparent backgrounds\"\n- Test requirements (in plan) = Verification points derived from acceptance criteria\nfor example:\n✓ \"Required tests: Extract 5-10 colors, Performance \u003C100ms\"\n- Implementation approach (up to Ralph) = Technical decisions\nexample TO AVOID:\n✗ \"Use K-means clustering with 3 iterations\"\n\n99999. Topic Scope Test: \"One Sentence Without 'And'\"\nCan you describe the topic of concern in one sentence without conjoining unrelated capabilities?\nexample to follow:\n✓ \"The color extraction system analyzes images to identify dominant colors\"\nexample to avoid:\n✗ \"The user system handles authentication, profiles, and billing\" → 3 topics\nIf you need \"and\" to describe what it does, it's probably multiple topics\n\n99999999. The key: Specify WHAT to verify (outcomes), not HOW to implement (approach). This maintains \"Let Ralph Ralph\" principle - Ralph decides implementation details while having clear success signals.\n\n99999999999. Apply all rules to all existing files with up to 100 parallel Sonnet subagents in @specs (except README.md) and create new files if determined its needed based on `specs\u002FREADME.md`. The names of the files should follow this name convention: \u003Cint>-filename.md, for example 01-range-optimization.md, 02-adaptive-behavior.md etc.\n```\n\n— contributed by [@terry-xyz](https:\u002F\u002Fgithub.com\u002Fterry-xyz) · [@blackrosesxyz](https:\u002F\u002Fx.com\u002Fblackrosesxyz)\n\n---\n\n### Reverse Engineering Brownfield Projects to Specs\n\nIt's easy to start working with specs in Greenfield, but when you're working in Brownfield, you have to take another approach. That's why you need to reverse engineer the implementations of the code back into specs to begin using the Ralph playbook.\n\n_When to use:_ You inherited or joined a codebase with no specs. You want to use Ralph on a project that wasn't built with Ralph. You need to add features to an existing brownfield project.\n\n_Invoke:_ \"Reverse-engineer specs for [topic\u002Farea] using `PROMPT_reverse_engineer_specs.md`\"\n\n_Flow:_\n\n1. Point agent at existing codebase with `PROMPT_reverse_engineer_specs.md` →\n2. Agent investigates code (implementation-aware) →\n3. Agent writes specs describing actual behavior (implementation-free) →\n4. Specs land in `specs\u002F` →\n5. Repeat as needed for all specs\n6. Proceed with normal Ralph phases (plan → build) against documented baseline\n\nYou can use an agent orchestration pattern where the sub-agent is the reverse engineer and the orchestrator knows about the Topic of Concern Philosophy:\n\n- **Full domain coverage:** Tell the orchestrator to identify the list of topics in the domain, then spawn sub-agents to create complete specifications for each topic.\n- **Task-scoped coverage:** Provide a specific task you're going to perform and have the agent analyze the codebase, find the relevant topics, then create\u002Fupdate each respective spec.\n\nNo modifications to existing prompt files needed — this is purely additive. The generated specs are the same format Ralph already consumes in planning and building phases.\n\n#### Considerations\n\n- **Mono-repo structures:** May require scoping the reverse-engineering to specific packages or services rather than the entire repo. Point the agent at the relevant subdirectory.\n- **Entire-domain specs generation:** Generating specs for an entire domain is a larger investment — worth doing if your team is adopting Ralph as a standard workflow.\n- **Quick development or small changes:** Small code changes may drift from generated specs. Decide upfront whether your team will re-run reverse-engineering to keep specs current, or accept temporary drift.\n- **Spec staleness after refactors:** Once Ralph builds new features on top of reverse-engineered specs, major refactors can invalidate specs silently. Re-run reverse-engineering periodically on heavily changed areas.\n- **Topic granularity:** The prompt enforces \"one topic per spec\" strictly. On a large codebase, deciding where to draw topic boundaries is a judgment call — too broad and specs become unwieldy, too narrow and you drown in files. Start coarse and split as needed.\n- **Bugs become specs:** The prompt intentionally documents buggy behavior as the defined behavior. Reverse-engineered specs describe what *is*, not what *should be*. Write new specs separately for desired behavior changes.\n- **Token cost on large codebases:** Exhaustive code tracing with sub-agents can burn significant tokens. Scope to the areas you're actually planning to modify first.\n\n#### Compatibility with Core Philosophy\n\n| Principle             | Maintained? | How                                                         |\n| --------------------- | ----------- | ----------------------------------------------------------- |\n| Deterministic setup   | ✅ Yes      | Specs are written artifacts (known state), not ad-hoc context, contains all flaws in code. |\n| Context efficiency    | ⚠️ Partial  | Must be adopted throughout your entire team culture |\n| Capture the why       | ⚠️ Partial  | Not all implemented code contains the why behind things, only captures comments if they express the why intention. |\n| Let Ralph Ralph       | ✅ Yes      | Topics of concern are still chosen by Ralph. |\n| Plan is disposable    | ✅ Yes      | Specs provide stable baseline; plans regenerate against documented reality |\n| Simplicity wins       | ✅ Yes      | Provides a Hawkeye view of your entire specifications. |\n\n#### `PROMPT_reverse_engineer_specs.md` Template\n\n_Notes:_\n\n- Documents actual code behavior (bugs included) — not intended behavior\n- Two-phase process: Phase 1 investigates with full code access, Phase 2 writes specs with zero implementation details\n- One topic per spec, enforced by the \"one sentence without 'and'\" test\n- Current subagent names presume using Claude\n\n_Files:_ [`PROMPT_reverse_engineer_specs.md`](files\u002FPROMPT_reverse_engineer_specs.md)\n\n```\n0a. Study `specs\u002F*` with up to 250 parallel Sonnet subagents to learn existing specifications.\n0b. Study `src\u002F*` to understand the codebase. Use up to 500 parallel Sonnet subagents for reads\u002Fsearches. Treat `src\u002Flib` as the project's standard library for shared utilities and components.\n\n1. For each topic assigned (or discovered), reverse-engineer the source code and produce a specification in `specs\u002F`. Use Opus subagents for complex tracing. Ultrathink. Before writing a spec, search to confirm one doesn't already exist for that topic.\n2. One topic per spec. Must pass the \"one sentence without 'and'\" test. Split if \"and\" joins unrelated capabilities.\n3. **Two-phase process:** Phase 1 (Investigation) — trace every entry point, branch, code path to terminal. Map data flow, side effects, state mutations, error handling, concurrency, config-driven paths, implicit behavior. Phase 2 (Output) — zero implementation details. No function\u002Fclass\u002Fvariable names, file paths, library\u002Fframework references. A different team on a different stack must be able to reimplement from the spec alone.\n4. **Document reality, not intent.** Bugs are features. Never add behaviors the code doesn't implement. Never suggest improvements. If a source comment contradicts the code, document the code's behavior and ignore the comment.\n5. **Scope boundaries:** When tracing leaves the topic, stop. Document what crosses the boundary (sent\u002Freceived) only. Test: \"Could this change without changing my topic's outcomes?\" If yes, it's across the boundary.\n6. **Shared behavior:** Inline fully in every spec (self-contained). Note shared topics for cross-spec tracking. Shared behavior also gets its own canonical spec.\n7. **Spec format:** Markdown in `specs\u002F`. Each spec includes: topic statement, scope (in-scope and boundaries), data contracts, behaviors (in execution order), and state transitions. Mark notable\u002Fsurprising behavior, unreachable paths, and shared cross-topic behavior inline. Capture rationale from source comments (strip implementation references). File naming: `specs\u002FNN-kebab-case.md` (e.g., `01-session-management.md`).\n8. When specs are complete and validated, `git add -A` then `git commit` with a message describing which specs were added\u002Fupdated. After the commit, `git push`.\n\n99999. **Exhaustive checklist before finalizing:** Every entry point documented. Every branch traced to terminal. Every data contract. Every side effect in execution order. Every error path (caught\u002Fpropagated\u002Fignored). Every config-driven path. Concurrency outcomes. Unreachable paths marked. Notable\u002Fsurprising behavior marked. Zero implementation details in output. If any item is missing, trace again.\n999999. The code is the source of truth. If specs are inconsistent with the code, update the spec using an Opus 4.6 subagent.\n9999999. Single sources of truth, no duplicated specs. Update existing specs rather than creating new ones.\n99999999. When you learn something new about the project, update @AGENTS.md using a subagent but keep it brief and operational only — no status updates or progress notes.\n999999999. Source comments explaining why behavior must be preserved (regulatory, compatibility, intentional) — capture rationale, strip implementation references. Stale comments are not spec.\n9999999999. Document all configuration-driven paths, not just the currently active one.\n99999999999. If you find inconsistencies in `specs\u002F*` then use an Opus 4.6 subagent with 'ultrathink' to update the specs.\n```\n\n— contributed by Jake Cukjati · [@Byte0fCode](https:\u002F\u002Fx.com\u002FByte0fCode) · [@jackstine](https:\u002F\u002Fgithub.com\u002Fjackstine)\n","# 拉尔夫行动手册\n\n2025年12月，[拉尔夫](https:\u002F\u002Fghuntley.com\u002Fralph\u002F) 那张强大却略显“傻气”的小脸一下子冲上了大多数与AI相关的时间线的榜首。\n\n我一向很关注[@GeoffreyHuntley](https:\u002F\u002Fx.com\u002FGeoffreyHuntley) 分享的那些极其聪明的洞见，但今年夏天拉尔夫并没有真正让我眼前一亮。如今，围绕它的各种热议让我不再能视而不见。\n\n[@mattpocockuk](https:\u002F\u002Fx.com\u002Fmattpocockuk\u002Fstatus\u002F2008200878633931247) 和 [@ryancarson](https:\u002F\u002Fx.com\u002Fryancarson\u002Fstatus\u002F2008548371712135632) 的概述帮了大忙——直到杰夫亲自出马，来了句 [“算了吧”](https:\u002F\u002Fx.com\u002FGeoffreyHuntley\u002Fstatus\u002F2008731415312236984)。\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FClaytonFarr_ralph-playbook_readme_b1d483cbb3de.png\" alt=\"nah\" width=\"500\" \u002F>\n\n## 那么，究竟怎样才是使用拉尔夫的最佳方式呢？\n\n许多人似乎通过不同的方法都取得了不错的效果——但我更想尽可能贴近最初提出这一方法的人，也就是那位不仅捕捉到了这套思路、而且还在实践中反复打磨它的人，来解读其中的奥秘。\n\n于是，我深入研究了[最近的视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=O2bBWDoxO4s)以及杰夫的[原始文章](https:\u002F\u002Fghuntley.com\u002Fralph\u002F)，试图理清到底哪种做法最有效。\n\n以下是我的成果——一份（可能有点强迫症驱动的）拉尔夫行动手册，将各种零散的细节整理成一套可操作的流程，希望能尽量保留其核心价值，而不是在实践过程中将其削弱。\n\n> 在深入研究的过程中，我还想到一些对核心方法可能有价值的[补充改进](#enhancements)，这些改进旨在遵循让拉尔夫如此出色运作的指导原则。\n\n> [!TIP]\n> 以[📖 格式化指南 →](https:\u002F\u002FClaytonFarr.github.io\u002Fralph-playbook\u002F) 查看\n\n希望这份手册对你有所帮助——[@ClaytonFarr](https:\u002F\u002Fx.com\u002FClaytonFarr)\n\n---\n\n## 目录\n\n- [工作流程](#workflow)\n- [关键原则](#key-principles)\n- [循环机制](#loop-mechanics)\n- [许可证](#license)\n- [文件](#files)\n- [有哪些改进？](#enhancements)\n\n---\n\n## 工作流程\n\n一张图胜过千条推文和一小时的视频。杰夫的[这篇概述](https:\u002F\u002Fghuntley.com\u002Fralph\u002F)（订阅他的邮件列表即可查看完整文章）确实帮助我理清了从 1) 创意 → 2) 与JTBD对齐的具体规格 → 3) 全面的实施计划 → 4) 拉尔夫工作循环的详细流程。\n\n![ralph-diagram.png](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FClaytonFarr_ralph-playbook_readme_9ce8e43650f7.png)\n\n### 🗘 三阶段、两份提示、一个循环\n\n这张图让我明白，拉尔夫不仅仅是一个“会写代码的循环”。它更像是一个包含3个阶段、2份提示和1个循环的漏斗。\n\n#### 第一阶段：明确需求（LLM对话）\n\n- 讨论项目创意，识别待完成的工作（JTBD）\n- 将每个JTBD分解为具体的关注主题\n- 使用子代理从URL加载信息并纳入上下文\n- LLM理解JTBD的关注主题：子代理为每个主题编写 `specs\u002FFILENAME.md`\n\n#### 第二阶段 \u002F 第三阶段：运行拉尔夫循环（两种模式，根据需要切换 `PROMPT.md`）\n\n虽然机制相同，但针对不同目标使用不同的提示：\n\n| 模式       | 适用场景                            | 提示重点                                            |\n| ---------- | -------------------------------------- | ------------------------------------------------------- |\n| _规划_     | 尚无计划，或计划已过时\u002F错误         | 仅生成或更新 `IMPLEMENTATION_PLAN.md`                 |\n| _构建_     | 已有计划                            | 根据计划实施，提交代码，并顺带更新计划 |\n\n_各模式下的提示差异：_\n\n- “规划”模式的提示会进行差距分析（规格与代码对比），并输出一个优先级排序的任务清单——不涉及实际实现，也不提交代码。\n- “构建”模式的提示则假定已有计划，从中挑选任务，执行实现，运行测试（反压机制），然后提交代码。\n\n_为什么两种模式都要用这个循环呢？_\n\n- “构建”模式离不开它：因为开发本身就是迭代式的（大量任务 × 新鲜上下文 = 孤立效应）。\n- “规划”模式则为了保持一致性：采用相同的执行模型，尽管通常只需1-2轮就能完成。\n- 灵活性：如果计划需要调整，循环可以多次读取自身输出。\n- 简洁性：一套机制搞定所有事情；文件I\u002FO清晰；随时可以停止或重启。\n\n_每次迭代都会加载的上下文：_ `PROMPT.md` + `AGENTS.md`\n\n_“规划”模式的循环流程：_\n\n1. 子代理研究 `specs\u002F*` 和现有的 `\u002Fsrc`\n2. 对比规格与代码（差距分析）\n3. 创建或更新带有优先级任务的 `IMPLEMENTATION_PLAN.md`\n4. 不进行任何实现\n\n_“构建”模式的循环流程：_\n\n1. _定位_ – 子代理研究 `specs\u002F*`（需求）\n2. _阅读计划_ – 研究 `IMPLEMENTATION_PLAN.md`\n3. _选择_ – 挑选最重要的任务\n4. _调查_ – 子代理研究相关的 `\u002Fsrc`（不要假设未实现）\n5. _实现_ – 多个子代理负责文件操作\n6. _验证_ – 1个子代理负责构建和测试（反压机制）\n7. _更新 `IMPLEMENTATION_PLAN.md`_ – 标记任务已完成，记录发现的问题或新情况\n8. _更新 `AGENTS.md`_ – 如果有新的操作经验\n9. _提交_\n10. _循环结束_ → 清除上下文 → 下一轮从头开始\n\n#### 概念\n\n| 术语                    | 定义                                                      |\n| ----------------------- | --------------------------------------------------------------- |\n| _待完成的工作（JTBD）_   | 用户的高层次需求或期望结果                                 |\n| _关注主题_              | JTBD 中的一个独立的方面或组成部分                          |\n| _规格_                  | 针对某一关注主题的需求文档 (`specs\u002FFILENAME.md`)           |\n| _任务_                  | 由规格与代码对比得出的工作单元                             |\n\n_关系：_\n\n- 1个JTBD → 多个关注主题\n- 1个关注主题 → 1份规格\n- 1份规格 → 多个任务（规格往往比任务更宏观）\n\n_举例：_\n\n- JTBD：“帮助设计师创建情绪板”\n- 关注主题：图片收集、色彩提取、布局、分享\n- 每个主题对应一个规格文件\n- 每份规格又会衍生出许多实施任务\n\n_主题范围测试：“一句话不带‘和’”_\n\n- 你能用一句话描述这个关注主题，而不把不相关的功能混在一起吗？\n  - ✓ “色彩提取系统会分析图片，找出主导颜色”\n  - ✗ “用户系统负责身份验证、个人资料和计费” → 这是3个主题\n- 如果需要用“和”来描述它的功能，那很可能就是多个主题。\n\n---\n\n## 关键原则\n\n### ⏳ 上下文就是一切\n\n- 当广告宣称有 20 万+ 个 token 时，真正可用的却只有约 17.6 万个。\n- 而“智能区域”的上下文利用率通常在 40% 到 60% 之间。\n- 对于紧凑的任务，每次循环只执行一个任务，就能实现 **100% 的智能区域上下文利用率**。\n\n这决定了并驱动着其他所有方面：\n\n- **将主代理\u002F上下文用作调度器**\n  - 不要将昂贵的工作分配给主上下文；尽可能改用子代理来处理。\n- **将子代理用作内存扩展**\n  - 每个子代理拥有约 156KB 的内存空间，并且会进行垃圾回收。\n  - 通过分散任务来避免污染主上下文。\n- **简单与简洁胜出**\n  - 这适用于系统中的组件数量、循环配置以及内容本身。\n  - 冗长的输入会降低确定性。\n- **优先使用 Markdown 而不是 JSON**\n  - 用于定义和跟踪工作，以提高 token 使用效率。\n\n### 🧭 引导 Ralph：模式与反压\n\n创造正确的信号和门控机制，以引导 Ralph 产生成功的结果，是至关重要的。你可以从两个方向进行引导：\n\n- **上游引导**\n  - 确保确定性的设置：\n    - 分配前约 5,000 个 token 用于规格说明。\n    - 每次循环的上下文都加载相同的文件，使模型从已知状态开始（`PROMPT.md` + `AGENTS.md`）。\n  - 你现有的代码决定了哪些内容会被使用和生成。\n  - 如果 Ralph 生成了错误的模式，就添加或更新工具和现有代码模式，以引导它走向正确的方向。\n- **下游引导**\n  - 通过测试、类型检查、lint 工具、构建等手段创建反压机制，从而拒绝无效或不可接受的工作。\n  - 提示中只是笼统地提到“运行测试”。而 `AGENTS.md` 中则具体指定了实际的命令，使反压机制更具项目针对性。\n  - 反压机制不仅限于代码验证：有些验收标准难以通过程序化方式检验——例如创意质量、美学设计、用户体验等。可以利用 LLM 作为“裁判”，对这些主观标准进行二元化的通过\u002F失败判断，从而提供反压。（关于如何用 Ralph 处理此类问题，请参阅下方的【非确定性反压】部分。）\n- **提醒 Ralph 创建\u002F使用反压**\n  - 在实施过程中提醒 Ralph 使用反压：“重要提示：编写文档时，请记录下背后的原因——包括测试及其在实现中的重要性。”\n\n### 🙏 让 Ralph 做自己擅长的事\n\nRalph 的有效性取决于你对其最终能够做出正确决策的信任程度，以及你为它创造这种能力所付出的努力。\n\n- **让 Ralph 做自己擅长的事**\n  - 充分发挥 LLM 自我识别、自我修正和自我改进的能力。\n  - 这适用于实施计划、任务定义及优先级排序。\n  - 通过迭代逐步实现最终的一致性。\n- **采取保护措施**\n  - 为了实现自主运行，Ralph 需要使用 `--dangerously-skip-permissions` 参数——如果每次调用工具都要请求批准，循环就会中断。这种方式完全绕过了 Claude 的权限系统，因此沙箱就成了唯一的安全边界。\n  - 哲学理念是：“被攻破不是会不会的问题，而是何时会发生。那么破坏范围又有多大呢？”\n  - 如果不使用沙箱运行，你的机器上存储的凭据、浏览器 Cookie、SSH 密钥和访问令牌都将面临风险。\n  - 应该在隔离的环境中运行，并仅授予最低限度的必要权限：\n    - 仅提供完成任务所需的 API 密钥和部署密钥。\n    - 不允许访问超出需求范围的任何私密数据。\n    - 尽可能限制网络连接。\n  - 可选方案包括本地 Docker 沙箱、远程\u002F生产环境中的 Fly Sprites 或 E2B 等。（更多信息请参见参考资料中的【沙箱环境】部分。）\n  - 此外，还有一些应急措施：按下 Ctrl+C 可以停止循环；使用 `git reset --hard` 可以撤销未提交的更改；如果轨迹出现偏差，可以重新生成计划。\n\n### 🚦 走出循环之外\n\n要想充分发挥 Ralph 的作用，就必须让他自由地工作。Ralph 应该负责完成所有任务，包括决定接下来要实施哪项计划以及如何实施。你的职责不再是参与循环，而是置身其外——搭建合适的环境和条件，帮助 Ralph 取得成功。\n\n- **观察并适时调整**——尤其是在初期，静下心来仔细观察。哪些模式逐渐显现？Ralph 哪里出现了问题？它需要什么样的提示？你最初使用的提示很可能不会是你最终使用的提示——它们会随着观察到的失败模式不断演变。\n- **像调吉他一样微调**——不要一开始就事无巨细地规定好一切，而是通过观察和反应式调整来进行优化。当 Ralph 以某种特定方式失败时，就增加相应的提示，帮助它下次做得更好。\n  \n然而，提示并不仅仅局限于文本形式，它可以是 Ralph 能够发现的任何信息：\n- 提示中的约束条件——例如明确指示“不要假设尚未实现”。\n- `AGENTS.md` 文件——其中包含关于如何构建和测试的实际经验。\n- 你代码库中的实用工具——当你添加一种新模式时，Ralph 会自动发现并遵循它。\n- 其他可被发现的相关输入……\n\n> [!TIP]\n>\n> 1. 可以先尝试在 `AGENTS.md` 中什么都不写（即空文件，不含任何最佳实践等内容）。\n> 2. 通过小规模测试来观察期望的行为，并找出其中的不足之处。（参考 Geoff 的操作示例：https:\u002F\u002Fx.com\u002FClaytonFarr\u002Fstatus\u002F2010780371542241508）\n> 3. 观察最初的几次循环，看看哪些环节存在空白。\n> 4. 仅根据需要，通过更新 `AGENTS.md` 或修改代码模式（如共享工具等）来调整行为。\n\n记住一点：**计划是可以丢弃的**：\n- 如果计划有问题，就直接抛弃，重新开始。\n- 重新生成计划的成本仅相当于一次规划循环，相比 Ralph 陷入无休止的重复要便宜得多。\n- 应该在以下情况时重新生成计划：\n  - Ralph 的行动偏离了轨道（实施了错误的任务或重复工作）。\n  - 计划显得过时或与当前状态不符。\n  - 已完成事项过多导致混乱。\n  - 你对项目的规格进行了重大调整。\n  - 你对自己已经完成的工作感到困惑。\n\n---\n\n## 循环机制\n\n### I. 任务选择\n\n`loop.sh` 实际上充当了一个“外层循环”，其中每一次循环对应一项独立的任务（在不同的会话中执行）。当一项任务完成后，`loop.sh` 会启动一个新的会话，以选择下一项任务，前提是仍有未完成的任务存在。\n\nGeoff 最初采用的最简版 `loop.sh` 脚本如下：\n\n```bash\nwhile :; do cat PROMPT.md | claude ; done\n```\n\n**注意**：同样的方法也可以应用于其他 CLI 工具，例如 `amp`、`codex`、`opencode` 等。\n\n**是什么控制着任务的持续进行呢？**\n\n其机制非常简单而巧妙：\n1. Bash 循环运行 → 将 `PROMPT.md` 输入给 Claude。\n2. `PROMPT.md` 指示 → “研究 `IMPLEMENTATION_PLAN.md` 并选择最重要的事情……”\n3. 代理完成一项任务 → 更新磁盘上的 `IMPLEMENTATION_PLAN.md` 文件，提交更改后退出。\n4. Bash 循环立即重启 → 新的上下文窗口。\n5. 代理读取更新后的计划 → 选择下一项最重要的任务……\n\n**关键洞察**：`IMPLEMENTATION_PLAN.md` 文件会在每次迭代之间持久化保存在磁盘上，作为原本相互隔离的循环执行之间的共享状态。每次迭代都会以确定性的方式加载相同的文件（`PROMPT.md`、`AGENTS.md` 和 `specs\u002F*`），并从磁盘上读取当前的状态。\n\n无需复杂的编排系统——只需一个简单的 Bash 循环不断重启代理，而代理每次都会通过读取计划文件来决定下一步该做什么。\n\n### II. 任务执行\n\n每个任务都会在反压机制（如测试等）的驱动下持续工作，直到通过为止——从而在一个会话中形成一个伪内部“循环”。\n\n这个内部循环只是模型单次响应内的自我修正与迭代推理过程，由反压提示、工具使用和子代理共同推动。它并非编程意义上的循环。\n\n单个任务的执行没有硬性技术限制。其控制依赖于以下几点：\n\n- _范围约束_：PROMPT.md 指示“一次只做一项任务”，并在“测试通过后提交”。\n- _反压机制_：测试或构建失败会迫使代理先修复问题再进行提交。\n- _自然结束_：代理在成功提交后退出。\n\n_Ralph 可能会陷入循环、忽略指令或走错方向_——这是预期的行为，也是调优过程的一部分。当 Ralph 以特定方式失败来“试探你”时，你可以通过调整提示词或优化反压机制来增加约束条件。这种非确定性可以通过观察和迭代逐步管理。\n\n### 增强版 `loop.sh` 示例\n\n该脚本封装了核心循环，并增加了模式选择功能（规划\u002F构建），同时设置了最大迭代次数以限制可完成的任务数量，并在每次迭代后执行 Git 推送。\n\n_此增强版本使用了两个保存的提示文件：_\n\n- `PROMPT_plan.md` —— 规划模式（差距分析，生成\u002F更新计划）\n- `PROMPT_build.md` —— 构建模式（根据计划实施）\n\n```bash\n#!\u002Fbin\u002Fbash\n# 使用方法：.\u002Floop.sh [plan|build] [max_iterations]\n# 示例：\n#   .\u002Floop.sh              # 构建模式，无限制任务数\n#   .\u002Floop.sh 20           # 构建模式，最多20个任务\n#   .\u002Floop.sh build 20     # 构建模式，最多20个任务\n#   .\u002Floop.sh plan         # 规划模式，无限制任务数\n#   .\u002Floop.sh plan 5       # 规划模式，最多5个任务\n\n# 解析参数\nif [ \"$1\" = \"plan\" ]; then\n    # 规划模式\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"build\" ]; then\n    # 显式构建模式（可选最大迭代次数）\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # 构建模式，指定最大任务数（仅数字）\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=$1\nelse\n    # 默认为构建模式，无限制（无参数或输入无效）\n    MODE=\"build\"\n    PROMPT_FILE=\"PROMPT_build.md\"\n    MAX_ITERATIONS=0\nfi\n\nITERATION=0\nCURRENT_BRANCH=$(git branch --show-current)\n\necho \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\necho \"模式:   $MODE\"\necho \"提示:   $PROMPT_FILE\"\necho \"分支:   $CURRENT_BRANCH\"\n[ $MAX_ITERATIONS -gt 0 ] && echo \"最大:    $MAX_ITERATIONS 次迭代（任务数量）\"\necho \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n\n# 验证提示文件是否存在\nif [ ! -f \"$PROMPT_FILE\" ]; then\n    echo \"错误：$PROMPT_FILE 未找到\"\n    exit 1\nfi\n\nwhile true; do\n    if [ $MAX_ITERATIONS -gt 0 ] && [ $ITERATION -ge $MAX_ITERATIONS ]; then\n        echo \"已达到最大迭代次数（任务数量）：$MAX_ITERATIONS\"\n        break\n    fi\n\n    # 运行 Ralph 迭代，使用选定的提示\n    # -p：无头模式（非交互式，从标准输入读取）\n    # --dangerously-skip-permissions：自动批准所有工具调用（YOLO 模式）\n    # --output-format=stream-json：结构化输出，便于日志记录和监控\n    # --model opus：主代理使用 Opus 处理复杂推理（任务选择、优先级排序）\n    # 在构建模式下，若计划清晰且任务明确，也可使用 'sonnet' 以提高速度\n    # --verbose：详细执行日志\n    cat \"$PROMPT_FILE\" | claude -p \\\n        --dangerously-skip-permissions \\\n        --output-format=stream-json \\\n        --model opus \\\n        --verbose\n\n    # 每次迭代后推送更改\n    git push origin \"$CURRENT_BRANCH\" || {\n        echo \"推送失败。正在创建远程分支...\"\n        git push -u origin \"$CURRENT_BRANCH\"\n    }\n\n    ITERATION=$((ITERATION + 1))\n    echo -e \"\\n\\n======================== LOOP $ITERATION ========================\\n\"\ndone\n```\n\n_模式选择：_\n\n- 不提供关键字 → 使用 `PROMPT_build.md` 进行构建（实现）\n- 提供 `plan` 关键字 → 使用 `PROMPT_plan.md` 进行规划（差距分析、计划生成）\n\n_最大迭代次数：_\n\n- 限制的是_任务选择循环_（尝试执行的任务数量；而非单个任务内的工具调用次数）\n- 每次迭代代表一个新的上下文窗口，对应于 IMPLEMENTATION_PLAN.md 中的一项任务，以及一次提交\n- `.\u002Floop.sh` 会无限运行（可通过 Ctrl+C 手动停止）\n- `.\u002Floop.sh 20` 则会在执行 20 次迭代后停止\n\n_Claude CLI 标志位：_\n\n- `-p`（无头模式）：启用非交互式操作，从标准输入读取提示\n- `--dangerously-skip-permissions`：跳过所有权限提示，实现完全自动化运行\n- `--output-format=stream-json`：输出结构化 JSON，便于日志记录、监控和可视化\n- `--model opus`：主代理使用 Opus 处理任务选择、优先级排序和协调工作（若任务明确，也可使用 `sonnet` 以提升速度）\n- `--verbose`：提供详细的执行日志\n\n### 流式输出版本\n\n另一种 `loop_streamed.sh` 脚本会将 Claude 的原始 JSON 输出通过 `parse_stream.js` 进行处理，以在终端上显示可读、带颜色编码的内容，包括工具调用、结果和执行统计信息。\n\n_与基础 `loop.sh` 的区别：_\n\n- 将提示作为参数传递（`-p \"$FULL_PROMPT\"`），而非通过标准输入管道\n- 添加了 `--include-partial-messages` 以实现实时流式传输\n- 将输出通过 `parse_stream.js`（Node.js，无需依赖）进行处理\n- 在提示内容末尾添加了“执行上述指令。”字样\n\n_文件：_ [`loop_streamed.sh`](files\u002Floop_streamed.sh) · [`parse_stream.js`](files\u002Fparse_stream.js)\n\n— 由 [@terry-xyz](https:\u002F\u002Fgithub.com\u002Fterry-xyz) 和 [@blackrosesxyz](https:\u002F\u002Fx.com\u002Fblackrosesxyz) 贡献\n\n## 许可证\n\n本仓库采用 [MIT 许可证](LICENSE) 开放使用。\n\n第三方截图及外部来源的图片除外，除非另有说明。详情请参阅 [NOTICE](NOTICE) 文件。\n\n---\n\n## 文件结构\n\n```\nproject-root\u002F\n├── loop.sh                         # Ralph 循环脚本\n├── PROMPT_build.md                 # 构建模式指令\n├── PROMPT_plan.md                  # 规划模式指令\n├── AGENTS.md                       # 每次迭代加载的操作指南\n├── IMPLEMENTATION_PLAN.md          # 优先级任务列表（由 Ralph 生成\u002F更新）\n├── specs\u002F                          # 需求规格文档（每个 JTBD 主题一份）\n│   ├── [jtbd-topic-a].md\n│   └── [jtbd-topic-b].md\n├── src\u002F                            # 应用程序源代码\n└── src\u002Flib\u002F                        # 共享工具与组件\n```\n\n### `loop.sh`\n\n主要的循环脚本，负责编排 Ralph 的迭代过程。\n\n有关详细的实现示例和配置选项，请参阅【循环机制】部分。\n\n_设置：_ 首次使用前需赋予脚本可执行权限：\n\n```bash\nchmod +x loop.sh\n```\n\n_核心功能：_ 不断向 Claude 输入提示文件，管理迭代次数限制，并在每项任务完成后推送更改。\n\n### 提示词\n\n每个循环迭代的指令集。根据需要在规划和构建版本之间切换。\n\n_提示词结构：_\n\n| 部分                | 用途                                               |\n| ---------------------- | ----------------------------------------------------- |\n| _阶段 0_ (0a, 0b, 0c) | 定向：研究规格、源代码位置、当前计划    |\n| _阶段 1-4_            | 主要指令：任务、验证、提交           |\n| _999... 编号_         | 安全保障\u002F不变量（数字越大越关键） |\n\n_关键语言模式_（Geoff 的特定措辞）：\n\n- “研究”（而不是“阅读”或“查看”）\n- “不要假设未实现”（至关重要——阿喀琉斯之踵）\n- “使用并行子代理” \u002F “最多 N 个子代理”\n- “构建\u002F测试仅使用 1 个子代理”（背压控制）\n- “格外深入思考”（现为“超深度思考”）\n- “捕捉原因”\n- “保持更新”\n- “如果功能缺失，则由你负责添加”\n- “解决它们或记录下来”\n\n#### `PROMPT_plan.md` 模板\n\n_注：_\n\n- 更新下方的 [项目特定目标] 占位符。\n- 当前子代理名称假定使用 Claude。\n\n```\n0a. 使用最多 250 个并行 Sonnet 子代理研究 `specs\u002F*`，以了解应用程序的规格说明。\n0b. 研究 @IMPLEMENTATION_PLAN.md（如果存在），以理解迄今为止的计划。\n0c. 使用最多 250 个并行 Sonnet 子代理研究 `src\u002Flib\u002F*`，以了解共享的工具和组件。\n0d. 作为参考，应用程序源代码位于 `src\u002F*`。\n\n1. 研究 @IMPLEMENTATION_PLAN.md（如果存在；它可能不正确），并使用最多 500 个 Sonnet 子代理研究 `src\u002F*` 中的现有源代码，将其与 `specs\u002F*` 进行比较。使用 Opus 子代理分析发现，确定任务优先级，并创建\u002F更新 @IMPLEMENTATION_PLAN.md，将其整理成按待实现事项优先级排序的要点列表。超深度思考。考虑搜索 TODO、最小化实现、占位符、被跳过或不稳定测试，以及不一致的模式。研究 @IMPLEMENTATION_PLAN.md 以确定研究起点，并持续用子代理更新已完成\u002F未完成的事项。\n\n重要提示：仅制定计划，切勿实施任何内容。不要假设功能缺失；请先通过代码搜索确认。将 `src\u002Flib` 视为项目的标准库，用于共享工具和组件。优先选择其中整合的、符合语言习惯的实现，而非临时复制的内容。\n\n最终目标：我们希望实现 [项目特定目标]。考虑缺失的元素，并据此制定计划。如果某个元素缺失，请先搜索确认其确实不存在，然后如有必要，在 specs\u002FFILENAME.md 中编写该规范。如果你创建了新元素，则需使用子代理在 @IMPLEMENTATION_PLAN.md 中记录其实现计划。\n```\n\n#### `PROMPT_build.md` 模板\n\n_注意：_ 当前子代理名称假定使用 Claude。\n\n```\n0a. 使用最多 500 个并行 Sonnet 子代理研究 `specs\u002F*`，以了解应用程序的规格说明。\n0b. 研究 @IMPLEMENTATION_PLAN.md。\n0c. 作为参考，应用程序源代码位于 `src\u002F*`。\n\n1. 你的任务是按照规格说明，使用并行子代理实现功能。遵循 @IMPLEMENTATION_PLAN.md，选择最重要的事项进行处理。在做出更改之前，使用 Sonnet 子代理搜索代码库（不要假设未实现）。你可以使用最多 500 个并行 Sonnet 子代理进行搜索和读取，但构建\u002F测试时仅使用 1 个 Sonnet 子代理。当需要复杂推理时（调试、架构决策），使用 Opus 子代理。\n2. 实现功能或解决问题后，运行该代码单元的测试。如果功能缺失，则需根据应用程序规格说明自行添加。超深度思考。\n3. 发现问题时，立即使用子代理更新 @IMPLEMENTATION_PLAN.md，记录你的发现。问题解决后，更新并移除相关条目。\n4. 测试通过后，更新 @IMPLEMENTATION_PLAN.md，然后执行 `git add -A` 和 `git commit`，并在提交信息中描述所做的更改。提交完成后，执行 `git push`。\n\n99999. 重要提示：撰写文档时，务必说明原因——包括测试和实现的重要性。\n999999. 重要提示：单一事实来源，禁止迁移或适配器。若与你工作无关的测试失败，应将其作为本次增量的一部分一并解决。\n9999999. 一旦没有构建或测试错误，就创建一个 Git 标签。如果没有 Git 标签，则从 0.0.0 开始，每次补丁版本加 1，例如 0.0.1，前提是 0.0.0 不存在。\n99999999. 如有需要，可添加额外的日志以调试问题。\n999999999. 使用子代理保持 @IMPLEMENTATION_PLAN.md 的最新状态——未来的开发工作依赖于此，以避免重复劳动。尤其在你完成本轮任务后，务必及时更新。\n9999999999. 如果你了解到关于如何运行应用程序的新知识，使用子代理更新 @AGENTS.md，但内容应简明扼要。例如，如果你在掌握正确命令之前多次尝试运行某些指令，那么该文件就应该相应更新。\n99999999999. 对于你注意到的任何 bug，即使与当前工作无关，也应使用子代理将其解决或记录在 @IMPLEMENTATION_PLAN.md 中。\n999999999999. 功能必须完整实现。占位符和桩代码会浪费精力和时间，导致重复劳动。\n9999999999999. 当 @IMPLEMENTATION_PLAN.md 变得过于冗长时，应定期使用子代理清理已完成的条目。\n99999999999999. 如果发现 `specs\u002F*` 中存在不一致之处，则使用 Opus 4.6 子代理，并要求其进行“超深度思考”，以更新规格说明。\n999999999999999. 重要提示：保持 @AGENTS.md 的实用性——状态更新和进度说明应放在 `IMPLEMENTATION_PLAN.md` 中。臃肿的 AGENTS.md 会污染未来每次循环的上下文。\n```\n\n### `AGENTS.md`\n\n单一、规范的“循环核心”——一份简洁实用的“如何运行\u002F构建”指南。\n\n- 不是变更日志或进度日记\n- 描述如何构建\u002F运行项目\n- 记录能改进循环的实用经验\n- 内容应简短（约 60 行）\n\n状态、进度和计划应放在 `IMPLEMENTATION_PLAN.md` 中，而非此处。\n\n_循环回溯 \u002F 即时自我评估：_\n\nAGENTS.md 应包含使 Ralph 能够在同一循环内即时评估其工作的项目特定命令——即循环回溯能力。这包括：\n\n- 构建命令\n- 测试命令（定向测试和完整套件）\n- 类型检查\u002F代码风格检查命令\n- 任何其他验证工具\n\nBUILDING 提示词笼统地提到“运行测试”；而 AGENTS.md 则具体指定了实际命令。这就是如何为每个项目配置背压机制的方式。\n\n#### 示例\n\n```\n## 构建与运行\n\n关于如何构建项目的简明规则：\n\n## 验证\n\n实现后运行以下命令以获得即时反馈：\n\n- 测试：`[测试命令]`\n- 类型检查：`[类型检查命令]`\n- 代码风格检查：`[代码风格检查命令]`\n\n## 操作注意事项\n\n关于如何运行项目的简明经验：\n\n...\n\n### 代码库模式\n\n...\n```\n\n### `IMPLEMENTATION_PLAN.md`\n\n由拉尔夫生成的、基于差距分析（规格与代码对比）得出的优先级任务列表——以项目符号形式呈现。\n\n- _创建于_ 规划模式下\n- _更新于_ 构建模式中（标记完成、添加新发现、记录缺陷）\n- _可重新生成_ ——杰夫：“我已经多次删除了待办清单” → 切换回规划模式\n- _自我修正_ ——如果缺少规格，构建模式甚至可以创建新的规格\n\n这种循环设计是刻意为之：通过迭代实现最终一致性。\n\n- _无预设模板_ ——让拉尔夫\u002F大模型自行决定并管理最适合其工作的格式。\n\n### `specs\u002F*`\n\n每个关注主题对应一个 Markdown 文件。这些文件是关于应构建内容的事实来源。\n\n- 在需求阶段创建（人类与大模型的对话）\n- 被规划模式和构建模式共同使用\n- 如发现不一致，可进行更新（较为罕见，宜使用子代理处理）\n\n- _无预设模板_ ——让拉尔夫\u002F大模型自行决定并管理最适合其工作的格式。\n\n### `src\u002F` 和 `src\u002Flib\u002F`\n\n应用源代码及共享工具\u002F组件。\n\n在 `PROMPT.md` 模板中被引用，用于引导步骤。\n\n---\n\n## 改进方案？\n\n我仍在评估这些方案的价值与可行性，但听起来机会颇为诱人：\n\n- [利用 Claude 的 AskUserQuestionTool 进行规划](#use-claudes-askuserquestiontool-for-planning) —— 使用 Claude 内置的访谈工具，系统性地澄清 JTBD、边缘情况及规格中的验收准则。\n- [基于验收的反压机制](#acceptance-driven-backpressure) —— 在规划阶段根据验收准则推导测试需求。这能防止“作弊”——如果没有通过适当的测试，就不能宣称已完成。\n- [非确定性反压机制](#non-deterministic-backpressure) —— 对于主观性任务（语气、美学、用户体验等），使用大模型作为评判者进行测试。采用二元通过\u002F失败的评审方式，并不断迭代直至通过。\n- [适合拉尔夫的工作分支](#ralph-friendly-work-branches) —— 在运行时要求拉尔夫“筛选到功能 X”并不可靠。相反，应在一开始就为每个分支创建明确范围的计划。\n- [JTBD → 故事地图 → SLC 发布](#jtbd--story-map--slc-release) —— 将“让拉尔夫做拉尔夫该做的事”的理念延伸至将 JTBD 中的目标用户及其活动与简单\u002F可爱\u002F完整的发布版本相连接。\n- [规格审计](#specs-audit) —— 专门设立一种模式，用于按照质量规则生成和维护规格：仅关注行为结果、明确主题范围、统一命名规范。\n- [将现有项目逆向工程为规格](#reverse-engineering-brownfield-projects-to-specs) —— 通过将现有代码库逆向转化为规格，再开始规划新工作，从而将旧项目纳入拉尔夫的工作流。\n\n---\n\n### 利用 Claude 的 AskUserQuestionTool 进行规划\n\n在第一阶段（定义需求）中，使用 Claude 内置的 `AskUserQuestionTool` 工具，通过结构化访谈系统地探索 JTBD、关注主题、边缘情况以及验收准则，然后再编写规格文档。\n\n_适用场景：_ 初始需求模糊或极少，需要明确约束条件，或者存在多种可行方案时。\n\n_调用方式：_ “请使用 AskUserQuestion 工具采访我，以了解 [JTBD\u002F主题\u002F验收准则\u002F...]”\n\nClaude 将提出有针对性的问题，以澄清需求并确保各方达成一致，随后生成 `specs\u002F*.md` 文件。\n\n_流程：_\n\n1. 从已知信息出发 →\n2. _Claude 通过 AskUserQuestion 进行访谈_ →\n3. 反复迭代直至清晰 →\n4. Claude 编写包含验收准则的规格文档 →\n5. 继续进入规划与构建阶段\n\n无需修改代码或提示词——此方法仅利用现有的 Claude Code 功能来增强第一阶段的工作。\n\n_灵感来源_ —— [Thariq 的 X 平台帖子](https:\u002F\u002Fx.com\u002Ftrq212\u002Fstatus\u002F2005315275026260309)：\n\n---\n\n### 接受驱动的反压机制\n\nGeoff 的 Ralph 通过渐进式的迭代过程，_隐式地_ 将需求规格、实现和测试连接起来。而这一增强功能将使这种连接变得_显式_，即在规划阶段就推导出测试需求，从而建立一条从“成功的样子”到“如何验证它”的直接路径。\n\n该增强功能将需求规格中的验收准则与实现计划中的测试要求直接关联起来，从而提升反压的质量，具体体现在：\n\n- _防止“作弊”_：如果没有根据验收准则推导出的必要测试，则无法宣称任务完成。\n- _支持 TDD 工作流_：在开始实现之前，测试要求就已经明确。\n- _提高收敛性_：以“必要测试全部通过”作为明确的完成信号，而非模糊的“看起来完成了？”。\n- _保持确定性_：测试要求存在于计划中（已知状态），而不是在运行时才逐渐显现（概率性）。\n\n#### 与核心理念的兼容性\n\n| 原则             | 是否保持？ | 如何体现                                                         |\n| ------------------ | ----------- | ---------------------------------------------------------------- |\n| 单体运作         | ✅ 是      | 每次只有一个代理、一项任务、一个循环                             |\n| 反压至关重要     | ✅ 是      | 测试是反压机制，只是现在以显式方式推导出来                     |\n| 上下文效率       | ✅ 是      | 规划阶段一次性决定测试，避免重复构建和重新发现                   |\n| 确定性设置       | ✅ 是      | 测试要求在计划中（已知状态），而非运行时动态生成                 |\n| 让 Ralph 自主决断 | ✅ 是      | Ralph 仍然负责优先级排序并选择实现方案                          |\n| 计划可丢弃       | ✅ 是      | 如果测试要求有误，可以重新生成计划                              |\n| “捕捉为什么”      | ✅ 是      | 测试意图在实现前就记录在计划中                                 |\n| 杜绝作弊         | ✅ 是      | 必要的测试能够防止仅占位的实现                                   |\n\n#### 规范性的平衡\n\n关键区别在于：\n\n_验收准则_（来自需求规格）= 行为结果、可观察的结果，即成功的定义。\n\n- ✅ “从任何上传的图片中提取 5-10 种主导颜色”\n- ✅ “处理小于 5MB 的图片耗时不超过 100 毫秒”\n- ✅ “处理边缘情况：灰度图、单色图、透明背景”\n\n_测试要求_（来自实现计划）= 由验收准则推导出的验证点。\n\n- ✅ “必要测试：提取 5-10 种颜色、性能低于 100 毫秒、处理灰度边缘情况”\n\n_实现方案_（由 Ralph 决定）= 关于如何实现的具体技术决策。\n\n- ❌ “使用 K-means 聚类算法，迭代 3 次，并进行 LAB 颜色空间转换”\n\n关键在于：_明确要验证什么（结果），而不是如何实现（方案）_\n\n这保持了“让 Ralph 自主决断”的原则——Ralph 在拥有清晰成功信号的情况下，自主决定实现细节。\n\n#### 架构：三阶段连接\n\n```\n阶段 1：需求定义\n    specs\u002F*.md + 接受验准则\n    ↓\n阶段 2：规划（推导测试要求）\n    IMPLEMENTATION_PLAN.md + 必要测试\n    ↓\n阶段 3：构建（带测试的实现）\n    实现 + 测试 → 反压\n```\n\n#### 第一阶段：需求定义\n\n在生成需求规格的人机对话过程中：\n\n- 讨论用户目标，并将其分解为关注的主题。\n- 根据需要使用子代理加载外部上下文。\n- _讨论并定义验收准则_——哪些可观察、可验证的结果表明任务成功。\n- 准则应聚焦于行为层面（结果），而非实现细节（如何构建）。\n- LLM 会根据需求规格的逻辑编写包含验收准则的内容。\n- 验收准则将成为规划阶段推导测试要求的基础。\n\n#### 第二阶段：规划模式增强\n\n修改 `PROMPT_plan.md` 的第一条指令，加入测试推导的内容。在第一句后添加：\n\n```markdown\n对于计划中的每一项任务，根据需求规格中的验收准则推导出必要的测试——需要验证的具体结果（行为、性能、边缘情况）。测试验证的是“做什么”，而非“怎么做”。请将这些测试纳入任务定义中。\n```\n\n#### 第三阶段：构建模式增强\n\n修改 `PROMPT_build.md` 的指令：\n\n_指令 1：_ 在“选择最重要的事项来处理”之后添加：\n\n```markdown\n任务应包含必要的测试——将测试作为任务范围的一部分一起实现。\n```\n\n_指令 2：_ 将“运行该代码单元的测试”替换为：\n\n```markdown\n运行任务定义中指定的所有必要测试。所有必要测试必须存在且通过，任务才算完成。\n```\n\n_新增约束条件_（置于第 9 步序列中）：\n\n```markdown\n999. 必须先确保根据验收准则推导出的必要测试存在并全部通过，才能提交代码。测试是实现范围的一部分，而非可选项。采用测试驱动开发方法：测试可以先写，也可以与实现同步进行。\n```\n\n---\n\n### 非确定性反压\n\n有些验收准则难以通过程序化的方式进行验证：\n\n- _创作质量_——写作风格、叙事流畅性、内容吸引力。\n- _审美判断_——视觉和谐、设计平衡、品牌一致性。\n- _用户体验质量_——导航是否直观、信息层级是否清晰。\n- _内容适宜性_——信息是否符合场景、是否适合目标受众。\n\n这些都需要类似人类的判断力，但在构建循环中仍需反压机制来确保满足验收准则。\n\n_解决方案：_ 添加 LLM 作为评判者的测试作为反压手段，采用二元通过\u002F失败判定。\n\nLLM 的评审是非确定性的（同一产物在不同运行中可能得到不同的评价）。这符合 Ralph 的哲学：“在不确定的世界里，以确定的方式应对问题”。循环通过迭代提供最终的一致性——评审将持续进行，直到通过为止，同时接受自然的差异。\n\n#### 需要创建的内容（第一步）\n\n在 `src\u002Flib\u002F` 目录下创建两个文件：\n\n```\nsrc\u002Flib\u002F\n  llm-review.ts          # 核心工具——单一函数，简洁 API\n  llm-review.test.ts     # 参考示例，展示模式（Ralph 从中学习）\n```\n\n##### `llm-review.ts` - Ralph 发现的二元通过\u002F失败 API：\n\n```typescript\ninterface ReviewResult {\n  pass: boolean;\n  feedback?: string; \u002F\u002F 仅当不通过时存在\n}\n\nfunction createReview(config: {\n  criteria: string; \u002F\u002F 评估的内容（行为、可观察）\n  artifact: string; \u002F\u002F 文本内容或截图路径\n  intelligence?: \"fast\" | \"smart\"; \u002F\u002F 可选，默认为 'fast'\n}): Promise\u003CReviewResult>;\n```\n\n_多模态支持：_ 两种智能级别都将使用多模态模型（文本+视觉）。自动检测产物类型：\n\n- 文本评估：`artifact: \"Your content here\"` → 作为文本输入处理。\n- 视觉评估：`artifact: \".\u002Ftmp\u002Fscreenshot.png\"` → 作为视觉输入处理（检测 .png、.jpg、.jpeg 等扩展名）。\n\n_智能级别_（指判断质量，而非能力类型）：\n\n- `fast`（默认）：快速、经济高效的模型，适用于简单的评估\n  - 示例：Gemini 3.0 Flash（多模态、快速、便宜）\n- `smart`：更高质量的模型，适合进行细致入微的美学或创意判断\n  - 示例：GPT 5.1（多模态、判断更佳、成本更高）\n\n测试夹具会自动选择合适的模型。（示例为当前可用选项，并非强制要求。）\n\n##### `llm-review.test.ts` - 展示 Ralph 如何使用它（文本和视觉示例）：\n\n```typescript\nimport { createReview } from \"@\u002Flib\u002Fllm-review\";\n\n\u002F\u002F 示例 1：文本评估\ntest(\"欢迎信息语气\", async () => {\n  const message = generateWelcomeMessage();\n  const result = await createReview({\n    criteria:\n      \"信息采用温暖、对话式的语气，适合设计专业人士，同时清晰传达价值主张\",\n    artifact: message, \u002F\u002F 文本内容\n  });\n  expect(result.pass).toBe(true);\n});\n\n\u002F\u002F 示例 2：视觉评估（截图路径）\ntest(\"仪表板视觉层次结构\", async () => {\n  await page.screenshot({ path: \".\u002Ftmp\u002Fdashboard.png\" });\n  const result = await createReview({\n    criteria:\n      \"布局展现出清晰的视觉层次，主要操作一目了然\",\n    artifact: \".\u002Ftmp\u002Fdashboard.png\", \u002F\u002F 截图路径\n  });\n  expect(result.pass).toBe(true);\n});\n\n\u002F\u002F 示例 3：智能模式用于复杂判断\ntest(\"品牌视觉一致性\", async () => {\n  await page.screenshot({ path: \".\u002Ftmp\u002Fhomepage.png\" });\n  const result = await createReview({\n    criteria:\n      \"视觉设计保持专业品牌形象，适合金融服务行业，同时避免过于刻板的企业化风格\",\n    artifact: \".\u002Ftmp\u002Fhomepage.png\",\n    intelligence: \"smart\", \u002F\u002F 复杂的美学判断\n  });\n  expect(result.pass).toBe(true);\n});\n```\n\n_Ralph 从这些示例中学习到：_ 文本和截图都可以作为评估对象。根据需要评估的内容来选择即可。其余工作由测试夹具内部自动处理。\n\n_未来的扩展性：_ 目前的设计为了简单起见，使用单一的 `artifact: string`。如果后续出现需要多个评估对象的明确模式（如前后对比、跨项目一致性检查、多角度评估等），可以扩展为 `artifact: string | string[]`。对于大多数多项目需求，可以通过组合截图或拼接文本的方式来实现。\n\n#### 与 Ralph 工作流的集成\n\n_规划阶段_ - 更新 `PROMPT_plan.md`：\n\n在以下内容之后：\n\n```\n...研究 @IMPLEMENTATION_PLAN.md 以确定研究的起点，并利用子代理持续更新已完成和未完成的事项。\n```\n\n插入如下内容：\n\n```\n从验收标准推导测试需求时，需明确验证是需要程序化的校验（可测量、可检查）还是类似人类的判断（感知质量、语气、美学）。这两种方式都是同样有效的反压机制。对于难以通过程序化方式验证的主观标准，请探索 src\u002Flib 中的非确定性评估模式。\n```\n\n_构建阶段_ - 更新 `PROMPT_build.md`：\n\n在第 9 步序列中新增一条约束：\n\n```markdown\n9999. 创建测试以验证实现是否符合验收标准，包括常规测试（行为、性能、正确性）以及感知质量测试（针对主观标准，参见 src\u002Flib 中的相关模式）。\n```\n\n_发现而非文档化：_ 在 `src\u002Flib` 探索过程中（Phase 0c），Ralph 通过 `llm-review.test.ts` 中的示例学习 LLM 评审模式。无需更新 AGENTS.md——代码示例本身就是文档。\n\n#### 与核心理念的兼容性\n\n| 原则             | 是否保持？ | 如何实现                                                                                                                                          |\n| --------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------- |\n| 反压至关重要        | ✅ 是      | 将反压机制扩展至非程序化的验收环节                                                                                          |\n| 确定性设置          | ⚠️ 部分  | 计划中的标准是确定性的，而评估过程是非确定性的，但可通过迭代逐步收敛。这是为了兼顾主观质量而有意做出的权衡。 |\n| 上下文效率          | ✅ 是      | 通过 `src\u002Flib` 复用测试夹具，测试定义简洁精炼                                                                                         |\n| 让 Ralph 自主决策   | ✅ 是      | Ralph 发现模式，自行决定何时使用，并编写评估标准                                                                                |\n| 计划可丢弃          | ✅ 是      | 评审需求作为计划的一部分，若不准确可重新生成                                                                                    |\n| 简单至上            | ✅ 是      | 单一函数、二元结果，无复杂的评分机制                                                                                            |\n| 为 Ralph 提供指引     | ✅ 是      | 仅需少量提示补充，Ralph 可通过代码探索自主学习                                                                                       |\n\n---\n\n### 适合 Ralph 的工作分支\n\n_关键原则：_ Geoff 的 Ralph 基于一个一次性计划工作，其中 Ralph 会挑选“最重要的”任务。要在保持这一模式的同时使用分支，必须在创建计划时进行范围限定，而不是在选择任务时。\n\n_为什么这很重要：_\n\n- ❌ _错误做法：_ 先创建完整计划，再让 Ralph 在运行时“过滤”任务 → 不可靠（70-80%），违反确定性\n- ✅ _正确做法：_ 为每个工作分支提前创建一个限定范围的计划 → 确定性、简单，且保持“计划是一次性的”\n\n_解决方案：_ 添加 `plan-work` 模式，在当前分支上生成一个针对特定工作的 `IMPLEMENTATION_PLAN.md`。用户创建工作分支后，使用自然语言描述工作重点来运行 `plan-work`。LLM 会根据这个描述来限定计划的范围。规划完成后，Ralph 将基于这个已经限定范围的计划进行构建，无需任何语义过滤——只需像往常一样挑选“最重要的”任务。\n\n_术语：_ “工作”一词有意宽泛——它可以指功能、关注点、重构工作、基础设施变更、缺陷修复，或任何连贯的相关更改集合。您传递给 `plan-work` 的工作描述是提供给 LLM 的自然语言文本，可以是散文形式，不受 Git 分支命名规则的限制。\n\n#### 设计原则\n\n- ✅ _每个 Ralph 会话都以单体方式_ 处理每个分支上的单一工作内容\n- ✅ _用户手动创建分支_ —— 完全掌控命名规范和策略（如工作树）\n- ✅ _自然语言工作描述_ —— 向 LLM 传递散文，不受 Git 命名规则限制\n- ✅ _在计划创建时进行范围限定_（确定性），而非在任务选择时（概率性）\n- ✅ _每个分支仅有一个计划_ —— 每个分支对应一个 `IMPLEMENTATION_PLAN.md`\n- ✅ _计划仍具有一次性_ —— 当某个分支的计划不正确或过时时，可重新生成限定范围的计划\n- ✅ 循环会话内不允许动态切换分支\n- ✅ 维持了简单性和确定性\n- ✅ 可选——主分支的工作流仍然有效\n- ✅ 构建时无需语义过滤——Ralph 只需挑选“最重要的”任务\n\n#### 工作流程\n\n_1. 完整规划（在主分支上）_\n\n```bash\n.\u002Floop.sh plan\n# 为整个项目生成完整的 IMPLEMENTATION_PLAN.md\n```\n\n_2. 创建工作分支_\n\n用户执行：\n\n```bash\ngit checkout -b ralph\u002Fuser-auth-oauth\n# 使用您喜欢的任何命名规范创建分支\n# 建议：工作分支以 ralph\u002F* 开头\n```\n\n_3. 限定范围规划（在工作分支上）_\n\n```bash\n.\u002Floop.sh plan-work \"带有 OAuth 和会话管理的用户认证系统\"\n# 传递自然语言描述——LLM 将据此限定计划范围\n# 生成只包含该工作相关任务的聚焦型 IMPLEMENTATION_PLAN.md\n```\n\n_4. 基于计划构建（在工作分支上）_\n\n```bash\n.\u002Floop.sh\n# Ralph 基于限定范围的计划进行构建（无需过滤）\n# 直接从已限定范围的计划中挑选最重要的任务\n```\n\n_5. 创建 PR（工作完成后）_\n\n用户执行：\n\n```bash\ngh pr create --base main --head ralph\u002Fuser-auth-oauth --fill\n```\n\n#### 针对工作范围的 Loop 脚本\n\n扩展基础增强版 Loop 脚本，增加对工作分支的支持及限定范围规划功能：\n\n```bash\n#!\u002Fbin\u002Fbash\nset -euo pipefail\n\n# 使用方法：\n#   .\u002Floop.sh [plan|build] [max_iterations]  # 在当前分支上进行规划或构建\n#   .\u002Floop.sh plan-work \"工作描述\"  # 在当前分支上创建限定范围的计划\n# 示例：\n#   .\u002Floop.sh                               # 构建模式，无上限\n#   .\u002Floop.sh 20                            # 构建模式，最多 20 次\n#   .\u002Floop.sh build 20                      # 构建模式，最多 20 次\n#   .\u002Floop.sh plan 5                        # 完整规划，最多 5 次\n#   .\u002Floop.sh plan-work \"用户认证\"         # 限定范围规划\n\n# 解析参数\nMODE=\"build\"\nPROMPT_FILE=\"PROMPT_build.md\"\n\nif [ \"$1\" = \"plan\" ]; then\n    # 完整规划模式\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"build\" ]; then\n    # 显式构建模式（可选最大迭代次数）\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"plan-work\" ]; then\n    # 限定范围规划模式\n    if [ -z \"$2\" ]; then\n        echo \"错误：plan-work 需要提供工作描述\"\n        echo \"用法：.\u002Floop.sh plan-work '工作描述'\"\n        exit 1\n    fi\n    MODE=\"plan-work\"\n    WORK_DESCRIPTION=\"$2\"\n    PROMPT_FILE=\"PROMPT_plan_work.md\"\n    MAX_ITERATIONS=${3:-5}  # 工作规划默认 5 次\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # 构建模式，指定最大迭代次数（纯数字）\n    MAX_ITERATIONS=$1\nelse\n    # 构建模式，无上限\n    MAX_ITERATIONS=0\nfi\n\nITERATION=0\nCURRENT_BRANCH=$(git branch --show-current)\n\n# 校验 plan-work 模式下的分支\nif [ \"$MODE\" = \"plan-work\" ]; then\n    if [ \"$CURRENT_BRANCH\" = \"main\" ] || [ \"$CURRENT_BRANCH\" = \"master\" ]; then\n        echo \"错误：plan-work 应在工作分支上运行，而非 main\u002Fmaster\"\n        echo \"请先创建工作分支：git checkout -b ralph\u002Fyour-work\"\n        exit 1\n    fi\n\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n    echo \"模式：    plan-work\"\n    echo \"分支：  $CURRENT_BRANCH\"\n    echo \"工作：    $WORK_DESCRIPTION\"\n    echo \"提示词：  $PROMPT_FILE\"\n    echo \"计划：    将创建限定范围的 IMPLEMENTATION_PLAN.md\"\n    [ \"$MAX_ITERATIONS\" -gt 0 ] && echo \"上限：     $MAX_ITERATIONS 次\"\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n\n    # 提醒未提交的 IMPLEMENTATION_PLAN.md 更改\n    if [ -f \"IMPLEMENTATION_PLAN.md\" ] && ! git diff --quiet IMPLEMENTATION_PLAN.md 2>\u002Fdev\u002Fnull; then\n        echo \"警告：IMPLEMENTATION_PLAN.md 存在未提交的更改，这些更改将被覆盖\"\n        read -p \"继续吗？[y\u002FN] \" -n 1 -r\n        echo\n        [[ ! $REPLY =~ ^[Yy]$ ]] && exit 1\n    fi\n\n    # 导出工作描述供 PROMPT_plan_work.md 使用\n    export WORK_SCOPE=\"$WORK_DESCRIPTION\"\nelse\n    # 普通规划\u002F构建模式\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n    echo \"模式：   $MODE\"\n    echo \"分支：   $CURRENT_BRANCH\"\n    echo \"提示词： $PROMPT_FILE\"\n    echo \"计划：   IMPLEMENTATION_PLAN.md\"\n    [ \"$MAX_ITERATIONS\" -gt 0 ] && echo \"上限：    $MAX_ITERATIONS 次\"\n    echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\nfi\n\n# 验证提示词文件是否存在\nif [ ! -f \"$PROMPT_FILE\" ]; then\n    echo \"错误：$PROMPT_FILE 未找到\"\n    exit 1\nfi\n\n# 主循环\nwhile true; do\n    if [ \"$MAX_ITERATIONS\" -gt 0 ] && [ \"$ITERATION\" -ge \"$MAX_ITERATIONS\" ]; then\n        echo \"达到最大迭代次数：$MAX_ITERATIONS\"\n\n        if [ \"$MODE\" = \"plan-work\" ]; then\n            echo \"\"\n            echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n            echo \"已创建限定范围的计划：$WORK_DESCRIPTION\"\n            echo \"要构建，请运行：\"\n            echo \"  .\u002Floop.sh 20\"\n            echo \"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\"\n        fi\n        break\n    fi\n\n    # 使用选定的提示词运行Ralph迭代\n    # -p: 无头模式（非交互式，从标准输入读取）\n    # --dangerously-skip-permissions: 自动批准所有工具调用（YOLO模式）\n    # --output-format=stream-json: 结构化输出，便于日志记录和监控\n    # --model opus: 主代理使用Opus模型进行复杂推理（任务选择、优先级排序）\n    #               如果计划清晰且任务定义明确，也可使用‘sonnet’以提高速度\n    # --verbose: 详细执行日志记录\n\n    # 对于计划-工作模式，在管道前先替换提示词中的${WORK_SCOPE}\n    if [ \"$MODE\" = \"plan-work\" ]; then\n        envsubst \u003C \"$PROMPT_FILE\" | claude -p \\\n            --dangerously-skip-permissions \\\n            --output-format=stream-json \\\n            --model opus \\\n            --verbose\n    else\n        cat \"$PROMPT_FILE\" | claude -p \\\n            --dangerously-skip-permissions \\\n            --output-format=stream-json \\\n            --model opus \\\n            --verbose\n    fi\n\n    # 推送到当前分支\n    CURRENT_BRANCH=$(git branch --show-current)\n    git push origin \"$CURRENT_BRANCH\" || {\n        echo \"推送失败。正在创建远程分支...\"\n        git push -u origin \"$CURRENT_BRANCH\"\n    }\n\n    ITERATION=$((ITERATION + 1))\n    echo -e \"\\n\\n======================== 第 $ITERATION 次循环 ========================\\n\"\ndone\n```\n\n#### `PROMPT_plan_work.md` 模板\n\n_注意：_ 与 `PROMPT_plan.md` 完全相同，但加入了范围限定说明，并替换了环境变量 `WORK_SCOPE`（由循环脚本自动完成）。\n\n```\n0a. 使用最多250个并行的Sonnet子代理研究`specs\u002F*`，以了解应用程序的规格说明。\n0b. 研究@IMPLEMENTATION_PLAN.md（如果存在），以理解目前的计划。\n0c. 使用最多250个并行的Sonnet子代理研究`src\u002Flib\u002F*`，以理解共享的实用工具和组件。\n0d. 作为参考，应用程序源代码位于`src\u002F*`。\n\n1. 您正在为工作“${WORK_SCOPE}”制定一个限定范围的实施计划。研究@IMPLEMENTATION_PLAN.md（如果存在；它可能是不正确的），并使用多达500个Sonnet子代理来研究`src\u002F*`中的现有源代码，将其与`specs\u002F*`进行比较。使用一个Opus子代理分析结果，确定任务的优先级，并创建或更新@IMPLEMENTATION_PLAN.md，将其整理成按未实现事项优先级排序的项目列表。务必深入思考。考虑搜索TODO标记、最小化实现、占位符、被跳过或不稳定的功能测试，以及不一致的模式。研究@IMPLEMENTATION_PLAN.md以确定研究的起点，并持续使用子代理更新已完成或未完成的条目。\n\n重要提示：这仅是针对“${WORK_SCOPE}”的限定范围规划。请仅包含与此工作范围直接相关的任务。保持保守态度——如果不确定某项任务是否属于此工作范围，则将其排除。如果计划过于狭窄，可以重新生成。仅做规划，切勿实施任何内容。不要假设功能缺失；应先通过代码搜索确认。将`src\u002Flib`视为项目的标准库，用于共享实用工具和组件。优先选择其中统一、符合语言习惯的实现方式，而非临时复制的版本。\n\n最终目标：我们希望实现限定范围的工作“${WORK_SCOPE}”。请考虑与此工作相关的缺失部分，并据此制定计划。如果发现某项缺失内容，首先搜索确认其确实不存在，然后在必要时编写规格说明文件至`specs\u002FFILENAME.md`。如果您创建了新元素，请使用子代理在@IMPLEMENTATION_PLAN.md中记录其实现计划。\n```\n\n#### 与核心理念的兼容性\n\n| 原则              | 是否保持？ | 如何保持                                                                      |\n| ---------------------- | ----------- | ------------------------------------------------------------------------ |\n| 单体操作   | ✅ 是      | Ralph 仍然在分支内以单进程运行                     |\n| 每次循环一项任务      | ✅ 是      | 未发生变化                                                                |\n| 新鲜上下文          | ✅ 是      | 未发生变化                                                                |\n| 确定性          | ✅ 是      | 规划阶段的范围限定是确定性的，而运行时则是概率性的            |\n| 简单                 | ✅ 是      | 可选增强功能，主流程仍可正常工作                          |\n| 计划驱动            | ✅ 是      | 每个分支只有一个IMPLEMENTATION_PLAN.md                                    |\n| 单一事实来源 | ✅ 是      | 每个分支只有一个计划——限定范围的计划会取代分支上的完整计划           |\n| 计划可丢弃     | ✅ 是      | 随时可重新生成限定范围的计划：`.\u002Floop.sh plan-work \"工作描述\"` |\n| Markdown优于JSON     | ✅ 是      | 仍然是Markdown格式的计划                                                     |\n| 让Ralph自己做决定        | ✅ 是      | Ralph 从已经限定范围的计划中挑选出“最重要”的内容——无需额外筛选        |\n\n---\n\n### JTBD → 故事地图 → SLC发布\n\n#### 关注点 → 活动\n\nGeoff提出的[建议工作流](https:\u002F\u002Fghuntley.com\u002Fcontent\u002Fimages\u002Fsize\u002Fw2400\u002F2025\u002F07\u002FThe-ralph-Process.png)已经将规划与“待完成的工作”相结合——将JTBD分解为关注点，这些关注点又转化为规格说明。我非常喜欢这一点，我认为还有机会进一步利用这种方法所带来的产品优势，将“关注点”重新定义为“活动”。\n\n活动是旅程中的动词（如“上传照片”、“提取颜色”），而不是能力（如“颜色提取系统”）。它们天然地由用户意图来界定。\n> 关注点：“颜色提取”、“布局引擎”→ 能力导向\n> 活动：“上传照片”、“查看提取的颜色”、“排列布局”→ 旅程导向\n\n#### 活动 → 用户旅程\n\n活动及其构成步骤自然地串联成用户流程，形成一种“旅程结构”，使其中的空白和依赖关系变得清晰可见。借助_[用户故事地图](https:\u002F\u002Fwww.nngroup.com\u002Farticles\u002Fuser-story-mapping\u002F)_，可以将活动按列排列（即旅程的主干），并将能力深度按行排列——从而全面展示可能构建的内容：\n\n```\n上传    →   提取    →   排列     →   分享\n\n基础         自动           手动          导出\n批量         调色板        模板        合作\n批处理         AI主题      自动布局      嵌入\n```\n\n#### 用户旅程 → 发布切片\n\n地图上的水平切片会成为候选发布版本。并非每个活动都需要在每个版本中都具备新功能——有些单元格可以留空，只要该切片仍然保持连贯性即可：\n\n```\n                  上传    →   提取    →   排列     →   分享\n\n发布1：        基本         自动                           导出\n                  ───────────────────────────────────────────────────\n发布2：                      调色板        手动\n                  ───────────────────────────────────────────────────\n发布3：        批量         AI主题      模板       嵌入\n```\n\n#### 发布切片 → SLC 发布\n\n故事地图为你提供了切分的_结构_。Jason Cohen 的 _[简单、可爱、完整 (SLC)](https:\u002F\u002Flongform.asmartbear.com\u002Fslc\u002F)_ 则给出了判断一个切片是否优秀的_标准_：\n\n- _简单_ — 范围狭窄，能够快速交付。不需要涵盖所有活动或所有深度。\n- _完整_ — 在该范围内完整地完成一项任务。而不是一个不完整的预览版。\n- _可爱_ — 用户真正愿意使用它。在其限定范围内令人愉悦。\n\n_为什么选择 SLC 而不是 MVP？_ MVP 以牺牲用户体验为代价来优化学习效果——“最小”往往意味着功能残缺或令人沮丧。而 SLC 则颠覆了这一点：在市场中学习的同时，持续交付真正的价值。如果成功，你将拥有更多的选择；即使失败，你也已经善待了用户。\n\n每个切片都可以成为一个具有明确价值和身份的发布版本：\n\n```\n                  上传    →   提取    →   排列     →   分享\n\n调色板选择器：   基本         自动                           导出\n                  ───────────────────────────────────────────────────\n情绪板：                     调色板        手动\n                  ───────────────────────────────────────────────────\n设计工作室：    批量         AI主题      模板       嵌入\n```\n\n- _调色板选择器_ — 上传、提取、导出。从第一天起就能立即产生价值。\n- _情绪板_ — 增加了排列功能。创意表达融入了整个流程。\n- _设计工作室_ — 提供专业级功能：批量处理、AI 主题、可嵌入的输出。\n\n---\n\n#### 使用 Ralph 实现\n\n上述概念——活动、故事地图、SLC 发布——是我们的_思维工具_。那么如何将其转化为 Ralph 的工作流程呢？\n\n_默认的 Ralph 方法：_\n\n1. _定义需求_：人类与 LLM 共同确定用户想要解决的 JTBD 主题 → `specs\u002F*.md`\n2. _创建任务计划_：LLM 分析所有规格文档及现有代码 → `IMPLEMENTATION_PLAN.md`\n3. _开发_：Ralph 按照完整范围进行开发。\n\n这种方法非常适合以功能为中心的工作（如新特性、重构或基础设施建设）。但它并不能自然地产生有价值的（SLC）产品发布——它只会按照规格文档的内容进行开发。\n\n_活动 → SLC 发布方法：_\n\n要实现 SLC 发布，我们需要将活动置于目标用户的上下文中。用户群体决定了谁有 JTBD，进而明确了哪些活动是关键，以及什么才是“可爱”的含义。\n\n```\n用户群体（谁）\n    └── 拥有 JTBD（期望的结果）\n            └── 由活动（实现结果的手段）来满足\n```\n\n##### 工作流程\n\n_I. 需求阶段（2 步）：_\n\n仍由 LLM 与人类对话完成，类似于默认的 Ralph 方法。\n\n1. _定义用户群体及其 JTBD_ — 我们为谁构建？他们希望达成什么_成果_？\n\n   - 人类与 LLM 讨论并确定目标用户群体及其 JTBD（期望达成的成果）。\n   - 可能包含多个相关联的用户群体（例如，“设计师”负责创作，“客户”负责评审）。\n   - 生成 `AUDIENCE_JTBD.md` 文件。\n\n2. _定义活动_ — 用户为了实现其 JTBD 需要做些什么？\n\n   - 根据 `AUDIENCE_JTBD.md` 文件制定。\n   - 对于每个 JTBD，识别完成它所需的活动。\n   - 对于每个活动，确定：\n     - 功能深度（基础 → 增强）——复杂程度的不同层次。\n     - 每个层次下期望达成的成果——成功的具体表现是什么？\n   - 生成 `specs\u002F*.md` 文件（每个活动对应一个文件）。\n\n   活动中的具体步骤是隐含的，LLM 可以在规划过程中推断出来。\n\n_II. 规划阶段：_\n\n在 Ralph 循环中使用_更新后的_规划提示执行。\n\n- LLM 分析：\n  - `AUDIENCE_JTBD.md`（用户是谁、期望达成的成果）\n  - `specs\u002F*`（可以构建的内容）\n  - 当前代码状态（已有的功能）\n- LLM 确定下一个 SLC 切片（哪些活动、处于何种功能深度），并为该切片规划任务。\n- LLM 生成 `IMPLEMENTATION_PLAN.md` 文件。\n- _人类验证_计划后再进行开发：\n  - 该范围是否代表一个连贯的 SLC 发布？\n  - 是否包含了正确的活动，并且处于恰当的功能深度？\n  - 如果有问题，则重新运行规划循环以生成新的计划，必要时可更新输入或规划提示。\n  - 如果正确，则进入开发阶段。\n\n_III. 开发阶段：_\n\n在 Ralph 循环中使用标准的开发提示执行。\n\n##### 更新后的规划提示\n\n这是 `PROMPT_plan.md` 的变体，增加了用户背景信息和面向 SLC 的切片建议。\n\n_注意事项：_\n\n- 与默认模板不同，此模板没有 `[项目特定目标]` 占位符——因为目标是隐含的：为用户群体推荐最有价值的下一个发布版本。\n- 目前的子代理名称假设使用的是 Claude 模型。\n\n```\n0a. 研究 @AUDIENCE_JTBD.md，了解我们要为谁构建以及他们的待完成工作。\n0b. 使用最多 250 个 Sonnet 子代理并行研究 `specs\u002F*`，以了解与 JTBD 相关的活动。\n0c. 研究 @IMPLEMENTATION_PLAN.md（如有），以了解迄今为止的计划。\n0d. 使用最多 250 个 Sonnet 子代理并行研究 `src\u002Flib\u002F*`，以了解共享的工具和组件。\n0e. 作为参考，应用程序源代码位于 `src\u002F*`。\n\n1. 将 `specs\u002F*` 中的活动按顺序排列成针对 @AUDIENCE_JTBD.md 中用户群体的用户旅程图。考虑各活动之间的流转关系及依赖性。\n\n2. 确定下一个 SLC 发布版本。使用最多 500 个 Sonnet 子代理比较 `src\u002F*` 和 `specs\u002F*`。再用一个 Opus 子代理分析结果。深入思考。结合已实现的功能，推荐哪些活动（处于何种功能深度）构成最有价值的下一个发布版本。优先选择较薄的水平切片——即能在提供实际价值的前提下，尽可能缩小范围。一个好的切片应具备以下特点：简单（范围小、可实现）、可爱（用户愿意使用）和完整（完整地完成有意义的任务，而非不完整的预览）。\n\n3. 使用一个 Opus 子代理（深入思考）分析并综合上述发现，对任务进行优先级排序，并创建或更新 @IMPLEMENTATION_PLAN.md，以要点列表的形式列出针对推荐的 SLC 发布版本尚未实施的任务。计划开头应概述推荐的 SLC 发布版本（包含哪些内容以及原因），然后按优先级列出该范围内的任务。需考虑待办事项、占位符、最小化实现、未完成的测试等，但仅限于本次发布的范围。对于超出范围的发现，则记录为后续工作。\n\n重要提示：仅进行规划，不要执行任何操作。不要假设某些功能缺失；应先通过代码搜索确认。将 `src\u002Flib` 视为项目的标准库，用于共享工具和组件。优先采用其中的整合、规范化的实现方式，而非临时复制的做法。\n\n终极目标：我们希望在 @AUDIENCE_JTBD.md 中为受众实现最具价值的下一次发布。请考虑缺失的元素，并据此进行规划。如果某个元素缺失，请先搜索确认其不存在，然后再根据需要在 specs\u002FFILENAME.md 中编写该规范。如果你创建了一个新元素，则需使用子代理将其实施计划记录在 @IMPLEMENTATION_PLAN.md 中。\n```\n\n##### 备注\n\n_为什么将 `AUDIENCE_JTBD.md` 作为单独的文档：_\n\n- 单一事实来源——防止各规范之间出现偏差\n- 支持整体性思考：“这个受众最需要什么？”\n- 将 JTBD 与受众信息一同记录（“为什么”的部分与“谁”的部分紧密相连）\n- 在规范制定和 SLC 规划阶段都会被引用两次\n- 使活动规范聚焦于“做什么”，而非重复描述“谁”\n\n_基数关系：_\n\n- 一个受众对应多个 JTBD（例如，“设计师”有“捕捉空间”、“探索概念”、“向客户展示”等需求）\n- 一个 JTBD 可分解为多个活动（例如，“捕捉空间”包括上传、测量、房间检测等操作）\n- 一个活动可以服务于多个 JTBD（例如，“上传照片”既可用于“捕捉”需求，也可用于“收集灵感”）\n\n---\n\n\n\n### 规范审核\n\n一种专门的循环模式，用于生成和维护规范文件，并强制执行质量规则。确保规范专注于行为结果（而非实现细节），主题范围恰当（“一句话不含‘和’”），以及文件命名规范一致。\n\n_何时使用：_ 在编写或更新规范后，运行规范模式以确保所有规范文件的一致性和整洁性。\n\n_具体功能：_\n\n- 遍历现有的 `specs\u002F*` 文件\n- 强制执行质量规则：仅关注行为结果，不得包含代码块或实现细节\n- 使用“一句话不含‘和’”测试验证主题范围是否恰当\n- 根据 `specs\u002FREADME.md` 的要求，在必要时创建新的规范文件\n- 统一文件命名格式：\u003C整数>-文件名.md（如 `01-range-optimization.md`）\n\n_使用方法：_ 在你的循环脚本中添加 `specs` 参数，选择 `PROMPT_specs.md`：\n\n```bash\n.\u002Floop.sh specs        # 规范模式，无限次迭代\n.\u002Floop.sh specs 3      # 规范模式，最多迭代3次\n```\n\n_如何将规范模式加入 `loop.sh`：_ 在参数解析部分插入一个新的 `elif` 分支：\n\n```bash\n# 解析参数\nif [ \"$1\" = \"plan\" ]; then\n    # 计划模式\n    MODE=\"plan\"\n    PROMPT_FILE=\"PROMPT_plan.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [ \"$1\" = \"specs\" ]; then        # ← 添加这一段\n    # 规范模式\n    MODE=\"specs\"\n    PROMPT_FILE=\"PROMPT_specs.md\"\n    MAX_ITERATIONS=${2:-0}\nelif [[ \"$1\" =~ ^[0-9]+$ ]]; then\n    # 构建模式，设置最大迭代次数\n    ...\n```\n\n_如何将规范模式加入 `loop_streamed.sh`：_ 同样在相同位置添加 `elif` 分支。其余部分（流式处理、`parse_stream.js` 管道）保持不变。\n\n_相关文件：_ [`PROMPT_specs.md`](files\u002FPROMPT_specs.md)\n\n#### `PROMPT_specs.md` 模板\n\n_备注：_\n\n- 规范定义的是要验证的“内容”（即结果），而不是“如何实现”（即方法）。实现细节由 Ralph 在构建阶段决定。\n\n```\n0a. 使用最多250个并行的 Sonnet 子代理研究 `specs\u002F*` 文件，以了解应用程序的规范。\n\n1. 确定待完成的工作（JTBD）→ 将每个 JTBD 分解为相关的主题 → 利用子代理从 URL 加载信息到上下文中 → LLM 理解 JTBD 的相关主题：子代理为每个主题撰写 specs\u002FFILENAME.md。\n\n## 规则（不适用于 `specs\u002FREADME.md`）\n\n999. 切勿添加代码块或建议变量应如何命名。这些将由 Ralph 决定。\n9999.\n- 接受标准（在规范中）= 行为结果，可观察到的结果\n例如：\n✓ “从任何上传的图像中提取5-10种主色调”\n✓ “在100毫秒内处理小于5MB的图像”\n✓ “处理特殊情况：灰度图、单色图、透明背景”\n- 测试要求（在计划中）= 由接受标准推导出的验证点\n例如：\n✓ “所需测试：提取5-10种颜色，性能\u003C100ms”\n- 实现方案（由 Ralph 决定）= 技术决策\n避免的例子：\n✗ “使用K-means聚类算法，迭代3次”\n\n99999. 主题范围测试：“一句话不含‘和’”\n你能用一句话描述相关主题，而不将不相关的功能混在一起吗？\n推荐示例：\n✓ “色彩提取系统分析图像以识别主色调”\n避免示例：\n✗ “用户系统负责身份验证、个人资料管理和计费” → 这涉及3个主题\n如果需要用“和”来描述功能，那很可能就是多个主题了\n\n99999999. 关键：明确要验证的内容（结果），而不是实现方式（方法）。这遵循“让Ralph做Ralph”的原则——Ralph在拥有清晰成功信号的情况下决定实现细节。\n99999999999. 对所有现有文件应用上述规则，使用最多100个并行的 Sonnet 子代理在 @specs 中执行（除 README.md 外），并根据 `specs\u002FREADME.md` 的要求创建新文件。文件名应遵循以下命名规范：\u003C整数>-文件名.md，例如 01-range-optimization.md、02-adaptive-behavior.md 等。\n```\n\n— 由 [@terry-xyz](https:\u002F\u002Fgithub.com\u002Fterry-xyz) · [@blackrosesxyz](https:\u002F\u002Fx.com\u002Fblackrosesxyz) 贡献\n\n---\n\n### 逆向工程旧项目以生成规范\n\n在全新项目中直接使用规范很容易，但在已有项目的环境中工作时，就需要采取不同的方法。因此，你需要将现有代码的实现逆向转化为规范，才能开始运用 Ralph 的工作流程。\n\n_何时使用：_ 你接手或加入了一个没有规范的代码库。你想在原本未使用 Ralph 构建的项目上使用 Ralph。你需要为现有的旧项目添加新功能。\n\n_调用方法：_ “使用 `PROMPT_reverse_engineer_specs.md` 逆向工程 [主题\u002F领域] 的规范”\n\n_流程：_\n\n1. 使用 `PROMPT_reverse_engineer_specs.md` 指令代理指向现有代码库 →\n2. 代理调查代码（具备实现知识）→\n3. 代理编写描述实际行为的规范（不涉及实现细节）→\n4. 规范存入 `specs\u002F` 目录 →\n5. 根据需要重复此过程，直至所有规范完成 →\n6. 按照常规的 Ralph 流程（计划→构建）基于已记录的基准继续推进\n\n你可以采用代理编排模式，其中子代理负责逆向工程，而编排者熟悉“关注主题”理念：\n\n- **覆盖整个领域：** 告诉编排者识别领域内的所有主题，然后启动子代理为每个主题创建完整的规范。\n- **按任务范围覆盖：** 提供你即将执行的具体任务，让代理分析代码库，找出相关主题，然后为每个主题创建或更新相应的规范。\n\n无需修改现有的提示文件——这只是额外的补充。生成的规范与 Ralph 在计划和构建阶段已经使用的格式相同。\n\n#### 考虑事项\n\n- **单体仓库结构：** 可能需要将逆向工程的范围限定在特定的包或服务上，而不是整个仓库。请将代理指向相关的子目录。\n- **整个领域的规格生成：** 为整个领域生成规格是一项较大的投入——如果你们团队正在将 Ralph 作为标准工作流程采用，则值得这样做。\n- **快速开发或小规模变更：** 小规模的代码变更可能会与生成的规格产生偏差。需提前决定团队是重新运行逆向工程以保持规格的最新状态，还是接受暂时的偏差。\n- **重构后规格过时：** 一旦 Ralph 基于逆向工程生成的规格构建新功能，重大重构可能会悄然使这些规格失效。对于改动频繁的区域，应定期重新运行逆向工程。\n- **主题粒度：** 该提示严格要求“每份规格只描述一个主题”。在大型代码库中，如何划定主题边界是一个需要权衡的问题——划分得太宽会导致规格过于臃肿，而划分得太细则会陷入文件过多的困境。可以先从粗略的划分开始，再根据需要进行拆分。\n- **Bug 成为规格：** 该提示有意将有缺陷的行为记录为定义行为。逆向工程生成的规格描述的是“现状”，而非“应有的状态”。对于期望的行为变更，应单独编写新的规格。\n- **大型代码库的 Token 成本：** 使用子代理进行全面的代码追踪可能会消耗大量 Token。建议首先将范围限定在你实际计划修改的区域。\n\n#### 与核心理念的兼容性\n\n| 原则             | 是否保持？ | 如何实现                                                         |\n| --------------------- | ----------- | ----------------------------------------------------------- |\n| 确定性设置   | ✅ 是      | 规格是书面化的产物（已知状态），而非临时性的上下文；它包含了代码中的所有缺陷。 |\n| 上下文效率    | ⚠️ 部分  | 必须在整个团队的文化中全面推行 |\n| 记录“为什么”       | ⚠️ 部分  | 并非所有已实现的代码都包含背后的“为什么”，只有明确表达“为什么”的注释才会被记录下来。 |\n| 让 Ralph 来做 Ralph       | ✅ 是      | 关注的主题仍然由 Ralph 选择。 |\n| 计划是可丢弃的    | ✅ 是      | 规格提供了稳定的基线；计划会基于已记录的事实重新生成 |\n| 简洁至上       | ✅ 是      | 提供了对整个规格的概览视图。 |\n\n#### `PROMPT_reverse_engineer_specs.md` 模板\n\n_说明：_\n\n- 记录的是代码的实际行为（包括 Bug），而非预期行为\n- 分两阶段进行：第一阶段在完全访问代码的情况下进行调研，第二阶段在不包含任何实现细节的情况下编写规格\n- 每份规格只描述一个主题，通过“一句话中不含‘和’”的测试来强制执行\n- 当前的子代理名称假定使用 Claude\n\n_文件：_ [`PROMPT_reverse_engineer_specs.md`](files\u002FPROMPT_reverse_engineer_specs.md)\n\n```\n0a. 使用最多 250 个并行的 Sonnet 子代理研究 `specs\u002F*`，以了解现有的规格文档。\n0b. 研究 `src\u002F*` 以理解代码库。使用最多 500 个并行的 Sonnet 子代理进行读取和搜索。将 `src\u002Flib` 视为项目的标准库，用于共享工具和组件。\n\n1. 对于每个分配（或发现）的主题，逆向工程源代码并在 `specs\u002F` 中生成一份规格文档。对于复杂的追踪任务，使用 Opus 子代理，并尽量做到精简。在编写规格之前，务必先搜索确认是否已经存在针对该主题的规格。\n2. 每份规格仅描述一个主题。必须通过“一句话中不含‘和’”的测试。如果“和”连接了不相关的能力，则应将其拆分为多个规格。\n3. **两阶段流程：** 第一阶段（调研）——追踪每一个入口点、分支和代码路径，直到其终止。绘制数据流、副作用、状态变化、错误处理、并发控制、配置驱动的路径以及隐式行为。第二阶段（输出）——完全不包含任何实现细节。不得出现函数、类、变量名、文件路径、库或框架引用。即使是由不同团队在不同技术栈上实现，也应能够仅凭这份规格完成复现。\n4. **记录现实，而非意图。** Bug 也是功能的一部分。绝不能添加代码本身并未实现的行为，也不得提出改进建议。如果源代码中的注释与其行为相矛盾，则应记录代码的实际行为，忽略该注释。\n5. **范围边界：** 当追踪过程超出当前主题范围时，应立即停止。仅记录跨越边界的输入和输出。测试方法：“如果不改变我的主题结果，这件事能否发生？”如果答案是肯定的，则说明它超出了当前主题范围。\n6. **共享行为：** 在每份规格中完整内联描述（自包含）。记录共享主题以便跨规格跟踪。共享行为还应拥有其独立的规范文档。\n7. **规格格式：** 使用 Markdown 格式，保存在 `specs\u002F` 目录下。每份规格应包括：主题陈述、范围（包含的内容及边界）、数据契约、行为（按执行顺序排列）以及状态转换。在文中标注值得注意或出乎意料的行为、不可达路径以及跨主题的共享行为。从源代码注释中提取理由（去除实现相关的引用）。文件命名格式为：`specs\u002FNN-kebab-case.md`（例如：`01-session-management.md`）。\n8. 规格完成后并经过验证后，执行 `git add -A`，然后提交一条描述新增或更新了哪些规格的提交信息。提交完成后，执行 `git push`。\n99999. **最终确认前的全面检查清单：** 每一个入口点都已被记录；每一处分支都被追踪至终点；每一个数据契约都已明确；所有的副作用均按执行顺序列出；每一条错误路径（被捕获、传播或忽略）都已记录；所有配置驱动的路径均已追踪；并发行为的结果也被记录；不可达路径已被标记；值得注意或出乎意料的行为也被标记；输出中没有任何实现细节。若有任何一项缺失，需重新进行追踪。\n999999. 代码才是事实的唯一来源。如果规格与代码不一致，需使用 Opus 4.6 子代理更新规格。\n9999999. 维护单一的事实来源，避免重复规格。应优先更新现有规格，而非创建新规格。\n99999999. 每当了解到关于项目的新信息时，应使用子代理更新 @AGENTS.md 文件，但内容应简明扼要，仅限于操作层面——不得包含状态更新或进度报告。\n999999999. 对于解释为何必须保留某些行为的源代码注释（如出于法规、兼容性或故意设计等原因），应提取其理由，同时去除实现相关的引用。过时的注释不应纳入规格。\n9999999999. 应记录所有配置驱动的路径，而不仅仅是当前生效的那一条。\n99999999999. 如果发现 `specs\u002F*` 中存在不一致之处，则应使用带有“ultrathink”模式的 Opus 4.6 子代理来更新规格。\n```\n\n— 由 Jake Cukjati 贡献 · [@Byte0fCode](https:\u002F\u002Fx.com\u002FByte0fCode) · [@jackstine](https:\u002F\u002Fgithub.com\u002Fjackstine)","# Ralph Playbook 快速上手指南\n\nRalph Playbook 是一套基于 Geoffrey Huntley 提出的 \"Ralph\" 模式优化的 AI 编程工作流。它通过“规划”与“构建”双模式循环，利用子代理（Subagents）管理上下文，实现从需求定义到代码落地的自动化闭环。\n\n## 环境准备\n\n在开始之前，请确保你的开发环境满足以下要求：\n\n*   **操作系统**：macOS 或 Linux（Windows 用户建议使用 WSL2）。\n*   **核心依赖**：\n    *   [Claude Code](https:\u002F\u002Fclaude.ai\u002Fcode) (或兼容的 CLI 工具)：需已安装并配置好 API Key。\n    *   `git`：用于版本控制和回滚。\n    *   `bash`：用于运行循环脚本。\n*   **沙箱环境（强烈推荐）**：\n    *   由于 Ralph 需要以 `--dangerously-skip-permissions` 模式运行以绕过权限确认，**必须**在隔离环境中执行以防安全风险。\n    *   **本地方案**：Docker Desktop。\n    *   **云端方案**：E2B、Fly Sprites 或其他支持临时沙箱的服务。\n*   **项目结构**：建议在一个空的或现有的 Git 仓库根目录下操作。\n\n## 安装步骤\n\nRalph Playbook 本质上是一组约定和规范，而非单一安装包。请按以下步骤初始化：\n\n1.  **克隆或创建项目结构**\n    在项目根目录下创建必要的配置文件和目录结构：\n\n    ```bash\n    mkdir -p specs src\n    touch PROMPT.md AGENTS.md IMPLEMENTATION_PLAN.md\n    git init\n    ```\n\n2.  **配置核心文件**\n    根据当前阶段（规划或构建），编辑 `PROMPT.md`。\n    *   **规划模式 (PLANNING)**：提示词应侧重于差距分析（Specs vs Code）并生成待办列表。\n    *   **构建模式 (BUILDING)**：提示词应侧重于从计划中选取任务、实施代码、运行测试并提交。\n\n    同时，保持 `AGENTS.md` 初始为空或仅包含基础构建\u002F测试命令，随迭代动态更新。\n\n3.  **创建循环脚本 (`loop.sh`)**\n    在项目根目录创建 `loop.sh`，这是驱动 Ralph 的核心引擎。最简版本如下：\n\n    ```bash\n    #!\u002Fbin\u002Fbash\n    while :; do \n      cat PROMPT.md | claude --dangerously-skip-permissions\n      # 在此处添加逻辑判断任务是否完成，若完成则跳出循环或重新加载上下文\n      # 实际生产中建议增加错误处理和上下文清理逻辑\n    done\n    ```\n\n    赋予执行权限：\n    ```bash\n    chmod +x loop.sh\n    ```\n\n4.  **设置沙箱（可选但推荐）**\n    如果使用 Docker，请创建一个包含必要构建工具（Node.js, Python, Go 等）的基础镜像，并在其中运行上述脚本。\n\n## 基本使用\n\nRalph 的工作流分为两个主要阶段：**定义需求** 和 **运行循环**。\n\n### 第一阶段：定义需求 (Phase 1)\n\n在与 LLM 的对话中（非自动循环模式下），将高层级的用户需求（JTBD）拆解为具体的关注点（Topic of Concern），并为每个关注点生成规范文档。\n\n1.  **拆分任务**：将一个大的 JTBD 拆分为多个独立的主题（例如：“图片收集”、“颜色提取”）。\n    *   *原则*：每个主题应能用不含“和\u002F与”的一句话描述。\n2.  **生成规范**：让 AI 为每个主题编写 Markdown 规范文件，保存至 `specs\u002F` 目录。\n    ```markdown\n    # specs\u002Fcolor-extraction.md\n    ## 目标\n    分析上传的图片并识别主导颜色...\n    ```\n\n### 第二阶段：运行 Ralph 循环 (Phase 2 & 3)\n\n根据当前是否有有效的实施计划，切换 `PROMPT.md` 的内容来启动循环。\n\n#### 模式 A：规划模式 (PLANNING)\n*适用场景：项目初期或计划已过时。*\n\n1.  更新 `PROMPT.md` 为规划指令（侧重差距分析，不写代码）。\n2.  运行循环：\n    ```bash\n    .\u002Floop.sh\n    ```\n3.  **结果**：AI 将读取 `specs\u002F*` 和现有代码，生成或更新 `IMPLEMENTATION_PLAN.md`，列出优先级的待办任务列表。**此时不会修改源代码。**\n\n#### 模式 B：构建模式 (BUILDING)\n*适用场景：已有明确的 `IMPLEMENTATION_PLAN.md`。*\n\n1.  更新 `PROMPT.md` 为构建指令（侧重执行任务、运行测试、提交代码）。\n2.  确保 `AGENTS.md` 中包含项目特定的构建\u002F测试命令（如 `npm test`, `go build`），以形成“背压”（Backpressure）机制。\n3.  运行循环：\n    ```bash\n    .\u002Floop.sh\n    ```\n4.  **循环内部流程**：\n    *   **定向**：子代理读取 `specs` 和 `IMPLEMENTATION_PLAN.md`。\n    *   **选择**：挑选下一个最高优先级的任务。\n    *   **调查**：检查相关源码（不假设未实现）。\n    *   **实施**：编写代码。\n    *   **验证**：运行测试\u002F构建（若失败则自我修正）。\n    *   **更新**：标记任务完成，更新计划和 `AGENTS.md`。\n    *   **提交**：Git Commit。\n    *   **重置**：清空上下文，开始下一次迭代。\n\n### 关键操作提示\n\n*   **观察与调整**：初次运行时，请密切观察输出。如果 Ralph 重复犯错，不要直接干预代码，而是更新 `AGENTS.md` 或添加辅助工具函数来引导它。\n*   **计划重置**：如果发现 Ralph 偏离轨道或计划混乱，直接删除 `IMPLEMENTATION_PLAN.md` 并重新运行**规划模式**循环。成本很低，但能纠正方向。\n*   **紧急停止**：随时按 `Ctrl+C` 终止循环。使用 `git reset --hard` 可快速回滚到上一次成功提交的状态。","某全栈开发者需要在两周内从零构建一个具备用户认证和实时数据看板功能的 SaaS 原型，但缺乏详细的技术规格文档。\n\n### 没有 ralph-playbook 时\n- **需求模糊导致返工**：直接将模糊的想法丢给 AI 编码，生成的代码常偏离核心业务目标（JTBD），需反复手动修正。\n- **规划与执行混乱**：在“写计划”和“写代码”之间频繁切换提示词，导致上下文丢失，AI 经常忘记之前的架构决策。\n- **缺乏迭代闭环**：代码提交后缺乏自动化的测试反馈机制（Backpressure），错误累积到后期才被发现，修复成本极高。\n- **文档不同步**：实施计划与实际代码严重脱节，随着功能增加，开发者无法追踪哪些任务已完成或需要调整。\n\n### 使用 ralph-playbook 后\n- **精准定义需求**：遵循第一阶段流程，先将想法拆解为具体的 JTBD 并生成独立的 `specs\u002F` 文档，确保 AI 完全理解业务痛点后再动手。\n- **双模式清晰分离**：利用 _PLANNING_ 模式专门进行差距分析并输出优先级的 `IMPLEMENTATION_PLAN.md`，随后切换至 _BUILDING_ 模式严格按计划执行，互不干扰。\n- **自动化闭环迭代**：在构建循环中自动运行测试作为反馈压力，每完成一个小任务即提交代码并更新计划，确保持续集成且错误即时暴露。\n- **动态同步的路线图**：实施计划随代码进度自动更新，开发者随时可查看准确的待办列表，保持对整体进度的掌控感。\n\nralph-playbook 通过将模糊创意转化为标准化的“规划 - 构建”双阶段闭环，让自主 AI 编码从随机试错变成了可控的工程交付流程。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FClaytonFarr_ralph-playbook_9ce8e436.png","ClaytonFarr","Clayton Farr","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FClaytonFarr_b6482a03.jpg",null,"https:\u002F\u002Fgithub.com\u002FClaytonFarr",[82,86,90],{"name":83,"color":84,"percentage":85},"HTML","#e34c26",91.8,{"name":87,"color":88,"percentage":89},"JavaScript","#f1e05a",5,{"name":91,"color":92,"percentage":93},"Shell","#89e051",3.2,924,240,"2026-04-02T21:33:03","MIT","未说明 (基于 shell 脚本和 Claude CLI，推测支持 Linux\u002FmacOS)","未说明 (该工具为工作流编排脚本，依赖外部 API 而非本地推理)","未说明",{"notes":102,"python":103,"dependencies":104},"该工具并非传统的本地 AI 模型，而是一个用于编排 Claude 代理的工作流脚本（Playbook）。运行核心需求是安装并配置好 Claude CLI，且需要有效的 API 访问权限。文中特别强调为了自动化运行需使用 `--dangerously-skip-permissions` 参数，因此强烈建议在 Docker、Fly Sprites 或 E2B 等沙箱环境中运行以保障安全，避免泄露本地凭证。无需本地 GPU 或特定 Python 库。","未说明 (主要依赖 bash\u002Fshell 环境和 Claude CLI)",[105,106,107],"Claude CLI (Anthropic)","bash\u002Fshell 环境","git",[15,26],"2026-03-27T02:49:30.150509","2026-04-06T08:52:25.853125",[112,117,122,127,132,137],{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},11048,"是否可以使用个人 Claude Max 订阅而不是 API 密钥来运行此项目？","可以。您只需使用 Claude Code CLI 即可，因为 Anthropic 最近关闭了通过其他工具使用订阅额度的途径。请确保您已登录 CLI，并在 loop.sh 脚本中使用 'claude' 命令（当前代码示例已默认包含此配置）。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F3",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},11049,"运行 `.\u002Floop.sh plan` 时会自动生成或更新 CLAUDE.md 文件，这是预期行为吗？","这通常不是脚本本身的预期行为。问题往往出在用户主目录下的全局配置文件中（例如 ~\u002F.claude），其中可能包含指示 claude cli 为所有项目创建或更新该文件的指令。请检查您的用户级 CLAUDE.md 或相关配置文件，loop.sh 和 PROMPT_plan.md 中并没有执行此操作的逻辑。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F6",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},11050,"使用 Claude Code Max Plan 时输出格式混乱或无输出，如何解决？","如果直接使用 --output-format stream-json 会输出难以阅读的原始 JSON，而使用 --output-format text 则需等待任务完成才有输出。解决方案是使用 Node.js 流解析器：保留 stream-json 标志，添加 --include-partial-messages 参数，并通过管道传递给解析器以实时提取可读内容。或者，若可接受延迟，可将 json 标志替换为 \"--output-format text\"，但这会导致约 10 分钟内无任何输出直到任务结束。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F4",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},11051,"“无限迭代”模式的实际意义是什么？何时应该使用它？","虽然某些实现推荐在 Claude 回答符合预期时设置“停止条件”，但这并不完全可靠（LLM 可能不总是遵守）。更推荐的方法是像 Geoff 的示例那样：初期手动调整以建立适当的“背压”（如测试、共享工具、更新 AGENTS.md 等），然后依靠这些机制和指令在满足条件时提交工作，以此作为任务的隐式停止点。最大迭代次数（max-iterations）仅控制尝试的任务数量，而非单个任务内部的循环次数。目前尚无更可靠的编程方式来强制限制每个任务的工作量（如最大 token 数），但这值得探索。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F2",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},11052,"如何为该项目添加许可证？","项目已通过 PR #15 添加了 MIT 许可证，并附带了针对第三方截图\u002F资产的 NOTICE 文件。用户可直接查看仓库根目录下的 LICENSE 和 NOTICE 文件以了解使用条款。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F7",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},11053,"什么是 Meta Prompting Pipeline，它在此项目中是如何工作的？","Meta Prompting Pipeline 的核心思路是：1) 获取 LLM 输出的上下文栈；2) 找出其中与提示词（prompts）或技能（skills）相关的部分；3) 根据执行的操作判断这些技能或提示词是否被实际利用；4) 推导提示词的结果（目标是否达成）；5) 利用提示优化（Prompt Optimization）技术确定如何调整思维方式以修改提示词，从而获得更好的结果。该功能旨在构建自我优化的提示循环。","https:\u002F\u002Fgithub.com\u002FClaytonFarr\u002Fralph-playbook\u002Fissues\u002F13",[]]