[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-muratcankoylan--Agent-Skills-for-Context-Engineering":3,"tool-muratcankoylan--Agent-Skills-for-Context-Engineering":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":79,"owner_twitter":79,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":92,"env_os":93,"env_gpu":93,"env_ram":93,"env_deps":94,"category_tags":96,"github_topics":79,"view_count":97,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":98,"updated_at":99,"faqs":100,"releases":116},268,"muratcankoylan\u002FAgent-Skills-for-Context-Engineering","Agent-Skills-for-Context-Engineering","A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require effective context management.","Agent-Skills-for-Context-Engineering 是一个开源的 Agent 技能集合，专注于上下文工程（Context Engineering）领域，帮助开发者构建生产级的 AI Agent 系统。\n\n上下文工程是管理语言模型上下文窗口的学科。与单纯的提示词工程不同，它关注的是如何优化进入模型注意力预算的所有信息——包括系统提示、工具定义、检索文档、消息历史和工具输出等。随着上下文变长，模型会出现“中间丢失”、注意力衰减等问题，这个工具正是为了解决这些挑战而设计的。\n\n该仓库提供了系统化的技能模块，涵盖基础概念（如上下文降级模式、压缩策略）、架构设计（多 Agent 模式、记忆系统、工具设计）、运营优化（上下文压缩、评估框架）以及项目开发方法论。此外还支持构建托管 Agent，配备沙盒虚拟机和多人协作功能。\n\n适合有一定编程基础的开发者、研究人员以及正在构建 AI Agent 产品的技术团队使用。无论是想学习如何优化 Agent 上下文管理，还是需要具体的架构模式和评估方法，这个开源项目都能提供实用的参考和可复用的技能模板。","# Agent Skills for Context Engineering\n\nA comprehensive, open collection of Agent Skills focused on context engineering principles for building production-grade AI agent systems. These skills teach the art and science of curating context to maximize agent effectiveness across any agent platform.\n\n## What is Context Engineering?\n\nContext engineering is the discipline of managing the language model's context window. Unlike prompt engineering, which focuses on crafting effective instructions, context engineering addresses the holistic curation of all information that enters the model's limited attention budget: system prompts, tool definitions, retrieved documents, message history, and tool outputs.\n\nThe fundamental challenge is that context windows are constrained not by raw token capacity but by attention mechanics. As context length increases, models exhibit predictable degradation patterns: the \"lost-in-the-middle\" phenomenon, U-shaped attention curves, and attention scarcity. Effective context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes.\n\n## Recognition\n\nThis repository is cited in academic research as foundational work on static skill architecture:\n\n> \"While static skills are well-recognized [Anthropic, 2025b; Muratcan Koylan, 2025], MCE is among the first to dynamically evolve them, bridging manual skill engineering and autonomous self-improvement.\"\n\n— [Meta Context Engineering via Agentic Skill Evolution](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2601.21557), Peking University State Key Laboratory of General Artificial Intelligence (2026)\n\n## Skills Overview\n\n### Foundational Skills\n\nThese skills establish the foundational understanding required for all subsequent context engineering work.\n\n| Skill | Description |\n|-------|-------------|\n| [context-fundamentals](skills\u002Fcontext-fundamentals\u002F) | Understand what context is, why it matters, and the anatomy of context in agent systems |\n| [context-degradation](skills\u002Fcontext-degradation\u002F) | Recognize patterns of context failure: lost-in-middle, poisoning, distraction, and clash |\n| [context-compression](skills\u002Fcontext-compression\u002F) | Design and evaluate compression strategies for long-running sessions |\n\n### Architectural Skills\n\nThese skills cover the patterns and structures for building effective agent systems.\n\n| Skill | Description |\n|-------|-------------|\n| [multi-agent-patterns](skills\u002Fmulti-agent-patterns\u002F) | Master orchestrator, peer-to-peer, and hierarchical multi-agent architectures |\n| [memory-systems](skills\u002Fmemory-systems\u002F) | Design short-term, long-term, and graph-based memory architectures |\n| [tool-design](skills\u002Ftool-design\u002F) | Build tools that agents can use effectively |\n| [filesystem-context](skills\u002Ffilesystem-context\u002F) | Use filesystems for dynamic context discovery, tool output offloading, and plan persistence |\n| [hosted-agents](skills\u002Fhosted-agents\u002F) | **NEW** Build background coding agents with sandboxed VMs, pre-built images, multiplayer support, and multi-client interfaces |\n\n### Operational Skills\n\nThese skills address the ongoing operation and optimization of agent systems.\n\n| Skill | Description |\n|-------|-------------|\n| [context-optimization](skills\u002Fcontext-optimization\u002F) | Apply compaction, masking, and caching strategies |\n| [evaluation](skills\u002Fevaluation\u002F) | Build evaluation frameworks for agent systems |\n| [advanced-evaluation](skills\u002Fadvanced-evaluation\u002F) | Master LLM-as-a-Judge techniques: direct scoring, pairwise comparison, rubric generation, and bias mitigation |\n\n### Development Methodology\n\nThese skills cover the meta-level practices for building LLM-powered projects.\n\n| Skill | Description |\n|-------|-------------|\n| [project-development](skills\u002Fproject-development\u002F) | Design and build LLM projects from ideation through deployment, including task-model fit analysis, pipeline architecture, and structured output design |\n\n### Cognitive Architecture Skills\n\nThese skills cover formal cognitive modeling for rational agent systems.\n\n| Skill | Description |\n|-------|-------------|\n| [bdi-mental-states](skills\u002Fbdi-mental-states\u002F) | **NEW** Transform external RDF context into agent mental states (beliefs, desires, intentions) using formal BDI ontology patterns for deliberative reasoning and explainability |\n\n## Design Philosophy\n\n### Progressive Disclosure\n\nEach skill is structured for efficient context use. At startup, agents load only skill names and descriptions. Full content loads only when a skill is activated for relevant tasks.\n\n### Platform Agnosticism\n\nThese skills focus on transferable principles rather than vendor-specific implementations. The patterns work across Claude Code, Cursor, and any agent platform that supports skills or allows custom instructions.\n\n### Conceptual Foundation with Practical Examples\n\nScripts and examples demonstrate concepts using Python pseudocode that works across environments without requiring specific dependency installations.\n\n## Usage\n\n### Usage with Claude Code\n\nThis repository is a **Claude Code Plugin Marketplace** containing context engineering skills that Claude automatically discovers and activates based on your task context.\n\n### Installation\n\n**Step 1: Add the Marketplace**\n\nRun this command in Claude Code to register this repository as a plugin source:\n\n```\n\u002Fplugin marketplace add muratcankoylan\u002FAgent-Skills-for-Context-Engineering\n```\n\n**Step 2: Install the Plugin**\n\nOption A - Browse and install:\n1. Select `Browse and install plugins`\n2. Select `context-engineering-marketplace`\n3. Select `context-engineering`\n4. Select `Install now`\n\nOption B - Direct install via command:\n\n```\n\u002Fplugin install context-engineering@context-engineering-marketplace\n```\n\nThis installs all 13 skills in a single plugin. Skills are activated automatically based on your task context.\n\n### Skill Triggers\n\n| Skill | Triggers On |\n|-------|-------------|\n| `context-fundamentals` | \"understand context\", \"explain context windows\", \"design agent architecture\" |\n| `context-degradation` | \"diagnose context problems\", \"fix lost-in-middle\", \"debug agent failures\" |\n| `context-compression` | \"compress context\", \"summarize conversation\", \"reduce token usage\" |\n| `context-optimization` | \"optimize context\", \"reduce token costs\", \"implement KV-cache\" |\n| `multi-agent-patterns` | \"design multi-agent system\", \"implement supervisor pattern\" |\n| `memory-systems` | \"implement agent memory\", \"build knowledge graph\", \"track entities\" |\n| `tool-design` | \"design agent tools\", \"reduce tool complexity\", \"implement MCP tools\" |\n| `filesystem-context` | \"offload context to files\", \"dynamic context discovery\", \"agent scratch pad\", \"file-based context\" |\n| `hosted-agents` | \"build background agent\", \"create hosted coding agent\", \"sandboxed execution\", \"multiplayer agent\", \"Modal sandboxes\" |\n| `evaluation` | \"evaluate agent performance\", \"build test framework\", \"measure quality\" |\n| `advanced-evaluation` | \"implement LLM-as-judge\", \"compare model outputs\", \"mitigate bias\" |\n| `project-development` | \"start LLM project\", \"design batch pipeline\", \"evaluate task-model fit\" |\n| `bdi-mental-states` | \"model agent mental states\", \"implement BDI architecture\", \"transform RDF to beliefs\", \"build cognitive agent\" |\n\n\u003Cimg width=\"1014\" height=\"894\" alt=\"Screenshot 2025-12-26 at 12 34 47 PM\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmuratcankoylan_Agent-Skills-for-Context-Engineering_readme_46c97aaea74c.png\" \u002F>\n\n### For Cursor (Open Plugins)\n\nThis repository is listed on the [Cursor Plugin Directory](https:\u002F\u002Fcursor.directory\u002Fplugins\u002Fcontext-engineering).\n\nThe `.plugin\u002Fplugin.json` manifest follows the [Open Plugins](https:\u002F\u002Fopen-plugins.com) standard, so the repo also works with any conformant agent tool (Codex, GitHub Copilot, etc.).\n\n### Using Individual Skills\n\nTo use a single skill without installing the full plugin, copy its `SKILL.md` directly into your project's `.claude\u002Fskills\u002F` directory:\n\n```bash\n# Example: add just the context-fundamentals skill\nmkdir -p .claude\u002Fskills\ncurl -o .claude\u002Fskills\u002Fcontext-fundamentals.md \\\n  https:\u002F\u002Fraw.githubusercontent.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fmain\u002Fskills\u002Fcontext-fundamentals\u002FSKILL.md\n```\n\nAvailable skills: `context-fundamentals`, `context-degradation`, `context-compression`, `context-optimization`, `multi-agent-patterns`, `memory-systems`, `tool-design`, `filesystem-context`, `hosted-agents`, `evaluation`, `advanced-evaluation`, `project-development`, `bdi-mental-states`\n\n### For Custom Implementations\n\nExtract the principles and patterns from any skill and implement them in your agent framework. The skills are deliberately platform-agnostic.\n\n## Examples\n\nThe [examples](examples\u002F) folder contains complete system designs that demonstrate how multiple skills work together in practice.\n\n| Example | Description | Skills Applied |\n|---------|-------------|----------------|\n| [digital-brain-skill](examples\u002Fdigital-brain-skill\u002F) | **NEW** Personal operating system for founders and creators. Complete Claude Code skill with 6 modules, 4 automation scripts | context-fundamentals, context-optimization, memory-systems, tool-design, multi-agent-patterns, evaluation, project-development |\n| [x-to-book-system](examples\u002Fx-to-book-system\u002F) | Multi-agent system that monitors X accounts and generates daily synthesized books | multi-agent-patterns, memory-systems, context-optimization, tool-design, evaluation |\n| [llm-as-judge-skills](examples\u002Fllm-as-judge-skills\u002F) | Production-ready LLM evaluation tools with TypeScript implementation, 19 passing tests | advanced-evaluation, tool-design, context-fundamentals, evaluation |\n| [book-sft-pipeline](examples\u002Fbook-sft-pipeline\u002F) | Train models to write in any author's style. Includes Gertrude Stein case study with 70% human score on Pangram, $2 total cost | project-development, context-compression, multi-agent-patterns, evaluation |\n\nEach example includes:\n- Complete PRD with architecture decisions\n- Skills mapping showing which concepts informed each decision\n- Implementation guidance\n\n### Digital Brain Skill Example\n\nThe [digital-brain-skill](examples\u002Fdigital-brain-skill\u002F) example is a complete personal operating system demonstrating comprehensive skills application:\n\n- **Progressive Disclosure**: 3-level loading (SKILL.md → MODULE.md → data files)\n- **Module Isolation**: 6 independent modules (identity, content, knowledge, network, operations, agents)\n- **Append-Only Memory**: JSONL files with schema-first lines for agent-friendly parsing\n- **Automation Scripts**: 4 consolidated tools (weekly_review, content_ideas, stale_contacts, idea_to_draft)\n\nIncludes detailed traceability in [HOW-SKILLS-BUILT-THIS.md](examples\u002Fdigital-brain-skill\u002FHOW-SKILLS-BUILT-THIS.md) mapping every architectural decision to specific skill principles.\n\n### LLM-as-Judge Skills Example\n\nThe [llm-as-judge-skills](examples\u002Fllm-as-judge-skills\u002F) example is a complete TypeScript implementation demonstrating:\n\n- **Direct Scoring**: Evaluate responses against weighted criteria with rubric support\n- **Pairwise Comparison**: Compare responses with position bias mitigation\n- **Rubric Generation**: Create domain-specific evaluation standards\n- **EvaluatorAgent**: High-level agent combining all evaluation capabilities\n\n### Book SFT Pipeline Example\n\nThe [book-sft-pipeline](examples\u002Fbook-sft-pipeline\u002F) example demonstrates training small models (8B) to write in any author's style:\n\n- **Intelligent Segmentation**: Two-tier chunking with overlap for maximum training examples\n- **Prompt Diversity**: 15+ templates to prevent memorization and force style learning\n- **Tinker Integration**: Complete LoRA training workflow with $2 total cost\n- **Validation Methodology**: Modern scenario testing proves style transfer vs content memorization\n\nIntegrates with context engineering skills: project-development, context-compression, multi-agent-patterns, evaluation.\n\n## Star History\n\u003Cimg width=\"3664\" height=\"2648\" alt=\"star-history-2026317\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmuratcankoylan_Agent-Skills-for-Context-Engineering_readme_0bec67b79db5.png\" \u002F>\n\n## Structure\n\nEach skill follows the Agent Skills specification:\n\n```\nskill-name\u002F\n├── SKILL.md              # Required: instructions + metadata\n├── scripts\u002F              # Optional: executable code demonstrating concepts\n└── references\u002F           # Optional: additional documentation and resources\n```\n\nSee the [template](template\u002F) folder for the canonical skill structure.\n\n## Contributing\n\nThis repository follows the Agent Skills open development model. Contributions are welcome from the broader ecosystem. When contributing:\n\n1. Follow the skill template structure\n2. Provide clear, actionable instructions\n3. Include working examples where appropriate\n4. Document trade-offs and potential issues\n5. Keep SKILL.md under 500 lines for optimal performance\n\nFeel free to contact [Muratcan Koylan](https:\u002F\u002Fx.com\u002Fkoylanai) for collaboration opportunities or any inquiries.\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## References\n\nThe principles in these skills are derived from research and production experience at leading AI labs and framework developers. Each skill includes references to the underlying research and case studies that inform its recommendations.\n","# 上下文工程的智能体技能集\n\n一个全面、开放的智能体技能集，专注于构建生产级AI智能体系统的上下文工程原则。这些技能教授策划上下文的艺术和科学，以在任何智能体平台上最大化智能体效能。\n\n## 什么是上下文工程？\n\n上下文工程是管理语言模型上下文窗口的学科。与专注于制作有效指令的提示工程不同，上下文工程关注的是对进入模型有限注意力预算的所有信息进行整体整理：系统提示、工具定义、检索到的文档、消息历史和工具输出。\n\n根本挑战在于上下文窗口受限于注意力机制而非原始令牌容量。随着上下文长度增加，模型表现出可预测的退化模式：\"中间丢失\"现象、U形注意力曲线和注意力稀缺。有效的上下文工程意味着寻找最小化的高信号令牌集合，以最大化预期结果的概率。\n\n## 认可\n\n该仓库在学术研究中被引用，作为静态技能架构的基础性工作：\n\n> \"虽然静态技能已被广泛认可 [Anthropic, 2025b; Muratcan Koylan, 2025]，但MCE是首批动态演进它们的系统之一，弥合了手动技能工程和自主自我改进之间的鸿沟。\"\n\n— [通过智能体技能演进的元上下文工程](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2601.21557)，北京大学通用人工智能国家重点实验室 (2026)\n\n## 技能概览\n\n### 基础技能\n\n这些技能为所有后续的上下文工程工作奠定基础理解。\n\n| 技能 | 描述 |\n|-------|-------------|\n| [context-fundamentals](skills\u002Fcontext-fundamentals\u002F) | 理解什么是上下文、为什么它重要，以及智能体系统中上下文的结构 |\n| [context-degradation](skills\u002Fcontext-degradation\u002F) | 识别上下文失败模式：中间丢失、污染、干扰和冲突 |\n| [context-compression](skills\u002Fcontext-compression\u002F) | 设计和评估长时间运行的会话的压缩策略 |\n\n### 架构技能\n\n这些技能涵盖构建有效智能体系统的模式和结构。\n\n| 技能 | 描述 |\n|-------|-------------|\n| [multi-agent-patterns](skills\u002Fmulti-agent-patterns\u002F) | 掌握编排器、点对点和分层多智能体架构 |\n| [memory-systems](skills\u002Fmemory-systems\u002F) | 设计短期、长期和基于图的记忆架构 |\n| [tool-design](skills\u002Ftool-design\u002F) | 构建智能体可以有效使用的工具 |\n| [filesystem-context](skills\u002Ffilesystem-context\u002F) | 使用文件系统进行动态上下文发现、工具输出卸载和计划持久化 |\n| [hosted-agents](skills\u002Fhosted-agents\u002F) | **新增** 使用沙盒虚拟机、预构建镜像、多人支持和多客户端接口构建后台编码智能体 |\n\n### 运维技能\n\n这些技能解决智能体系统的持续运行和优化问题。\n\n| 技能 | 描述 |\n|-------|-------------|\n| [context-optimization](skills\u002Fcontext-optimization\u002F) | 应用压缩、掩码和缓存策略 |\n| [evaluation](skills\u002Fevaluation\u002F) | 为智能体系统构建评估框架 |\n| [advanced-evaluation](skills\u002Fadvanced-evaluation\u002F) | 掌握LLM作为评判者的技术：直接评分、成对比较、评分标准生成和偏差缓解 |\n\n### 开发方法论\n\n这些技能涵盖构建LLM驱动项目的元级实践。\n\n| 技能 | 描述 |\n|-------|-------------|\n| [project-development](skills\u002Fproject-development\u002F) | 从构思到部署设计和构建LLM项目，包括任务-模型匹配分析、管道架构和结构化输出设计 |\n\n### 认知架构技能\n\n这些技能涵盖智能体系统的形式化认知建模。\n\n| 技能 | 描述 |\n|-------|-------------|\n| [bdi-mental-states](skills\u002Fbdi-mental-states\u002F) | **新增** 使用形式化BDI本体模式将外部RDF上下文转换为智能体心理状态（信念、愿望、意图），用于审慎推理和可解释性 |\n\n## 设计理念\n\n### 渐进式披露\n\n每个技能都为高效使用上下文而构建。在启动时，智能体只加载技能名称和描述。只有当技能被激活用于相关任务时，才会加载完整内容。\n\n### 平台无关性\n\n这些技能专注于可转移的原则，而不是供应商特定的实现。这些模式适用于Claude Code、Cursor以及任何支持技能或允许自定义指令的智能体平台。\n\n### 概念基础与实践示例\n\n脚本和示例使用Python伪代码演示概念，可在不同环境中工作，无需安装特定依赖项。\n\n## 使用方法\n\n### 与Claude Code一起使用\n\n该仓库是一个**Claude Code插件市场**，包含上下文工程技能，Claude会根据您的任务上下文自动发现和激活这些技能。\n\n### 安装\n\n**步骤1：添加市场**\n\n在Claude Code中运行此命令，将该仓库注册为插件源：\n\n```\n\u002Fplugin marketplace add muratcankoylan\u002FAgent-Skills-for-Context-Engineering\n```\n\n**步骤2：安装插件**\n\n选项A - 浏览并安装：\n1. 选择`浏览并安装插件`\n2. 选择`context-engineering-marketplace`\n3. 选择`context-engineering`\n4. 选择`立即安装`\n\n选项B - 通过命令直接安装：\n\n```\n\u002Fplugin install context-engineering@context-engineering-marketplace\n```\n\n这会将所有13个技能安装为单个插件。技能会根据您的任务上下文自动激活。\n\n###技能触发器\n\n| 技能 | 触发条件 |\n|-------|-------------|\n| `context-fundamentals` | \"understand context\", \"explain context windows\", \"design agent architecture\" |\n| `context-degradation` | \"diagnose context problems\", \"fix lost-in-middle\", \"debug agent failures\" |\n| `context-compression` | \"compress context\", \"summarize conversation\", \"reduce token usage\" |\n| `context-optimization` | \"optimize context\", \"reduce token costs\", \"implement KV-cache\" |\n| `multi-agent-patterns` | \"design multi-agent system\", \"implement supervisor pattern\" |\n| `memory-systems` | \"implement agent memory\", \"build knowledge graph\", \"track entities\" |\n| `tool-design` | \"design agent tools\", \"reduce tool complexity\", \"implement MCP tools\" |\n| `filesystem-context` | \"offload context to files\", \"dynamic context discovery\", \"agent scratch pad\", \"file-based context\" |\n| `hosted-agents` | \"build background agent\", \"create hosted coding agent\", \"sandboxed execution\", \"multiplayer agent\", \"Modal sandboxes\" |\n| `evaluation` | \"evaluate agent performance\", \"build test framework\", \"measure quality\" |\n| `advanced-evaluation` | \"implement LLM-as-judge\", \"compare model outputs\", \"mitigate bias\" |\n| `project-development` | \"start LLM project\", \"design batch pipeline\", \"evaluate task-model fit\" |\n| `bdi-mental-states` | \"model agent mental states\", \"implement BDI architecture\", \"transform RDF to beliefs\", \"build cognitive agent\" |\n\n\u003Cimg width=\"1014\" height=\"894\" alt=\"Screenshot 2025-12-26 at 12 34 47 PM\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmuratcankoylan_Agent-Skills-for-Context-Engineering_readme_46c97aaea74c.png\" \u002F>\n\n### 适用于 Cursor（开放插件）\n\n本仓库已收录于 [Cursor 插件目录](https:\u002F\u002Fcursor.directory\u002Fplugins\u002Fcontext-engineering)。\n\n`.plugin\u002Fplugin.json` 清单遵循[开放插件](https:\u002F\u002Fopen-plugins.com)标准，因此该仓库也可与任何兼容的代理工具（Codex、GitHub Copilot 等）配合使用。\n\n### 使用单个技能\n\n如需使用单个技能而不安装完整插件，可将其 `SKILL.md` 直接复制到项目的 `.claude\u002Fskills\u002F` 目录：\n\n```bash\n# 示例：仅添加 context-fundamentals 技能\nmkdir -p .claude\u002Fskills\ncurl -o .claude\u002Fskills\u002Fcontext-fundamentals.md \\\n  https:\u002F\u002Fraw.githubusercontent.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fmain\u002Fskills\u002Fcontext-fundamentals\u002FSKILL.md\n```\n\n可用技能：`context-fundamentals`、`context-degradation`、`context-compression`、`context-optimization`、`multi-agent-patterns`、`memory-systems`、`tool-design`、`filesystem-context`、`hosted-agents`、`evaluation`、`advanced-evaluation`、`project-development`、`bdi-mental-states`\n\n### 自定义实现\n\n可从任何技能中提取原则和模式，并将其实现到您的代理框架中。这些技能刻意保持与平台无关的特性。\n\n## 示例\n\n[examples](examples\u002F) 文件夹包含完整的系统设计，展示了多个技能如何在实践中协同工作。\n\n| 示例 | 描述 | 应用的技能 |\n|---------|-------------|----------------|\n| [digital-brain-skill](examples\u002Fdigital-brain-skill\u002F) | **新增** 为创始人和创作者打造的个人操作系统。完整的 Claude Code 技能，包含 6 个模块、4 个自动化脚本 | context-fundamentals, context-optimization, memory-systems, tool-design, multi-agent-patterns, evaluation, project-development |\n| [x-to-book-system](examples\u002Fx-to-book-system\u002F) | 监控 X 账号并生成每日综合书籍的多代理系统 | multi-agent-patterns, memory-systems, context-optimization, tool-design, evaluation |\n| [llm-as-judge-skills](examples\u002Fllm-as-judge-skills\u002F) | 生产就绪的 LLM 评估工具，包含 TypeScript 实现，19 个通过测试 | advanced-evaluation, tool-design, context-fundamentals, evaluation |\n| [book-sft-pipeline](examples\u002Fbook-sft-pipeline\u002F) | 训练模型以任何作者的风格写作。包括 Gertrude Stein 案例研究，在 Pangram 上获得 70% 人类评分，总成本 $2 | project-development, context-compression, multi-agent-patterns, evaluation |\n\n每个示例包含：\n- 完整的 PRD 及架构决策\n- 技能映射，展示哪些概念影响了每个决策\n- 实现指南\n\n### Digital Brain Skill 示例\n\n[digital-brain-skill](examples\u002Fdigital-brain-skill\u002F) 示例是一个完整的个人操作系统，展示了全面的技能应用：\n\n- **渐进式展示**（Progressive Disclosure）：3 级加载（SKILL.md → MODULE.md → 数据文件）\n- **模块隔离**（Module Isolation）：6 个独立模块（身份、内容、知识、网络、运营、代理）\n- **追加式内存**（Append-Only Memory）：使用模式优先行的 JSONL 文件，便于代理解析\n- **自动化脚本**（Automation Scripts）：4 个整合工具（weekly_review、content_ideas、stale_contacts、idea_to_draft）\n\n包含详细的可追溯性文档 [HOW-SKILLS-BUILT-THIS.md](examples\u002Fdigital-brain-skill\u002FHOW-SKILLS-BUILT-THIS.md)，将每个架构决策映射到特定的技能原则。\n\n### LLM-as-Judge Skills 示例\n\n[llm-as-judge-skills](examples\u002Fllm-as-judge-skills\u002F) 示例是一个完整的 TypeScript 实现，展示了：\n\n- **直接评分**（Direct Scoring）：根据加权标准和评分规则评估响应\n- **成对比较**（Pairwise Comparison）：比较响应并缓解位置偏差\n- **评分规则生成**（Rubric Generation）：创建领域特定的评估标准\n- **EvaluatorAgent**：整合所有评估能力的高级代理\n\n### Book SFT Pipeline 示例\n\n[book-sft-pipeline](examples\u002Fbook-sft-pipeline\u002F) 示例展示了训练小型模型（8B）以任何作者风格写作：\n\n- **智能分段**（Intelligent Segmentation）：双层分块与重叠，最大化训练样本\n- **提示多样性**（Prompt Diversity）：15+ 模板防止记忆并强制风格学习\n- **Tinker 集成**（Tinker Integration）：完整的 LoRA 训练工作流，总成本 $2\n- **验证方法论**（Validation Methodology）：现代场景测试证明风格迁移优于内容记忆\n\n与上下文工程技能集成：project-development、context-compression、multi-agent-patterns、evaluation。\n\n## Star 历史\n\u003Cimg width=\"3664\" height=\"2648\" alt=\"star-history-2026317\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmuratcankoylan_Agent-Skills-for-Context-Engineering_readme_0bec67b79db5.png\" \u002F>\n\n## 结构\n\n每个技能遵循 Agent Skills 规范：\n\n```\nskill-name\u002F\n├── SKILL.md              # 必需：说明 + 元数据\n├── scripts\u002F              # 可选：演示概念的可执行代码\n└── references\u002F           # 可选：额外文档和资源\n```\n\n请参阅 [template](template\u002F) 文件夹了解规范的技能结构。\n\n## 贡献指南\n\n本仓库采用 Agent Skills 开放开发模式。欢迎来自更广泛生态系统的贡献。在贡献时，请遵循以下原则：\n\n1. 遵循技能模板结构\n2. 提供清晰、可操作的指令\n3. 在适当的地方包含可工作的示例\n4. 记录权衡取舍和潜在问题\n5. 将 SKILL.md 保持在 500 行以内以获得最佳性能\n\n欢迎与 [Muratcan Koylan](https:\u002F\u002Fx.com\u002Fkoylanai) 联系，讨论合作机会或任何咨询。\n\n## 许可证\n\nMIT 许可证 - 详见 LICENSE 文件。\n\n## 参考资料\n\n这些技能的原则来源于领先人工智能实验室和框架开发者的研究和生产经验。每项技能都包含支撑其建议的基础研究和案例研究参考。","# Agent-Skills-for-Context-Engineering 快速上手指南\n\n## 环境准备\n\n### 系统要求\n\n- **操作系统**：macOS、Windows（WSL）、Linux\n- **运行工具**：Claude Code（必需）\n- **其他工具**：Git、curl（用于单独安装技能）\n\n### 前置依赖\n\n1. 安装 Claude Code（请访问 [anthropic.com\u002Fclaude-code](https:\u002F\u002Fwww.anthropic.com\u002Fclaude-code) 获取安装包）\n2. 确保命令行可访问 `git` 和 `curl`\n\n## 安装步骤\n\n### 方式一：Claude Code 插件安装（推荐）\n\n**Step 1：添加 Marketplace**\n\n在 Claude Code 中运行：\n\n```\n\u002Fplugin marketplace add muratcankoylan\u002FAgent-Skills-for-Context-Engineering\n```\n\n**Step 2：安装插件**\n\n选择以下任一方式：\n\n- **方式 A**：运行 `\u002Fplugin install context-engineering@context-engineering-marketplace`\n- **方式 B**：在 Claude Code 中依次选择 `Browse and install plugins` → `context-engineering-marketplace` → `context-engineering` → `Install now`\n\n安装完成后，13 个技能将自动集成到 Claude Code 中。\n\n### 方式二：单独安装单个技能\n\n如果只需使用某个技能，可以直接下载对应的 `SKILL.md` 文件：\n\n```bash\n# 创建技能目录\nmkdir -p .claude\u002Fskills\n\n# 下载单个技能（以 context-fundamentals 为例）\ncurl -o .claude\u002Fskills\u002Fcontext-fundamentals.md \\\n  https:\u002F\u002Fraw.githubusercontent.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fmain\u002Fskills\u002Fcontext-fundamentals\u002FSKILL.md\n```\n\n可用的技能列表：`context-fundamentals`、`context-degradation`、`context-compression`、`context-optimization`、`multi-agent-patterns`、`memory-systems`、`tool-design`、`filesystem-context`、`hosted-agents`、`evaluation`、`advanced-evaluation`、`project-development`、`bdi-mental-states`\n\n## 基本使用\n\n### 自动触发\n\n安装插件后，技能会根据任务上下文自动激活。例如：\n\n| 触发指令 | 激活技能 |\n|---------|---------|\n| \"understand context\" | context-fundamentals |\n| \"diagnose context problems\" | context-degradation |\n| \"compress context\" | context-compression |\n| \"optimize context\" | context-optimization |\n| \"design multi-agent system\" | multi-agent-patterns |\n| \"implement agent memory\" | memory-systems |\n| \"design agent tools\" | tool-design |\n| \"evaluate agent performance\" | evaluation |\n| \"start LLM project\" | project-development |\n\n### 简单示例\n\n在 Claude Code 中直接提出需求即可：\n\n```\n# 想要理解上下文工程的基础概念\n\"explain what is context engineering\"\n\n# 想要优化上下文使用\n\"help me optimize context to reduce token costs\"\n\n# 想要设计多智能体系统\n\"design a multi-agent architecture for my project\"\n```\n\nClaude Code 会自动加载相关技能并提供指导。\n\n### 查看示例项目\n\n仓库中的 [examples](examples\u002F) 目录包含完整系统设计供参考：\n\n- **digital-brain-skill**：个人知识管理系统\n- **x-to-book-system**：多智能体内容生成系统\n- **llm-as-judge-skills**：LLM 评估工具\n- **book-sft-pipeline**：模型风格训练流水线\n\n查看示例：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering.git\ncd Agent-Skills-for-Context-Engineering\u002Fexamples\n```","某中型科技公司的 AI 团队正在开发一个企业级智能代码审查助手，需要同时处理代码片段、审查历史、团队规范和工具输出，在多轮对话中保持一致性和准确性。\n\n### 没有 Agent-Skills-for-Context-Engineering 时\n\n- 团队在构建 Agent 时只能凭经验调整上下文，导致模型在长对话中频繁出现\"遗忘\"早期需求的问题\n- 多 Agent 协作时，每个 Agent 的上下文相互干扰，无法有效区分不同任务的信息优先级\n- 代码审查涉及大量工具输出（静态分析结果、测试报告），直接填入上下文导致 token 浪费严重，模型性能下降\n- 团队反复调试但找不到根本原因，只能通过缩短对话长度来规避问题，严重影响用户体验\n- 缺乏系统性的评估方法，无法量化不同上下文策略的效果，优化工作盲目且低效\n\n### 使用 Agent-Skills-for-Context-Engineering 后\n\n- 应用 context-degradation 技能识别出\"lost-in-the-middle\"现象，针对性地将关键审查规则放在上下文的开头和结尾\n- 采用 context-compression 技能对工具输出进行结构化压缩，在保留关键信息的同时减少 60% 的 token 消耗\n- 引入 memory-systems 技能设计分层记忆架构，将团队规范存入长期记忆，当前审查任务使用短期记忆\n- 运用 multi-agent-patterns 技能重构为\"审查员+规范检查员+建议生成器\"的分层架构，降低上下文复杂度\n- 通过 evaluation 和 advanced-evaluation 技能建立量化评估体系，用 LLM-as-Judge 方法持续监控并优化上下文策略\n\n通过系统性的上下文工程，这家企业成功将代码审查助手的有效对话轮次从 5 轮提升至 20 轮以上，审查准确率提高 35%。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmuratcankoylan_Agent-Skills-for-Context-Engineering_fa5f3786.png","muratcankoylan","Muratcan Koylan","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmuratcankoylan_f6010625.png","Context Engineer & AI Agent Systems Manager | Prompt design & context engineering, persona embodiment and multi-agent architectures.",null,"Toronto","www.muratcankoylan.com","https:\u002F\u002Fgithub.com\u002Fmuratcankoylan",[84],{"name":85,"color":86,"percentage":87},"Python","#3572A5",100,14735,1156,"2026-04-05T22:19:02","MIT",1,"未说明",{"notes":95,"python":93,"dependencies":93},"这是一个 Claude Code 插件市场，无需本地安装依赖。通过 \u002Fplugin marketplace add 命令注册后，Claude Code 会自动发现并激活相关技能。技能以 Markdown 文件形式存在，使用时通过渐进式加载（先加载名称和描述，激活时才加载完整内容）。示例代码为 Python 伪代码，不要求特定运行环境。需配合 Claude Code、Cursor 或其他支持 Open Plugins 标准的 AI 编程工具使用。",[15,13],35,"2026-03-27T02:49:30.150509","2026-04-06T07:14:49.615867",[101,106,111],{"id":102,"question_zh":103,"answer_zh":104,"source_url":105},863,"安装插件后所有文件夹内容完全相同，\u002Fcontext 命令显示重复插件怎么办？","这是因为 marketplace.json 中所有5个插件都使用了相同的 \"source\": \".\u002F\"，导致 Claude Code 为每个插件分别缓存了整个仓库，造成5个完全相同的副本（约2.42 MB × 5），并且在 \u002Fcontext 中显示重复技能。解决方案：已合并为单个插件（context-engineering），包含全部13个技能，只有一个缓存条目，无重复。安装命令：\u002Fplugin install context-engineering@context-engineering-marketplace。如果已安装旧插件，需先卸载再重新安装新插件。","https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fissues\u002F34",{"id":107,"question_zh":108,"answer_zh":109,"source_url":110},864,"添加市场后没有技能可供安装，技能发现功能损坏如何解决？","问题原因是 marketplace.json 只配置了单个插件条目指向 .\u002Fskills，但 Claude Code 无法发现嵌套在子目录中的单个技能。解决方案：在 marketplace.json 中将每个技能定义为独立的插件条目，为每个技能目录添加 .claude-plugin\u002Fplugin.json 清单文件，并更新安装文档。这样每个技能都可以被独立发现和安装。相关修复已合并到 main 分支。","https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fissues\u002F18",{"id":112,"question_zh":113,"answer_zh":114,"source_url":115},865,"interleaved_thinking 技能目录名称无效，无法使用怎么办？","技能目录名称包含下划线（interleaved_thinking）不是有效的 OpenCode 技能名称。解决方案：已提交 PR #53 将技能路径\u002F名称改为使用连字符（interleaved-thinking），以符合命名规范。","https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fissues\u002F42",[117,122],{"id":118,"version":119,"summary_zh":120,"released_at":121},110141,"v2.0.0","## What's Changed\n\nAll 13 skills have been comprehensively rewritten based on learnings from Anthropic's [\"Lessons from Building Claude Code: How We Use Skills\"](https:\u002F\u002Fwww.anthropic.com) article. This is the largest single update to the skills collection.\n\n### The Core Transformation\n\nSkills have been rewritten from **textbook voice** (explains concepts) to **hybrid instructional voice** (leads with actions, weaves in reasoning):\n\n**Before:**\n> \"System prompts establish the agent's core identity, constraints, and behavioral guidelines. They are loaded once at session start and typically persist throughout the conversation.\"\n\n**After:**\n> \"Organize system prompts into distinct sections using XML tags or Markdown headers. System prompts persist throughout the conversation, so place the most critical constraints at the beginning and end where attention is strongest — the middle receives 10-40% less recall accuracy.\"\n\n### Changes Across All 13 Skills\n\n- **Hybrid voice rewrite** — every prose section rewritten from \"X is Y\" to \"Do X because Y\" while preserving all substantive knowledge (metrics, research findings, thresholds)\n- **Gotchas sections** — standardized `## Gotchas` added to all 13 skills (was 4, inconsistently formatted). 5-9 experience-derived, specific, actionable gotchas per skill. Per Anthropic: *\"The highest-signal content in any skill is the Gotchas section.\"*\n- **Composable scripts** — all 12 Python scripts updated with `__all__` exports, type hints, `\"Use when:\"` docstrings, and `__main__` demo blocks\n- **Progressive disclosure triggers** — all 13 References sections now include `\"Read when: [condition]\"` to tell Claude *when* to load supporting files\n- **Template updated** — `template\u002FSKILL.md` now includes canonical `## Gotchas` section\n\n### Key Metrics\n\n| Metric | Before | After |\n|--------|--------|-------|\n| Total SKILL.md lines | 3,559 | 3,421 (compressed despite adding content) |\n| Skills with Gotchas | 4 (inconsistent) | **13 (standardized)** |\n| Scripts with `__all__` | 0 | **12** |\n| References with triggers | 0 | **13** |\n| All under 500-line limit | ✓ | ✓ (range: 195–402) |\n\n### New Files\n\n- `docs\u002Fskills-improvement-analysis.md` — full gap analysis of the repo against Anthropic's 9 skill categories and best practices\n- `skills\u002Fadvanced-evaluation\u002Freferences\u002Fevaluation-pipeline.md` — ASCII pipeline diagram offloaded from SKILL.md to free line budget\n\n### Skills Updated\n\n| Bundle | Skills |\n|--------|--------|\n| context-engineering-fundamentals | context-fundamentals, context-degradation, context-compression, context-optimization |\n| agent-architecture | multi-agent-patterns, memory-systems, tool-design, filesystem-context, hosted-agents |\n| agent-evaluation | evaluation, advanced-evaluation |\n| agent-development | project-development |\n| cognitive-architecture | bdi-mental-states |\n\n---\n\n**28 files changed, 4,938 insertions, 3,100 deletions**\n\n🤖 Generated with [Claude Code](https:\u002F\u002Fclaude.ai\u002Fcode)","2026-03-17T22:22:15",{"id":123,"version":124,"summary_zh":125,"released_at":126},110142,"v1.1.0","## 🎉 What's New\r\n\r\n### New Skill: Advanced Evaluation\r\nA comprehensive skill for mastering LLM-as-a-Judge evaluation techniques. Based on research from [Eugene Yan's LLM-Evaluators](https:\u002F\u002Feugeneyan.com\u002Fwriting\u002Fllm-evaluators\u002F).\r\n\r\n**Covers:**\r\n- Direct scoring vs. pairwise comparison selection\r\n- Position, length, and verbosity bias mitigation\r\n- Metric selection (Cohen's κ, Spearman's ρ, Kendall's τ)\r\n- Production evaluation pipeline design\r\n- 10 actionable guidelines for reliable evaluation\r\n\r\n📁 [`skills\u002Fadvanced-evaluation\u002F`](skills\u002Fadvanced-evaluation\u002F)\r\n\r\n### New Example: LLM-as-Judge Skills\r\nA complete TypeScript [AI SDK-6](https:\u002F\u002Fvercel.com\u002Fblog\u002Fai-sdk-6) implementation demonstrating the Advanced Evaluation skill in practice.\r\n\r\n**Includes:**\r\n- 3 evaluation tools: `directScore`, `pairwiseCompare`, `generateRubric`\r\n- `EvaluatorAgent` class with full evaluation workflows\r\n- 19 passing tests with real OpenAI API calls\r\n- Position bias mitigation with automatic position swapping\r\n- Zod schemas for type-safe inputs\u002Foutputs\r\n\r\n📁 [`examples\u002Fllm-as-judge-skills\u002F`](examples\u002Fllm-as-judge-skills\u002F)\r\n\r\n## Quick Start\r\n\r\ncd examples\u002Fllm-as-judge-skills\r\nnpm install\r\ncp env.example .env  # Add your OPENAI_API_KEY\r\nnpm test  \r\n\r\n## Skills Applied\r\nThis example demonstrates how multiple skills work together:\r\n- `advanced-evaluation` - Core evaluation patterns\r\n- `tool-design` - Zod schemas and error handling\r\n- `context-fundamentals` - Structured evaluation prompts\r\n- `evaluation` - Foundational evaluation concepts\r\n\r\n## Contributors\r\n- @muratcankoylan\r\n\r\n---\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fcompare\u002Fv1.0.0...v1.1.0\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmuratcankoylan\u002FAgent-Skills-for-Context-Engineering\u002Fcommits\u002Fv1.1.0","2025-12-24T06:17:26"]