[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-alibaba--page-agent":3,"tool-alibaba--page-agent":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":99,"forks":100,"last_commit_at":101,"license":102,"difficulty_score":23,"env_os":103,"env_gpu":103,"env_ram":103,"env_deps":104,"category_tags":108,"github_topics":109,"view_count":118,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":119,"updated_at":120,"faqs":121,"releases":152},968,"alibaba\u002Fpage-agent","page-agent","JavaScript in-page GUI agent. Control web interfaces with natural language.","page-agent 是一款嵌入网页的 JavaScript GUI 智能体，让你用自然语言直接操控网页界面。无需安装浏览器扩展、Python 环境或 headless 浏览器，只需一行代码即可在现有网页中集成 AI 助手功能。\n\n这款工具主要解决了传统网页自动化方案门槛高、依赖重的问题。它采用纯文本的 DOM 操作方式，不需要截图或多模态大模型，也无需特殊权限，开发者可以灵活接入自己的 LLM。对于希望快速为产品添加 AI Copilot 的 SaaS 团队、需要简化复杂表单流程的 ERP\u002FCRM 系统开发者，以及关注无障碍访问的产品团队尤为实用。此外，配合可选的 Chrome 扩展和 MCP Server，还能实现跨标签页的多页面任务处理。\n\n核心亮点在于\"轻量无侵入\"——完全在页面内运行，不改造后端即可落地。无论是想一句话完成 20 步点击操作，还是为视障用户提供语音交互入口，page-agent 都能以极低的接入成本实现。","# Page Agent\n\n\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fimg.alicdn.com\u002Fimgextra\u002Fi4\u002FO1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png\">\n  \u003Cimg alt=\"Page Agent Banner\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_page-agent_readme_58659c6dc330.png\">\n\u003C\u002Fpicture>\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-auto.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT) [![TypeScript](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%3C%2F%3E-TypeScript-%230074c1.svg)](http:\u002F\u002Fwww.typescriptlang.org\u002F) [![Bundle Size](https:\u002F\u002Fimg.shields.io\u002Fbundlephobia\u002Fminzip\u002Fpage-agent)](https:\u002F\u002Fbundlephobia.com\u002Fpackage\u002Fpage-agent) [![Downloads](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fdt\u002Fpage-agent.svg)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fpage-agent) [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Falibaba\u002Fpage-agent.svg)](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent)\n\nThe GUI Agent Living in Your Webpage. Control web interfaces with natural language.\n\n🌐 **English** | [中文](.\u002Fdocs\u002FREADME-zh.md)\n\n\u003Ca href=\"https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002F\" target=\"_blank\">\u003Cb>🚀 Demo\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Fintroduction\u002Foverview\" target=\"_blank\">\u003Cb>📖 Docs\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fnews.ycombinator.com\u002Fitem?id=47264138\" target=\"_blank\">\u003Cb>📢 HN Discussion\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fx.com\u002Fsimonluvramen\" target=\"_blank\">\u003Cb>𝕏 Follow on X\u003C\u002Fb>\u003C\u002Fa>\n\n\u003Cvideo id=\"demo-video\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa1f2eae2-13fb-4aae-98cf-a3fc1620a6c2\" controls crossorigin muted>\u003C\u002Fvideo>\n\n---\n\n## ✨ Features\n\n- **🎯 Easy integration**\n    - No need for `browser extension` \u002F `python` \u002F `headless browser`.\n    - Just in-page javascript. Everything happens in your web page.\n- **📖 Text-based DOM manipulation**\n    - No screenshots. No multi-modal LLMs or special permissions needed.\n- **🧠 Bring your own LLMs**\n- **🐙 Optional [chrome extension](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fchrome-extension) for multi-page tasks.**\n    - And an [MCP Server (Beta)](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fmcp-server) to control it from outside\n\n## 💡 Use Cases\n\n- **SaaS AI Copilot** — Ship an AI copilot in your product in lines of code. No backend rewrite.\n- **Smart Form Filling** — Turn 20-click workflows into one sentence. Perfect for ERP, CRM, and admin systems.\n- **Accessibility** — Make any web app accessible through natural language. Voice commands, screen readers, zero barrier.\n- **Multi-page Agent** — Extend your own web agent's reach across browser tabs [chrome extension](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fchrome-extension).\n- **MCP** - Allow your agent clients to control your browser.\n\n## 🚀 Quick Start\n\n### One-line integration\n\nFastest way to try PageAgent with our free Demo LLM:\n\n```html\n\u003Cscript src=\"{URL}\" crossorigin=\"true\">\u003C\u002Fscript>\n```\n\n> **⚠️ For technical evaluation only.** This demo CDN uses our free [testing LLM API](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fmodels#free-testing-api). By using it, you agree to its [terms](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fblob\u002Fmain\u002Fdocs\u002Fterms-and-privacy.md).\n\n| Mirrors | URL                                                                                |\n| ------- | ---------------------------------------------------------------------------------- |\n| Global  | https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Fpage-agent@1.6.2\u002Fdist\u002Fiife\u002Fpage-agent.demo.js         |\n| China   | https:\u002F\u002Fregistry.npmmirror.com\u002Fpage-agent\u002F1.6.2\u002Ffiles\u002Fdist\u002Fiife\u002Fpage-agent.demo.js |\n\n### NPM Installation\n\n```bash\nnpm install page-agent\n```\n\n```javascript\nimport { PageAgent } from 'page-agent'\n\nconst agent = new PageAgent({\n    model: 'qwen3.5-plus',\n    baseURL: 'https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1',\n    apiKey: 'YOUR_API_KEY',\n    language: 'en-US',\n})\n\nawait agent.execute('Click the login button')\n```\n\nFor more programmatic usage, see [📖 Documentations](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Fintroduction\u002Foverview).\n\n## 🤝 Contributing\n\nWe welcome contributions from the community! Follow our instructions in [CONTRIBUTING.md](CONTRIBUTING.md) for setup and guidelines.\n\nPlease read the [maintainer note](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F349) and [Code of Conduct](docs\u002FCODE_OF_CONDUCT.md) before opening issues or PRs.\n\nContributions generated entirely by **bots or agents** without substantial human involvement will **not be accepted**.\n\n## ⚖️ License\n\n[MIT License](LICENSE)\n\n## 👏 Acknowledgments\n\nThis project builds upon the excellent work of **[`browser-use`](https:\u002F\u002Fgithub.com\u002Fbrowser-use\u002Fbrowser-use)**.\n\n`PageAgent` is designed for **client-side web enhancement**, not server-side automation.\n\n```\nDOM processing components and prompt are derived from browser-use:\n\nBrowser Use \u003Chttps:\u002F\u002Fgithub.com\u002Fbrowser-use\u002Fbrowser-use>\nCopyright (c) 2024 Gregor Zunic\nLicensed under the MIT License\n\nWe gratefully acknowledge the browser-use project and its contributors for their\nexcellent work on web automation and DOM interaction patterns that helped make\nthis project possible.\n```\n\n## 🌟 Awesome Page Agent\n\nBuilt something cool with PageAgent? Add it here! Open a PR to share your project.\n\n> These are community projects — not maintained or endorsed by us. Use at your own discretion.\n\n| Project | Description |\n| ------- | ----------- |\n| _Yours?_ | [Open a PR](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpulls) 🙌 |\n\n---\n\n**⭐ Star this repo if you find PageAgent helpful!**\n","# Page Agent\n\n\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fimg.alicdn.com\u002Fimgextra\u002Fi4\u002FO1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png\">\n  \u003Cimg alt=\"Page Agent Banner\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_page-agent_readme_58659c6dc330.png\">\n\u003C\u002Fpicture>\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-auto.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT) [![TypeScript](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%3C%2F%3E-TypeScript-%230074c1.svg)](http:\u002F\u002Fwww.typescriptlang.org\u002F) [![Bundle Size](https:\u002F\u002Fimg.shields.io\u002Fbundlephobia\u002Fminzip\u002Fpage-agent)](https:\u002F\u002Fbundlephobia.com\u002Fpackage\u002Fpage-agent) [![Downloads](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fdt\u002Fpage-agent.svg)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fpage-agent) [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Falibaba\u002Fpage-agent.svg)](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent)\n\n生活在网页中的图形用户界面代理（GUI Agent）。使用自然语言控制网页界面。\n\n🌐 **English** | [中文](.\u002Fdocs\u002FREADME-zh.md)\n\n\u003Ca href=\"https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002F\" target=\"_blank\">\u003Cb>🚀 示例\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Fintroduction\u002Foverview\" target=\"_blank\">\u003Cb>📖 文档\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fnews.ycombinator.com\u002Fitem?id=47264138\" target=\"_blank\">\u003Cb>📢 HN 讨论\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fx.com\u002Fsimonluvramen\" target=\"_blank\">\u003Cb>𝕏 在 X 上关注\u003C\u002Fb>\u003C\u002Fa>\n\n\u003Cvideo id=\"demo-video\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa1f2eae2-13fb-4aae-98cf-a3fc1620a6c2\" controls crossorigin muted>\u003C\u002Fvideo>\n\n---\n\n## ✨ 特性\n\n- **🎯 简单集成**\n    - 不需要 `浏览器扩展` \u002F `Python` \u002F `无头浏览器`。\n    - 只需页面内 JavaScript。一切都在您的网页中完成。\n- **📖 基于文本的 DOM 操作**\n    - 不需要截图，也不需要多模态大语言模型（LLMs）或特殊权限。\n- **🧠 使用您自己的 LLMs**\n- **🐙 可选的 [Chrome 扩展](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fchrome-extension) 用于多页面任务**\n    - 还有一个 [MCP 服务器 (Beta)](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fmcp-server)，可以从外部控制它。\n\n## 💡 使用场景\n\n- **SaaS AI 助手** — 用几行代码在您的产品中部署一个 AI 助手，无需后端重写。\n- **智能表单填写** — 将 20 次点击的工作流简化为一句话。非常适合 ERP、CRM 和管理系统。\n- **无障碍访问** — 通过自然语言让任何 Web 应用变得无障碍。支持语音命令、屏幕阅读器，零障碍。\n- **多页面代理** — 通过 [Chrome 扩展](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fchrome-extension) 扩展您自己的 Web 代理到多个浏览器标签页。\n- **MCP** - 允许您的代理客户端控制您的浏览器。\n\n## 🚀 快速开始\n\n### 一行代码集成\n\n最快的方式是通过我们的免费 Demo LLM 来尝试 PageAgent：\n\n```html\n\u003Cscript src=\"{URL}\" crossorigin=\"true\">\u003C\u002Fscript>\n```\n\n> **⚠️ 仅用于技术评估。** 此演示 CDN 使用我们的免费 [测试 LLM API](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Ffeatures\u002Fmodels#free-testing-api)。使用它即表示您同意其 [条款](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fblob\u002Fmain\u002Fdocs\u002Fterms-and-privacy.md)。\n\n| 镜像 | URL                                                                                |\n| ------- | ---------------------------------------------------------------------------------- |\n| 全球  | https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Fpage-agent@1.6.2\u002Fdist\u002Fiife\u002Fpage-agent.demo.js         |\n| 中国   | https:\u002F\u002Fregistry.npmmirror.com\u002Fpage-agent\u002F1.6.2\u002Ffiles\u002Fdist\u002Fiife\u002Fpage-agent.demo.js |\n\n### NPM 安装\n\n```bash\nnpm install page-agent\n```\n\n```javascript\nimport { PageAgent } from 'page-agent'\n\nconst agent = new PageAgent({\n    model: 'qwen3.5-plus',\n    baseURL: 'https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1',\n    apiKey: 'YOUR_API_KEY',\n    language: 'en-US',\n})\n\nawait agent.execute('点击登录按钮')\n```\n\n更多编程用法，请参阅 [📖 文档](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Fintroduction\u002Foverview)。\n\n## 🤝 贡献\n\n我们欢迎社区的贡献！请按照 [CONTRIBUTING.md](CONTRIBUTING.md) 中的说明进行设置和遵循指南。\n\n在提交问题或 PR 之前，请阅读 [维护者说明](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F349) 和 [行为准则](docs\u002FCODE_OF_CONDUCT.md)。\n\n完全由 **机器人或代理** 生成且没有实质性人类参与的贡献将 **不被接受**。\n\n## ⚖️ 许可证\n\n[MIT 许可证](LICENSE)\n\n## 👏 致谢\n\n本项目基于 **[`browser-use`](https:\u002F\u002Fgithub.com\u002Fbrowser-use\u002Fbrowser-use)** 的优秀工作构建。\n\n`PageAgent` 专为 **客户端网页增强** 设计，而非服务端自动化。\n\n```\nDOM 处理组件和提示来源于 browser-use：\n\nBrowser Use \u003Chttps:\u002F\u002Fgithub.com\u002Fbrowser-use\u002Fbrowser-use>\n版权所有 (c) 2024 Gregor Zunic\n根据 MIT 许可证授权\n\n我们衷心感谢 browser-use 项目及其贡献者在网页自动化和 DOM 交互模式方面的出色工作，\n这使得本项目成为可能。\n```\n\n## 🌟 精彩的 Page Agent 项目\n\n用 PageAgent 构建了很酷的东西？添加到这里！打开 PR 分享您的项目。\n\n> 这些是社区项目 —— 我们不维护或背书。请自行决定使用。\n\n| 项目 | 描述 |\n| ------- | ----------- |\n| _您的项目？_ | [打开 PR](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpulls) 🙌 |\n\n---\n\n**⭐ 如果您觉得 PageAgent 有帮助，请给这个仓库点个星！**","# Page Agent 快速上手指南\n\nPage Agent 是一个嵌入网页的 GUI 智能代理工具，可以通过自然语言控制网页界面。\n\n---\n\n## 环境准备\n\n- **系统要求**: 支持现代浏览器（如 Chrome、Firefox、Edge 等）。\n- **前置依赖**:\n  - Node.js (推荐版本 >= 16)\n  - npm 或 yarn\n  - 需要一个 LLM API Key（如 Qwen API Key）\n\n---\n\n## 安装步骤\n\n### 方法一：通过 NPM 安装\n\n```bash\nnpm install page-agent\n```\n\n### 方法二：通过 CDN 快速加载\n\n在 HTML 文件中直接引入以下脚本：\n\n```html\n\u003Cscript src=\"https:\u002F\u002Fregistry.npmmirror.com\u002Fpage-agent\u002F1.6.2\u002Ffiles\u002Fdist\u002Fiife\u002Fpage-agent.demo.js\" crossorigin=\"true\">\u003C\u002Fscript>\n```\n\n> **注意**: 上述 CDN 使用免费测试 LLM API，仅适用于技术评估。使用前请阅读 [服务条款](https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fblob\u002Fmain\u002Fdocs\u002Fterms-and-privacy.md)。\n\n---\n\n## 基本使用\n\n### 示例 1：通过 NPM 使用\n\n以下代码展示了如何初始化 PageAgent 并执行简单操作：\n\n```javascript\nimport { PageAgent } from 'page-agent'\n\nconst agent = new PageAgent({\n    model: 'qwen3.5-plus',\n    baseURL: 'https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1',\n    apiKey: 'YOUR_API_KEY',\n    language: 'en-US',\n})\n\nawait agent.execute('Click the login button')\n```\n\n### 示例 2：通过 CDN 使用\n\n在 HTML 文件中直接调用 PageAgent：\n\n```html\n\u003Cscript>\n  const agent = new PageAgent({\n      model: 'qwen3.5-plus',\n      baseURL: 'https:\u002F\u002Fdashscope.aliyuncs.com\u002Fcompatible-mode\u002Fv1',\n      apiKey: 'YOUR_API_KEY',\n      language: 'zh-CN',\n  })\n\n  agent.execute('点击登录按钮')\n\u003C\u002Fscript>\n```\n\n---\n\n## 更多文档\n\n如需了解更多高级用法，请参考 [官方文档](https:\u002F\u002Falibaba.github.io\u002Fpage-agent\u002Fdocs\u002Fintroduction\u002Foverview)。\n\n---\n\n**⭐ 如果你觉得 Page Agent 有帮助，请为项目点个 Star！**","一位前端开发者正在为公司内部的老旧ERP系统添加智能助手功能，希望通过自然语言交互提升操作效率。\n\n### 没有 page-agent 时\n- 需要手动编写大量DOM操作代码，逐个定位页面元素并实现交互逻辑\n- 处理复杂表单时，必须记住每个字段的具体位置和名称，容易出错\n- 系统界面经常更新，每次改动都需要重新调整代码适配新的DOM结构\n- 实现跨页面操作时，需要维护多个独立的脚本文件，管理困难\n- 开发周期长，需要同时考虑不同浏览器的兼容性问题\n\n### 使用 page-agent 后\n- 只需简单引入JavaScript库，通过自然语言指令即可完成页面操作\n- 用户可以直接说出\"填写表单，项目名称为新产品\"，无需关心具体字段位置\n- 页面结构调整后，page-agent能自动适配新的DOM结构，减少维护成本\n- 借助可选的chrome扩展，轻松实现跨页面任务的统一管理\n- 内置的兼容性处理让开发者无需担心浏览器差异，专注业务逻辑\n\npage-agent将复杂的网页操作简化为自然语言指令，显著提升了开发效率和用户体验。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_page-agent_24f864a1.png","alibaba","Alibaba","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Falibaba_f65f7221.png","Alibaba Open Source",null,"https:\u002F\u002Fopensource.alibaba.com\u002F","https:\u002F\u002Fgithub.com\u002Falibaba",[83,87,91,95],{"name":84,"color":85,"percentage":86},"TypeScript","#3178c6",81.4,{"name":88,"color":89,"percentage":90},"JavaScript","#f1e05a",11.4,{"name":92,"color":93,"percentage":94},"CSS","#663399",5.8,{"name":96,"color":97,"percentage":98},"HTML","#e34c26",1.3,15167,1180,"2026-04-05T11:00:36","MIT","未说明",{"notes":105,"python":103,"dependencies":106},"无需浏览器扩展或 Python 环境，仅需在网页中引入 JavaScript。可选 Chrome 扩展用于多页面任务。需要提供 LLM API 密钥以使用特定功能。",[67,107],"npm",[14,15,13],[110,111,112,113,114,115,116,117],"agent","ai","web","ai-agents","javascript","typescript","browser-automation","mcp",4,"2026-03-27T02:49:30.150509","2026-04-06T06:53:11.237127",[122,127,132,137,142,147],{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},4285,"通过 CDN 引入 Page-Agent 时为什么会报错？","问题可能是由于 script 标签放在 body 标签之前导致的。UMD 版本在引入后会自动创建一个实例，而此时 body 标签尚未加载完成。解决方法是将 script 标签放在 body 标签末尾，并手动清除自动创建的实例后再重新初始化：\n\n```html\n\u003Cbody>\n  \u003Cdiv>其他内容\u003C\u002Fdiv>\n  \u003Cscript src=\"https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Fpage-agent@latest\u002Fdist\u002Fumd\u002Fpage-agent.js\" crossorigin=\"true\" type=\"text\u002Fjavascript\">\u003C\u002Fscript>\n  \u003Cscript>\n    window.pageAgent.dispose();\n    const pageAgent = new ......\n  \u003C\u002Fscript>\n\u003C\u002Fbody>\n```","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F79",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},4286,"如何解决鼠标悬浮触发的下拉框无法正常关闭的问题？","该问题可能与页面防调试机制有关，某些网页（如 Boss 直聘）会在打开开发者工具时直接关闭页面，导致难以调试。目前该问题暂时关闭，如果需要复现，请提供可调试的页面或更多信息。","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F307",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},4287,"如何导出历史执行记录？","该功能已在后续版本中实现。每个历史条目增加了一个导出按钮，可以将 HistoricalEvent 下载为 JSON 文件。如果需要回放功能，可以关注后续更新。","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F239",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},4288,"为什么 GLM-4.7 模型在使用 Page-Agent 插件时会报 InvokeError 错误？","GLM-4.7 模型可能无法正确处理复杂的 tool schema，建议将 temperature 参数设置为允许值的上限以提高重试恢复的几率。此外，Page-Agent 的 autofixer 可以修复部分格式错误，但完全无视系统提示词和 tools 参数的情况需要进一步优化。","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F258",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},4289,"为什么 iframe 内容无法点击？","这是一个已知问题，同域 iframe 的内容可以看到但无法操作。该问题已在版本 1.6.0 中修复。","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F237",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},4290,"使用 Qwen3.5-35B-A3B 本地模型时为什么会报 Invalid tool_choice 错误？","LM Studio 的参数结构与 OpenAI 不完全兼容，需要添加一个开关来移除 `tool_choice` 参数。该问题已在后续版本中修复。","https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fissues\u002F284",[153,158,163,168,173,178,183,188,193,198,203,208,213,218,223,228,233,238,243,248],{"id":154,"version":155,"summary_zh":156,"released_at":157},103719,"v1.7.1","## What's Changed\r\n* feat(controller): improve scroll container detection and tool guidance by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F390\r\n* feat(controller): add experimental `keepSemanticTags` config by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F395\r\n* feat(ext): add `systemInstruction` to ExecuteConfig by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F386\r\n* fix: 将currentScript提取到setTimeout外部以避免空指针 by @Anyexyz in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F384\r\n* fix(isInteractiveCandidate): use hasAttribute with known aria list to detect aria- attributes by @lgy2020 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F356\r\n* fix(page-controller): apply scroll direction to pixels parameter by @mvanhorn in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F332\r\n* fix(ext): guard postMessage listeners against iframe sources by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F389\r\n* fix: recognize role=\"listitem\" as interactive element by @Lubrsy706 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F203\r\n* fix(controller): treat interactive with aria as distinct by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F396\r\n* docs: lm studio by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F398\r\n* docs(website): add qwen3.6-plus to models page by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F385\r\n\r\n## New Contributors\r\n* @Anyexyz made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F384\r\n* @lgy2020 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F356\r\n* @mvanhorn made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F332\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.7.0...v1.7.1","2026-04-03T18:16:38",{"id":159,"version":160,"summary_zh":161,"released_at":162},103720,"v1.7.0","## What's Changed\r\n* Enhance `clickElement` action by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F378\r\n* Fix #361 #199 #263\r\n* chore(deps-dev): bump the development-dependencies group with 3 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F371\r\n* chore(deps): bump @modelcontextprotocol\u002Fsdk from 1.27.1 to 1.29.0 in the production-dependencies group by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F370\r\n* chore(deps): bump the github-actions group with 2 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F369\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.6.3...v1.7.0","2026-03-31T14:00:45",{"id":164,"version":165,"summary_zh":166,"released_at":167},103721,"v1.6.3","## What's Changed\r\n* ❗fix(ext): new tabs are not detected by content scripts by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F363\r\n* feat(ext): `experimentalIncludeAllTabs` - control all tabs by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F363\r\n* feat(ext): fix EmptyState animation by @1245040330 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F362\r\n* docs: simplify docs by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F365\r\n\r\n## New Contributors\r\n* @1245040330 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F362\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.6.2...v1.6.3","2026-03-30T14:36:13",{"id":169,"version":170,"summary_zh":171,"released_at":172},103722,"v1.6.2","## What's Changed\r\n* chore(page-controller): export actions as internal methods by @zfangqijun in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F310\r\n* fix(ui): set task input max length to 1000 by @Gujiassh in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F292\r\n* chore(deps-dev): bump lucide-react from 0.577.0 to 1.0.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F342\r\n* chore(deps): bump ws from 8.19.0 to 8.20.0 in the production-dependencies group by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F340\r\n* chore(deps-dev): bump the development-dependencies group with 7 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F341\r\n\r\n## New Contributors\r\n* @zfangqijun made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F310\r\n* @Gujiassh made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F292\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.6.0...v1.6.2","2026-03-24T17:10:02",{"id":174,"version":175,"summary_zh":176,"released_at":177},103723,"v1.6.0","## Features\r\n- **Beta MCP support** - New `@page-agent\u002Fmcp` package lets MCP clients such as Claude Desktop and Copilot control the browser through the Page Agent extension\r\n- **Better iframe handling** - Same-origin iframe elements are handled more reliably during DOM extraction and actions\r\n- **Extension history workflows** - Users can rerun past tasks, export history sessions as JSON, and approve MCP-triggered tasks before execution\r\n\r\n## What's Changed\r\n* feat: optional AK by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F311\r\n* fix: add execCommand fallback for contenteditable input by @voidborne-d in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F210\r\n* fix(PageController): add `mouseleave` event by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F319\r\n* feat(ext): rerun tasks from history by @Adonis0123 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F314\r\n* feat(extension): export history sessions as json by @Adonis0123 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F313\r\n* chore(ext): rm keydown event on history by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F321\r\n* feat: option to disable named tool choice by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F322\r\n* fix(PageController): same-origin iframe actions by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F325\r\n\r\n## New Contributors\r\n* @voidborne-d made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F210\r\n* @Adonis0123 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F314\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.11...v1.6.0","2026-03-20T18:36:19",{"id":179,"version":180,"summary_zh":181,"released_at":182},103724,"v1.5.11","## What's Changed\r\n\r\n🧪 MCP(beta) is here! Go test it!\r\n\r\n> Chrome web store is still reviewing the v1.5.11 extension. \r\n> Should be listed later today.\r\n> Ext >= 1.5.11 is needed for the MCP to work.\r\n\r\nhttps:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Ftree\u002Fmain\u002Fpackages\u002Fmcp\r\n\r\n* feat: mcp (WIP) by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F283 \r\n* chore(deps-dev): bump the development-dependencies group with 10 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F274\r\n* feat: add MiniMax model support with temperature clamping by @octo-patch in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F221\r\n* feat: upgrade MiniMax default model to M2.7 by @octo-patch in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F294\r\n\r\n## New Contributors\r\n* @octo-patch made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F221\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.8...v1.5.11","2026-03-18T13:10:45",{"id":184,"version":185,"summary_zh":186,"released_at":187},103725,"v1.5.8","## What's Changed\r\n* feat: add stepDelay config option by @linked-danis in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F250\r\n* fix: type-safe scrollIntoViewIfNeeded by @linked-danis in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F251\r\n* feat(ext): initial controlled group by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F273\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.7...v1.5.8","2026-03-16T14:56:45",{"id":189,"version":190,"summary_zh":191,"released_at":192},103726,"v1.5.7","## What's Changed\r\n* fix: extract attributes for heuristically-detected interactive elements by @Lubrsy706 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F202\r\n* feat(website): basic SEO by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F229\r\n* fix: validate URL in fetchLlmsTxt by @linked-danis in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F247\r\n* refactor: SimulatorMask use CSS classes by @linked-danis in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F248\r\n* fix: typos and grammar in system prompts and source code by @Wizard-Guido in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F236\r\n\r\n## New Contributors\r\n* @linked-danis made their first contribution in #247\r\n* @Lubrsy706 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F202\r\n* @Wizard-Guido made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F236\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.6...v1.5.7","2026-03-13T14:07:51",{"id":194,"version":195,"summary_zh":196,"released_at":197},103727,"v1.5.6","## What's Changed\r\n* fix(ui): escape array content in cards and fix double-escaping by @RinZ27 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F191\r\n* feat: extension use the same version as packages by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F209\r\n\r\n## New Contributors\r\n* @RinZ27 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F191\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.5...v1.5.6","2026-03-11T16:56:12",{"id":199,"version":200,"summary_zh":201,"released_at":202},103728,"v1.5.5","## What's Changed\r\n* chore(deps-dev): bump the development-dependencies group with 10 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F177\r\n* fix(page-controller): honor viewportExpansion in DOM extraction by @fancyboi999 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F181\r\n* fix(page-controller): improve contenteditable input with proper events by @JasonOA888 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F179\r\n\r\n## 🐙 Extension\r\n\r\n0.1.17\r\n\r\n## New Contributors\r\n* @fancyboi999 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F181\r\n* @JasonOA888 made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F179\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.4...v1.5.5","2026-03-10T14:52:27",{"id":204,"version":205,"summary_zh":206,"released_at":207},103729,"v1.5.4","## What's Changed\r\n\r\n* fix: contenteditable by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F172\r\n* feat: support wildcard in `includeAttributes` by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F173\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.3...v1.5.4","2026-03-09T15:36:14",{"id":209,"version":210,"summary_zh":211,"released_at":212},103730,"v1.5.3","## What's Changed\r\n* fix: add button to clear saved configuration from the error boundary by @gaomeng1900 in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F160\r\n* fix(llms): avoid reasoning_effort for GPT-5.4 chat tools by @tsubasakong in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F169\r\n\r\n## New Contributors\r\n* @tsubasakong made their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F169\r\n* @hobostay make their first contribution in https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fpull\u002F156\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Fcompare\u002Fv1.5.2...v1.5.3","2026-03-09T09:23:36",{"id":214,"version":215,"summary_zh":216,"released_at":217},103731,"EXT_v0.1.16","see https:\u002F\u002Fgithub.com\u002Falibaba\u002Fpage-agent\u002Freleases\u002Ftag\u002Fv1.5.3","2026-03-09T09:32:18",{"id":219,"version":220,"summary_zh":221,"released_at":222},103732,"EXT_v0.1.15","fix: add button to clear saved configuration from the error boundary","2026-03-07T15:34:51",{"id":224,"version":225,"summary_zh":226,"released_at":227},103733,"v1.5.2","### Breaking Changes\r\n\r\n- **`data-browser-use-ignore` → `data-page-agent-ignore`** - DOM ignore attribute renamed to match the project identity\r\n- **Config types restructured** - `PageAgentConfig` split into `AgentConfig` + `PageAgentCoreConfig`; config definitions moved from `config\u002Findex.ts` to `types.ts`\r\n- **Zod v3\u002Fv4 dual support** - Libraries now accept both `zod@^3.25` and `zod@^4.0` as peer dependencies\r\n\r\n### Features\r\n\r\n- **Experimental `llms.txt` support** - Agent can fetch and include a site's `llms.txt` in context. Enable via `experimentalLlmsTxt: true`\r\n\r\n### Improvements\r\n\r\n- Default `maxSteps` changed from 20 to 40 for better for complex tasks out of the box\r\n- Added 400ms wait between agent steps for page reactions\r\n- Increased click wait time (100ms → 200ms) for more reliable interactions\r\n- Removed debug `console.log` statements from scroll actions\r\n- Reset observations on new task start\r\n- Improved logging across packages","2026-03-05T12:45:46",{"id":229,"version":230,"summary_zh":231,"released_at":232},103734,"EXT_v0.1.12","fix: crash when webgl2 not available","2026-03-06T19:24:12",{"id":234,"version":235,"summary_zh":236,"released_at":237},103735,"EXT_v0.1.11","> PageAgent 1.5.1\r\n\r\n- **Advanced config panel** - New collapsible section exposing Max Steps, System Instruction, and experimental `llms.txt` toggle\r\n- Streamlined User Auth Token description\r\n- Moved testing API notice below auth token section","2026-03-05T12:49:56",{"id":239,"version":240,"summary_zh":241,"released_at":242},103736,"v1.4.0","### Features\r\n\r\n- Update Terms of Use and Privacy Policy\r\n- **Robust tool-call validation** - Action inputs are now validated against tool schemas individually, producing clear error messages (e.g. `Invalid input for action \"click_element_by_index\"`) instead of unreadable union parse errors\r\n- **Primitive action input coercion** - Small models that output `{\"click_element_by_index\": 2}` instead of `{\"click_element_by_index\": {\"index\": 2}}` are now auto-corrected using tool schemas\r\n- **Qwen model updates** - Added `qwen3.5-plus` as the default free testing model; disabled `enable_thinking` for Qwen models to avoid incompatible responses\r\n- **Updated default LLM endpoint** - Migrated demo and extension to a new testing endpoint with legacy endpoint auto-migration\r\n\r\n### Improvements\r\n\r\n- Unified zod imports (`* as z`) across all packages for consistency\r\n- Better Zod error formatting with `z.prettifyError()` in LLM client\r\n- Exported `InvokeError` and `InvokeErrorType` as values (not just types) from `@page-agent\u002Fllms`\r\n- Exported `SupportedLanguage` type from `@page-agent\u002Fcore`","2026-02-27T14:07:57",{"id":244,"version":245,"summary_zh":246,"released_at":247},103737,"EXT_v0.1.8","- Update PageAgent to 1.4.0\r\n- **Language setting** - Added language selector (System \u002F English \u002F 中文) in config panel\r\n- **UI makeover** - New empty state with breathing glow and typing animation; ai-motion glow overlay while running; refined focus styles\r\n- **Testing endpoint notice** - Shows terms of use notice when using the free testing API\r\n- **Legacy endpoint migration** - Auto-migrates old Supabase testing endpoint to new endpoint on startup\r\n","2026-02-27T14:09:56",{"id":249,"version":250,"summary_zh":251,"released_at":252},103738,"EXT_v0.1.7","- Update page agent version\r\n- Update UX\r\n- Add locales","2026-02-14T09:09:09"]