[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-SomeOddCodeGuy--WilmerAI":3,"tool-SomeOddCodeGuy--WilmerAI":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",155373,2,"2026-04-14T11:34:08",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":77,"owner_twitter":73,"owner_website":79,"owner_url":80,"languages":81,"stars":93,"forks":94,"last_commit_at":95,"license":96,"difficulty_score":97,"env_os":98,"env_gpu":99,"env_ram":100,"env_deps":101,"category_tags":106,"github_topics":107,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":111,"updated_at":112,"faqs":113,"releases":144},7588,"SomeOddCodeGuy\u002FWilmerAI","WilmerAI","WilmerAI is one of the oldest LLM semantic routers. It uses multi-layer prompt routing and complex workflows to allow you to not only create practical chatbots, but to extend any kind of application that connects to an LLM via REST API. Wilmer sits between your app and your many LLM APIs, so that you can manipulate prompts as needed.","WilmerAI 是一款专注于大语言模型（LLM）语义路由与任务编排的开源中间件。它部署在你的应用程序与多个 LLM API 之间，充当智能流量管家，能够根据需求灵活操控提示词（Prompt）。\n\n传统路由工具往往仅依据单个关键词分类请求，而 WilmerAI 的核心优势在于其强大的上下文理解能力。它能分析完整的对话历史，精准识别用户意图。例如，当用户追问“这意味着什么”时，它能结合前文关于“罗塞塔石碑”的讨论，将其判定为历史查询而非普通闲聊，从而路由到最合适的处理流程。\n\n这一能力源于其独特的基于节点的工作流引擎。开发者可通过 JSON 文件定义复杂的多层路由逻辑，每个节点不仅能调度不同的 LLM，还能调用外部工具、运行自定义脚本或嵌套其他工作流。对前端应用而言，这些复杂的后端逻辑仅表现为一次标准的 API 调用，无需修改现有代码即可实现功能扩展。此外，WilmerAI 近期还更新了多用户隔离支持、并发控制以及图像直通功能。\n\nWilmerAI 主要面向需要构建高级聊天机器人或集成 LLM 功能的开发者与技术研究人员。如果你希望在不重构前端架构的前提下，为应用赋予更聪明的意图识别能力","WilmerAI 是一款专注于大语言模型（LLM）语义路由与任务编排的开源中间件。它部署在你的应用程序与多个 LLM API 之间，充当智能流量管家，能够根据需求灵活操控提示词（Prompt）。\n\n传统路由工具往往仅依据单个关键词分类请求，而 WilmerAI 的核心优势在于其强大的上下文理解能力。它能分析完整的对话历史，精准识别用户意图。例如，当用户追问“这意味着什么”时，它能结合前文关于“罗塞塔石碑”的讨论，将其判定为历史查询而非普通闲聊，从而路由到最合适的处理流程。\n\n这一能力源于其独特的基于节点的工作流引擎。开发者可通过 JSON 文件定义复杂的多层路由逻辑，每个节点不仅能调度不同的 LLM，还能调用外部工具、运行自定义脚本或嵌套其他工作流。对前端应用而言，这些复杂的后端逻辑仅表现为一次标准的 API 调用，无需修改现有代码即可实现功能扩展。此外，WilmerAI 近期还更新了多用户隔离支持、并发控制以及图像直通功能。\n\nWilmerAI 主要面向需要构建高级聊天机器人或集成 LLM 功能的开发者与技术研究人员。如果你希望在不重构前端架构的前提下，为应用赋予更聪明的意图识别能力和复杂的工作流 orchestration 能力，WilmerAI 是一个值得尝试的解决方案。需注意，该项目目前仍处于开发阶段，适合愿意探索前沿技术的用户。","# WilmerAI\n\n*\"What If Language Models Expertly Routed All Inference?\"*\n\n## DISCLAIMER:\n\n> This project is still under development. The software is provided as-is, without warranty of any kind.\n>\n> This project and any expressed views, methodologies, etc., found within are the result of contributions by the\n> maintainer and any contributors in their free time and on their personal hardware, and should not reflect upon\n> any of their employers.\n>\n> [The maintainer of this project, SomeOddCodeGuy, is not doing any Contract, Freelance, or Collaboration\n> work.](https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy#disclaimer)\n\n---\n\n## What is WilmerAI?\n\nWilmerAI is an application designed for advanced semantic prompt routing and complex task orchestration. It\noriginated from the need for a router that could understand the full context of a conversation, rather than just the\nmost recent message.\n\nUnlike simple routers that might categorize a prompt based on a single keyword, WilmerAI's routing system can analyze\nthe entire conversation history. This allows it to understand the true intent behind a query like \"What do you think it\nmeans?\", recognizing it as historical query if that statement was preceded by a discussion about the Rosetta Stone,\nrather than merely conversational.\n\nThis contextual understanding is made possible by its core: a **node-based workflow engine**. Like the rest of Wilmer,\nthe routing is a workflow, categorizing through a sequence of steps, or \"nodes\", defined in a JSON file.\nThe route chosen kicks off another specialized workflow, which can call more workflows from there. Each node can\norchestrate different LLMs, call external tools, run custom scripts, call other workflows, and many other things.\n\nTo the client application, this entire multi-step process appears as a standard API call, enabling advanced backend\nlogic without requiring changes to your existing front-end tools.\n\n---\n## Maintainer's Note Addendum - UPDATED 2026-04-12\n\n> A year and a half after it was first requested, I finally have tool calling support in here. This\n> was something that I was regularly putting off because of how challenging it was to add in.\n> \n> What this means, and how I've been using it the past two weeks- you can jam Wilmer in between something\n> like OpenCode and Llama.cpp. I've been working on improving OpenCode quality using Qwen 27b and 122b\n> by creating workflows that the OpenCode calls pass through. It slows everything down a lot, but the\n> result is far less engagement from me because it gets things right in far less tries.\n> \n> I'm going to tinker with these workflows for a month or so and then start putting them out for folks.\n> Updating the workflows here are next on the list.\n> \n> Also, another big change: finally added image passthrough for the standard node. When I first put the\n> ImageProcessor in, vision models were still really new. Now they're everywhere, so I've reworked the\n> ImageProcessor to be more tailored towards the specific purpose of long term efficiency, while\n> the Standard node could be used instead to just send images to a model like normal.\n\n\n## Maintainer's Note - UPDATED 2026-03-29\n\n> I've been on a tear with Wilmer lately, and this is probably the biggest batch of changes since\n> the workflow engine refactor. The short version: **you don't need to run multiple Wilmer instances\n> anymore.**\n>\n> That was always the thing that bugged me the most about how Wilmer worked. You'd end up with\n> this pile of running instances, each with their own config, and it was a pain to manage. So I\n> finally sat down and fixed it.\n>\n> Here's what's new:\n>\n> - **Multi-user support.** You can now launch Wilmer with `--User alice --User bob` (as many as\n>   you need), and each user gets their own config, conversation files, memories, and log\n>   directory. Wilmer figures out who's making the request and routes everything to the right\n>   place.\n>\n> - **Concurrency controls.** The `--concurrency` and `--concurrency-timeout` flags when starting\n>   the server let you gate how many simultaneous requests are run at once.\n>   By default only one request processes at a time (which is what you want for most local\n>   Mac setups), and anything else queues up instead of stepping on each other. You can crank it\n>   up if your backend can handle it, like NVidia setups.\n>\n> - **Per-user file isolation.** Discussion ID files and some other per-session stuff now live in\n>   user-specific directories. When you've got multiple users on one instance, this keeps\n>   everyone's files from piling into one big folder.\n>\n> - **API key support.** If a request comes in with an `Authorization: Bearer` key, Wilmer uses\n>   that to bundle files into isolated per-key directories. This is a second layer of bundling\n>\n> - **EXPERIMENTAL: Optional encryption.** You can enable per-api-key Fernet encryption for stored files. If you\n>   turn it on, then it will use your API key to encrypt your loose files generated by Wilmer.\n>  There's also a re-keying script if you ever need to rotate keys. (NOTE: Doesn't yet affect sqlite dbs)\n>\n> - **More memory and context options.** I've added a couple of new tools for managing long\n>   conversations- an automatic memory condensation layer for file-based memories, and a\n>   ContextCompactor workflow node for token-aware conversation compaction. More on those in the\n>   docs, but the memory condenser is a big help on long chats. Short version is it generates N\n>   number of memories, and when it hits that point it will take those N memories and rewrite them\n>   as 1 memory, then keep going. So if you do 3, then it writes 3, rewrites them down to be 1, then\n>   does 3 more, rewrites those 3 as 1, etc. So instead of 6 memories, you get 2. If you are writing\n>   memories every 10,000 tokens, that's 60,000 tokens summarized down to two small 500-1000 token memories.\n>\n> - **Image handling improvements.** Fixed a longstanding design issue in Wilmer's ImageProcessor so that\n>   Images are now tracked per-message from the moment they come\n>   in all the way through to LLM dispatch, so they stay tied to the conversation turn that\n>   produced them. This also allowed me to add caching in the image processor (when a discussionid is active)\n>   so recurring image calls don't\n>   have to reprocess the same data every time.\n>\n> In a previous recent release, I also added **shared workflow collections and workflow selection\n> via the API model field.** The `\u002Fv1\u002Fmodels` and `\u002Fapi\u002Ftags` endpoints now return your available\n> workflows, which means front-ends like Open WebUI will show them right in the model dropdown.\n> You just pick the workflow you want the same way you'd pick a model. Shared workflow folders\n> (`_shared\u002F`) let multiple users point at the same workflow sets without duplicating config all\n> over the place, but also let one user have a bunch of workflows under it. So instead of having\n> a coding workflow as one user, a general workflow as another, etc, you get one user with multiple\n> available workflows under it.\n>\n> ![Shared Workflows in Open WebUI](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_7c1a7a8ca181.png)\n>\n> The example users and workflows that ship with Wilmer are overdue for an update to reflect all\n> of this. That's next on my list.\n>\n> -Socg\n\n## The Power of Workflows\n\n### Semi-Autonomous Workflows Allow You Determine What Tools and When\n\nThe below shows Open WebUI connected to 2 instances of Wilmer (recorded before multi-user support was added; a single\ninstance can now serve multiple users). The first instance just hits Mistral Small 3 24b directly, and then the second\ninstance makes a call to the [Offline Wikipedia API](https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FOfflineWikipediaTextApi) before\nmaking the call to the same model.\n\n![No-RAG vs RAG](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_32d62cf1cfb9.gif)\n*Click the image to play gif if it doesn't start automatically*\n\n### Iterative LLM Calls To Improve Performance\n\nA zero-shot to an LLM may not give great results, but follow-up questions will often improve them. If you\nregularly perform\n[the same follow-up questions when doing tasks like software development](https:\u002F\u002Fwww.someoddcodeguy.dev\u002Fmy-personal-guide-for-developing-software-with-ai-assistance\u002F),\ncreating a workflow to automate those steps can have great results.\n\n### Distributed LLMs\n\nWith workflows, you can have as many LLMs available to work together in a single call as you have computers to support.\nFor example, if you have old machines lying around that can run 3-8b models? You can put them to use as worker LLMs in\nvarious nodes. The more LLM APIs that you have available to you, either on your own home hardware or via proprietary\nAPIs, the more powerful you can make your workflow network. A single prompt to Wilmer could reach out to 5+ computers,\nincluding proprietary APIs, depending on how you build your workflow.\n\n## Some (Not So Pretty) Pictures to Help People Visualize What It Can Do\n\n#### Example of A Simple Assistant Workflow Using the Prompt Router\n\n![Single Assistant Routing to Multiple LLMs](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_421da95014a7.jpg)\n\n#### Example of How Routing Might Be Used\n\n![Prompt Routing Example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_c65c70228eef.png)\n\n#### Group Chat to Different LLMs\n\n![Groupchat to Different LLMs](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_eed49478181b.png)\n\n#### Example of a UX Workflow Where A User Asks for a Website\n\n![Oversimplified Example Coding Workflow](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_e7da803ee913.jpg)\n\n## Key Features\n\n* **Advanced Contextual Routing**\n  The primary function of WilmerAI. It directs user requests using sophisticated, context-aware logic. This is handled\n  by two mechanisms:\n    * **Prompt Routing**: At the start of a conversation, it analyzes the user's prompt to select the most appropriate\n      specialized workflow (e.g., \"Coding,\" \"Factual,\" \"Creative\").\n    * **In-Workflow Routing**: During a workflow, it provides conditional \"if\u002Fthen\" logic, allowing a process to\n      dynamically choose its next step based on the output of a previous node.\n\n  Crucially, these routing decisions can be based on the **entire conversation history**, not just the user's last\n  messages, allowing for a much deeper understanding of intent.\n\n---\n\n* **Core: Node-Based Workflow Engine**\n  The foundation that powers the routing and all other logic. WilmerAI processes requests using workflows, which are\n  JSON files that define a sequence of steps (nodes). Each node performs a specific task, and its output can be passed\n  as input to the next, enabling complex, chained-thought processes.\n\n---\n\n* **Multi-LLM & Multi-Tool Orchestration**\n  Each node in a workflow can connect to a completely different LLM endpoint or execute a tool. This allows you to\n  orchestrate the best model for each part of a task -- for example, using a small, fast local model for summarization and\n  a large, powerful cloud model for the final reasoning, all within a single workflow.\n\n---\n\n* **Modular & Reusable Workflows**\n  You can build self-contained workflows for common tasks (like searching a database or summarizing text) and then\n  execute them as a single, reusable node inside other, larger workflows. This simplifies the design of complex agents.\n\n---\n\n* **Stateful Conversation Memory**\n  To provide the necessary context for long conversations and accurate routing, WilmerAI uses a three-part memory\n  system: a chronological summary file, a continuously updated \"rolling summary\" of the entire chat, and a searchable\n  vector database for Retrieval-Augmented Generation (RAG).\n\n---\n\n* **Adaptable API Gateway**\n  WilmerAI's \"front door.\" It exposes OpenAI- and Ollama-compatible API endpoints, allowing you to connect your existing\n  front-end applications and tools without modification.\n\n---\n\n* **Flexible Backend Connectors**\n  WilmerAI's \"back door.\" It connects to various LLM backends (OpenAI, Ollama, KoboldCpp) using a simple but powerful\n  configuration system of **Endpoints** (the address), **API Types** (the schema\u002Fdriver), and **Presets** (the\n  generation parameters).\n\n---\n\n- **MCP Server Tool Integration using MCPO:** New and experimental support for MCP\n  server tool calling using MCPO, allowing tool use mid-workflow. Big thank you\n  to [iSevenDays](https:\u002F\u002Fgithub.com\u002FiSevenDays)\n  for the amazing work on this feature. More info can be found in the [ReadMe](Public\u002Fmodules\u002FREADME_MCP_TOOLS.md)\n\n---\n\n- **Privacy First Development:** At its core, Wilmer is continually designed with the\n  principle of being completely private. Socg uses this application constantly, and doesn't\n  want his information getting blasted out to the net any more than anyone else does. As such,\n  every decision that is made is focused on the idea that the only incoming and outgoing calls\n  from Wilmer should be things that the user expects, and actively configured themselves.\n\n----\n\n#### Privacy Check -- 2026-03-29\n\nFor my own edification, to ensure I didn't accidentally add something that would negatively impact\nWilmer's privacy posture, I'll sometimes ask Claude Code to do an end-to-end check to look for any\noutbound calls or other data leakage. It's not as good as a formal code audit, but it gives me\npeace of mind. I've included the results of the check here.\n\nOn 2026-03-29, Claude Code (Claude Opus) was asked to search the codebase and report any outbound\nnetwork calls, telemetry, or other privacy-relevant behavior it could find. The results are listed\nbelow for transparency, but they are not a guarantee -- if privacy matters to your deployment,\nplease run your own analysis.\n\n```text\nWhat Was Checked\n----------------\nThe Middleware\u002F and Public\u002F source trees, the top-level entry points (server.py, run_eventlet.py,\nrun_waitress.py), and all shell\u002Fbatch launcher scripts (run_macos.sh, run_windows.bat,\nScripts\u002Frekey_encrypted_files.sh, Scripts\u002Frekey_encrypted_files.bat) were searched for outbound\nHTTP calls (requests.get, requests.post, requests.Session, requests.request), raw socket usage,\nsubprocess invocations, dynamic imports, hardcoded external URLs, telemetry-related keywords\n(analytics, telemetry, phone-home, tracking, metrics), and environment variable reads. The entry\npoints and launcher scripts contained no outbound network calls; run_eventlet.py sets TCP_NODELAY\non the local listening socket but makes no external connections.\n\nOutbound Network Calls\n----------------------\nEvery outbound HTTP call site found in the codebase:\n\n1. Middleware\u002Fllmapis\u002Fhandlers\u002Fbase\u002Fbase_llm_api_handler.py (lines 223, 373)\n   - session.post() to user-configured LLM endpoint (self.base_url from endpoint config)\n\n2. Middleware\u002Fworkflows\u002Ftools\u002Foffline_wikipedia_api_tool.py (lines 45, 80, 117, 153, 192)\n   - requests.get() to user-configured host; defaults to 127.0.0.1:5728\n   - Disabled unless activateWikiApi is set\n\n3. Public\u002Fmodules\u002Fmcp_service_discoverer.py (line 55)\n   - requests.get() to user-configured or env-var MCPO server; defaults to localhost:8889\n\n4. Public\u002Fmodules\u002Fmcp_tool_executor.py (line 235)\n   - requests.request() to same MCPO server as above\n\nNo telemetry, analytics, phone-home, auto-update, or hardcoded external URLs were found. All\noutbound connections target endpoints that the user explicitly configures.\n\nData Storage\n------------\n- JSON conversation\u002Fmemory files: Optionally encrypted at rest using Fernet (AES-128-CBC with\n  HMAC, PBKDF2 with 100k iterations) when encryptUsingApiKey is enabled.\n- SQLite databases: Used for vector memory and workflow locks. These are NOT encrypted, even\n  when the encryption feature is enabled.\n- Log files: At DEBUG level, logs may contain full prompts and LLM responses unless\n  redactLogOutput or encryptUsingApiKey is enabled in the user configuration.\n- Configuration files: May contain API keys in plaintext. These files are not encrypted by Wilmer.\n\nThird-Party Dependencies\n------------------------\nAll runtime dependencies from requirements.txt:\n\n  requests 2.33.0        - HTTP client for LLM API and tool calls\n  urllib3 2.6.3           - Transport layer for requests\n  scikit-learn 1.8.0      - TF-IDF vectorization for memory search\n  Flask 3.1.3             - HTTP server framework\n  Jinja2 3.1.6            - Template rendering for workflow prompts\n  Pillow 12.1.1           - Image format detection and processing\n  eventlet 0.40.4         - Async WSGI server (optional)\n  waitress 3.0.2          - Production WSGI server (optional)\n  cryptography 46.0.5     - Fernet encryption for stored data\n\nNo telemetry or analytics code was found in any of these packages' initialization paths as used\nby Wilmer.\n\nDynamic Code Loading\n--------------------\n- PythonModule workflow nodes execute user-provided Python scripts from the configured scripts\n  directory with the full privileges of the Wilmer process. These scripts are not sandboxed or\n  validated by Wilmer.\n- API handler discovery (Middleware\u002Fllmapis\u002Fhandlers\u002F) is internal-only and loads only from the\n  handlers directory within the application.\n\nImage URL Handling\n------------------\nWhen a conversation message contains an image referenced by HTTP URL, that URL is forwarded as-is\nto the configured LLM provider. Wilmer does not fetch the image itself.\n\nLimitations\n-----------\n1. This is a static source-code search performed by an AI (Claude Opus), not a formal third-party\n   security audit.\n2. Third-party library source code was not checked at the bytecode level. The results confirm only\n   that Wilmer's own code does not appear to initiate unexpected connections.\n3. PythonModule scripts are user-provided and can execute arbitrary code. Their behavior is outside\n   the scope of this check.\n4. Runtime network monitoring (e.g., packet capture) was not performed.\n5. SQLite databases used for vector memory are not encrypted, even when encryption is otherwise\n   enabled.\n6. Log files may contain full conversation content unless redaction is explicitly enabled.\n```\n\n> While I do not have the tools to make a 100% guarantee claim there is not a third party\n> library doing something I'm not expecting, I wanted to make a point\n> that this is something that is important to me. I highly recommend, if you have\n> any concerns, that you run your own analysis of the codebase and app. Please open an issue\n> if you ever find anything that I've missed.\n\n## User Documentation\n\nUser Documentation can be found by going to [\u002FDocs\u002FUser_Documentation\u002F](Docs\u002FUser_Documentation\u002FREADME.md)\n\n## Developer Documentation\n\nHelpful developer docs can be found in [\u002FDocs\u002FDeveloper_Docs\u002F](Docs\u002FDeveloper_Docs\u002FREADME.md)\n\n## Quick-ish Setup\n\n### Youtube Videos\n\n[![WilmerAI and Open WebUI Install on Fresh Windows 11 Desktop](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_1dfc034985a7.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KDpbxHMXmTs \"WilmerAI and Open WebUI Install on Fresh Windows 11 Desktop\")\n\n### Guides\n\n#### WilmerAI\n\nHop into the [User Documents Setup Starting Guide](Docs\u002FUser_Documentation\u002FSetup\u002F_Getting-Start_Wilmer-Api.md) to get\nstep by step rundown of how to quickly set up the API.\n\n\n#### Wilmer with Open WebUI\n\n[You can click here to find a written guide for setting up Wilmer with Open WebUI](Docs\u002FUser_Documentation\u002FSetup\u002FOpen-WebUI.md)\n\n#### Wilmer With SillyTavern\n\n[You can click here to find a written guide for setting up Wilmer with SillyTavern](Docs\u002FUser_Documentation\u002FSetup\u002FSillyTavern.md).\n\n\n---\n\n## Why Make WilmerAI?\n\nWilmer was kicked off in late 2023, during the Llama 2 era, to make maximum use of fine-tunes through routing.\nThe routers that existed at the time didn't handle semantic routing well- often categorizing was based on a single\nword and the last message only; but sometimes a single word isn't enough to describe a category, and the last\nmessage may have too much inferred speech or lack too much context to appropriately categorize on.\n\nAlmost immediately after Wilmer was started, it became apparent that just routing wasn't enough: the finetunes were ok,\nbut nowhere near as smart as proprietary LLMs. However, when the LLMs were forced to iterate on the same task over and\nover, the quality of their responses tended to improve (as long as the prompt was well written). This meant that the\noptimal result wasn't routing just to have a single LLM one-shot the response, but rather sending the prompt to\nsomething\nmore complex.\n\nInstead of relying on unreliable autonomous agents, Wilmer became focused on semi-autonomous Workflows, giving the\nuser granular control of the path the LLMs take, and allow maximum use of the user's own domain knowledge and\nexperience. This also meant that multiple LLMs could work together, orchestrated by the workflow itself,\nto come up with a single solution.\n\nRather than routing to a single LLM, Wilmer routes to many via a whole workflow.\n\nThis has allowed Wilmer's categorization to be far more complex and customizable than most routers. Categorization is\nhandled by user defined workflows, with as many nodes and LLMs involved as the user wants, to break down the\nconversation and determine exactly what the user is asking for. This means the user can experiment with different\nprompting styles to try to make the router get the best result. Additionally, the routes are more than just keywords,\nbut rather full descriptions of what the route entails. Little is left to the LLM's \"imagination\". The goal is that\nany weakness in Wilmer's categorization can be corrected by simply modifying the categorization workflow. And once\nthat category is chosen? It goes to another workflow.\n\nEventually Wilmer became more about Workflows than routing, and an optional bypass was made to skip routing entirely.\nBecause of the small footprint, this means that users can run multiple instances of Wilmer- some hitting a workflow\ndirectly, while others use categorization and routing.\n\nWhile Wilmer may have been the first of its kind, many other semantic routers have since appeared; some of which are\nlikely faster and better. But this project will continue to be maintained for a long time to come, as the maintainer\nof the project still uses it as his daily driver, and has many more plans for it.\n\n## Wilmer API Endpoints\n\n### How Do You Connect To Wilmer?\n\nWilmer exposes several different APIs on the front end, allowing you to connect most applications in the LLM space\nto it.\n\nWilmer exposes the following APIs that other apps can connect to it with:\n\n- OpenAI Compatible v1\u002Fcompletions (*requires [Wilmer Prompt Template](Public\u002FConfigs\u002FPromptTemplates\u002Fwilmerai.json)*)\n- OpenAI Compatible chat\u002Fcompletions\n- Ollama Compatible api\u002Fgenerate (*requires [Wilmer Prompt Template](Public\u002FConfigs\u002FPromptTemplates\u002Fwilmerai.json)*)\n- Ollama Compatible api\u002Fchat\n\n### What Wilmer Can Connect To\n\nOn the backend, Wilmer is capable to connecting to various APIs, where it will send its prompts to LLMs. Wilmer\ncurrently is capable of connecting to the following API types:\n\n- Claude API (Anthropic Messages API)\n- OpenAI Compatible v1\u002Fcompletions\n- OpenAI Compatible chat\u002Fcompletions\n- Ollama Compatible api\u002Fgenerate\n- Ollama Compatible api\u002Fchat\n- KoboldCpp Compatible api\u002Fv1\u002Fgenerate (*non-streaming generate*)\n- KoboldCpp Compatible \u002Fapi\u002Fextra\u002Fgenerate\u002Fstream (*streaming generate*)\n\nWilmer supports both streaming and non-streaming connections, and has been tested using both Sillytavern\nand Open WebUI.\n\n## Maintainer's Note:\n\n> This project is being supported in my free time on my personal hardware. I do not have the ability to contribute to\n> this during standard business hours on\n> weekdays due to work, so my only times to make code updates are weekends, and some weekday late nights.\n>\n> If you find a bug or other issue, a fix may take a week or two to go out. I apologize in\n> advance if that ends up being the case, but please don't take it as meaning I am not taking the\n> issue seriously. In reality, I likely\n> won't have the ability to even look at the issue until the following Friday or Saturday.\n>\n> -Socg\n\n## IMPORTANT:\n\n> Please keep in mind that workflows, by their very nature, could make many calls to an API endpoint based on how you\n> set them up. WilmerAI does not track token usage, does not report accurate token usage via its API, nor offer any\n> viable\n> way to monitor token usage. So if token usage tracking is important to you for cost reasons, please be sure to keep\n> track of how many tokens you are using via any dashboard provided to you by your LLM APIs, especially early on as you\n> get used to this software.\n>\n>Your LLM directly affects the quality of WilmerAI. This is an LLM driven project, where the flows and outputs are\n> almost\n> entirely dependent on the connected LLMs and their responses. If you connect Wilmer to a model that produces lower\n> quality outputs, or if your presets or prompt template have flaws, then Wilmer's overall quality will be much lower\n> quality as well. It's not much different than agentic workflows in that way.\n\n---\n\n## Contact\n\nFor feedback, requests, or just to say hi, you can reach me at:\n\nWilmerAI.Project@gmail.com\n\n---\n\n## Third Party Libraries\n\nWilmerAI imports several libraries within its requirements.txt, and imports the libraries via import statements; it does\nnot extend or modify the source of those libraries.\n\nThe libraries are:\n\n* Flask : https:\u002F\u002Fgithub.com\u002Fpallets\u002Fflask\u002F\n* requests: https:\u002F\u002Fgithub.com\u002Fpsf\u002Frequests\u002F\n* scikit-learn: https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002F\n* urllib3: https:\u002F\u002Fgithub.com\u002Furllib3\u002Furllib3\u002F\n* jinja2: https:\u002F\u002Fgithub.com\u002Fpallets\u002Fjinja\n* pillow: https:\u002F\u002Fgithub.com\u002Fpython-pillow\u002FPillow\n* eventlet: https:\u002F\u002Fgithub.com\u002Feventlet\u002Feventlet\n* waitress: https:\u002F\u002Fgithub.com\u002FPylons\u002Fwaitress\n* cryptography: https:\u002F\u002Fgithub.com\u002Fpyca\u002Fcryptography\n\nFurther information on their licensing can be found within the README of the ThirdParty-Licenses folder, as well as the\nfull text of each license and their NOTICE files, if applicable, with relevant last updated dates for each.\n\n## Wilmer License and Copyright\n\n    WilmerAI\n    Copyright (C) 2024-2026 Christopher Smith\n\n    This program is free software: you can redistribute it and\u002For modify\n    it under the terms of the GNU General Public License as published by\n    the Free Software Foundation, either version 3 of the License, or\n    (at your option) any later version.\n\n    This program is distributed in the hope that it will be useful,\n    but WITHOUT ANY WARRANTY; without even the implied warranty of\n    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n    GNU General Public License for more details.\n\n    You should have received a copy of the GNU General Public License\n    along with this program.  If not, see \u003Chttps:\u002F\u002Fwww.gnu.org\u002Flicenses\u002F>.\n","# WilmerAI\n\n*\"如果语言模型能够完美地路由所有推理呢？\"*\n\n## 免责声明：\n\n> 本项目仍在开发中。软件按“原样”提供，不提供任何形式的担保。\n>\n> 本项目以及其中所表达的观点、方法论等，均由维护者及贡献者在业余时间、使用个人硬件完成，不应被视为代表其任何雇主的意见。\n>\n> [本项目的维护者 SomeOddCodeGuy 目前未从事任何合同、自由职业或合作工作。](https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy#disclaimer)\n\n---\n\n## WilmerAI 是什么？\n\nWilmerAI 是一款用于高级语义提示路由和复杂任务编排的应用程序。它的诞生源于对一种能够理解完整对话上下文、而不仅仅是最新消息的路由器的需求。\n\n与仅根据单个关键词对提示进行分类的简单路由器不同，WilmerAI 的路由系统可以分析整个对话历史。这使得它能够真正理解诸如“你觉得它是什么意思？”这样的问题背后的意图：如果该陈述之前有关于罗塞塔石碑的讨论，那么它会被识别为历史查询，而非单纯的日常对话。\n\n这种上下文理解能力得益于其核心——**基于节点的工作流引擎**。如同 Wilmer 的其他部分一样，路由本身也是一个工作流，通过 JSON 文件中定义的一系列步骤（即“节点”）来进行分类。选定的路由会启动另一个专门的工作流，而这个工作流又可以进一步调用其他工作流。每个节点都可以编排不同的大语言模型、调用外部工具、运行自定义脚本、调用其他工作流，以及其他多种操作。\n\n对于客户端应用而言，这一整套多步骤流程表现为一次标准的 API 调用，从而在无需修改现有前端工具的情况下实现高级后端逻辑。\n\n---\n## 维护者注记补充 - 更新于 2026 年 4 月 12 日\n\n> 在最初提出这一需求的一年半之后，我终于在此实现了工具调用支持。由于添加这项功能极具挑战性，我一直将其搁置。\n>\n> 这意味着什么，以及我在过去两周中的使用方式——你可以将 Wilmer 插入到 OpenCode 和 Llama.cpp 之间。我一直在利用 Qwen 27B 和 122B 提升 OpenCode 的质量，通过创建 OpenCode 调用经过的工作流来实现。虽然这样做会显著降低整体速度，但结果是我不再需要频繁干预，因为系统能够在更少的尝试中给出正确的答案。\n>\n> 我计划再花大约一个月时间优化这些工作流，随后便会将其分享给社区。接下来的工作将是更新此处的工作流。\n>\n> 另外还有一个重大变化：我终于为标准节点添加了图像直通功能。当初引入 ImageProcessor 时，视觉模型还非常新颖。如今它们已广泛应用，因此我对 ImageProcessor 进行了重新设计，使其更加贴合长期效率提升的目标；与此同时，标准节点则可用于像往常一样直接将图像传递给模型。\n\n## 维护者注 - 更新于 2026-03-29\n\n> 近期我一直在对 Wilmer 进行大规模改进，这次的改动可能是自工作流引擎重构以来最大的一次。简而言之：**你不再需要运行多个 Wilmer 实例了。**\n>\n> 这一直是我对 Wilmer 工作方式最不满的地方。你会看到一堆正在运行的实例，每个实例都有自己的配置文件，管理起来非常麻烦。所以我终于坐下来彻底解决了这个问题。\n>\n> 现在的新功能包括：\n>\n> - **多用户支持。** 你现在可以使用 `--User alice --User bob` 参数启动 Wilmer（可以根据需要添加任意数量的用户），每个用户都会拥有独立的配置文件、对话文件、记忆存储和日志目录。Wilmer 会自动识别请求来源，并将所有内容路由到对应的用户目录中。\n>\n> - **并发控制。** 在启动服务器时，你可以使用 `--concurrency` 和 `--concurrency-timeout` 标志来限制同时处理的请求数量。默认情况下，每次只处理一个请求（这在大多数本地 Mac 环境下是理想设置），其他请求会排队等待，而不会相互干扰。如果你的后端硬件足够强大，比如配备了 NVIDIA 显卡，也可以适当提高并发数。\n>\n> - **用户文件隔离。** 讨论 ID 文件以及其他与会话相关的数据现在都存储在用户专属的目录中。这样一来，即使在一个实例上有多位用户，他们的文件也不会混杂在一起。\n>\n> - **API 密钥支持。** 如果请求中包含 `Authorization: Bearer` 密钥，Wilmer 会根据该密钥将文件分组存储到独立的目录中。这是第二层文件隔离机制。\n>\n> - **实验性：可选加密功能。** 你可以为每个 API 密钥启用 Fernet 加密功能，用于保护存储的文件。一旦开启，Wilmer 会使用你的 API 密钥对生成的文件进行加密。此外，还提供了一个密钥轮换脚本，方便你在需要时更换密钥。（注意：目前尚未应用于 SQLite 数据库）\n>\n> - **更多记忆和上下文管理选项。** 我新增了几项工具来更好地管理长对话——一个基于文件的记忆自动压缩层，以及一个基于 Token 的对话压缩工作流节点 ContextCompactor。关于这些功能的详细说明请参阅文档，其中记忆压缩器在处理长时间对话时非常有用。它的原理是：当积累到一定数量的记忆条目时，它会将这些条目合并成一条新的记忆，然后继续重复这个过程。例如，如果你设置了每 10,000 个 Token 生成一条记忆，那么原本可能产生 6 条记忆的内容，最终会被压缩成 2 条较小的记忆（每条约 500–1,000 个 Token）。这样，60,000 个 Token 的内容就被浓缩成了两份简洁的小型记忆。\n>\n> - **图像处理优化。** 我修复了 Wilmer ImageProcessor 中长期存在的设计问题，现在图像从进入系统开始，直到发送给 LLM 处理的整个流程都会被逐条消息追踪记录，确保它们始终与生成它们的对话回合保持关联。这一改进还让我能够在图像处理器中引入缓存机制（当讨论 ID 处于活动状态时），从而避免重复处理相同的数据。\n>\n> 在之前的版本中，我还增加了 **共享工作流集合以及通过 API 模型字段选择工作流的功能。** `\u002Fv1\u002Fmodels` 和 `\u002Fapi\u002Ftags` 端点现在会返回你可用的工作流列表，这意味着像 Open WebUI 这样的前端界面可以直接在模型下拉菜单中显示这些工作流。你只需像选择模型一样，直接挑选所需的工作流即可。共享工作流文件夹（`_shared\u002F`）允许多个用户指向同一套工作流，而无需在各处重复配置；同时，单个用户也可以在其下管理多套工作流。这样一来，你就不再需要为不同用户分别设置编码工作流、通用工作流等，而是可以让一个用户拥有多种可用的工作流。\n>\n> ![Open WebUI 中的共享工作流](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_7c1a7a8ca181.png)\n>\n> 随 Wilmer 一起提供的示例用户和工作流已经很久没有更新了，接下来我会对其进行相应调整。\n>\n> -Socg\n\n## 工作流的强大之处\n\n### 半自主工作流让你掌控工具及其使用时机\n\n以下展示了 Open WebUI 连接到两个 Wilmer 实例的情形（录制于多用户支持功能加入之前；实际上，单个实例现在就可以服务多个用户）。第一个实例直接调用了 Mistral Small 3 24b 模型，而第二个实例则先调用了 [Offline Wikipedia API](https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FOfflineWikipediaTextApi)，然后再向同一个模型发起请求。\n\n![无 RAG 对比 RAG](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_32d62cf1cfb9.gif)\n*点击图片即可播放 GIF，若未自动播放*\n\n### 迭代式 LLM 调用以提升性能\n\nLLM 的零次提示可能无法给出理想结果，但后续的追问通常能显著改善效果。如果你经常在执行诸如软件开发之类的任务时重复同样的追问步骤[参考我的个人指南：如何借助 AI 辅助进行软件开发](https:\u002F\u002Fwww.someoddcodeguy.dev\u002Fmy-personal-guide-for-developing-software-with-ai-assistance\u002F)，那么创建一个自动化这些步骤的工作流将会带来极大的收益。\n  \n### 分布式 LLM 网络\n\n借助工作流，你可以在一次调用中整合任意数量的 LLM，只要你的硬件设备能够支持即可。例如，如果你手头有一些旧电脑，能够运行 3–8B 规模的模型，那么完全可以将它们作为工作节点中的 LLM 使用。无论是在自家硬件上还是通过专有 API，你能接入的 LLM 接口越多，你的工作流网络就越强大。根据你的工作流设计，向 Wilmer 发出的一个提示就有可能同时触达 5 台甚至更多的计算机，包括各种专有 API。\n  \n## 一些不太美观但有助于理解其功能的示意图\n\n#### 使用提示路由器的简单助手工作流示例\n\n![单一助手路由至多个 LLM](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_421da95014a7.jpg)\n\n#### 路由功能的应用示例\n\n![提示路由示例](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_c65c70228eef.png)\n\n#### 向不同 LLM 发送群聊消息\n\n![群聊路由至不同 LLM](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_eed49478181b.png)\n\n#### 用户请求网站设计的 UX 工作流示例\n\n![过于简化的设计工作流示例](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_e7da803ee913.jpg)\n\n## 核心特性\n\n* **高级上下文路由**\n  WilmerAI 的核心功能。它使用复杂的、基于上下文的逻辑来引导用户请求。这一过程由两种机制处理：\n    * **提示路由**：在对话开始时，分析用户的输入提示，以选择最合适的专用工作流（例如“编码”、“事实性”、“创意性”）。\n    * **工作流内路由**：在工作流执行过程中，提供条件性的“如果\u002F那么”逻辑，使流程能够根据前一个节点的输出动态选择下一步。\n\n  关键的是，这些路由决策可以基于**整个对话历史**，而不仅仅是用户最后几条消息，从而实现对用户意图更深层次的理解。\n\n---\n\n* **核心：基于节点的工作流引擎**\n  路由及其他所有逻辑的基础。WilmerAI 使用工作流来处理请求，这些工作流是定义步骤序列（节点）的 JSON 文件。每个节点执行特定任务，其输出可作为下一个节点的输入，从而实现复杂的链式思维过程。\n\n---\n\n* **多模型与多工具编排**\n  工作流中的每个节点都可以连接到完全不同的大模型端点，或调用不同的工具。这使得你可以为任务的不同部分编排最适合的模型——例如，在单个工作流中，使用小型快速的本地模型进行摘要生成，再用大型强大的云端模型完成最终推理。\n\n---\n\n* **模块化与可重用的工作流**\n  你可以为常见任务（如数据库查询或文本摘要）构建独立的工作流，然后将其作为单个可重用节点嵌入到更大的工作流中。这样可以简化复杂智能体的设计。\n\n---\n\n* **有状态的对话记忆**\n  为了在长时间对话中提供必要的上下文并实现精准路由，WilmerAI 使用三部分记忆系统：按时间顺序记录的摘要文件、持续更新的整段聊天“滚动摘要”，以及用于检索增强生成（RAG）的可搜索向量数据库。\n\n---\n\n* **适应性强的 API 网关**\n  WilmerAI 的“入口”。它暴露兼容 OpenAI 和 Ollama 的 API 端点，允许你无需修改即可连接现有的前端应用和工具。\n\n---\n\n* **灵活的后端连接器**\n  WilmerAI 的“出口”。它通过一套简单但强大的配置系统连接到各种大模型后端（OpenAI、Ollama、KoboldCpp），该系统包括**端点**（地址）、**API 类型**（架构\u002F驱动程序）和**预设**（生成参数）。\n\n---\n\n- **使用 MCPO 集成 MCP 服务器工具**：一项新的实验性功能，支持在工作流中调用 MCP 服务器工具。特别感谢 [iSevenDays](https:\u002F\u002Fgithub.com\u002FiSevenDays) 在此功能上的出色工作。更多信息请参阅 [ReadMe](Public\u002Fmodules\u002FREADME_MCP_TOOLS.md)。\n\n---\n\n- **隐私优先开发**：从根本上讲，Wilmer 始终秉持完全隐私的原则进行设计。Socg 持续使用这款应用，他和其他人一样，不希望自己的信息被随意泄露到网络上。因此，每一个决策都围绕着这样一个理念：Wilmer 中唯一进出的数据流，都应该是用户期望并主动配置的内容。\n\n----\n\n#### 隐私检查 — 2026年3月29日\n\n为了确保自己没有无意中添加任何可能损害 Wilmer 隐私保护措施的内容，我有时会请 Claude Code 对代码库进行全面检查，查找任何对外的网络请求或其他数据泄露行为。虽然这不如正式的代码审计，但能让我安心一些。现将检查结果附上。\n\n2026年3月29日，我请 Claude Code（Claude Opus）扫描了代码库，并报告其中发现的所有对外网络请求、遥测数据，以及其他可能涉及隐私的行为。以下列出检查结果，供参考，但这并不构成保证——如果你的部署对隐私非常敏感，请务必自行进行分析。\n\n```text\n检查内容\n----------------\n搜索了 Middleware\u002F 和 Public\u002F 源码树、顶层入口文件（server.py、run_eventlet.py、run_waitress.py），以及所有 Shell\u002F批处理启动脚本（run_macos.sh、run_windows.bat、Scripts\u002Frekey_encrypted_files.sh、Scripts\u002Frekey_encrypted_files.bat），寻找对外 HTTP 请求（requests.get、requests.post、requests.Session、requests.request）、原始套接字使用、子进程调用、动态导入、硬编码的外部 URL、与遥测相关的关键词（analytics、telemetry、phone-home、tracking、metrics），以及环境变量读取。入口文件和启动脚本中未发现任何对外网络请求；run_eventlet.py 在本地监听套接字上设置了 TCP_NODELAY，但并未建立任何外部连接。\n\n对外网络请求\n----------------------\n代码库中发现的所有对外 HTTP 请求位置：\n\n1. Middleware\u002Fllmapis\u002Fhandlers\u002Fbase\u002Fbase_llm_api_handler.py（第223、373行）\n   - session.post() 发送到用户配置的大模型端点（endpoint config 中的 self.base_url）\n\n2. Middleware\u002Fworkflows\u002Ftools\u002Foffline_wikipedia_api_tool.py（第45、80、117、153、192行）\n   - requests.get() 发送到用户配置的主机；默认为 127.0.0.1:5728\n   - 仅在 activateWikiApi 被设置时启用\n\n3. Public\u002Fmodules\u002Fmcp_service_discoverer.py（第55行）\n   - requests.get() 发送到用户配置或环境变量指定的 MCPO 服务器；默认为 localhost:8889\n\n4. Public\u002Fmodules\u002Fmcp_tool_executor.py（第235行）\n   - requests.request() 发送到与上述相同的 MCPO 服务器\n\n未发现任何遥测、分析、回传数据、自动更新或硬编码的外部 URL。所有对外连接的目标都是用户明确配置的端点。\n\n数据存储\n------------\n- JSON 对话\u002F记忆文件：当启用 encryptUsingApiKey 时，可选地使用 Fernet 加密（AES-128-CBC 加 HMAC，PBKDF2 迭代 10万次）进行静态加密。\n- SQLite 数据库：用于向量记忆和工作流锁。即使启用了加密功能，这些数据库也未加密。\n- 日志文件：在 DEBUG 级别下，日志可能包含完整的提示和大模型响应，除非用户配置中启用了 redactLogOutput 或 encryptUsingApiKey。\n- 配置文件：可能包含明文 API 密钥。这些文件不会被 Wilmer 加密。\n\n第三方依赖\n------------------------\nrequirements.txt 中的所有运行时依赖项：\n\n  requests 2.33.0        - 用于大模型 API 和工具调用的 HTTP 客户端\n  urllib3 2.6.3           - requests 的传输层\n  scikit-learn 1.8.0      - 用于记忆搜索的 TF-IDF 向量化\n  Flask 3.1.3             - HTTP 服务器框架\n  Jinja2 3.1.6            - 用于工作流提示的模板渲染\n  Pillow 12.1.1           - 图像格式检测和处理\n  eventlet 0.40.4         - 异步 WSGI 服务器（可选）\n  waitress 3.0.2          - 生产级 WSGI 服务器（可选）\n  cryptography 46.0.5     - 用于存储数据的 Fernet 加密\n\n在 Wilmer 使用的所有这些包的初始化路径中，均未发现遥测或分析代码。\n  \n动态代码加载\n--------------------\n- PythonModule 工作流节点会以 Wilmer 进程的全部权限执行用户提供的、来自配置脚本目录的 Python 脚本。这些脚本不会被沙箱隔离，也不会经过 Wilmer 的验证。\n- API 处理器发现（Middleware\u002Fllmapis\u002Fhandlers\u002F）仅限内部使用，并且只从应用程序内的 handlers 目录加载。\n\n图片 URL 处理\n------------------\n当对话消息包含通过 HTTP URL 引用的图片时，该 URL 会原样转发给配置的 LLM 提供商。Wilmer 不会自行下载该图片。\n\n局限性\n-----------\n1. 这是一项由 AI（Claude Opus）执行的静态源代码搜索，而非正式的第三方安全审计。\n2. 第三方库的源代码并未在字节码级别进行检查。结果仅确认 Wilmer 自身的代码似乎不会发起意外连接。\n3. PythonModule 脚本由用户提供，可以执行任意代码。其行为不在本次检查范围内。\n4. 未进行运行时网络监控（例如数据包捕获）。\n5. 用于向量记忆的 SQLite 数据库即使在其他情况下启用了加密功能，也未进行加密。\n6. 日志文件可能包含完整的对话内容，除非明确启用了内容脱敏功能。\n```\n\n> 虽然我并没有工具能够百分之百保证不存在第三方库在执行我意料之外的操作，但我希望强调这一点对我来说非常重要。如果您有任何疑虑，强烈建议您自行对代码库和应用程序进行分析。如果您发现了任何我遗漏的内容，请随时提交问题。\n\n\n\n## 用户文档\n\n用户文档可在 [\u002FDocs\u002FUser_Documentation\u002F](Docs\u002FUser_Documentation\u002FREADME.md) 找到。\n\n## 开发者文档\n\n有用的开发者文档可在 [\u002FDocs\u002FDeveloper_Docs\u002F](Docs\u002FDeveloper_Docs\u002FREADME.md) 找到。\n\n## 快速设置\n\n### YouTube 视频\n\n[![WilmerAI 和 Open WebUI 在全新 Windows 11 桌面安装](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_readme_1dfc034985a7.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KDpbxHMXmTs \"WilmerAI 和 Open WebUI 在全新 Windows 11 桌面安装\")\n\n### 指南\n\n#### WilmerAI\n\n请参阅 [用户文档设置入门指南](Docs\u002FUser_Documentation\u002FSetup\u002F_Getting-Start_Wilmer-Api.md)，获取关于如何快速设置 API 的分步说明。\n\n\n#### Wilmer 与 Open WebUI\n\n[您可以点击此处查看关于将 Wilmer 与 Open WebUI 配合使用的书面指南](Docs\u002FUser_Documentation\u002FSetup\u002FOpen-WebUI.md)\n\n#### Wilmer 与 SillyTavern\n\n[您可以点击此处查看关于将 Wilmer 与 SillyTavern 配合使用的书面指南](Docs\u002FUser_Documentation\u002FSetup\u002FSillyTavern.md)\n\n\n---\n\n## 为什么创建 WilmerAI？\n\nWilmer 项目于 2023 年底启动，正值 Llama 2 时代，旨在通过路由机制最大化微调模型的应用效果。当时现有的路由器大多无法很好地处理语义路由——它们往往仅根据单个关键词以及最后一句话来分类；然而，有时单个词并不足以准确描述一个类别，而最后一句话可能包含过多的隐含信息，或者缺乏足够的上下文，从而难以进行恰当的分类。\n\n几乎在 Wilmer 启动后不久，人们就意识到仅仅依靠路由是不够的：尽管微调模型的效果尚可，但远不及专有 LLM 的智能水平。然而，当 LLM 被反复要求针对同一任务进行迭代时，其响应质量往往会提升（前提是提示词编写得当）。这表明，最佳方案并非简单地将请求路由至单一 LLM 一次性完成响应，而是将提示词发送给更为复杂的系统。\n\n因此，Wilmer 没有依赖不可靠的自主代理，而是专注于半自主的工作流设计，赋予用户对 LLM 执行路径的精细控制权，并最大限度地利用用户的领域知识和经验。这也意味着多个 LLM 可以协同工作，由工作流本身进行编排，共同得出最终解决方案。\n\n与仅将请求路由至单一 LLM 不同，Wilmer 会通过完整的工作流将请求分配给多个 LLM。\n\n这种设计使得 Wilmer 的分类能力比大多数路由器更加复杂且高度可定制。分类过程由用户自定义的工作流负责，用户可以根据需要添加任意数量的节点和 LLM，逐步拆解对话内容，精确判断用户的需求。这意味着用户可以尝试不同的提示词风格，以优化路由器的表现。此外，路由规则不仅仅是关键词，而是对整个路由流程的完整描述，几乎不留给 LLM 任何“想象”的空间。其目标是，一旦 Wilmer 的分类出现不足，只需调整分类工作流即可解决。而一旦确定了分类方向，请求就会进入下一个工作流。\n\n最终，Wilmer 的核心逐渐从单纯的路由转向工作流设计，并提供了可选的绕过机制，允许用户直接跳过路由步骤。由于其占用资源较少，用户可以同时运行多个 Wilmer 实例——有些直接执行特定的工作流，而另一些则先经过分类和路由流程。\n\n尽管 Wilmer 可能是同类项目中的首创，但此后已涌现出许多其他的语义路由器，其中一些甚至可能速度更快、性能更好。然而，该项目仍将持续维护很长一段时间，因为项目的维护者本人至今仍在将其作为日常工具使用，并计划为其开发更多功能。\n\n## Wilmer API 端点\n\n### 如何连接到 Wilmer？\n\nWilmer 在前端暴露了多种不同的 API，使大多数 LLM 相关应用都能与其对接。\n\nWilmer 支持以下 API，可供其他应用连接：\n\n- OpenAI 兼容 v1\u002Fcompletions（*需使用 [Wilmer 提示模板](Public\u002FConfigs\u002FPromptTemplates\u002Fwilmerai.json)*）\n- OpenAI 兼容 chat\u002Fcompletions\n- Ollama 兼容 api\u002Fgenerate（*需使用 [Wilmer 提示模板](Public\u002FConfigs\u002FPromptTemplates\u002Fwilmerai.json)*）\n- Ollama 兼容 api\u002Fchat\n\n### Wilmer 可以连接哪些服务？\n\n在后端，Wilmer 能够连接多种 API，并将提示词发送至相应的 LLM。目前，Wilmer 支持以下类型的 API：\n\n- Claude API（Anthropic Messages API）\n- OpenAI 兼容 v1\u002Fcompletions\n- OpenAI 兼容 chat\u002Fcompletions\n- Ollama 兼容 api\u002Fgenerate\n- Ollama 兼容 api\u002Fchat\n- KoboldCpp 兼容 api\u002Fv1\u002Fgenerate（非流式生成）\n- KoboldCpp 兼容 \u002Fapi\u002Fextra\u002Fgenerate\u002Fstream（流式生成）\n\nWilmer 同时支持流式和非流式连接，并已在 Sillytavern 和 Open WebUI 上进行了测试。\n\n## 维护者注：\n\n> 本项目是在我的个人硬件上，利用业余时间进行维护的。由于工作原因，我无法在工作日的正常工作时间内参与开发，因此我只能在周末以及部分工作日晚上进行代码更新。\n>\n> 如果您发现了 bug 或其他问题，修复可能需要一到两周的时间才能发布。如果确实如此，我在此提前致歉，但请不要因此认为我没有认真对待该问题。实际上，我可能要到下一周的周五或周六才有时间查看并处理该问题。\n>\n> -Socg\n\n## 重要提示：\n\n> 请注意，工作流的本质决定了它们可能会根据您的配置对 API 端点发起多次调用。WilmerAI 不会跟踪令牌使用情况，也不会通过其 API 报告准确的令牌使用量，更不提供任何可行的方式来监控令牌使用情况。因此，如果您出于成本考虑需要跟踪令牌使用量，请务必通过 LLM API 提供给您的仪表板来记录您使用的令牌数量，尤其是在刚开始使用本软件时，以便熟悉其操作。\n>\n> 您所使用的 LLM 直接影响 WilmerAI 的质量。这是一个由 LLM 驱动的项目，其流程和输出几乎完全依赖于所连接的 LLM 及其响应。如果您将 Wilmer 连接到一个生成质量较低模型，或者您的预设设置或提示模板存在缺陷，那么 Wilmer 的整体质量也会相应降低。在这方面，它与代理式工作流并没有太大区别。\n\n---\n\n## 联系方式\n\n如需反馈、请求或只是打个招呼，您可以通过以下邮箱联系我：\n\nWilmerAI.Project@gmail.com\n\n---\n\n## 第三方库\n\nWilmerAI 在其 requirements.txt 文件中引入了多个第三方库，并通过 import 语句直接导入这些库；它并未扩展或修改这些库的源代码。\n\n这些库包括：\n\n* Flask : https:\u002F\u002Fgithub.com\u002Fpallets\u002Fflask\u002F\n* requests: https:\u002F\u002Fgithub.com\u002Fpsf\u002Frequests\u002F\n* scikit-learn: https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002F\n* urllib3: https:\u002F\u002Fgithub.com\u002Furllib3\u002Furllib3\u002F\n* jinja2: https:\u002F\u002Fgithub.com\u002Fpallets\u002Fjinja\n* pillow: https:\u002F\u002Fgithub.com\u002Fpython-pillow\u002FPillow\n* eventlet: https:\u002F\u002Fgithub.com\u002Feventlet\u002Feventlet\n* waitress: https:\u002F\u002Fgithub.com\u002FPylons\u002Fwaitress\n* cryptography: https:\u002F\u002Fgithub.com\u002Fpyca\u002Fcryptography\n\n有关这些库的许可信息，请参阅 ThirdParty-Licenses 文件夹中的 README 文件，其中包含了每种许可证的完整文本以及相关的 NOTICE 文件（如有），并注明了各自的最后更新日期。\n\n## Wilmer 许可证与版权\n\n    WilmerAI\n    版权所有 © 2024–2026 克里斯托弗·史密斯\n\n    本程序是自由软件：您可以重新分发它，并按照自由软件基金会发布的 GNU 通用公共许可证条款对其进行修改，\n    无论是第 3 版本，还是您选择的任何后续版本。\n    \n    本程序以“按原样”提供，不附带任何担保，包括但不限于适销性或特定用途适用性的隐含担保。详细信息请参阅 GNU 通用公共许可证。\n    \n    您应当随本程序收到一份 GNU 通用公共许可证的副本。如果没有，请访问 \u003Chttps:\u002F\u002Fwww.gnu.org\u002Flicenses\u002F> 查阅。","# WilmerAI 快速上手指南\n\nWilmerAI 是一款专为高级语义提示路由和复杂任务编排设计的开源应用。其核心是一个基于节点的工作流引擎，能够理解完整的对话上下文（而不仅仅是最后一条消息），从而智能地将请求路由到最合适的 LLM 或工具链。它兼容 OpenAI 和 Ollama API，可无缝集成到现有的前端工具（如 Open WebUI）中。\n\n> **注意**：本项目目前仍处于开发阶段，软件按“原样”提供，无任何形式的担保。\n\n## 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**：支持 Linux、macOS 或 Windows (WSL2 推荐)。\n*   **Python 版本**：Python 3.10 或更高版本。\n*   **依赖管理**：推荐使用 `pip` 或 `conda` 进行环境管理。\n*   **后端模型服务**：需预先部署好至少一个 LLM 后端（如本地运行的 Ollama、vLLM，或拥有 OpenAI\u002FAzure 等 API Key）。\n*   **内存建议**：由于涉及多步工作流和上下文记忆，建议至少 8GB 可用内存；若运行本地大模型，需根据模型大小增加显存\u002F内存。\n\n## 安装步骤\n\n### 1. 克隆项目仓库\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI.git\ncd WilmerAI\n```\n\n*(注：如果国内访问 GitHub 较慢，可使用镜像源加速，例如：`git clone https:\u002F\u002Fghp.ci\u002Fhttps:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI.git`)*\n\n### 2. 创建并激活虚拟环境\n\n建议使用 Python 虚拟环境以避免依赖冲突：\n\n```bash\npython -m venv venv\nsource venv\u002Fbin\u002Factivate  # Linux\u002FmacOS\n# 或在 Windows 上: venv\\Scripts\\activate\n```\n\n### 3. 安装依赖\n\n```bash\npip install -r requirements.txt\n```\n\n*(注：若下载速度慢，可指定国内镜像源：`pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`)*\n\n### 4. 配置初始文件\n\n复制示例配置文件并根据需要修改（主要是后端 API 地址和密钥）：\n\n```bash\ncp config.example.json config.json\n```\n\n编辑 `config.json`，填入您的 LLM 后端信息（如 Ollama 地址 `http:\u002F\u002Flocalhost:11434` 或 OpenAI Key）。\n\n## 基本使用\n\n### 启动服务器\n\nWilmerAI 支持多用户模式和并发控制。以下是几种常见的启动方式：\n\n**单用户默认模式（适合本地测试）：**\n```bash\npython main.py\n```\n\n**多用户模式（为不同用户隔离配置和记忆）：**\n```bash\npython main.py --User alice --User bob\n```\n\n**启用并发控制（适合高性能后端，如 NVIDIA GPU 集群）：**\n```bash\npython main.py --concurrency 4 --concurrency-timeout 300\n```\n\n> **说明**：默认情况下，服务器一次只处理一个请求以保证本地稳定性。`--concurrency` 参数允许同时处理多个请求。\n\n### 连接前端工具\n\nWilmerAI 启动后，默认会暴露兼容 OpenAI 格式的 API 端点（通常为 `http:\u002F\u002Flocalhost:8000\u002Fv1`）。您可以直接将其配置到任何支持 OpenAI API 的客户端中。\n\n**示例：使用 curl 测试路由功能**\n\n```bash\ncurl http:\u002F\u002Flocalhost:8000\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"coding-workflow\", \n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"帮我写一个 Python 脚本，用于抓取网页标题。\"}\n    ]\n  }'\n```\n\n*   **model 字段**：在此处填写您在 Wilmer 配置中定义的工作流名称（如 `coding-workflow`、`general-assistant`）。Wilmer 会根据该名称加载对应的 JSON 工作流定义。\n*   **上下文感知**：如果您在同一个会话 ID（Session ID）下连续发送消息，Wilmer 会自动读取历史记忆和滚动摘要，实现基于完整上下文的智能路由。\n\n### 简单工作流概念\n\nWilmer 的核心在于 `workflows` 目录下的 JSON 文件。一个简单的路由逻辑如下：\n\n1.  **输入**：用户发送消息。\n2.  **路由节点**：分析整段对话历史，判断意图（是写代码？还是查资料？）。\n3.  **执行节点**：根据判断结果，调用特定的 LLM（如用 Qwen-72B 写代码，用 Mistral-7B 做总结）或外部工具（如搜索维基百科）。\n4.  **输出**：将最终结果返回给客户端，整个过程对前端透明。\n\n您可以根据需求自定义这些 JSON 工作流，实现复杂的自动化任务编排。","某初创团队正在开发一款集成代码生成与历史文档查询的智能研发助手，需同时调用多个不同特性的大模型 API。\n\n### 没有 WilmerAI 时\n- **意图识别肤浅**：系统仅靠关键词匹配路由，当用户问“它是什么意思？”时，无法结合前文关于\"Rosetta 石”的讨论，导致模型回答泛泛而谈。\n- **架构臃肿难管**：为处理不同任务需启动多个独立的路由实例，每个实例配置各异，运维人员难以统一管理和监控。\n- **前端耦合严重**：若要切换底层模型或增加预处理脚本，必须修改前端代码以适配新的 API 逻辑，迭代效率极低。\n- **多用户隔离困难**：缺乏原生多用户支持，不同开发者的对话记忆和配置文件容易混淆，存在数据串扰风险。\n\n### 使用 WilmerAI 后\n- **上下文深度理解**：WilmerAI 基于节点的工作流引擎能分析完整对话历史，精准识别“它”指代的是前文的特定技术概念，从而调用专业模型给出精确解答。\n- **单实例统一管理**：借助新增的多用户支持，团队只需运行一个 WilmerAI 实例，即可通过命令行参数为每位开发者隔离配置、记忆和日志，大幅简化部署。\n- **后端逻辑透明化**：WilmerAI 作为中间层拦截请求，团队可在后端 JSON 工作流中自由编排模型调用、工具执行及图像透传，前端无需任何改动即可享受复杂逻辑。\n- **动态工作流编排**：利用最新加入的工具调用支持，WilmerAI 能自动在代码生成任务中串联 Qwen 等大模型进行多次自我修正，显著减少人工干预次数。\n\nWilmerAI 通过语义感知路由与可视化工作流编排，将分散的 LLM 能力整合为灵活、可维护的企业级智能中枢。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSomeOddCodeGuy_WilmerAI_32d62cf1.gif","SomeOddCodeGuy","Chris","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FSomeOddCodeGuy_42409e1c.png","Just a boring dev\u002Fmanager who picked up LLMs in early 2023, and been engrossed ever since.",null,"Florida, USA","https:\u002F\u002Fwww.someoddcodeguy.dev","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy",[82,86,90],{"name":83,"color":84,"percentage":85},"Python","#3572A5",99.8,{"name":87,"color":88,"percentage":89},"Shell","#89e051",0.1,{"name":91,"color":92,"percentage":89},"Batchfile","#C1F12E",809,48,"2026-04-13T08:17:37","GPL-3.0",4,"未说明 (文中提及本地 Mac 设置和 NVIDIA 设置，暗示支持 macOS 和 Linux\u002FWindows)","非必需 (作为 API 网关可连接任意后端); 若本地运行需根据所连接的模型决定 (文中提及支持 NVIDIA 设置及运行 3b-122b 参数量的模型)","未说明 (取决于工作流中调用的具体模型大小，文中提及可管理长上下文记忆)",{"notes":102,"python":103,"dependencies":104},"WilmerAI 本身是一个路由和工作流编排工具，而非直接托管模型的推理引擎。它通过 API 连接现有的 LLM 后端（如 Ollama, OpenAI, Llama.cpp）。硬件需求完全取决于用户在工作流中配置的具体模型（文中示例提到使用 Qwen 27b\u002F122b 或 Mistral Small）。支持多用户隔离、并发控制、基于上下文的语义路由以及图像透传功能。建议配合本地推理后端（如 Llama.cpp）或云服务使用。","未说明",[105],"未说明 (架构为基于 JSON 的工作流引擎，通过 API 连接外部 LLM 后端如 Ollama, OpenAI, Llama.cpp 等)",[13,15,35,14],[108,109,110],"ai","generative-ai","llms","2026-03-27T02:49:30.150509","2026-04-15T07:10:22.072487",[114,119,124,129,134,139],{"id":115,"question_zh":116,"answer_zh":117,"source_url":118},34008,"如何让 WilmerAI 兼容 Ollama 或连接其他需要 Ollama 接口的工具？","WilmerAI 本身可以模拟 Ollama 或 OpenAI 的 API 行为，无需额外代理。您可以将前端工具（如 Open WebUI）连接到 WilmerAI，将其视为一个 Ollama 实例；同时配置 WilmerAI 后端连接到 KoboldCpp 或其他模型的 API。具体做法是：将 WilmerAI 作为 OpenAI v1\u002FCompletions 文本完成端点，然后让其转发请求到 Ollama 的 api\u002Fchat 聊天完成端点。这种配置使得 WilmerAI 能像 Ollama 代理一样工作，支持多种工具链的集成。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F40",{"id":120,"question_zh":121,"answer_zh":122,"source_url":123},34009,"启动服务器时遇到 'ModuleNotFoundError: No module named ... ollama_chat_api_image_specific_handler' 错误怎么办？","该错误通常是因为开发者在发布版本时遗漏提交了某个文件所致。如果遇到此错误，请检查 GitHub 仓库的最新提交记录，维护者通常会迅速补推缺失的文件（例如 `Middleware\u002Fllmapis\u002Follama_chat_api_image_specific_handler.py`）。解决方法是拉取最新的代码更新（git pull），确保所有必要的模块文件都已存在于本地目录中。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F22",{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},34010,"如何在命令行指定配置文件目录和当前用户，以便多工作流运行或避免 Git 合并冲突？","项目已支持通过命令行参数直接指定配置目录和当前用户。这允许用户将 `Public\u002FConfigs` 目录从仓库中分离出来，在进行 `git pull` 更新时避免合并本地的工作流或端点更改。同时，指定 `current_user` 参数使得在同一安装实例上运行多个不同用户的工作流变得更加容易，而无需维护多套独立的配置目录。具体用法请参考相关 PR（如 PR #8）的实现细节。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F4",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},34011,"使用小模型（如 7B\u002F8B）运行事实性工作流（Factual Workflow）或离线维基百科功能时经常出错或不稳定，如何解决？","WilmerAI 的功能表现高度依赖所使用的模型。较大的模型通常能很好地处理复杂的工作流和提示词，但较小的模型（如 7B 或 8B）可能需要更精确的提示词工程才能正常工作。如果遇到路由错误或生成不稳定，建议尝试优化提示词（prompts）以增强其鲁棒性，或者换用参数量更大的模型。此外，维护者正在改进日志输出，以便在控制台明确显示是否调用了事实性工作流或拉取了维基百科文章，帮助用户调试。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F11",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},34012,"记忆（Memories）功能出现混乱，重复生成旧记忆或无法生成新记忆，且摘要质量下降，该如何处理？","这是一个已知的记忆创建逻辑问题，会导致系统混淆并重新生成旧记忆，进而影响基于记忆生成的摘要质量。该问题已在特定的代码提交（commit dd3bde2）中得到修复。如果您遇到此类问题，请务必更新到包含该修复提交的最新版本代码。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F13",{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},34013,"消息哈希计算中包含 [DiscussionId] 导致同一讨论中的消息哈希值变化，影响记忆去重，如何解决？","在创建记忆时对消息进行哈希计算时，如果包含动态变化的 `[DiscussionId]` 字段（例如 SillyTavern 在不同深度插入该 ID），会导致相同内容的消息产生不同的哈希值，从而破坏去重机制。解决方案是在哈希计算之前，从消息内容中剥离（strip out）`[DiscussionId]` 部分，确保哈希值仅基于实际的消息文本内容生成。","https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fissues\u002F5",[145,150,155,160,165,170,175,180,185,190,195,200,205,210,215,219,224,229,234,239],{"id":146,"version":147,"summary_zh":148,"released_at":149},263887,"v0.62.1","## 重大新特性\n\n1. **端到端工具调用透传** — 全面支持所有大模型处理器（Claude、OpenAI、Ollama）的工具调用功能，涵盖流式与非流式两种路径。前端 API 处理器会从传入请求中提取 `tools` 和 `tool_choice` 参数，将其贯穿工作流管道，并转发至后端大模型处理器。后端处理器则从大模型响应中解析工具调用数据，再通过响应管道返回给前端客户端。工作流节点配置新增 `allowTools` 布尔选项（默认为 `false`），用于控制哪些节点会透传工具调用；因此，记忆节点、摘要生成器和分类器在内部处理过程中会静默地抑制工具调用。内部采用 OpenAI 格式作为标准；Claude 和 Ollama 处理器会在其原生格式与 OpenAI 格式之间进行转换。流式工具调用片段会绕过所有文本处理步骤（如前缀剥离、思考块移除、群聊重构），直接以 SSE 格式发出。\n\n2. **分隔符切片工作流节点** — 新增一种节点类型，可根据指定分隔符将内容拆分为多个片段，并返回前 N 个（头部）或后 N 个（尾部）片段，同时使用相同的分隔符重新拼接。该节点适用于截取日志、CSV 行或按章节分隔的文档等内容。可通过 `content`、`delimiter`、`mode`（“head”\u002F“tail”）以及 `count` 等属性进行配置。支持在内容和分隔符字段中使用变量替换。\n\n3. **对话变量格式化控制** — 为 `chat_user_prompt_*` 工作流变量新增两种格式化选项。节点级的 `addUserAssistantTags`（布尔值）会在对话变量字符串中的每条消息前添加 `User: ` \u002F `Assistant: ` \u002F `System: ` 角色前缀。用户级的 `separateConversationInVariables`（布尔值）配合 `conversationSeparationDelimiter`（字符串），可将默认的消息间换行符替换为自定义的分隔符。\n\n4. **节点级图片控制** — 标准节点现可通过 `acceptImages`（布尔值，保留发送至大模型的对话消息中的图片）和 `maxImagesToSend`（整数，限制发送的总图片数量，优先保留最新图片；0 表示无限制）来控制图片透传。图片将按时间顺序从旧到新依次裁剪。\n\n5. **`\u002Fv1\u002Fchat\u002Fcompletions` 版本化路由** — 新增 `\u002Fv1\u002Fchat\u002Fcompletions` 作为兼容 OpenAI 的 API 的主要版本化路由。现有的 `\u002Fchat\u002Fcompletions` 仍作为别名保留，以确保向后兼容性。\n\n## 错误修复\n\n6. **代理输出\u002F输入中的大括号转义问题** — 修复了当代理输出、代理输入或增强后的工具调用文本中包含字面大括号时（例如来自工具调用的 JSON 或由 GetCustomFile 加载的文件），`str.format()` 函数崩溃的问题。采用两遍哨兵转义机制：先将变量值中的字面大括号替换为哨兵标记后再进行格式化，最后再将其恢复原状。\n\n7. **带下划线的类别匹配问题** — 修复了 `_match_category` 无法匹配包含下划线的类别键（如 `NEW_INSTRUCTION`）的问题。原有代码会去除大模型输出中的标点符号（包括下划线），但在比较时却…","2026-04-13T01:03:27",{"id":151,"version":152,"summary_zh":153,"released_at":154},263888,"v0.62","## 重大新特性\n\n1. **端到端工具调用透传** — 全面支持所有 LLM 处理器（Claude、OpenAI、Ollama）的工具调用功能，涵盖流式与非流式两种路径。前端 API 处理器会从传入请求中提取 `tools` 和 `tool_choice` 参数，将其贯穿工作流管道，并转发至后端 LLM 处理器。后端处理器则从 LLM 响应中解析工具调用数据，再通过响应管道返回给前端客户端。工作流节点配置新增 `allowTools` 布尔值（默认为 `false`），用于控制哪些节点会透传工具调用；因此，记忆节点、摘要生成器和分类器在内部处理过程中会静默地抑制工具调用。内部采用 OpenAI 格式作为标准；Claude 和 Ollama 处理器会在其原生格式与 OpenAI 格式之间进行转换。流式工具调用片段会绕过所有文本处理步骤（如前缀剥离、思考块移除、群聊重构），直接以 SSE 格式发出。\n\n2. **分隔符切片工作流节点** — 新增一种节点类型，可根据指定分隔符将内容拆分为多个片段，并返回前 N 个（头部）或后 N 个（尾部）片段，同时使用相同的分隔符重新拼接。适用于截取日志、CSV 行或按章节分隔的文档等内容。可通过 `content`、`delimiter`、`mode`（“head”\u002F“tail”）以及 `count` 属性进行配置。支持在内容和分隔符字段中使用变量替换。\n\n3. **对话变量格式化控制** — 为 `chat_user_prompt_*` 工作流变量新增两种格式化选项。节点级的 `addUserAssistantTags`（布尔值）会在对话变量字符串中的每条消息前添加 `User: ` \u002F `Assistant: ` \u002F `System: ` 角色前缀。用户级的 `separateConversationInVariables`（布尔值）配合 `conversationSeparationDelimiter`（字符串），可将默认的消息间换行符替换为自定义的分隔符。\n\n4. **节点级图片控制** — 标准节点现可通过 `acceptImages`（布尔值，保留发送至 LLM 的对话消息中的图片）和 `maxImagesToSend`（整数，限制发送的总图片数量，仅保留最近的几张；0 表示无限制）来控制图片的透传。图片将按时间顺序从最早的一张开始裁剪。\n\n5. **`\u002Fv1\u002Fchat\u002Fcompletions` 版本化路由** — 新增 `\u002Fv1\u002Fchat\u002Fcompletions` 作为兼容 OpenAI 的 API 的主要版本化路由。现有的 `\u002Fchat\u002Fcompletions` 仍作为别名保留，以确保向后兼容性。\n\n## 错误修复\n\n6. **代理输出\u002F输入中的大括号转义问题** — 修复了当代理输出、代理输入或增强后的工具调用文本中包含字面意义上的大括号时（例如来自工具调用的 JSON 或由 GetCustomFile 加载的文件），`str.format()` 会崩溃的问题。采用两遍哨兵转义机制：先将变量值中的字面大括号替换为哨兵标记后再进行格式化，格式化完成后再将其恢复原状。\n\n7. **带下划线的类别匹配问题** — 修复了 `_match_category` 无法匹配包含下划线的类别键（如 `NEW_INSTRUCTION`）的问题。原有代码会去除 LLM 输出中的标点符号（包括下划线），但随后却对比一个","2026-04-12T21:25:48",{"id":156,"version":157,"summary_zh":158,"released_at":159},263889,"v0.61","## 变更内容\n* 由 @dependabot[bot] 在 https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F86 中将 cryptography 从 46.0.5 升级至 46.0.6\n\n## 新贡献者\n* @dependabot[bot] 在 https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F86 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.6...v0.61","2026-04-05T15:32:48",{"id":161,"version":162,"summary_zh":163,"released_at":164},263890,"v0.6","# v0.6 - 2026年3月\n\n## 主要新特性\n\n1. **ContextCompactor 工作流节点** — 新增节点类型，采用基于 token 的滑动窗口机制，将对话消息总结为两份滚动摘要（旧摘要 + 最旧摘要）。该节点独立于记忆系统，专为实现对最近对话内容的感知式压缩而设计。使用 XML 样式的标签，并可通过配置文件进行自定义。\n\n2. **自动记忆凝缩** — 针对基于文件的记忆系统新增可选的凝缩层。当新记忆积累到一定数量（可配置阈值）后，最旧的一批记忆将由 LLM 总结为一条凝缩条目，从而在长时间对话中有效减少文件膨胀。\n\n3. **每条消息的图像关联** — 进行重大重构，用每条消息中的 `\"images\"` 键取代原有的合成 `{\"role\": \"images\"}` 消息。现在，图像从摄入阶段一直伴随其原始消息，直至发送给 LLM 处理。同时，在摄入时支持 OpenAI 的多模态内容解析。\n\n4. **Claude API 图像支持** — Claude 处理器现已全面支持图像输入。支持 base64 编码、data URI 和 HTTP URL。使用 PIL\u002FPillow 库进行格式检测（可选，若失败则回退至 JPEG 格式）。根据 Anthropic 的建议，图像应置于文本之前。\n\n5. **用户级加密** — 当通过 `Authorization: Bearer` 提供 API 密钥时，文件将存储在与密钥对应的隔离目录中。可为每个用户启用可选的 Fernet 加密（AES-128-CBC + HMAC-SHA256，PBKDF2 密钥派生），并支持明文到加密的透明迁移。同时提供密钥重置脚本。\n\n6. **多用户支持** — 单个 WilmerAI 实例现可通过多次指定 `--User` 参数为多个用户提供服务。实现完全的用户隔离：独立的用户配置读取、请求范围内的用户身份识别、独立的用户日志目录，以及聚合模型和标签端点。\n\n7. **WSGI 并发限制中间件** — 所有入口点新增 `--concurrency`（默认值为 1）和 `--concurrency-timeout`（默认值为 900 秒）命令行参数。超过并发限制的请求将进入队列，等待空闲槽位或超时（返回 503 错误）。该功能在 WSGI 层实现，确保信号量能够跨流式响应保持一致。\n\n## Bug 修复\n\n8. **SillyTavern 流式传输卡顿问题** — 修复了以 SillyTavern 作为前端时出现的流式传输卡顿问题。\n\n9. **Open WebUI 流式错误** — 恢复了 JSON 心跳格式（此前被改为纯换行符，导致 Open WebUI 的 NDJSON 解析器抛出 JSONDecodeError）。\n\n10. **记忆生成停滞问题** — 修复了首次运行后因前端注入仅包含 `[DiscussionId]` 标记的作者说明而导致的消息哈希碰撞，从而使得后续记忆生成无法触发的问题。\n\n11. **GetCurrentMemoryFromFile 返回错误数据** — 此函数曾与 `GetCurrentSummaryFromFile` 共享代码路径，导致其返回的是滚动聊天摘要而非记忆片段。现已修正，能够正确返回记忆片段。\n\n12. **图像回溯默认值回归** — 将默认回溯窗口从 5 次恢复为 10 次（此前被悄然减半）。\n\n13. **流式传输中的多词前缀检测** — 修复了 `StreamingResponseHandler` 无法去除多词响应前缀的问题。","2026-03-29T22:57:37",{"id":166,"version":167,"summary_zh":168,"released_at":169},263891,"v0.5","## 摘要\r\n\r\n> 注意：本次更新引入了新变量，以逐步弃用诸如 \"chat_user_prompt_last_twenty\" 之类的旧变量。出于向后兼容性的考虑，这些旧变量暂时不会被移除，但未来我们将不再频繁依赖它们。\r\n\r\n### 新工作流节点\r\n- JsonExtractor 节点：无需额外调用 LLM 即可从 LLM 响应的 JSON 中提取字段。\r\n- TagTextExtractor 节点：无需额外调用 LLM 即可提取 XML\u002FHTML 样式标签之间的内容。\r\n\r\n### 可配置的提示变量\r\n- nMessagesToIncludeInVariable：节点属性，用于控制聊天或模板化提示变量中包含的消息数量。\r\n- estimatedTokensToIncludeInVariable：基于令牌预算的消息选择策略，会累积最近的消息直到达到令牌上限。\r\n- minMessagesInVariable + maxEstimatedTokensInVariable：组合模式先确保至少包含指定数量的消息，然后再根据令牌预算进行填充。\r\n\r\n### 令牌估算\r\n- 重新校准了粗略估算的单词与令牌比例（由 1.538 调整为 1.35 令牌\u002F单词）。\r\n- 添加了可配置的安全 margin 参数，默认值为 1.10。\r\n\r\n### 内存系统修复\r\n- 修复了 file_exists 检查逻辑，该逻辑曾导致新对话的消息阈值触发器被永久禁用。\r\n- 修正了触发条件比较中的“差一”错误（将 > 改为 >=）。\r\n- 添加了通过 close() 方法清理 HTTP 会话的功能，以防止长连接占用 llama.cpp 的槽位。\r\n- 将超时设置拆分为 (连接, 读取) 元组形式。\r\n- 增加了内存触发决策的诊断日志记录。\r\n\r\n### 代码质量\r\n- 修复了空 except 子句，在取消路径中改用 except Exception 捕获异常。\r\n- 为可配置的变量切片操作添加了基于提示的详细日志记录。\r\n\r\n### 示例工作流配置\r\n- 更新了所有示例工作流 JSON 文件，使其使用新的可配置变量语法。","2026-02-09T03:26:08",{"id":171,"version":172,"summary_zh":173,"released_at":174},263892,"v0.4.1","## 变更内容\n* 修复了因近期移除图像特定处理程序而导致的内存系统问题。由 @SomeOddCodeGuy 在 https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F82 中完成。","2026-01-05T03:53:47",{"id":176,"version":177,"summary_zh":178,"released_at":179},263893,"v0.4","## 变更内容\n  - 修复在记忆生成过程中最旧的消息块被静默丢弃的问题\n  - 修复新消息计数错误导致已记忆消息被重复处理的问题\n  - 修复 pytest.ini 中测试路径大小写敏感的问题\n\n  功能：\n  - 通过 API 模型字段（\u002Fv1\u002Fmodels 和 \u002Fapi\u002Ftags 端点）添加共享工作流集合及工作流选择功能\n  - 添加包含时间信息的工作流节点执行摘要日志记录\n  - 新增 workflowConfigsSubDirectoryOverride 配置项，用于指定共享工作流文件夹路径\n  - 新增 sharedWorkflowsSubDirectoryOverride 配置项，用于自定义共享文件夹名称\n  - 在文件路径中新增 {Discussion_Id} 和 {YYYY_MM_DD} 变量\n  - 为 maxResponseSizeInTokens 参数增加变量替换支持\n  - 添加基于 Web 的设置向导（setup_wizard_web.py）（目前处于开发中，可能会临时使用或被替换）\n  - 实现向量记忆的可续性，并为每个分块记录哈希值\n\n  重构：\n  - 将图像处理程序整合到标准处理程序中，移除约 700 行代码\n  - 统一预设和工作流的命名规范，采用连字符分隔方式\n  - 将遗留工作流归档至 _archive 子目录\n  - 添加预配置的共享工作流文件夹\n\n  简化：\n  - 更新预设名称以匹配端点名称。现在更加直观易懂，您可以更方便地使用预设来确保每个端点都采用合适的配置。\n  - _example_general_workflow 是示例型生产力工作流的一站式解决方案；借助自定义工作流系统，您可以更轻松地衍生出更多工作流。只需将新文件夹放入 workflows 下的 _shared 目录，即可作为模型快速启用新的工作流。稍后我会为此制作一段视频。\n  - 移除了特定于图像的处理程序。终于完成了！这些代码是我早期编写的，一直拖着没处理，但它们总是让我感到困扰。现在，常规处理程序已经集成了图像框架支持（如果适用的话）。\n\n  测试：\n  - 更新测试用例，以验证修正后的记忆哈希行为\n  - 新增针对工作流覆盖功能的测试用例","2026-01-04T21:26:24",{"id":181,"version":182,"summary_zh":183,"released_at":184},263894,"v0.3.1","## 变更内容\n* 更新 urllib3，以修复 Dependabot 报告的问题，由 SomeOddCodeGuy 在 https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F79 中完成。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.3.0...v0.3.1","2025-12-07T23:14:43",{"id":186,"version":187,"summary_zh":188,"released_at":189},263895,"v0.3.0","* 增加了对 Claude LLM API 的支持\n* 将 Flask 暴露的可运行 API 替换为：MacOS\u002FLinux 使用 Eventlet，Windows 使用 Waitress\n* 修复了单元测试在 Windows 上无法正常运行的问题\n* 修正了两个因 LLM_API URL 和 ConfigDirectory 文件夹路径末尾多出斜杠而导致功能中断的地方\n* 增加了完善的取消功能实现：在 Open WebUI 或其他前端界面中按下“停止”按钮时，能够正确终止工作流，并将其级联传递至 LLM。\n  * 部分 LLM API 支持此功能，但并非全部。该功能应能正常终止 Wilmer 及其工作流，不过如果 LLM API 正在处理请求，则可能无法强制停止。\n* 增加了使用变量替换 Endpoints 和 Presets 的功能\n  * 目前仅限于工作流顶部的硬编码变量，或来自父工作流的 agentXInputs 变量。","2025-10-13T02:50:28",{"id":191,"version":192,"summary_zh":193,"released_at":194},263896,"v0.2.1","## 变更内容\n\n- 新增了由大语言模型辅助的工作流生成文档文件夹。目前仍在开发中。\n> 这仍处于开发阶段，但我已成功使用该功能生成了几个工作流。这是我希望 Wilmer 能够实现的方向的一个开端：让其设置和工作流生成过程能够被大语言模型轻松自动化。\n\n- 修复了静态响应节点的流式传输问题\n- 更新部分文章维基节点，使其返回指定数量的结果\n- 修复了思考标签清理的 bug。此前出现过一种情况：大语言模型（magistral 2509）会识别并处理思考标签，但不会生成任何内容，导致整个响应被删除，最终 agentXOutput 中接收到的是完全空的响应。\n- 添加了 ArithmeticProcessor 节点\n- 添加了 Conditional 条件节点\n- 添加了 StringConcatenator 字符串拼接节点\n- 更新了条件工作流，允许在默认情况下直接传递内容，而无需进入其他工作流\n- 实现了递归工作流的 POC 示例，以一个简单的编码工作流为例。接下来还将推出维基百科相关的工作流，但在正式发布之前，我打算再进行一些测试。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.2...v0.2.1","2025-09-29T03:47:35",{"id":196,"version":197,"summary_zh":198,"released_at":199},263897,"v0.2","## What's Changed\r\n* Unit Tests, bug fixes and documentation moving by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F75\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.1.8.2...v0.2","2025-09-22T04:13:32",{"id":201,"version":202,"summary_zh":203,"released_at":204},263898,"v0.1.8.2","Just updating some docs and a few tweaks to some configs ","2025-09-16T02:51:15",{"id":206,"version":207,"summary_zh":208,"released_at":209},263899,"v0.1.8.1","Updated urllib3 to 2.5.0 to satisfy 2 dependabot issues and clear out security notifications.","2025-09-15T01:56:02",{"id":211,"version":212,"summary_zh":213,"released_at":214},263900,"v0.1.8","> IMPORTANT: This may require deleting and rebuilding the vector memory file for a discussionId. The next message should start regenerating the memories, same as a file would. Just change the discussionId to something else, or back up the .db, if you want to preserve your current memories.\r\n\r\n- Added the new SaveCustomFile node, which just simply allows you to save a string to a text file.\r\n- Also added StaticResponse node, which returns a constant string or a variable, without making an LLM call.\r\n- Also fixed an issue with memories where the vector toggle was gatekeeping the file memories from working.\r\n- Also corrected an issue with vector memories where the lookback turns could cause the memory tracker to get confused and rebuild the whole memory set (required vector memory db rebuild).\r\n- Also corrected issue with endpoints not properly removing the starting string as expected.\r\n- Set the default stream back to true for openai api.","2025-09-14T05:47:28",{"id":216,"version":217,"summary_zh":77,"released_at":218},263901,"v0.1.7.2","2025-08-28T02:18:36",{"id":220,"version":221,"summary_zh":222,"released_at":223},263902,"v0.1.7.1","Fixed a couple of example users","2025-08-27T00:20:40",{"id":225,"version":226,"summary_zh":227,"released_at":228},263903,"v0.1.7","## ✨ Custom-Workflow System Enhancements\r\n\r\nThis update introduces significant improvements to how custom-workflows are handled, making them more powerful, reliable, and easier to configure.\r\n\r\n### 1. Reliable Streaming for Nested Workflows\r\n\r\n**The Benefit:** You can now build complex, multi-step agents that use custom-workflows as their final step, and they will reliably stream their full response back to the user. This creates a much smoother and more responsive user experience for sophisticated interactions.\r\n\r\nPreviously, streaming from a custom-workflow was unreliable and could fail silently. This has been completely fixed by making the custom-workflow correctly inherit its \"responder\" status from the parent workflow that calls it.\r\n\r\n---\r\n\r\n### 2. Dynamic Loading of User-Specific Workflows\r\n\r\n**The Benefit:** Workflows can now be designed to dynamically call and execute other workflows from specific, user-defined folders. This unlocks powerful personalization and multi-tenancy capabilities.\r\n\r\nFor example, a main workflow can now decide to trigger a specialized \"summarization\" custom-workflow located in `user_A\u002Fworkflows\u002F` or `user_B\u002Fworkflows\u002F` based on the current context. This allows for greater modularity and customized experiences without duplicating core logic. You can enable this by adding the new `workflowUserFolderOverride` parameter to your custom-workflow node configuration.\r\n\r\n---\r\n\r\n### 3. Simplified and Centralized Configuration\r\n\r\n**The Benefit:** Configuring custom-workflows is now simpler and less error-prone. The logic for overriding prompts and other settings has been consolidated, which means your workflow JSON files will be cleaner and easier to maintain.\r\n\r\nThis refactoring reduces duplicated configuration between `CustomWorkflow` and `ConditionalCustomWorkflow` nodes, ensuring behavior is consistent and predictable.\r\n\r\n***\r\n\r\n## 🛠️ Tool & Node Enhancements\r\n\r\nWe've also made significant improvements to several built-in tool nodes, fixing bugs and adding new capabilities.\r\n\r\n### 1. Timestamps Now Available to All Nodes\r\n\r\n**The Benefit:** Build more context-aware agents that can reason about the timing of a conversation in their internal \"thought\" steps.\r\n\r\nPreviously, only the final \"responding\" node in a workflow could access message timestamps. We've removed this limitation. Now, **any node** can be configured to receive the conversation with timestamps by setting `\"addDiscussionIdTimestampsForLLM\": true`. This enables internal steps to perform time-based analysis (e.g., \"The user first mentioned this topic 10 minutes ago\").","2025-08-25T05:35:00",{"id":230,"version":231,"summary_zh":232,"released_at":233},263904,"v0.1.6","> WARNING: This release might be a bit buggy. I am testing as much as I can, but time is in short supply. If you find an issue, please let me know. Rollback if necessary.\r\n\r\n## Version 0.1.6: Workflow and Memory System Update\r\n\r\nThis release introduces significant improvements to the workflow engine, focusing on developer experience, dynamic\r\ncapabilities, and a new intelligent memory system. Core components have been refactored for clarity and consistency, and\r\nseveral new features provide greater flexibility and control.\r\n\r\n### Core Refactor: Unified ExecutionContext\r\n\r\nA new `ExecutionContext` object has been introduced as the central mechanism for managing runtime state in workflows.\r\nThis data class consolidates all relevant execution data—including `request_id`, `messages`, `agent_outputs`, and access\r\nto core services—into a single, consistent structure.\r\n\r\nThis change simplifies function signatures, reduces boilerplate parameter passing, and makes state management more\r\npredictable and maintainable. It also streamlines the development of new node types by providing a uniform interface.\r\n\r\n### Enhanced Workflow Configuration\r\n\r\nWorkflow configuration files (JSON) now support a more flexible structure. Workflows can be defined as a top-level\r\ndictionary containing a `\"nodes\"` array and any number of custom, workflow-scoped variables (e.g.,\r\n`\"persona_name\": \"Helpful Assistant\"`). These variables are automatically available for use in prompts within the\r\nworkflow.\r\n\r\nThe system maintains full backward compatibility with the previous format, which treated the workflow file as a simple\r\nlist of nodes. No migration is required.\r\n\r\nWhen multiple variable sources are present, precedence is applied in the following order:\r\n\r\n1. Date and time variables\r\n2. Custom workflow variables\r\n3. Agent or node outputs\r\n\r\nTo avoid conflicts, custom variable names should not overlap with system-generated ones (e.g.,\r\n`chat_user_prompt_last_one`).\r\n\r\n### New Features and Capabilities\r\n\r\n#### Sub-Workflow Parameters (scoped_inputs)\r\n\r\nWorkflows can now pass initial data into custom sub-workflows using the `scoped_inputs` parameter. This enables the creation of\r\nreusable, parameterized sub-workflows that behave like modular components.\r\n\r\nThe inputs are provided as a list of strings and are exposed within the sub-workflow as `{agent1Input}`,\r\n`{agent2Input}`, etc.\r\n\r\n#### Expanded Dynamic Prompt Variables\r\n\r\nA comprehensive set of new dynamic variables is now available for use in prompts:\r\n\r\n- **Workflow variables**: Access custom variables defined at the top level of the workflow JSON (e.g.,\r\n  `{persona_name}`).\r\n- **Date and time**: Variables such as `{todays_date_pretty}`, `{current_time_12h}`, and `{current_day_of_week}` are now\r\n  available.\r\n- **Time context**: The `{time_context_summary}` variable provides a natural language description of the timing between\r\n  messages (e.g., \"The user sent this a few minutes after your last message\").\r\n\r\n#### Intelligent Vector Memory System\r\n\r\nA new vector-based memory system has been implemented to enable more sophisticated and searchable long-term memory.\r\n\r\nKey features include:\r\n\r\n- **Automated memory generation**: Conversation history can be analyzed by an LLM to produce structured metadata (title,\r\n  summary, keywords) for semantic search.\r\n- **Customizable memory workflows**: The generation of both vector and file-based memories can now be controlled by\r\n  user-defined sub-workflows via `vectorMemoryWorkflowName` and `fileMemoryWorkflowName`, allowing full customization of\r\n  the summarization process. The old way of doing file memories still exists as well.\r\n- **Isolated storage**: Each conversation (identified by `discussion_id`) now uses a dedicated SQLite database for\r\n  vector memory, improving data isolation, performance, and scalability.\r\n- **Independent configuration**: Settings for vector and file-based memory (e.g., lookback turns, chunk size) are now\r\n  managed separately, allowing fine-tuned control.\r\n- **Backward compatibility**: The original file-based memory system remains fully supported and is the default. The\r\n  vector memory system is an optional, opt-in enhancement.\r\n\r\n#### Enhanced Timestamp Control\r\n\r\nNodes can now be configured to use human-readable, relative timestamps (e.g., `[Sent 5 minutes ago]`) instead of\r\nabsolute timestamps. This is enabled by setting `\"useRelativeTimestamps\": true` in the node configuration. If not\r\nspecified, absolute timestamps are used by default.\r\n\r\n### Quality of Life Improvements and Fixes\r\n\r\n- A new node type, `VectorMemorySearch`, has been added to support querying the vector memory system.\r\n- LLM stream handling and response formatting have been consolidated into a dedicated `StreamingResponseHandler` class\r\n  for improved maintainability.\r\n- The `responseStartTextToRemove` feature now accepts an array of strings, removing the first matching prefix from the\r\n  start of an LLM response. This allows for more flexible c","2025-08-18T03:35:46",{"id":235,"version":236,"summary_zh":237,"released_at":238},263905,"v0.1.5","## What's Changed\r\n* Release 2025 08\u002Frefactor workflows first step by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F64\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.1.3...v0.1.5","2025-08-17T01:52:28",{"id":240,"version":241,"summary_zh":242,"released_at":243},263906,"v0.1.3","## What's Changed\r\n* Update README.md by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F57\r\n* Major refactor of the LLM handlers by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F59\r\n* Corrected merge issue where original handlers didn't get deleted by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F60\r\n* Release 2025 07\u002Ffixes and updates by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F62\r\n* Fixing trailing space issue that was causing problems with smaller models in some situations by @SomeOddCodeGuy in https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fpull\u002F63\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSomeOddCodeGuy\u002FWilmerAI\u002Fcompare\u002Fv0.1.2...v0.1.3","2025-08-17T01:49:58"]