[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-withcatai--node-llama-cpp":3,"tool-withcatai--node-llama-cpp":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",148568,2,"2026-04-09T23:34:24",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":77,"owner_website":77,"owner_url":78,"languages":79,"stars":114,"forks":115,"last_commit_at":116,"license":117,"difficulty_score":32,"env_os":118,"env_gpu":119,"env_ram":120,"env_deps":121,"category_tags":127,"github_topics":128,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":149,"updated_at":150,"faqs":151,"releases":180},6172,"withcatai\u002Fnode-llama-cpp","node-llama-cpp","Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level","node-llama-cpp 是一个让开发者能在本地电脑上轻松运行大型语言模型（LLM）的 Node.js 工具库。它基于高性能的 llama.cpp 项目构建，旨在解决在 JavaScript 环境中部署 AI 模型门槛高、配置复杂的问题，让用户无需深入底层编译细节即可调用强大的 AI 能力。\n\n这款工具特别适合 Node.js 开发者、全栈工程师以及希望在本地构建隐私安全 AI 应用的技术人员。其核心亮点在于“开箱即用”的体验：它自动适配用户的硬件环境，支持 Metal、CUDA 和 Vulkan 等多种加速技术，并预置了二进制文件，避免了繁琐的 node-gyp 或 Python 依赖配置。此外，node-llama-cpp 具备独特的结构化输出控制能力，能强制模型按指定的 JSON Schema 生成数据，极大提升了后端集成的可靠性；同时支持函数调用、文本嵌入及重排序等高级功能。无论是想通过一行命令在终端体验对话，还是在项目中深度集成本地 AI 服务，node-llama-cpp 都提供了完善的 TypeScript 支持和文档，是连接 JavaScript 生态与本地大模型的","node-llama-cpp 是一个让开发者能在本地电脑上轻松运行大型语言模型（LLM）的 Node.js 工具库。它基于高性能的 llama.cpp 项目构建，旨在解决在 JavaScript 环境中部署 AI 模型门槛高、配置复杂的问题，让用户无需深入底层编译细节即可调用强大的 AI 能力。\n\n这款工具特别适合 Node.js 开发者、全栈工程师以及希望在本地构建隐私安全 AI 应用的技术人员。其核心亮点在于“开箱即用”的体验：它自动适配用户的硬件环境，支持 Metal、CUDA 和 Vulkan 等多种加速技术，并预置了二进制文件，避免了繁琐的 node-gyp 或 Python 依赖配置。此外，node-llama-cpp 具备独特的结构化输出控制能力，能强制模型按指定的 JSON Schema 生成数据，极大提升了后端集成的可靠性；同时支持函数调用、文本嵌入及重排序等高级功能。无论是想通过一行命令在终端体验对话，还是在项目中深度集成本地 AI 服务，node-llama-cpp 都提供了完善的 TypeScript 支持和文档，是连接 JavaScript 生态与本地大模型的桥梁。","\u003Cdiv align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fnode-llama-cpp.withcat.ai\" target=\"_blank\">\u003Cimg alt=\"node-llama-cpp Logo\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_e10791939417.png\" width=\"360px\" \u002F>\u003C\u002Fa>\n    \u003Ch1>node-llama-cpp\u003C\u002Fh1>\n    \u003Cp>Run AI models locally on your machine\u003C\u002Fp>\n    \u003Csub>Pre-built bindings are provided with a fallback to building from source with cmake\u003C\u002Fsub>\n    \u003Cp>\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\" class=\"main-badges\">\n\n[![Build](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Factions\u002Fworkflows\u002Fbuild.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Factions\u002Fworkflows\u002Fbuild.yml)\n[![License](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_ac9c05ef13d1.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n[![Types](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_1b88f2806b42.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n[![Version](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_b488687c599c.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n\n\u003C\u002Fdiv>\n\n✨ [`gpt-oss` is here!](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\n\n## Features\n* Run LLMs locally on your machine\n* [Metal, CUDA and Vulkan support](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F#gpu-support)\n* [Pre-built binaries are provided](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source), with a fallback to building from source _**without**_ `node-gyp` or Python\n* [Adapts to your hardware automatically](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F#gpu-support), no need to configure anything\n* A Complete suite of everything you need to use LLMs in your projects\n* [Use the CLI to chat with a model without writing any code](#try-it-without-installing)\n* Up-to-date with the latest `llama.cpp`. Download and compile the latest release with a [single CLI command](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#downloading-a-release)\n* Enforce a model to generate output in a parseable format, [like JSON](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fchat-session#json-response), or even force it to [follow a specific JSON schema](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fchat-session#response-json-schema)\n* [Provide a model with functions it can call on demand](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fchat-session#function-calling) to retrieve information or perform actions\n* [Embedding and reranking support](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fembedding)\n* [Safe against special token injection attacks](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fllama-text#input-safety-in-node-llama-cpp)\n* Great developer experience with full TypeScript support, and [complete documentation](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F)\n* Much more\n\n## [Documentation](https:\u002F\u002Fnode-llama-cpp.withcat.ai)\n* [Getting started guide](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F)\n* [API reference](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ffunctions\u002FgetLlama)\n* [CLI help](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fcli\u002F)\n* [Blog](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002F)\n* [Changelog](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Freleases)\n* [Roadmap](https:\u002F\u002Fgithub.com\u002Forgs\u002Fwithcatai\u002Fprojects\u002F1)\n\n## Try It Without Installing\nChat with a model in your terminal using [a single command](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fcli\u002Fchat):\n```bash\nnpx -y node-llama-cpp chat\n```\n\n## Installation\n```bash\nnpm install node-llama-cpp\n```\n\n[This package comes with pre-built binaries](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source) for macOS, Linux and Windows.\n\nIf binaries are not available for your platform, it'll fallback to download a release of `llama.cpp` and build it from source with `cmake`.\nTo disable this behavior, set the environment variable `NODE_LLAMA_CPP_SKIP_DOWNLOAD` to `true`.\n\n## Usage\n```typescript\nimport {fileURLToPath} from \"url\";\nimport path from \"path\";\nimport {getLlama, LlamaChatSession} from \"node-llama-cpp\";\n\nconst __dirname = path.dirname(fileURLToPath(import.meta.url));\n\nconst llama = await getLlama();\nconst model = await llama.loadModel({\n    modelPath: path.join(__dirname, \"models\", \"Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf\")\n});\nconst context = await model.createContext();\nconst session = new LlamaChatSession({\n    contextSequence: context.getSequence()\n});\n\n\nconst q1 = \"Hi there, how are you?\";\nconsole.log(\"User: \" + q1);\n\nconst a1 = await session.prompt(q1);\nconsole.log(\"AI: \" + a1);\n\n\nconst q2 = \"Summarize what you said\";\nconsole.log(\"User: \" + q2);\n\nconst a2 = await session.prompt(q2);\nconsole.log(\"AI: \" + a2);\n```\n\n> For more examples, see the [getting started guide](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F)\n\n## Contributing\nTo contribute to `node-llama-cpp` read the [contribution guide](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fcontributing).\n\n## Acknowledgements\n* llama.cpp: [ggml-org\u002Fllama.cpp](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp)\n\n\n\u003Cbr \u002F>\n\n\u003Cdiv align=\"center\" width=\"360\">\n    \u003Cimg alt=\"Star please\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_406524a134e7.png\" width=\"360\" margin=\"auto\" \u002F>\n    \u003Cbr\u002F>\n    \u003Cp align=\"right\">\n        \u003Ci>If you like this repo, star it ✨\u003C\u002Fi>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n    \u003C\u002Fp>\n\u003C\u002Fdiv>\n","\u003Cdiv align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fnode-llama-cpp.withcat.ai\" target=\"_blank\">\u003Cimg alt=\"node-llama-cpp Logo\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_e10791939417.png\" width=\"360px\" \u002F>\u003C\u002Fa>\n    \u003Ch1>node-llama-cpp\u003C\u002Fh1>\n    \u003Cp>在您的本地机器上运行 AI 模型\u003C\u002Fp>\n    \u003Csub>提供预编译的绑定，并支持回退到使用 CMake 从源码构建\u003C\u002Fsub>\n    \u003Cp>\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\" class=\"main-badges\">\n\n[![构建](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Factions\u002Fworkflows\u002Fbuild.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Factions\u002Fworkflows\u002Fbuild.yml)\n[![许可证](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_ac9c05ef13d1.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n[![类型定义](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_1b88f2806b42.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n[![版本](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_b488687c599c.png)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fnode-llama-cpp)\n\n\u003C\u002Fdiv>\n\n✨ [`gpt-oss` 已发布！](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\n\n## 特性\n* 在本地机器上运行大语言模型\n* 支持 Metal、CUDA 和 Vulkan 显卡加速\n* 提供预编译二进制文件，同时支持不依赖 `node-gyp` 或 Python 的源码构建\n* 自动适配您的硬件配置，无需任何额外设置\n* 完整的工具集，满足您在项目中使用大语言模型的所有需求\n* 可通过命令行与模型对话，无需编写代码\n* 始终保持与最新版 `llama.cpp` 同步。只需一条命令即可下载并编译最新版本\n* 能够强制模型以可解析的格式生成输出，例如 JSON 格式，甚至可以指定特定的 JSON 模式\n* 允许为模型提供可按需调用的函数，用于获取信息或执行操作\n* 支持嵌入和重排序功能\n* 具有防止特殊标记注入攻击的安全机制\n* 提供出色的开发体验，全面支持 TypeScript，并配有完整的文档\n* 更多功能……\n\n## [文档](https:\u002F\u002Fnode-llama-cpp.withcat.ai)\n* [入门指南](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F)\n* [API 参考](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ffunctions\u002FgetLlama)\n* [CLI 使用帮助](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fcli\u002F)\n* [博客](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002F)\n* [变更日志](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Freleases)\n* [路线图](https:\u002F\u002Fgithub.com\u002Forgs\u002Fwithcatai\u002Fprojects\u002F1)\n\n## 无需安装即可试用\n您可以在终端中使用一条命令与模型对话：\n```bash\nnpx -y node-llama-cpp chat\n```\n\n## 安装\n```bash\nnpm install node-llama-cpp\n```\n\n此包为 macOS、Linux 和 Windows 提供了预编译的二进制文件。如果您的平台没有可用的二进制文件，则会自动下载 `llama.cpp` 的最新版本，并使用 CMake 进行编译。若要禁用此行为，请将环境变量 `NODE_LLAMA_CPP_SKIP_DOWNLOAD` 设置为 `true`。\n\n## 使用示例\n```typescript\nimport {fileURLToPath} from \"url\";\nimport path from \"path\";\nimport {getLlama, LlamaChatSession} from \"node-llama-cpp\";\n\nconst __dirname = path.dirname(fileURLToPath(import.meta.url));\n\nconst llama = await getLlama();\nconst model = await llama.loadModel({\n    modelPath: path.join(__dirname, \"models\", \"Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf\")\n});\nconst context = await model.createContext();\nconst session = new LlamaChatSession({\n    contextSequence: context.getSequence()\n});\n\n\nconst q1 = \"你好，最近怎么样？\";\nconsole.log(\"用户：\" + q1);\n\nconst a1 = await session.prompt(q1);\nconsole.log(\"AI：\" + a1);\n\n\nconst q2 = \"请总结一下你刚才说的话\";\nconsole.log(\"用户：\" + q2);\n\nconst a2 = await session.prompt(q2);\nconsole.log(\"AI：\" + a2);\n```\n\n> 更多示例请参阅 [入门指南](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002F)\n\n## 贡献\n如需参与 `node-llama-cpp` 的开发，请阅读 [贡献指南](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fcontributing)。\n\n## 致谢\n* llama.cpp：[ggml-org\u002Fllama.cpp](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp)\n\n\n\u003Cbr \u002F>\n\n\u003Cdiv align=\"center\" width=\"360\">\n    \u003Cimg alt=\"请点赞\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_readme_406524a134e7.png\" width=\"360\" margin=\"auto\" \u002F>\n    \u003Cbr\u002F>\n    \u003Cp align=\"right\">\n        \u003Ci>如果您喜欢这个仓库，请给它点个赞 ✨\u003C\u002Fi>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n    \u003C\u002Fp>\n\u003C\u002Fdiv>","# node-llama-cpp 快速上手指南\n\n`node-llama-cpp` 是一个强大的 Node.js 库，允许你在本地机器上运行大型语言模型（LLM）。它基于 `llama.cpp`，支持 Metal (macOS)、CUDA (NVIDIA) 和 Vulkan，并能自动适配你的硬件。无需配置复杂的构建环境，开箱即用。\n\n## 环境准备\n\n*   **操作系统**：macOS、Linux 或 Windows。\n*   **Node.js**：建议安装最新的 LTS 版本。\n*   **前置依赖**：\n    *   该包默认提供预编译二进制文件，**无需**安装 Python、`node-gyp` 或 CMake 即可直接使用。\n    *   仅当你的平台没有预编译文件时，它才会尝试使用 `cmake` 从源码构建（此时才需要安装 CMake）。\n*   **模型文件**：你需要准备一个 `.gguf` 格式的模型文件（例如从 Hugging Face 下载）。\n\n## 安装步骤\n\n使用 npm 安装核心库：\n\n```bash\nnpm install node-llama-cpp\n```\n\n> **提示**：如果你在中国大陆遇到下载预编译二进制文件速度慢的问题，可以尝试配置网络代理，或者设置环境变量跳过自动下载（需手动处理），但在大多数网络环境下可直接安装。\n\n## 基本使用\n\n以下是一个最简单的聊天示例。请确保你已经在项目目录下准备了模型文件（例如 `Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf`）。\n\n1.  创建一个 TypeScript 或 JavaScript 文件（例如 `index.ts`）。\n2.  填入以下代码：\n\n```typescript\nimport {fileURLToPath} from \"url\";\nimport path from \"path\";\nimport {getLlama, LlamaChatSession} from \"node-llama-cpp\";\n\nconst __dirname = path.dirname(fileURLToPath(import.meta.url));\n\n\u002F\u002F 初始化 Llama 实例\nconst llama = await getLlama();\n\n\u002F\u002F 加载模型 (请替换为你本地的 .gguf 模型路径)\nconst model = await llama.loadModel({\n    modelPath: path.join(__dirname, \"models\", \"Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf\")\n});\n\n\u002F\u002F 创建上下文\nconst context = await model.createContext();\n\n\u002F\u002F 创建聊天会话\nconst session = new LlamaChatSession({\n    contextSequence: context.getSequence()\n});\n\n\u002F\u002F 开始对话\nconst q1 = \"Hi there, how are you?\";\nconsole.log(\"User: \" + q1);\n\nconst a1 = await session.prompt(q1);\nconsole.log(\"AI: \" + a1);\n\nconst q2 = \"Summarize what you said\";\nconsole.log(\"User: \" + q2);\n\nconst a2 = await session.prompt(q2);\nconsole.log(\"AI: \" + a2);\n```\n\n3.  运行你的代码（确保你的 `package.json` 中设置了 `\"type\": \"module\"` 或使用 `.mjs` 后缀）：\n\n```bash\nnode index.ts\n# 或者如果使用 tsx\nnpx tsx index.ts\n```\n\n### 免安装体验 (CLI)\n\n如果你只想快速测试某个模型而不想写代码，可以直接使用 npx 运行命令行工具：\n\n```bash\nnpx -y node-llama-cpp chat\n```\n\n运行后按照提示选择或输入模型路径即可开始终端聊天。","某电商初创团队需要在 Node.js 后端中构建一个本地化的智能订单分析系统，用于自动提取用户评论中的关键信息并生成结构化报告。\n\n### 没有 node-llama-cpp 时\n- **数据隐私风险高**：必须将用户评论发送至第三方云端 API 处理，面临敏感数据泄露合规风险。\n- **输出格式不可控**：大模型返回的文本杂乱无章，需编写复杂的正则表达式进行二次清洗，极易解析失败。\n- **部署依赖复杂**：在服务器配置 Python 环境、编译 C++ 扩展及管理 `node-gyp` 依赖耗时耗力，常因环境差异导致部署失败。\n- **硬件加速难启用**：难以自动适配服务器的 GPU 资源（如 CUDA 或 Metal），导致推理速度缓慢，无法实时响应。\n- **功能扩展受限**：缺乏原生的函数调用支持，无法让模型直接触发内部数据库查询来验证订单状态。\n\n### 使用 node-llama-cpp 后\n- **数据完全本地化**：直接在本地机器运行开源模型，用户数据无需出域，彻底满足隐私合规要求。\n- **原生 JSON Schema 约束**：利用生成级别的 JSON Schema 强制约束，模型直接输出符合预定义结构的订单数据，零代码清洗。\n- **开箱即用的部署体验**：自动下载预编译二进制文件，无需安装 Python 或手动编译，一键即可在不同操作系统上运行。\n- **智能硬件适配**：自动检测并启用服务器的 GPU 加速（支持 CUDA\u002FMetal\u002FVulkan），推理延迟降低 80% 以上。\n- **内置函数调用能力**：通过原生函数调用接口，模型可主动请求查询订单详情，实现“分析 - 验证”闭环自动化。\n\nnode-llama-cpp 让开发者能在 Node.js 生态中以最低成本实现安全、高效且结构可控的本地大模型应用落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwithcatai_node-llama-cpp_11156e76.png","withcatai","Catai","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fwithcatai_bc8dab2e.jpg","Run AI models locally on your machine with Node.js",null,"https:\u002F\u002Fgithub.com\u002Fwithcatai",[80,84,88,92,96,100,103,107,111],{"name":81,"color":82,"percentage":83},"TypeScript","#3178c6",91.3,{"name":85,"color":86,"percentage":87},"C++","#f34b7d",4.5,{"name":89,"color":90,"percentage":91},"CSS","#663399",1.7,{"name":93,"color":94,"percentage":95},"Vue","#41b883",0.8,{"name":97,"color":98,"percentage":99},"JavaScript","#f1e05a",0.7,{"name":101,"color":102,"percentage":99},"CMake","#DA3434",{"name":104,"color":105,"percentage":106},"Shell","#89e051",0.2,{"name":108,"color":109,"percentage":110},"HTML","#e34c26",0,{"name":112,"color":113,"percentage":110},"C","#555555",1991,180,"2026-04-09T20:22:50","MIT","Linux, macOS, Windows","非必需。支持 Metal (macOS), CUDA (NVIDIA), Vulkan。若无对应预编译二进制文件，需从源码构建（依赖 cmake），无需手动配置即可自动适配硬件。","未说明（取决于所加载的模型大小）",{"notes":122,"python":123,"dependencies":124},"该工具提供预编译二进制文件，若平台不支持则回退到使用 cmake 从源码构建，此过程不需要 node-gyp 或 Python。支持通过 CLI 直接聊天而无需编写代码。可根据需要强制模型输出 JSON 格式或遵循特定 JSON Schema，并支持函数调用、嵌入和重排序功能。","不需要",[125,126],"cmake (仅在无预编译二进制时需从源码构建)","llama.cpp (内置或自动下载)",[14,35,15,13],[129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148],"ai","bindings","catai","llama","llama-cpp","llm","nodejs","prebuilt-binaries","grammar","gguf","cuda","metal","json-schema","cmake","cmake-js","self-hosted","embedding","function-calling","gpu","vulkan","2026-03-27T02:49:30.150509","2026-04-10T18:53:15.055477",[152,157,162,167,171,176],{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},27949,"在 Windows 上的 Electron 应用中升级 node-llama-cpp 后出现 NoBinaryFoundError 错误怎么办？","该问题通常发生在从旧版本（如 3.0.0-beta44）升级到新版本（如 3.2.0）时，Windows x64 预编译二进制文件未被正确识别。维护者已确认这是由 llama.cpp 的破坏性变更引起的，并已在版本 3.3.0 中修复。解决方案是将 `node-llama-cpp` 升级到最新版本（v3.3.0 或更高）。\n\n可以通过以下命令更新：\n```bash\nnpm install node-llama-cpp@latest\n```\n或者指定版本：\n```bash\nnpm install node-llama-cpp@3.3.0\n```","https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F381",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},27950,"在 Apple M 系列芯片（如 M4\u002FM5）上加载模型时遇到 \"Failed to create context\" 错误的原因是什么？","此错误可能与内存锁定（mlock）有关。如果用户设备上运行了其他占用大量内存且使用了 mlock 的进程，会导致操作系统无法将未使用的内存交换到磁盘，从而使实际可用内存小于报告值，导致上下文创建失败。\n\n建议检查是否有其他高内存占用的进程在运行，或者尝试减少模型的上下文大小（contextSize）。如果问题持续，可能需要等待后续版本对内存估算算法的优化。","https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F549",{"id":163,"question_zh":164,"answer_zh":165,"source_url":166},27951,"如何在加载 LLaMa 模型后动态应用不同的 LoRA 适配器？","该功能已在 `node-llama-cpp` v3.0.0 版本中发布。你可以使用库中提供的 API 来动态加载和应用 LoRA 文件，其底层调用了 `llama.cpp` 中的 `llama_model_apply_lora_from_file()` 函数。\n\n请确保你的项目依赖版本至少为 3.0.0：\n```bash\nnpm install node-llama-cpp@3.0.0\n```\n具体使用方法请参考官方文档中关于 LoRA 动态加载的章节。","https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F103",{"id":168,"question_zh":169,"answer_zh":170,"source_url":166},27952,"如何在不实际加载模型文件的情况下估算模型所需的内存资源？","`node-llama-cpp` 提供了逆向工程 `llama.cpp` 实现的内存估算功能，仅需模型文件的元数据即可进行估算。虽然并非完美，但在大多数模型上估算结果非常接近实际使用情况。\n\n你可以使用以下命令来测量和估算特定模型的内存需求：\n```bash\nnpx node-llama-cpp@beta inspect measure \u003C模型文件路径>\n```\n如果发现估算偏差较大，可以对比 `llama.cpp` 源码中该模型的内存分配逻辑，并向 `node-llama-cpp` 提交 PR 以改进 `GgufInsights` 中的估算算法。",{"id":172,"question_zh":173,"answer_zh":174,"source_url":175},27953,"遇到 \"Conversation roles must alternate user\u002Fassistant...\" 错误该如何解决？","这个错误通常由对话历史格式不正确引起，Jinja 模板引擎要求对话角色必须严格交替（即 user, assistant, user, assistant...）。\n\n解决方案是在每次发送新提示之前，重置或正确初始化对话历史记录。有用户反馈，将历史记录设置为初始空状态（`initialChatHistory`）后再添加当前提示，可以解决该问题。请检查你的代码逻辑，确保在构建消息列表时没有连续出现两个相同角色的消息。","https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F263",{"id":177,"question_zh":178,"answer_zh":179,"source_url":161},27954,"为什么在较新的 macOS 版本或 M 系列芯片上会出现上下文大小超出 VRAM 限制的错误？","这通常是因为系统报告的可用显存与实际可用显存不一致，特别是在有其他进程占用内存或使用了内存锁定（mlock）的情况下。当请求的上下文大小（contextSize）过大时，会触发 \"A context size of X is too large for the available VRAM\" 错误。\n\n建议尝试减小 `createContext` 时的 `contextSize` 参数值，或者关闭其他占用大量内存的应用程序。如果使用的是开发版或测试版库，建议升级到最新稳定版以获取更好的内存管理策略。",[181,186,191,196,201,206,211,216,221,226,231,236,241,246,251,256,261,266,271,276],{"id":182,"version":183,"summary_zh":184,"released_at":185},188866,"v3.18.1","## [3.18.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.18.0...v3.18.1)（2026-03-17）\n\n\n### 功能特性\n\n* 自定义 `postinstall` 行为（[#582](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F582)）（[57bea3d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57bea3da9ffa78955e8b25f195ce6cc714980cb5)）（文档：[自定义 `postinstall` 行为](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Ftroubleshooting#postinstall-behavior)）\n* 实验性支持上下文 KV 缓存类型配置（[#582](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F582)）（[57bea3d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57bea3da9ffa78955e8b25f195ce6cc714980cb5)）（文档：[`LlamaContextOptions[\"experimentalKvCacheKeyType\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLlamaContextOptions#experimentalkvcachekeytype)）\n* 支持 `NVFP4` 量化格式（[#582](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F582)）（[57bea3d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57bea3da9ffa78955e8b25f195ce6cc714980cb5)）\n\n---\n\n随 `llama.cpp` 版本 [`b8390`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8390) 一同发布\n\n> 若要使用最新可用的 `llama.cpp` 版本，请运行 `npx -n node-llama-cpp source download --release latest`。（[了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release)）","2026-03-17T08:38:19",{"id":187,"version":188,"summary_zh":189,"released_at":190},188867,"v3.18.0","# [3.18.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.17.1...v3.18.0) (2026-03-15)\n\n\n### 功能特性\n\n* 为需要自动检查点的模型添加自动检查点功能（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* **`QwenChatWrapper`:** 支持 Qwen 3.5（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* **`inspect gpu` 命令:** 检测并报告缺失的预编译二进制模块及自定义 npm 注册表（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n\n\n### 错误修复\n\n* **`resolveModelFile`:** 去重并发下载请求（[#570](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F570)）（[cc105b9](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fcc105b944b704f5f987004c7e58259275f885f42)）\n* 修正文档链接中 Vulkan URL 的大小写问题（[#568](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F568)）（[5a44506](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F5a445067641b055a5217ac981e1dade7d767f884)）\n* Qwen 3.5 的内存估算问题（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* HarmonyChatWrapper 中语法使用的问题（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* 添加 Mistral 思考段落检测功能（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* 在上下文切换时，对当前响应中过长的段落进行压缩，而不是抛出错误（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n* 将默认思考预算设置为上下文大小的 75%，以防止生成低质量的回答（[#573](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F573)）（[c641959](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fc6419597611c69840571b62e08339a200da2882e)）\n\n---\n\n随 `llama.cpp` 发布版本 [`b8352`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8352) 一同发布。\n\n> 若要使用最新的 `llama.cpp` 版本，请运行 `npx -n node-llama-cpp source download --release latest`。([了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))","2026-03-15T21:18:04",{"id":192,"version":193,"summary_zh":194,"released_at":195},188868,"v3.17.1","## [3.17.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.17.0...v3.17.1)（2026-02-28）\n\n\n### 错误修复\n\n* Electron 模板（[#566](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F566)）（[8931402](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F89314022104def53c5fe5c13cb1ceca25b77a8e2)）\n\n---\n\n随 `llama.cpp` 发布版本 [`b8179`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8179) 一同发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。（[了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release)）\n\n\n","2026-02-28T01:51:39",{"id":197,"version":198,"summary_zh":199,"released_at":200},188869,"v3.17.0","# [3.17.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.16.2...v3.17.0) (2026-02-27)\n\n\n### 功能特性\n\n* **`getLlama`:** `build: \"autoAttempt\"` ([#564](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F564)) ([dda5ade](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdda5ade714b5e2b2ea4f1de50fbefd41198ef397)) (文档：[`LlamaOptions [\"build\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLlamaOptions#build))\n* 移除 octokit 依赖 ([#564](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F564)) ([dda5ade](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdda5ade714b5e2b2ea4f1de50fbefd41198ef397))\n\n\n### 错误修复\n\n* **CLI:** 默认禁用直接 I\u002FO ([#564](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F564)) ([dda5ade](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdda5ade714b5e2b2ea4f1de50fbefd41198ef397))\n* 在未释放 `Llama` 实例的情况下，Bun 进程退出时发生段错误 ([#564](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F564)) ([dda5ade](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdda5ade714b5e2b2ea4f1de50fbefd41198ef397))\n* 检测 Nix 环境中的 glibc ([#564](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F564)) ([dda5ade](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdda5ade714b5e2b2ea4f1de50fbefd41198ef397))\n\n\n---\n\n随 `llama.cpp` 发布版本 [`b8169`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8169) 一起发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。([了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))","2026-02-27T22:38:15",{"id":202,"version":203,"summary_zh":204,"released_at":205},188870,"v3.16.2","## [3.16.2](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.16.1...v3.16.2)（2026-02-21）\n\n\n### 错误修复\n\n* macOS 14 预编译二进制文件（[#559](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F559)）（[6faa5ae](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F6faa5aeffbddf138010b38b90e4effb0485bf2ee)）\n\n---\n\n随 `llama.cpp` 发布版本 [`b8121`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8121) 一起发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。（[了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release)）\n\n\n","2026-02-21T20:33:02",{"id":207,"version":208,"summary_zh":209,"released_at":210},188871,"v3.16.1","## [3.16.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.16.0...v3.16.1)（2026-02-20）\n\n\n### 错误修复\n\n* 导出缺失的类型 ([#557](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F557)) ([498711c](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F498711c507cf69f64273d40ef7f9108866ba1af5))\n\n---\n\n随 `llama.cpp` 发布版本 [`b8117`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8117) 一同发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。([了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\n\n\n","2026-02-20T21:39:32",{"id":212,"version":213,"summary_zh":214,"released_at":215},188872,"v3.16.0","# [3.16.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.15.1...v3.16.0) (2026-02-19)\n\n\n### 功能特性\n\n* 排除顶部选项（XTC）（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）（文档：[`LLamaChatPromptOptions[\"xtc\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLLamaChatPromptOptions#xtc)）\n* DRY（不要重复自己）重复惩罚（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）（文档：[`LLamaChatPromptOptions[\"dryRepeatPenalty\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLLamaChatPromptOptions#dryrepeatpenalty)）\n* 支持 Tiny Aya 模型（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）\n\n\n### 错误修复\n* 调整默认的 VRAM 填充配置，以预留足够的内存用于计算缓冲区（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）\n* 支持带有可选空格前缀的函数调用语法（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）\n* 将 `useDirectIo` 的默认值更改为 `false`（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）\n* Vulkan 设备去重（[#553](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F553)）（[57e8c22](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F57e8c2264738693acc7450c43d565bb4ceac1129)）\n\n---\n\n随 `llama.cpp` 发布版本 [`b8095`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb8095) 一同发布。\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。（[了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release)）","2026-02-19T04:08:22",{"id":217,"version":218,"summary_zh":219,"released_at":220},188873,"v3.15.1","## [3.15.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.15.0...v3.15.1)（2026-01-26）\n\n\n### Bug修复\n\n* 适配`llama.cpp`的变更（[#547](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F547)）（[4baa480](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F4baa480f6d85f7ca425ffec8811963407f0bc9e1)）\n* 后端库文件重复（[#541](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F541)）（[f5123bf](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Ff5123bffa483202c8cff85e1f4b11933f4189597)）\n\n---\n\n随`llama.cpp`发布版本[`b7836`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb7836)一同发布\n\n> 若要使用最新的`llama.cpp`发布版本，请运行`npx -n node-llama-cpp source download --release latest`。（[了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release)）\n\n\n","2026-01-26T03:06:50",{"id":222,"version":223,"summary_zh":224,"released_at":225},188874,"v3.15.0","# [3.15.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.5...v3.15.0) (2026-01-10)\n\n\n### 功能特性\n\n* **`LlamaCompletion`:** `stopOnAbortSignal` ([#538](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F538)) ([734693d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F734693d5022c6627823ca7cdd270ad6dda67812c))（文档：[`LlamaCompletionGenerationOptions[\"stopOnAbortSignal\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLlamaCompletionGenerationOptions#stoponabortsignal)）\n**`LlamaModel`:** `useDirectIo` ([#538](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F538)) ([734693d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F734693d5022c6627823ca7cdd270ad6dda67812c))（文档：[`LlamaModelOptions[\"useDirectIo\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLlamaModelOptions#usedirectio）\n\n\n### 错误修复\n\n* 支持新的 CUDA 13.1 架构 ([#538](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F538)) ([734693d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F734693d5022c6627823ca7cdd270ad6dda67812c))\n* 使用 CUDA 13.1 而不是 13.0 构建预编译二进制文件 ([#538](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F538)) ([734693d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F734693d5022c6627823ca7cdd270ad6dda67812c))\n\n---\n\n随 `llama.cpp` 发布版本 [`b7698`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb7698) 一起发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。([了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))","2026-01-10T22:40:22",{"id":227,"version":228,"summary_zh":229,"released_at":230},188875,"v3.14.5","## [3.14.5](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.4...v3.14.5)（2025-12-10）\n\n\n### 错误修复\n\n* OIDC 包发布 ([#531](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F531)) ([3d3cb97](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F3d3cb977ae698d5404bf95822a9a2d590b363f14))\n\n---\n\n随 `llama.cpp` 发布版本 [`b7347`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb7347) 一起发布\n\n> 若要使用最新的 `llama.cpp` 发布版本，请运行 `npx -n node-llama-cpp source download --release latest`。([了解更多](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\n\n\n","2025-12-10T23:38:37",{"id":232,"version":233,"summary_zh":234,"released_at":235},188876,"v3.14.4","## [3.14.4](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.3...v3.14.4) (2025-12-08)\n\n\n### Bug Fixes\n\n* `create-node-llama-cpp` module package release ([#530](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F530)) ([9a428e5](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F9a428e5dc8b2344174e4a73643f29585ac88fd79))\n\n---\n\nShipped with `llama.cpp` release [`b7324`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb7324)\n\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\n\n\n","2025-12-08T19:53:59",{"id":237,"version":238,"summary_zh":239,"released_at":240},188877,"v3.14.3","## [3.14.3](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.2...v3.14.3) (2025-12-08)\r\n\r\n\r\n### Features\r\n\r\n* **`source download` CLI:** log the downloaded release when the release is set to `latest` ([#522](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F522)) ([e37835c](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fe37835ce22270e2e92b371a49fdac1201ecc3443))\r\n\r\n\r\n### Bug Fixes\r\n\r\n* adapt to `llama.cpp` changes ([#522](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F522)) ([e37835c](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fe37835ce22270e2e92b371a49fdac1201ecc3443))\r\n* pad the context size to align with the implementation in `llama.cpp` ([#522](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F522)) ([e37835c](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fe37835ce22270e2e92b371a49fdac1201ecc3443)) (see [#522](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F522) for more details)\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b7315`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb7315)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-12-08T16:27:30",{"id":242,"version":243,"summary_zh":244,"released_at":245},188878,"v3.14.2","## [3.14.2](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.1...v3.14.2) (2025-10-26)\r\n\r\n\r\n### Bug Fixes\r\n\r\n* a new release due to a `semantic-release` failure in the previous release ([#518](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F518)) ([e516e50](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fe516e5015d6818d483487e4229ea91a36188b2c2))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6845`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6845)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-10-26T19:45:53",{"id":247,"version":248,"summary_zh":249,"released_at":250},188879,"v3.14.1","## [3.14.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.14.0...v3.14.1) (2025-10-26)\r\n\r\n\r\n### Bug Fixes\r\n\r\n* **Vulkan:** include integrated GPU memory ([#516](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F516)) ([47475ac](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F47475aceef49429c4ba51e681249d82d78be0960))\r\n* **Vulkan:** deduplicate the same device coming from different drivers ([#516](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F516)) ([47475ac](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F47475aceef49429c4ba51e681249d82d78be0960))\r\n* adapt Llama chat wrappers to breaking `llama.cpp` changes ([#516](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F516)) ([47475ac](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F47475aceef49429c4ba51e681249d82d78be0960))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6843`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6843)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-10-26T17:32:29",{"id":252,"version":253,"summary_zh":254,"released_at":255},188880,"v3.14.0","# [3.14.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.13.0...v3.14.0) (2025-10-02)\r\n\r\n\r\n### Features\r\n\r\n* Qwen3 Reranker support ([#506](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F506)) ([00305f7](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F00305f7790d3f998ab4311b3ea0ccf54732d2c02)) (see [#506](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F506) for prequantized Qwen3 Reranker models you can use)\r\n\r\n\r\n### Bug Fixes\r\n* handle HuggingFace rate limit responses ([#506](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F506)) ([00305f7](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F00305f7790d3f998ab4311b3ea0ccf54732d2c02))\r\n* adapt to `llama.cpp` breaking changes ([#506](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F506)) ([00305f7](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F00305f7790d3f998ab4311b3ea0ccf54732d2c02))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6673`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6673)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-10-02T21:53:15",{"id":257,"version":258,"summary_zh":259,"released_at":260},188881,"v3.13.0","# [3.13.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.12.4...v3.13.0) (2025-09-09)\n\n\n### Features\n\n* Seed OSS support ([#502](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F502)) ([eefe78c](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Feefe78c8ffa2dd277e1b8913d957f61eadc8788a))\n\n\n### Bug Fixes\n\n* adapt to breaking `llama.cpp` changes ([#501](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F501)) ([76b505e](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F76b505edf350ae8bf8837fddeda68f8fb9ed4550))\n* **Vulkan:** read external memory usage ([#500](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F500)) ([d33cc31](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fd33cc315eb5ecfc209da4d843a6ac7184e832754))\n\n---\n\nShipped with `llama.cpp` release [`b6431`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6431)\n\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\n\n\n","2025-09-09T18:20:09",{"id":262,"version":263,"summary_zh":264,"released_at":265},188882,"v3.12.4","[![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fdf5f1f59-a2cd-4fdb-b60c-3214f4a1584b)](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n# ✨ [`gpt-oss` is here!](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\r\nRead about the release in the [blog post](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n\r\n---\r\n\r\n## [3.12.4](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.12.3...v3.12.4) (2025-08-28)\r\n\r\n\r\n### Bug Fixes\r\n\r\n* gpt-oss prompt preloading ([#496](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F496)) ([db4a243](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fdb4a2437d08a659a0972e9c435609d77b93e209c))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6301`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6301)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-08-28T00:40:48",{"id":267,"version":268,"summary_zh":269,"released_at":270},188883,"v3.12.3","[![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fdf5f1f59-a2cd-4fdb-b60c-3214f4a1584b)](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n# ✨ [`gpt-oss` is here!](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\r\nRead about the release in the [blog post](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n\r\n---\r\n\r\n## [3.12.3](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.12.2...v3.12.3) (2025-08-26)\r\n\r\n\r\n### Bug Fixes\r\n\r\n* **Vulkan:** context creation edge cases ([#492](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F492)) ([12749c0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F12749c08130773eda6268b3dc811f758ca61bcbc))\r\n* prebuilt binaries CUDA 13 support ([#494](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F494)) ([b10999d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fb10999de02a606a1dc02e67d81188db51346c109))\r\n* don't share loaded shared libraries between backends ([#492](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F492)) ([12749c0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F12749c08130773eda6268b3dc811f758ca61bcbc))\r\n* split prebuilt CUDA binaries into 2 npm modules ([#495](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F495)) ([6e59160](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F6e59160dd36a0b558675f61cd1bd06cca522193c))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6294`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6294)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-08-26T23:01:56",{"id":272,"version":273,"summary_zh":274,"released_at":275},188884,"v3.12.1","[![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fdf5f1f59-a2cd-4fdb-b60c-3214f4a1584b)](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n# ✨ [`gpt-oss` is here!](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\r\nRead about the release in the [blog post](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n\r\n---\r\n\r\n## [3.12.1](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.12.0...v3.12.1) (2025-08-11)\r\n\r\n\r\n### Features\r\n\r\n* `comment` segment budget ([#489](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F489)) ([30eaa23](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F30eaa23c60d7aa8b0bb66fdd6a5a3c1d5c63bead)) (documentation: [API: `LLamaChatPromptOptions[\"budgets\"][\"commentTokens\"]`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fapi\u002Ftype-aliases\u002FLLamaChatPromptOptions#budgets-commenttokens))\r\n* **Electron template**: comment segments\r\n* **Electron template**: improve completions speed when using functions\r\n\r\n\r\n### Bug Fixes\r\n\r\n* `gpt-oss` segment budgets ([#489](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F489)) ([30eaa23](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F30eaa23c60d7aa8b0bb66fdd6a5a3c1d5c63bead))\r\n* add support for more `gpt-oss` variations ([#489](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F489)) ([30eaa23](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F30eaa23c60d7aa8b0bb66fdd6a5a3c1d5c63bead))\r\n* default to using a model message for prompt completion on unsupported models ([#489](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F489)) ([30eaa23](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F30eaa23c60d7aa8b0bb66fdd6a5a3c1d5c63bead))\r\n* prompt completion config ([#490](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F490)) ([f849cd9](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Ff849cd9d83a3444c6e5b910e7f68388ba34687f4))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6133`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6133)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-08-11T18:38:43",{"id":277,"version":278,"summary_zh":279,"released_at":280},188885,"v3.12.0","[![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fdf5f1f59-a2cd-4fdb-b60c-3214f4a1584b)](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n# ✨ [`gpt-oss` is here!](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss) ✨\r\nRead about the release in the [blog post](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss)\r\n\r\n---\r\n\r\n# [3.12.0](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcompare\u002Fv3.11.0...v3.12.0) (2025-08-09)\r\n\r\n\r\n### Features\r\n\r\n* `gpt-oss` support ([#487](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F487)) ([722e29d](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002F722e29d64f164d12d82d0438f408f2aa5106bd81)) (documentation: [`gpt-oss`](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fblog\u002Fv3.12-gpt-oss))\r\n\r\n\r\n### Bug Fixes\r\n\r\n* **`Llama`:** expose the `numa` ([#485](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F485)) ([ea0d815](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fea0d8159c1f7bd15b52050b67663f26b88df709b))\r\n* add `--numa` flag to cli commands ([#485](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fissues\u002F485)) ([ea0d815](https:\u002F\u002Fgithub.com\u002Fwithcatai\u002Fnode-llama-cpp\u002Fcommit\u002Fea0d8159c1f7bd15b52050b67663f26b88df709b))\r\n\r\n---\r\n\r\nShipped with `llama.cpp` release [`b6122`](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Freleases\u002Ftag\u002Fb6122)\r\n\r\n> To use the latest `llama.cpp` release available, run `npx -n node-llama-cpp source download --release latest`. ([learn more](https:\u002F\u002Fnode-llama-cpp.withcat.ai\u002Fguide\u002Fbuilding-from-source#download-new-release))\r\n\r\n\r\n","2025-08-09T19:14:01"]