[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-erelsgl--limdu":3,"tool-erelsgl--limdu":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",160784,2,"2026-04-19T11:32:54",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":77,"owner_website":79,"owner_url":80,"languages":81,"stars":86,"forks":87,"last_commit_at":88,"license":89,"difficulty_score":32,"env_os":90,"env_gpu":91,"env_ram":92,"env_deps":93,"category_tags":103,"github_topics":77,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":104,"updated_at":105,"faqs":106,"releases":137},9822,"erelsgl\u002Flimdu","limdu","Machine-learning for Node.js","Limdu 是一个专为 Node.js 设计的机器学习框架，旨在帮助开发者轻松构建智能分类系统。它特别擅长处理自然语言理解任务，是对话系统和聊天机器人开发的理想选择。\n\n针对传统机器学习库在实时交互场景中的不足，Limdu 提供了灵活的解决方案。它不仅支持标准的批量学习，更核心地具备了“在线学习”能力，允许模型在运行时动态接收新数据并即时更新，无需重新训练整个模型。此外，它还原生支持多标签分类，能够应对一个输入对应多个输出结果的复杂场景。\n\n这款工具主要面向熟悉 JavaScript 的软件开发者和算法研究人员。对于需要快速原型验证或希望在 Node.js 环境中直接集成机器学习功能的团队来说，Limdu 能显著降低技术门槛。其独特的技术亮点在于提供了分类“解释”功能，不仅能给出判断结果，还能告知用户得出该结论的依据（例如识别出某动物是鸟类是因为它有翅膀和喙），这在调试模型和增强系统透明度方面极具价值。虽然目前项目处于 Alpha 阶段，部分功能仍在完善中，但其开放的架构非常欢迎社区贡献，适合愿意探索前沿技术并参与共建的开发者使用。","# Limdu.js\n\nLimdu is a machine-learning framework for Node.js. It supports **multi-label classification**, **online learning**, and **real-time classification**. Therefore, it is especially suited for natural language understanding in dialog systems and chat-bots.\n\nLimdu is in an \"alpha\" state - some parts are working (see this readme), but some parts are missing or not tested. Contributions are welcome. \n\nLimdu currently runs on Node.js 0.12 and later versions.\n\n## Installation\n\n\tnpm install limdu\n\n\u003C!--a href=\"https:\u002F\u002Ftracking.gitads.io\u002F?repo=limdu\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ferelsgl_limdu_readme_706770eee320.png\" alt=\"GitAds\"\u002F> \u003C\u002Fa-->\n\n## Demos\n\nYou can run the demos from this project: [limdu-demo](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu-demo).\n \n**Table of Contents**  *generated with [DocToc](http:\u002F\u002Fdoctoc.herokuapp.com\u002F)*\n\n- [Binary Classification](#binary-classification)\n\t- [Batch Learning - learn from an array of input-output pairs:](#batch-learning---learn-from-an-array-of-input-output-pairs)\n\t- [Online Learning](#online-learning)\n\t- [Binding](#binding)\n\t- [Explanations](#explanations)\n\t- [Other Binary Classifiers](#other-binary-classifiers)\n- [Multi-Label Classification](#multi-label-classification)\n\t- [Other Multi-label classifiers](#other-multi-label-classifiers)\n- [Feature engineering](#feature-engineering)\n\t- [Feature extraction - converting an input sample into feature-value pairs:](#feature-extraction---converting-an-input-sample-into-feature-value-pairs)\n\t- [Input Normalization](#input-normalization)\n\t- [Feature lookup table - convert custom features to integer features](#feature-lookup-table---convert-custom-features-to-integer-features)\n- [Serialization](#serialization)\n- [Cross-validation](#cross-validation)\n- [Back-classification (aka Generation)](#back-classification-aka-generation)\n- [SVM wrappers](#svm-wrappers)\n- [Undocumented featuers](#undocumented-featuers)\n- [Contributions](#contributions)\n- [License](#license)\n\n## Binary Classification\n\n### Batch Learning - learn from an array of input-output pairs:\n\n```js\nvar limdu = require('limdu');\n\nvar colorClassifier = new limdu.classifiers.NeuralNetwork();\n\ncolorClassifier.trainBatch([\n\t{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 0},  \u002F\u002F black\n\t{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 1}, \u002F\u002F white\n\t{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 1}   \u002F\u002F white\n\t]);\n\nconsole.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }));  \u002F\u002F 0.99 - almost white\n```\n\nCredit: this example uses [brain.js, by Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fbrain).\n\n\n### Online Learning\n```js\nvar birdClassifier = new limdu.classifiers.Winnow({\n\tdefault_positive_weight: 1,\n\tdefault_negative_weight: 1,\n\tthreshold: 0\n});\n\nbirdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1}, 1);  \u002F\u002F eagle is a bird (1)\nbirdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 0);    \u002F\u002F dog is not a bird (0)\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1})); \u002F\u002F initially, penguin is mistakenly classified as 0 - \"not a bird\"\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1}, \u002F*explanation level=*\u002F4)); \u002F\u002F why? because it does not fly.\n\nbirdClassifier.trainOnline({'wings': 1, 'flight': 0, 'beak': 1, 'penguin':1}, 1);  \u002F\u002F learn that penguin is a bird, although it doesn't fly \nbirdClassifier.trainOnline({'wings': 0, 'flight': 1, 'beak': 0, 'bat': 1}, 0);     \u002F\u002F learn that bat is not a bird, although it does fly\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1})); \u002F\u002F now, chicken is correctly classified as a bird, although it does not fly.  \nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1}, \u002F*explanation level=*\u002F4)); \u002F\u002F why?  because it has wings and beak.\n```\n\nCredit: this example uses Modified Balanced Margin Winnow ([Carvalho and Cohen, 2006](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F2243777)). \n\nThe \"explanation\" feature is explained below.\n\n\n### Binding\n\nUsing Javascript's binding capabilities, it is possible to create custom classes, which are made of existing classes and pre-specified parameters:\n```js\nvar MyWinnow = limdu.classifiers.Winnow.bind(0, {\n\tdefault_positive_weight: 1,\n\tdefault_negative_weight: 1,\n\tthreshold: 0\n});\n\nvar birdClassifier = new MyWinnow();\n...\n\u002F\u002F continue as above\n```\n\n### Explanations\n\nSome classifiers can return \"explanations\" - additional information that explains how the classification result has been derived: \n\n```js\nvar colorClassifier = new limdu.classifiers.Bayesian();\n\ncolorClassifier.trainBatch([\n\t{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 'black'}, \n\t{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 'white'},\n\t{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 'white'},\n\t]);\n\nconsole.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }, \n\t\t\u002F* explanation level = *\u002F1));\n```\nCredit: this example uses code from [classifier.js, by Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fclassifier).\n\nThe explanation feature is experimental and is supported differently for different classifiers. For example, for the Bayesian classifier it returns the probabilities for each category:\n\n```js\n{ classes: 'white',\n\texplanation: [ 'white: 0.0621402182289608', 'black: 0.031460948468170505' ] }\n```\n\nWhile for the winnow classifier it returns the relevance (feature-value times feature-weight) for each feature: \n\n```js\n{ classification: 1,\n\texplanation: [ 'bias+1.12', 'r+1.08', 'g+0.25', 'b+0.00' ] }\n```\n\nWARNING: The internal format of the explanations might change without notice. The explanations should be used for presentation purposes only (and not, for example, for extracting the actual numbers). \n\n### Other Binary Classifiers\n\nIn addition to Winnow and NeuralNetwork, version 0.2 includes the following binary classifiers:\n\n* Bayesian - uses [classifier.js, by Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fclassifier). \n* Perceptron - Loosely based on [perceptron.js, by John Chesley](https:\u002F\u002Fgithub.com\u002Fchesles\u002Fperceptron).\n* SVM - uses [svm.js, by Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fsvmjs). \n* Linear SVM - wrappers around SVM-Perf and Lib-Linear (see below).\n* Decision Tree - based on [node-decision-tree-id3 by Ankit Kuwadekar](https:\u002F\u002Fgithub.com\u002Fbugless\u002Fnodejs-decision-tree-id3) or [ID3-Decision-Tree by Will Kurt](https:\u002F\u002Fgithub.com\u002Fwillkurt\u002FID3-Decision-Tree).\n\nThis library is still under construction, and not all features work for all classifiers. For a full list of the features that do work, see the \"test\" folder. \n\n\n## Multi-Label Classification\n\nIn binary classification, the output is 0 or 1;\n\nIn multi-label classification, the output is a set of zero or more labels.\n\n```js\nvar MyWinnow = limdu.classifiers.Winnow.bind(0, {retrain_count: 10});\n\nvar intentClassifier = new limdu.classifiers.multilabel.BinaryRelevance({\n\tbinaryClassifierType: MyWinnow\n});\n\nintentClassifier.trainBatch([\n\t{input: {I:1,want:1,an:1,apple:1}, output: \"APPLE\"},\n\t{input: {I:1,want:1,a:1,banana:1}, output: \"BANANA\"},\n\t{input: {I:1,want:1,chips:1}, output: \"CHIPS\"}\n\t]);\n\nconsole.dir(intentClassifier.classify({I:1,want:1,an:1,apple:1,and:1,a:1,banana:1}));  \u002F\u002F ['APPLE','BANANA']\n```\n\n### Other Multi-label classifiers\n\nIn addition to BinaryRelevance, version 0.2 includes the following multi-label classifier types (see the multilabel folder):\n\n* Cross-Lingual Language Model Classifier (based on [Anton Leusky and David Traum, 2008](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F12540655))\n* HOMER - Hierarchy Of Multi-label classifiERs (based on [Tsoumakas et al., 2007](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F3170786))\n* Meta-Labeler (based on [Lei Tang, Suju Rajan, Vijay K. Narayanan, 2009](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F4860265)) \n* Joint identification and segmentation (based on [Fabrizio Morbini, Kenji Sagae, 2011](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F10259046))\n* Passive-Aggressive (based on [Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, 2006](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F5960770))\n* Threshold Classifier (converting multi-class classifier to multi-label classifier by finding the best appropriate threshold)\n\nThis library is still under construction, and not all features work for all classifiers. For a full list of the features that do work, see the \"test\" folder. \n\n## Feature engineering\n\n### Feature extraction - converting an input sample into feature-value pairs:\n\n```js\n\u002F\u002F First, define our base classifier type (a multi-label classifier based on winnow):\nvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n});\n\n\u002F\u002F Now define our feature extractor - a function that takes a sample and adds features to a given features set:\nvar WordExtractor = function(input, features) {\n\tinput.split(\" \").forEach(function(word) {\n\t\tfeatures[word]=1;\n\t});\n};\n\n\u002F\u002F Initialize a classifier with the base classifier type and the feature extractor:\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,\n\tfeatureExtractor: WordExtractor\n});\n\n\u002F\u002F Train and test:\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output:    \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\nconsole.dir(intentClassifier.classify(\"I WANT AN APPLE AND A BANANA\"));  \u002F\u002F []\n```\n\nAs you can see from the last example, by default feature extraction is case-sensitive. \nWe will take care of this in the next example.\n\nInstead of defining your own feature extractor, you can use those already bundled with limdu:\n\n```js\nlimdu.features.NGramsOfWords\nlimdu.features.NGramsOfLetters\nlimdu.features.HypernymExtractor\n```\n\nYou can also make 'featureExtractor' an array of several feature extractors, that will be executed in the order you include them.\n\n### Input Normalization\n\n```js\n\u002F\u002FInitialize a classifier with a feature extractor and a case normalizer:\nintentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,  \u002F\u002F same as in previous example\n\tnormalizer: limdu.features.LowerCaseNormalizer,\n\tfeatureExtractor: WordExtractor  \u002F\u002F same as in previous example\n});\n\n\u002F\u002FTrain and test:\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\nconsole.dir(intentClassifier.classify(\"I WANT AN APPLE AND A BANANA\"));  \u002F\u002F ['apl','bnn'] \n```\n\nOf course you can use any other function as an input normalizer. For example, if you know how to write a spell-checker, you can create a normalizer that corrects typos in the input.\n\nYou can also make 'normalizer' an array of several normalizers. These will be executed in the order you include them.\n\n### Feature lookup table - convert custom features to integer features\n\nThis example uses the quadratic SVM implementation [svm.js, by Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fsvmjs). \nThis SVM (like most SVM implementations) works with integer features, so we need a way to convert our string-based features to integers.\n\n```js\nvar limdu = require('limdu');\n\n\u002F\u002F First, define our base classifier type (a multi-label classifier based on svm.js):\nvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\tbinaryClassifierType: limdu.classifiers.SvmJs.bind(0, {C: 1.0})\n});\n\n\u002F\u002F Initialize a classifier with a feature extractor and a lookup table:\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),  \u002F\u002F each word (\"1-gram\") is a feature  \n\tfeatureLookupTable: new limdu.features.FeatureLookupTable()\n});\n\n\u002F\u002F Train and test:\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\n```\n\nThe FeatureLookupTable takes care of the numbers, while you may continue to work with texts! \n\n## Serialization\n\nSay you want to train a classifier on your home computer, and use it on a remote server. To do this, you should somehow convert the trained classifier to a string, send the string to the remote server, and deserialize it there.\n\nYou can do this with the \"serialization.js\" package:\n\n\tnpm install serialization\n\t\nOn your home machine, do the following:\n\n```js\nvar serialize = require('serialization');\n\n\u002F\u002F First, define a function that creates a fresh  (untrained) classifier.\n\u002F\u002F This code should be stand-alone - it should include all the 'require' statements\n\u002F\u002F   required for creating the classifier.\nfunction newClassifierFunction() {\n\tvar limdu = require('limdu');\n\tvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t});\n\n\tvar WordExtractor = function(input, features) {\n\t\tinput.split(\" \").forEach(function(word) {\n\t\t\tfeatures[word]=1;\n\t\t});\n\t};\n\t\n\t\u002F\u002F Initialize a classifier with a feature extractor:\n\treturn new limdu.classifiers.EnhancedClassifier({\n\t\tclassifierType: TextClassifier,\n\t\tfeatureExtractor: WordExtractor,\n\t\tpastTrainingSamples: [], \u002F\u002F to enable retraining\n\t});\n}\n\n\u002F\u002F Use the above function for creating a new classifier:\nvar intentClassifier = newClassifierFunction();\n\n\u002F\u002F Train and test:\nvar dataset = [\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t];\nintentClassifier.trainBatch(dataset);\n\nconsole.log(\"Original classifier:\");\nintentClassifier.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifier.trainOnline(\"I want a doughnut\", \"dnt\");\nintentClassifier.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\nintentClassifier.retrain();\nintentClassifier.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifier.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\n\n\u002F\u002F Serialize the classifier (convert it to a string)\nvar intentClassifierString = serialize.toString(intentClassifier, newClassifierFunction);\n\n\u002F\u002F Save the string to a file, and send it to a remote server.\n```\n\nOn the remote server, do the following:\n\n```js\n\u002F\u002F retrieve the string from a file and then:\n\nvar intentClassifierCopy = serialize.fromString(intentClassifierString, __dirname);\n\nconsole.log(\"Deserialized classifier:\");\nintentClassifierCopy.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifierCopy.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\nintentClassifierCopy.trainOnline(\"I want an elm tree\", \"elm\");\nintentClassifierCopy.classifyAndLog(\"I want doughnut and elm tree\");  \u002F\u002F ['dnt','elm']\n```\n\nCAUTION: Serialization was not tested for all possible combinations of classifiers and enhancements. Test well before use!\n\n## Cross-validation\n\n```js\n\u002F\u002F create a dataset with a lot of input-output pairs:\nvar dataset = [ ... ];\n\n\u002F\u002F Decide how many folds you want in your   k-fold cross-validation:\nvar numOfFolds = 5;\n\n\u002F\u002F Define the type of classifier that you want to test:\nvar IntentClassifier = limdu.classifiers.EnhancedClassifier.bind(0, {\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n});\n\nvar microAverage = new limdu.utils.PrecisionRecall();\nvar macroAverage = new limdu.utils.PrecisionRecall();\n\nlimdu.utils.partitions.partitions(dataset, numOfFolds, function(trainSet, testSet) {\n\tconsole.log(\"Training on \"+trainSet.length+\" samples, testing on \"+testSet.length+\" samples\");\n\tvar classifier = new IntentClassifier();\n\tclassifier.trainBatch(trainSet);\n\tlimdu.utils.test(classifier, testSet, \u002F* verbosity = *\u002F0,\n\t\tmicroAverage, macroAverage);\n});\n\nmacroAverage.calculateMacroAverageStats(numOfFolds);\nconsole.log(\"\\n\\nMACRO AVERAGE:\"); console.dir(macroAverage.fullStats());\n\nmicroAverage.calculateStats();\nconsole.log(\"\\n\\nMICRO AVERAGE:\"); console.dir(microAverage.fullStats());\n```\n\n## Back-classification (aka Generation)\n\nUse this option to get the list of all samples with a given class.\n\n```js\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n\tpastTrainingSamples: [],\n});\n\n\u002F\u002F Train and test:\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I really want an apple\", output: \"apl\"},\n\t{input: \"I want a banana very much\", output: \"bnn\"},\n\t]);\n\nconsole.dir(intentClassifier.backClassify(\"apl\"));  \u002F\u002F [ 'I want an apple', 'I really want an apple' ]\n```\n\n## SVM wrappers\n\nThe native svm.js implementation takes a lot of time to train -  quadratic in the number of training samples. \nThere are two common packages that can be trained in time linear in the number of training samples. They are:\n\n* [SVM-Perf](http:\u002F\u002Fwww.cs.cornell.edu\u002Fpeople\u002Ftj\u002Fsvm_light\u002Fsvm_perf.html) - by Thorsten Joachims;\n* [LibLinear](http:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fliblinear) - Fan, Chang, Hsieh, Wang and Lin.\n\nThe limdu.js package provides wrappers for these implementations. \nIn order to use the wrappers, you must have the binary file used for training in your path, that is:\n\n* **svm\\_perf\\_learn** - from [SVM-Perf](http:\u002F\u002Fwww.cs.cornell.edu\u002Fpeople\u002Ftj\u002Fsvm_light\u002Fsvm_perf.html).\n* **liblinear\\_train** - from [LibLinear](http:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fliblinear).\n\nOnce you have any one of these installed, you can use the corresponding classifier instead of any binary classifier\nused in the previous demos, as long as you have a feature-lookup-table. For example, with SvmPerf:\n\n```js\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.SvmPerf.bind(0, \t{\n\t\t\tlearn_args: \"-c 20.0\" \n\t\t})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n\tfeatureLookupTable: new limdu.features.FeatureLookupTable()\n});\n```\n\nand similarly with SvmLinear.\n\nSee the files classifiers\u002Fsvm\u002FSvmPerf.js and classifiers\u002Fsvm\u002FSvmLinear.js for a documentation of the options.\n\n\n## Undocumented featuers\n\nSome advanced features are working but not documented yet. If you need any of them, open an issue and I will try to document them.\n\n* Custom input normalization, based on regular expressions.\n* Input segmentation for multi-label classification - both manual (with regular expressions) and automatic.\n* Feature extraction for model adaptation.\n* Spell-checker features. \n* Hypernym features.\n* Classification based on a cross-lingual language model.\n* Format conversion - ARFF, JSON, svm-light, TSV.\n\n## License\n\nLGPL\n\n## Contributions\n\nCode contributions are welcome. Reasonable pull requests, with appropriate documentation and unit-tests, will be accepted.\n\nDo you like limdu? Remember that you can star it :-)\n","# Limdu.js\n\nLimdu 是一个用于 Node.js 的机器学习框架。它支持**多标签分类**、**在线学习**和**实时分类**。因此，它特别适用于对话系统和聊天机器人中的自然语言理解任务。\n\nLimdu 目前处于“alpha”阶段——部分功能已经实现（请参阅此 README），但仍有部分功能缺失或尚未经过充分测试。欢迎贡献代码！\n\nLimdu 目前支持 Node.js 0.12 及更高版本。\n\n## 安装\n\n\tnpm install limdu\n\n\u003C!--a href=\"https:\u002F\u002Ftracking.gitads.io\u002F?repo=limdu\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ferelsgl_limdu_readme_706770eee320.png\" alt=\"GitAds\"\u002F> \u003C\u002Fa-->\n\n## 演示\n\n您可以运行该项目中的演示：[limdu-demo](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu-demo)。\n\n**目录** *由 [DocToc](http:\u002F\u002Fdoctoc.herokuapp.com\u002F) 自动生成*\n\n- [二分类](#binary-classification)\n\t- [批量学习——从输入输出对数组中学习：](#batch-learning---learn-from-an-array-of-input-output-pairs)\n\t- [在线学习](#online-learning)\n\t- [绑定](#binding)\n\t- [解释](#explanations)\n\t- [其他二分类器](#other-binary-classifiers)\n- [多标签分类](#multi-label-classification)\n\t- [其他多标签分类器](#other-multi-label-classifiers)\n- [特征工程](#feature-engineering)\n\t- [特征提取——将输入样本转换为特征-值对：](#feature-extraction---converting-an-input-sample-into-feature-value-pairs)\n\t- [输入归一化](#input-normalization)\n\t- [特征查找表——将自定义特征转换为整数特征](#feature-lookup-table---convert-custom-features-to-integer-features)\n- [序列化](#serialization)\n- [交叉验证](#cross-validation)\n- [反向分类（又称生成）](#back-classification-aka-generation)\n- [SVM 包装器](#svm-wrappers)\n- [未文档化的功能](#undocumented-featuers)\n- [贡献](#contributions)\n- [许可证](#license)\n\n## 二分类\n\n### 批量学习——从输入输出对数组中学习：\n\n```js\nvar limdu = require('limdu');\n\nvar colorClassifier = new limdu.classifiers.NeuralNetwork();\n\ncolorClassifier.trainBatch([\n\t{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 0},  \u002F\u002F 黑色\n\t{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 1}, \u002F\u002F 白色\n\t{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 1}   \u002F\u002F 白色\n\t]);\n\nconsole.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }));  \u002F\u002F 0.99 - 几乎是白色\n```\n\n鸣谢：本示例使用了 [brain.js，作者为 Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fbrain)。\n\n### 在线学习\n```js\nvar birdClassifier = new limdu.classifiers.Winnow({\n\tdefault_positive_weight: 1,\n\tdefault_negative_weight: 1,\n\tthreshold: 0\n});\n\nbirdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1}, 1);  \u002F\u002F 鹰是鸟类（1）\nbirdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 0);    \u002F\u002F 狗不是鸟类（0）\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1})); \u002F\u002F 初始时，企鹅被错误地分类为 0——“不是鸟类”\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1}, \u002F*解释级别=*\u002F4)); \u002F\u002F 为什么？因为它不会飞。\n\nbirdClassifier.trainOnline({'wings': 1, 'flight': 0, 'beak': 1, 'penguin':1}, 1);  \u002F\u002F 学习到企鹅是鸟类，尽管它不会飞 \nbirdClassifier.trainOnline({'wings': 0, 'flight': 1, 'beak': 0, 'bat': 1}, 0);     \u002F\u002F 学习到蝙蝠不是鸟类，尽管它会飞\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1})); \u002F\u002F 现在，鸡被正确地分类为鸟类，尽管它不会飞。  \nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1}, \u002F*解释级别=*\u002F4)); \u002F\u002F 为什么？因为它有翅膀和喙。\n```\n\n鸣谢：本示例使用了改进的平衡间隔 Winnow 算法（[Carvalho 和 Cohen，2006 年](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F2243777)）。关于“解释”功能将在下文说明。\n\n### 基于绑定的自定义分类器\n\n利用 JavaScript 的绑定特性，可以创建由现有分类器和预设参数组成的自定义类：\n```js\nvar MyWinnow = limdu.classifiers.Winnow.bind(0, {\n\tdefault_positive_weight: 1,\n\tdefault_negative_weight: 1,\n\tthreshold: 0\n});\n\nvar birdClassifier = new MyWinnow();\n...\n\u002F\u002F 继续如上\n```\n\n### 解释\n\n某些分类器可以返回“解释”——即附加信息，用以说明分类结果是如何得出的：\n\n```js\nvar colorClassifier = new limdu.classifiers.Bayesian();\n\ncolorClassifier.trainBatch([\n\t{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 'black'}, \n\t{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 'white'},\n\t{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 'white'},\n\t]);\n\nconsole.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }, \n\t\t\u002F* 解释级别 = *\u002F1));\n```\n\n鸣谢：本示例使用了 [classifier.js，作者为 Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fclassifier) 中的代码。\n\n解释功能目前仍处于实验阶段，并且不同分类器的支持方式有所不同。例如，对于贝叶斯分类器，它会返回每个类别的概率：\n\n```js\n{ classes: 'white',\n\texplanation: [ 'white: 0.0621402182289608', 'black: 0.031460948468170505' ] }\n```\n\n而对于 Winnow 分类器，则会返回每个特征的相关性（特征值乘以特征权重）：\n\n```js\n{ classification: 1,\n\texplanation: [ 'bias+1.12', 'r+1.08', 'g+0.25', 'b+0.00' ] }\n```\n\n警告：解释的内部格式可能会在未经通知的情况下发生变化。解释应仅用于展示目的（而不应用于提取实际数值等用途）。\n\n### 其他二分类器\n\n除了 Winnow 和 NeuralNetwork 外，版本 0.2 还包含以下二分类器：\n\n* 贝叶斯分类器——基于 [classifier.js，作者为 Heather Arthur](https:\u002F\u002Fgithub.com\u002Fharthur\u002Fclassifier)。\n* 感知器——大致基于 [perceptron.js，作者为 John Chesley](https:\u002F\u002Fgithub.com\u002Fchesles\u002Fperceptron)。\n* SVM——基于 [svm.js，作者为 Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fsvmjs)。\n* 线性 SVM——封装了 SVM-Perf 和 Lib-Linear（见下文）。\n* 决策树——基于 [node-decision-tree-id3，作者为 Ankit Kuwadekar](https:\u002F\u002Fgithub.com\u002Fbugless\u002Fnodejs-decision-tree-id3) 或 [ID3-Decision-Tree，作者为 Will Kurt](https:\u002F\u002Fgithub.com\u002Fwillkurt\u002FID3-Decision-Tree)。\n\n该库仍在开发中，并非所有功能都适用于所有分类器。有关已实现功能的完整列表，请参阅“test”文件夹。\n\n## 多标签分类\n\n在二分类中，输出是 0 或 1；\n\n而在多标签分类中，输出是一组零个或多个标签。\n\n```js\nvar MyWinnow = limdu.classifiers.Winnow.bind(0, {retrain_count: 10});\n\nvar intentClassifier = new limdu.classifiers.multilabel.BinaryRelevance({\n\tbinaryClassifierType: MyWinnow\n});\n\nintentClassifier.trainBatch([\n\t{input: {I:1,want:1,an:1,apple:1}, output: \"APPLE\"},\n\t{input: {I:1,want:1,a:1,banana:1}, output: \"BANANA\"},\n\t{input: {I:1,want:1,chips:1}, output: \"CHIPS\"}\n\t]);\n\nconsole.dir(intentClassifier.classify({I:1,want:1,an:1,apple:1,and:1,a:1,banana:1}));  \u002F\u002F ['APPLE','BANANA']\n```\n\n### 其他多标签分类器\n\n除了 BinaryRelevance 之外，版本 0.2 还包含了以下多标签分类器类型（参见 multilabel 文件夹）：\n\n* 跨语言语言模型分类器（基于 [Anton Leusky 和 David Traum, 2008](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F12540655)）\n* HOMER - 多标签分类器层次结构（基于 [Tsoumakas 等人, 2007](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F3170786)）\n* 元标签器（基于 [Lei Tang、Suju Rajan、Vijay K. Narayanan, 2009](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F4860265)）\n* 联合识别与分段（基于 [Fabrizio Morbini、Kenji Sagae, 2011](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F10259046)）\n* 被动-攻击性算法（基于 [Koby Crammer、Ofer Dekel、Joseph Keshet、Shai Shalev-Shwartz、Yoram Singer, 2006](http:\u002F\u002Fwww.citeulike.org\u002Fuser\u002Ferelsegal-halevi\u002Farticle\u002F5960770)）\n* 阈值分类器（通过寻找最佳阈值将多类分类器转换为多标签分类器）\n\n该库仍在开发中，并非所有功能都适用于所有分类器。有关已实现功能的完整列表，请参阅“test”文件夹。\n\n## 特征工程\n\n### 特征提取——将输入样本转换为特征-值对：\n\n```js\n\u002F\u002F 首先，定义我们的基础分类器类型（基于 winnow 的多标签分类器）：\nvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n});\n\n\u002F\u002F 现在定义我们的特征提取器——一个接收样本并将特征添加到给定特征集合中的函数：\nvar WordExtractor = function(input, features) {\n\tinput.split(\" \").forEach(function(word) {\n\t\tfeatures[word]=1;\n\t});\n};\n\n\u002F\u002F 使用基础分类器类型和特征提取器初始化分类器：\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,\n\tfeatureExtractor: WordExtractor\n});\n\n\u002F\u002F 训练和测试：\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output:    \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\nconsole.dir(intentClassifier.classify(\"I WANT AN APPLE AND A BANANA\"));  \u002F\u002F []\n```\n\n正如最后一个示例所示，默认情况下，特征提取是区分大小写的。\n我们将在下一个示例中解决这个问题。\n\n除了定义自己的特征提取器外，您还可以使用 limdu 自带的特征提取器：\n\n```js\nlimdu.features.NGramsOfWords\nlimdu.features.NGramsOfLetters\nlimdu.features.HypernymExtractor\n```\n\n您还可以将“featureExtractor”设置为包含多个特征提取器的数组，这些提取器将按照您指定的顺序依次执行。\n\n### 输入归一化\n\n```js\n\u002F\u002F 使用特征提取器和大小写归一化器初始化分类器：\nintentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,  \u002F\u002F 与前一个示例相同\n\tnormalizer: limdu.features.LowerCaseNormalizer,\n\tfeatureExtractor: WordExtractor  \u002F\u002F 与前一个示例相同\n});\n\n\u002F\u002F 训练和测试：\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\nconsole.dir(intentClassifier.classify(\"I WANT AN APPLE AND A BANANA\"));  \u002F\u002F ['apl','bnn'] \n```\n\n当然，您也可以使用其他任何函数作为输入归一化器。例如，如果您会编写拼写检查器，就可以创建一个能够纠正输入中错别字的归一化器。\n\n此外，您还可以将“normalizer”设置为包含多个归一化器的数组。这些归一化器将按照您指定的顺序依次执行。\n\n### 特征查找表——将自定义特征转换为整数特征\n\n本示例使用了二次 SVM 实现 [svm.js，由 Andrej Karpathy 开发](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fsvmjs)。\n由于该 SVM（与大多数 SVM 实现类似）仅支持整数特征，因此我们需要一种方法将基于字符串的特征转换为整数。\n\n```js\nvar limdu = require('limdu');\n\n\u002F\u002F 首先，定义我们的基础分类器类型（基于 svm.js 的多标签分类器）：\nvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\tbinaryClassifierType: limdu.classifiers.SvmJs.bind(0, {C: 1.0})\n});\n\n\u002F\u002F 使用特征提取器和查找表初始化分类器：\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: TextClassifier,\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),  \u002F\u002F 每个词（1-gram）都作为一个特征  \n\tfeatureLookupTable: new limdu.features.FeatureLookupTable()\n});\n\n\u002F\u002F 训练和测试：\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t]);\n\nconsole.dir(intentClassifier.classify(\"I want an apple and a banana\"));  \u002F\u002F ['apl','bnn']\n```\n\nFeatureLookupTable 会处理数字部分，而您可以继续使用文本进行操作！\n\n## 序列化\n\n假设你想在自己的家用电脑上训练一个分类器，并将其部署到远程服务器上使用。要做到这一点，你需要将训练好的分类器转换成字符串，然后将该字符串发送到远程服务器，并在那里进行反序列化。\n\n你可以使用 `serialization.js` 包来实现这一过程：\n\n```bash\nnpm install serialization\n```\n\n在你的家用电脑上，执行以下操作：\n\n```js\nvar serialize = require('serialization');\n\n\u002F\u002F 首先，定义一个函数来创建一个新的（未训练的）分类器。\n\u002F\u002F 这段代码应该是独立的——它应该包含创建分类器所需的所有 `require` 语句。\nfunction newClassifierFunction() {\n\tvar limdu = require('limdu');\n\tvar TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t});\n\n\tvar WordExtractor = function(input, features) {\n\t\tinput.split(\" \").forEach(function(word) {\n\t\t\tfeatures[word]=1;\n\t\t});\n\t};\n\t\n\t\u002F\u002F 使用特征提取器初始化分类器：\n\treturn new limdu.classifiers.EnhancedClassifier({\n\t\tclassifierType: TextClassifier,\n\t\tfeatureExtractor: WordExtractor,\n\t\tpastTrainingSamples: [], \u002F\u002F 以便支持重新训练\n\t});\n}\n\n\u002F\u002F 使用上述函数创建一个新的分类器：\nvar intentClassifier = newClassifierFunction();\n\n\u002F\u002F 训练并测试：\nvar dataset = [\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I want chips\", output: \"cps\"},\n\t];\nintentClassifier.trainBatch(dataset);\n\nconsole.log(\"原始分类器：\");\nintentClassifier.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifier.trainOnline(\"I want a doughnut\", \"dnt\");\nintentClassifier.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\nintentClassifier.retrain();\nintentClassifier.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifier.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\n\n\u002F\u002F 将分类器序列化（转换为字符串）\nvar intentClassifierString = serialize.toString(intentClassifier, newClassifierFunction);\n\n\u002F\u002F 将字符串保存到文件中，并发送到远程服务器。\n```\n\n在远程服务器上，执行以下操作：\n\n```js\n\u002F\u002F 从文件中读取字符串后：\n\nvar intentClassifierCopy = serialize.fromString(intentClassifierString, __dirname);\n\nconsole.log(\"反序列化后的分类器：\");\nintentClassifierCopy.classifyAndLog(\"I want an apple and a banana\");  \u002F\u002F ['apl','bnn']\nintentClassifierCopy.classifyAndLog(\"I want chips and a doughnut\");  \u002F\u002F ['cps','dnt']\nintentClassifierCopy.trainOnline(\"I want an elm tree\", \"elm\");\nintentClassifierCopy.classifyAndLog(\"I want doughnut and elm tree\");  \u002F\u002F ['dnt','elm']\n```\n\n注意：序列化功能尚未针对所有可能的分类器和增强方法组合进行全面测试。请在使用前充分测试！\n\n## 交叉验证\n\n```js\n\u002F\u002F 创建一个包含大量输入输出对的数据集：\nvar dataset = [ ... ];\n\n\u002F\u002F 决定你希望使用的 k 折交叉验证的折数：\nvar numOfFolds = 5;\n\n\u002F\u002F 定义你要测试的分类器类型：\nvar IntentClassifier = limdu.classifiers.EnhancedClassifier.bind(0, {\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n});\n\nvar microAverage = new limdu.utils.PrecisionRecall();\nvar macroAverage = new limdu.utils.PrecisionRecall();\n\nlimdu.utils.partitions.partitions(dataset, numOfFolds, function(trainSet, testSet) {\n\tconsole.log(\"在 \"+trainSet.length+\" 个样本上训练，在 \"+testSet.length+\" 个样本上测试\");\n\tvar classifier = new IntentClassifier();\n\tclassifier.trainBatch(trainSet);\n\tlimdu.utils.test(classifier, testSet, \u002F* verbosity = *\u002F0,\n\t\tmicroAverage, macroAverage);\n});\n\nmacroAverage.calculateMacroAverageStats(numOfFolds);\nconsole.log(\"\\n\\nMACRO AVERAGE：\"); console.dir(macroAverage.fullStats());\n\nmicroAverage.calculateStats();\nconsole.log(\"\\n\\nMICRO AVERAGE：\"); console.dir(microAverage.fullStats());\n```\n\n## 反向分类（也称生成）\n\n使用此选项可以获取具有给定类别的所有样本列表。\n\n```js\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n\tpastTrainingSamples: [],\n});\n\n\u002F\u002F 训练和测试：\nintentClassifier.trainBatch([\n\t{input: \"I want an apple\", output: \"apl\"},\n\t{input: \"I want a banana\", output: \"bnn\"},\n\t{input: \"I really want an apple\", output: \"apl\"},\n\t{input: \"I want a banana very much\", output: \"bnn\"},\n\t]);\n\nconsole.dir(intentClassifier.backClassify(\"apl\"));  \u002F\u002F [ 'I want an apple', 'I really want an apple' ]\n```\n\n## SVM 包装器\n\n原生的 svm.js 实现训练时间较长——与训练样本数量呈二次方关系。\n有两种常用的包可以在与训练样本数量线性相关的时间内完成训练，它们是：\n\n* [SVM-Perf](http:\u002F\u002Fwww.cs.cornell.edu\u002Fpeople\u002Ftj\u002Fsvm_light\u002Fsvm_perf.html) —— 由 Thorsten Joachims 开发；\n* [LibLinear](http:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fliblinear) —— 由 Fan、Chang、Hsieh、Wang 和 Lin 共同开发。\n\nlimdu.js 包提供了这些实现的包装器。要使用这些包装器，你必须在系统路径中拥有用于训练的二进制文件，即：\n\n* **svm\\_perf\\_learn** —— 来自 [SVM-Perf]；\n* **liblinear\\_train** —— 来自 [LibLinear]。\n\n一旦安装了其中任何一个，你就可以用相应的分类器替代之前演示中使用的任何二元分类器，只要有一个特征查找表即可。例如，使用 SvmPerf：\n\n```js\nvar intentClassifier = new limdu.classifiers.EnhancedClassifier({\n\tclassifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {\n\t\tbinaryClassifierType: limdu.classifiers.SvmPerf.bind(0, \t{\n\t\t\tlearn_args: \"-c 20.0\" \n\t\t})\n\t}),\n\tfeatureExtractor: limdu.features.NGramsOfWords(1),\n\tfeatureLookupTable: new limdu.features.FeatureLookupTable()\n});\n```\n\n同样地，也可以使用 SvmLinear。\n\n有关选项的详细信息，请参阅 `classifiers\u002Fsvm\u002FSvmPerf.js` 和 `classifiers\u002Fsvm\u002FSvmLinear.js` 文件。\n\n## 未文档化的功能\n\n一些高级功能目前仍在开发中，尚未记录在文档中。如果你需要这些功能，请提交一个问题，我会尽量补充文档。\n\n* 基于正则表达式的自定义输入归一化；\n* 多标签分类中的输入分段——既可以手动（通过正则表达式），也可以自动；\n* 用于模型适应的特征提取；\n* 拼写检查功能；\n* 上位词特征；\n* 基于跨语言语言模型的分类；\n* 格式转换——ARFF、JSON、svm-light、TSV。\n\n## 许可证\n\nLGPL\n\n## 贡献\n\n欢迎代码贡献。合理的拉取请求，附带适当的文档和单元测试，都将被接受。\n\n你喜欢 limdu 吗？别忘了给它点个星哦 :-)","# Limdu.js 快速上手指南\n\nLimdu 是一个专为 Node.js 设计的机器学习框架，支持**多标签分类**、**在线学习**和**实时分类**。它特别适用于对话系统和聊天机器人的自然语言理解（NLU）场景。\n\n> **注意**：目前该库处于 Alpha 阶段，部分功能可能尚未完善或未经过充分测试。\n\n## 环境准备\n\n*   **操作系统**：支持 Windows、macOS 及 Linux。\n*   **运行环境**：Node.js 0.12 或更高版本（建议使用最新的 LTS 版本）。\n*   **前置依赖**：无需额外系统级依赖，仅需 npm 包管理器。\n\n## 安装步骤\n\n使用 npm 直接安装即可：\n\n```bash\nnpm install limdu\n```\n\n> **国内加速建议**：如果下载速度较慢，可配置淘宝镜像源进行安装：\n> ```bash\n> npm install limdu --registry=https:\u002F\u002Fregistry.npmmirror.com\n> ```\n\n## 基本使用\n\n以下示例演示了如何使用 Limdu 进行最简单的**二分类批量学习**（根据 RGB 颜色值判断是黑色还是白色）。\n\n### 1. 引入模块并创建分类器\n\n```js\nvar limdu = require('limdu');\n\n\u002F\u002F 创建一个神经网络分类器\nvar colorClassifier = new limdu.classifiers.NeuralNetwork();\n```\n\n### 2. 训练模型\n\n通过 `trainBatch` 方法传入输入输出对数组进行训练：\n\n```js\ncolorClassifier.trainBatch([\n\t{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 0},  \u002F\u002F 黑色\n\t{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 1}, \u002F\u002F 白色\n\t{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 1}    \u002F\u002F 白色\n]);\n```\n\n### 3. 进行预测\n\n使用 `classify` 方法对新数据进行分类：\n\n```js\n\u002F\u002F 输出结果接近 1 (白色)\nconsole.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 })); \n```\n\n### 进阶：在线学习示例\n\nLimdu 也支持逐条数据的在线学习（Online Learning），适合动态更新模型：\n\n```js\nvar birdClassifier = new limdu.classifiers.Winnow({\n\tdefault_positive_weight: 1,\n\tdefault_negative_weight: 1,\n\tthreshold: 0\n});\n\n\u002F\u002F 在线训练：鹰是鸟 (1)，狗不是鸟 (0)\nbirdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1}, 1);\nbirdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 0);\n\n\u002F\u002F 预测\nconsole.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1}));\n```","某初创团队正在开发一款面向电商客服的即时聊天机器人，需要快速理解用户关于商品颜色、类别及售后问题的复杂描述。\n\n### 没有 limdu 时\n- **无法处理多标签意图**：当用户说“这件红色衣服既太贵又没货”时，传统单分类模型只能识别一个意图，导致回复顾此失彼。\n- **冷启动周期长**：每次新增商品品类或促销规则，都必须收集大量历史数据重新进行批量训练，上线延迟严重。\n- **误判原因黑盒化**：当机器人错误地将“企鹅玩偶”归类为“活体宠物”时，开发人员难以追溯判断逻辑，调试效率极低。\n- **实时适应性差**：面对用户现场纠正（如“不，我是说退货而不是换货”），系统无法立即更新认知，需等待下一次模型迭代。\n\n### 使用 limdu 后\n- **精准的多标签分类**：利用 limdu 的多标签分类特性，系统能同时识别“价格投诉”与“库存查询”多个意图，生成综合回复。\n- **支持在线增量学习**：借助在线学习功能，客服在对话中标记的新样本可被 limdu 实时吸收，新规则秒级生效无需重训。\n- **提供决策解释机制**：调用 limdu 的解释功能，开发者可查看模型因“有翅膀”而将蝙蝠误判为鸟的具体权重，快速优化特征工程。\n- **动态修正认知偏差**：通过实时分类与在线训练结合，当用户纠正“企鹅也是鸟”时，limdu 能立即调整权重，后续对“鸡”的分类也随之准确。\n\nlimdu 让 Node.js 聊天机器人具备了边对话边进化、且决策过程透明可控的核心竞争力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ferelsgl_limdu_90ed4fc7.png","erelsgl","Erel Segal-Halevi","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Ferelsgl_c6407763.jpg","Lecturer at Ariel University","Ariel University",null,"erelsgl@gmail.com","http:\u002F\u002Ferelsgl.github.io","https:\u002F\u002Fgithub.com\u002Ferelsgl",[82],{"name":83,"color":84,"percentage":85},"JavaScript","#f1e05a",100,1051,99,"2026-04-10T19:10:00","LGPL-3.0","未说明 (基于 Node.js，理论上支持所有 Node.js 兼容系统)","不需要","未说明",{"notes":94,"python":95,"dependencies":96},"该工具是一个基于 Node.js 的机器学习框架，处于 Alpha 阶段。它不依赖 Python 或 GPU，主要适用于对话系统和聊天机器人的自然语言理解任务。安装仅需通过 npm 进行。部分功能（如 SVM 包装器）可能需要额外的底层库支持，但核心功能纯 JavaScript 实现。","不适用 (基于 Node.js)",[97,98,99,100,101,102],"Node.js >= 0.12","brain.js (可选，用于神经网络)","classifier.js (可选，用于贝叶斯分类)","perceptron.js (可选，用于感知机)","svm.js (可选，用于 SVM)","node-decision-tree-id3 或 ID3-Decision-Tree (可选，用于决策树)",[35,14],"2026-03-27T02:49:30.150509","2026-04-20T07:17:56.637866",[107,112,117,122,127,132],{"id":108,"question_zh":109,"answer_zh":110,"source_url":111},44103,"使用 SVM 包装器（如 svm_perf_learn）时提示找不到可执行文件怎么办？","该错误通常由两个原因引起：\n1. 环境变量未配置：请确保已将 `svm_perf_learn` 的路径添加到系统环境变量 PATH 中，或者将其复制到 `%SystemRoot%\\system32` 目录下。\n2. 代码检查逻辑缺陷：在 `SvmPerf.js` 中，检查命令 `execSync(\"svm_perf_learn -c 1 a\")` 会因为训练集文件 \"a\" 不存在而失败。建议将检查命令修改为 `execSync(\"svm_perf_learn\")` 仅测试可执行文件是否存在，而不依赖不存在的临时文件。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F48",{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},44104,"运行交叉验证示例时报错 'limdu.utils.test does not exist' 如何解决？","`limdu.utils.test` 函数在最新版本中可能已被移除或未提交。如果遇到此问题：\n1. 可以参考历史版本恢复 `PrecisionRecall.js` 文件，地址为：https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fblob\u002Fd61166c91a81daee62e3d67d5fff2b06cee8191f\u002Futils\u002FPrecisionRecall.js\n2. 或者查看官方提供的替代示例，例如 `classifiers\u002Fsvm\u002FSvmLinearDemo.js`，该示例展示了如何使用交叉验证功能。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F53",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},44105,"svmlinear 分类器无法工作，报错或抛出异常怎么办？","这通常是因为 `liblinear_train` 命令在没有参数时返回退出码 1，导致 Node.js 抛出错误。解决方案是修改调用命令，使其包含一个有效的目录参数（例如当前目录），这样命令会返回退出码 0。具体做法是将执行命令从 `liblinear_train` 改为 `liblinear_train .`（注意末尾的点代表当前目录）。维护者已在后续版本（0.9.2）中修复了此问题，建议升级库版本。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F50",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},44106,"安装 limdu 时因依赖包 'brain' 被废弃而失败，如何处理？","原有的 `brain` 库已被废弃并从 npm 移除，导致依赖它的项目无法安装。解决方案是将依赖切换到社区维护的分支 `brain.js`。维护者已接受此更改并发布了新版本（0.9.4），用户只需更新 `limdu` 到最新版本即可解决安装失败的问题。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F57",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},44107,"如何在 limdu 中使用 Synaptic 神经网络库？","Limdu 已经支持集成 Synaptic 库。用户可以通过提交 Pull Request 或查看相关合并记录（如 PR #41）来启用该功能。维护者确认新的分类器实现正常，并建议贡献者为新分类器添加测试文件以确保稳定性。使用时需确保已安装 `synaptic` 包并在配置中指定使用该分类器。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F35",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},44108,"在哪里可以找到关于折叠（folding）和交叉验证的具体代码示例？","虽然主 README 中的示例可能已过时或不完整，但你可以参考 `classifiers\u002Fsvm\u002FSvmLinearDemo.js` 文件。该文件包含了使用 SvmLinear 分类器进行交叉验证的完整示例代码，展示了如何正确设置和运行交叉验证流程。","https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fissues\u002F37",[138],{"id":139,"version":140,"summary_zh":141,"released_at":142},351672,"v1.0.0","# 1.0.0 (2023-06-19)\n\n\n### Bug修复\n\n* 添加 semantic-release 相关包 ([b669fe7](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fcommit\u002Fb669fe7a80a3991a267c7309e915ca4f9ac0276e))\n* 更新 publish.yml 中的 Node.js 版本 ([e4063b2](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fcommit\u002Fe4063b261e3637a5c0e0b5bbac08978358af3142))\n* 将 Node.js 版本升级至 20.x ([d2d46c9](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fcommit\u002Fd2d46c96f93ff828968582a7ccc35393e9d6970a))\n* 使用 codfish semantic action ([c8c766b](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fcommit\u002Fc8c766bbfbcac70d9851e24b3f8f3479cc80130f))\n\n\n### 功能\n\n* 自动发布到 npm ([c3cd44f](https:\u002F\u002Fgithub.com\u002Ferelsgl\u002Flimdu\u002Fcommit\u002Fc3cd44f0d8d8266951c3f71685195f692284cc46))","2023-06-19T17:59:23"]