[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-online-ml--awesome-online-machine-learning":3,"tool-online-ml--awesome-online-machine-learning":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",148568,2,"2026-04-09T23:34:24",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":76,"stars":79,"forks":80,"last_commit_at":81,"license":82,"difficulty_score":83,"env_os":75,"env_gpu":84,"env_ram":84,"env_deps":85,"category_tags":88,"github_topics":89,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":94,"updated_at":95,"faqs":96,"releases":97},6103,"online-ml\u002Fawesome-online-machine-learning","awesome-online-machine-learning",":bookmark_tabs: Online machine learning resources","awesome-online-machine-learning 是一个专为在线机器学习领域打造的精选资源库。与传统批量学习不同，在线机器学习处理的是连续到达的数据流，模型需随新数据实时增量更新。该资源库旨在解决开发者与研究者在面对流式数据时，难以系统获取高质量学习资料、算法实现及前沿论文的痛点。\n\n它非常适合从事实时推荐系统、金融风控、物联网数据分析的工程师，以及专注于序列决策和流式算法的研究人员使用。其核心亮点在于构建了极其详尽的知识体系：不仅收录了从入门课程到专业书籍的学习路径，还按线性模型、神经网络、漂移检测、异常检测等细分技术领域整理了大量学术论文。此外，它还涵盖了建模工具与部署方案，并汇集了业界关于实时机器学习挑战与解决方案的深度博客。无论是想深入了解 Vowpal Wabbit 等工具的内核原理，还是寻找具体的代码实现参考，这里都能提供一站式的高质量指引，帮助用户高效掌握数据流背后的智能决策技术。","\u003Cdiv align=\"center\">\n    \u003Ch1>Awesome Online Machine Learning\u003C\u002Fh1>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsindresorhus\u002Fawesome\">\u003Cimg src=\"https:\u002F\u002Fcdn.rawgit.com\u002Fsindresorhus\u002Fawesome\u002Fd7305f38d29fed78fa85652e3a63e154dd8e8829\u002Fmedia\u002Fbadge.svg\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n[Online machine learning](https:\u002F\u002Fwww.wikiwand.com\u002Fen\u002FOnline_machine_learning) is a subset of machine learning where data arrives sequentially. In contrast to the more traditional batch learning, online learning methods update themselves incrementally with one data point at a time.\n\n- [Courses and books](#courses-and-books)\n- [Blog posts](#blog-posts)\n- [Software](#software)\n  - [Modelling](#modelling)\n  - [Deployment](#deployment)\n- [Papers](#papers)\n  - [Linear models](#linear-models)\n  - [Support vector machines](#support-vector-machines)\n  - [Neural networks](#neural-networks)\n  - [Decision trees](#decision-trees)\n  - [Unsupervised learning](#unsupervised-learning)\n  - [Time series](#time-series)\n  - [Drift detection](#drift-detection)\n  - [Anomaly detection](#anomaly-detection)\n  - [Metric learning](#metric-learning)\n  - [Graph theory](#graph-theory)\n  - [Ensemble models](#ensemble-models)\n  - [Expert learning](#expert-learning)\n  - [Active learning](#active-learning)\n  - [Miscellaneous](#miscellaneous)\n  - [Surveys](#surveys)\n  - [General-purpose algorithms](#general-purpose-algorithms)\n  - [Hyperparameter tuning](#hyperparameter-tuning)\n  - [Evaluation](#evaluation)\n\n## Courses and books\n\n- [Machine Learning for Streaming Data with Python](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FMachine-Learning-for-Streaming-Data-with-Python)\n- [IE 498: Online Learning and Decision Making](https:\u002F\u002Fyuanz.web.illinois.edu\u002Fteaching\u002FIE498fa19\u002F)\n- [Introduction to Online Learning](https:\u002F\u002Fparameterfree.com\u002Flecture-notes-on-online-learning\u002F)\n- [Machine Learning the Feature](http:\u002F\u002Fwww.hunch.net\u002F~mltf\u002F) — Gives some insights into the inner workings of Vowpal Wabbit, especially the [slides on online linear learning](http:\u002F\u002Fwww.hunch.net\u002F~mltf\u002Fonline_linear.pdf).\n- [Machine learning for data streams with practical examples in MOA](https:\u002F\u002Fwww.cms.waikato.ac.nz\u002F~abifet\u002Fbook\u002Fcontents.html)\n- [Online Methods in Machine Learning (MIT)](http:\u002F\u002Fwww.mit.edu\u002F~rakhlin\u002F6.883\u002F)\n- [Streaming 101: The world beyond batch](https:\u002F\u002Fwww.oreilly.com\u002Fideas\u002Fthe-world-beyond-batch-streaming-101)\n- [Prediction, Learning, and Games](http:\u002F\u002Fwww.ii.uni.wroc.pl\u002F~lukstafi\u002Fpmwiki\u002Fuploads\u002FAGT\u002FPrediction_Learning_and_Games.pdf)\n- [Introduction to Online Convex Optimization](https:\u002F\u002Focobook.cs.princeton.edu\u002FOCObook.pdf)\n- [Reinforcement Learning and Stochastic Optimization: A unified framework for sequential decisions](https:\u002F\u002Fcastlelab.princeton.edu\u002FRLSO\u002F) — The entire book builds upon Online Learning paradigm in applied learning\u002Foptimization problems, *Chapter 3  Online learning* being the reference.\n- [Big Data course at the CILVR lab at NYU](https:\u002F\u002Fcilvr.cs.nyu.edu\u002Fdoku.php?id=courses:bigdata:slides:start) — Focus on linear models and bandits. Some courses are given by John Langford, the creator of Vowpal Wabbit.\n- [Machine Learning for Personalization](http:\u002F\u002Fwww.cs.columbia.edu\u002F~jebara\u002F6998\u002F) — Course from Columbia by Tony Jebara, covers bandits.\n- [An Introduction to Online Learning](http:\u002F\u002Fchercheurs.lille.inria.fr\u002F~lazaric\u002FWebpage\u002FHome\u002FEntries\u002F2012\u002F1\u002F31_Course_on_%22Advanced_topics_of_machine_learning_theory_and_online_learning%22_files\u002Fpoli-online.pdf)\n- [Streaming Data Analytics](https:\u002F\u002Fgithub.com\u002Femanueledellavalle\u002Fstreaming-data-analytics) - Course from Politecnico di Milano.\n\n## Blog posts\n\n- [Fennel AI blog posts about online recsys](https:\u002F\u002Ffennel.ai\u002Fblog)\n- [Anomaly Detection with Bytewax & Redpanda (Bytewax, 2022)](https:\u002F\u002Fwww.bytewax.io\u002Fblog\u002Fanomaly-detection-bw-rpk\u002F)\n- [The online machine learning predict\u002Ffit switcheroo (Max Halford, 2022)](https:\u002F\u002Fmaxhalford.github.io\u002Fblog\u002Fpredict-fit-switcheroo\u002F)\n- [Real-time machine learning: challenges and solutions (Chip Huyen, 2022)](https:\u002F\u002Fhuyenchip.com\u002F2022\u002F01\u002F02\u002Freal-time-machine-learning-challenges-and-solutions.html)\n- [Anomalies detection using River (Matias Aravena Gamboa, 2021)](https:\u002F\u002Fmedium.com\u002Fspikelab\u002Fanomalies-detection-using-river-398544d3536)\n- [Introdução (não-extensiva) a Online Machine Learning (Saulo Mastelini, 2021)](https:\u002F\u002Fmedium.com\u002F@saulomastelini\u002Fintrodu%C3%A7%C3%A3o-a-online-machine-learning-874bd6b7c3c8)\n- [Machine learning is going real-time (Chip Huyen, 2020)](https:\u002F\u002Fhuyenchip.com\u002F2020\u002F12\u002F27\u002Freal-time-machine-learning.html)\n- [The correct way to evaluate online machine learning models (Max Halford, 2020)](https:\u002F\u002Fmaxhalford.github.io\u002Fblog\u002Fonline-learning-evaluation\u002F)\n- [What is online machine learning? (Max Pagels, 2018)](https:\u002F\u002Fmedium.com\u002Fvalue-stream-design\u002Fonline-machine-learning-515556ff72c5)\n- [What Is It and Who Needs It (Data Science Central, 2015)](https:\u002F\u002Fwww.datasciencecentral.com\u002Fprofiles\u002Fblogs\u002Fstream-processing-what-is-it-and-who-needs-it)\n\n## Software\n\nSee more [here](https:\u002F\u002Fgithub.com\u002Fstars\u002FMaxHalford\u002Flists\u002Fonline-learning).\n\n### Modelling\n\n- [River](https:\u002F\u002Fgithub.com\u002Fcreme-ml\u002Fcreme\u002F) — A Python library for general purpose online machine learning.\n- [dask](https:\u002F\u002Fml.dask.org\u002Fincremental.html)\n- [Jubatus](http:\u002F\u002Fjubat.us\u002Fen\u002Findex.html)\n- [Flink ML](https:\u002F\u002Fnightlies.apache.org\u002Fflink\u002Fflink-ml-docs-stable\u002F) - Apache Flink machine learning library\n- [LIBFFM](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Flibffm\u002F) — A Library for Field-aware Factorization Machines\n- [LIBLINEAR](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fliblinear\u002F) — A Library for Large Linear Classification\n- [LIBOL](https:\u002F\u002Fgithub.com\u002FLIBOL) — A collection of online linear models trained with first and second order gradient descent methods. Not maintained.\n- [MOA](https:\u002F\u002Fmoa.cms.waikato.ac.nz\u002Fdocumentation\u002F)\n- [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) — [Some](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fcomputing\u002Fscaling_strategies.html#incremental-learning) of scikit-learn's estimators can handle incremental updates, although this is usually intended for mini-batch learning. See also the [\"Computing with scikit-learn\"](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fcomputing.html) page.\n- [Spark Streaming](https:\u002F\u002Fspark.apache.org\u002Fdocs\u002Flatest\u002Fstreaming-programming-guide.html) — Doesn't do online learning per say, but instead mini-batches the data into fixed intervals of time.\n- [SofiaML](https:\u002F\u002Fcode.google.com\u002Farchive\u002Fp\u002Fsofia-ml\u002F)\n- [StreamDM](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FstreamDM) — A machine learning library on top of Spark Streaming.\n- [Tornado](https:\u002F\u002Fgithub.com\u002Falipsgh\u002Ftornado)\n- [VFML](http:\u002F\u002Fwww.cs.washington.edu\u002Fdm\u002Fvfml\u002F)\n- [Vowpal Wabbit](https:\u002F\u002Fgithub.com\u002FVowpalWabbit\u002Fvowpal_wabbit)\n\n### Deployment\n\n- [KappaML](https:\u002F\u002Fwww.kappaml.com\u002F)\n- [django-river-ml](https:\u002F\u002Fgithub.com\u002Fvsoch\u002Fdjango-river-ml) — a Django plugin for deploying River models\n- [chantilly](https:\u002F\u002Fgithub.com\u002Fonline-ml\u002Fchantilly) — a prototype meant to be compatible with River (previously Creme)\n\n## Papers\n\n### Linear models\n\n- [A globally optimal fast iterative linear maximum likelihood classifier (2023)](https:\u002F\u002Flibrary.imaging.org\u002Fei\u002Farticles\u002F35\u002F14\u002FCOIMG-172)\n- [Field-aware Factorization Machines for CTR Prediction (2016)](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fpapers\u002Fffm.pdf)\n- [Practical Lessons from Predicting Clicks on Ads at Facebook (2014)](https:\u002F\u002Fresearch.fb.com\u002Fwp-content\u002Fuploads\u002F2016\u002F11\u002Fpractical-lessons-from-predicting-clicks-on-ads-at-facebook.pdf)\n- [Ad Click Prediction: a View from the Trenches (2013)](https:\u002F\u002Fstatic.googleusercontent.com\u002Fmedia\u002Fresearch.google.com\u002Fen\u002F\u002Fpubs\u002Farchive\u002F41159.pdf)\n- [Normalized online learning (2013)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1305.6646)\n- [Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent (2011)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1107.2490)\n- [Dual Averaging Methods for Regularized Stochastic Learning andOnline Optimization (2010)](https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fwp-content\u002Fuploads\u002F2016\u002F02\u002Fxiao10JMLR.pdf)\n- [Adaptive Regularization of Weight Vectors (2009)](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F3848-adaptive-regularization-of-weight-vectors.pdf)\n- [Stochastic Gradient Descent Training forL1-regularized Log-linear Models with Cumulative Penalty (2009)](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FP09-1054)\n- [Confidence-Weighted Linear Classification (2008)](https:\u002F\u002Fwww.cs.jhu.edu\u002F~mdredze\u002Fpublications\u002Ficml_variance.pdf)\n- [Exact Convex Confidence-Weighted Learning (2008)](https:\u002F\u002Fwww.cs.jhu.edu\u002F~mdredze\u002Fpublications\u002Fcw_nips_08.pdf)\n- [Online Passive-Aggressive Algorithms (2006)](http:\u002F\u002Fjmlr.csail.mit.edu\u002Fpapers\u002Fvolume7\u002Fcrammer06a\u002Fcrammer06a.pdf)\n- [Logarithmic Regret Algorithms forOnline Convex Optimization (2007)](https:\u002F\u002Fwww.cs.princeton.edu\u002F~ehazan\u002Fpapers\u002Flog-journal.pdf)\n- [A Second-Order Perceptron Algorithm (2005)](http:\u002F\u002Fwww.datascienceassn.org\u002Fsites\u002Fdefault\u002Ffiles\u002FSecond-order%20Perception%20Algorithm.pdf)\n- [Online Learning with Kernels (2004)](https:\u002F\u002Falex.smola.org\u002Fpapers\u002F2004\u002FKivSmoWil04.pdf)\n- [Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms (2004)](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fsummary?doi=10.1.1.58.7377)\n\n### Support vector machines\n\n- [Pegasos: Primal Estimated sub-GrAdient SOlver for SVM (2007)](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fsummary?doi=10.1.1.74.8513)\n- [A New Approximate Maximal Margin Classification Algorithm (2001)](http:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fvolume2\u002Fgentile01a\u002Fgentile01a.pdf)\n- [The Relaxed Online Maximum Margin Algorithm (2000)](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F1727-the-relaxed-online-maximum-margin-algorithm.pdf)\n\n### Neural networks\n\n- [Three scenarios for continual learning (2019)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.07734.pdf)\n\n### Decision trees\n\n- [AMF: Aggregated Mondrian Forests for Online Learning (2019)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.10529)\n- [Mondrian Forests: Efficient Online Random Forests (2014)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1406.2673)\n- [Mining High-Speed Data Streams (2000)](https:\u002F\u002Fhomes.cs.washington.edu\u002F~pedrod\u002Fpapers\u002Fkdd00.pdf)\n\n### Unsupervised learning\n\n- [Online Clustering: Algorithms, Evaluation, Metrics, Applications and Benchmarking (2022)](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3534678.3542600)\n- [Online hierarchical clustering approximations (2019)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.09667.pdf)\n- [DeepWalk: Online Learning of Social Representations (2014)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1403.6652.pdf)\n- [Online Learning with Random Representations (2014)](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.127.2742&rep=rep1&type=pdf)\n- [Online Latent Dirichlet Allocation with Infinite Vocabulary (2013)](http:\u002F\u002Fproceedings.mlr.press\u002Fv28\u002Fzhai13.pdf)\n- [Web-Scale K-Means Clustering (2010)](https:\u002F\u002Fwww.eecs.tufts.edu\u002F~dsculley\u002Fpapers\u002Ffastkmeans.pdf)\n- [Online Dictionary Learning For Sparse Coding (2009)](https:\u002F\u002Fwww.di.ens.fr\u002Fsierra\u002Fpdfs\u002Ficml09.pdf)\n- [Density-Based Clustering over an Evolving Data Stream with Noise (2006)](https:\u002F\u002Farchive.siam.org\u002Fmeetings\u002Fsdm06\u002Fproceedings\u002F030caof.pdf)\n- [Knowledge Acquisition Via Incremental Conceptual Clustering (2004)](http:\u002F\u002Fwww.inf.ufrgs.br\u002F~engel\u002Fdata\u002Fmedia\u002Ffile\u002FAprendizagem\u002FCobweb.pdf)\n- [Online and Batch Learning of Pseudo-Metrics (2004)](https:\u002F\u002Fai.stanford.edu\u002F~ang\u002Fpapers\u002Ficml04-onlinemetric.pdf)\n- [BIRCH: an efficient data clustering method for very large databases (1996)](https:\u002F\u002Fwww2.cs.sfu.ca\u002FCourseCentral\u002F459\u002Fhan\u002Fpapers\u002Fzhang96.pdf)\n\n### Time series\n\n- [Online Learning for Time Series Prediction (2013)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1302.6927.pdf)\n\n### Drift detection\n\n- [A Survey on Concept Drift Adaptation (2014)](http:\u002F\u002Feprints.bournemouth.ac.uk\u002F22491\u002F1\u002FACM%20computing%20surveys.pdf)\n\n### Anomaly detection\n\n- [Leveraging the Christoffel-Darboux Kernel for Online Outlier Detection (2022)](https:\u002F\u002Fhal.laas.fr\u002Fhal-03562614\u002Fdocument)\n- [Interpretable Anomaly Detection with Mondrian Pólya Forests on Data Streams (2020)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2008.01505.pdf)\n- [Fast Anomaly Detection for Streaming Data (2011)](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F11\u002FPapers\u002F254.pdf)\n\n### Metric learning\n\n- [Online Metric Learning and Fast Similarity Search (2009)](http:\u002F\u002Fpeople.bu.edu\u002Fbkulis\u002Fpubs\u002Fnips_online.pdf)\n- [Information-Theoretic Metric Learning (2007)](http:\u002F\u002Fwww.cs.utexas.edu\u002Fusers\u002Fpjain\u002Fpubs\u002Fmetriclearning_icml.pdf)\n- [Online and Batch Learning of Pseudo-Metrics (2004)](https:\u002F\u002Fai.stanford.edu\u002F~ang\u002Fpapers\u002Ficml04-onlinemetric.pdf)\n\n### Graph theory\n\n- [DeepWalk: Online Learning of Social Representations (2014)](http:\u002F\u002Fwww.cs.cornell.edu\u002Fcourses\u002Fcs6241\u002F2019sp\u002Freadings\u002FPerozzi-2014-DeepWalk.pdf)\n\n### Ensemble models\n\n- [Optimal and Adaptive Algorithms for Online Boosting (2015)](http:\u002F\u002Fproceedings.mlr.press\u002Fv37\u002Fbeygelzimer15.pdf) — An implementation is available [here](https:\u002F\u002Fgithub.com\u002FVowpalWabbit\u002Fvowpal_wabbit\u002Fblob\u002Fmaster\u002Fvowpalwabbit\u002Fboosting.cc)\n- [Online Bagging and Boosting (2001)](https:\u002F\u002Fti.arc.nasa.gov\u002Fm\u002Fprofile\u002Foza\u002Ffiles\u002Fozru01a.pdf)\n- [A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting (1997)](http:\u002F\u002Fwww.face-rec.org\u002Falgorithms\u002FBoosting-Ensemble\u002Fdecision-theoretic_generalization.pdf)\n\n### Expert learning\n\n- [On the optimality of the Hedge algorithm in the stochastic regime](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.01382.pdf)\n\n### Active learning\n\n- [A survey on online active learning (2023)](https:\u002F\u002Farxiv.org\u002Fftp\u002Farxiv\u002Fpapers\u002F2302\u002F2302.08893.pdf)\n\n### Miscellaneous\n\n- [Multi-Output Chain Models and their Application in Data Streams (2019)](https:\u002F\u002Fjmread.github.io\u002Ftalks\u002F2019_03_08-Imperial_Stats_Seminar.pdf)\n- [A Complete Recipe for Stochastic Gradient MCMC (2015)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1506.04696)\n- [Online EM Algorithm for Latent Data Models (2007)](https:\u002F\u002Farxiv.org\u002Fabs\u002F0712.4273) — Source code is available [here](https:\u002F\u002Fwww.di.ens.fr\u002F~cappe\u002FCode\u002FOnlineEM\u002F)\n- [StreamAI: Dealing with Challenges of Continual Learning Systems for Serving AI in Production (2023)](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10172871)\n\n### Surveys\n\n- [Machine learning for streaming data: state of the art, challenges, and opportunities (2019)](https:\u002F\u002Fwww.kdd.org\u002Fexploration_files\u002F3._CR_7._Machine_learning_for_streaming_data_state_of_the_art-Final.pdf)\n- [Online Learning: A Comprehensive Survey (2018)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02871)\n- [Online Machine Learning in Big Data Streams (2018)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.05872v1)\n- [Incremental learning algorithms and applications (2016)](https:\u002F\u002Fwww.elen.ucl.ac.be\u002FProceedings\u002Fesann\u002Fesannpdf\u002Fes2016-19.pdf)\n- [Batch-Incremental versus Instance-Incremental Learning in Dynamic and Evolving Data](http:\u002F\u002Falbertbifet.com\u002Fwp-content\u002Fuploads\u002F2013\u002F10\u002FIDA2012.pdf)\n- [Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey (2011)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1507.01030)\n- [Online Learning and Stochastic Approximations (1998)](https:\u002F\u002Fleon.bottou.org\u002Fpublications\u002Fpdf\u002Fonline-1998.pdf)\n\n### General-purpose algorithms\n\n- [Maintaining Sliding Window Skylines on Data Streams (2006)](http:\u002F\u002Fwww.cs.ust.hk\u002F~dimitris\u002FPAPERS\u002FTKDE06-Sky.pdf)\n- [The Sliding DFT (2003)](https:\u002F\u002Fpdfs.semanticscholar.org\u002F525f\u002Fb581f9afe17b6ec21d6cb58ed42d1100943f.pdf) — An online variant of the Fourier Transform, a concise explanation is available [here](https:\u002F\u002Fwww.comm.utoronto.ca\u002F~dimitris\u002Fece431\u002Fslidingdft.pdf)\n- [Sketching Algorithms for Big Data](https:\u002F\u002Fwww.sketchingbigdata.org\u002F)\n\n### Hyperparameter tuning\n\n- [ChaCha for Online AutoML (2021)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.04815.pdf)\n\n### Evaluation\n\n- [Delayed labelling evaluation for data streams (2019)](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10618-019-00654-y)\n- [Efficient Online Evaluation of Big Data Stream Classifiers (2015)](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F2783258.2783372)\n- [Issues in Evaluation of Stream Learning Algorithms (2009)](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F1557019.1557060)\n","\u003Cdiv align=\"center\">\n    \u003Ch1>超棒的在线机器学习\u003C\u002Fh1>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsindresorhus\u002Fawesome\">\u003Cimg src=\"https:\u002F\u002Fcdn.rawgit.com\u002Fsindresorhus\u002Fawesome\u002Fd7305f38d29fed78fa85652e3a63e154dd8e8829\u002Fmedia\u002Fbadge.svg\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n[在线机器学习](https:\u002F\u002Fwww.wikiwand.com\u002Fen\u002FOnline_machine_learning) 是机器学习的一个子集，其中数据按顺序到达。与更传统的批量学习不同，在线学习方法每次只用一个数据点逐步更新自身。\n\n- [课程和书籍](#courses-and-books)\n- [博客文章](#blog-posts)\n- [软件](#software)\n  - [建模](#modelling)\n  - [部署](#deployment)\n- [论文](#papers)\n  - [线性模型](#linear-models)\n  - [支持向量机](#support-vector-machines)\n  - [神经网络](#neural-networks)\n  - [决策树](#decision-trees)\n  - [无监督学习](#unsupervised-learning)\n  - [时间序列](#time-series)\n  - [漂移检测](#drift-detection)\n  - [异常检测](#anomaly-detection)\n  - [度量学习](#metric-learning)\n  - [图论](#graph-theory)\n  - [集成模型](#ensemble-models)\n  - [专家学习](#expert-learning)\n  - [主动学习](#active-learning)\n  - [其他](#miscellaneous)\n  - [综述](#surveys)\n  - [通用算法](#general-purpose-algorithms)\n  - [超参数调优](#hyperparameter-tuning)\n  - [评估](#evaluation)\n\n## 课程和书籍\n\n- [使用Python进行流式数据的机器学习](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FMachine-Learning-for-Streaming-Data-with-Python)\n- [IE 498：在线学习与决策制定](https:\u002F\u002Fyuanz.web.illinois.edu\u002Fteaching\u002FIE498fa19\u002F)\n- [在线学习导论](https:\u002F\u002Fparameterfree.com\u002Flecture-notes-on-online-learning\u002F)\n- [机器学习的本质](http:\u002F\u002Fwww.hunch.net\u002F~mltf\u002F) — 提供了关于Vowpal Wabbit内部工作原理的一些见解，尤其是关于[在线线性学习的幻灯片](http:\u002F\u002Fwww.hunch.net\u002F~mltf\u002Fonline_linear.pdf)。\n- [使用MOA的实际示例进行数据流的机器学习](https:\u002F\u002Fwww.cms.waikato.ac.nz\u002F~abifet\u002Fbook\u002Fcontents.html)\n- [麻省理工学院的机器学习在线方法](http:\u002F\u002Fwww.mit.edu\u002F~rakhlin\u002F6.883\u002F)\n- [流式处理101：超越批处理的世界](https:\u002F\u002Fwww.oreilly.com\u002Fideas\u002Fthe-world-beyond-batch-streaming-101)\n- [预测、学习与博弈](http:\u002F\u002Fwww.ii.uni.wroc.pl\u002F~lukstafi\u002Fpmwiki\u002Fuploads\u002FAGT\u002FPrediction_Learning_and_Games.pdf)\n- [在线凸优化导论](https:\u002F\u002Focobook.cs.princeton.edu\u002FOCObook.pdf)\n- [强化学习与随机优化：序贯决策的统一框架](https:\u002F\u002Fcastlelab.princeton.edu\u002FRLSO\u002F) — 全书基于应用学习\u002F优化问题中的在线学习范式构建，其中*第3章 在线学习*是参考内容。\n- [纽约大学CILVR实验室的大数据课程](https:\u002F\u002Fcilvr.cs.nyu.edu\u002Fdoku.php?id=courses:bigdata:slides:start) — 重点介绍线性模型和多臂赌博机。部分课程由Vowpal Wabbit的创建者John Langford讲授。\n- [个性化机器学习](http:\u002F\u002Fwww.cs.columbia.edu\u002F~jebara\u002F6998\u002F) — 哥伦比亚大学Tony Jebara教授的课程，涵盖多臂赌博机。\n- [在线学习简介](http:\u002F\u002Fchercheurs.lille.inria.fr\u002F~lazaric\u002FWebpage\u002FHome\u002FEntries\u002F2012\u002F1\u002F31_Course_on_%22Advanced_topics_of_machine_learning_theory_and_online_learning%22_files\u002Fpoli-online.pdf)\n- [流式数据分析](https:\u002F\u002Fgithub.com\u002Femanueledellavalle\u002Fstreaming-data-analytics) - 米兰理工大学的课程。\n\n## 博客文章\n\n- [Fennel AI关于在线推荐系统的博客文章](https:\u002F\u002Ffennel.ai\u002Fblog)\n- [使用Bytewax和Redpanda进行异常检测（Bytewax，2022年）](https:\u002F\u002Fwww.bytewax.io\u002Fblog\u002Fanomaly-detection-bw-rpk\u002F)\n- [在线机器学习中的predict\u002Ffit转换（Max Halford，2022年）](https:\u002F\u002Fmaxhalford.github.io\u002Fblog\u002Fpredict-fit-switcheroo\u002F)\n- [实时机器学习：挑战与解决方案（Chip Huyen，2022年）](https:\u002F\u002Fhuyenchip.com\u002F2022\u002F01\u002F02\u002Freal-time-machine-learning-challenges-and-solutions.html)\n- [使用River进行异常检测（Matias Aravena Gamboa，2021年）](https:\u002F\u002Fmedium.com\u002Fspikelab\u002Fanomalies-detection-using-river-398544d3536)\n- [在线机器学习的简要介绍（Saulo Mastelini，2021年）](https:\u002F\u002Fmedium.com\u002F@saulomastelini\u002Fintrodu%C3%A7%C3%A3o-a-online-machine-learning-874bd6b7c3c8)\n- [机器学习正在走向实时化（Chip Huyen，2020年）](https:\u002F\u002Fhuyenchip.com\u002F2020\u002F12\u002F27\u002Freal-time-machine-learning.html)\n- [正确评估在线机器学习模型的方法（Max Halford，2020年）](https:\u002F\u002Fmaxhalford.github.io\u002Fblog\u002Fonline-learning-evaluation\u002F)\n- [什么是在线机器学习？（Max Pagels，2018年）](https:\u002F\u002Fmedium.com\u002Fvalue-stream-design\u002Fonline-machine-learning-515556ff72c5)\n- [它是什么以及谁需要它（数据科学中心，2015年）](https:\u002F\u002Fwww.datasciencecentral.com\u002Fprofiles\u002Fblogs\u002Fstream-processing-what-is-it-and-who-needs-it)\n\n## 软件\n\n更多内容请参见[这里](https:\u002F\u002Fgithub.com\u002Fstars\u002FMaxHalford\u002Flists\u002Fonline-learning)。\n\n### 建模\n\n- [River](https:\u002F\u002Fgithub.com\u002Fcreme-ml\u002Fcreme\u002F) — 一个用于通用在线机器学习的Python库。\n- [dask](https:\u002F\u002Fml.dask.org\u002Fincremental.html)\n- [Jubatus](http:\u002F\u002Fjubat.us\u002Fen\u002Findex.html)\n- [Flink ML](https:\u002F\u002Fnightlies.apache.org\u002Fflink\u002Fflink-ml-docs-stable\u002F) - Apache Flink的机器学习库\n- [LIBFFM](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Flibffm\u002F) — 一个用于场感知因子分解机的库\n- [LIBLINEAR](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fliblinear\u002F) — 一个用于大规模线性分类的库\n- [LIBOL](https:\u002F\u002Fgithub.com\u002FLIBOL) — 一组使用一阶和二阶梯度下降法训练的在线线性模型。目前已不再维护。\n- [MOA](https:\u002F\u002Fmoa.cms.waikato.ac.nz\u002Fdocumentation\u002F)\n- [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) — [部分](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fcomputing\u002Fscaling_strategies.html#incremental-learning) scikit-learn的估计器可以处理增量更新，尽管这通常是为了小批量学习设计的。另请参阅“使用scikit-learn计算”页面。\n- [Spark Streaming](https:\u002F\u002Fspark.apache.org\u002Fdocs\u002Flatest\u002Fstreaming-programming-guide.html) — 并非严格意义上的在线学习，而是将数据划分为固定的时间间隔进行小批量处理。\n- [SofiaML](https:\u002F\u002Fcode.google.com\u002Farchive\u002Fp\u002Fsofia-ml\u002F)\n- [StreamDM](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FstreamDM) — 一个基于Spark Streaming的机器学习库。\n- [Tornado](https:\u002F\u002Fgithub.com\u002Falipsgh\u002Ftornado)\n- [VFML](http:\u002F\u002Fwww.cs.washington.edu\u002Fdm\u002Fvfml\u002F)\n- [Vowpal Wabbit](https:\u002F\u002Fgithub.com\u002FVowpalWabbit\u002Fvowpal_wabbit)\n\n### 部署\n\n- [KappaML](https:\u002F\u002Fwww.kappaml.com\u002F)\n- [django-river-ml](https:\u002F\u002Fgithub.com\u002Fvsoch\u002Fdjango-river-ml) — 一个用于部署River模型的Django插件\n- [chantilly](https:\u002F\u002Fgithub.com\u002Fonline-ml\u002Fchantilly) — 一个旨在与River兼容的原型（之前为Creme）\n\n## 论文\n\n### 线性模型\n\n- [全局最优的快速迭代线性最大似然分类器（2023）](https:\u002F\u002Flibrary.imaging.org\u002Fei\u002Farticles\u002F35\u002F14\u002FCOIMG-172)\n- [用于CTR预测的领域感知因子分解机（2016）](https:\u002F\u002Fwww.csie.ntu.edu.tw\u002F~cjlin\u002Fpapers\u002Fffm.pdf)\n- [在Facebook上预测广告点击的实际经验（2014）](https:\u002F\u002Fresearch.fb.com\u002Fwp-content\u002Fuploads\u002F2016\u002F11\u002Fpractical-lessons-from-predicting-clicks-on-ads-at-facebook.pdf)\n- [广告点击预测：一线实践（2013）](https:\u002F\u002Fstatic.googleusercontent.com\u002Fmedia\u002Fresearch.google.com\u002Fen\u002F\u002Fpubs\u002Farchive\u002F41159.pdf)\n- [归一化的在线学习（2013）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1305.6646)\n- [面向最优单遍大规模学习的平均随机梯度下降法（2011）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1107.2490)\n- [用于正则化随机学习和在线优化的对偶平均方法（2010）](https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fwp-content\u002Fuploads\u002F2016\u002F02\u002Fxiao10JMLR.pdf)\n- [权重向量的自适应正则化（2009）](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F3848-adaptive-regularization-of-weight-vectors.pdf)\n- [带有累积惩罚项的L1正则化对数线性模型的随机梯度下降训练（2009）](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FP09-1054)\n- [置信加权线性分类（2008）](https:\u002F\u002Fwww.cs.jhu.edu\u002F~mdredze\u002Fpublications\u002Ficml_variance.pdf)\n- [精确凸的置信加权学习（2008）](https:\u002F\u002Fwww.cs.jhu.edu\u002F~mdredze\u002Fpublications\u002Fcw_nips_08.pdf)\n- [在线被动-攻击算法（2006）](http:\u002F\u002Fjmlr.csail.mit.edu\u002Fpapers\u002Fvolume7\u002Fcrammer06a\u002Fcrammer06a.pdf)\n- [在线凸优化的对数后悔算法（2007）](https:\u002F\u002Fwww.cs.princeton.edu\u002F~ehazan\u002Fpapers\u002Flog-journal.pdf)\n- [二阶感知器算法（2005）](http:\u002F\u002Fwww.datascienceassn.org\u002Fsites\u002Fdefault\u002Ffiles\u002FSecond-order%20Perception%20Algorithm.pdf)\n- [使用核函数的在线学习（2004）](https:\u002F\u002Falex.smola.org\u002Fpapers\u002F2004\u002FKivSmoWil04.pdf)\n- [利用随机梯度下降算法求解大规模线性预测问题（2004）](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fsummary?doi=10.1.1.58.7377)\n\n### 支持向量机\n\n- [Pegasos：用于支持向量机的原始估计次梯度求解器（2007）](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fsummary?doi=10.1.1.74.8513)\n- [一种新的近似最大间隔分类算法（2001）](http:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fvolume2\u002Fgentile01a\u002Fgentile01a.pdf)\n- [松弛的在线最大间隔算法（2000）](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F1727-the-relaxed-online-maximum-margin-algorithm.pdf)\n\n### 神经网络\n\n- [持续学习的三种场景（2019）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.07734.pdf)\n\n### 决策树\n\n- [AMF：用于在线学习的聚合蒙德里安森林（2019）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.10529)\n- [蒙德里安森林：高效的在线随机森林（2014）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1406.2673)\n- [高速数据流挖掘（2000）](https:\u002F\u002Fhomes.cs.washington.edu\u002F~pedrod\u002Fpapers\u002Fkdd00.pdf)\n\n### 无监督学习\n\n- [在线聚类：算法、评估、指标、应用及基准测试（2022）](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3534678.3542600)\n- [在线层次聚类近似方法（2019）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.09667.pdf)\n- [DeepWalk：社交表征的在线学习（2014）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1403.6652.pdf)\n- [基于随机表征的在线学习（2014）](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.127.2742&rep=rep1&type=pdf)\n- [具有无限词汇表的在线隐狄利克雷分配模型（2013）](http:\u002F\u002Fproceedings.mlr.press\u002Fv28\u002Fzhai13.pdf)\n- [Web规模K均值聚类（2010）](https:\u002F\u002Fwww.eecs.tufts.edu\u002F~dsculley\u002Fpapers\u002Ffastkmeans.pdf)\n- [稀疏编码的在线字典学习（2009）](https:\u002F\u002Fwww.di.ens.fr\u002Fsierra\u002Fpdfs\u002Ficml09.pdf)\n- [带有噪声的动态数据流上的密度聚类（2006）](https:\u002F\u002Farchive.siam.org\u002Fmeetings\u002Fsdm06\u002Fproceedings\u002F030caof.pdf)\n- [通过增量概念聚类获取知识（2004）](http:\u002F\u002Fwww.inf.ufrgs.br\u002F~engel\u002Fdata\u002Fmedia\u002Ffile\u002FAprendizagem\u002FCobweb.pdf)\n- [伪度量的在线与批量学习（2004）](https:\u002F\u002Fai.stanford.edu\u002F~ang\u002Fpapers\u002Ficml04-onlinemetric.pdf)\n- [BIRCH：一种适用于超大型数据库的高效数据聚类方法（1996）](https:\u002F\u002Fwww2.cs.sfu.ca\u002FCourseCentral\u002F459\u002Fhan\u002Fpapers\u002Fzhang96.pdf)\n\n### 时间序列\n\n- [时间序列预测的在线学习（2013）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1302.6927.pdf)\n\n### 漂移检测\n\n- [概念漂移适应综述（2014）](http:\u002F\u002Feprints.bournemouth.ac.uk\u002F22491\u002F1\u002FACM%20computing%20surveys.pdf)\n\n### 异常检测\n\n- [利用克里斯托费尔-达布克斯核进行在线异常检测（2022）](https:\u002F\u002Fhal.laas.fr\u002Fhal-03562614\u002Fdocument)\n- [基于蒙德里安波利亚森林的数据流可解释异常检测（2020）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2008.01505.pdf)\n- [流式数据的快速异常检测（2011）](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F11\u002FPapers\u002F254.pdf)\n\n### 度量学习\n\n- [在线度量学习与快速相似度搜索（2009）](http:\u002F\u002Fpeople.bu.edu\u002Fbkulis\u002Fpubs\u002Fnips_online.pdf)\n- [信息论视角下的度量学习（2007）](http:\u002F\u002Fwww.cs.utexas.edu\u002Fusers\u002Fpjain\u002Fpubs\u002Fmetriclearning_icml.pdf)\n- [伪度量的在线与批量学习（2004）](https:\u002F\u002Fai.stanford.edu\u002F~ang\u002Fpapers\u002Ficml04-onlinemetric.pdf)\n\n### 图论\n\n- [DeepWalk：社交表征的在线学习（2014）](http:\u002F\u002Fwww.cs.cornell.edu\u002Fcourses\u002Fcs6241\u002F2019sp\u002Freadings\u002FPerozzi-2014-DeepWalk.pdf)\n\n### 集成模型\n\n- [在线提升的最优与自适应算法（2015）](http:\u002F\u002Fproceedings.mlr.press\u002Fv37\u002Fbeygelzimer15.pdf) — 其实现可在[此处](https:\u002F\u002Fgithub.com\u002FVowpalWabbit\u002Fvowpal_wabbit\u002Fblob\u002Fmaster\u002Fvowpalwabbit\u002Fboosting.cc)找到\n- [在线自助法与提升法（2001）](https:\u002F\u002Fti.arc.nasa.gov\u002Fm\u002Fprofile\u002Foza\u002Ffiles\u002Fozru01a.pdf)\n- [在线学习的决策理论推广及其在提升中的应用（1997）](http:\u002F\u002Fwww.face-rec.org\u002Falgorithms\u002FBoosting-Ensemble\u002Fdecision-theoretic_generalization.pdf)\n\n### 专家学习\n\n- [关于Hedge算法在随机环境下的最优性](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.01382.pdf)\n\n### 主动学习\n\n- [在线主动学习综述（2023）](https:\u002F\u002Farxiv.org\u002Fftp\u002Farxiv\u002Fpapers\u002F2302\u002F2302.08893.pdf)\n\n### 杂项\n\n- [多输出链模型及其在数据流中的应用（2019）](https:\u002F\u002Fjmread.github.io\u002Ftalks\u002F2019_03_08-Imperial_Stats_Seminar.pdf)\n- [随机梯度MCMC的完整配方（2015）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1506.04696)\n- [潜在数据模型的在线EM算法（2007）](https:\u002F\u002Farxiv.org\u002Fabs\u002F0712.4273) — 源代码可在[此处](https:\u002F\u002Fwww.di.ens.fr\u002F~cappe\u002FCode\u002FOnlineEM\u002F)找到\n- [StreamAI：应对生产环境中持续学习系统的挑战（2023）](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10172871)\n\n### 调查研究\n\n- [流数据的机器学习：现状、挑战与机遇（2019）](https:\u002F\u002Fwww.kdd.org\u002Fexploration_files\u002F3._CR_7._Machine_learning_for_streaming_data_state_of_the_art-Final.pdf)\n- [在线学习：综合 survey（2018）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02871)\n- [大数据流中的在线机器学习（2018）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.05872v1)\n- [增量学习算法及其应用（2016）](https:\u002F\u002Fwww.elen.ucl.ac.be\u002FProceedings\u002Fesann\u002Fesannpdf\u002Fes2016-19.pdf)\n- [动态与演化数据中的批处理-增量学习 vs 实例-增量学习](http:\u002F\u002Falbertbifet.com\u002Fwp-content\u002Fuploads\u002F2013\u002F10\u002FIDA2012.pdf)\n- [凸优化的增量梯度、次梯度及邻近点方法：survey（2011）](https:\u002F\u002Farxiv.org\u002Fabs\u002F1507.01030)\n- [在线学习与随机逼近（1998）](https:\u002F\u002Fleon.bottou.org\u002Fpublications\u002Fpdf\u002Fonline-1998.pdf)\n\n### 通用算法\n\n- [在数据流上维护滑动窗口 Skyline（2006）](http:\u002F\u002Fwww.cs.ust.hk\u002F~dimitris\u002FPAPERS\u002FTKDE06-Sky.pdf)\n- [滑动 DFT（2003）](https:\u002F\u002Fpdfs.semanticscholar.org\u002F525f\u002Fb581f9afe17b6ec21d6cb58ed42d1100943f.pdf) — 傅里叶变换的在线变体，简明解释见[此处](https:\u002F\u002Fwww.comm.utoronto.ca\u002F~dimitris\u002Fece431\u002Fslidingdft.pdf)\n- [大数据的 sketching 算法](https:\u002F\u002Fwww.sketchingbigdata.org\u002F)\n\n### 超参数调优\n\n- [ChaCha：用于在线 AutoML 的方法（2021）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.04815.pdf)\n\n### 评估\n\n- [数据流的延迟标注评估（2019）](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10618-019-00654-y)\n- [大数据流分类器的高效在线评估（2015）](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F2783258.2783372)\n- [流式学习算法评估中的问题（2009）](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F1557019.1557060)","# Awesome Online Machine Learning 快速上手指南\n\n**Awesome Online Machine Learning** 并非一个单一的软件库，而是一个精选的在线机器学习（Online Machine Learning）资源列表，涵盖了课程、博客、软件工具及学术论文。在线机器学习是一种数据按顺序到达时进行增量更新的学习范式，区别于传统的批量学习。\n\n本指南将帮助你快速了解该领域的核心工具（以 Python 生态中最流行的 **River** 库为例），并引导你利用此列表中的资源开始学习。\n\n## 环境准备\n\n在开始之前，请确保你的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows (推荐 WSL2)。\n*   **Python 版本**：建议安装 Python 3.8 或更高版本。\n*   **包管理工具**：推荐使用 `pip` 或 `conda`。\n*   **前置知识**：具备基础的 Python 编程能力和机器学习概念（如模型训练、预测）。\n\n> **国内加速建议**：\n> 在中国大陆地区，建议使用国内镜像源安装依赖，以提升下载速度。\n> *   **pip 镜像**：清华大学开源软件镜像站 (`-i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`)\n> *   **Conda 镜像**：清华 TUNA 镜像站配置\n\n## 安装步骤\n\n由于该仓库是资源列表，实际开发中通常直接安装列表中推荐的核心库。目前 Python 生态中最活跃且功能最全的在线机器学习库是 **River** (原 Creme)。\n\n### 1. 使用 pip 安装 (推荐)\n\n使用国内镜像源安装 River：\n\n```bash\npip install river -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 2. 验证安装\n\n在终端或 Python 交互环境中运行以下命令，若无报错则安装成功：\n\n```python\npython -c \"import river; print(river.__version__)\"\n```\n\n### 3. 获取其他资源\n\n你可以克隆该仓库以离线浏览推荐的论文、课程链接和更多软件工具列表：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fonline-ml\u002Fawesome-online-machine-learning.git\n```\n\n## 基本使用\n\n以下示例演示如何使用 **River** 库进行最简单的在线线性回归。与批量学习不同，在线学习是“来一条数据，学一条数据”（`learn_one`），并随时可以进行预测（`predict_one`）。\n\n### 示例：在线线性回归\n\n```python\nfrom river import linear_model, metrics, preprocessing\n\n# 1. 构建模型管道\n# 这里组合了标准化处理器 (StandardScaler) 和 线性回归模型 (LinearRegression)\nmodel = preprocessing.StandardScaler() | linear_model.LinearRegression()\n\n# 2. 定义评估指标\nmetric = metrics.MSE()\n\n# 3. 模拟流式数据并进行在线学习\n# 假设数据逐条到达 (x: 特征，y: 目标值)\ndataset = [\n    ({'x': 1}, 2),\n    ({'x': 2}, 4),\n    ({'x': 3}, 6),\n    ({'x': 4}, 8),\n]\n\nfor x, y in dataset:\n    # 先预测\n    y_pred = model.predict_one(x)\n    \n    # 更新评估指标\n    metric.update(y, y_pred)\n    \n    # 再学习 (增量更新模型参数)\n    model.learn_one(x, y)\n\nprint(f\"当前均方误差 (MSE): {metric.get():.4f}\")\n\n# 4. 对新数据进行预测\nnew_data = {'x': 5}\nprediction = model.predict_one(new_data)\nprint(f\"预测 x=5 时的值：{prediction:.2f}\")\n```\n\n### 下一步探索\n\n参考 `awesome-online-machine-learning` 仓库中的分类目录深入钻研：\n*   **Software\u002FModelling**: 探索 Vowpal Wabbit, MOA, Flink ML 等其他高性能工具。\n*   **Courses and books**: 学习《Machine Learning for Streaming Data with Python》等专著。\n*   **Papers**: 阅读关于漂移检测 (Drift detection)、异常检测 (Anomaly detection) 的前沿论文。","某电商平台的推荐系统团队正面临用户行为数据实时流入的挑战，急需将传统的批量更新模型升级为能够即时响应变化的在线学习架构。\n\n### 没有 awesome-online-machine-learning 时\n- **资源检索大海捞针**：团队成员需分散在各大论文库、博客和论坛中手动搜索“在线学习”或“流式数据处理”资料，耗时数周仍难以构建完整的知识体系。\n- **技术选型盲目试错**：缺乏对 Vowpal Wabbit、River 等主流建模工具的横向对比与最佳实践指引，导致初期选用了不支持增量更新的框架，造成架构返工。\n- **理论落地困难**：开发人员虽了解基本概念，但找不到针对“概念漂移检测”或“实时异常检测”的具体代码示例与教程，算法迟迟无法上线。\n- **忽视前沿动态**：由于缺少聚合渠道，团队错过了如 Fennel AI 或 Chip Huyen 关于实时机器学习挑战的最新行业洞察，解决方案显得过时。\n\n### 使用 awesome-online-machine-learning 后\n- **一站式知识导航**：利用其分类清晰的课程、书籍与论文列表，团队在两天内便完成了从理论基础到进阶优化的完整学习路径规划。\n- **精准工具匹配**：通过\"Software\"板块的直接指引，快速锁定了适合流式数据的建模库与部署方案，避免了重复造轮子，显著缩短研发周期。\n- **场景化实战参考**：借助\"Blog posts\"和\"Papers\"中关于漂移检测与时间序列的具体案例，工程师迅速复现了核心算法，成功实现了模型的秒级增量更新。\n- **紧跟行业前沿**：持续追踪列表中收录的最新技术文章，确保推荐策略能及时调整以应对用户行为的突发变化，保持系统竞争力。\n\nawesome-online-machine-learning 将原本碎片化的在线学习资源转化为结构化的行动指南，帮助团队以最低成本实现了从批量处理到实时智能的架构跃迁。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fonline-ml_awesome-online-machine-learning_abe30967.png","online-ml","The Fellowship of Online Machine Learning","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fonline-ml_247cb21b.png","",null,"https:\u002F\u002Fmaxhalford.notion.site\u002FFriends-of-Online-Machine-Learning-8a264829ccf345a4b2627de38139ec8b","https:\u002F\u002Fgithub.com\u002Fonline-ml",611,68,"2026-04-06T06:53:51","CC0-1.0",1,"未说明",{"notes":86,"python":84,"dependencies":87},"该仓库是一个在线机器学习（Online Machine Learning）的资源列表（Awesome List），而非单一的独立软件工具。它汇总了相关的课程、博客、论文以及多个不同的开源软件库（如 River, Vowpal Wabbit, MOA, Flink ML 等）。因此，具体的运行环境需求（操作系统、GPU、内存、Python 版本及依赖库）取决于用户选择使用的具体子项目或库，本 README 文件中未提供统一的安装或运行环境要求。",[],[14],[90,91,92,93],"awesome","awesome-list","machine-learning","online-machine-learning","2026-03-27T02:49:30.150509","2026-04-10T11:24:11.084081",[],[]]