[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-davidrosenberg--mlcourse":3,"tool-davidrosenberg--mlcourse":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":79,"owner_twitter":79,"owner_website":79,"owner_url":81,"languages":82,"stars":121,"forks":122,"last_commit_at":123,"license":79,"difficulty_score":124,"env_os":125,"env_gpu":126,"env_ram":126,"env_deps":127,"category_tags":130,"github_topics":131,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":134,"updated_at":135,"faqs":136,"releases":177},2012,"davidrosenberg\u002Fmlcourse","mlcourse","Machine learning course materials.","mlcourse 是一套清晰、深入的机器学习课程材料，由纽约大学 David Rosenberg 教授整理，涵盖从线性回归、正则化到核方法、EM 算法和贝叶斯统计等核心内容。它特别适合希望系统理解机器学习理论基础的学习者，而非仅学习代码实现。课程注重概念背后的数学直觉，例如用线性代数自然推导再生核定理，或通过矩匹配视角解释逻辑回归，帮助学习者摆脱“黑箱”印象。针对学生常感困惑的点，如条件期望符号、弹性网相关性定理、拉格朗日对偶等，课程专门补充了简明笔记，大幅提升理解效率。内容多次迭代优化，删繁就简，比如将原本一小时的对偶理论压缩为十分钟精要，同时新增了对支持向量机重训练、Thompson 采样等实用主题的深入探讨。适合有基础的大学生、研究生、数据科学从业者及对算法原理感兴趣的开发者阅读，尤其推荐给希望超越工具调参、真正理解模型为何有效的学习者。所有材料免费开放，代码与习题设计严谨，是自学或教学的优质资源。","\u003C!-- # DS-GA 1003: Machine Learning and Computational Statistics -->\n\u003C!-- - New figures illustrating regularization paths in space of all functions.-->\n\n## Notable Changes Since [2018](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2018\u002F#home)\n- Added a [note](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fsvm-retraining-with-support-vectors.pdf) on retraining SVMs with just the support vectors\n- Added a [note](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Flogistic-regression-moment-matching.pdf) on a moment-matching interpretation of fitting logistic regression and more general softmax-style linear conditional probability models.\n\n## Notable Changes from [2017FOML](https:\u002F\u002Fbloomberg.github.io\u002Ffoml\u002F#home) to [2018](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2018\u002F#home)\n- Elaborated on the [case against sparsity](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F03a.elastic-net.pdf#page=18) in the lecture on elastic net, to complement the reasons *for* sparsity on the slide [Lasso Gives Feature Sparsity: So What?](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F02c.L1L2-regularization.pdf).\n- Added a [note on conditional expectations](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fconditional-expectations.pdf), since many students find the notation confusing.\n- Added a [note on the correlated features theorem for elastic net](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Felastic-net-theorem.pdf), which was basically a translation of Zou and Hastie's 2005 paper \"Regularization and variable\nselection via the elastic net.\" into the notation of our class, dropping an unnecessary centering condition, and using a more standard definition of correlation.\n- Changes to EM Algorithm presentation: Added [several diagrams](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F13c.EM-algorithm.pdf#page=10) (slides 10-14) to give the general idea of a variational method, and made explicit that the marginal log-likelihood is exactly the pointwise supremum over the variational lower bounds [(slides 31 and 32)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F13c.EM-algorithm.pdf#page=31)).\n- Treatment of [the representer theorem](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F04c.representer-theorem.pdf) is now well before any mention of kernels, and is described as an interesting consequence of basic linear algebra:  \"Look how the solution always lies in the subspace spanned by the data.  That's interesting (and obvious with enough practice). We can now constrain our optimization problem to this subspace...\"\n- The [kernel methods](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F05a.kernel-methods.pdf) lecture was rewritten to significantly reduce references to the feature map.  When we're just talking about kernelization, it seems like unneeded extra notation. \n- Replaced the [1-hour crash course in Lagrangian duality](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F4a.convex-optimization.pdf) with a [10-minute summary of Lagrangian duality](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F04d.lagrangian-duality-in-ten-minutes.pdf), which I actually never presented and left as optional reading.\n- Added a [brief note on Thompson sampling for Bernoulli Bandits](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002Fin-prep\u002Fthompson-sampling-bernoulli.pdf) as a fun application for our [unit on Bayesian statistics](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F08a.bayesian-methods.pdf).\n- Significant improvement of the programming problem for lasso regression in [Homework #2](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw2.pdf).\n- New written and programming problems on logistic regression in [Homework #5](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw5.pdf) (showing the equivalence of the ERM and the conditional probability model formulations, as well as implementing regularized logistic regression).\n- New homework on backpropagation [Homework #7](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw7.pdf) (with Philipp Meerkamp and Pierre Garapon).\n\n## Notable Changes from [2017](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2017\u002F#home) to [2017FOML](https:\u002F\u002Fbloomberg.github.io\u002Ffoml\u002F#home)\n- This version of the course didn't have any ML prerequisites, so added a couple lectures on the basics:\n    - Added lecture on [Black Box ML](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F01.black-box-ML.pdf).\n    - Added lecture on standard methods of [evaluating classifier performance](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F06b.classifier-performance.pdf).\n- Added a note on the [main takeaways from duality for the SVM](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002FSVM-main-points.pdf). \n- Rather than go through the [full derivation of the SVM dual](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F4b.SVM.pdf), in [the new lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FLectures\u002F04b.SVM-summary.pdf), I just state the dual formulation and highlight the insights we get from the complementary slackness conditions, with an emphasis on the \"sparsity in the data\". \n- Dropped the [geometric derivation of SVMs](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLabs\u002F3-SVM-Slides.pdf) and all mention of hard-margin SVM. It was always a crowd-pleaser, but I don't think it's worth the time. Seemed most useful as a review of affine spaces, projections, and other basic linear algebra.\n- Dropped most of [the AdaBoost lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F9b.adaboost.pdf), except to mention it as a special case of forward stagewise additive modeling with an exponential loss [(slides 24-29)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F11a.gradient-boosting.pdf#page=23). \n- New [worked example](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fpoisson-gradient-boosting.pdf) for predicting Poisson distributions with linear and gradient boosting models.\n- New module on [back propagation](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F14b.backpropagation.pdf).\n\n## Notable Changes from [2016](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2016\u002F#home) to [2017](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2017\u002F#home)\n- New lecture on geometric approach to SVMs (Brett)\n- New lecture on principal component analysis (Brett)\n- Added slide on k-means++ (Brett)\n- Added slides on explicit feature vector for 1-dim RBF kernel\n- Created notebook to regenerate the buggy lasso\u002Felastic net plots from Hastie's book (Vlad)\n- L2 constraint for linear models gives Lipschitz continuity of prediction function (Thanks to Brian Dalessandro for pointing this out to me). \n- Expanded discussion of L1\u002FL2\u002FElasticNet with correlated random variables (Thanks Brett for the figures)\n\n## Notable Changes from [2015](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2015\u002F#home) to [2016](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2016\u002F#home)\n- [New lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F9a.multiclass.pdf) on **multiclass classification** and an intro to **structured prediction**\n- [New homework](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FHomework\u002Fhw6-multiclass\u002Fhw6.pdf) on **multiclass hinge loss** and **multiclass SVM**\n- [New homework](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FHomework\u002Fhw7-bayesian\u002Fhw7.pdf) on Bayesian methods, specifically the **beta-binomial model, hierarchical models, empirical Bayes ML-II, MAP-II**\n- [New short lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F2.Lab.elastic-net.pdf) on correlated variables with L1, L2, and **Elastic Net** regularization\n- Added some details about subgradient methods, including a [one-slide proof](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F4b.subgradient-descent.pdf#page=14) that subgradient descent moves us towards a minimizer of a\n  convex function (based on [Boyd's notes](http:\u002F\u002Fstanford.edu\u002Fclass\u002Fee364b\u002Flectures.html))\n- Added some [review notes](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FNotes\u002Fdirectional-derivative.pdf) on directional derivatives, gradients, and first-order approximations\n- Added light discussion of [convergence rates](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F4a.sgd-gd-revisited.pdf#page=12) for SGD vs GD (accidentally left out theorem for SGD)\n- For lack of time, dropped the [curse of dimensionality discussion](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F1b.intro-slt-riskdecomp.pdf#page=18), originally based on [Guillaume Obozinski's slides](http:\u002F\u002Fsites.uclouvain.be\u002Fsocn\u002Fpmwiki\u002Fuploads\u002FCourses\u002FObozinski1#page=21)\n- [New lecture (from slide 12)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F5b.kernel-methods.pdf#page=12) on the **Representer Theorem** (without RKHS), and its use for kernelization (based on [Shalev-Shwartz and Ben-David's book](http:\u002F\u002Fwww.cs.huji.ac.il\u002F~shais\u002FUnderstandingMachineLearning\u002Findex.html))\n- Dropped the [kernel machine approach (slide 16)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F4c.kernels.pdf#page=16) to introducing kernels, which was based on the approach in Kevin Murphy's book \n- Added EM algorithm convergence [theorem (slide 20)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F14a.EM-algorithm.pdf#page=20) based on [Vaida's result](http:\u002F\u002Fwww3.stat.sinica.edu.tw\u002Fstatistica\u002Foldpdf\u002Fa15n316.pdf)\t\n- [New lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F8.Lab.more-boosting.pdf) giving more details on gradient boosting, including brief mentions of some variants (**stochastic gradient boosting**, **LogitBoost**, **XGBoost**)\n- New [worked example](https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fblob\u002Fgh-pages\u002FArchive\u002F2016\u002FNotes\u002Ftest-two-review-problems.pdf) for predicting exponential distributions with generalized linear models and gradient boosting models.\n- Deconstructed 2015's [lecture on generalized linear models](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf), which started with [natural exponential families (slide 15)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf#page=15) and built up to a definition of [GLMs (slide 20)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf#page=20).  Instead, presented the more general notion of [conditional probability models](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F10b.conditional-probability-models.pdf), focused on using MLE and gave multiple examples; relegated formal introduction of exponential families and generalized linear models to the end; \n- Removed equality constraints from [convex optimization lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F3b.convex-optimization.pdf) to\n  simplify, but check [here](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F3b.convex-optimization.pdf) if you want them back\n- Dropped content on [Bayesian Naive Bayes](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F12.Lab.bayesian-methods.pdf), for lack of time\n- Dropped formal discussion of [k-means objective function (slide 9)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F13.mixture-models.pdf#page=9)\n- Dropped the [brief introduction to **information theory**](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F14a.information-theory.pdf). Initially included, since we needed to introduce KL divergence and Gibbs inequality anyway, for the EM algorithm. The mathematical prerequisites are now given [here (slide 15)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F13.Lab.EM-algorithm.pdf#page=15).\n\n## Possible Future Topics\n### Basic Techniques\n- Gaussian processes\n- MCMC (or at least Gibbs sampling)\n- Importance sampling\n- Density ratio estimation (for covariate shift, anomaly detection, conditional probability modeling)\n- Local methods (knn, locally weighted regression, etc.)\n### Applications\n- Collaborative filtering \u002F matrix factorization (building on [this lecture on matrix factorization](https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fblob\u002Fgh-pages\u002Fin-prep\u002Fmatrix-factorization.pdf) and [Brett's lecture on PCA](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F13-PCA-Slides.pdf))\n- Learning to rank and associated concepts\n- Bandits \u002F learning from logged data?\n- Generalized additive models for interpretable nonlinear fits (smoothing way, basis function way, and gradient boosting way)\n- Automated hyperparameter search (with GPs, random, hyperband,...)\n- Active learning\n- Domain shift \u002F covariate shift adaptation\n- Reinforcement learning (minimal path to REINFORCE)\n#### Latent Variable Models\n- PPCA \u002F Factor Analysis and non-Gaussian generalizations\n    - Personality types as example of factor analysis if we can get data?\n- Variational Autoencoders \n- Latent Dirichlet Allocation \u002F topic models\n- Generative models for images and text (where we care about the human-perceived quality of what's generated rather than the likelihood given to test examples) (GANs and friends)\n#### Bayesian Models\n- Relevance vector machines\n- BART\n- Gaussian process regression and conditional probability models\n### Technical Points\n- Overfitting the validation set?\n- Link to paper on [subgradient convergence for tame functions](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1804.07795.pdf)\n### Other\n- Class imbalance\n- Black box feature importance measures (building on [Ben's 2018 lecture](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FLabs\u002FFeatureImportance\u002Ffeature-importance-slides.ipynb))\n- Quantile regression and conditional prediction intervals (perhaps integrated into homework on loss functions); \n- More depth on basic neural networks: weight initialization, vanishing \u002F exploding gradient, possibly batch normalization\n- Finish up 'structured prediction' with beam search \u002F Viterbi\n    - give probabilistic analogue with MEMM's\u002FCRF's \n- Generative vs discriminative (Jordan & Ng's naive bayes vs logistic regression, plus new experiments including regularization)\n- Something about causality?\n- [DART](http:\u002F\u002Fproceedings.mlr.press\u002Fv38\u002Fkorlakaivinayak15.pdf)\n- LightGBM and [CatBoost](http:\u002F\u002Flearningsys.org\u002Fnips17\u002Fassets\u002Fpapers\u002Fpaper_11.pdf) efficient handling of categorical features (i.e. handling categorical features in regression trees )\n\n\u003C!-- #    - [Metric-Optimized Example Weights](https:\u002F\u002Farxiv.org\u002Fabs\u002F1805.10582) -->\n\n## Citation Information\n\u003Ca rel=\"license\" href=\"http:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby\u002F4.0\u002F\">\u003Cimg alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdavidrosenberg_mlcourse_readme_4650c94e56fa.png\" \u002F>\u003C\u002Fa>\u003Cbr \u002F>\u003Cspan xmlns:dct=\"http:\u002F\u002Fpurl.org\u002Fdc\u002Fterms\u002F\" property=\"dct:title\">Machine Learning Course Materials\u003C\u002Fspan> by \u003Cspan xmlns:cc=\"http:\u002F\u002Fcreativecommons.org\u002Fns#\" property=\"cc:attributionName\">Various Authors\u003C\u002Fspan> is licensed under a \u003Ca rel=\"license\" href=\"http:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby\u002F4.0\u002F\">Creative Commons Attribution 4.0 International License\u003C\u002Fa>.  The author of each document in this repository is considered the license holder for that document.\n","\u003C!-- # DS-GA 1003：机器学习与计算统计 -->\n\u003C!-- - 新增图表，用于展示所有函数空间中的正则化路径。-->\n\n## 自[2018年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2018\u002F#home)以来的重要变化\n- 增加了关于仅用支持向量重新训练SVM的[笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fsvm-retraining-with-support-vectors.pdf)\n- 增加了关于逻辑回归拟合及更一般的softmax风格线性条件概率模型的矩匹配解释的[笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Flogistic-regression-moment-matching.pdf)\n\n## 自[2017FOML](https:\u002F\u002Fbloomberg.github.io\u002Ffoml\u002F#home)到[2018年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2018\u002F#home)的重要变化\n- 在弹性网课程中详细阐述了[反对稀疏性的理由](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F03a.elastic-net.pdf#page=18)，以补充幻灯片上关于稀疏性的*原因*——“Lasso带来特征稀疏性：那又如何？”\n- 增加了关于[条件期望的笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fconditional-expectations.pdf)，因为许多学生觉得相关符号很困惑。\n- 增加了关于弹性网相关特征定理的[笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Felastic-net-theorem.pdf)，该笔记基本上是将Zou和Hastie 2005年的论文“通过弹性网进行正则化与变量选择”翻译成我们课堂的符号，并去掉了不必要的中心化条件，采用了更标准的相关性定义。\n- EM算法讲解的改动：增加了[若干图表](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F13c.EM-algorithm.pdf#page=10)（第10至14页），以说明变分方法的一般思想，并明确指出边际对数似然正是变分下界点上的逐点上确界[(第31和32页)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F13c.EM-algorithm.pdf#page=31)。\n- 对[表示定理](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F04c.representer-theorem.pdf)的处理现在放在提及核函数之前，并将其描述为基本线性代数的一个有趣推论：“看看解总是位于由数据张成的子空间内。这很有趣（只要多练习就显而易见）。我们现在可以将优化问题限制在这个子空间……”\n- [核方法](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F05a.kernel-methods.pdf)课程被重写，大幅减少了对特征映射的引用。当我们只讨论核化时，这些额外的符号似乎并不必要。\n- 用[10分钟的拉格朗日对偶性总结](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F04d.lagrangian-duality-in-ten-minutes.pdf)取代了[1小时的拉格朗日对偶性速成课](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F4a.convex-optimization.pdf)，实际上我从未讲过这个内容，而是留作选读材料。\n- 增加了关于伯努利赌博机的汤普森采样[简短笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002Fin-prep\u002Fthompson-sampling-bernoulli.pdf)，作为我们[贝叶斯统计单元](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2018\u002FLectures\u002F08a.bayesian-methods.pdf)的一个有趣应用。\n- [作业2](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw2.pdf)中lasso回归的编程题显著改进。\n- [作业5](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw5.pdf)中新增了逻辑回归的书面和编程题目（展示了ERM与条件概率模型公式的等价性，以及实现正则化逻辑回归）。\n- 新的反向传播作业[作业7](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FHomework\u002Fhw7.pdf)（与Philipp Meerkamp和Pierre Garapon合作完成）。\n\n## 自[2017年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2017\u002F#home)到[2017FOML](https:\u002F\u002Fbloomberg.github.io\u002Ffoml\u002F#home)的重要变化\n- 这个版本的课程没有机器学习先修要求，因此增加了几节基础课程：\n    - 增加了关于[黑盒机器学习](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F01.black-box-ML.pdf)的课程。\n    - 增加了关于[评估分类器性能](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F06b.classifier-performance.pdf)的标准方法的课程。\n- 增加了关于[SVM对偶性的主要结论](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002FSVM-main-points.pdf)的笔记。\n- 与其详细推导[SVM对偶问题](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F4b.SVM.pdf)，在[新课程](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FLectures\u002F04b.SVM-summary.pdf)中，我只是陈述对偶形式并强调从互补松弛条件中获得的见解，尤其关注“数据中的稀疏性”。\n- 删除了[SVM的几何推导](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLabs\u002F3-SVM-Slides.pdf)以及所有关于硬间隔SVM的提及。这一直很受学生欢迎，但我认为不值得花那么多时间。它最适合作为仿射空间、投影及其他基础线性代数的复习。\n- 删除了大部分[AdaBoost课程](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F9b.adaboost.pdf)，仅保留了作为指数损失前向阶段加法建模的特殊情况的提及[(第24至29页)](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F11a.gradient-boosting.pdf#page=23)。\n- 新的[实例解析](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FNotes\u002Fpoisson-gradient-boosting.pdf)用于用线性和梯度提升模型预测泊松分布。\n- 新的[反向传播模块](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017Fall\u002FLectures\u002F14b.backpropagation.pdf)。\n\n## 自[2016年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2016\u002F#home)到[2017年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2017\u002F#home)的重要变化\n- 关于SVM几何方法的新课程（Brett）\n- 关于主成分分析的新课程（Brett）\n- 增加了k-means++的幻灯片（Brett）\n- 增加了关于一维RBF核的显式特征向量的幻灯片\n- 创建了笔记本，用于重新生成Hastie书中的有bug的lasso\u002F弹性网图（Vlad）\n- 线性模型的L2约束使预测函数具有Lipschitz连续性（感谢Brian Dalessandro指出这一点）。\n- 扩展了关于L1\u002FL2\u002F弹性网与相关随机变量的讨论（感谢Brett提供的图表）\n\n## 从[2015年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2015\u002F#home)到[2016年](https:\u002F\u002Fdavidrosenberg.github.io\u002Fml2016\u002F#home)的主要变化\n- 新增了关于**多类分类**的讲座[（https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F9a.multiclass.pdf）]，以及对**结构化预测**的介绍\n- 新增了关于**多类合页损失**和**多类SVM**的作业[（https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FHomework\u002Fhw6-multiclass\u002Fhw6.pdf）]\n- 新增了关于贝叶斯方法的作业[（https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FHomework\u002Fhw7-bayesian\u002Fhw7.pdf）]，特别是**贝塔二项模型、层次模型、经验贝叶斯ML-II、MAP-II**\n- 新增了关于相关变量与L1、L2及**弹性网络**正则化的简短讲座[（https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F2.Lab.elastic-net.pdf）]\n- 补充了一些有关次梯度方法的细节，包括一个[单页证明](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F4b.subgradient-descent.pdf#page=14)，说明次梯度下降会将我们导向凸函数的极小值点（基于[Boyd的笔记](http:\u002F\u002Fstanford.edu\u002Fclass\u002Fee364b\u002Flectures.html)）\n- 补充了一些关于方向导数、梯度和一阶近似的[复习笔记](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FNotes\u002Fdirectional-derivative.pdf)\n- 轻松讨论了SGD与GD的[收敛速度](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F4a.sgd-gd-revisited.pdf#page=12)，顺便补上了SGD的定理（之前遗漏了）\n- 因时间不足，删掉了原本基于[纪尧姆·奥博津斯基的幻灯片](http:\u002F\u002Fsites.uclouvain.be\u002Fsocn\u002Fpmwiki\u002Fuploads\u002FCourses\u002FObozinski1#page=21)的[维度灾难讨论](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F1b.intro-slt-riskdecomp.pdf#page=18))\n- 新增了关于**表示定理**的讲座[（第12页起）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F5b.kernel-methods.pdf#page=12)（不涉及再生核希尔伯特空间），并介绍了其在核化中的应用（基于[Shalev-Shwartz和Ben-David的书](http:\u002F\u002Fwww.cs.huji.ac.il\u002F~shais\u002FUnderstandingMachineLearning\u002Findex.html)）\n- 删除了原定用于引入核的[核机器方法（第16页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F4c.kernels.pdf#page=16)，该方法基于凯文·墨菲的书中的思路\n- 增加了EM算法收敛[定理（第20页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F14a.EM-algorithm.pdf#page=20)，基于[Vaida的结果](http:\u002F\u002Fwww3.stat.sinica.edu.tw\u002Fstatistica\u002Foldpdf\u002Fa15n316.pdf)\n- 新增了关于梯度提升的更多细节的讲座[（https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F8.Lab.more-boosting.pdf）]，包括一些变体的简要介绍（**随机梯度提升**、**LogitBoost**、**XGBoost**）\n- 新增了[实操示例](https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fblob\u002Fgh-pages\u002FArchive\u002F2016\u002FNotes\u002Ftest-two-review-problems.pdf)，用于用广义线性模型和梯度提升模型预测指数分布\n- 对2015年的[广义线性模型讲座](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf)进行了重构，该讲座最初从[自然指数族（第15页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf#page=15)开始，逐步推导出[GLM的定义（第20页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F8.Lab.glm.pdf#page=20)。相反，这次更一般地介绍了[条件概率模型](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F10b.conditional-probability-models.pdf)，重点是使用极大似然估计，并给出了多个例子；正式引入指数族和广义线性模型的内容被移到最后；\n- 在[凸优化讲座](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F3b.convex-optimization.pdf)中去掉了等式约束以简化内容，但如果你想恢复这些约束，请查看[这里](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F3b.convex-optimization.pdf)\n- 因时间不足，删掉了关于[贝叶斯朴素贝叶斯](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F12.Lab.bayesian-methods.pdf)的内容\n- 删掉了对[k-means目标函数（第9页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F13.mixture-models.pdf#page=9)的正式讨论\n- 删掉了[信息论的简要介绍](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2015\u002FLectures\u002F14a.information-theory.pdf)。最初加入是因为我们需要引入KL散度和吉布斯不等式，以便为EM算法做准备。数学预备知识现在已放在[这里（第15页）](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F13.Lab.EM-algorithm.pdf#page=15)。\n\n## 可能的未来主题\n### 基础技术\n- 高斯过程\n- MCMC（或至少吉布斯采样）\n- 重要性采样\n- 密度比估计（用于协变量偏移、异常检测、条件概率建模）\n- 局部方法（knn、局部加权回归等）\n### 应用\n- 协作过滤\u002F矩阵分解（基于[这篇关于矩阵分解的讲座](https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fblob\u002Fgh-pages\u002Fin-prep\u002Fmatrix-factorization.pdf)和[Brett关于PCA的讲座](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2017\u002FLectures\u002F13-PCA-Slides.pdf)）\n- 排序学习及相关概念\n- 贝尔特\u002F从日志数据中学习？\n- 广义可加模型用于可解释的非线性拟合（平滑法、基函数法和梯度提升法）\n- 自动超参数搜索（使用高斯过程、随机搜索、超带等）\n- 主动学习\n- 域偏移\u002F协变量偏移适应\n- 强化学习（通往REINFORCE的最小路径）\n#### 潜变量模型\n- PPCA\u002F因子分析及其非高斯推广\n    - 如果能获取数据，人格类型可作为因子分析的例子？\n- 变分自编码器\n- 潜狄利克分配\u002F主题模型\n- 图像和文本的生成模型（我们关注的是生成内容的人类感知质量，而非测试样本的似然性）（GAN及其同类）\n#### 贝叶斯模型\n- 相关向量机\n- BART\n- 高斯过程回归与条件概率模型\n### 技术要点\n- 过拟合验证集？\n- 与[针对温和函数的次梯度收敛论文](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1804.07795.pdf)的链接\n\n### 其他\n- 类别不平衡\n- 黑箱特征重要性度量（基于[Ben 2018年讲座](https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FLabs\u002FFeatureImportance\u002Ffeature-importance-slides.ipynb)）\n- 分位数回归与条件预测区间（或许融入损失函数相关的作业中）；\n- 更深入地介绍基础神经网络：权重初始化、梯度消失\u002F爆炸，可能还包括批归一化\n- 完成“结构化预测”部分，包括束搜索\u002FViterbi算法\n    - 给出与MEMM\u002FCRF对应的概率模型类比\n- 生成式与判别式（Jordan & Ng的朴素贝叶斯与逻辑回归，以及包含正则化的全新实验）\n- 关于因果关系的一些内容？\n- [DART](http:\u002F\u002Fproceedings.mlr.press\u002Fv38\u002Fkorlakaivinayak15.pdf)\n- LightGBM和[CatBoost](http:\u002F\u002Flearningsys.org\u002Fnips17\u002Fassets\u002Fpapers\u002Fpaper_11.pdf)对分类特征的高效处理（即在回归树中处理分类特征）\n\n\u003C!-- #    - [基于指标优化的示例权重](https:\u002F\u002Farxiv.org\u002Fabs\u002F1805.10582) -->\n\n## 引用信息\n\u003Ca rel=\"license\" href=\"http:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby\u002F4.0\u002F\">\u003Cimg alt=\"知识共享许可协议\" style=\"border-width:0\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdavidrosenberg_mlcourse_readme_4650c94e56fa.png\" \u002F>\u003C\u002Fa>\u003Cbr \u002F>\u003Cspan xmlns:dct=\"http:\u002F\u002Fpurl.org\u002Fdc\u002Fterms\u002F\" property=\"dct:title\">机器学习课程资料\u003C\u002Fspan>由\u003Cspan xmlns:cc=\"http:\u002F\u002Fcreativecommons.org\u002Fns#\" property=\"cc:attributionName\">多位作者\u003C\u002Fspan>创作，采用\u003C a rel=\"license\" href=\"http:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby\u002F4.0\u002F\">知识共享署名4.0国际许可协议\u003C\u002Fa>授权。本仓库中每份文档的作者均被视为该文档的许可持有者。","# mlcourse 快速上手指南\n\n## 环境准备\n\n- **系统要求**：Linux \u002F macOS \u002F Windows（推荐 WSL2）\n- **前置依赖**：\n  - Python 3.8+\n  - pip\n  - Git\n\n推荐使用 `conda` 或 `pyenv` 管理 Python 环境，国内用户可使用清华源加速：\n\n```bash\npip config set global.index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 安装步骤\n\n1. 克隆仓库：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse.git\ncd mlcourse\n```\n\n2. 安装依赖（推荐使用 `requirements.txt`）：\n\n```bash\npip install -r requirements.txt\n```\n\n> 若无 `requirements.txt`，可手动安装常用包：  \n> `pip install numpy scipy scikit-learn matplotlib jupyter pandas`\n\n3. 启动 Jupyter Notebook（用于交互式学习）：\n\n```bash\njupyter notebook\n```\n\n## 基本使用\n\n打开课程笔记或作业文件，例如：\n\n```bash\n# 查看 Homework #2：Lasso 回归编程任务\nopen Homework\u002Fhw2.pdf\n# 或在 Jupyter 中打开示例 Notebook\njupyter notebook Homework\u002Fhw2.ipynb\n```\n\n最简单示例：运行一个线性回归练习（假设已有数据 `X`, `y`）\n\n```python\nfrom sklearn.linear_model import Lasso\nimport numpy as np\n\n# 示例数据\nX = np.array([[1], [2], [3], [4]])\ny = np.array([2, 4, 6, 8])\n\n# 训练 Lasso 回归\nlasso = Lasso(alpha=0.1)\nlasso.fit(X, y)\n\n# 预测\nprint(lasso.predict([[5]]))\n```\n\n> 所有课程材料（笔记、作业、代码）均位于项目目录下，建议按 `Archive\u002F2018\u002F` 或 `Notes\u002F` 目录顺序学习。","一位机器学习工程师在一家金融科技公司负责信用评分模型的迭代，团队正从 Lasso 回归转向弹性网络（Elastic Net），以处理高维特征中大量相关变量的问题。但团队成员对弹性网络的数学原理、正则化路径的直观理解，以及如何在实际代码中实现正则化逻辑回归存在困惑，导致模型调试周期长、解释性差。\n\n### 没有 mlcourse 时\n- 团队对弹性网络中“相关特征如何被同时选中”缺乏直观理解，误以为只选一个特征就代表模型更优。\n- 在实现正则化逻辑回归时，不清楚经验风险最小化（ERM）与条件概率建模之间的等价性，导致代码逻辑混乱。\n- 调试支持向量机（SVM）时，不知道只需用支持向量重训练即可大幅加速，每次都要重新拟合整个数据集。\n- 学生和初级工程师对条件期望符号（如 E[Y|X]）频繁出错，影响模型推导和论文阅读。\n- 缺乏清晰的拉格朗日对偶性速成材料，团队在理解优化约束时依赖晦涩的教材，耗时数天仍不得要领。\n\n### 使用 mlcourse 后\n- 通过《弹性网络相关特征定理》笔记，团队快速理解了为何相关变量会被成组选择，从而合理调整正则化参数 λ₁ 和 λ₂。\n- 借助《逻辑回归矩匹配》笔记，工程师确认了 ERM 与概率建模的等价性，重构了代码，使模型输出更稳定、可解释。\n- 采用支持向量重训练的技巧，SVM 模型更新时间从 45 分钟缩短至 3 分钟，显著提升迭代效率。\n- 《条件期望》笔记成为新员工的必读材料，团队会议中因符号误解导致的返工减少 80%。\n- 10 分钟拉格朗日对偶速成指南让团队在一周内掌握优化约束的几何意义，顺利实现自定义正则化项。\n\nmlcourse 用清晰、实用的数学直觉和工程落地指南，把原本需要数周摸索的理论障碍，变成了团队可快速掌握的实战能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdavidrosenberg_mlcourse_d9ea7bf6.png","davidrosenberg","David S. Rosenberg","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fdavidrosenberg_1c163550.jpg","Head of ML Strategy in the Office of the CTO at Bloomberg, and former Adjunct Associate Professor in the Center for Data Science at NYU.  ",null,"Toronto, ON","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg",[83,87,91,95,99,103,107,111,115,118],{"name":84,"color":85,"percentage":86},"Jupyter Notebook","#DA5B0B",60.9,{"name":88,"color":89,"percentage":90},"HTML","#e34c26",20.6,{"name":92,"color":93,"percentage":94},"TeX","#3D6117",13.8,{"name":96,"color":97,"percentage":98},"R","#198CE7",1.8,{"name":100,"color":101,"percentage":102},"Asymptote","#ff0000",1.6,{"name":104,"color":105,"percentage":106},"Mathematica","#dd1100",0.6,{"name":108,"color":109,"percentage":110},"Python","#3572A5",0.4,{"name":112,"color":113,"percentage":114},"Makefile","#427819",0.1,{"name":116,"color":117,"percentage":114},"MATLAB","#e16737",{"name":119,"color":120,"percentage":114},"Shell","#89e051",578,267,"2026-04-05T14:33:03",1,"Linux, macOS, Windows","未说明",{"notes":128,"python":126,"dependencies":129},"该课程为机器学习理论讲义，主要包含笔记、讲义和作业，未提供可运行的代码库或软件环境，因此无明确的运行环境依赖。建议使用 Jupyter Notebook 阅读 PDF 和 Notebook 文件，需安装 Python 环境以运行示例代码（如存在）。",[],[13],[132,133],"machine-learning","course-materials","2026-03-27T02:49:30.150509","2026-04-06T09:43:33.777848",[137,142,147,152,157,162,167,172],{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},9105,"为什么梯度向量与等高线正交？在不可导点处如何理解次梯度？","梯度与等高线正交是因为：若 F(x) = c 定义了一个水平面，g: (-1,1) → R^n 是一条参数曲线且导数非零，则根据链式法则，Df(g(0)) · g'(0) = 0，表明梯度与曲线切线垂直，即与等高线正交。在不可导点，次梯度是所有满足 f(y) ≥ f(x) + g^T(y - x) 的向量 g 的集合，可视为梯度的广义形式。参考：https:\u002F\u002Fdavidrosenberg.github.io\u002Fmlcourse\u002FArchive\u002F2016\u002FLectures\u002F4b.subgradient-descent.pdf#page=12","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F18",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},9106,"在 SVM 中，权重 w 是输入向量的线性组合，这一结论是如何得出的？它与核方法有何关系？","w 是输入向量 x_i 的线性组合，这一结论直接来自原始问题的约束形式，无需强对偶性。一旦知道 w 是 x_i 的线性组合，核方法的使用就自然成立：因为预测函数仅依赖于内积 x_i^T x_j，可直接用核函数 K(x_i, x_j) 替代。虽然传统推导通过对偶问题得到，但实际更本质的来源是原始问题的结构。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F19",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},9107,"如何在 GitHub 仓库中清理自动生成的临时文件（如 .out、VarFill.ltx）？","应移除所有由构建系统自动生成的文件，如 .out、VarFill.ltx、VarFill.tex 等，因为它们可通过重新编译重建。维护者已确认：若文件是自动生成的，就不应提交到仓库中。建议在提交前使用 .gitignore 文件排除这些临时文件。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F43",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},9108,"如何通过图形直观比较 L1 和 L2 正则化对重复特征的影响？","可通过绘制 L1 和 L2 范数的等高线与单位球（norm-ball）的交点来可视化：L2 正则化会使重复特征的系数相等（圆形球与等高线相切于对称点），而 L1 正则化倾向于将其中一个系数压缩为零（菱形球在轴上相切）。相关图示见：https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse-homework\u002Ftree\u002Fmaster\u002Fin-prep\u002Frecitations\u002FL1vL2","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F36",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},9109,"制作文本型或流程图式图表（如特征提取图）推荐使用什么工具？","推荐使用 PowerPoint + LaTeX 插件：将 LaTeX 公式编译为图片后粘贴到 PPT 中，再组合成流程图。例如，feature-extraction.png 就是用此方法制作的。对于更复杂的图表，也可使用 Asymptote 或 JavaScript（如 Percy 教授使用的工具）。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F25",{"id":163,"question_zh":164,"answer_zh":165,"source_url":166},9110,"学习目标（Learning Objectives）应该如何撰写才能有效指导学生复习和考试？","学习目标应具体、可测试，避免使用“理解 X”这类模糊表述。应明确列出学生应能完成的操作，例如“能推导岭回归的解析解”或“能解释 L1 正则化为何产生稀疏解”。目标将作为学习指南发布在课程网站上，建议使用 markdown 格式（learning-objectives.md）以便维护。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F32",{"id":168,"question_zh":169,"answer_zh":170,"source_url":171},9111,"向量量化（Vector Quantization）是否应作为特征生成方法在课程中讲解？","虽然向量量化是有效的特征生成方法，但当前课程材料已通过新特征幻灯片（https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fblob\u002Fgh-pages\u002FLabs\u002F4-Features\u002Ftex\u002F04e.features.pdf）涵盖其核心思想，只是未使用该术语。建议在讲义中明确加入“向量量化”一词以增强术语一致性。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F37",{"id":173,"question_zh":174,"answer_zh":175,"source_url":176},9112,"决策树对应的预测函数 f_F 是否存在？如何定义其空间的度量？","决策树对应的预测函数 f_F 是存在的，它是一个从 R^2 到 R 的分段常数函数集合。其空间的度量可使用 L2(P_X) 范数，即对输入分布 P_X 下的函数差值平方积分，衡量两个决策树预测结果的平均差异。该度量不依赖于树的结构，只关注输出函数的差异。","https:\u002F\u002Fgithub.com\u002Fdavidrosenberg\u002Fmlcourse\u002Fissues\u002F16",[]]