[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-jayinai--ml-interview":3,"tool-jayinai--ml-interview":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":76,"owner_url":79,"languages":80,"stars":85,"forks":86,"last_commit_at":87,"license":76,"difficulty_score":88,"env_os":89,"env_gpu":90,"env_ram":90,"env_deps":91,"category_tags":94,"github_topics":95,"view_count":23,"oss_zip_url":76,"oss_zip_packed_at":76,"status":16,"created_at":98,"updated_at":99,"faqs":100,"releases":101},2418,"jayinai\u002Fml-interview","ml-interview","Preparing for machine learning interviews","ml-interview 是一个专为机器学习求职者打造的开源面试准备资源库，旨在通过问答形式帮助候选人系统梳理核心知识点。它主要解决了求职者在面对技术面试时，难以全面覆盖算法原理、简历陈述技巧以及基础数据处理能力（如 SQL）等关键痛点。\n\n该工具特别适合正在寻找机器学习工程师、数据科学家岗位的开发者及相关领域的研究人员使用。其独特亮点在于不仅罗列了从线性回归、逻辑回归到 CNN、RNN 等经典算法的深度解析，还特别强调了“如何量化展示简历项目”的实战技巧，指导用户用具体数据对比来突显个人贡献，从而在面试中脱颖而出。此外，它还贴心地提供了 SQL 复习资源链接，弥补了部分算法岗位对数据库能力的考察需求。尽管原作者已推荐更新的项目，但 ml-interview 中关于基础概念的精炼总结与面试策略依然具有很高的参考价值，是备战大厂面试的实用指南。","# This repos is depreciated, check out the latest [Nailing Machine Learning Concepts](https:\u002F\u002Fgithub.com\u002Fjayinai\u002Fnail-ml-concept)\n\nThis repository covers how to prepare for machine learning interviews, mainly\nin the format of questions & answers. Asides from machine learning knowledge,\nother crucial aspects include:\n\n* [Explain your resume](#explain-your-resume)\n* [SQL](#sql)\n\nGo directly to [machine learning](#machine-learning)\n\n\n## Explain your resume\n\nYour resume should specify interesting ML projects you got involved in the past,\nand **quantitatively** show your contribution. Consider the following comparison:\n\n> Trained a machine learning system\n\nvs.\n\n> Trained a deep vision system (SqueezeNet) that has 1\u002F30 model size, 1\u002F3 training\n> time, 1\u002F5 inference time, and 2x faster convergence compared with traditional\n> ConvNet (e.g., ResNet)\n\nWe all can tell which one is gonna catch interviewer's eyeballs and better show\ncase your ability.\n\nIn the interview, be sure to explain what you've done well. Spend some time going\nover your resume before the interview.\n\n\n## SQL\n\nAlthough you don't have to be a SQL expert for most machine learning positions,\nthe interviews might ask you some SQL related questions so it helps to refresh\nyour memory beforehand. Some good SQL resources are:\n\n* [W3schools SQL](https:\u002F\u002Fwww.w3schools.com\u002Fsql\u002F)\n* [SQLZOO](http:\u002F\u002Fsqlzoo.net\u002F)\n\n\n## Machine learning\n\nFirst, it's always a good idea to review [Chapter 5](http:\u002F\u002Fwww.deeplearningbook.org\u002Fcontents\u002Fml.html) \nof the deep learning book, which covers machine learning basics.\n\n\n* [Linear regression](#linear-regression)\n* [Logistic regression](#logistic-regression)\n* [KNN](#knn)\n* [SVM](#svm)\n* [Naive Bayes]\n* [Decision tree](#decision-tree)\n* [Bagging](#bagging)\n* [Random forest](#random-forest)\n* [Boosting](#boosting)\n* [Stacking](#stacking)\n* [Clustering]\n* [MLP](#mlp)\n* [CNN](#cnn)\n* [RNN and LSTM](#rnn-and-lstm)\n* [word2vec](#word2vec)\n* [Generative vs discriminative](#generative-vs-discriminative)\n* [Paramteric vs Nonparametric](#paramteric-vs-nonparametric)\n\n\n\n### Linear regression\n\n* how to learn the parameter: minimize the cost function\n* how to minimize cost function: gradient descent\n* regularization: \n    - L1 (lasso): can shrink certain coef to zero, thus performing feature selection\n    - L2 (ridge): shrink all coef with the same proportion; almost always outperforms L1\n    - combined (Elastic Net): \n* assumes linear relationship between features and the label\n* can add polynomial and interaction features to add non-linearity\n\n![lr](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_06075b946b95.png)\n\n[back to top](#machine-learning)\n\n\n### Logistic regression\n\n* Generalized linear model (GLM) for classification problems\n* Apply the sigmoid function to the output of linear models, squeezing the target\nto range [0, 1] \n* Threshold to make prediction: if the output > .5, prediction 1; otherwise prediction 0\n* a special case of softmax function, which deals with multi-class problem\n\n[back to top](#machine-learning)\n\n### KNN\n\nGiven a data point, we compute the K nearest data points (neighbors) using certain\ndistance metric (e.g., Euclidean metric). For classification, we take the majority label\nof neighbors; for regression, we take the mean of the label values.\n\nNote for KNN technically we don't need to train a model, we simply compute during\ninference time. This can be computationally expensive since each of the test example\nneed to be compared with every training example to see how close they are.\n\nThere are approximation methods can have faster inference time by\npartitioning the training data into regions.\n\nNote when K equals 1 or other small number the model is prone to overfitting (high variance), while\nwhen K equals number of data points or other large number the model is prone to underfitting (high bias)\n\n![KNN](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_0c04efa0c727.png)\n\n[back to top](#machine-learning)\n\n\n### SVM\n\n* can perform linear, nonlinear, or outlier detection (unsupervised)\n* large margin classifier: not only have a decision boundary, but want the boundary\nto be as far from the closest training point as possible\n* the closest training examples are called the support vectors, since they are the points\nbased on which the decision boundary is drawn\n* SVMs are sensitive to feature scaling\n\n![svm](https:\u002F\u002Fqph.ec.quoracdn.net\u002Fmain-qimg-675fedee717331e478ecfcc40e2e4d38)\n\n\n[back to top](#machine-learning)\n\n\n### Decision tree\n\n* Non-parametric, supervised learning algorithms\n* Given the training data, a decision tree algorithm divides the feature space into\nregions. For inference, we first see which\nregion does the test data point fall in, and take the mean label values (regression)\nor the majority label value (classification).\n* **Construction**: top-down, chooses a variable to split the data such that the \ntarget variables within each region are as homogeneous as possible. Two common\nmetrics: gini impurity or information gain, won't matter much in practice.\n* Advantage: simply to understand & interpret, mirrors human decision making\n* Disadvantage: \n    - can overfit easily (and generalize poorly)if we don't limit the depth of the tree\n    - can be non-robust: A small change in the training data can lead to a totally different tree\n    - instability: sensitive to training set rotation due to its orthogonal decision boundaries\n\n![decision tree](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_c557af4b34c5.gif)\n\n[back to top](#machine-learning)\n\n\n### Bagging\n\nTo address overfitting, we can use an ensemble method called bagging (bootstrap aggregating),\nwhich reduces the variance of the meta learning algorithm. Bagging can be applied\nto decision tree or other algorithms.\n\nHere is a [great illustration](http:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fauto_examples\u002Fensemble\u002Fplot_bias_variance.html#sphx-glr-auto-examples-ensemble-plot-bias-variance-py) of a single estimator vs. bagging\n\n![bagging](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_c27baac76629.png)\n\n* Bagging is when samlping is performed *with* replacement. When sampling is performed *without* replacement, it's called pasting.\n* Bagging is popular due to its boost for performance, but also due to that individual learners can be trained in parallel and scale well\n* Ensemble methods work best when the learners are as independent from one another as possible\n* Voting: soft voting (predict probability and average over all individual learners) often works better than hard voting\n* out-of-bag instances (37%) can act validation set for bagging\n\n\n\n[back to top](#machine-learning)\n\n\n### Random forest\n\nRandom forest improves bagging further by adding some randomness. In random forest,\nonly a subset of features are selected at random to construct a tree (while often not subsample instances).\nThe benefit is that random forest **decorrelates** the trees. \n\nFor example, suppose we have a dataset. There is one very predicative feature, and a couple\nof moderately predicative features. In bagging trees, most of the trees\nwill use this very predicative feature in the top split, and therefore making most of the trees\nlook similar, **and highly correlated**. Averaging many highly correlated results won't lead\nto a large reduction in variance compared with uncorrelated results. \nIn random forest for each split we only consider a subset of the features and therefore\nreduce the variance even further by introducing more uncorrelated trees.\n\nI wrote a [notebook](notebooks\u002Fbag-rf-var.ipynb) to illustrate this point.\n\nIn practice, tuning random forest entails having a large number of trees (the more the better, but\nalways consider computation constraint). Also, `min_samples_leaf` (The minimum number of\nsamples at the leaf node)to control the tree size and overfitting. Always CV the parameters. \n\n**Feature importance**\n\nIn a decision tree, important features are likely to appear closer to the root of the tree. We can get\na feature's importance for random forest by computing the averaging depth at which it appears across all\ntrees in the forest.\n\n\n[back to top](#machine-learning)\n\n\n### Boosting\n\n**How it works**\n\nBoosting builds on weak learners, and in an iterative fashion. In each iteration,\na new learner is added, while all existing learners are kept unchanged. All learners\nare weighted based on their performance (e.g., accuracy), and after a weak learner\nis added, the data are re-weighted: examples that are misclassified gain more weights,\nwhile examples that are correctly classified lose weights. Thus, future weak learners\nfocus more on examples that previous weak learners misclassified.\n\n\n**Difference from random forest (RF)**\n\n* RF grows trees **in parallel**, while Boosting is sequential\n* RF reduces variance, while Boosting reduces errors by reducing bias\n\n\n**XGBoost (Extreme Gradient Boosting)**\n\n\n> XGBoost uses a more regularized model formalization to control overfitting, which gives it better performance\n\n[back to top](#machine-learning)\n\n\n### Stacking\n\n* Instead of using trivial functions (such as hard voting) to aggregate the predictions from individual learners, train a model to perform this aggregation\n* First split the training set into two subsets: the first subset is used to train the learners in the first layer\n* Next the first layer learners are used to make predictions (meta features) on the second subset, and those predictions are used to train another models (to obtain the weigts of different learners) in the second layer\n* We can train multiple models in the second layer, but this entails subsetting the original dataset into 3 parts\n\n![stacking](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_5dad471261b9.jpg)\n\n[back to top](#machine-learning)\n\n\n### MLP\n\nA feedforward neural network where we have multiple layers. In each layer we\ncan have multiple neurons, and each of the neuron in the next layer is a linear\u002Fnonlinear\ncombination of the all the neurons in the previous layer. In order to train the network\nwe back propagate the errors layer by layer. In theory MLP can approximate any functions.\n\n![mlp](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_88a70583fb61.jpg)\n\n[back to top](#machine-learning)\n\n### CNN\n\nThe Conv layer is the building block of a Convolutional Network. The Conv layer consists\nof a set of learnable filters (such as 5 * 5 * 3, width * height * depth). During the forward\npass, we slide (or more precisely, convolve) the filter across the input and compute the dot \nproduct. Learning again happens when the network back propagate the error layer by layer.\n\nInitial layers capture low-level features such as angle and edges, while later\nlayers learn a combination of the low-level features and in the previous layers \nand can therefore represent higher level feature, such as shape and object parts.\n\n![CNN](http:\u002F\u002Fwww.kdnuggets.com\u002Fwp-content\u002Fuploads\u002Fdnn-layers.jpg)\n\n[back to top](#machine-learning)\n\n### RNN and LSTM\n\nRNN is another paradigm of neural network where we have difference layers of cells,\nand each cell only take as input the cell from the previous layer, but also the previous\ncell within the same layer. This gives RNN the power to model sequence. \n\n![RNN](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_2a37bd4e9b12.jpeg)\n\nThis seems great, but in practice RNN barely works due to exploding\u002Fvanishing gradient, which \nis cause by a series of multiplication of the same matrix. To solve this, we can use \na variation of RNN, called long short-term memory (LSTM), which is capable of learning\nlong-term dependencies. \n\nThe math behind LSTM can be pretty complicated, but intuitively LSTM introduce \n    - input gate\n    - output gate\n    - forget gate\n    - memory cell (internal state)\n    \nLSTM resembles human memory: it forgets old stuff (old internal state * forget gate) \nand learns from new input (input node * input gate)\n\n![lstm](http:\u002F\u002Fdeeplearning.net\u002Ftutorial\u002F_images\u002Flstm_memorycell.png)\n\n[back to top](#machine-learning)\n\n\n### word2vec\n\n* Shallow, two-layer neural networks that are trained to construct linguistic context of words\n* Takes as input a large corpus, and produce a vector space, typically of several hundred\ndimension, and each word in the corpus is assigned a vector in the space\n* The key idea is context: words that occur often in the same context should have same\u002Fopposite\nmeanings.\n* Two flavors\n    - continuous bag of words (CBOW): the model predicts the current word given a window of surrounding context words\n    - skip gram: predicts the surrounding context words using the current word\n\n![word2vec](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_2a021c45dcf2.png)\n\n[back to top](#machine-learning)\n\n\n### Generative vs discriminative\n\n* Discriminative algorithms model *p(y|x; w)*, that is, given the dataset and learned\nparameter, what is the probability of y belonging to a specific class. A discriminative algorithm\ndoesn't care about how the data was generated, it simply categorizes a given example\n* Generative algorithms try to model *p(x|y)*, that is, the distribution of features given\nthat it belongs to a certain class. A generative algorithm models how the data was\ngenerated.\n\n> Given a training set, an algorithm like logistic regression or\n> the perceptron algorithm (basically) tries to find a straight line—that is, a\n> decision boundary—that separates the elephants and dogs. Then, to classify\n> a new animal as either an elephant or a dog, it checks on which side of the\n> decision boundary it falls, and makes its prediction accordingly.\n\n> Here’s a different approach. First, looking at elephants, we can build a\n> model of what elephants look like. Then, looking at dogs, we can build a\n> separate model of what dogs look like. Finally, to classify a new animal, we\n> can match the new animal against the elephant model, and match it against\n> the dog model, to see whether the new animal looks more like the elephants\n> or more like the dogs we had seen in the training set.\n\n[back to top](#machine-learning)\n\n\n### Paramteric vs Nonparametric\n\n* A learning model that summarizes data with a set of parameters of fixed size (independent of the number of training examples) is called a parametric model.\n* A model where the number of parameters is not determined prior to training. Nonparametric does not mean that they have NO parameters! On the contrary, nonparametric models (can) become more and more complex with an increasing amount of data.\n\n[back to top](#machine-learning)\n","# 此仓库已弃用，请查看最新的 [Nailing Machine Learning Concepts](https:\u002F\u002Fgithub.com\u002Fjayinai\u002Fnail-ml-concept)\n\n本仓库涵盖了如何准备机器学习面试的内容，主要以问答形式呈现。除了机器学习知识外，其他关键方面还包括：\n\n* [解释你的简历](#explain-your-resume)\n* [SQL](#sql)\n\n直接跳转到 [机器学习](#machine-learning)\n\n\n## 解释你的简历\n\n你的简历应详细列出你过去参与过的有趣机器学习项目，并且**量化**地展示你的贡献。可以参考以下对比：\n\n> 训练了一个机器学习系统\n\nvs.\n\n> 训练了一个深度视觉模型（SqueezeNet），其模型大小仅为传统卷积神经网络（如ResNet）的1\u002F30，训练时间缩短至1\u002F3，推理时间减少至1\u002F5，收敛速度提升2倍。\n\n显而易见，后者更能吸引面试官的注意，并更好地体现你的能力。\n\n在面试中，务必清晰地阐述自己所做的工作。建议在面试前花些时间仔细回顾自己的简历。\n\n\n## SQL\n\n尽管大多数机器学习岗位并不需要你成为SQL专家，但面试中仍可能会涉及一些SQL相关问题，因此提前复习一下会很有帮助。以下是一些不错的SQL学习资源：\n\n* [W3schools SQL](https:\u002F\u002Fwww.w3schools.com\u002Fsql\u002F)\n* [SQLZOO](http:\u002F\u002Fsqlzoo.net\u002F)\n\n\n## 机器学习\n\n首先，建议复习《深度学习》一书中的第5章（[链接](http:\u002F\u002Fwww.deeplearningbook.org\u002Fcontents\u002Fml.html)），该章节涵盖了机器学习的基础知识。\n\n\n* [线性回归](#linear-regression)\n* [逻辑回归](#logistic-regression)\n* [K近邻](#knn)\n* [支持向量机](#svm)\n* [朴素贝叶斯]\n* [决策树](#decision-tree)\n* [自助法集成](#bagging)\n* [随机森林](#random-forest)\n* [提升方法](#boosting)\n* [堆叠方法](#stacking)\n* [聚类]\n* [多层感知器](#mlp)\n* [卷积神经网络](#cnn)\n* [循环神经网络与LSTM](#rnn-and-lstm)\n* [word2vec](#word2vec)\n* [生成式与判别式](#generative-vs-discriminative)\n* [参数化与非参数化](#paramteric-vs-nonparametric)\n\n\n\n### 线性回归\n\n* 参数学习方式：最小化损失函数\n* 最小化损失函数的方法：梯度下降法\n* 正则化：\n    - L1（套索）：可将某些系数缩减至零，从而实现特征选择\n    - L2（岭回归）：按相同比例缩减所有系数；通常性能优于L1\n    - 组合正则化（弹性网络）：\n* 假设特征与标签之间存在线性关系\n* 可通过添加多项式特征和交互特征来引入非线性\n\n![lr](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_06075b946b95.png)\n\n[返回顶部](#machine-learning)\n\n\n### 逻辑回归\n\n* 用于分类问题的广义线性模型（GLM）\n* 在线性模型的输出上应用sigmoid函数，将目标值压缩到[0, 1]区间\n* 通过阈值进行预测：若输出大于0.5，则预测为1；否则预测为0\n* 是softmax函数的一种特殊情况，适用于多分类问题\n\n[返回顶部](#machine-learning)\n\n### K近邻\n\n给定一个数据点，我们使用某种距离度量（如欧氏距离）计算其K个最近的数据点（邻居）。对于分类任务，取邻居中多数的标签；对于回归任务，取标签值的平均值。\n\n需要注意的是，K近邻算法本质上不需要训练模型，只需在推理时进行计算。然而，由于每个测试样本都需要与所有训练样本比较以确定距离，这可能导致较高的计算开销。为了提高推理效率，可以通过将训练数据划分为若干区域来采用近似方法。\n\n当K取1或其他较小值时，模型容易过拟合（高方差）；而当K等于数据点总数或其他较大值时，模型则容易欠拟合（高偏差）。\n\n![KNN](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_0c04efa0c727.png)\n\n[返回顶部](#machine-learning)\n\n\n### 支持向量机\n\n* 可用于线性、非线性分类或异常检测（无监督学习）\n* 大间隔分类器：不仅有决策边界，还希望该边界尽可能远离最近的训练样本\n* 距离决策边界最近的训练样本称为支持向量，因为它们是决定决策边界的关键点\n* SVM对特征缩放较为敏感\n\n![svm](https:\u002F\u002Fqph.ec.quoracdn.net\u002Fmain-qimg-675fedee717331e478ecfcc40e2e4d38)\n\n\n[返回顶部](#machine-learning)\n\n\n### 决策树\n\n* 非参数化、监督学习算法\n* 根据训练数据，决策树算法会将特征空间划分为多个区域。在推理时，首先判断测试样本落入哪个区域，然后根据该区域内的标签均值（回归）或多数标签（分类）做出预测。\n* **构建过程**：自顶向下，选择某个变量进行划分，使得每个区域内的目标变量尽可能同质化。常用的指标包括基尼不纯度和信息增益，但在实际应用中差异不大。\n* 优点：易于理解和解释，贴近人类的决策过程\n* 缺点：\n    - 如果不限制树的深度，容易过拟合（泛化能力差）\n    - 不够稳健：训练数据的微小变化可能导致完全不同的树结构\n    - 不稳定性：由于其正交的决策边界，对训练集的旋转较为敏感\n\n![决策树](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_c557af4b34c5.gif)\n\n[返回顶部](#machine-learning)\n\n\n### 自助法集成\n\n为了解决过拟合问题，可以使用一种名为自助法集成（Bagging）的集成方法，它能够降低元学习算法的方差。自助法集成可以应用于决策树或其他算法。\n\n这里有一张很好的示意图展示了单个估计器与自助法集成的区别：[链接](http:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fauto_examples\u002Fensemble\u002Fplot_bias_variance.html#sphx-glr-auto-examples-ensemble-plot-bias-variance-py)\n\n![bagging](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_c27baac76629.png)\n\n* 自助法集成是在**有放回**抽样的情况下进行的。如果**无放回**抽样，则称为“粘贴”方法。\n* 自助法集成之所以流行，不仅因为它能提升性能，还因为各个子模型可以并行训练，具有良好的可扩展性。\n* 集成方法的效果最佳时，要求各个子模型彼此尽可能独立。\n* 投票机制：软投票（预测概率并取所有子模型的平均值）通常比硬投票效果更好。\n* 未被选入自助法集成的样本（约占37%）可以作为验证集使用。\n\n\n\n[返回顶部](#machine-learning)\n\n### 随机森林\n\n随机森林通过引入随机性进一步改进了自助法集成。在随机森林中，每次构建树时只随机选择一部分特征（而通常不会对样本进行子采样）。这样做的好处是使随机森林中的树之间**去相关化**。\n\n例如，假设我们有一个数据集，其中有一个非常具有预测性的特征，以及几个中等预测性的特征。在自助法集成中，大多数树会在根节点处使用这个非常具有预测性的特征进行分裂，因此这些树看起来会非常相似，**并且高度相关**。而对大量高度相关的结果取平均，并不能像对不相关的结果那样显著降低方差。\n\n在随机森林中，每次分裂时只考虑部分特征，从而通过引入更多不相关的树来进一步降低方差。\n\n我编写了一个[笔记本](notebooks\u002Fbag-rf-var.ipynb)来说明这一点。\n\n在实践中，调参随机森林时需要设置较大的树数量（越多越好，但也要考虑计算资源的限制）。此外，还需要调整`min_samples_leaf`参数（即叶节点上最少的样本数），以控制树的大小和防止过拟合。始终使用交叉验证来优化这些超参数。\n\n**特征重要性**\n\n在决策树中，重要的特征往往出现在靠近树根的位置。对于随机森林，我们可以通过计算某个特征在森林中所有树上的平均出现深度来衡量其重要性。\n\n\n[返回顶部](#machine-learning)\n\n\n### 提升法\n\n**工作原理**\n\n提升法基于弱学习器，采用迭代的方式构建模型。在每一轮迭代中，都会添加一个新的弱学习器，而现有的所有弱学习器保持不变。每个弱学习器会根据其表现（如准确率）被赋予一定的权重。当新弱学习器加入后，数据会被重新加权：那些被错误分类的样本权重会增加，而正确分类的样本权重会减少。这样一来，后续的弱学习器会更加关注之前弱学习器分类错误的样本。\n\n**与随机森林的区别**\n\n* 随机森林是**并行**地构建树，而提升法是**顺序**进行的。\n* 随机森林主要通过降低方差来提升性能，而提升法则是通过减少偏差来降低误差。\n\n\n**XGBoost（极端梯度提升）**\n\n\n> XGBoost 使用了更正则化的模型形式来控制过拟合，因此性能更好。\n\n[返回顶部](#machine-learning)\n\n\n### 堆叠泛化\n\n* 不再使用简单的聚合方法（如硬投票）来组合各个基模型的预测结果，而是训练一个模型来进行这种聚合。\n* 首先将训练集分为两部分：第一部分用于训练第一层的基模型。\n* 然后用第一层的基模型对第二部分数据进行预测，得到元特征，并利用这些元特征训练第二层的模型，以确定不同基模型的权重。\n* 第二层可以训练多个模型，但这需要将原始数据集进一步划分为三份。\n\n![堆叠泛化](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_5dad471261b9.jpg)\n\n[返回顶部](#machine-learning)\n\n\n### 多层感知机（MLP）\n\n多层感知机是一种前馈神经网络，包含多个隐藏层。每一层可以有多个神经元，下一层的每个神经元都是由上一层所有神经元的线性或非线性组合构成的。为了训练网络，我们需要逐层反向传播误差。理论上，多层感知机可以逼近任意函数。\n\n![MLP](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_88a70583fb61.jpg)\n\n[返回顶部](#machine-learning)\n\n### 卷积神经网络（CNN）\n\n卷积层是卷积神经网络的基本构建模块。卷积层由一组可学习的滤波器组成，例如 5×5×3 的形状（宽度×高度×深度）。在前向传播过程中，我们会将滤波器在整个输入上滑动（更准确地说，是进行卷积运算），并计算点积。网络通过逐层反向传播误差来完成学习过程。\n\n早期的卷积层主要捕捉低级特征，如角度和边缘；而随着网络层次的加深，它们会学习到更高级的特征，比如形状和物体的组成部分。\n\n![CNN](http:\u002F\u002Fwww.kdnuggets.com\u002Fwp-content\u002Fuploads\u002Fdnn-layers.jpg)\n\n[返回顶部](#machine-learning)\n\n### 循环神经网络（RNN）和长短期记忆网络（LSTM）\n\n循环神经网络是另一种神经网络范式，它由多层细胞组成，每一层的细胞不仅接收来自上一层的输入，还会接收本层中前一个时刻的输出。这种结构使得 RNN 能够对序列数据建模。\n\n![RNN](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_2a37bd4e9b12.jpeg)\n\n这听起来很棒，但在实际应用中，由于梯度爆炸或梯度消失问题，RNN 往往难以有效工作。这是因为在多次矩阵相乘的过程中，梯度可能会变得非常大或非常小。为了解决这个问题，我们可以使用 RNN 的一种变体——长短期记忆网络（LSTM），它能够学习长期依赖关系。\n\nLSTM 的数学原理可能比较复杂，但从直观上看，LSTM 引入了以下机制：\n- 输入门\n- 输出门\n- 忘记门\n- 记忆单元（内部状态）\n\nLSTM 类似于人类的记忆：它会忘记旧的信息（通过忘记门作用于旧的内部状态），同时从新的输入中学习（通过输入门获取信息）。\n\n![LSTM](http:\u002F\u002Fdeeplearning.net\u002Ftutorial\u002F_images\u002Flstm_memorycell.png)\n\n[返回顶部](#machine-learning)\n\n\n### word2vec\n\n* word2vec 是一种浅层的两层神经网络，经过训练后可以构建词语的语言学上下文。\n* 它以大规模语料库作为输入，生成一个通常是几百维的向量空间，语料库中的每个词都会在这个空间中被映射为一个向量。\n* 其核心思想在于“上下文”：经常出现在相同上下文中的词语应该具有相同或相反的意义。\n* 主要有两种模式：\n    - 连续词袋模型（CBOW）：根据周围的若干个上下文词来预测当前词。\n    - 跳字模型（Skip-Gram）：根据当前词来预测周围的上下文词。\n\n![word2vec](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_readme_2a021c45dcf2.png)\n\n[返回顶部](#machine-learning)\n\n### 生成模型与判别模型\n\n* 判别算法建模的是 *p(y|x; w)*，即在给定数据集和学习到的参数的情况下，y 属于某一特定类别的概率。判别算法并不关心数据是如何生成的，它只是简单地对给定的样本进行分类。\n* 生成算法则试图建模 *p(x|y)*，即在已知样本属于某一类别时，特征的分布情况。生成算法是对数据生成过程的建模。\n\n> 给定一个训练集，像逻辑回归或感知机这样的算法（基本上）会尝试找到一条直线——也就是决策边界——来将大象和狗分开。然后，为了将一个新的动物分类为大象或狗，它会检查该动物落在决策边界的哪一侧，并据此做出预测。\n\n> 下面是另一种方法。首先，通过观察大象，我们可以建立一个关于大象外观的模型；接着，通过观察狗，我们再建立一个关于狗外观的独立模型。最后，为了对一个新的动物进行分类，我们可以将这个新动物与大象模型进行匹配，同时也与狗模型进行匹配，从而判断这个新动物更像我们在训练集中见过的大象，还是更像狗。\n\n[返回顶部](#machine-learning)\n\n\n### 参数模型与非参数模型\n\n* 使用一组大小固定（与训练样本数量无关）的参数来总结数据的学习模型称为参数模型。\n* 在非参数模型中，参数的数量并不是在训练之前就确定的。需要注意的是，“非参数”并不意味着它们完全没有参数！相反，随着数据量的增加，非参数模型可能会变得越来越复杂。\n\n[返回顶部](#machine-learning)","# ml-interview 快速上手指南\n\n> **重要提示**：本仓库（`ml-interview`）已废弃。作者建议转向最新的项目 **[Nailing Machine Learning Concepts](https:\u002F\u002Fgithub.com\u002Fjayinai\u002Fnail-ml-concept)** 获取更新的机器学习面试准备内容。\n\n以下内容基于原仓库核心知识整理，旨在帮助开发者快速复习机器学习面试核心概念。本项目主要为**问答形式的知识库**，无需复杂的环境安装或编译过程。\n\n## 环境准备\n\n由于本项目本质是包含面试题解的文档集合（Markdown\u002FNotebook），对环境要求极低。\n\n*   **系统要求**：Windows \u002F macOS \u002F Linux 均可。\n*   **前置依赖**：\n    *   **Git**：用于克隆代码库。\n    *   **浏览器**：推荐 Chrome 或 Edge，用于直接阅读 GitHub 渲染后的 Markdown 文档。\n    *   **可选 - Jupyter Notebook**：如果需要运行仓库中提供的演示代码（如 `notebooks\u002Fbag-rf-var.ipynb`），建议安装 Python 环境及常用数据科学库。\n    \n    若需运行 Notebook 示例，推荐使用国内镜像源安装基础环境：\n    ```bash\n    # 使用清华源安装 Anaconda 或 Miniconda (如尚未安装)\n    # 随后创建环境并安装基础库\n    conda create -n ml-interview python=3.8\n    conda activate ml-interview\n    pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple numpy pandas scikit-learn matplotlib jupyter\n    ```\n\n## 安装步骤\n\n本项目无需通过包管理器安装，直接克隆仓库即可使用。\n\n1.  **克隆仓库**\n    打开终端，执行以下命令：\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fjayinai\u002Fml-interview.git\n    ```\n    *(注：鉴于仓库已废弃，若链接失效或需最新内容，请克隆新项目：`git clone https:\u002F\u002Fgithub.com\u002Fjayinai\u002Fnail-ml-concept.git`)*\n\n2.  **进入目录**\n    ```bash\n    cd ml-interview\n    ```\n\n3.  **查看内容**\n    *   **在线阅读**：直接在 GitHub 网页上浏览 `README.md` 及各章节链接。\n    *   **本地阅读**：使用 VS Code、Typora 或任何 Markdown 编辑器打开 `README.md`。\n    *   **运行示例**：如有 `.ipynb` 文件，启动 Jupyter：\n        ```bash\n        jupyter notebook notebooks\u002Fbag-rf-var.ipynb\n        ```\n\n## 基本使用\n\n本工具的核心用法是**按需查阅**和**概念复习**。以下是针对面试准备的最简使用流程：\n\n### 1. 简历优化自查\n在面试前，对照 `Explain your resume` 章节检查你的项目描述。\n*   **原则**：必须量化贡献。\n*   **示例对比**：\n    *   ❌ 弱描述：`Trained a machine learning system`\n    *   ✅ 强描述：`Trained a deep vision system (SqueezeNet) that has 1\u002F30 model size, 1\u002F3 training time, 1\u002F5 inference time, and 2x faster convergence compared with traditional ConvNet (e.g., ResNet)`\n\n### 2. 核心算法复习\n根据面试岗位需求，跳转到对应章节复习核心概念与考点。\n\n*   **基础模型**：\n    *   **线性回归 (Linear Regression)**：复习成本函数最小化、梯度下降、L1\u002FL2 正则化区别。\n    *   **逻辑回归 (Logistic Regression)**：理解 Sigmoid 函数、阈值判定及与 Softmax 的关系。\n    *   **KNN**：掌握距离度量、K 值选择对偏差 - 方差的影响。\n    *   **SVM**：理解最大间隔分类器、支持向量及特征缩放敏感性。\n\n*   **集成学习**：\n    *   **Bagging & Random Forest**：理解并行训练、有放回采样、特征随机选择如何降低方差。\n    *   **Boosting (XGBoost)**：理解串行训练、样本权重调整如何降低偏差。\n    *   **Stacking**：掌握利用元模型（Meta-model）聚合预测值的策略。\n\n*   **深度学习**：\n    *   **MLP\u002FCNN\u002FRNN\u002FLSTM**：复习网络结构、反向传播、卷积操作及门控机制（Input\u002FForget\u002FOutput gates）。\n\n### 3. SQL 技能刷新\n虽然 ML 岗位不要求成为 SQL 专家，但需掌握基础查询。\n*   推荐资源：[W3schools SQL](https:\u002F\u002Fwww.w3schools.com\u002Fsql\u002F) 或 [SQLZOO](http:\u002F\u002Fsqlzoo.net\u002F)\n\n### 4. 理论深度进阶\n对于高级岗位，重点复习以下理论辨析：\n*   Generative vs Discrimative models (生成式与判别式)\n*   Parametric vs Nonparametric models (参数与非参数方法)\n*   Bias-Variance Tradeoff (偏差 - 方差权衡)\n\n---\n*提示：面试中请结合具体项目经验，清晰阐述上述算法的原理、优缺点及应用场景。*","一名拥有两年经验的算法工程师正在备战大厂机器学习岗位面试，急需系统梳理知识盲区并优化项目表述。\n\n### 没有 ml-interview 时\n- 简历描述空洞，仅写“训练过机器学习系统”，无法用量化数据（如模型大小、推理时间）体现技术深度，难以吸引面试官注意。\n- 面对线性回归正则化（L1\u002FL2）、KNN 过拟合边界等基础概念，记忆模糊，缺乏结构化的问答素材来应对突发提问。\n- 忽视 SQL 技能复习，误以为算法岗不考数据库，结果在笔试环节因生疏而丢分。\n- 复习范围零散，盲目翻阅厚重的深度学习教材，难以快速定位面试高频考点如生成式与判别式模型的区别。\n\n### 使用 ml-interview 后\n- 参考简历优化指南，将项目经历重写为“训练 SqueezeNet 视觉系统，模型体积缩小 30 倍，推理速度提升 5 倍”，用具体指标瞬间抓住面试官眼球。\n- 利用涵盖逻辑回归、SVM 到 LSTM 的精选问答库，快速重温梯度下降、参数与非参数模型等核心机制，回答条理清晰且专业。\n- 通过内置的 SQL 资源链接针对性刷题，迅速找回状态，从容应对面试中穿插的数据查询考题。\n- 直接聚焦目录中的高频考点（如 Bagging 与 Boosting 差异），跳过冗余理论，实现高效精准的考前突击。\n\nml-interview 通过将分散的知识点转化为结构化的实战问答与量化表达策略，帮助求职者从“泛泛而谈”转型为“数据驱动”的专业候选人。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjayinai_ml-interview_06075b94.png","jayinai",null,"https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fjayinai_7fb89f6d.png","Tech Lead, Machine Learning Engineer","https:\u002F\u002Fgithub.com\u002Fjayinai",[81],{"name":82,"color":83,"percentage":84},"Jupyter Notebook","#DA5B0B",100,903,214,"2026-03-06T22:56:19",1,"","未说明",{"notes":92,"python":90,"dependencies":93},"该仓库已弃用（deprecated），作者建议转向新的项目 'Nailing Machine Learning Concepts'。当前内容主要为机器学习面试的知识问答整理（涵盖简历、SQL 及各类算法概念），并非可执行的软件工具，因此无需特定的运行环境、GPU、内存或依赖库。",[],[13],[96,97],"machine-learning","interview-preparation","2026-03-27T02:49:30.150509","2026-04-06T09:44:29.284847",[],[]]