[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-tirthajyoti--Machine-Learning-with-Python":3,"tool-tirthajyoti--Machine-Learning-with-Python":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",151314,2,"2026-04-11T23:32:58",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":32,"env_os":102,"env_gpu":103,"env_ram":103,"env_deps":104,"category_tags":118,"github_topics":120,"view_count":32,"oss_zip_url":136,"oss_zip_packed_at":136,"status":17,"created_at":137,"updated_at":138,"faqs":139,"releases":160},6743,"tirthajyoti\u002FMachine-Learning-with-Python","Machine-Learning-with-Python","Practice and tutorial-style notebooks  covering wide variety of machine learning techniques","Machine-Learning-with-Python 是一个由 Dr. Tirthajyoti Sarkar 精心维护的开源学习资源库，旨在通过一系列实战导向的 Jupyter Notebook 教程，帮助用户系统掌握机器学习技术。它主要解决了初学者和进阶者在理论学习与代码实践之间难以衔接的痛点，提供了从数据清洗、可视化到模型构建的全流程可运行代码示例。\n\n这套资源非常适合 Python 开发者、数据科学学生以及希望转行进入 AI 领域的研究人员使用。无论是需要快速复习 NumPy 和 Pandas 基础操作，还是想深入理解 Scikit-learn、TensorFlow 和 Keras 等框架的高级应用，用户都能在这里找到对应的详细指南。其独特亮点在于不仅涵盖了常规的算法实现，还包含了如 PDF 表格读取、多源数据加载等实际工程中常见但教程稀缺的实用技巧，并对比了纯 Python 与 NumPy 的性能差异，帮助开发者写出更高效的代码。配合丰富的依赖库清单和相关论文索引，Machine-Learning-with-Python 堪称一座连接理论知识与工业级应用的坚实桥梁。","[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-BSD%202--Clause-orange.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FBSD-2-Clause)\n[![GitHub forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fnetwork)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fstargazers)\n[![PRs Welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fpulls)\n\n# Python Machine Learning Jupyter Notebooks ([ML website](https:\u002F\u002Fmachine-learning-with-python.readthedocs.io\u002Fen\u002Flatest\u002F))\n\n### Dr. Tirthajyoti Sarkar, Fremont, California ([Please feel free to connect on LinkedIn here](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftirthajyoti-sarkar-2127aa7))\n\n![ml-ds](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_4dee2129855e.png)\n\n---\n\n## Also check out these super-useful Repos that I curated\n\n- ### [Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FPapers-Literature-ML-DL-RL-AI)\n\n- ### [Carefully curated resource links for data science in one place](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FData-science-best-resources)\n\n## Requirements\n* **Python 3.6+**\n* **NumPy (`pip install numpy`)**\n* **Pandas (`pip install pandas`)**\n* **Scikit-learn (`pip install scikit-learn`)**\n* **SciPy (`pip install scipy`)**\n* **Statsmodels (`pip install statsmodels`)**\n* **MatplotLib (`pip install matplotlib`)**\n* **Seaborn (`pip install seaborn`)**\n* **Sympy (`pip install sympy`)**\n* **Flask (`pip install flask`)**\n* **WTForms (`pip install wtforms`)**\n* **Tensorflow (`pip install tensorflow>=1.15`)**\n* **Keras (`pip install keras`)**\n* **pdpipe (`pip install pdpipe`)**\n\n---\n\nYou can start with this article that I wrote in Heartbeat magazine (on Medium platform): \n### [\"Some Essential Hacks and Tricks for Machine Learning with Python\"](https:\u002F\u002Fheartbeat.fritz.ai\u002Fsome-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2)\n\u003Cimg src=\"https:\u002F\u002Fcookieegroup.com\u002Fwp-content\u002Fuploads\u002F2018\u002F10\u002F2-1.png\" width=\"450\" height=\"300\"\u002F>\n\n## Essential tutorial-type notebooks on Pandas and Numpy\nJupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, Matplotlib etc.\n\n* [Detailed Numpy operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_operations.ipynb)\n* [Detailed Pandas operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FPandas_Operations.ipynb)\n* [Numpy and Pandas quick basics](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Pandas_Quick.ipynb)\n* [Matplotlib and Seaborn quick basics](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FMatplotlib_Seaborn_basics.ipynb)\n* [Advanced Pandas operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FAdvanced%20Pandas%20Operations.ipynb)\n* [How to read various data sources](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FHow%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)\n* [PDF reading and table processing demo](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FPDF%20table%20reading%20and%20processing%20demo.ipynb)\n* [How fast are Numpy operations compared to pure Python code?](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FHow%20fast%20are%20NumPy%20ops.ipynb) (Read my [article](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f) on Medium related to this topic)\n* [Fast reading from Numpy using .npy file format](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Reading.ipynb) (Read my [article](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-start-using-npy-file-more-often-df2a13cc0161) on Medium on this topic)\n\n## Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms\n\n### Regression\n* Simple linear regression with t-statistic generation\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_9f7836e2632d.jpg\" width=\"400\" height=\"300\"\u002F>\n\n* [Multiple ways to perform linear regression in Python and their speed comparison](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FLinear_Regression_Methods.ipynb) ([check the article I wrote on freeCodeCamp](https:\u002F\u002Fmedium.freecodecamp.org\u002Fdata-science-with-python-8-ways-to-do-linear-regression-and-measure-their-speed-b5577d75f8b))\n\n* [Multi-variate regression with regularization](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FMulti-variate%20LASSO%20regression%20with%20CV.ipynb)\n\u003Cimg src=\"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002Ff\u002Ff8\u002FL1_and_L2_balls.svg\u002F300px-L1_and_L2_balls.svg.png\"\u002F>\n\n* Polynomial regression using ***scikit-learn pipeline feature*** ([check the article I wrote on *Towards Data Science*](https:\u002F\u002Ftowardsdatascience.com\u002Fmachine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49))\n\n* [Decision trees and Random Forest regression](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRandom_Forest_Regression.ipynb) (showing how the Random Forest works as a robust\u002Fregularized meta-estimator rejecting overfitting)\n\n* [Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRegression_Diagnostics.ipynb)\n\n* [Robust linear regression using `HuberRegressor` from Scikit-learn](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRobust%20Linear%20Regression.ipynb)\n\n-----\n\n### Classification\n* Logistic regression\u002Fclassification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FLogistic_Regression_Classification.ipynb))\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_c17398f5a73e.png\"\u002F>\n\n* _k_-nearest neighbor classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FKNN_Classification.ipynb))\n\n* Decision trees and Random Forest Classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FDecisionTrees_RandomForest_Classification.ipynb))\n\n* Support vector machine classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FSupport_Vector_Machine_Classification.ipynb)) (**[check the article I wrote in Towards Data Science on SVM and sorting algorithm](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_65a1b91ed41c.png\"\u002F>\n\n* Naive Bayes classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FNaive_Bayes_Classification.ipynb))\n\n---\n\n### Clustering\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_c9de74278e8d.jpg\" width=\"450\" height=\"300\"\u002F>\n\n* _K_-means clustering ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FK_Means_Clustering_Practice.ipynb))\n\n* Affinity propagation (showing its time complexity and the effect of damping factor) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FAffinity_Propagation.ipynb))\n\n* Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FMean_Shift_Clustering.ipynb))\n\n* DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FDBScan_Clustering.ipynb))\n\n* Hierarchical clustering with Dendograms showing how to choose optimal number of clusters ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FHierarchical_Clustering.ipynb))\n\n\u003Cimg src=\"https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FCarsten_Walther\u002Fpublication\u002F273456906\u002Ffigure\u002Ffig3\u002FAS:294866065084419@1447312956501\u002FExample-of-hierarchical-clustering-clusters-are-consecutively-merged-with-the-most.png\" width=\"700\" height=\"400\"\u002F>\n\n---\n\n### Dimensionality reduction\n* Principal component analysis\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_cceb1c27ece7.jpg\" width=\"450\" height=\"300\"\u002F>\n\n---\n\n### Deep Learning\u002FNeural Network\n* [Demo notebook to illustrate the superiority of deep neural network for complex nonlinear function approximation task](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FFunction%20Approximation%20by%20Neural%20Network\u002FPolynomial%20regression%20-%20linear%20and%20neural%20network.ipynb)\n* Step-by-step building of 1-hidden-layer and 2-hidden-layer dense network using basic TensorFlow methods\n\n---\n\n### Random data generation using symbolic expressions\n* How to use [Sympy package](https:\u002F\u002Fwww.sympy.org\u002Fen\u002Findex.html) to generate random datasets using symbolic mathematical expressions.\n\n* Here is my article on Medium on this topic: [Random regression and classification problem generation with symbolic expression](https:\u002F\u002Ftowardsdatascience.com\u002Frandom-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d)\n\n---\n\n### Synthetic data generation techniques\n* [Notebooks here](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FSynthetic_data_generation)\n\n### Simple deployment examples (serving ML models on web API)\n* [Serving a linear regression model through a simple HTTP server interface](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002FLinear_regression). User needs to request predictions by executing a Python script. Uses `Flask` and `Gunicorn`.\n\n* [Serving a recurrent neural network (RNN) through a HTTP webpage](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002Frnn_app), complete with a web form, where users can input parameters and click a button to generate text based on the pre-trained RNN model. Uses `Flask`, `Jinja`, `Keras`\u002F`TensorFlow`, `WTForms`.\n\n---\n\n### Object-oriented programming with machine learning\nImplementing some of the core OOP principles in a machine learning context by [building your own Scikit-learn-like estimator, and making it better](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FOOP_in_ML\u002FClass_MyLinearRegression.ipynb).\n\nSee my articles on Medium on this topic.\n\n* [Object-oriented programming for data scientists: Build your ML estimator](https:\u002F\u002Ftowardsdatascience.com\u002Fobject-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64)\n* [How a simple mix of object-oriented programming can sharpen your deep learning prototype](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd)\n\n---\n### Unit testing ML code with Pytest\nCheck the files and detailed instructions in the [Pytest](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FPytest) directory to understand how one should write unit testing code\u002Fmodule for machine learning models\n\n---\n\n### Memory and timing profiling\n\nProfiling data science code and ML models for memory footprint and computing time is a critical but often overlooed area. Here are a couple of Notebooks showing the ideas,\n\n* [Memory profling using Scalene](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FMemory-profiling\u002FScalene)\n* [Time-profiling data science code](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FTime-profiling\u002FcProfile.ipynb)\n","[![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-BSD%202--Clause-orange.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FBSD-2-Clause)\n[![GitHub 复刻数](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fnetwork)\n[![GitHub 星标数](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fstargazers)\n[![欢迎提交 PR](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fpulls)\n\n# Python 机器学习 Jupyter 笔记本（[ML 网站](https:\u002F\u002Fmachine-learning-with-python.readthedocs.io\u002Fen\u002Flatest\u002F)）\n\n### 蒂尔塔焦蒂·萨卡尔博士，弗里蒙特，加利福尼亚州（[欢迎在 LinkedIn 上联系我](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftirthajyoti-sarkar-2127aa7)）\n\n![ml-ds](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_4dee2129855e.png)\n\n---\n\n## 同时也请查看我整理的这些超级实用的仓库\n\n- ### [与机器学习、深度学习、人工智能、博弈论、强化学习相关的高引用且有用的论文](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FPapers-Literature-ML-DL-RL-AI)\n\n- ### [精心整理的数据科学资源链接合集](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FData-science-best-resources)\n\n## 需求\n* **Python 3.6+**\n* **NumPy (`pip install numpy`)**\n* **Pandas (`pip install pandas`)**\n* **Scikit-learn (`pip install scikit-learn`)**\n* **SciPy (`pip install scipy`)**\n* **Statsmodels (`pip install statsmodels`)**\n* **MatplotLib (`pip install matplotlib`)**\n* **Seaborn (`pip install seaborn`)**\n* **Sympy (`pip install sympy`)**\n* **Flask (`pip install flask`)**\n* **WTForms (`pip install wtforms`)**\n* **Tensorflow (`pip install tensorflow>=1.15`)**\n* **Keras (`pip install keras`)**\n* **pdpipe (`pip install pdpipe`)**\n\n---\n\n你可以从我在 Heartbeat 杂志（Medium 平台）上撰写的一篇文章开始：\n### [\"使用 Python 进行机器学习的一些必备技巧和窍门\"](https:\u002F\u002Fheartbeat.fritz.ai\u002Fsome-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2)\n\u003Cimg src=\"https:\u002F\u002Fcookieegroup.com\u002Fwp-content\u002Fuploads\u002F2018\u002F10\u002F2-1.png\" width=\"450\" height=\"300\"\u002F>\n\n## Pandas 和 Numpy 的基础教程型笔记本\n涵盖 NumPy、Pandas、Seaborn、Matplotlib 等主题的广泛功能和操作的 Jupyter 笔记本。\n\n* [详细的 Numpy 操作](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_operations.ipynb)\n* [详细的 Pandas 操作](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FPandas_Operations.ipynb)\n* [Numpy 和 Pandas 快速入门](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Pandas_Quick.ipynb)\n* [Matplotlib 和 Seaborn 快速入门](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FMatplotlib_Seaborn_basics.ipynb)\n* [高级 Pandas 操作](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FAdvanced%20Pandas%20Operations.ipynb)\n* [如何读取各种数据源](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FHow%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)\n* [PDF 读取与表格处理演示](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FPDF%20table%20reading%20and%20processing%20demo.ipynb)\n* [Numpy 操作相比纯 Python 代码有多快？](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FHow%20fast%20are%20NumPy%20ops.ipynb)（阅读我在 Medium 上发表的关于此主题的文章：[为什么你应该忘记 for 循环，拥抱向量化编程](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f)）\n* [使用 .npy 文件格式快速读取 Numpy 数据](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Reading.ipynb)（阅读我在 Medium 上发表的关于此主题的文章：[为什么你应该更频繁地使用 .npy 文件](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-start-using-npy-file-more-often-df2a13cc0161)）\n\n## 涵盖回归、分类、聚类、降维以及一些基础神经网络算法的教程型笔记本\n\n### 回归\n* 带 t 统计量生成的简单线性回归\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_9f7836e2632d.jpg\" width=\"400\" height=\"300\"\u002F>\n\n* [在 Python 中执行线性回归的多种方法及其速度比较](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FLinear_Regression_Methods.ipynb)（参阅我在 freeCodeCamp 上撰写的文章：[用 Python 进行数据科学：8 种线性回归方法及其速度对比](https:\u002F\u002Fmedium.freecodecamp.org\u002Fdata-science-with-python-8-ways-to-do-linear-regression-and-measure-their-speed-b5577d75f8b)）\n\n* [带有正则化的多元回归](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FMulti-variate%20LASSO%20regression%20with%20CV.ipynb)\n\u003Cimg src=\"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002Ff\u002Ff8\u002FL1_and_L2_balls.svg\u002F300px-L1_and_L2_balls.svg.png\"\u002F>\n\n* 使用 ***scikit-learn 管道功能*** 进行多项式回归（参阅我在 *Towards Data Science* 上撰写的文章：[用 Python 进行机器学习：一种简单而稳健的非线性数据拟合方法](https:\u002F\u002Ftowardsdatascience.com\u002Fmachine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49)）\n\n* [决策树和随机森林回归](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRandom_Forest_Regression.ipynb)（展示了随机森林作为稳健\u002F正则化的元估计器，能够有效避免过拟合）\n\n* [针对线性回归问题的详细可视化分析和拟合优度诊断测试](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRegression_Diagnostics.ipynb)\n\n* [使用 Scikit-learn 的 `HuberRegressor` 进行鲁棒线性回归](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRobust%20Linear%20Regression.ipynb)\n\n-----\n\n### 分类\n* 逻辑回归\u002F分类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FLogistic_Regression_Classification.ipynb))\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_c17398f5a73e.png\"\u002F>\n\n* _k_ 最近邻分类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FKNN_Classification.ipynb))\n\n* 决策树与随机森林分类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FDecisionTrees_RandomForest_Classification.ipynb))\n\n* 支持向量机分类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FSupport_Vector_Machine_Classification.ipynb)) (**[查看我在 Towards Data Science 上撰写的关于 SVM 和排序算法的文章](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_65a1b91ed41c.png\"\u002F>\n\n* 朴素贝叶斯分类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FNaive_Bayes_Classification.ipynb))\n\n---\n\n### 聚类\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_c9de74278e8d.jpg\" width=\"450\" height=\"300\"\u002F>\n\n* _K_-均值聚类 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FK_Means_Clustering_Practice.ipynb))\n\n* 相似度传播（展示其时间复杂度及阻尼因子的影响）([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FAffinity_Propagation.ipynb))\n\n* 均值漂移技术（展示其时间复杂度以及噪声对聚类发现的影响）([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FMean_Shift_Clustering.ipynb))\n\n* DBSCAN（展示其如何在不考虑簇形状的情况下，通用性地检测高密度区域，而 k-means 则无法做到）([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FDBScan_Clustering.ipynb))\n\n* 层次聚类结合树状图，展示如何选择最佳的聚类数量 ([这里是 Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FHierarchical_Clustering.ipynb))\n\n\u003Cimg src=\"https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FCarsten_Walther\u002Fpublication\u002F273456906\u002Ffigure\u002Ffig3\u002FAS:294866065084419@1447312956501\u002FExample-of-hierarchical-clustering-clusters-are-consecutively-merged-with-the-most.png\" width=\"700\" height=\"400\"\u002F>\n\n---\n\n### 降维\n* 主成分分析\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_readme_cceb1c27ece7.jpg\" width=\"450\" height=\"300\"\u002F>\n\n---\n\n### 深度学习\u002F神经网络\n* [演示 Notebook，说明深度神经网络在复杂非线性函数逼近任务中的优越性](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FFunction%20Approximation%20by%20Neural%20Network\u002FPolynomial%20regression%20-%20linear%20and%20neural%20network.ipynb)\n* 使用基础 TensorFlow 方法逐步构建单隐藏层和双隐藏层密集网络。\n\n---\n\n### 使用符号表达式生成随机数据\n* 如何利用 [Sympy 包](https:\u002F\u002Fwww.sympy.org\u002Fen\u002Findex.html) 通过符号数学表达式生成随机数据集。\n\n* 这是我的 Medium 文章：[使用符号表达式生成随机回归与分类问题](https:\u002F\u002Ftowardsdatascience.com\u002Frandom-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d)\n\n---\n\n### 合成数据生成技术\n* [Notebook 请见这里](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FSynthetic_data_generation)\n\n### 简单部署示例（在 Web API 上服务机器学习模型）\n* [通过简单的 HTTP 服务器接口服务线性回归模型](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002FLinear_regression)。用户需要通过执行 Python 脚本来请求预测结果。使用了 `Flask` 和 `Gunicorn`。\n\n* [通过 HTTP 网页服务循环神经网络 (RNN)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002Frnn_app)，配备了一个网页表单，用户可以输入参数并点击按钮，基于预先训练好的 RNN 模型生成文本。使用了 `Flask`、`Jinja`、`Keras`\u002F`TensorFlow`、`WTForms`。\n\n---\n\n### 面向对象编程与机器学习\n在机器学习背景下实现一些核心的 OOP 原则，例如 [构建你自己的类似 Scikit-learn 的估计器，并使其更优秀](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FOOP_in_ML\u002FClass_MyLinearRegression.ipynb)。\n\n请参阅我在 Medium 上的相关文章：\n\n* [面向数据科学家的面向对象编程：构建你的 ML 估计器](https:\u002F\u002Ftowardsdatascience.com\u002Fobject-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64)\n* [简单混合的面向对象编程如何提升你的深度学习原型](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd)\n\n---\n### 使用 Pytest 对 ML 代码进行单元测试\n请查看 [Pytest](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FPytest) 目录中的文件及详细说明，以了解如何为机器学习模型编写单元测试代码\u002F模块。\n\n---\n\n### 内存与性能剖析\n对数据科学代码和机器学习模型进行内存占用和计算时间的剖析是一项至关重要但常常被忽视的工作。以下是一些展示相关思路的 Notebook：\n\n* [使用 Scalene 进行内存剖析](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FMemory-profiling\u002FScalene)\n* [数据科学代码的时间剖析](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FTime-profiling\u002FcProfile.ipynb)","# Machine-Learning-with-Python 快速上手指南\n\n本指南基于 Dr. Tirthajyoti Sarkar 开源的机器学习教程仓库，提供从数据处理、经典算法到模型部署的完整 Jupyter Notebook 示例。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Windows \u002F macOS \u002F Linux\n*   **Python 版本**：3.6 或更高版本\n*   **核心依赖库**：\n    *   数据处理：`NumPy`, `Pandas`, `SciPy`, `pdpipe`\n    *   可视化：`Matplotlib`, `Seaborn`\n    *   机器学习：`Scikit-learn`, `Statsmodels`\n    *   深度学习：`TensorFlow` (>=1.15), `Keras`\n    *   Web 部署：`Flask`, `WTForms`\n    *   符号计算：`Sympy`\n\n> **提示**：国内用户建议使用清华或阿里镜像源加速安装过程。\n\n## 安装步骤\n\n### 1. 克隆项目\n首先将仓库克隆到本地：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python.git\ncd Machine-Learning-with-Python\n```\n\n### 2. 安装依赖\n推荐使用 `pip` 配合国内镜像源一次性安装所有必要组件：\n\n```bash\npip install numpy pandas scikit-learn scipy statsmodels matplotlib seaborn sympy flask wtforms pdpipe -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n安装深度学习框架（注意版本兼容性）：\n```bash\npip install \"tensorflow>=1.15\" keras -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\n本项目主要由一系列 **Jupyter Notebooks (.ipynb)** 组成，涵盖了从基础操作到高级算法的实战案例。\n\n### 启动方式\n确保已安装 Jupyter Lab 或 Notebook：\n```bash\npip install jupyterlab -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\njupyter lab\n```\n在浏览器打开后，导航至克隆的项目目录即可看到分类清晰的笔记本文件。\n\n### 最简单的使用示例：探索 Pandas 与 Numpy 基础\n对于初学者，建议从 `Pandas and Numpy` 文件夹入手，快速掌握数据操作核心。\n\n1.  在 Jupyter 中打开 `Pandas and Numpy\u002FNumpy_Pandas_Quick.ipynb`。\n2.  运行单元格，查看基础的数组创建与 DataFrame 操作演示。\n\n或者，尝试运行一个具体的回归分析示例：\n\n1.  进入 `Regression` 文件夹。\n2.  打开 `Linear_Regression_Methods.ipynb`。\n3.  该笔记本展示了多种执行线性回归的方法及其速度对比。您可以直接修改代码中的数据部分，观察模型拟合效果的变化。\n\n```python\n# 示例：在 Notebook 中加载并预览数据（伪代码逻辑参考）\nimport pandas as pd\nimport numpy as np\n\n# 读取示例数据\ndf = pd.read_csv('your_data.csv') \n\n# 简单的数据预览\nprint(df.head())\n```\n\n### 进阶路径推荐\n*   **数据清洗**：阅读 `Read_data_various_sources` 目录下的笔记，学习如何读取 CSV、PDF 表格等多种格式数据。\n*   **算法实战**：依次浏览 `Classification`（分类）、`Clustering-Dimensionality-Reduction`（聚类与降维）目录，每个笔记都包含了算法原理图解与代码实现。\n*   **模型部署**：参考 `Deployment` 目录，学习如何使用 Flask 将训练好的线性回归或 RNN 模型发布为 Web API。","某初创公司的数据分析师小李需要在一周内为电商客户构建用户流失预测模型，但他对 Scikit-learn 的高级调参和 Pandas 复杂数据清洗操作尚不熟练。\n\n### 没有 Machine-Learning-with-Python 时\n- **基础语法查阅耗时**：面对 NumPy 广播机制或 Pandas 多级索引等复杂操作，需频繁搜索零散文档，打断编程思路。\n- **数据预处理踩坑**：在处理非结构化数据（如 PDF 报表）或缺失值填充时，缺乏标准代码参考，容易写出低效甚至错误的逻辑。\n- **算法实现门槛高**：从零编写深度学习模型或集成学习算法时，因不熟悉 Keras\u002FTensorFlow 的最佳实践而反复调试报错。\n- **可视化效果单一**：仅能绘制基础图表，难以利用 Seaborn 快速生成具有洞察力的统计图形来向业务方汇报。\n- **学习曲线陡峭**：缺乏系统性的教程笔记，导致在数据科学概念与代码实现之间难以建立直观联系，培训成本高昂。\n\n### 使用 Machine-Learning-with-Python 后\n- **即查即用高效开发**：直接复用仓库中详细的 NumPy 和 Pandas 操作笔记本，快速掌握高级函数，编码效率提升 50%。\n- **标准化数据处理**：参考\"PDF 读取与表格处理”等实战案例，迅速解决异构数据源难题，确保数据清洗流程稳健可靠。\n- **开箱即用的算法模板**：基于涵盖各类机器学习技术的教程笔记，快速搭建并优化预测模型，大幅减少试错时间。\n- **专业级可视化呈现**：利用 Matplotlib 和 Seaborn 的速查示例，一键生成高质量分析图表，显著提升报告说服力。\n- **系统化技能进阶**：通过循序渐进的 Jupyter 笔记实战，将理论知识迅速转化为动手能力，缩短团队新人成长周期。\n\nMachine-Learning-with-Python 通过提供全覆盖的实战型笔记，将碎片化的学习过程转化为标准化的开发流程，极大降低了机器学习项目的落地门槛。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftirthajyoti_Machine-Learning-with-Python_c17398f5.png","tirthajyoti","Tirthajyoti Sarkar","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Ftirthajyoti_6bfa2f0f.jpg","VP, AI\u002FML Platforms, Industry 4.0, edge-computing, semiconductor technology, Author, Python packages - pydbgen, MLR, and doepy,","Rhombus Power","Fremont, CA","tirthajyoti@gmail.com","tirthajyotiS","tirthajyoti.github.io","https:\u002F\u002Fgithub.com\u002Ftirthajyoti",[83,87,91,95],{"name":84,"color":85,"percentage":86},"Jupyter Notebook","#DA5B0B",99.8,{"name":88,"color":89,"percentage":90},"Python","#3572A5",0.2,{"name":92,"color":93,"percentage":94},"CSS","#663399",0,{"name":96,"color":97,"percentage":94},"HTML","#e34c26",3308,1839,"2026-04-11T13:05:31","BSD-2-Clause","","未说明",{"notes":105,"python":106,"dependencies":107},"该项目主要包含用于机器学习的 Jupyter Notebook 教程。依赖库可通过 pip 安装。部分示例涉及 Flask Web 部署和 Keras\u002FTensorFlow 深度学习，但未明确指定 GPU 加速需求或具体显存要求。","3.6+",[108,109,110,111,112,113,114,115,116,117],"numpy","pandas","scikit-learn","scipy","statsmodels","matplotlib","seaborn","sympy","flask","tensorflow>=1.15",[14,16,119],"其他",[108,121,109,113,122,110,123,124,125,126,127,128,129,130,131,132,133,134,135,116],"statistics","regression","classification","clustering","decision-trees","random-forest","dimensionality-reduction","neural-network","deep-learning","artificial-intelligence","data-science","machine-learning","k-nearest-neighbours","naive-bayes","pytest",null,"2026-03-27T02:49:30.150509","2026-04-12T07:50:40.985564",[140,145,150,155],{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},30414,"如何快速在任意平台上使用 Docker 运行 Jupyter Notebook？","可以使用预构建的 Docker 镜像在几分钟内运行 Notebook。具体步骤如下：\n1. 初始化 Git 仓库：\n   mkdir -p ~\u002Fdev\u002Fml\n   cd ~\u002Fdev\u002Fml\n   git clone https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FPythonMachineLearning.git\n2. 拉取 Docker 镜像：\n   docker pull artificialintelligence\u002Fpython-jupyter\n3. 运行容器（映射端口和目录）：\n   docker run -d -p 9000:8888 -v ${PWD}:\u002Fnotebook -v ${PWD}:\u002Fdata artificialintelligence\u002Fpython-jupyter\n4. 在浏览器打开 http:\u002F\u002Flocalhost:9000。\n5. 停止容器时，先使用 docker ps 查找容器 ID，然后执行 docker kill \u003CCONTAINER_ID>。","https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fissues\u002F3",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},30415,"项目是否包含强化学习模型（如 UCB 或 Thompson 采样）？如果没有，应该在哪里贡献代码？","本项目主要关注基础机器学习，不包含强化学习模型。维护者建议将此类模型（如 UCB 和 Thompson 采样）添加到专门的强化学习基础仓库中：https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FRL_basics","https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fissues\u002F13",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},30416,"在梯度下降等非线性算法中，如何避免陷入局部最优解？有什么好的初始参数估计方法吗？","对于复杂方程，初始参数的选择至关重要。推荐使用 scipy.optimize.differential_evolution（差分进化遗传算法）来进行初始参数估计。虽然本项目中的线性回归代码因保证收敛未特别处理初始化，但在多层神经网络的反向传播等场景中，维护者确认会重点关注并采用此类更好的初始化策略。","https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fissues\u002F1",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},30417,"在测试 NumPy 运算速度（如对数运算）时，计时范围是否应该包含数组的创建时间？","这是一个关于基准测试公平性的问题。用户指出在测试 np.log10 速度时，代码仅计算了运算时间而忽略了 np.array 的创建时间，这与其他未预先创建数组的方法相比可能不公平。建议在对比不同方法速度时，确保所有方法的计时起点一致，即都包含数据准备（数组创建）的过程，或者明确说明计时仅针对特定运算操作。","https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fissues\u002F4",[]]