[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-rapidsai--cuml":3,"tool-rapidsai--cuml":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":124,"forks":125,"last_commit_at":126,"license":127,"difficulty_score":10,"env_os":128,"env_gpu":129,"env_ram":130,"env_deps":131,"category_tags":143,"github_topics":144,"view_count":23,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":150,"updated_at":151,"faqs":152,"releases":181},2720,"rapidsai\u002Fcuml","cuml","cuML - RAPIDS Machine Learning Library","cuML 是 RAPIDS 生态系统中专为 GPU 加速设计的机器学习库，旨在让数据科学家和工程师无需深入 CUDA 编程细节，即可在显卡上高效运行传统表格型机器学习任务。它完美兼容 scikit-learn 的 Python API，用户只需极少的代码修改，就能将原本在 CPU 上运行的算法迁移至 GPU，从而在处理大规模数据集时获得 10 到 50 倍的性能提升。\n\ncuML 主要解决了传统机器学习在海量数据面前计算缓慢、耗时过长的痛点，特别适用于需要快速迭代模型或处理亿级数据行的场景。无论是进行聚类分析、降维处理，还是构建回归与分类模型，cuML 都提供了丰富的算法支持，包括 DBSCAN、K-Means、PCA 以及线性回归等。\n\n该工具非常适合熟悉 Python 和数据科学工作流的开发者、研究人员及软件工程师使用。其独特的技术亮点在于不仅支持单卡加速，还通过集成 Dask 框架实现了多 GPU 乃至多节点集群的分布式训练与推理，能够轻松应对超大规模数据的挑战。此外，cuML 能与 cuDF 无缝协作，直接操作 GPU 内存中的数据帧，进一步消除了数据传输瓶颈，让高性能计算变得","cuML 是 RAPIDS 生态系统中专为 GPU 加速设计的机器学习库，旨在让数据科学家和工程师无需深入 CUDA 编程细节，即可在显卡上高效运行传统表格型机器学习任务。它完美兼容 scikit-learn 的 Python API，用户只需极少的代码修改，就能将原本在 CPU 上运行的算法迁移至 GPU，从而在处理大规模数据集时获得 10 到 50 倍的性能提升。\n\ncuML 主要解决了传统机器学习在海量数据面前计算缓慢、耗时过长的痛点，特别适用于需要快速迭代模型或处理亿级数据行的场景。无论是进行聚类分析、降维处理，还是构建回归与分类模型，cuML 都提供了丰富的算法支持，包括 DBSCAN、K-Means、PCA 以及线性回归等。\n\n该工具非常适合熟悉 Python 和数据科学工作流的开发者、研究人员及软件工程师使用。其独特的技术亮点在于不仅支持单卡加速，还通过集成 Dask 框架实现了多 GPU 乃至多节点集群的分布式训练与推理，能够轻松应对超大规模数据的挑战。此外，cuML 能与 cuDF 无缝协作，直接操作 GPU 内存中的数据帧，进一步消除了数据传输瓶颈，让高性能计算变得触手可及。","# \u003Cdiv align=\"left\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_2b4ae54fb74d.png\" width=\"90px\"\u002F>&nbsp;cuML - GPU Machine Learning Algorithms\u003C\u002Fdiv>\n\ncuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other [RAPIDS](https:\u002F\u002Frapids.ai\u002F) projects.\n\ncuML enables data scientists, researchers, and software engineers to run\ntraditional tabular ML tasks on GPUs without going into the details of CUDA\nprogramming. In most cases, cuML's Python API matches the API from\n[scikit-learn](https:\u002F\u002Fscikit-learn.org).\n\nFor large datasets, these GPU-based implementations can complete 10-50x faster\nthan their CPU equivalents. For details on performance, see the [cuML Benchmarks\nNotebook](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Ftree\u002Fmain\u002Fnotebooks\u002Ftools).\n\nAs an example, the following Python snippet loads input and computes DBSCAN clusters, all on GPU, using cuDF:\n```python\nimport cudf\nfrom cuml.cluster import DBSCAN\n\n# Create and populate a GPU DataFrame\ngdf_float = cudf.DataFrame()\ngdf_float['0'] = [1.0, 2.0, 5.0]\ngdf_float['1'] = [4.0, 2.0, 1.0]\ngdf_float['2'] = [4.0, 2.0, 1.0]\n\n# Setup and fit clusters\ndbscan_float = DBSCAN(eps=1.0, min_samples=1)\ndbscan_float.fit(gdf_float)\n\nprint(dbscan_float.labels_)\n```\n\nOutput:\n```\n0    0\n1    1\n2    2\ndtype: int32\n```\n\ncuML also features multi-GPU and multi-node-multi-GPU operation, using [Dask](https:\u002F\u002Fwww.dask.org), for a\ngrowing list of algorithms. The following Python snippet reads input from a CSV file and performs\na NearestNeighbors query across a cluster of Dask workers, using multiple GPUs on a single node:\n\n\nInitialize a `LocalCUDACluster` configured with [UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx) for fast transport of CUDA arrays\n```python\n# Initialize UCX for high-speed transport of CUDA arrays\nfrom dask_cuda import LocalCUDACluster\n\n# Create a Dask single-node CUDA cluster w\u002F one worker per device\ncluster = LocalCUDACluster(protocol=\"ucx\",\n                           enable_tcp_over_ucx=True,\n                           enable_nvlink=True,\n                           enable_infiniband=False)\n```\n\nLoad data and perform `k-Nearest Neighbors` search. `cuml.dask` estimators also support `Dask.Array` as input:\n```python\n\nfrom dask.distributed import Client\nclient = Client(cluster)\n\n# Read CSV file in parallel across workers\nimport dask_cudf\ndf = dask_cudf.read_csv(\"\u002Fpath\u002Fto\u002Fcsv\")\n\n# Fit a NearestNeighbors model and query it\nfrom cuml.dask.neighbors import NearestNeighbors\nnn = NearestNeighbors(n_neighbors = 10, client=client)\nnn.fit(df)\nneighbors = nn.kneighbors(df)\n```\n\nFor additional examples, browse our complete [API\ndocumentation](https:\u002F\u002Fdocs.rapids.ai\u002Fapi\u002Fcuml\u002Fstable\u002F), or check out our\nexample [walkthrough\nnotebooks](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Ftree\u002Fmain\u002Fnotebooks). Finally, you\ncan find complete end-to-end examples in the [notebooks-contrib\nrepo](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fnotebooks-contrib).\n\n\n### Supported Algorithms\n| Category | Algorithm | Notes |\n| --- | --- | --- |\n| **Clustering** |  Density-Based Spatial Clustering of Applications with Noise (DBSCAN) | Multi-node multi-GPU via Dask |\n|  | Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN)  | |\n|  | K-Means | Multi-node multi-GPU via Dask |\n|  | Single-Linkage Agglomerative Clustering | |\n|  | Spectral Clustering | |\n| **Dimensionality Reduction** | Principal Components Analysis (PCA) | Multi-node multi-GPU via Dask|\n| | Incremental PCA | |\n| | Truncated Singular Value Decomposition (tSVD) | Multi-node multi-GPU via Dask |\n| | Uniform Manifold Approximation and Projection (UMAP) | Multi-node multi-GPU Inference via Dask |\n| | Random Projection | |\n| | t-Distributed Stochastic Neighbor Embedding (TSNE) | |\n| | Spectral Embedding | |\n| **Linear Models for Regression or Classification** | Linear Regression (OLS) | Multi-node multi-GPU via Dask |\n| | Linear Regression with Lasso or Ridge Regularization | Multi-node multi-GPU via Dask |\n| | ElasticNet Regression | |\n| | LARS Regression | (experimental) |\n| | Logistic Regression | Multi-node multi-GPU via Dask-GLM [demo](https:\u002F\u002Fgithub.com\u002Fdaxiongshu\u002Frapids-demos) |\n| | Naive Bayes | Multi-node multi-GPU via Dask |\n| | Stochastic Gradient Descent (SGD), Coordinate Descent (CD), and Quasi-Newton (QN) (including L-BFGS and OWL-QN) solvers for linear models  | |\n| **Nonlinear Models for Regression or Classification** | Random Forest (RF) Classification | Experimental multi-node multi-GPU via Dask |\n| | Random Forest (RF) Regression | Experimental multi-node multi-GPU via Dask |\n| | Inference for decision tree-based models | Forest Inference Library (FIL) |\n|  | K-Nearest Neighbors (KNN) Classification | Multi-node multi-GPU via Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx), uses [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) for Nearest Neighbors Query. |\n|  | K-Nearest Neighbors (KNN) Regression | Multi-node multi-GPU via Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx), uses [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) for Nearest Neighbors Query. |\n|  | Support Vector Machine Classifier (SVC) | |\n|  | Epsilon-Support Vector Regression (SVR) | |\n| **Preprocessing** | Standardization, or mean removal and variance scaling \u002F Normalization \u002F Encoding categorical features \u002F Discretization \u002F Imputation of missing values \u002F Polynomial features generation \u002F and coming soon custom transformers and non-linear transformation | Based on Scikit-Learn preprocessing\n| **Time Series** | Holt-Winters Exponential Smoothing | |\n|  | Auto-regressive Integrated Moving Average (ARIMA) | Supports seasonality (SARIMA) |\n| **Model Explanation** | SHAP Kernel Explainer | [Based on SHAP](https:\u002F\u002Fshap.readthedocs.io\u002Fen\u002Flatest\u002F) |\n|  | SHAP Permutation Explainer | [Based on SHAP](https:\u002F\u002Fshap.readthedocs.io\u002Fen\u002Flatest\u002F) |\n| **Execution device interoperability** | | Run estimators interchangeably from host\u002Fcpu or device\u002Fgpu with minimal code change [demo](https:\u002F\u002Fdocs.rapids.ai\u002Fapi\u002Fcuml\u002Fstable\u002Fexecution_device_interoperability.html) |\n| **Other**                                             | K-Nearest Neighbors (KNN) Search                                                                                                          | Multi-node multi-GPU via Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx), uses [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) for Nearest Neighbors Query. |\n\n---\n\n## Installation\n\nSee [the RAPIDS Release Selector](https:\u002F\u002Fdocs.rapids.ai\u002Finstall#selector) for\nthe command line to install either nightly or official release cuML packages\nvia conda, pip, or Docker.\n\n## Build\u002FInstall from Source\nSee the build [guide](BUILD.md).\n\n## Scikit-learn Compatibility\n\ncuML is compatible with scikit-learn version 1.4 or higher.\n\n## Model serialization and security\n\ncuML models can be serialized with `pickle` or `joblib` and loaded later for inference. cuML uses cloudpickle so that models trained with cuml.accel can be loaded and used with scikit-learn.\n\n**Only unpickle or deserialize from trusted sources.** The `pickle` module (and by extension `joblib`) is not secure: malicious payloads can execute arbitrary code during deserialization and compromise your system. **Do not unpickle or load data from untrusted or tampered sources.** This applies to `pickle.load()` \u002F `pickle.loads()`, `joblib.load()`, and any file-based model loading. For details and patterns, see the [Model Serialization and Persistence](docs\u002Fsource\u002Fpickling_cuml_models.ipynb) notebook and the [Python pickle security documentation](https:\u002F\u002Fdocs.python.org\u002F3\u002Flibrary\u002Fpickle.html).\n\n## Contributing\n\nPlease see our [guide for contributing to cuML](CONTRIBUTING.md).\n\n## References\n\nThe RAPIDS team has a number of blogs with deeper technical dives and examples. [You can find them here on Medium.](https:\u002F\u002Fmedium.com\u002Frapids-ai\u002Ftagged\u002Fmachine-learning)\n\nFor additional details on the technologies behind cuML, as well as a broader overview of the Python Machine Learning landscape, see [_Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence_ (2020)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.04803) by Sebastian Raschka, Joshua Patterson, and Corey Nolet.\n\nPlease consider citing this when using cuML in a project. You can use the citation BibTeX:\n\n```bibtex\n@article{raschka2020machine,\n  title={Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence},\n  author={Raschka, Sebastian and Patterson, Joshua and Nolet, Corey},\n  journal={arXiv preprint arXiv:2002.04803},\n  year={2020}\n}\n```\n\n## Contact\n\nFind out more details on the [RAPIDS site](https:\u002F\u002Frapids.ai\u002Fcommunity.html)\n\n## \u003Cdiv align=\"left\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_2b4ae54fb74d.png\" width=\"265px\"\u002F>\u003C\u002Fdiv> Open GPU Data Science\n\nThe RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.\n\n\u003Cp align=\"center\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_635a03b38996.png\" width=\"80%\"\u002F>\u003C\u002Fp>\n","# \u003Cdiv align=\"left\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_2b4ae54fb74d.png\" width=\"90px\"\u002F>&nbsp;cuML - GPU 机器学习算法\u003C\u002Fdiv>\n\ncuML 是一套库，实现了与 RAPIDS 其他项目具有兼容 API 的机器学习算法和数学基础函数。\n\ncuML 使数据科学家、研究人员和软件工程师能够在不深入 CUDA 编程细节的情况下，在 GPU 上运行传统的表格型机器学习任务。在大多数情况下，cuML 的 Python API 与 [scikit-learn](https:\u002F\u002Fscikit-learn.org) 的 API 完全一致。\n\n对于大型数据集，这些基于 GPU 的实现速度可以比其 CPU 对应版本快 10 到 50 倍。有关性能的详细信息，请参阅 [cuML 基准测试笔记本](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Ftree\u002Fmain\u002Fnotebooks\u002Ftools)。\n\n例如，以下 Python 代码片段使用 cuDF 在 GPU 上加载输入并计算 DBSCAN 聚类：\n```python\nimport cudf\nfrom cuml.cluster import DBSCAN\n\n# 创建并填充一个 GPU DataFrame\ngdf_float = cudf.DataFrame()\ngdf_float['0'] = [1.0, 2.0, 5.0]\ngdf_float['1'] = [4.0, 2.0, 1.0]\ngdf_float['2'] = [4.0, 2.0, 1.0]\n\n# 设置并拟合聚类模型\ndbscan_float = DBSCAN(eps=1.0, min_samples=1)\ndbscan_float.fit(gdf_float)\n\nprint(dbscan_float.labels_)\n```\n\n输出：\n```\n0    0\n1    1\n2    2\ndtype: int32\n```\n\ncuML 还支持多 GPU 和多节点多 GPU 操作，通过 [Dask](https:\u002F\u002Fwww.dask.org) 实现，并且这一功能正在逐步扩展到更多算法中。以下 Python 代码片段从 CSV 文件读取输入，并在 Dask 工作节点集群上执行最近邻查询，同时利用单个节点上的多个 GPU：\n\n首先初始化一个配置了 [UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx) 的 `LocalCUDACluster`，以实现 CUDA 数组的高速传输：\n```python\n# 初始化 UCX 以实现 CUDA 数组的高速传输\nfrom dask_cuda import LocalCUDACluster\n\n# 创建一个单节点 Dask CUDA 集群，每个设备对应一个工作进程\ncluster = LocalCUDACluster(protocol=\"ucx\",\n                           enable_tcp_over_ucx=True,\n                           enable_nvlink=True,\n                           enable_infiniband=False)\n```\n\n然后加载数据并执行 k 最近邻搜索。`cuml.dask` 中的估计器也支持 `Dask.Array` 作为输入：\n```python\n\nfrom dask.distributed import Client\nclient = Client(cluster)\n\n# 并行地在各个工作节点上读取 CSV 文件\nimport dask_cudf\ndf = dask_cudf.read_csv(\"\u002Fpath\u002Fto\u002Fcsv\")\n\n# 拟合最近邻模型并进行查询\nfrom cuml.dask.neighbors import NearestNeighbors\nnn = NearestNeighbors(n_neighbors = 10, client=client)\nnn.fit(df)\nneighbors = nn.kneighbors(df)\n```\n\n如需更多示例，请浏览我们的完整 [API 文档](https:\u002F\u002Fdocs.rapids.ai\u002Fapi\u002Fcuml\u002Fstable\u002F) 或查看我们的示例 [教程笔记本](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Ftree\u002Fmain\u002Fnotebooks)。此外，您还可以在 [notebooks-contrib 仓库](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fnotebooks-contrib) 中找到完整的端到端示例。\n\n### 支持的算法\n| 类别 | 算法 | 备注 |\n| --- | --- | --- |\n| **聚类** | 基于密度的带有噪声的应用空间聚类 (DBSCAN) | 通过 Dask 实现多节点多 GPU |\n|  | 层次化的基于密度的带有噪声的应用空间聚类 (HDBSCAN) | |\n|  | K 均值聚类 | 通过 Dask 实现多节点多 GPU |\n|  | 单链接凝聚聚类 | |\n|  | 谱聚类 | |\n| **降维** | 主成分分析 (PCA) | 通过 Dask 实现多节点多 GPU|\n| | 增量 PCA | |\n| | 截断奇异值分解 (tSVD) | 通过 Dask 实现多节点多 GPU |\n| | 统一流形近似与投影 (UMAP) | 通过 Dask 实现多节点多 GPU 推理 |\n| | 随机投影 | |\n| | t 分布随机邻域嵌入 (TSNE) | |\n| | 谱嵌入 | |\n| **用于回归或分类的线性模型** | 最小二乘法线性回归 (OLS) | 通过 Dask 实现多节点多 GPU |\n| | 带有 Lasso 或 Ridge 正则化的线性回归 | 通过 Dask 实现多节点多 GPU |\n| | 弹性网络回归 | |\n| | LARS 回归 | （实验性）|\n| | 逻辑回归 | 通过 Dask-GLM 实现多节点多 GPU [演示](https:\u002F\u002Fgithub.com\u002Fdaxiongshu\u002Frapids-demos) |\n| | 朴素贝叶斯 | 通过 Dask 实现多节点多 GPU |\n| | 随机梯度下降 (SGD)、坐标下降 (CD) 和拟牛顿法 (QN)（包括 L-BFGS 和 OWL-QN）求解线性模型 | |\n| **用于回归或分类的非线性模型** | 随机森林 (RF) 分类 | 实验性：通过 Dask 实现多节点多 GPU |\n| | 随机森林 (RF) 回归 | 实验性：通过 Dask 实现多节点多 GPU |\n| | 基于决策树模型的推理 | 决策树推理库 (FIL) |\n|  | K 最近邻 (KNN) 分类 | 通过 Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx) 实现多节点多 GPU，使用 [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) 进行最近邻查询。 |\n|  | K 最近邻 (KNN) 回归 | 通过 Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx) 实现多节点多 GPU，使用 [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) 进行最近邻查询。 |\n|  | 支持向量机分类器 (SVC) | |\n|  | Epsilon 支持向量回归 (SVR) | |\n| **预处理** | 标准化，即均值移除和方差缩放 \u002F 归一化 \u002F 编码分类特征 \u002F 离散化 \u002F 缺失值插补 \u002F 多项式特征生成 \u002F 以及即将推出的自定义转换器和非线性变换 | 基于 Scikit-Learn 的预处理 |\n| **时间序列** | Holt-Winters 指数平滑 | |\n|  | 自回归积分滑动平均 (ARIMA) | 支持季节性（SARIMA） |\n| **模型解释** | SHAP 核解释器 | [基于 SHAP](https:\u002F\u002Fshap.readthedocs.io\u002Fen\u002Flatest\u002F) |\n|  | SHAP 置换解释器 | [基于 SHAP](https:\u002F\u002Fshap.readthedocs.io\u002Fen\u002Flatest\u002F) |\n| **执行设备互操作性** | | 可以在主机\u002FCPU 或设备\u002FGPU 上交替运行估计器，只需少量代码更改 [演示](https:\u002F\u002Fdocs.rapids.ai\u002Fapi\u002Fcuml\u002Fstable\u002Fexecution_device_interoperability.html) |\n| **其他**                                             | K 最近邻 (KNN) 搜索                                                                                                          | 通过 Dask+[UCXX](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fucxx) 实现多节点多 GPU，使用 [Faiss](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffaiss) 进行最近邻查询。 |\n\n---\n\n## 安装\n\n请参阅 [RAPIDS 发行版选择器](https:\u002F\u002Fdocs.rapids.ai\u002Finstall#selector)，以获取通过 conda、pip 或 Docker 安装 nightly 版或官方 cuML 包的命令行。\n\n## 从源代码构建\u002F安装\n请参阅构建 [指南](BUILD.md)。\n\n## 与 scikit-learn 的兼容性\n\ncuML 与 scikit-learn 1.4 或更高版本兼容。\n\n## 模型序列化与安全性\n\ncuML 模型可以使用 `pickle` 或 `joblib` 进行序列化，并在后续加载以进行推理。cuML 使用 cloudpickle，因此使用 cuml.accel 训练的模型可以加载并在 scikit-learn 中使用。\n\n**请仅从受信任的来源反序列化或加载模型。** `pickle` 模块（以及由此扩展的 `joblib`）并不安全：恶意载荷在反序列化过程中可能执行任意代码，从而危及您的系统。**请勿从不受信任或已被篡改的来源反序列化或加载数据。** 这适用于 `pickle.load()` \u002F `pickle.loads()`、`joblib.load()` 以及任何基于文件的模型加载操作。有关详细信息和最佳实践模式，请参阅 [模型序列化与持久化](docs\u002Fsource\u002Fpickling_cuml_models.ipynb) 笔记本，以及 [Python pickle 安全性文档](https:\u002F\u002Fdocs.python.org\u002F3\u002Flibrary\u002Fpickle.html)。\n\n## 贡献\n\n请参阅我们的 [cuML 贡献指南](CONTRIBUTING.md)。\n\n## 参考文献\n\nRAPIDS 团队发布了许多深入的技术解析和示例博客。[您可以在 Medium 上找到它们。](https:\u002F\u002Fmedium.com\u002Frapids-ai\u002Ftagged\u002Fmachine-learning)\n\n如需了解更多关于 cuML 背后技术的细节，以及对 Python 机器学习领域的更广泛概述，请参阅 Sebastian Raschka、Joshua Patterson 和 Corey Nolet 于 2020 年撰写的 _Python 中的机器学习：数据科学、机器学习和人工智能领域的主要发展与技术趋势_ (2020) [arXiv:2002.04803]。\n\n在项目中使用 cuML 时，请考虑引用该文献。您可以使用以下 BibTeX 格式的引用：\n\n```bibtex\n@article{raschka2020machine,\n  title={Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence},\n  author={Raschka, Sebastian and Patterson, Joshua and Nolet, Corey},\n  journal={arXiv preprint arXiv:2002.04803},\n  year={2020}\n}\n```\n\n## 联系方式\n\n更多详情请访问 [RAPIDS 官网](https:\u002F\u002Frapids.ai\u002Fcommunity.html)。\n\n## \u003Cdiv align=\"left\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_2b4ae54fb74d.png\" width=\"265px\"\u002F>\u003C\u002Fdiv> 开放式 GPU 数据科学\n\nRAPIDS 是一套开源软件库，旨在支持完全在 GPU 上运行端到端的数据科学和分析流水线。它依赖 NVIDIA® CUDA® 原语实现底层计算优化，同时通过用户友好的 Python 接口暴露 GPU 的并行计算能力和高带宽内存速度。\n\n\u003Cp align=\"center\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_readme_635a03b38996.png\" width=\"80%\"\u002F>\u003C\u002Fp>","# cuML 快速上手指南\n\ncuML 是 RAPIDS 套件的一部分，提供了一套与 scikit-learn API 兼容的 GPU 加速机器学习算法库。它允许数据科学家在无需深入了解 CUDA 编程细节的情况下，利用 GPU 将传统表格型机器学习任务的速度提升 10-50 倍。\n\n## 环境准备\n\n在使用 cuML 之前，请确保您的系统满足以下要求：\n\n*   **操作系统**：Linux (x86_64) 或 WSL2 (Windows Subsystem for Linux)。\n*   **GPU 硬件**：NVIDIA GPU (Compute Capability 7.0 或更高，如 Volta, Turing, Ampere, Hopper 架构)。\n*   **驱动程序**：已安装最新的 NVIDIA 驱动。\n*   **CUDA Toolkit**：需安装与 RAPIDS 版本匹配的 CUDA Toolkit（通常通过 conda 安装时会自动处理依赖）。\n*   **Python 版本**：推荐 Python 3.9 - 3.11。\n*   **前置依赖**：建议先安装 `conda` (Miniconda 或 Anaconda) 以管理环境。\n\n> **注意**：cuML 强依赖于 `cudf` (GPU DataFrame 库)，通常作为 RAPIDS 套件整体安装。\n\n## 安装步骤\n\n推荐使用 `conda` 进行安装，这是最稳定且依赖冲突最少的方式。\n\n### 1. 创建并激活虚拟环境\n```bash\nconda create -n rapids-env -c conda-forge -n rapids-env python=3.10\nconda activate rapids-env\n```\n\n### 2. 安装 cuML (及 RAPIDS 核心组件)\n访问 [RAPIDS Release Selector](https:\u002F\u002Fdocs.rapids.ai\u002Finstall#selector) 获取针对您系统配置的最新命令。以下是通用安装示例（基于 conda-forge 和 nvidia 频道）：\n\n```bash\nconda install -c conda-forge -c nvidia rapids=24.04 python=3.10 cuda-version=12.4\n```\n*注：请将 `rapids=24.04` 和 `cuda-version=12.4` 替换为您需要的具体版本。*\n\n### 国内加速方案\n如果下载速度较慢，可以尝试使用清华源或中科大源镜像（需确认镜像站是否同步了 `nvidia` 和 `rapids` 频道，若未同步仍需回退到官方源）：\n\n```bash\n# 配置清华源 (示例)\nconda config --add channels https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fcloud\u002Fconda-forge\u002F\nconda config --add channels https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fpkgs\u002Fmain\u002F\n# 然后执行上述安装命令\n```\n*若国内镜像缺少特定的 RAPIDS 包，请直接使用官方渠道安装。*\n\n### 验证安装\n运行以下 Python 代码检查是否能成功导入并识别 GPU：\n```python\nimport cuml\nprint(cuml.__version__)\n```\n\n## 基本使用\n\ncuML 的 API 设计与 scikit-learn 高度一致。以下示例演示如何使用 cuDF 创建 GPU DataFrame，并利用 cuML 进行 DBSCAN 聚类。\n\n### 单 GPU 示例：DBSCAN 聚类\n\n```python\nimport cudf\nfrom cuml.cluster import DBSCAN\n\n# 1. 创建并填充 GPU DataFrame\ngdf_float = cudf.DataFrame()\ngdf_float['0'] = [1.0, 2.0, 5.0]\ngdf_float['1'] = [4.0, 2.0, 1.0]\ngdf_float['2'] = [4.0, 2.0, 1.0]\n\n# 2. 初始化模型并拟合\n# eps: 邻域半径，min_samples: 形成稠密区域所需的最小样本数\ndbscan_float = DBSCAN(eps=1.0, min_samples=1)\ndbscan_float.fit(gdf_float)\n\n# 3. 输出聚类标签\nprint(dbscan_float.labels_)\n```\n\n**输出结果：**\n```\n0    0\n1    1\n2    2\ndtype: int32\n```\n\n### 多 GPU 示例：分布式 K-近邻搜索 (可选)\n\n如果您拥有多张 GPU 并希望利用 Dask 进行分布式计算，可以使用 `cuml.dask` 模块：\n\n```python\nfrom dask_cuda import LocalCUDACluster\nfrom dask.distributed import Client\nimport dask_cudf\nfrom cuml.dask.neighbors import NearestNeighbors\n\n# 1. 初始化本地 CUDA 集群 (启用 UCX 以获得高速传输)\ncluster = LocalCUDACluster(protocol=\"ucx\",\n                           enable_tcp_over_ucx=True,\n                           enable_nvlink=True,\n                           enable_infiniband=False)\nclient = Client(cluster)\n\n# 2. 并行读取数据 (假设有一个大型 CSV)\n# df = dask_cudf.read_csv(\"\u002Fpath\u002Fto\u002Flarge_dataset.csv\") \n# 此处为了演示，复用上面的逻辑构建一个简单的 Dask DataFrame\nimport cudf\nlocal_df = cudf.DataFrame({'0': [1.0, 2.0, 5.0], '1': [4.0, 2.0, 1.0]})\ndf = dask_cudf.from_cudf(local_df, npartitions=2)\n\n# 3. 拟合模型并查询\nnn = NearestNeighbors(n_neighbors=2, client=client)\nnn.fit(df)\nneighbors = nn.kneighbors(df)\n\nprint(neighbors)\n```\n\n### 兼容性提示\n*   **Scikit-learn 兼容**：cuML 兼容 scikit-learn 1.4+ 版本，大多数 estimator 可以直接替换 sklearn 中的对应类。\n*   **模型保存**：支持使用 `pickle` 或 `joblib` 序列化模型，但请务必仅加载来自可信来源的模型文件，以防恶意代码执行。","某电商数据团队需要在每晚对数亿条用户行为日志进行实时聚类分析，以识别异常刷单团伙并更新风控模型。\n\n### 没有 cuml 时\n- 处理亿级数据时，基于 CPU 的 scikit-learn 运行耗时极长，单次 DBSCAN 聚类往往需要数小时，难以满足夜间窗口期的时效要求。\n- 为了加速计算，工程师不得不投入大量精力编写复杂的 CUDA 内核代码或手动优化多进程并行逻辑，开发门槛极高。\n- 随着数据量增长，单机内存频繁溢出，被迫将数据拆分处理，导致算法无法捕捉全局分布特征，聚类准确率大幅下降。\n- 扩容成本高昂，若要通过增加 CPU 节点来缩短时间，不仅硬件投入巨大，还面临严峻的网络通信瓶颈。\n\n### 使用 cuml 后\n- 利用 GPU 并行加速能力，同样的亿级数据聚类任务从数小时缩短至几分钟，整体速度提升 10-50 倍，轻松赶上业务时效。\n- 直接复用与 scikit-learn 高度兼容的 Python API，数据科学家无需掌握底层 CUDA 编程即可调用 GPU 算力，开发效率显著提升。\n- 借助 Dask 集成轻松实现多卡甚至多机协同训练，海量数据可一次性载入显存集群进行全局计算，确保了模型精度。\n- 通过 UCX 高速传输协议优化多卡间通信，在单节点内即可线性扩展算力，大幅降低了硬件采购与维护成本。\n\ncuML 让传统机器学习算法在 GPU 上实现了“零代码改造”的百倍加速，彻底打破了大规模数据分析的性能与成本瓶颈。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Frapidsai_cuml_2b4ae54f.png","rapidsai","RAPIDS","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Frapidsai_30174da9.png","Open GPU Data Science",null,"RAPIDSai","https:\u002F\u002Frapids.ai","https:\u002F\u002Fgithub.com\u002Frapidsai",[85,89,93,97,101,105,109,113,117,121],{"name":86,"color":87,"percentage":88},"C++","#f34b7d",31.5,{"name":90,"color":91,"percentage":92},"Python","#3572A5",29.8,{"name":94,"color":95,"percentage":96},"Cuda","#3A4E3A",25.8,{"name":98,"color":99,"percentage":100},"Cython","#fedf5b",8.8,{"name":102,"color":103,"percentage":104},"Jupyter Notebook","#DA5B0B",2.5,{"name":106,"color":107,"percentage":108},"CMake","#DA3434",0.7,{"name":110,"color":111,"percentage":112},"Shell","#89e051",0.6,{"name":114,"color":115,"percentage":116},"C","#555555",0.3,{"name":118,"color":119,"percentage":120},"HTML","#e34c26",0,{"name":122,"color":123,"percentage":120},"Dockerfile","#384d54",5168,621,"2026-04-03T05:25:15","Apache-2.0","Linux","必需 NVIDIA GPU，具体型号和显存大小未说明（依赖数据集大小），需支持 CUDA（具体版本需参考 RAPIDS 安装选择器，通常较新版本）","未说明",{"notes":132,"python":133,"dependencies":134},"cuML 是 RAPIDS 套件的一部分，旨在让数据科学家无需编写 CUDA 代码即可在 GPU 上运行机器学习任务。支持通过 Dask 进行多 GPU 和多节点分布式训练与推理。安装建议使用官方提供的 RAPIDS Release Selector 工具通过 conda、pip 或 Docker 进行。模型序列化支持 pickle 和 joblib，但需注意安全风险，仅加载可信来源的模型。部分算法（如随机森林的多节点支持）仍处于实验阶段。","未说明 (兼容 scikit-learn 1.4+)",[135,136,137,138,139,140,141,142],"cudf","dask-cuda","dask","distributed","ucxx","faiss","scikit-learn>=1.4","cloudpickle",[13],[145,146,147,148,149],"machine-learning-algorithms","machine-learning","cuda","gpu","nvidia","2026-03-27T02:49:30.150509","2026-04-06T08:42:03.086045",[153,158,163,168,172,177],{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},12594,"如何在 RandomForestRegressor 中处理大数据集以避免 GPU 显存不足（OOM）错误？","1. 增加集群规模：例如使用两个 DGX-2 的集群可以将训练数据量大致翻倍。但需注意，推理时完整的树仍需能放入每个 worker 中，因此极大树尺寸仍有限制。\n2. 等待内存优化更新：开发团队计划在 0.15 版本早期改进推理阶段的内存消耗，并计划支持基于 CuPy 的 Dask 数组输入，这将显著减少内存开销。\n3. 当前建议：尽量合理设置 max_depth 和 n_estimators 参数，避免生成过大的树模型。","https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fissues\u002F1998",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},12595,"如何设置更大的 max_depth 和 n_trees 参数而不触发 OOM 错误？","在 cuML 0.12 版本中，默认会创建稠密表示导致 max_depth > 16 时显存溢出。该问题已在 0.13 版本修复：\n1. 现在用户可以通过设置 `fil_sparse_format` 变量选择稀疏或稠密表示。\n2. 默认情况下，当 `algo='auto'` 或 `algo='naive'` 时使用稀疏实现，否则使用稠密实现。\n3. 建议升级到 cuML 0.13 或更高版本（如 nightly 版），安装命令示例：`conda install -c rapidsai-nightly cuml`。升级后可成功训练 depth 超过 30 的模型。","https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fissues\u002F1467",{"id":164,"question_zh":165,"answer_zh":166,"source_url":167},12596,"在循环中使用 RandomForestRegressor 配合 RMM 内存池时出现显存泄漏怎么办？","这是一个已知的 RMM 内存池 Bug，在反复拟合、预测和删除模型时会导致显存持续增长直至耗尽。该问题已通过 RMM 仓库的 PR #510 修复。解决方案是升级到包含此修复的 RMM 最新版本。根本原因是在尝试从已销毁的流回收内存块时未正确处理，可能导致段错误。","https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fissues\u002F2632",{"id":169,"question_zh":170,"answer_zh":171,"source_url":162},12597,"能否对 Dask 随机森林模型进行 pickle 序列化保存？","目前不支持对 Dask 随机森林模型进行 pickle 操作。这是当前的功能限制，用户需寻找其他持久化方案（如手动保存模型参数或使用特定格式的导出功能，如果可用）。",{"id":173,"question_zh":174,"answer_zh":175,"source_url":176},12598,"使用 CSR 稀疏矩阵进行余弦相似度计算时报错 'radix_sort: failed on 2nd step: cudaErrorInvalidValue' 如何解决？","该错误通常与特定版本的 RAPIDS Docker 镜像（如 21.06）中的底层 CUDA 排序算法兼容性有关。虽然具体修复步骤在提供的片段中未完全显示，但此类问题通常需要通过以下方式解决：\n1. 确保升级到最新的稳定版或 nightly 版 RAPIDS 镜像，以获取最新的 bug 修复。\n2. 检查输入数据的格式是否符合 `csr_row_normalize_l2` 的要求（如数据类型是否为 float32，索引是否连续等）。\n3. 如果问题持续，尝试在较新的 Docker 标签（如 21.08 或更高）中运行代码，因为该错误可能在后续版本中已被修复。","https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fissues\u002F4167",{"id":178,"question_zh":179,"answer_zh":180,"source_url":162},12599,"在 DGX-1\u002FDGX-2 上运行随机森林时，如何优化内存使用以支持更深的树？","关键优化点在于利用稀疏格式存储森林结构：\n1. 确认使用的是 cuML 0.13+ 版本，该版本默认在 `algo='auto'` 或 `algo='naive'` 时启用稀疏格式。\n2. 显式设置参数以确保使用稀疏表示，避免默认的稠密表示在大深度时消耗过多显存。\n3. 对于超大规模数据，考虑结合 Dask 分布式训练，将数据分片到多个 GPU worker 上，同时注意每个 worker 仍需容纳完整的树结构用于推理。",[182,187,192,197,202,207,212,217,222,227,232,237,242,247,252,257,262,267,272,277],{"id":183,"version":184,"summary_zh":185,"released_at":186},62995,"v26.02.00","\u003C!-- 发布说明由 .github\u002Frelease.yml 中的配置在 v26.02.00 版本生成 -->\n\n## 变更内容\n### 🚨 重大变更\n* 由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7539 中简化类型反射实现\n* 由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7612 中弃用 SVC\u002FSVR 的 `TotalIters`\n* 由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7628 中从公共 API 中弃用句柄\n* 由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7649 中避免在 `__init__` 中将 `output_type=None` 强制转换为全局 `output_type`\n* 由 @JohnZed 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7687 中移除未使用的 QR 分解 MG 代码\n* 由 @betatim 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7667 中使用 scikit-learn 的 train_test_split\n### 🐛 错误修复\n* 由 @robertmaynard 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7583 中避免在 MG 关闭时链接 cumlprims_mg\n* 由 @aamijar 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7587 中修复 UMAP 中导致非法内存访问的 int32 溢出问题\n* 由 @dantegd 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7604 中显式初始化 SVC 构造函数中未初始化的 SvmParameter 字段\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7634 中使用生成的数据集替代真实的 Covertype 数据集\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7637 中移除测试中对远程数据集的依赖\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7644 中在 BERTopic 轮子集成测试中使用合成数据集\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7636 中重构用于输入验证的 check_ptr 函数中的指针比较逻辑\n* 由 @dantegd 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7653 中将 XGBoost 重新加入测试\n* 由 @aamijar 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7677 中对 spectral embedding 测试进行 xfail 标记\n* 由 @bdice 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7668 中更新 RMM 内存资源 API，使其使用基于引用的等效版本\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7681 中对 'test_tsne_distance_metrics_on_sparse_input' 测试进行 xfail 标记\n* 由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7574 中使用 NCCL 实现 Dask kNN 和 DBSCAN\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7686 中将部分 SpectralEmbedding() 的 common test_estimators 测试标记为不稳定\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7694 中处理新 sklearn LARS 错误消息，并在 test_typeerror_input 中进行相应处理\n* 由 @robertmaynard 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7582 中使 build.sh 现在支持自定义 LIBCUML_BUILD_DIR 值\n* 由 @jinsolp 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7597 中修复当提供 `random_state` 时 UMAP 中的异常值问题\n* 由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7715 中绕过 BERTopic 集成测试中的 sentence-transformer 回归问题，并在集成测试中不因该问题而失败\n* 由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7691 中重构 classlabels 工具\n### 📖 文档更新\n* 由 @virchan 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7218 中重新编写 LinearRegression 文档\n* `","2026-02-05T08:03:16",{"id":188,"version":189,"summary_zh":190,"released_at":191},62996,"v25.12.00","\u003C!-- 发布说明由 .github\u002Frelease.yml 配置在 v25.12.00 版本中生成 -->\n\n## 变更内容\n### 🚨 重大变更\n* 将 `dask` 设为可选依赖，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7303 中实现\n* HDBSCAN 使用 `int64_t` 类型并移除 `int` 类型，由 @jinsolp 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7104 中实现\n* UMAP - 延迟嵌入分配，由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7313 中实现\n* 不再从 `cuml.explainer` 中的模型推断数据类型，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7358 中实现\n* 清理 LinearRegression，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7355 中实现\n* 清理 `AgglomerativeClustering`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7379 中实现\n* 清理 `LinearSVC`\u002F`LinearSVR`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7376 中实现\n* 要求 CUDA 12.2 或更高版本，由 @jakirkham 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7408 中实现\n* 对 `Ridge` 进行多项改进，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7410 中实现\n* 在 SVC\u002FSVR 中公开 `n_iter_`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7461 中实现\n* 统一回归 `predict` 输出的数据类型，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7464 中实现\n* 在所有地方支持非数值类标签，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7480 中实现\n* 将 TSNE 的 `n_iter` 重命名为 `max_iter`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7500 中实现\n* 清理 `cuml.multiclass`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7508 中实现\n* 清理 `SGD`\u002F`MBSGDClassifier`\u002F`MBSGDRegressor`，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7504 中实现\n### 🐛 问题修复\n* 避免 pyyaml 运行时依赖，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7312 中实现\n* 使 scikit-learn SVM 到 cuML 的转换更加稳健，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7324 中实现\n* 撤销“CI：自动为来自关联 issue 的 PR 分配优先级 (#7354)”的更改，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7357 中实现\n* 添加对 UMAP 预计算 KNN 特征的检查，由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7300 中实现\n* 解绑 treelite，并对受 XGBoost\u002Ftreelite 不兼容影响的一些测试进行 xfail 处理，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7366 中实现\n* 修复 Dask DBSCAN 中的问题，由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7359 中实现\n* CI：将夜间 CI 检查窗口延长至 14 天，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7392 中实现\n* 修复坐标下降中的溢出问题，由 @jcrist 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7399 中实现\n* 修复按值传递输出参数导致拥有 CSR 矩阵的拷贝构造函数被调用的问题，由 @achirkin 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7390 中实现\n* 修复 Dask DBSCAN 树归约问题，由 @viclafargue 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7398 中实现\n* 修复 test_kernel_density 测试，由 @csadorf 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7404 中实现\n* 在 CI 样式检查中获取 rapids-cmake 配置用于 cmake-format，由 @bdice 在 https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fpull\u002F7406 中实现\n* 修复 PorterStemmer 中 cudf-pandas 模块的字符串切片数据类型不匹配问题","2025-12-11T05:33:31",{"id":193,"version":194,"summary_zh":195,"released_at":196},62997,"v25.10.00","## 🚨 重大变更\n\n- 弃用 `cuml.ensemble` 中的 `convert_to_*` 方法，改用 `as_*` 方法 (#7254) @jcrist\n- 修复 `KernelDensity.score_samples` 的输出类型和数据类型问题 (#7240) @jcrist\n- 从集成估计器中移除 `get_json`、`get_detailed_text` 和 `get_summary_text` 方法 (#7177) @jcrist\n- 弃用 `accuracy_metric` (#7170) @jcrist\n- 弃用 `cuml.ensemble` 和 `cuml.dask.ensemble` 中的 `predict_model` 方法 (#7155) @jcrist\n- 移除已弃用的 `cuml.accel` CLI 选项 (#7110) @jcrist\n- 在 25.10 版本中移除弃用警告 (#7109) @jcrist\n- 移除 UMAP 的弃用警告及 `data_on_host` 选项 (#7099) @jinsolp\n- 修复 UMAP 图阈值处理问题 (#6595) @viclafargue\n\n## 🐛 错误修复\n\n- 为 `cu13` 轮子设置 NCCL rpath (#7304) @divyegala\n- 将 NCCL 库路径添加到 libcuml 的 CMakeLists.txt 文件中 (#7281) @csadorf\n- 确保传递给 SpectralEmbedding 的值为有限值 (#7280) @jcrist\n- 确保 sklearn 往返转换时属性一致性 (#7278) @jcrist\n- 修复稀疏度为 0% 时 SpectralEmbedding 预计算选项的问题 (#7271) @aamijar\n- 修复 test_onehot_inverse_transform_handle_unknown 测试中的参考对象问题 (#7246) @mroeschke\n- 略微提高岭回归测试的容差值 (#7243) @csadorf\n- 提高 test_complement_partial_fit 中 float32 的 rtol 值，以减少间歇性失败 (#7237) @csadorf\n- 对 nrows=500 的 test_umap_fit_transform_score 进行条件性 xfail 处理 (#7232) @csadorf\n- 修复 `StandardScaler.n_samples_seen_` 问题 (#7209) @jcrist\n- 验证 `KernelDensity.fit` 中的 `sample_weight` 参数 (#7208) @jcrist\n- 支持 `input_to_host_array` 中的非连续输入 (#7207) @jcrist\n- 为 HDBSCAN single_linkage 设置正确的 min_samples 值 (#7195) @tarang-jain\n- 当 HDBSCAN 的 min_samples 大于样本数量时，抛出适当的异常 (#7193) @tarang-jain\n- 从 `Ridge` 文档中移除对 `solver=\"cd\"` 的说明和处理 (#7190) @jcrist\n- 如果 `n_samples \u003C n_clusters`，在 `KMeans` 中友好地报错 (#7189) @jcrist\n- 进一步提高 test_random_seed_consistency 测试的容差值 (#7180) @csadorf\n- 使用自定义插件提前下载测试数据 (#7169) @betatim\n- 通过检查离群点并打乱顺序来解决 UMAP 的离群点问题 (#7131) @jinsolp\n- 指出随机投影变换的非确定性问题 (#7129) @jcrist\n- 重写随机投影估计器 (#7119) @jcrist\n- 防止在运行 UMAP 的 scikit-learn 兼容性测试套件时出现 CUDA 问题 (#7107) @viclafargue\n- 修复 accel 性能分析工具中某些回退方法计数翻倍的问题 (#7101) @jcrist\n- 不要对 cupy 数组调用 to_output 方法 (#7044) @Matt711\n- 添加针对无内存资源设备的修复 (#6823) @viclafargue\n- 修复 UMAP 图阈值处理问题 (#6595) @viclafargue\n\n## 📖 文档更新\n\n- 在 cuml.accel 性能分析工具中添加关于子进程的警告 (#7290) @jcrist\n- 修订 cuML 25.10 版本的文档 (#7228) @csadorf\n- 提供 GitHub 问题模板，用于报告 CI 失败 (#7178) @csadorf\n- 修复 cuml-accel 文档的重定向问题 (#7139) @jcrist\n\n## 🚀 新功能\n\n- 更新 cufft 宏定义，使其与 CUDA 13 类型兼容 (#7094) @robertmaynard\n\n## 🛠️ 改进\n\n- 修复遗漏的依赖项 in","2025-10-08T22:37:51",{"id":198,"version":199,"summary_zh":200,"released_at":201},62998,"v25.12.00a","## 🔗 链接\n\n- [开发分支](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Ftree\u002Fbranch-25.12)\n- [与 `main` 分支对比](https:\u002F\u002Fgithub.com\u002Frapidsai\u002Fcuml\u002Fcompare\u002Fmain...branch-25.12)\n\n## 🚨 重大变更\n\n- 不再从 `cuml.explainer` 中的模型推断 dtype (#7358) @jcrist\n- 清理 LinearRegression (#7355) @jcrist\n- UMAP - 延迟嵌入分配 (#7313) @viclafargue\n- 将 `dask` 设为可选依赖项 (#7303) @jcrist\n- HDBSCAN 使用 `int64_t` 并弃用 `int` 类型 (#7104) @jinsolp\n\n## 🐛 Bug 修复\n\n- 撤销“CI：自动为来自关联 issue 的 PR 分配优先级 (#7354)” (#7357) @csadorf\n- 使 scikit-learn SVM 到 cuML 的转换更加稳健 (#7324) @csadorf\n- 避免 pyyaml 运行时依赖。(#7312) @csadorf\n- 添加对 UMAP 预计算 KNN 功能的检查 (#7300) @viclafargue\n\n## 📖 文档\n\n- 在 README.md 中添加谱嵌入算法 (#7348) @aamijar\n\n## 🚀 新特性\n\n- 为 `libcuml` 提供静态目标 (#7351) @divyegala\n\n## 🛠️ 改进\n\n- 不再从 `cuml.explainer` 中的模型推断 dtype (#7358) @jcrist\n- 清理 LinearRegression (#7355) @jcrist\n- CI：自动为来自关联 issue 的 PR 分配优先级 (#7354) @csadorf\n- 将 xgboost 添加到基准测试工具中 (#7350) @dantegd\n- 从 CMakeLists.txt 中移除过时的链接标志 (#7349) @hcho3\n- 清理 SVM (#7347) @jcrist\n- 启用 `sccache-dist` 连接池 (#7344) @trxcllnt\n- 添加 `libcuml` 链接冒烟测试 (#7343) @divyegala\n- 使用 `pinned_host_memory_resource` 替代 `pinned_memory_resource`。(#7340) @bdice\n- 移除已弃用的 `cuml.ensemble` 和 `cuml.dask.ensemble` 功能 (#7332) @jcrist\n- 清理 `Ridge` (#7330) @jcrist\n- 加速在 C 顺序输入上的线性模型预测 (#7329) @jcrist\n- 清理 `*.pxd` 文件和 `cimport` 语句 (#7327) @jcrist\n- 在 `SpectralEmbedding` 中释放 GIL (#7326) @jcrist\n- 清理 DBSCAN (#7325) @jcrist\n- 降低测试中的警告冗余度 (#7322) @csadorf\n- 修复不稳定的 `test_logistic_regression_weighting` 测试 (#7321) @csadorf\n- 对 `cuml.neighbors` 进行一些清理 (#7320) @jcrist\n- 对 HDBSCAN 进行一些清理 (#7319) @jcrist\n- 清理 `cuml.decomposition` (#7316) @jcrist\n- 再次并行运行 scikit-learn 测试套件 (#7315) @csadorf\n- UMAP - 延迟嵌入分配 (#7313) @viclafargue\n- 清理 TSNE 的 Python 实现 (#7311) @jcrist\n- 传播 `Dask`\u002F`UCX` 异常 (#7308) @viclafargue\n- 将 SparseRandomProjection、AgglomerativeClustering 和 GaussianRandomProjection 添加到通用检查中 (#7307) @betatim\n- 将 `dask` 设为可选依赖项 (#7303) @jcrist\n- 更新至 rapids-logger 0.2 (#7301) @bdice\n- 分支 25.12 合并分支 25.10 (#7291) @jcrist\n- 避免使用已弃用的 cudf `from_pandas` 方法 (#7284) @TomAugspurger\n- 更新 `RAPIDS_BRANCH`，并将更改写入 `update-version.sh` 脚本 (#7268) @KyleFromNVIDIA\n- HDBSCAN 使用 `int64_t` 并弃用 `int` 类型 (#7104) @jinsolp","2025-10-03T17:57:49",{"id":203,"version":204,"summary_zh":205,"released_at":206},62999,"v25.08.00","# cuML 25.08 版本说明\n\n## 🎉 新增功能\n\n## ⭐ 亮点\n\n- **谱嵌入**：用于降维和流形学习的新算法 (#6581) @aamijar\n- **cuML.accel 性能分析器**：为零代码变更加速功能添加了性能分析能力 (#7021) @jcrist\n- **cuML.accel LinearSVC\u002FLinearSVR**：新增对线性支持向量分类和回归的支持 (#6866) @viclafargue\n- **cuML.accel set_output\u002Fget_feature_names_out**：添加了对 scikit-learn 输出配置的支持 (#6942) @jcrist\n\n### 🔧 主要改进\n\n#### UMAP 增强\n- 多 GPU KNN 图构建支持 (#7019) @jinsolp\n- 改进了距离计算中对相同向量的处理 (#6904) @jinsolp\n- 在小数据集上禁用非确定性，以提高可重复性 (#7004) @viclafargue\n\n#### FIL（森林推理库）改进\n- 支持宽数据推理 (#7014) @hcho3\n- 更好地处理空的类别节点 (#6924) @hcho3\n- 改进了 GPU 上下文管理 (#6987) @hcho3\n- 恢复了旧版阈值行为 (#6922) @hcho3\n\n#### 零代码变更加速 (`cuml.accel`)\n- 添加了性能分析器，以更好地进行性能分析 (#7021) @jcrist\n- 增强了代理估计器的日志记录 (#6957) @csadorf\n- 实现了元数据路由支持 (#6950) @jcrist\n- 新增对 `LinearSVC` 和 `LinearSVR` 的支持 (#6866) @viclafargue\n- 将 `KernelRidge` 加入到支持的算法列表中 (#6917) @jcrist\n- 改进了命令行界面，支持 `-c` 和 `-` 选项 (#6852) @jcrist\n\n#### 算法增强\n- **DBSCAN**：现在可以计算 `components_` 属性 (#6976) @jcrist\n- **逻辑回归**：公开了 `n_iter_` 属性，用于跟踪迭代次数 (#6911) @betatim\n- **随机森林**：修复了默认的 `max_features` 参数 (#6862) @jcrist\n- **TSNE**：增加了对不支持度量的回退支持 (#6992) @jcrist\n- **岭回归**：更好地处理欠定系统 (#7003) @betatim\n\n#### 开发者体验\n- **测试**：通过集成上游测试套件，增强了 CI 对 HDBSCAN、UMAP 等算法的支持 (#6995, #6989, #6986) @jcrist\n- **文档**：全面更新了 Python 开发者指南和 API 文档 (#6843) @csadorf\n- **依赖项**：升级到 CUDA 12.9，并添加了对 scikit-learn 1.4 的支持 (#6944, #6845) @jakirkham, @betatim\n\n## 🚨 破坏性变更\n\n### 已弃用的参数与函数\n- **UMAP**：`data_on_host` 参数已弃用 (#6953) @jinsolp\n- **HDBSCAN**：\n  - `cuml.cluster` 命名空间中的预测函数已弃用 (#6943) @jcrist\n  - `connectivity` 参数已弃用 (#6936) @jcrist\n- **SGD 算法**：在 `MBSGDClassifier`、`MBSGDRegressor` 和 `SGD` 中，`penalty='none'` 已弃用 (#6926) @jcrist\n- **KMeans**：`random_state` 的默认值已更改为 `None` (#6884) @jcrist\n\n### 已移除的组件\n- **实验性 FIL**：移除了 `experimental.fil` Python 模块 (#6899) @hcho3\n- **遗留 FIL**：从 libcuml 中移除 (#6844) @hcho3\n- **CUDA 11 支持**：已移除 from","2025-08-06T20:19:52",{"id":208,"version":209,"summary_zh":210,"released_at":211},63000,"v25.06.00","## 🚨 重大变更\n\n- 弃用设备选择功能 (#6784) @jcrist\n- 移除 Python cuML 中对旧版 FIL 的使用 (#6728) @hcho3\n- 在 cuVS 中使用 RBC (#6644) @divyegala\n- 将 Barnes-Hut 算法映射到 FFT，用于 cuml.accel 中的 T-SNE (#6619) @csadorf\n- 新的 Estimator Proxy 架构 (#6613) @jcrist\n- 禁用 CI 中 cuml-cpu 的构建和上传 (#6529) @dantegd\n- 修复：在 `make_classification` 中将随机状态传播到 numpy 随机数生成器 (#6518) @betatim\n\n## 🐛 错误修复\n\n- UMAP 批量 nnd 测试热修复 (#6826) @jinsolp\n- 更新 forest_inference_demo.ipynb 以适配新的 FIL API 变更 (#6824) @dantegd\n- 修复 FIL `infer_kernel` 在 CUDA arch 1210 上的编译问题 (#6821) @viclafargue\n- 降低 ElasticNet 测试中求解器测试的阈值。(#6766) @csadorf\n- 使用 _assert_allclose 函数进行近似相等性检查。(#6763) @csadorf\n- UMAP 谱初始化在出错时回退到随机初始化 (#6750) @aamijar\n- [修复] 对于重复点（距离为零）在 kneighbors_graph 中设置 `include_self=False` (#6735) @aamijar\n- 修复：将 scikit-learn KMeans 的 n_init 设置为 10，以匹配 cuML (#6727) @csadorf\n- 修复 `LinearSVC.predict` 的输出数据类型 (#6715) @jcrist\n- CUDA 12.9 正确绑定更新后的压缩标志 (#6713) @robertmaynard\n- 修复 `PCA.noise_variance_` 的类型和支持问题 (#6693) @jcrist\n- 更新 dask RF 回归器的 partial_inference 功能 (#6691) @TomAugspurger\n- 正确处理 FIL 中的退化解树 (#6673) @hcho3\n- 修复 scikit-learn 测试 xfail 列表中的严格标记。(#6661) @csadorf\n- 文档：修复 docstring 格式 (#6659) @betatim\n- 修复 pr\u002Fissue 状态自动化中的字段 ID 问题 (#6656) @csadorf\n- 调整 scikit-learn 测试失败处理方式 (#6646) @csadorf\n- 移除 softmax 测试 qn 测试中的任意分数阈值 (#6636) @csadorf\n- 不要在 `fit` 之前设置 `n_features_in_` 属性 (#6624) @betatim\n- 将 Barnes-Hut 算法映射到 FFT，用于 cuml.accel 中的 T-SNE (#6619) @csadorf\n- 将 sklearn 测试 test_cross_val_predict[coo_array] 标记为不稳定。(#6610) @csadorf\n- 不再并行运行 scikit-learn 测试套件。(#6609) @csadorf\n- 将一个 kmeans 和一个 t_sne sklearn 测试标记为不稳定 (#6598) @csadorf\n- 放宽 ARIMA pytest 失败的阈值 (#6579) @divyegala\n- 修复 `_block_gemv` 内核中的竞态条件 (#6578) @divyegala\n- 修复 shap 内核的启动参数 (#6577) @divyegala\n- 更新 SVC 测试以适配 CCCL 更新 (#6569) @viclafargue\n- 更新 FIL 演示笔记本中的 FIL 模型加载参数 (#6562) @csadorf\n- 将输出数据类型本地化；移除全局 set_api_output_dtype 调用 (#6561) @Ofek-Haim\n- 将逻辑回归数字分类测试的阈值降低至 0.9。(#6552) @csadorf\n- 修复：在 `make_classification` 中将随机状态传播到 numpy 随机数生成器 (#6518) @betatim\n- 修复使用字典类输入构造 cudf.DataFrame 的问题，并确保 CumlArray.to_output('cudf') 不会将 NaN 转换为 NA (#6517) @mroeschke\n- 在 SVC 测试中调用 gc (#6514) @viclafargue\n- 修复日志宏 (#6511) @vyasr\n- 单纯复形函数的修复和清理 (#6493) @viclafargue\n- 在平滑 KNN 生成过程中检查 KNN 图，若 n","2025-06-06T12:44:50",{"id":213,"version":214,"summary_zh":215,"released_at":216},63001,"v25.04.00","## 🚨 重大变更\n\n- 将实验性 FIL 提升为稳定版 (#6464) @wphicks\n- 支持 `LogisticRegression` 中非平凡的 `classes_` 属性 (#6346) @jcrist\n- 使用新的 rapids-logger 库 (#6289) @vyasr\n\n## 🐛 错误修复\n\n- 解除 25.04 版本的 CI 阻塞 (#6519) @csadorf\n- 跳过 cudf.pandas 测试中的 `test_rf_classification_seed` 测试。(#6500) @csadorf\n- 移除 dask 到稀疏矩阵的 workaround (#6489) @TomAugspurger\n- 修复意外的 sklearn 必需导入 (#6483) @jcrist\n- 修复 `SVC` 中将实例传递给元估计器构造函数的问题 (#6471) @betatim\n- 修复 ARM 架构上的编译器依赖问题 (#6456) @bdice\n- 撤销 &quot;临时增加 `max_days_without_success` (#6390)&quot; (#6455) @divyegala\n- 增强 UniversalBase 的参数处理，使其能够接受 NoneType 类型 (#6453) @csadorf\n- 修复 UMAP 变换 (#6449) @viclafargue\n- 对于 KMeans，当输入为稀疏矩阵时回退到 CPU 计算 (#6448) @csadorf\n- 有限支持数组类输入 (#6442) @csadorf\n- 修复 `cudf.pandas` 构建中的 `test_accuracy_score` 测试 (#6439) @jcrist\n- 支持 `cuml.accel` 估计器中的位置参数 (#6423) @jcrist\n- 修复 HDBSCAN Python 文档中关于 `metric` 的问题 (#6422) @divyegala\n- 声明运行时对 &#39;packaging&#39; 的依赖，更新 scikit-learn 和 hdbscan，使 cuml-cpu 与 cuml 保持一致 (#6420) @jameslamb\n- 修复 UMAP 中 `initial_alpha` 和 `learning_rate` 的传递问题 (#6417) @jcrist\n- 实现 Ridge .solver_ 估计属性 (#6415) @csadorf\n- 修复线性模型中的多目标预测问题 (#6414) @csadorf\n- 确保 `output_type=&quot;pandas&quot;` 返回用户友好的 pandas 数据结构 (#6407) @jcrist\n- 确保 `LinearSVC` 支持所有输入类型 (#6404) @jcrist\n- 使 UMAP 回调函数可序列化 (#6402) @jcrist\n- 正确对齐实验性 FIL 中的树结构 (#6397) @wphicks\n- 进行了一些日志级别处理的清理 (#6393) @jcrist\n- 修复 `KernelDensity` 在使用 `epanechnikov` 核时出现的 cupy 错误 (#6388) @jcrist\n- 在 RF 的 MSEObjectiveTest 中检查真实值和结果是否均为 NaN (#6387) @wphicks\n- 支持 `LabelEncoder` 接受非本机字节序的输入 (#6384) @jcrist\n- 修复 `test_kernel_ridge.py` 中的假设检验问题 (#6382) @jcrist\n- 从最近邻测试中移除调试日志 (#6376) @csadorf\n- 确保 FIL CPU 版本可以在没有可用 GPU 的情况下运行 (#6373) @wphicks\n- 在超参数查找过程中正确传递不可哈希对象 (#6369) @wphicks\n- 正确翻译 RandomForest 准则超参数 (#6363) @wphicks\n- 改进 UMAP 的回退机制 (#6358) @viclafargue\n- 进行了一些 GPU&lt;-&gt;CPU 互操作性的修复 (#6355) @jcrist\n- 修复 `KernelRidge.predict` 的输出类型问题 (#6354) @jcrist\n- 跳过 `test_extract_partitions_shape` 测试。(#6338) @csadorf\n- 要求 sphinx&lt;8.2.0 (#6336) @csadorf\n- 修复无参数调用 NearestNeighbors.kneighbors() 的问题。(#6333) @csadorf\n- 注释掉 UMAP 中的调试打印语句 (#6332) @viclafargue\n- 正确设置 UMAP 分派触发条件 (#6330) @viclafargue\n- 修复由 sklearn 流程提供的 None 参数引起的问题 (#6326) @viclafargue\n- 改善随机森林的互操作性 (#6320) @dantegd\n-","2025-04-09T21:28:25",{"id":218,"version":219,"summary_zh":220,"released_at":221},63002,"v25.02.01","## 🚨 重大变更\n\n- 将 pip 开发容器更新至 UCX 1.18 (#6249) @jameslamb\n\n## 🐛 错误修复\n\n- 移除残留的 click 选项 (#6381) @dantegd\n- 修复由于变量类型错误导致的 Dask 逻辑回归段错误\u002F挂起问题 (#6281) @dantegd\n- 在跟踪详细级别记录 UMAP 数组。(#6274) @csadorf\n- 确保所有方法签名都与 scikit-learn 兼容 (#6260) @jcrist\n- 修复当 data_on_host=True 时 UMAP 转换出现非法内存访问错误的问题 (#6259) @csadorf\n- 忽略 cuDF 的 __dataframe__ 已弃用警告。(#6229) @bdice\n- 修复 cuDF 更改对 Porter 词干提取器的影响，并调整 ARIMA 的 pytest 测试 (#6227) @dantegd\n- 避免日志条目重复 (#6222) @jcrist\n- 进一步修复 Scipy 1.15 更新对 PR 和夜间 CI 的影响 (#6213) @dantegd\n- 更新 Scipy 中 setulb 的调用，以适应 1.15 版本的新签名 (#6207) @dantegd\n- 调整 test_kmeans 测试，避免误报失败 (#6193) @dantegd\n- 调整逻辑回归 `log_proba` pytest 的容差范围，以避免误报失败 (#6188) @dantegd\n- 跳过 CUDA 12.0.1 夜间任务中 kernel_density 测试的不稳定测试 (#6184) @dantegd\n- 尝试减少 cuML 测试中的网络使用量。(#6174) @bdice\n- cuML dask 修复以解除 CI 阻塞 (#6170) @dantegd\n- 在 FIL 中将 BATCH_TREE_REORG 重新映射为 TREE_REORG (#6161) @wphicks\n\n## 📖 文档\n\n- 修复 pyx 文件中的 GitHub 链接 (#6202) @thomasjpfan\n\n## 🚀 新特性\n\n- 允许在 25.02 版本中违反 CUDA ODR 规则 (#6264) @robertmaynard\n- 定义 sm_120 的块大小 (#6250) @robertmaynard\n- 支持 `Ridge` 中的 `alpha=0` (#6236) @jcrist\n- 为支持的模型添加 `as_sklearn` 和 `from_sklearn` API，以便序列化到 CPU 上的 scikit-learn 估计器 (#6102) @dantegd\n\n## 🛠️ 改进\n\n- 将 25.04 版本的 PR 后向移植到补丁版本 25.02.01 (#6329) @dantegd\n- 在可能需要重试的 CI 作业中使用 `rapids-pip-retry` (#6293) @gforsyth\n- 避免在 UMAP 使用 nndescent 时进行大尺寸设备分配 (#6292) @jcrist\n- 撤销 CUDA 12.8 共享工作流分支的更改 (#6282) @vyasr\n- 使用 CUDA 12.8.0 构建和测试 (#6266) @bdice\n- 将 pip 开发容器更新至 UCX 1.18 (#6249) @jameslamb\n- 删除已弃用的 thrust 功能，并替换为 libcu++ 功能 (#6248) @miscco\n- 添加上限以防止使用 numba 0.61.0 (#6244) @galipremsagar\n- 规范化空白字符 (#6238) @bdice\n- 将 cpp\u002Ftest 重命名为 cpp\u002Ftests。(#6237) @bdice\n- 使用 cuda.bindings 布局。(#6233) @bdice\n- 对于 UMAP 中未实现的距离度量，跳过将其分派到 GPU 的操作 (#6224) @betatim\n- 在 CUDA 12 的 conda 构建中使用 GCC 13。(#6221) @bdice\n- 为 wheel 包声明 cuda-python 依赖项，以及其他小型打包更改 (#6217) @jameslamb\n- 将 Treelite 升级至 4.4.1 (#6212) @hcho3\n- 支持 raft 的日志目标 (#6208) @vyasr\n- 使用 rapids-cmake 构建日志模块 (#6205) @vyasr\n- 在 pyproject.toml 中整合 pytest 配置 (#6201) @jameslamb\n- 引入 libcuml wheel 包 (#6199) @jameslamb\n- 检查夜间构建是否最近成功完成 (#6196) @vyasr\n- 移除 sphinx 的版本锁定 (#6195) @vyasr\n- 简化 wheel 的 CI 脚本，以及其他小型打包更改 (#6190) @jameslamb\n- 更新至 ","2025-03-03T21:13:57",{"id":223,"version":224,"summary_zh":225,"released_at":226},63003,"v24.12.00","## 🚨 重大变更\n\n- 将 24.10 分支向前合并到 24.12 分支 (#6106) @divyegala\n\n## 🐛 Bug 修复\n\n- 修复 SSL 错误。(#6177) @bdice\n- 修复 `scikit-learn` 版本说明符 (#6158) @trxcllnt\n- 正确处理实验性 FIL 中缺失的分类数据 (#6132) @wphicks\n- 为 cuda-python 设置上限 (#6131) @bdice\n- 不再假定指针在设备和主机之间是互斥的。(#6128) @robertmaynard\n- cuml SINGLEGPU 现在会告知 cuvs 不要使用 nccl\u002Fmg 支持进行构建 (#6127) @robertmaynard\n- 从 CumlArray 的 pickle 头部移除类型信息 (#6120) @wphicks\n- 将 24.10 分支向前合并到 24.12 分支 (#6106) @divyegala\n- 修复 Dask 估计器在训练前的序列化问题 (#6065) @viclafargue\n\n## 🚀 新特性\n\n- 启用 HDBSCAN 的 `gpu` 训练和 `cpu` 推理 (#6108) @divyegala\n\n## 🛠️ 改进\n\n- 更新 FIL 测试，使用 XGBoost UBJSON 替代二进制格式 (#6153) @hcho3\n- 使用 cuvs 中的稀疏 k-近邻\u002F距离计算 (#6143) @benfred\n- 确保 MG 在稀疏矩阵的 mean_stddev 中执行相同次数的 allreduce 调用，以避免挂起 (#6141) @lijinf2\n- 停止将 cutlass 排除在符号排除检查之外 (#6140) @vyasr\n- 优化逻辑回归数据集标准化时的 MG 方差计算 (#6138) @lijinf2\n- 在 CI 中强制执行 wheel 文件大小限制及 README 格式化 (#6136) @jameslamb\n- 实验性命令行界面 UX (#6135) @dantegd\n- 添加遥测功能 (#6126) @msarahan\n- 如果设置了 CUML_ALGORITHMS，则使 cuVS 成为可选组件 (#6125) @hcho3\n- devcontainer：将 `VAULT_HOST` 替换为 `AWS_ROLE_ARN` (#6118) @jjacobelli\n- 在构建过程中打印 sccache 统计信息 (#6111) @jameslamb\n- 修复 Doxygen 文档中的版本号 (#6104) @jameslamb\n- 使 CI 中的 conda 安装更加严格 (#6103) @jameslamb\n- 将 `get_param_names` 设为单 GPU 估计器的类方法，以更接近 Scikit-learn 的实现 (#6101) @dantegd\n- 根据更改的文件修剪工作流 (#6094) @KyleFromNVIDIA\n- 将所有 rmm 导入更新为使用 pylibrmm\u002Flibrmm (#6084) @Matt711\n- 将 24.10 分支合并到 24.12 分支 (#6083) @jameslamb","2024-12-11T21:41:19",{"id":228,"version":229,"summary_zh":230,"released_at":231},63004,"v24.10.00","## 🚨 重大变更\n\n- 移除基于旧版 dask-glm 的逻辑回归 (#6028) @dantegd\n\n## 🐛 错误修复\n\n- 修复字符串列的 train_test_split 函数 (#6088) @dantegd\n- 停止遮蔽自由函数 (#6076) @vyasr\n- 为 conftest 选项设置默认值。(#6067) @bdice\n- 将许可证文件添加到 conda 包中 (#6061) @raydouglass\n- 将 np.NAN 修正为 np.nan。(#6056) @bdice\n- 重新启用 `pytest cuml-dask`，用于 CUDA 12.5 轮子的 CI 测试 (#6051) @divyegala\n- 修复 `simplicial_set_embedding` (#6043) @viclafargue\n- 维护：允许错误信息包含 ``np.float32(1.0)`` (#6030) @seberg\n- 停止导出 fill_k 内核，因为这会导致 ODR 违规 (#6021) @robertmaynard\n- 避免在 cudf.Series 不再允许列输入后使用 cudf 列 API (#6019) @mroeschke\n- 将 HDBSCAN 包版本固定为 `0.8.38` (#5906) @divyegala\n\n## 📖 文档更新\n\n- 更新 UMAP 文档 (#6064) @viclafargue\n- 更新实验性 FIL 中的 README 文件 (#6052) @hcho3\n- 添加单纯复形嵌入的文档 (#6042) @Intron7\n\n## 🚀 新功能\n\n- TSNE CPU\u002FGPU 互操作性 (#6063) @divyegala\n- 在 UMAP 中启用 GPU `fit` 和 CPU `transform` (#6032) @divyegala\n\n## 🛠️ 改进\n\n- 迁移到使用 cuVS 进行向量搜索 (#6085) @benfred\n- 支持 MG 稀疏逻辑回归中的全零特征向量 (#6082) @lijinf2\n- 更新 update-version.sh 以使用 packaging 库 (#6081) @AyodeAwe\n- 再次使用 CI 工作流分支 'branch-24.10' (#6072) @jameslamb\n- 更新 fmt（至 11.0.2）和 spdlog（至 1.14.1），并将这些库添加到 libcuml conda 的主机依赖项中 (#6071) @jameslamb\n- 将 flake8 更新至 7.1.1。(#6070) @bdice\n- 添加对 Python 3.12 的支持，并将 umap-learn 更新至 0.5.6 (#6060) @jameslamb\n- 修复关于有符号与无符号整数的编译器警告 (#6053) @hcho3\n- 更新 rapidsai\u002Fpre-commit-hooks (#6048) @KyleFromNVIDIA\n- 放弃对 Python 3.9 的支持 (#6040) @jameslamb\n- 添加 use_cuda_wheels 矩阵条目 (#6038) @KyleFromNVIDIA\n- 将调试构建切换为 RelWithDebInfo (#6033) @rongou\n- 移除 NumPy \u003C2 的版本限制 (#6031) @seberg\n- 移除基于旧版 dask-glm 的逻辑回归 (#6028) @dantegd\n- [FEA] UMAP API，用于通过批处理 NN Descent 构建 (#6022) @jinsolp\n- 启用 SVM、DBSCAN 和 KMeans 的 CPU\u002FGPU 互操作性 (#6020) @viclafargue\n- 更新预提交钩子 (#6016) @KyleFromNVIDIA\n- 改进 update-version.sh (#6014) @bdice\n- 使用 tool.scikit-build.cmake.version，设置 scikit-build-core 的最低版本 (#6012) @jameslamb\n- 将 branch-24.08 合并到 branch-24.10 中 (#5981) @jameslamb\n- 使用 CUDA 数学轮子 (#5966) @KyleFromNVIDIA","2024-10-09T20:29:23",{"id":233,"version":234,"summary_zh":235,"released_at":236},63005,"v24.08.00","## 🐛 Bug Fixes\n\n- Fixes for encoders\u002Ftransformers for cudf.pandas (#5990) @dantegd\n- BUG: remove sample parameter from pca call to mean (#5980) @mfoerste4\n- Fix segfault and other errors in ForestInference.load_from_sklearn (#5973) @hcho3\n- Rename `.devcontainer`s for CUDA 12.5 (#5967) @jakirkham\n- [MNT] Small NumPy 2 related fixes (#5954) @seberg\n- CI Fix: use ld_preload to avoid libgomp issue on ARM jobs (#5949) @dantegd\n- Fix for benchmark runner to handle parameter sweeps of multiple data types (#5938) @dantegd\n- Avoid extra memory copy when using cp.concatenate in cuml.dask kmeans (#5937) @dantegd\n- Assign correct `labels_` in `cuml.dask.kmeans` (#5931) @dantegd\n- Fix nightly jobs by updating hypothesis strategies to account for sklearn change (#5925) @dantegd\n- Fix for SVC fit_proba not using class weights (#5912) @pablotanner\n- Fix `cudf.pandas` failure on `test_convert_input_dtype` (#5885) @dantegd\n- Fix `cudf.pandas` failure on  `test_convert_matrix_order_cuml_array` (#5882) @dantegd\n- Simplify cuml array (#5166) @wence-\n\n## 🚀 New Features\n\n- [FEA] Enable UMAP to build knn graph using NN Descent (#5910) @jinsolp\n- Allow estimators to accept any dtype (#5888) @dantegd\n\n## 🛠️ Improvements\n\n- Add support for XGBoost UBJSON in FIL (#6009) @hcho3\n- split up CUDA-suffixed dependencies in dependencies.yaml (#5974) @jameslamb\n- Use workflow branch 24.08 again (#5970) @KyleFromNVIDIA\n- Bump Treelite to 4.3.0 (#5968) @hcho3\n- reduce memory_footprint for sparse PCA transform (#5964) @Intron7\n- Build and test with CUDA 12.5.1 (#5963) @KyleFromNVIDIA\n- Support int64 index type in MG sparse LogisticRegression (#5962) @lijinf2\n- Add CUDA_STATIC_MATH_LIBRARIES (#5959) @KyleFromNVIDIA\n- skip CMake 3.30.0 (#5956) @jameslamb\n- Make `ci\u002Frun_cuml_dask_pytests.sh` environment-agnostic again (#5950) @trxcllnt\n- Use verify-alpha-spec hook (#5948) @KyleFromNVIDIA\n- nest cuml one level deeper in python (#5944) @msarahan\n- resolve dependency-file-generator warning, other rapids-build-backend followup (#5928) @jameslamb\n- Adopt CI\u002Fpackaging codeowners (#5923) @bdice\n- Remove text builds of documentation (#5921) @vyasr\n- Fix conflict of forward-merge #5905 of branch-24.06 into branch-24.08 (#5911) @dantegd\n- Bump Treelite to 4.2.1 (#5908) @hcho3\n- remove unnecessary &#39;setuptools&#39; dependency (#5901) @jameslamb\n- [FEA] PCA Initialization for TSNE (#5897) @aamijar\n- Use rapids-build-backend (#5804) @KyleFromNVIDIA","2024-08-08T02:09:42",{"id":238,"version":239,"summary_zh":240,"released_at":241},63006,"v24.06.01","## 🐛 Bug Fixes\n\n- [HOTFIX] Fix import of sklearn by using cpu_only_import (#5914) @dantegd\n- Fix label binarize for binary class (#5900) @jinsolp\n- Fix RandomForestClassifier return type (#5896) @jinsolp\n- Fix nightly CI: remove deprecated creation of columns by using explicit dtype (#5880) @dantegd\n- Fix DBSCAN allocates rbc index even if deactivated (#5859) @mfoerste4\n- Remove gtest from dependencies.yaml (#5854) @robertmaynard\n- Support expression-based Dask Dataframe API (#5835) @rjzamora\n- Mark all kernels with internal linkage (#5764) @robertmaynard\n- Fix build.sh clean command (#5730) @csadorf\n\n## 📖 Documentation\n\n- Update the developer&#39;s guide with new copyright hook (#5848) @KyleFromNVIDIA\n\n## 🚀 New Features\n\n- Always use a static gtest and gbench (#5847) @robertmaynard\n\n## 🛠️ Improvements\n\n- [HOTFIX] Add compatibility of imports with multiple Scikit-learn versions (#5922) @dantegd\n- Support double precision in MNMG Logistic Regression (#5898) @lijinf2\n- Reduce and rename cudf.pandas integrations jobs (#5890) @dantegd\n- Fix building cuml with CCCL main (#5886) @trxcllnt\n- Add optional CI job for integration tests with cudf.pandas (#5881) @dantegd\n- Enable pytest failures on FutureWarnings\u002FDeprecationWarnings (#5877) @mroeschke\n- Remove return in test_lbfgs (#5875) @mroeschke\n- Avoid dask_cudf.core imports (#5874) @bdice\n- Support CPU object for `train_test_split` (#5873) @isVoid\n- Only use functions in the limited API (#5871) @vyasr\n- Replace deprecated disutils.version with packaging.version (#5868) @mroeschke\n- Adjust deprecated cupy.sparse usage (#5867) @mroeschke\n- Fix numpy 2.0 deprecations (#5866) @mroeschke\n- Fix deprecated positional arg usage (#5865) @mroeschke\n- Use int instead of float in random.randint (#5864) @mroeschke\n- Migrate to `{{ stdlib(&quot;c&quot;) }}` (#5863) @hcho3\n- Avoid deprecated API in notebook (#5862) @rjzamora\n- Add dedicated handling for cudf.pandas wrapped Numpy arrays (#5861) @betatim\n- Prepend devcontainer name with the username (#5860) @trxcllnt\n- add --rm and --name to devcontainer run args (#5857) @trxcllnt\n- Update pip devcontainers to UCX v1.15.0 (#5856) @trxcllnt\n- Replace rmm::mr::device_memory_resource* with rmm::device_async_resource_ref (#5853) @harrism\n- Update scikit-learn to 1.4 (#5851) @betatim\n- Prevent undefined behavior when passing handle from Treelite to cuML FIL (#5849) @hcho3\n- Adds missing files to `update-version.sh` (#5830) @AyodeAwe\n- Enable all tests for `arm` arch (#5824) @galipremsagar\n- Address PytestReturnNotNoneWarning in cuml tests (#5819) @mroeschke\n- Handle binary classifier with all-0 labels (#5810) @hcho3\n- Use pytest_cases.fixture to fix warnings. (#5798) @bdice\n- Enable Dask tests with UCX-Py\u002FUCXX in CI (#5697) @pentschev","2024-06-12T17:18:26",{"id":243,"version":244,"summary_zh":245,"released_at":246},63007,"v24.06.00","## 🐛 Bug Fixes\n\n- [HOTFIX] Fix import of sklearn by using cpu_only_import (#5914) @dantegd\n- Fix label binarize for binary class (#5900) @jinsolp\n- Fix RandomForestClassifier return type (#5896) @jinsolp\n- Fix nightly CI: remove deprecated creation of columns by using explicit dtype (#5880) @dantegd\n- Fix DBSCAN allocates rbc index even if deactivated (#5859) @mfoerste4\n- Remove gtest from dependencies.yaml (#5854) @robertmaynard\n- Support expression-based Dask Dataframe API (#5835) @rjzamora\n- Mark all kernels with internal linkage (#5764) @robertmaynard\n- Fix build.sh clean command (#5730) @csadorf\n\n## 📖 Documentation\n\n- Update the developer&#39;s guide with new copyright hook (#5848) @KyleFromNVIDIA\n\n## 🚀 New Features\n\n- Always use a static gtest and gbench (#5847) @robertmaynard\n\n## 🛠️ Improvements\n\n- Support double precision in MNMG Logistic Regression (#5898) @lijinf2\n- Reduce and rename cudf.pandas integrations jobs (#5890) @dantegd\n- Fix building cuml with CCCL main (#5886) @trxcllnt\n- Add optional CI job for integration tests with cudf.pandas (#5881) @dantegd\n- Enable pytest failures on FutureWarnings\u002FDeprecationWarnings (#5877) @mroeschke\n- Remove return in test_lbfgs (#5875) @mroeschke\n- Avoid dask_cudf.core imports (#5874) @bdice\n- Support CPU object for `train_test_split` (#5873) @isVoid\n- Only use functions in the limited API (#5871) @vyasr\n- Replace deprecated disutils.version with packaging.version (#5868) @mroeschke\n- Adjust deprecated cupy.sparse usage (#5867) @mroeschke\n- Fix numpy 2.0 deprecations (#5866) @mroeschke\n- Fix deprecated positional arg usage (#5865) @mroeschke\n- Use int instead of float in random.randint (#5864) @mroeschke\n- Migrate to `{{ stdlib(&quot;c&quot;) }}` (#5863) @hcho3\n- Avoid deprecated API in notebook (#5862) @rjzamora\n- Add dedicated handling for cudf.pandas wrapped Numpy arrays (#5861) @betatim\n- Prepend devcontainer name with the username (#5860) @trxcllnt\n- add --rm and --name to devcontainer run args (#5857) @trxcllnt\n- Update pip devcontainers to UCX v1.15.0 (#5856) @trxcllnt\n- Replace rmm::mr::device_memory_resource* with rmm::device_async_resource_ref (#5853) @harrism\n- Update scikit-learn to 1.4 (#5851) @betatim\n- Prevent undefined behavior when passing handle from Treelite to cuML FIL (#5849) @hcho3\n- Adds missing files to `update-version.sh` (#5830) @AyodeAwe\n- Enable all tests for `arm` arch (#5824) @galipremsagar\n- Address PytestReturnNotNoneWarning in cuml tests (#5819) @mroeschke\n- Handle binary classifier with all-0 labels (#5810) @hcho3\n- Use pytest_cases.fixture to fix warnings. (#5798) @bdice\n- Enable Dask tests with UCX-Py\u002FUCXX in CI (#5697) @pentschev","2024-06-05T17:43:18",{"id":248,"version":249,"summary_zh":250,"released_at":251},63008,"v24.04.00","## 🐛 Bug Fixes\n\n- Update pre-commit-hooks to v0.0.3 (#5816) @KyleFromNVIDIA\n- Correct and adjust tolerances of mnmg logreg pytests (#5812) @dantegd\n- Remove use of cudf.core.column.full. (#5794) @bdice\n- Suppress all HealthChecks on test_split_datasets. (#5791) @bdice\n- Suppress a hypothesis HealthCheck on test_split_datasets that fails in nightly CI. (#5790) @bdice\n- [BUG] Fix `MAX_THREADS_PER_SM` on sm 89. (#5785) @trivialfis\n- fix device to host copy not sync stream in logistic regression mg (#5766) @lijinf2\n- Use cudf.Index instead of cudf.GenericIndex. (#5738) @bdice\n- update RAPIDS dependencies to 24.4, refactor dependencies.yaml (#5726) @jameslamb\n\n## 🚀 New Features\n\n- Support CUDA 12.2 (#5711) @jameslamb\n\n## 🛠️ Improvements\n\n- Use `conda env create --yes` instead of `--force` (#5822) @bdice\n- Bump Treelite to 4.1.2 (#5814) @hcho3\n- Support standardization for sparse vectors in logistic regression MG (#5806) @lijinf2\n- Update script input name (#5802) @AyodeAwe\n- Add upper bound to prevent usage of NumPy 2 (#5797) @bdice\n- Enable pytest failures on warnings from cudf (#5796) @mroeschke\n- Use public cudf APIs where possible (#5795) @mroeschke\n- Remove hard-coding of RAPIDS version where possible (#5793) @KyleFromNVIDIA\n- Switch `pytest-xdist` algorithm to `worksteal` (#5792) @bdice\n- Automate C++ include file grouping and ordering using clang-format (#5787) @harrism\n- Add support for Python 3.11, require NumPy 1.23+ (#5786) @jameslamb\n- [ENH] Let cuDF handle input types for label encoder. (#5783) @trivialfis\n- Install test dependencies at the same time as cuml packages. (#5781) @bdice\n- Update devcontainers to CUDA Toolkit 12.2 (#5778) @trxcllnt\n- target branch-24.04 for GitHub Actions workflows (#5776) @jameslamb\n- Add environment-agnostic scripts for running ctests and pytests (#5761) @trxcllnt\n- Pandas 2.x support (#5758) @dantegd\n- Update ops-bot.yaml (#5752) @AyodeAwe\n- Forward-merge branch-24.02 to branch-24.04 (#5735) @bdice\n- Replace local copyright check with pre-commit-hooks verify-copyright (#5732) @KyleFromNVIDIA\n- DBSCAN utilize rbc eps_neighbors (#5728) @mfoerste4","2024-04-10T20:54:06",{"id":253,"version":254,"summary_zh":255,"released_at":256},63009,"v24.02.00","## 🚨 Breaking Changes\n\n- Update to CCCL 2.2.0. (#5702) @bdice\n- Switch to scikit-build-core (#5693) @vyasr\n\n## 🐛 Bug Fixes\n\n- [Hotfix] Fix FIL gtest (#5755) @hcho3\n- Exclude tests from builds (#5754) @vyasr\n- Fix ctest directory to ensure tests are executed (#5753) @bdice\n- Synchronize stream in SVC memory test (#5729) @wphicks\n- Fix shared-workflows repo name (#5723) @raydouglass\n- Fix cupy dependency in pyproject.toml (#5705) @vyasr\n- Only cufft offers a static_nocallback version of the library (#5703) @robertmaynard\n\n## 🛠️ Improvements\n\n- [Hotfix] Update GPUTreeSHAP to fix ARM build (#5747) @hcho3\n- Disable HistGradientBoosting support for now (#5744) @hcho3\n- Disable hnswlib feature in RAFT; pin pytest (#5733) @hcho3\n- [LogisticRegressionMG] Support standardization with no data modification (#5724) @lijinf2\n- Remove usages of rapids-env-update (#5716) @KyleFromNVIDIA\n- Remove extraneous SKBUILD_BUILD_OPTIONS (#5714) @vyasr\n- refactor CUDA versions in dependencies.yaml (#5712) @jameslamb\n- Update to CCCL 2.2.0. (#5702) @bdice\n- Migrate to Treelite 4.0 (#5701) @hcho3\n- Use cuda::proclaim_return_type on device lambdas. (#5696) @bdice\n- move _process_generic to base_return_types, avoid circular import (#5695) @dcolinmorgan\n- Switch to scikit-build-core (#5693) @vyasr\n- Fix all deprecated function calls in TUs where warnings are errors (#5692) @vyasr\n- Remove CUML_BUILD_WHEELS and standardize Python builds (#5689) @vyasr\n- Forward-merge branch-23.12 to branch-24.02 (#5657) @bdice\n- Add cuML devcontainers (#5568) @trxcllnt","2024-02-13T16:00:26",{"id":258,"version":259,"summary_zh":260,"released_at":261},63010,"v23.12.00","## 🚨 Breaking Changes\n\n- [LogisticRegressionMG] Support sparse vectors (#5632) @lijinf2\n\n## 🐛 Bug Fixes\n\n- Update actions\u002Flabeler to v4 (#5686) @raydouglass\n- updated docs around `make_column_transformer` change from `.preprocessing` to `.compose` (#5680) @taureandyernv\n- Skip dask pytest NN hang in CUDA 11.4 CI (#5665) @dantegd\n- Avoid hard import of sklearn in base module. (#5663) @csadorf\n- CI: Pin clang-tidy to 15.0.7. (#5661) @csadorf\n- Adjust assumption regarding valid cudf.Series dimensional input. (#5654) @csadorf\n- Flatten cupy array before feeding to cudf.Series (#5651) @vyasr\n- CI: Fix expected ValueError and dask-glm incompatibility (#5644) @csadorf\n- Use drop_duplicates instead of unique for cudf&#39;s pandas compatibility mode (#5639) @vyasr\n- Temporarily avoid pydata-sphinx-theme version 0.14.2. (#5629) @csadorf\n- Fix type hint in split function. (#5625) @trivialfis\n- Fix trying to get pointer to None in svm\u002Flinear.pyx (#5615) @yosider\n- Reduce parallelism to avoid OOMs in wheel tests (#5611) @vyasr\n\n## 📖 Documentation\n\n- Update interoperability docs (#5633) @beckernick\n- Update instructions for creating a conda build environment (#5628) @csadorf\n\n## 🚀 New Features\n\n- Basic implementation of `OrdinalEncoder`. (#5646) @trivialfis\n\n## 🛠️ Improvements\n\n- Build concurrency for nightly and merge triggers (#5658) @bdice\n- [LogisticRegressionMG][FEA] Support training when dataset contains only one class (#5655) @lijinf2\n- Use new `rapids-dask-dependency` metapackage for managing `dask` versions (#5649) @galipremsagar\n- Simplify some logic in LabelEncoder (#5648) @vyasr\n- Increase `Nanny` close timeout in `LocalCUDACluster` tests (#5636) @pentschev\n- [LogisticRegressionMG] Support sparse vectors (#5632) @lijinf2\n- Add rich HTML representation to estimators (#5630) @betatim\n- Unpin `dask` and `distributed` for `23.12` development (#5627) @galipremsagar\n- Update `shared-action-workflows` references (#5621) @AyodeAwe\n- Use branch-23.12 workflows. (#5618) @bdice\n- Update rapids-cmake functions to non-deprecated signatures (#5616) @robertmaynard\n- Allow nightly dependencies and set up consistent nightly versions for conda and pip packages (#5607) @vyasr\n- Forward-merge branch-23.10 to branch-23.12 (#5596) @bdice\n- Build CUDA 12.0 ARM conda packages. (#5595) @bdice\n- Enable multiclass svm for sparse input (#5588) @mfoerste4","2023-12-06T19:20:46",{"id":263,"version":264,"summary_zh":265,"released_at":266},63011,"v23.10.00","## 🚨 Breaking Changes\n\n- add sample_weight parameter to dbscan.fit (#5574) @mfoerste4\n- Update to Cython 3.0.0 (#5506) @vyasr\n\n## 🐛 Bug Fixes\n\n- Fix accidental unsafe cupy import (#5613) @dantegd\n- Fixes for CPU package (#5599) @dantegd\n- Fixes for timeouts in tests (#5598) @dantegd\n\n## 🚀 New Features\n\n- Enable cuml-cpu nightly (#5585) @dantegd\n- add sample_weight parameter to dbscan.fit (#5574) @mfoerste4\n\n## 🛠️ Improvements\n\n- cuml-cpu notebook, docs and cluster models (#5597) @dantegd\n- Pin `dask` and `distributed` for `23.10` release (#5592) @galipremsagar\n- Add changes for early experimental support for dataframe interchange protocol API (#5591) @dantegd\n- [FEA] Support L1 regularization and ElasticNet in MNMG Dask LogisticRegression (#5587) @lijinf2\n- Update image names (#5586) @AyodeAwe\n- Update to clang 16.0.6. (#5583) @bdice\n- Upgrade to Treelite 3.9.1 (#5581) @hcho3\n- Update to doxygen 1.9.1. (#5580) @bdice\n- [REVIEW] Adding a few of datasets for benchmarking (#5573) @vinaydes\n- Allow cuML MNMG estimators to be serialized (#5571) @viclafargue\n- [FEA] Support multiple classes in multi-node-multi-gpu logistic regression, from C++, Cython, to Dask Python class (#5565) @lijinf2\n- Use `copy-pr-bot` (#5563) @ajschmidt8\n- Unblock CI for branch-23.10 (#5561) @csadorf\n- Fix CPU-only build for new FIL (#5559) @hcho3\n- [FEA] Support no regularization in MNMG LogisticRegression (#5558) @lijinf2\n- Unpin `dask` and `distributed` for `23.10` development (#5557) @galipremsagar\n- Branch 23.10 merge 23.08 (#5547) @vyasr\n- Use Python builtins to prep benchmark `tmp_dir` (#5537) @jakirkham\n- Branch 23.10 merge 23.08 (#5522) @vyasr\n- Update to Cython 3.0.0 (#5506) @vyasr","2023-10-11T18:02:50",{"id":268,"version":269,"summary_zh":270,"released_at":271},63012,"v23.08.00","## 🚨 Breaking Changes\n\n- Stop using setup.py in build.sh (#5500) @vyasr\n- Add `copy_X` parameter to `LinearRegression` (#5495) @viclafargue\n\n## 🐛 Bug Fixes\n\n- Update dependencies.yaml test_notebooks to include dask_ml (#5545) @taureandyernv\n- Fix cython-lint issues. (#5536) @bdice\n- Skip rf_memleak tests (#5529) @dantegd\n- Pin hdbscan to fix pytests in CI (#5515) @dantegd\n- Fix UMAP and simplicial set functions metric (#5490) @viclafargue\n- Fix test_masked_column_mode (#5480) @viclafargue\n- Use fit_predict rather than fit for KNeighborsClassifier and KNeighborsRegressor in benchmark utility (#5460) @beckernick\n- Modify HDBSCAN membership_vector batch_size check (#5455) @tarang-jain\n\n## 🚀 New Features\n\n- Use rapids-cmake testing to run tests in parallel (#5487) @robertmaynard\n- [FEA] Update MST Reduction Op (#5386) @tarang-jain\n- cuml: Build CUDA 12 packages (#5318) @vyasr\n- CI: Add custom GitHub Actions job to run clang-tidy (#5235) @csadorf\n\n## 🛠️ Improvements\n\n- Pin `dask` and `distributed` for `23.08` release (#5541) @galipremsagar\n- Remove Dockerfile. (#5534) @bdice\n- Improve temporary directory handling in cuML (#5527) @jakirkham\n- Support init arguments in MNMG LogisticRegression (#5519) @lijinf2\n- Support predict in MNMG Logistic Regression (#5516) @lijinf2\n- Remove unused matrix.cuh and math.cuh headers to eliminate deprecation warnings. (#5513) @bdice\n- Update gputreeshap to use rapids-cmake. (#5512) @bdice\n- Remove raft specializations includes. (#5509) @bdice\n- Revert CUDA 12.0 CI workflows to branch-23.08. (#5508) @bdice\n- Enable wheels CI scripts to run locally (#5507) @divyegala\n- Default to nproc for PARALLEL_LEVEL in build.sh. (#5505) @csadorf\n- Fixed potential overflows in SVM, minor adjustments to nvtx ranges (#5504) @mfoerste4\n- Stop using setup.py in build.sh (#5500) @vyasr\n- Fix PCA test (#5498) @viclafargue\n- Update build dependencies (#5496) @csadorf\n- Add `copy_X` parameter to `LinearRegression` (#5495) @viclafargue\n- Sparse pca patch (#5493) @Intron7\n- Restrict HDBSCAN metric options to L2 #5415 (#5492) @Rvch7\n- Fix typos. (#5481) @bdice\n- Add multi-node-multi-gpu Logistic Regression in C++ (#5477) @lijinf2\n- Add missing stream argument to cub calls in workingset (#5476) @mfoerste4\n- Update to CMake 3.26.4 (#5464) @vyasr\n- use rapids-upload-docs script (#5457) @AyodeAwe\n- Unpin `dask` and `distributed` for development (#5452) @galipremsagar\n- Remove documentation build scripts for Jenkins (#5450) @ajschmidt8\n- Fix update version and pinnings for 23.08. (#5440) @bdice\n- Add cython-lint configuration. (#5439) @bdice\n- Unpin scikit-build upper bound (#5438) @vyasr\n- Fix some deprecation warnings in tests. (#5436) @bdice\n- Update `raft::sparse::distance::pairwise_distance` to new API (#5428) @divyegala","2023-08-09T21:09:44",{"id":273,"version":274,"summary_zh":275,"released_at":276},63013,"v23.06.00","## 🚨 Breaking Changes\n\n- Dropping Python 3.8 (#5385) @divyegala\n- Support sparse input for SVC and SVR (#5273) @mfoerste4\n\n## 🐛 Bug Fixes\n\n- Fix for umap-pytest issue in Rocky8 and wheels in GHA nightly tests (#5458) @dantegd\n- Fixes for nightly GHA runs (#5446) @dantegd\n- Add missing RAFT cusolver_macros import and changes for recent cuDF updates (#5434) @dantegd\n- Fix kmeans pytest to correctly compute fp output error (#5426) @mdoijade\n- Add missing `raft\u002Fmatrix\u002Fmatrix.cuh` include (#5411) @benfred\n- Fix path to cumlprims_mg in build workflow (#5406) @divyegala\n- Fix path to cumlprims in build workflow (#5405) @vyasr\n- Pin to scikit-build&lt;17.2 (#5400) @vyasr\n- Fix forward merge #5383 (#5384) @dantegd\n- Correct buffer move assignment in experimental FIL (#5372) @wphicks\n- Avoid invalid memory access in experimental FIL for large output size (#5365) @wphicks\n- Fix forward merge #5336 (#5345) @dantegd\n\n## 📖 Documentation\n\n- Fix HDBSCAN docs and add membership_vector to cuml.cluster.hdbscan namespace (#5378) @beckernick\n- Small doc fix (#5375) @tarang-jain\n\n## 🚀 New Features\n\n- Provide method for auto-optimization of FIL parameters (#5368) @wphicks\n\n## 🛠️ Improvements\n\n- Fix documentation source code links (#5449) @ajschmidt8\n- Drop seaborn dependency. (#5437) @bdice\n- Make all nvtx usage go through safe imports (#5424) @dantegd\n- run docs nightly too (#5423) @AyodeAwe\n- Switch back to using primary shared-action-workflows branch (#5420) @vyasr\n- Add librmm to libcuml dependencies. (#5410) @bdice\n- Update recipes to GTest version &gt;=1.13.0 (#5408) @bdice\n- Remove cudf from libcuml `meta.yaml` (#5407) @divyegala\n- Support CUDA 12.0 for pip wheels (#5404) @divyegala\n- Support for gtest 1.11+ changes (#5403) @dantegd\n- Update cupy dependency (#5401) @vyasr\n- Build wheels using new single image workflow (#5394) @vyasr\n- Revert shared-action-workflows pin (#5391) @divyegala\n- Fix logic for concatenating Treelite objects (#5387) @hcho3\n- Dropping Python 3.8 (#5385) @divyegala\n- Remove usage of rapids-get-rapids-version-from-git (#5379) @jjacobelli\n- [ENH] Add missing includes of rmm\u002Fmr\u002Fdevice\u002Fper_device_resource.hpp (#5369) @ahendriksen\n- Remove wheel pytest verbosity (#5367) @sevagh\n- support parameter &#39;class_weight&#39; and method &#39;decision_function&#39; in LinearSVC (#5364) @mfoerste4\n- Update clang-format to 16.0.1. (#5361) @bdice\n- Implement apply() in FIL (#5358) @hcho3\n- Use ARC V2 self-hosted runners for GPU jobs (#5356) @jjacobelli\n- Try running silhouette test (#5353) @vyasr\n- Remove uses-setup-env-vars (#5344) @vyasr\n- Resolve auto-merger conflicts between `branch-23.04` &amp; `branch-23.06` (#5340) @galipremsagar\n- Solve merge conflict of PR #5327 (#5329) @dantegd\n- Branch 23.06 merge 23.04 (#5315) @vyasr\n- Support sparse input for SVC and SVR (#5273) @mfoerste4\n- Delete outdated versions.json. (#5229) @bdice","2023-06-07T21:03:43",{"id":278,"version":279,"summary_zh":280,"released_at":281},63014,"v23.04.00","## 🚨 Breaking Changes\n\n- Pin `dask` and `distributed` for release (#5333) @galipremsagar\n\n## 🐛 Bug Fixes\n\n- Skip pickle notebook during nbsphinx (#5342) @dantegd\n- Avoid race condition in FIL predict_per_tree (#5334) @wphicks\n- Ensure experimental FIL shmem usage is below device limits (#5326) @wphicks\n- Update cuda architectures for threads per sm restriction (#5323) @wphicks\n- Run experimental FIL tests in CI (#5316) @wphicks\n- Run memory leak pytests without parallelism to avoid sporadic test failures (#5313) @dantegd\n- Update cupy version for pip wheels (#5311) @dantegd\n- Fix for raising attributeerors erroneously for ipython methods (#5299) @dantegd\n- Fix cuml local cpp docs build (#5297) @galipremsagar\n- Don&#39;t run dask tests twice when testing wheels (#5279) @benfred\n- Remove MANIFEST.in use auto-generated one for sdists and package_data for wheels (#5278) @vyasr\n- Removing remaining include of `raft\u002Fdistance\u002Fdistance_type.hpp` (#5264) @cjnolet\n- Enable hypothesis testing for nightly test runs. (#5244) @csadorf\n- Support numeric, boolean, and string keyword arguments to class methods during CPU dispatching (#5236) @beckernick\n- Allowing large data in kmeans (#5228) @cjnolet\n\n## 📖 Documentation\n\n- Fix docs build to be `pydata-sphinx-theme=0.13.0` compatible (#5259) @galipremsagar\n- Add supported CPU\u002FGPU operators to API docs and update docstrings (#5239) @beckernick\n- Fix documentation author (#5126) @bdice\n\n## 🚀 New Features\n\n- Modify default batch size in HDBSCAN soft clustering (#5335) @tarang-jain\n- reduce memory pressure in membership vector computation (#5268) @tarang-jain\n- membership_vector for HDBSCAN (#5247) @tarang-jain\n- Provide FIL implementation for both CPU and GPU (#4890) @wphicks\n\n## 🛠️ Improvements\n\n- Remove deprecated Treelite CI API from FIL (#5348) @hcho3\n- Updated forest inference to new dask worker api for 23.04 (#5347) @taureandyernv\n- Pin `dask` and `distributed` for release (#5333) @galipremsagar\n- Pin cupy in wheel tests to supported versions (#5312) @vyasr\n- Drop `pickle5` (#5310) @jakirkham\n- Remove CUDA_CHECK macro (#5308) @hcho3\n- Revert faiss removal pinned tag (#5306) @cjnolet\n- Upgrade to Treelite 3.2.0 (#5304) @hcho3\n- Implement predict_per_tree() in FIL (#5303) @hcho3\n- remove faiss from cuml (#5293) @benfred\n- Stop setting package version attribute in wheels (#5285) @vyasr\n- Add libfaiss runtime dependency to libcuml. (#5284) @bdice\n- Move faiss_mr from raft (#5281) @benfred\n- Generate pyproject dependencies with dfg (#5275) @vyasr\n- Updating cuML to use consolidated RAFT libs (#5272) @cjnolet\n- Add codespell as a linter (#5265) @benfred\n- Pass `AWS_SESSION_TOKEN` and `SCCACHE_S3_USE_SSL` vars to conda build (#5263) @ajschmidt8\n- Update to GCC 11 (#5258) @bdice\n- Drop Python 3.7 handling for pickle protocol 4 (#5256) @jakirkham\n- Migrate as much as possible to pyproject.toml (#5251) @vyasr\n- Adapt to rapidsai\u002Frmm#1221 which moves allocator callbacks (#5249) @wence-\n- Add dfg as a pre-commit hook. (#5246) @vyasr\n- Stop using versioneer to manage versions (#5245) @vyasr\n- Enhance cuML benchmark utility and refactor hdbscan import utilities (#5242) @beckernick\n- Fix GHA build workflow (#5241) @AjayThorve\n- Support innerproduct distance in the pairwise_distance API (#5230) @benfred\n- Enable hypothesis for 23.04 (#5221) @csadorf\n- Reduce error handling verbosity in CI tests scripts (#5219) @AjayThorve\n- Bump pinned pip wheel deps to 23.4 (#5217) @sevagh\n- Update shared workflow branches (#5215) @ajschmidt8\n- Unpin `dask` and `distributed` for development (#5209) @galipremsagar\n- Remove gpuCI scripts. (#5208) @bdice\n- Move date to build string in `conda` recipe (#5190) @ajschmidt8\n- Kernel shap improvements (#5187) @vinaydes\n- test out the raft bfknn replacement (#5186) @benfred\n- Forward merge 23.02 into 23.04 (#5182) @vyasr\n- Add `detail` namespace for linear models (#5107) @lowener\n- Add pre-commit configuration (#4983) @csadorf","2023-04-12T21:34:27"]