[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-mlr-org--mlr":3,"tool-mlr-org--mlr":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",152630,2,"2026-04-12T23:33:54",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":72,"owner_avatar_url":73,"owner_bio":74,"owner_company":75,"owner_location":75,"owner_email":75,"owner_twitter":75,"owner_website":76,"owner_url":77,"languages":78,"stars":95,"forks":96,"last_commit_at":97,"license":98,"difficulty_score":99,"env_os":100,"env_gpu":101,"env_ram":101,"env_deps":102,"category_tags":108,"github_topics":112,"view_count":32,"oss_zip_url":75,"oss_zip_packed_at":75,"status":17,"created_at":132,"updated_at":133,"faqs":134,"releases":165},6994,"mlr-org\u002Fmlr","mlr","Machine Learning in R ","mlr 是一个专为 R 语言设计的机器学习框架，旨在为用户提供统一、高效的实验基础设施。在 R 的原生环境中，不同的机器学习算法往往缺乏标准化的接口，导致研究人员在进行复杂实验时，需要编写大量繁琐且易错的代码来封装算法、统一输出格式，并手动实现重采样、超参数优化、特征选择及数据预处理等功能。mlr 完美解决了这些痛点，它将分类、回归、生存分析及聚类等监督与非监督学习方法整合在同一接口下，让用户能专注于实验设计本身，而非底层代码实现。\n\n该工具特别适合数据科学家、统计研究人员以及需要在 R 中进行系统性模型评估的开发者使用。其核心技术亮点在于提供了一套完整的模块化流程，不仅支持灵活的实验扩展和自定义算法构建，还原生集成了并行计算能力以加速耗时任务。此外，mlr 与 OpenML 平台深度连接，便于用户共享数据集与实验结果，促进可复现研究。需要注意的是，目前 mlr 已进入维护退休阶段，开发团队建议新项目优先采用其继任者 mlr3，但 mlr 依然是理解 R 语言机器学习工作流的重要经典工具。"," mlr \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_64b3143b8187.png\" align=\"right\" \u002F>\n\nPackage website: [release](https:\u002F\u002Fmlr.mlr-org.com\u002F) | [dev](https:\u002F\u002Fmlr.mlr-org.com\u002Fdev\u002F)\n\nMachine learning in R.\n\n\u003C!-- badges: start -->\n\n[![tic](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fworkflows\u002Ftic\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Factions)\n[![CRAN_Status_Badge](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_e8ee5320d93f.png)](https:\u002F\u002Fcran.r-project.org\u002Fpackage=mlr)\n[![cran checks](https:\u002F\u002Fbadges.cranchecks.info\u002Fworst\u002Fmlr.svg)](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fchecks\u002Fcheck_results_mlr.html)\n[![CRAN Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_3da50e2fcae7.png)](https:\u002F\u002Fcran.r-project.org\u002Fpackage=mlr)\n[![StackOverflow](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fstackoverflow-mlr-blue.svg)](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr)\n[![lifecycle](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flifecycle-retired-orange.svg)](https:\u002F\u002Flifecycle.r-lib.org\u002Farticles\u002Fstages.html)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fmlr-org\u002Fmlr\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fapp.codecov.io\u002Fgh\u002Fmlr-org\u002Fmlr)\n\n\u003C!-- badges: end -->\n\n- [CRAN release site](https:\u002F\u002FCRAN.R-project.org\u002Fpackage=mlr)\n- [Online tutorial](https:\u002F\u002Fmlr.mlr-org.com\u002Findex.html)\n- [Changelog](https:\u002F\u002Fmlr.mlr-org.com\u002Fnews\u002Findex.html)\n\n- [Stackoverflow](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr): `#mlr`\n- [Mattermost](https:\u002F\u002Flmmisld-lmu-stats-slds.srv.mwn.de\u002Fmlr_invite\u002F)\n- [Blog](https:\u002F\u002Fmlr-org.com\u002F)\n\n## Deprecated\n\n{mlr} is considered retired from the mlr-org team.\nWe won't add new features anymore and will only fix _severe_ bugs.\nWe suggest to use the new [mlr3](https:\u002F\u002Fmlr3.mlr-org.com\u002F) framework from now on and for future projects.\n\nNot all features of {mlr} are already implemented in {mlr3}.\nIf you are missing a crucial feature, please open an issue in the respective [mlr3 extension package](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3\u002Fwiki\u002FExtension-Packages) and do not hesitate to follow-up on it.\n\n## Installation\n\n**Release**\n\n```r\ninstall.packages(\"mlr\")\n```\n\n**Development**\n\n```R\nremotes::install_github(\"mlr-org\u002Fmlr\")\n```\n\n## Citing {mlr} in publications\n\nPlease cite our [JMLR paper](https:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv17\u002F15-066.html) [[bibtex](https:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fv17\u002F15-066.bib)].\n\nSome parts of the package were created as part of other publications.\nIf you use these parts, please cite the relevant work appropriately.\nAn overview of all {mlr} related publications can be found [here](https:\u002F\u002Fmlr.mlr-org.com\u002Farticles\u002Ftutorial\u002Fmlr_publications.html).\n\n## Introduction\n\nR does not define a standardized interface for its machine-learning algorithms.\nTherefore, for any non-trivial experiments, you need to write lengthy, tedious and error-prone wrappers to call the different algorithms and unify their respective output.\n\nAdditionally you need to implement infrastructure to\n\n- resample your models\n- optimize hyperparameters\n- select features\n- cope with pre- and post-processing of data and compare models in a statistically meaningful way.\n\nAs this becomes computationally expensive, you might want to parallelize your experiments as well. This often forces users to make crummy trade-offs in their experiments due to time constraints or lacking expert programming skills.\n\n{mlr} provides this infrastructure so that you can focus on your experiments!\nThe framework provides supervised methods like classification, regression and survival analysis along with their corresponding evaluation and optimization methods, as well as unsupervised methods like clustering.\nIt is written in a way that you can extend it yourself or deviate from the implemented convenience methods and construct your own complex experiments or algorithms.\n\nFurthermore, the package is nicely connected to the [**OpenML**](https:\u002F\u002Fgithub.com\u002Fopenml\u002Fopenml-r) R package and its [online platform](https:\u002F\u002Fwww.openml.org\u002F), which aims at supporting collaborative machine learning online and allows to easily share datasets as well as machine learning tasks, algorithms and experiments in order to support reproducible research.\n\n## Features\n\n- Clear **S3** interface to R **classification, regression, clustering and survival** analysis methods\n- Abstract description of learners and tasks by properties\n- Convenience methods and generic building blocks for your machine learning experiments\n- Resampling methods like **bootstrapping, cross-validation and subsampling**\n- Extensive visualizations (e.g. ROC curves, predictions and partial predictions)\n- Simplified benchmarking across data sets and learners\n- Easy hyperparameter tuning using different optimization strategies, including potent configurators like\n  - **iterated F-racing (irace)**\n  - **sequential model-based optimization**\n- **Variable selection with filters and wrappers**\n- Nested resampling of models with tuning and feature selection\n- **Cost-sensitive learning, threshold tuning and imbalance correction**\n- Wrapper mechanism to extend learner functionality in complex ways\n- Possibility to combine different processing steps to a complex data mining chain that can be jointly optimized\n- **OpenML** connector for the Open Machine Learning server\n- Built-in **parallelization**\n- **Detailed tutorial**\n\n## Miscellaneous\n\nSimple usage questions are better suited at Stackoverflow using the [mlr](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr) tag.\n\nPlease note that all of us work in academia and put a lot of work into this project - simply because we like it, not because we are paid for it.\n\nNew development efforts should go into [{mlr3}](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3).\nWe have a own style guide which can easily applied by using the `mlr_style` from the [styler](https:\u002F\u002Fgithub.com\u002Fr-lib\u002Fstyler) package.\nSee [our wiki](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3\u002Fwiki\u002FStyle-Guide#styler-mlr-style) for more information.\n\n## Talks, Workshops, etc.\n\n[mlr-outreach](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr-outreach) holds all outreach activities related to {mlr} and {mlr3}.\n","mlr \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_64b3143b8187.png\" align=\"right\" \u002F>\n\n软件包官网：[发布版](https:\u002F\u002Fmlr.mlr-org.com\u002F) | [开发版](https:\u002F\u002Fmlr.mlr-org.com\u002Fdev\u002F)\n\nR语言中的机器学习。\n\n\u003C!-- badges: start -->\n\n[![tic](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fworkflows\u002Ftic\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Factions)\n[![CRAN_Status_Badge](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_e8ee5320d93f.png)](https:\u002F\u002Fcran.r-project.org\u002Fpackage=mlr)\n[![cran checks](https:\u002F\u002Fbadges.cranchecks.info\u002Fworst\u002Fmlr.svg)](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fchecks\u002Fcheck_results_mlr.html)\n[![CRAN Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_readme_3da50e2fcae7.png)](https:\u002F\u002Fcran.r-project.org\u002Fpackage=mlr)\n[![StackOverflow](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fstackoverflow-mlr-blue.svg)](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr)\n[![lifecycle](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flifecycle-retired-orange.svg)](https:\u002F\u002Flifecycle.r-lib.org\u002Farticles\u002Fstages.html)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fmlr-org\u002Fmlr\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fapp.codecov.io\u002Fgh\u002Fmlr-org\u002Fmlr)\n\n\u003C!-- badges: end -->\n\n- [CRAN发布站点](https:\u002F\u002FCRAN.R-project.org\u002Fpackage=mlr)\n- [在线教程](https:\u002F\u002Fmlr.mlr-org.com\u002Findex.html)\n- [变更日志](https:\u002F\u002Fmlr.mlr-org.com\u002Fnews\u002Findex.html)\n\n- [Stackoverflow](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr)：`#mlr`\n- [Mattermost](https:\u002F\u002Flmmisld-lmu-stats-slds.srv.mwn.de\u002Fmlr_invite\u002F)\n- [博客](https:\u002F\u002Fmlr-org.com\u002F)\n\n## 已弃用\n\n{mlr}已被mlr-org团队宣布退役。\n我们不再添加新功能，仅会修复_严重_的bug。\n建议从现在起及未来项目中使用新的[mlr3](https:\u002F\u002Fmlr3.mlr-org.com\u002F)框架。\n\n并非{mlr}的所有功能都已经在{mlr3}中实现。\n如果您缺少关键功能，请在相应的[mlr3扩展包](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3\u002Fwiki\u002FExtension-Packages)中提交issue，并随时跟进。\n\n## 安装\n\n**发布版**\n\n```r\ninstall.packages(\"mlr\")\n```\n\n**开发版**\n\n```R\nremotes::install_github(\"mlr-org\u002Fmlr\")\n```\n\n## 在出版物中引用{mlr}\n\n请引用我们的[JMLR论文](https:\u002F\u002Fjmlr.org\u002Fpapers\u002Fv17\u002F15-066.html) [[bibtex](https:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fv17\u002F15-066.bib)]。\n\n该软件包的部分内容是作为其他出版物的一部分创建的。\n如果您使用了这些部分，请相应地引用相关工作。\n所有{mlr}相关出版物的概述可以在这里找到[这里](https:\u002F\u002Fmlr.mlr-org.com\u002Farticles\u002Ftutorial\u002Fmlr_publications.html)。\n\n## 简介\n\nR语言并未为其机器学习算法定义标准化接口。\n因此，对于任何非 trivial 的实验，您都需要编写冗长、繁琐且容易出错的封装代码来调用不同的算法，并统一它们的输出。\n\n此外，您还需要实现以下基础设施：\n\n- 对模型进行重采样\n- 优化超参数\n- 选择特征\n- 处理数据的预处理和后处理\n- 以统计学上有意义的方式比较模型。\n\n由于这些操作计算成本较高，您可能还希望对实验进行并行化。\n这往往迫使用户因时间限制或缺乏专业的编程技能而在实验中做出妥协。\n\n{mlr}提供了这些基础设施，使您可以专注于自己的实验！\n该框架提供了监督学习方法，如分类、回归和生存分析，以及相应的评估和优化方法，同时也支持无监督学习方法，如聚类。\n它被设计成允许您自行扩展，或者偏离已实现的便捷方法，构建自己的复杂实验或算法。\n\n此外，该包与[R包**OpenML**](https:\u002F\u002Fgithub.com\u002Fopenml\u002Fopenml-r)及其[在线平台](https:\u002F\u002Fwww.openml.org\u002F)良好集成，\n该平台旨在支持在线协作式机器学习，并允许轻松共享数据集、机器学习任务、算法和实验，从而支持可重复性研究。\n\n## 特性\n\n- 清晰的**S3**接口，用于R中的**分类、回归、聚类和生存**分析方法\n- 通过属性抽象描述学习器和任务\n- 便捷的方法和通用构建模块，用于您的机器学习实验\n- 重采样方法，如**自助法、交叉验证和子采样**\n- 丰富的可视化工具（例如ROC曲线、预测和部分预测）\n- 跨数据集和学习器的简化基准测试\n- 使用多种优化策略轻松进行超参数调优，包括强大的配置器，如\n  - **迭代F竞赛（irace）**\n  - **基于序列模型的优化**\n- **带有过滤器和包装器的变量选择**\n- 带有调优和特征选择的嵌套模型重采样\n- **代价敏感学习、阈值调优和不平衡数据纠正**\n- 封装机制，用于以复杂方式扩展学习器功能\n- 可将不同的处理步骤组合成一个复杂的数据挖掘链，可联合优化\n- **OpenML**连接器，用于开放机器学习服务器\n- 内置**并行化**\n- **详细教程**\n\n## 其他\n\n简单的使用问题更适合在Stackoverflow上使用[mlr](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fmlr)标签提问。\n\n请注意，我们所有人都在学术界工作，并为这个项目投入了大量精力——仅仅因为我们喜欢它，而不是因为我们为此获得报酬。\n\n新的开发工作应转向[{mlr3}](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3)。\n我们有自己的风格指南，可以通过使用[styler](https:\u002F\u002Fgithub.com\u002Fr-lib\u002Fstyler)包中的`mlr_style`轻松应用。\n更多信息请参阅[我们的wiki](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr3\u002Fwiki\u002FStyle-Guide#styler-mlr-style)。\n\n## 讲座、研讨会等\n\n[mlr-outreach](https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr-outreach)负责所有与{mlr}和{mlr3}相关的推广活动。","# mlr 快速上手指南\n\n> **重要提示**：`mlr` 包目前已被官方标记为**退休（Retired）**状态。开发团队不再添加新功能，仅修复严重漏洞。对于新项目，强烈建议使用其继任者 **[mlr3](https:\u002F\u002Fmlr3.mlr-org.com\u002F)**。本指南仅供维护旧项目或学习参考。\n\n## 环境准备\n\n- **操作系统**：Windows、macOS 或 Linux\n- **R 版本**：建议 R 3.5.0 或更高版本\n- **前置依赖**：\n  - 基础 R 环境\n  - 推荐安装 `devtools` 或 `remotes` 包以便从 GitHub 安装开发版\n  - 若需并行计算，需配置相应的后端（如 `parallel` 包）\n\n## 安装步骤\n\n### 1. 安装稳定版（推荐）\n从 CRAN 镜像源安装（国内用户可先设置清华或中科大镜像）：\n\n```r\n# 设置国内镜像（可选，加速下载）\noptions(repos = c(CRAN = \"https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002FCRAN\"))\n\n# 安装 mlr\ninstall.packages(\"mlr\")\n```\n\n### 2. 安装开发版\n如需最新代码（包含未发布的修复），可从 GitHub 安装：\n\n```r\n# 确保已安装 remotes 包\ninstall.packages(\"remotes\")\n\n# 从 GitHub 安装\nremotes::install_github(\"mlr-org\u002Fmlr\")\n```\n\n## 基本使用\n\n`mlr` 的核心流程分为三步：**定义任务 (Task)** -> **选择学习器 (Learner)** -> **训练与评估 (Train & Resample)**。\n\n以下是一个最简单的分类示例（使用鸢尾花数据集）：\n\n```r\nlibrary(mlr)\n\n# 1. 定义任务：创建一个分类任务\n# data: 数据集\n# target: 目标变量列名\n# task.type: 任务类型 (classif, regr, cluster, surv)\ntask \u003C- makeClassifTask(data = iris, target = \"Species\")\n\n# 2. 选择学习器：选择一个分类算法\n# 例如：随机森林 (\"classif.ranger\" 或 \"classif.randomForest\")\n# 这里使用内置的简单分类树\nlearner \u003C- makeLearner(\"classif.rpart\")\n\n# 3. 训练模型\nmodel \u003C- train(learner, task)\n\n# 查看模型摘要\nprint(model)\n\n# 4. 模型评估：使用交叉验证 (Cross-Validation)\n# cv.inds: 交叉验证折数\nresample_result \u003C- resample(\n  learner = learner,\n  task = task,\n  resampling = makeResampleDesc(\"CV\", iters = 5),\n  measures = acc # 评估指标：准确率\n)\n\n# 输出评估结果\nprint(resample_result$aggr)\n```\n\n### 核心概念简述\n- **Task**: 封装了数据和目标变量，统一了不同算法的输入格式。\n- **Learner**: 封装了具体的机器学习算法（如 SVM, 随机森林, 神经网络等）。\n- **Resample**: 提供了交叉验证、自助法等重采样策略，用于客观评估模型性能。\n- **Measure**: 定义了评估指标（如准确率 `acc`, 均方误差 `mse`, AUC 等）。","某金融风控团队需要在 R 语言环境中，基于历史交易数据快速构建并对比多种机器学习模型，以预测客户违约概率。\n\n### 没有 mlr 时\n- **接口混乱**：调用随机森林、SVM 等不同算法包时，需手动编写大量重复代码来统一输入输出格式，极易出错。\n- **流程繁琐**：交叉验证、超参数调优和特征选择需分别实现，缺乏标准化流程，导致实验复现困难。\n- **效率低下**：面对海量参数组合，难以直接利用多核并行加速，往往因计算耗时过长而被迫简化实验方案。\n- **评估片面**：缺乏统一的统计评估框架，难以科学地对比不同模型在特定业务指标上的表现差异。\n\n### 使用 mlr 后\n- **统一接口**：mlr 提供了标准化的学习器接口，一行代码即可切换不同算法，自动处理数据格式对齐问题。\n- **流程自动化**：内置完整的重采样、超参数优化及特征选择管道，通过配置对象即可一键执行复杂实验流程。\n- **并行加速**：原生支持并行计算后端，轻松将网格搜索等耗时任务分发至多核运行，大幅缩短建模周期。\n- **科学评估**：提供丰富的性能度量指标和统计检验工具，确保模型对比结果具有统计学意义且可复现。\n\nmlr 通过构建标准化的机器学习基础设施，让数据科学家从繁琐的工程编码中解放出来，专注于核心实验设计与业务价值挖掘。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmlr-org_mlr_27d7bc42.png","mlr-org","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmlr-org_347544af.png","",null,"https:\u002F\u002Fmlr-org.com","https:\u002F\u002Fgithub.com\u002Fmlr-org",[79,83,87,91],{"name":80,"color":81,"percentage":82},"R","#198CE7",98,{"name":84,"color":85,"percentage":86},"HTML","#e34c26",1.6,{"name":88,"color":89,"percentage":90},"C","#555555",0.3,{"name":92,"color":93,"percentage":94},"Shell","#89e051",0.1,1679,404,"2026-04-09T17:07:13","NOASSERTION",1,"Linux, macOS, Windows","未说明",{"notes":103,"python":104,"dependencies":105},"该工具是基于 R 语言的机器学习框架，非 Python 工具。官方已宣布该项目进入‘退休’（retired）状态，不再添加新功能，仅修复严重漏洞，建议新项目使用其继任者 mlr3。安装可通过 CRAN 或 GitHub 进行。支持并行计算，但未明确具体硬件门槛。","不适用 (基于 R 语言)",[106,107],"R (基础环境)","S3 接口支持",[109,14,16,35,52,110,15,111,13],"视频","其他","音频",[113,114,115,116,117,118,119,120,121,122,123,124,125,64,126,127,128,129,130,131],"machine-learning","data-science","tuning","cran","r-package","predictive-modeling","classification","regression","statistics","r","survival-analysis","imbalance-correction","tutorial","learners","hyperparameters-optimization","feature-selection","multilabel-classification","clustering","stacking","2026-03-27T02:49:30.150509","2026-04-13T13:37:42.870013",[135,140,145,150,155,160],{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},31501,"partialPrediction 函数绘制的决策树等分段常数模型为何看起来是分段线性的？","这是因为绘图时使用了线条插值连接各个点，导致视觉上呈现分段线性，但实际上底层的偏预测函数（如决策树）是分段常数的。目前系统无法自动识别包裹模型是否产生此类函数，因此导数计算可能在间断点出现异常。建议用户自行理解模型特性，忽略插值带来的视觉误导。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F289",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},31497,"在使用 resample() 处理大型数据集并结合特征过滤时，遇到 'Assertion on xs failed: Must be of type list, not NULL' 错误怎么办？","该问题通常出现在使用 makeFilterWrapper 进行特征选择且数据量较大时。维护者指出这可能与底层特征过滤实现的扩展性有关。建议尝试使用扩展性更好的特征过滤实现，例如 CORElearn 包中的方法。如果问题依旧，可能需要检查并行计算配置或减少特征数量。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F514",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},31498,"如何为 LiblineaRLogReg 学习器正确设置类别权重（class weights）？","在代码中需要先将权重向量通过 .weights[unique(names(.weights))] 进行处理，以确保权重名称唯一且顺序正确，然后将其作为 wi 参数传递给 LiblineaR 函数。此外，后续版本计划将类别权重信息直接集成到学习器属性中，可能不再需要手动设置 makeWeightedClassesWrapper 中的 wcw.param 参数。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F123",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},31499,"mlr 包的 rdocumentation 网站上文档版本滞后或缺失新功能链接怎么办？","这是由于文档生成配置问题导致的。维护者已修复了版本 2.4 发布版中的图片链接问题。对于压缩包（release 和 devel）中损坏的图片，是因为构建过程中未包含这些文件。如果遇到类似问题，建议禁用 Jekyll 以支持符号链接，或在新的仓库中提交 Issue 反馈。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F330",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},31500,"如何在预测时使用子模型（submodels），哪些学习器支持此功能？","从 mlr 2.5 版本开始，允许在预测时使用子模型，目前已在 gbm 和 randomForest 学习器中实现。支持子模型的学习器会拥有 'submodel' 属性以及额外的 'submodel.param' 参数，用于指定控制子模型的参数。用户可以通过查看学习器的 properties 来确认是否支持该功能。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F298",{"id":161,"question_zh":162,"answer_zh":163,"source_url":164},31502,"randomForestSRCSyn 学习器测试失败并报错 'dim(X) must have a positive length' 是什么原因？","这是一个已知的学习器实现缺陷，导致在应用预测结果时维度检查失败。该问题已被标记为 Bug，通常需要等待维护者修复底层代码或更新相关依赖包。临时解决方案是避免使用该特定的合成采样学习器，或检查输入数据维度是否符合预期。","https:\u002F\u002Fgithub.com\u002Fmlr-org\u002Fmlr\u002Fissues\u002F602",[166,171,176,181,186,191,196,201,206,211,216,221,226,231,236,241,246,251,256,261],{"id":167,"version":168,"summary_zh":169,"released_at":170},233773,"v2.19.1","## 错误修复\n\n- 调整 `classif.logreg` 函数中 `\"positive\"` 参数的行为 (#2846)\n\n- 对不同水平数的变量进行虚拟特征编码时，采用一致的命名方式 (#2847)\n\n- 移除 {nodeHarvest} 学习器 (#2841)\n\n- 移除 {rknn} 学习器 (#2842)\n\n- 移除所有 {DiscriMiner} 学习器 (#2840)\n\n- 移除 {extraTrees} 学习器 (#2839)\n\n- 移除已弃用的 {rrlda} 学习器\n\n- 解决部分 {ggplot2} 的弃用警告\n\n- 修复了 `information.gain` 筛选器的计算问题。\n  此前，由于筛选器命名中的一个错误，即使请求的是 `information.gain`，仍然会计算 `chi.squared`（#2816，@jokokojote）\n\n- 使 `helpLearnerParam()` 的 HTML 解析更加健壮 (#2843)\n\n- 为帮助页面添加 HTML5 支持","2022-09-30T09:24:08",{"id":172,"version":173,"summary_zh":174,"released_at":175},233774,"v2.19.0","- 添加过滤器 `FSelectoRcpp::relief()`。这个基于 C++ 的 RelieF 过滤算法实现，比 {FSelector} 包中基于 Java 的实现快得多 (#2804)\n- 修复 `FilterWrapper` 对象的 S3 打印方法\n- 使 ibrier 指标能够用于生存分析任务 (#2789)\n- 切换到 testthat v3 (#2796)\n- 启用并行测试 (#2796)\n- 将 PMCMR 包替换为 PMCMRplus (#2796)\n- 因 CRAN 已移除 CoxBoost 学习器而将其移除\n- 当 `fix.factors.prediction = TRUE` 导致预测中新因子水平产生 NA 值时发出警告 (@jakob-r, #2794)\n- 当包装的学习器的预测结果与 `newdata` 长度不一致时，清晰地显示错误信息 (@jakob-r, #2794)","2021-02-23T08:13:15",{"id":177,"version":178,"summary_zh":179,"released_at":180},233775,"v2.18.0","- 许多 praznik 筛选器现在也能处理回归任务 (#2790, @bommert)\n- `praznik_MRMR`: 移除对生存分析任务的处理 (#2790, @bommert)\n- xgboost: 将 `objective` 的默认值从已弃用的 `reg:linear` 更新为 `reg:squarederror`\n- 如果在 Task 中设置了 `blocking`，但在 `makeResampleDesc()` 中未设置 `blocking.cv`，则发出警告 (#2788)\n- 修复 `generateLearningCurveData()` 中学习器的顺序 (#2768)\n- `getFeatureImportance()`: 考虑线性 xgboost 模型的特征重要性权重\n- 修正 glmnet 学习器的说明（参数 `s` 的默认值与说明不符）(#2747)\n- 移除 `createSpatialResamplingPlots()` 中使用的 {hrbrthemes} 依赖。该包在 R-devel 上曾引发问题。此外，用户应自行设置自定义主题。\n- 在 `getNestedTuneResultsOptPathDf()` 中显式返回值 (#2754)\n","2020-10-06T12:57:58",{"id":182,"version":183,"summary_zh":184,"released_at":185},233776,"v2.17.1","## 学习器 - 错误修复\n\n- 移除 `regr_slim` 学习器，因为其依赖的包（flare）已在 CRAN 上被弃用。\n\n## 评估指标 - 错误修复\n\n- 移除指标 `clValid::dunn` 及其测试用例（该包已被弃用）(#2742)\n- 错误修复：`tuneThreshold()` 现在会考虑评估指标的方向。此前，性能指标始终被最小化(#2732)。\n- 移除调整后的 R² 指标 (arsq)，修复 #2711。\n\n## 过滤器 - 错误修复\n\n- 修复了一个问题：当使用阈值筛选时，随机森林最小深度过滤器只会返回 NA 值。实际上，只有低于给定阈值的特征才应返回 NA。（@annette987, #2710）\n- 修复了简单过滤器无法通过参数 `more.args` 传递过滤器选项的问题。（@annette987, #2709）\n\n## 特征选择 - 错误修复\n\n- 修复了在 `selectFeatures()` 中使用 `bits.to.features` 时 `print.FeatSelResult()` 的问题(#2721)\n- 使 `getFeatureImportance()` 返回一个长格式的数据框(#2708)\n\n## 其他\n\n- pkgdown：将变更日志移至附录\n- 考虑到 {checkmate} v2.0.0 的更新(#2734)\n\n- 重构 ParamSets 中来自各个包的函数调用（如 `\u003Cpkg::fun>`），以避免在 `listLearners()` 中因相关包未安装而引发错误(#2730)\n- 即使某个包未安装，`listLearners()` 也不应失败(#2717)","2020-03-24T11:00:50",{"id":187,"version":188,"summary_zh":189,"released_at":190},233777,"v2.17.0","## 绘图 \r\n \r\n* `n.show` 参数在 `plotFilterValues()` 中不起作用。感谢 @albersonmiranda。(#2689) \r\n \r\n## 函数型数据 \r\n \r\nPR: #2638 (@pfistl) \r\n- 为函数型数据的回归和分类任务新增了多个学习器： \r\n  - classif.classiFunc.(kernel|knn)（基于不同半度量的核方法和 KNN） \r\n  - (classif|regr).fgam（函数型广义加性模型） \r\n  - (classif|regr).FDboost（提升的函数型广义加性模型） \r\n \r\n- 新增了用于从函数型数据中提取特征的预处理步骤： \r\n  - extractFDAFourier（傅里叶变换） \r\n  - extractFDAWavelets（小波特征） \r\n  - extractFDAFPCA（主成分分析） \r\n  - extractFDATsfeatures（来自 tsfeatures 包的时间序列特征） \r\n  - extractFDADTWKernel（动态时间规整核） \r\n  - extractFDAMultiResFeatures（在多分辨率下计算特征） \r\n \r\n- 修复了一个错误：多分类问题转化为二分类问题的约简技术无法应用于函数型数据。 \r\n \r\n- 还进行了若干其他小的错误修复和代码改进。 \r\n- 扩展并澄清了多个 fda 组件的文档说明。 \r\n \r\n## 学习器——通用 \r\n \r\n- xgboost：向参数 `tree_method` 添加了选项 'auto'、'approx' 和 'gpu_hist'（@albersonmiranda，#2701） \r\n \r\n## 过滤器——通用 \r\n \r\n- 允许将自定义阈值函数传递给 filterFeatures 和 makeFilterWrapper（@annette987，#2686） \r\n- 允许集成过滤器包含多个相同类型的基过滤器（@annette987，#2688） \r\n \r\n## 过滤器——错误修复 \r\n \r\n- `filterFeatures()`：当应用于集成过滤器时，参数 `thresh` 无法正常工作。（@annette987，#2699） \r\n- 修复了集成过滤器排序不正确的问题。感谢 @annette987（#2698）","2020-01-10T22:22:10",{"id":192,"version":193,"summary_zh":194,"released_at":195},233778,"v2.16.0","## 软件包基础设施 \n \n- pkgdown 站点上现已为所有函数提供引用分组（https:\u002F\u002Fmlr.mlr-org.com\u002Freference\u002Findex.html） \n- CI 测试现仅在 Circle CI 上进行（此前为 Travis CI） \n \n## 学习器——通用改进 \n \n- 修复了 `classif.xgboost` 中的一个 bug，该 bug 导致无法为二分类任务传递监控列表。此问题是由内部标签反转机制不够优化所致。感谢 @001ben 的报告（#32）（@mllg） \n- 更新 `fda.usc` 学习器，使其兼容 >=2.0 版本的软件包 \n- 将 `glmnet` 学习器更新至上游软件包版本 3.0.0 \n- 将 `xgboost` 学习器更新至上游版本 0.90.2（@pat-s 和 @be-marc，#2681） \n- 更新了 `classif.gbm` 和 `regr.gbm` 学习器的参数集。具体而言，参数 `shrinkage` 的默认值已从 0.001 调整为 0.1，并增加了 `distribution` 参数的可选值。同时，已禁用软件包内部的并行化功能（通过参数 `n.cores`）。（@pat-s，#2651） \n- 更新了 `h2o.deeplearning` 学习器的参数设置（@albersonmiranda，#2668） \n \n## 其他 \n \n- 在 `.onLoad()` 中添加 `configureMlr()`，以可能解决一些边缘情况（#2585）（@pat-s，#2637） \n \n## 学习器——错误修复 \n \n- 由于内部 bug，`h2o.gbm` 学习器在未传入 `wcol` 参数的情况下无法运行。此外，该 bug 还导致预测时出现另一个问题：预测结果的 `data.frame` 被错误地格式化为字符型而非数值型。感谢 @nagdevAmruthnath 在 #2630 中指出此问题。 \n \n## 过滤器——通用改进 \n \n- 错误修复：允许 `method = \"vh\"` 用于过滤器 `randomForestSRC_var.select`，并对不支持的值返回更具信息量的错误提示。此外，现在也可以传递 `conservative` 参数。更多信息请参见 #2646 和 #2639。（@pat-s，#2649） \n* 错误修复：随着 _praznik_ v7.0.0 版本的发布，过滤器 `praznik_CMIM` 不再对逻辑型特征返回结果。更多信息请参阅 https:\u002F\u002Fgitlab.com\u002Fmbq\u002Fpraznik\u002Fissues\u002F19。","2019-11-26T22:22:05",{"id":197,"version":198,"summary_zh":199,"released_at":200},233779,"v2.15.0","## 突破性变化\n \n- 以往返回的是宽格式的 `data.frame`，现在则以长格式（整洁）的 `tibble` 返回筛选结果。这样更便于应用后续处理方法（如 `group_by()` 等）(@pat-s, #2456) \n- `benchmark()` 默认不再存储调优结果（`$extract` 插槽）。  \n  如果您希望保留该插槽（例如用于调优后的分析），请将 `keep.extract = TRUE`。  \n  这一改动源于带有大量调优的 `BenchmarkResult` 对象体积会变得非常大（约 GB 级），在高性能计算集群上连续执行多个 `benchmark()` 调用时，可能会导致运行时内存问题。  \n- `benchmark()` 默认也不再存储生成的模型（`$models` 插槽）。  \n  原因与上述 `$extract` 插槽相同。  \n  如需存储，可设置 `models = TRUE`。 \n \n## 函数 — 通用\n \n- `generateFeatureImportanceData()` 新增参数 `show.info`，用于显示当前正在计算的特征名称、其在队列中的索引以及每个特征的耗时(@pat-s, #26222) \n \n## 学习器 — 通用\n \n- 移除了 `classif.liquidSVM` 和 `regr.liquidSVM`，原因是 `liquidSVM` 已从 CRAN 上架。  \n- 修复了一个导致某些情况下概率聚合不正确的错误。该错误已存在较长时间，由于 `data.table` 在 `rbindlist()` 中的默认行为变更而暴露出来。更多信息请参见 #2578。(@mllg, #2579)  \n- `regr.randomForest` 新增三种估计标准误差的方法：  \n  - `se.method = \"jackknife\"`  \n  - `se.method = \"bootstrap\"`  \n  - `se.method = \"sd\"`  \n  更多详情请参阅 `?regr.randomForest`。  \n  `regr.ranger` 则依赖于该包提供的函数（“jackknife”和“infjackknife”，后者为默认值）。(@jakob-r, #1784)  \n- `regr.gbm` 现在支持“分位数分布”(@bthieurmel, #2603)  \n- `classif.plsdaCaret` 现在支持多分类任务(@GegznaV, #2621) \n \n## 函数 — 通用 \n- `getClassWeightParam()` 现在也适用于 Wrapper* 模型和集成模型(@ja-thomas, #891)  \n- 新增 `getLearnerNote()`，用于查询学习器的“Note”插槽(@alona-sydorova, #2086)  \n- `e1071::svm()` 现在仅当数据中包含因子时才使用公式接口。这一改动旨在防止部分用户在使用大型数据集时遇到的“栈溢出”问题。更多信息请参见 #1738。(@mb706, #1740) \n \n## 学习器 — 新增 \n- 新增来自 _ClusterR_ 包的学习器 `cluster.MiniBatchKmeans`(@Prasiddhi, #2554) \n \n## 函数 — 通用 \n- `plotHyperParsEffect()` 现在支持对嵌套交叉验证中超参数效应进行分面可视化(@MasonGallo, #1653)  \n- 修复了一个导致某些情况下概率聚合不正确的错误。该错误已存在较长时间，由于 `data.table` 在 `rbindlist()` 中的默认行为变更而暴露出来。更多信息请参见 #2578。(@mllg, #2579)  \n- 修复了 `options(on.learner.error)` 在 `benchmark()` 中未被正确尊重的 bug。","2019-08-07T10:15:30",{"id":202,"version":203,"summary_zh":204,"released_at":205},233780,"v2.14.0","## 通用\n* 添加在重采样中使用完全预定义索引的选项（`makeResampleDesc(fixed = TRUE)`）(@pat-s, #2412)。\n* `Task` 帮助页面现在被拆分为单独的页面，例如 `RegrTask`、`ClassifTask` (@pat-s, #2564)\n\n## 函数 - 新增\n* `deleteCacheDir()`: 清空默认的 mlr 缓存目录 (@pat-s, #2463)\n* `getCacheDir()`: 返回默认的 mlr 缓存目录 (@pat-s, #2463)\n\n## 函数 - 通用\n* `getResamplingIndices(inner = TRUE)` 现在能够正确返回内部索引（此前内部索引指的是各自外部层训练集的子集）(@pat-s, #2413)。\n\n## 过滤器 - 通用\n* 在生成过滤器值时现在会使用缓存。\n  这意味着对于特定设置，过滤器值只会计算一次，并在后续迭代中使用存储的缓存。\n  这一变化在调优 `fw.perc`、`fw.abs` 或 `fw.threshold` 时带来了显著的速度提升。\n  可以通过 `makeFilterWrapper()` 或 `filterFeatures()` 中的新参数 `cache` 来启用此功能 (@pat-s, #2463)。\n\n## 过滤器 - 新增\n* praznik_JMI\n* praznik_DISR\n* praznik_JMIM\n* praznik_MIM\n* praznik_NJMIM\n* praznik_MRMR\n* praznik_CMIM\n* FSelectorRcpp_gain.ratio\n* FSelectorRcpp_information.gain\n* FSelectorRcpp_symuncert\n\n此外，过滤器名称已按照以下方案统一：`\u003Cpkgname>_\u003Cfiltername>`。例外情况是包含在 R 基础包中的过滤器，在这种情况下省略了包名。\n\n## 过滤器 - 通用\n* 添加了来自 `FSelectorRcpp` 包的过滤器 `FSelectorRcpp_gain.ratio`、`FSelectorRcpp_information.gain` 和 `FSelectorRcpp_symmetrical.uncertainty`。\n  这些过滤器的速度大约是 `FSelector` 包实现的 100 倍。\n  请注意，这两种实现的内部机制略有不同，因此不应将 `FSelectorRcpp` 的方法视为 `FSelector` 包的直接替代品。\n* 过滤器名称已按照以下方案统一：`\u003Cpkgname>_\u003Cfiltername>`。(@pat-s, #2533)\n  - `information.gain` -> `FSelector_information.gain`\n  - `gain.ratio` -> `FSelector_gain.ratio`\n  - `symmetrical.uncertainty` -> `FSelector_symmetrical.uncertainty`\n  - `chi.squared` -> `FSelector_chi.squared`\n  - `relief` -> `FSelector_relief`\n  - `oneR` -> `FSelector_oneR`\n  - `randomForestSRC.rfsrc` -> `randomForestSRC_importance`\n  - `randomForestSRC.var.select` -> `randomForestSRC_var.select`\n  - `randomForest.importance` -> `randomForest_importance`\n\n* 修复了与加载所需过滤器包命名空间相关的一个错误 (@pat-s, #2483)\n\n## 学习器 - 新增\n* classif.liquidSVM (@PhilippPro, #2428)\n* regr.liquidSVM (@PhilippPro, #2428)\n\n## 学习器 - 通用\n* regr.h2o.gbm: 添加了多个参数，`\"h2o.use.data.table\" = TRUE` 现已成为默认值 (@j-hartshorn, #2508)\n* h2o 学习器现在支持获取特征重要性 (@markusdumke, #2434)\n\n## 学习器 - 修复\n* 在某些情况下，优化后的超参数并未在性能层面应用。","2019-04-26T22:14:04",{"id":207,"version":208,"summary_zh":209,"released_at":210},233781,"v2.13","## 一般性更改\n* 禁用 CRAN 的单元测试，目前仅在 Travis 上进行测试\n* 使用 show.learner.output = FALSE 抑制输出消息\n\n## 函数 - 一般性更改\n* plotHyperParsEffect：添加颜色\n\n## 函数 - 新增\n* getResamplingIndices\n* createSpatialResamplingPlots\n\n## 学习器 - 一般性更改\n* regr.nnet：移除不必要的参数 linout、entropy、softmax 和 censored\n* regr.ranger：增加权重处理功能\n\n## 学习器 - 已移除\n* {classif,regr}.blackboost：新版本发布后 API 发生了变化","2018-09-09T14:46:06",{"id":212,"version":213,"summary_zh":214,"released_at":215},233782,"v2.12","## 一般性更新\n* 增加了对使用矩阵列表示的功能型数据（fda）的支持。\n* 放宽了包装器的嵌套方式——唯一明确禁止的组合是将调参包装器嵌套在优化包装器中。\n* 重构了重采样进度消息，以提供更清晰的概览，并更好地区分训练和测试指标。\n* calculateROCMeasures 现在返回绝对值而非相对值。\n* 通过提供空间划分方法“SpCV”和“SpRepCV”，增加了对空间数据的支持。\n* 新增了 spatial.task 分类任务。\n* 新增了 spam.task 分类任务。\n* 分类任务现在将类别分布存储在 class.distribution 成员中。\n* 当数据包含 NA 或学习器不支持缺失值时，mlr 现在会预测 NA。\n* 在“train”函数中对任务进行子集化，并根据该子集调整因子水平（适用于分类任务）。这意味着因子水平的分布不一定与整个任务相同；此外，重采样中模型的任务描述反映的是各自的子集，而重采样预测的任务描述则反映的是整个任务，而不一定对应于任何单个模型的任务。\n* 通过新的重采样方法“GrowingWindowCV”和“FixedWindowCV”，增加了针对预测的滑动窗口交叉验证和固定窗口交叉验证的支持。\n\n## 函数 - 一般性更新\n* generatePartialDependenceData：现依赖于“mmpf”包；移除了参数：“center”、“resample”、“fmin”、“fmax”和“gridsize”；新增了“uniform”和“n”参数，用于配置部分依赖图的网格。\n* batchmark：允许处理重采样实例并减少部分结果。\n* resample、performance：新增标志“na.rm”，用于在聚合过程中移除 NA。\n* plotTuneMultiCritResultGGVIS：新增参数“point.info”和“point.trafo”，用于控制交互性。\n* calculateConfusionMatrix：新增参数“set”，用于指定混淆矩阵应针对“train”、“test”还是“both”计算（默认为“both”）。\n* PlotBMRSummary：新增参数“shape”。\n* plotROCCurves：新增分面参数。\n* PreprocWrapperCaret：新增参数“ppc.corr”、“ppc.zv”、“ppc.nzv”、“ppc.n.comp”、“ppc.cutoff”、“ppc.freqCut”、“ppc.uniqueCut”。\n\n## 函数 - 新增\n* makeClassificationViaRegressionWrapper\n* getPredictionTaskDesc\n* helpLearner、helpLearnerParam：打开某个学习器的帮助文档或获取其参数的描述。\n* setMeasurePars\n* makeFunctionalData\n* hasFunctionalFeatures\n* extractFDAFeatures、reextractFDAFeatures\n* extractFDAFourier、extractFDAFPCA、extractFDAMultiResFeatures、extractFDAWavelets\n* makeExtractFDAFeatMethod\n* makeExtractFDAFeatsWrapper\n* getTuneResultOptPath\n* makeTuneMultiCritControlMBO：允许使用 mlrMBO 进行基于模型的多准则\u002F多目标优化。\n\n## 函数 - 移除\n* 移除了 plotViperCharts。\n\n## 指标 - 一般性更新\n* 指标“arsq”现在具有 ID“arsq”。\n* 指标“m”","2018-06-23T16:00:43",{"id":217,"version":218,"summary_zh":219,"released_at":220},233783,"v2.11","## general\r\n* The internal class naming of the task descriptions have been changed causing probable incompatibilities with tasks generated under old versions.\r\n* New option on.error.dump to include dumps that can be inspected with the\r\n  debugger with errors\r\n* mlr now supports tuning with Bayesian optimization with mlrMBO\r\n\r\n## functions - general\r\n* tuneParams: fixed a small and obscure bug in logging for extremely large ParamSets\r\n* getBMR-operators: now support \"drop\" argument that simplifies the resulting list\r\n* configureMlr: added option \"on.measure.not.applicable\" to handle situations where performance\r\n  cannot be calculated and one wants NA instead of an error - useful in, e.g., larger benchmarks\r\n* tuneParams, selectFeatures: removed memory stats from default output for\r\n  performance reasons (can be restored by using a control object with \"log.fun\"\r\n  = \"memory\")\r\n* listLearners: change check.packages default to FALSE\r\n* tuneParams and tuneParamsMultiCrit: new parameter `resample.fun` to specify a custom resampling function to use.\r\n* Deprecated: getTaskDescription, getBMRTaskDescriptions, getRRTaskDescription.\r\n  New names: getTaskDesc, getBMRTaskDescs, getRRTaskDesc.\r\n\r\n## functions - new\r\n* getOOBPreds: get out-of-bag predictions from trained models for learners that store them -- these learners have the new \"oobpreds\" property\r\n* listTaskTypes, listLearnerProperties\r\n* getMeasureProperties, hasMeasureProperties, listMeasureProperties\r\n* makeDummyFeaturesWrapper: fuse a learner with a dummy feature creator\r\n* simplifyMeasureNames: shorten measure names to the actual measure, e.g.\r\n  mmce.test.mean -> mmce\r\n* getFailureModelDump, getPredictionDump, getRRDump: get error dumps\r\n* batchmark: Function to run benchmarks with the batchtools package on high performance computing clusters\r\n* makeTuneControlMBO: allows Bayesian optimization\r\n\r\n## measures - new\r\n* kendalltau, spearmanrho\r\n\r\n## learners - general\r\n* classif.plsdaCaret: added parameter \"method\".\r\n* regr.randomForest: refactored se-estimation code, improved docs and default is now se.method = \"jackknife\".\r\n* regr.xgboost, classif.xgboost: removed \"factors\" property as these learners do not handle categorical features\r\n-- factors are silently converted to integers internally, which may misinterpret the structure of the data\r\n* glmnet: control parameters are reset to factory settings before applying\r\n  custom settings and training and set back to factory afterwards\r\n\r\n## learners - removed\r\n* {classif,regr}.avNNet: no longer necessary, mlr contains a bagging wrapper","2018-06-23T16:00:33",{"id":222,"version":223,"summary_zh":224,"released_at":225},233784,"v2.10","## functions - general\r\n* fixed bug in resample when using predict = \"train\" (issue #1284)\r\n* update to irace 2.0 -- there are algorithmic changes in irace that may affect\r\n  performance\r\n* generateFilterValuesData: fixed a bug wrt feature ordering\r\n* imputeLearner: fixed a bug when data actually contained no NAs\r\n* print.Learner: if a learner hyperpar was set to value \"NA\" this was not\r\n  displayed in printer\r\n* makeLearner, setHyperPars: if you mistype a learner or hyperpar name, mlr\r\n  uses fuzzy matching to suggest the 3 closest names in the message\r\n* tuneParams: tuning with irace is now also parallelized, i.e., different\r\n  learner configs are evaluated in parallel.\r\n* benchmark: mini fix, arg 'learners' now also accepts class strings\r\n* object printers: some mlr printers show head previews of data.frames.\r\n  these now also print info on the total nr of rows and cols and are less confusing\r\n* aggregations: have better properties now, they know whether they require training or\r\n  test set evals\r\n* the filter methods have better R docs\r\n* filter randomForestSRC.var.select: new arg \"method\"\r\n* filter mrmr: fixed some smaller bugs and updated properties\r\n* generateLearningCurveData: also accepts single learner, does not require a list\r\n* plotThreshVsPerf: added \"measures\" arg\r\n* plotPartialDependence: can create tile plots with joint partial dependence\r\n  on two features for multiclass classification by facetting across the classes\r\n* generatePartialDependenceData and generateFunctionalANOVAData: expanded\r\n  \"fun\" argument to allow for calculation of weights\r\n* new \"?mlrFamilies\" manual page which lists all families and the functions\r\n  belonging to it\r\n* we are converging on data.table as a standard internally, this should not\r\n  change any API behavior on the outside, though\r\n* generateHyperParsEffectData and plotHyperParsEffect now support more than 2\r\n  hyperparameters\r\n* linear.correlation, rank.correlation, anova.test: use Rfast instead of\r\n  FSelector\u002Fcustom implementation now, performance should be much better\r\n* use of our own colAUC function instead of the ROCR package for AUC calculation\r\n  to improve performance\r\n* we output resample performance messages for every iteration now\r\n* performance improvements for the auc measure\r\n* createDummyFeatures supports vectors now\r\n* removed the pretty.names argument from plotHyperParsEffect -- labels can be set\r\n  though normal ggplot2 functions on the returned object\r\n* Fixed a bad bug in resample, the slot \"runtime\" or a ResampleResult,\r\n  when the runtime was measured not in seconds but e.g. mins. R measures then potentially in mins,\r\n  but mlr claimed it would be seconds.\r\n* New \"dummy\" learners (that disregard features completely) can be fitted now for baseline comparisons,\r\n  see \"featureless\" learners below.\r\n\r\n## functions - new\r\n* filter: randomForest.importance\r\n* generateFeatureImportanceData: permutation-based feature importance and local\r\n  importance\r\n* getFeatureImportanceLearner: new Learner API function\r\n* getFeatureImportance: top level function to extract feature importance\r\n  information\r\n* calculateROCMeasures\r\n* calculateConfusionMatrix: new confusion-matrix like function that calculates\r\n  and tables many receiver operator measures\r\n* makeLearners: create multiple learners at once\r\n* getLearnerId, getLearnerType, getLearnerPredictType, getLearnerPackages\r\n* getLearnerParamSet, getLearnerParVals\r\n* getRRPredictionList\r\n* addRRMeasure\r\n* plotResiduals\r\n* getLearnerShortName\r\n* mergeBenchmarkResults\r\n\r\n## functions - renamed\r\n* Renamed rf.importance filter (now deprecated) to randomForestSRC.var.rfsrc\r\n* Renamed rf.min.depth filter (now deprecated) to randomForestSRC.var.select\r\n* Renamed getConfMatrix (now deprecated) to calculateConfusionMatrix\r\n* Renamed setId (now deprecated) to setLearnerId\r\n\r\n## functions - removed\r\n* mergeBenchmarkResultLearner, mergeBenchmarkResultTask\r\n\r\n## learners - general\r\n* classif.ada: fixed some param problem with rpart.control params\r\n* classif.cforest, regr.cforest, surv.cforest:\r\n  removed parameters \"minprob\", \"pvalue\", \"randomsplits\"\r\n  as these are set internally and cannot be changed by the user\r\n* regr.GPfit: some more params for correlation kernel\r\n* classif.xgboost, regr.xgboost: can now properly handle NAs (property was missing and other problems), added \"colsample_bylevel\" parameter\r\n* adapted {classif,regr,surv}.ranger parameters for new ranger version\r\n\r\n## learners - new\r\n* multilabel.cforest\r\n* surv.gbm\r\n* regr.cvglmnet\r\n* {classif,regr,surv}.gamboost\r\n* classif.earth\r\n* {classif,regr}.evtree\r\n* {classif,regr}.evtree\r\n\r\n## learners - removed\r\n* classif.randomForestSRCSyn, regr.randomForestSRCSyn: due to continued stability issues\r\n\r\n## measures - new\r\n* ssr, qsr, lsr\r\n* rrse, rae, mape\r\n* kappa, wkappa\r\n* msle, rmsle\r\n","2018-06-23T16:00:19",{"id":227,"version":228,"summary_zh":229,"released_at":230},233785,"v2.9","## functions - general\r\n* various cleanups that removed unused code\r\n* subsetTask, getTaskData: arg \"features\" now also accepts logical and integer\r\n* removeConstantFeatures now also operates on data.frames and\r\n  makeRemoveConstantFeaturesWrapper can be used to augment a learner with this\r\n  preprocessing step.\r\n* normalizeFeatures, createDummyFeatures: arg 'exclude' was replaced by 'cols'\r\n* normalizeFeatures is now S3 and can be called also on data.frames\r\n* SMOTEWrapper: fix a bug where \"sw.nn\" was not correctly passed down\r\n* fixed a bug that caused hyperparameters to be not passed on correctly in the\r\n  ModelMultiplexer in some cases\r\n* fix bug with NoFeaturesModel and ModelMultiplexer\r\n* fix small bug in DownsampleWrapper when trained with weights\r\n* getNestedTuneResultsOptPathDf: added new arg \"trafo\"\r\n* improve documentation for permutation.importance filter and perform slight\r\n  argument renaming to fix potential name clashes\r\n* plotPartialDependence can plot classification tasks with more than one\r\n  interacted features now\r\n* generateFilterValuesData: added argument 'more.args'\r\n* add pretty.names arguments to plots that show learner short names instead of IDs\r\n* addition of 'data' argument to plotPartialDependence which adds the training\r\n  data to the graph\r\n* added new arguments \"facet.wrap.nrow\" and \"facet.wrap.ncol\" which enable\r\n  arrangement of facets in\r\n  rows and columns to plotting functions\r\n\r\n## functions - new\r\n* generateHyperParsEffectData, plotHyperParsEffect\r\n* makeMultilabelClassifierChainsWrapper, makeMultilabelDBRWrapper\r\n  makeMultilabelNestedStackingWrapper, makeMultilabelStackingWrapper\r\n* makeConstantClassWrapper\r\n* generateFunctionalANOVAData\r\n\r\n## functions - removed\r\n* getParamSet generic (now in ParamHelpers package)\r\n\r\n## functions - renamed\r\n* generatePartialPrediction to generatePartialDependence\r\n* plotPartialPrediction to plotPartialDependence\r\n* plotPartialPredictionGGVIS to plotPartialDependenceGGVIS\r\n\r\n## learners - general\r\n* fixed weight handling and weight tag for some learners\r\n* remove unnecessary linear.output parameter for classif.neuralnet\r\n* remove unsupported KSVM parameter value stringdot\r\n* fix some bartMachine compatibility issues\r\n* classif.ranger, regr.ranger and surv.ranger: now respect unordered factors by\r\n  default\r\n* clean up randomForestSRC and randomForestSRCSyn learners\r\n* the \"penalized\" learner were restructured and improved (params were added), also\r\n  see below.\r\n* add stability.nugget parameter for \"regr.km\"\r\n* classif.blackboost, regr.blackboost: made sure that arg \"stump\" is passed on\r\n  correctly\r\n* fixed parameter values for WEKA learners IBk, J48, PART, EM, SimpleKMeans, XMeans\r\n* classif.glmboost, regr.glmboost: add parameters stopintern and trace\r\n\r\n## learners - new\r\n* classif.C50\r\n* classif.gausspr\r\n* classif.penalized.fusedlasso\r\n* classif.penalized.lasso\r\n* classif.penalized.ridge\r\n* classif.h2o.deeplearning\r\n* classif.h2o.gbm\r\n* classif.h2o.glm\r\n* classif.h2o.randomForest\r\n* classif.rrf\r\n* regr.penalized.fusedlasso\r\n* regr.gausspr\r\n* regr.glm\r\n* regr.GPfit\r\n* regr.h2o.deeplearning\r\n* regr.h2o.gbm\r\n* regr.h2o.glm\r\n* regr.h2o.randomForest\r\n* regr.rrf\r\n* surv.cv.CoxBoost\r\n* surv.penalized.fusedlasso\r\n* surv.penalized.lasso\r\n* surv.penalized.ridge\r\n* cluster.kkmeans\r\n* multilabel.randomforestSRC\r\n\r\n## learners - removed\r\n* surv.optimCoxBoostPenalty\r\n* surv.penalized (split up, see new learners above)\r\n\r\n## measures - general\r\n* updated gmean measure and unit test, added reference to formula of gmean\r\n* makeCostMeasure: removed arg \"task\", names of cost matrix are checked on measure\r\n  calculation\r\n\r\n## measures - new\r\n* multiclass.brier\r\n* brier.scaled\r\n* logloss\r\n* multilabel.subset01, multilabel.f1, multilabel.acc, multilabel.ppv,\r\n  multilabel.tpr\r\n* multiclass.au1p, multiclass.au1u, multiclass.aunp, multiclass.aunu\r\n\r\n## measures - renamed\r\n* multiclass.auc to multiclass.au1u\r\n* hamloss to multilabel.hamloss","2018-06-23T16:00:05",{"id":232,"version":233,"summary_zh":234,"released_at":235},233786,"v2.8","* Feature filter \"univariate\" had a bad name, was deprecated and is now called\r\n  \"univariate.model.score\". The new one also has better defaults.\r\n* (generate\u002Fplot)PartialPrediction: added new arg \"geom\" for tile plots\r\n* small fix for plotBMRSummary\r\n* the ModelMultiplexer inherits its predict.type from the base learners now\r\n* check that learners in an ensemble have the same predict.type\r\n* new function getBMRModels to extract stored models from a benchmark result\r\n* Fixed a bug where several learners from the LiblineaR package\r\n  (\"classif.LiblineaRL2LogReg\", \"classif.LiblineaRL2SVC\", \"regr.LiblineaRL2L2SVR\")\r\n  were calling the wrong value for \"type\" (0) and thus training the wrong model.\r\n* Fixed a bug where the resampling objects hout, cv2, cv3, cv5, cv10 were not\r\n  documented in the ResampleDesc help page\r\n* regr.xgboost, classif.xgboost: add feval param\r\n* fixed a bug in irace tuning interface with unamed discrete values\r\n* Fixed bugs in \"jackknife\" and \"bootstrap\" se estimators for regr.randomForest.\r\n* Added \"sd\" estimator for regr.randomForest.\r\n* Fixed a mini bug in ModelMultiplexer where hyperpars that are only needed in\r\n  predict were not passed down correctly\r\n* Fixed a bug where the function capLargeValues wasn't working if you passed a\r\n  task.\r\n* capLargeValues now has a new argument \"target\", to prevent from capping response\r\n  values.\r\n* classif.gbm, regr.gbm: Updated possible 'distribution' settings a bit.\r\n* oversample, undersample, makeOversampleWrapper, makeUndersampleWrapper,\r\n  makeOverBaggingWrapper:\r\n  Added arguments to specifically select the sampled class.\r\n\r\n## API changes\r\n* listLearners now returns a data frame with properties of the learners if\r\n  create is false\r\n\r\n## new functions\r\n* getBMRModels\r\n\r\n## removed functions\r\n* generateROCRCurvesData, plotROCRCurves, plotROCRCurvesGGVIS\r\n\r\n## new learners\r\n* classif.randomForestSRCSyn\r\n* classif.cvglmnet\r\n* regr.randomForestSRCSyn\r\n* cluster.dbscan\r\n\r\n## new measures\r\n* rsq, arsq, expvar","2018-06-23T15:59:50",{"id":237,"version":238,"summary_zh":239,"released_at":240},233787,"v2.7","* New argument \"models\" for function benchmark\r\n* fixed a bug where 'keep.pred' was ignored in the benchmark function\r\n* some of the very new functions for benchmark plots had to be refactored and\u002For\r\n  renamed.\r\n  these names are gone from the API:\r\n  plotBenchmarkResult, generateRankMatrixAsBarData, plotRankMatrixAsBar, generateBenchmarkSummaryData, plotBenchmarkSummary,\r\n  this is the new API:\r\n  plotBMRSummary, plotBMRBoxplots, plotBMRRanksAsBarChart","2018-06-23T15:59:36",{"id":242,"version":243,"summary_zh":244,"released_at":245},233788,"v2.6","* cluster.kmeans: added support for fuzzy clustering (property \"prob\")\r\n* regr.lm: removed some erroneous param settings\r\n* regr.glmnet: added 'family' param and allowed 'gaussian', but also 'poisson'\r\n* disabled plotViperCharts unit tests as VC seems to be offline currently\r\n* multilabel: improve few task getter functions, especially getTaskFormula is\r\n  now correct\r\n\r\n## new learners\r\n* regr.glmboost\r\n* cluster.Cobweb","2018-06-23T15:59:22",{"id":247,"version":248,"summary_zh":249,"released_at":250},233789,"v2.5","* fixed a bug that caused performance() to return incorrect values with\r\n  ResamplePredictions\r\n* we have (somewhat experimental) support for multilabel classification.\r\n  so we now have a task, a new baselearner (rFerns),\r\n  and a generic reduction-to-binary algorithm (MultilabelWrapper)\r\n* tuning: added 'budget' parameter in makeTuneControl* (single-objective)\r\n  and makeTuneMultiCritControl* (multi-objective scenarios), allowing to define\r\n  a maximum \"number of evaluations\" budget for tuning algorithms\r\n* tuning: added 'budget' parameter in makeTuneMultiCritControl*, allowing to\r\n  define a maximum \"number of evaluations\" budget for tuning algorithms\r\n  in the single-objective case\r\n* makeTuneControlGenSA: optimized function will be considered non-smooth\r\n  per default (change via ... args)\r\n* classif.svm, regr.svm: added 'scale' param\r\n* ksvm: added 'cache' param\r\n* plotFilterValuesGGVIS: sort and n_show are interactive, interactive flag removed\r\n* renamed getProbabilities to getPredictionProbabilities and deprecated\r\n  getProbabilities\r\n* plots now use long names for measures where possible\r\n* there was a nasty bug in measure \"mcc\". fixed and unit tested. and apologies.\r\n* removed getTaskFormulaAsString and improved getTaskFormula so the former is\r\n  not needed anymore\r\n* aggregations now have a 'name' property, which is a long name\r\n* generateLearningCurveData and generateThreshVsPerfData now append the\r\n  aggregation id to the output column name if the measure ids are the same\r\n* plotLearningCurve, plotLearningCurveGGVIS, plotThreshVsPerf,\r\n  plotThreshVsPerfGGVIS now have an argument\r\n  'pretty.names' which plots the 'name' element of the measures instead of the 'id'.\r\n* makeCustomResampledMeasure now has arguments 'measure.id' and 'aggregation.id'\r\n  instead of only 'id' which corresponded to the measure. Also, 'name' and note (corresponding to the measure)\r\n  as well as 'aggregation.name' have been added.\r\n* makeCostMeasure now has arguments 'name' and 'id'.\r\n* classification learner now can have a property 'class.weights', supported by\r\n  'class.weights.param'. The latter indicates which of the parameters provides\r\n  that class weights information to the learner.\r\n* class weights integrated in the learner will be used as default for 'wcw.param'\r\n  in 'makeWeightedClassesWrapper'\r\n* listLearners with create = FALSE does not load packages anymore and is\r\n  therefore faster and more reliable; it also supports the additional parameter\r\n  check.packages now that will check whether required packages are installed\r\n  without loading them\r\n* many new functions for statistical benchmark comparisons are added, see below\r\n* rename hasProperties, getProperties to hasLearnerProperties and\r\n  getLearnerProperties\r\n* Learner properties are now implemented object oriented as a state of a Learner.\r\n  Only RLearners have the properties stored in a slot.\r\n  For each class the getter can be overwritten.\r\n* The hill climbing algorithm for stacking (Caruana 04) is implemented as method\r\n  'hill.climb' in 'makeStackedLearner' to select models from base learners, which\r\n  is equivalent to weighted average.\r\n* The model compression algorithm for stacking (Caruana 06) is implemented as\r\n  method 'compress' in 'makeStackedLearner' to first select models from base\r\n  learners and then mimic the behaviour with a super learner. The default super\r\n  learner is neural network.\r\n* relativeOverfitting provides a way to estimate how much a model overfits to\r\n  the training data according to a measure.\r\n* restructured the LiblineaR learners to a more convenient format. These old ones\r\n  were removed:\r\n  classif.LiblineaRBinary, classif.LiblineaRLogReg,  classif.LiblineaRMultiClass.\r\n  For the new ones, see below.\r\n* Added some commonly used ResampleDesc description objects, to save typing in\r\n  resample experiments:\r\n  hout, cv2, cv3, cv5, cv10.\r\n* regr.randomForest: changed default nodesize to 5 (according to randomForest\r\n  defaults)\r\n\r\n## new functions\r\n* getDefaultMeasure\r\n* getTaskClassLevels\r\n* getPredictionTruth, getPredictionResponse, getPredictionSE\r\n* convertMLBenchObjToTask\r\n* getBMRLearners, getBMRMeasures, getBMRMeasureIds\r\n* makeMultilabelTask, makeMultilabelWrapper, getMultilabelBinaryPerformances\r\n* generatePartialPredictionData, plotPartialPrediction, and\r\n  plotPartialPredictionGGVIS\r\n* getClassWeightParam\r\n* plotBenchmarkResult, convertBMRToRankMatrix, generateRankMatrixAsBarData,\r\n  plotRankMatrixAsBar, generateBenchmarkSummaryData, plotBenchmarkSummary,\r\n  friedmanTestBMR, friedmanPostHocTestBMR, generateCritDifferencesData,\r\n  plotCritDifferences\r\n* getCaretParamSet\r\n* generateCalibrationData and plotCalibration\r\n* relativeOverfitting\r\n* plotROCCurves\r\n\r\n## new measures\r\n* hamloss\r\n\r\n## new learners\r\n* multilabel.rFerns\r\n* classif.avNNet\r\n* classif.neuralnet\r\n* regr.avNNet\r\n* classif.clusterSVM\r\n* classif.dcSVM\r\n* classif.gaterSVM\r\n* classif.mlp\r\n* classif.saeDNN\r\n* classif.dbnDNN\r\n* classif.nnTrain\r\n* classi","2018-06-23T15:59:08",{"id":252,"version":253,"summary_zh":254,"released_at":255},233790,"v2.4","* WrappedModel printer was slightly improved\r\n* ReampleResult now stores the runtime it took to resample in a slot\r\n* getTaskFormula \u002F getTaskFormulaAsString have new argument 'explicit.features'\r\n* getTaskData now has recodeY = \"drop.levels\" which drops empty factor levels\r\n* option fix.factors in makeLearner was renamed to fix.factors.prediction for\r\n  clarity\r\n* showHyperPars was removed. getParamSet does exactly the same thing\r\n* 'resample' and 'benchmark' got the argument keep.pred,\r\n  setting it to FALSE allows to discard the prediction objects to save memory\r\n* we had to slightly change how the mem usage is reported in tuning and feature\r\n  selection\r\n  See TuneControl and FeatSelControl where it is documented what is done now.\r\n* tuneIrace: allows to set the precision \u002F digits within irace (using the argument\r\n  'digits' in makeTuneControlIrace); default is maximum precision\r\n* for plotting in general we try to introduce a \"data layer\", so the data can be\r\n  generated independently of the plotting first, into well-defined objects;\r\n  these can then be plotted with mlr or custom code;\r\n  the naming scheme is always generate\u003CFoo>Data and plot\u003CFoo>\r\n* getFilterValues is deprecated in favor of generateFilterValuesData\r\n* plotFilterValues can now plot multiple filter methods using facetting\r\n* plotROCRCurves has been rewritten to use ggplot2\r\n* classif.ada: added \"loss\" hyperpar\r\n* add missings properties to all ctree and cforest methods:\r\n  regr\u002Fclassif for ctree, regr\u002Fclassif\u002Fsurv for cforest, and regr\u002Fclassif for blackboost\r\n* learner xgboost was removed, because the package is not on CRAN anymore,\r\n  unfortunately\r\n* reg.km: added param 'iso'\r\n* classif.mda: added param 'start.method' and changed its default to 'lvq', added\r\n  params 'sub.df', 'tot.df' and 'criterion'\r\n* classif.randomForest: 'sampsize' can now be an int vector (instead of a scalar)\r\n* plotThreshVsPerf and plotLearningCurve now have param 'facet'\r\n\r\n## new functions\r\n* getTaskSize\r\n* getNestedTuneResultsX, getNestedTuneResultsOptPathDf\r\n* tuneDesign\r\n* generateROCRCurvesData, generateFilterValuesData, generateLearningCurveData,\r\n  plotLearningCurve, generateThreshVsPerfData, plotThreshVsPerf,\r\n* generateThreshVsPerfData accepts Prediction, ResampleResult, lists of\r\n  ResampleResult, and BenchmarkResult objects.\r\n* experimental ggvis functions: plotROCRCurvesGGVIS, plotLearningCurveGGVIS,\r\n  plotTuneMultiCritResultGGVIS, plotThreshVsPerfGGVIS, and plotFilterValuesGGVIS\r\n\r\n## new learners:\r\n* classif.bst\r\n* classif.hdrda\r\n* classif.nodeHarvest\r\n* classif.pamr\r\n* classif.rFerns\r\n* classif.sparseLDA\r\n* regr.bst\r\n* regr.frbs\r\n* regr.nodeHarvest\r\n* regr.slim\r\n\r\n## new measures:\r\n* brier","2018-06-23T15:58:54",{"id":257,"version":258,"summary_zh":259,"released_at":260},233791,"v2.3","* resample now returns an object of class ResampleResult (downward compatible)\r\n  to allow for a print method.\r\n* resampling on features now supported for an arbitrary number of factor features\r\n* mlr supports ViperCharts plots now\r\n* ROC plot via ROCR can now be created automatically, before you had to call\r\n  asROCRPrediction,\r\n  then construct the plots via ROCR your self. See plotROCRCurves\r\n* all mlr measures now have slots \"name\" and \"note\"\r\n* exported a few very simple \"getters\" for tasks, see below\r\n* in makeLearner a probability predict.threshold can be set for classifiers, also\r\n  see setPredictThreshold\r\n* in the control objects for tuning and feature selection, the user can now enable\r\n  threshold tuning\r\n* in the control objects for tuning and feature selection, the user can now define\r\n  his own logging function\r\n* default console logging for tuneParams and selectFeatures is more informative,\r\n  it displays time and memory info\r\n* updated some properties of some learners\r\n* Default arguments of classif.bartMachine, classif.randomForestSRC,\r\n  regr.randomForestSRC and sur.randomForestSRC\r\n  have been changed to allow missing data support with default settings.\r\n* externalized measure functions to be used on vectors.\r\n* some minor bug fixes\r\n* required basic learner packages are not loaded into the global namespace\r\n  anymore, requireNamespace\r\n  is used internally instead. this ensures less name clashes and name shadowing\r\n* resample passes dot arguments to the learner hyperpars\r\n* new option \"on.par.out.of.bounds\" to disable out-of-bound checks for model\r\n  parameters\r\n* measures were slightly internally changed. they expose more properties (check\r\n  ?Measure) and some now unnecessary object slots were removed\r\n* classif.lda and classif.qda now have hyperpar \"predict.method\"\r\n* filterFeatures and makeFilterWrapper gain an argument for mandatory features\r\n* plotLearnerPrediction has new option \"err.size\"\r\n* classif.plsDA and cluster.DBscan for now removed because of problems with the\r\n  underlying learning algorithm\r\n* new aggregation test.join\r\n* the following models now can handle factors and ordereds by extra dummy or int\r\n  encoding:\r\n  classif.glmnet, regr.glmnet, surv.glmnet, surv.cvglmnet, surv.penalized,\r\n  surv.optimCoxBoostPenalty, surv.glmboost, surv.CoxBoost\r\n\r\n## new functions\r\n* getTaskType, getTaskId, getTaskTargetNames\r\n* plotROCRCurves\r\n* plotViperCharts\r\n* measureSSE, measureMSE, measureRMSE, measureMEDSE, ...\r\n* PreprocWrapperCaret\r\n* setPredictThreshold\r\n\r\n## new learners:\r\n* classif.bdk\r\n* classif.binomial\r\n* classif.extraTrees\r\n* classif.probit\r\n* classif.xgboost\r\n* classif.xyf\r\n* regr.bartMachine\r\n* regr.bcart\r\n* regr.bdk\r\n* regr.bgp\r\n* regr.bgpllm\r\n* regr.blm\r\n* regr.brnn\r\n* regr.btgp\r\n* regr.btgpllm\r\n* regr.btlm\r\n* regr.cubist\r\n* regr.elmNN\r\n* regr.extraTrees\r\n* regr.laGP\r\n* regr.xgboost\r\n* regr.xyf\r\n* surv.rpart","2018-06-23T15:58:37",{"id":262,"version":263,"summary_zh":264,"released_at":265},233792,"v2.2","* The web tutorial was MUCH improved!\r\n* more example tasks and data sets\r\n* Learners and tasks now support ordered factors as features.\r\n  The task description knows whether ordered factors are present and it is checked\r\n  whether the learner supports such a feature. We have set this property 'ordered'\r\n  very conservatively, so very few learners have it, where we are sure ordered\r\n  inputs are handled correctly during training.\r\n  If you know of more models that support this, please inform us.\r\n* basic R learners now have new slots: name (a descriptive name of the algorithm),\r\n  short.name (abbreviation that can be used in plots and tables) and note\r\n  (notes regarding slight changes for the mlr integration of the learner and such).\r\n* makeLearner now supports some options regarding learner error handling and\r\n  output which could before only be set globally via configureMlr\r\n* Additional arguments for imputation functions to allow a more fine-grain\r\n  control of dummy column creation\r\n* imputeMin and imputeMax now subtract or add a multiple of the range of\r\n  the data from the minimum or to the maximum, respectively.\r\n* cluster methods now have property 'prob' when they support fuzzy cluster\r\n  membership probabilities,\r\n  and also then support predict.type = 'prob'. Everything basically works the same\r\n  as for posterior probabilities in classif.* methods.\r\n* predict preserves the rownames of the input in its output\r\n* fixed a bug in createDummyFeatures that caused an error when the data contained\r\n  missing values.\r\n* plotLearnerPrediction works for clustering and allows greyscale plots (for\r\n  printing or articles)\r\n* the whole object-oriented structure behind feature filtering was much\r\n  improved. Smaller changes in the signature of makeFilterWrapper and\r\n  filterFeatures have become necessary.\r\n* fixed a bug in filter methods of the FSelector package that caused an error when\r\n  variable names contained accented letters\r\n* filterFeatures can now be also applied to the result of getFilterValues\r\n* We dropped the data.frame version of some preprocessing operations like\r\n  mergeFactorLevelsBySize,\r\n  joinClassLevels and removeConstantFeatures for consistency. These now always require tasks as input.\r\n* We support a pretty generic framework for stacking \u002F super-learning now, see\r\n  makeStackedLearner\r\n* imbalancy correction + smote:\r\n  ** fix a bug in \"smote\" when only factor features are present\r\n  ** change to oversampling: sample new observations only (with replacement)\r\n  ** extension to smote algorithm (sampling): minority class observations in\r\n  binary classification\r\n  are either chosen via sampling or alternatively, each minority class observation\r\n  is used an equal number of times\r\n* made the getters for BenchmarkResult more consistent. These are now:\r\n  getBMRTaskIds, getBMRLearnerIds, getBMRPredictions, getBMRPerformances,\r\n  getBMRAggrPerformances\r\n  getBMRTuneResults, getFeatSelResults, getBMRFilteredFeatures\r\n  The following methods do not work for BenchmarkResult anymore: getTuneResult, getFeatSelResult\r\n* Removed getFilterResult, because it does the same as getFilteredFeatures\r\n\r\n## new learners:\r\n* classif.bartMachine\r\n* classif.lqa\r\n* classif.randomForestSRC\r\n* classif.sda\r\n* regr.ctree\r\n* regr.plsr\r\n* regr.randomForestSRC\r\n* cluster.cmeans\r\n* cluster.DBScan\r\n* cluster.kmeans\r\n* cluster.FarthestFirst\r\n* surv.cvglmnet\r\n* surv.optimCoxBoostPenalty\r\n\r\n## new filters:\r\n* variance\r\n* univariate\r\n* carscore\r\n* rf.importance, rf.min.depth\r\n* anova.test, kruskal.test\r\n* mrmr\r\n\r\n## new functions\r\n* makeMulticlassWrapper\r\n* makeStackedLearner, getStackedBaseLearnerPredictions\r\n* joinClassLevels\r\n* summarizeColumns, summarizeLevels\r\n* capLargeValues, mergeFactorLevelsBySize","2018-06-23T15:57:17"]