[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-EducationalTestingService--skll":3,"tool-EducationalTestingService--skll":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159267,2,"2026-04-17T11:29:14",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":76,"owner_website":78,"owner_url":79,"languages":80,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":32,"env_os":93,"env_gpu":93,"env_ram":93,"env_deps":94,"category_tags":106,"github_topics":107,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":111,"updated_at":112,"faqs":113,"releases":144},8489,"EducationalTestingService\u002Fskll","skll","SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.","skll（SciKit-Learn Laboratory）是一款专为简化机器学习实验流程而设计的 Python 工具包。它基于流行的 scikit-learn 库构建，旨在让用户无需编写复杂的编程代码，仅通过配置文件即可轻松运行完整的机器学习实验。\n\n在传统工作流中，研究人员往往需要花费大量时间编写重复的“胶水代码”来连接数据加载、模型训练、参数调优和结果评估等环节。skll 有效解决了这一痛点，用户只需准备好特征数据，随后通过简单的 INI 格式配置文件定义实验任务（如交叉验证、模型评估或预测）、指定算法列表及超参数搜索策略，即可调用命令行工具自动执行全流程。\n\n这款工具特别适合数据科学家、教育研究者以及希望快速验证算法效果的开发者使用。其独特的技术亮点在于将实验逻辑从代码中解耦，支持一键并行处理多个数据集与多种模型的组合实验，并自动生成详尽的性能指标、预测结果及可视化报告。此外，skll 还原生支持在兼容 DRMAA 的集群上进行分布式计算，显著提升了大规模实验的效率。如果你希望专注于数据分析与模型选择，而非繁琐的工程实现，skll 将是一个得力的助手。","SciKit-Learn Laboratory\n-----------------------\n\n.. image:: https:\u002F\u002Fgitlab.com\u002FEducationalTestingService\u002Fskll\u002Fbadges\u002Fmain\u002Fpipeline.svg\n   :target: https:\u002F\u002Fgitlab.com\u002FEducationalTestingService\u002Fskll\u002F-\u002Fpipelines\n   :alt: Gitlab CI status\n\n.. image:: https:\u002F\u002Fdev.azure.com\u002FEducationalTestingService\u002FSKLL\u002F_apis\u002Fbuild\u002Fstatus\u002FEducationalTestingService.skll\n   :target: https:\u002F\u002Fdev.azure.com\u002FEducationalTestingService\u002FSKLL\u002F_build?view=runs\n   :alt: Azure Pipelines status\n\n.. image:: https:\u002F\u002Fcodecov.io\u002Fgh\u002FEducationalTestingService\u002Fskll\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg\n  :target: https:\u002F\u002Fcodecov.io\u002Fgh\u002FEducationalTestingService\u002Fskll\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fskll.svg\n   :target: https:\u002F\u002Fpypi.org\u002Fproject\u002Fskll\u002F\n   :alt: Latest version on PyPI\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fskll.svg\n   :alt: License\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fconda\u002Fv\u002Fets\u002Fskll.svg\n   :target: https:\u002F\u002Fanaconda.org\u002Fets\u002Fskll\n   :alt: Conda package for SKLL\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fskll.svg\n   :target: https:\u002F\u002Fpypi.org\u002Fproject\u002Fskll\u002F\n   :alt: Supported python versions for SKLL\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDOI-10.5281%2Fzenodo.12825-blue.svg\n   :target: http:\u002F\u002Fdx.doi.org\u002F10.5281\u002Fzenodo.12825\n   :alt: DOI for citing SKLL 1.0.0\n\n.. image:: https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\n :target: https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002FEducationalTestingService\u002Fskll\u002Fmain?filepath=examples%2FTutorial.ipynb\n\n\nThis Python package provides command-line utilities to make it easier to run\nmachine learning experiments with scikit-learn.  One of the primary goals of\nour project is to make it so that you can run scikit-learn experiments without\nactually needing to write any code other than what you used to generate\u002Fextract\nthe features.\n\nInstallation\n~~~~~~~~~~~~\n\nYou can install using either ``pip`` or ``conda``. See details `here \u003Chttps:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Fgetting_started.html>`__.\n\nRequirements\n~~~~~~~~~~~~\n\n-  Python 3.10, 3.11, or 3.12.\n-  `beautifulsoup4 \u003Chttp:\u002F\u002Fwww.crummy.com\u002Fsoftware\u002FBeautifulSoup\u002F>`__\n-  `gridmap \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Fgridmap\u002F>`__ (only required if you plan\n   to run things in parallel on a DRMAA-compatible cluster)\n-  `joblib \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Fjoblib\u002F>`__\n-  `pandas \u003Chttp:\u002F\u002Fpandas.pydata.org>`__\n-  `ruamel.yaml \u003Chttp:\u002F\u002Fyaml.readthedocs.io\u002Fen\u002Flatest\u002Foverview.html>`__\n-  `scikit-learn \u003Chttp:\u002F\u002Fscikit-learn.org\u002Fstable\u002F>`__\n-  `seaborn \u003Chttp:\u002F\u002Fseaborn.pydata.org>`__\n-  `tabulate \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Ftabulate\u002F>`__\n\nCommand-line Interface\n~~~~~~~~~~~~~~~~~~~~~~\n\nThe main utility we provide is called ``run_experiment`` and it can be used to\neasily run a series of learners on datasets specified in a configuration file\nlike:\n\n.. code:: ini\n\n  [General]\n  experiment_name = Titanic_Evaluate_Tuned\n  # valid tasks: cross_validate, evaluate, predict, train\n  task = evaluate\n\n  [Input]\n  # these directories could also be absolute paths\n  # (and must be if you're not running things in local mode)\n  train_directory = train\n  test_directory = dev\n  # Can specify multiple sets of feature files that are merged together automatically\n  featuresets = [[\"family.csv\", \"misc.csv\", \"socioeconomic.csv\", \"vitals.csv\"]]\n  # List of scikit-learn learners to use\n  learners = [\"RandomForestClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"MultinomialNB\"]\n  # Column in CSV containing labels to predict\n  label_col = Survived\n  # Column in CSV containing instance IDs (if any)\n  id_col = PassengerId\n\n  [Tuning]\n  # Should we tune parameters of all learners by searching provided parameter grids?\n  grid_search = true\n  # Function to maximize when performing grid search\n  objectives = ['accuracy']\n\n  [Output]\n  # Also compute the area under the ROC curve as an additional metric\n  metrics = ['roc_auc']\n  # The following can also be absolute paths\n  logs = output\n  results = output\n  predictions = output\n  probability = true\n  models = output\n\nFor more information about getting started with ``run_experiment``, please check\nout `our tutorial \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Ftutorial.html>`__, or\n`our config file specs \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Frun_experiment.html>`__.\n\nYou can also follow this `interactive Jupyter tutorial \u003Chttps:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002FAVajpayeeJr\u002Fskll\u002Ffeature\u002F448-interactive-binder?filepath=examples>`__.\n\nWe also provide utilities for:\n\n-  `converting between machine learning toolkit formats \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#skll-convert>`__\n   (e.g., ARFF, CSV)\n-  `filtering feature files \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#filter-features>`__\n-  `joining feature files \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#join-features>`__\n-  `other common tasks \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html>`__\n\n\nPython API\n~~~~~~~~~~\n\nIf you just want to avoid writing a lot of boilerplate learning code, you can\nalso use our simple Python API which also supports pandas DataFrames.\nThe main way you'll want to use the API is through\nthe ``Learner`` and ``Reader`` classes. For more details on our API, see\n`the documentation \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Fapi.html>`__.\n\nWhile our API can be broadly useful, it should be noted that the command-line\nutilities are intended as the primary way of using SKLL.  The API is just a nice\nside-effect of our developing the utilities.\n\n\nA Note on Pronunciation\n~~~~~~~~~~~~~~~~~~~~~~~\n\n.. image:: doc\u002Fskll.png\n   :alt: SKLL logo\n   :align: right\n\n.. container:: clear\n\n  .. image:: doc\u002Fspacer.png\n\nSciKit-Learn Laboratory (SKLL) is pronounced \"skull\": that's where the learning\nhappens.\n\nTalks\n~~~~~\n\n-  *Simpler Machine Learning with SKLL 1.0*, Dan Blanchard, PyData NYC 2014 (`video \u003Chttps:\u002F\u002Fwww.youtube.com\u002Fwatch?v=VEo2shBuOrc&feature=youtu.be&t=1s>`__ | `slides \u003Chttp:\u002F\u002Fwww.slideshare.net\u002FDanielBlanchard2\u002Fpy-data-nyc-2014>`__)\n-  *Simpler Machine Learning with SKLL*, Dan Blanchard, PyData NYC 2013 (`video \u003Chttp:\u002F\u002Fvimeo.com\u002F79511496>`__ | `slides \u003Chttp:\u002F\u002Fwww.slideshare.net\u002FDanielBlanchard2\u002Fsimple-machine-learning-with-skll>`__)\n\nCiting\n~~~~~~\nIf you are using SKLL in your work, you can cite it as follows: \"We used scikit-learn (Pedragosa et al, 2011) via the SKLL toolkit (https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll).\"\n\nBooks\n~~~~~\n\nSKLL is featured in `Data Science at the Command Line \u003Chttp:\u002F\u002Fdatascienceatthecommandline.com>`__\nby `Jeroen Janssens \u003Chttp:\u002F\u002Fjeroenjanssens.com>`__.\n\nChangelog\n~~~~~~~~~\n\nSee `GitHub releases \u003Chttps:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Freleases>`__.\n\nContribute\n~~~~~~~~~~\n\nThank you for your interest in contributing to SKLL! See `CONTRIBUTING.md \u003Chttps:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fblob\u002Fmain\u002FCONTRIBUTING.md>`__ for instructions on how to get started.\n","SciKit-Learn 实验室\n-----------------------\n\n.. image:: https:\u002F\u002Fgitlab.com\u002FEducationalTestingService\u002Fskll\u002Fbadges\u002Fmain\u002Fpipeline.svg\n   :target: https:\u002F\u002Fgitlab.com\u002FEducationalTestingService\u002Fskll\u002F-\u002Fpipelines\n   :alt: GitLab CI 状态\n\n.. image:: https:\u002F\u002Fdev.azure.com\u002FEducationalTestingService\u002FSKLL\u002F_apis\u002Fbuild\u002Fstatus\u002FEducationalTestingService.skll\n   :target: https:\u002F\u002Fdev.azure.com\u002FEducationalTestingService\u002FSKLL\u002F_build?view=runs\n   :alt: Azure Pipelines 状态\n\n.. image:: https:\u002F\u002Fcodecov.io\u002Fgh\u002FEducationalTestingService\u002Fskll\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg\n  :target: https:\u002F\u002Fcodecov.io\u002Fgh\u002FEducationalTestingService\u002Fskll\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fskll.svg\n   :target: https:\u002F\u002Fpypi.org\u002Fproject\u002Fskll\u002F\n   :alt: PyPI 上的最新版本\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fskll.svg\n   :alt: 许可证\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fconda\u002Fv\u002Fets\u002Fskll.svg\n   :target: https:\u002F\u002Fanaconda.org\u002Fets\u002Fskll\n   :alt: SKLL 的 Conda 包\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fskll.svg\n   :target: https:\u002F\u002Fpypi.org\u002Fproject\u002Fskll\u002F\n   :alt: SKLL 支持的 Python 版本\n\n.. image:: https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDOI-10.5281%2Fzenodo.12825-blue.svg\n   :target: http:\u002F\u002Fdx.doi.org\u002F10.5281\u002Fzenodo.12825\n   :alt: 用于引用 SKLL 1.0.0 的 DOI\n\n.. image:: https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\n :target: https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002FEducationalTestingService\u002Fskll\u002Fmain?filepath=examples%2FTutorial.ipynb\n\n\n这个 Python 包提供命令行工具，使使用 scikit-learn 运行机器学习实验变得更加容易。我们项目的主要目标之一是让您无需编写任何代码（除了用于生成或提取特征的代码），就能运行 scikit-learn 实验。\n\n安装\n~~~~\n\n您可以使用 ``pip`` 或 ``conda`` 进行安装。详细信息请参见 `这里 \u003Chttps:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Fgetting_started.html>`__。\n\n要求\n~~~~\n\n-  Python 3.10、3.11 或 3.12。\n-  `beautifulsoup4 \u003Chttp:\u002F\u002Fwww.crummy.com\u002Fsoftware\u002FBeautifulSoup\u002F>`__\n-  `gridmap \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Fgridmap\u002F>`__（仅在您计划在兼容 DRMAA 的集群上并行运行时需要）\n-  `joblib \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Fjoblib\u002F>`__\n-  `pandas \u003Chttp:\u002F\u002Fpandas.pydata.org>`__\n-  `ruamel.yaml \u003Chttp:\u002F\u002Fyaml.readthedocs.io\u002Fen\u002Flatest\u002Foverview.html>`__\n-  `scikit-learn \u003Chttp:\u002F\u002Fscikit-learn.org\u002Fstable\u002F>`__\n-  `seaborn \u003Chttp:\u002F\u002Fseaborn.pydata.org>`__\n-  `tabulate \u003Chttps:\u002F\u002Fpypi.org\u002Fproject\u002Ftabulate\u002F>`__\n\n命令行界面\n~~~~~~~~~~~\n\n我们提供的主要工具名为 ``run_experiment``，可用于轻松地在配置文件中指定的数据集上运行一系列学习器，例如：\n\n.. code:: ini\n\n  [General]\n  experiment_name = Titanic_Evaluate_Tuned\n  # valid tasks: cross_validate, evaluate, predict, train\n  task = evaluate\n\n  [Input]\n  # these directories could also be absolute paths\n  # (and must be if you're not running things in local mode)\n  train_directory = train\n  test_directory = dev\n  # Can specify multiple sets of feature files that are merged together automatically\n  featuresets = [[\"family.csv\", \"misc.csv\", \"socioeconomic.csv\", \"vitals.csv\"]]\n  # List of scikit-learn learners to use\n  learners = [\"RandomForestClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"MultinomialNB\"]\n  # Column in CSV containing labels to predict\n  label_col = Survived\n  # Column in CSV containing instance IDs (if any)\n  id_col = PassengerId\n\n  [Tuning]\n  # Should we tune parameters of all learners by searching provided parameter grids?\n  grid_search = true\n  # Function to maximize when performing grid search\n  objectives = ['accuracy']\n\n  [Output]\n  # Also compute the area under the ROC curve as an additional metric\n  metrics = ['roc_auc']\n  # The following can also be absolute paths\n  logs = output\n  results = output\n  predictions = output\n  probability = true\n  models = output\n\n有关如何开始使用 ``run_experiment`` 的更多信息，请查看我们的 `教程 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Ftutorial.html>`__ 或 `配置文件规范 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Frun_experiment.html>`__。\n\n您还可以按照这个 `交互式 Jupyter 教程 \u003Chttps:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002FAVajpayeeJr\u002Fskll\u002Ffeature\u002F448-interactive-binder?filepath=examples>`__ 进行操作。\n\n我们还提供了以下实用工具：\n\n-  `在机器学习工具包格式之间进行转换 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#skll-convert>`__（例如 ARFF、CSV）\n-  `过滤特征文件 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#filter-features>`__\n-  `合并特征文件 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html#join-features>`__\n-  `其他常见任务 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Futilities.html>`__\n\n\nPython API\n~~~~~~~~~~\n\n如果您只想避免编写大量的样板学习代码，也可以使用我们简单的 Python API，它同样支持 pandas DataFrame。您将主要通过 ``Learner`` 和 ``Reader`` 类来使用该 API。有关我们 API 的更多详细信息，请参阅 `文档 \u003Chttps:\u002F\u002Fskll.readthedocs.org\u002Fen\u002Flatest\u002Fapi.html>`__。\n\n虽然我们的 API 具有广泛的实用性，但需要注意的是，命令行工具才是使用 SKLL 的主要方式。API 只是我们开发这些工具的一个顺带成果。\n\n\n关于发音的说明\n~~~~~~~~~~~~~~~\n\n.. image:: doc\u002Fskll.png\n   :alt: SKLL 标志\n   :align: right\n\n.. container:: clear\n\n  .. image:: doc\u002Fspacer.png\n\nSciKit-Learn Laboratory（SKLL）的发音是“skull”：学习就发生在这里。\n\n演讲\n~~~~\n\n-  *使用 SKLL 1.0 进行更简单的机器学习*，Dan Blanchard，PyData NYC 2014 (`视频 \u003Chttps:\u002F\u002Fwww.youtube.com\u002Fwatch?v=VEo2shBuOrc&feature=youtu.be&t=1s>`__ | `幻灯片 \u003Chttp:\u002F\u002Fwww.slideshare.net\u002FDanielBlanchard2\u002Fpy-data-nyc-2014>`__)\n-  *使用 SKLL 进行更简单的机器学习*，Dan Blanchard，PyData NYC 2013 (`视频 \u003Chttp:\u002F\u002Fvimeo.com\u002F79511496>`__ | `幻灯片 \u003Chttp:\u002F\u002Fwww.slideshare.net\u002FDanielBlanchard2\u002Fsimple-machine-learning-with-skll>`__)\n\n引用\n~~~~\n\n如果您在工作中使用了 SKLL，可以这样引用：“我们通过 SKLL 工具包（https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll）使用了 scikit-learn（Pedragosa 等，2011 年）。”\n\n书籍\n~~~~\n\nSKLL 被收录在 `Jeroen Janssens \u003Chttp:\u002F\u002Fjeroenjanssens.com>`__ 所著的《命令行上的数据科学》一书中。\n\n变更日志\n~~~~~~~~\n\n请参阅 `GitHub 发布页面 \u003Chttps:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Freleases>`__。\n\n贡献\n~~~~\n\n感谢您对 SKLL 的贡献兴趣！有关如何开始贡献的说明，请参阅 `CONTRIBUTING.md \u003Chttps:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fblob\u002Fmain\u002FCONTRIBUTING.md>`__。","# SKLL (SciKit-Learn Laboratory) 快速上手指南\n\nSKLL 是一个基于 scikit-learn 的 Python 工具包，旨在通过命令行配置文件的方式简化机器学习实验流程。无需编写大量样板代码，即可轻松运行分类、回归、交叉验证等任务。\n\n## 1. 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：3.10, 3.11 或 3.12\n*   **核心依赖**：\n    *   `scikit-learn`\n    *   `pandas`\n    *   `joblib`\n    *   `seaborn`\n    *   `beautifulsoup4`\n    *   `ruamel.yaml`\n    *   `tabulate`\n    *   `gridmap` (仅当需要在 DRMAA 兼容集群上并行运行时需要)\n\n> **提示**：国内用户建议使用清华或阿里镜像源加速依赖下载。\n\n## 2. 安装步骤\n\n您可以选择使用 `pip` 或 `conda` 进行安装。\n\n### 方式一：使用 pip 安装（推荐）\n\n```bash\npip install skll -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方式二：使用 conda 安装\n\n如果您使用 Anaconda 或 Miniconda：\n\n```bash\nconda install -c ets skll\n```\n\n## 3. 基本使用\n\nSKLL 的核心功能是通过 `run_experiment` 命令读取配置文件来执行实验。以下是一个最简单的分类实验示例。\n\n### 第一步：准备数据\n\n假设您有两个目录 `train` 和 `dev`，里面存放着 CSV 格式的特征文件。CSV 文件中需包含特征列、标签列（如 `Survived`）和 ID 列（如 `PassengerId`）。\n\n### 第二步：创建配置文件\n\n创建一个名为 `titanic_config.ini` 的文件，内容如下：\n\n```ini\n[General]\nexperiment_name = Titanic_Evaluate_Tuned\ntask = evaluate\n\n[Input]\ntrain_directory = train\ntest_directory = dev\nfeaturesets = [[\"family.csv\", \"misc.csv\", \"socioeconomic.csv\", \"vitals.csv\"]]\nlearners = [\"RandomForestClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"MultinomialNB\"]\nlabel_col = Survived\nid_col = PassengerId\n\n[Tuning]\ngrid_search = true\nobjectives = ['accuracy']\n\n[Output]\nmetrics = ['roc_auc']\nlogs = output\nresults = output\npredictions = output\nprobability = true\nmodels = output\n```\n\n### 第三步：运行实验\n\n在终端中执行以下命令启动实验：\n\n```bash\nrun_experiment titanic_config.ini\n```\n\n执行完成后，结果、日志、预测值和训练好的模型将自动保存到配置的 `output` 目录中。\n\n### 进阶用法\n\n除了命令行工具，SKLL 也提供了简单的 Python API，支持直接操作 `pandas DataFrame`。主要类包括 `Learner` 和 `Reader`。如需深入了解 API 用法或更多配置选项，请参阅官方文档。","某教育科技公司的数据科学团队需要基于学生行为数据构建辍学预测模型，并需快速对比多种算法效果以交付最佳方案。\n\n### 没有 skll 时\n- 数据科学家必须编写大量重复的 Python 样板代码来加载 CSV、分割数据集及实例化不同的分类器。\n- 每次尝试新算法或调整参数网格时，都需要手动修改脚本并重新运行，实验迭代周期长达数小时。\n- 难以统一管理和复现多组实验配置，团队成员间协作时常因环境差异或脚本版本混乱导致结果不一致。\n- 生成包含准确率、ROC 曲线下面积等多维度指标的详细报告需要额外编写可视化与格式化代码。\n\n### 使用 skll 后\n- 只需编写一份简洁的 INI 配置文件，定义数据路径、候选算法列表（如随机森林、SVM）及目标列，无需触碰核心逻辑代码。\n- 通过 `run_experiment` 命令一键自动执行网格搜索、交叉验证及多模型并行训练，将实验迭代时间缩短至分钟级。\n- 所有实验参数、日志、预测概率及训练好的模型均按标准目录结构自动保存，确保实验过程完全可追溯且易于复现。\n- 系统自动生成格式美观的表格报告，直接呈现各模型在准确率与 AUC 等关键指标上的对比排名，辅助快速决策。\n\nskll 将繁琐的机器学习工程流程转化为声明式配置，让团队能专注于特征工程与业务洞察而非底层代码实现。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEducationalTestingService_skll_638ad23f.png","EducationalTestingService","ETS","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FEducationalTestingService_365bf92c.png","Advancing the science of measurement to power human progress.",null,"opensource@ets.org","https:\u002F\u002Fwww.ets.org","https:\u002F\u002Fgithub.com\u002FEducationalTestingService",[81,85],{"name":82,"color":83,"percentage":84},"Python","#3572A5",99.7,{"name":86,"color":87,"percentage":88},"PowerShell","#012456",0.3,561,68,"2026-04-02T21:25:14","NOASSERTION","未说明",{"notes":95,"python":96,"dependencies":97},"该工具主要通过命令行实用程序（如 run_experiment）运行机器学习实验，无需编写额外代码。支持通过 pip 或 conda 安装。若需在兼容 DRMAA 的集群上并行运行任务，则必须安装 gridmap 库。","3.10, 3.11, 3.12",[98,99,100,101,102,103,104,105],"beautifulsoup4","gridmap (可选，用于 DRMAA 兼容集群并行运行)","joblib","pandas","ruamel.yaml","scikit-learn","seaborn","tabulate",[14],[108,109,103,110],"machine-learning","python","hacktoberfest","2026-03-27T02:49:30.150509","2026-04-18T00:45:41.651564",[114,119,124,129,134,139],{"id":115,"question_zh":116,"answer_zh":117,"source_url":118},37993,"为什么在配置文件中指定交叉验证折叠（folds）文件会导致实验运行速度显著变慢？","这是因为默认情况下，即使指定了外部折叠文件，内部网格搜索（inner grid search）仍会重新计算折叠，导致重复工作。解决方案是修改配置，明确指示在进行内部网格搜索时不要使用折叠文件。维护者通过实验证实，采用此方法后，SVC 模型的运行时间从约 42.5 秒降低到了 11.8 秒，与不指定折叠文件时的速度（11.7 秒）基本一致。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F363",{"id":120,"question_zh":121,"answer_zh":122,"source_url":123},37994,"SKLL 中默认的调优目标（tuning objective）是什么？对于回归任务是否有问题？","早期版本中，所有学习器（包括回归器）的默认评估指标均为 `f1_score_micro`，这对回归任务是不合理的。该问题已通过更新解决：移除了单一的 `objective` 字段，改为强制要求用户显式指定 `objectives` 字段（列表形式）。这不仅修复了回归任务的默认值问题，还提高了与 `metrics` 选项的一致性。现在用户必须根据任务类型（分类或回归）手动设置合适的评估指标。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F381",{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},37995,"当分类标签为类似 \"2\", \"2.1\", \"2.21\" 的字符串时，为什么预测结果会出现意外的浮点数？","这是由于读取器类中的 `safe_float` 函数会自动将看起来像数字的字符串标签转换为整数或浮点数（例如 \"2\" 变为 2.0），随后生成 numpy 数组导致类型统一为浮点型。虽然完全禁止转换会影响混合算法实验的配置，但官方已意识到此边缘情况并进行了文档补充。建议用户在处理此类标签时注意数据类型的隐式转换，并查阅相关文档了解潜在影响，或在预处理阶段确保标签保持为字符串格式。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F436",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},37996,"SKLL 的文档字符串（docstrings）遵循什么风格？布尔参数的描述格式是怎样的？","SKLL 致力于使文档字符串风格与 scikit-learn 保持一致。关于布尔参数的描述，经过讨论决定采用 scikit-learn 的第一种风格，即使用 \"If ``True``, do X.\" 的格式（如果为真，则执行 X），而不是 \"Whether or not to...\" 或疑问句形式。这种格式在参数类型后直接标明默认值，并且对于 numpy 数组类型会包含形状信息，以确保文档的一致性和清晰度。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F647",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},37997,"SKLL 是否应该直接使用 scikit-learn 自带的 Cohen's Kappa 实现？","虽然 scikit-learn 提供了 `cohen_kappa_score`，但 SKLL 没有直接替换原有实现。原因是 scikit-learn 的实现需要用户显式列出所有标签，而 SKLL 的原生实现能自动推断标签范围（从最小到最大）。为了保持向后兼容性并避免让用户手动指定标签，SKLL 选择保留自己的封装逻辑。此外，SKLL 的实现还支持特定的 'off-by-one' kappa 计算方式，这是直接切换所不具备的。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F391",{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},37998,"如何改进 SKLL 的单元测试结构和覆盖率？","针对测试代码冗余和覆盖率不足的问题，项目采取了多项措施：1. 将庞大的 `test_skll.py` 拆分为多个模块，每个模块对应一个被测试的主模块；2. 创建了 `RandomDataWriter` 类来统一生成随机测试数据，减少重复代码；3. 重新组织了临时数据文件的写入路径；4. 重点提高了 `featureset.py` 和 `learner.py` 的测试覆盖率。这些改进使得测试更易于维护，并支持在安装后通过 `skll.test` 子包运行单元测试。","https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fissues\u002F148",[145,150,155,160,165,170,175,180,185,190,195,200,205,210,215,220,225,230,235,240],{"id":146,"version":147,"summary_zh":148,"released_at":149},306174,"v5.1.0","## 变更内容\n* 由 @tamarl08 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F773 中将 setup.py 替换为 pyproject.toml\n* 由 @tamarl08 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F774 中移除已被 ruff 覆盖的 pre-commit 钩子\n* 由 @tamarl08 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F775 中允许 feature_scaling 中使用 None 值\n* 由 @damien2012eng 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F772 中将 Numpy 更新至最新版本 [2.0]\n* 由 @damien2012eng 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F777 中将 scikit-learn 更新至最新版本\n\n## 新贡献者\n* @damien2012eng 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F772 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv5.0.1...v5.1.0","2024-12-27T15:07:05",{"id":151,"version":152,"summary_zh":153,"released_at":154},306175,"v5.0.1","## 🛠 小幅更新 🛠 \n\nSKLL 5.0.1 是一个次要版本，对用户没有影响。 \n\n- 更新了 pre-commit 检查。\n- 更新了依赖项：\n  - 从 `requirements.txt` 中移除了所有开发依赖。\n  - 更新了 `doc\u002Frequirements.txt` 中的版本。\n- 新增了 `requirements.dev` 文件。该文件同时包含运行时和开发依赖。\n- 更新了 `CONTRIBUTING.md`，使其使用此文件而非 `requirements.txt`。\n- 在 `MANIFEST.in` 中排除了此文件，以确保它不会被打包到 PyPI 发布包中。\n- 更新了 CI 流水线，使其使用 `requirements.dev` 而不是 `requirements.txt`。\n- 更新了发布流程检查清单。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv5.0.0...v5.0.1","2024-03-08T20:13:23",{"id":156,"version":157,"summary_zh":158,"released_at":159},306176,"v5.0.0","## 💥 重大变更 💥  \n* `scikit-learn` 已更新至 v1.4.0。这意味着使用 SKLL 进行的实验结果很可能与 SKLL v4.0.1 的结果有所不同（https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F766）。\n* 由于 scikit-learn v1.4.0 不再支持 Python 3.8 和 3.9，因此这两个版本已不再受支持。\n* 与之前版本相比，在运行实验时生成的 `results.json` 输出文件中包含了更多信息（https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F761）。\n\n## 💡 新功能 💡  \n* SKLL 的实验结果现在可以自动记录到 [Weights & Biases](https:\u002F\u002Fwandb.ai) 平台（https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F758、https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F761、https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F765）。\n* 现在支持 Python 3.12。\n\n## 🛠 修复与改进 🛠  \n* 修复了 ReadTheDocs 配置（https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F757）\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv4.0.1...v5.0.0","2024-02-22T18:18:24",{"id":161,"version":162,"summary_zh":163,"released_at":164},306177,"v4.0.1","## 变更内容\n* 由 @tamarl08 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F754 中修复了对 yaml.load 的调用\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv4.0.0...v4.0.1","2023-11-14T15:08:13",{"id":166,"version":167,"summary_zh":168,"released_at":169},306178,"v4.0.0","## 💥 重大变更 💥 \n\n- `scikit-learn` 已更新至 v1.3.0。这意味着使用 SKLL 3.2.0 运行的相同实验可能会产生不同的结果。\n\n## 💡 新特性 💡 \n* @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F742 中添加了对 `BaggingClassifier` 和 `BaggingRegressor` 的支持。\n* @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F743 中添加了对 `HistGradientBoostingClassifier` 和 `HistGradientBoostingRegressor` 的支持。\n* @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F745 中将模型拟合时间纳入学习曲线。\n* @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F741 中为回归器新增了 `neg_root_mean_squared_error` 指标和目标函数。\n* @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F749 中增加了对 Python 3.11 的支持。\n\n## 🛠 错误修复与改进 🛠 \n\n* 应用代码格式化及其他小幅改动。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F724 中完成。\n* 尽可能使用 `pathlib.Path`。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F725 中完成。\n* 迁移到新的 Codecov 上传工具。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F728 中完成。\n* 为 `skll.config` 模块添加类型提示。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F729 中完成。\n* 为 `skll.data` 模块添加类型提示，并改进 `skll.config` 中的类型定义。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F730 中完成。\n* 修复特征集拆分方法中的错误。由 @tamarl08 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F731 中完成。\n* 为 `skll.experiments` 模块添加类型提示。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F732 中完成。\n* 为 `skll.learner` 模块添加类型提示，并进行其他重构工作。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F734 中完成。\n* 为 `skll.utils` 模块及所有剩余文件添加类型提示。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F736 中完成。\n* 改进文档字符串并创建可链接的类型提示（第1部分）。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F737 中完成。\n* 改进文档字符串和类型提示（第2部分）。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F738 中完成。\n* 改进文档字符串和类型提示（第3部分）。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F739 中完成。\n* 改进文档字符串和类型提示（第4部分）。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F740 中完成。\n* 将测试框架从 `nose` 迁移至 `nose2`。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F747 中完成。\n* 停止在 SKLL 中使用 scikit-learn 的私有 `_scorer` API 来实现自定义指标。由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F751 中完成。\n* 修复文档中的一些拼写错误等。由 @mulhod 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingServic","2023-07-17T16:04:04",{"id":171,"version":172,"summary_zh":173,"released_at":174},306179,"v3.2.0","## 变更内容\n* 更新 RTD 依赖项以修复构建失败问题，由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F719 中完成  \n* 更新依赖项、整合需求文件并调整代码覆盖率，由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F721 中完成  \n* 发布 v3.2.0 版本，由 @desilinguist 在 https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fpull\u002F722 中完成  \n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv3.1.0...v3.2.0","2023-01-19T17:20:37",{"id":176,"version":177,"summary_zh":178,"released_at":179},306180,"v3.1.0","这是一个新版本，包含依赖项更新、错误修复和改进。\n\n## 💥 依赖项更新 💥 \n\n- `scikit-learn` 已更新至 v1.1.2。这意味着使用 SKLL 3.1.0 运行相同的 SKLL 实验可能会产生不同的结果。（问题 #713，拉取请求 #716）\n\n## 🛠 错误修复与改进 🛠 \n\n- SKLL 学习器现在支持一个新的方法 `get_feature_names_out()`，该方法会返回学习器实际使用的_正确_特征集。由于某些特征可能已被特征选择器移除，在这种情况下仅依赖向量化器的词汇表是不够的。此方法可以方便地获取实际使用的特征名称，即使特征选择器已经移除了部分特征。（问题 #714，拉取请求 #715）\n- 更新了学习曲线代码，以使用 `seaborn` v0.12.0 的新 API（拉取请求 #716）\n- 从 SKLL 的示例和测试中移除了波士顿住房数据集。该数据集存在伦理问题，目前正从 `scikit-learn` 中移除。（问题 #700，#717）\n\n## ✔️ 测试 ✔️ \n\n- 为 `Learner.get_feature_name_out()` 添加了新的测试。（问题 #714，拉取请求 #715）\n\n## 👩‍🔬 贡献者 👨‍🔬\n\n(*注*: 此列表按姓氏字母顺序排列，而非按对本次发布贡献的质量或数量排序。)\n\nSanjna Kashyap (@Frost45)、Nitin Madnani (@desilinguist)、Matt Mulholland (@mulhod) 和 Remo Nitschke (@remo-help)。","2022-09-14T19:33:18",{"id":181,"version":182,"summary_zh":183,"released_at":184},306181,"v3.0","这是一个重要的新版本，包含了依赖项更新和错误修复！\n\n⚡️ **SKLL 3.0 与 SKLL 的先前版本不兼容，并且即使使用相同的数据和相同的设置，也可能产生与先前版本不同的结果。** ⚡️\n\n## 💥 重大变更 💥 \n\n- 现已不再官方支持 Python 3.7，同时新增了对 Python 3.10 的官方支持（问题 #701，PR #711）。\n\n- `scikit-learn` 已更新至 v1.0.1（问题 #699，PR #702）。\n\n- “调参”部分的配置字段 `pos_label_str` 已更名为 [`pos_label`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#pos-label-optional)。包含 `pos_label_str` 的旧版配置文件现在会抛出异常（问题 #569，PR #706）。\n\n- “输出”部分的配置字段 `log` 在 SKLL v2.5 中已更名为 [`logs`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#logs-optional)，现已被完全弃用。包含 `log` 的旧版配置文件现在会抛出异常（问题 #671，PR #705）。\n\n## 💡 新特性 💡 \n\n- SKLL 现在支持为交叉验证任务指定[自定义随机种子](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#cv-seed-optional)。此选项对于多次运行相同的交叉验证实验（使用相同数量但不同划分方式的折数）以评估重复实验之间的方差非常有用（问题 #593，PR #707）。\n\n## 🛠 错误修复与改进 🛠 \n\n- 使用 `--drop-blanks` 选项配合 [`filter_features`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#filter-features) 时，如果表格型特征文件中的每一行都存在空列，现在会抛出一条更有帮助的错误信息（问题 #693，PR #703）。\n\n- SKLL 的 conda 包再次恢复为通用 Python 包，而非平台特定包（问题 #710，PR #711）。\n\n## 📖 文档更新 📖 \n\n- 在动手教程中新增了一个[小节](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Ftutorial.html#create-virtual-environment-with-skll)，解释如何首先在虚拟环境中安装 SKLL（问题 #689，PR #709）。\n\n- 在教程的数据部分补充了指向 SKLL 仓库的缺失链接（问题 #688，PR #691）。\n\n- 更新了 [`CONTRIBUTING.md`](https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)，增加了更详细的向 SKLL 仓库推送代码的操作说明（问题 #680，PR #704）。\n\n- 添加了指向 RSMTool 中 `quadratic_weighted_kappa` 实现的链接，该实现支持连续值，可用作 SKLL 中的自定义指标，既可用于超参数调优，也可用于验证。请参阅“目标”部分下的 **quadratic_weighted_kappa** 条目（问题 #512，PR #704）。\n\n- 继续改进函数和方法文档字符串的可读性。\n\n## ✔️ 测试 ✔️ \n\n- 现在所有测试在调用 `run_configuration()` 时都会指定 `local=True`。这","2021-12-21T20:12:29",{"id":186,"version":187,"summary_zh":188,"released_at":189},306182,"v2.5","这是一个重大的新版本，包含数十项新功能、错误修复以及文档更新！\n\n⚡️ **SKLL 2.5 与 SKLL 的先前版本不兼容，并且即使使用相同的数据和相同的设置，也可能产生与先前版本不同的结果。** ⚡️\n\n## 💥 重大变更 💥 \n\n- 由于最新版本的 `pandas` 和 `numpy` 已经不再支持 Python 3.6，因此 SKLL 现在也不再官方支持 Python 3.6。\n\n- 旧版的顶级导入已被移除，现在应按以下方式重写（问题 #661，PR #662）：\n  + `from skll import Learner` ➡️ `from skll.learner import Learner`\n  + `from skll import FeatureSet` ➡️ `from skll.data import FeatureSet`\n  + `from skll import run_configuration` ➡️ `from skll.experiments import run_configuration`\n\n- `Learner.predict()` 方法的 `class_labels` 关键字参数的默认值现已从 `False` 改为 `True`。因此，对于概率型分类器，该方法现在将默认返回类别标签，而不是类别概率。若需获取类别概率，请在调用此方法时将 `class_labels` 设置为 `False`（问题 #621，PR #622）。\n\n- [`filter_features`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#filter-features) 脚本现在提供了更直观的命令行选项。输入文件必须使用 `-i`\u002F`--input` 指定，输出文件则必须使用 `-o`\u002F`--output` 指定。此外，由于 `-i` 已用于指定输入文件，因此现在必须使用 `--inverse` 来反转筛选命令（问题 #598，PR #660）。\n\n- `MegaMReader` 和 `MegaMWriter` 类已被从 SKLL 中移除，因为 `.megam` 文件已不再受 SKLL 支持（问题 #532，PR #557）。\n\n- 配置文件中的 [`param_grids`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#param-grids-optional) 选项现为一个字典列表，而非之前每个 `learners` 选项中指定的学习器对应的一个字典列表。相应地，`Learner.train()` 和 `Learner.cross_validate()` 中的 `param_grid` 选项也由字典列表改为单个字典，而各学习器的默认参数网格现在也直接以字典形式给出。（问题 #618，PR #619）。\n\n- 通过配置文件运行 [`learning_curve`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#learning-curve) 任务时，现在至少需要 500 个示例。如果示例数量不足，将会抛出 `ValueError` 异常。只有在直接通过 API 使用 `Learner.learning_curve()` 时，才能覆盖这一行为（问题 #624，PR #631）。\n\n## 💡 新特性 💡 \n\n- [`VotingClassifier`](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fmodules\u002Fgenerated\u002Fsklearn.ensemble.VotingClassifier.html) 和 [`VotingRegressor`](h","2021-02-26T03:01:01",{"id":191,"version":192,"summary_zh":193,"released_at":194},306183,"v2.1","这是 SKLL 的一个次要版本更新，唯一的改动是现在兼容 scikit-learn v0.22.2。\n\n⚡️ **scikit-learn v0.22 中有几项[变更](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002Fwhats_new\u002Fv0.22.html)，这些变更可能导致即使使用相同的数据和参数进行拟合，多个估计器和函数也会产生不同的结果。因此，SKLL 2.1 即使在相同数据和相同设置下，也可能与之前的版本产生不同的结果。** ⚡️\n\n## 💡 新特性 💡 \n\n- 将 `scikit-learn` 更新至 0.22.2（问题 #594，拉取请求 #595）。\n\n## 🔎 其他次要更改 🔎 \n\n- 更新导入语句，以匹配新的 `scikit-learn` API。\n- 修复了 `logutils.py` 中的一个小 bug。\n- 由于 `scikit-learn` 模型和函数的变化，更新了一些测试输出。\n- 更新部分测试，以便能够对 conda 和 PyPI 包的预发布版本进行测试。\n\n## 👩‍🔬 贡献者 👨‍🔬\n\n(*注*: 此列表按姓氏字母顺序排列，而非根据对本次发布贡献的质量或数量排序。)\n\nAoife Cahill (@aoifecahill)、Binod Gyawali (@bndgyawali)、Matt Mulholland (@mulhod)、Nitin Madnani (@desilinguist) 和 Mengxuan Zhao (@chaomenghsuan)。","2020-03-13T17:22:26",{"id":196,"version":197,"summary_zh":198,"released_at":199},306184,"v2.0","This is a major new release. It's probably the largest SKLL release we have ever done since SKLL 1.0 came out! It includes dozens of new features, bugfixes, and documentation updates!\r\n\r\n⚡️ **SKLL 2.0 is backwards incompatible with previous versions of SKLL and might yield different results compared to previous versions even with the same data and same settings.** ⚡️ \r\n\r\n## 💥 Incompatible Changes 💥 \r\n\r\n- Python 2.7 is no longer supported since the underlying version of scikit-learn no longer supports it (Issue #497, PR #506).\r\n\r\n- Configuration field `objective` has been deprecated and replaced with `objectives` which [allows](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#objectives-optional) specifying multiple tuning objectives for grid search (Issue #381, PR #458).\r\n\r\n- Grid search is now enabled by default in both the API as well as while using a configuration file (Issue #463, PR #465).\r\n\r\n- The `Predictor` class previously provided by the `generate_predictions` utility script is no longer available. If you were relying on this class, you should just load the model file and call `Learner.predict()` instead (Issue #562, PR #566). \r\n\r\n- There are no longer any default grid search objectives since the choice of objective is best left to the user. Note that since grid search is enabled by default, you must either choose an objective or explicitly disable grid search (Issue #381, PR #458).\r\n\r\n- `mean_squared_error` is no longer supported as a metric. Use `neg_mean_squared_error` instead (Issue #382, PR #470).\r\n\r\n- The `cv_folds_file` configuration file field is now just called `folds_file` (Issue #382, PR #470).\r\n\r\n- Running an experiment with the `learning_curve` task now requires specifying [`metrics`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#metrics-optional) in the `Output` section instead of `objectives` in the `Tuning` section (Issue #382, PR #470).\r\n\r\n- Previously when reading in CSV\u002FTSV files, missing data was automatically imputed as zeros. This is not appropriate in all cases. This no longer the case and blanks are retained as is. Missing values will need to be explicitly dropped or replaced (see below) before using the file with SKLL (Issue #364, PRs #475 & #518). \r\n\r\n- `pandas` and `seaborn` are now direct dependencies of SKLL, and not optional (Issues #455 & #364, PRs #475 & #508).\r\n\r\n## 💡 New features 💡 \r\n\r\n- `CSVReader`\u002F`CSVWriter` & `TSVReader`\u002F`TSVWriter` now use `pandas` as the backend rather than custom code that relied on the `csv` module. This leads to significant speedups, especially for very large files (~5x for reading and ~10x for writing)! The speedup comes at the cost of moderate increase in memory consumption. See detailed benchmarks [here](https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Ffiles\u002F3637196\u002Ftest_skll.pdf) (Issue #364, PRs #475 & #518).\r\n\r\n- SKLL models now have a new [`pipeline` attribute](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#pipeline-optional) which makes it easy to manipulate and use them in `scikit-`learn, if needed (Issue #451, PR #474). \r\n\r\n- `scikit-learn` updated to 0.21.3 (Issue #457, PR #559).\r\n\r\n- The SKLL conda package is now a [generic Python package](https:\u002F\u002Fwww.anaconda.com\u002Fcondas-new-noarch-packages\u002F) which means the same package works on all platforms and on all Python versions >= 3.6. This package is hosted on the new, public [ETS anaconda channel](https:\u002F\u002Fanaconda.org\u002Fets).\r\n\r\n- SKLL learner hyperparameters have been updated to match the new `scikit-learn` defaults and those upcoming in 0.22.0 (Issue #438, PR #533).\r\n\r\n- Intermediate results for the grid search process are now available in the [`results.json`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#results-files) files (Issue #431, #471).  \r\n\r\n- The K models trained for each split of a K-fold cross-validation experiment can now be [saved](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#save-cv-models-optional) to disk (Issue #501, PR #505). \r\n\r\n- Missing values in CSV\u002FTSV files can be dropped\u002Freplaced both via the [command line](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#cmdoption-filter-features-db) and the [API](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Fapi\u002Fdata.html#skll.data.readers.CSVReader) (Issue #540, PR #542). \r\n\r\n- Warnings from `scikit-learn` are now captured in SKLL log files (issue #441, PR #480). \r\n\r\n- `Learner.model_params()` and, consequently, the [`print_model_weights`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#print-model-weights) utility script now work with models trained on hashed features (issue #444, PR #466). \r\n\r\n- The [`print_model_weights`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#print-model-weights) utility script can now output feature weights sorted by class labels to improve readability (Issue #442, PR #468).\r\n\r\n- The [`skll_convert`](https:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#skll-convert) utility script can now convert feature files that do not conta","2019-10-24T16:58:36",{"id":201,"version":202,"summary_zh":203,"released_at":204},306185,"v1.5.3","This is a minor release of SKLL with the most notable change being compatibility with the latest version of scikit-learn (v0.20.1).\r\n\r\n### What's new\r\n\r\n- SKLL is now compatible with scikit-learn v0.20.1 (Issue #432, PR #439).\r\n- `GradientBoostingClassifier` and `GradientBoostingRegressor` now accept sparse matrices as input (Issue #428, PR #429).\r\n- The `model_params` property now works for SVC learners with a linear kernel (Issue #425, PR #443).\r\n- Improved documentation (Issue #423, PR #437). \r\n- Update `generate_predictions` to output the probabilities for _all_ classes instead of just the first class (Issue #430, PR #433). **Note**: this change breaks backward compatibility with previous SKLL versions since the output file now _always_ includes a column header. \r\n\r\n### Bugfixes\r\n\r\n- Fixed broken links in documentation (Issues #421 and #422, PR #437).\r\n- Fixed data type conversion in `NDJWriter` (Issue #416, PR #440).\r\n- Properly handle the possible combinations of trained model and prediction set vectorizers in `Learner.predict` (Issue #414, PR #445). \r\n\r\n### Other changes\r\n\r\n- Make the tests for `MLPClassifier` and `MLPRegressor` go faster (by turning off grid search) to prevent Travis CI from timing out (issue #434, PR #435).\r\n","2018-12-14T19:12:33",{"id":206,"version":207,"summary_zh":208,"released_at":209},306186,"v1.5.2","This is a hot fix release that addresses a single issue. \r\n\r\n`Learner` instances created via `from_file()` method did not get loggers associated with them. This meant that any and all warnings generated for such learner instances would have led to `AttributeError` exceptions. ","2018-04-12T17:01:55",{"id":211,"version":212,"summary_zh":213,"released_at":214},306187,"v1.5.1","This is primarily a bug fix release.\r\n\r\n### Bugfixes\r\n\r\n- Generate the \"folds_file\" warnings only when \"folds_file\" is specified (issue #404, PR #405).\r\n- Modify `Learner.save()` to deal properly with reading in and re-saving older models (issue #406, PR #407).\r\n- Fix regression that caused the output directories to not be automatically created (issue #408, PR #409).\r\n","2018-01-31T18:04:04",{"id":216,"version":217,"summary_zh":218,"released_at":219},306188,"v1.5","This is a major new release of SKLL.\r\n\r\n### What's new\r\n- Several new scikit-learn learners included along with reasonable default parameter grids for tuning, where appropriate (issues #256 & #375, PR #377).\r\n    - `BayesianRidge`\r\n    - `DummyRegressor`\r\n    - `HuberRegressors`\r\n    - `Lars`\r\n    - `MLPRegressor`\r\n    - `RANSACRegressor`\r\n    - `TheilSenRegressor`\r\n    - `DummyClassifier`\r\n    - `MLPClassifier`\r\n    - `RidgeClassifier`\r\n- Allow computing any number of additional evaluation metrics in addition to the tuning objective (issue #350, PR #384). \r\n- Rename `cv_folds_file` configuration option to `folds_file`. The former is still supported with a deprecation warning but will be removed in the next release (PR #367).\r\n- Add a new configuration option [`use_folds_file_for_grid_search`](http:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#use-folds-file-for-grid-search-optional) which controls whether the inner-loop grid-search in a cross-validation experiment with a custom folds file also uses the folds from the file. It's set to True by default. Setting it to False means that the inner loop uses regular 3-fold cross-validation and ignores the file (PR #367).\r\n- Also add a keyword argument called `use_custom_folds_for_grid_search` to the `Learner.cross_validate()` method (PR #367).\r\n- Learning curves can now be plotted from existing summary files using the new [`plot_learning_curves`](http:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Futilities.html#plot-learning-curves) command line utility (issue #346, PR #396).\r\n- Overhaul logging in SKLL. All messages are now logged both to the console (if running interactively) and to log files. Read more about the SKLL log files in the [Output Files section](http:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#output-files) of the documentation (issue #369, PR #380).\r\n- `neg_log_loss` is now available as an objective function for classification (issue #327, PR #392).\r\n\r\n### Changes\r\n- SKLL now supports Python 3.6. Although Python 3.4 and 3.5 will still work, 3.6 is now the officially supported Python 3 version. Python 2.7 is still supported. (issue #355, PR #360).\r\n- The required version of scikit-learn has been bumped up to 0.19.1 (issue #328, PR #330). \r\n- The learning curve y-limits are now computed a bit more intelligently (issue #389, PR #390).\r\n- Raise a warning if ablation flag is used for an experiment that uses `train_file`\u002F`test_file` - this is not supported (issue #313, PR #392).\r\n- Raise a warning if both `fixed_parameters` and `param_grids` are specified (issue #185, PR #297).\r\n- Disable grid search if no default parameter grids are available in SKLL and the user doesn't provide parameter grids either (issue #376, PR #378).\r\n- SKLL has a copy of scikit-learn's `DictVectorizer` because it needs some custom functionality. _Most_ (but not all) of our modifications have now been merged into scikit-learn so our custom version is now significantly condensed down to just a single method (issue #263, PR #374).\r\n- Improved outputs for cross-validation tasks (issues #349 & #371, PRs #365 & #372)\r\n    - When a folds file is specified, the log erroneously showed the full dictionary.\r\n    - Show number of cross-validation folds in results to be \u003Cn> via folds file if a folds file is specified.\r\n    - Show grid search folds in results to be \u003Cn> via folds file if the grid search ends up using the folds file.\r\n    - Do not show the stratified folds information in results when a folds file is specified.\r\n    - Show the value of `use_folds_file_for_grid_search` in results when appropriate.\r\n    - Show grid search related information in results only when we are actually doing grid search.\r\n- The Travis CI plan was broken up into multiple jobs in order to get around the 50 minute limit (issue #385, PR #387).\r\n- For the conda package, some of the dependencies are now sourced from the `conda-forge` channel.\r\n\r\n### Bugfixes\r\n- Fix the bug that was causing the inner grid-search loop of a cross-validation experiment to use a single job instead of the number specified via `grid_search_jobs` (issue #363, PR #367).\r\n- Fix unbound variable in `readers.py` (issue #340, PR #392).\r\n- Fix bug when running a learning curve experiment via `gridmap` (issue #386, PR #390).\r\n- Fix a mismatch between the default number of grid search folds and the default number of slots requested via `gridmap` (issue #342, PR #367).\r\n\r\n### Documentation  \r\n- Update documentation and tests for all of the above changes and new features.\r\n- Update tutorial and installation instructions (issues #383 and #394, PR #399).\r\n- Standardize all of the function and method docstrings to be NumPy style. Add docstrings where missing (issue #373, PR #397).","2017-12-14T20:27:42",{"id":221,"version":222,"summary_zh":223,"released_at":224},306189,"v1.3","This is a major new release of SKLL.\n\n### New features\n- You can now generate learning curves for multiple learners, multiple feature sets, and multiple objectives in a single experiment by using `task=learning_curve` in the configuration file. See [documentation](http:\u002F\u002Fskll.readthedocs.io\u002Fen\u002Flatest\u002Frun_experiment.html#learning-curve) for more details (issue #221, PR #332). \n\n### Changes\n- The required version of scikit-learn has been bumped up to 0.18.1 (issue #328, PR #330). \n- SKLL now uses the MKL backend on macOS\u002FLinux instead of OpenBLAS when used as a `conda` package. \n\n### Bugfixes\n- Fix deprecation warning when using `Learner.model_params()` (issue #325, PR #329). \n- Update the definitions of SKLL F1 metrics as a result of scikit-learn upgrade (issue #325, PR #330).\n- Bring documentation for SVC parameter grids up to date with the code (issue #334, PR #337).\n- Update documentation to make it clear that the SKLL `conda` package is only available for Python 3.4. For other Python versions, users should use `pip`. \n","2017-02-13T19:48:47",{"id":226,"version":227,"summary_zh":228,"released_at":229},306190,"v1.2.1","This is primarily a bug fix release but also adds a major new API feature.\n\nNew API Feature:\n- If you use the SKLL API, you can now create `FeatureSet` instances _directly_ from `pandas` data frames (issue #261, PR #292).\n\nBugfixes:\n- Correctly parse floats in scientific notation, e.g., when specifying parameter grids and\u002For fixed parameters (issue #318, PR #320)\n- `print_model_weights` now correctly handles models trained with `fit_intercept=False` (issue #322, PR #323).\n","2016-05-20T19:10:19",{"id":231,"version":232,"summary_zh":233,"released_at":234},306191,"v1.2","This release includes major changes as well as a number of bugfixes.\n\nChanges:\n- The required version of scikit-learn has been bumped up to 0.17.1 (issue #273, PRs #288 and #308)\n- You can now optionally save cross-validation folds to a file for later analysis (issue #259, PR #262)\n- Update documentation to be clear about when two `FeatureSet` instances are deemed equal (issue #272, PR #294)\n- You can now specify multiple objective functions for parameter tuning (issue #115, PR #291)\n\nBugfixes:\n- Use a fixed random state when doing non-stratified k-fold cross-validation (issue #247, PR #286)\n- Fix errors when using reusing relative paths in output section (issue #252, PR #287)\n- `print_model_weights` now works correctly for multi-class logistic regression models (issue #274, PR #267)\n- Correctly raise an `IOError` if the config file is not correctly specified (issue #275, PR #281)\n- The `evaluate` task does not crash when the test data has labels that were not seen in training data (issue #279, PR #290)\n- The `fit()` method for rescaled versions of learners now works correctly when not doing grid search (issue #304, PR #306)\n- Fix minor typos in the documentation and tutorial.\n","2016-02-24T01:32:43",{"id":236,"version":237,"summary_zh":238,"released_at":239},306192,"v1.1.1","This is a minor bugfix release.  It fixes:\n-  Issue where a `FileExistsError` would be raised when processing many configs (PR #260)\n-  Instance of `cv_folds` instead of `num_cv_folds` in the documentation (PR #248).\n-  Crash with `print_model_weights` and Logistic Regression models without intercepts (issue #250, PR #251)\n-  Division by zero error when there was only one example (issue #253, PR #254)\n","2015-10-23T17:20:26",{"id":241,"version":242,"summary_zh":243,"released_at":244},306193,"v1.1.0","The biggest changes in this release are that the required version of scikit-learn has been bumped up to 0.16.1 and config file parsing is much more robust and gives much better error messages when users make mistakes.\n\n### Implemented enhancements\n- Base estimators other than the defaults are now supported for `AdaBoost` classifiers and regressors (#238)\n- User can now specify number of cross-validation folds to use in the config file (#222)\n- Decision Trees and Random Forests no longer need dense inputs (#207)\n- Stratification during cross-validation is now optional (#160)\n\n### Fixed bugs\n- Bug when checking if `hasher_features` is a valid option (#234)\n- Invalid\u002Fmissing\u002Fduplicate options in configuration are now detected (#223)\n- Stop modifying global numpy random seed (#220)\n- Relative paths specified in the config file are now relative to the config file location instead of to the current directory (#213)\n\n### Closed issues\n- Incompatibility with the latest version of scikit-learn (v0.16.1) (#235, #241, #233)\n- Learner.model_params will return weights with the wrong sign if sklearn is fixed (#111)\n\n### Merged pull requests\n- Overhaul configuration file parsing (@desilinguist, #246)\n- Several minor bugfixes (@desilinguist, #245)\n- Compatibility with scikit-learn v0.16.1 (@desilinguist, #243)\n- Expose cv_folds and stratified (@aoifecahill, #240)\n- Adding Report tests (@brianray, #237)\n\n[Full Changelog](https:\u002F\u002Fgithub.com\u002FEducationalTestingService\u002Fskll\u002Fcompare\u002Fv1.0.1...v1.1.0)\n","2015-07-20T13:35:52"]