[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-alteryx--featuretools":3,"tool-alteryx--featuretools":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",158594,2,"2026-04-16T23:34:05",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":93,"env_os":94,"env_gpu":94,"env_ram":94,"env_deps":95,"category_tags":102,"github_topics":104,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":113,"updated_at":114,"faqs":115,"releases":145},8317,"alteryx\u002Ffeaturetools","featuretools","An open source python library for automated feature engineering","Featuretools 是一款专为自动化特征工程设计的开源 Python 库。在机器学习项目中，从原始数据中构建高质量特征往往是最耗时且最依赖人工经验的环节，Featuretools 正是为了解决这一痛点而生。它能够自动分析多表关联数据（如包含时间戳的客户交易记录），通过深度特征合成（DFS）算法，智能地组合和转换数据，瞬间生成成百上千个潜在特征，从而将数据科学家从繁琐的手工编码中解放出来。\n\n这款工具特别适合数据科学家、机器学习工程师以及研究人员使用。无论是需要快速构建基线模型的开发人员，还是希望探索更多特征可能性的算法研究员，都能从中受益。即使是不具备深厚领域知识的初学者，也能利用它快速从复杂的数据关系中提取有价值的信息。\n\nFeaturetools 的核心亮点在于其独特的“实体集”（EntitySet）概念和对多表数据的原生支持。它不仅能处理简单的单表数据，更能理解表与表之间的关联，自动跨越表格边界进行聚合和变换操作。此外，它还支持自然语言处理（NLP）扩展和分布式计算（Dask），能够灵活应对各种复杂场景。通过标准化和自动化特征构建流程，Featuretools 让团队能更专","Featuretools 是一款专为自动化特征工程设计的开源 Python 库。在机器学习项目中，从原始数据中构建高质量特征往往是最耗时且最依赖人工经验的环节，Featuretools 正是为了解决这一痛点而生。它能够自动分析多表关联数据（如包含时间戳的客户交易记录），通过深度特征合成（DFS）算法，智能地组合和转换数据，瞬间生成成百上千个潜在特征，从而将数据科学家从繁琐的手工编码中解放出来。\n\n这款工具特别适合数据科学家、机器学习工程师以及研究人员使用。无论是需要快速构建基线模型的开发人员，还是希望探索更多特征可能性的算法研究员，都能从中受益。即使是不具备深厚领域知识的初学者，也能利用它快速从复杂的数据关系中提取有价值的信息。\n\nFeaturetools 的核心亮点在于其独特的“实体集”（EntitySet）概念和对多表数据的原生支持。它不仅能处理简单的单表数据，更能理解表与表之间的关联，自动跨越表格边界进行聚合和变换操作。此外，它还支持自然语言处理（NLP）扩展和分布式计算（Dask），能够灵活应对各种复杂场景。通过标准化和自动化特征构建流程，Featuretools 让团队能更专注于模型优化与业务洞察，显著提升机器学习工作流的效率。","\u003Cp align=\"center\">\n\u003Cimg width=50% src=\"https:\u002F\u002Fwww.featuretools.com\u002Fwp-content\u002Fuploads\u002F2017\u002F12\u002FFeatureLabs-Logo-Tangerine-800.png\" alt=\"Featuretools\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n\u003Ci>\"One of the holy grails of machine learning is to automate more and more of the feature engineering process.\"\u003C\u002Fi> ― Pedro Domingos, \u003Ca href=\"https:\u002F\u002Fbit.ly\u002Fthings_to_know_ml\">A Few Useful Things to Know about Machine Learning\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Factions\u002Fworkflows\u002Ftests_with_latest_deps.yaml\" alt=\"Tests\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Factions\u002Fworkflows\u002Ftests_with_latest_deps.yaml\u002Fbadge.svg?branch=main\" alt=\"Tests\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Falteryx\u002Ffeaturetools\">\n        \u003Cimg src=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Falteryx\u002Ffeaturetools\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg\"\u002F>\n    \u003C\u002Fa>\n    \u003Ca href='https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002F?badge=stable'>\n        \u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_13d664e1afd7.png' alt='Documentation Status' \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ffeaturetools.svg?maxAge=2592000\" alt=\"PyPI Version\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ffeaturetools\u002Fbadges\u002Fversion.svg\" alt=\"Anaconda Version\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"http:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fquestions-on_stackoverflow-blue.svg\" alt=\"StackOverflow\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpepy.tech\u002Fproject\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_d5537c070735.png\" alt=\"PyPI Downloads\" \u002F>\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\u003Chr>\n\n[Featuretools](https:\u002F\u002Fwww.featuretools.com) is a python library for automated feature engineering. See the [documentation](https:\u002F\u002Fdocs.featuretools.com) for more information.\n\n## Installation\nInstall with pip\n\n```\npython -m pip install featuretools\n```\n\nor from the Conda-forge channel on [conda](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ffeaturetools):\n\n```\nconda install -c conda-forge featuretools\n```\n\n### Add-ons\n\nYou can install add-ons individually or all at once by running:\n\n```\npython -m pip install \"featuretools[complete]\"\n```\n\n**Premium Primitives** - Use Premium Primitives from the premium-primitives repo\n\n```\npython -m pip install \"featuretools[premium]\"\n```\n\n**NLP Primitives** - Use Natural Language Primitives from the nlp-primitives repo\n\n```\npython -m pip install \"featuretools[nlp]\"\n```\n\n**Dask Support** - Use Dask to run DFS with njobs > 1\n\n```\npython -m pip install \"featuretools[dask]\"\n```\n\n## Example\nBelow is an example of using Deep Feature Synthesis (DFS) to perform automated feature engineering. In this example, we apply DFS to a multi-table dataset consisting of timestamped customer transactions.\n\n```python\n>> import featuretools as ft\n>> es = ft.demo.load_mock_customer(return_entityset=True)\n>> es.plot()\n```\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_07b862ee0ce7.png\" width=\"350\">\n\nFeaturetools can automatically create a single table of features for any \"target dataframe\"\n```python\n>> feature_matrix, features_defs = ft.dfs(entityset=es, target_dataframe_name=\"customers\")\n>> feature_matrix.head(5)\n```\n\n```\n            zip_code  COUNT(transactions)  COUNT(sessions)  SUM(transactions.amount) MODE(sessions.device)  MIN(transactions.amount)  MAX(transactions.amount)  YEAR(join_date)  SKEW(transactions.amount)  DAY(join_date)                   ...                     SUM(sessions.MIN(transactions.amount))  MAX(sessions.SKEW(transactions.amount))  MAX(sessions.MIN(transactions.amount))  SUM(sessions.MEAN(transactions.amount))  STD(sessions.SUM(transactions.amount))  STD(sessions.MEAN(transactions.amount))  SKEW(sessions.MEAN(transactions.amount))  STD(sessions.MAX(transactions.amount))  NUM_UNIQUE(sessions.DAY(session_start))  MIN(sessions.SKEW(transactions.amount))\ncustomer_id                                                                                                                                                                                                                                  ...\n1              60091                  131               10                  10236.77               desktop                      5.60                    149.95             2008                   0.070041               1                   ...                                                     169.77                                 0.610052                                   41.95                               791.976505                              175.939423                                 9.299023                                 -0.377150                                5.857976                                        1                                -0.395358\n2              02139                  122                8                   9118.81                mobile                      5.81                    149.15             2008                   0.028647              20                   ...                                                     114.85                                 0.492531                                   42.96                               596.243506                              230.333502                                10.925037                                  0.962350                                7.420480                                        1                                -0.470007\n3              02139                   78                5                   5758.24               desktop                      6.78                    147.73             2008                   0.070814              10                   ...                                                      64.98                                 0.645728                                   21.77                               369.770121                              471.048551                                 9.819148                                 -0.244976                               12.537259                                        1                                -0.630425\n4              60091                  111                8                   8205.28               desktop                      5.73                    149.56             2008                   0.087986              30                   ...                                                      83.53                                 0.516262                                   17.27                               584.673126                              322.883448                                13.065436                                 -0.548969                               12.738488                                        1                                -0.497169\n5              02139                   58                4                   4571.37                tablet                      5.91                    148.17             2008                   0.085883              19                   ...                                                      73.09                                 0.830112                                   27.46                               313.448942                              198.522508                                 8.950528                                  0.098885                                5.599228                                        1                                -0.396571\n\n[5 rows x 69 columns]\n```\nWe now have a feature vector for each customer that can be used for machine learning. See the [documentation on Deep Feature Synthesis](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Fgetting_started\u002Fafe.html) for more examples.\n\nFeaturetools contains many different types of built-in primitives for creating features. If the primitive you need is not included, Featuretools also allows you to [define your own custom primitives](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Fgetting_started\u002Fprimitives.html#defining-custom-primitives).\n\n## Demos\n**Predict Next Purchase**\n\n[Repository](https:\u002F\u002Fgithub.com\u002Falteryx\u002Fopen_source_demos\u002Fblob\u002Fmain\u002Fpredict-next-purchase\u002F) | [Notebook](https:\u002F\u002Fgithub.com\u002Falteryx\u002Fopen_source_demos\u002Fblob\u002Fmain\u002Fpredict-next-purchase\u002FTutorial.ipynb)\n\nIn this demonstration, we use a multi-table dataset of 3 million online grocery orders from Instacart to predict what a customer will buy next. We show how to generate features with automated feature engineering and build an accurate machine learning pipeline using Featuretools, which can be reused for multiple prediction problems. For more advanced users, we show how to scale that pipeline to a large dataset using Dask.\n\nFor more examples of how to use Featuretools, check out our [demos](https:\u002F\u002Fwww.featuretools.com\u002Fdemos) page.\n\n## Testing & Development\n\nThe Featuretools community welcomes pull requests. Instructions for testing and development are available [here.](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Finstall.html#development)\n\n## Support\nThe Featuretools community is happy to provide support to users of Featuretools. Project support can be found in four places depending on the type of question:\n\n1. For usage questions, use [Stack Overflow](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Ffeaturetools) with the `featuretools` tag.\n2. For bugs, issues, or feature requests start a [Github issue](https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues).\n3. For discussion regarding development on the core library, use [Slack](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Falteryx-oss\u002Fshared_invite\u002Fzt-182tyvuxv-NzIn6eiCEf8TBziuKp0bNA).\n4. For everything else, the core developers can be reached by email at open_source_support@alteryx.com\n\n## Citing Featuretools\n\nIf you use Featuretools, please consider citing the following paper:\n\nJames Max Kanter, Kalyan Veeramachaneni. [Deep feature synthesis: Towards automating data science endeavors.](https:\u002F\u002Fdai.lids.mit.edu\u002Fwp-content\u002Fuploads\u002F2017\u002F10\u002FDSAA_DSM_2015.pdf) *IEEE DSAA 2015*.\n\nBibTeX entry:\n\n```bibtex\n@inproceedings{kanter2015deep,\n  author    = {James Max Kanter and Kalyan Veeramachaneni},\n  title     = {Deep feature synthesis: Towards automating data science endeavors},\n  booktitle = {2015 {IEEE} International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, October 19-21, 2015},\n  pages     = {1--10},\n  year      = {2015},\n  organization={IEEE}\n}\n```\n\n## Built at Alteryx\n\n**Featuretools** is an open source project maintained by [Alteryx](https:\u002F\u002Fwww.alteryx.com). To see the other open source projects we’re working on visit [Alteryx Open Source](https:\u002F\u002Fwww.alteryx.com\u002Fopen-source). If building impactful data science pipelines is important to you or your business, please get in touch.\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.alteryx.com\u002Fopen-source\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_2f25fd22244f.png\" alt=\"Alteryx Open Source\" width=\"800\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n","\u003Cp align=\"center\">\n\u003Cimg width=50% src=\"https:\u002F\u002Fwww.featuretools.com\u002Fwp-content\u002Fuploads\u002F2017\u002F12\u002FFeatureLabs-Logo-Tangerine-800.png\" alt=\"Featuretools\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n\u003Ci>“机器学习的圣杯之一，就是让特征工程的过程越来越自动化。”\u003C\u002Fi> ― 佩德罗·多明戈斯，《关于机器学习的一些有用知识》\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Factions\u002Fworkflows\u002Ftests_with_latest_deps.yaml\" alt=\"测试\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Factions\u002Fworkflows\u002Ftests_with_latest_deps.yaml\u002Fbadge.svg?branch=main\" alt=\"测试\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Falteryx\u002Ffeaturetools\">\n        \u003Cimg src=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Falteryx\u002Ffeaturetools\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg\"\u002F>\n    \u003C\u002Fa>\n    \u003Ca href='https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002F?badge=stable'>\n        \u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_13d664e1afd7.png' alt='文档状态' \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ffeaturetools.svg?maxAge=2592000\" alt=\"PyPI版本\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ffeaturetools\u002Fbadges\u002Fversion.svg\" alt=\"Anaconda版本\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"http:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fquestions-on_stackoverflow-blue.svg\" alt=\"StackOverflow\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpepy.tech\u002Fproject\u002Ffeaturetools\" target=\"_blank\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_d5537c070735.png\" alt=\"PyPI下载量\" \u002F>\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\u003Chr>\n\n[Featuretools](https:\u002F\u002Fwww.featuretools.com) 是一个用于自动化特征工程的 Python 库。更多信息请参阅 [文档](https:\u002F\u002Fdocs.featuretools.com)。\n\n## 安装\n使用 pip 安装：\n\n```\npython -m pip install featuretools\n```\n\n或者从 Conda-forge 频道通过 [conda] 安装：\n\n```\nconda install -c conda-forge featuretools\n```\n\n### 插件\n您可以单独安装插件，也可以一次性全部安装，方法是运行：\n\n```\npython -m pip install \"featuretools[complete]\"\n```\n\n**高级原语** - 使用 premium-primitives 仓库中的高级原语：\n\n```\npython -m pip install \"featuretools[premium]\"\n```\n\n**NLP 原语** - 使用 nlp-primitives 仓库中的自然语言处理原语：\n\n```\npython -m pip install \"featuretools[nlp]\"\n```\n\n**Dask 支持** - 使用 Dask 在 DFS 中启用多线程（njobs > 1）：\n\n```\npython -m pip install \"featuretools[dask]\"\n```\n\n## 示例\n以下是使用深度特征合成（DFS）进行自动化特征工程的示例。在本例中，我们将 DFS 应用于包含带时间戳的客户交易记录的多表数据集。\n\n```python\n>> import featuretools as ft\n>> es = ft.demo.load_mock_customer(return_entityset=True)\n>> es.plot()\n```\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_07b862ee0ce7.png\" width=\"350\">\n\nFeaturetools 可以自动为任意“目标数据框”创建单个特征表。\n```python\n>> feature_matrix, features_defs = ft.dfs(entityset=es, target_dataframe_name=\"customers\")\n>> feature_matrix.head(5)\n```\n\n```\n            zip_code  COUNT(transactions)  COUNT(sessions)  SUM(transactions.amount) MODE(sessions.device)  MIN(transactions.amount)  MAX(transactions.amount)  YEAR(join_date)  SKEW(transactions.amount)  DAY(join_date)                   ...                     SUM(sessions.MIN(transactions.amount))  MAX(sessions.SKEW(transactions.amount))  MAX(sessions.MIN(transactions.amount))  SUM(sessions.MEAN(transactions.amount))  STD(sessions.SUM(transactions.amount))  STD(sessions.MEAN(transactions.amount))  SKEW(sessions.MEAN(transactions.amount))  STD(sessions.MAX(transactions.amount))  NUM_UNIQUE(sessions.DAY(session_start))  MIN(sessions.SKEW(transactions.amount))\ncustomer_id                                                                                                                                                                                                                                  ...\n1              60091                  131               10                  10236.77               desktop                      5.60                    149.95             2008                   0.070041               1                   ...                                                     169.77                                 0.610052                                   41.95                               791.976505                              175.939423                                 9.299023                                 -0.377150                                5.857976                                        1                                -0.395358\n2              02139                  122                8                   9118.81                mobile                      5.81                    149.15             2008                   0.028647              20                   ...                                                     114.85                                 0.492531                                   42.96                               596.243506                              230.333502                                10.925037                                  0.962350                                7.420480                                        1                                -0.470007\n3              02139                   78                5                   5758.24               desktop                      6.78                    147.73             2008                   0.070814              10                   ...                                                      64.98                                 0.645728                                   21.77                               369.770121                              471.048551                                 9.819148                                 -0.244976                               12.537259                                        1                                -0.630425\n4              60091                  111                8                   8205.28               desktop                      5.73                    149.56             2008                   0.087986              30                   ...                                                      83.53                                 0.516262                                   17.27                               584.673126                              322.883448                                13.065436                                 -0.548969                               12.738488                                        1                                -0.49716ig\n5              02139                   58                4                   4571.37                tablet                      5.91                    148.17             2008                   0.085883              19                   ...                                                      73.09                                 0.830112                                   27.46                               313.448942                              198.522508                                 8.950528                                  0.098885                                5.599228                                        1                                -0.396571\n\n[5 rows x 69 columns]\n```\n现在我们已经为每个客户生成了一个可用于机器学习的特征向量。更多示例请参阅 [深度特征合成文档](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Fgetting_started\u002Fafe.html)。\n\nFeaturetools 包含许多不同类型的内置基元来创建特征。如果所需的基元未包含在内，Featuretools 还允许您 [定义自己的自定义基元](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Fgetting_started\u002Fprimitives.html#defining-custom-primitives)。\n\n## 演示\n**预测下一次购买**\n\n[仓库](https:\u002F\u002Fgithub.com\u002Falteryx\u002Fopen_source_demos\u002Fblob\u002Fmain\u002Fpredict-next-purchase\u002F) | [笔记本](https:\u002F\u002Fgithub.com\u002Falteryx\u002Fopen_source_demos\u002Fblob\u002Fmain\u002Fpredict-next-purchase\u002FTutorial.ipynb)\n\n在本次演示中，我们使用来自 Instacart 的 300 万条在线杂货订单多表数据集来预测客户下次将购买什么。我们展示了如何通过自动化特征工程生成特征，并利用 Featuretools 构建一个准确的机器学习流水线，该流水线可重复用于多个预测问题。对于高级用户，我们还演示了如何使用 Dask 将该流水线扩展到大型数据集。\n\n有关如何使用 Featuretools 的更多示例，请查看我们的 [演示页面](https:\u002F\u002Fwww.featuretools.com\u002Fdemos)。\n\n## 测试与开发\n\nFeaturetools 社区欢迎 Pull Request。测试和开发说明可在 [此处](https:\u002F\u002Ffeaturetools.alteryx.com\u002Fen\u002Fstable\u002Finstall.html#development)找到。\n\n## 支持\nFeaturetools 社区乐于为 Featuretools 的用户提供支持。根据问题类型，项目支持可通过以下四种方式获取：\n\n1. 如有使用方面的问题，请在 [Stack Overflow](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Ffeaturetools) 上使用 `featuretools` 标签提问。\n2. 如遇到 bug、问题或功能请求，请在 [Github](https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues) 上提交 issue。\n3. 如需讨论核心库的开发相关话题，请加入 [Slack](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Falteryx-oss\u002Fshared_invite\u002Fzt-182tyvuxv-NzIn6eiCEf8TBziuKp0bNA) 频道。\n4. 对于其他任何问题，可发送邮件至 open_source_support@alteryx.com 联系核心开发者。\n\n## 引用 Featuretools\n\n如果您使用了 Featuretools，请考虑引用以下论文：\n\nJames Max Kanter, Kalyan Veeramachaneni. [深度特征合成：迈向数据科学工作的自动化。](https:\u002F\u002Fdai.lids.mit.edu\u002Fwp-content\u002Fuploads\u002F2017\u002F10\u002FDSAA_DSM_2015.pdf) *IEEE DSAA 2015*。\n\nBibTeX 条目如下：\n\n```bibtex\n@inproceedings{kanter2015deep,\n  author    = {James Max Kanter and Kalyan Veeramachaneni},\n  title     = {Deep feature synthesis: Towards automating data science endeavors},\n  booktitle = {2015 {IEEE} International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, October 19-21, 2015},\n  pages     = {1--10},\n  year      = {2015},\n  organization={IEEE}\n}\n```\n\n## 由 Alteryx 构建\n\n**Featuretools** 是由 [Alteryx](https:\u002F\u002Fwww.alteryx.com) 维护的开源项目。如需了解我们正在开展的其他开源项目，请访问 [Alteryx Open Source](https:\u002F\u002Fwww.alteryx.com\u002Fopen-source)。如果构建具有影响力的数据科学流水线对您或您的企业至关重要，请随时与我们联系。\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.alteryx.com\u002Fopen-source\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_readme_2f25fd22244f.png\" alt=\"Alteryx Open Source\" width=\"800\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fp>","# Featuretools 快速上手指南\n\nFeaturetools 是一个用于自动化特征工程的 Python 库。它通过深度特征合成（Deep Feature Synthesis, DFS）技术，能够自动从多表数据关系中生成丰富的特征向量，极大地简化了机器学习前的数据准备工作。\n\n## 环境准备\n\n*   **操作系统**：Windows、macOS 或 Linux\n*   **Python 版本**：建议 Python 3.8 及以上版本\n*   **前置依赖**：需预先安装 `pip` 或 `conda` 包管理工具\n*   **网络建议**：国内用户建议使用清华源或阿里源加速安装\n\n## 安装步骤\n\n您可以选择使用 `pip` 或 `conda` 进行安装。\n\n### 方式一：使用 pip 安装（推荐）\n\n**基础安装：**\n```bash\npython -m pip install featuretools -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n**完整功能安装（包含 NLP、Dask 等扩展支持）：**\n```bash\npython -m pip install \"featuretools[complete]\" -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方式二：使用 conda 安装\n\n```bash\nconda install -c conda-forge featuretools\n```\n\n## 基本使用\n\n以下示例演示了如何加载模拟数据、构建实体集（EntitySet），并使用深度特征合成（DFS）自动生成特征矩阵。\n\n### 1. 加载数据并查看关系\n首先导入库并加载内置的模拟客户交易数据，该数据包含多个关联表。\n\n```python\nimport featuretools as ft\n\n# 加载模拟数据并返回实体集\nes = ft.demo.load_mock_customer(return_entityset=True)\n\n# 可视化数据表之间的关系（可选）\nes.plot()\n```\n\n### 2. 执行深度特征合成 (DFS)\n指定目标表（例如 `\"customers\"`），Featuretools 会自动遍历关联表，计算聚合特征（如计数、总和、均值等）和转换特征。\n\n```python\n# 执行 DFS，目标是为 \"customers\" 表生成特征\nfeature_matrix, features_defs = ft.dfs(entityset=es, target_dataframe_name=\"customers\")\n\n# 查看生成的特征矩阵前 5 行\nprint(feature_matrix.head(5))\n```\n\n**输出示例：**\n生成的表格将包含大量自动衍生的特征列，例如 `COUNT(transactions)`（交易次数）、`SUM(transactions.amount)`（交易总额）、`MODE(sessions.device)`（常用设备）等。\n\n```text\n            zip_code  COUNT(transactions)  COUNT(sessions)  SUM(transactions.amount) MODE(sessions.device)  ...\ncustomer_id                                                                                                 ...\n1              60091                  131               10                  10236.77               desktop  ...\n2              02139                  122                8                   9118.81                mobile  ...\n3              02139                   78                5                   5758.24               desktop  ...\n4              60091                  111                8                   8205.28               desktop  ...\n5              02139                   58                4                   4571.37                tablet  ...\n\n[5 rows x 69 columns]\n```\n\n现在，`feature_matrix` 即可直接作为输入数据用于训练机器学习模型。","某电商数据科学团队正致力于构建用户流失预测模型，需要基于分散在客户、会话和交易三张表中的历史行为数据提取高价值特征。\n\n### 没有 featuretools 时\n- **人工编码效率极低**：数据科学家需花费数周时间手动编写复杂的 SQL 聚合语句和 Python 循环，以计算“用户过去 30 天平均消费额”或“最大单笔交易金额”等基础指标。\n- **深层关系难以挖掘**：面对多表关联，人工很难跨越层级去构造组合特征（如“每次会话中交易金额的标准差的总和”），导致大量隐含模式被遗漏。\n- **特征一致性差**：不同成员编写的特征逻辑风格不一，容易出现统计口径错误，且代码复用性低，维护成本高昂。\n- **迭代周期漫长**：每当业务增加新数据字段，整个特征工程流程需推倒重来，严重拖慢模型上线和验证的速度。\n\n### 使用 featuretools 后\n- **自动化生成海量特征**：只需定义好实体关系，featuretools 即可通过深度特征合成（DFS）技术在几分钟内自动生成数百个跨表聚合特征，覆盖计数、求和、趋势等多种变换。\n- **自动探索复杂交互**：工具能自动递归地组合基础算子，轻松发现人工难以想到的深层嵌套特征（如会话层级的偏度再聚合），显著提升模型区分度。\n- **标准化与可复现**：所有特征生成逻辑由统一算法驱动，消除了人为编码误差，生成的特征定义清晰可追溯，便于团队协作和审计。\n- **敏捷响应业务变化**：当新增数据列时，仅需重新运行 DFS 流程，featuretools 会自动将其纳入计算体系，将特征迭代周期从数周缩短至数小时。\n\nfeaturetools 将数据科学家从繁琐的手工特征构造中解放出来，使其能专注于更高价值的模型策略与业务洞察。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falteryx_featuretools_9e3ed3d5.png","alteryx","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Falteryx_e8409361.png","Alteryx Open Source",null,"alteryxoss","http:\u002F\u002Fwww.alteryx.com","https:\u002F\u002Fgithub.com\u002Falteryx",[81,85],{"name":82,"color":83,"percentage":84},"Python","#3572A5",99.9,{"name":86,"color":87,"percentage":88},"Makefile","#427819",0.1,7629,908,"2026-04-15T03:28:21","BSD-3-Clause",1,"未说明",{"notes":96,"python":94,"dependencies":97},"该工具是一个用于自动化特征工程的 Python 库，支持通过 pip 或 conda 安装。若需使用分布式计算加速（多核并行），需额外安装 dask 扩展包（featuretools[dask]）。若需使用自然语言处理或高级预定义特征原语，需分别安装 nlp 或 premium 扩展包。文档未明确列出具体的操作系统、Python 版本及硬件资源需求，通常兼容主流 Python 数据科学生态环境。",[98,99,100,101],"pandas","numpy","scipy","dask (可选)",[103,16,14],"其他",[105,106,107,108,109,110,111,112],"feature-engineering","machine-learning","data-science","automated-machine-learning","automl","python","scikit-learn","automated-feature-engineering","2026-03-27T02:49:30.150509","2026-04-17T08:24:34.909044",[116,121,126,131,136,141],{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},37216,"Featuretools 是否支持 Spark 或 Koalas DataFrames？","是的，Featuretools v0.19.0 版本起已包含对 Koalas DataFrames 的支持。虽然原生的 PySpark 支持已被弃用，但可以通过 Koalas 实现与 Apache Spark 的互操作性。有关索引策略等详细信息，可参考 Databricks 关于 Koalas 与 Spark 互操作性的官方博客。","https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues\u002F887",{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},37217,"升级 Featuretools 后运行 dfs() 出现 'IndexError: Too many levels' 错误怎么办？","该问题已在主分支修复，并包含在 Featuretools v0.5.1 及更高版本中。如果遇到此错误，请将库升级到最新版本（pip install --upgrade featuretools）。如果升级后问题仍然存在，请提交新的 Issue。","https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues\u002F252",{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},37218,"使用 Dask 运行 Featuretools 时内存溢出（Memory Crashing）如何解决？","在使用 Dask 分布式客户端时，需显式限制每个 worker 的内存使用量以防止系统崩溃。示例配置如下：\nclient = Client(n_workers=2, threads_per_worker=2, memory_limit='2GB')\n确保在创建 Client 时设置 memory_limit 参数，并根据系统总内存合理分配。此外，确认数据分片（npartitions）数量适中，避免单个任务负载过大。","https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues\u002F1357",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},37219,"DIFF 原始特征（Primitive）是如何计算值的，特别是聚合后的值？","DIFF 的计算逻辑在 Issue #824 中进行了更新。对于聚合值（如 MAX(sales.amount)），DIFF 是基于截止时间和训练窗口内的历史数据计算的。如果您需要自定义行为，可以创建一个 uses_full_entity=False 的自定义原始特征，但需注意这会导致结果依赖于具体的截止时间点，可能产生意想不到的结果。建议查阅最新文档以了解更新后的具体计算细节。","https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues\u002F809",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},37220,"如何使用滑动时间窗口或多个时间快照进行特征工程（例如预测未来事件）？","推荐使用单个实体集（EntitySet），但调用两次 ft.dfs（设置 features_only=True）来生成不同的特征定义列表。然后，针对每个列表分别调用 ft.calculate_feature_matrix()，并在每次调用时传入不同的 training_window 参数。最后将生成的特征矩阵合并。这种方法允许为每个客户或实体生成多个时间切片的特征，适用于流失预测、预防性维护等场景。","https:\u002F\u002Fgithub.com\u002Falteryx\u002Ffeaturetools\u002Fissues\u002F100",{"id":142,"question_zh":143,"answer_zh":144,"source_url":135},37221,"运行 DFS 时出现重复列（Duplicate Columns）如何处理？","如果在生成的特征矩阵中发现重复列，可以尝试在调用 ft.dfs 时显式设置 trans_primitives=[]（如果不需转换原始特征），或者检查是否同时应用了产生相同结果的聚合和转换原始特征。通过限制使用的原始特征列表通常可以解决此问题。",[146,151,156,161,166,171,176,181,186,191,196,201,206,211,216,221,226,231,236,241],{"id":147,"version":148,"summary_zh":149,"released_at":150},297747,"v1.31.0","**v1.31.0 2024年5月14日**\n\n* 增强功能\n    * 添加对 Python 3.12 的支持 (#2713)\n* 修复\n    * 将 `flatten_list` 工具函数移至 `feature_discovery` 模块，以修复导入错误 (#2702)\n* 变更\n    * 暂时限制 Dask 的版本 (#2694)\n    * 移除从 Dask 或 PySpark 数据框创建 `EntitySets` 的支持 (#2705)\n    * 在依赖文件中提高 `tqdm` 和 `pip` 的最低版本要求 (#2716)\n    * 在调用 `tarfile.extractall` 时使用 `filter` 参数，以安全地反序列化 `EntitySets` (#2722)\n* 测试相关变更\n    * 修复序列化测试，使其与 pytest 8.1.1 兼容 (#2694)\n    * 更新以使最小依赖检查器能够正常运行 (#2709)\n    * 更新拉取请求检查的 CI 任务 (#2720)\n    * 更新发布说明更新检查的 CI 任务 (#2726)\n\n感谢以下人员为本次发布做出贡献：\n@thehomebrewnerd","2024-05-14T18:59:20",{"id":152,"version":153,"summary_zh":154,"released_at":155},297748,"v1.30.0","v1.30.0 2024年2月26日\n====================\n* 变更\n    * 更新 numpy、pandas 和 Woodwork 的最低要求 (#2681)\n    * 更新发布说明的版本号以反映此次发布 (#2689)\n* 测试相关变更\n    * 更新 ``make_ecommerce_entityset``，使其无需 Dask 即可运行 (#2677)\n\n  感谢以下人员对本次发布的贡献：\n@tamargrey, @thehomebrewnerd","2024-02-26T17:28:49",{"id":157,"version":158,"summary_zh":159,"released_at":160},297749,"v1.29.0","### v1.29.0 2024年2月16日\n#### 警告：\nFeaturetools 的此版本将不再支持 Python 3.8。\n\n* 修复\n    * 修复依赖问题 (#2644, #2656)\n    * 为 pandas 2.2.0 中 nunique 的 bug 提供临时解决方案，并取消对 pandas 依赖的固定版本限制 (#2657)\n* 变更\n    * 修复 is_categorical_dtype 的弃用警告 (#2641)\n    * 移除 Spark 安装时对 woodwork、pyarrow、numpy 和 pandas 的版本固定限制 (#2661)\n* 文档变更\n    * 更新 Featuretools 的 logo，使其在深色模式下正常显示 (#2632)\n    * 在无法发布的情况下，移除对高级原语的引用 (:pr:`2674`)\n* 测试变更\n    * 更新测试以兼容 ``holidays`` 的新版本 (#2636)\n    * 将 ruff 升级至 0.1.6，并使用 ruff 作为代码格式化工具 (#2639)\n    * 更新 ``release.yaml``，在 PyPI 发布时使用可信发布者 (#2646, #2653, #2654)\n    * 更新依赖检查工具和测试，加入 Dask 支持 (#2658)\n    * 修复运行 Woodwork 主分支相关测试的问题，使其能够被触发 (#2657)\n    * 修复最小依赖检查动作 (#2664)\n    * 修复针对 Woodwork 主分支测试的 Slack 警报问题 (#2668)\n\n感谢以下贡献者为本次发布做出的贡献：\n    @gsheni, @thehomebrewnerd, @tamargrey, @LakshmanKishore","2024-02-16T19:50:24",{"id":162,"version":163,"summary_zh":164,"released_at":165},297750,"v1.28.0","**v1.28.0 2023年10月26日**\n\n* 修复\n    * 修复 `PercentTrue` 原语中默认值的 bug (#2627)\n* 变更\n    * 重构 `featuretools\u002Ftests\u002Fprimitive_tests\u002Futils.py`，使用列表推导式以提升代码的 Python 风格 (#2607)\n    * 重构 `can_stack_primitive_on_inputs` (#2522)\n    * 更新文档图片的 S3 存储桶 (#2593)\n    * 暂时将 pandas 的最高版本限制为 `\u003C2.1.0`，并将 pyarrow 的最高版本限制为 `\u003C13.0.0` (#2609)\n    * 为兼容 pandas 版本 `2.1.0` 进行更新，并移除 pandas 的版本上限限制 (#2616)\n* 文档变更\n    * 修复 README 中测试相关徽章的问题 (#2598)\n    * 更新 Read the Docs 配置，使用 build.os (#2601)\n* 测试变更\n    * 更新 Airflow Looking Glass 性能测试工作流 (#2615)\n    * 移除旧的性能测试工作流 (#2620)\n\n感谢以下人员对本次发布做出的贡献：\n@gsheni、@petejanuszewski1、@thehomebrewnerd、@tosemml","2023-10-26T18:21:15",{"id":167,"version":168,"summary_zh":169,"released_at":170},297751,"v1.27.0","**v1.27.0 2023年7月24日**\n\n* 增强功能\n    * 添加对 Python 3.11 的支持 (#2583)\n    * 添加对 ``pandas`` v2.0 的支持 (#2585)\n* 变更\n    * 移除自然语言原语插件 (#2570)\n    * 更新以解决各类警告 (#2589)\n* 测试变更\n    * 通过 Airflow 在合并时运行 Looking Glass 性能测试 (#2575)\n\n感谢以下人员为本次发布做出贡献：\n@gsheni、@petejanuszewski1、@sbadithe、@thehomebrewnerd","2023-07-24T15:39:18",{"id":172,"version":173,"summary_zh":174,"released_at":175},297752,"v1.26.0","v1.26.0 2023年4月27日\n====================\n* 增强功能\n    * 引入新的单表 DFS 算法 (#2516)。此功能为**实验性**，尚未正式支持。\n    * 添加高级原语安装命令 (#2545)\n* 修复\n    * 修复 ``DaysInMonth`` 的描述 (#2547)\n* 变更\n    * 将 Dask 设为可选依赖项 (#2560)\n\n感谢以下人员对本次发布的贡献：\n@dvreed77、@gsheni、@thehomebrewnerd","2023-04-27T21:49:53",{"id":177,"version":178,"summary_zh":179,"released_at":180},297753,"v1.25.0","v1.24.0 2023年4月13日\n===============\n  * 增强功能\n    * 新增 ``MaxCount``、``MedianCount``、``MaxMinDelta``、``NUniqueDays``、``NMostCommonFrequency``、\n        ``NUniqueDaysOfCalendarYear``、``NUniqueDaysOfMonth``、``NUniqueMonths``、\n        ``NUniqueWeeks``、``IsFirstWeekOfMonth`` (#2533)\n    * 新增 ``HasNoDuplicates``、``NthWeekOfMonth``、``IsMonotonicallyDecreasing``、``IsMonotonicallyIncreasing``、\n        ``IsUnique`` (#2537)\n  * 变更\n      * 将 pandas 版本限制为 \u003C 2.0.0 (#2533)\n      * 将 pandas 最低版本升级至 1.5.0 (#2537)\n      * 移除 ``Correlation`` 和 ``AutoCorrelation`` 原语，因为它们可能导致数据泄露 (#2537)\n      * 移除 ``Kurtosis`` 原语对 IntegerNullable 的支持 (#2537)\n\n    感谢以下人员对本次发布做出的贡献：\n    @gsheni","2023-04-13T18:54:46",{"id":182,"version":183,"summary_zh":184,"released_at":185},297754,"v1.24.0","v1.24.0 2023年3月29日\n====================\n* 增强功能\n    * 添加 ``AverageCountPerUnique``、``CountryCodeToContinent``、``FileExtension``、``FirstLastTimeDelta``、``SavgolFilter``、``CumulativeTimeSinceLastFalse``、``CumulativeTimeSinceLastTrue``、``PercentChange``、``PercentUnique`` (#2485)\n    * 添加 ``FullNameToFirstName``、``FullNameToLastName``、``FullNameToTitle``、``AutoCorrelation``、``Correlation``、``DateFirstEvent`` (#2526)\n    * 添加 ``Kurtosis``、``MinCount``、``NumFalseSinceLastTrue``、``NumPeaks``、``NumTrueSinceLastFalse``、``NumZeroCrossings`` (#2514)\n* 修复\n    * 将 github-action-check-linked-issues 锁定到 1.4.5 版本 (#2497)\n    * 支持 Woodwork 的更新，改进数值推断（将整数识别为字符串）(#2505)\n    * 更新 ``SubtractNumeric`` 原语，添加交换律类属性 (#2527)\n* 变更\n    * 将 Makefile 中的核心依赖、测试依赖和开发依赖的命令分开 (#2518)\n\n感谢以下人员对本次发布的贡献：\n@dvreed77、@gsheni、@ozzieD","2023-03-29T18:16:44",{"id":187,"version":188,"summary_zh":189,"released_at":190},297755,"v1.23.0","v1.23.0 2023年2月15日\n====================\n* 变更\n    * 将``TotalWordLength``和``UpperCaseWordCount``的返回类型改为``IntegerNullable`` (#2474)\n* 测试相关变更\n    * 添加 GitHub Actions 缓存以加速工作流 (#2475)\n    * 修复最新的依赖检查器安装命令 (#2476)\n    * 在 CI 工作流中添加对关联问题的拉取请求检查 (#2477, #2481)\n    * 从 lint 工作流中移除 make package 步骤 (#2479)\n\n感谢以下人员为本次发布做出贡献：\n@dvreed77、@gsheni、@sbadithe","2023-02-15T15:02:56",{"id":192,"version":193,"summary_zh":194,"released_at":195},297756,"v1.22.0","* 增强功能\n  * 添加 ``AbsoluteDiff``、``SameAsPrevious``、``Variance``、``Season``、``UpperCaseWordCount`` 转换原语 (#2460)\n* 修复\n  * 修复 ``NumWords`` 中连续空格的 bug (#2459)\n  * 修复与 ``holidays`` v0.19.0 的兼容性问题 (#2471)\n* 变更\n  * 在 pre-commit-config 中指定 black 和 ruff 的配置参数 (#2456)\n  * 当输入为 null 时，``NumCharacters`` 返回 null (#2463)\n* 文档变更\n  * 更新 ``release.md``，添加启动 Looking Glass 性能测试运行的说明 (#2461)\n  * 锁定 ``jupyter-client==7.4.9``，以修复文档构建失败的问题 (#2463)\n  * 解除对 jupyter-client 文档依赖的锁定 (#2468)\n* 测试变更\n  * 为 ``NumWords`` 和 ``NumCharacters`` 原语添加测试套件 (#2459, #2463)\n\n感谢以下人员为本次发布做出贡献：\n@gsheni、@rwedge、@sbadithe、@thehomebrewnerd","2023-01-31T22:25:17",{"id":197,"version":198,"summary_zh":199,"released_at":200},297757,"v1.21.0","**Jan 18, 2023**\r\n\r\n* Enhancements\r\n  * Add `get_recommended_primitives` function to featuretools (#2398)\r\n* Changes\r\n  * Update build_docs workflow to only run for Python 3.8 and Python 3.10 (#2447)\r\n* Documentation Changes\r\n  * Minor fix to release notes (#2444)\r\n* Testing Changes\r\n  * Add test that checks for Natural Language primitives timing out against edge-case input (#2429)\r\n  * Fix test compatibility with composeml 0.10 (#2439)\r\n  * Minimum dependency unit test jobs do not abort if one job fails (#2437)\r\n  * Run Looking Glass performance tests on merge to main (#2440, #2441)\r\n  * Add ruff for linting and replace isort\u002Fflake8 (#2448)\r\n\r\nThanks to the following people for contributing to this release:\r\n    @gsheni, @ozzieD, @rwedge, @sbadithe, @thehomebrewnerd ","2023-01-18T22:35:47",{"id":202,"version":203,"summary_zh":204,"released_at":205},297758,"v1.20.0","**Jan 5, 2023**\r\n\r\n* Enhancements\r\n    * Add ``TimeSinceLastFalse``, ``TimeSinceLastMax``, ``TimeSinceLastMin``, and ``TimeSinceLastTrue`` primitives (#2418)\r\n    * Add ``MaxConsecutiveFalse``, ``MaxConsecutiveNegatives``, ``MaxConsecutivePositives``, ``MaxConsecutiveTrue``, ``MaxConsecutiveZeros``, ``NumConsecutiveGreaterMean``, ``NumConsecutiveLessMean`` (#2420)\r\n* Fixes\r\n    * Fix typo in ``_handle_binary_comparison`` function name and update ``set_feature_names`` docstring (#2388)\r\n    * Only allow Datetime time index as input to ``RateOfChange`` primitive (#2408)\r\n    * Prevent catastrophic backtracking in regex for ``NumberOfWordsInQuotes`` (#2413)\r\n    * Fix to eliminate fragmentation ``PerformanceWarning`` in ``feature_set_calculator.py`` (#2424)\r\n    * Fix serialization of ``NumberOfCommonWords`` feature with custom word_set (#2432)\r\n    * Improve edge case handling in NaturalLanguage primitives by standardizing delimiter regex (#2423)\r\n    * Remove support for ``Datetime`` and ``Ordinal`` inputs in several primitives to prevent creation of Features that cannot be calculated (#2434)\r\n* Changes\r\n    * Refactor ``_all_direct_and_same_path`` by deleting call to ``_features_have_same_path`` (#2400)\r\n    * Refactor ``_build_transform_features`` by iterating over ``input_features`` once (#2400)\r\n    * Iterate only once over ``ignore_columns`` in ``DeepFeatureSynthesis`` init (#2397)\r\n    * Resolve empty Pandas series warnings (#2403)\r\n    * Initialize Woodwork with ``init_with_partial_schama`` instead of ``init`` in ``EntitySet.add_last_time_indexes`` (#2409)\r\n    * Updates for compatibility with numpy 1.24.0 (#2414)\r\n    * The ``delimiter_regex`` parameter for ``TotalWordLength`` has been renamed to ``do_not_count`` (#2423)\r\n* Documentation Changes\r\n    *  Remove unused sections from 1.19.0 notes (#2396)\r\n\r\nThanks to the following people for contributing to this release:\r\n@gsheni, @rwedge, @sbadithe, @thehomebrewnerd\r\n\r\n**Breaking Changes**\r\n* The ``delimiter_regex`` parameter for ``TotalWordLength`` has been renamed to ``do_not_count``.\r\n  Old saved features that had a non-default value for the parameter will no longer load.\r\n* Support for ``Datetime`` and ``Ordinal`` inputs has been removed from the ``LessThanScalar``,\r\n  ``GreaterThanScalar``, ``LessThanEqualToScalar`` and ``GreaterThanEqualToScalar`` primitives.","2023-01-05T17:45:46",{"id":207,"version":208,"summary_zh":209,"released_at":210},297759,"v1.19.0","**v1.19.0 Dec 9, 2022**\r\n\r\n* Enhancements\r\n    * Add `OneDigitPostalCode` and `TwoDigitPostalCode` primitives (#2365)\r\n    * Add `ExpandingCount`, `ExpandingMin`, `ExpandingMean`, `ExpandingMax`, `ExpandingSTD` and `ExpandingTrend` primitives (#2343)\r\n* Fixes\r\n    * Fix DeepFeatureSynthesis to consider the `base_of_exclude` family of attributes when creating transform features(#2380)\r\n    * Fix bug with negative version numbers in `test_version` (#2389)\r\n    * Fix bug in `MultiplyNumericBoolean` primitive that can cause an error with certain input dtype combinations (#2393)\r\n* Testing Changes\r\n    * Fix version comparison in `test_holiday_out_of_range` (#2382)\r\n\r\nThanks to the following people for contributing to this release:\r\n@sbadithe, @thehomebrewnerd","2022-12-09T16:57:29",{"id":212,"version":213,"summary_zh":214,"released_at":215},297760,"v1.18.0","v1.18.0 Nov 15, 2022\r\n====================\r\n* Enhancements\r\n    * Add ``RollingOutlierCount`` primitive (#2129)\r\n    * Add ``RateOfChange`` primitive (#2359)\r\n* Fixes\r\n    * Sets ``uses_full_dataframe`` for ``Rolling*`` and ``Exponential*`` primitives (#2354)\r\n    * Updates for compatibility with upcoming Woodwork release 0.21.0 (#2363)\r\n    * Updates demo dataset location to use new links (#2366)\r\n    * Fix ``test_holiday_out_of_range`` after ``holidays`` release 0.17 (#2373)\r\n* Changes\r\n    * Remove click and CLI functions (``list-primitives``, ``info``) (#2353, #2358)\r\n* Documentation Changes\r\n    * Build docs in parallel with Sphinx (#2351)\r\n    * Use non-editable install to allow local docs build (#2367)\r\n    * Remove primitives.featurelabs.com website from documentation (#2369)\r\n* Testing Changes\r\n    * Replace use of pytest's tmpdir fixture with tmp_path (#2344)\r\n\r\nThanks to the following people for contributing to this release:\r\n@gsheni, @rwedge, @sbadithe, @tamargrey, @thehomebrewnerd","2022-11-15T21:22:06",{"id":217,"version":218,"summary_zh":219,"released_at":220},297761,"v1.17.0","v1.17.0 Oct 31, 2022\r\n===============\r\n* Enhancements\r\n    * Add featuretools-sklearn-transformer as an extra installation option (#2335)\r\n    * Add CountAboveMean, CountBelowMean, CountGreaterThan, CountInsideNthSTD, CountInsideRange, CountLessThan, CountOutsideNthSTD, CountOutsideRange (#2336)\r\n* Changes\r\n    * Restructure primitives directory to use individual primitives files (#2331)\r\n    * Restrict 2022.10.1 for dask and distributed (#2347)\r\n* Documentation Changes\r\n    * Add Featuretools-SQL to Install page on documentation (#2337)\r\n    * Fixes broken link in Featuretools documentation (#2339)\r\n\r\n    Thanks to the following people for contributing to this release:\r\n    @gsheni, @rwedge, @sbadithe, @thehomebrewnerd","2022-10-31T17:26:38",{"id":222,"version":223,"summary_zh":224,"released_at":225},297762,"v1.16.0"," * Enhancements\r\n     * Add ExponentialWeighted primitives and DateToTimeZone primitive (#2318)\r\n     * Add 14 natural language primitives from ``nlp_primitives`` library (#2328) \r\n * Documentation Changes\r\n     * Fix typos in ``aggregation_primitive_base.py`` and ``features_deserializer.py`` (#2317) (#2324)\r\n     * Update SQL integration documentation to reflect Snowflake compatibility (#2313)\r\n * Testing Changes\r\n     * Add Windows install test #2330 \r\n    \r\n Thanks to the following people for contributing to this release: \r\n @gsheni, @sbadithe, @thehomebrewnerd ","2022-10-24T18:06:06",{"id":227,"version":228,"summary_zh":229,"released_at":230},297763,"v1.15.0","v1.15.0 Oct 6, 2022\r\n===================\r\n* Enhancements\r\n    * Add ``series_library`` attribute to ``EntitySet`` dictionary (#2257)\r\n    * Leverage ``Library`` Enum inheriting from ``str`` (#2275)\r\n* Changes\r\n    * Change default gap for Rolling* primitives from 0 to 1 to prevent accidental leakage (#2282)\r\n    * Updates for pandas 1.5.0 compatibility (#2290, #2291, #2308)\r\n    * Exclude documentation files from release workflow (#2295)\r\n    * Bump requirements for optional pyspark dependency (#2299)\r\n    * Bump ``scipy`` and ``woodwork[spark]`` dependencies (#2306)\r\n* Documentation Changes\r\n    * Add documentation describing how to use ``featuretools_sql`` with ``featuretools`` (#2262)\r\n    * Remove ``featuretools_sql`` as a docs requirement (#2302)\r\n    * Fix typo in ``DiffDatetime`` doctest (#2314)\r\n    * Fix typo in ``EntitySet`` documentation (#2315)\r\n* Testing Changes\r\n    * Remove graphviz version restrictions in Windows CI tests (#2285)\r\n    * Run CI tests with ``pytest -n auto`` (#2298, #2310)\r\n\r\nThanks to the following people for contributing to this release:\r\n@gsheni, @rwedge, @sbadithe, @thehomebrewnerd\r\n\r\nBreaking Changes\r\n----------------\r\n* The ``EntitySet`` schema has been updated to include a ``series_library`` attribute\r\n* The default behavior of the ``Rolling*`` primitives has changed in this release. If this primitive was used without\r\n  defining the ``gap`` value, the feature values returned with this release will be different than feature values from\r\n  prior releases.","2022-10-06T18:50:49",{"id":232,"version":233,"summary_zh":234,"released_at":235},297764,"v1.15.0.dev0","Developmental release for testing purposes","2022-10-05T22:08:23",{"id":237,"version":238,"summary_zh":239,"released_at":240},297765,"v1.14.0","v1.14.0 Sep 1, 2022\r\n===================\r\n* Enhancements\r\n    * Replace ``NumericLag`` with ``Lag`` primitive (#2252)\r\n    * Refactor build_features to speed up long running DFS calls by 50% (#2224)\r\n* Fixes\r\n    * Fix compatibility issues with holidays 0.15 (#2254)\r\n* Changes\r\n    * Update release notes to make clear conda release portion (#2249)\r\n    * Use pyproject.toml only (move away from setup.cfg) (#2260, #2263, #2265)\r\n    * Add entry point instructions for pyproject.toml project (#2272)\r\n* Documentation Changes\r\n    * Fix to remove warning from Using Spark EntitySets Guide (#2258)\r\n* Testing Changes\r\n    * Add tests\u002Fprofiling\u002Fdfs_profile.py (#2224)\r\n    * Add workflow to test featuretools without test dependencies (#2274)\r\n\r\nThanks to the following people for contributing to this release:\r\n@cp2boston, @gsheni, @ozzieD, @stefaniesmith, @thehomebrewnerd","2022-09-01T18:35:02",{"id":242,"version":243,"summary_zh":244,"released_at":245},297766,"v1.13.0","v1.13.0 Aug 18, 2022\r\n===================\r\n* Fixes\r\n    * Allow boolean columns to be included in remove_highly_correlated_features (#2231)\r\n* Changes\r\n    * Refactor schema version checking to use `packaging` method (#2230)\r\n    * Extract duplicated logic for Rolling primitives into a general utility function (#2218)\r\n    * Set pandas version to >=1.4.0 (#2246)\r\n    * Remove workaround in `roll_series_with_gap` caused by pandas version \u003C 1.4.0 (#2246)\r\n* Documentation Changes\r\n    * Add line breaks between sections of IsFederalHoliday primitive docstring (#2235)\r\n* Testing Changes\r\n    * Update create feedstock PR forked repo to use (#2223, #2237)\r\n    * Update development requirements and use latest for documentation (#2225)\r\n\r\nThanks to the following people for contributing to this release:\r\n@gsheni, @ozzieD, @sbadithe, @tamargrey ","2022-08-18T18:44:10"]