[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-jrzaurin--pytorch-widedeep":3,"tool-jrzaurin--pytorch-widedeep":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",157379,2,"2026-04-15T23:32:42",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":75,"owner_location":76,"owner_email":77,"owner_twitter":72,"owner_website":75,"owner_url":78,"languages":79,"stars":92,"forks":93,"last_commit_at":94,"license":95,"difficulty_score":32,"env_os":96,"env_gpu":96,"env_ram":96,"env_deps":97,"category_tags":105,"github_topics":109,"view_count":32,"oss_zip_url":75,"oss_zip_packed_at":75,"status":17,"created_at":122,"updated_at":123,"faqs":124,"releases":155},7934,"jrzaurin\u002Fpytorch-widedeep","pytorch-widedeep","A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch","pytorch-widedeep 是一个基于 PyTorch 构建的灵活深度学习包，专为处理多模态数据而设计。它核心解决了传统机器学习模型难以同时高效融合表格数据、文本和图像的痛点，让用户能够轻松将结构化数据与非结构化数据结合进行联合建模。\n\n该工具基于谷歌经典的\"Wide & Deep\"算法并进行了多模态扩展：利用\"Wide\"部分记忆特征交互，通过\"Deep\"部分学习复杂非线性关系，从而显著提升预测精度。其独特亮点在于模块化架构，不仅支持自定义表格数据处理（deeptabular），还内置了推荐系统模块（rec），并能无缝集成预训练的文本和图像编码器，极大降低了多模态深度学习的实现门槛。\n\npytorch-widedeep 非常适合数据科学家、AI 研究人员以及需要处理复杂混合数据的深度学习开发者使用。无论是构建高精度的推荐系统，还是解决涉及多维信息的分类与回归问题，它都能提供简洁高效的代码接口，帮助用户快速验证想法并部署模型，是探索表格数据与多模态融合领域的得力助手。","\n\u003Cp align=\"center\">\n  \u003Cimg width=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_5bd5a11d58c3.png\">\n\u003C\u002Fp>\n\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fpytorch-widedeep.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpytorch-widedeep\u002F)\n[![Python 3.9 3.10 3.11 3.12](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpytorch-widedeep\u002F)\n[![Build Status](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Factions\u002Fworkflows\u002Fbuild.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Factions)\n[![Documentation Status](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_13d664e1afd7.png)](https:\u002F\u002Fpytorch-widedeep.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fjrzaurin\u002Fpytorch-widedeep)\n[![Code style: black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n[![Maintenance](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMaintained%3F-yes-green.svg)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fgraphs\u002Fcommit-activity)\n[![contributions welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontributions-welcome-brightgreen.svg?style=flat)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues)\n[![Slack](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fslack-chat-green.svg?logo=slack)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fpytorch-widedeep\u002Fshared_invite\u002Fzt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)\n[![DOI](https:\u002F\u002Fjoss.theoj.org\u002Fpapers\u002F10.21105\u002Fjoss.05027\u002Fstatus.svg)](https:\u002F\u002Fdoi.org\u002F10.21105\u002Fjoss.05027)\n\n# pytorch-widedeep\n\nA flexible package for multimodal-deep-learning to combine tabular data with\ntext and images using Wide and Deep models in Pytorch\n\n**Documentation:** [https:\u002F\u002Fpytorch-widedeep.readthedocs.io](https:\u002F\u002Fpytorch-widedeep.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)\n\n**Companion posts and tutorials:** [infinitoml](https:\u002F\u002Fjrzaurin.github.io\u002Finfinitoml\u002F)\n\n**Experiments and comparison with `LightGBM`**: [TabularDL vs LightGBM](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Ftabulardl-benchmark)\n\n**Slack**: if you want to contribute or just want to chat with us, join [slack](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fpytorch-widedeep\u002Fshared_invite\u002Fzt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)\n\nThe content of this document is organized as follows:\n\n- [pytorch-widedeep](#pytorch-widedeep)\n  - [Introduction](#introduction)\n  - [Architectures](#architectures)\n  - [The ``deeptabular`` component](#the-deeptabular-component)\n  - [The ``rec`` module](#the-rec-module)\n  - [Text and Images](#text-and-images)\n  - [Installation](#installation)\n    - [Developer Install](#developer-install)\n  - [Quick start](#quick-start)\n  - [Testing](#testing)\n  - [How to Contribute](#how-to-contribute)\n  - [Acknowledgments](#acknowledgments)\n  - [License](#license)\n  - [Cite](#cite)\n    - [BibTex](#bibtex)\n    - [APA](#apa)\n\n### Introduction\n\n``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.07792),\nadjusted for multi-modal datasets.\n\nIn general terms, `pytorch-widedeep` is a package to use deep learning with\ntabular data. In particular, is intended to facilitate the combination of\ntext and images with corresponding tabular data using wide and deep models.\nWith that in mind there are a number of architectures that can be implemented\nwith the library. The main components of those architectures are shown in the\nFigure below:\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"750\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_541389db690f.png\">\n\u003C\u002Fp>\n\nIn math terms, and following the notation in the\n[paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.07792), the expression for the architecture\nwithout a ``deephead`` component can be formulated as:\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_d1e39750a145.png\">\n\u003C\u002Fp>\n\nWhere &sigma; is the sigmoid function, *'W'* are the weight matrices applied to the wide model and to the final\nactivations of the deep models, *'a'* are these final activations,\n&phi;(x) are the cross product transformations of the original features *'x'*, and\n, and *'b'* is the bias term.\nIn case you are wondering what are *\"cross product transformations\"*, here is\na quote taken directly from the paper: *\"For binary features, a cross-product\ntransformation (e.g., “AND(gender=female, language=en)”) is 1 if and only if\nthe constituent features (“gender=female” and “language=en”) are all 1, and 0\notherwise\".*\n\nIt is perfectly possible to use custom models (and not necessarily those in\nthe library) as long as the the custom models have an property called\n``output_dim`` with the size of the last layer of activations, so that\n``WideDeep`` can be constructed. Examples on how to use custom components can\nbe found in the Examples folder and the section below.\n\n### Architectures\n\nThe `pytorch-widedeep` library offers a number of different architectures. In\nthis section we will show some of them in their simplest form (i.e. with\ndefault param values in most cases) with their corresponding code snippets.\nNote that **all** the snippets below shoud run locally. For a more detailed\nexplanation of the different components and their parameters, please refer to\nthe documentation.\n\nFor the examples below we will be using a toy dataset generated as follows:\n\n```python\nimport os\nimport random\n\nimport numpy as np\nimport pandas as pd\nfrom PIL import Image\nfrom faker import Faker\n\n\ndef create_and_save_random_image(image_number, size=(32, 32)):\n\n    if not os.path.exists(\"images\"):\n        os.makedirs(\"images\")\n\n    array = np.random.randint(0, 256, (size[0], size[1], 3), dtype=np.uint8)\n\n    image = Image.fromarray(array)\n\n    image_name = f\"image_{image_number}.png\"\n    image.save(os.path.join(\"images\", image_name))\n\n    return image_name\n\n\nfake = Faker()\n\ncities = [\"New York\", \"Los Angeles\", \"Chicago\", \"Houston\"]\nnames = [\"Alice\", \"Bob\", \"Charlie\", \"David\", \"Eva\"]\n\ndata = {\n    \"city\": [random.choice(cities) for _ in range(100)],\n    \"name\": [random.choice(names) for _ in range(100)],\n    \"age\": [random.uniform(18, 70) for _ in range(100)],\n    \"height\": [random.uniform(150, 200) for _ in range(100)],\n    \"sentence\": [fake.sentence() for _ in range(100)],\n    \"other_sentence\": [fake.sentence() for _ in range(100)],\n    \"image_name\": [create_and_save_random_image(i) for i in range(100)],\n    \"target\": [random.choice([0, 1]) for _ in range(100)],\n}\n\ndf = pd.DataFrame(data)\n```\n\nThis will create a 100 rows dataframe and a dir in your local folder, called\n`images` with 100 random images (or images with just noise).\n\nPerhaps the simplest architecture would be just one component, `wide`,\n`deeptabular`, `deeptext` or `deepimage` on their own, which is also\npossible, but let's start the examples with a standard Wide and Deep\narchitecture. From there, how to build a model comprised only of one\ncomponent will be straightforward.\n\nNote that the examples shown below would be almost identical using any of the\nmodels available in the library. For example, `TabMlp` can be replaced by\n`TabResnet`, `TabNet`, `TabTransformer`, etc. Similarly, `BasicRNN` can be\nreplaced by `AttentiveRNN`, `StackedAttentiveRNN`, or `HFModel` with\ntheir corresponding parameters and preprocessor in the case of the Hugging\nFace models.\n\n**1. Wide and Tabular component (aka deeptabular)**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_ac763715f38f.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, WidePreprocessor\nfrom pytorch_widedeep.models import Wide, TabMlp, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Wide\nwide_cols = [\"city\"]\ncrossed_cols = [(\"city\", \"name\")]\nwide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)\nX_wide = wide_preprocessor.fit_transform(df)\nwide = Wide(input_dim=np.unique(X_wide).shape[0])\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# WideDeep\nmodel = WideDeep(wide=wide, deeptabular=tab_mlp)\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_wide=X_wide,\n    X_tab=X_tab,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**2. Tabular and Text data**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_14c286db8585.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text = text_preprocessor.fit_transform(df)\nrnn = BasicRNN(\n    vocab_size=len(text_preprocessor.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=rnn)\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=X_text,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**3. Tabular and text with a FC head on top via the `head_hidden_dims` param\n  in `WideDeep`**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_88db8d511c8f.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text = text_preprocessor.fit_transform(df)\nrnn = BasicRNN(\n    vocab_size=len(text_preprocessor.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=rnn, head_hidden_dims=[32, 16])\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=X_text,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**4. Tabular and multiple text columns that are passed directly to\n  `WideDeep`**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_e0b1d5d2baba.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=[rnn_1, rnn_2])\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**5. Tabular data and multiple text columns that are fused via a the library's\n  `ModelFuser` class**\n\n\u003Cp align=\"center\">\n    \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_3eaa97b40d46.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser\nfrom pytorch_widedeep import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\n\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\nmodels_fuser = ModelFuser(models=[rnn_1, rnn_2], fusion_method=\"mult\")\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=models_fuser)\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**6. Tabular and multiple text columns, with an image column. The text columns\n  are fused via the library's `ModelFuser` and then all fused via the\n  deephead paramenter in `WideDeep` which is a custom `ModelFuser` coded by\n  the user**\n\nThis is perhaps the less elegant solution as it involves a custom component by\nthe user and slicing the 'incoming' tensor. In the future, we will include a\n`TextAndImageModelFuser` to make this process more straightforward. Still, is not\nreally complicated and it is a good example of how to use custom components in\n`pytorch-widedeep`.\n\nNote that the only requirement for the custom component is that it has a\nproperty called `output_dim` that returns the size of the last layer of\nactivations. In other words, it does not need to inherit from\n`BaseWDModelComponent`. This base class simply checks the existence of such\nproperty and avoids some typing errors internally.\n\n\u003Cp align=\"center\">\n    \u003Cimg width=\"600\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_d4df294b5164.png\">\n\u003C\u002Fp>\n\n```python\nimport torch\n\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor, ImagePreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser, Vision\nfrom pytorch_widedeep.models._base_wd_model_component import BaseWDModelComponent\nfrom pytorch_widedeep import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[16, 8],\n)\n\n# Text\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nmodels_fuser = ModelFuser(\n    models=[rnn_1, rnn_2],\n    fusion_method=\"mult\",\n)\n\n# Image\nimage_preprocessor = ImagePreprocessor(img_col=\"image_name\", img_path=\"images\")\nX_img = image_preprocessor.fit_transform(df)\nvision = Vision(pretrained_model_setup=\"resnet18\", head_hidden_dims=[16, 8])\n\n# deephead (custom model fuser)\nclass MyModelFuser(BaseWDModelComponent):\n    \"\"\"\n    Simply a Linear + Relu sequence on top of the text + images followed by a\n    Linear -> Relu -> Linear for the concatenation of tabular slice of the\n    tensor and the output of the text and image sequential model\n    \"\"\"\n    def __init__(\n        self,\n        tab_incoming_dim: int,\n        text_incoming_dim: int,\n        image_incoming_dim: int,\n        output_units: int,\n    ):\n\n        super(MyModelFuser, self).__init__()\n\n        self.tab_incoming_dim = tab_incoming_dim\n        self.text_incoming_dim = text_incoming_dim\n        self.image_incoming_dim = image_incoming_dim\n        self.output_units = output_units\n        self.text_and_image_fuser = torch.nn.Sequential(\n            torch.nn.Linear(text_incoming_dim + image_incoming_dim, output_units),\n            torch.nn.ReLU(),\n        )\n        self.out = torch.nn.Sequential(\n            torch.nn.Linear(output_units + tab_incoming_dim, output_units * 4),\n            torch.nn.ReLU(),\n            torch.nn.Linear(output_units * 4, output_units),\n        )\n\n    def forward(self, X: torch.Tensor) -> torch.Tensor:\n        tab_slice = slice(0, self.tab_incoming_dim)\n        text_slice = slice(\n            self.tab_incoming_dim, self.tab_incoming_dim + self.text_incoming_dim\n        )\n        image_slice = slice(\n            self.tab_incoming_dim + self.text_incoming_dim,\n            self.tab_incoming_dim + self.text_incoming_dim + self.image_incoming_dim,\n        )\n        X_tab = X[:, tab_slice]\n        X_text = X[:, text_slice]\n        X_img = X[:, image_slice]\n        X_text_and_image = self.text_and_image_fuser(torch.cat([X_text, X_img], dim=1))\n        return self.out(torch.cat([X_tab, X_text_and_image], dim=1))\n\n    @property\n    def output_dim(self):\n        return self.output_units\n\n\ndeephead = MyModelFuser(\n    tab_incoming_dim=tab_mlp.output_dim,\n    text_incoming_dim=models_fuser.output_dim,\n    image_incoming_dim=vision.output_dim,\n    output_units=8,\n)\n\n# WideDeep\nmodel = WideDeep(\n    deeptabular=tab_mlp,\n    deeptext=models_fuser,\n    deepimage=vision,\n    deephead=deephead,\n)\n\n# Train\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    X_img=X_img,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**7. A two-tower model**\n\nThis is a popular model in the context of recommendation systems. Let's say we\nhave a tabular dataset formed my triples (user features, item features,\ntarget). We can create a two-tower model where the user and item features are\npassed through two separate models and then \"fused\" via a dot product.\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"350\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_1531c49bcc26.png\">\n\u003C\u002Fp>\n\n```python\nimport numpy as np\nimport pandas as pd\n\nfrom pytorch_widedeep import Trainer\nfrom pytorch_widedeep.preprocessing import TabPreprocessor\nfrom pytorch_widedeep.models import TabMlp, WideDeep, ModelFuser\n\n# Let's create the interaction dataset\n# user_features dataframe\nnp.random.seed(42)\nuser_ids = np.arange(1, 101)\nages = np.random.randint(18, 60, size=100)\ngenders = np.random.choice([\"male\", \"female\"], size=100)\nlocations = np.random.choice([\"city_a\", \"city_b\", \"city_c\", \"city_d\"], size=100)\nuser_features = pd.DataFrame(\n    {\"id\": user_ids, \"age\": ages, \"gender\": genders, \"location\": locations}\n)\n\n# item_features dataframe\nitem_ids = np.arange(1, 101)\nprices = np.random.uniform(10, 500, size=100).round(2)\ncolors = np.random.choice([\"red\", \"blue\", \"green\", \"black\"], size=100)\ncategories = np.random.choice([\"electronics\", \"clothing\", \"home\", \"toys\"], size=100)\n\nitem_features = pd.DataFrame(\n    {\"id\": item_ids, \"price\": prices, \"color\": colors, \"category\": categories}\n)\n\n# Interactions dataframe\ninteraction_user_ids = np.random.choice(user_ids, size=1000)\ninteraction_item_ids = np.random.choice(item_ids, size=1000)\npurchased = np.random.choice([0, 1], size=1000, p=[0.7, 0.3])\ninteractions = pd.DataFrame(\n    {\n        \"user_id\": interaction_user_ids,\n        \"item_id\": interaction_item_ids,\n        \"purchased\": purchased,\n    }\n)\nuser_item_purchased = interactions.merge(\n    user_features, left_on=\"user_id\", right_on=\"id\"\n).merge(item_features, left_on=\"item_id\", right_on=\"id\")\n\n# Users\ntab_preprocessor_user = TabPreprocessor(\n    cat_embed_cols=[\"gender\", \"location\"],\n    continuous_cols=[\"age\"],\n)\nX_user = tab_preprocessor_user.fit_transform(user_item_purchased)\ntab_mlp_user = TabMlp(\n    column_idx=tab_preprocessor_user.column_idx,\n    cat_embed_input=tab_preprocessor_user.cat_embed_input,\n    continuous_cols=[\"age\"],\n    mlp_hidden_dims=[16, 8],\n    mlp_dropout=[0.2, 0.2],\n)\n\n# Items\ntab_preprocessor_item = TabPreprocessor(\n    cat_embed_cols=[\"color\", \"category\"],\n    continuous_cols=[\"price\"],\n)\nX_item = tab_preprocessor_item.fit_transform(user_item_purchased)\ntab_mlp_item = TabMlp(\n    column_idx=tab_preprocessor_item.column_idx,\n    cat_embed_input=tab_preprocessor_item.cat_embed_input,\n    continuous_cols=[\"price\"],\n    mlp_hidden_dims=[16, 8],\n    mlp_dropout=[0.2, 0.2],\n)\n\ntwo_tower_model = ModelFuser([tab_mlp_user, tab_mlp_item], fusion_method=\"dot\")\n\nmodel = WideDeep(deeptabular=two_tower_model)\n\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=[X_user, X_item],\n    target=interactions.purchased.values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**8. Tabular with a multi-target loss**\n\nThis one is \"a bonus\" to illustrate the use of multi-target losses, more than\nactually a different architecture.\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_2d6c122c5cbe.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor, ImagePreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser, Vision\nfrom pytorch_widedeep.losses_multitarget import MultiTargetClassificationLoss\nfrom pytorch_widedeep.models._base_wd_model_component import BaseWDModelComponent\nfrom pytorch_widedeep import Trainer\n\n# let's add a second target to the dataframe\ndf[\"target2\"] = [random.choice([0, 1]) for _ in range(100)]\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# 'pred_dim=2' because we have two binary targets. For other types of targets,\n#  please, see the documentation\nmodel = WideDeep(deeptabular=tab_mlp, pred_dim=2).\n\nloss = MultiTargetClassificationLoss(binary_config=[0, 1], reduction=\"mean\")\n\n# When a multi-target loss is used, 'custom_loss_function' must not be None.\n# See the docs\ntrainer = Trainer(model, objective=\"multitarget\", custom_loss_function=loss)\n\ntrainer.fit(\n    X_tab=X_tab,\n    target=df[[\"target\", \"target2\"]].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n### The ``deeptabular`` component\n\nIt is important to emphasize again that **each individual component, `wide`,\n`deeptabular`, `deeptext` and `deepimage`, can be used independently** and in\nisolation. For example, one could use only `wide`, which is in simply a\nlinear model. In fact, one of the most interesting functionalities\nin``pytorch-widedeep`` would be the use of the ``deeptabular`` component on\nits own, i.e. what one might normally refer as Deep Learning for Tabular\nData. Currently, ``pytorch-widedeep`` offers the following different models\nfor that component:\n\n0. **Wide**: a simple linear model where the nonlinearities are captured via\ncross-product transformations, as explained before.\n1. **TabMlp**: a simple MLP that receives embeddings representing the\ncategorical features, concatenated with the continuous features, which can\nalso be embedded.\n2. **TabResnet**: similar to the previous model but the embeddings are\npassed through a series of ResNet blocks built with dense layers.\n3. **TabNet**: details on TabNet can be found in\n[TabNet: Attentive Interpretable Tabular Learning](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07442)\n\nTwo simpler attention based models that we call:\n\n4. **ContextAttentionMLP**: MLP with at attention mechanism \"on top\" that is based on\n    [Hierarchical Attention Networks for Document Classification](https:\u002F\u002Fwww.cs.cmu.edu\u002F~.\u002Fhovy\u002Fpapers\u002F16HLT-hierarchical-attention-networks.pd)\n5. **SelfAttentionMLP**: MLP with an attention mechanism that is a simplified\n    version of a transformer block that we refer as \"query-key self-attention\".\n\nThe ``Tabformer`` family, i.e. Transformers for Tabular data:\n\n6. **TabTransformer**: details on the TabTransformer can be found in\n[TabTransformer: Tabular Data Modeling Using Contextual Embeddings](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.06678.pdf).\n7. **SAINT**: Details on SAINT can be found in\n[SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.01342).\n8. **FT-Transformer**: details on the FT-Transformer can be found in\n[Revisiting Deep Learning Models for Tabular Data](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11959).\n9. **TabFastFormer**: adaptation of the FastFormer for tabular data. Details\non the Fasformer can be found in\n[FastFormers: Highly Efficient Transformer Models for Natural Language Understanding](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.13382)\n10. **TabPerceiver**: adaptation of the Perceiver for tabular data. Details on\nthe Perceiver can be found in\n[Perceiver: General Perception with Iterative Attention](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.03206)\n\nAnd probabilistic DL models for tabular data based on\n[Weight Uncertainty in Neural Networks](https:\u002F\u002Farxiv.org\u002Fabs\u002F1505.05424):\n\n11. **BayesianWide**: Probabilistic adaptation of the `Wide` model.\n12. **BayesianTabMlp**: Probabilistic adaptation of the `TabMlp` model\n\nNote that while there are scientific publications for the TabTransformer,\nSAINT and FT-Transformer, the TabFasfFormer and TabPerceiver are our own\nadaptation of those algorithms for tabular data.\n\nIn addition, Self-Supervised pre-training can be used for all `deeptabular`\nmodels, with the exception of the `TabPerceiver`. Self-Supervised\npre-training can be used via two methods or routines which we refer as:\nencoder-decoder method and constrastive-denoising method. Please, see the\ndocumentation and the examples for details on this functionality, and all\nother options in the library.\n\n### The ``rec`` module\n\nThis module was introduced as an extension to the existing components in the\nlibrary, addressing questions and issues related to recommendation systems.\nWhile still under active development, it currently includes a select number\nof powerful recommendation models.\n\nIt's worth noting that this library already supported the implementation of\nvarious recommendation algorithms using existing components. For example,\nmodels like Wide and Deep, Two-Tower, or Neural Collaborative Filtering could\nbe constructed using the library's core functionalities.\n\nThe recommendation algorithms in the `rec` module are:\n\n1. [AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.11921)\n2. [DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.04247)\n3. (Deep) Field Aware Factorization Machine (FFM): a Deep Learning version of the algorithm presented in [Field-aware Factorization Machines in a Real-world Online Advertising System](https:\u002F\u002Farxiv.org\u002Fabs\u002F1701.04099)\n4. [xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1803.05170)\n5. [Deep Interest Network for Click-Through Rate Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.06978)\n6. [Deep and Cross Network for Ad Click Predictions](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.05123)\n7. [DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.13535)\n8. [Towards Deeper, Lighter and Interpretable Click-through Rate Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.04635)\n9. A basic Transformer-based model for recommendation where the problem is faced as a sequence.\n\nSee the examples for details on how to use these models.\n\n### Text and Images\n\nFor the text component, `deeptext`, the library offers the following models:\n\n1. **BasicRNN**: a simple RNN 2. **AttentiveRNN**: a RNN with an attention\nmechanism based on the\n[Hierarchical Attention Networks for DocumentClassification](https:\u002F\u002Fwww.cs.cmu.edu\u002F~.\u002Fhovy\u002Fpapers\u002F16HLT-hierarchical-attention-networks.pd)\n3. **StackedAttentiveRNN**: a stack of AttentiveRNNs\n4. **HFModel**: a wrapper around Hugging Face Transfomer-based models. At the moment\nonly models from the families BERT, RoBERTa, DistilBERT, ALBERT and ELECTRA\nare supported. This is because this library is designed to address\nclassification and regression tasks and these are the most 'popular'\nencoder-only models, which have proved to be those that work best for these\ntasks. If there is demand for other models, they will be included in the\nfuture.\n\nFor the image component, `deepimage`, the library supports models from the\nfollowing families:\n'resnet', 'shufflenet', 'resnext', 'wide_resnet', 'regnet', 'densenet', 'mobilenetv3',\n 'mobilenetv2', 'mnasnet', 'efficientnet' and 'squeezenet'.  These are\n offered via `torchvision` and wrapped up in the `Vision` class.\n\n### Installation\n\nInstall using pip:\n\n```bash\npip install pytorch-widedeep\n```\n\nOr install directly from github\n\n```bash\npip install git+https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep.git\n```\n\n#### Developer Install\n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\ncd pytorch-widedeep\n\n# Install in dev mode\npip install -e .\n```\n\n### Quick start\n\nHere is an end-to-end example of a binary classification with the [adult\ndataset]([adult](https:\u002F\u002Fwww.kaggle.com\u002Fwenruliu\u002Fadult-income-dataset))\nusing `Wide` and `DeepDense` and defaults settings.\n\nBuilding a wide (linear) and deep model with ``pytorch-widedeep``:\n\n```python\nimport numpy as np\nimport torch\nfrom sklearn.model_selection import train_test_split\n\nfrom pytorch_widedeep import Trainer\nfrom pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor\nfrom pytorch_widedeep.models import Wide, TabMlp, WideDeep\nfrom pytorch_widedeep.metrics import Accuracy\nfrom pytorch_widedeep.datasets import load_adult\n\n\ndf = load_adult(as_frame=True)\ndf[\"income_label\"] = (df[\"income\"].apply(lambda x: \">50K\" in x)).astype(int)\ndf.drop(\"income\", axis=1, inplace=True)\ndf_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)\n\n# Define the 'column set up'\nwide_cols = [\n    \"education\",\n    \"relationship\",\n    \"workclass\",\n    \"occupation\",\n    \"native-country\",\n    \"gender\",\n]\ncrossed_cols = [(\"education\", \"occupation\"), (\"native-country\", \"occupation\")]\n\ncat_embed_cols = [\n    \"workclass\",\n    \"education\",\n    \"marital-status\",\n    \"occupation\",\n    \"relationship\",\n    \"race\",\n    \"gender\",\n    \"capital-gain\",\n    \"capital-loss\",\n    \"native-country\",\n]\ncontinuous_cols = [\"age\", \"hours-per-week\"]\ntarget = \"income_label\"\ntarget = df_train[target].values\n\n# prepare the data\nwide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)\nX_wide = wide_preprocessor.fit_transform(df_train)\n\ntab_preprocessor = TabPreprocessor(\n    cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]\n)\nX_tab = tab_preprocessor.fit_transform(df_train)\n\n# build the model\nwide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=continuous_cols,\n)\nmodel = WideDeep(wide=wide, deeptabular=tab_mlp)\n\n# train and validate\ntrainer = Trainer(model, objective=\"binary\", metrics=[Accuracy])\ntrainer.fit(\n    X_wide=X_wide,\n    X_tab=X_tab,\n    target=target,\n    n_epochs=5,\n    batch_size=256,\n)\n\n# predict on test\nX_wide_te = wide_preprocessor.transform(df_test)\nX_tab_te = tab_preprocessor.transform(df_test)\npreds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)\n\n# Save and load\n\n# Option 1: this will also save training history and lr history if the\n# LRHistory callback is used\ntrainer.save(path=\"model_weights\", save_state_dict=True)\n\n# Option 2: save as any other torch model\ntorch.save(model.state_dict(), \"model_weights\u002Fwd_model.pt\")\n\n# From here in advance, Option 1 or 2 are the same. I assume the user has\n# prepared the data and defined the new model components:\n# 1. Build the model\nmodel_new = WideDeep(wide=wide, deeptabular=tab_mlp)\nmodel_new.load_state_dict(torch.load(\"model_weights\u002Fwd_model.pt\"))\n\n# 2. Instantiate the trainer\ntrainer_new = Trainer(model_new, objective=\"binary\")\n\n# 3. Either start the fit or directly predict\npreds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab, batch_size=32)\n```\n\nOf course, one can do **much more**. See the Examples folder, the\ndocumentation or the companion posts for a better understanding of the content\nof the package and its functionalities.\n\n### Testing\n\n```\npytest tests\n```\n\n### How to Contribute\n\nCheck [CONTRIBUTING](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fblob\u002Fmaster\u002FCONTRIBUTING.MD) page.\n\n### Acknowledgments\n\nThis library takes from a series of other libraries, so I think it is just\nfair to mention them here in the README (specific mentions are also included\nin the code).\n\nThe `Callbacks` and `Initializers` structure and code is inspired by the\n[`torchsample`](https:\u002F\u002Fgithub.com\u002Fncullen93\u002Ftorchsample) library, which in\nitself partially inspired by [`Keras`](https:\u002F\u002Fkeras.io\u002F).\n\nThe `TextProcessor` class in this library uses the\n[`fastai`](https:\u002F\u002Fdocs.fast.ai\u002Ftext.transform.html#BaseTokenizer.tokenizer)'s\n`Tokenizer` and `Vocab`. The code at `utils.fastai_transforms` is a minor\nadaptation of their code so it functions within this library. To my experience\ntheir `Tokenizer` is the best in class.\n\nThe `ImageProcessor` class in this library uses code from the fantastic [Deep\nLearning for Computer\nVision](https:\u002F\u002Fwww.pyimagesearch.com\u002Fdeep-learning-computer-vision-python-book\u002F)\n(DL4CV) book by Adrian Rosebrock.\n\n### License\n\nThis work is dual-licensed under Apache 2.0 and MIT (or any later version).\nYou can choose between one of them if you use this work.\n\n`SPDX-License-Identifier: Apache-2.0 AND MIT`\n\n### Cite\n\n#### BibTex\n\n```\n@article{Zaurin_pytorch-widedeep_A_flexible_2023,\nauthor = {Zaurin, Javier Rodriguez and Mulinka, Pavol},\ndoi = {10.21105\u002Fjoss.05027},\njournal = {Journal of Open Source Software},\nmonth = jun,\nnumber = {86},\npages = {5027},\ntitle = {{pytorch-widedeep: A flexible package for multimodal deep learning}},\nurl = {https:\u002F\u002Fjoss.theoj.org\u002Fpapers\u002F10.21105\u002Fjoss.05027},\nvolume = {8},\nyear = {2023}\n}\n```\n\n#### APA\n\n```\nZaurin, J. R., & Mulinka, P. (2023). pytorch-widedeep: A flexible package for\nmultimodal deep learning. Journal of Open Source Software, 8(86), 5027.\nhttps:\u002F\u002Fdoi.org\u002F10.21105\u002Fjoss.05027\n```\n","\u003Cp align=\"center\">\n  \u003Cimg width=\"300\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_5bd5a11d58c3.png\">\n\u003C\u002Fp>\n\n[![PyPI版本](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fpytorch-widedeep.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpytorch-widedeep\u002F)\n[![Python 3.9 3.10 3.11 3.12](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpytorch-widedeep\u002F)\n[![构建状态](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Factions\u002Fworkflows\u002Fbuild.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Factions)\n[![文档状态](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_13d664e1afd7.png)](https:\u002F\u002Fpytorch-widedeep.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fjrzaurin\u002Fpytorch-widedeep)\n[![代码风格：black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack)\n[![维护状态](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMaintained%3F-yes-green.svg)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fgraphs\u002Fcommit-activity)\n[![欢迎贡献](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontributions-welcome-brightgreen.svg?style=flat)](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues)\n[![Slack](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fslack-chat-green.svg?logo=slack)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fpytorch-widedeep\u002Fshared_invite\u002Fzt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)\n[![DOI](https:\u002F\u002Fjoss.theoj.org\u002Fpapers\u002F10.21105\u002Fjoss.05027\u002Fstatus.svg)](https:\u002F\u002Fdoi.org\u002F10.21105\u002Fjoss.05027)\n\n# pytorch-widedeep\n\n一个灵活的多模态深度学习包，用于在 PyTorch 中结合表格数据、文本和图像，采用 Wide 和 Deep 模型。\n\n**文档:** [https:\u002F\u002Fpytorch-widedeep.readthedocs.io](https:\u002F\u002Fpytorch-widedeep.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)\n\n**配套文章与教程:** [infinitoml](https:\u002F\u002Fjrzaurin.github.io\u002Finfinitoml\u002F)\n\n**实验及与 `LightGBM` 的对比:** [TabularDL vs LightGBM](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Ftabulardl-benchmark)\n\n**Slack:** 如果您想参与贡献或只是想与我们交流，请加入 [slack](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fpytorch-widedeep\u002Fshared_invite\u002Fzt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)。\n\n本文档的内容组织如下：\n\n- [pytorch-widedeep](#pytorch-widedeep)\n  - [简介](#introduction)\n  - [架构](#architectures)\n  - [``deeptabular`` 组件](#the-deeptabular-component)\n  - [``rec`` 模块](#the-rec-module)\n  - [文本与图像](#text-and-images)\n  - [安装](#installation)\n    - [开发者安装](#developer-install)\n  - [快速入门](#quick-start)\n  - [测试](#testing)\n  - [如何贡献](#how-to-contribute)\n  - [致谢](#acknowledgments)\n  - [许可证](#license)\n  - [引用](#cite)\n    - [BibTex](#bibtex)\n    - [APA](#apa)\n\n### 简介\n\n``pytorch-widedeep`` 基于 Google 的 [Wide and Deep 算法](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.07792)，并针对多模态数据集进行了调整。\n\n广义上讲，`pytorch-widedeep` 是一个用于将深度学习应用于表格数据的工具包。具体而言，它旨在通过 Wide 和 Deep 模型，方便地将文本和图像与相应的表格数据相结合。基于这一目标，该库支持多种架构的实现。这些架构的主要组成部分如图所示：\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"750\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_541389db690f.png\">\n\u003C\u002Fp>\n\n从数学角度来看，按照论文中的符号约定（[论文链接](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.07792)），不含 ``deephead`` 组件的架构表达式可写为：\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_d1e39750a145.png\">\n\u003C\u002Fp>\n\n其中 &sigma; 表示 sigmoid 函数，*'W'* 是应用于 Wide 模型以及 Deep 模型最终激活层的权重矩阵，*'a'* 是这些最终激活值，&phi;(x) 是对原始特征 *'x'* 进行的交叉乘积变换，而 *'b'* 则是偏置项。如果您好奇“交叉乘积变换”是什么，这里直接引用论文中的一段话：“对于二值特征，交叉乘积变换（例如‘AND(性别=女性, 语言=en)’）仅当其组成特征（‘性别=女性’和‘语言=en’）均为 1 时才为 1，否则为 0。” \n\n当然，也可以使用自定义模型（而不局限于库中提供的模型），只要这些自定义模型具备名为 ``output_dim`` 的属性，表示最后一层激活的维度，以便能够构建 ``WideDeep`` 模型即可。有关如何使用自定义组件的示例，可在 Examples 文件夹及下文找到。\n\n### 架构\n\n`pytorch-widedeep` 库提供了多种不同的架构。在这一节中，我们将以最简单的形式（即大多数情况下使用默认参数值）展示其中的一些架构，并附上相应的代码片段。请注意，**所有**以下代码片段都应在本地运行。如需对各个组件及其参数的更详细说明，请参阅文档。\n\n对于下面的示例，我们将使用一个如下生成的玩具数据集：\n\n```python\nimport os\nimport random\n\nimport numpy as np\nimport pandas as pd\nfrom PIL import Image\nfrom faker import Faker\n\n\ndef create_and_save_random_image(image_number, size=(32, 32)):\n\n    if not os.path.exists(\"images\"):\n        os.makedirs(\"images\")\n\n    array = np.random.randint(0, 256, (size[0], size[1], 3), dtype=np.uint8)\n\n    image = Image.fromarray(array)\n\n    image_name = f\"image_{image_number}.png\"\n    image.save(os.path.join(\"images\", image_name))\n\n    return image_name\n\n\nfake = Faker()\n\ncities = [\"New York\", \"Los Angeles\", \"Chicago\", \"Houston\"]\nnames = [\"Alice\", \"Bob\", \"Charlie\", \"David\", \"Eva\"]\n\ndata = {\n    \"city\": [random.choice(cities) for _ in range(100)],\n    \"name\": [random.choice(names) for _ in range(100)],\n    \"age\": [random.uniform(18, 70) for _ in range(100)],\n    \"height\": [random.uniform(150, 200) for _ in range(100)],\n    \"sentence\": [fake.sentence() for _ in range(100)],\n    \"other_sentence\": [fake.sentence() for _ in range(100)],\n    \"image_name\": [create_and_save_random_image(i) for i in range(100)],\n    \"target\": [random.choice([0, 1]) for _ in range(100)],\n}\n\ndf = pd.DataFrame(data)\n```\n\n这将创建一个包含100行的DataFrame，并在您的本地文件夹中生成一个名为`images`的目录，其中包含100张随机图像（或仅包含噪声的图像）。\n\n也许最简单的架构就是单独使用`wide`、`deeptabular`、`deeptext`或`deepimage`中的某一个组件，这也是可行的。不过，我们还是从标准的Wide and Deep架构开始示例。在此之后，构建仅由一个组件组成的模型就会变得非常直观。\n\n需要注意的是，以下示例几乎可以使用库中提供的任何模型来实现。例如，`TabMlp`可以被替换为`TabResnet`、`TabNet`、`TabTransformer`等。同样地，`BasicRNN`也可以被替换为`AttentiveRNN`、`StackedAttentiveRNN`，或者在使用Hugging Face模型时，替换为`HFModel`，并相应调整其参数和预处理器。\n\n**1. Wide和Tabular组件（即deeptabular）**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_ac763715f38f.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, WidePreprocessor\nfrom pytorch_widedeep.models import Wide, TabMlp, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Wide\nwide_cols = [\"city\"]\ncrossed_cols = [(\"city\", \"name\")]\nwide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)\nX_wide = wide_preprocessor.fit_transform(df)\nwide = Wide(input_dim=np.unique(X_wide).shape[0])\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# WideDeep\nmodel = WideDeep(wide=wide, deeptabular=tab_mlp)\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_wide=X_wide,\n    X_tab=X_tab,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**2. Tabular和Text数据**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_14c286db8585.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text = text_preprocessor.fit_transform(df)\nrnn = BasicRNN(\n    vocab_size=len(text_preprocessor.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=rnn)\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=X_text,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**3. Tabular和text，并通过`WideDeep`中的`head_hidden_dims`参数在其顶部添加一个全连接层**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"400\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_88db8d511c8f.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# Text\ntext_preprocessor = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text = text_preprocessor.fit_transform(df)\nrnn = BasicRNN(\n    vocab_size=len(text_preprocessor.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=rnn, head_hidden_dims=[32, 16])\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=X_text,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**4. Tabular和多个text列，直接传递给`WideDeep`**\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_e0b1d5d2baba.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n\n# Tabular\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# 文本\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=[rnn_1, rnn_2])\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**5. 表格数据与多个文本列，通过库中的 `ModelFuser` 类进行融合**\n\n\u003Cp align=\"center\">\n    \u003Cimg width=\"500\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_3eaa97b40d46.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser\nfrom pytorch_widedeep import Trainer\n\n# 表格数据\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# 文本数据\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\n\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\n\nmodels_fuser = ModelFuser(models=[rnn_1, rnn_2], fusion_method=\"mult\")\n\n# WideDeep\nmodel = WideDeep(deeptabular=tab_mlp, deeptext=models_fuser)\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**6. 表格数据、多个文本列以及一张图像列。文本列先通过库中的 `ModelFuser` 进行融合，再通过 `WideDeep` 中的 `deephead` 参数（由用户自定义的 `ModelFuser`）将所有模态进一步融合**\n\n这种方法可能稍显不够优雅，因为它涉及用户自定义组件以及对“输入”张量的切片操作。未来我们将推出专门的 `TextAndImageModelFuser`，以使这一流程更加简便。不过，这其实并不复杂，也是一个很好的示例，展示了如何在 `pytorch-widedeep` 中使用自定义组件。\n\n需要注意的是，自定义组件的唯一要求是必须具备一个名为 `output_dim` 的属性，用于返回最后一层激活的维度。换句话说，它并不需要继承自 `BaseWDModelComponent`。这个基类的作用只是检查该属性是否存在，从而在内部避免一些类型错误。\n\n\u003Cp align=\"center\">\n    \u003Cimg width=\"600\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_d4df294b5164.png\">\n\u003C\u002Fp>\n\n```python\nimport torch\n\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor, ImagePreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser, Vision\nfrom pytorch_widedeep.models._base_wd_model_component import BaseWDModelComponent\nfrom pytorch_widedeep import Trainer\n\n# 表格数据\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[16, 8],\n)\n\n# 文本数据\ntext_preprocessor_1 = TextPreprocessor(\n    text_col=\"sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_1 = text_preprocessor_1.fit_transform(df)\ntext_preprocessor_2 = TextPreprocessor(\n    text_col=\"other_sentence\", maxlen=20, max_vocab=100, n_cpus=1\n)\nX_text_2 = text_preprocessor_2.fit_transform(df)\nrnn_1 = BasicRNN(\n    vocab_size=len(text_preprocessor_1.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nrnn_2 = BasicRNN(\n    vocab_size=len(text_preprocessor_2.vocab.itos),\n    embed_dim=16,\n    hidden_dim=8,\n    n_layers=1,\n)\nmodels_fuser = ModelFuser(\n    models=[rnn_1, rnn_2],\n    fusion_method=\"mult\",\n)\n\n# 图像数据\nimage_preprocessor = ImagePreprocessor(img_col=\"image_name\", img_path=\"images\")\nX_img = image_preprocessor.fit_transform(df)\nvision = Vision(pretrained_model_setup=\"resnet18\", head_hidden_dims=[16, 8])\n\n# deephead（自定义模型融合器）\nclass MyModelFuser(BaseWDModelComponent):\n    \"\"\"\n    只是在文本和图像之上添加一个线性层加ReLU的序列，随后再接一个\n    线性层 -> ReLU -> 线性层的结构，用于将张量中的表格切片与文本和图像序列模型的输出进行拼接。\n    \"\"\"\n    def __init__(\n        self,\n        tab_incoming_dim: int,\n        text_incoming_dim: int,\n        image_incoming_dim: int,\n        output_units: int,\n    ):\n\n        super(MyModelFuser, self).__init__()\n\n        self.tab_incoming_dim = tab_incoming_dim\n        self.text_incoming_dim = text_incoming_dim\n        self.image_incoming_dim = image_incoming_dim\n        self.output_units = output_units\n        self.text_and_image_fuser = torch.nn.Sequential(\n            torch.nn.Linear(text_incoming_dim + image_incoming_dim, output_units),\n            torch.nn.ReLU(),\n        )\n        self.out = torch.nn.Sequential(\n            torch.nn.Linear(output_units + tab_incoming_dim, output_units * 4),\n            torch.nn.ReLU(),\n            torch.nn.Linear(output_units * 4, output_units),\n        )\n\n    def forward(self, X: torch.Tensor) -> torch.Tensor:\n        tab_slice = slice(0, self.tab_incoming_dim)\n        text_slice = slice(\n            self.tab_incoming_dim, self.tab_incoming_dim + self.text_incoming_dim\n        )\n        image_slice = slice(\n            self.tab_incoming_dim + self.text_incoming_dim,\n            self.tab_incoming_dim + self.text_incoming_dim + self.image_incoming_dim,\n        )\n        X_tab = X[:, tab_slice]\n        X_text = X[:, text_slice]\n        X_img = X[:, image_slice]\n        X_text_and_image = self.text_and_image_fuser(torch.cat([X_text, X_img], dim=1))\n        return self.out(torch.cat([X_tab, X_text_and_image], dim=1))\n\n    @property\n    def output_dim(self):\n        return self.output_units\n\n\ndeephead = MyModelFuser(\n    tab_incoming_dim=tab_mlp.output_dim,\n    text_incoming_dim=models_fuser.output_dim,\n    image_incoming_dim=vision.output_dim,\n    output_units=8,\n)\n\n# WideDeep\nmodel = WideDeep(\n    deeptabular=tab_mlp,\n    deeptext=models_fuser,\n    deepimage=vision,\n    deephead=deephead,\n)\n\n# 训练\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=X_tab,\n    X_text=[X_text_1, X_text_2],\n    X_img=X_img,\n    target=df[\"target\"].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**7. 双塔模型**\n\n这是一种在推荐系统中非常流行的模型。假设我们有一个由三元组（用户特征、物品特征、目标）组成的表格数据集。我们可以构建一个双塔模型，其中用户特征和物品特征分别通过两个独立的模型处理，然后通过点积进行“融合”。\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"350\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_1531c49bcc26.png\">\n\u003C\u002Fp>\n\n```python\nimport numpy as np\nimport pandas as pd\n\nfrom pytorch_widedeep import Trainer\nfrom pytorch_widedeep.preprocessing import TabPreprocessor\nfrom pytorch_widedeep.models import TabMlp, WideDeep, ModelFuser\n\n# 让我们创建交互数据集\n# 用户特征数据框\nnp.random.seed(42)\nuser_ids = np.arange(1, 101)\nages = np.random.randint(18, 60, size=100)\ngenders = np.random.choice([\"male\", \"female\"], size=100)\nlocations = np.random.choice([\"city_a\", \"city_b\", \"city_c\", \"city_d\"], size=100)\nuser_features = pd.DataFrame(\n    {\"id\": user_ids, \"age\": ages, \"gender\": genders, \"location\": locations}\n)\n\n# 物品特征数据框\nitem_ids = np.arange(1, 101)\nprices = np.random.uniform(10, 500, size=100).round(2)\ncolors = np.random.choice([\"red\", \"blue\", \"green\", \"black\"], size=100)\ncategories = np.random.choice([\"electronics\", \"clothing\", \"home\", \"toys\"], size=100)\n\nitem_features = pd.DataFrame(\n    {\"id\": item_ids, \"price\": prices, \"color\": colors, \"category\": categories}\n)\n\n# 交互数据框\ninteraction_user_ids = np.random.choice(user_ids, size=1000)\ninteraction_item_ids = np.random.choice(item_ids, size=1000)\npurchased = np.random.choice([0, 1], size=1000, p=[0.7, 0.3])\ninteractions = pd.DataFrame(\n    {\n        \"user_id\": interaction_user_ids,\n        \"item_id\": interaction_item_ids,\n        \"purchased\": purchased,\n    }\n)\nuser_item_purchased = interactions.merge(\n    user_features, left_on=\"user_id\", right_on=\"id\"\n).merge(item_features, left-on=\"item_id\", right-on=\"id\")\n\n# 用户\ntab_preprocessor_user = TabPreprocessor(\n    cat_embed_cols=[\"gender\", \"location\"],\n    continuous_cols=[\"age\"],\n)\nX_user = tab_preprocessor_user.fit_transform(user_item_purchased)\ntab_mlp_user = TabMlp(\n    column_idx=tab_preprocessor_user.column_idx,\n    cat_embed_input=tab_preprocessor_user.cat_embed_input,\n    continuous_cols=[\"age\"],\n    mlp_hidden_dims=[16, 8],\n    mlp_dropout=[0.2, 0.2],\n)\n\n# 物品\ntab_preprocessor_item = TabPreprocessor(\n    cat_embed_cols=[\"color\", \"category\"],\n    continuous_cols=[\"price\"],\n)\nX_item = tab_preprocessor_item.fit_transform(user_item_purchased)\ntab_mlp_item = TabMlp(\n    column_idx=tab_preprocessor_item.column_idx,\n    cat_embed_input=tab_preprocessor_item.cat_embed_input,\n    continuous_cols=[\"price\"],\n    mlp_hidden_dims=[16, 8],\n    mlp_dropout=[0.2, 0.2],\n)\n\ntwo_tower_model = ModelFuser([tab_mlp_user, tab_mlp_item], fusion_method=\"dot\")\n\nmodel = WideDeep(deeptabular=two_tower_model)\n\ntrainer = Trainer(model, objective=\"binary\")\n\ntrainer.fit(\n    X_tab=[X_user, X_item],\n    target=interactions.purchased.values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n**8. 具有多目标损失的表格数据**\n\n这个例子更像是一个“附加内容”，用来展示多目标损失的使用，而不是真正意义上的不同架构。\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"200\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_readme_2d6c122c5cbe.png\">\n\u003C\u002Fp>\n\n```python\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, TextPreprocessor, ImagePreprocessor\nfrom pytorch_widedeep.models import TabMlp, BasicRNN, WideDeep, ModelFuser, Vision\nfrom pytorch_widedeep.losses_multitarget import MultiTargetClassificationLoss\nfrom pytorch_widedeep.models._base_wd_model_component import BaseWDModelComponent\nfrom pytorch_widedeep import Trainer\n\n# 让我们在数据框中添加第二个目标\ndf[\"target2\"] = [random.choice([0, 1]) for _ in range(100)]\n\n# 表格数据\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"], continuous_cols=[\"age\", \"height\"]\n)\nX_tab = tab_preprocessor.fit_transform(df)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],\n)\n\n# 'pred_dim=2' 因为我们有两个二分类目标。对于其他类型的目标，\n# 请参阅文档\nmodel = WideDeep(deeptabular=tab_mlp, pred_dim=2)。\n\nloss = MultiTargetClassificationLoss(binary_config=[0, 1], reduction=\"mean\")\n\n# 当使用多目标损失时，'custom_loss_function' 不得为 None。\n\n# 参阅文档\ntrainer = Trainer(model, objective=\"multitarget\", custom_loss_function=loss)\n\ntrainer.fit(\n    X_tab=X_tab,\n    target=df[[\"target\", \"target2\"]].values,\n    n_epochs=1,\n    batch_size=32,\n)\n```\n\n### ``deeptabular`` 组件\n\n需要再次强调的是，**`wide`、`deeptabular`、`deeptext` 和 `deepimage` 这四个组件可以独立使用**。例如，用户可以选择仅使用 `wide` 组件，它实际上就是一个线性模型。事实上，``pytorch-widedeep`` 最有趣的功能之一就是单独使用 `deeptabular` 组件，也就是通常所说的面向表格数据的深度学习。目前，该组件提供了以下几种模型：\n\n0. **Wide**：一个简单的线性模型，通过之前提到的交叉项变换来捕捉非线性关系。\n1. **TabMlp**：一个简单的多层感知机，接收表示分类特征的嵌入向量，并将其与连续特征拼接起来；连续特征也可以被嵌入。\n2. **TabResnet**：与前一种模型类似，但嵌入向量会先经过一系列由全连接层构建的 ResNet 块处理。\n3. **TabNet**：关于 TabNet 的详细信息可在 [TabNet: 用于表格数据的可解释注意力机制学习](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07442) 中找到。\n\n此外，还有两种基于注意力机制的简化模型，我们分别称为：\n\n4. **ContextAttentionMLP**：在 MLP 的基础上增加了一种基于 [用于文档分类的层次化注意力网络](https:\u002F\u002Fwww.cs.cmu.edu\u002F~.\u002Fhovy\u002Fpapers\u002F16HLT-hierarchical-attention-networks.pd) 的注意力机制。\n5. **SelfAttentionMLP**：在 MLP 中引入了一种简化的自注意力机制，类似于我们称之为“查询-键自注意力”的 Transformer 块。\n\n接下来是针对表格数据的 Transformer 系列模型：\n\n6. **TabTransformer**：关于 TabTransformer 的详细信息可在 [TabTransformer：利用上下文嵌入进行表格数据建模](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.06678.pdf) 中找到。\n7. **SAINT**：关于 SAINT 的详细信息可在 [SAINT：通过行级注意力和对比学习预训练改进表格数据神经网络](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.01342) 中找到。\n8. **FT-Transformer**：关于 FT-Transformer 的详细信息可在 [重新审视用于表格数据的深度学习模型](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11959) 中找到。\n9. **TabFastFormer**：将 FastFormer 模型适配到表格数据场景中。关于 FastFormer 的详细信息可在 [FastFormers：用于自然语言理解的高度高效的 Transformer 模型](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.13382) 中找到。\n10. **TabPerceiver**：将 Perceiver 模型适配到表格数据场景中。关于 Perceiver 的详细信息可在 [Perceiver：基于迭代注意力的通用感知模型](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.03206) 中找到。\n\n此外，还有基于 [神经网络中的权重不确定性](https:\u002F\u002Farxiv.org\u002Fabs\u002F1505.05424) 的概率深度学习模型，适用于表格数据：\n\n11. **BayesianWide**：对 `Wide` 模型的概率化改造。\n12. **BayesianTabMlp**：对 `TabMlp` 模型的概率化改造。\n\n需要注意的是，虽然 TabTransformer、SAINT 和 FT-Transformer 都有相关的学术论文发表，但 TabFastFormer 和 TabPerceiver 是我们团队针对表格数据对这些算法所做的改编。\n\n另外，所有 `deeptabular` 模型都可以使用自监督预训练方法，除了 `TabPerceiver` 之外。自监督预训练可以通过两种方式实现，我们称之为编码器-解码器方法和对比去噪方法。有关此功能及其他库内选项的详细信息，请参阅文档和示例。\n\n### ``rec`` 模块\n\n该模块作为现有组件的扩展被引入，旨在解决推荐系统相关的问题。尽管目前仍在积极开发中，但它已经包含了一些强大的推荐模型。\n\n值得注意的是，此前该库已支持使用现有组件实现多种推荐算法。例如，Wide & Deep、Two-Tower 或神经协同过滤等模型都可以通过库的核心功能轻松构建。\n\n``rec`` 模块中包含的推荐算法如下：\n\n1. [AutoInt：基于自注意力神经网络的自动特征交互学习](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.11921)\n2. [DeepFM：基于因子分解机的神经网络CTR预测模型](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.04247)\n3. （Deep）Field Aware Factorization Machine (FFM)：这是 [真实在线广告系统中的领域感知因子分解机](https:\u002F\u002Farxiv.org\u002Fabs\u002F1701.04099) 中提出的算法的深度学习版本。\n4. [xDeepFM：结合显式与隐式特征交互的推荐系统](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1803.05170)\n5. [用于点击率预测的深度兴趣网络](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.06978)\n6. [用于广告点击预测的深度与交叉网络](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.05123)\n7. [DCN V2：改进的深度与交叉网络及大规模排序学习系统的实践经验](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.13535)\n8. [迈向更深层、更轻量且可解释的点击率预测](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.04635)\n9. 一种基于 Transformer 的基础推荐模型，将问题视为序列建模任务。\n\n有关如何使用这些模型的详细信息，请参阅示例。\n\n### 文本与图像处理\n\n对于文本处理组件 `deeptext`，该库提供了以下几种模型：\n\n1. **BasicRNN**：一个简单的循环神经网络。  \n2. **AttentiveRNN**：基于 [用于文档分类的层次化注意力网络](https:\u002F\u002Fwww.cs.cmu.edu\u002F~.\u002Fhovy\u002Fpapers\u002F16HLT-hierarchical-attention-networks.pd) 的带有注意力机制的 RNN。  \n3. **StackedAttentiveRNN**：多层 AttentiveRNN 的堆叠结构。  \n4. **HFModel**：一个封装了 Hugging Face 基于 Transformer 模型的接口。目前仅支持 BERT、RoBERTa、DistilBERT、ALBERT 和 ELECTRA 等系列模型。这是因为该库主要针对分类和回归任务设计，而这些模型是最为流行的仅编码器架构，已被证明在这些任务中表现最佳。如果未来有需求，其他模型也将被纳入支持范围。\n\n对于图像处理组件 `deepimage`，该库支持以下模型家族：\n'resnet'、'shufflenet'、'resnext'、'wide_resnet'、'regnet'、'densenet'、'mobilenetv3'、'mobilenetv2'、'mnasnet'、'efficientnet' 和 'squeezenet'。这些模型通过 `torchvision` 提供，并被封装在 `Vision` 类中。\n\n### 安装\n\n使用 pip 安装：\n\n```bash\npip install pytorch-widedeep\n```\n\n或者直接从 GitHub 安装：\n\n```bash\npip install git+https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep.git\n```\n\n#### 开发者安装\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\ncd pytorch-widedeep\n\n# 以开发模式安装\npip install -e .\n```\n\n### 快速入门\n\n以下是一个使用 `Wide` 和 `DeepDense` 模型，并采用默认设置，对 [adult 数据集]([adult](https:\u002F\u002Fwww.kaggle.com\u002Fwenruliu\u002Fadult-income-dataset)) 进行二分类的端到端示例。\n\n使用 `pytorch-widedeep` 构建一个宽（线性）模型和深度模型：\n\n```python\nimport numpy as np\nimport torch\nfrom sklearn.model_selection import train_test_split\n\nfrom pytorch_widedeep import Trainer\nfrom pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor\nfrom pytorch_widedeep.models import Wide, TabMlp, WideDeep\nfrom pytorch_widedeep.metrics import Accuracy\nfrom pytorch_widedeep.datasets import load_adult\n\n\ndf = load_adult(as_frame=True)\ndf[\"income_label\"] = (df[\"income\"].apply(lambda x: \">50K\" in x)).astype(int)\ndf.drop(\"income\", axis=1, inplace=True)\ndf_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)\n\n# 定义特征列配置\nwide_cols = [\n    \"education\",\n    \"relationship\",\n    \"workclass\",\n    \"occupation\",\n    \"native-country\",\n    \"gender\",\n]\ncrossed_cols = [(\"education\", \"occupation\"), (\"native-country\", \"occupation\")]\n\ncat_embed_cols = [\n    \"workclass\",\n    \"education\",\n    \"marital-status\",\n    \"occupation\",\n    \"relationship\",\n    \"race\",\n    \"gender\",\n    \"capital-gain\",\n    \"capital-loss\",\n    \"native-country\",\n]\ncontinuous_cols = [\"age\", \"hours-per-week\"]\ntarget = \"income_label\"\ntarget = df_train[target].values\n\n# 准备数据\nwide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)\nX_wide = wide_preprocessor.fit_transform(df_train)\n\ntab_preprocessor = TabPreprocessor(\n    cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]\n)\nX_tab = tab_preprocessor.fit_transform(df_train)\n\n# 构建模型\nwide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=continuous_cols,\n)\nmodel = WideDeep(wide=wide, deeptabular=tab_mlp)\n\n# 训练和验证\ntrainer = Trainer(model, objective=\"binary\", metrics=[Accuracy])\ntrainer.fit(\n    X_wide=X_wide,\n    X_tab=X_tab,\n    target=target,\n    n_epochs=5,\n    batch_size=256,\n)\n\n# 在测试集上进行预测\nX_wide_te = wide_preprocessor.transform(df_test)\nX_tab_te = tab_preprocessor.transform(df_test)\npreds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)\n\n# 保存和加载\n\n# 选项1：如果使用了 LRHistory 回调函数，还会保存训练历史和学习率历史\ntrainer.save(path=\"model_weights\", save_state_dict=True)\n\n# 选项2：像保存其他 PyTorch 模型一样保存\ntorch.save(model.state_dict(), \"model_weights\u002Fwd_model.pt\")\n\n# 从这里开始，选项1和选项2是等价的。假设用户已经准备好了数据并定义了新的模型组件：\n# 1. 构建模型\nmodel_new = WideDeep(wide=wide, deeptabular=tab_mlp)\nmodel_new.load_state_dict(torch.load(\"model_weights\u002Fwd_model.pt\"))\n\n# 2. 实例化训练器\ntrainer_new = Trainer(model_new, objective=\"binary\")\n\n# 3. 可以开始训练，也可以直接进行预测\npreds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab, batch_size=32)\n```\n\n当然，你还可以做 **更多** 的事情。请查看 Examples 文件夹、文档或配套文章，以便更好地理解该包的内容及其功能。\n\n### 测试\n\n```\npytest tests\n```\n\n### 如何贡献\n\n请查看 [CONTRIBUTING](https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fblob\u002Fmaster\u002FCONTRIBUTING.MD) 页面。\n\n### 致谢\n\n本库借鉴了多个其他库的经验，因此我认为在 README 中提及这些库是公平的（代码中也有具体说明）。\n\n`Callbacks` 和 `Initializers` 的结构和代码灵感来源于 [`torchsample`](https:\u002F\u002Fgithub.com\u002Fncullen93\u002Ftorchsample) 库，而后者又部分受到 [`Keras`](https:\u002F\u002Fkeras.io\u002F) 的启发。\n\n本库中的 `TextProcessor` 类使用了 [`fastai`](https:\u002F\u002Fdocs.fast.ai\u002Ftext.transform.html#BaseTokenizer.tokenizer) 的 `Tokenizer` 和 `Vocab`。位于 `utils.fastai_transforms` 的代码是对它们的轻微改编，使其能在本库中正常工作。根据我的经验，他们的 `Tokenizer` 是同类产品中的佼佼者。\n\n本库中的 `ImageProcessor` 类则使用了 Adrian Rosebrock 所著的精彩书籍 [Deep Learning for Computer Vision](https:\u002F\u002Fwww.pyimagesearch.com\u002Fdeep-learning-computer-vision-python-book\u002F)（DL4CV）中的代码。\n\n### 许可证\n\n本作品采用 Apache 2.0 和 MIT 双重许可（或任何后续版本）。您在使用本作品时可以选择其中一种许可证。\n\n`SPDX-License-Identifier: Apache-2.0 AND MIT`\n\n### 引用\n\n#### BibTex\n\n```\n@article{Zaurin_pytorch-widedeep_A_flexible_2023,\nauthor = {Zaurin, Javier Rodriguez and Mulinka, Pavol},\ndoi = {10.21105\u002Fjoss.05027},\njournal = {Journal of Open Source Software},\nmonth = jun,\nnumber = {86},\npages = {5027},\ntitle = {{pytorch-widedeep: A flexible package for multimodal deep learning}},\nurl = {https:\u002F\u002Fjoss.theoj.org\u002Fpapers\u002F10.21105\u002Fjoss.05027},\nvolume = {8},\nyear = {2023}\n}\n```\n\n#### APA\n\n```\nZaurin, J. R., & Mulinka, P. (2023). pytorch-widedeep: A flexible package for multimodal deep learning. Journal of Open Source Software, 8(86), 5027.\nhttps:\u002F\u002Fdoi.org\u002F10.21105\u002Fjoss.05027\n```","# pytorch-widedeep 快速上手指南\n\n`pytorch-widedeep` 是一个基于 PyTorch 的灵活深度学习包，旨在处理多模态数据。它基于 Google 的 Wide & Deep 算法进行了调整，能够轻松地将表格数据与文本、图像数据进行组合建模。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux, macOS 或 Windows\n*   **Python 版本**: 3.9, 3.10, 3.11 或 3.12\n*   **核心依赖**:\n    *   PyTorch (建议安装最新稳定版)\n    *   pandas, numpy, scikit-learn\n    *   Pillow (用于图像处理)\n    *   tqdm, tabulate\n\n> **提示**：国内用户建议使用清华或阿里镜像源加速 PyTorch 及相关依赖的安装。\n\n## 安装步骤\n\n您可以通过 pip 直接安装稳定版：\n\n```bash\npip install pytorch-widedeep\n```\n\n如果需要安装包含所有可选依赖（如 Hugging Face transformers 支持等）的版本：\n\n```bash\npip install \"pytorch-widedeep[all]\"\n```\n\n**国内加速安装示例（使用清华源）：**\n\n```bash\npip install pytorch-widedeep -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\n以下示例展示如何构建一个最基础的 **Wide & Deep** 模型，仅包含表格数据（Tabular）和宽组件（Wide）。该流程分为数据预处理、模型定义和训练三个步骤。\n\n### 1. 准备数据与预处理\n\n首先，我们需要区分“宽”组件所需的列（通常用于交叉特征）和“深”组件所需的列（分类嵌入和连续数值）。\n\n```python\nimport numpy as np\nimport pandas as pd\nfrom pytorch_widedeep.preprocessing import TabPreprocessor, WidePreprocessor\nfrom pytorch_widedeep.models import Wide, TabMlp, WideDeep\nfrom pytorch_widedeep.training import Trainer\n\n# 假设 df 是你的 pandas DataFrame，包含 'city', 'name', 'age', 'height', 'target' 列\n# df = pd.read_csv(\"your_data.csv\")\n\n# --- Wide 组件预处理 ---\nwide_cols = [\"city\"]                # 用于宽模型的列\ncrossed_cols = [(\"city\", \"name\")]   # 交叉特征列\nwide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)\nX_wide = wide_preprocessor.fit_transform(df)\n\n# 初始化 Wide 模型\nwide = Wide(input_dim=np.unique(X_wide).shape[0])\n\n# --- Deep Tabular 组件预处理 ---\ntab_preprocessor = TabPreprocessor(\n    embed_cols=[\"city\", \"name\"],       # 需要嵌入的分类列\n    continuous_cols=[\"age\", \"height\"]  # 连续数值列\n)\nX_tab = tab_preprocessor.fit_transform(df)\n\n# 初始化 Deep Tabular 模型 (这里使用默认的 TabMlp)\ntab_mlp = TabMlp(\n    column_idx=tab_preprocessor.column_idx,\n    cat_embed_input=tab_preprocessor.cat_embed_input,\n    continuous_cols=tab_preprocessor.continuous_cols,\n    mlp_hidden_dims=[64, 32],          # MLP 隐藏层维度\n)\n```\n\n### 2. 构建模型\n\n将上述两个组件组合到 `WideDeep` 主模型中。\n\n```python\n# 组合模型\nmodel = WideDeep(wide=wide, deeptabular=tab_mlp)\n```\n\n> **注**：该库同样支持组合文本 (`deeptext`) 和图像 (`deepimage`) 组件，只需实例化相应的预处理器和模型（如 `BasicRNN`, `ResNet` 等）并传入 `WideDeep` 即可。\n\n### 3. 训练模型\n\n使用内置的 `Trainer` 进行训练。\n\n```python\n# 初始化训练器，设定任务类型为二分类 (\"binary\")，也可设为 \"regression\" 或多分类\ntrainer = Trainer(model, objective=\"binary\")\n\n# 开始训练\ntrainer.fit(\n    X_wide=X_wide,\n    X_tab=X_tab,\n    target=df[\"target\"].values,\n    n_epochs=5,           # 训练轮数\n    batch_size=32,        # 批次大小\n)\n```\n\n训练完成后，您可以使用 `trainer.predict` 进行预测，或使用 `trainer.load_best_model` 加载验证集表现最佳的模型权重。","某电商公司的数据科学团队正致力于构建一个精准的商品推荐系统，需要同时处理用户画像表格、商品描述文本以及商品展示图片。\n\n### 没有 pytorch-widedeep 时\n- **多模态融合困难**：工程师需手动编写大量底层代码来对齐表格、文本和图像特征，不同数据源的维度匹配极易出错。\n- **模型架构僵化**：难以灵活实现 Google 的 Wide & Deep 算法，无法有效兼顾记忆历史规则（Wide 部分）与泛化新特征（Deep 部分）的能力。\n- **开发周期漫长**：从数据预处理到搭建完整的 PyTorch 训练流水线耗时数周，且维护自定义的多输入模型极其繁琐。\n- **调优成本高昂**：缺乏统一的接口管理多种模态的嵌入层和激活函数，导致超参数搜索和实验复现效率低下。\n\n### 使用 pytorch-widedeep 后\n- **一键多模态集成**：利用其内置组件，仅需几行配置即可将结构化用户数据与非结构化图文数据无缝拼接，自动处理特征交叉。\n- **架构灵活可配**：直接调用预置的 Wide & Deep 架构，轻松调整“宽”侧的记忆能力与“深”侧的泛化深度，快速适配业务需求。\n- **研发效率倍增**：标准化的 API 屏蔽了复杂的底层实现，团队在几天内即可完成从原型验证到生产级模型的部署。\n- **实验迭代加速**：统一的训练循环和评估模块让多模态超参数调优变得简单可控，显著提升了模型最终的点击率预测精度。\n\npytorch-widedeep 通过标准化多模态深度学习流程，让团队能以最低成本释放表格、文本与图像数据的联合价值。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fjrzaurin_pytorch-widedeep_fd8caabb.png","jrzaurin","Javier","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fjrzaurin_1ed13fae.jpg",null,"London","jrzaurin@gmail.com","https:\u002F\u002Fgithub.com\u002Fjrzaurin",[80,84,88],{"name":81,"color":82,"percentage":83},"Python","#3572A5",98.1,{"name":85,"color":86,"percentage":87},"JavaScript","#f1e05a",1.5,{"name":89,"color":90,"percentage":91},"CSS","#663399",0.4,1410,197,"2026-04-12T13:48:21","Apache-2.0","未说明",{"notes":98,"python":99,"dependencies":100},"该工具是一个基于 PyTorch 的灵活包，用于结合表格数据、文本和图像进行多模态深度学习。支持多种架构组件（如 TabMlp, BasicRNN 等），并允许使用自定义模型（需具备 output_dim 属性）。文中示例代码展示了如何预处理数据和构建模型，但未明确列出具体的依赖库版本、GPU 硬件要求或内存需求。","3.9, 3.10, 3.11, 3.12",[101,102,103,104],"torch","pandas","numpy","Pillow",[106,107,14,35,16,15,108],"其他","视频","音频",[110,111,112,113,114,115,116,117,118,119,120,121],"pytorch","tabular-data","text","images","multimodal-deep-learning","pytorch-tabular-data","pytorch-nlp","pytorch-cv","pytorch-transformers","deep-learning","model-hub","python","2026-03-27T02:49:30.150509","2026-04-16T08:13:22.979101",[125,130,135,140,145,150],{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},35526,"在 Windows 上运行代码时遇到报错，如何解决？","在 Windows 系统上，需要在代码开头添加 `if __name__ == '__main__':` 保护块，并在预测时显式指定 `batch_size` 参数。示例代码如下：\n```python\nfrom pytorch_widedeep.datasets import load_adult\n\nif __name__ == '__main__':\n    df = load_adult(as_frame=True)\n    # ... 数据处理代码 ...\n    preds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab, batch_size=256)\n```\n这是因为 Windows 上的多进程启动机制要求主程序入口必须受保护。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F203",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},35527,"该库是否支持推荐系统（RecSys）场景？如何使用序列特征（如用户交互历史列表）作为输入？","是的，该库支持推荐系统场景。对于包含用户交互物品列表的序列特征，可以参考官方提供的示例代码。维护者已在 `wide_deep_recsys` 分支中添加了相关支持，具体用法可查看以下示例脚本：\nhttps:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Ftree\u002Fwide_deep_recsys\u002Fexamples\u002Fscripts\u002Fwide_deep_for_recsys\n该示例展示了如何处理序列索引列表并将其作为模型输入。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F133",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},35528,"如何将推荐问题转化为分类问题并进行排序？","可以将推荐问题建模为多分类问题（例如 20 个类别代表 20 个物品）。训练完成后，使用模型输出的预测概率作为得分进行排序。例如，选取概率最高的前 10 个物品作为推荐结果。这是一种有效且常用的方法。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F107",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},35529,"如何在训练过程中保存表现最佳（如 F1 分数最高）的模型 epoch，而不仅仅是损失最低的？","虽然默认配置通常基于 Loss 保存模型，但你可以通过自定义回调或调整训练器参数来监控其他指标（如 F1 Score）。如果在结合表格数据和文本数据时发现文本效果被削弱，这通常是因为表格数据本身已包含大部分信息且结构更清晰。若需显著提升效果，建议检查文本是否提供了表格中不存在的新增信息（如评论情感），否则考虑引入图像等多模态数据可能更有效。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F188",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},35530,"在进行纯文本数据的预测时遇到 'ValueError: cannot select an axis to squeeze out' 错误，如何解决？","该问题已在 v1.6.0 版本中修复。请升级库到最新版本：\n```bash\npip install --upgrade pytorch-widedeep\n```\n该错误通常发生在输出维度处理不当的情况下，新版本已优化了预测阶段的维度压缩逻辑。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F113",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},35531,"库中是否集成了 Flash Attention 以提升注意力机制的效率？","是的，库中已经集成了高效的注意力机制实现。目前使用的是 PyTorch 原生的 `torch.nn.functional.scaled_dot_product_attention`，它在底层优化了内存和延迟。如果尝试访问 `attention_weights` 可能会抛出错误，这是预期行为。未来版本计划进一步引入 Flash Attention v2 和线性注意力机制。","https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F132",[156,161,166,171,176,181,186,191,196,201,206,211,216,220,225,230,235,240,245,250],{"id":157,"version":158,"summary_zh":159,"released_at":160},280675,"v.1.7.0","本次发布新增了一系列与 Hugging Face 集成相关的功能。此外，同样重要的是，该库 now 支持多 GPU 训练，并兼容 Python 3.12。","2025-09-27T10:17:13",{"id":162,"version":163,"summary_zh":164,"released_at":165},280676,"v.1.6.5","新增对 MPS 后端的支持  \n在 rec 模块中新增了一系列模型：DCN、DCNv2、GDCN、AutoInt、AutoIntPlus  \n新增 DIN 预处理模块  \n修订了文档  \n修订了示例  \n其他（大小不一）修复","2024-11-06T17:07:40",{"id":167,"version":168,"summary_zh":169,"released_at":170},280677,"v.1.6.4","* 在库中关于推荐算法出现了若干问题并在 Slack 上展开了讨论之后，我决定加入一个推荐模块，该模块目前包含少量推荐算法。这些算法包括：\n\n- 因子分解机（FM）和 DeepFM\n- 特征感知因子分解机（FFM）和 DeepFFM\n- 极深因子分解机（xDeepFM）\n- 深度兴趣网络（DIN）\n\n我们将在不久的将来添加更多算法。\n\n* 此外，还修复了一些 bug（https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F232 和 https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fissues\u002F233）。","2024-09-24T14:56:09",{"id":172,"version":173,"summary_zh":174,"released_at":175},280678,"v.1.6.3","1. 增加了对不同列的多表格模型的支持（在此基础上新增了多文本列和多图片列，相比之前版本有所扩展）。  \n2. 移除了对 FDS 和 LDS 的支持。  \n3. 保留了在 1.6.2 版本中引入的优化器保存功能（该版本生命周期短暂且未正式发布）。","2024-08-26T11:38:36",{"id":177,"version":178,"summary_zh":179,"released_at":180},280679,"v.1.6.1","这是一个用于修复 `numpy>=1.21.6, \u003C2.0.0` 的快速补丁。\n\n否则，它与 1.6.0 完全相同。","2024-06-17T08:42:41",{"id":182,"version":183,"summary_zh":184,"released_at":185},280680,"v1.6.0","## 变更内容\n* @jrzaurin 在 https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fpull\u002F209 中实现了 Hugging Face 集成\n* @jrzaurin 在 https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fpull\u002F215 中增加了对多文本和多图像列的支持\n* @jrzaurin 在 https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fpull\u002F215 中添加了对多目标损失函数的支持\n* README 文件几乎被完全重写，新增了 7 种可能的模型架构图（其中的方框或组件可以是库中的任意模型），并提供了使用玩具数据集的可完全运行的示例，任何人都可以将其作为起点。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fjrzaurin\u002Fpytorch-widedeep\u002Fcompare\u002Fv1.5.1...v1.6.0","2024-06-15T15:08:45",{"id":187,"version":188,"summary_zh":189,"released_at":190},280681,"v1.5.1","主要修复了问题 #204。","2024-04-10T15:57:08",{"id":192,"version":193,"summary_zh":194,"released_at":195},280682,"v.1.5.0","新增了两种数值特征嵌入方法，这些方法在论文《表格型深度学习中数值特征的嵌入》（[On Embeddings for Numerical Features in Tabular Deep Learning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.05556)）中有详细描述，并相应地调整了所有模型和功能。","2024-02-17T12:11:44",{"id":197,"version":198,"summary_zh":199,"released_at":200},280683,"v.1.4.0","本次发布主要新增了通过 `load_from_folder` 模块处理大型数据集的功能。\n\n该模块借鉴了 `torchvision` 库中的 `ImageFolder` 类，但已根据我们库的需求进行了适配。详情请参阅文档。","2023-11-17T08:54:42",{"id":202,"version":203,"summary_zh":204,"released_at":205},280684,"v.1.3.2","1. 新增了 [Flash Attention](https:\u002F\u002Fpytorch.org\u002Fdocs\u002Fstable\u002Fgenerated\u002Ftorch.nn.functional.scaled_dot_product_attention.html)  \n2. 新增了 [Linear Attention](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.16236)  \n3. 重新审阅并优化了文档","2023-08-06T10:50:00",{"id":207,"version":208,"summary_zh":209,"released_at":210},280685,"v1.3.1","1. Added example scripts and notebooks on how to use the library in the context of recommendation systems using [this notebook](https:\u002F\u002Fwww.kaggle.com\u002Fcode\u002Fmatanivanov\u002Fwide-deep-learning-for-recsys-with-pytorch) as example. This is a response to issue #133\r\n2. Used the opportunity to add the movielens 100k dataset to the library, so that now it can be imported from the datasets module\r\n3. Added a simple (not pre-trained) transformer model to to the text component\r\n4. Added citation file\r\n5. Fix a bug regarding the padding index not being 1 when using the fastai transforms","2023-07-31T17:25:32",{"id":212,"version":213,"summary_zh":214,"released_at":215},280686,"v1.3.0","* Added a new functionality to access feature importance via attention weights for all DL models for Tabular data except for the `TabPerceiver`. This functionality is accessed via the `feature_importance` attribute in the trainer (computed during training with a sample of observations) and at predict time via de `explain` method.\r\n* Fix all restore weights capabilities in all forms of training. Such capabilities are present in two callbacks, the `EarlyStopping` and the `ModelCheckpoint` Callbacks. Prior to this release there was a bug and the weights were not restored.","2023-07-21T07:37:36",{"id":217,"version":218,"summary_zh":75,"released_at":219},280687,"joss_paper_package_version_v1.2.0","2023-05-31T15:03:23",{"id":221,"version":222,"summary_zh":223,"released_at":224},280688,"v1.2.2","1. Fixed a bug related to the option of adding a FC head on top of the \"backbone\" models\r\n2. Added a notebook to illustrate how one could use a Hugginface model along with any other model in the library","2023-01-20T09:05:28",{"id":226,"version":227,"summary_zh":228,"released_at":229},280689,"v.1.2.1","Simple minor release fixing the implementation of the additive attention (see #110 )","2022-10-07T09:44:03",{"id":231,"version":232,"summary_zh":233,"released_at":234},280690,"v1.2.0","There are a number of changes and new features in this release, here is a summary:\r\n\r\n1. Refactored the code related to the 3 forms of training in the library: \r\n    - Supervised Training (via the `Trainer` class)\r\n    - Self-Supervised pre-training: we have implemented two methods or routines for self-supervised pre-training. These are: \r\n        - Encoder-Decoder Pre-Training (via the `EncoderDecoderTrainer` class): this is inspired by the [TabNet paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07442)\r\n        - Constrastive-Denoising Pre-Training (via de `ConstrastiveDenoising` class): this is inspired by the [SAINT paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.01342)\r\n     - Bayesian or Probabilistic Training (via the `BayesianTrainer`: this is inspired by the paper [Weight Uncertainty in Neural Networks\r\n](https:\u002F\u002Farxiv.org\u002Fabs\u002F1505.05424)     \r\n\r\n    Just as a reminder, the current deep learning models for tabular data available in the library are: \r\n    - Wide\r\n    - TabMlp\r\n    - TabResNet\r\n    - [TabNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07442)\r\n    - [TabTransformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.06678)\r\n    - [FTTransformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11959v2)\r\n    - [SAINT](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.01342)\r\n    - [TabFastformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.09084)\r\n    - [TabPerceiver](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.03206)\r\n    - BayesianWide\r\n    - BayesianTabMlp      \r\n\r\n2. The text related component has now 3 available models, all based on RNNs. There are reasons for that although the integration with the Hugginface Transformer library is the next step in the development of the library. The 3 models available are: \r\n    - BasicRNN\r\n    - AttentiveRNN\r\n    - StackedAttentiveRNN\r\n    \r\n    The last two are based on [Hierarchical Attention Networks for Document Classification](https:\u002F\u002Fwww.cs.cmu.edu\u002F~hovy\u002Fpapers\u002F16HLT-hierarchical-attention-networks.pdf). See the docs for details\r\n\r\n3. The image related component is now fully integrated with the latest [torchvision](https:\u002F\u002Fpytorch.org\u002Fvision\u002Fstable\u002Fmodels.html) release, with a new [Multi-Weight Support API](https:\u002F\u002Fpytorch.org\u002Fblog\u002Fintroducing-torchvision-new-multi-weight-support-api\u002F). Currently, the model variants supported by our library are: \r\n    - resnet\r\n    - shufflenet\r\n    - resnext\r\n    - wide_resnet\r\n    - regnet\r\n    - densenet\r\n    - mobilenet\r\n    - mnasnet\r\n    - efficientnet\r\n    - squeezenet\r\n    ","2022-09-01T13:10:25",{"id":236,"version":237,"summary_zh":238,"released_at":239},280691,"v1.1.2","Simply Update all documentation","2022-08-29T14:53:09",{"id":241,"version":242,"summary_zh":243,"released_at":244},280692,"v1.1.0","This release fixes some minor bugs but mainly brings a couple of new functionalities: \r\n\r\n1. New experimental Attentive models, namely: `ContextAttentionMLP` and `SelfAttentionMLP`. \r\n2. 2 Probabilistic models based on Bayes by Backprop (BBP) as described in [Weight Uncertainty in Neural Networks](https:\u002F\u002Farxiv.org\u002Fabs\u002F1505.05424), namely: `BayesianTabMlp` and `BayesianWide`.\r\n3. Label and Feature Distribution Smoothing (LDS and FDS) for Deep Imbalanced Regression (DIR) as described in [Delving into Deep Imbalanced Regression](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.09554)\r\n4. Better integration with `torchvision` for the `deepimage` component of a `WideDeep` model\r\n5. 3 Available models for the `deeptext` component of a `WideDeep` model. Namely: `BasicRNN`, `AttentiveRNN` and `StackedAttentiveRNN`","2022-03-10T16:26:56",{"id":246,"version":247,"summary_zh":248,"released_at":249},280693,"v1.0.10","This minor release simply fixes issue #53 related to the fact that `SAINT`, the `FT-Transformer` and the `TabFasformer` failed when the input data had no categorical columns","2021-10-07T16:04:56",{"id":251,"version":252,"summary_zh":253,"released_at":254},280694,"v1.0.9","**Functionalities**: \r\n\r\n- Added a new functionality called `Tab2Vec` that given a trained model and a fitted Tabular Preprocessor it will return an input dataframe transformed into embeddings\r\n\r\n\r\n**TabFormers: Increased the `Tabformer` (Transformers for Tabular Data) family**\r\n\r\n- Added a proper implementation of the [FT-Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11959) with Linear Attention (as introduced in the [Linformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.04768) paper)\r\n- Added a TabFastFormer model, an adaptation of the [FastFormer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.09084) for Tabular Data\r\n- Added a TabPerceiver model, an adaptation of the [Perceiver](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.03206) for Tabular Data\r\n\r\n\r\n**Docs**\r\n\r\n- Refined the docs to make them cleared and fix a few typos\r\n","2021-09-07T10:11:28"]