[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-TylerYep--torchinfo":3,"tool-TylerYep--torchinfo":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",147882,2,"2026-04-09T11:32:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":94,"difficulty_score":95,"env_os":96,"env_gpu":97,"env_ram":97,"env_deps":98,"category_tags":103,"github_topics":104,"view_count":32,"oss_zip_url":113,"oss_zip_packed_at":113,"status":17,"created_at":114,"updated_at":115,"faqs":116,"releases":146},5954,"TylerYep\u002Ftorchinfo","torchinfo","View model summaries in PyTorch!","torchinfo 是一款专为 PyTorch 开发者设计的模型分析工具，旨在提供比原生 `print(model)` 更详尽、直观的神经网络结构摘要。它解决了开发者在调试复杂网络时难以快速掌握层级细节、参数量及计算开销的痛点，功能上类似于 TensorFlow 中的 `model.summary()`。\n\n无论是深度学习研究人员还是工程开发者，都能通过 torchinfo 轻松查看每一层的输入输出形状、参数数量以及乘加运算量（Mult-Adds）。此外，它还能估算模型在前向传播和反向传播过程中的内存占用，帮助优化资源分配。\n\n该工具的独特亮点在于其强大的兼容性：不仅支持卷积网络，还完美适配 RNN、LSTM 等递归层及具有分支结构的复杂模型。它提供了可配置的表格行列选项，并能返回包含完整统计数据的对象，便于程序化调用。在 Jupyter Notebook 或 Google Colab 环境中，torchinfo 也能无缝集成，直接渲染清晰的可视化表格。只需几行代码，用户即可获得专业的模型洞察，是构建和优化深度学习架构时的得力助手。","# torchinfo\n\n[![Python 3.8+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-380\u002F)\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ftorchinfo.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ftorchinfo)\n[![Conda version](https:\u002F\u002Fimg.shields.io\u002Fconda\u002Fvn\u002Fconda-forge\u002Ftorchinfo)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ftorchinfo)\n[![Build Status](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Factions\u002Fworkflows\u002Ftest.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Factions\u002Fworkflows\u002Ftest.yml)\n[![pre-commit.ci status](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002FTylerYep\u002Ftorchinfo\u002Fmain.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002FTylerYep\u002Ftorchinfo\u002Fmain)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FTylerYep\u002Ftorchinfo)](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fblob\u002Fmain\u002FLICENSE)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002FTylerYep\u002Ftorchinfo\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FTylerYep\u002Ftorchinfo)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTylerYep_torchinfo_readme_1a294af4e63f.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Ftorchinfo)\n\n(formerly torch-summary)\n\nTorchinfo provides information complementary to what is provided by `print(your_model)` in PyTorch, similar to Tensorflow's `model.summary()` API to view the visualization of the model, which is helpful while debugging your network. In this project, we implement a similar functionality in PyTorch and create a clean, simple interface to use in your projects.\n\nThis is a completely rewritten version of the original torchsummary and torchsummaryX projects by @sksq96 and @nmhkahn. This project addresses all of the issues and pull requests left on the original projects by introducing a completely new API.\n\nSupports PyTorch versions 1.4.0+.\n\n# Usage\n\n```\npip install torchinfo\n```\n\nAlternatively, via conda:\n\n```\nconda install -c conda-forge torchinfo\n```\n\n# How To Use\n\n```python\nfrom torchinfo import summary\n\nmodel = ConvNet()\nbatch_size = 16\nsummary(model, input_size=(batch_size, 1, 28, 28))\n```\n\n```\n================================================================================================================\nLayer (type:depth-idx)          Input Shape          Output Shape         Param #            Mult-Adds\n================================================================================================================\nSingleInputNet                  [7, 1, 28, 28]       [7, 10]              --                 --\n├─Conv2d: 1-1                   [7, 1, 28, 28]       [7, 10, 24, 24]      260                1,048,320\n├─Conv2d: 1-2                   [7, 10, 12, 12]      [7, 20, 8, 8]        5,020              2,248,960\n├─Dropout2d: 1-3                [7, 20, 8, 8]        [7, 20, 8, 8]        --                 --\n├─Linear: 1-4                   [7, 320]             [7, 50]              16,050             112,350\n├─Linear: 1-5                   [7, 50]              [7, 10]              510                3,570\n================================================================================================================\nTotal params: 21,840\nTrainable params: 21,840\nNon-trainable params: 0\nTotal mult-adds (M): 3.41\n================================================================================================================\nInput size (MB): 0.02\nForward\u002Fbackward pass size (MB): 0.40\nParams size (MB): 0.09\nEstimated Total Size (MB): 0.51\n================================================================================================================\n```\n\n\u003C!-- single_input_all_cols.out -->\n\nNote: if you are using a Jupyter Notebook or Google Colab, `summary(model, ...)` must be the returned value of the cell.\nIf it is not, you should wrap the summary in a print(), e.g. `print(summary(model, ...))`.\nSee `tests\u002Fjupyter_test.ipynb` for examples.\n\n**This version now supports:**\n\n- RNNs, LSTMs, and other recursive layers\n- Branching output used to explore model layers using specified depths\n- Returns ModelStatistics object containing all summary data fields\n- Configurable rows\u002Fcolumns\n- Jupyter Notebook \u002F Google Colab\n\n**Other new features:**\n\n- Verbose mode to show weights and bias layers\n- Accepts either input data or simply the input shape!\n- Customizable line widths and batch dimension\n- Comprehensive unit\u002Foutput testing, linting, and code coverage testing\n\n**Community Contributions:**\n\n- Sequentials & ModuleLists (thanks to @roym899)\n- Improved Mult-Add calculations (thanks to @TE-StefanUhlich, @zmzhang2000)\n- Dict\u002FMisc input data (thanks to @e-dorigatti)\n- Pruned layer support (thanks to @MajorCarrot)\n\n# Documentation\n\n```python\ndef summary(\n    model: nn.Module,\n    input_size: Optional[INPUT_SIZE_TYPE] = None,\n    input_data: Optional[INPUT_DATA_TYPE] = None,\n    batch_dim: Optional[int] = None,\n    cache_forward_pass: Optional[bool] = None,\n    col_names: Optional[Iterable[str]] = None,\n    col_width: int = 25,\n    depth: int = 3,\n    device: Optional[torch.device] = None,\n    dtypes: Optional[List[torch.dtype]] = None,\n    mode: str = \"same\",\n    row_settings: Optional[Iterable[str]] = None,\n    verbose: int = 1,\n    **kwargs: Any,\n) -> ModelStatistics:\n\"\"\"\nSummarize the given PyTorch model. Summarized information includes:\n    1) Layer names,\n    2) input\u002Foutput shapes,\n    3) kernel shape,\n    4) # of parameters,\n    5) # of operations (Mult-Adds),\n    6) whether layer is trainable\n\nNOTE: If neither input_data or input_size are provided, no forward pass through the\nnetwork is performed, and the provided model information is limited to layer names.\n\nArgs:\n    model (nn.Module):\n            PyTorch model to summarize. The model should be fully in either train()\n            or eval() mode. If layers are not all in the same mode, running summary\n            may have side effects on batchnorm or dropout statistics. If you\n            encounter an issue with this, please open a GitHub issue.\n\n    input_size (Sequence of Sizes):\n            Shape of input data as a List\u002FTuple\u002Ftorch.Size\n            (dtypes must match model input, default is FloatTensors).\n            You should include batch size in the tuple.\n            Default: None\n\n    input_data (Sequence of Tensors):\n            Arguments for the model's forward pass (dtypes inferred).\n            If the forward() function takes several parameters, pass in a list of\n            args or a dict of kwargs (if your forward() function takes in a dict\n            as its only argument, wrap it in a list).\n            Default: None\n\n    batch_dim (int):\n            Batch_dimension of input data. If batch_dim is None, assume\n            input_data \u002F input_size contains the batch dimension, which is used\n            in all calculations. Else, expand all tensors to contain the batch_dim.\n            Specifying batch_dim can be an runtime optimization, since if batch_dim\n            is specified, torchinfo uses a batch size of 1 for the forward pass.\n            Default: None\n\n    cache_forward_pass (bool):\n            If True, cache the run of the forward() function using the model\n            class name as the key. If the forward pass is an expensive operation,\n            this can make it easier to modify the formatting of your model\n            summary, e.g. changing the depth or enabled column types, especially\n            in Jupyter Notebooks.\n            WARNING: Modifying the model architecture or input data\u002Finput size when\n            this feature is enabled does not invalidate the cache or re-run the\n            forward pass, and can cause incorrect summaries as a result.\n            Default: False\n\n    col_names (Iterable[str]):\n            Specify which columns to show in the output. Currently supported: (\n                \"input_size\",\n                \"output_size\",\n                \"num_params\",\n                \"params_percent\",\n                \"kernel_size\",\n                \"groups\",\n                \"mult_adds\",\n                \"trainable\",\n            )\n            Default: (\"output_size\", \"num_params\")\n            If input_data \u002F input_size are not provided, only \"num_params\" is used.\n\n    col_width (int):\n            Width of each column.\n            Default: 25\n\n    depth (int):\n            Depth of nested layers to display (e.g. Sequentials).\n            Nested layers below this depth will not be displayed in the summary.\n            Default: 3\n\n    device (torch.Device):\n            Uses this torch device for model and input_data.\n            If not specified, uses the dtype of input_data if given, or the\n            parameters of the model. Otherwise, uses the result of\n            torch.cuda.is_available().\n            Default: None\n\n    dtypes (List[torch.dtype]):\n            If you use input_size, torchinfo assumes your input uses FloatTensors.\n            If your model use a different data type, specify that dtype.\n            For multiple inputs, specify the size of both inputs, and\n            also specify the types of each parameter here.\n            Default: None\n\n    mode (str)\n            Either \"train\", \"eval\" or \"same\", which determines whether we call\n            model.train() or model.eval() before calling summary(). In any case,\n            original model mode is restored at the end.\n            Default: \"same\".\n\n    row_settings (Iterable[str]):\n            Specify which features to show in a row. Currently supported: (\n                \"ascii_only\",\n                \"depth\",\n                \"var_names\",\n            )\n            Default: (\"depth\",)\n\n    verbose (int):\n            0 (quiet): No output\n            1 (default): Print model summary\n            2 (verbose): Show weight and bias layers in full detail\n            Default: 1\n            If using a Juypter Notebook or Google Colab, the default is 0.\n\n    **kwargs:\n            Other arguments used in `model.forward` function. Passing *args is no\n            longer supported.\n\nReturn:\n    ModelStatistics object\n            See torchinfo\u002Fmodel_statistics.py for more information.\n\"\"\"\n```\n\n# Examples\n\n## Get Model Summary as String\n\n```python\nfrom torchinfo import summary\n\nmodel_stats = summary(your_model, (1, 3, 28, 28), verbose=0)\nsummary_str = str(model_stats)\n# summary_str contains the string representation of the summary!\n```\n\n## Explore Different Configurations\n\n```python\nclass LSTMNet(nn.Module):\n    def __init__(self, vocab_size=20, embed_dim=300, hidden_dim=512, num_layers=2):\n        super().__init__()\n        self.hidden_dim = hidden_dim\n        self.embedding = nn.Embedding(vocab_size, embed_dim)\n        self.encoder = nn.LSTM(embed_dim, hidden_dim, num_layers=num_layers, batch_first=True)\n        self.decoder = nn.Linear(hidden_dim, vocab_size)\n\n    def forward(self, x):\n        embed = self.embedding(x)\n        out, hidden = self.encoder(embed)\n        out = self.decoder(out)\n        out = out.view(-1, out.size(2))\n        return out, hidden\n\nsummary(\n    LSTMNet(),\n    (1, 100),\n    dtypes=[torch.long],\n    verbose=2,\n    col_width=16,\n    col_names=[\"kernel_size\", \"output_size\", \"num_params\", \"mult_adds\"],\n    row_settings=[\"var_names\"],\n)\n```\n\n```\n========================================================================================================================\nLayer (type (var_name))                  Kernel Shape         Output Shape         Param #              Mult-Adds\n========================================================================================================================\nLSTMNet (LSTMNet)                        --                   [100, 20]            --                   --\n├─Embedding (embedding)                  --                   [1, 100, 300]        6,000                6,000\n│    └─weight                            [300, 20]                                 └─6,000\n├─LSTM (encoder)                         --                   [1, 100, 512]        3,768,320            376,832,000\n│    └─weight_ih_l0                      [2048, 300]                               ├─614,400\n│    └─weight_hh_l0                      [2048, 512]                               ├─1,048,576\n│    └─bias_ih_l0                        [2048]                                    ├─2,048\n│    └─bias_hh_l0                        [2048]                                    ├─2,048\n│    └─weight_ih_l1                      [2048, 512]                               ├─1,048,576\n│    └─weight_hh_l1                      [2048, 512]                               ├─1,048,576\n│    └─bias_ih_l1                        [2048]                                    ├─2,048\n│    └─bias_hh_l1                        [2048]                                    └─2,048\n├─Linear (decoder)                       --                   [1, 100, 20]         10,260               10,260\n│    └─weight                            [512, 20]                                 ├─10,240\n│    └─bias                              [20]                                      └─20\n========================================================================================================================\nTotal params: 3,784,580\nTrainable params: 3,784,580\nNon-trainable params: 0\nTotal mult-adds (M): 376.85\n========================================================================================================================\nInput size (MB): 0.00\nForward\u002Fbackward pass size (MB): 0.67\nParams size (MB): 15.14\nEstimated Total Size (MB): 15.80\n========================================================================================================================\n\n```\n\n\u003C!-- lstm.out -->\n\n## ResNet\n\n```python\nimport torchvision\n\nmodel = torchvision.models.resnet152()\nsummary(model, (1, 3, 224, 224), depth=3)\n```\n\n```\n==========================================================================================\nLayer (type:depth-idx)                   Output Shape              Param #\n==========================================================================================\nResNet                                   [1, 1000]                 --\n├─Conv2d: 1-1                            [1, 64, 112, 112]         9,408\n├─BatchNorm2d: 1-2                       [1, 64, 112, 112]         128\n├─ReLU: 1-3                              [1, 64, 112, 112]         --\n├─MaxPool2d: 1-4                         [1, 64, 56, 56]           --\n├─Sequential: 1-5                        [1, 256, 56, 56]          --\n│    └─Bottleneck: 2-1                   [1, 256, 56, 56]          --\n│    │    └─Conv2d: 3-1                  [1, 64, 56, 56]           4,096\n│    │    └─BatchNorm2d: 3-2             [1, 64, 56, 56]           128\n│    │    └─ReLU: 3-3                    [1, 64, 56, 56]           --\n│    │    └─Conv2d: 3-4                  [1, 64, 56, 56]           36,864\n│    │    └─BatchNorm2d: 3-5             [1, 64, 56, 56]           128\n│    │    └─ReLU: 3-6                    [1, 64, 56, 56]           --\n│    │    └─Conv2d: 3-7                  [1, 256, 56, 56]          16,384\n│    │    └─BatchNorm2d: 3-8             [1, 256, 56, 56]          512\n│    │    └─Sequential: 3-9              [1, 256, 56, 56]          16,896\n│    │    └─ReLU: 3-10                   [1, 256, 56, 56]          --\n│    └─Bottleneck: 2-2                   [1, 256, 56, 56]          --\n\n  ...\n  ...\n  ...\n\n├─AdaptiveAvgPool2d: 1-9                 [1, 2048, 1, 1]           --\n├─Linear: 1-10                           [1, 1000]                 2,049,000\n==========================================================================================\nTotal params: 60,192,808\nTrainable params: 60,192,808\nNon-trainable params: 0\nTotal mult-adds (G): 11.51\n==========================================================================================\nInput size (MB): 0.60\nForward\u002Fbackward pass size (MB): 360.87\nParams size (MB): 240.77\nEstimated Total Size (MB): 602.25\n==========================================================================================\n```\n\n\u003C!-- resnet152.out -->\n\n## Multiple Inputs w\u002F Different Data Types\n\n```python\nclass MultipleInputNetDifferentDtypes(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.fc1a = nn.Linear(300, 50)\n        self.fc1b = nn.Linear(50, 10)\n\n        self.fc2a = nn.Linear(300, 50)\n        self.fc2b = nn.Linear(50, 10)\n\n    def forward(self, x1, x2):\n        x1 = F.relu(self.fc1a(x1))\n        x1 = self.fc1b(x1)\n        x2 = x2.type(torch.float)\n        x2 = F.relu(self.fc2a(x2))\n        x2 = self.fc2b(x2)\n        x = torch.cat((x1, x2), 0)\n        return F.log_softmax(x, dim=1)\n\nsummary(model, [(1, 300), (1, 300)], dtypes=[torch.float, torch.long])\n```\n\nAlternatively, you can also pass in the input_data itself, and\ntorchinfo will automatically infer the data types.\n\n```python\ninput_data = torch.randn(1, 300)\nother_input_data = torch.randn(1, 300).long()\nmodel = MultipleInputNetDifferentDtypes()\n\nsummary(model, input_data=[input_data, other_input_data, ...])\n```\n\n## Sequentials & ModuleLists\n\n```python\nclass ContainerModule(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self._layers = nn.ModuleList()\n        self._layers.append(nn.Linear(5, 5))\n        self._layers.append(ContainerChildModule())\n        self._layers.append(nn.Linear(5, 5))\n\n    def forward(self, x):\n        for layer in self._layers:\n            x = layer(x)\n        return x\n\n\nclass ContainerChildModule(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self._sequential = nn.Sequential(nn.Linear(5, 5), nn.Linear(5, 5))\n        self._between = nn.Linear(5, 5)\n\n    def forward(self, x):\n        out = self._sequential(x)\n        out = self._between(out)\n        for l in self._sequential:\n            out = l(out)\n\n        out = self._sequential(x)\n        for l in self._sequential:\n            out = l(out)\n        return out\n\nsummary(ContainerModule(), (1, 5))\n```\n\n```\n==========================================================================================\nLayer (type:depth-idx)                   Output Shape              Param #\n==========================================================================================\nContainerModule                          [1, 5]                    --\n├─ModuleList: 1-1                        --                        --\n│    └─Linear: 2-1                       [1, 5]                    30\n│    └─ContainerChildModule: 2-2         [1, 5]                    --\n│    │    └─Sequential: 3-1              [1, 5]                    --\n│    │    │    └─Linear: 4-1             [1, 5]                    30\n│    │    │    └─Linear: 4-2             [1, 5]                    30\n│    │    └─Linear: 3-2                  [1, 5]                    30\n│    │    └─Sequential: 3-3              --                        (recursive)\n│    │    │    └─Linear: 4-3             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-4             [1, 5]                    (recursive)\n│    │    └─Sequential: 3-4              [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-5             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-6             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-7             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-8             [1, 5]                    (recursive)\n│    └─Linear: 2-3                       [1, 5]                    30\n==========================================================================================\nTotal params: 150\nTrainable params: 150\nNon-trainable params: 0\nTotal mult-adds (M): 0.00\n==========================================================================================\nInput size (MB): 0.00\nForward\u002Fbackward pass size (MB): 0.00\nParams size (MB): 0.00\nEstimated Total Size (MB): 0.00\n==========================================================================================\n```\n\n\u003C!-- container.out -->\n\n# Contributing\n\nAll issues and pull requests are much appreciated! If you are wondering how to build the project:\n\n- torchinfo is actively developed using the lastest version of Python.\n  - Changes should be backward compatible to Python 3.8, and will follow Python's End-of-Life guidance for old versions.\n  - Run `pip install -r requirements-dev.txt`. We use the latest versions of all dev packages.\n  - Run `pre-commit install`.\n  - To use auto-formatting tools, use `pre-commit run -a`.\n  - To run unit tests, run `pytest`.\n  - To update the expected output files, run `pytest --overwrite`.\n  - To skip output file tests, use `pytest --no-output`\n\n# References\n\n- Thanks to @sksq96, @nmhkahn, and @sangyx for providing the inspiration for this project.\n- For Model Size Estimation @jacobkimmel ([details here](https:\u002F\u002Fgithub.com\u002Fsksq96\u002Fpytorch-summary\u002Fpull\u002F21))\n","# torchinfo\n\n[![Python 3.8+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-380\u002F)\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ftorchinfo.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Ftorchinfo)\n[![Conda version](https:\u002F\u002Fimg.shields.io\u002Fconda\u002Fvn\u002Fconda-forge\u002Ftorchinfo)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Ftorchinfo)\n[![Build Status](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Factions\u002Fworkflows\u002Ftest.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Factions\u002Fworkflows\u002Ftest.yml)\n[![pre-commit.ci status](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002FTylerYep\u002Ftorchinfo\u002Fmain.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002FTylerYep\u002Ftorchinfo\u002Fmain)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FTylerYep\u002Ftorchinfo)](https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fblob\u002Fmain\u002FLICENSE)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002FTylerYep\u002Ftorchinfo\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FTylerYep\u002Ftorchinfo)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTylerYep_torchinfo_readme_1a294af4e63f.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Ftorchinfo)\n\n（原名 torch-summary）\n\nTorchinfo 提供了与 PyTorch 中 `print(your_model)` 所提供信息互补的信息，类似于 TensorFlow 的 `model.summary()` API，用于查看模型的可视化结构，在调试网络时非常有帮助。在这个项目中，我们实现了类似的功能，并为 PyTorch 创建了一个简洁、易用的接口，方便在项目中使用。\n\n这是由 @sksq96 和 @nmhkahn 原创的 torchsummary 和 torchsummaryX 项目的完全重写版本。通过引入全新的 API，本项目解决了原项目遗留的所有问题和拉取请求。\n\n支持 PyTorch 1.4.0 及以上版本。\n\n# 使用方法\n\n```\npip install torchinfo\n```\n\n或者通过 conda 安装：\n\n```\nconda install -c conda-forge torchinfo\n```\n\n# 使用方式\n\n```python\nfrom torchinfo import summary\n\nmodel = ConvNet()\nbatch_size = 16\nsummary(model, input_size=(batch_size, 1, 28, 28))\n```\n\n```\n================================================================================================================\nLayer (type:depth-idx)          Input Shape          Output Shape         Param #            Mult-Adds\n================================================================================================================\nSingleInputNet                  [7, 1, 28, 28]       [7, 10]              --                 --\n├─Conv2d: 1-1                   [7, 1, 28, 28]       [7, 10, 24, 24]      260                1,048,320\n├─Conv2d: 1-2                   [7, 10, 12, 12]      [7, 20, 8, 8]        5,020              2,248,960\n├─Dropout2d: 1-3                [7, 20, 8, 8]        [7, 20, 8, 8]        --                 --\n├─Linear: 1-4                   [7, 320]             [7, 50]              16,050             112,350\n├─Linear: 1-5                   [7, 50]              [7, 10]              510                3,570\n================================================================================================================\nTotal params: 21,840\nTrainable params: 21,840\nNon-trainable params: 0\nTotal mult-adds (M): 3.41\n================================================================================================================\nInput size (MB): 0.02\nForward\u002Fbackward pass size (MB): 0.40\nParams size (MB): 0.09\nEstimated Total Size (MB): 0.51\n================================================================================================================\n```\n\n\u003C!-- single_input_all_cols.out -->\n\n注意：如果您使用的是 Jupyter Notebook 或 Google Colab，`summary(model, ...)` 必须是单元格的返回值。如果不是，您应该将 summary 包裹在 print() 中，例如 `print(summary(model, ...))`。示例请参见 `tests\u002Fjupyter_test.ipynb`。\n\n**此版本现在支持：**\n\n- RNN、LSTM 及其他递归层\n- 分支输出，可用于按指定深度探索模型层\n- 返回包含所有摘要数据字段的 ModelStatistics 对象\n- 可配置的行数和列数\n- Jupyter Notebook \u002F Google Colab\n\n**其他新特性：**\n\n- 详细模式，可显示权重和偏置层\n- 支持输入数据或仅输入形状！\n- 可自定义线条宽度和批量维度\n- 全面的单元测试、输出测试、代码风格检查和代码覆盖率测试\n\n**社区贡献：**\n\n- Sequentials 和 ModuleLists（感谢 @roym899）\n- 改进的 Mult-Add 计算（感谢 @TE-StefanUhlich、@zmzhang2000）\n- 字典\u002F其他格式的输入数据（感谢 @e-dorigatti）\n- 剪枝层支持（感谢 @MajorCarrot）\n\n# 文档\n\n```python\ndef summary(\n    model: nn.Module,\n    input_size: Optional[INPUT_SIZE_TYPE] = None,\n    input_data: Optional[INPUT_DATA_TYPE] = None,\n    batch_dim: Optional[int] = None,\n    cache_forward_pass: Optional[bool] = None,\n    col_names: Optional[Iterable[str]] = None,\n    col_width: int = 25,\n    depth: int = 3,\n    device: Optional[torch.device] = None,\n    dtypes: Optional[List[torch.dtype]] = None,\n    mode: str = \"same\",\n    row_settings: Optional[Iterable[str]] = None,\n    verbose: int = 1,\n    **kwargs: Any,\n) -> ModelStatistics:\n\"\"\"\n总结给定的 PyTorch 模型。总结的信息包括：\n    1) 层名称，\n    2) 输入\u002F输出形状，\n    3) 卷积核形状，\n    4) 参数数量，\n    5) 运算次数（乘加操作），\n    6) 该层是否可训练。\n\n注意：如果既没有提供 input_data 也没有提供 input_size，则不会执行前向传播，此时提供的模型信息仅限于层名称。\n\n参数：\n    model (nn.Module):\n            要总结的 PyTorch 模型。模型应处于 train() 或 eval() 模式之一。如果各层模式不一致，运行 summary 可能会对 BatchNorm 或 Dropout 的统计产生副作用。若遇到此类问题，请提交 GitHub 问题。\n\n    input_size (序列尺寸)：\n            输入数据的形状，以 List\u002FTuple\u002Ftorch.Size 形式提供。\n            （数据类型需与模型输入匹配，默认为 FloatTensor）。\n            元组中应包含批次大小。\n            默认：None\n\n    input_data (张量序列)：\n            用于模型前向传播的参数（数据类型由输入推断）。\n            如果 forward() 函数接受多个参数，请传入参数列表或关键字参数字典。\n            （若 forward() 函数仅接受一个字典作为参数，需将其包裹在列表中）。\n            默认：None\n\n    batch_dim (int)：\n            输入数据的批次维度。若 batch_dim 为 None，则假定 input_data \u002F input_size 包含批次维度，并在所有计算中使用该维度。否则，会扩展所有张量以包含批次维度。\n            指定 batch_dim 可以优化运行时性能，因为当指定了批次维度时，torchinfo 在前向传播时会使用批次大小为 1。\n            默认：None\n\n    cache_forward_pass (bool)：\n            若为 True，则会以模型类名作为键缓存 forward() 函数的运行结果。如果前向传播操作开销较大，这将使修改模型摘要的格式更加方便，例如更改深度或启用的列类型，特别是在 Jupyter Notebook 中。\n            警告：启用此功能后，若修改模型架构或输入数据\u002F输入尺寸，缓存不会失效，也不会重新运行前向传播，从而可能导致摘要不准确。\n            默认：False\n\n    col_names (可迭代字符串)：\n            指定在输出中显示哪些列。当前支持的列有：(\n                \"input_size\",\n                \"output_size\",\n                \"num_params\",\n                \"params_percent\",\n                \"kernel_size\",\n                \"groups\",\n                \"mult_adds\",\n                \"trainable\",\n            )\n            默认：(\"output_size\", \"num_params\")\n            如果未提供 input_data \u002F input_size，则仅显示 \"num_params\"。\n\n    col_width (int)：\n            每列的宽度。\n            默认：25\n\n    depth (int)：\n            显示嵌套层数的深度（例如 Sequential）。\n            低于此深度的嵌套层将不会显示在摘要中。\n            默认：3\n\n    device (torch.Device)：\n            使用指定的 torch 设备来处理模型和输入数据。\n            如果未指定，则优先使用输入数据的数据类型（若有），否则使用模型参数的数据类型。若两者均未指定，则使用 torch.cuda.is_available() 的结果。\n            默认：None\n\n    dtypes (List[torch.dtype])：\n            如果使用 input_size，torchinfo 假设输入数据为 FloatTensor 类型。若您的模型使用其他数据类型，请在此处指定相应的数据类型。\n            对于多输入情况，需同时指定每个输入的尺寸及对应的数据类型。\n            默认：None\n\n    mode (str)：\n            可取值为 \"train\"、\"eval\" 或 \"same\"，用于决定在调用 summary 之前是调用 model.train() 还是 model.eval()。无论哪种情况，最终都会恢复模型的原始模式。\n            默认：\"same\"。\n\n    row_settings (可迭代字符串)：\n            指定在每一行中显示哪些特性。当前支持的特性有：(\n                \"ascii_only\",\n                \"depth\",\n                \"var_names\",\n            )\n            默认：(\"depth\",)\n\n    verbose (int)：\n            0（静默）：无输出\n            1（默认）：打印模型摘要\n            2（详细）：完整显示权重和偏置层的信息\n            默认：1\n            如果使用 Jupyter Notebook 或 Google Colab，则默认为 0。\n\n    **kwargs：\n            `model.forward` 函数中使用的其他参数。不再支持 *args 的传递方式。\n\n返回：\n    ModelStatistics 对象\n            更多信息请参阅 torchinfo\u002Fmodel_statistics.py 文件。\n\"\"\"\n```\n\n# 示例\n\n## 获取模型摘要的字符串表示\n\n```python\nfrom torchinfo import summary\n\nmodel_stats = summary(your_model, (1, 3, 28, 28), verbose=0)\nsummary_str = str(model_stats)\n# summary_str 包含摘要的字符串表示！\n```\n\n## 探索不同的配置\n\n```python\nclass LSTMNet(nn.Module):\n    def __init__(self, vocab_size=20, embed_dim=300, hidden_dim=512, num_layers=2):\n        super().__init__()\n        self.hidden_dim = hidden_dim\n        self.embedding = nn.Embedding(vocab_size, embed_dim)\n        self.encoder = nn.LSTM(embed_dim, hidden_dim, num_layers=num_layers, batch_first=True)\n        self.decoder = nn.Linear(hidden_dim, vocab_size)\n\n    def forward(self, x):\n        embed = self.embedding(x)\n        out, hidden = self.encoder(embed)\n        out = self.decoder(out)\n        out = out.view(-1, out.size(2))\n        return out, hidden\n\nsummary(\n    LSTMNet(),\n    (1, 100),\n    dtypes=[torch.long],\n    verbose=2,\n    col_width=16,\n    col_names=[\"kernel_size\", \"output_size\", \"num_params\", \"mult_adds\"],\n    row_settings=[\"var_names\"],\n)\n```\n\n```\n========================================================================================================================\nLayer (type (var_name))                  Kernel Shape         Output Shape         Param #              Mult-Adds\n========================================================================================================================\nLSTMNet (LSTMNet)                        --                   [100, 20]            --                   --\n├─Embedding (embedding)                  --                   [1, 100, 300]        6,000                6,000\n│    └─weight                            [300, 20]                                 └─6,000\n├─LSTM (encoder)                         --                   [1, 100, 512]        3,768,320            376,832,000\n│    └─weight_ih_l0                      [2048, 300]                               ├─614,400\n│    └─weight_hh_l0                      [2048, 512]                               ├─1,048,576\n│    └─bias_ih_l0                        [2048]                                    ├─2,048\n│    └─bias_hh_l0                        [2048]                                    ├─2,048\n│    └─weight_ih_l1                      [2048, 512]                               ├─1,048,576\n│    └─weight_hh_l1                      [2048, 512]                               ├─1,048,576\n│    └─bias_ih_l1                        [2048]                                    ├─2,048\n│    └─bias_hh_l1                        [2048]                                    └─2,048\n├─Linear (decoder)                       --                   [1, 100, 20]         10,260               10,260\n│    └─weight                            [512, 20]                                 ├─10,240\n│    └─bias                              [20]                                      └─20\n========================================================================================================================\nTotal params: 3,784,580\nTrainable params: 3,784,580\nNon-trainable params: 0\nTotal mult-adds (M): 376.85\n========================================================================================================================\nInput size (MB): 0.00\nForward\u002Fbackward pass size (MB): 0.67\nParams size (MB): 15.14\nEstimated Total Size (MB): 15.80\n========================================================================================================================\n\n```\n\n\u003C!-- lstm.out -->\n\n## ResNet\n\n```python\nimport torchvision\n\nmodel = torchvision.models.resnet152()\nsummary(model, (1, 3, 224, 224), depth=3)\n```\n\n```\n==========================================================================================\nLayer (type:depth-idx)                   Output Shape              Param #\n==========================================================================================\nResNet                                   [1, 1000]                 --\n├─Conv2d: 1-1                            [1, 64, 112, 112]         9,408\n├─BatchNorm2d: 1-2                       [1, 64, 112, 112]         128\n├─ReLU: 1-3                              [1, 64, 112, 112]         --\n├─MaxPool2d: 1-4                         [1, 64, 56, 56]           --\n├─Sequential: 1-5                        [1, 256, 56, 56]          --\n│    └─Bottleneck: 2-1                   [1, 256, 56, 56]          --\n│    │    └─Conv2d: 3-1                  [1, 64, 56, 56]           4,096\n│    │    └─BatchNorm2d: 3-2             [1, 64, 56, 56]           128\n│    │    └─ReLU: 3-3                    [1, 64, 56, 56]           --\n│    │    └─Conv2d: 3-4                  [1, 64, 56, 56]           36,864\n│    │    └─BatchNorm2d: 3-5             [1, 64, 56, 56]           128\n│    │    └─ReLU: 3-6                    [1, 64, 56, 56]           --\n│    │    └─Conv2d: 3-7                  [1, 256, 56, 56]          16,384\n│    │    └─BatchNorm2d: 3-8             [1, 256, 56, 56]          512\n│    │    └─Sequential: 3-9              [1, 256, 56, 56]          16,896\n│    │    └─ReLU: 3-10                   [1, 256, 56, 56]           --\n│    └─Bottleneck: 2-2                   [1, 256, 56, 56]          --\n\n  ...\n  ...\n  ...\n\n├─AdaptiveAvgPool2d: 1-9                 [1, 2048, 1, 1]           --\n├─Linear: 1-10                           [1, 1000]                 2,049,000\n==========================================================================================\nTotal params: 60,192,808\nTrainable params: 60,192,808\nNon-trainable params: 0\nTotal mult-adds (G): 11.51\n==========================================================================================\nInput size (MB): 0.60\nForward\u002Fbackward pass size (MB): 360.87\nParams size (MB): 240.77\nEstimated Total Size (MB): 602.25\n==========================================================================================\n```\n\n\u003C!-- resnet152.out -->\n\n## 多输入与不同数据类型\n\n```python\nclass MultipleInputNetDifferentDtypes(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.fc1a = nn.Linear(300, 50)\n        self.fc1b = nn.Linear(50, 10)\n\n        self.fc2a = nn.Linear(300, 50)\n        self.fc2b = nn.Linear(50, 10)\n\n    def forward(self, x1, x2):\n        x1 = F.relu(self.fc1a(x1))\n        x1 = self.fc1b(x1)\n        x2 = x2.type(torch.float)\n        x2 = F.relu(self.fc2a(x2))\n        x2 = self.fc2b(x2)\n        x = torch.cat((x1, x2), 0)\n        return F.log_softmax(x, dim=1)\n\nsummary(model, [(1, 300), (1, 300)], dtypes=[torch.float, torch.long])\n```\n\n或者，你也可以直接传入输入数据，`torchinfo` 会自动推断数据类型。\n\n```python\ninput_data = torch.randn(1, 300)\nother_input_data = torch.randn(1, 300).long()\nmodel = MultipleInputNetDifferentDtypes()\n\nsummary(model, input_data=[input_data, other_input_data, ...])\n```\n\n## 顺序容器与 ModuleList\n\n```python\nclass ContainerModule(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self._layers = nn.ModuleList()\n        self._layers.append(nn.Linear(5, 5))\n        self._layers.append(ContainerChildModule())\n        self._layers.append(nn.Linear(5, 5))\n\n    def forward(self, x):\n        for layer in self._layers:\n            x = layer(x)\n        return x\n\n\nclass ContainerChildModule(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self._sequential = nn.Sequential(nn.Linear(5, 5), nn.Linear(5, 5))\n        self._between = nn.Linear(5, 5)\n\n    def forward(self, x):\n        out = self._sequential(x)\n        out = self._between(out)\n        for l in self._sequential:\n            out = l(out)\n\n        out = self._sequential(x)\n        for l in self._sequential:\n            out = l(out)\n        return out\n\nsummary(ContainerModule(), (1, 5))\n```\n\n```\n==========================================================================================\nLayer (type:depth-idx)                   Output Shape              Param #\n==========================================================================================\nContainerModule                          [1, 5]                    --\n├─ModuleList: 1-1                        --                        --\n│    └─Linear: 2-1                       [1, 5]                    30\n│    └─ContainerChildModule: 2-2         [1, 5]                    --\n│    │    └─Sequential: 3-1              [1, 5]                    --\n│    │    │    └─Linear: 4-1             [1, 5]                    30\n│    │    │    └─Linear: 4-2             [1, 5]                    30\n│    │    └─Linear: 3-2                  [1, 5]                    30\n│    │    └─Sequential: 3-3              --                        (recursive)\n│    │    │    └─Linear: 4-3             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-4             [1, 5]                    (recursive)\n│    │    └─Sequential: 3-4              [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-5             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-6             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-7             [1, 5]                    (recursive)\n│    │    │    └─Linear: 4-8             [1, 5]                    (recursive)\n│    └─Linear: 2-3                       [1, 5]                    30\n==========================================================================================\nTotal params: 150\nTrainable params: 150\nNon-trainable params: 0\nTotal mult-adds (M): 0.00\n==========================================================================================\nInput size (MB): 0.00\nForward\u002Fbackward pass size (MB): 0.00\nParams size (MB): 0.00\nEstimated Total Size (MB): 0.00\n==========================================================================================\n```\n\n\u003C!-- container.out -->\n\n# 贡献\n\n我们非常欢迎所有的问题和拉取请求！如果您想知道如何构建该项目：\n\n- torchinfo 使用最新版本的 Python 进行开发。\n  - 更改应向后兼容 Python 3.8，并遵循 Python 对旧版本的支持终止政策。\n  - 运行 `pip install -r requirements-dev.txt`。我们使用所有开发依赖包的最新版本。\n  - 运行 `pre-commit install`。\n  - 若要使用自动格式化工具，运行 `pre-commit run -a`。\n  - 若要运行单元测试，运行 `pytest`。\n  - 若要更新预期输出文件，运行 `pytest --overwrite`。\n  - 若要跳过输出文件测试，使用 `pytest --no-output`。\n\n# 参考文献\n\n- 感谢 @sksq96、@nmhkahn 和 @sangyx 为本项目提供了灵感。\n- 关于模型大小估算，感谢 @jacobkimmel（[详情请见此处](https:\u002F\u002Fgithub.com\u002Fsksq96\u002Fpytorch-summary\u002Fpull\u002F21)）。","# torchinfo 快速上手指南\n\n`torchinfo` 是一个用于 PyTorch 模型可视化的工具，功能类似于 TensorFlow 的 `model.summary()`。它能清晰地展示模型的层级结构、输入\u002F输出形状、参数量及计算量（Mult-Adds），是调试神经网络结构的得力助手。\n\n## 环境准备\n\n*   **Python 版本**：3.8 及以上\n*   **PyTorch 版本**：1.4.0 及以上\n*   **系统要求**：支持 Linux、macOS 和 Windows\n\n## 安装步骤\n\n推荐使用 pip 进行安装。国内用户可使用清华源或阿里源加速下载。\n\n**使用 pip 安装（推荐）：**\n```bash\npip install torchinfo\n```\n\n**使用国内镜像源加速安装：**\n```bash\npip install torchinfo -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n**使用 Conda 安装：**\n```bash\nconda install -c conda-forge torchinfo\n```\n\n## 基本使用\n\n### 1. 最简单的用法\n\n只需导入 `summary` 函数，传入模型实例和输入数据的形状（需包含 Batch Size），即可打印模型摘要。\n\n```python\nfrom torchinfo import summary\nimport torch.nn as nn\n\n# 假设你有一个定义好的模型\nclass ConvNet(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)\n        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)\n        self.fc1 = nn.Linear(320, 50)\n        self.fc2 = nn.Linear(50, 10)\n\n    def forward(self, x):\n        x = torch.relu(self.conv1(x))\n        x = torch.max_pool2d(x, 2)\n        x = torch.relu(self.conv2(x))\n        x = torch.max_pool2d(x, 2)\n        x = x.view(-1, 320)\n        x = torch.relu(self.fc1(x))\n        x = self.fc2(x)\n        return x\n\nmodel = ConvNet()\n\n# 生成模型摘要\n# input_size 格式为: (Batch_Size, Channels, Height, Width)\nsummary(model, input_size=(16, 1, 28, 28))\n```\n\n**输出示例：**\n```text\n================================================================================================================\nLayer (type:depth-idx)          Input Shape          Output Shape         Param #            Mult-Adds\n================================================================================================================\nConvNet                         [16, 1, 28, 28]      [16, 10]             --                 --\n├─Conv2d: 1-1                   [16, 1, 28, 28]      [16, 10, 24, 24]     260                1,048,320\n├─Conv2d: 1-2                   [16, 10, 12, 12]     [16, 20, 8, 8]       5,020              2,248,960\n├─Linear: 1-3                   [16, 320]            [16, 50]             16,050             112,350\n├─Linear: 1-4                   [16, 50]             [16, 10]             510                3,570\n================================================================================================================\nTotal params: 21,840\nTrainable params: 21,840\nNon-trainable params: 0\nTotal mult-adds (M): 3.41\n================================================================================================================\n```\n\n### 2. Jupyter Notebook \u002F Colab 特别说明\n\n如果在 Jupyter Notebook 或 Google Colab 中使用，确保 `summary()` 是单元格的最后一个表达式，或者将其包裹在 `print()` 中：\n\n```python\n# 方式一：作为单元格最后一行\nsummary(model, input_size=(16, 1, 28, 28))\n\n# 方式二：显式打印\nprint(summary(model, input_size=(16, 1, 28, 28)))\n```\n\n### 3. 获取摘要字符串\n\n如果你需要将摘要信息保存为字符串而不是直接打印，可以设置 `verbose=0`：\n\n```python\nmodel_stats = summary(model, input_size=(16, 1, 28, 28), verbose=0)\nsummary_str = str(model_stats)\n# 现在 summary_str 包含了完整的摘要文本\n```","某计算机视觉工程师在优化一个复杂的残差网络（ResNet）变体时，需要精确评估模型在边缘设备上的部署可行性。\n\n### 没有 torchinfo 时\n- **层级结构黑盒化**：仅靠 `print(model)` 只能看到类名和参数定义，无法直观获知数据流经每个卷积层后的具体尺寸变化，难以定位维度不匹配的报错源头。\n- **资源估算靠猜**：缺乏自动计算机制，开发者需手动推导参数量（Params）和乘加运算量（Mult-Adds），极易算错导致模型超出显存限制或推理延迟过高。\n- **调试效率低下**：面对包含分支结构或递归层（如 RNN）的复杂网络，肉眼追踪数据流向极其耗时，往往需要插入大量临时打印语句才能验证中间层状态。\n\n### 使用 torchinfo 后\n- **可视化数据流向**：运行一行 `summary` 代码即可生成清晰表格，详细展示每一层的输入\u002F输出形状，瞬间发现某一下采样层导致了特征图尺寸异常。\n- **精准量化指标**：自动统计总参数量、可训练参数及估算的 MACs（百万次乘加运算），直接确认模型大小是否符合嵌入式设备的 100MB 存储上限。\n- **一键深度诊断**：无需修改模型代码，即可通过配置深度参数查看嵌套子模块的内部细节，快速理清复杂分支结构的逻辑，将调试时间从数小时缩短至几分钟。\n\ntorchinfo 将模糊的模型定义转化为透明的量化视图，让开发者在编码阶段就能精准掌控模型的性能边界与资源消耗。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTylerYep_torchinfo_a22fac7f.png","TylerYep","Tyler Yep","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FTylerYep_9016b876.png","Hi, I'm Tyler!","Robinhood","Stanford University","tyep@cs.stanford.edu","tyleryep1","https:\u002F\u002Fwww.tyleryep.com","https:\u002F\u002Fgithub.com\u002FTylerYep",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",74.2,{"name":88,"color":89,"percentage":90},"Jupyter Notebook","#DA5B0B",25.8,2912,134,"2026-04-08T06:48:37","MIT",1,"","未说明",{"notes":99,"python":100,"dependencies":101},"该工具用于分析 PyTorch 模型结构，支持 CPU 和 GPU 运行（具体设备可配置）。若在 Jupyter Notebook 或 Google Colab 中使用，需将 summary() 作为单元格返回值或使用 print() 包裹。支持 RNN、LSTM 等递归层及分支结构。","3.8+",[102],"torch>=1.4.0",[14],[105,106,107,108,109,110,111,64,112],"pytorch","torchsummary","torch","keras","visualization","torchvision","torch-summary","python",null,"2026-03-27T02:49:30.150509","2026-04-10T02:42:02.591043",[117,122,127,132,137,142],{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},27016,"如何在 summary 输出中指定参数量（Params）和乘加运算量（MACs）的单位？","该功能已在 v1.7.2 版本中实现。您可以在调用 `summary()` 函数时添加 `params_units` 和 `macs_units` 参数来指定单位（如 'MB', 'GB' 等）。默认值为 'auto'，以保持与当前行为一致。这有助于快速比较不同的模型架构。","https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fissues\u002F183",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},27017,"为什么模型中的 `nn.Parameter` 在 summary 报告中被遗漏了？","这是一个已知问题，当 `nn.Parameter` 与其他 PyTorch 预定义层（如 `nn.Linear`）混合使用时，早期版本可能无法正确统计。该问题已在 v1.7.0 版本中通过重写相关逻辑修复。请确保升级到 torchinfo v1.7.0 或更高版本以获得准确的参数统计。","https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fissues\u002F84",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},27018,"使用 `nn.UninitializedParameter` 或延迟初始化模块时报错怎么办？","早期版本在遇到 `nn.UninitializedParameter` 时会抛出 `ValueError`。维护者已提交修复（commit 8cf0ab2），利用 PyTorch 的 `is_lazy` 函数来处理延迟模块的参数填充。此修复已包含在 torchinfo v1.7.0 中，升级即可解决该报错。","https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fissues\u002F117",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},27019,"嵌套模型结构中的 MACs（乘加运算量）计算不准确或显示错误如何解决？","嵌套模型导致的 MACs 计算错误已在 v1.5.1 版本中修复。维护者 generalized 了显示测试以适用于所有测试模型。如果您在升级后仍然发现总和与分项之和不匹配，建议检查是否使用了最新版本，并留意后续关于参数和 mult-adds 计算的进一步修复更新。","https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fissues\u002F60",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},27020,"如何让 torchinfo 支持输入为字典（dict）的模型？","您可以直接将字典传递给 `input_data` 参数。使用方法如下：`summary(model, input_data={\"key1\": tensor1, \"key2\": tensor2}, ...)`。确保您的模型 `forward` 方法能够接收该字典作为输入。如果模型需要列表中包含字典等复杂结构，可能需要进一步调整输入格式以匹配模型预期。","https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fissues\u002F46",{"id":143,"question_zh":144,"answer_zh":145,"source_url":121},27021,"开发 torchinfo 时需要什么版本的 Python 和 PyTorch？","torchinfo 包本身支持 Python 3.7，但开发环境中的类型检查（typechecking）运行在 Python 3.9 上。维护者建议在开发时参考项目 `.github\u002Fworkflows\u002Ftest.yml` 文件中定义的 Python 和 PyTorch 版本配对。目前也在计划增加对 Python 3.10 和 PyTorch 1.13 的支持。",[147,152,157,162,167,172,177,181,186,191,196,201,206,211,216,221,226,231,236,241],{"id":148,"version":149,"summary_zh":150,"released_at":151},180131,"v1.8.0","## 变更内容\n* 支持使用 np.ndarray 作为输入和输出。由 @snimu 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F222 中实现。\n* 如果提供了 input_data，则不更改设备。由 @snimu 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F236 中实现。\n* 迁移到使用 tensor.untyped_storage() 以兼容 PyTorch 2.0。由 @TylerYep 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fcommit\u002F7f2bed3 中实现。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fcompare\u002Fv1.7.2...v1.8.0","2023-05-14T20:54:46",{"id":153,"version":154,"summary_zh":155,"released_at":156},180132,"v1.7.2","## 变更内容\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F173 中简化了 layers_to_str 函数。\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F174 中为 ColumnSettings 添加了 hide_recursive_layers 选项。\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F181 中增加了递归调用对 total_output_bytes 的贡献计算。\n* @richardtml 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F188 中添加了 params 和 MACs 的单位说明符。\n* @luke396 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F199 中新增了 'Params %' 列。\n* @snimu 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F211 中实现了当 `summary` 的 `device` 参数为 `None` 时，自动从第一个模型参数中获取设备的功能。\n* @snimu 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F212 中启用了对嵌套 input 和 output 字典的分析。\n\n## 清理工作\n* @TylerYep 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F189 中为 GitHub Actions 测试添加了更多版本。\n* @TylerYep 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F191 中修复了测试用例中的 torchvision 已弃用警告。\n* @TylerYep 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F220 中将 nested_list_size 函数独立出来，添加了一些文档，并改进了 setuptools 的 mypy 类型检查。\n\n## 新贡献者\n* @richardtml 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F188 中完成了首次贡献。\n* @luke396 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F199 中完成了首次贡献。\n* @snimu 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F211 中完成了首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fcompare\u002Fv1.7.1...v1.7.2","2023-02-05T20:50:10",{"id":158,"version":159,"summary_zh":160,"released_at":161},180133,"v1.7.1","## 变更内容\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F165 中更新了半精度测试用例，以支持 PyTorch 1.12 版本。\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F163 中，将 `add_missing_layers` 函数中的 `class_name` 替换为 `layer_id`。\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F169 中，用 `add_missing_container_layers` 替代了 `add_missing_layers`。\n\n## 新贡献者\n* @mert-kurttutan 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F165 中完成了首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fcompare\u002Fv1.7.0...v1.7.1","2022-09-26T00:25:35",{"id":163,"version":164,"summary_zh":165,"released_at":166},180134,"v1.7.0","## 变更内容\n* 计算 nn.Parameter 的参数量\n* 为 nn.UninitializedParameters 添加参数量统计\n* 显示整个模型的输出形状\n* 内部优化（钩子改为迭代应用而非递归应用，并添加性能分析代码）\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fcompare\u002Fv1.6.6...v1.7.0","2022-05-28T23:30:26",{"id":168,"version":169,"summary_zh":170,"released_at":171},180135,"v1.6.6","## 变更内容\n* 由 @bsridatta 在 https:\u002F\u002Fgithub.com\u002FTylerYep\u002Ftorchinfo\u002Fpull\u002F128 中添加了“Trainable”列\n* 当 ModuleList 中存在 None 值时不再报错\n* 使用实际的属性名获取 kernel_size。","2022-05-16T05:47:46",{"id":173,"version":174,"summary_zh":175,"released_at":176},180136,"v1.6.5","修复一个回归问题：在 PyTorch 版本低于 1.9 时，torchinfo 会崩溃。","2022-03-25T20:01:21",{"id":178,"version":179,"summary_zh":113,"released_at":180},180137,"v1.6.3","2022-01-15T22:10:43",{"id":182,"version":183,"summary_zh":184,"released_at":185},180138,"v1.6.2","修复了在同一变量中复用图层时的 bug，无论是否存在现有钩子均适用。","2022-01-11T07:27:55",{"id":187,"version":188,"summary_zh":189,"released_at":190},180139,"v1.6.1","- 支持剪枝后的模型\r\n- 支持将“ascii_only”作为行设置，以禁用花式分支日志记录。","2021-12-24T09:38:01",{"id":192,"version":193,"summary_zh":194,"released_at":195},180140,"v1.6.0","弃用对 Python 3.6 的支持。如果您想使用 Python 3.6，请安装 v1.5.4。","2021-12-21T22:01:28",{"id":197,"version":198,"summary_zh":199,"released_at":200},180141,"v1.5.4","LayerInfo 的 `trainable` 字段现在已改为 `trainable_params`。","2021-11-24T07:02:56",{"id":202,"version":203,"summary_zh":204,"released_at":205},180142,"v1.5.3","- Display layers that share the same variable\r\n  - e.g. activation layers that are defined once and then reused throughout the model\r\n- README updates.","2021-08-07T21:21:18",{"id":207,"version":208,"summary_zh":209,"released_at":210},180143,"v1.5.2","- Use sys.getsizeof for calculating input size. In the future, we will use this instead of the tensor shape to calculate the size.\r\n- Rework the input_data correction to allow nested dicts and other data structure combinations.\r\n- Add missing basic summary test.\r\n- Refactor the main summary function to use a common traversal helper function.","2021-07-06T05:15:19",{"id":212,"version":213,"summary_zh":214,"released_at":215},180144,"v1.5.1","- Fix bug causing inconsistent Mult-Add totals that do not sum correctly.\r\n- Overhaul output testing to work automatically for all tests\r\n- Add `cache_forward_pass` to make it easier to iterate on depths in Jupyter Notebooks\r\n","2021-07-05T05:37:21",{"id":217,"version":218,"summary_zh":219,"released_at":220},180145,"v.1.5.0","Upgrade the version number past v1.4.5 in order for pip to resolve the version correctly across older versions of pip.","2021-07-03T01:09:34",{"id":222,"version":223,"summary_zh":224,"released_at":225},180146,"v0.1.5","- Fix issues with torch.jit scripted modules\r\n- Add support for ParameterLists\r\n- Display bias layers in verbose=2 mode","2021-06-13T02:08:04",{"id":227,"version":228,"summary_zh":229,"released_at":230},180147,"v0.1.4","Add py.typed to surface type annotations.","2021-06-07T05:34:48",{"id":232,"version":233,"summary_zh":234,"released_at":235},180148,"v0.1.3","Fix bug with inconsistent calculations for `total_params` using different `depth`s.","2021-06-04T21:59:26",{"id":237,"version":238,"summary_zh":239,"released_at":240},180149,"v0.1.2","Fix MACs calculation and differences in output when using different depths.\r\n\r\n`depth` parameter now only affects formatting, calculated values do not change.","2021-05-22T05:07:37",{"id":242,"version":243,"summary_zh":244,"released_at":245},180150,"v0.1.1","Add model name to the topmost row of the summary table, fixed bug in nested_list_size.","2021-05-09T01:41:30"]