[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-NVlabs--pacnet":3,"tool-NVlabs--pacnet":61},[4,18,28,37,45,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":24,"last_commit_at":25,"category_tags":26,"status":17},9989,"n8n","n8n-io\u002Fn8n","n8n 是一款面向技术团队的公平代码（fair-code）工作流自动化平台，旨在让用户在享受低代码快速构建便利的同时，保留编写自定义代码的灵活性。它主要解决了传统自动化工具要么过于封闭难以扩展、要么完全依赖手写代码效率低下的痛点，帮助用户轻松连接 400 多种应用与服务，实现复杂业务流程的自动化。\n\nn8n 特别适合开发者、工程师以及具备一定技术背景的业务人员使用。其核心亮点在于“按需编码”：既可以通过直观的可视化界面拖拽节点搭建流程，也能随时插入 JavaScript 或 Python 代码、调用 npm 包来处理复杂逻辑。此外，n8n 原生集成了基于 LangChain 的 AI 能力，支持用户利用自有数据和模型构建智能体工作流。在部署方面，n8n 提供极高的自由度，支持完全自托管以保障数据隐私和控制权，也提供云端服务选项。凭借活跃的社区生态和数百个现成模板，n8n 让构建强大且可控的自动化系统变得简单高效。",184740,2,"2026-04-19T23:22:26",[16,14,13,15,27],"插件",{"id":29,"name":30,"github_repo":31,"description_zh":32,"stars":33,"difficulty_score":10,"last_commit_at":34,"category_tags":35,"status":17},10095,"AutoGPT","Significant-Gravitas\u002FAutoGPT","AutoGPT 是一个旨在让每个人都能轻松使用和构建 AI 的强大平台，核心功能是帮助用户创建、部署和管理能够自动执行复杂任务的连续型 AI 智能体。它解决了传统 AI 应用中需要频繁人工干预、难以自动化长流程工作的痛点，让用户只需设定目标，AI 即可自主规划步骤、调用工具并持续运行直至完成任务。\n\n无论是开发者、研究人员，还是希望提升工作效率的普通用户，都能从 AutoGPT 中受益。开发者可利用其低代码界面快速定制专属智能体；研究人员能基于开源架构探索多智能体协作机制；而非技术背景用户也可直接选用预置的智能体模板，立即投入实际工作场景。\n\nAutoGPT 的技术亮点在于其模块化“积木式”工作流设计——用户通过连接功能块即可构建复杂逻辑，每个块负责单一动作，灵活且易于调试。同时，平台支持本地自托管与云端部署两种模式，兼顾数据隐私与使用便捷性。配合完善的文档和一键安装脚本，即使是初次接触的用户也能在几分钟内启动自己的第一个 AI 智能体。AutoGPT 正致力于降低 AI 应用门槛，让人人都能成为 AI 的创造者与受益者。",183572,"2026-04-20T04:47:55",[13,36,27,14,15],"语言模型",{"id":38,"name":39,"github_repo":40,"description_zh":41,"stars":42,"difficulty_score":10,"last_commit_at":43,"category_tags":44,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":46,"name":47,"github_repo":48,"description_zh":49,"stars":50,"difficulty_score":24,"last_commit_at":51,"category_tags":52,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",161147,"2026-04-19T23:31:47",[14,13,36],{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":24,"last_commit_at":59,"category_tags":60,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":84,"forks":85,"last_commit_at":86,"license":87,"difficulty_score":10,"env_os":88,"env_gpu":89,"env_ram":88,"env_deps":90,"category_tags":97,"github_topics":98,"view_count":24,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":102,"updated_at":103,"faqs":104,"releases":139},10136,"NVlabs\u002Fpacnet","pacnet","Pixel-Adaptive Convolutional Neural Networks (CVPR '19)","pacnet 是一个基于深度学习的开源项目，核心实现了“像素自适应卷积神经网络”（PACNN）。传统卷积神经网络在处理图像时，通常使用固定的卷积核在空间上滑动，难以灵活适应图像中复杂的局部结构变化。pacnet 通过引入引导特征（guidance features），让卷积核的形状和权重能够根据输入图像的每个像素位置动态调整。这种机制有效解决了传统方法在图像超分辨率、去噪、光流估计等任务中细节恢复不足的问题，显著提升了处理结果的清晰度和边缘保持能力。\n\n该项目主要面向计算机视觉领域的研究人员和开发者，特别是那些希望复现 CVPR 2019 获奖论文成果，或在 PyTorch 框架下探索自适应卷积应用的团队。pacnet 提供了完整的 PyTorch 模块实现，包括标准卷积、转置卷积、池化以及条件随机场（CRF）推理等多种变体层（如 `PacConv2d`、`PacCRF` 等），并支持灵活的核类型配置与预计算接口。其代码结构清晰，文档详尽，便于用户快速集成到现有的深度学习工作流中进行实验或二次开发，是推动自适应卷积技术落地的重要工具。","## Pixel-Adaptive Convolutional Neural Networks\n\n#### [Project page](https:\u002F\u002Fsuhangpro.github.io\u002Fpac\u002Findex.html) |  [Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.05373) | [Video](https:\u002F\u002Fyoutu.be\u002FgsQZbHuR64o)\n\nPixel-Adaptive Convolutional Neural Networks\u003Cbr>\n[Hang Su](https:\u002F\u002Fsuhangpro.github.io\u002F), [Varun Jampani](https:\u002F\u002Fvarunjampani.github.io\u002F), [Deqing Sun](http:\u002F\u002Fresearch.nvidia.com\u002Fperson\u002Fdeqing-sun), [Orazio Gallo](https:\u002F\u002Fresearch.nvidia.com\u002Fperson\u002Forazio-gallo), [Erik Learned-Miller](http:\u002F\u002Fpeople.cs.umass.edu\u002F~elm\u002F), and [Jan Kautz](http:\u002F\u002Fjankautz.com\u002F).\u003Cbr>\nCVPR 2019.\n\n### License\n\nCopyright (C) 2019 NVIDIA Corporation.  All rights reserved.\nLicensed under the CC BY-NC-SA 4.0 license (https:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby-nc-sa\u002F4.0\u002Flegalcode). \n\n\n### Installation\n* Make sure you have Python>=3.5 (we recommend using a Conda environment). \n* Add the project directory to your Python paths.\n* Install dependencies:\n    * PyTorch v0.4-1.1 (incl. torchvision) with CUDA: see [PyTorch instructions](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F).\n    * Additional libraries: \n        ```bash\n        pip install -r requirements.txt\n        ```\n* (Optional) Verify installation: \n    ```bash\n    python -m unittest \n    ```\n\n\n### Layer Catalog \n\nWe implemented 5 types of PAC layers (as PyTorch `Module`):  \n* `PacConv2d`: the standard variant\n* `PacConvTranspose2d`: the transposed (fractionally-strided) variant for upsampling\n* `PacPool2d`: the pooling variant\n* `PacCRF`: Mean-Field (MF) inference of a CRF\n* `PacCRFLoose`: MF inference of a CRF where the MF steps do not share weights\n\nMore details regarding each layer is provided below.\n\n#### `PacConv2d`\n\n`PacConv2d` is the PAC counterpart of `nn.Conv2d`. It accepts most standard `nn.Conv2d` arguments (including in_channels, out_channels, kernel_size, bias, stride, padding, dilation, but not groups and padding_mode), \nand we make sure that when the same arguments are used, `PacConv2d` and `nn.Conv2d` have the exact same output sizes. \nA few additional optional arguments are available: \n\n```\n    Args (in addition to those of Conv2d):\n        kernel_type (str): 'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'. Default: 'gaussian'\n        smooth_kernel_type (str): 'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'. Default: 'none'\n        normalize_kernel (bool): Default: False\n        shared_filters (bool): Default: False\n        filler (str): 'uniform'. Default: 'uniform'\n\n    Note:\n        - kernel_size only accepts odd numbers\n        - padding should not be larger than :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\nWhen used to build computation graphs, this layer takes two input tensors and generates one output tensor:  \n\n```python\nin_ch, out_ch, g_ch = 16, 32, 8         # channel sizes of input, output and guidance\nf, b, h, w = 5, 2, 64, 64               # filter size, batch size, input height and width\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, h, w)       # guidance feature ('f' in Eq.3 of paper)\n\nconv = nn.Conv2d(in_ch, out_ch, f)\nout_conv = conv(input)                  # standard spatial convolution\n\npacconv = PacConv2d(in_ch, out_ch, f)   \nout_pac = pacconv(input, guide)         # PAC \nout_pac = pacconv(input, None, guide_k) # alternative interface\n                                        # guide_k is pre-computed 'K' (see Eq.3 of paper) \n                                        # of shape [b, g_ch, f, f, h, w]. packernel2d can be \n                                        # used for its creation.  \n```\n\nUse `pacconv2d` (in conjunction with `packernel2d`) for its functional interface. \n\n#### `PacConvTranspose2d`\n`PacConvTranspose2d` is the PAC counterpart of `nn.ConvTranspose2d`. It accepts most standard `nn.ConvTranspose2d` \narguments (including in_channels, out_channels, kernel_size, bias, stride, padding, output_padding, dilation, but not groups and padding_mode), and we make sure that when the same arguments are used, \n`PacConvTranspose2d` and `nn.ConvTranspose2d` have the exact same output sizes. \nA few additional optional arguments are available: , and also a few additional ones: \n\n```\n    Args (in addition to those of ConvTranspose2d):\n        kernel_type (str): 'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'. Default: 'gaussian'\n        smooth_kernel_type (str): 'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'. Default: 'none'\n        normalize_kernel (bool): Default: False\n        shared_filters (bool): Default: False\n        filler (str): 'uniform' | 'linear'. Default: 'uniform'\n\n    Note:\n        - kernel_size only accepts odd numbers\n        - padding should not be larger than :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\nSimilar to `PacConv2d`, `PacConvTranspose2d` also offers two ways of usage: \n\n```python\nin_ch, out_ch, g_ch = 16, 32, 8             # channel sizes of input, output and guidance\nf, b, h, w, oh, ow = 5, 2, 8, 8, 16, 16     # filter size, batch size, input height and width\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, oh, ow)         # guidance feature, note that it needs to match \n                                            # the spatial sizes of the output\n\nconvt = nn.ConvTranspose2d(in_ch, out_ch, f, stride=2, padding=2, output_padding=1)\nout_convt = convt(input)                    # standard transposed convolution\n\npacconvt = PacConvTranspose2d(in_ch, out_ch, f, stride=2, padding=2, output_padding=1)   \nout_pact = pacconvt(input, guide)           # PAC \nout_pact = pacconvt(input, None, guide_k)   # alternative interface\n                                            # guide_k is pre-computed 'K' \n                                            # of shape [b, g_ch, f, f, oh, ow].\n                                            # packernel2d can be used for its creation.  \n```\n\nUse `pacconv_transpose2d` (in conjunction with `packernel2d`) for its functional interface. \n\n#### `PacPool2d`\n`PacPool2d` is the PAC counterpart of `nn.AvgPool2d`. It accepts most standard `nn.AvgPool2d` \narguments (including kernel_size, stride, padding, dilation, but not ceil_mode and count_include_pad), and we make sure that when the same arguments are used, \n`PacPool2d` and `nn.AvgPool2d` have the exact same output sizes. \nA few additional optional arguments are available: , and also a few additional ones: \n\n```\n    Args:\n        kernel_size, stride, padding, dilation\n        kernel_type (str): 'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'. Default: 'gaussian'\n        smooth_kernel_type (str): 'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'. Default: 'none'\n        channel_wise (bool): Default: False\n        normalize_kernel (bool): Default: False\n        out_channels (int): needs to be specified for channel_wise 'inv_*' (non-fixed) kernels. Default: -1\n\n    Note:\n        - kernel_size only accepts odd numbers\n        - padding should not be larger than :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\nSimilar to `PacConv2d`, `PacPool2d` also offers two ways of usage: \n\n```python\nin_ch, g_ch = 16, 8                     # channel sizes of input and guidance\nstride, f, b, h, w = 5, 2, 64, 64       # stride, filter size, batch size, input height and width\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, h, w)       # guidance feature \n\npool = nn.AvgPool2d(f, stride)\nout_pool = pool(input)                  # standard spatial convolution\n\npacpool = PacPool2d(f, stride)   \nout_pac = pacpool(input, guide)         # PAC \nout_pac = pacpool(input, None, guide_k) # alternative interface\n                                        # guide_k is pre-computed 'K'\n                                        # of shape [b, g_ch, f, f, h, w]. packernel2d can be \n                                        # used for its creation.  \n```\n\nUse `pacpool2d` (in conjunction with `packernel2d`) for its functional interface. \n\n#### `PacCRF` and `PacCRFLoose`\nThese layers offer a convenient way to add a CRF component at the end of a dense prediction network. \nThey performs approximate mean-field inference under the hood. Available arguments \ninclude: \n\n```python\n    Args:\n        channels (int): number of categories.\n        num_steps (int): number of mean-field update steps.\n        final_output (str): 'log_softmax' | 'softmax' | 'log_Q'. Default: 'log_Q'\n        perturbed_init (bool): whether to perturb initialization. Default: True\n        native_impl (bool): Default: False\n        fixed_weighting (bool): whether to use fixed weighting for unary\u002Fpairwise terms. Default: False\n        unary_weight (float): Default: 1.0\n        pairwise_kernels (dict or list): pairwise kernels, see add_pairwise_kernel() for details. Default: None\n```\n\nUsage example: \n\n```python\n# create a CRF layer for 21 classes using 5 mean-field steps\ncrf = PacCRF(21, num_steps=5, unary_weight=1.0)\n\n# add a pariwise term with equal weight with the unary term\ncrf.add_pairwise_kernel(kernel_size=5, dilation=1, blur=1, compat_type='4d', pairwise_weight=1.0)\n\n# a convenient function is provided for creating pairwise features based on pixel color and positions\nedge_features = [paccrf.create_YXRGB(im, yx_scale=100.0, rgb_scale=30.0)] \noutput = crf(unary, edge_features)\n\n# Note that we use constant values for unary_weight, pairwise_weight, yx_scale, rgb_scale, but they can \n# also take tensors and be learned through backprop.\n```\n\n### Experiments\n\n#### Joint upsampling\n##### Joint depth upsampling on NYU Depth V2\n* Train\u002Ftest split is provided by [Li et al.](https:\u002F\u002Fsites.google.com\u002Fsite\u002Fyijunlimaverick\u002Fdeepjointfilter) \n* Test with one of our pre-trained models: \n\n    ```bash\n    python -m task_jointUpsampling.main --load-weights weights_depth\u002Fx8_pac_weights_epoch_5000.pth \\\n                                        --download \\\n                                        --factor 8 \\\n                                        --model PacJointUpsample \\\n                                        --dataset NYUDepthV2 \\\n                                        --data-root data\u002Fnyu\n    ```\n    \n    |   | 4x | 8x  | 16x  |\n    |---|---|---|---|\n    | `Bilinear` | RMSE: 5.43 | RMSE: 8.36  | RMSE: 12.90 |\n    | `PacJointUpsample`  | RMSE: 2.39 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_pac_weights_epoch_5000.pth) | RMSE: 4.59 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_pac_weights_epoch_5000.pth)  | RMSE: 8.09 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_pac_weights_epoch_5000.pth)  |\n    | `PacJointUpsampleLite`  | RMSE: 2.55  &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_paclite_weights_epoch_5000.pth)  | RMSE: 4.82  &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_paclite_weights_epoch_5000.pth)  | RMSE: 8.52 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_paclite_weights_epoch_5000.pth)  |\n    | `DJIF`  | RMSE: 2.64 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_djif_weights_epoch_5000.pth)  | RMSE: 5.15 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_djif_weights_epoch_5000.pth)  | RMSE: 9.39 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_djif_weights_epoch_5000.pth)  |\n\n* Train from scratch: \n\n    ```bash\n    python -m task_jointUpsampling.main --factor 8 \\\n                                        --data-root data\u002Fnyu \\\n                                        --exp-root exp\u002Fnyu \\\n                                        --download \\\n                                        --dataset NYUDepthV2 \\\n                                        --epochs 5000 \\\n                                        --lr-steps 3500 4500\n    ```\n    \n    See `python -m task_jointUpsampling.main -h` for the complete list of command line options. \n\n##### Joint optical flow upsampling on Sintel\n* Train\u002Fval split (`1` - train, `2` - val) is provided in [meta\u002FSintel_train_val.txt](task_jointUpsampling\u002Fmeta\u002FSintel_train_val.txt) ([original source](https:\u002F\u002Flmb.informatik.uni-freiburg.de\u002Fresources\u002Fdatasets\u002FFlyingChairs\u002FSintel_train_val.txt)): \n    * Validation: 133 pairs\n        * `ambush_6` (all 19)\n        * `bamboo_2` (last 25)\n        * `cave_4` (last 25)\n        * `market_6` (all 39)\n        * `temple_2` (last 25)\n    * Training: remaining 908 pairs\n\n* Test with one of our pre-trained models: \n\n    ```bash\n    python -m task_jointUpsampling.main --load-weights weights_flow\u002Fx8_pac_weights_epoch_5000.pth \\\n                                        --download \\\n                                        --factor 8 \\\n                                        --model PacJointUpsample \\\n                                        --dataset Sintel \\\n                                        --data-root data\u002Fsintel\n    ```\n    \n    |   | 4x | 8x  | 16x  |\n    |---|---|---|---|\n    | `Bilinear` | EPE: 0.4650 | EPE: 0.9011 | EPE: 1.6281 |\n    | `PacJointUpsample` | EPE: 0.1042 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx4_pac_weights_epoch_5000.pth) | EPE: 0.2558 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx8_pac_weights_epoch_5000.pth) | EPE: 0.5921 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx16_pac_weights_epoch_5000.pth) |\n    | `DJIF` | EPE: 0.1760 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx4_djif_weights_epoch_5000.pth) | EPE: 0.4382 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx8_djif_weights_epoch_5000.pth) | EPE: 1.0422 &#124; [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx16_djif_weights_epoch_5000.pth) |\n\n* Train from scratch: \n\n    ```bash\n    python -m task_jointUpsampling.main --factor 8 \\\n                                        --data-root data\u002Fsintel \\\n                                        --exp-root exp\u002Fsintel \\\n                                        --download \\\n                                        --dataset Sintel \\\n                                        --epochs 5000 \\\n                                        --lr-steps 3500 4500\n    ```\n    \n    See `python -m task_jointUpsampling.main -h` for the complete list of command line options. \n    \n#### Semantic segmentation\n* Test with one of the pre-trained models:\n\n    ```bash\n    python -m task_semanticSegmentation.main --data-root data\u002Fvoc \\ \n                                             --exp-root exp\u002Fvoc \\\n                                             --download \\\n                                             --load-weights fcn8s_from_caffe.pth \\\n                                             --model fcn8s \\\n                                             --test-split val11_sbd \\\n                                             --test-crop -1\n    ```\n    \n    |   | miou (val \u002F test) | model name  | weights | \n    |---|---|---|---|\n    | Backbone (FCN8s)  | 65.51% \u002F 67.20% | `fcn8s`  | [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_from_caffe.pth)   |\n    | PacCRF  | 68.90% \u002F 69.82% | `fcn8s_crfi5p4d5641p4d5161`  | [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_paccrf_epoch_30.pth)  |\n    | PacCRF-32  |  68.52% \u002F 69.41% | `fcn8s_crfi5p4d5321`  | [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_paccrf32_epoch_30.pth)  |\n    | PacFCN (hot-swapping)  | 67.44% \u002F 69.18% | `fcn8spac`  | [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_pacfcn_epoch_20.pth)  |\n    | PacFCN+PacCRF  |  69.87% \u002F 71.34% | `fcn8spac_crfi5p4d5641p4d5161`  |  [download](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_pacfcncrf_epoch_20.pth) |\n    \n    Note that the last two models requires argument `--test-crop 512`.\n\n* Generate predictions\n    \n    Use the `--eval pred` mode to save predictions instead of reporting scores. Predictions will be saved \n    under `exp-root`\u002Foutputs_*_pred, and can be used for VOC evaluation server: \n    \n    ```bash\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc \\\n    --load-weights fcn8s_paccrf_epoch_30.pth \\\n    --test-crop -1 \\\n    --test-split test \\\n    --eval pred \\\n    --model fcn8s_crfi5p4d5641p4d5161 \n    \n    cd exp\u002Fvoc\n    mkdir -p results\u002FVOC2012\u002FSegmentation\n    mv outputs_test_pred results\u002FVOC2012\u002FSegmentation\u002Fcomp6_test_cls\n    tar zcf results_fcn8s_crf.tgz results\n    ```\n    \n    Note that since there is no publicly available URL for the test split of VOC, when using the test split, \n    the data files need to be downloaded from the [official website](http:\u002F\u002Fhost.robots.ox.ac.uk:8080\u002Feval\u002Fdownloads\u002FVOC2012test.tar) manually. \n    Simply place the downloaded VOC2012test.tar under the data root and untar. \n\n* Train models\n\n    As an example, here shows the commands for the two-stage training of PacCRF: \n    \n    ```bash\n    # stage 1: train CRF only with frozen backbone\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc\u002Fcrf_only \\\n    --load-weights-backbone fcn8s_from_caffe.pth \\\n    --train-split train11 \\\n    --test-split val11_sbd \\\n    --train-crop 449 \\\n    --test-crop -1 \\\n    --model fcn8sfrozen_crfi5p4d5641p4d5161 \\\n    --epochs 40 \\\n    --lr 0.001 \\\n    --lr-steps 20\n    \n    # stage 2: train CRF and backbone jointly\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc\u002Fjoint \\\n    --load-weights-backbone fcn8s_from_caffe.pth \\\n    --load-weights exp\u002Fvoc\u002Fcrf_only\u002Fweights_epoch_40.pth \\\n    --train-split train11 \\\n    --test-split val11_sbd \\\n    --train-crop 449 \\\n    --test-crop -1 \\\n    --model fcn8s_crfi5lp4d5641p4d5161 \\\n    --epochs 30 \\\n    --lr 0.0000001 \\\n    --lr-steps 20\n    ```\n\nSee `python -m task_semanticSegmentation.main -h` for the complete list of command line options. \n\n### Citation\nIf you use this code for your research, please consider citing our paper: \n```\n@inproceedings{su2019pixel,\n  author    = {Hang Su and \n\t       Varun Jampani and \n\t       Deqing Sun and \n\t       Orazio Gallo and \n\t       Erik Learned-Miller and \n\t       Jan Kautz},\n  title     = {Pixel-Adaptive Convolutional Neural Networks},\n  booktitle = {Proceedings of the IEEE Conference on Computer \n               Vision and Pattern Recognition (CVPR)},\n  year      = {2019}\n}\n```\n","## 像素自适应卷积神经网络\n\n#### [项目页面](https:\u002F\u002Fsuhangpro.github.io\u002Fpac\u002Findex.html) |  [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.05373) | [视频](https:\u002F\u002Fyoutu.be\u002FgsQZbHuR64o)\n\n像素自适应卷积神经网络\u003Cbr>\n[苏航](https:\u002F\u002Fsuhangpro.github.io\u002F)、[瓦伦·詹帕尼](https:\u002F\u002Fvarunjampani.github.io\u002F)、[孙德清](http:\u002F\u002Fresearch.nvidia.com\u002Fperson\u002Fdeqing-sun)、[奥拉齐奥·加洛](https:\u002F\u002Fresearch.nvidia.com\u002Fperson\u002Forazio-gallo)、[埃里克·勒恩德-米勒](http:\u002F\u002Fpeople.cs.umass.edu\u002F~elm\u002F)以及[扬·考茨](http:\u002F\u002Fjankautz.com\u002F)。\u003Cbr>\nCVPR 2019。\n\n### 许可证\n\n版权所有 © 2019 NVIDIA Corporation。保留所有权利。\n根据 CC BY-NC-SA 4.0 许可证授权（https:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby-nc-sa\u002F4.0\u002Flegalcode）。\n\n\n### 安装\n* 确保您已安装 Python>=3.5（我们建议使用 Conda 环境）。\n* 将项目目录添加到您的 Python 路径中。\n* 安装依赖项：\n    * PyTorch v0.4-1.1（包括 torchvision）并支持 CUDA：请参阅 [PyTorch 安装说明](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F)。\n    * 其他库：\n        ```bash\n        pip install -r requirements.txt\n        ```\n* （可选）验证安装：\n    ```bash\n    python -m unittest \n    ```\n\n\n### 层目录\n\n我们实现了 5 种类型的 PAC 层（作为 PyTorch `Module`）：\n* `PacConv2d`：标准变体\n* `PacConvTranspose2d`：用于上采样的转置（分数步长）变体\n* `PacPool2d`：池化变体\n* `PacCRF`：条件随机场的均场推断\n* `PacCRFLoose`：条件随机场的均场推断，其中均场步骤不共享权重\n\n关于每种层的更多详细信息如下。\n\n#### `PacConv2d`\n\n`PacConv2d` 是 `nn.Conv2d` 的 PAC 对应版本。它接受大多数标准的 `nn.Conv2d` 参数（包括 in_channels、out_channels、kernel_size、bias、stride、padding、dilation，但不包括 groups 和 padding_mode），并且我们确保在使用相同参数时，`PacConv2d` 和 `nn.Conv2d` 具有完全相同的输出尺寸。\n此外，还有一些可选参数：\n\n```\n    参数（除 Conv2d 的参数外）：\n        kernel_type（str）：'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'。默认值：'gaussian'\n        smooth_kernel_type（str）：'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'。默认值：'none'\n        normalize_kernel（bool）：默认值：False\n        shared_filters（bool）：默认值：False\n        filler（str）：'uniform'。默认值：'uniform'\n\n    注意：\n        - kernel_size 只能为奇数\n        - padding 不得大于 :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\n当用于构建计算图时，该层接受两个输入张量并生成一个输出张量：\n\n```python\nin_ch, out_ch, g_ch = 16, 32, 8         # 输入、输出和引导通道的数量\nf, b, h, w = 5, 2, 64, 64               # 滤波器大小、批次大小、输入高度和宽度\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, h, w)       # 引导特征（见论文公式3中的'f'）\n\nconv = nn.Conv2d(in_ch, out_ch, f)\nout_conv = conv(input)                  # 标准空间卷积\n\npacconv = PacConv2d(in_ch, out_ch, f)   \nout_pac = pacconv(input, guide)         # PAC \nout_pac = pacconv(input, None, guide_k) # 替代接口\n                                        # guide_k 是预先计算的'K'（见论文公式3）\n                                        # 形状为[b, g_ch, f, f, h, w]。可以使用 packernel2d 来创建它。\n```\n\n请使用 `pacconv2d`（结合 `packernel2d`）以获得其功能接口。\n\n#### `PacConvTranspose2d`\n`PacConvTranspose2d` 是 `nn.ConvTranspose2d` 的 PAC 对应版本。它接受大多数标准的 `nn.ConvTranspose2d` 参数（包括 in_channels、out_channels、kernel_size、bias、stride、padding、output_padding、dilation，但不包括 groups 和 padding_mode），并且我们确保在使用相同参数时，`PacConvTranspose2d` 和 `nn.ConvTranspose2d` 具有完全相同的输出尺寸。\n此外，还有一些可选参数：\n\n```\n    参数（除 ConvTranspose2d 的参数外）：\n        kernel_type（str）：'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'。默认值：'gaussian'\n        smooth_kernel_type（str）：'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'。默认值：'none'\n        normalize_kernel（bool）：默认值：False\n        shared_filters（bool）：默认值：False\n        filler（str）：'uniform' | 'linear'。默认值：'uniform'\n\n    注意：\n        - kernel_size 只能为奇数\n        - padding 不得大于 :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\n与 `PacConv2d` 类似，`PacConvTranspose2d` 也提供两种使用方式：\n\n```python\nin_ch, out_ch, g_ch = 16, 32, 8             # 输入、输出和引导通道的数量\nf, b, h, w, oh, ow = 5, 2, 8, 8, 16, 16     # 滤波器大小、批次大小、输入高度和宽度\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, oh, ow)         # 引导特征，注意它需要与输出的空间尺寸匹配\n\nconvt = nn.ConvTranspose2d(in_ch, out_ch，f，stride=2，padding=2，output_padding=1)\nout_convt = convt(input)                    # 标准转置卷积\n\npacconvt = PacConvTranspose2d(in_ch，out_ch，f，stride=2，padding=2，output_padding=1)   \nout_pact = pacconvt(input，guide)           # PAC \nout_pact = pacconvt(input，None，guide_k)   # 替代接口\n                                            # guide_k 是预先计算的'K' \n                                            # 形状为[b，g_ch, f，f，oh，ow]。\n                                            # 可以使用 packernel2d 来创建它。\n```\n\n请使用 `pacconv_transpose2d`（结合 `packernel2d`）以获得其功能接口。\n\n#### `PacPool2d`\n`PacPool2d` 是 `nn.AvgPool2d` 的 PAC 对应版本。它接受大多数标准的 `nn.AvgPool2d` 参数（包括 kernel_size、stride、padding、dilation，但不包括 ceil_mode 和 count_include_pad），并且我们确保在使用相同参数时，`PacPool2d` 和 `nn.AvgPool2d` 具有完全相同的输出尺寸。\n此外，还有一些可选参数：\n\n```\n    参数：\n        kernel_size、stride、padding、dilation\n        kernel_type（str）：'gaussian' | 'inv_{alpha}_{lambda}[_asym][_fixed]'。默认值：'gaussian'\n        smooth_kernel_type（str）：'none' | 'gaussian' | 'average_{sz}' | 'full_{sz}'。默认值：'none'\n        channel_wise（bool）：默认值：False\n        normalize_kernel（bool）：默认值：False\n        out_channels（int）：对于 channel_wise 的 'inv_*'（非固定）核，必须指定。默认值：-1\n\n    注意：\n        - kernel_size 只能为奇数\n        - padding 不得大于 :math:`dilation * (kernel_size - 1) \u002F 2`\n```\n\n与 `PacConv2d` 类似，`PacPool2d` 也提供两种使用方式：\n\n```python\nin_ch, g_ch = 16, 8                     # 输入和引导特征的通道数\nstride, f, b, h, w = 5, 2, 64, 64       # 步幅、卷积核大小、批次大小、输入高度和宽度\ninput = torch.rand(b, in_ch, h, w)\nguide = torch.rand(b, g_ch, h, w)       # 引导特征\n\npool = nn.AvgPool2d(f, stride)\nout_pool = pool(input)                  # 标准的空间卷积\n\npacpool = PacPool2d(f, stride)   \nout_pac = pacpool(input, guide)         # PAC \nout_pac = pacpool(input, None, guide_k) # 另一种接口\n                                        # guide_k 是预先计算好的 'K'\n                                        # 形状为 [b, g_ch, f, f, h, w]。可以使用 packernel2d 来创建它。\n```\n\n请使用 `pacpool2d`（结合 `packernel2d`）的功能性接口。\n\n#### `PacCRF` 和 `PacCRFLoose`\n这些层提供了一种便捷的方式，可以在密集预测网络的末尾添加 CRF 组件。它们在底层执行近似的均值场推理。可用的参数包括：\n\n```python\n    Args:\n        channels (int): 类别数量。\n        num_steps (int): 均值场更新的步数。\n        final_output (str): 'log_softmax' | 'softmax' | 'log_Q'。默认值：'log_Q'\n        perturbed_init (bool): 是否对初始化进行扰动。默认值：True\n        native_impl (bool): 默认值：False\n        fixed_weighting (bool): 是否对一元项\u002F二元项使用固定权重。默认值：False\n        unary_weight (float): 默认值：1.0\n        pairwise_kernels (dict 或 list): 二元核，详情请参见 add_pairwise_kernel()。默认值：None\n```\n\n使用示例：\n\n```python\n\n\n# 创建一个针对 21 类、使用 5 步均值场推理的 CRF 层\ncrf = PacCRF(21, num_steps=5, unary_weight=1.0)\n\n# 添加一个与一元项权重相等的二元项\ncrf.add_pairwise_kernel(kernel_size=5, dilation=1, blur=1, compat_type='4d', pairwise_weight=1.0)\n\n# 提供了一个便捷函数，用于根据像素颜色和位置创建二元特征\nedge_features = [paccrf.create_YXRGB(im, yx_scale=100.0, rgb_scale=30.0)] \noutput = crf(unary, edge_features)\n\n# 注意，我们这里使用了常量值作为 unary_weight、pairwise_weight、yx_scale 和 rgb_scale，但它们也可以是张量，并通过反向传播进行学习。\n```\n\n### 实验\n\n#### 联合上采样\n##### NYU Depth V2 上的联合深度上采样\n* 训练\u002F测试划分由 [Li 等](https:\u002F\u002Fsites.google.com\u002Fsite\u002Fyijunlimaverick\u002Fdeepjointfilter) 提供\n* 使用我们预训练的模型之一进行测试：\n\n    ```bash\n    python -m task_jointUpsampling.main --load-weights weights_depth\u002Fx8_pac_weights_epoch_5000.pth \\\n                                        --download \\\n                                        --factor 8 \\\n                                        --model PacJointUpsample \\\n                                        --dataset NYUDepthV2 \\\n                                        --data-root data\u002Fnyu\n    ```\n    \n    |   | 4x | 8x  | 16x  |\n    |---|---|---|---|\n    | `双线性插值` | RMSE: 5.43 | RMSE: 8.36  | RMSE: 12.90 |\n    | `PacJointUpsample`  | RMSE: 2.39 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_pac_weights_epoch_5000.pth) | RMSE: 4.59 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_pac_weights_epoch_5000.pth)  | RMSE: 8.09 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_pac_weights_epoch_5000.pth)  |\n    | `PacJointUpsampleLite`  | RMSE: 2.55  &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_paclite_weights_epoch_5000.pth)  | RMSE: 4.82  &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_paclite_weights_epoch_5000.pth)  | RMSE: 8.52 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_paclite_weights_epoch_5000.pth)  |\n    | `DJIF`  | RMSE: 2.64 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx4_djif_weights_epoch_5000.pth)  | RMSE: 5.15 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx8_djif_weights_epoch_5000.pth)  | RMSE: 9.39 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fdepth_upsampling\u002Fx16_djif_weights_epoch_5000.pth)  |\n\n* 从头开始训练：\n\n    ```bash\n    python -m task_jointUpsampling.main --factor 8 \\\n                                        --data-root data\u002Fnyu \\\n                                        --exp-root exp\u002Fnyu \\\n                                        --download \\\n                                        --dataset NYUDepthV2 \\\n                                        --epochs 5000 \\\n                                        --lr-steps 3500 4500\n    ```\n    \n    有关完整的命令行选项，请参阅 `python -m task_jointUpsampling.main -h`。\n\n##### Sintel 上的联合光流上采样\n* 训练\u002F验证划分（`1` - 训练，`2` - 验证）在 [meta\u002FSintel_train_val.txt](task_jointUpsampling\u002Fmeta\u002FSintel_train_val.txt) 中给出（[原始来源](https:\u002F\u002Flmb.informatik.uni-freiburg.de\u002Fresources\u002Fdatasets\u002FFlyingChairs\u002FSintel_train_val.txt))：\n    * 验证集：133 对\n        * `ambush_6`（全部 19 对）\n        * `bamboo_2`（最后 25 对）\n        * `cave_4`（最后 25 对）\n        * `market_6`（全部 39 对）\n        * `temple_2`（最后 25 对）\n    * 训练集：剩余的 908 对\n\n* 使用我们预训练的模型之一进行测试：\n\n    ```bash\n    python -m task_jointUpsampling.main --load-weights weights_flow\u002Fx8_pac_weights_epoch_5000.pth \\\n                                        --download \\\n                                        --factor 8 \\\n                                        --model PacJointUpsample \\\n                                        --dataset Sintel \\\n                                        --data-root data\u002Fsintel\n    ```\n    \n    |   | 4x | 8x  | 16x  |\n    |---|---|---|---|\n    | `双线性插值` | EPE: 0.4650 | EPE: 0.9011 | EPE: 1.6281 |\n    | `PacJointUpsample` | EPE: 0.1042 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx4_pac_weights_epoch_5000.pth) | EPE: 0.2558 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx8_pac_weights_epoch_5000.pth) | EPE: 0.5921 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx16_pac_weights_epoch_5000.pth) |\n    | `DJIF` | EPE: 0.1760 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx4_djif_weights_epoch_5000.pth) | EPE: 0.4382 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx8_djif_weights_epoch_5000.pth) | EPE: 1.0422 &#124; [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fflow_upsampling\u002Fx16_djif_weights_epoch_5000.pth) |\n\n* 从头开始训练：\n\n    ```bash\n    python -m task_jointUpsampling.main --factor 8 \\\n                                        --data-root data\u002Fsintel \\\n                                        --exp-root exp\u002Fsintel \\\n                                        --download \\\n                                        --dataset Sintel \\\n                                        --epochs 5000 \\\n                                        --lr-steps 3500 4500\n    ```\n    \n    有关完整的命令行选项列表，请参阅 `python -m task_jointUpsampling.main -h`。 \n    \n#### 语义分割\n* 使用预训练模型之一进行测试：\n\n    ```bash\n    python -m task_semanticSegmentation.main --data-root data\u002Fvoc \\ \n                                             --exp-root exp\u002Fvoc \\\n                                             --download \\\n                                             --load-weights fcn8s_from_caffe.pth \\\n                                             --model fcn8s \\\n                                             --test-split val11_sbd \\\n                                             --test-crop -1\n    ```\n    \n    |   | miou (val \u002F test) | 模型名称  | 权重 | \n    |---|---|---|---|\n    | 主干网络 (FCN8s)  | 65.51% \u002F 67.20% | `fcn8s`  | [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_from_caffe.pth)   |\n    | PacCRF  | 68.90% \u002F 69.82% | `fcn8s_crfi5p4d5641p4d5161`  | [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_paccrf_epoch_30.pth)  |\n    | PacCRF-32  |  68.52% \u002F 69.41% | `fcn8s_crfi5p4d5321`  | [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_paccrf32_epoch_30.pth)  |\n    | PacFCN（热插拔）  | 67.44% \u002F 69.18% | `fcn8spac`  | [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_pacfcn_epoch_20.pth)  |\n    | PacFCN+PacCRF  |  69.87% \u002F 71.34% | `fcn8spac_crfi5p4d5641p4d5161`  |  [下载](http:\u002F\u002Fmaxwell.cs.umass.edu\u002Fpacnet-data\u002Fpretrained_weights\u002Fsemantic_segmentation\u002Ffcn8s_pacfcncrf_epoch_20.pth) |\n    \n    请注意，最后两个模型需要使用参数 `--test-crop 512`。\n\n* 生成预测\n    \n    使用 `--eval pred` 模式来保存预测结果，而不是报告分数。预测结果将保存在 `exp-root`\u002Foutputs_*_pred 目录下，可用于 VOC 评估服务器：\n    \n    ```bash\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc \\\n    --load-weights fcn8s_paccrf_epoch_30.pth \\\n    --test-crop -1 \\\n    --test-split test \\\n    --eval pred \\\n    --model fcn8s_crfi5p4d5641p4d5161 \n    \n    cd exp\u002Fvoc\n    mkdir -p results\u002FVOC2012\u002FSegmentation\n    mv outputs_test_pred results\u002FVOC2012\u002FSegmentation\u002Fcomp6_test_cls\n    tar zcf results_fcn8s_crf.tgz results\n    ```\n    \n    请注意，由于 VOC 的测试集没有公开的下载链接，因此在使用测试集时，需要从[官方网站](http:\u002F\u002Fhost.robots.ox.ac.uk:8080\u002Feval\u002Fdownloads\u002FVOC2012test.tar)手动下载数据文件。只需将下载的 VOC2012test.tar 文件放置在数据根目录下并解压即可。\n\n* 训练模型\n\n    以下以 PacCRF 的两阶段训练为例：\n\n    ```bash\n    # 第一阶段：仅训练 CRF，冻结主干网络\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc\u002Fcrf_only \\\n    --load-weights-backbone fcn8s_from_caffe.pth \\\n    --train-split train11 \\\n    --test-split val11_sbd \\\n    --train-crop 449 \\\n    --test-crop -1 \\\n    --model fcn8sfrozen_crfi5p4d5641p4d5161 \\\n    --epochs 40 \\\n    --lr 0.001 \\\n    --lr-steps 20\n\n    # 第二阶段：联合训练 CRF 和主干网络\n    python -m task_semanticSegmentation.main \\\n    --data-root data\u002Fvoc \\\n    --exp-root exp\u002Fvoc\u002Fjoint \\\n    --load-weights-backbone fcn8s_from_caffe.pth \\\n    --load-weights exp\u002Fvoc\u002Fcrf_only\u002Fweights_epoch_40.pth \\\n    --train-split train11 \\\n    --test-split val11_sbd \\\n    --train-crop 449 \\\n    --test-crop -1 \\\n    --model fcn8s_crfi5lp4d5641p4d5161 \\\n    --epochs 30 \\\n    --lr 0.0000001 \\\n    --lr-steps 20\n    ```\n\n    有关完整的命令行选项列表，请参阅 `python -m task_semanticSegmentation.main -h`。 \n\n\n\n### 引用\n如果您在研究中使用了此代码，请考虑引用我们的论文：\n```\n@inproceedings{su2019pixel,\n  author    = {苏航、瓦伦·詹帕尼、孙德清、奥拉齐奥·加洛、埃里克·勒恩德-米勒、扬·考茨},\n  title     = {像素自适应卷积神经网络},\n  booktitle = {IEEE计算机视觉与模式识别会议（CVPR）论文集},\n  year      = {2019}\n}\n```","# PacNet 快速上手指南\n\nPacNet (Pixel-Adaptive Convolutional Neural Networks) 是一个基于 PyTorch 的开源项目，实现了像素自适应卷积网络。该工具通过引入引导特征（Guidance Feature），使卷积核能够根据图像内容动态调整，广泛应用于联合上采样、深度估计等密集预测任务。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux \u002F macOS \u002F Windows\n*   **Python 版本**: >= 3.5 (推荐使用 Conda 虚拟环境)\n*   **深度学习框架**: PyTorch v0.4 - 1.1 (必须包含 CUDA 支持)\n    *   *注意：由于项目较老，建议创建独立的旧版 PyTorch 环境以避免兼容性问题。*\n*   **其他依赖**: `torchvision` 及 `requirements.txt` 中列出的库。\n\n> **国内加速建议**：\n> 安装 PyTorch 时，推荐使用清华或中科大镜像源。例如安装 PyTorch 1.1 (CUDA 10.0)：\n> ```bash\n> pip install torch==1.1.0 torchvision==0.3.0 -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 安装步骤\n\n1.  **克隆项目并配置路径**\n    将项目目录添加到您的 Python 路径中，以便导入模块。\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet.git\n    cd pacnet\n    export PYTHONPATH=\"${PYTHONPATH}:$(pwd)\"\n    ```\n\n2.  **安装依赖库**\n    使用 pip 安装项目所需的额外依赖包。\n    ```bash\n    pip install -r requirements.txt\n    ```\n    *(国内用户可添加 `-i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple` 加速下载)*\n\n3.  **验证安装 (可选)**\n    运行单元测试以确保环境配置正确。\n    ```bash\n    python -m unittest \n    ```\n\n## 基本使用\n\nPacNet 提供了多种核心层（如 `PacConv2d`, `PacPool2d` 等），其用法与原生 PyTorch 层类似，但需要额外传入一个**引导特征 (Guide)** 张量。\n\n### 最简单的使用示例 (`PacConv2d`)\n\n以下示例展示了如何构建一个标准的像素自适应卷积层，并执行前向传播。\n\n```python\nimport torch\nimport torch.nn as nn\nfrom pacnet import PacConv2d\n\n# 1. 定义维度参数\nin_ch, out_ch, g_ch = 16, 32, 8         # 输入通道、输出通道、引导特征通道\nf, b, h, w = 5, 2, 64, 64               # 卷积核尺寸、Batch size、高、宽\n\n# 2. 创建输入张量和引导特征张量\n# input: 标准输入图像特征\ninput = torch.rand(b, in_ch, h, w)\n# guide: 引导特征 (对应论文公式 3 中的 'f')，空间尺寸需与输入一致\nguide = torch.rand(b, g_ch, h, w)\n\n# 3. 初始化 PAC 卷积层\n# 参数用法与 nn.Conv2d 基本一致，支持 kernel_type, normalize_kernel 等额外参数\npacconv = PacConv2d(in_ch, out_ch, f)   \n\n# 4. 执行前向传播\n# 标准调用方式：传入 input 和 guide\nout_pac = pacconv(input, guide)         \n\n# 替代调用方式：传入预计算的核 'K' (guide_k)\n# guide_k 形状应为 [b, g_ch, f, f, h, w]，可通过 packernel2d 生成\n# out_pac = pacconv(input, None, guide_k) \n\nprint(f\"Output shape: {out_pac.shape}\")\n```\n\n### 核心接口说明\n\n*   **输入要求**: `PacConv2d` 接受两个主要张量：`input` (待处理特征) 和 `guide` (引导信息)。\n*   **尺寸一致性**: 当使用相同的标准参数（如 stride, padding）时，`PacConv2d` 的输出尺寸与原生 `nn.Conv2d` 完全一致。\n*   **函数式接口**: 除了类实例化调用外，也可使用 `pacconv2d` 函数配合 `packernel2d` 进行函数式编程。\n\n### 其他可用层\n\n项目还实现了以下变体，用法与上述类似：\n*   `PacConvTranspose2d`: 用于上采样的转置卷积。\n*   `PacPool2d`: 像素自适应池化层。\n*   `PacCRF` \u002F `PacCRFLoose`: 用于密集预测任务末端的条件随机场 (CRF) 推理层。","某医疗影像算法团队正在开发一套肺部 CT 扫描超分辨率重建系统，旨在将低剂量扫描得到的模糊图像恢复为清晰的高清影像，以辅助医生精准诊断微小结节。\n\n### 没有 pacnet 时\n- 传统卷积神经网络（如使用标准 `Conv2d`）在处理图像边缘和纹理细节时，往往采用固定的卷积核权重，导致重建后的肺结节边缘出现模糊或伪影。\n- 为了保留细节，开发者不得不引入复杂的后处理步骤或堆叠更多网络层，这不仅增加了模型参数量，还显著拖慢了推理速度。\n- 在低对比度区域，模型难以区分真实的组织纹理与噪声，容易造成过度平滑，丢失关键的病理特征。\n- 调整网络结构以适应不同尺度的特征融合非常困难，通常需要大量手动实验来寻找最佳的上采样策略。\n\n### 使用 pacnet 后\n- 利用 `PacConv2d` 层，模型能根据输入的引导特征（如原始高分辨率结构图）动态生成像素级的自适应卷积核，使重建出的结节边缘锐利且自然。\n- 借助 `PacConvTranspose2d` 进行上采样时，无需堆叠深层网络即可实现高质量的特征恢复，大幅减少了参数量并提升了实时推理性能。\n- 通过像素自适应机制，模型在去噪的同时完美保留了微弱的纹理细节，有效避免了关键病灶信息的丢失。\n- 开发者可以直接替换标准卷积层，利用其灵活的接口轻松构建多尺度特征融合架构，显著缩短了模型迭代周期。\n\npacnet 通过引入像素自适应卷积机制，让 AI 模型在处理医学影像等对细节极其敏感的任务时，实现了清晰度与效率的双重突破。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_pacnet_57c774b5.png","NVlabs","NVIDIA Research Projects","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNVlabs_fc20d641.jpg","",null,"http:\u002F\u002Fresearch.nvidia.com","https:\u002F\u002Fgithub.com\u002FNVlabs",[80],{"name":81,"color":82,"percentage":83},"Python","#3572A5",100,517,78,"2026-03-30T13:39:30","NOASSERTION","未说明","需要支持 CUDA 的 NVIDIA GPU（具体型号和显存未说明），需安装带 CUDA 支持的 PyTorch",{"notes":91,"python":92,"dependencies":93},"建议使用 Conda 管理环境。项目依赖较旧版本的 PyTorch (0.4-1.1)，在现代环境中安装可能需要注意版本兼容性。代码包含单元测试，可通过 'python -m unittest' 验证安装。",">=3.5",[94,95,96],"torch>=0.4, \u003C=1.1","torchvision","requirements.txt 中列出的其他库",[14,15],[99,100,101],"deep-learning","machine-learning","computer-vision","2026-03-27T02:49:30.150509","2026-04-20T19:22:22.313118",[105,110,115,120,125,130,134],{"id":106,"question_zh":107,"answer_zh":108,"source_url":109},45511,"升级到 PyTorch 1.4.0 后报错找不到 'type2backend'，如何解决？","在 PyTorch 1.4.0 中，backend 不再直接暴露。解决方法是使用更高层的 `fold` 函数替代原有实现。您可以切换到项目的 `th14` 分支来获取已修复的代码，该分支专门解决了 type2backend 缺失的问题。分支链接：https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fbranches","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F14",{"id":111,"question_zh":112,"answer_zh":113,"source_url":114},45512,"运行光流（flow）任务时遇到显存不足或无法生成结果文件的问题怎么办？","如果您使用的是 11GB 显存的 GPU，只能运行所有深度（depth）实验和 4 倍缩放（4x）的光流实验。8 倍和 16 倍的光流测试在这些硬件上无法运行。此外，代码默认仅用于定量评估，不会保存生成的 .flo 文件（除非语义分割代码中使用了特定选项）。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F17",{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},45513,"如何在显存较低的 GPU 上单独运行单元测试以避免 CUDA 内存溢出？","不要运行整个测试套件，而是通过指定类名和方法名单独运行单个测试用例。例如，运行命令：`python test_pac.py PacConvTest.test_conv_forward_const_kernel`。这样可以显著降低显存需求，便于在低显存设备上调试。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F2",{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},45514,"PacCRF 模型中的边连接模式是如何定义的？是否支持全连接或指定点连接？","连接模式由 `add_pairwise_kernel` 行定义。例如，设置 `kernel_size=5` 意味着每个像素与其邻域内所有其他 `5 * 5 - 1` 个像素相连。目前的实现是基于邻域窗口的连接，类似于 DenseCRF 的局部全连接逻辑，不支持直接构建两个任意指定点之间的单独边。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F23",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},45515,"PacConv2d 中引导图（guidance）的通道数为什么通常设为输入通道数的一半？","对于输入尺寸为 B x C x H x W 的层，引导图的尺寸应为 B x C_g x H x W（批次、高、宽必须一致，深度可以不同）。将引导图通道数 C_g 设为输入通道数 C 的一半仅仅是示例中的一种设计选择，并非强制规则，您可以根据实际需求调整。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F16",{"id":131,"question_zh":132,"answer_zh":133,"source_url":129},45516,"如果需要预先计算并传入卷积核 K，其张量形状应该是什么？","如果您希望向层提供预计算的 K（在多次复用时效率更高），其尺寸应为 `B x 1 x F x F x H x W`，其中 B 是批次大小，F 是卷积滤波器大小（kernel size），H 和 W 是特征图的高度和宽度。",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},45517,"PAC 层反向传播时，卷积核权重是如何更新的？是否受引导图影响？","在 PAC 层中，卷积核是根据引导图（guide input）动态计算的。在反向传播过程中，梯度会同时更新生成卷积核的参数以及引导图相关的参数。这意味着卷积核权重的更新不仅依赖于目标输入的梯度，也依赖于引导图与卷积核之间计算关系的梯度传递。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fpacnet\u002Fissues\u002F29",[]]