[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-NVIDIA--DeepLearningExamples":3,"tool-NVIDIA--DeepLearningExamples":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",142651,2,"2026-04-06T23:34:12",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":119,"forks":120,"last_commit_at":121,"license":77,"difficulty_score":10,"env_os":122,"env_gpu":123,"env_ram":124,"env_deps":125,"category_tags":138,"github_topics":140,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":156,"updated_at":157,"faqs":158,"releases":186},4889,"NVIDIA\u002FDeepLearningExamples","DeepLearningExamples","State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.","DeepLearningExamples 是 NVIDIA 官方提供的一套前沿深度学习示例代码库，旨在帮助开发者在企业级基础设施上轻松训练和部署高性能模型。它主要解决了深度学习实践中常见的痛点：复现困难、性能调优复杂以及部署流程繁琐。通过提供经过严格验证的脚本，该资源确保了模型在精度和速度上的可复现性，让用户无需从零开始摸索优化技巧。\n\n这套工具特别适合人工智能工程师、算法研究人员以及需要构建生产级应用的技术团队使用。无论是计算机视觉领域的 EfficientNet 系列，还是其他主流架构，用户都能在这里找到对应的最佳实践。\n\n其核心技术亮点在于深度集成了 NVIDIA CUDA-X 软件栈，充分释放了 Volta、Turing 和 Ampere 架构 GPU 中 Tensor Cores 的计算潜力。项目不仅支持混合精度训练（AMP）、多卡及多节点分布式训练，还无缝衔接了 TensorRT 加速、ONNX 格式转换以及 Triton 推理服务器等部署环节。此外，所有示例均封装在 NGC 容器中以月度更新，内置了经过质量认证的 cuDNN、NCCL 等底层库，确保用户始终使用最新且稳定","DeepLearningExamples 是 NVIDIA 官方提供的一套前沿深度学习示例代码库，旨在帮助开发者在企业级基础设施上轻松训练和部署高性能模型。它主要解决了深度学习实践中常见的痛点：复现困难、性能调优复杂以及部署流程繁琐。通过提供经过严格验证的脚本，该资源确保了模型在精度和速度上的可复现性，让用户无需从零开始摸索优化技巧。\n\n这套工具特别适合人工智能工程师、算法研究人员以及需要构建生产级应用的技术团队使用。无论是计算机视觉领域的 EfficientNet 系列，还是其他主流架构，用户都能在这里找到对应的最佳实践。\n\n其核心技术亮点在于深度集成了 NVIDIA CUDA-X 软件栈，充分释放了 Volta、Turing 和 Ampere 架构 GPU 中 Tensor Cores 的计算潜力。项目不仅支持混合精度训练（AMP）、多卡及多节点分布式训练，还无缝衔接了 TensorRT 加速、ONNX 格式转换以及 Triton 推理服务器等部署环节。此外，所有示例均封装在 NGC 容器中以月度更新，内置了经过质量认证的 cuDNN、NCCL 等底层库，确保用户始终使用最新且稳定的技术组合，从而大幅缩短从实验到落地的周期。","# NVIDIA Deep Learning Examples for Tensor Cores\n\n## Introduction\nThis repository provides State-of-the-Art Deep Learning examples that are easy to train and deploy, achieving the best reproducible accuracy and performance with NVIDIA CUDA-X software stack running on NVIDIA Volta, Turing and Ampere GPUs.\n\n## NVIDIA GPU Cloud (NGC) Container Registry\nThese examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https:\u002F\u002Fngc.nvidia.com). These containers include:\n\n- The latest NVIDIA examples from this repository\n- The latest NVIDIA contributions shared upstream to the respective framework\n- The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance\n- [Monthly release notes](https:\u002F\u002Fdocs.nvidia.com\u002Fdeeplearning\u002Fdgx\u002Findex.html#nvidia-optimized-frameworks-release-notes) for each of the NVIDIA optimized containers\n\n\n## Computer Vision\n| Models                                                                                                                                 | Framework    | AMP            | Multi-GPU | Multi-Node | TensorRT | ONNX | Triton                                                                                                                       | DLC  | NB                                                                                                                                                               |\n|----------------------------------------------------------------------------------------------------------------------------------------|--------------|----------------|-----------|------------|----------|------|------------------------------------------------------------------------------------------------------------------------------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [EfficientNet-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)             | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                    | Yes  | -                                                                                                                                                                |\n| [EfficientNet-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)             | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [EfficientNet-WideSE-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)      | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [EfficientNet-WideSE-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)      | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [EfficientNet v1-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v1)   | TensorFlow2  | Yes            | Yes       | Yes        | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [EfficientNet v1-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v1)   | TensorFlow2  | Yes            | Yes       | Yes        | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [EfficientNet v2-S](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v2)    | TensorFlow2  | Yes            | Yes       | Yes        | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [GPUNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FGPUNet)                                     | PyTorch      | Yes            | Yes       | -          | Example | Yes  | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FGPUNet\u002Ftriton\u002F)                      | Yes  | -                                                                                                                                                                |\n| [Mask R-CNN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FMaskRCNN)                                 | PyTorch      | Yes            | Yes       | -          | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fdetectron2) | -    | Supported                                                                                                                             | -    | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FMaskRCNN\u002Fpytorch\u002Fnotebooks\u002Fpytorch_MaskRCNN_pyt_train_and_inference.ipynb) |\n| [Mask R-CNN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FSegmentation\u002FMaskRCNN)                             | TensorFlow2  | Yes            | Yes       | -          | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fdetectron2) | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [nnUNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FnnUNet)                                       | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FMxNet\u002FClassification\u002FRN50v1.5)                                  | MXNet        | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | -    | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPaddlePaddle\u002FClassification\u002FRN50v1.5)                           | PaddlePaddle | Yes            | Yes       | -          | Example | -    | Supported                                                                                                                             | -    | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fresnet50v1.5)                   | PyTorch      | Yes            | Yes       | -          | Example | -    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fresnet50)            | Yes  | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fresnet50v1.5)                | TensorFlow   | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fresnext101-32x4d)             | PyTorch      | Yes            | Yes       | -          | Example | -    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fresnext101-32x4d)    | Yes  | -                                                                                                                                                                |\n| [ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fresnext101-32x4d)          | TensorFlow   | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [SE-ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fse-resnext101-32x4d)       | PyTorch      | Yes            | Yes       | -          | Example | -    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fse-resnext101-32x4d) | Yes  | -                                                                                                                                                                |\n| [SE-ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fse-resnext101-32x4d)    | TensorFlow   | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n| [SSD](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FDetection\u002FSSD)                                                | PyTorch      | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | -    | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FPyTorch\u002FDetection\u002FSSD\u002Fexamples\u002Finference.ipynb)                                                 |\n| [SSD](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FDetection\u002FSSD)                                             | TensorFlow   | Yes            | Yes       | -          | Supported | -    | Supported                                                                                                                             | Yes  | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FTensorFlow\u002FDetection\u002FSSD\u002Fmodels\u002Fresearch\u002Fobject_detection\u002Fobject_detection_tutorial.ipynb)      |\n| [U-Net Med](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FSegmentation\u002FUNet_Medical)                          | TensorFlow2  | Yes            | Yes       | -          | Example | -    | Supported                                                                                                                             | Yes  | -                                                                                                                                                                |\n\n## Natural Language Processing\n| Models                                                                                                                 | Framework   | AMP  | Multi-GPU | Multi-Node | TensorRT | ONNX | Triton                                                                                                    | DLC  | NB                                                                                                                                          |\n|------------------------------------------------------------------------------------------------------------------------|-------------|------|-----------|------------|----------|------|-----------------------------------------------------------------------------------------------------------|------|---------------------------------------------------------------------------------------------------------------------------------------------|\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FLanguageModeling\u002FBERT)                       | PyTorch     | Yes  | Yes       | Yes        | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fdemo\u002FBERT) | -    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FLanguageModeling\u002FBERT\u002Ftriton)    | Yes  | -                                                                                                                                           |\n| [GNMT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FTranslation\u002FGNMT)                            | PyTorch     | Yes  | Yes       | -          | Supported | -    | Supported                                                                                                          | -    | -                                                                                                                                           |\n| [ELECTRA](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FLanguageModeling\u002FELECTRA)             | TensorFlow2 | Yes  | Yes       | Yes        | Supported | -    | Supported                                                                                                          | Yes  | -                                                                                                                                           |\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT)                    | TensorFlow  | Yes  | Yes       | Yes        | Example | -    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT\u002Ftriton) | Yes  | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT\u002Fnotebooks)                                |\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FLanguageModeling\u002FBERT)                   | TensorFlow2 | Yes  | Yes       | Yes        | Supported | -    | Supported                                                                                                          | Yes  | -                                                                                                                                           |\n| [GNMT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FTranslation\u002FGNMT)                         | TensorFlow  | Yes  | Yes       | -          | Supported | -    | Supported                                                                                                          | -    | -                                                                                                                                           |\n| [Faster Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FFasterTransformer)                     | Tensorflow  | -    | -         | -          | Example | -    | Supported                                                                                                          | -    | -                                                                                                                                           |\n\n\n## Recommender Systems\n| Models                                                                                                         | Framework   | AMP   | Multi-GPU | Multi-Node   | ONNX   | Triton                                                                                               | DLC  | NB                                                                                                     |\n|----------------------------------------------------------------------------------------------------------------|-------------|-------|-----------|--------------|--------|------------------------------------------------------------------------------------------------------|------|--------------------------------------------------------------------------------------------------------|\n| [DLRM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM)                 | PyTorch     | Yes   | Yes       | -            | Yes    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM\u002Ftriton) | Yes  | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM\u002Fnotebooks) |\n| [DLRM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FDLRM)             | TensorFlow2 | Yes   | Yes       | Yes          | -      | Supported                                                                                                     | Yes  | -                                                                                                      |\n| [NCF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FNCF)                   | PyTorch     | Yes   | Yes       | -            | -      | Supported                                                                                                     | -    | -                                                                                                      |\n| [Wide&Deep](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FWideAndDeep)  | TensorFlow  | Yes   | Yes       | -            | -      | Supported                                                                                                     | Yes  | -                                                                                                      |\n| [Wide&Deep](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FWideAndDeep) | TensorFlow2 | Yes   | Yes       | -            | -      | Supported                                                                                                     | Yes  | -                                                                                                      |\n| [NCF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FNCF)                | TensorFlow  | Yes   | Yes       | -            | -      | Supported                                                                                                     | Yes  | -                                                                                                      |\n| [VAE-CF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FVAE-CF)          | TensorFlow  | Yes   | Yes       | -            | -      | Supported                                                                                                     | -    | -                                                                                                      |\n| [SIM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FSIM)               | TensorFlow2 | Yes   | Yes       | -            | -      | Supported                                                                                                     | Yes  | -                                                                                                      |\n\n\n## Speech to Text\n| Models                                                                                                       | Framework   | AMP  | Multi-GPU  | Multi-Node   | TensorRT | ONNX   | Triton                                                                                                   | DLC   | NB                                                                                                           |\n|--------------------------------------------------------------------------------------------------------------|-------------|------|------------|--------------|----------|--------|----------------------------------------------------------------------------------------------------------|-------|--------------------------------------------------------------------------------------------------------------|\n| [Jasper](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper)        | PyTorch     | Yes  | Yes        | -            | Example | Yes    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper\u002Ftrtis) | Yes   | [Yes](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper\u002Fnotebooks) |\n| [QuartzNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FQuartzNet)  | PyTorch     | Yes  | Yes        | -            | Supported | -      | Supported                                                                                                         | Yes   | -                                                                                                            |\n\n## Text to Speech\n| Models                                                                                                                  | Framework   | AMP  | Multi-GPU  | Multi-Node  | TensorRT | ONNX   | Triton                                                                                                        | DLC   | NB  |\n|-------------------------------------------------------------------------------------------------------------------------|-------------|------|------------|-------------|----------|--------|---------------------------------------------------------------------------------------------------------------|-------|-----|\n| [FastPitch](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FFastPitch)               | PyTorch     | Yes  | Yes        | -           | Example | -      | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FFastPitch\u002Ftriton)    | Yes   | Yes |\n| [FastSpeech](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FCUDA-Optimized\u002FFastSpeech)                      | PyTorch     | Yes  | Yes        | -           | Example | -      | Supported                                                                                                              | -     | -   |\n| [Tacotron 2 and WaveGlow](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FTacotron2) | PyTorch     | Yes  | Yes        | -           | Example | Yes    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FTacotron2\u002Ftrtis_cpp) | Yes   | -   |\n| [HiFi-GAN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FHiFiGAN)                  | PyTorch     | Yes  | Yes        | -           | Supported | -      | Supported                                                                                                              | Yes   | -   |\n\n## Graph Neural Networks\n| Models                                                                                                                  | Framework  | AMP  | Multi-GPU  | Multi-Node   | ONNX   | Triton   | DLC  | NB   |\n|-------------------------------------------------------------------------------------------------------------------------|------------|------|------------|--------------|--------|----------|------|------|\n| [SE(3)-Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FDGLPyTorch\u002FDrugDiscovery\u002FSE3Transformer) | PyTorch    | Yes  | Yes        | -            | -      | Supported         | -    | -    |\n| [MoFlow](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FDrugDiscovery\u002FMoFlow)                       | PyTorch    | Yes  | Yes        | -            | -      | Supported         | -    | -    |\n\n## Time-Series Forecasting\n| Models                                                                                                            | Framework  | AMP  | Multi-GPU   | Multi-Node   | TensorRT | ONNX   | Triton                                                                                           | DLC   | NB  |\n|-------------------------------------------------------------------------------------------------------------------|------------|------|-------------|--------------|----------|--------|--------------------------------------------------------------------------------------------------|-------|-----|\n| [Temporal Fusion Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FForecasting\u002FTFT) | PyTorch    | Yes  | Yes         | -            | Example | Yes    | [Example](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FForecasting\u002FTFT\u002Ftriton) | Yes   | -   |\n\n## NVIDIA support\nIn each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.\n\n## Glossary\n\n**Multinode Training**\nSupported on a pyxis\u002Fenroot Slurm cluster.\n\n**Deep Learning Compiler (DLC)**\nTensorFlow XLA and PyTorch JIT and\u002For TorchScript\n\n**Accelerated Linear Algebra (XLA)**\nXLA is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes. The results are improvements in speed and memory usage.\n\n**PyTorch JIT and\u002For TorchScript**\nTorchScript is a way to create serializable and optimizable models from PyTorch code. TorchScript, an intermediate representation of a PyTorch model (subclass of nn.Module) that can then be run in a high-performance environment such as C++.\n\n**Automatic Mixed Precision (AMP)**\nAutomatic Mixed Precision (AMP) enables mixed precision training on Volta, Turing, and NVIDIA Ampere GPU architectures automatically.\n\n**TensorFloat-32 (TF32)**\nTensorFloat-32 (TF32) is the new math mode in [NVIDIA A100](https:\u002F\u002Fwww.nvidia.com\u002Fen-us\u002Fdata-center\u002Fa100\u002F) GPUs for handling the matrix math also called tensor operations. TF32 running on Tensor Cores in A100 GPUs can provide up to 10x speedups compared to single-precision floating-point math (FP32) on Volta GPUs. TF32 is supported in the NVIDIA Ampere GPU architecture and is enabled by default.\n\n**Jupyter Notebooks (NB)**\nThe Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.\n\n\n## Feedback \u002F Contributions\nWe're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!\n\n## Known issues\nIn each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.\n","# 用于 Tensor Core 的 NVIDIA 深度学习示例\n\n## 简介\n本仓库提供了易于训练和部署的最先进深度学习示例，在 NVIDIA Volta、Turing 和 Ampere GPU 上运行 NVIDIA CUDA-X 软件栈时，能够实现最佳的可复现精度和性能。\n\n## NVIDIA GPU Cloud (NGC) 容器注册表\n这些示例以及我们的 NVIDIA 深度学习软件栈，以每月更新的 Docker 容器形式托管在 NGC 容器注册表中（https:\u002F\u002Fngc.nvidia.com）。这些容器包括：\n\n- 本仓库中的最新 NVIDIA 示例\n- NVIDIA 向各框架上游社区贡献的最新代码\n- 最新的 NVIDIA 深度学习软件库，例如 cuDNN、NCCL、cuBLAS 等，所有这些库都经过严格的月度质量保证流程，以确保提供最佳性能\n- 每个 NVIDIA 优化容器的[每月发布说明](https:\u002F\u002Fdocs.nvidia.com\u002Fdeeplearning\u002Fdgx\u002Findex.html#nvidia-optimized-frameworks-release-notes)\n\n## 计算机视觉\n| 模型                                                                                                                                 | 框架    | AMP            | 多GPU | 多节点 | TensorRT | ONNX | Triton                                                                                                                       | DLC  | NB                                                                                                                                                               |\n|----------------------------------------------------------------------------------------------------------------------------------------|----------|----------------|---------|----------|----------|------|------------------------------------------------------------------------------------------------------------------------------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [EfficientNet-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)             | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                         | 是   | -                                                                                                                                                                |\n| [EfficientNet-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)             | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [EfficientNet-WideSE-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)      | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [EfficientNet-WideSE-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fefficientnet)      | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [EfficientNet v1-B0](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v1)   | TensorFlow2  | 是            | 是      | 是       | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [EfficientNet v1-B4](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v1)   | TensorFlow2  | 是            | 是      | 是       | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [EfficientNet v2-S](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FClassification\u002FConvNets\u002Fefficientnet_v2)    | TensorFlow2  | 是            | 是      | 是       | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fefficientnet) | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [GPUNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FGPUNet)                                     | PyTorch  | 是            | 是      | -        | 示例     | 是   | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FGPUNet\u002Ftriton\u002F)                      | 是   | -                                                                                                                                                                |\n| [Mask R-CNN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FMaskRCNN)                                 | PyTorch  | 是            | 是      | -        | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fdetectron2) | -    | 支持                                                                                                                             | 否   | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FMaskRCNN\u002Fpytorch\u002Fnotebooks\u002Fpytorch_MaskRCNN_pyt_train_and_inference.ipynb) |\n| [Mask R-CNN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FSegmentation\u002FMaskRCNN)                             | TensorFlow2  | 是            | 是      | -        | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fsamples\u002Fpython\u002Fdetectron2) | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [nnUNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSegmentation\u002FnnUNet)                                       | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FMxNet\u002FClassification\u002FRN50v1.5)                                  | MXNet    | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 否   | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPaddlePaddle\u002FClassification\u002FRN50v1.5)                           | PaddlePaddle | 是            | 是      | -        | 示例     | -    | 支持                                                                                                                             | 否   | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fresnet50v1.5)                   | PyTorch  | 是            | 是      | -        | 示例     | -    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fresnet50)            | 是   | -                                                                                                                                                                |\n| [ResNet-50](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fresnet50v1.5)                | TensorFlow | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fresnext101-32x4d)             | PyTorch  | 是            | 是      | -        | 示例     | -    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fresnext101-32x4d)    | 是   | -                                                                                                                                                                |\n| [ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fresnext101-32x4d)          | TensorFlow | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [SE-ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Fse-resnext101-32x4d)       | PyTorch  | 是            | 是      | -        | 示例     | -    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FClassification\u002FConvNets\u002Ftriton\u002Fse-resnext101-32x4d) | 是   | -                                                                                                                                                                |\n| [SE-ResNeXt-101](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FClassification\u002FConvNets\u002Fse-resnext101-32x4d)    | TensorFlow | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n| [SSD](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FDetection\u002FSSD)                                                | PyTorch  | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 否   | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FPyTorch\u002FDetection\u002FSSD\u002Fexamples\u002Finference.ipynb)                                                 |\n| [SSD](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FDetection\u002FSSD)                                             | TensorFlow | 是            | 是      | -        | 支持     | -    | 支持                                                                                                                             | 是   | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fblob\u002Fmaster\u002FTensorFlow\u002FDetection\u002FSSD\u002Fmodels\u002Fresearch\u002Fobject_detection\u002Fobject_detection_tutorial.ipynb)      |\n| [U-Net Med](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FSegmentation\u002FUNet_Medical)                          | TensorFlow2  | 是            | 是      | -        | 示例     | -    | 支持                                                                                                                             | 是   | -                                                                                                                                                                |\n\n## 自然语言处理\n| 模型                                                                                                                 | 框架   | AMP  | 多 GPU | 多节点 | TensorRT | ONNX | Triton                                                                                                    | DLC  | NB                                                                                                                                          |\n|------------------------------------------------------------------------------------------------------------------------|-------------|------|-----------|------------|----------|------|-----------------------------------------------------------------------------------------------------------|------|---------------------------------------------------------------------------------------------------------------------------------------------|\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FLanguageModeling\u002FBERT)                       | PyTorch     | 是  | 是       | 是        | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT\u002Ftree\u002Fmain\u002Fdemo\u002FBERT) | -    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FLanguageModeling\u002FBERT\u002Ftriton)    | 是  | -                                                                                                                                           |\n| [GNMT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FTranslation\u002FGNMT)                            | PyTorch     | 是  | 是       | -          | 支持 | -    | 支持                                                                                                          | -    | -                                                                                                                                           |\n| [ELECTRA](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FLanguageModeling\u002FELECTRA)             | TensorFlow2 | 是  | 是       | 是        | 支持 | -    | 支持                                                                                                          | 是  | -                                                                                                                                           |\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT)                    | TensorFlow  | 是  | 是       | 是        | 示例 | -    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT\u002Ftriton) | 是  | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FLanguageModeling\u002FBERT\u002Fnotebooks)                                |\n| [BERT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FLanguageModeling\u002FBERT)                   | TensorFlow2 | 是  | 是       | 是        | 支持 | -    | 支持                                                                                                          | 是  | -                                                                                                                                           |\n| [GNMT](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FTranslation\u002FGNMT)                         | TensorFlow  | 是  | 是       | -          | 支持 | -    | 支持                                                                                                          | -    | -                                                                                                                                           |\n| [Faster Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FFasterTransformer)                     | Tensorflow  | -    | -         | -          | 示例 | -    | 支持                                                                                                          | -    | -                                                                                                                                           |\n\n## 推荐系统\n| 模型                                                                                                         | 框架   | AMP   | 多GPU | 多节点   | ONNX   | Triton                                                                                               | DLC  | NB                                                                                                     |\n|----------------------------------------------------------------------------------------------------------------|-------------|-------|-----------|--------------|--------|------------------------------------------------------------------------------------------------------|------|--------------------------------------------------------------------------------------------------------|\n| [DLRM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM)                 | PyTorch     | 是   | 是       | -            | 是    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM\u002Ftriton) | 是  | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FDLRM\u002Fnotebooks) |\n| [DLRM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FDLRM)             | TensorFlow2 | 是   | 是       | 是          | -      | 支持                                                                                                     | 是  | -                                                                                                      |\n| [NCF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FRecommendation\u002FNCF)                   | PyTorch     | 是   | 是       | -            | -      | 支持                                                                                                     | -    | -                                                                                                      |\n| [Wide&Deep](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FWideAndDeep)  | TensorFlow  | 是   | 是       | -            | -      | 支持                                                                                                     | 是  | -                                                                                                      |\n| [Wide&Deep](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FWideAndDeep) | TensorFlow2 | 是   | 是       | -            | -      | 支持                                                                                                     | 是  | -                                                                                                      |\n| [NCF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FNCF)                | TensorFlow  | 是   | 是       | -            | -      | 支持                                                                                                     | 是  | -                                                                                                      |\n| [VAE-CF](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow\u002FRecommendation\u002FVAE-CF)          | TensorFlow  | 是   | 是       | -            | -      | 支持                                                                                                     | -    | -                                                                                                      |\n| [SIM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FTensorFlow2\u002FRecommendation\u002FSIM)               | TensorFlow2 | 是   | 是       | -            | -      | 支持                                                                                                     | 是  | -                                                                                                      |\n\n\n## 语音转文本\n| 模型                                                                                                       | 框架   | AMP  | 多GPU  | 多节点   | TensorRT | ONNX   | Triton                                                                                                   | DLC   | NB                                                                                                           |\n|--------------------------------------------------------------------------------------------------------------|-------------|------|------------|--------------|----------|--------|----------------------------------------------------------------------------------------------------------|-------|--------------------------------------------------------------------------------------------------------------|\n| [Jasper](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper)        | PyTorch     | 是  | 是        | -            | 示例 | 是    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper\u002Ftrtis) | 是   | [是](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FJasper\u002Fnotebooks) |\n| [QuartzNet](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechRecognition\u002FQuartzNet)  | PyTorch     | 是  | 是        | -            | 支持 | -      | 支持                                                                                                         | 是   | -                                                                                                            |\n\n## 文本转语音\n| 模型                                                                                                                  | 框架   | AMP  | 多GPU  | 多节点  | TensorRT | ONNX   | Triton                                                                                                        | DLC   | NB  |\n|-------------------------------------------------------------------------------------------------------------------------|-------------|------|------------|-------------|----------|--------|---------------------------------------------------------------------------------------------------------------|-------|-----|\n| [FastPitch](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FFastPitch)               | PyTorch     | 是  | 是        | -           | 示例 | -      | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FFastPitch\u002Ftriton)    | 是   | 是 |\n| [FastSpeech](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FCUDA-Optimized\u002FFastSpeech)                      | PyTorch     | 是  | 是        | -           | 示例 | -      | 支持                                                                                                              | -     | -   |\n| [Tacotron 2 和 WaveGlow](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FTacotron2) | PyTorch     | 是  | 是        | -           | 示例 | 是    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FTacotron2\u002Ftrtis_cpp) | 是   | -   |\n| [HiFi-GAN](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FSpeechSynthesis\u002FHiFiGAN)                  | PyTorch     | 是  | 是        | -           | 支持 | -      | 支持                                                                                                              | 是   | -   |\n\n## 图神经网络\n| 模型                                                                                                                  | 框架  | AMP  | 多GPU  | 多节点   | ONNX   | Triton   | DLC  | NB   |\n|-------------------------------------------------------------------------------------------------------------------------|------------|------|------------|--------------|--------|----------|------|------|\n| [SE(3)-Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FDGLPyTorch\u002FDrugDiscovery\u002FSE3Transformer) | PyTorch    | 是  | 是        | -            | -      | 支持         | -    | -    |\n| [MoFlow](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FDrugDiscovery\u002FMoFlow)                       | PyTorch    | 是  | 是        | -            | -      | 支持         | -    | -    |\n\n## 时间序列预测\n| 模型                                                                                                            | 框架  | AMP  | 多GPU   | 多节点   | TensorRT | ONNX   | Triton                                                                                           | DLC   | NB  |\n|-------------------------------------------------------------------------------------------------------------------|------------|------|-------------|--------------|----------|--------|--------------------------------------------------------------------------------------------------|-------|-----|\n| [Temporal Fusion Transformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FForecasting\u002FTFT) | PyTorch    | 是  | 是         | -            | 示例 | 是    | [示例](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Ftree\u002Fmaster\u002FPyTorch\u002FForecasting\u002FTFT\u002Ftriton) | 是   | -   |\n\n## NVIDIA 支持\n在每个网络的 README 文件中，我们都会说明将提供的支持级别。支持范围从持续更新和改进，到用于引领行业思考的特定时间点发布。\n\n## 术语表\n\n**多节点训练**\n在 pyxis\u002Fenroot Slurm 集群上支持。\n\n**深度学习编译器 (DLC)**\nTensorFlow XLA 以及 PyTorch JIT 和\u002F或 TorchScript\n\n**加速线性代数 (XLA)**\nXLA 是一种针对线性代数的领域专用编译器，能够在几乎不修改源代码的情况下加速 TensorFlow 模型。其结果是提升速度并减少内存使用。\n\n**PyTorch JIT 和\u002F或 TorchScript**\nTorchScript 是一种将 PyTorch 代码转换为可序列化且可优化模型的方法。TorchScript 是 PyTorch 模型（nn.Module 的子类）的一种中间表示形式，随后可以在高性能环境中运行，例如 C++。\n\n**自动混合精度 (AMP)**\n自动混合精度 (AMP) 可以在 Volta、Turing 和 NVIDIA Ampere 架构的 GPU 上自动启用混合精度训练。\n\n**TensorFloat-32 (TF32)**\nTensorFloat-32 (TF32) 是 [NVIDIA A100](https:\u002F\u002Fwww.nvidia.com\u002Fen-us\u002Fdata-center\u002Fa100\u002F) GPU 中用于处理矩阵运算（也称为张量运算）的新数学模式。在 A100 GPU 的 Tensor Core 上运行 TF32，与 Volta GPU 上的单精度浮点运算（FP32）相比，速度最高可提升 10 倍。TF32 在 NVIDIA Ampere GPU 架构中得到支持，并默认启用。\n\n**Jupyter Notebooks (NB)**\nJupyter Notebook 是一个开源的 Web 应用程序，允许用户创建和共享包含实时代码、公式、可视化内容及叙述性文本的文档。\n\n\n## 反馈 \u002F 贡献\n我们把这些示例发布在 GitHub 上，旨在更好地支持社区、促进反馈，并通过 GitHub Issues 和 pull requests 收集和实施贡献。我们欢迎所有贡献！\n\n## 已知问题\n在每个网络的 README 文件中，我们都会列出已知问题，并鼓励社区提供反馈。","# NVIDIA DeepLearningExamples 快速上手指南\n\n本指南帮助中国开发者快速在 NVIDIA GPU 上部署和运行业界领先的深度学习示例模型（如 EfficientNet, ResNet, Mask R-CNN 等），以获取最佳的性能和可复现的精度。\n\n## 1. 环境准备\n\n推荐使用 **NVIDIA GPU Cloud (NGC)** 提供的预构建 Docker 容器，这是最简便且能保证依赖库（cuDNN, NCCL, cuBLAS 等）版本兼容性的方式。\n\n### 系统要求\n*   **GPU**: NVIDIA Volta, Turing 或 Ampere 架构显卡（支持 Tensor Cores）。\n*   **操作系统**: Linux (Ubuntu 18.04\u002F20.04 推荐)。\n*   **驱动**: 已安装最新的 NVIDIA 驱动程序。\n*   **软件**:\n    *   Docker CE 19.03+\n    *   NVIDIA Container Toolkit (nvidia-docker2)\n\n### 前置依赖安装\n确保已安装 NVIDIA Container Toolkit 以允许 Docker 访问 GPU：\n\n```bash\n# 配置 Docker 默认运行时\nsudo mkdir -p \u002Fetc\u002Fdocker\ncat \u003C\u003CEOF | sudo tee \u002Fetc\u002Fdocker\u002Fdaemon.json\n{\n    \"runtimes\": {\n        \"nvidia\": {\n            \"path\": \"nvidia-container-runtime\",\n            \"runtimeArgs\": []\n        }\n    }\n}\nEOF\n\n# 重启 Docker 服务\nsudo systemctl restart docker\n```\n\n> **国内加速提示**：如果无法直接访问 `nvcr.io`，建议配置 Docker 镜像加速器（如阿里云、腾讯云等），或在 `\u002Fetc\u002Fdocker\u002Fdaemon.json` 中添加 `registry-mirrors`。部分国内云厂商也提供了 NGC 内容的同步镜像。\n\n## 2. 安装步骤\n\n无需手动编译源码或安装复杂的 Python 依赖，直接拉取官方优化的 Docker 容器即可。\n\n### 拉取容器\n登录 NGC 注册表并拉取最新版本的深度学习框架容器（以 PyTorch 为例）：\n\n```bash\n# 登录 NGC (首次使用需要执行，输入 API Key)\ndocker login nvcr.io\n\n# 拉取最新的 PyTorch 容器 (包含本仓库的最新示例代码)\ndocker pull nvcr.io\u002Fnvidia\u002Fpytorch:24.05-py3\n```\n*注：版本号 `24.05` 仅为示例，请替换为 [NGC](https:\u002F\u002Fcatalog.ngc.nvidia.com\u002Fcontainers) 上的最新版本号。TensorFlow 用户可拉取 `nvcr.io\u002Fnvidia\u002Ftensorflow:24.05-tf2-py3`。*\n\n### 启动容器\n启动容器并挂载当前目录以便访问代码：\n\n```bash\ndocker run --gpus all -it --rm -v $(pwd):\u002Fworkspace nvcr.io\u002Fnvidia\u002Fpytorch:24.05-py3\n```\n\n进入容器后，示例代码通常位于 `\u002Fworkspace\u002Fexamples` 或直接集成在环境中。若需获取最新 GitHub 代码，可在容器内执行：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples.git\ncd DeepLearningExamples\n```\n\n## 3. 基本使用\n\n以下以 **PyTorch 版 ResNet-50** 为例，展示如何快速启动训练。其他模型（如 EfficientNet, Mask R-CNN）的使用逻辑类似，具体参数请参考各模型目录下的 `README.md`。\n\n### 单卡训练示例\n在容器内进入对应模型目录并运行训练脚本：\n\n```bash\ncd PyTorch\u002FClassification\u002FConvNets\u002Fresnet50v1.5\n\n# 运行训练脚本 (自动检测单卡)\npython main.py \\\n    --arch resnet50 \\\n    --batch-size 256 \\\n    --epochs 90 \\\n    --data-path \u002Fdata\u002Fimagenet \\\n    --amp\n```\n*   `--amp`: 启用自动混合精度（Automatic Mixed Precision），利用 Tensor Cores 加速训练。\n*   `--data-path`: 替换为你本地的 ImageNet 数据集路径。\n\n### 多卡训练示例\n使用 `torch.distributed.launch` 或 `mpirun` 进行多 GPU 训练：\n\n```bash\npython -m torch.distributed.run --nproc_per_node=8 main.py \\\n    --arch resnet50 \\\n    --batch-size 256 \\\n    --epochs 90 \\\n    --data-path \u002Fdata\u002Fimagenet \\\n    --amp\n```\n\n### 推理与部署\n大多数示例支持导出为 **ONNX** 或使用 **TensorRT** 进行高性能推理。以生成 ONNX 模型为例：\n\n```bash\npython main.py \\\n    --arch resnet50 \\\n    --evaluate \\\n    --resume checkpoint_best.pth.tar \\\n    --export-onnx model.onnx\n```\n\n生成的 `model.onnx` 可直接用于 NVIDIA Triton Inference Server 进行生产环境部署。","某医疗影像初创公司正急需在两周内训练出高精度的肺部结节检测模型，并部署到医院的本地 GPU 集群上。\n\n### 没有 DeepLearningExamples 时\n- **环境配置耗时极长**：团队需手动匹配 PyTorch、cuDNN 和 CUDA 版本，常因依赖冲突导致数天的环境调试，严重拖慢研发进度。\n- **多卡加速难以实现**：自行编写的分布式训练代码效率低下，无法充分利用医院集群的多张 A100 显卡，训练周期从预计的 3 天延长至 2 周。\n- **推理性能不达标**：模型训练完成后，缺乏针对 Tensor Cores 的优化，导致在医院边缘设备上的推理延迟过高，无法满足医生实时诊断需求。\n- **结果复现困难**：由于缺乏标准化的脚本和经过验证的超参数，不同工程师跑出的模型精度波动大，难以通过医疗合规性验证。\n\n### 使用 DeepLearningExamples 后\n- **开箱即用的容器环境**：直接拉取 NGC 上预集成的 Docker 容器，内置了经过严格测试的最新 NVIDIA 深度学习库，环境搭建从几天缩短至几分钟。\n- **原生支持高效多卡训练**：调用内置的 EfficientNet 等模型脚本，天然支持多 GPU 和多节点并行，将训练时间压缩回 3 天内，且线性加速比优异。\n- **端到端推理优化**：利用工具链中集成的 TensorRT 和 Triton 支持，一键将模型量化并部署，推理延迟降低 4 倍，完美适配临床实时场景。\n- **精度与性能可复现**：基于官方提供的基准脚本和调优参数，确保每次训练的模型精度一致且达到业界最先进水平，轻松通过项目验收。\n\nDeepLearningExamples 通过提供企业级优化的全栈方案，让团队从繁琐的基础设施调试中解放出来，专注于核心算法创新与业务落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVIDIA_DeepLearningExamples_50fe2f5b.png","NVIDIA","NVIDIA Corporation","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNVIDIA_7dcf6000.png","",null,"https:\u002F\u002Fnvidia.com","https:\u002F\u002Fgithub.com\u002FNVIDIA",[81,85,89,93,97,101,105,108,112,116],{"name":82,"color":83,"percentage":84},"Jupyter Notebook","#DA5B0B",50,{"name":86,"color":87,"percentage":88},"Python","#3572A5",43.3,{"name":90,"color":91,"percentage":92},"Shell","#89e051",2.8,{"name":94,"color":95,"percentage":96},"C++","#f34b7d",2.3,{"name":98,"color":99,"percentage":100},"Cuda","#3A4E3A",1.1,{"name":102,"color":103,"percentage":104},"Makefile","#427819",0.2,{"name":106,"color":107,"percentage":104},"Dockerfile","#384d54",{"name":109,"color":110,"percentage":111},"CMake","#DA3434",0.1,{"name":113,"color":114,"percentage":115},"Starlark","#76d275",0,{"name":117,"color":118,"percentage":115},"C","#555555",14769,3410,"2026-04-06T11:14:32","Linux","必需 NVIDIA GPU (Volta, Turing, Ampere 架构)，需安装 NVIDIA CUDA-X 软件栈，具体显存大小取决于模型","未说明",{"notes":126,"python":124,"dependencies":127},"该项目主要通过 NGC (NVIDIA GPU Cloud) 提供的 Docker 容器进行部署，而非直接在宿主机安装依赖。容器每月更新，包含经过严格测试的深度学习库。支持多 GPU 和多节点训练。部分模型支持 TensorRT、ONNX 和 Triton 推理服务器。",[128,129,130,131,132,133,134,135,136,137],"Docker","NVIDIA Container Toolkit","PyTorch","TensorFlow","TensorFlow2","MXNet","PaddlePaddle","cuDNN","NCCL","cuBLAS",[15,14,35,139],"音频",[141,142,143,144,145,146,147,148,149,150,151,152,153,154,155],"computer-vision","deep-learning","drug-discovery","forecasting","large-language-models","mxnet","paddlepaddle","pytorch","recommender-systems","speech-recognition","speech-synthesis","tensorflow","tensorflow2","translation","nlp","2026-03-27T02:49:30.150509","2026-04-07T13:27:58.411855",[159,164,169,174,178,182],{"id":160,"question_zh":161,"answer_zh":162,"source_url":163},22214,"加载预训练模型检查点时出现 'Unexpected key module.xxx in state_dict' 错误怎么办？","这是由于模型保存时使用了 DataParallel 包装，导致键名多了 'module.' 前缀。解决方法是修改加载代码，在加载 state_dict 前去除键名中的 'module.' 前缀。具体代码如下：\nmodel.load_state_dict({k.replace('module.',''): v for k, v in checkpoint['state_dict'].items()})\nif 'optimizer' in checkpoint: optimizer.load_state_dict(checkpoint['optimizer'])\n如果使用了 AMP，还需添加：if amp_run and 'amp' in checkpoint: amp.load_state_dict(checkpoint['amp'])","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fissues\u002F319",{"id":165,"question_zh":166,"answer_zh":167,"source_url":168},22215,"如何在 DLRM 模型中解决推理基准测试的步数限制或显存不足（OOM）问题？","1. 关于步数限制：可以通过命令行参数控制推理基准测试的步数，增加该值可解决问题。\n2. 关于显存不足：通常是因为尝试在极长的测试数据集（如 32k batches）上计算 AUC 分数，目前不支持此操作。对于合成数据集，计算 AUC 意义不大。如果是为了基准测试 I\u002FO，可以修改生成合成数据集脚本中的标志位（flag）来增加样本数量，但请注意生成大数据集需要较长时间。","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fissues\u002F1305",{"id":170,"question_zh":171,"answer_zh":172,"source_url":173},22216,"运行 TensorFlow 自定义算子时出现段错误（segmentation fault）如何解决？","这是一个运行时兼容性问题而非编译问题。用户反馈将 TensorFlow 版本降级或切换到 tf1.13 版本后，该段错误问题得到解决。请确保使用的 TensorFlow 版本与项目代码兼容。","https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FDeepLearningExamples\u002Fissues\u002F166",{"id":175,"question_zh":176,"answer_zh":177,"source_url":168},22217,"如何在多 GPU 环境下并行运行 DLRM 模型的推理任务？","对于适合单张 GPU 运行的模型，推理任务是容易并行化的。您可以直接并行运行 4 个脚本，每个脚本对应一张 GPU，从而实现多卡并行推理。",{"id":179,"question_zh":180,"answer_zh":181,"source_url":163},22218,"继续训练 Tacotron2 或 WaveGlow 模型时，如何正确传递检查点路径和优化器状态？","在继续训练时，除了使用 --checkpoint-path 参数指定检查点文件外，还需要确保训练脚本正确加载了优化器状态。如果修改了加载逻辑（例如去除 'module.' 前缀），请确保同时加载 optimizer 和 amp 状态。参考命令示例：\npython train.py -m WaveGlow -o .\u002F -lr 1e-4 --epochs 14500 -bs 10 --segment-length 16000 --weight-decay 0 --grad-clip-thresh 65504.0 --cudnn-enabled --cudnn-benchmark --log-file nvlog.json --training-files filelists\u002Fhindi_audio_text_train_filelist.txt --amp --validation-files filelists\u002Fhindi_audio_text_val_filelist.txt --wn-channels 256 --checkpoint-path backup\u002Fwaveglow_1076430_14000_amp",{"id":183,"question_zh":184,"answer_zh":185,"source_url":168},22219,"DLRM 数据预处理脚本报错或找不到数据文件（如 day_0.gz 到 day_23.gz）怎么办？","首先运行验证脚本 verify_criteo_downloaded.sh 确认所有分片文件（day_0.gz 到 day_23.gz）已正确下载到指定目录。如果文件存在但仍报错，请检查预处理脚本中的路径配置是否正确，并确保清理了旧的中间文件（如 spark, intermediate_binary, output 等目录），然后重新运行 prepare_dataset.sh。",[]]