[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-microsoft--DirectML":3,"tool-microsoft--DirectML":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":114,"forks":115,"last_commit_at":116,"license":117,"difficulty_score":10,"env_os":118,"env_gpu":119,"env_ram":120,"env_deps":121,"category_tags":132,"github_topics":80,"view_count":23,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":133,"updated_at":134,"faqs":135,"releases":165},3062,"microsoft\u002FDirectML","DirectML","⚠️DirectML is in maintenance mode ⚠️ DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.","DirectML 是一个专为机器学习打造的高性能、硬件加速库，基于 DirectX 12 构建。它的核心使命是让常见的机器学习任务能够在各种显卡上高效运行，无需依赖特定的厂商驱动或专用框架。无论是 AMD、Intel、NVIDIA 还是高通的显卡，只要支持 DirectX 12，DirectML 都能提供统一的 GPU 加速能力。\n\n它主要解决了机器学习在不同硬件平台上部署难、兼容性差的问题。通过提供低开销且高度一致的接口，DirectML 确保了模型在不同设备上的运行结果可靠且可预测，特别适合对性能和延迟敏感的场景。\n\n这款工具非常适合应用开发者、游戏工程师以及需要将 AI 功能集成到实时应用中的研究人员。对于希望在 Windows 环境下灵活调用本地算力，而不想被特定深度学习框架绑定的用户来说，DirectML 是理想选择。\n\n其独特亮点在于与 Direct3D 12 的无缝互操作性，允许开发者在同一应用中混合使用图形渲染与机器学习推理。此外，它还支持作为独立组件分发，方便在旧版 Windows 系统上部署固定版本。需要注意的是，DirectML 目前已进入维护模式，不再增加新功能","DirectML 是一个专为机器学习打造的高性能、硬件加速库，基于 DirectX 12 构建。它的核心使命是让常见的机器学习任务能够在各种显卡上高效运行，无需依赖特定的厂商驱动或专用框架。无论是 AMD、Intel、NVIDIA 还是高通的显卡，只要支持 DirectX 12，DirectML 都能提供统一的 GPU 加速能力。\n\n它主要解决了机器学习在不同硬件平台上部署难、兼容性差的问题。通过提供低开销且高度一致的接口，DirectML 确保了模型在不同设备上的运行结果可靠且可预测，特别适合对性能和延迟敏感的场景。\n\n这款工具非常适合应用开发者、游戏工程师以及需要将 AI 功能集成到实时应用中的研究人员。对于希望在 Windows 环境下灵活调用本地算力，而不想被特定深度学习框架绑定的用户来说，DirectML 是理想选择。\n\n其独特亮点在于与 Direct3D 12 的无缝互操作性，允许开发者在同一应用中混合使用图形渲染与机器学习推理。此外，它还支持作为独立组件分发，方便在旧版 Windows 系统上部署固定版本。需要注意的是，DirectML 目前已进入维护模式，不再增加新功能，但会继续提供安全更新，并建议新版 Windows 11 用户转向更先进的 Windows ML。","# DirectML \u003C!-- omit in toc -->\n\n---\n\n⚠️ **DirectML is in maintenance mode** ⚠️\n\n- If your PC runs Windows 11, version 24H2 (build 26100) or later, consider using [Windows ML](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fwindows\u002Fai\u002Fnew-windows-ml\u002Foverview) for accelerated machine learning model execution.\n- DirectML will remain supported on previous Windows releases (see [releases](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fwindows\u002Fai\u002Fdirectml\u002Fdml-version-history)) and will continue to ship with future versions of Windows. However, no new functionality or feature updates are planned.\n- DirectML will continue to receive security and compliance-related fixes. Refer to [SECURITY.md](.\u002FSECURITY.md) for reporting security issues. \n- The issues and samples in this repository will not be updated.\n\n---\n\nDirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.\n\nWhen used standalone, the DirectML API is a low-level DirectX 12 library and is suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications. The seamless interoperability of DirectML with Direct3D 12 as well as its low overhead and conformance across hardware makes DirectML ideal for accelerating machine learning when both high performance is desired, and the reliability and predictability of results across hardware is critical.\n\nMore information about DirectML can be found in [Introduction to DirectML](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml-intro).\n\n- [Getting Started with DirectML](#getting-started-with-directml)\n  - [Hardware requirements](#hardware-requirements)\n  - [For application developers](#for-application-developers)\n  - [For users, data scientists, and researchers](#for-users-data-scientists-and-researchers)\n- [DirectML Samples](#directml-samples)\n- [DxDispatch Tool](#dxdispatch-tool)\n- [Windows ML on DirectML](#windows-ml-on-directml)\n- [ONNX Runtime on DirectML](#onnx-runtime-on-directml)\n- [PyTorch with DirectML](#pytorch-with-directml)\n- [TensorFlow with DirectML](#tensorflow-with-directml)\n- [Feedback](#feedback)\n- [External Links](#external-links)\n  - [Documentation](#documentation)\n  - [More information](#more-information)\n- [Contributing](#contributing)\n\nVisit the [DirectX Landing Page](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Flanding-page\u002F) for more resources for DirectX developers.\n\n## Getting Started with DirectML\n\nDirectML is distributed as a system component of Windows 10, and is available as part of the Windows 10 operating system (OS) in Windows 10, version 1903 (10.0; Build 18362), and newer.\n\nStarting with DirectML [version 1.4.0](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml-version-history), DirectML is also available as a standalone redistributable package (see [Microsoft.AI.DirectML](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F)), which is useful for applications that wish to use a fixed version of DirectML, or when running on older versions of Windows 10.\n\n### Hardware requirements\n\nDirectML requires a DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:\n\n* AMD GCN 1st Gen (Radeon HD 7000 series) and above\n* Intel Haswell (4th-gen core) HD Integrated Graphics and above\n* NVIDIA Kepler (GTX 600 series) and above\n* Qualcomm Adreno 600 and above\n\n### For application developers\n\nDirectML exposes a native C++ DirectX 12 API. The header and library (DirectML.h\u002FDirectML.lib) are available as part of the [redistributable NuGet package](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F), and are also included in the Windows 10 SDK version 10.0.18362 or newer.\n\n* The Windows 10 SDK can be downloaded from the [Windows Dev Center](https:\u002F\u002Fdeveloper.microsoft.com\u002Fwindows\u002Fdownloads\u002Fwindows-10-sdk\u002F)\n* [Microsoft.AI.DirectML](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F) on the NuGet Gallery\n* [DirectML programming guide](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml)\n* [DirectML API reference](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdirect3d-directml-reference)\n\n### For users, data scientists, and researchers\n\nDirectML is built-in as a backend to several frameworks such as Windows ML, ONNX Runtime, and TensorFlow.\n\nSee the following sections for more information:\n\n* [Windows ML on DirectML](#Windows-ML-on-DirectML)\n* [ONNX Runtime on DirectML](#ONNX-Runtime-on-DirectML)\n* [TensorFlow with DirectML](#TensorFlow-with-DirectML)\n* [PyTorch with DirectML](#pytorch-with-DirectML)\n\n## DirectML Samples\n\nDirectML C++ sample code is available under [Samples](.\u002FSamples).\n* [HelloDirectML](.\u002FSamples\u002FHelloDirectML): A minimal \"hello world\" application that executes a single DirectML operator.\n* [DirectMLNpuInference](.\u002FSamples\\DirectMLNpuInference): A sample that showcases how to utilize NPU hardware with DirectML.\n* [DirectMLSuperResolution](.\u002FSamples\u002FDirectMLSuperResolution): A sample that uses DirectML to execute a basic super-resolution model to upscale video from 540p to 1080p in real time.\n* [yolov4](.\u002FSamples\u002Fyolov4): YOLOv4 is an object detection model capable of recognizing up to 80 different classes of objects in an image. This sample contains a complete end-to-end implementation of the model using DirectML, and is able to run in real time on a user-provided video stream.\n\nDirectML Python sample code is available under [Python\u002Fsamples](.\u002FPython\u002Fsamples). The samples require PyDirectML, an open source Python projection library for DirectML, which can be built and installed to a Python executing environment from [Python\u002Fsrc](.\u002FPython\u002Fsrc). Refer to the [Python\u002FREADME.md](Python\u002FREADME.md) file for more details.\n\n* [MobileNet](.\u002FPython\u002Fsamples\u002Fmobilenet.py): Adapted from the [ONNX MobileNet model](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fmobilenet). MobileNet classifies an image into 1000 different classes. It is highly efficient in speed and size, ideal for mobile applications.\n* [MNIST](.\u002FPython\u002Fsamples\u002Fmnist.py): Adapted from the [ONNX MNIST model](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fmnist). MNIST predicts handwritten digits using a convolution neural network.\n* [SqueezeNet](.\u002FPython\u002Fsamples\u002Fsqueezenet.py): Based on the [ONNX SqueezeNet model](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fsqueezenet). SqueezeNet performs image classification trained on the ImageNet dataset. It is highly efficient and provides results with good accuracy.\n* [FNS-Candy](.\u002FPython\u002Fsamples\u002Fcandy.py): Adapted from the [Windows ML Style Transfer model](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FWindows-Machine-Learning\u002Ftree\u002Fmaster\u002FSamples\u002FFNSCandyStyleTransfer) sample, FNS-Candy re-applies specific artistic styles on regular images.\n* [Super Resolution](.\u002FPython\u002Fsamples\u002Fsuperres.py): Adapted from the [ONNX Super Resolution model](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fsuper_resolution\u002Fsub_pixel_cnn_2016), Super-Res upscales and sharpens the input images to refine the details and improve image quality.\n\n## DxDispatch Tool\n\n[DxDispatch](.\u002FDxDispatch\u002FREADME.md) is simple command-line executable for launching DirectX 12 compute programs (including DirectML operators) without writing all the C++ boilerplate.\n\n## Windows ML on DirectML\n\nWindows ML (WinML) is a high-performance, reliable API for deploying hardware-accelerated ML inferences on Windows devices. DirectML provides the GPU backend for Windows ML.\n\nDirectML acceleration can be enabled in Windows ML using the [LearningModelDevice](https:\u002F\u002Fdocs.microsoft.com\u002Fuwp\u002Fapi\u002Fwindows.ai.machinelearning.learningmodeldevice) with any one of the [DirectX DeviceKinds](https:\u002F\u002Fdocs.microsoft.com\u002Fuwp\u002Fapi\u002Fwindows.ai.machinelearning.learningmodeldevicekind).\n\nFor more information, see [Get Started with Windows ML](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fai\u002Fwindows-ml\u002F#get-started).\n\n* [Windows Machine Learning Overview (docs.microsoft.com)](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fai\u002Fwindows-ml\u002F)\n* [Windows Machine Learning GitHub](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning)\n* [WinMLRunner](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning\u002Ftree\u002Fmaster\u002FTools\u002FWinMLRunner), a tool for executing ONNX models using WinML with DirectML\n\n## ONNX Runtime on DirectML\n\nONNX Runtime is a cross-platform inferencing and training accelerator compatible with many popular ML\u002FDNN frameworks, including PyTorch, TensorFlow\u002FKeras, scikit-learn, and more.\n\nDirectML is available as an optional *execution provider* for ONNX Runtime that provides hardware acceleration when running on Windows 10.\n\nFor more information about getting started, see [Using the DirectML execution provider](https:\u002F\u002Fwww.onnxruntime.ai\u002Fdocs\u002Freference\u002Fexecution-providers\u002FDirectML-ExecutionProvider.html#using-the-directml-execution-provider).\n\n* [ONNX Runtime homepage](https:\u002F\u002Faka.ms\u002Fonnxruntime)\n* [ONNX Runtime GitHub](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime)\n* [DirectML Execution Provider readme](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime\u002Fblob\u002Fmaster\u002Fdocs\u002Fexecution_providers\u002FDirectML-ExecutionProvider.md)\n\n## PyTorch with DirectML\n\nPyTorch with DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware. This is done through [`torch-directml`](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F), a plugin for PyTorch.\n\nPyTorch with DirectML is supported on both the latest versions of Windows and the [Windows Subsystem for Linux](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwsl\u002Fabout), and is available for download as a PyPI package. For more information about getting started with `torch-directml`, see our [Windows](https:\u002F\u002Flearn.microsoft.com\u002Fwindows\u002Fai\u002Fdirectml\u002Fpytorch-windows) or [WSL 2](https:\u002F\u002Flearn.microsoft.com\u002Fwindows\u002Fai\u002Fdirectml\u002Fpytorch-wsl) guidance on Microsoft Learn.\n\n* [torch-directml PyPI project](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F)\n* [PyTorch with DirectML samples](.\u002FPyTorch\u002F)\n* [PyTorch homepage](https:\u002F\u002Fpytorch.org\u002F)\n\n## TensorFlow with DirectML\n\nTensorFlow is a popular open source platform for machine learning and is a leading framework for training of machine learning models.\n\nDirectML acceleration for TensorFlow 1.15 is currently available for Public Preview. TensorFlow on DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware.\n\nTensorFlow on DirectML is supported on both the latest versions of Windows 10 and the [Windows Subsystem for Linux](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwsl\u002Fabout), and is available for download as a PyPI package. For more information about getting started, see [GPU accelerated ML training (docs.microsoft.com)](http:\u002F\u002Faka.ms\u002Fgpuinwsldocs)\n\n* [TensorFlow on DirectML GitHub repo](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Ftensorflow-directml)\n* [TensorFlow on DirectML samples](.\u002FTensorFlow)\n* [tensorflow-directml PyPI project](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftensorflow-directml\u002F)\n* [TensorFlow GitHub | RFC: TensorFlow on DirectML](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Fcommunity\u002Fpull\u002F243)\n* [TensorFlow homepage](https:\u002F\u002Fwww.tensorflow.org\u002F)\n\n## Feedback\n\nWe look forward to hearing from you!\n\n* For TensorFlow with DirectML issues, bugs, and feedback; or for general DirectML issues and feedback, please [file an issue](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Ftensorflow-directml-plugin\u002Fissues) or contact us directly at askdirectml@microsoft.com.\n\n* For PyTorch with DirectML issues, bugs, and feedback; or for general DirectML issues and feedback, please [file an issue](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML-Samples\u002Fissues) or contact us directly at askdirectml@microsoft.com.\n\n* For Windows ML issues, please file a GitHub issue at [microsoft\u002FWindows-Machine-Learning](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning\u002Fissues) or contact us directly at askwindowsml@microsoft.com.\n\n* For ONNX Runtime issues, please file an issue at [microsoft\u002Fonnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime\u002Fissues).\n\n## External Links\n\n### Documentation\n[DirectML programming guide](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml)  \n[DirectML API reference](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdirect3d-directml-reference)\n\n### More information\n[Introducing DirectML (Game Developers Conference '19)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=QjQm_wNrvVw)   \n[Accelerating GPU Inferencing with DirectML and DirectX 12 (SIGGRAPH '18)](https:\u002F\u002Fon-demand.gputechconf.com\u002Fsiggraph\u002F2018\u002Fvideo\u002Fsig1814-2-adrian-tsai-gpu-inferencing-directml-and-directx-12.html)  \n[Windows AI: hardware-accelerated ML on Windows devices (Microsoft Build '20)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=-qf2PMuOXWI&feature=youtu.be)  \n[Gaming with Windows ML (DirectX Developer Blog)](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fgaming-with-windows-ml\u002F)  \n[DirectML at GDC 2019 (DirectX Developer Blog)](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fdirectml-at-gdc-2019\u002F)  \n[DirectX ❤ Linux (DirectX Developer Blog)](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fdirectx-heart-linux\u002F)\n\n## Contributing\n\nThis project welcomes contributions and suggestions.  Most contributions require you to agree to a\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\nthe rights to use your contribution. For details, visit https:\u002F\u002Fcla.microsoft.com.\n\nWhen you submit a pull request, a CLA-bot will automatically determine whether you need to provide\na CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions\nprovided by the bot. You will only need to do this once across all repos using our CLA.\n\nThis project has adopted the [Microsoft Open Source Code of Conduct](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F).\nFor more information see the [Code of Conduct FAQ](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F) or\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.\n","# DirectML \u003C!-- omit in toc -->\n\n---\n\n⚠️ **DirectML 处于维护模式** ⚠️\n\n- 如果您的电脑运行的是 Windows 11 版本 24H2（内部版本 26100）或更高版本，请考虑使用 [Windows ML](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fwindows\u002Fai\u002Fnew-windows-ml\u002Foverview) 来加速机器学习模型的执行。\n- DirectML 将继续在之前的 Windows 版本上得到支持（参见 [发行历史](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fwindows\u002Fai\u002Fdirectml\u002Fdml-version-history)），并会随未来版本的 Windows 一起发布。然而，不再计划推出新的功能或特性更新。\n- DirectML 将继续接收与安全和合规相关的问题修复。有关安全问题的报告，请参阅 [SECURITY.md](.\u002FSECURITY.md)。\n- 此仓库中的问题和示例将不再更新。\n\n---\n\nDirectML 是一个高性能、硬件加速的 DirectX 12 库，专为机器学习设计。它可在广泛的受支持硬件和驱动程序上为常见的机器学习任务提供 GPU 加速，包括来自 AMD、Intel、NVIDIA 和 Qualcomm 等厂商的所有支持 DirectX 12 的 GPU。\n\n单独使用时，DirectML API 是一个低级别的 DirectX 12 库，适用于高性能、低延迟的应用场景，例如框架、游戏以及其他实时应用。DirectML 与 Direct3D 12 的无缝互操作性，以及其低开销和跨硬件的一致性，使其成为在需要高性能的同时，又必须确保跨硬件结果可靠性和可预测性的机器学习加速的理想选择。\n\n有关 DirectML 的更多信息，请参阅 [DirectML 简介](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml-intro)。\n\n- [开始使用 DirectML](#getting-started-with-directml)\n  - [硬件要求](#hardware-requirements)\n  - [面向应用开发者](#for-application-developers)\n  - [面向用户、数据科学家和研究人员](#for-users-data-scientists-and-researchers)\n- [DirectML 示例](#directml-samples)\n- [DxDispatch 工具](#dxdispatch-tool)\n- [基于 DirectML 的 Windows ML](#windows-ml-on-directml)\n- [基于 DirectML 的 ONNX Runtime](#onnx-runtime-on-directml)\n- [PyTorch 结合 DirectML](#pytorch-with-directml)\n- [TensorFlow 结合 DirectML](#tensorflow-with-directml)\n- [反馈](#feedback)\n- [外部链接](#external-links)\n  - [文档](#documentation)\n  - [更多信息](#more-information)\n- [贡献](#contributing)\n\n访问 [DirectX 登陆页](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Flanding-page\u002F) 获取更多针对 DirectX 开发者的资源。\n\n## 开始使用 DirectML\n\nDirectML 作为 Windows 10 的系统组件分发，并自 Windows 10 版本 1903（10.0；内部版本 18362）及更高版本起，已包含在 Windows 10 操作系统中。\n\n从 DirectML [版本 1.4.0](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml-version-history) 开始，DirectML 也以独立的可再分发包形式提供（参见 [Microsoft.AI.DirectML](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F)），这对于希望使用固定版本 DirectML 的应用程序，或在较旧版本的 Windows 10 上运行的应用程序非常有用。\n\n### 硬件要求\n\nDirectML 需要一台支持 DirectX 12 的设备。过去几年内发布的几乎所有市售显卡都支持 DirectX 12。兼容硬件的例子包括：\n\n* AMD GCN 第一代（Radeon HD 7000 系列）及以上\n* Intel Haswell（第四代 Core）集成显卡及以上\n* NVIDIA Kepler（GTX 600 系列）及以上\n* Qualcomm Adreno 600 及以上\n\n### 面向应用开发者\n\nDirectML 提供原生 C++ DirectX 12 API。头文件和库（DirectML.h\u002FDirectML.lib）既包含在 [可再分发的 NuGet 包](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F) 中，也包含在 Windows 10 SDK 版本 10.0.18362 或更高版本中。\n\n* 可从 [Windows 开发者中心](https:\u002F\u002Fdeveloper.microsoft.com\u002Fwindows\u002Fdownloads\u002Fwindows-10-sdk\u002F) 下载 Windows 10 SDK\n* NuGet 目录上的 [Microsoft.AI.DirectML](https:\u002F\u002Fwww.nuget.org\u002Fpackages\u002FMicrosoft.AI.DirectML\u002F)\n* [DirectML 编程指南](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml)\n* [DirectML API 参考](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdirect3d-directml-reference)\n\n### 面向用户、数据科学家和研究人员\n\nDirectML 内置为多个框架的后端，例如 Windows ML、ONNX Runtime 和 TensorFlow。\n\n更多信息请参阅以下章节：\n\n* [基于 DirectML 的 Windows ML](#Windows-ML-on-DirectML)\n* [基于 DirectML 的 ONNX Runtime](#ONNX-Runtime-on-DirectML)\n* [TensorFlow 结合 DirectML](#TensorFlow-with-DirectML)\n* [PyTorch 结合 DirectML](#pytorch-with-DirectML)\n\n## DirectML 示例\n\nDirectML 的 C++ 示例代码可在 [Samples](.\u002FSamples) 目录下找到。\n* [HelloDirectML](.\u002FSamples\u002FHelloDirectML)：一个极简的“Hello World”应用程序，仅执行一个 DirectML 算子。\n* [DirectMLNpuInference](.\u002FSamples\\DirectMLNpuInference)：展示如何使用 NPU 硬件与 DirectML 配合的示例。\n* [DirectMLSuperResolution](.\u002FSamples\u002FDirectMLSuperResolution)：利用 DirectML 执行一个基础的超分辨率模型，将 540p 视频实时上采样至 1080p 的示例。\n* [yolov4](.\u002FSamples\u002Fyolov4)：YOLOv4 是一种目标检测模型，能够识别图像中的多达 80 类物体。该示例包含使用 DirectML 完整实现的端到端模型，并且能够在用户提供的视频流上实时运行。\n\nDirectML 的 Python 示例代码可在 [Python\u002Fsamples](.\u002FPython\u002Fsamples) 目录下找到。这些示例需要 PyDirectML，这是一个用于 DirectML 的开源 Python 投影库，可以从 [Python\u002Fsrc](.\u002FPython\u002Fsrc) 构建并安装到 Python 执行环境中。更多详细信息请参阅 [Python\u002FREADME.md](Python\u002FREADME.md) 文件。\n\n* [MobileNet](.\u002FPython\u002Fsamples\u002Fmobilenet.py)：改编自 [ONNX MobileNet 模型](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fmobilenet)。MobileNet 可将图像分类为 1000 个不同类别，其速度和体积都非常高效，非常适合移动应用。\n* [MNIST](.\u002FPython\u002Fsamples\u002Fmnist.py)：改编自 [ONNX MNIST 模型](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fmnist)。MNIST 使用卷积神经网络预测手写数字。\n* [SqueezeNet](.\u002FPython\u002Fsamples\u002Fsqueezenet.py)：基于 [ONNX SqueezeNet 模型](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fclassification\u002Fsqueezenet)。SqueezeNet 在 ImageNet 数据集上训练的图像分类任务中表现出色，效率极高且准确率良好。\n* [FNS-Candy](.\u002FPython\u002Fsamples\u002Fcandy.py)：改编自 [Windows ML 风格迁移模型](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FWindows-Machine-Learning\u002Ftree\u002Fmaster\u002FSamples\u002FFNSCandyStyleTransfer) 示例，FNS-Candy 可以将特定的艺术风格重新应用于普通图像。\n* [Super Resolution](.\u002FPython\u002Fsamples\u002Fsuperres.py)：改编自 [ONNX 超分辨率模型](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fmodels\u002Ftree\u002Fmaster\u002Fvision\u002Fsuper_resolution\u002Fsub_pixel_cnn_2016)，该模型可对输入图像进行上采样和锐化，从而细化细节并提升图像质量。\n\n## DxDispatch 工具\n\n[DxDispatch](.\u002FDxDispatch\u002FREADME.md) 是一个简单的命令行可执行文件，用于在无需编写所有 C++ 模板代码的情况下启动 DirectX 12 计算程序（包括 DirectML 算子）。\n\n## Windows ML on DirectML\n\nWindows ML (WinML) 是一个高性能、可靠的 API，用于在 Windows 设备上部署硬件加速的机器学习推理。DirectML 为 Windows ML 提供 GPU 后端支持。\n\n通过使用 [LearningModelDevice](https:\u002F\u002Fdocs.microsoft.com\u002Fuwp\u002Fapi\u002Fwindows.ai.machinelearning.learningmodeldevice)，结合任意一种 [DirectX DeviceKinds](https:\u002F\u002Fdocs.microsoft.com\u002Fuwp\u002Fapi\u002Fwindows.ai.machinelearning.learningmodeldevicekind)，可以在 Windows ML 中启用 DirectML 加速。\n\n有关更多信息，请参阅 [开始使用 Windows ML](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fai\u002Fwindows-ml\u002F#get-started)。\n\n* [Windows 机器学习概述 (docs.microsoft.com)](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fai\u002Fwindows-ml\u002F)\n* [Windows 机器学习 GitHub 仓库](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning)\n* [WinMLRunner](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning\u002Ftree\u002Fmaster\u002FTools\u002FWinMLRunner)，一个使用 WinML 和 DirectML 执行 ONNX 模型的工具\n\n## ONNX Runtime on DirectML\n\nONNX Runtime 是一个跨平台的推理和训练加速器，兼容多种流行的机器学习\u002FDNN 框架，包括 PyTorch、TensorFlow\u002FKeras、scikit-learn 等。\n\nDirectML 可作为 ONNX Runtime 的可选 *执行提供者*，在 Windows 10 上运行时提供硬件加速功能。\n\n有关入门的更多信息，请参阅 [使用 DirectML 执行提供者](https:\u002F\u002Fwww.onnxruntime.ai\u002Fdocs\u002Freference\u002Fexecution-providers\u002FDirectML-ExecutionProvider.html#using-the-directml-execution-provider)。\n\n* [ONNX Runtime 主页](https:\u002F\u002Faka.ms\u002Fonnxruntime)\n* [ONNX Runtime GitHub 仓库](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime)\n* [DirectML 执行提供者说明文档](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime\u002Fblob\u002Fmaster\u002Fdocs\u002Fexecution_providers\u002FDirectML-ExecutionProvider.md)\n\n## PyTorch with DirectML\n\nPyToch 与 DirectML 结合，可在广泛的 DirectX 12 兼容硬件上进行复杂机器学习模型的训练和推理。这是通过 [`torch-directml`](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F) 这一 PyTorch 插件实现的。\n\nPyTorch 与 DirectML 同时支持最新的 Windows 版本以及 [Windows Subsystem for Linux](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwsl\u002Fabout)，并且可以作为 PyPI 包下载。有关如何开始使用 `torch-directml` 的更多信息，请参阅我们在 Microsoft Learn 上提供的 [Windows](https:\u002F\u002Flearn.microsoft.com\u002Fwindows\u002Fai\u002Fdirectml\u002Fpytorch-windows) 或 [WSL 2](https:\u002F\u002Flearn.microsoft.com\u002Fwindows\u002Fai\u002Fdirectml\u002Fpytorch-wsl) 指南。\n\n* [torch-directml PyPI 项目](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F)\n* [PyTorch 与 DirectML 示例](.\u002FPyTorch\u002F)\n* [PyTorch 主页](https:\u002F\u002Fpytorch.org\u002F)\n\n## TensorFlow with DirectML\n\nTensorFlow 是一个广受欢迎的开源机器学习平台，也是训练机器学习模型的领先框架。\n\n适用于 TensorFlow 1.15 的 DirectML 加速目前处于公开预览阶段。TensorFlow on DirectML 使复杂的机器学习模型能够在广泛的 DirectX 12 兼容硬件上进行训练和推理。\n\nTensorFlow on DirectML 同时支持最新的 Windows 10 版本以及 [Windows 子系统 for Linux](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwsl\u002Fabout)，并且可以作为 PyPI 包下载。有关入门的更多信息，请参阅 [GPU 加速的机器学习训练 (docs.microsoft.com)](http:\u002F\u002Faka.ms\u002Fgpuinwsldocs)。\n\n* [TensorFlow on DirectML GitHub 仓库](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Ftensorflow-directml)\n* [TensorFlow on DirectML 示例](.\u002FTensorFlow)\n* [tensorflow-directml PyPI 项目](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftensorflow-directml\u002F)\n* [TensorFlow GitHub | 关于 TensorFlow on DirectML 的 RFC](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Fcommunity\u002Fpull\u002F243)\n* [TensorFlow 主页](https:\u002F\u002Fwww.tensorflow.org\u002F)\n\n## 反馈\n\n我们期待您的反馈！\n\n* 如遇 TensorFlow 与 DirectML 相关的问题、漏洞或反馈，或关于 DirectML 的一般性问题与反馈，请在 [GitHub 上提交 issue](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Ftensorflow-directml-plugin\u002Fissues) 或直接发送邮件至 askdirectml@microsoft.com。\n\n* 如遇 PyTorch 与 DirectML 相关的问题、漏洞或反馈，或关于 DirectML 的一般性问题与反馈，请在 [GitHub 上提交 issue](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML-Samples\u002Fissues) 或直接发送邮件至 askdirectml@microsoft.com。\n\n* 如遇 Windows ML 相关的问题，请在 [microsoft\u002FWindows-Machine-Learning](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FWindows-Machine-Learning\u002Fissues) 提交 GitHub issue，或直接发送邮件至 askwindowsml@microsoft.com。\n\n* 如遇 ONNX Runtime 相关的问题，请在 [microsoft\u002Fonnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime\u002Fissues) 提交 issue。\n\n## 外部链接\n\n### 文档\n[DirectML 编程指南](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdml)  \n[DirectML API 参考](https:\u002F\u002Fdocs.microsoft.com\u002Fwindows\u002Fwin32\u002Fdirect3d12\u002Fdirect3d-directml-reference)\n\n### 更多信息\n[介绍 DirectML（2019 年游戏开发者大会）](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=QjQm_wNrvVw)   \n[利用 DirectML 和 DirectX 12 加速 GPU 推理（2018 年 SIGGRAPH）](https:\u002F\u002Fon-demand.gputechconf.com\u002Fsiggraph\u002F2018\u002Fvideo\u002Fsig1814-2-adrian-tsai-gpu-inferencing-directml-and-directx-12.html)  \n[Windows AI：Windows 设备上的硬件加速机器学习（2020 年 Microsoft Build）](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=-qf2PMuOXWI&feature=youtu.be)  \n[使用 Windows ML 进行游戏开发（DirectX 开发者博客）](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fgaming-with-windows-ml\u002F)  \n[DirectML 在 GDC 2019 上的亮相（DirectX 开发者博客）](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fdirectml-at-gdc-2019\u002F)  \n[DirectX ❤ Linux（DirectX 开发者博客）](https:\u002F\u002Fdevblogs.microsoft.com\u002Fdirectx\u002Fdirectx-heart-linux\u002F)\n\n## 贡献\n\n本项目欢迎各类贡献和建议。大多数贡献都需要您签署贡献者许可协议（CLA），以声明您有权并将您的贡献权利授予我们使用。有关详情，请访问 https:\u002F\u002Fcla.microsoft.com。\n\n当您提交拉取请求时，CLA 机器人会自动判断您是否需要提供 CLA，并相应地为 PR 添加标记或评论。请按照机器人提供的指示操作即可。对于所有使用我们 CLA 的仓库，您只需完成一次此流程。\n\n本项目已采纳 [Microsoft 开源行为准则](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F)。如需更多信息，请参阅 [行为准则常见问题解答](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F)，或如有任何其他问题或意见，请联系 [opencode@microsoft.com](mailto:opencode@microsoft.com)。","# DirectML 快速上手指南\n\n> **⚠️ 重要提示：维护模式**\n> DirectML 目前处于**维护模式**。\n> - 如果您使用的是 **Windows 11 24H2 (Build 26100)** 或更高版本，建议迁移至 [Windows ML](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fwindows\u002Fai\u002Fnew-windows-ml\u002Foverview) 以获得更好的支持。\n> - DirectML 将继续在旧版 Windows 上受支持，并提供安全修复，但不再计划新增功能。\n> - 本仓库中的示例代码和 Issue 将不再更新。\n\nDirectML 是一个高性能、硬件加速的 DirectX 12 机器学习库。它支持所有兼容 DirectX 12 的 GPU（包括 AMD、Intel、NVIDIA 和 Qualcomm），无需特定厂商驱动即可实现跨硬件的机器学习加速。\n\n## 1. 环境准备\n\n### 系统要求\n- **操作系统**：\n  - Windows 10 版本 1903 (Build 18362) 或更高版本（系统内置）。\n  - 或通过独立包支持更早的 Windows 10 版本。\n- **硬件**：必须拥有支持 **DirectX 12** 的显卡。\n  - **AMD**: GCN 1st Gen (Radeon HD 7000 系列) 及以上\n  - **Intel**: Haswell (第 4 代酷睿) 集成显卡及以上\n  - **NVIDIA**: Kepler (GTX 600 系列) 及以上\n  - **Qualcomm**: Adreno 600 及以上\n\n### 前置依赖\n- **对于应用开发者 (C++)**：\n  - 安装 [Windows 10 SDK](https:\u002F\u002Fdeveloper.microsoft.com\u002Fwindows\u002Fdownloads\u002Fwindows-10-sdk\u002F) (版本 10.0.18362 或更新)。\n- **对于数据科学家\u002F研究者 (Python)**：\n  - 安装 Python 环境。\n  - 根据使用的框架安装对应的 DirectML 插件（见下文安装步骤）。\n\n---\n\n## 2. 安装步骤\n\n根据您的使用场景选择以下一种安装方式：\n\n### 方案 A：Python 数据科学与模型推理 (推荐)\n\nDirectML 可作为后端加速 **ONNX Runtime**、**PyTorch** 或 **TensorFlow**。\n\n#### 1. 使用 ONNX Runtime (通用推理)\n适用于运行导出的 ONNX 模型，支持 PyTorch\u002FTensorFlow 等训练出的模型。\n```bash\npip install onnxruntime-directml\n```\n\n#### 2. 使用 PyTorch\n通过 `torch-directml` 插件在 Windows 或 WSL2 上启用 DirectML 加速。\n```bash\npip install torch-directml\n```\n\n#### 3. 使用 TensorFlow\n适用于 TensorFlow 1.15 版本的加速（公共预览版）。\n```bash\npip install tensorflow-directml\n```\n\n> **注**：国内用户若下载缓慢，可添加国内镜像源参数，例如：\n> `pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \u003C包名>`\n\n### 方案 B：C++ 应用开发\n\n#### 方式 1：使用 NuGet 包（推荐，可固定版本）\n在 Visual Studio 的项目目录中打开命令行，或使用 .csproj\u002F.vcxproj 管理：\n```powershell\nInstall-Package Microsoft.AI.DirectML\n```\n\n#### 方式 2：使用 Windows SDK\n确保已安装 Windows 10 SDK (10.0.18362+)，在 C++ 项目中直接包含头文件并链接库：\n- 头文件：`#include \u003CDirectML.h>`\n- 库文件：`DirectML.lib` (SDK 自动包含)\n\n---\n\n## 3. 基本使用\n\n### 场景一：Python 中使用 PyTorch + DirectML\n\n安装 `torch-directml` 后，只需将设备指定为 `\"dml\"` 即可利用 GPU 加速。\n\n```python\nimport torch\nimport torch_directml\n\n# 检查 DirectML 是否可用\nif torch_directml.is_available():\n    print(\"DirectML is available!\")\n    \n    # 创建 DirectML 设备\n    device = torch_directml.device()\n    \n    # 将模型和数据移动到 DirectML 设备\n    model = MyModel().to(device)\n    data = torch.randn(32, 3, 224, 224).to(device)\n    \n    # 执行推理或训练\n    output = model(data)\n    print(output.shape)\nelse:\n    print(\"DirectML not available on this system.\")\n```\n\n### 场景二：Python 中使用 ONNX Runtime + DirectML\n\n加载 ONNX 模型并指定 `execution_provider` 为 `DmlExecutionProvider`。\n\n```python\nimport onnxruntime as ort\n\n# 定义会话选项，启用 DirectML\nsess_options = ort.SessionOptions()\nsess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL\n\n# 创建推理会话，优先使用 DirectML\nproviders = ['DmlExecutionProvider', 'CPUExecutionProvider']\nsession = ort.InferenceSession(\"model.onnx\", sess_options, providers=providers)\n\n# 运行推理\ninput_name = session.get_inputs()[0].name\noutput_name = session.get_outputs()[0].name\nresult = session.run([output_name], {input_name: input_data})\n```\n\n### 场景三：C++ 最小化示例 (HelloDirectML)\n\n以下是使用原生 C++ API 初始化设备并创建命令列表的核心逻辑片段：\n\n```cpp\n#include \u003CDirectML.h>\n#include \u003Cwrl\u002Fclient.h>\n#include \u003Cdxgi1_4.h>\n#include \u003Cd3d12.h>\n\n\u002F\u002F 假设已初始化 D3D12 设备和命令队列 (device, commandQueue)\nMicrosoft::WRL::ComPtr\u003CIDMLDevice> dmlDevice;\nDMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&dmlDevice));\n\n\u002F\u002F 创建命令记录器\nMicrosoft::WRL::ComPtr\u003CIDMLCommandRecorder> commandRecorder;\ndmlDevice->CreateCommandRecorder(IID_PPV_ARGS(&commandRecorder));\n\n\u002F\u002F 后续可在此绑定算子 (Operators) 并执行...\n```\n\n> 更多完整的 C++ 和 Python 示例代码请参考项目源码中的 `Samples` 和 `Python\u002Fsamples` 目录。","一家位于上海的独立游戏工作室正在开发一款支持实时风格迁移的创意工具，允许玩家通过摄像头将现实画面瞬间转化为动漫或油画风格。\n\n### 没有 DirectML 时\n- **硬件兼容性差**：团队被迫仅针对 NVIDIA 显卡优化代码，导致使用 AMD 或 Intel 集成显卡的用户无法运行或体验极差，损失了大量潜在用户。\n- **推理延迟高**：在缺乏专用 AI 加速库的情况下，模型只能依赖 CPU 运行，画面转换延迟高达数秒，完全破坏了游戏的实时互动性。\n- **开发维护成本高**：为了适配不同厂商的 GPU，开发者需要编写多套后端代码（如分别对接 CUDA 和 OpenCL），极大增加了调试难度和包体体积。\n- **部署门槛高**：普通用户必须手动安装庞大的深度学习框架（如完整版 PyTorch 或 TensorFlow）及特定驱动才能启动程序。\n\n### 使用 DirectML 后\n- **全平台无缝覆盖**：DirectML 基于 DirectX 12 构建，自动利用所有兼容显卡（包括 AMD、Intel、NVIDIA 及高通芯片），确保任何 Windows 10\u002F11 设备均可流畅运行。\n- **实时低延迟响应**：借助 GPU 硬件加速，风格迁移推理速度提升数十倍，实现了毫秒级的画面反馈，完美满足游戏对实时性的苛刻要求。\n- **统一开发接口**：团队只需调用一套原生 C++ API 即可跨硬件运行，无需关心底层厂商差异，显著简化了代码架构并缩短了开发周期。\n- **零依赖轻量部署**：DirectML 作为 Windows 系统组件自带，用户无需额外安装任何重型 AI 框架，双击即可启动应用，极大降低了使用门槛。\n\nDirectML 通过统一的硬件加速层，让开发者能够以最低成本将高性能 AI 功能无缝交付给所有 Windows 用户。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_DirectML_a60c4bb4.png","microsoft","Microsoft","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmicrosoft_4900709c.png","Open source projects and samples from Microsoft",null,"opensource@microsoft.com","OpenAtMicrosoft","https:\u002F\u002Fopensource.microsoft.com","https:\u002F\u002Fgithub.com\u002Fmicrosoft",[86,90,94,98,102,106,110],{"name":87,"color":88,"percentage":89},"C++","#f34b7d",53.7,{"name":91,"color":92,"percentage":93},"Python","#3572A5",38.7,{"name":95,"color":96,"percentage":97},"C","#555555",4.4,{"name":99,"color":100,"percentage":101},"CMake","#DA3434",2.4,{"name":103,"color":104,"percentage":105},"PowerShell","#012456",0.6,{"name":107,"color":108,"percentage":109},"Shell","#89e051",0.2,{"name":111,"color":112,"percentage":113},"HLSL","#aace60",0,2553,332,"2026-04-03T22:48:16","MIT","Windows","必需。需要支持 DirectX 12 的 GPU。兼容型号包括：AMD GCN 第一代 (Radeon HD 7000 系列) 及以上、Intel Haswell (第 4 代酷睿) 集成显卡及以上、NVIDIA Kepler (GTX 600 系列) 及以上、Qualcomm Adreno 600 及以上。无需 CUDA。","未说明",{"notes":122,"python":123,"dependencies":124},"1. DirectML 目前处于维护模式，不再增加新功能，但会继续提供安全修复。Windows 11 24H2 及以后版本建议使用 Windows ML。2. 该工具原生支持 Windows 10 (1903\u002FBuild 18362) 及更新版本，也可通过 WSL 2 在 Linux 子系统上运行 PyTorch 和 TensorFlow 插件。3. 不支持 macOS 或原生 Linux 环境。4. 开发者可通过 NuGet 获取 C++ API，或通过 PyPI 获取 Python 插件。","未说明 (通过 torch-directml 或 tensorflow-directml 插件使用时，需遵循对应框架的 Python 版本要求)",[125,126,127,128,129,130,131],"DirectX 12","Windows 10 SDK (版本 10.0.18362 或更高)","Microsoft.AI.DirectML (NuGet 包，可选独立分发)","torch-directml (PyTorch 插件，可选)","tensorflow-directml (TensorFlow 插件，可选)","ONNX Runtime (可选执行提供者)","PyDirectML (Python 投影库，用于运行 Python 示例)",[13],"2026-03-27T02:49:30.150509","2026-04-06T07:00:45.545630",[136,141,146,151,156,161],{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},14100,"为什么 DirectML 的训练速度比 CUDA 慢很多（例如慢 2.8 倍）？","DirectML 在版本 1.3 到 1.8 之间存在一个性能缺陷，会影响特定硬件和驱动程序上的操作创建。解决方法是设置环境变量 TF_DIRECTML_KERNEL_CACHE_SIZE，将其值设为高于默认的 1024（例如 1300）。这不仅能绕过该缺陷，通常还能略微提升整体性能。\n\n设置方法（以 Linux\u002FWSL 为例）：\nexport TF_DIRECTML_KERNEL_CACHE_SIZE=1300","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Fissues\u002F108",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},14101,"运行 PyTorch DirectML 时遇到 RuntimeError: expected key in DispatchKeySet... but got: PrivateUse1 错误怎么办？","该错误通常出现在较旧版本的 PyTorch 中，因为对 `PrivateUse1` 调度键的支持不完善。此问题应在 PyTorch v2.1.0 或更高版本中得到解决。请尝试升级您的 PyTorch 环境至 v2.1.0+。\n\n注意：如果您使用的是 `torch-directml` 包，请确保其版本也与最新的 PyTorch 兼容，早期版本（如 0.2 或 0.1.13）可能仍未修复此问题。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Fissues\u002F400",{"id":147,"question_zh":148,"answer_zh":149,"source_url":150},14102,"在 WSL 中使用 TensorFlow 时出现 \"Could not load dynamic library 'libcuda.so.1'\" 错误如何解决？","这个错误通常是因为 WSL 会话未正确加载 GPU 驱动或状态异常。即使已安装 NVIDIA 预览版驱动，也可能需要重置 WSL 状态。\n\n请尝试以下步骤：\n1. 关闭所有正在运行的 WSL 终端窗口。\n2. 在 Windows PowerShell 或命令提示符中运行命令：`wsl --shutdown`\n3. 重新启动 WSL 终端并再次运行您的 Python 脚本。\n\n如果问题依旧，请确保已安装支持 DirectML 的 NVIDIA 预览版驱动程序。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Fissues\u002F18",{"id":152,"question_zh":153,"answer_zh":154,"source_url":155},14103,"使用 pip 安装 tensorflow-directml 时提示 \"Could not find a version that satisfies the requirement\" 怎么办？","这通常是因为当前激活的 Python 环境中缺少必要的依赖或上下文，或者未在正确的 Conda 环境中操作。\n\n解决方案：\n如果您使用 Conda 管理环境，请先激活名为 `directml` 的环境（或您创建的相关环境），然后再执行安装命令：\n```bash\nconda activate directml\npip install tensorflow-directml\n```\n确保您的 Python 版本与 tensorflow-directml 支持的版本匹配（通常推荐 Python 3.6 - 3.8，具体视包版本而定）。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Fissues\u002F38",{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},14104,"DirectML 是否支持动态改变 GEMM 算子的输入维度（如 m, n, k）而无需重新编译？","DirectML 的算子通常在编译时需要确定形状，但某些特定硬件驱动更新已改善了内核创建速度慢的问题。对于动态尺寸输入，目前官方示例多要求预先设定常数维度。\n\n不过，针对 Intel 显卡，内核创建缓慢的问题已在驱动程序版本 31.0.101.5186 (Intel XE\u002FArch) 及更高版本中修复。如果您的应用场景涉及频繁的形状变化，建议更新显卡驱动至最新版本以获得更好的动态处理能力，尽管完全无需重编译的动态 GEMM 支持仍受限于 API 设计，通常需要通过重新绑定描述符或重建算子实例来适应新尺寸。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Fissues\u002F212",{"id":162,"question_zh":163,"answer_zh":164,"source_url":140},14105,"截至 2024 年，DirectML 的性能相比 CUDA 是否有显著提升？","根据社区反馈，截至 2024 年，部分用户在使用 ONNX Runtime 和 NVIDIA 1080 GTX 显卡时，发现 DirectML 的速度仍然比 CUDA 慢约 2 倍（使用 DirectML 1.10 版本）。\n\n虽然微软修复了早期版本（1.3-1.8）中的性能缺陷并通过调整缓存大小（TF_DIRECTML_KERNEL_CACHE_SIZE）优化了表现，但在纯训练任务和高负载场景下，DirectML 与原生 CUDA 之间仍存在明显的性能差距。对于对训练速度极其敏感的项目，目前 CUDA 仍是首选；DirectML 更适合用于无法使用 CUDA 的硬件（如 AMD 显卡）或在 Windows\u002FWSL 环境下进行推理任务。",[166,171,176,181,186],{"id":167,"version":168,"summary_zh":169,"released_at":170},80818,"torch-directml-0.2.4.dev240815","2024年8月15日构建的 torch-directml 预览版。\n\nPython 包已作为 [PyPI 发布](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F0.2.4.dev240815\u002F) 提供。要自动下载合适的 Python 包，只需运行 `pip install torch-directml` 即可。\n\n## 新增功能\n- 新增对 `avg_pool3d` 和 `upsample_bicubic2d` 算子的支持\n- 在 PyTorch 的 `PrivateUse1` 后端中支持 `device_count`\n\n## 问题修复\n- 修复了使用 DirectML 缩放点积注意力时，Whisper 示例（[GitHub 地址](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Ftree\u002Fmaster\u002FPyTorch\u002Faudio\u002Fwhisper)）无法以 fp16 精度运行的问题 #598\n- 解决了与算子支持及 `PrivateUse1` 中 `device_count` 相关的问题 #596、#609","2024-08-19T20:59:02",{"id":172,"version":173,"summary_zh":174,"released_at":175},80819,"torch-directml-0.2.3.dev240715","2024年7月15日构建的 torch-directml 预览版。\n\nPython 包已作为 [PyPI 发布](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F0.2.3.dev240715\u002F) 提供。要自动下载合适的 Python 包，只需运行 `pip install torch-directml` 即可。\n\n## 新增内容\n- 新增对 `isin`、`std_mean`、`group_norm`、`multinomial` 算子的支持\n- 添加了[示例](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Ftree\u002Fmaster\u002FPyTorch\u002Fdiffusion\u002Fsd)，并支持 Stable Diffusion Turbo 和 XL Turbo\n","2024-07-17T15:03:14",{"id":177,"version":178,"summary_zh":179,"released_at":180},80820,"torch-directml-0.2.2.dev240614","2024年6月14日构建的 torch-directml 预览版。\n\nPython 包已作为 [PyPI 发布](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftorch-directml\u002F0.2.2.dev240614\u002F) 提供。要自动下载合适的 Python 包，只需运行 `pip install torch-directml` 即可。\n\n## 新特性\n\n- 支持 PyTorch 2.3.1 和 torchvision 0.18.1\n- 新增对 `softplus`、`amax`、`linspace`、`vector_norm`、`native_dropout` 等算子的支持\n- 对 `linear` 算子进行了性能优化\n- 初步支持通过稠密张量实现 SparseTensor\n- 添加了 [示例](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Ftree\u002Fmaster\u002FPyTorch\u002Faudio\u002Fwhisper)，并支持 [OpenAI Whisper 模型](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fwhisper)\n- 对 [LLM 示例](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDirectML\u002Ftree\u002Fmaster\u002FPyTorch\u002Fllm) 代码进行了重构和清理，以提高可维护性\n- 增加了 Generator 注册，并为 torch `PrivateUse1` 后端添加了随机种子支持\n\n## Bug 修复\n\n- 修复了层归一化在非连续输入下产生错误结果的问题 #588\n- 解决了新算子及 `PrivateUse1` 随机种子支持相关的问题 #592、#590、#587、#586","2024-06-15T01:16:25",{"id":182,"version":183,"summary_zh":184,"released_at":185},80821,"tensorflow-directml-1.15.3.dev200626","2020年6月26日构建的 TensorFlow-DirectML 预览版。\n\nPython 包已作为 [PyPI 发布](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftensorflow-directml\u002F1.15.3.dev200626\u002F) 提供。要自动下载合适的 Python 包，只需运行 `pip install tensorflow-directml` 即可。\n\n## dev200626 中的更改：\n• 某些内存不足（OOM）错误现在会优雅地失败，而不会导致 Python 解释器崩溃。\n• 在启用 Grappler 优化时，允许使用全部显存。\n• 实现了 GRUBlockCell 算子。\n","2020-06-30T23:36:12",{"id":187,"version":188,"summary_zh":189,"released_at":190},80822,"tensorflow-directml-1.15.3.dev200615","TensorFlow 1.15.3 的首个预览版，同时支持 Windows 和 WSL 上的 DirectML。\n\n请访问 PyPI.org 上的 [tensorflow-directml](https:\u002F\u002Fpypi.org\u002Fproject\u002Ftensorflow-directml) 获取最新版本！","2020-06-17T04:31:06"]