[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-philferriere--dlwin":3,"tool-philferriere--dlwin":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150037,2,"2026-04-10T23:33:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":79,"difficulty_score":94,"env_os":95,"env_gpu":96,"env_ram":97,"env_deps":98,"category_tags":111,"github_topics":112,"view_count":121,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":122,"updated_at":123,"faqs":124,"releases":154},722,"philferriere\u002Fdlwin","dlwin","GPU-accelerated Deep Learning on Windows 10 native","dlwin 是一套专为 Windows 10 打造的深度学习环境搭建指南，帮助用户在原生系统中实现高效的 GPU 加速深度学习体验。以往在 Windows 上配置深度学习往往面临依赖复杂、环境冲突等问题，许多教程建议通过虚拟机或 Docker 解决，但这会增加资源消耗并降低性能。dlwin 摒弃了这些间接方案，提供了一套无需额外安装 MinGW、直接基于 Windows 原生的集成环境。\n\n它兼容 Keras、TensorFlow、CNTK、MXNet 和 PyTorch 五大主流框架，并支持多种 Keras 后端组合。对于必须在 Windows 环境下工作的开发者及研究人员，dlwin 简化了从 Visual Studio、CUDA 到各类 Python 库的配置流程，确保模型训练和实验能直接在本地 GPU 上流畅运行，无需切换操作系统即可享受强大的计算能力。","GPU-accelerated Deep Learning on Windows 10 native (Keras\u002FTensorflow\u002FCNTK\u002FMXNet and PyTorch)\n===============================================================================\n\n**>> LAST UPDATED JUNE, 2018 \u003C\u003C**\n\n**This latest update:**\n- **supports 5 frameworks (Keras\u002FTensorflow\u002FCNTK\u002FMXNet and PyTorch),**\n- **supports 3 GPU-accelerated Keras backends (CNTK, Tensorflow, or MXNet),**\n- **doesn't require installing MinGW separately,**\n- **uses more recent versions of many python libraries.**\n\nThere are certainly a lot of guides to assist you build great deep learning (DL) setups on Linux or Mac OS (including with Tensorflow which, unfortunately, as of this posting, cannot be easily installed on Windows), but few care about building an efficient Windows 10-**native** setup. Most focus on running an Ubuntu VM hosted on Windows or using Docker, unnecessary - and ultimately sub-optimal - steps.\n\nWe also found enough misguiding\u002Fdeprecated information out there to make it worthwhile putting together a step-by-step guide for the latest stable versions of Keras, Tensorflow, CNTK, MXNet, and PyTorch. Used either together (e.g., Keras with Tensorflow backend), or independently -- PyTorch cannot be used as a Keras backend, TensorFlow can be used on its own -- they make for some of the most powerful deep learning python libraries to work natively on Windows.\n\nIf you **must** run your DL setup on Windows 10, then the information contained here will hopefully be useful to you.\n\nOlder installation instructions from [July 2017](README_July2017.md), [May 2017](README_May2017.md) and [January 2017](README_Jan2017.md) are still available. They allow you to use Theano as a Keras backend.\n\n# TOC\n\n- [Dependencies](#dependencies)\n- [Hardware](#hardware)\n- [Installation steps](#installation-steps)\n  * [Windows toolkits](#toolkits)\n    + [Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0](#visual-studio-2015-community-edition-update-3-w-windows-kit-100102400)\n    + [Anaconda 5.2.0 (64-bit) (Python 3.6 TF support \u002F Python 2.7 no TF support))](#anaconda-520-64-bit-python-36-tf-support-python-27-no-tf-support)\n      - [Create a `dlwin36` conda environment](#create-a-dlwin36-conda-environment)\n      - [Optional but highly-recommended image processing libraries](#optional-but-highly-recommended-image-processing-libraries)\n    + [CUDA 9.0.176 (64-bit)](#cuda-90176-64-bit)\n    + [cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0](#cudnn-v704-nov-13-2017-for-cuda-90)\n  * [Deep learning python libraries](#deep-learning-python-libraries)\n    + [Installing `keras` 2.1.6](#installing-keras-216)\n    + [Installing `tensorflow-gpu` 1.8.0 (solo, or as a Keras backend)](#installing-tensorflow-gpu-180-solo-or-as-a-keras-backend)\n    + [Installing `cntk-gpu` 2.5.1 (solo, or as a Keras backend)](#installing-cntk-gpu-251-solo-or-as-a-keras-backend)\n    + [Installing `mxnet-cu90` 1.2.0 (solo, or as a Keras backend)](#installing-mxnet-cu90-120-solo-or-as-a-keras-backend)\n    + [Installing `pytorch` 0.4.0](#installing-pytorch-040)\n  * [Quick checks](#quick-checks)\n    + [Checking the list of Python libraries installed](#checking-the-list-of-python-libraries-installed)\n    + [Checking our PATH sysenv var](#checking-our-path-sysenv-var)\n    + [Quick-checking each main Python library install](#quick-checking-each-main-python-library-install)\n  * [GPU tests](#gpu-tests)\n    + [Validating our GPU install with Keras](#validating-our-gpu-install-with-keras)\n    + [Keras with Tensorflow backend (GPU disabled)](#keras-with-tensorflow-backend-gpu-disabled)\n    + [Keras with Tensorflow backend (using GPU)](#keras-with-tensorflow-backend-using-gpu)\n    + [Keras with CNTK backend (using GPU)](#keras-with-cntk-backend-using-gpu)\n    + [Keras with MXNet backend (using GPU)](#keras-with-mxnet-backend-using-gpu)\n    + [Validating our GPU install with PyTorch](#validating-our-gpu-install-with-pytorch)\n- [Suggested viewing and reading](#suggested-viewing-and-reading)\n- [About the Author](#about-the-author)\n\n\u003Csmall>\u003Ci>\u003Ca href='http:\u002F\u002Fecotrust-canada.github.io\u002Fmarkdown-toc\u002F'>Table of contents generated with markdown-toc\u003C\u002Fa>\u003C\u002Fi>\u003C\u002Fsmall>\n\n# Dependencies\n\nHere's a summary list of the tools and libraries we use for deep learning on Windows 10 (Version 1709 OS Build 16299.371):\n\n1. Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0\n   - Used for its C\u002FC++ compiler (not its IDE) and SDK. This specific version has been selected due to [Windows Compiler Support in CUDA](http:\u002F\u002Fdocs.nvidia.com\u002Fcuda\u002Fcuda-installation-guide-microsoft-windows\u002Findex.html#system-requirements).\n2. Anaconda (64-bit) w. Python 3.6 (Anaconda3-5.2.0) [for Tensorflow support] or Python 2.7 (Anaconda2-5.2.0) [no Tensorflow support] with MKL 2018.0.3\n   - A Python distro that gives us NumPy, SciPy, and other scientific libraries\n   - MKL is used for its CPU-optimized implementation of many linear algebra operations\n3. CUDA 9.0.176 (64-bit)\n   - Used for its GPU math libraries, card driver, and CUDA compiler\n4. cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0.176\n   - Used to run vastly faster convolution neural networks\n5. Keras 2.1.6 with three different backends: Tensorflow-gpu 1.8.0, CNTK-gpu 2.5.1, and MXNet-cuda90 1.2.0\n   - Keras is used for deep learning on top of Tensorflow or CNTK\n   - Tensorflow and CNTK are backends used to evaluate mathematical expressions on multi-dimensional arrays\n   - Theano is a legacy backend no longer in active development\n6. PyTorch v0.4.0\n\n# Hardware\n\n1. Dell Precision T7900, 64GB RAM\n   - Intel Xeon E5-2630 v4 @ 2.20 GHz (1 processor, 10 cores total, 20 logical processors)\n2. NVIDIA GeForce Titan X, 12GB RAM\n   - Driver version: 390.77 \u002F Win 10 64\n2. NVIDIA GeForce GTX 1080 Ti, 11GB RAM\n   - Driver version: 390.77 \u002F Win 10 64   \n\n# Installation steps\n\nWe like to keep our toolkits and libraries in a single root folder boringly called `e:\\toolkits.win`, so whenever you see a Windows path that starts with `e:\\toolkits.win` below, make sure to replace it with whatever you decide your own toolkit drive and folder ought to be.\n\n## Toolkits\n\n### Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0\n\nDownload [Visual Studio Community 2015 with Update 3 (x86)](https:\u002F\u002Fwww.visualstudio.com\u002Fvs\u002Folder-downloads). It is used by the CUDA toolkit.\n> Note that for downloading, a free [Visual Studio Dev Essentials](https:\u002F\u002Fwww.visualstudio.com\u002Fdev-essentials\u002F) license or a full Visual Studio Subscription is required.\n\nRun the downloaded executable to install Visual Studio, using whatever additional config settings work best for you:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_e92c161f156c.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_b61b0949c4bc.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_67e97acc9459.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_a833c6ffd09a.png)\n\n1. Add `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin` to your `PATH`, based on where you installed VS 2015.\n2. Define sysenv variable `INCLUDE` with the value `C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt`\n3. Define sysenv variable `LIB` with the value `C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\um\\x64;C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\ucrt\\x64`\n\n> Reference Note: We couldn't run any Theano python files until we added the last two env variables above. We would get a `c:\\program files (x86)\\microsoft visual studio 14.0\\vc\\include\\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory` error at compile time and missing `kernel32.lib uuid.lib ucrt.lib` errors at link time. True, you could probably run `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\\amd64\\vcvars64.bat` (with proper params) every single time you open a MINGW cmd prompt, but, obviously, none of the sysenv vars would stick from one session to the next.\n\n### Anaconda 5.2.0 (64-bit) (Python 3.6 TF support \u002F Python 2.7 no TF support)\n\nThis tutorial was initially created using Python 2.7. As Tensorflow has become the backend of choice for Keras, we've decided to document installation steps using Python 3.6 by default. Depending on your own preferred configuration, use `e:\\toolkits.win\\anaconda3-5.2.0` or `e:\\toolkits.win\\anaconda2-5.2.0` as the folder where to install Anaconda.\n\nDownload the Python 3.6 Anaconda version from [here](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002FAnaconda3-5.2.0-Windows-x86_64.exe) and the Python 2.7 version from [there](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002FAnaconda2-5.2.0-Windows-x86_64.exe):\n\n[![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_c9e2ed22ebb7.png)](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002F)\n\nRun the downloaded executable to install Anaconda:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_82409e2bc139.png)\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_0520e7cf270a.png)\n\n> Warning: Below, we enabled the second of the `Advanced Options` because it works for us, but that may not be the best option for you!\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_e528a5db5614.png)\n\nDefine the following variable and update PATH as shown here:\n\n1. Define sysenv variable `PYTHON_HOME` with the value `e:\\toolkits.win\\anaconda3-5.2.0`\n2. Add `%PYTHON_HOME%`, `%PYTHON_HOME%\\Scripts`, and `%PYTHON_HOME%\\Library\\bin` to `PATH`\n\n#### Create a `dlwin36` conda environment\n\nAfter Anaconda installation, open a Windows command prompt and execute:\n\n```\n$ conda create --yes -n dlwin36 numpy scipy mkl-service m2w64-toolchain libpython matplotlib pandas scikit-learn tqdm jupyter h5py cython\n```\n\nHere's the [output log](installed_files\u002Fdlwin36_log.txt) for the command above.\n\nNext, use `activate dlwin36` to activate this new environment. By the way, if you already have an older `dlwin36` environment, you can delete it using `conda env remove -n dlwin36`.\n\n#### Optional but highly-recommended image processing libraries\n\nIf we're going to use the GPU, why did we install a CPU-optimized linear algebra library like MKL? With our setup, most of the deep learning grunt work is performed by the GPU, that is correct, but *the CPU isn't idle*. An important part of image-based Kaggle competitions is **data augmentation**. In that context, data augmentation is the process of manufacturing additional input samples (more training images) by transformation of the original training samples, via the use of image processing operators. Basic transformations such as downsampling and (mean-centered) normalization are also needed. If you feel adventurous, you'll want to try additional pre-processing enhancements (noise removal, histogram equalization, etc.). You certainly could use the GPU for that purpose and save the results to file. In practice, however, those operations are often executed **in parallel on the CPU** while the GPU is busy learning the weights of the deep neural network and the augmented data discarded after use.\n\nIf your deep learning projects are image-based, we recommend also installing the following libraries:\n\n- `scikit-image`: open source image processing library for the Python programming language that includes algorithms for segmentation, geometric transformations, color space manipulation, analysis, filtering, morphology, feature detection, and more. See [this page](http:\u002F\u002Fscikit-image.org\u002F) for more info.\n- `opencv`: a library of programming functions mainly aimed at real-time computer vision. It has C++, Python and Java interfaces and supports many OS platforms, including Windows. See [this page](https:\u002F\u002Fopencv.org\u002F) for additional info.\n- `imgaug`: a staple of image-based Kaggle competitions, this python library helps you with augmenting images for your machine learning projects by converting a set of input images into a new, much larger set of slightly altered images. See [this page](https:\u002F\u002Fgithub.com\u002Faleju\u002Fimgaug) for details.\n\nTo install these libraries, use the following commands:\n\n```\n$ activate dlwin36\n(dlwin36) $conda install --yes pillow scikit-image\n(dlwin36) $conda install --yes -c conda-forge opencv\n(dlwin36) $pip install git+https:\u002F\u002Fgithub.com\u002Faleju\u002Fimgaug\n```\n\nHere's an [output log](installed_files\u002Fdlwin36_imgproc_log.txt) for the commands above.\n\n### CUDA 9.0.176 (64-bit)\n\nDownload CUDA 9.0.176 (64-bit) from the [NVidia website](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-90-download-archive)\n\nWhy not install CUDA 9.1? Simply because, as of this writing, Tensorflow 1.8 still uses CUDA 9.0 (see issue [#15140](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Ftensorflow\u002Fissues\u002F15140)).\n\nSelect the proper target platform:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_4c86d1650adc.png)\n\nDownload all the installers:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_5c4166537e45.png)\n\nRun the downloaded installers one after the other. Install the files in `e:\\toolkits.win\\cuda-9.0.176`:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_00f0977f207f.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_f736aedbe835.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_3dcb54d98f7f.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_cdae4759cf47.png)\n\nAfter completion, the installer should have created a system environment (sysenv) variable named `CUDA_PATH` and added `%CUDA_PATH%\\bin` as well as`%CUDA_PATH%\\libnvvp` to `PATH`. Check that it is indeed the case. If, for some reason, the CUDA env vars are missing, then:\n\n1. Define a system environment (sysenv) variable named `CUDA_PATH` with the value `e:\\toolkits.win\\cuda-9.0.176`\n2. Add`%CUDA_PATH%\\bin` and `%CUDA_PATH%\\libnvvp` to `PATH`\n\n### cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0\n\nPer NVidia's [website](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcudnn), \"cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers,\" hallmarks of convolution network architectures. Download cuDNN from [here](https:\u002F\u002Fdeveloper.nvidia.com\u002Frdp\u002Fcudnn-download). Choose the cuDNN Library for Windows 10 that matches the CUDA version:\n\nNvidia has recently removed the option for the 7.0.4 Windows download. You can download it [here](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcompute\u002Fmachine-learning\u002Fcudnn\u002Fsecure\u002Fv7.0.4\u002Fprod\u002F9.0_20171031\u002Fcudnn-9.0-windows10-x64-v7).\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_bccbb9a18d25.png)\n\nThe downloaded ZIP file contains three directories (`bin`, `include`, `lib`). Extract and copy their content to the identically-named `bin`, `include` and `lib` directories in`%CUDA_PATH%`.\n\n## Deep learning python libraries\n\n### Installing `keras` 2.1.6\n\nWhy not just install the latest bleeding-edge\u002Fdev version of Keras and various backends (Tensorflow, CNTK or Theano)? Simply put, because it makes [reproducible research](https:\u002F\u002Fwww.coursera.org\u002Flearn\u002Freproducible-research) harder. If your work colleagues or Kaggle teammates install the latest code from the dev branch at a different time than you did, you will most likely be running different code bases on your machines, increasing the odds that even though you're using the same input data (the same random seeds, etc.), you still end up with different results when you shouldn't. For this reason alone, we highly recommend only using point releases, the same one across machines, and always documenting which one you use if you can't just use a setup script.\n\nInstall Keras as follows:\n\n```\n(dlwin36) $$ pip install keras==2.1.6\n$ pip install keras==2.1.6\nCollecting keras==2.1.6\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F54\u002Fe8\u002Feaff7a09349ae9bd40d3ebaf028b49f5e2392c771f294910f75bb608b241\u002FKeras-2.1.6-py2.py3-none-any.whl\nRequirement already satisfied: numpy>=1.9.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.14.5)\nRequirement already satisfied: scipy>=0.14 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.1.0)\nRequirement already satisfied: h5py in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (2.8.0)\nRequirement already satisfied: pyyaml in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (3.12)\nRequirement already satisfied: six>=1.9.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.11.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: keras\nSuccessfully installed keras-2.1.6\n```\n\n### Installing `tensorflow-gpu` 1.8.0 (solo, or as a Keras backend)\n\nRun the following command to install Tensorflow:\n\n```\n$ pip install tensorflow-gpu==1.8.0\nCollecting tensorflow-gpu==1.8.0\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F42\u002Fa8\u002F4c96a2b4f88f5d6dfd70313ebf38de1fe4d49ba9bf2ef34dc12dd198ab9a\u002Ftensorflow_gpu-1.8.0-cp36-cp36m-win_amd64.whl\nRequirement already satisfied: six>=1.10.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.11.0)\nCollecting grpcio>=1.8.6 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F5d\u002F8b\u002F104918993129d6c919a16826e6adcfa4a106c791da79fb9655c5b22ad9ff\u002Fgrpcio-1.12.1-cp36-cp36m-win_amd64.whl (1.4MB)\n    100% |████████████████████████████████| 1.4MB 6.6MB\u002Fs\nCollecting gast>=0.2.0 (from tensorflow-gpu==1.8.0)\nCollecting tensorboard\u003C1.9.0,>=1.8.0 (from tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F59\u002Fa6\u002F0ae6092b7542cfedba6b2a1c9b8dceaf278238c39484f3ba03b03f07803c\u002Ftensorboard-1.8.0-py3-none-any.whl\nRequirement already satisfied: wheel>=0.26 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (0.31.1)\nCollecting termcolor>=1.1.0 (from tensorflow-gpu==1.8.0)\nRequirement already satisfied: numpy>=1.13.3 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.14.5)\nCollecting protobuf>=3.4.0 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F75\u002F7a\u002F0dba607e50b97f6a89fa3f96e23bf56922fa59d748238b30507bfe361bbc\u002Fprotobuf-3.6.0-cp36-cp36m-win_amd64.whl (1.1MB)\n    100% |████████████████████████████████| 1.1MB 6.6MB\u002Fs\nCollecting absl-py>=0.1.6 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F57\u002F8d\u002F6664518f9b6ced0aa41cf50b989740909261d4c212557400c48e5cda0804\u002Fabsl-py-0.2.2.tar.gz (82kB)\n    100% |████████████████████████████████| 92kB 5.9MB\u002Fs\nCollecting astor>=0.6.0 (from tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fb2\u002F91\u002Fcc9805f1ff7b49f620136b3a7ca26f6a1be2ed424606804b0fbcf499f712\u002Fastor-0.6.2-py2.py3-none-any.whl\nCollecting html5lib==0.9999999 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\nCollecting werkzeug>=0.11.10 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F20\u002Fc4\u002F12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243\u002FWerkzeug-0.14.1-py2.py3-none-any.whl\nCollecting bleach==1.5.0 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F33\u002F70\u002F86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90\u002Fbleach-1.5.0-py2.py3-none-any.whl\nCollecting markdown>=2.6.8 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F6d\u002F7d\u002F488b90f470b96531a3f5788cf12a93332f543dbab13c423a5e7ce96a0493\u002FMarkdown-2.6.11-py2.py3-none-any.whl\nRequirement already satisfied: setuptools in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from protobuf>=3.4.0->tensorflow-gpu==1.8.0) (39.2.0)\nBuilding wheels for collected packages: absl-py\n  Running setup.py bdist_wheel for absl-py ... done\n  Stored in directory: C:\\Users\\Phil\\AppData\\Local\\pip\\Cache\\wheels\\a0\\f8\\e9\\1933dbb3447ea6ef557062fd5461cb118deb8c2ed074e8344bf\nSuccessfully built absl-py\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: grpcio, gast, html5lib, werkzeug, bleach, markdown, protobuf, tensorboard, termcolor, absl-py, astor, tensorflow-gpu\n  Found existing installation: html5lib 1.0.1\n    Uninstalling html5lib-1.0.1:\n      Successfully uninstalled html5lib-1.0.1\n  Found existing installation: bleach 2.1.3\n    Uninstalling bleach-2.1.3:\n      Successfully uninstalled bleach-2.1.3\nSuccessfully installed absl-py-0.2.2 astor-0.6.2 bleach-1.5.0 gast-0.2.0 grpcio-1.12.1 html5lib-0.9999999 markdown-2.6.11 protobuf-3.6.0 tensorboard-1.8.0 tensorflow-gpu-1.8.0 termcolor-1.1.0 werkzeug-0.14.1\n```\n\nIf you want TensorFlow to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `tensorflow`.\n\n### Installing `cntk-gpu` 2.5.1 (solo, or as a Keras backend)\n\nAs documented at [this link](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fcognitive-toolkit\u002Fsetup-windows-python), install CNTK GPU as follows:\n\n```\n(dlwin36) $ pip install https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\nCollecting cntk-gpu==2.5.1 from https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\n  Downloading https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl (428.6MB)\n    100% |████████████████████████████████| 428.6MB 53kB\u002Fs\nRequirement already satisfied: scipy>=0.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.1.0)\nRequirement already satisfied: numpy>=1.11 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.14.5)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: cntk-gpu\nSuccessfully installed cntk-gpu-2.5.1\n```\n\nIf you want CNTK to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `cntk`.\n\n### Installing `mxnet-cu90` 1.2.0 (solo, or as a Keras backend)\n\nMXNet is a deep learning framework with strong backing from Amazon (through AWS). It is also supported by Microsoft on Azure. To install it, run the following command:\n\n```\n(dlwin36) $ pip install mxnet-cu90==1.2.0 keras-mxnet==2.1.6.1\nCollecting mxnet-cu90==1.2.0\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F72\u002Fa8\u002F9226bd6913b7ba4657a218b9a252b60de98938dd41e8517a0b4ab4291203\u002Fmxnet_cu90-1.2.0-py2.py3-none-win_amd64.whl (457.0MB)\n    100% |████████████████████████████████| 457.0MB 47kB\u002Fs\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (1.14.5)\nRequirement already satisfied: graphviz in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (0.8.3)\nCollecting keras-mxnet==2.1.6.1\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F99\u002F93\u002F13ec18147fcef7c393e3fbf2d2c20171975be14e68d4c915b194be174ab6\u002Fkeras_mxnet-2.1.6.1-py2.py3-none-any.whl (388kB)\n    100% |████████████████████████████████| 389kB 3.3MB\u002Fs\nCollecting requests (from mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F65\u002F47\u002F7e02164a2a3db50ed6d8a6ab1d6d60b69c4c3fdf57a284257925dfc12bda\u002Frequests-2.19.1-py2.py3-none-any.whl (91kB)\n    100% |████████████████████████████████| 92kB 1.2MB\u002Fs\nCollecting urllib3\u003C1.24,>=1.21.1 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fbd\u002Fc9\u002F6fdd990019071a4a32a5e7cb78a1d92c53851ef4f56f62a3486e6a7d8ffb\u002Furllib3-1.23-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB\u002Fs\nCollecting chardet\u003C3.1.0,>=3.0.2 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fbc\u002Fa9\u002F01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8\u002Fchardet-3.0.4-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB\u002Fs\nRequirement already satisfied: certifi>=2017.4.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from requests->mxnet-cu90==1.2.0) (2018.4.16)\nCollecting idna\u003C2.8,>=2.5 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F4b\u002F2a\u002F0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165\u002Fidna-2.7-py2.py3-none-any.whl (58kB)\n    100% |████████████████████████████████| 61kB 3.9MB\u002Fs\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: urllib3, chardet, idna, requests, mxnet-cu90\nSuccessfully installed chardet-3.0.4 idna-2.7 mxnet-cu90-1.2.0 requests-2.19.1 urllib3-1.23\n```\n\nIf you want MXNet to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `mxnet`.\n\n### Installing `pytorch` 0.4.0\n\nPyTorch is Facebook AI Research (FAIR)'s answer to Google's Tensorflow.  Only with version v0.4.0 does it **officially** support Windows (x64). Setup requires installing `pytorch`, `cuda90`, and `torchvision` so, first, run the following command:\n\n```\n(dlwin36) $ conda install --yes pytorch==0.4.0 cuda90 -c pytorch\nSolving environment: done\n\n## Package Plan ##\n\n  environment location: e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\n\n  added \u002F updated specs:\n    - cuda90\n    - pytorch==0.4.0\n\n\nThe following packages will be downloaded:\n\n    package                    |            build\n    ---------------------------|-----------------\n    cuda90-1.0                 |                0           2 KB  pytorch\n    certifi-2018.4.16          |           py36_0         143 KB\n    pytorch-0.4.0              |py36_cuda90_cudnn7he774522_1       577.6 MB  pytorch\n    ------------------------------------------------------------\n                                           Total:       577.7 MB\n\nThe following NEW packages will be INSTALLED:\n\n    cffi:      1.11.5-py36h945400d_0\n    cuda90:    1.0-0                              pytorch\n    pycparser: 2.18-py36hd053e01_1\n    pytorch:   0.4.0-py36_cuda90_cudnn7he774522_1 pytorch     [cuda90]\n\nThe following packages will be UPDATED:\n\n    certifi:   2018.4.16-py36_0                   conda-forge --> 2018.4.16-py36_0\n\n\nDownloading and Extracting Packages\ncuda90-1.0           |    2 KB | ############################################################################## | 100%\ncertifi-2018.4.16    |  143 KB | ############################################################################## | 100%\npytorch-0.4.0        | 577.6 MB | ############################################################################# | 100%\nPreparing transaction: done\nVerifying transaction: done\nExecuting transaction: done\n```\n\nSecond, install `torchvision` with this command:\n\n```\n(dlwin36torch) $ pip install torchvision==0.2.1\nCollecting torchvision==0.2.1\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fca\u002F0d\u002Ff00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1\u002Ftorchvision-0.2.1-py2.py3-none-any.whl\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.14.5)\nRequirement already satisfied: pillow>=4.1.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (5.1.0)\nRequirement already satisfied: six in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.11.0)\nRequirement already satisfied: torch in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (0.4.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: torchvision\nSuccessfully installed torchvision-0.2.1\n```\n\nIf you have issues with PyTorch on Windows, I highly recommend reading their [Windows FAQ](http:\u002F\u002Fpytorch.org\u002Fdocs\u002Fstable\u002Fnotes\u002Fwindows.html).\n\n## Quick checks\n\n### Checking the list of Python libraries installed\n\nYou should end up with the following list of libraries in your `dlwin36` conda environment:\n\n```\n(dlwin36) $ conda list\n# packages in environment at e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36:\n#\n# Name                    Version                   Build  Channel\nabsl-py                   0.2.2                     \u003Cpip>\nastor                     0.6.2                     \u003Cpip>\nbackcall                  0.1.0                    py36_0  \nblas                      1.0                         mkl  \nbleach                    1.5.0                     \u003Cpip>\nbleach                    2.1.3                    py36_0  \nbokeh                     0.12.16                  py36_0  \nca-certificates           2018.4.16                     0    conda-forge\ncertifi                   2018.4.16                py36_0  \ncffi                      1.11.5           py36h945400d_0  \nchardet                   3.0.4                     \u003Cpip>\nclick                     6.7              py36hec8c647_0  \ncloudpickle               0.5.3                    py36_0  \ncntk-gpu                  2.5.1                     \u003Cpip>\ncolorama                  0.3.9            py36h029ae33_0  \ncuda90                    1.0                           0    pytorch\ncycler                    0.10.0           py36h009560c_0  \ncython                    0.28.3           py36hfa6e2cd_0  \ncytoolz                   0.9.0.1          py36hfa6e2cd_0  \ndask                      0.18.0                   py36_0  \ndask-core                 0.18.0                   py36_0  \ndecorator                 4.3.0                    py36_0  \ndistributed               1.22.0                   py36_0  \nentrypoints               0.2.3            py36hfd66bb0_2  \nfreetype                  2.8.1                    vc14_0  [vc14]  conda-forge\ngast                      0.2.0                     \u003Cpip>\ngraphviz                  0.8.3                     \u003Cpip>\ngrpcio                    1.12.1                    \u003Cpip>\nh5py                      2.8.0            py36h3bdd7fb_0  \nhdf5                      1.10.2                   vc14_0  [vc14]  conda-forge\nheapdict                  1.0.0                    py36_2  \nhtml5lib                  1.0.1            py36h047fa9f_0  \nhtml5lib                  0.9999999                 \u003Cpip>\nicc_rt                    2017.0.4             h97af966_0  \nicu                       58.2                     vc14_0  [vc14]  conda-forge\nidna                      2.7                       \u003Cpip>\nimageio                   2.3.0                    py36_0  \nimgaug                    0.2.5                     \u003Cpip>\nintel-openmp              2018.0.3                      0  \nipykernel                 4.8.2                    py36_0  \nipython                   6.4.0                    py36_0  \nipython_genutils          0.2.0            py36h3c5d0ee_0  \nipywidgets                7.2.1                    py36_0  \njedi                      0.12.0                   py36_1  \njinja2                    2.10             py36h292fed1_0  \njpeg                      9b                       vc14_2  [vc14]  conda-forge\njsonschema                2.6.0            py36h7636477_0  \njupyter                   1.0.0                    py36_4  \njupyter_client            5.2.3                    py36_0  \njupyter_console           5.2.0            py36h6d89b47_1  \njupyter_core              4.4.0            py36h56e9d50_0  \nKeras                     2.1.6                     \u003Cpip>\nkiwisolver                1.0.1            py36h12c3424_0  \nlibpng                    1.6.34                   vc14_0  [vc14]  conda-forge\nlibpython                 2.1                      py36_0  \nlibsodium                 1.0.16                   vc14_0  [vc14]  conda-forge\nlibtiff                   4.0.9                    vc14_0  [vc14]  conda-forge\nlibwebp                   0.5.2                    vc14_7  [vc14]  conda-forge\nlocket                    0.2.0            py36hfed976d_1  \nm2w64-binutils            2.25.1                        5  \nm2w64-bzip2               1.0.6                         6  \nm2w64-crt-git             5.0.0.4636.2595836               2  \nm2w64-gcc                 5.3.0                         6  \nm2w64-gcc-ada             5.3.0                         6  \nm2w64-gcc-fortran         5.3.0                         6  \nm2w64-gcc-libgfortran     5.3.0                         6  \nm2w64-gcc-libs            5.3.0                         7  \nm2w64-gcc-libs-core       5.3.0                         7  \nm2w64-gcc-objc            5.3.0                         6  \nm2w64-gmp                 6.1.0                         2  \nm2w64-headers-git         5.0.0.4636.c0ad18a               2  \nm2w64-isl                 0.16.1                        2  \nm2w64-libiconv            1.14                          6  \nm2w64-libmangle-git       5.0.0.4509.2e5a9a2               2  \nm2w64-libwinpthread-git   5.0.0.4634.697f757               2  \nm2w64-make                4.1.2351.a80a8b8               2  \nm2w64-mpc                 1.0.3                         3  \nm2w64-mpfr                3.1.4                         4  \nm2w64-pkg-config          0.29.1                        2  \nm2w64-toolchain           5.3.0                         7  \nm2w64-tools-git           5.0.0.4592.90b8472               2  \nm2w64-windows-default-manifest 6.4                           3  \nm2w64-winpthreads-git     5.0.0.4634.697f757               2  \nm2w64-zlib                1.2.8                        10  \nMarkdown                  2.6.11                    \u003Cpip>\nmarkupsafe                1.0              py36h0e26971_1  \nmatplotlib                2.2.2                    py36_1    conda-forge\nmistune                   0.8.3            py36hfa6e2cd_1  \nmkl                       2018.0.3                      1  \nmkl-service               1.1.2            py36h57e144c_4  \nmkl_fft                   1.0.1            py36h452e1ab_0  \nmkl_random                1.0.1            py36h9258bd6_0  \nmsgpack-python            0.5.6            py36he980bc4_0  \nmsys2-conda-epoch         20160418                      1  \nmxnet-cu90                1.2.0                     \u003Cpip>\nnbconvert                 5.3.1            py36h8dc0fde_0  \nnbformat                  4.4.0            py36h3a5bc1b_0  \nnetworkx                  2.1                      py36_0  \nnotebook                  5.5.0                    py36_0  \nnumpy                     1.14.5           py36h9fa60d3_0  \nnumpy-base                1.14.5           py36h5c71026_0  \nolefile                   0.45.1                   py36_0  \nopencv                    3.4.1                  py36_200    conda-forge\nopenssl                   1.0.2o                   vc14_0  [vc14]  conda-forge\npackaging                 17.1                     py36_0  \npandas                    0.23.1           py36h830ac7b_0  \npandoc                    1.19.2.1             hb2460c7_1  \npandocfilters             1.4.2            py36h3ef6317_1  \nparso                     0.2.1                    py36_0  \npartd                     0.3.8            py36hc8e763b_0  \npickleshare               0.7.4            py36h9de030f_0  \npillow                    5.1.0            py36h0738816_0  \npip                       10.0.1                   py36_0  \nprompt_toolkit            1.0.15           py36h60b8f86_0  \nprotobuf                  3.6.0                     \u003Cpip>\npsutil                    5.4.6            py36hfa6e2cd_0  \npycparser                 2.18             py36hd053e01_1  \npygments                  2.2.0            py36hb010967_0  \npyparsing                 2.2.0            py36h785a196_1  \npyqt                      5.6.0                    py36_2  \npython                    3.6.5                h0c2934d_0  \npython-dateutil           2.7.3                    py36_0  \npytorch                   0.4.0           py36_cuda90_cudnn7he774522_1  [cuda90]  pytorch\npytz                      2018.4                   py36_0  \npywavelets                0.5.2            py36hc649158_0  \npywinpty                  0.5.4                    py36_0  \npyyaml                    3.12             py36h1d1928f_1  \npyzmq                     17.0.0           py36hfa6e2cd_1  \nqt                        5.6.2                    vc14_1  [vc14]  conda-forge\nqtconsole                 4.3.1            py36h99a29a9_0  \nrequests                  2.19.1                    \u003Cpip>\nscikit-image              0.13.1           py36hfa6e2cd_1  \nscikit-learn              0.19.1           py36h53aea1b_0  \nscipy                     1.1.0            py36h672f292_0  \nsend2trash                1.5.0                    py36_0  \nsetuptools                39.2.0                   py36_0  \nsimplegeneric             0.8.1                    py36_2  \nsip                       4.19.8           py36h6538335_0  \nsix                       1.11.0           py36h4db2310_1  \nsortedcontainers          2.0.4                    py36_0  \nsqlite                    3.22.0                   vc14_0  [vc14]  conda-forge\ntblib                     1.3.2            py36h30f5020_0  \ntensorboard               1.8.0                     \u003Cpip>\ntensorflow-gpu            1.8.0                     \u003Cpip>\ntermcolor                 1.1.0                     \u003Cpip>\nterminado                 0.8.1                    py36_1  \ntestpath                  0.3.1            py36h2698cfe_0  \ntk                        8.6.7                    vc14_0  [vc14]  conda-forge\ntoolz                     0.9.0                    py36_0  \ntorchvision               0.2.1                     \u003Cpip>\ntornado                   5.0.2                    py36_0  \ntqdm                      4.23.4                   py36_0  \ntraitlets                 4.3.2            py36h096827d_0  \nurllib3                   1.23                      \u003Cpip>\nvc                        14                   h0510ff6_3  \nvs2015_runtime            14.0.25123                    3  \nwcwidth                   0.1.7            py36h3d5aa90_0  \nwebencodings              0.5.1            py36h67c50ae_1  \nWerkzeug                  0.14.1                    \u003Cpip>\nwheel                     0.31.1                   py36_0  \nwidgetsnbextension        3.2.1                    py36_0  \nwincertstore              0.2              py36h7fe50ca_0  \nwinpty                    0.4.3                         4  \nyaml                      0.1.7                    vc14_0  [vc14]  conda-forge\nzeromq                    4.2.5                    vc14_1  [vc14]  conda-forge\nzict                      0.1.3            py36h2d8e73e_0  \nzlib                      1.2.11                   vc14_0  [vc14]  conda-forge\n```\n\n### Checking our PATH sysenv var\n\nAt this point, whenever the `dlwin36` conda environment is active, the `PATH` environment variable should look something like:\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\mingw-w64\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\usr\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\bin\nE:\\toolkits.win\\cuda-9.0.176\\bin\nE:\\toolkits.win\\cuda-9.0.176\\libnvvp\ne:\\toolkits.win\\anaconda3-5.2.0\ne:\\toolkits.win\\anaconda3-5.2.0\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\Library\\bin\nC:\\ProgramData\\Oracle\\Java\\javapath\nC:\\WINDOWS\\system32\nC:\\WINDOWS\nC:\\WINDOWS\\System32\\Wbem\nC:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\\nC:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\nC:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\nC:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\\nC:\\Program Files\\Git\\cmd\nC:\\Program Files\\Git\\mingw64\\bin\nC:\\Program Files\\Git\\usr\\bin\nC:\\WINDOWS\\System32\\OpenSSH\\\n...\n```\n\n> Note: To get a line-by-line display of the directories on your path (as shown above), enter this incantation at a command prompt: `ECHO.%PATH:;= & ECHO.%`.\n\n### Quick-checking each main Python library install\n\nTo do a quick check of the installed backends, run the following:\n\n```\n(dlwin36) $ python -c \"import tensorflow; print('tensorflow: %s, %s' % (tensorflow.__version__, tensorflow.__file__))\"\ntensorflow: 1.8.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\tensorflow\\__init__.py\n(dlwin36) $ python -c \"import cntk; print('cntk: %s, %s' % (cntk.__version__, cntk.__file__))\"\ncntk: 2.5.1, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\cntk\\__init__.py\n(dlwin36) $ python -c \"import mxnet; print('mxnet: %s, %s' % (mxnet.__version__, mxnet.__file__))\"f\nmxnet: 1.2.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\__init__.py\n(dlwin36) $ python -c \"import keras; print('keras: %s, %s' % (keras.__version__, keras.__file__))\"\nUsing TensorFlow backend.\nkeras: 2.1.6, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\__init__.py\n(dlwin36) $ python -c \"import torch; print('torch: %s, %s' % (torch.__version__, torch.__file__))\"\ntorch: 0.4.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\torch\\__init__.py\n```\n\n## GPU tests\n\n### Validating our GPU install with Keras\n\nWe can train a simple convnet ([convolutional neural network](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FConvolutional_neural_network)) on the [MNIST dataset](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMNIST_database) by using one of the example scripts provided with Keras. The file is called `mnist_cnn.py` and can be found in Keras' `examples` folder, [here](https:\u002F\u002Fgithub.com\u002Fkeras-team\u002Fkeras\u002Fblob\u002Fmaster\u002Fexamples\u002Fmnist_cnn.py). The code is as follows:\n\n```python\n'''Trains a simple convnet on the MNIST dataset.\n\nGets to 99.25% test accuracy after 12 epochs\n(there is still a lot of margin for parameter tuning).\n16 seconds per epoch on a GRID K520 GPU.\n'''\n\nfrom __future__ import print_function\nimport keras\nfrom keras.datasets import mnist\nfrom keras.models import Sequential\nfrom keras.layers import Dense, Dropout, Flatten\nfrom keras.layers import Conv2D, MaxPooling2D\nfrom keras import backend as K\n\nbatch_size = 128\nnum_classes = 10\nepochs = 12\n\n# input image dimensions\nimg_rows, img_cols = 28, 28\n\n# the data, split between train and test sets\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\n\nif K.image_data_format() == 'channels_first':\n    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)\n    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)\n    input_shape = (1, img_rows, img_cols)\nelse:\n    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n    input_shape = (img_rows, img_cols, 1)\n\nx_train = x_train.astype('float32')\nx_test = x_test.astype('float32')\nx_train \u002F= 255\nx_test \u002F= 255\nprint('x_train shape:', x_train.shape)\nprint(x_train.shape[0], 'train samples')\nprint(x_test.shape[0], 'test samples')\n\n# convert class vectors to binary class matrices\ny_train = keras.utils.to_categorical(y_train, num_classes)\ny_test = keras.utils.to_categorical(y_test, num_classes)\n\nmodel = Sequential()\nmodel.add(Conv2D(32, kernel_size=(3, 3),\n                 activation='relu',\n                 input_shape=input_shape))\nmodel.add(Conv2D(64, (3, 3), activation='relu'))\nmodel.add(MaxPooling2D(pool_size=(2, 2)))\nmodel.add(Dropout(0.25))\nmodel.add(Flatten())\nmodel.add(Dense(128, activation='relu'))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(num_classes, activation='softmax'))\n\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n\nmodel.fit(x_train, y_train,\n          batch_size=batch_size,\n          epochs=epochs,\n          verbose=1,\n          validation_data=(x_test, y_test))\nscore = model.evaluate(x_test, y_test, verbose=0)\nprint('Test loss:', score[0])\nprint('Test accuracy:', score[1])\n```\n\n### Keras with Tensorflow backend (GPU disabled)\n\nTo activate and test the Tensorflow backend in **CPU-only mode**, and get a good baseline to compare against, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ set CUDA_VISIBLE_DEVICES=-1\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n2018-06-15 11:59:57.047920: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 11:59:58.152643: E T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE\n2018-06-15 11:59:58.164753: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: SERVERP\n2018-06-15 11:59:58.173767: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:165] hostname: SERVERP\n60000\u002F60000 [==============================] - 60s 997us\u002Fstep - loss: 0.2603 - acc: 0.9195 - val_loss: 0.0502 - val_acc: 0.9836\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 57s 952us\u002Fstep - loss: 0.0873 - acc: 0.9734 - val_loss: 0.0390 - val_acc: 0.9868\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 57s 947us\u002Fstep - loss: 0.0657 - acc: 0.9803 - val_loss: 0.0346 - val_acc: 0.9888\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 57s 945us\u002Fstep - loss: 0.0543 - acc: 0.9842 - val_loss: 0.0348 - val_acc: 0.9886\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 56s 941us\u002Fstep - loss: 0.0470 - acc: 0.9862 - val_loss: 0.0354 - val_acc: 0.9878\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 56s 939us\u002Fstep - loss: 0.0410 - acc: 0.9871 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 56s 941us\u002Fstep - loss: 0.0369 - acc: 0.9888 - val_loss: 0.0290 - val_acc: 0.9901\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 58s 960us\u002Fstep - loss: 0.0337 - acc: 0.9892 - val_loss: 0.0261 - val_acc: 0.9916\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 57s 953us\u002Fstep - loss: 0.0313 - acc: 0.9904 - val_loss: 0.0291 - val_acc: 0.9906\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 57s 958us\u002Fstep - loss: 0.0286 - acc: 0.9913 - val_loss: 0.0317 - val_acc: 0.9889\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 58s 961us\u002Fstep - loss: 0.0269 - acc: 0.9915 - val_loss: 0.0290 - val_acc: 0.9914\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 59s 976us\u002Fstep - loss: 0.0270 - acc: 0.9915 - val_loss: 0.0304 - val_acc: 0.9916\nTest loss: 0.030398282517803726\nTest accuracy: 0.9916\n```\n\n> Note: If you've run the sequence of commands above, to restore CUDA's ability to detect the presence of your GPU(s), just set the environment variable `CUDA_VISIBLE_DEVICES` to the list of IDs of the installed GPU devices on your machine. In other words, if you have only one GPU, use `set CUDA_VISIBLE_DEVICES=0`. If you have two GPUs, use `set CUDA_VISIBLE_DEVICES=0,1`. And, so on.\n\n### Keras with Tensorflow backend (using GPU)\n\nTo activate and test the Tensorflow backend, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n2018-06-15 12:14:21.774082: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 12:14:22.219436: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 0 with properties:\nname: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645\npciBusID: 0000:04:00.0\ntotalMemory: 11.00GiB freeMemory: 9.09GiB\n2018-06-15 12:14:22.345166: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 1 with properties:\nname: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076\npciBusID: 0000:03:00.0\ntotalMemory: 12.00GiB freeMemory: 10.06GiB\n2018-06-15 12:14:22.360064: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1435] Adding visible gpu devices: 0, 1\n2018-06-15 12:14:23.731981: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:\n2018-06-15 12:14:23.741080: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:929]      0 1\n2018-06-15 12:14:23.747608: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 0:   N N\n2018-06-15 12:14:23.753642: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 1:   N N\n2018-06-15 12:14:23.759825: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (\u002Fjob:localhost\u002Freplica:0\u002Ftask:0\u002Fdevice:GPU:0 with 8804 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)\n2018-06-15 12:14:24.168800: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (\u002Fjob:localhost\u002Freplica:0\u002Ftask:0\u002Fdevice:GPU:1 with 9737 MB memory) -> physical GPU (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0, compute capability: 5.2)\n60000\u002F60000 [==============================] - 10s 161us\u002Fstep - loss: 0.2613 - acc: 0.9198 - val_loss: 0.0563 - val_acc: 0.9811\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0875 - acc: 0.9743 - val_loss: 0.0435 - val_acc: 0.9853\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0652 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9886\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0531 - acc: 0.9844 - val_loss: 0.0324 - val_acc: 0.9896\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0466 - acc: 0.9861 - val_loss: 0.0307 - val_acc: 0.9895\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0421 - acc: 0.9869 - val_loss: 0.0323 - val_acc: 0.9906\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0402 - acc: 0.9879 - val_loss: 0.0286 - val_acc: 0.9907\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0326 - acc: 0.9896 - val_loss: 0.0299 - val_acc: 0.9909\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0311 - acc: 0.9907 - val_loss: 0.0262 - val_acc: 0.9922\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0310 - acc: 0.9902 - val_loss: 0.0256 - val_acc: 0.9918\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0267 - acc: 0.9914 - val_loss: 0.0310 - val_acc: 0.9905\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0262 - acc: 0.9917 - val_loss: 0.0281 - val_acc: 0.9919\nTest loss: 0.028108230106867086\nTest accuracy: 0.9919\n```\n\nKeras with the tensorflow backend operating in GPU-accelerated mode is about **14.5 times faster** than in CPU mode (58\u002F4=14.5).\n\n### Keras with CNTK backend (using GPU)\n\nTo activate and test the CNTK backend, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=cntk\n(dlwin36) $ python mnist_cnn.py\nUsing CNTK backend\nSelected GPU[0] GeForce GTX 1080 Ti as the process wide default device.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n60000\u002F60000 [==============================] - 7s 110us\u002Fstep - loss: 0.2594 - acc: 0.9211 - val_loss: 0.0561 - val_acc: 0.9806\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0855 - acc: 0.9752 - val_loss: 0.0425 - val_acc: 0.9864\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0646 - acc: 0.9805 - val_loss: 0.0327 - val_acc: 0.9887\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0537 - acc: 0.9839 - val_loss: 0.0303 - val_acc: 0.9892\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0466 - acc: 0.9863 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0410 - acc: 0.9872 - val_loss: 0.0289 - val_acc: 0.9916\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0356 - acc: 0.9896 - val_loss: 0.0278 - val_acc: 0.9917\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0341 - acc: 0.9899 - val_loss: 0.0293 - val_acc: 0.9905\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0325 - acc: 0.9903 - val_loss: 0.0249 - val_acc: 0.9920\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0302 - acc: 0.9903 - val_loss: 0.0275 - val_acc: 0.9910\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0277 - acc: 0.9913 - val_loss: 0.0258 - val_acc: 0.9915\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0253 - acc: 0.9923 - val_loss: 0.0277 - val_acc: 0.9906\nTest loss: 0.027684621373889287\nTest accuracy: 0.9906\n```\n\nIn this specific experiment, CNTK in GPU mode is fast but not as fast as Tensorflow.\n\n### Keras with MXNet backend (using GPU)\n\nTo activate and test the MXNet backend, use the following command:\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n```\n\nPlease note that, at the time of this writing, per [issue #106](https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Fissues\u002F106), it is not possible to use the same Keras code and expect it will run with MXNet on GPU yet. You will need to modify **ONE LINE** in the sample file `mnist_cnn.py` as shown here:\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n```\n\nshould be:\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'],\n              context= [\"gpu(0)\"])\n```\n\nAlternatively, use the file [`mnist_cnn_mxnet.py`](mnist_cnn_mxnet.py) (it includes the change above) included in this repo, as follows:\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  train_symbol = func(*args, **kwargs)\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:92: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  test_symbol = func(*args, **kwargs)\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0\u002Fbatch_size\u002Fnum_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[04:55:20] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\.\u002Fcudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000\u002F60000 [==============================] - 12s 192us\u002Fstep - loss: 0.3480 - acc: 0.8934 - val_loss: 0.0817 - val_acc: 0.9743\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.1177 - acc: 0.9660 - val_loss: 0.0524 - val_acc: 0.9828\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0859 - acc: 0.9750 - val_loss: 0.0432 - val_acc: 0.9857\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0704 - acc: 0.9792 - val_loss: 0.0363 - val_acc: 0.9882\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0608 - acc: 0.9817 - val_loss: 0.0344 - val_acc: 0.9884\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0561 - acc: 0.9839 - val_loss: 0.0328 - val_acc: 0.9889\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0503 - acc: 0.9853 - val_loss: 0.0322 - val_acc: 0.9890\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0473 - acc: 0.9860 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0440 - acc: 0.9870 - val_loss: 0.0304 - val_acc: 0.9899\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0413 - acc: 0.9877 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0388 - acc: 0.9888 - val_loss: 0.0281 - val_acc: 0.9913\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0382 - acc: 0.9883 - val_loss: 0.0285 - val_acc: 0.9904\nTest loss: 0.028510591367455346\nTest accuracy: 0.9904\n```\n\nFrom this single experiment, MXNet appears to be the slowest of the three Keras backends. If you are set on using MXNet, however, you may want to implement the changes in the warning above:\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  train_symbol = func(*args, **kwargs)\n```\n\nYou can use the following lines to effect those changes:\n\n```\n(dlwin36) $ %SystemDrive%\n(dlwin36) $ cd %USERPROFILE%\\.keras\n(dlwin36) $ cp keras.json keras.json.bak\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_first\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"mxnet\" & echo }) > keras_mxnet.json\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_last\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"tensorflow\" & echo }) > keras_tensorflow.json\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_last\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"cntk\" & echo }) > keras_cntk.json\n(dlwin36) $ cp -f keras_mxnet.json keras.json\n```\n\nNote 1: If you want to go back to TensorFlow or CNTK after this, all you have to do is copy the proper `json` file to `keras.json` (e.g., `cp -f keras_tensorflow.json keras.json` and set `KERAS_BACKEND` to the matching framework (e.g., `set KERAS_BACKEND=tensorflow`).\n\nNote 2: After switching to the `channels_first` channel ordering, I got the following results:\n\n```\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 1, 28, 28)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0\u002Fbatch_size\u002Fnum_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[05:39:39] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\.\u002Fcudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000\u002F60000 [==============================] - 9s 152us\u002Fstep - loss: 0.3485 - acc: 0.8923 - val_loss: 0.0851 - val_acc: 0.9732\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.1191 - acc: 0.9652 - val_loss: 0.0529 - val_acc: 0.9824\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0874 - acc: 0.9741 - val_loss: 0.0435 - val_acc: 0.9865\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0740 - acc: 0.9784 - val_loss: 0.0402 - val_acc: 0.9867\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0642 - acc: 0.9809 - val_loss: 0.0328 - val_acc: 0.9884\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0585 - acc: 0.9826 - val_loss: 0.0346 - val_acc: 0.9897\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0534 - acc: 0.9843 - val_loss: 0.0315 - val_acc: 0.9889\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0491 - acc: 0.9852 - val_loss: 0.0336 - val_acc: 0.9888\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0441 - acc: 0.9865 - val_loss: 0.0302 - val_acc: 0.9899\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0421 - acc: 0.9877 - val_loss: 0.0303 - val_acc: 0.9903\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0404 - acc: 0.9878 - val_loss: 0.0294 - val_acc: 0.9903\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0381 - acc: 0.9889 - val_loss: 0.0272 - val_acc: 0.9904\nTest loss: 0.027214839413274603\nTest accuracy: 0.9904\n```\n\nThis is a bit faster, but not as fast as Keras with a CNTK or Tensorflow backend.\n\n### Validating our GPU install with PyTorch\n\nHere too, we can train a convnet on the MNIST dataset with a similar network as the one used in the Keras case by modifying a sample from PyTorch's `examples` [folder](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fexamples\u002Fblob\u002Fmaster\u002Fmnist\u002Fmain.py). The new code is as follows:\n\n```python\nfrom __future__ import print_function\nimport sys, argparse\nfrom time import time\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\n\ntracker_length = 30\n\nclass Net(nn.Module):\n    def __init__(self):\n        super(Net, self).__init__()\n        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)\n        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)\n        self.fc1 = nn.Linear(12*12*64, 128)\n        self.fc2 = nn.Linear(128, 10)\n\n    def forward(self, x):\n        x = F.relu(self.conv1(x))      # 28x28x32 -> 26x26x32\n        x = F.relu(self.conv2(x))      # 26x26x32 -> 24x24x64\n        x = F.max_pool2d(x, 2) # 24x24x64 -> 12x12x64\n        x = F.dropout(x, p=0.25, training=self.training)\n        x = x.view(-1, 12*12*64)       # flatten 12x12x64 = 9216\n        x = F.relu(self.fc1(x))        # fc 9216 -> 128\n        x = F.dropout(x, p=0.5, training=self.training)\n        x = self.fc2(x)                # fc 128 -> 10\n        return F.log_softmax(x, dim=1) # to 10 logits\n\ndef train(args, model, device, train_loader, optimizer):\n    model.train()\n    start_time = time()\n\n    for batch_idx, (data, target) in enumerate(train_loader):\n        data, target = data.to(device), target.to(device)\n        optimizer.zero_grad()\n        output = model(data)\n        loss = F.nll_loss(output, target)\n        loss.backward()\n        optimizer.step()\n        if batch_idx % args.log_interval == 0:\n            percentage = 100. * batch_idx \u002F len(train_loader)\n            cur_length = int((tracker_length * int(percentage)) \u002F 100)\n            bar = '=' * cur_length + '>' + '-' * (tracker_length - cur_length)\n            sys.stdout.write('\\r{}\u002F{} [{}] - loss: {:.4f}'.format(\n                batch_idx * len(data), len(train_loader.dataset),\n                bar, loss.item()))\n            sys.stdout.flush()\n\n    train_time = time() - start_time\n    sys.stdout.write('\\r{}\u002F{} [{}] - {:.1f}s {:.1f}us\u002Fstep - loss: {:.4f}'.format(\n        len(train_loader.dataset), len(train_loader.dataset), '=' * tracker_length, \n        train_time, (train_time \u002F len(train_loader.dataset)) * 1000000.0, loss.item()))\n    sys.stdout.flush()\n\n    return len(train_loader.dataset), train_time, loss.item()\n\ndef test(args, model, device, test_loader):\n    model.eval()\n    test_loss = 0\n    correct = 0\n\n    with torch.no_grad():\n        for data, target in test_loader:\n            data, target = data.to(device), target.to(device)\n            output = model(data)\n            test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss\n            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability\n            correct += pred.eq(target.view_as(pred)).sum().item()\n\n    test_loss \u002F= len(test_loader.dataset)\n    test_accuracy = correct \u002F len(test_loader.dataset)\n\n    return test_loss, test_accuracy\n\ndef main():\n    # Training settings\n    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')\n    parser.add_argument('--batch-size', type=int, default=64, metavar='N',\n                        help='input batch size for training (default: 64)')\n    parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',\n                        help='input batch size for testing (default: 1000)')\n    parser.add_argument('--epochs', type=int, default=10, metavar='N',\n                        help='number of epochs to train (default: 10)')\n    parser.add_argument('--lr', type=float, default=0.01, metavar='LR',\n                        help='learning rate (default: 0.01)')\n    parser.add_argument('--momentum', type=float, default=0.5, metavar='M',\n                        help='SGD momentum (default: 0.5)')\n    parser.add_argument('--no-cuda', action='store_true', default=False,\n                        help='disables CUDA training')\n    parser.add_argument('--seed', type=int, default=1, metavar='S',\n                        help='random seed (default: 1)')\n    parser.add_argument('--log-interval', type=int, default=10, metavar='N',\n                        help='how many batches to wait before logging training status')\n    args = parser.parse_args()\n    use_cuda = not args.no_cuda and torch.cuda.is_available()\n\n    torch.manual_seed(args.seed)\n\n    device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n\n    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}\n    train_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('..\u002Fdata', train=True, download=True,\n                       transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.batch_size, shuffle=True, **kwargs)\n    test_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('..\u002Fdata', train=False, transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.test_batch_size, shuffle=True, **kwargs)\n\n\n    model = Net().to(device)\n    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)\n\n    for epoch in range(1, args.epochs + 1):\n        print(\"\\nEpoch {}\u002F{}\".format(epoch, args.epochs))\n        train_len, train_time, train_loss = train(args, model, device, train_loader, optimizer)\n        test_loss, test_accuracy = test(args, model, device, test_loader)\n        sys.stdout.write('\\r{}\u002F{} [{}] - {:.1f}s {:.1f}us\u002Fstep - loss: {:.4f} - val_loss: {:.4f} - val_acc: {:.4f}'.format(\n            train_len, train_len, '=' * tracker_length, \n            train_time, (train_time \u002F train_len) * 1000000.0, train_loss,\n            test_loss, test_accuracy))\n        sys.stdout.flush()\n\n\nif __name__ == '__main__':\n    main()\n```\n\nWe include the modified version of this sample in our repo under the name [`mnist_cnn_pytorch.py`](mnist_cnn_pytorch.py). You can run it as follows:\n\n```\n(dlwin36) $ python mnist_cnn_pytorch.py\nEpoch 1\u002F12\n60000\u002F60000 [==============================] - 7.1s 118.6us\u002Fstep - loss: 0.2592 - val_loss: 0.1883 - val_acc: 0.9438\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 6.1s 102.0us\u002Fstep - loss: 0.1917 - val_loss: 0.1412 - val_acc: 0.9575\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.5us\u002Fstep - loss: 0.2335 - val_loss: 0.1074 - val_acc: 0.9679\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.2us\u002Fstep - loss: 0.2038 - val_loss: 0.0828 - val_acc: 0.9741\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.8us\u002Fstep - loss: 0.1733 - val_loss: 0.0676 - val_acc: 0.9783\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.2us\u002Fstep - loss: 0.0952 - val_loss: 0.0587 - val_acc: 0.9810\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.8us\u002Fstep - loss: 0.0521 - val_loss: 0.0527 - val_acc: 0.9832\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.5us\u002Fstep - loss: 0.0993 - val_loss: 0.0484 - val_acc: 0.9834\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.3us\u002Fstep - loss: 0.2031 - val_loss: 0.0449 - val_acc: 0.9853\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.0us\u002Fstep - loss: 0.2267 - val_loss: 0.0429 - val_acc: 0.9868\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 6.1s 100.9us\u002Fstep - loss: 0.0819 - val_loss: 0.0426 - val_acc: 0.9857\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.7us\u002Fstep - loss: 0.0312 - val_loss: 0.0370 - val_acc: 0.9872\n```\n\nAs expected, the network's training performance using PyTorch is on par with the other frameworks.\n\n# Suggested viewing and reading\n\nDeep Learning with Keras - Python, by The SemiColon:\n\n@ https:\u002F\u002Fwww.youtube.com\u002Fplaylist?list=PLVBorYCcu-xX3Ppjb_sqBd_Xf6GqagQyl\n\nDeep Learning with Python, François Chollet\n\n@ https:\u002F\u002Fwww.manning.com\u002Fbooks\u002Fdeep-learning-with-python\n\n# About the Author\n\nFor information about the author, please visit:\n\n[![https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fphilferriere](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_a5d35e87943b.png)](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fphilferriere)\n\n\n","GPU-accelerated Deep Learning on Windows 10 native (Keras\u002FTensorflow\u002FCNTK\u002FMXNet and PyTorch)\n===============================================================================\n\n**>> 最后更新于 2018 年 6 月 \u003C\u003C**\n\n**此次最新更新：**\n- **支持 5 个框架（Keras\u002FTensorflow\u002FCNTK\u002FMXNet 和 PyTorch），**\n- **支持 3 种 GPU 加速的 Keras 后端（CNTK、Tensorflow 或 MXNet），**\n- **无需单独安装 MinGW，**\n- **使用更新版本的许多 Python 库。**\n\n当然有很多指南可以帮助你构建基于 Linux 或 Mac OS 的强大深度学习 (DL) 设置（包括 Tensorflow，遗憾的是，截至本文发布时，它还不能在 Windows 上轻松安装），但很少有人关心构建高效的 Windows 10-**原生**设置。大多数关注点在运行托管在 Windows 上的 Ubuntu 虚拟机 (VM) 或使用 Docker，这些都是不必要且最终不够理想的步骤。\n\n我们也发现网上有足够多的误导性\u002F过时信息，值得为最新稳定版本的 Keras、Tensorflow、CNTK、MXNet 和 PyTorch 整理一份分步指南。无论是组合使用（例如，Keras 搭配 Tensorflow 后端），还是独立使用——PyTorch 不能作为 Keras 后端，TensorFlow 可以独立使用——它们都是能够在 Windows 上原生运行的最强大的深度学习 Python 库之一。\n\n如果你**必须**在 Windows 10 上运行你的 DL 设置，那么这里包含的信息希望能对你有所帮助。\n\n来自 [2017 年 7 月](README_July2017.md)、[2017 年 5 月](README_May2017.md) 和 [2017 年 1 月](README_Jan2017.md) 的旧安装说明仍然可用。它们允许你将 Theano 用作 Keras 后端。\n\n# TOC\n\n- [依赖项](#dependencies)\n- [硬件](#hardware)\n- [安装步骤](#installation-steps)\n  * [工具包](#toolkits)\n    + [Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0](#visual-studio-2015-community-edition-update-3-w-windows-kit-100102400)\n    + [Anaconda 5.2.0 (64-bit) (Python 3.6 TF support \u002F Python 2.7 no TF support))](#anaconda-520-64-bit-python-36-tf-support-python-27-no-tf-support)\n      - [创建 `dlwin36` conda 环境](#create-a-dlwin36-conda-environment)\n      - [可选但强烈推荐图像处理库](#optional-but-highly-recommended-image-processing-libraries)\n    + [CUDA 9.0.176 (64-bit)](#cuda-90176-64-bit)\n    + [cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0](#cudnn-v704-nov-13-2017-for-cuda-90)\n  * [深度学习 python 库](#deep-learning-python-libraries)\n    + [安装 `keras` 2.1.6](#installing-keras-216)\n    + [安装 `tensorflow-gpu` 1.8.0 (独立，或作为 Keras 后端)](#installing-tensorflow-gpu-180-solo-or-as-a-keras-backend)\n    + [安装 `cntk-gpu` 2.5.1 (独立，或作为 Keras 后端)](#installing-cntk-gpu-251-solo-or-as-a-keras-backend)\n    + [安装 `mxnet-cu90` 1.2.0 (独立，或作为 Keras 后端)](#installing-mxnet-cu90-120-solo-or-as-a-keras-backend)\n    + [安装 `pytorch` 0.4.0](#installing-pytorch-040)\n  * [快速检查](#quick-checks)\n    + [检查已安装的 Python 库列表](#checking-the-list-of-python-libraries-installed)\n    + [检查我们的 PATH 系统环境变量](#checking-our-path-sysenv-var)\n    + [快速检查每个主要 Python 库的安装](#quick-checking-each-main-python-library-install)\n  * [GPU 测试](#gpu-tests)\n    + [使用 Keras 验证我们的 GPU 安装](#validating-our-gpu-install-with-keras)\n    + [Keras 搭配 Tensorflow 后端 (GPU 禁用)](#keras-with-tensorflow-backend-gpu-disabled)\n    + [Keras 搭配 Tensorflow 后端 (使用 GPU)](#keras-with-tensorflow-backend-using-gpu)\n    + [Keras 搭配 CNTK 后端 (使用 GPU)](#keras-with-cntk-backend-using-gpu)\n    + [Keras 搭配 MXNet 后端 (使用 GPU)](#keras-with-mxnet-backend-using-gpu)\n    + [使用 PyTorch 验证我们的 GPU 安装](#validating-our-gpu-install-with-pytorch)\n- [建议观看和阅读](#suggested-viewing-and-reading)\n- [关于作者](#about-the-author)\n\n\u003Csmall>\u003Ci>\u003Ca href='http:\u002F\u002Fecotrust-canada.github.io\u002Fmarkdown-toc\u002F'>目录由 markdown-toc 生成\u003C\u002Fa>\u003C\u002Fi>\u003C\u002Fsmall>\n\n# 依赖项\n\n以下是我们在 Windows 10 上进行深度学习所使用的工具和库的总结列表（版本 1709 OS Build 16299.371）：\n\n1. Visual Studio 2015 Community Edition Update 3 带 Windows Kit 10.0.10240.0\n   - 用于其 C\u002FC++ 编译器（而非其集成开发环境 IDE）和软件开发工具包 (SDK)。选择此特定版本是因为 [CUDA 中的 Windows 编译器支持](http:\u002F\u002Fdocs.nvidia.com\u002Fcuda\u002Fcuda-installation-guide-microsoft-windows\u002Findex.html#system-requirements)。\n2. Anaconda (64-bit) 带 Python 3.6 (Anaconda3-5.2.0) [用于 Tensorflow 支持] 或 Python 2.7 (Anaconda2-5.2.0) [无 Tensorflow 支持] 带 MKL 2018.0.3\n   - 一个提供 NumPy、SciPy 和其他科学库的 Python 发行版\n   - MKL 用于其针对 CPU 优化的许多线性代数运算实现\n3. CUDA 9.0.176 (64-bit)\n   - 用于其 GPU 数学库、显卡驱动和 CUDA 编译器\n4. cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0.176\n   - 用于运行速度大幅加快的卷积神经网络\n5. Keras 2.1.6 带三种不同的后端：Tensorflow-gpu 1.8.0, CNTK-gpu 2.5.1, 和 MXNet-cuda90 1.2.0\n   - Keras 用于在 Tensorflow 或 CNTK 之上进行深度学习\n   - Tensorflow 和 CNTK 是用于在多维数组上评估数学表达式的后端\n   - Theano 是一个不再积极开发的遗留后端\n6. PyTorch v0.4.0\n\n# 硬件\n\n1. Dell Precision T7900, 64GB RAM\n   - Intel Xeon E5-2630 v4 @ 2.20 GHz (1 个处理器，共 10 核，20 个逻辑处理器)\n2. NVIDIA GeForce Titan X, 12GB RAM\n   - 驱动版本：390.77 \u002F Win 10 64\n2. NVIDIA GeForce GTX 1080 Ti, 11GB RAM\n   - 驱动版本：390.77 \u002F Win 10 64   \n\n# 安装步骤\n\n我们喜欢将我们的工具包和库保存在一个简单命名为 `e:\\toolkits.win` 的单根文件夹中，因此每当你在下面看到以 `e:\\toolkits.win` 开头的 Windows 路径时，请确保将其替换为你决定自己的工具包驱动器和文件夹应该是什么。\n\n## 工具包\n\n### Visual Studio 2015 社区版更新 3 配合 Windows Kit 10.0.10240.0\n\n下载 [Visual Studio Community 2015 with Update 3 (x86)](https:\u002F\u002Fwww.visualstudio.com\u002Fvs\u002Folder-downloads)。它被 CUDA Toolkit (CUDA 工具包) 使用。\n> 注意，下载需要免费的 [Visual Studio Dev Essentials](https:\u002F\u002Fwww.visualstudio.com\u002Fdev-essentials\u002F) 许可证或完整的 Visual Studio 订阅。\n\n运行下载的 exe 文件以安装 Visual Studio，使用最适合您的其他配置设置：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_e92c161f156c.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_b61b0949c4bc.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_67e97acc9459.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_a833c6ffd09a.png)\n\n1. 根据您安装 VS 2015 的位置，将 `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin` 添加到您的 `PATH` 环境变量中。\n2. 定义系统环境变量 (sysenv) `INCLUDE`，值为 `C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt`\n3. 定义系统环境变量 (sysenv) `LIB`，值为 `C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\um\\x64;C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\ucrt\\x64`\n\n> 参考说明：在添加上述最后两个环境变量之前，我们无法运行任何 Theano Python 文件。编译时会收到 `c:\\program files (x86)\\microsoft visual studio 14.0\\vc\\include\\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory` 错误，链接时会出现缺少 `kernel32.lib uuid.lib ucrt.lib` 的错误。确实，每次打开 MINGW 命令提示符时，您可能可以运行 `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\\amd64\\vcvars64.bat`（带正确的参数），但显然，这些系统环境变量无法在一次会话到下一次会话之间保持。\n\n### Anaconda 5.2.0 (64 位) (Python 3.6 支持 TensorFlow (TF) \u002F Python 2.7 不支持 TensorFlow (TF))\n\n本教程最初是使用 Python 2.7 创建的。由于 TensorFlow (张量流) 已成为 Keras 的首选后端，我们决定默认记录使用 Python 3.6 的安装步骤。根据您的偏好配置，使用 `e:\\toolkits.win\\anaconda3-5.2.0` 或 `e:\\toolkits.win\\anaconda2-5.2.0` 作为安装 Anaconda 的文件夹。\n\n从 [此处](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002FAnaconda3-5.2.0-Windows-x86_64.exe) 下载 Python 3.6 Anaconda 版本，从 [那里](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002FAnaconda2-5.2.0-Windows-x86_64.exe) 下载 Python 2.7 版本：\n\n[![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_c9e2ed22ebb7.png)](https:\u002F\u002Frepo.continuum.io\u002Farchive\u002F)\n\n运行下载的 exe 文件以安装 Anaconda：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_82409e2bc139.png)\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_0520e7cf270a.png)\n\n> 警告：下面，我们启用了“高级选项 (Advanced Options)\"中的第二个选项，因为它对我们有效，但这可能不是对您最好的选择！\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_e528a5db5614.png)\n\n定义以下变量并按如下所示更新 PATH：\n\n1. 定义系统环境变量 (sysenv) `PYTHON_HOME`，值为 `e:\\toolkits.win\\anaconda3-5.2.0`\n2. 将 `%PYTHON_HOME%`, `%PYTHON_HOME%\\Scripts`, 和 `%PYTHON_HOME%\\Library\\bin` 添加到 `PATH`\n\n#### 创建 `dlwin36` conda 环境\n\n安装 Anaconda 后，打开 Windows 命令提示符并执行：\n\n```\n$ conda create --yes -n dlwin36 numpy scipy mkl-service m2w64-toolchain libpython matplotlib pandas scikit-learn tqdm jupyter h5py cython\n```\n\n以下是上述命令的 [输出日志](installed_files\u002Fdlwin36_log.txt)。\n\n接下来，使用 `activate dlwin36` 激活此新环境。顺便说一下，如果您已经有旧的 `dlwin36` 环境，可以使用 `conda env remove -n dlwin36` 删除它。\n\n#### 可选但强烈推荐图像处理库\n\n如果我们打算使用 GPU (图形处理器)，为什么我们要安装像 MKL (数学核心库) 这样的 CPU (中央处理器) 优化的线性代数库？在我们的设置中，大多数深度学习 (Deep Learning) 的繁重工作是由 GPU 执行的，这是正确的，但 CPU 并非闲置。基于图像的 Kaggle 竞赛的一个重要部分是数据增强 (Data Augmentation)。在此背景下，数据增强是通过使用图像处理算子，对原始训练样本进行变换来制造额外的输入样本（更多训练图像）的过程。也需要基本的变换，如下采样 (downsampling) 和（均值中心化）归一化 (normalization)。如果您喜欢冒险，您将想要尝试额外的预处理增强（去噪、直方图均衡化等）。当然，您可以为此目的使用 GPU 并将结果保存到文件。然而，在实践中，当 GPU 忙于学习深度神经网络的权重时，这些操作通常是在 CPU 上并行执行的，并且增强后的数据在使用后会被丢弃。\n\n如果您的深度学习项目是基于图像的，我们还建议安装以下库：\n\n- `scikit-image`: 用于 Python 编程语言的开源图像处理库，包括分割、几何变换、颜色空间操作、分析、滤波、形态学、特征检测等的算法。有关更多信息，请参见 [此页面](http:\u002F\u002Fscikit-image.org\u002F)。\n- `opencv`: 主要针对实时计算机视觉的编程函数库。它具有 C++、Python 和 Java 接口，并支持许多操作系统平台，包括 Windows。有关附加信息，请参见 [此页面](https:\u002F\u002Fopencv.org\u002F)。\n- `imgaug`: 基于图像的 Kaggle 竞赛的必备品，这个 Python 库通过将一组输入图像转换为一组新的、大得多的略有不同的图像，帮助您为机器学习项目增强图像。有关详细信息，请参见 [此页面](https:\u002F\u002Fgithub.com\u002Faleju\u002Fimgaug)。\n\n要安装这些库，请使用以下命令：\n\n```\n$ activate dlwin36\n(dlwin36) $conda install --yes pillow scikit-image\n(dlwin36) $conda install --yes -c conda-forge opencv\n(dlwin36) $pip install git+https:\u002F\u002Fgithub.com\u002Faleju\u002Fimgaug\n```\n\n以下是上述命令的 [输出日志](installed_files\u002Fdlwin36_imgproc_log.txt)。\n\n### CUDA 9.0.176 (64 位)\n\n从 [NVIDIA 网站](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-90-download-archive) 下载 CUDA 9.0.176 (64 位)\n\n为什么不安装 CUDA 9.1？很简单，截至本文撰写时，TensorFlow 1.8 仍然使用 CUDA 9.0（参见问题 [#15140](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Ftensorflow\u002Fissues\u002F15140)）。\n\n选择正确的目标平台：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_4c86d1650adc.png)\n\n下载所有安装程序：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_5c4166537e45.png)\n\n依次运行下载的安装程序。将文件安装到 `e:\\toolkits.win\\cuda-9.0.176`：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_00f0977f207f.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_f736aedbe835.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_3dcb54d98f7f.png)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_cdae4759cf47.png)\n\n完成后，安装程序应已创建一个名为 `CUDA_PATH` 的系统环境变量 (sysenv)，并将 `%CUDA_PATH%\\bin` 以及 `%CUDA_PATH%\\libnvvp` 添加到 `PATH` 中。请检查是否确实如此。如果由于某些原因缺少 CUDA 环境变量，则：\n\n1. 定义一个名为 `CUDA_PATH` 的系统环境变量 (sysenv)，值为 `e:\\toolkits.win\\cuda-9.0.176`\n2. 将 `%CUDA_PATH%\\bin` 和 `%CUDA_PATH%\\libnvvp` 添加到 `PATH` 中\n\n### cuDNN v7.0.4 (2017 年 11 月 13 日) 适用于 CUDA 9.0\n\n根据 NVIDIA 的 [网站](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcudnn)，\"cuDNN 为标准例程提供了高度优化的实现，例如前向和后向卷积、池化、归一化和激活层”，这些是卷积网络架构的标志。从 [此处](https:\u002F\u002Fdeveloper.nvidia.com\u002Frdp\u002Fcudnn-download) 下载 cuDNN。选择与 CUDA 版本匹配的 Windows 10 cuDNN 库：\n\nNVIDIA 最近移除了 7.0.4 Windows 下载的选项。您可以在此处下载它 [链接](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcompute\u002Fmachine-learning\u002Fcudnn\u002Fsecure\u002Fv7.0.4\u002Fprod\u002F9.0_20171031\u002Fcudnn-9.0-windows10-x64-v7)。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_bccbb9a18d25.png)\n\n下载的 ZIP 文件包含三个目录（`bin`、`include`、`lib`）。提取并复制它们的内容到 `%CUDA_PATH%` 下同名的 `bin`、`include` 和 `lib` 目录中。\n\n## 深度学习 Python 库\n\n### 安装 `keras` 2.1.6\n\n为什么不直接安装 Keras 和各种后端（TensorFlow、CNTK 或 Theano）的最新前沿\u002F开发版本呢？简而言之，因为这会让 [可重复研究](https:\u002F\u002Fwww.coursera.org\u002Flearn\u002Freproducible-research) 变得更困难。如果您的同事或 Kaggle 队友在不同的时间从开发分支安装了最新代码，那么您的机器上运行的代码库很可能不同，这增加了即使您使用相同的输入数据（相同的随机种子等），最终结果却不同的可能性，而这本不应该发生。仅凭这一点，我们就强烈建议只使用特定版本（point releases），并在不同机器上使用同一版本，如果您无法直接使用安装脚本，请务必记录所使用的版本。\n\n按如下方式安装 Keras：\n\n```\n(dlwin36) $$ pip install keras==2.1.6\n$ pip install keras==2.1.6\nCollecting keras==2.1.6\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F54\u002Fe8\u002Feaff7a09349ae9bd40d3ebaf028b49f5e2392c771f294910f75bb608b241\u002FKeras-2.1.6-py2.py3-none-any.whl\nRequirement already satisfied: numpy>=1.9.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.14.5)\nRequirement already satisfied: scipy>=0.14 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.1.0)\nRequirement already satisfied: h5py in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (2.8.0)\nRequirement already satisfied: pyyaml in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (3.12)\nRequirement already satisfied: six>=1.9.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.11.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: keras\nSuccessfully installed keras-2.1.6\n```\n\n### 安装 `tensorflow-gpu` 1.8.0（独立安装，或作为 Keras 后端）\n\n运行以下命令以安装 TensorFlow：\n\n```\n$ pip install tensorflow-gpu==1.8.0\nCollecting tensorflow-gpu==1.8.0\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F42\u002Fa8\u002F4c96a2b4f88f5d6dfd70313ebf38de1fe4d49ba9bf2ef34dc12dd198ab9a\u002Ftensorflow_gpu-1.8.0-cp36-cp36m-win_amd64.whl\nRequirement already satisfied: six>=1.10.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.11.0)\nCollecting grpcio>=1.8.6 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F5d\u002F8b\u002F104918993129d6c919a16826e6adcfa4a106c791da79fb9655c5b22ad9ff\u002Fgrpcio-1.12.1-cp36-cp36m-win_amd64.whl (1.4MB)\n    100% |████████████████████████████████| 1.4MB 6.6MB\u002Fs\nCollecting gast>=0.2.0 (from tensorflow-gpu==1.8.0)\nCollecting tensorboard\u003C1.9.0,>=1.8.0 (from tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F59\u002Fa6\u002F0ae6092b7542cfedba6b2a1c9b8dceaf278238c39484f3ba03b03f07803c\u002Ftensorboard-1.8.0-py3-none-any.whl\nRequirement already satisfied: wheel>=0.26 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (0.31.1)\nCollecting termcolor>=1.1.0 (from tensorflow-gpu==1.8.0)\nRequirement already satisfied: numpy>=1.13.3 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.14.5)\nCollecting protobuf>=3.4.0 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F75\u002F7a\u002F0dba607e50b97f6a89fa3f96e23bf56922fa59d748238b30507bfe361bbc\u002Fprotobuf-3.6.0-cp36-cp36m-win_amd64.whl (1.1MB)\n    100% |████████████████████████████████| 1.1MB 6.6MB\u002Fs\nCollecting absl-py>=0.1.6 (from tensorflow-gpu==1.8.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F57\u002F8d\u002F6664518f9b6ced0aa41cf50b989740909261d4c212557400c48e5cda0804\u002Fabsl-py-0.2.2.tar.gz (82kB)\n    100% |████████████████████████████████| 92kB 5.9MB\u002Fs\nCollecting astor>=0.6.0 (from tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fb2\u002F91\u002Fcc9805f1ff7b49f620136b3a7ca26f6a1be2ed424606804b0fbcf499f712\u002Fastor-0.6.2-py2.py3-none-any.whl\nCollecting html5lib==0.9999999 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\nCollecting werkzeug>=0.11.10 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F20\u002Fc4\u002F12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243\u002FWerkzeug-0.14.1-py2.py3-none-any.whl\nCollecting bleach==1.5.0 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F33\u002F70\u002F86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90\u002Fbleach-1.5.0-py2.py3-none-any.whl\nCollecting markdown>=2.6.8 (from tensorboard\u003C1.9.0,>=1.8.0->tensorflow-gpu==1.8.0)\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F6d\u002F7d\u002F488b90f470b96531a3f5788cf12a93332f543dbab13c423a5e7ce96a0493\u002FMarkdown-2.6.11-py2.py3-none-any.whl\nRequirement already satisfied: setuptools in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from protobuf>=3.4.0->tensorflow-gpu==1.8.0) (39.2.0)\nBuilding wheels for collected packages: absl-py\n  Running setup.py bdist_wheel for absl-py ... done\n  Stored in directory: C:\\Users\\Phil\\AppData\\Local\\pip\\Cache\\wheels\\a0\\f8\\e9\\1933dbb3447ea6ef557062fd5461cb118deb8c2ed074e8344bf\nSuccessfully built absl-py\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: grpcio, gast, html5lib, werkzeug, bleach, markdown, protobuf, tensorboard, termcolor, absl-py, astor, tensorflow-gpu\n  Found existing installation: html5lib 1.0.1\n    Uninstalling html5lib-1.0.1:\n      Successfully uninstalled html5lib-1.0.1\n  Found existing installation: bleach 2.1.3\n    Uninstalling bleach-2.1.3:\n      Successfully uninstalled bleach-2.1.3\nSuccessfully installed absl-py-0.2.2 astor-0.6.2 bleach-1.5.0 gast-0.2.0 grpcio-1.12.1 html5lib-0.9999999 markdown-2.6.11 protobuf-3.6.0 tensorboard-1.8.0 tensorflow-gpu-1.8.0 termcolor-1.1.0 werkzeug-0.14.1\n```\n\n如果您希望 TensorFlow 成为默认的 Keras 后端，请定义一个名为 `KERAS_BACKEND` 的系统环境变量，并将其值设置为 `tensorflow`。\n\n### 安装 `cntk-gpu` 2.5.1（独立安装，或作为 Keras 后端）\n\n根据 [此链接](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fcognitive-toolkit\u002Fsetup-windows-python) 中的文档，按如下方式安装 CNTK（认知工具包）GPU：\n\n```\n(dlwin36) $ pip install https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\nCollecting cntk-gpu==2.5.1 from https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\n  Downloading https:\u002F\u002Fcntk.ai\u002FPythonWheel\u002FGPU\u002Fcntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl (428.6MB)\n    100% |████████████████████████████████| 428.6MB 53kB\u002Fs\nRequirement already satisfied: scipy>=0.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.1.0)\nRequirement already satisfied: numpy>=1.11 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.14.5)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: cntk-gpu\nSuccessfully installed cntk-gpu-2.5.1\n```\n\n如果您希望 CNTK 成为默认的 Keras 后端，请定义一个名为 `KERAS_BACKEND` 的系统环境变量，并将其值设置为 `cntk`。\n\n### 安装 `mxnet-cu90` 1.2.0（独立使用，或作为 Keras 后端）\n\nMXNet 是一个深度学习框架，得到了亚马逊（通过 AWS）的强力支持。它也在 Azure 上得到微软的支持。要安装它，请运行以下命令：\n\n```\n(dlwin36) $ pip install mxnet-cu90==1.2.0 keras-mxnet==2.1.6.1\nCollecting mxnet-cu90==1.2.0\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F72\u002Fa8\u002F9226bd6913b7ba4657a218b9a252b60de98938dd41e8517a0b4ab4291203\u002Fmxnet_cu90-1.2.0-py2.py3-none-win_amd64.whl (457.0MB)\n    100% |████████████████████████████████| 457.0MB 47kB\u002Fs\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (1.14.5)\nRequirement already satisfied: graphviz in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (0.8.3)\nCollecting keras-mxnet==2.1.6.1\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F99\u002F93\u002F13ec18147fcef7c393e3fbf2d2c20171975be14e68d4c915b194be174ab6\u002Fkeras_mxnet-2.1.6.1-py2.py3-none-any.whl (388kB)\n    100% |████████████████████████████████| 389kB 3.3MB\u002Fs\nCollecting requests (from mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F65\u002F47\u002F7e02164a2a3db50ed6d8a6ab1d6d60b69c4c3fdf57a284257925dfc12bda\u002Frequests-2.19.1-py2.py3-none-any.whl (91kB)\n    100% |████████████████████████████████| 92kB 1.2MB\u002Fs\nCollecting urllib3\u003C1.24,>=1.21.1 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fbd\u002Fc9\u002F6fdd990019071a4a32a5e7cb78a1d92c53851ef4f56f62a3486e6a7d8ffb\u002Furllib3-1.23-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB\u002Fs\nCollecting chardet\u003C3.1.0,>=3.0.2 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fbc\u002Fa9\u002F01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8\u002Fchardet-3.0.4-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB\u002Fs\nRequirement already satisfied: certifi>=2017.4.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from requests->mxnet-cu90==1.2.0) (2018.4.16)\nCollecting idna\u003C2.8,>=2.5 (from requests->mxnet-cu90==1.2.0)\n  Downloading https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002F4b\u002F2a\u002F0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165\u002Fidna-2.7-py2.py3-none-any.whl (58kB)\n    100% |████████████████████████████████| 61kB 3.9MB\u002Fs\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: urllib3, chardet, idna, requests, mxnet-cu90\nSuccessfully installed chardet-3.0.4 idna-2.7 mxnet-cu90-1.2.0 requests-2.19.1 urllib3-1.23\n```\n\n如果您希望 MXNet 成为默认的 Keras 后端，请定义一个名为 `KERAS_BACKEND` 的系统环境变量，值为 `mxnet`。\n\n### 安装 `pytorch` 0.4.0\n\nPyTorch 是 Facebook AI Research (FAIR) 针对 Google 的 Tensorflow 提出的解决方案。仅 v0.4.0 版本才正式支持 Windows (x64)。配置需要安装 `pytorch`、`cuda90` 和 `torchvision`，因此，首先运行以下命令：\n\n```\n(dlwin36) $ conda install --yes pytorch==0.4.0 cuda90 -c pytorch\nSolving environment: done\n\n## Package Plan ##\n\n  environment location: e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\n\n  added \u002F updated specs:\n    - cuda90\n    - pytorch==0.4.0\n\n\nThe following packages will be downloaded:\n\n    package                    |            build\n    ---------------------------|-----------------\n    cuda90-1.0                 |                0           2 KB  pytorch\n    certifi-2018.4.16          |           py36_0         143 KB\n    pytorch-0.4.0              |py36_cuda90_cudnn7he774522_1       577.6 MB  pytorch\n    ------------------------------------------------------------\n                                           Total:       577.7 MB\n\nThe following NEW packages will be INSTALLED:\n\n    cffi:      1.11.5-py36h945400d_0\n    cuda90:    1.0-0                              pytorch\n    pycparser: 2.18-py36hd053e01_1\n    pytorch:   0.4.0-py36_cuda90_cudnn7he774522_1 pytorch     [cuda90]\n\nThe following packages will be UPDATED:\n\n    certifi:   2018.4.16-py36_0                   conda-forge --> 2018.4.16-py36_0\n\n\nDownloading and Extracting Packages\ncuda90-1.0           |    2 KB | ############################################################################## | 100%\ncertifi-2018.4.16    |  143 KB | ############################################################################## | 100%\npytorch-0.4.0        | 577.6 MB | ############################################################################# | 100%\nPreparing transaction: done\nVerifying transaction: done\nExecuting transaction: done\n```\n\n其次，使用此命令安装 `torchvision`：\n\n```\n(dlwin36torch) $ pip install torchvision==0.2.1\nCollecting torchvision==0.2.1\n  Using cached https:\u002F\u002Ffiles.pythonhosted.org\u002Fpackages\u002Fca\u002F0d\u002Ff00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1\u002Ftorchvision-0.2.1-py2.py3-none-any.whl\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.14.5)\nRequirement already satisfied: pillow>=4.1.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (5.1.0)\nRequirement already satisfied: six in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.11.0)\nRequirement already satisfied: torch in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (0.4.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: torchvision\nSuccessfully installed torchvision-0.2.1\n```\n\n如果在 Windows 上使用 PyTorch 遇到问题，我强烈建议阅读他们的 [Windows FAQ](http:\u002F\u002Fpytorch.org\u002Fdocs\u002Fstable\u002Fnotes\u002Fwindows.html)。\n\n## 快速检查\n\n### 检查已安装的 Python 库列表\n\n在您的 `dlwin36` conda 环境中，您应该最终获得以下库列表：\n\n```\n(dlwin36) $ conda list\n# packages in environment at e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36:\n#\n```\n\n\u003C\u002Fthink>\n\n# 包名                    版本                   构建  通道\nabsl-py                   0.2.2                     \u003Cpip>\nastor                     0.6.2                     \u003Cpip>\nbackcall                  0.1.0                    py36_0  \nblas                      1.0                         mkl  \nbleach                    1.5.0                     \u003Cpip>\nbleach                    2.1.3                    py36_0  \nbokeh                     0.12.16                  py36_0  \nca-certificates           2018.4.16                     0    conda-forge\ncertifi                   2018.4.16                py36_0  \ncffi                      1.11.5           py36h945400d_0  \nchardet                   3.0.4                     \u003Cpip>\nclick                     6.7              py36hec8c647_0  \ncloudpickle               0.5.3                    py36_0  \ncntk-gpu                  2.5.1                     \u003Cpip>\ncolorama                  0.3.9            py36h029ae33_0  \ncuda90                    1.0                           0    pytorch\ncycler                    0.10.0           py36h009560c_0  \ncython                    0.28.3           py36hfa6e2cd_0  \ncytoolz                   0.9.0.1          py36hfa6e2cd_0  \ndask                      0.18.0                   py36_0  \ndask-core                 0.18.0                   py36_0  \ndecorator                 4.3.0                    py36_0  \ndistributed               1.22.0                   py36_0  \nentrypoints               0.2.3            py36hfd66bb0_2  \nfreetype                  2.8.1                    vc14_0  [vc14]  conda-forge\ngast                      0.2.0                     \u003Cpip>\ngraphviz                  0.8.3                     \u003Cpip>\ngrpcio                    1.12.1                    \u003Cpip>\nh5py                      2.8.0            py36h3bdd7fb_0  \nhdf5                      1.10.2                   vc14_0  [vc14]  conda-forge\nheapdict                  1.0.0                    py36_2  \nhtml5lib                  1.0.1            py36h047fa9f_0  \nhtml5lib                  0.9999999                 \u003Cpip>\nicc_rt                    2017.0.4             h97af966_0  \nicu                       58.2                     vc14_0  [vc14]  conda-forge\nidna                      2.7                       \u003Cpip>\nimageio                   2.3.0                    py36_0  \nimgaug                    0.2.5                     \u003Cpip>\nintel-openmp              2018.0.3                      0  \nipykernel                 4.8.2                    py36_0  \nipython                   6.4.0                    py36_0  \nipython_genutils          0.2.0            py36h3c5d0ee_0  \nipywidgets                7.2.1                    py36_0  \njedi                      0.12.0                   py36_1  \njinja2                    2.10             py36h292fed1_0  \njpeg                      9b                       vc14_2  [vc14]  conda-forge\njsonschema                2.6.0            py36h7636477_0  \njupyter                   1.0.0                    py36_4  \njupyter_client            5.2.3                    py36_0  \njupyter_console           5.2.0            py36h6d89b47_1  \njupyter_core              4.4.0            py36h56e9d50_0  \nKeras                     2.1.6                     \u003Cpip>\nkiwisolver                1.0.1            py36h12c3424_0  \nlibpng                    1.6.34                   vc14_0  [vc14]  conda-forge\nlibpython                 2.1                      py36_0  \nlibsodium                 1.0.16                   vc14_0  [vc14]  conda-forge\nlibtiff                   4.0.9                    vc14_0  [vc14]  conda-forge\nlibwebp                   0.5.2                    vc14_7  [vc14]  conda-forge\nlocket                    0.2.0            py36hfed976d_1  \nm2w64-binutils            2.25.1                        5  \nm2w64-bzip2               1.0.6                         6  \nm2w64-crt-git             5.0.0.4636.2595836               2  \nm2w64-gcc                 5.3.0                         6  \nm2w64-gcc-ada             5.3.0                         6  \nm2w64-gcc-fortran         5.3.0                         6  \nm2w64-gcc-libgfortran     5.3.0                         6  \nm2w64-gcc-libs            5.3.0                         7  \nm2w64-gcc-libs-core       5.3.0                         7  \nm2w64-gcc-objc            5.3.0                         6  \nm2w64-gmp                 6.1.0                         2  \nm2w64-headers-git         5.0.0.4636.c0ad18a               2  \nm2w64-isl                 0.16.1                        2  \nm2w64-libiconv            1.14                          6  \nm2w64-libmangle-git       5.0.0.4509.2e5a9a2               2  \nm2w64-libwinpthread-git   5.0.0.4634.697f757               2  \nm2w64-make                4.1.2351.a80a8b8               2  \nm2w64-mpc                 1.0.3                         3  \nm2w64-mpfr                3.1.4                         4  \nm2w64-pkg-config          0.29.1                        2  \nm2w64-toolchain           5.3.0                         7  \nm2w64-tools-git           5.0.0.4592.90b8472               2  \nm2w64-windows-default-manifest 6.4                           3  \nm2w64-winpthreads-git     5.0.0.4634.697f757               2  \nm2w64-zlib                1.2.8                        10  \nMarkdown                  2.6.11                    \u003Cpip>\nmarkupsafe                1.0              py36h0e26971_1  \nmatplotlib                2.2.2                    py36_1    conda-forge\nmistune                   0.8.3            py36hfa6e2cd_1  \nmkl                       2018.0.3                      1  \nmkl-service               1.1.2            py36h57e144c_4  \nmkl_fft                   1.0.1            py36h452e1ab_0  \nmkl_random                1.0.1            py36h9258bd6_0  \nmsgpack-python            0.5.6            py36he980bc4_0  \nmsys2-conda-epoch         20160418                      1  \nmxnet-cu90                1.2.0                     \u003Cpip>\nnbconvert                 5.3.1            py36h8dc0fde_0  \nnbformat                  4.4.0            py36h3a5bc1b_0  \nnetworkx                  2.1                      py36_0  \nnotebook                  5.5.0                    py36_0  \nnumpy                     1.14.5           py36h9fa60d3_0  \nnumpy-base                1.14.5           py36h5c71026_0  \nolefile                   0.45.1                   py36_0  \nopencv                    3.4.1                  py36_200    conda-forge\nopenssl                   1.0.2o                   vc14_0  [vc14]  conda-forge\npackaging                 17.1                     py36_0  \npandas                    0.23.1           py36h830ac7b_0  \npandoc                    1.19.2.1             hb2460c7_1  \npandocfilters             1.4.2            py36h3ef6317_1  \nparso                     0.2.1                    py36_0  \npartd                     0.3.8            py36hc8e763b_0  \npickleshare               0.7.4            py36h9de030f_0  \npillow                    5.1.0            py36h0738816_0  \npip                       10.0.1                   py36_0  \nprompt_toolkit            1.0.15           py36h60b8f86_0  \nprotobuf                  3.6.0                     \u003Cpip>\npsutil                    5.4.6            py36hfa6e2cd_0  \npycparser                 2.18             py36hd053e01_1  \npygments                  2.2.0            py36hb010967_0  \npyparsing                 2.2.0            py36h785a196_1  \npyqt                      5.6.0                    py36_2  \npython                    3.6.5                h0c2934d_0  \npython-dateutil           2.7.3                    py36_0  \npytorch                   0.4.0           py36_cuda90_cudnn7he774522_1  [cuda90]  pytorch\npytz                      2018.4                   py36_0  \npywavelets                0.5.2            py36hc649158_0  \npywinpty                  0.5.4                    py36_0  \npyyaml                    3.12             py36h1d1928f_1  \npyzmq                     17.0.0           py36hfa6e2cd_1  \nqt                        5.6.2                    vc14_1  [vc14]  conda-forge\nqtconsole                 4.3.1            py36h99a29a9_0  \nrequests                  2.19.1                    \u003Cpip>\nscikit-image              0.13.1           py36hfa6e2cd_1  \nscikit-learn              0.19.1           py36h53aea1b_0  \nscipy                     1.1.0            py36h672f292_0  \nsend2trash                1.5.0                    py36_0  \nsetuptools                39.2.0                   py36_0  \nsimplegeneric             0.8.1                    py36_2  \nsip                       4.19.8           py36h6538335_0  \nsix                       1.11.0           py36h4db2310_1  \nsortedcontainers          2.0.4                    py36_0  \nsqlite                    3.22.0                   vc14_0  [vc14]  conda-forge\ntblib                     1.3.2            py36h30f5020_0  \ntensorboard               1.8.0                     \u003Cpip>\ntensorflow-gpu            1.8.0                     \u003Cpip>\ntermcolor                 1.1.0                     \u003Cpip>\nterminado                 0.8.1                    py36_1  \ntestpath                  0.3.1            py36h2698cfe_0  \ntk                        8.6.7                    vc14_0  [vc14]  conda-forge\ntoolz                     0.9.0                    py36_0  \ntorchvision               0.2.1                     \u003Cpip>\ntornado                   5.0.2                    py36_0  \ntqdm                      4.23.4                   py36_0  \ntraitlets                 4.3.2            py36h096827d_0  \nurllib3                   1.23                      \u003Cpip>\nvc                        14                   h0510ff6_3  \nvs2015_runtime            14.0.25123                    3  \nwcwidth                   0.1.7            py36h3d5aa90_0  \nwebencodings              0.5.1            py36h67c50ae_1  \nWerkzeug                  0.14.1                    \u003Cpip>\nwheel                     0.31.1                   py36_0  \nwidgetsnbextension        3.2.1                    py36_0  \nwincertstore              0.2              py36h7fe50ca_0  \nwinpty                    0.4.3                         4  \nyaml                      0.1.7                    vc14_0  [vc14]  conda-forge\nzeromq                    4.2.5                    vc14_1  [vc14]  conda-forge\nzict                      0.1.3            py36h2d8e73e_0  \nzlib                      1.2.11                   vc14_0  [vc14]  conda-forge\n\n### 检查我们的 PATH 系统环境变量\n\n此时，只要激活了 `dlwin36` conda 环境，PATH 环境变量应该类似于以下内容：\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\mingw-w64\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\usr\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\bin\nE:\\toolkits.win\\cuda-9.0.176\\bin\nE:\\toolkits.win\\cuda-9.0.176\\libnvvp\ne:\\toolkits.win\\anaconda3-5.2.0\ne:\\toolkits.win\\anaconda3-5.2.0\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\Library\\bin\nC:\\ProgramData\\Oracle\\Java\\javapath\nC:\\WINDOWS\\system32\nC:\\WINDOWS\nC:\\WINDOWS\\System32\\Wbem\nC:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\\nC:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\nC:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\nC:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\\nC:\\Program Files\\Git\\cmd\nC:\\Program Files\\Git\\mingw64\\bin\nC:\\Program Files\\Git\\usr\\bin\nC:\\WINDOWS\\System32\\OpenSSH\\\n...\n```\n\n> 注意：若要逐行显示路径上的目录（如上所示），请在命令提示符中输入以下指令：`ECHO.%PATH:;= & ECHO.%`。\n\n### 快速检查每个主要 Python 库的安装情况\n\n要快速检查已安装的后端（backends），请运行以下内容：\n\n```\n(dlwin36) $ python -c \"import tensorflow; print('tensorflow: %s, %s' % (tensorflow.__version__, tensorflow.__file__))\"\ntensorflow: 1.8.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\tensorflow\\__init__.py\n(dlwin36) $ python -c \"import cntk; print('cntk: %s, %s' % (cntk.__version__, cntk.__file__))\"\ncntk: 2.5.1, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\cntk\\__init__.py\n(dlwin36) $ python -c \"import mxnet; print('mxnet: %s, %s' % (mxnet.__version__, mxnet.__file__))\"f\nmxnet: 1.2.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\__init__.py\n(dlwin36) $ python -c \"import keras; print('keras: %s, %s' % (keras.__version__, keras.__file__))\"\nUsing TensorFlow backend.\nkeras: 2.1.6, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\__init__.py\n(dlwin36) $ python -c \"import torch; print('torch: %s, %s' % (torch.__version__, torch.__file__))\"\ntorch: 0.4.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\torch\\__init__.py\n```\n\n## GPU 测试\n\n### 使用 Keras 验证我们的 GPU 安装\n\n我们可以使用 Keras 提供的一个示例脚本，在 [MNIST 数据集](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMNIST_database) 上训练一个简单的卷积网络（convnet，即 [convolutional neural network](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FConvolutional_neural_network)）。该文件名为 `mnist_cnn.py`，可以在 Keras 的 `examples` 文件夹中找到，[此处](https:\u002F\u002Fgithub.com\u002Fkeras-team\u002Fkeras\u002Fblob\u002Fmaster\u002Fexamples\u002Fmnist_cnn.py)。代码如下：\n\n```python\n'''Trains a simple convnet on the MNIST dataset.\n\nGets to 99.25% test accuracy after 12 epochs\n(there is still a lot of margin for parameter tuning).\n16 seconds per epoch on a GRID K520 GPU.\n'''\n\nfrom __future__ import print_function\nimport keras\nfrom keras.datasets import mnist\nfrom keras.models import Sequential\nfrom keras.layers import Dense, Dropout, Flatten\nfrom keras.layers import Conv2D, MaxPooling2D\nfrom keras import backend as K\n\nbatch_size = 128\nnum_classes = 10\nepochs = 12\n\n# input image dimensions\nimg_rows, img_cols = 28, 28\n\n# the data, split between train and test sets\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\n\nif K.image_data_format() == 'channels_first':\n    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)\n    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)\n    input_shape = (1, img_rows, img_cols)\nelse:\n    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n    input_shape = (img_rows, img_cols, 1)\n\nx_train = x_train.astype('float32')\nx_test = x_test.astype('float32')\nx_train \u002F= 255\nx_test \u002F= 255\nprint('x_train shape:', x_train.shape)\nprint(x_train.shape[0], 'train samples')\nprint(x_test.shape[0], 'test samples')\n\n# convert class vectors to binary class matrices\ny_train = keras.utils.to_categorical(y_train, num_classes)\ny_test = keras.utils.to_categorical(y_test, num_classes)\n\nmodel = Sequential()\nmodel.add(Conv2D(32, kernel_size=(3, 3),\n                 activation='relu',\n                 input_shape=input_shape))\nmodel.add(Conv2D(64, (3, 3), activation='relu'))\nmodel.add(MaxPooling2D(pool_size=(2, 2)))\nmodel.add(Dropout(0.25))\nmodel.add(Flatten())\nmodel.add(Dense(128, activation='relu'))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(num_classes, activation='softmax'))\n\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n\nmodel.fit(x_train, y_train,\n          batch_size=batch_size,\n          epochs=epochs,\n          verbose=1,\n          validation_data=(x_test, y_test))\nscore = model.evaluate(x_test, y_test, verbose=0)\nprint('Test loss:', score[0])\nprint('Test accuracy:', score[1])\n```\n\n### 使用 TensorFlow 后端（禁用 GPU(图形处理器)）的 Keras\n\n若要激活并测试 **仅 CPU(中央处理器) 模式** 下的 TensorFlow(深度学习框架) 后端，并获得一个良好的基准进行比较，请使用以下命令：\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ set CUDA_VISIBLE_DEVICES=-1\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n2018-06-15 11:59:57.047920: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 11:59:58.152643: E T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE\n2018-06-15 11:59:58.164753: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: SERVERP\n2018-06-15 11:59:58.173767: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:165] hostname: SERVERP\n60000\u002F60000 [==============================] - 60s 997us\u002Fstep - loss: 0.2603 - acc: 0.9195 - val_loss: 0.0502 - val_acc: 0.9836\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 57s 952us\u002Fstep - loss: 0.0873 - acc: 0.9734 - val_loss: 0.0390 - val_acc: 0.9868\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 57s 947us\u002Fstep - loss: 0.0657 - acc: 0.9803 - val_loss: 0.0346 - val_acc: 0.9888\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 57s 945us\u002Fstep - loss: 0.0543 - acc: 0.9842 - val_loss: 0.0348 - val_acc: 0.9886\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 56s 941us\u002Fstep - loss: 0.0470 - acc: 0.9862 - val_loss: 0.0354 - val_acc: 0.9878\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 56s 939us\u002Fstep - loss: 0.0410 - acc: 0.9871 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 56s 941us\u002Fstep - loss: 0.0369 - acc: 0.9888 - val_loss: 0.0290 - val_acc: 0.9901\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 58s 960us\u002Fstep - loss: 0.0337 - acc: 0.9892 - val_loss: 0.0261 - val_acc: 0.9916\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 57s 953us\u002Fstep - loss: 0.0313 - acc: 0.9904 - val_loss: 0.0291 - val_acc: 0.9906\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 57s 958us\u002Fstep - loss: 0.0286 - acc: 0.9913 - val_loss: 0.0317 - val_acc: 0.9889\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 58s 961us\u002Fstep - loss: 0.0269 - acc: 0.9915 - val_loss: 0.0290 - val_acc: 0.9914\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 59s 976us\u002Fstep - loss: 0.0270 - acc: 0.9915 - val_loss: 0.0304 - val_acc: 0.9916\nTest loss: 0.030398282517803726\nTest accuracy: 0.9916\n```\n\n> 注意：如果您已运行上述命令序列，要恢复 CUDA 检测您的 GPU 存在的能力，只需将环境变量 `CUDA_VISIBLE_DEVICES` 设置为机器上已安装 GPU 设备的 ID 列表。换句话说，如果您只有一个 GPU，请使用 `set CUDA_VISIBLE_DEVICES=0`。如果您有两个 GPU，请使用 `set CUDA_VISIBLE_DEVICES=0,1`。以此类推。\n\n### 使用 TensorFlow 后端（GPU）的 Keras\n\n要激活并测试 TensorFlow 后端，请使用以下命令：\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n2018-06-15 12:14:21.774082: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 12:14:22.219436: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 0 with properties:\nname: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645\npciBusID: 0000:04:00.0\ntotalMemory: 11.00GiB freeMemory: 9.09GiB\n2018-06-15 12:14:22.345166: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 1 with properties:\nname: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076\npciBusID: 0000:03:00.0\ntotalMemory: 12.00GiB freeMemory: 10.06GiB\n2018-06-15 12:14:22.360064: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1435] Adding visible gpu devices: 0, 1\n2018-06-15 12:14:23.731981: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:\n2018-06-15 12:14:23.741080: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:929]      0 1\n2018-06-15 12:14:23.747608: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 0:   N N\n2018-06-15 12:14:23.753642: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 1:   N N\n2018-06-15 12:14:23.759825: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (\u002Fjob:localhost\u002Freplica:0\u002Ftask:0\u002Fdevice:GPU:0 with 8804 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)\n2018-06-15 12:14:24.168800: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (\u002Fjob:localhost\u002Freplica:0\u002Ftask:0\u002Fdevice:GPU:1 with 9737 MB memory) -> physical GPU (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0, compute capability: 5.2)\n60000\u002F60000 [==============================] - 10s 161us\u002Fstep - loss: 0.2613 - acc: 0.9198 - val_loss: 0.0563 - val_acc: 0.9811\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0875 - acc: 0.9743 - val_loss: 0.0435 - val_acc: 0.9853\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0652 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9886\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0531 - acc: 0.9844 - val_loss: 0.0324 - val_acc: 0.9896\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0466 - acc: 0.9861 - val_loss: 0.0307 - val_acc: 0.9895\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0421 - acc: 0.9869 - val_loss: 0.0323 - val_acc: 0.9906\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0402 - acc: 0.9879 - val_loss: 0.0286 - val_acc: 0.9907\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0326 - acc: 0.9896 - val_loss: 0.0299 - val_acc: 0.9909\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0311 - acc: 0.9907 - val_loss: 0.0262 - val_acc: 0.9922\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0310 - acc: 0.9902 - val_loss: 0.0256 - val_acc: 0.9918\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0267 - acc: 0.9914 - val_loss: 0.0310 - val_acc: 0.9905\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 4s 71us\u002Fstep - loss: 0.0262 - acc: 0.9917 - val_loss: 0.0281 - val_acc: 0.9919\nTest loss: 0.028108230106867086\nTest accuracy: 0.9919\n```\n\n在 GPU 加速模式下运行的、使用 TensorFlow 后端的 Keras，其速度比 CPU 模式快约 **14.5 倍**（58\u002F4=14.5）。\n\n### 使用 CNTK 后端（GPU）的 Keras\n\n要激活并测试 CNTK 后端，请使用以下命令：\n\n```\n(dlwin36) $ set KERAS_BACKEND=cntk\n(dlwin36) $ python mnist_cnn.py\nUsing CNTK backend\nSelected GPU[0] GeForce GTX 1080 Ti as the process wide default device.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\n60000\u002F60000 [==============================] - 7s 110us\u002Fstep - loss: 0.2594 - acc: 0.9211 - val_loss: 0.0561 - val_acc: 0.9806\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0855 - acc: 0.9752 - val_loss: 0.0425 - val_acc: 0.9864\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0646 - acc: 0.9805 - val_loss: 0.0327 - val_acc: 0.9887\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0537 - acc: 0.9839 - val_loss: 0.0303 - val_acc: 0.9892\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0466 - acc: 0.9863 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0410 - acc: 0.9872 - val_loss: 0.0289 - val_acc: 0.9916\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 6s 93us\u002Fstep - loss: 0.0356 - acc: 0.9896 - val_loss: 0.0278 - val_acc: 0.9917\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0341 - acc: 0.9899 - val_loss: 0.0293 - val_acc: 0.9905\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0325 - acc: 0.9903 - val_loss: 0.0249 - val_acc: 0.9920\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0302 - acc: 0.9903 - val_loss: 0.0275 - val_acc: 0.9910\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0277 - acc: 0.9913 - val_loss: 0.0258 - val_acc: 0.9915\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 6s 94us\u002Fstep - loss: 0.0253 - acc: 0.9923 - val_loss: 0.0277 - val_acc: 0.9906\nTest loss: 0.027684621373889287\nTest accuracy: 0.9906\n```\n\n在此特定实验中，GPU 模式下的 CNTK 很快，但不如 TensorFlow 快。\n\n### 使用 MXNet 后端（GPU）的 Keras\n\n要激活并测试 MXNet 后端，请使用以下命令：\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n```\n\n请注意，截至本文撰写之时，根据 [问题 #106](https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Fissues\u002F106)，目前尚无法直接使用相同的 Keras 代码并期望其在 GPU 上通过 MXNet 运行。您需要按照如下所示修改示例文件 `mnist_cnn.py` 中的 **一行** 代码：\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n```\n\n应为：\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'],\n              context= [\"gpu(0)\"])\n```\n\n或者，使用此仓库中包含的文件 [`mnist_cnn_mxnet.py`](mnist_cnn_mxnet.py)（它包含了上述更改），如下所示：\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  train_symbol = func(*args, **kwargs)\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:92: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  test_symbol = func(*args, **kwargs)\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0\u002Fbatch_size\u002Fnum_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[04:55:20] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\.\u002Fcudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000\u002F60000 [==============================] - 12s 192us\u002Fstep - loss: 0.3480 - acc: 0.8934 - val_loss: 0.0817 - val_acc: 0.9743\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.1177 - acc: 0.9660 - val_loss: 0.0524 - val_acc: 0.9828\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0859 - acc: 0.9750 - val_loss: 0.0432 - val_acc: 0.9857\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0704 - acc: 0.9792 - val_loss: 0.0363 - val_acc: 0.9882\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0608 - acc: 0.9817 - val_loss: 0.0344 - val_acc: 0.9884\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0561 - acc: 0.9839 - val_loss: 0.0328 - val_acc: 0.9889\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0503 - acc: 0.9853 - val_loss: 0.0322 - val_acc: 0.9890\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0473 - acc: 0.9860 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0440 - acc: 0.9870 - val_loss: 0.0304 - val_acc: 0.9899\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0413 - acc: 0.9877 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0388 - acc: 0.9888 - val_loss: 0.0281 - val_acc: 0.9913\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 7s 119us\u002Fstep - loss: 0.0382 - acc: 0.9883 - val_loss: 0.0285 - val_acc: 0.9904\nTest loss: 0.028510591367455346\nTest accuracy: 0.9904\n```\n\n仅从这次单一实验来看，MXNet 似乎是三个 Keras 后端 (backend) 中速度最慢的。但是，如果您决定使用 MXNet，则可能需要实施上述警告中的更改：\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https:\u002F\u002Fgithub.com\u002Fawslabs\u002Fkeras-apache-mxnet\u002Ftree\u002Fmaster\u002Fdocs\u002Fmxnet_backend\u002Fperformance_guide.md\n  train_symbol = func(*args, **kwargs)\n```\n\n您可以使用以下命令来实现这些更改：\n\n```\n(dlwin36) $ %SystemDrive%\n(dlwin36) $ cd %USERPROFILE%\\.keras\n(dlwin36) $ cp keras.json keras.json.bak\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_first\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"mxnet\" & echo }) > keras_mxnet.json\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_last\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"tensorflow\" & echo }) > keras_tensorflow.json\n(dlwin36) $ (echo { & echo     \"image_data_format\": \"channels_last\", & echo     \"epsilon\": 1e-07, & echo     \"floatx\": \"float32\", & echo     \"backend\": \"cntk\" & echo }) > keras_cntk.json\n(dlwin36) $ cp -f keras_mxnet.json keras.json\n```\n\n注意 1：如果您在此之后想切换回 TensorFlow 或 CNTK，只需将正确的 `json` 文件复制到 `keras.json`（例如：`cp -f keras_tensorflow.json keras.json` 并将 `KERAS_BACKEND` 设置为匹配的框架（例如：`set KERAS_BACKEND=tensorflow`）。\n\n注意 2：切换到 `channels_first` 通道排序后，我得到了以下结果：\n\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 1, 28, 28)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1\u002F12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0\u002Fbatch_size\u002Fnum_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[05:39:39] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\.\u002Fcudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000\u002F60000 [==============================] - 9s 152us\u002Fstep - loss: 0.3485 - acc: 0.8923 - val_loss: 0.0851 - val_acc: 0.9732\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.1191 - acc: 0.9652 - val_loss: 0.0529 - val_acc: 0.9824\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0874 - acc: 0.9741 - val_loss: 0.0435 - val_acc: 0.9865\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0740 - acc: 0.9784 - val_loss: 0.0402 - val_acc: 0.9867\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0642 - acc: 0.9809 - val_loss: 0.0328 - val_acc: 0.9884\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0585 - acc: 0.9826 - val_loss: 0.0346 - val_acc: 0.9897\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0534 - acc: 0.9843 - val_loss: 0.0315 - val_acc: 0.9889\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0491 - acc: 0.9852 - val_loss: 0.0336 - val_acc: 0.9888\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0441 - acc: 0.9865 - val_loss: 0.0302 - val_acc: 0.9899\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0421 - acc: 0.9877 - val_loss: 0.0303 - val_acc: 0.9903\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0404 - acc: 0.9878 - val_loss: 0.0294 - val_acc: 0.9903\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 7s 109us\u002Fstep - loss: 0.0381 - acc: 0.9889 - val_loss: 0.0272 - val_acc: 0.9904\nTest loss: 0.027214839413274603\nTest accuracy: 0.9904\n\n速度稍快一些，但不如使用 CNTK 或 TensorFlow 后端 (backend) 的 Keras 快。\n\n### 使用 PyTorch 验证我们的 GPU (图形处理单元) 安装\n\n同样地，我们可以通过修改 PyTorch 的 `示例 (examples)` [文件夹](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fexamples\u002Fblob\u002Fmaster\u002Fmnist\u002Fmain.py) 中的样本，在 MNIST 数据集上训练一个与 Keras 案例中使用的网络类似的卷积神经网络 (convnet)。新代码如下所示：\n\n```python\nfrom __future__ import print_function\nimport sys, argparse\nfrom time import time\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\n\ntracker_length = 30\n\nclass Net(nn.Module):\n    def __init__(self):\n        super(Net, self).__init__()\n        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)\n        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)\n        self.fc1 = nn.Linear(12*12*64, 128)\n        self.fc2 = nn.Linear(128, 10)\n\n    def forward(self, x):\n        x = F.relu(self.conv1(x))      # 28x28x32 -> 26x26x32\n        x = F.relu(self.conv2(x))      # 26x26x32 -> 24x24x64\n        x = F.max_pool2d(x, 2) # 24x24x64 -> 12x12x64\n        x = F.dropout(x, p=0.25, training=self.training)\n        x = x.view(-1, 12*12*64)       # flatten 12x12x64 = 9216\n        x = F.relu(self.fc1(x))        # fc 9216 -> 128\n        x = F.dropout(x, p=0.5, training=self.training)\n        x = self.fc2(x)                # fc 128 -> 10\n        return F.log_softmax(x, dim=1) # to 10 logits\n\ndef train(args, model, device, train_loader, optimizer):\n    model.train()\n    start_time = time()\n\n    for batch_idx, (data, target) in enumerate(train_loader):\n        data, target = data.to(device), target.to(device)\n        optimizer.zero_grad()\n        output = model(data)\n        loss = F.nll_loss(output, target)\n        loss.backward()\n        optimizer.step()\n        if batch_idx % args.log_interval == 0:\n            percentage = 100. * batch_idx \u002F len(train_loader)\n            cur_length = int((tracker_length * int(percentage)) \u002F 100)\n            bar = '=' * cur_length + '>' + '-' * (tracker_length - cur_length)\n            sys.stdout.write('\\r{}\u002F{} [{}] - loss: {:.4f}'.format(\n                batch_idx * len(data), len(train_loader.dataset),\n                bar, loss.item()))\n            sys.stdout.flush()\n\n    train_time = time() - start_time\n    sys.stdout.write('\\r{}\u002F{} [{}] - {:.1f}s {:.1f}us\u002Fstep - loss: {:.4f}'.format(\n        len(train_loader.dataset), len(train_loader.dataset), '=' * tracker_length, \n        train_time, (train_time \u002F len(train_loader.dataset)) * 1000000.0, loss.item()))\n    sys.stdout.flush()\n\n    return len(train_loader.dataset), train_time, loss.item()\n\ndef test(args, model, device, test_loader):\n    model.eval()\n    test_loss = 0\n    correct = 0\n\n    with torch.no_grad():\n        for data, target in test_loader:\n            data, target = data.to(device), target.to(device)\n            output = model(data)\n            test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss\n            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability\n            correct += pred.eq(target.view_as(pred)).sum().item()\n\n    test_loss \u002F= len(test_loader.dataset)\n    test_accuracy = correct \u002F len(test_loader.dataset)\n\n    return test_loss, test_accuracy\n```\n\ndef main():\n    # Training settings\n    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')\n    parser.add_argument('--batch-size', type=int, default=64, metavar='N',\n                        help='input batch size for training (default: 64)')\n    parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',\n                        help='input batch size for testing (default: 1000)')\n    parser.add_argument('--epochs', type=int, default=10, metavar='N',\n                        help='number of epochs to train (default: 10)')\n    parser.add_argument('--lr', type=float, default=0.01, metavar='LR',\n                        help='learning rate (default: 0.01)')\n    parser.add_argument('--momentum', type=float, default=0.5, metavar='M',\n                        help='SGD momentum (default: 0.5)')\n    parser.add_argument('--no-cuda', action='store_true', default=False,\n                        help='disables CUDA training')\n    parser.add_argument('--seed', type=int, default=1, metavar='S',\n                        help='random seed (default: 1)')\n    parser.add_argument('--log-interval', type=int, default=10, metavar='N',\n                        help='how many batches to wait before logging training status')\n    args = parser.parse_args()\n    use_cuda = not args.no_cuda and torch.cuda.is_available()\n\n    torch.manual_seed(args.seed)\n\n    device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n\n    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}\n    train_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('..\u002Fdata', train=True, download=True,\n                       transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.batch_size, shuffle=True, **kwargs)\n    test_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('..\u002Fdata', train=False, transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.test_batch_size, shuffle=True, **kwargs)\n\n\n    model = Net().to(device)\n    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)\n\n    for epoch in range(1, args.epochs + 1):\n        print(\"\\nEpoch {}\u002F{}\".format(epoch, args.epochs))\n        train_len, train_time, train_loss = train(args, model, device, train_loader, optimizer)\n        test_loss, test_accuracy = test(args, model, device, test_loader)\n        sys.stdout.write('\\r{}\u002F{} [{}] - {:.1f}s {:.1f}us\u002Fstep - loss: {:.4f} - val_loss: {:.4f} - val_acc: {:.4f}'.format(\n            train_len, train_len, '=' * tracker_length, \n            train_time, (train_time \u002F train_len) * 1000000.0, train_loss,\n            test_loss, test_accuracy))\n        sys.stdout.flush()\n\n\nif __name__ == '__main__':\n    main()\n```\n\n我们在仓库中包含了此样本的修改版本，文件名为 [`mnist_cnn_pytorch.py`](mnist_cnn_pytorch.py)。您可以按以下方式运行：\n\n```\n(dlwin36) $ python mnist_cnn_pytorch.py\nEpoch 1\u002F12\n60000\u002F60000 [==============================] - 7.1s 118.6us\u002Fstep - loss: 0.2592 - val_loss: 0.1883 - val_acc: 0.9438\nEpoch 2\u002F12\n60000\u002F60000 [==============================] - 6.1s 102.0us\u002Fstep - loss: 0.1917 - val_loss: 0.1412 - val_acc: 0.9575\nEpoch 3\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.5us\u002Fstep - loss: 0.2335 - val_loss: 0.1074 - val_acc: 0.9679\nEpoch 4\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.2us\u002Fstep - loss: 0.2038 - val_loss: 0.0828 - val_acc: 0.9741\nEpoch 5\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.8us\u002Fstep - loss: 0.1733 - val_loss: 0.0676 - val_acc: 0.9783\nEpoch 6\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.2us\u002Fstep - loss: 0.0952 - val_loss: 0.0587 - val_acc: 0.9810\nEpoch 7\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.8us\u002Fstep - loss: 0.0521 - val_loss: 0.0527 - val_acc: 0.9832\nEpoch 8\u002F12\n60000\u002F60000 [==============================] - 6.1s 101.5us\u002Fstep - loss: 0.0993 - val_loss: 0.0484 - val_acc: 0.9834\nEpoch 9\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.3us\u002Fstep - loss: 0.2031 - val_loss: 0.0449 - val_acc: 0.9853\nEpoch 10\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.0us\u002Fstep - loss: 0.2267 - val_loss: 0.0429 - val_acc: 0.9868\nEpoch 11\u002F12\n60000\u002F60000 [==============================] - 6.1s 100.9us\u002Fstep - loss: 0.0819 - val_loss: 0.0426 - val_acc: 0.9857\nEpoch 12\u002F12\n60000\u002F60000 [==============================] - 6.0s 100.7us\u002Fstep - loss: 0.0312 - val_loss: 0.0370 - val_acc: 0.9872\n```\n\n正如预期，使用 PyTorch 进行网络训练的性能与其他框架相当。\n\n# 建议观看和阅读\n\nDeep Learning with Keras - Python, by The SemiColon:\n\n@ https:\u002F\u002Fwww.youtube.com\u002Fplaylist?list=PLVBorYCcu-xX3Ppjb_sqBd_Xf6GqagQyl\n\nDeep Learning with Python, François Chollet\n\n@ https:\u002F\u002Fwww.manning.com\u002Fbooks\u002Fdeep-learning-with-python\n\n# 关于作者\n\n有关作者的更多信息，请访问：\n\n[![https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fphilferriere](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_readme_a5d35e87943b.png)](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fphilferriere)","# dlwin Windows 原生深度学习环境快速上手指南\n\n**注意**：本指南基于 `dlwin` 项目 2018 年 6 月版本的文档整理。由于深度学习生态更新迅速，以下依赖版本（如 CUDA 9.0、TensorFlow 1.8）属于历史稳定版，适用于特定旧环境复现需求。\n\n## 1. 环境准备\n\n### 系统要求\n- **操作系统**: Windows 10 (建议版本 1709 及以上)\n- **硬件配置**:\n  - CPU: Intel Xeon E5 系列或同等性能 (多核)\n  - 内存：建议 64GB RAM\n  - GPU: NVIDIA 显卡 (驱动版本 390.77 或更高，支持 CUDA)\n- **磁盘空间**: 预留足够空间用于安装工具链及模型数据\n\n### 前置依赖\n请确保按顺序下载并安装以下组件（路径可根据个人习惯调整，本文示例以 `e:\\toolkits.win` 为例）：\n\n1. **Visual Studio 2015 Community Edition Update 3** + **Windows Kit 10.0.10240.0**\n   - 用途：提供 C\u002FC++ 编译器及 SDK，CUDA 编译必需。\n2. **Anaconda 5.2.0 (64-bit)**\n   - 用途：Python 发行版，包含科学计算库。\n   - 推荐版本：Python 3.6 (支持 TensorFlow)，也可选 Python 2.7。\n3. **CUDA 9.0.176 (64-bit)**\n   - 用途：GPU 数学库及编译器。\n4. **cuDNN v7.0.4**\n   - 用途：针对 CUDA 9.0 的神经网络加速库。\n\n> **提示**：对于国内开发者，建议在下载 Anaconda 或使用 pip\u002Fconda 时配置国内镜像源（如清华大学源），以提升下载速度。\n\n---\n\n## 2. 安装步骤\n\n### 第一步：配置 Visual Studio\n1. 下载安装 Visual Studio 2015 Community Update 3。\n2. 安装完成后，配置环境变量：\n   - 添加 `PATH`: `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin`\n   - 设置 `INCLUDE`: `C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt`\n   - 设置 `LIB`: `C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\um\\x64;C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\ucrt\\x64`\n\n### 第二步：安装 Anaconda 并配置环境\n1. 下载并运行 Anaconda 安装程序。\n2. 安装后配置环境变量：\n   - 设置 `PYTHON_HOME`: `e:\\toolkits.win\\anaconda3-5.2.0` (根据实际安装路径修改)\n   - 添加 `PATH`: `%PYTHON_HOME%`, `%PYTHON_HOME%\\Scripts`, `%PYTHON_HOME%\\Library\\bin`\n3. 创建 Conda 环境 `dlwin36`：\n   ```bash\n   $ conda create --yes -n dlwin36 numpy scipy mkl-service m2w64-toolchain libpython matplotlib pandas scikit-learn tqdm jupyter h5py cython\n   ```\n4. 激活环境：\n   ```bash\n   $ activate dlwin36\n   ```\n\n### 第三步：安装图像处理库 (可选但推荐)\n若涉及图像数据处理，建议安装以下库：\n```bash\n(dlwin36) $conda install --yes pillow scikit-image\n(dlwin36) $conda install --yes -c conda-forge opencv\n(dlwin36) $pip install git+https:\u002F\u002Fgithub.com\u002Faleju\u002Fimgaug\n```\n\n### 第四步：安装深度学习框架\n在激活的 `dlwin36` 环境中依次安装以下核心库（版本严格对应文档）：\n\n```bash\n(dlwin36) $pip install keras==2.1.6\n(dlwin36) $pip install tensorflow-gpu==1.8.0\n(dlwin36) $pip install cntk-gpu==2.5.1\n(dlwin36) $pip install mxnet-cu90==1.2.0\n(dlwin36) $pip install pytorch==0.4.0 torchvision\n```\n\n---\n\n## 3. 基本使用与验证\n\n### 环境检查\n激活环境后，可执行以下命令检查已安装的 Python 库：\n```bash\n(dlwin36) $conda list\n```\n检查系统 PATH 变量是否包含必要的 CUDA 和 Anaconda 路径。\n\n### GPU 验证示例\n创建一个简单的 Python 脚本 `test_gpu.py` 来验证 GPU 是否正常工作。\n\n**Keras + TensorFlow 后端测试：**\n```python\nfrom keras.backend.tensorflow_backend import set_session\nimport tensorflow as tf\nfrom keras import backend as K\n\nconfig = tf.ConfigProto()\nconfig.gpu_options.allow_growth = True\nsess = tf.Session(config=config)\nK.set_session(sess)\n\nprint(\"Testing GPU availability...\")\nwith tf.device('\u002Fgpu:0'):\n    a = tf.constant([[1.0, 2.0], [3.0, 4.0]])\n    b = tf.constant([[1.0, 1.0], [0.0, 1.0]])\n    c = tf.matmul(a, b)\n    print(c.eval())\n```\n\n**PyTorch 测试：**\n```python\nimport torch\nprint(torch.cuda.is_available())\nprint(torch.cuda.get_device_name(0))\n```\n\n### 常用操作\n- **切换后端**：Keras 支持 Tensorflow、CNTK、MXNet 作为后端，需在代码中指定 `backend` 参数。\n- **退出环境**：\n  ```bash\n  (dlwin36) $deactivate\n  ```\n\n---\n\n*本指南旨在帮助开发者快速搭建 Windows 10 下的原生深度学习环境。如遇具体报错，请参考原始 README 中的详细日志分析章节。*","一名在 Windows 笔记本上开发图像识别算法的独立开发者，急需快速验证模型效果，但苦于硬件环境配置繁琐。\n\n### 没有 dlwin 时\n- 必须安装 Ubuntu 虚拟机或 Docker 容器才能运行 TensorFlow GPU 版本，占用大量系统资源且启动缓慢。\n- 手动配置 CUDA、cuDNN 与 Python 版本的兼容性极差，经常因依赖冲突报错，导致数小时调试无果。\n- 训练过程仅能使用 CPU，模型收敛速度慢，迭代一次实验往往需要耗费数天时间。\n- 缺乏针对 Windows 原生的优化指南，网上教程多为过时信息，难以适配最新框架版本。\n\n### 使用 dlwin 后\n- 直接在 Windows 原生系统下通过脚本一键配置 Keras、TensorFlow、PyTorch 等主流框架的 GPU 加速环境。\n- 内置 Anaconda 环境管理方案，无需单独安装 MinGW，解决了复杂的依赖冲突问题。\n- 充分利用本地显卡算力，模型训练效率显著提升，当天即可完成多轮实验验证与调优。\n- 支持 CNTK、MXNet 等多种后端选择，灵活适配不同项目需求，不再受限于单一框架。\n\ndlwin 彻底消除了 Windows 平台进行深度学习的门槛，让用户无需切换操作系统即可享受高效的 GPU 计算能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fphilferriere_dlwin_0a019e60.png","philferriere","Phil Ferriere","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fphilferriere_bc8449f7.jpg","Former Cruise Senior Software\u002FResearch Engineer and Microsoft Tech\u002FDevelopment Lead passionate about Deep Learning with a focus on Computer Vision.","Freelance","Palm Springs, CA","pferriere@hotmail.com",null,"https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fphilferriere","https:\u002F\u002Fgithub.com\u002Fphilferriere",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",97.9,{"name":88,"color":89,"percentage":90},"Batchfile","#C1F12E",2.1,514,100,"2026-04-10T07:36:48",4,"Windows","需要 NVIDIA GPU，测试型号为 Titan X (12GB) 或 GTX 1080 Ti (11GB)，要求 CUDA 9.0.176 及 cuDNN v7.0.4","测试环境 64GB，最低需求未说明",{"notes":99,"python":100,"dependencies":101},"工具最后更新于 2018 年，依赖库版本较旧；仅支持 Windows 10 原生环境，不推荐 Docker\u002FVM；需安装特定版本 Visual Studio 2015 以编译 C++ 代码；必须手动配置系统环境变量 (PATH, INCLUDE, LIB)；建议使用 conda 管理环境。","3.6 (推荐) 或 2.7",[102,103,104,105,106,107,108,109,110],"Keras 2.1.6","TensorFlow-gpu 1.8.0","CNTK-gpu 2.5.1","MXNet-cu90 1.2.0","PyTorch 0.4.0","CUDA 9.0.176","cuDNN v7.0.4","Visual Studio 2015","Anaconda 5.2.0",[14],[113,114,115,116,117,118,119,120],"theano","gpu-acceleration","deep-learning","tensorflow","cudnn","cntk","gpu-mode","keras",5,"2026-03-27T02:49:30.150509","2026-04-11T16:55:09.176685",[125,130,135,140,145,149],{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},3047,"如何在 Windows 上通过 conda 安装 Theano？","推荐使用 conda 安装，因为它自带所有依赖（如 gcc 和 libpython），比手动安装更容易。具体步骤如下：\n1. conda create -p pyenv python=3.5\n2. conda install -p pyenv theano\n3. activate pyenv\n4. pip install keras","https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fdlwin\u002Fissues\u002F20",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},3048,"文档指定的 cuDNN 版本无法下载怎么办？","旧版构建（如 August 10, 2016）可能不再可用。建议尝试较新的版本（如 cuDNN v5.1 (Jan 20, 2017), for CUDA 8.0）。维护者建议说明应仅使用经过测试且实际可用的版本，而不是理论上应该能用的版本。","https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fdlwin\u002Fissues\u002F30",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},3049,"运行 Theano 时出现 'TheanoConfigWarning: Config key has no value' 错误如何解决？","这通常是 README.md 中的拼写错误导致环境变量未正确识别。请确保设置环境变量时包含正确的键值对格式。例如：\n使用：THEANO_FLAGS_CPU=floatX=float32,device=cpu\n而不是：THEANO_FLAGS_CPU=float32,device=cpu（缺少 floatX=）","https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fdlwin\u002Fissues\u002F17",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},3050,"Visual Studio 2015 Community 无法下载，可以使用 VS2017 吗？","VS2015 Community 现已仅限 MSDN 订阅者下载。建议将 VS2015 和 VS2017 的安装说明拆分为独立文档（如 vs2015.md, vs2017.md），并在主 README 中标注 Stable（稳定）和 Experimental（实验），以便用户根据情况选择。","https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fdlwin\u002Fissues\u002F25",{"id":146,"question_zh":147,"answer_zh":148,"source_url":144},3051,"如何自动化设置 Theano 的环境变量（如路径或编译标志）？","建议使用 PowerShell 脚本代替手动设置以减少错误。例如设置 MINGW_HOME 和 PATH：\n$mingwHome = \"c:\\toolkits\\mingw-w64-5.4.0\"\n[Environment]::SetEnvironmentVariable(\"MINGW_HOME\", $mingwHome, \"Machine\")\n同时可根据是否使用 OpenBLAS 自动设置 THEANO_FLAGS 标志。",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},3052,"安装 Keras + TensorFlow-GPU 是否需要配置 OpenBLAS、MinGW 或 Visual Studio？","可以简化安装流程，无需 OpenBLAS 和 MinGW，Visual Studio 若非必需也可跳过。但 cuDNN 是强制性的，否则会出现 DLL 加载失败错误。推荐安装组合：Anaconda 3, Cuda 8.0, cuDNN 5.1, pip keras==2.0.4, pip tensorflow-gpu==1.1.0。","https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fdlwin\u002Fissues\u002F34",[]]