[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-NVlabs--stylegan2-ada-pytorch":3,"tool-NVlabs--stylegan2-ada-pytorch":61},[4,18,26,36,44,52],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",141543,2,"2026-04-06T11:32:54",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":10,"last_commit_at":50,"category_tags":51,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":53,"name":54,"github_repo":55,"description_zh":56,"stars":57,"difficulty_score":10,"last_commit_at":58,"category_tags":59,"status":17},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[14,15,13,60],"视频",{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":101,"forks":102,"last_commit_at":103,"license":104,"difficulty_score":105,"env_os":106,"env_gpu":107,"env_ram":108,"env_deps":109,"category_tags":120,"github_topics":77,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":121,"updated_at":122,"faqs":123,"releases":156},4479,"NVlabs\u002Fstylegan2-ada-pytorch","stylegan2-ada-pytorch","StyleGAN2-ADA - Official PyTorch implementation","stylegan2-ada-pytorch 是英伟达官方推出的 StyleGAN2-ADA 模型的 PyTorch 版本，专注于在数据有限的情况下训练高质量的生成对抗网络（GAN）。传统 GAN 训练往往需要海量图片，一旦数据不足，判别器容易过拟合导致训练失败。该工具通过引入“自适应判别器增强”机制，有效稳定了小数据集上的训练过程，无需修改损失函数或网络架构，仅需几千张图像即可生成媲美原版 StyleGAN2 的高清结果，甚至在某些基准测试中刷新了纪录。\n\n相比原有的 TensorFlow 版本，stylegan2-ada-pytorch 在保持结果一致性的同时，显著提升了训练与推理速度，并优化了显存占用和启动时间。它支持加载旧版模型，采用更通用的 ZIP\u002FPNG 数据集格式，并兼容 TensorBoard 等主流工具，极大地方便了工程部署与实验复现。\n\n这款工具特别适合 AI 研究人员、深度学习开发者以及需要利用少量定制数据生成高质量图像的设计师使用。无论是从零训练还是对现有模型进行微调，stylegan2-ada-pytorch 都能帮助用户轻松突破数据瓶颈，探索人脸、动物、艺术风","stylegan2-ada-pytorch 是英伟达官方推出的 StyleGAN2-ADA 模型的 PyTorch 版本，专注于在数据有限的情况下训练高质量的生成对抗网络（GAN）。传统 GAN 训练往往需要海量图片，一旦数据不足，判别器容易过拟合导致训练失败。该工具通过引入“自适应判别器增强”机制，有效稳定了小数据集上的训练过程，无需修改损失函数或网络架构，仅需几千张图像即可生成媲美原版 StyleGAN2 的高清结果，甚至在某些基准测试中刷新了纪录。\n\n相比原有的 TensorFlow 版本，stylegan2-ada-pytorch 在保持结果一致性的同时，显著提升了训练与推理速度，并优化了显存占用和启动时间。它支持加载旧版模型，采用更通用的 ZIP\u002FPNG 数据集格式，并兼容 TensorBoard 等主流工具，极大地方便了工程部署与实验复现。\n\n这款工具特别适合 AI 研究人员、深度学习开发者以及需要利用少量定制数据生成高质量图像的设计师使用。无论是从零训练还是对现有模型进行微调，stylegan2-ada-pytorch 都能帮助用户轻松突破数据瓶颈，探索人脸、动物、艺术风格等多种领域的生成式应用。","## StyleGAN2-ADA &mdash; Official PyTorch implementation\n\n![Teaser image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_stylegan2-ada-pytorch_readme_77532d2bf89e.png)\n\n**Training Generative Adversarial Networks with Limited Data**\u003Cbr>\nTero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila\u003Cbr>\nhttps:\u002F\u002Farxiv.org\u002Fabs\u002F2006.06676\u003Cbr>\n\nAbstract: *Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes. The approach does not require changes to loss functions or network architectures, and is applicable both when training from scratch and when fine-tuning an existing GAN on another dataset. We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images. We expect this to open up new application domains for GANs. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42.*\n\nFor business inquiries, please visit our website and submit the form: [NVIDIA Research Licensing](https:\u002F\u002Fwww.nvidia.com\u002Fen-us\u002Fresearch\u002Finquiries\u002F)\n\n## Release notes\n\nThis repository is a faithful reimplementation of [StyleGAN2-ADA](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F) in PyTorch, focusing on correctness, performance, and compatibility.\n\n**Correctness**\n* Full support for all primary training configurations.\n* Extensive verification of image quality, training curves, and quality metrics against the TensorFlow version.\n* Results are expected to match in all cases, excluding the effects of pseudo-random numbers and floating-point arithmetic.\n\n**Performance**\n* Training is typically 5%&ndash;30% faster compared to the TensorFlow version on NVIDIA Tesla V100 GPUs.\n* Inference is up to 35% faster in high resolutions, but it may be slightly slower in low resolutions.\n* GPU memory usage is comparable to the TensorFlow version.\n* Faster startup time when training new networks (\u003C50s), and also when using pre-trained networks (\u003C4s).\n* New command line options for tweaking the training performance.\n\n**Compatibility**\n* Compatible with old network pickles created using the TensorFlow version.\n* New ZIP\u002FPNG based dataset format for maximal interoperability with existing 3rd party tools.\n* TFRecords datasets are no longer supported &mdash; they need to be converted to the new format.\n* New JSON-based format for logs, metrics, and training curves.\n* Training curves are also exported in the old TFEvents format if TensorBoard is installed.\n* Command line syntax is mostly unchanged, with a few exceptions (e.g., `dataset_tool.py`).\n* Comparison methods are not supported (`--cmethod`, `--dcap`, `--cfg=cifarbaseline`, `--aug=adarv`)\n* **Truncation is now disabled by default.**\n\n## Data repository\n\n| Path | Description\n| :--- | :----------\n| [stylegan2-ada-pytorch](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002F) | Main directory hosted on Amazon S3\n| &ensp;&ensp;&boxvr;&nbsp; [ada-paper.pdf](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fada-paper.pdf) | Paper PDF\n| &ensp;&ensp;&boxvr;&nbsp; [images](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fimages\u002F) | Curated example images produced using the pre-trained models\n| &ensp;&ensp;&boxvr;&nbsp; [videos](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fvideos\u002F) | Curated example interpolation videos\n| &ensp;&ensp;&boxur;&nbsp; [pretrained](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002F) | Pre-trained models\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; ffhq.pkl | FFHQ at 1024x1024, trained using original StyleGAN2\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; metfaces.pkl | MetFaces at 1024x1024, transfer learning from FFHQ using ADA\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqcat.pkl | AFHQ Cat at 512x512, trained from scratch using ADA\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqdog.pkl | AFHQ Dog at 512x512, trained from scratch using ADA\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqwild.pkl | AFHQ Wild at 512x512, trained from scratch using ADA\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; cifar10.pkl | Class-conditional CIFAR-10 at 32x32\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; brecahad.pkl | BreCaHAD at 512x512, trained from scratch using ADA\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig7c-training-set-sweeps](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig7c-training-set-sweeps\u002F) | Models used in Fig.7c (sweep over training set size)\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig11a-small-datasets](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig11a-small-datasets\u002F) | Models used in Fig.11a (small datasets & transfer learning)\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig11b-cifar10](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig11b-cifar10\u002F) | Models used in Fig.11b (CIFAR-10)\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [transfer-learning-source-nets](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Ftransfer-learning-source-nets\u002F) | Models used as starting point for transfer learning\n| &ensp;&ensp;&ensp;&ensp;&boxur;&nbsp; [metrics](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetrics\u002F) | Feature detectors used by the quality metrics\n\n## Requirements\n\n* Linux and Windows are supported, but we recommend Linux for performance and compatibility reasons.\n* 1&ndash;8 high-end NVIDIA GPUs with at least 12 GB of memory. We have done all testing and development using NVIDIA DGX-1 with 8 Tesla V100 GPUs.\n* 64-bit Python 3.7 and PyTorch 1.7.1. See [https:\u002F\u002Fpytorch.org\u002F](https:\u002F\u002Fpytorch.org\u002F) for PyTorch install instructions.\n* CUDA toolkit 11.0 or later.  Use at least version 11.1 if running on RTX 3090.  (Why is a separate CUDA toolkit installation required?  See comments in [#2](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F2#issuecomment-779457121).)\n* Python libraries: `pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3`.  We use the Anaconda3 2020.11 distribution which installs most of these by default.\n* Docker users: use the [provided Dockerfile](.\u002FDockerfile) to build an image with the required library dependencies.\n\nThe code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing [Visual Studio Community Edition](https:\u002F\u002Fvisualstudio.microsoft.com\u002Fvs\u002F) and adding it into `PATH` using `\"C:\\Program Files (x86)\\Microsoft Visual Studio\\\u003CVERSION>\\Community\\VC\\Auxiliary\\Build\\vcvars64.bat\"`.\n\n## Getting started\n\nPre-trained networks are stored as `*.pkl` files that can be referenced using local filenames or URLs:\n\n```.bash\n# Generate curated MetFaces images without truncation (Fig.10 left)\npython generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n\n# Generate uncurated MetFaces images with truncation (Fig.12 upper left)\npython generate.py --outdir=out --trunc=0.7 --seeds=600-605 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n\n# Generate class conditional CIFAR-10 images (Fig.17 left, Car)\npython generate.py --outdir=out --seeds=0-35 --class=1 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fcifar10.pkl\n\n# Style mixing example\npython style_mixing.py --outdir=out --rows=85,100,75,458,1500 --cols=55,821,1789,293 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\nOutputs from the above commands are placed under `out\u002F*.png`, controlled by `--outdir`. Downloaded network pickles are cached under `$HOME\u002F.cache\u002Fdnnlib`, which can be overridden by setting the `DNNLIB_CACHE_DIR` environment variable. The default PyTorch extension build directory is `$HOME\u002F.cache\u002Ftorch_extensions`, which can be overridden by setting `TORCH_EXTENSIONS_DIR`.\n\n**Docker**: You can run the above curated image example using Docker as follows:\n\n```.bash\ndocker build --tag sg2ada:latest .\n.\u002Fdocker_run.sh python3 generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\nNote: The Docker image requires NVIDIA driver release `r455.23` or later.\n\n**Legacy networks**: The above commands can load most of the network pickles created using the previous TensorFlow versions of StyleGAN2 and StyleGAN2-ADA. However, for future compatibility, we recommend converting such legacy pickles into the new format used by the PyTorch version:\n\n```.bash\npython legacy.py \\\n    --source=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2\u002Fnetworks\u002Fstylegan2-cat-config-f.pkl \\\n    --dest=stylegan2-cat-config-f.pkl\n```\n\n## Projecting images to latent space\n\nTo find the matching latent vector for a given image file, run:\n\n```.bash\npython projector.py --outdir=out --target=~\u002Fmytargetimg.png \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\nFor optimal results, the target image should be cropped and aligned similar to the [FFHQ dataset](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fffhq-dataset). The above command saves the projection target `out\u002Ftarget.png`, result `out\u002Fproj.png`, latent vector `out\u002Fprojected_w.npz`, and progression video `out\u002Fproj.mp4`. You can render the resulting latent vector by specifying `--projected_w` for `generate.py`:\n\n```.bash\npython generate.py --outdir=out --projected_w=out\u002Fprojected_w.npz \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\n## Using networks from Python\n\nYou can use pre-trained networks in your own Python code as follows:\n\n```.python\nwith open('ffhq.pkl', 'rb') as f:\n    G = pickle.load(f)['G_ema'].cuda()  # torch.nn.Module\nz = torch.randn([1, G.z_dim]).cuda()    # latent codes\nc = None                                # class labels (not used in this example)\nimg = G(z, c)                           # NCHW, float32, dynamic range [-1, +1]\n```\n\nThe above code requires `torch_utils` and `dnnlib` to be accessible via `PYTHONPATH`. It does not need source code for the networks themselves &mdash; their class definitions are loaded from the pickle via `torch_utils.persistence`.\n\nThe pickle contains three networks. `'G'` and `'D'` are instantaneous snapshots taken during training, and `'G_ema'` represents a moving average of the generator weights over several training steps. The networks are regular instances of `torch.nn.Module`, with all of their parameters and buffers placed on the CPU at import and gradient computation disabled by default.\n\nThe generator consists of two submodules, `G.mapping` and `G.synthesis`, that can be executed separately. They also support various additional options:\n\n```.python\nw = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)\nimg = G.synthesis(w, noise_mode='const', force_fp32=True)\n```\n\nPlease refer to [`generate.py`](.\u002Fgenerate.py), [`style_mixing.py`](.\u002Fstyle_mixing.py), and [`projector.py`](.\u002Fprojector.py) for further examples.\n\n## Preparing datasets\n\nDatasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file `dataset.json` for labels.\n\nCustom datasets can be created from a folder containing images; see [`python dataset_tool.py --help`](.\u002Fdocs\u002Fdataset-tool-help.txt) for more information. Alternatively, the folder can also be used directly as a dataset, without running it through `dataset_tool.py` first, but doing so may lead to suboptimal performance.\n\nLegacy TFRecords datasets are not supported &mdash; see below for instructions on how to convert them.\n\n**FFHQ**:\n\nStep 1: Download the [Flickr-Faces-HQ dataset](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fffhq-dataset) as TFRecords.\n\nStep 2: Extract images from TFRecords using `dataset_tool.py` from the [TensorFlow version of StyleGAN2-ADA](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F):\n\n```.bash\n# Using dataset_tool.py from TensorFlow version at\n# https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F\npython ..\u002Fstylegan2-ada\u002Fdataset_tool.py unpack \\\n    --tfrecord_dir=~\u002Fffhq-dataset\u002Ftfrecords\u002Fffhq --output_dir=\u002Ftmp\u002Fffhq-unpacked\n```\n\nStep 3: Create ZIP archive using `dataset_tool.py` from this repository:\n\n```.bash\n# Original 1024x1024 resolution.\npython dataset_tool.py --source=\u002Ftmp\u002Fffhq-unpacked --dest=~\u002Fdatasets\u002Fffhq.zip\n\n# Scaled down 256x256 resolution.\n#\n# Note: --resize-filter=box is required to reproduce FID scores shown in the\n# paper.  If you don't need to match exactly, it's better to leave this out\n# and default to Lanczos.  See https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F283#issuecomment-1731217782\npython dataset_tool.py --source=\u002Ftmp\u002Fffhq-unpacked --dest=~\u002Fdatasets\u002Fffhq256x256.zip \\\n    --width=256 --height=256 --resize-filter=box\n```\n\n**MetFaces**: Download the [MetFaces dataset](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fmetfaces-dataset) and create ZIP archive:\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fmetfaces\u002Fimages --dest=~\u002Fdatasets\u002Fmetfaces.zip\n```\n\n**AFHQ**: Download the [AFHQ dataset](https:\u002F\u002Fgithub.com\u002Fclovaai\u002Fstargan-v2\u002Fblob\u002Fmaster\u002FREADME.md#animal-faces-hq-dataset-afhq) and create ZIP archive:\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fcat --dest=~\u002Fdatasets\u002Fafhqcat.zip\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fdog --dest=~\u002Fdatasets\u002Fafhqdog.zip\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fwild --dest=~\u002Fdatasets\u002Fafhqwild.zip\n```\n\n**CIFAR-10**: Download the [CIFAR-10 python version](https:\u002F\u002Fwww.cs.toronto.edu\u002F~kriz\u002Fcifar.html) and convert to ZIP archive:\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fcifar-10-python.tar.gz --dest=~\u002Fdatasets\u002Fcifar10.zip\n```\n\n**LSUN**: Download the desired categories from the [LSUN project page](https:\u002F\u002Fwww.yf.io\u002Fp\u002Flsun\u002F) and convert to ZIP archive:\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Flsun\u002Fraw\u002Fcat_lmdb --dest=~\u002Fdatasets\u002Flsuncat200k.zip \\\n    --transform=center-crop --width=256 --height=256 --max_images=200000\n\npython dataset_tool.py --source=~\u002Fdownloads\u002Flsun\u002Fraw\u002Fcar_lmdb --dest=~\u002Fdatasets\u002Flsuncar200k.zip \\\n    --transform=center-crop-wide --width=512 --height=384 --max_images=200000\n```\n\n**BreCaHAD**:\n\nStep 1: Download the [BreCaHAD dataset](https:\u002F\u002Ffigshare.com\u002Farticles\u002FBreCaHAD_A_Dataset_for_Breast_Cancer_Histopathological_Annotation_and_Diagnosis\u002F7379186).\n\nStep 2: Extract 512x512 resolution crops using `dataset_tool.py` from the [TensorFlow version of StyleGAN2-ADA](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F):\n\n```.bash\n# Using dataset_tool.py from TensorFlow version at\n# https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F\npython dataset_tool.py extract_brecahad_crops --cropsize=512 \\\n    --output_dir=\u002Ftmp\u002Fbrecahad-crops --brecahad_dir=~\u002Fdownloads\u002Fbrecahad\u002Fimages\n```\n\nStep 3: Create ZIP archive using `dataset_tool.py` from this repository:\n\n```.bash\npython dataset_tool.py --source=\u002Ftmp\u002Fbrecahad-crops --dest=~\u002Fdatasets\u002Fbrecahad.zip\n```\n\n## Training new networks\n\nIn its most basic form, training new networks boils down to:\n\n```.bash\npython train.py --outdir=~\u002Ftraining-runs --data=~\u002Fmydataset.zip --gpus=1 --dry-run\npython train.py --outdir=~\u002Ftraining-runs --data=~\u002Fmydataset.zip --gpus=1\n```\n\nThe first command is optional; it validates the arguments, prints out the training configuration, and exits. The second command kicks off the actual training.\n\nIn this example, the results are saved to a newly created directory `~\u002Ftraining-runs\u002F\u003CID>-mydataset-auto1`, controlled by `--outdir`. The training exports network pickles (`network-snapshot-\u003CINT>.pkl`) and example images (`fakes\u003CINT>.png`) at regular intervals (controlled by `--snap`). For each pickle, it also evaluates FID (controlled by `--metrics`) and logs the resulting scores in `metric-fid50k_full.jsonl` (as well as TFEvents if TensorBoard is installed).\n\nThe name of the output directory reflects the training configuration. For example, `00000-mydataset-auto1` indicates that the *base configuration* was `auto1`, meaning that the hyperparameters were selected automatically for training on one GPU. The base configuration is controlled by `--cfg`:\n\n| Base config           | Description\n| :-------------------- | :----------\n| `auto`&nbsp;(default) | Automatically select reasonable defaults based on resolution and GPU count. Serves as a good starting point for new datasets but does not necessarily lead to optimal results.\n| `stylegan2`           | Reproduce results for StyleGAN2 config F at 1024x1024 using 1, 2, 4, or 8 GPUs.\n| `paper256`            | Reproduce results for FFHQ and LSUN Cat at 256x256 using 1, 2, 4, or 8 GPUs.\n| `paper512`            | Reproduce results for BreCaHAD and AFHQ at 512x512 using 1, 2, 4, or 8 GPUs.\n| `paper1024`           | Reproduce results for MetFaces at 1024x1024 using 1, 2, 4, or 8 GPUs.\n| `cifar`               | Reproduce results for CIFAR-10 (tuned configuration) using 1 or 2 GPUs.\n\nThe training configuration can be further customized with additional command line options:\n\n* `--aug=noaug` disables ADA.\n* `--cond=1` enables class-conditional training (requires a dataset with labels).\n* `--mirror=1` amplifies the dataset with x-flips. Often beneficial, even with ADA.\n* `--resume=ffhq1024 --snap=10` performs transfer learning from FFHQ trained at 1024x1024.\n* `--resume=~\u002Ftraining-runs\u002F\u003CNAME>\u002Fnetwork-snapshot-\u003CINT>.pkl` resumes a previous training run.\n* `--gamma=10` overrides R1 gamma. We recommend trying a couple of different values for each new dataset.\n* `--aug=ada --target=0.7` adjusts ADA target value (default: 0.6).\n* `--augpipe=blit` enables pixel blitting but disables all other augmentations.\n* `--augpipe=bgcfnc` enables all available augmentations (blit, geom, color, filter, noise, cutout).\n\nPlease refer to [`python train.py --help`](.\u002Fdocs\u002Ftrain-help.txt) for the full list.\n\n## Expected training time\n\nThe total training time depends heavily on resolution, number of GPUs, dataset, desired quality, and hyperparameters. The following table lists expected wallclock times to reach different points in the training, measured in thousands of real images shown to the discriminator (\"kimg\"):\n\n| Resolution | GPUs | 1000 kimg | 25000 kimg | sec\u002Fkimg          | GPU mem | CPU mem\n| :--------: | :--: | :-------: | :--------: | :---------------: | :-----: | :-----:\n| 128x128    | 1    | 4h 05m    | 4d 06h     | 12.8&ndash;13.7   | 7.2 GB  | 3.9 GB\n| 128x128    | 2    | 2h 06m    | 2d 04h     | 6.5&ndash;6.8     | 7.4 GB  | 7.9 GB\n| 128x128    | 4    | 1h 20m    | 1d 09h     | 4.1&ndash;4.6     | 4.2 GB  | 16.3 GB\n| 128x128    | 8    | 1h 13m    | 1d 06h     | 3.9&ndash;4.9     | 2.6 GB  | 31.9 GB\n| 256x256    | 1    | 6h 36m    | 6d 21h     | 21.6&ndash;24.2   | 5.0 GB  | 4.5 GB\n| 256x256    | 2    | 3h 27m    | 3d 14h     | 11.2&ndash;11.8   | 5.2 GB  | 9.0 GB\n| 256x256    | 4    | 1h 45m    | 1d 20h     | 5.6&ndash;5.9     | 5.2 GB  | 17.8 GB\n| 256x256    | 8    | 1h 24m    | 1d 11h     | 4.4&ndash;5.5     | 3.2 GB  | 34.7 GB\n| 512x512    | 1    | 21h 03m   | 21d 22h    | 72.5&ndash;74.9   | 7.6 GB  | 5.0 GB\n| 512x512    | 2    | 10h 59m   | 11d 10h    | 37.7&ndash;40.0   | 7.8 GB  | 9.8 GB\n| 512x512    | 4    | 5h 29m    | 5d 17h     | 18.7&ndash;19.1   | 7.9 GB  | 17.7 GB\n| 512x512    | 8    | 2h 48m    | 2d 22h     | 9.5&ndash;9.7     | 7.8 GB  | 38.2 GB\n| 1024x1024  | 1    | 1d 20h    | 46d 03h    | 154.3&ndash;161.6 | 8.1 GB  | 5.3 GB\n| 1024x1024  | 2    | 23h 09m   | 24d 02h    | 80.6&ndash;86.2   | 8.6 GB  | 11.9 GB\n| 1024x1024  | 4    | 11h 36m   | 12d 02h    | 40.1&ndash;40.8   | 8.4 GB  | 21.9 GB\n| 1024x1024  | 8    | 5h 54m    | 6d 03h     | 20.2&ndash;20.6   | 8.3 GB  | 44.7 GB\n\nThe above measurements were done using NVIDIA Tesla V100 GPUs with default settings (`--cfg=auto --aug=ada --metrics=fid50k_full`). \"sec\u002Fkimg\" shows the expected range of variation in raw training performance, as reported in `log.txt`. \"GPU mem\" and \"CPU mem\" show the highest observed memory consumption, excluding the peak at the beginning caused by `torch.backends.cudnn.benchmark`.\n\nIn typical cases, 25000 kimg or more is needed to reach convergence, but the results are already quite reasonable around 5000 kimg. 1000 kimg is often enough for transfer learning, which tends to converge significantly faster. The following figure shows example convergence curves for different datasets as a function of wallclock time, using the same settings as above:\n\n![Training curves](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_stylegan2-ada-pytorch_readme_41c02ca6c630.png)\n\nNote: `--cfg=auto` serves as a reasonable first guess for the hyperparameters but it does not necessarily lead to optimal results for a given dataset. For example, `--cfg=stylegan2` yields considerably better FID  for FFHQ-140k at 1024x1024 than illustrated above. We recommend trying out at least a few different values of `--gamma` for each new dataset.\n\n## Quality metrics\n\nBy default, `train.py` automatically computes FID for each network pickle exported during training. We recommend inspecting `metric-fid50k_full.jsonl` (or TensorBoard) at regular intervals to monitor the training progress. When desired, the automatic computation can be disabled with `--metrics=none` to speed up the training slightly (3%&ndash;9%).\n\nAdditional quality metrics can also be computed after the training:\n\n```.bash\n# Previous training run: look up options automatically, save result to JSONL file.\npython calc_metrics.py --metrics=pr50k3_full \\\n    --network=~\u002Ftraining-runs\u002F00000-ffhq10k-res64-auto1\u002Fnetwork-snapshot-000000.pkl\n\n# Pre-trained network pickle: specify dataset explicitly, print result to stdout.\npython calc_metrics.py --metrics=fid50k_full --data=~\u002Fdatasets\u002Fffhq.zip --mirror=1 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\nThe first example looks up the training configuration and performs the same operation as if `--metrics=pr50k3_full` had been specified during training. The second example downloads a pre-trained network pickle, in which case the values of `--mirror` and `--data` must be specified explicitly.\n\nNote that many of the metrics have a significant one-off cost when calculating them for the first time for a new dataset (up to 30min). Also note that the evaluation is done using a different random seed each time, so the results will vary if the same metric is computed multiple times.\n\nWe employ the following metrics in the ADA paper. Execution time and GPU memory usage is reported for one NVIDIA Tesla V100 GPU at 1024x1024 resolution:\n\n| Metric        | Time   | GPU mem | Description |\n| :-----        | :----: | :-----: | :---------- |\n| `fid50k_full` | 13 min | 1.8 GB  | Fr&eacute;chet inception distance\u003Csup>[1]\u003C\u002Fsup> against the full dataset\n| `kid50k_full` | 13 min | 1.8 GB  | Kernel inception distance\u003Csup>[2]\u003C\u002Fsup> against the full dataset\n| `pr50k3_full` | 13 min | 4.1 GB  | Precision and recall\u003Csup>[3]\u003C\u002Fsup> againt the full dataset\n| `is50k`       | 13 min | 1.8 GB  | Inception score\u003Csup>[4]\u003C\u002Fsup> for CIFAR-10\n\nIn addition, the following metrics from the [StyleGAN](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan) and [StyleGAN2](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2) papers are also supported:\n\n| Metric        | Time   | GPU mem | Description |\n| :------------ | :----: | :-----: | :---------- |\n| `fid50k`      | 13 min | 1.8 GB  | Fr&eacute;chet inception distance against 50k real images\n| `kid50k`      | 13 min | 1.8 GB  | Kernel inception distance against 50k real images\n| `pr50k3`      | 13 min | 4.1 GB  | Precision and recall against 50k real images\n| `ppl2_wend`   | 36 min | 2.4 GB  | Perceptual path length\u003Csup>[5]\u003C\u002Fsup> in W, endpoints, full image\n| `ppl_zfull`   | 36 min | 2.4 GB  | Perceptual path length in Z, full paths, cropped image\n| `ppl_wfull`   | 36 min | 2.4 GB  | Perceptual path length in W, full paths, cropped image\n| `ppl_zend`    | 36 min | 2.4 GB  | Perceptual path length in Z, endpoints, cropped image\n| `ppl_wend`    | 36 min | 2.4 GB  | Perceptual path length in W, endpoints, cropped image\n\nReferences:\n1. [GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.08500), Heusel et al. 2017\n2. [Demystifying MMD GANs](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01401), Bi&nacute;kowski et al. 2018\n3. [Improved Precision and Recall Metric for Assessing Generative Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.06991), Kynk&auml;&auml;nniemi et al. 2019\n4. [Improved Techniques for Training GANs](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.03498), Salimans et al. 2016\n5. [A Style-Based Generator Architecture for Generative Adversarial Networks](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.04948), Karras et al. 2018\n\n## License\n\nCopyright &copy; 2021, NVIDIA Corporation. All rights reserved.\n\nThis work is made available under the [Nvidia Source Code License](https:\u002F\u002Fnvlabs.github.io\u002Fstylegan2-ada-pytorch\u002Flicense.html).\n\n## Citation\n\n```\n@inproceedings{Karras2020ada,\n  title     = {Training Generative Adversarial Networks with Limited Data},\n  author    = {Tero Karras and Miika Aittala and Janne Hellsten and Samuli Laine and Jaakko Lehtinen and Timo Aila},\n  booktitle = {Proc. NeurIPS},\n  year      = {2020}\n}\n```\n\n## Development\n\nThis is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.\n\n## Acknowledgements\n\nWe thank David Luebke for helpful comments; Tero Kuosmanen and Sabu Nadarajan for their support with compute infrastructure; and Edgar Sch&ouml;nfeld for guidance on setting up unconditional BigGAN.\n","## StyleGAN2-ADA — 官方 PyTorch 实现\n\n![预告图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_stylegan2-ada-pytorch_readme_77532d2bf89e.png)\n\n**在数据有限的情况下训练生成对抗网络**\u003Cbr>\n特罗·卡拉斯、米卡·艾塔拉、扬内·赫尔斯特恩、萨穆利·莱内、雅科·莱蒂宁、蒂莫·艾拉\u003Cbr>\nhttps:\u002F\u002Farxiv.org\u002Fabs\u002F2006.06676\u003Cbr>\n\n摘要：*使用过少的数据训练生成对抗网络（GAN）通常会导致判别器过拟合，从而使得训练过程发散。我们提出了一种自适应的判别器增强机制，能够在数据量有限的情况下显著稳定训练过程。该方法无需修改损失函数或网络架构，既适用于从头开始训练，也适用于在另一个数据集上对现有 GAN 进行微调。我们在多个数据集上证明，现在仅需几千张训练图像就能获得良好的效果，甚至常常以比 StyleGAN2 少一个数量级的图像数量达到相似的效果。我们预计这将为 GAN 开辟新的应用领域。此外，我们还发现广泛使用的 CIFAR-10 实际上是一个数据量有限的基准测试，并将 FID 记录从 5.59 提升至 2.42。*\n\n如需商务合作，请访问我们的官网并提交表格：[NVIDIA Research Licensing](https:\u002F\u002Fwww.nvidia.com\u002Fen-us\u002Fresearch\u002Finquiries\u002F)\n\n## 发布说明\n\n本仓库是对 [StyleGAN2-ADA](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F) 在 PyTorch 中的忠实重实现，重点在于正确性、性能和兼容性。\n\n**正确性**\n* 完全支持所有主要的训练配置。\n* 对图像质量、训练曲线和质量指标进行了与 TensorFlow 版本的广泛验证。\n* 预计结果在所有情况下都一致，但伪随机数和浮点运算的影响除外。\n\n**性能**\n* 在 NVIDIA Tesla V100 GPU 上，训练速度通常比 TensorFlow 版本快 5%–30%。\n* 高分辨率下的推理速度可提升至 35%，但在低分辨率下可能会稍慢。\n* GPU 内存占用与 TensorFlow 版本相当。\n* 新建网络的启动时间更快（不到 50 秒），使用预训练网络时也只需不到 4 秒。\n* 新增了用于调整训练性能的命令行选项。\n\n**兼容性**\n* 兼容使用 TensorFlow 版本创建的旧网络文件。\n* 新的基于 ZIP\u002FPNG 的数据集格式，旨在实现与现有第三方工具的最大互操作性。\n* 不再支持 TFRecords 数据集——需要将其转换为新格式。\n* 日志、指标和训练曲线采用新的 JSON 格式。\n* 如果安装了 TensorBoard，则会同时以旧的 TFEvents 格式导出训练曲线。\n* 命令行语法基本保持不变，少数例外（例如 `dataset_tool.py`）。\n* 不再支持比较方法（`--cmethod`、`--dcap`、`--cfg=cifarbaseline`、`--aug=adarv`）。\n* **截断功能现已默认关闭。**\n\n## 数据仓库\n\n| 路径 | 描述\n| :--- | :----------\n| [stylegan2-ada-pytorch](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002F) | 主目录托管在 Amazon S3 上\n| &ensp;&ensp;&boxvr;&nbsp; [ada-paper.pdf](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fada-paper.pdf) | 论文 PDF\n| &ensp;&ensp;&boxvr;&nbsp; [images](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fimages\u002F) | 使用预训练模型生成的精选示例图像\n| &ensp;&ensp;&boxvr;&nbsp; [videos](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fvideos\u002F) | 精选的插值视频示例\n| &ensp;&ensp;&boxur;&nbsp; [pretrained](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002F) | 预训练模型\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; ffhq.pkl | FFHQ，分辨率为 1024×1024，由原始 StyleGAN2 训练而成\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; metfaces.pkl | MetFaces，分辨率为 1024×1024，基于 FFHQ 使用 ADA 进行迁移学习\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqcat.pkl | AFHQ Cat，分辨率为 512×512，使用 ADA 从零开始训练\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqdog.pkl | AFHQ Dog，分辨率为 512×512，使用 ADA 从零开始训练\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; afhqwild.pkl | AFHQ Wild，分辨率为 512×512，使用 ADA 从零开始训练\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; cifar10.pkl | CIFAR-10，条件类别为 32×32\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; brecahad.pkl | BreCaHAD，分辨率为 512×512，使用 ADA 从零开始训练\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig7c-training-set-sweeps](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig7c-training-set-sweeps\u002F) | 图 7c 中使用的模型（训练集规模扫描）\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig11a-small-datasets](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig11a-small-datasets\u002F) | 图 11a 中使用的模型（小数据集与迁移学习）\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [paper-fig11b-cifar10](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fpaper-fig11b-cifar10\u002F) | 图 11b 中使用的模型（CIFAR-10）\n| &ensp;&ensp;&ensp;&ensp;&boxvr;&nbsp; [transfer-learning-source-nets](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Ftransfer-learning-source-nets\u002F) | 作为迁移学习起点的模型\n| &ensp;&ensp;&ensp;&ensp;&boxur;&nbsp; [metrics](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetrics\u002F) | 质量指标所使用的特征检测器\n\n## 系统要求\n\n* 支持 Linux 和 Windows，但出于性能和兼容性考虑，我们推荐使用 Linux。\n* 1–8 张高端 NVIDIA 显卡，每张显卡至少配备 12 GB 显存。我们所有的测试和开发都是在配备 8 张 Tesla V100 显卡的 NVIDIA DGX-1 上完成的。\n* 64 位 Python 3.7 和 PyTorch 1.7.1。PyTorch 的安装说明请参见 [https:\u002F\u002Fpytorch.org\u002F](https:\u002F\u002Fpytorch.org\u002F)。\n* CUDA 工具包 11.0 或更高版本。如果使用 RTX 3090，建议至少使用 11.1 版本。（为什么需要单独安装 CUDA 工具包？请参阅 [#2](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F2#issuecomment-779457121) 中的评论。）\n* Python 库：`pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3`。我们使用 Anaconda3 2020.11 发行版，其中大部分库已默认安装。\n* Docker 用户：请使用提供的 [Dockerfile](.\u002FDockerfile) 构建包含所需库依赖的镜像。\n\n代码大量依赖于通过 NVCC 动态编译的自定义 PyTorch 扩展。在 Windows 上，编译需要 Microsoft Visual Studio。我们建议安装 [Visual Studio Community Edition](https:\u002F\u002Fvisualstudio.microsoft.com\u002Fvs\u002F)，并通过 `\"C:\\Program Files (x86)\\Microsoft Visual Studio\\\u003CVERSION>\\Community\\VC\\Auxiliary\\Build\\vcvars64.bat\"` 将其添加到 `PATH` 中。\n\n## 快速入门\n\n预训练网络以 `*.pkl` 文件的形式存储，可以通过本地文件名或 URL 引用：\n\n```.bash\n\n# 生成无截断的精选 MetFaces 图像（图10 左侧）\npython generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n\n# 生成有截断的非精选 MetFaces 图像（图12 左上角）\npython generate.py --outdir=out --trunc=0.7 --seeds=600-605 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n\n# 生成条件类别的 CIFAR-10 图像（图17 左侧，汽车类别）\npython generate.py --outdir=out --seeds=0-35 --class=1 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fcifar10.pkl\n\n# 风格混合示例\npython style_mixing.py --outdir=out --rows=85,100,75,458,1500 --cols=55,821,1789,293 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\n上述命令的输出文件将放置在 `out\u002F*.png` 目录下，由 `--outdir` 参数控制。下载的网络模型文件会被缓存到 `$HOME\u002F.cache\u002Fdnnlib` 目录中，可以通过设置环境变量 `DNNLIB_CACHE_DIR` 来覆盖该路径。默认的 PyTorch 扩展构建目录是 `$HOME\u002F.cache\u002Ftorch_extensions`，也可以通过设置 `TORCH_EXTENSIONS_DIR` 环境变量来更改。\n\n**Docker**：您可以通过以下 Docker 命令运行上述精选图像示例：\n\n```.bash\ndocker build --tag sg2ada:latest .\n.\u002Fdocker_run.sh python3 generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\n注意：Docker 镜像需要 NVIDIA 驱动版本 `r455.23` 或更高版本。\n\n**旧版网络**：上述命令可以加载大多数使用 TensorFlow 版本 StyleGAN2 和 StyleGAN2-ADA 创建的网络模型文件。然而，为了未来的兼容性，我们建议将这些旧版模型文件转换为 PyTorch 版本所使用的全新格式：\n\n```.bash\npython legacy.py \\\n    --source=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2\u002Fnetworks\u002Fstylegan2-cat-config-f.pkl \\\n    --dest=stylegan2-cat-config-f.pkl\n```\n\n## 将图像投影到潜在空间\n\n要找到给定图像文件对应的潜在向量，可运行以下命令：\n\n```.bash\npython projector.py --outdir=out --target=~\u002Fmytargetimg.png \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\n为获得最佳效果，目标图像应进行裁剪和对齐，类似于 [FFHQ 数据集](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fffhq-dataset)中的图像。上述命令会保存投影目标图像 `out\u002Ftarget.png`、结果图像 `out\u002Fproj.png`、潜在向量 `out\u002Fprojected_w.npz` 以及渐进视频 `out\u002Fproj.mp4`。您可以使用 `generate.py` 的 `--projected_w` 参数来渲染得到的潜在向量：\n\n```.bash\npython generate.py --outdir=out --projected_w=out\u002Fprojected_w.npz \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\n## 在 Python 中使用预训练网络\n\n您可以在自己的 Python 代码中使用预训练网络，如下所示：\n\n```.python\nwith open('ffhq.pkl', 'rb') as f:\n    G = pickle.load(f)['G_ema'].cuda()  # torch.nn.Module\nz = torch.randn([1, G.z_dim]).cuda()    # 潜在向量\nc = None                                # 类别标签（本示例中未使用）\nimg = G(z, c)                           # NCHW, float32, 动态范围 [-1, +1]\n```\n\n上述代码需要 `torch_utils` 和 `dnnlib` 能够通过 `PYTHONPATH` 访问。它不需要网络本身的源代码——其类定义会通过 `torch_utils.persistence` 从 pickle 文件中加载。\n\npickle 文件包含三个网络：“G”和“D”是在训练过程中拍摄的即时快照，“G_ema”则是生成器权重在多个训练步骤上的移动平均值。这些网络都是标准的 `torch.nn.Module` 实例，在导入时所有参数和缓冲区都位于 CPU 上，并且默认情况下禁用了梯度计算。\n\n生成器由两个子模块组成：“G.mapping”和“G.synthesis”，它们可以分别执行。此外，它们还支持多种附加选项：\n\n```.python\nw = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)\nimg = G.synthesis(w, noise_mode='const', force_fp32=True)\n```\n\n更多示例请参阅 [`generate.py`](.\u002Fgenerate.py)、[`style_mixing.py`](.\u002Fstyle_mixing.py) 和 [`projector.py`](.\u002Fprojector.py)。\n\n## 准备数据集\n\n数据集以未压缩的 ZIP 归档形式存储，其中包含未压缩的 PNG 文件和一个用于标签的元数据文件 `dataset.json`。\n\n自定义数据集可以从包含图像的文件夹创建；更多信息请参阅 [`python dataset_tool.py --help`](.\u002Fdocs\u002Fdataset-tool-help.txt)。或者，也可以直接将文件夹用作数据集，无需先通过 `dataset_tool.py` 处理，但这样做可能会导致性能不佳。\n\n不支持旧版 TFRecords 数据集——有关如何将其转换的说明见下文。\n\n**FFHQ**：\n\n步骤 1：下载 [Flickr-Faces-HQ 数据集](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fffhq-dataset)，格式为 TFRecords。\n\n步骤 2：使用来自 [TensorFlow 版本 StyleGAN2-ADA](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F) 的 `dataset_tool.py` 从 TFRecords 中提取图像：\n\n```.bash\n# 使用来自 https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F 的 TensorFlow 版本中的 dataset_tool.py\npython ..\u002Fstylegan2-ada\u002Fdataset_tool.py unpack \\\n    --tfrecord_dir=~\u002Fffhq-dataset\u002Ftfrecords\u002Fffhq --output_dir=\u002Ftmp\u002Fffhq-unpacked\n```\n\n步骤 3：使用本仓库中的 `dataset_tool.py` 创建 ZIP 归档：\n\n```.bash\n# 原始 1024x1024 分辨率。\npython dataset_tool.py --source=\u002Ftmp\u002Fffhq-unpacked --dest=~\u002Fdatasets\u002Fffhq.zip\n\n# 缩小至 256x256 分辨率。\n#\n# 注意：必须使用 --resize-filter=box 选项才能复现论文中展示的 FID 分数。如果您不需要完全匹配，最好省略此选项。\n\n# 并默认使用 Lanczos 插值。参见 https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F283#issuecomment-1731217782\npython dataset_tool.py --source=\u002Ftmp\u002Fffhq-unpacked --dest=~\u002Fdatasets\u002Fffhq256x256.zip \\\n    --width=256 --height=256 --resize-filter=box\n```\n\n**MetFaces**：下载 [MetFaces 数据集](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fmetfaces-dataset)并创建 ZIP 压缩包：\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fmetfaces\u002Fimages --dest=~\u002Fdatasets\u002Fmetfaces.zip\n```\n\n**AFHQ**：下载 [AFHQ 数据集](https:\u002F\u002Fgithub.com\u002Fclovaai\u002Fstargan-v2\u002Fblob\u002Fmaster\u002FREADME.md#animal-faces-hq-dataset-afhq)并创建 ZIP 压缩包：\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fcat --dest=~\u002Fdatasets\u002Fafhqcat.zip\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fdog --dest=~\u002Fdatasets\u002Fafhqdog.zip\npython dataset_tool.py --source=~\u002Fdownloads\u002Fafhq\u002Ftrain\u002Fwild --dest=~\u002Fdatasets\u002Fafhqwild.zip\n```\n\n**CIFAR-10**：下载 [CIFAR-10 的 Python 版本](https:\u002F\u002Fwww.cs.toronto.edu\u002F~kriz\u002Fcifar.html)并转换为 ZIP 压缩包：\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Fcifar-10-python.tar.gz --dest=~\u002Fdatasets\u002Fcifar10.zip\n```\n\n**LSUN**：从 [LSUN 项目页面](https:\u002F\u002Fwww.yf.io\u002Fp\u002Flsun\u002F)下载所需类别，并将其转换为 ZIP 压缩包：\n\n```.bash\npython dataset_tool.py --source=~\u002Fdownloads\u002Flsun\u002Fraw\u002Fcat_lmdb --dest=~\u002Fdatasets\u002Flsuncat200k.zip \\\n    --transform=center-crop --width=256 --height=256 --max_images=200000\n\npython dataset_tool.py --source=~\u002Fdownloads\u002Flsun\u002Fraw\u002Fcar_lmdb --dest=~\u002Fdatasets\u002Flsuncar200k.zip \\\n    --transform=center-crop-wide --width=512 --height=384 --max_images=200000\n```\n\n**BreCaHAD**：\n\n步骤 1：下载 [BreCaHAD 数据集](https:\u002F\u002Ffigshare.com\u002Farticles\u002FBreCaHAD_A_Dataset_for_Breast_Cancer_Histopathological_Annotation_and_Diagnosis\u002F7379186)。\n\n步骤 2：使用来自 [StyleGAN2-ADA 的 TensorFlow 版本](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F)的 `dataset_tool.py` 提取 512×512 分辨率的裁剪图像：\n\n```.bash\n# 使用来自 TensorFlow 版本的 dataset_tool.py\n# https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada\u002F\npython dataset_tool.py extract_brecahad_crops --cropsize=512 \\\n    --output_dir=\u002Ftmp\u002Fbrecahad-crops --brecahad_dir=~\u002Fdownloads\u002Fbrecahad\u002Fimages\n```\n\n步骤 3：使用本仓库中的 `dataset_tool.py` 创建 ZIP 压缩包：\n\n```.bash\npython dataset_tool.py --source=\u002Ftmp\u002Fbrecahad-crops --dest=~\u002Fdatasets\u002Fbrecahad.zip\n```\n\n## 训练新网络\n\n以最基础的形式来说，训练新网络只需执行以下命令：\n\n```.bash\npython train.py --outdir=~\u002Ftraining-runs --data=~\u002Fmydataset.zip --gpus=1 --dry-run\npython train.py --outdir=~\u002Ftraining-runs --data=~\u002Fmydataset.zip --gpus=1\n```\n\n第一条命令是可选的；它会验证参数、打印出训练配置并退出。第二条命令则真正开始训练。\n\n在本示例中，结果将保存到新创建的目录 `~\u002Ftraining-runs\u002F\u003CID>-mydataset-auto1` 中，该目录由 `--outdir` 控制。训练会定期导出网络快照文件（`network-snapshot-\u003CINT>.pkl`）和示例图像（`fakes\u003CINT>.png`），间隔时间由 `--snap` 控制。对于每个快照，还会评估 FID 指标（由 `--metrics` 控制），并将结果记录在 `metric-fid50k_full.jsonl` 文件中（如果安装了 TensorBoard，则同时生成 TFEvents 日志）。\n\n输出目录的名称反映了训练配置。例如，`00000-mydataset-auto1` 表示*基础配置*为 `auto1`，即针对单 GPU 训练自动选择了超参数。基础配置由 `--cfg` 控制：\n\n| 基础配置           | 说明\n| :-------------------- | :----------\n| `auto`&nbsp;(默认) | 根据分辨率和 GPU 数量自动选择合理的默认设置。这是新数据集的良好起点，但未必能带来最佳效果。\n| `stylegan2`           | 使用 1、2、4 或 8 个 GPU 复现 StyleGAN2 在 1024×1024 分辨率下的 F 配置结果。\n| `paper256`            | 使用 1、2、4 或 8 个 GPU 复现 FFHQ 和 LSUN Cat 数据集在 256×256 分辨率下的结果。\n| `paper512`            | 使用 1、2、4 或 8 个 GPU 复现 BreCaHAD 和 AFHQ 数据集在 512×512 分辨率下的结果。\n| `paper1024`           | 使用 1、2、4 或 8 个 GPU 复现 MetFaces 数据集在 1024×1024 分辨率下的结果。\n| `cifar`               | 使用 1 或 2 个 GPU 复现 CIFAR-10 数据集的优化配置结果。\n\n训练配置还可以通过其他命令行选项进一步自定义：\n\n* `--aug=noaug` 禁用 ADA 增强。\n* `--cond=1` 启用条件式训练（需要带有标签的数据集）。\n* `--mirror=1` 通过水平翻转扩充数据集。即使启用了 ADA，这样做通常也有益处。\n* `--resume=ffhq1024 --snap=10` 从已在 1024×1024 分辨率下训练的 FFHQ 数据集进行迁移学习。\n* `--resume=~\u002Ftraining-runs\u002F\u003CNAME>\u002Fnetwork-snapshot-\u003CINT>.pkl` 继续之前的训练过程。\n* `--gamma=10` 覆盖 R1 正则化系数。我们建议为每个新数据集尝试几个不同的值。\n* `--aug=ada --target=0.7` 调整 ADA 目标值（默认为 0.6）。\n* `--augpipe=blit` 启用像素混合增强，但禁用其他所有增强方法。\n* `--augpipe=bgcfnc` 启用所有可用的增强方法（混合、几何变换、颜色调整、滤波、噪声和裁剪）。\n\n完整选项列表请参阅 [`python train.py --help`](.\u002Fdocs\u002Ftrain-help.txt)。\n\n## 预期训练时间\n\n总的训练时间高度依赖于分辨率、GPU数量、数据集、期望的质量以及超参数。下表列出了达到不同训练阶段所需的预期壁挂时钟时间，单位为展示给判别器的真实图像千数（“kimg”）：\n\n| 分辨率   | GPU数量 | 1000 kimg | 25000 kimg | 秒\u002Fkimg          | GPU显存 | CPU内存\n| :--------: | :--: | :-------: | :--------: | :---------------: | :-----: | :-----:\n| 128x128    | 1    | 4小时05分 | 4天06小时   | 12.8–13.7       | 7.2 GB  | 3.9 GB\n| 128x128    | 2    | 2小时06分 | 2天04小时   | 6.5–6.8         | 7.4 GB  | 7.9 GB\n| 128x128    | 4    | 1小时20分 | 1天09小时   | 4.1–4.6         | 4.2 GB  | 16.3 GB\n| 128x128    | 8    | 1小时13分 | 1天06小时   | 3.9–4.9         | 2.6 GB  | 31.9 GB\n| 256x256    | 1    | 6小时36分 | 6天21小时   | 21.6–24.2       | 5.0 GB  | 4.5 GB\n| 256x256    | 2    | 3小时27分 | 3天14小时   | 11.2–11.8       | 5.2 GB  | 9.0 GB\n| 256x256    | 4    | 1小时45分 | 1天20小时   | 5.6–5.9         | 5.2 GB  | 17.8 GB\n| 256x256    | 8    | 1小时24分 | 1天11小时   | 4.4–5.5         | 3.2 GB  | 34.7 GB\n| 512x512    | 1    | 21小时03分| 21天22小时  | 72.5–74.9       | 7.6 GB  | 5.0 GB\n| 512x512    | 2    | 10小时59分| 11天10小时  | 37.7–40.0       | 7.8 GB  | 9.8 GB\n| 512x512    | 4    | 5小时29分 | 5天17小时   | 18.7–19.1       | 7.9 GB  | 17.7 GB\n| 512x512    | 8    | 2小时48分 | 2天22小时   | 9.5–9.7         | 7.8 GB  | 38.2 GB\n| 1024x1024  | 1    | 1天20小时 | 46天03小时  | 154.3–161.6     | 8.1 GB  | 5.3 GB\n| 1024x1024  | 2    | 23小时09分| 24天02小时  | 80.6–86.2       | 8.6 GB  | 11.9 GB\n| 1024x1024  | 4    | 11小时36分| 12天02小时  | 40.1–40.8       | 8.4 GB  | 21.9 GB\n| 1024x1024  | 8    | 5小时54分 | 6天03小时   | 20.2–20.6       | 8.3 GB  | 44.7 GB\n\n以上测量是在使用 NVIDIA Tesla V100 GPU 并采用默认设置（`--cfg=auto --aug=ada --metrics=fid50k_full`）的情况下进行的。“sec\u002Fkimg”显示了在 `log.txt` 中报告的原始训练性能的预期变化范围。“GPU mem”和“CPU mem”显示了观察到的最大内存消耗，不包括由 `torch.backends.cudnn.benchmark` 引起的初始峰值。\n\n在典型情况下，需要 25000 kimg 或更多才能达到收敛，但大约 5000 kimg 时结果已经相当合理。1000 kimg 通常足以进行迁移学习，而迁移学习往往能显著加快收敛速度。下图展示了使用上述相同设置时，不同数据集随壁挂时间变化的示例收敛曲线：\n\n![训练曲线](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_stylegan2-ada-pytorch_readme_41c02ca6c630.png)\n\n注意：`--cfg=auto` 可作为超参数的一个合理初始猜测，但并不一定能够为特定数据集带来最佳效果。例如，在 1024×1024 分辨率下，对于 FFHQ-140k 数据集，`--cfg=stylegan2` 的 FID 值明显优于上文所示的结果。我们建议针对每个新数据集至少尝试几个不同的 `--gamma` 值。\n\n## 质量指标\n\n默认情况下，`train.py` 会在训练过程中自动计算每次导出的网络模型文件的 FID 值。我们建议定期检查 `metric-fid50k_full.jsonl` 文件（或 TensorBoard），以监控训练进度。如果需要，可以通过 `--metrics=none` 禁用自动计算，从而略微加快训练速度（约 3%–9%）。\n\n此外，训练完成后还可以计算其他质量指标：\n\n```.bash\n# 上一次训练运行：自动查找选项，将结果保存到 JSONL 文件。\npython calc_metrics.py --metrics=pr50k3_full \\\n    --network=~\u002Ftraining-runs\u002F00000-ffhq10k-res64-auto1\u002Fnetwork-snapshot-000000.pkl\n\n# 预训练网络模型文件：明确指定数据集，将结果输出到标准输出。\npython calc_metrics.py --metrics=fid50k_full --data=~\u002Fdatasets\u002Fffhq.zip --mirror=1 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n\n第一个示例会自动查找训练配置，并执行与训练时指定了 `--metrics=pr50k3_full` 相同的操作。第二个示例则下载了一个预训练的网络模型文件，在这种情况下必须显式指定 `--mirror` 和 `--data` 参数。\n\n需要注意的是，许多指标在首次针对新数据集计算时会有较大的一次性开销（最长可达 30 分钟）。另外，每次评估都会使用不同的随机种子，因此多次计算同一指标可能会得到略有不同的结果。\n\n我们在 ADA 论文中采用了以下指标。执行时间和 GPU 显存占用均基于一台 NVIDIA Tesla V100 GPU 在 1024×1024 分辨率下的情况：\n\n| 指标        | 时间   | GPU显存 | 描述 |\n| :-----        | :----: | :-----: | :---------- |\n| `fid50k_full` | 13 分钟 | 1.8 GB  | 对完整数据集的 Fr&eacute;chet inception 距离\u003Csup>[1]\u003C\u002Fsup>\n| `kid50k_full` | 13 分钟 | 1.8 GB  | 对完整数据集的 Kernel inception 距离\u003Csup>[2]\u003C\u002Fsup>\n| `pr50k3_full` | 13 分钟 | 4.1 GB  | 对完整数据集的精确率和召回率\u003Csup>[3]\u003C\u002Fsup>\n| `is50k`       | 13 分钟 | 1.8 GB  | CIFAR-10 的 Inception 分数\u003Csup>[4]\u003C\u002Fsup>\n\n此外，还支持来自 [StyleGAN](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan) 和 [StyleGAN2](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2) 论文中的以下指标：\n\n| 指标        | 时间   | GPU显存 | 描述 |\n| :------------ | :----: | :-----: | :---------- |\n| `fid50k`      | 13 分钟 | 1.8 GB  | 对 5 万张真实图像的 Fr&eacute;chet inception 距离\n| `kid50k`      | 13 分钟 | 1.8 GB  | 对 5 万张真实图像的 Kernel inception 距离\n| `pr50k3`      | 13 分钟 | 4.1 GB  | 对 5 万张真实图像的精确率和召回率\n| `ppl2_wend`   | 36 分钟 | 2.4 GB  | W 空间中端点至完整图像的感知路径长度\u003Csup>[5]\u003C\u002Fsup>\n| `ppl_zfull`   | 36 分钟 | 2.4 GB  | Z 空间中完整路径裁剪后的感知路径长度\n| `ppl_wfull`   | 36 分钟 | 2.4 GB  | W 空间中完整路径裁剪后的感知路径长度\n| `ppl_zend`    | 36 分钟 | 2.4 GB  | Z 空间中端点裁剪后的感知路径长度\n| `ppl_wend`    | 36 分钟 | 2.4 GB  | W 空间中端点裁剪后的感知路径长度\n\n参考文献：\n1. [通过双时间尺度更新规则训练的 GAN 收敛到局部纳什均衡](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.08500)，Heusel 等，2017 年\n2. [揭秘 MMD GAN](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01401)，Bi&nacute;kowski 等，2018 年\n3. [用于评估生成模型的改进精确率和召回率指标](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.06991)，Kynk&auml;&auml;nniemi 等，2019 年\n4. [训练 GAN 的改进技术](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.03498)，Salimans 等，2016 年\n5. [面向生成对抗网络的基于风格的生成器架构](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.04948)，Karras 等，2018 年\n\n## 许可证\n\n版权所有 &copy; 2021，NVIDIA Corporation。保留所有权利。\n\n本作品根据 [Nvidia 源代码许可证](https:\u002F\u002Fnvlabs.github.io\u002Fstylegan2-ada-pytorch\u002Flicense.html) 提供。\n\n## 引用\n\n```\n@inproceedings{Karras2020ada,\n  title     = {有限数据下的生成对抗网络训练},\n  author    = {Tero Karras、Miika Aittala、Janne Hellsten、Samuli Laine、Jaakko Lehtinen、Timo Aila},\n  booktitle = {NeurIPS 会议论文集},\n  year      = {2020}\n}\n```\n\n## 开发\n\n这是一个研究参考实现，被视为一次性代码发布。因此，我们不接受以拉取请求形式的外部代码贡献。\n\n## 致谢\n\n我们感谢 David Luebke 提出的有益意见；感谢 Tero Kuosmanen 和 Sabu Nadarajan 在计算基础设施方面的支持；以及感谢 Edgar Schönhfeld 在搭建无条件 BigGAN 方面提供的指导。","# StyleGAN2-ADA PyTorch 快速上手指南\n\nStyleGAN2-ADA 是 NVIDIA 官方推出的 PyTorch 版本实现，专为**小样本数据训练**设计。通过自适应判别器增强机制，它能在仅有几千张训练图像的情况下稳定训练出高质量的生成模型，有效防止过拟合。\n\n## 环境准备\n\n### 系统要求\n*   **操作系统**：推荐 **Linux**（性能与兼容性最佳），支持 Windows。\n*   **GPU**：1–8 块高端 NVIDIA GPU，显存至少 **12 GB**（开发测试基于 8x Tesla V100）。\n*   **Python**：64-bit Python 3.7。\n*   **PyTorch**：版本 1.7.1。\n*   **CUDA**：Toolkit 11.0 或更高版本（若使用 RTX 3090，请至少使用 11.1）。\n\n### 依赖安装\n建议使用 Anaconda3 (2020.11 或更新) 管理环境。安装必要的 Python 库：\n\n```bash\npip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3\n```\n\n> **Windows 用户注意**：编译自定义 PyTorch 扩展需要 Microsoft Visual Studio。请安装 [Visual Studio Community Edition](https:\u002F\u002Fvisualstudio.microsoft.com\u002Fvs\u002F) 并将构建工具添加到 PATH：\n> `\"C:\\Program Files (x86)\\Microsoft Visual Studio\\\u003CVERSION>\\Community\\VC\\Auxiliary\\Build\\vcvars64.bat\"`\n\n> **Docker 用户**：可直接使用项目提供的 `Dockerfile` 构建包含所有依赖的环境镜像。\n\n## 安装步骤\n\n本项目无需传统的 `setup.py install`，克隆仓库后即可直接运行。\n\n1.  **克隆仓库**\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch.git\n    cd stylegan2-ada-pytorch\n    ```\n\n2.  **验证环境**\n    确保 `torch_utils` 和 `dnnlib` 目录在 `PYTHONPATH` 中（通常在当前目录下直接运行脚本即可自动识别）。\n\n3.  **数据集格式说明**\n    *   新版支持：未压缩的 ZIP 归档（内含 PNG 图片和 `dataset.json` 元数据）。\n    *   **不再支持**：旧的 TFRecords 格式。如有旧数据集，需使用 `dataset_tool.py` 进行转换。\n\n## 基本使用\n\n预训练模型以 `*.pkl` 文件形式提供，可直接通过本地路径或 URL 引用。以下是最常用的图像生成示例。\n\n### 1. 生成图像 (Generate Images)\n\n使用预训练的 MetFaces 模型生成图像：\n\n```bash\n# 生成未经截断的精选 MetFaces 图像\npython generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n\n# 生成带截断（truncation）的随机 MetFaces 图像（效果更稳定）\npython generate.py --outdir=out --trunc=0.7 --seeds=600-605 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\n*   `--outdir`：输出目录。\n*   `--trunc`：截断系数（0-1），越低图像质量越稳定但多样性降低。\n*   `--network`：预训练模型路径（支持 http\u002Fhttps 自动下载缓存）。\n\n### 2. 风格混合 (Style Mixing)\n\n探索潜在空间中的风格插值：\n\n```bash\npython style_mixing.py --outdir=out --rows=85,100,75,458,1500 --cols=55,821,1789,293 \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fmetfaces.pkl\n```\n\n### 3. 图像投影 (Project Images to Latent Space)\n\n将真实图片反推回潜在向量（Latent Vector）：\n\n```bash\npython projector.py --outdir=out --target=~\u002Fmytargetimg.png \\\n    --network=https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002Fffhq.pkl\n```\n*注意：目标图片最好经过与 FFHQ 数据集相同的裁剪和对齐预处理以获得最佳效果。*\n\n### 4. 在 Python 代码中调用\n\n你可以直接在 Python 脚本中加载模型进行推理：\n\n```python\nimport pickle\nimport torch\nfrom torch_utils import persistence\n\n# 加载预训练模型\nwith open('ffhq.pkl', 'rb') as f:\n    G = pickle.load(f)['G_ema'].cuda()  # 使用指数移动平均生成器\n\n# 准备输入\nz = torch.randn([1, G.z_dim]).cuda()    # 随机潜码\nc = None                                # 类别标签（无条件生成设为 None）\n\n# 生成图像\nimg = G(z, c)                           # 输出范围 [-1, +1], 格式 NCHW\n```\n\n### 常用预训练模型列表\n| 模型文件 | 描述 | 分辨率 |\n| :--- | :--- | :--- |\n| `ffhq.pkl` | FFHQ 人脸数据集 (原始 StyleGAN2 训练) | 1024x1024 |\n| `metfaces.pkl` | MetFaces 艺术人脸 (基于 FFHQ 迁移学习) | 1024x1024 |\n| `afhqcat.pkl` | AFHQ 猫 (从头训练) | 512x512 |\n| `cifar10.pkl` | CIFAR-10 (类条件生成) | 32x32 |\n\n所有模型均可从 [NVIDIA 官方数据源](https:\u002F\u002Fnvlabs-fi-cdn.nvidia.com\u002Fstylegan2-ada-pytorch\u002Fpretrained\u002F) 自动下载。","一家小型数字时尚初创公司试图利用仅有的 2000 张复古面料扫描图，训练一个能生成无限新花纹的 AI 模型以丰富其设计库。\n\n### 没有 stylegan2-ada-pytorch 时\n- **模型训练崩溃**：由于样本量远低于传统 GAN 所需的数万张，判别器迅速过拟合，导致生成器无法学习，训练过程早早发散失败。\n- **数据成本高昂**：团队被迫花费数周时间和大量预算去采集或购买更多高清面料图片，严重拖慢了产品上线节奏。\n- **生成质量低劣**：即使勉强训练，生成的图案也充满伪影、重复纹理严重，完全无法达到商业设计所需的清晰度和多样性标准。\n- **架构调整复杂**：为了适应小数据，开发人员需要手动修改损失函数或网络结构，试错成本极高且缺乏理论保证。\n\n### 使用 stylegan2-ada-pytorch 后\n- **小数据稳定训练**：借助其自适应判别器增强机制，仅需现有的 2000 张图片即可稳定训练，有效防止了判别器过拟合。\n- **大幅降低门槛**：无需额外收集数据，直接利用现有资产启动项目，将原本数周的数据准备期缩短为零。\n- **产出商用级素材**：生成的面料纹理细节丰富、风格多样且无重复感，FID 指标显著优化，直接满足设计师的打样需求。\n- **开箱即用高效**：无需修改任何网络架构或损失函数，直接复用官方配置即可在 PyTorch 环境中快速复现论文效果，训练速度还提升了约 30%。\n\nstylegan2-ada-pytorch 通过自适应增强技术，彻底打破了生成式 AI 对海量数据的依赖，让中小企业也能用少量样本创造出高质量的定制内容。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNVlabs_stylegan2-ada-pytorch_3829c2b3.png","NVlabs","NVIDIA Research Projects","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNVlabs_fc20d641.jpg","",null,"http:\u002F\u002Fresearch.nvidia.com","https:\u002F\u002Fgithub.com\u002FNVlabs",[81,85,89,93,97],{"name":82,"color":83,"percentage":84},"Python","#3572A5",88.7,{"name":86,"color":87,"percentage":88},"Cuda","#3A4E3A",7.5,{"name":90,"color":91,"percentage":92},"C++","#f34b7d",3.3,{"name":94,"color":95,"percentage":96},"Shell","#89e051",0.3,{"name":98,"color":99,"percentage":100},"Dockerfile","#384d54",0.2,4456,1239,"2026-04-06T10:26:59","NOASSERTION",4,"Linux, Windows","必需，1-8 块高端 NVIDIA GPU（测试基于 Tesla V100），显存至少 12GB，CUDA Toolkit 11.0+（RTX 3090 需 11.1+）","未说明",{"notes":110,"python":111,"dependencies":112},"Windows 用户编译自定义 PyTorch 扩展时需要安装 Microsoft Visual Studio。不再支持 TFRecords 数据集，需转换为新的 ZIP\u002FPNG 格式。默认禁用截断（Truncation）。建议使用 Anaconda3 2020.11 发行版。Docker 用户需要 NVIDIA 驱动版本 r455.23 或更高。","3.7 (64-bit)",[113,114,115,116,117,118,119],"torch==1.7.1","click","requests","tqdm","pyspng","ninja","imageio-ffmpeg==0.4.3",[15,14],"2026-03-27T02:49:30.150509","2026-04-06T23:57:49.538019",[124,129,134,139,143,148,152],{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},20367,"安装 PyTorch 插件 \"upfirdn2d_plugin\" 失败怎么办？","最常见且有效的解决方案是安装 `ninja` 构建工具。请运行命令：`pip install ninja`（在 Google Colab 中请使用 `%pip install ninja`）。安装完成后重新运行代码，通常可以解决 CUDA 内核编译失败的问题。如果仍然失败，可能是编译器版本不匹配，建议检查 GCC 版本或尝试强制使用参考实现（修改源码中的 `_init` 函数）。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F39",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},20368,"遇到 \"RuntimeError: CUDA error: no kernel image is available for execution on the device\" 错误如何解决？","这通常是由于 PyTorch 版本与 CUDA 版本不兼容导致的。建议尝试更换 PyTorch 版本。例如，对于 CUDA 11.1，可以尝试安装 PyTorch 1.8.0：\n`conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge`\n如果该版本无效，请尝试其他与您的 CUDA 版本匹配的 PyTorch 历史版本。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F6",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},20369,"在 Ubuntu 系统上编译 CUDA 操作符失败，推荐的完整环境配置是什么？","成功运行的推荐配置如下：\n- 操作系统：Ubuntu 18.04.4\n- 编译器：gcc 7.5.0\n- CUDA 版本：11.1\n- CUDNN 版本：8.0.5\n- Python 版本：3.7\n- PyTorch 版本：1.7.1（可从源码编译或使用预编译包）\n确保手动安装 CUDA 和 CUDNN，并确认 `print(torch.__version__)` 输出正常。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F2",{"id":140,"question_zh":141,"answer_zh":142,"source_url":138},20370,"在 Windows 上遇到 upfirdn2d 编译错误或运行时错误怎么办？","Windows 用户需要确保安装了完整的 Visual Studio（包含 C++ 构建工具）以处理 C++ 编译问题。此外，建议从 NVIDIA 官网下载并安装完整的 CUDA Toolkit（如 CUDA 11.3），而不仅仅是通过 conda 安装的 cudatoolkit。组合方案：Visual Studio + 完整 CUDA 11.3 + PyTorch 1.7.1 + Python 3.7 已被验证有效。",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},20371,"如何在 Vast.ai 或其他云实例上正确配置 StyleGAN2-ADA 环境？","推荐步骤如下：\n1. 选择带有适当 CUDA 版本的 NVIDIA Docker 镜像（如 `nvidia\u002Fcuda:11.x-cudnn8-runtime-ubuntu18.04`）。\n2. SSH 登录实例后，安装 Miniconda。\n3. 创建环境并安装依赖：\n   `conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 tensorboard -c pytorch --yes`\n   `pip install click psutil scipy requests tqdm pyspng ninja imageio imageio-ffmpeg==0.4.3 ipywidgets jupyterlab`\n注意：Docker 容器内通常无需单独安装 gcc 或 toolkit，重点在于安装 `ninja`。","https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fstylegan2-ada-pytorch\u002Fissues\u002F72",{"id":149,"question_zh":150,"answer_zh":151,"source_url":128},20372,"如果无法修复 CUDA 插件编译问题，是否有临时的替代方案？","是的，可以强制代码使用较慢的参考实现（Reference Implementation）而不是自定义 CUDA 内核。方法是编辑 `torch_utils\u002Fops\u002F` 目录下的相关文件（如 `bias_act.py`, `upfirdn2d.py` 等），找到 `_init()` 函数，修改逻辑使其跳过插件加载或直接设置为使用参考路径。虽然运行速度会变慢，但可以确保程序正常运行而不报错。",{"id":153,"question_zh":154,"answer_zh":155,"source_url":138},20373,"为什么安装了 ninja 但在 Ubuntu 20.04 上仍然报错？","`pip install ninja` 在 Colab 和 Windows 上通常有效，但在某些 Ubuntu 版本（如 20.04）上可能因系统自带的编译器版本过高（如 gcc 9+）与 CUDA 不兼容而导致失败。解决方案包括：\n1. 降级 GCC 到版本 7（例如 `conda install -c conda-forge\u002Flabel\u002Fgcc7 gcc_linux-64`）。\n2. 或者使用 Ubuntu 18.04 环境，该环境默认 gcc 7.5.0 兼容性更好。\n3. 检查是否缺少其他构建依赖，确保环境变量配置正确。",[]]