[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Mukosame--Zooming-Slow-Mo-CVPR-2020":3,"tool-Mukosame--Zooming-Slow-Mo-CVPR-2020":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":75,"owner_website":82,"owner_url":83,"languages":84,"stars":109,"forks":110,"last_commit_at":111,"license":112,"difficulty_score":113,"env_os":114,"env_gpu":115,"env_ram":114,"env_deps":116,"category_tags":128,"github_topics":129,"view_count":10,"oss_zip_url":138,"oss_zip_packed_at":138,"status":16,"created_at":139,"updated_at":140,"faqs":141,"releases":171},869,"Mukosame\u002FZooming-Slow-Mo-CVPR-2020","Zooming-Slow-Mo-CVPR-2020","Fast and Accurate One-Stage Space-Time Video Super-Resolution (accepted in CVPR 2020)","Zooming-Slow-Mo-CVPR-2020 是一款基于 PyTorch 构建的视频增强开源项目，核心功能是将低分辨率、低帧率的输入视频直接合成为高清且流畅的慢动作视频。它有效解决了传统方案中视频超分辨率与帧插值需分步处理的效率瓶颈，通过单阶段时空网络实现端到端的画质提升，大幅减少了计算开销。\n\n该项目主要面向计算机视觉领域的研究人员、算法工程师及具备一定开发能力的视频技术爱好者。其技术亮点在于创新性地结合了特征时序插值网络与可变形 ConvLSTM 模块，能够精准对齐并聚合时序信息，在 Vid4 和 Vimeo 等权威测试集上取得了领先的画质评估指标。\n\n需要注意的是，运行 Zooming-Slow-Mo-CVPR-2020 需要配备 NVIDIA GPU 及相关 CUDA 环境，并涉及部分代码编译工作。对于希望深入研究视频生成技术、复现顶会论文或寻求高质量视频预处理方案的开发者而言，Zooming-Slow-Mo-CVPR-2020 是一个兼具学术价值与实践意义的优秀资源，官方也提供了完整的训练与测试代码供参考。","# Zooming-Slow-Mo (CVPR-2020)\n\nBy [Xiaoyu Xiang\u003Csup>\\*\u003C\u002Fsup>](https:\u002F\u002Fengineering.purdue.edu\u002Fpeople\u002Fxiaoyu.xiang.1), [Yapeng Tian\u003Csup>\\*\u003C\u002Fsup>](http:\u002F\u002Fyapengtian.org\u002F), [Yulun Zhang](http:\u002F\u002Fyulunzhang.com\u002F), [Yun Fu](http:\u002F\u002Fwww1.ece.neu.edu\u002F~yunfu\u002F), [Jan P. Allebach\u003Csup>+\u003C\u002Fsup>](https:\u002F\u002Fengineering.purdue.edu\u002F~allebach\u002F), [Chenliang Xu\u003Csup>+\u003C\u002Fsup>](https:\u002F\u002Fwww.cs.rochester.edu\u002F~cxu22\u002F) (\u003Csup>\\*\u003C\u002Fsup> equal contributions, \u003Csup>+\u003C\u002Fsup> equal advising)\n\nThis is the official Pytorch implementation of _Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution_.\n\n#### [Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.11616) | [Journal Version](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.07473) | [Demo (YouTube)](https:\u002F\u002Fyoutu.be\u002F8mgD8JxBOus) | [1-min teaser (YouTube)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=C1o85AXUNl8) | [1-min teaser (Bilibili)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1GK4y1t7nb\u002F)\n\n\u003Ctable>\n  \u003Cthead>\n    \u003Ctr>\n      \u003Ctd>Input&nbsp;&nbsp;&nbsp;&nbsp;\u003C\u002Ftd>\n      \u003Ctd>Output\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\">\n      \u003Ca href=\"https:\u002F\u002Fyoutu.be\u002F8mgD8JxBOus\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FMukosame_Zooming-Slow-Mo-CVPR-2020_readme_48f33d8bf135.gif\" alt=\"Demo GIF\">\n        \u003C\u002Fimg>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## Updates\n\n- 2020.3.13 Add meta-info of datasets used in this paper\n- 2020.3.11 Add new function: video converter\n- 2020.3.10: Upload the complete code and pretrained models\n\n## Contents\n\n0. [Introduction](#introduction)\n1. [Prerequisites](#Prerequisites)\n2. [Get Started](#Get-Started)\n   - [Installation](#Installation)\n   - [Training](#Training)\n   - [Testing](#Testing)\n   - [Colab Notebook](#Colab-Notebook)\n3. [Citations](#citations)\n4. [Contact](#Contact)\n5. [License](#License)\n6. [Acknowledgments](#Acknowledgments)\n\n## Introduction\n\nThe repository contains the entire project (including all the preprocessing) for one-stage space-time video super-resolution with Zooming Slow-Mo.\n\nZooming Slow-Mo is a recently proposed joint video frame interpolation (VFI) and video super-resolution (VSR) method, which directly synthesizes an HR slow-motion video from an LFR, LR video. It is going to be published in [CVPR 2020](http:\u002F\u002Fcvpr2020.thecvf.com\u002F). The most up-to-date paper with supplementary materials can be found at [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.11616).\n\nIn Zooming Slow-Mo, we firstly temporally interpolate features of the missing LR frame by the proposed feature temporal interpolation network. Then, we propose a deformable ConvLSTM to align and aggregate temporal information simultaneously. Finally, a deep reconstruction network is adopted to predict HR slow-motion video frames. If our proposed architectures also help your research, please consider citing our paper.\n\nZooming Slow-Mo achieves state-of-the-art performance by PSNR and SSIM in Vid4, Vimeo test sets.\n\n![framework](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FMukosame_Zooming-Slow-Mo-CVPR-2020_readme_a40204826932.png)\n\n## Prerequisites\n\n- Python 3 (Recommend to use [Anaconda](https:\u002F\u002Fwww.anaconda.com\u002Fdownload\u002F#linux))\n- [PyTorch >= 1.1](https:\u002F\u002Fpytorch.org\u002F)\n- NVIDIA GPU + [CUDA](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-downloads)\n- [Deformable Convolution v2](https:\u002F\u002Farxiv.org\u002Fabs\u002F1811.11168), we adopt [CharlesShang's implementation](https:\u002F\u002Fgithub.com\u002FCharlesShang\u002FDCNv2) in the submodule.\n- Python packages: `pip install numpy opencv-python lmdb pyyaml pickle5 matplotlib seaborn`\n\n## Get Started\n\n### Installation\n\nInstall the required packages: `pip install -r requirements.txt`\n\nFirst, make sure your machine has a GPU, which is required for the DCNv2 module.\n\n1. Clone the Zooming Slow-Mo repository. We'll call the directory that you cloned Zooming Slow-Mo as ZOOMING_ROOT.\n\n```Shell\ngit clone --recursive https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020.git\n```\n\n2. Compile the DCNv2:\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\u002Fmodels\u002Fmodules\u002FDCNv2\nbash make.sh         # build\npython test.py    # run examples and gradient check\n```\n\nPlease make sure the test script finishes successfully without any errors before running the following experiments.\n\n### Training\n\n#### Part 1: Data Preparation\n\n1. Download the original training + test set of `Vimeo-septuplet` (82 GB).\n\n```Shell\nwget http:\u002F\u002Fdata.csail.mit.edu\u002Ftofu\u002Fdataset\u002Fvimeo_septuplet.zip\napt-get install unzip\nunzip vimeo_septuplet.zip\n```\n\n2. Split the `Vimeo-septuplet` into a training set and a test set, make sure you change the dataset's path to your download path in script, also you need to run for the training set and test set separately:\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fsep_vimeo_list.py\n```\n\nThis will create `train` and `test` folders in the directory of **`vimeo_septuplet\u002Fsequences`**. The folder structure is as follows:\n\n```\nvimeo_septuplet\n├── sequences\n    ├── 00001\n        ├── 0266\n            ├── im1.png\n            ├── ...\n            ├── im7.png\n        ├── 0268...\n    ├── 00002...\n├── readme.txt\n├──sep_trainlist.txt\n├── sep_testlist.txt\n```\n\n3. Generate low resolution (LR) images. You can either do this via MATLAB or Python (remember to configure the input and output path):\n\n```Matlab\n# In Matlab Command Window\nrun $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fgenerate_LR_Vimeo90K.m\n```\n\n```Shell\npython $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fgenerate_mod_LR_bic.py\n```\n\n4. Create the LMDB files for faster I\u002FO speed. Note that you need to configure your input and output path in the following script:\n\n```Shell\npython $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fcreate_lmdb_mp.py\n```\n\nThe structure of generated lmdb folder is as follows:\n\n```\nVimeo7_train.lmdb\n├── data.mdb\n├── lock.mdb\n├── meta_info.txt\n```\n\n#### Part 2: Train\n\n**Note:** In this part, we assume you are in the directory **`$ZOOMING_ROOT\u002Fcodes\u002F`**\n\n1. Configure your training settings that can be found at [options\u002Ftrain](.\u002Fcodes\u002Foptions\u002Ftrain). Our training settings in the paper can be found at [train_zsm.yml](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002Fcodes\u002Foptions\u002Ftrain\u002Ftrain_zsm.yml). We'll take this setting as an example to illustrate the following steps.\n\n2. Train the Zooming Slow-Mo model.\n\n```Shell\npython train.py -opt options\u002Ftrain\u002Ftrain_zsm.yml\n```\n\nAfter training, your model `xxxx_G.pth` and its training states, and a corresponding log file `train_LunaTokis_scratch_b16p32f5b40n7l1_600k_Vimeo_xxxx.log` are placed in the directory of `$ZOOMING_ROOT\u002Fexperiments\u002FLunaTokis_scratch_b16p32f5b40n7l1_600k_Vimeo\u002F`.\n\n### Testing\n\nWe provide the test code for both standard test sets (Vid4, SPMC, etc.) and custom video frames.\n\n#### Pretrained Models\n\nOur pretrained model can be downloaded via [GitHub](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002Fexperiments\u002Fpretrained_models\u002Fxiang2020zooming.pth) or [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1xeOoZclGeSI1urY6mVCcApfCqOPgxMBK).\n\n#### From Video\n\nIf you have installed ffmpeg, you can convert any video to a high-resolution and high frame-rate video using [video_to_zsm.py](.\u002Fcodes\u002Fvideo_to_zsm.py). The corresponding commands are:\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython video_to_zsm.py --video PATH\u002FTO\u002FVIDEO.mp4 --model PATH\u002FTO\u002FPRETRAINED\u002FMODEL.pth --output PATH\u002FTO\u002FOUTPUT.mp4\n```\n\nWe also write the above commands to a Shell script, so you can directly run:\n\n```Shell\nbash zsm_my_video.sh\n```\n\n#### From Extracted Frames\n\nAs a quick start, we also provide some example images in the [test_example](.\u002Ftest_example) folder. You can test the model with the following commands:\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython test.py\n```\n\n- You can put your own test folders in the [test_example](.\u002Ftest_example) too, or just change the input path, the number of frames, etc. in [test.py](codes\u002Ftest.py).\n\n- Your custom test results will be saved to a folder here: `$ZOOMING_ROOT\u002Fresults\u002Fyour_data_name\u002F`.\n\n#### Evaluate on Standard Test Sets\n\nThe [test.py](codes\u002Ftest.py) script also provides modes for evaluation on the following test sets: `Vid4`, `SPMC`, etc. We evaluate PSNR and SSIM on the Y-channels in YCrCb color space. The commands are the same with the ones above. All you need to do is the change the data_mode and corresponding path of the standard test set.\n\n### Colab Notebook\n\nPyTorch Colab notebook (provided by [@HanClinto](https:\u002F\u002Fgithub.com\u002FHanClinto)): [HighResSlowMo.ipynb](https:\u002F\u002Fgist.github.com\u002FHanClinto\u002F49219942f76d5f20990b6d048dbacbaf)\n\n## Citations\n\nIf you find the code helpful in your resarch or work, please cite the following papers.\n\n```BibTex\n@misc{xiang2021zooming,\n  title={Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video Super-Resolution},\n  author={Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P and Xu, Chenliang},\n  archivePrefix={arXiv},\n  eprint={2104.07473},\n  year={2021},\n  primaryClass={cs.CV}\n}\n\n@InProceedings{xiang2020zooming,\n  author = {Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P. and Xu, Chenliang},\n  title = {Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution},\n  booktitle = {IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  pages={3370--3379},\n  month = {June},\n  year = {2020}\n}\n\n@InProceedings{tian2018tdan,\n  author={Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu},\n  title={TDAN: Temporally Deformable Alignment Network for Video Super-Resolution},\n  booktitle = {IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  pages={3360--3369},\n  month = {June},\n  year = {2020}\n}\n\n@InProceedings{wang2019edvr,\n  author    = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},\n  title     = {EDVR: Video restoration with enhanced deformable convolutional networks},\n  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},\n  month     = {June},\n  year      = {2019},\n}\n```\n\n## Contact\n\n[Xiaoyu Xiang](https:\u002F\u002Fengineering.purdue.edu\u002Fpeople\u002Fxiaoyu.xiang.1) and [Yapeng Tian](http:\u002F\u002Fyapengtian.org\u002F).\n\nYou can also leave your questions as issues in the repository. We will be glad to answer them.\n\n## License\n\nThis project is released under the [GNU General Public License v3.0](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002FLICENSE).\n\n## Acknowledgments\n\nOur code is inspired by [TDAN-VSR](https:\u002F\u002Fgithub.com\u002FYapengTian\u002FTDAN-VSR) and [EDVR](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FEDVR).\n","# Zooming-Slow-Mo (CVPR-2020)\n\nBy [Xiaoyu Xiang\u003Csup>\\*\u003C\u002Fsup>](https:\u002F\u002Fengineering.purdue.edu\u002Fpeople\u002Fxiaoyu.xiang.1), [Yapeng Tian\u003Csup>\\*\u003C\u002Fsup>](http:\u002F\u002Fyapengtian.org\u002F), [Yulun Zhang](http:\u002F\u002Fyulunzhang.com\u002F), [Yun Fu](http:\u002F\u002Fwww1.ece.neu.edu\u002F~yunfu\u002F), [Jan P. Allebach\u003Csup>+\u003C\u002Fsup>](https:\u002F\u002Fengineering.purdue.edu\u002F~allebach\u002F), [Chenliang Xu\u003Csup>+\u003C\u002Fsup>](https:\u002F\u002Fwww.cs.rochester.edu\u002F~cxu22\u002F) (\u003Csup>\\*\u003C\u002Fsup> 同等贡献，\u003Csup>+\u003C\u002Fsup> 同等指导)\n\n这是 _Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution_（Zooming Slow-Mo：快速且准确的单阶段时空视频超分辨率）的官方 PyTorch (深度学习框架) 实现。\n\n#### [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.11616) | [期刊版本](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.07473) | [演示 (YouTube)](https:\u002F\u002Fyoutu.be\u002F8mgD8JxBOus) | [1 分钟预告 (YouTube)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=C1o85AXUNl8) | [1 分钟预告 (Bilibili)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1GK4y1t7nb\u002F)\n\n\u003Ctable>\n  \u003Cthead>\n    \u003Ctr>\n      \u003Ctd>输入&nbsp;&nbsp;&nbsp;&nbsp;\u003C\u002Ftd>\n      \u003Ctd>输出\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\">\n      \u003Ca href=\"https:\u002F\u002Fyoutu.be\u002F8mgD8JxBOus\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FMukosame_Zooming-Slow-Mo-CVPR-2020_readme_48f33d8bf135.gif\" alt=\"Demo GIF\">\n        \u003C\u002Fimg>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## Updates\n\n- 2020.3.13 添加本论文所用数据集的元信息 (meta-info)\n- 2020.3.11 添加新功能：视频转换器 (video converter)\n- 2020.3.10: 上传完整代码和预训练模型 (pretrained models)\n\n## Contents\n\n0. [介绍](#introduction)\n1. [前置要求](#Prerequisites)\n2. [开始使用](#Get-Started)\n   - [安装](#Installation)\n   - [训练](#Training)\n   - [测试](#Testing)\n   - [Colab Notebook](#Colab-Notebook)\n3. [引用](#citations)\n4. [联系](#Contact)\n5. [许可](#License)\n6. [致谢](#Acknowledgments)\n\n## Introduction\n\n该仓库包含了 Zooming Slow-Mo 单阶段时空视频超分辨率项目的全部内容（包括所有预处理步骤）。\n\nZooming Slow-Mo 是一种最近提出的联合视频帧插值 (Video Frame Interpolation, VFI) 和视频超分辨率 (Video Super-Resolution, VSR) 方法，它直接从低帧率 (Low Frame Rate, LFR)、低分辨率 (Low Resolution, LR) 视频中合成高分辨率 (High Resolution, HR) 慢动作视频。该论文将发表于 [CVPR 2020](http:\u002F\u002Fcvpr2020.thecvf.com\u002F)。带有补充材料的最新论文可在 [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.11616) 找到。\n\n在 Zooming Slow-Mo 中，我们首先通过提出的特征时间插值网络对缺失的 LR 帧进行时间上的特征插值。然后，我们提出了一种可变形 ConvLSTM 来同时对齐和聚合时间信息。最后，采用深度重建网络来预测 HR 慢动作视频帧。如果我们提出的架构也对您的研究有所帮助，请考虑引用我们的论文。\n\nZooming Slow-Mo 在 Vid4 和 Vimeo 测试集上通过 PSNR (峰值信噪比) 和 SSIM (结构相似性指数) 达到了最先进的性能。\n\n![framework](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FMukosame_Zooming-Slow-Mo-CVPR-2020_readme_a40204826932.png)\n\n## Prerequisites\n\n- Python 3 (建议使用 [Anaconda](https:\u002F\u002Fwww.anaconda.com\u002Fdownload\u002F#linux))\n- [PyTorch >= 1.1](https:\u002F\u002Fpytorch.org\u002F)\n- NVIDIA GPU + [CUDA](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-downloads) (并行计算架构)\n- [Deformable Convolution v2](https:\u002F\u002Farxiv.org\u002Fabs\u002F1811.11168) (可变形卷积 v2)，我们在子模块中采用了 [CharlesShang 的实现](https:\u002F\u002Fgithub.com\u002FCharlesShang\u002FDCNv2)。\n- Python 包：`pip install numpy opencv-python lmdb pyyaml pickle5 matplotlib seaborn`\n\n## Get Started\n\n### Installation\n\n安装所需的包：`pip install -r requirements.txt`\n\n首先，确保您的机器拥有 GPU，因为 DCNv2 模块需要 GPU 支持。\n\n1. 克隆 Zooming Slow-Mo 仓库。我们将您克隆的 Zooming Slow-Mo 目录称为 ZOOMING_ROOT。\n\n```Shell\ngit clone --recursive https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020.git\n```\n\n2. 编译 DCNv2：\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\u002Fmodels\u002Fmodules\u002FDCNv2\nbash make.sh         # build\npython test.py    # run examples and gradient check\n```\n\n请在运行以下实验之前，确保测试脚本成功完成且没有任何错误。\n\n### Training\n\n#### Part 1: Data Preparation\n\n1. 下载原始的 `Vimeo-septuplet` 训练 + 测试集 (82 GB)。\n\n```Shell\nwget http:\u002F\u002Fdata.csail.mit.edu\u002Ftofu\u002Fdataset\u002Fvimeo_septuplet.zip\napt-get install unzip\nunzip vimeo_septuplet.zip\n```\n\n2. 将 `Vimeo-septuplet` 拆分为训练集和测试集，请确保您在脚本中将数据集的路径更改为您的下载路径，并且您需要分别为训练集和测试集运行：\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fsep_vimeo_list.py\n```\n\n这将在 **`vimeo_septuplet\u002Fsequences`** 目录下创建 `train` 和 `test` 文件夹。文件夹结构如下：\n\n```\nvimeo_septuplet\n├── sequences\n    ├── 00001\n        ├── 0266\n            ├── im1.png\n            ├── ...\n            ├── im7.png\n        ├── 0268...\n    ├── 00002...\n├── readme.txt\n├──sep_trainlist.txt\n├── sep_testlist.txt\n```\n\n3. 生成低分辨率 (LR) 图像。您可以通过 MATLAB 或 Python 完成此操作（记得配置输入和输出路径）：\n\n```Matlab\n# In Matlab Command Window\nrun $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fgenerate_LR_Vimeo90K.m\n```\n\n```Shell\npython $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fgenerate_mod_LR_bic.py\n```\n\n4. 创建 LMDB (轻量级键值存储数据库) 文件以获得更快的 I\u002FO 速度。注意，您需要在以下脚本中配置您的输入和输出路径：\n\n```Shell\npython $ZOOMING_ROOT\u002Fcodes\u002Fdata_scripts\u002Fcreate_lmdb_mp.py\n```\n\n生成的 lmdb 文件夹结构如下：\n\n```\nVimeo7_train.lmdb\n├── data.mdb\n├── lock.mdb\n├── meta_info.txt\n```\n\n#### Part 2: Train\n\n**注意：** 在本部分中，假设您位于目录 **`$ZOOMING_ROOT\u002Fcodes\u002F`** 下。\n\n1. 配置您的训练设置，可以在 [options\u002Ftrain](.\u002Fcodes\u002Foptions\u002Ftrain) 中找到。我们论文中的训练设置可以在 [train_zsm.yml](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002Fcodes\u002Foptions\u002Ftrain\u002Ftrain_zsm.yml) 中找到。我们将以此设置为例来说明以下步骤。\n\n2. 训练 Zooming Slow-Mo 模型。\n\n```Shell\npython train.py -opt options\u002Ftrain\u002Ftrain_zsm.yml\n```\n\n训练完成后，您的模型 `xxxx_G.pth` 及其训练状态，以及相应的日志文件 `train_LunaTokis_scratch_b16p32f5b40n7l1_600k_Vimeo_xxxx.log` 将放置在 `$ZOOMING_ROOT\u002Fexperiments\u002FLunaTokis_scratch_b16p32f5b40n7l1_600k_Vimeo\u002F` 目录中。\n\n### 测试\n\n我们提供了针对标准测试集（Vid4, SPMC 等）和自定义视频帧的测试代码。\n\n#### 预训练模型\n\n我们的预训练模型可以通过 [GitHub](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002Fexperiments\u002Fpretrained_models\u002Fxiang2020zooming.pth) 或 [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1xeOoZclGeSI1urY6mVCcApfCqOPgxMBK) 下载。\n\n#### 从视频\n\n如果您已安装 ffmpeg (视频处理工具)，可以使用 [video_to_zsm.py](.\u002Fcodes\u002Fvideo_to_zsm.py) 将任何视频转换为高分辨率和高帧率的视频。相应的命令如下：\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython video_to_zsm.py --video PATH\u002FTO\u002FVIDEO.mp4 --model PATH\u002FTO\u002FPRETRAINED\u002FMODEL.pth --output PATH\u002FTO\u002FOUTPUT.mp4\n```\n\n我们也已将上述命令写入 Shell (命令行外壳) 脚本，因此您可以直接运行：\n\n```Shell\nbash zsm_my_video.sh\n```\n\n#### 从提取的帧\n\n为了快速开始，我们在 [test_example](.\u002Ftest_example) 文件夹中也提供了一些示例图像。您可以使用以下命令测试模型：\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython test.py\n```\n\n- 您也可以将自己的测试文件夹放入 [test_example](.\u002Ftest_example)，或者直接在 [test.py](codes\u002Ftest.py) 中更改输入路径、帧数等。\n\n- 您的自定义测试结果将保存到此处的文件夹：`$ZOOMING_ROOT\u002Fresults\u002Fyour_data_name\u002F`。\n\n#### 在标准测试集上评估\n\n[test.py](codes\u002Ftest.py) 脚本还提供了针对以下测试集的评估模式：`Vid4`, `SPMC` 等。我们在 YCrCb 色彩空间的 Y 通道上评估 PSNR (峰值信噪比) 和 SSIM (结构相似性)。命令与上述相同。您需要做的就是更改 data_mode 和标准测试集对应的路径。\n\n### Colab 笔记本\n\nPyTorch Colab 笔记本（由 [@HanClinto](https:\u002F\u002Fgithub.com\u002FHanClinto) 提供）：[HighResSlowMo.ipynb](https:\u002F\u002Fgist.github.com\u002FHanClinto\u002F49219942f76d5f20990b6d048dbacbaf)\n\n## 引用\n\n如果您发现该代码对您的研究或工作有帮助，请引用以下论文。\n\n```BibTex\n@misc{xiang2021zooming,\n  title={Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video Super-Resolution},\n  author={Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P and Xu, Chenliang},\n  archivePrefix={arXiv},\n  eprint={2104.07473},\n  year={2021},\n  primaryClass={cs.CV}\n}\n\n@InProceedings{xiang2020zooming,\n  author = {Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P. and Xu, Chenliang},\n  title = {Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution},\n  booktitle = {IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  pages={3370--3379},\n  month = {June},\n  year = {2020}\n}\n\n@InProceedings{tian2018tdan,\n  author={Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu},\n  title={TDAN: Temporally Deformable Alignment Network for Video Super-Resolution},\n  booktitle = {IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  pages={3360--3369},\n  month = {June},\n  year = {2020}\n}\n\n@InProceedings{wang2019edvr,\n  author    = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},\n  title     = {EDVR: Video restoration with enhanced deformable convolutional networks},\n  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},\n  month     = {June},\n  year      = {2019},\n}\n```\n\n## 联系\n\n[Xiaoyu Xiang](https:\u002F\u002Fengineering.purdue.edu\u002Fpeople\u002Fxiaoyu.xiang.1) 和 [Yapeng Tian](http:\u002F\u002Fyapengtian.org\u002F)。\n\n您也可以在仓库中以 Issue (问题) 的形式留下您的问题。我们将很乐意为您解答。\n\n## 许可证\n\n本项目基于 [GNU General Public License v3.0](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002FLICENSE) 发布。\n\n## 致谢\n\n我们的代码灵感来源于 [TDAN-VSR](https:\u002F\u002Fgithub.com\u002FYapengTian\u002FTDAN-VSR) 和 [EDVR](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FEDVR)。","# Zooming-Slow-Mo-CVPR-2020 快速上手指南\n\n本项目是 CVPR 2020 论文《Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution》的官方 PyTorch 实现。它能够将低帧率（LFR）、低分辨率（LR）的视频直接合成为高分辨率（HR）的慢动作视频。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n- **操作系统**: Linux \u002F Windows (推荐 Linux)\n- **硬件**: NVIDIA GPU + [CUDA](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-downloads)\n- **软件**:\n  - Python 3 (推荐使用 [Anaconda](https:\u002F\u002Fwww.anaconda.com\u002Fdownload\u002F#linux))\n  - PyTorch >= 1.1\n  - FFmpeg (用于视频处理)\n- **依赖包**:\n  ```bash\n  numpy opencv-python lmdb pyyaml pickle5 matplotlib seaborn\n  ```\n\n## 2. 安装步骤\n\n### 2.1 克隆代码库\n将项目克隆到本地，建议设置环境变量 `ZOOMING_ROOT` 指向克隆目录。\n\n```Shell\ngit clone --recursive https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020.git\nexport ZOOMING_ROOT=\u002Fpath\u002Fto\u002FZooming-Slow-Mo-CVPR-2020\n```\n\n### 2.2 编译 DCNv2 模块\n由于使用了可变形卷积 v2 (DCNv2)，需要手动编译该子模块。\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\u002Fmodels\u002Fmodules\u002FDCNv2\nbash make.sh         # 构建\npython test.py       # 运行示例和梯度检查，确保无错误\n```\n\n### 2.3 安装 Python 依赖\n在项目根目录下安装所需的 Python 包。\n\n```Shell\ncd $ZOOMING_ROOT\npip install -r requirements.txt\n```\n\n## 3. 基本使用\n\n本部分介绍如何使用预训练模型对视频进行超分辨率和插帧处理。\n\n### 3.1 下载预训练模型\n您可以从以下链接下载官方提供的预训练权重文件 `xiang2020zooming.pth`：\n- [GitHub](https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fblob\u002Fmaster\u002Fexperiments\u002Fpretrained_models\u002Fxiang2020zooming.pth)\n- [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1xeOoZclGeSI1urY6mVCcApfCqOPgxMBK)\n\n将下载的模型文件放置在您方便访问的路径下。\n\n### 3.2 视频处理 (Video to Video)\n如果您已安装 ffmpeg，可以直接将任意视频转换为高分辨率、高帧率的视频。\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython video_to_zsm.py --video PATH\u002FTO\u002FVIDEO.mp4 --model PATH\u002FTO\u002FPRETRAINED\u002FMODEL.pth --output PATH\u002FTO\u002FOUTPUT.mp4\n```\n\n或者直接使用提供的 Shell 脚本：\n\n```Shell\nbash zsm_my_video.sh\n```\n\n### 3.3 图像序列测试 (From Frames)\n如果您已有提取好的图像序列，可以使用 `test.py` 进行测试。\n\n```Shell\ncd $ZOOMING_ROOT\u002Fcodes\npython test.py\n```\n\n- 默认测试数据位于 `test_example` 文件夹。\n- 自定义测试时，请将图片放入 `test_example` 或修改 `test.py` 中的输入路径及帧数参数。\n- 结果将保存至 `$ZOOMING_ROOT\u002Fresults\u002Fyour_data_name\u002F`。","安防监控中心的技术人员正在复盘一起夜间盗窃案，手头只有一段画质模糊且播放卡顿的低分辨率监控录像，急需看清嫌疑人面部细节。\n\n### 没有 Zooming-Slow-Mo-CVPR-2020 时\n- 原始视频分辨率过低，人脸和车牌等关键信息模糊不清，难以作为有效证据。\n- 帧率不足导致画面跳跃，无法通过逐帧分析捕捉嫌疑人的细微动作轨迹。\n- 传统方案需分别进行超分辨率和帧插值处理，流程繁琐且多次转换会累积噪声误差。\n- 多步骤处理耗时较长，无法满足紧急案件快速研判的时间要求。\n\n### 使用 Zooming-Slow-Mo-CVPR-2020 后\n- 利用其时空联合建模能力，直接生成高清慢动作视频，显著还原了人脸纹理与衣物特征。\n- 单阶段网络同时完成插帧与超分，动作过渡平滑自然，消除了传统方法的闪烁伪影。\n- 推理速度更快，在保持高精度的前提下大幅缩短了视频预处理时间。\n- 端到端优化减少了中间环节的信息损失，提升了最终输出画面的可信度与可用性。\n\nZooming-Slow-Mo-CVPR-2020 实现了低质视频的一站式高清慢放增强，极大提升了安防取证的效率与准确性。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FMukosame_Zooming-Slow-Mo-CVPR-2020_c0211053.png","Mukosame","Xiaoyu Xiang","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FMukosame_04180a38.jpg","Research Scientist, Meta Reality Labs","Meta","Bay Area","xiaoyu.xiang.ai@gmail.com","https:\u002F\u002Fxiaoyux1ang.github.io\u002F","https:\u002F\u002Fgithub.com\u002FMukosame",[85,89,93,97,101,105],{"name":86,"color":87,"percentage":88},"Python","#3572A5",66.6,{"name":90,"color":91,"percentage":92},"Cuda","#3A4E3A",23.7,{"name":94,"color":95,"percentage":96},"C++","#f34b7d",6.5,{"name":98,"color":99,"percentage":100},"C","#555555",2.4,{"name":102,"color":103,"percentage":104},"MATLAB","#e16737",0.6,{"name":106,"color":107,"percentage":108},"Shell","#89e051",0.2,934,165,"2026-03-09T23:12:34","GPL-3.0",4,"未说明","需要 NVIDIA GPU，显存和CUDA版本未说明",{"notes":117,"python":118,"dependencies":119},"需手动编译DCNv2模块；训练前需预处理数据（下载数据集、生成低分辨率图像、创建LMDB索引）；支持自定义视频输入及标准测试集评估","3.x",[120,121,122,123,124,125,126,127],"torch>=1.1","numpy","opencv-python","lmdb","pyyaml","pickle5","matplotlib","seaborn",[14,52,13],[130,131,132,133,134,135,136,137],"pytorch","video","super-resolution","video-frame-interpolation","video-super-resolution","spatio-temporal","cvpr2020","cvpr",null,"2026-03-27T02:49:30.150509","2026-04-06T05:44:08.968890",[142,147,152,157,162,167],{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},3735,"为什么使用预训练模型在 Vid4 数据集上得到的 PSNR 值远低于论文结果？","这是因为测试前需要使用仓库提供的脚本生成对应的低分辨率（LR）图像。直接使用原始高清图像会导致评估指标偏差。生成 LR 图像后，PSNR 结果应与论文接近（例如 Vid4 平均 PSNR 约 24.49 dB）。","https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fissues\u002F29",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},3736,"模型训练开始后立刻结束，没有进行任何迭代，是什么原因？","检查日志中的 `Number of train images`，如果数量极少（如 3 张），说明数据集加载失败。此外，可能存在代码 Bug（如 `paths_GT` 为 set 类型导致无法下标访问），建议更新代码至最新版本或检查数据集路径配置是否正确。","https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fissues\u002F51",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},3737,"训练过程中 Loss 迅速变为 INF（梯度爆炸）该如何解决？","这通常表明 DCN（可变形卷积）模块没有正确学习偏移量。请检查网络结构配置，确保 DCN 部分初始化正确，并确认使用的配置文件符合官方要求。","https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fissues\u002F6",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},3738,"测试 1920*1080 等高分辨率视频时出现 OOM（显存不足）怎么办？","可以尝试减小输入序列的长度、patch size（切片大小）或 batch size。对于长视频，由于分块数量巨大可能导致效率低下，建议分段处理或降低分辨率预处理。","https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fissues\u002F64",{"id":163,"question_zh":164,"answer_zh":165,"source_url":166},3739,"如何使用自定义数据集进行训练和测试？需要配置 .yml 文件吗？","训练时需要编写 .yml 配置文件，保持基础参数一致，仅根据需求修改特定设置。测试通常需要先有训练好的模型。注意检查数据读取状态，避免出现 `Size = N\u002FA` 的异常。","https:\u002F\u002Fgithub.com\u002FMukosame\u002FZooming-Slow-Mo-CVPR-2020\u002Fissues\u002F63",{"id":168,"question_zh":169,"answer_zh":170,"source_url":161},3740,"运行时报错 `TypeError: list indices must be integers` 如何处理？","这是数据集加载代码的问题，`paths_GT` 可能是 set 类型而非 list，导致无法通过字符串键值访问。需修改代码确保索引操作合法，参考社区提交的 PR 修复相关代码行。",[]]