[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-cszn--BSRGAN":3,"similar-cszn--BSRGAN":104},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":15,"owner_avatar_url":16,"owner_bio":17,"owner_company":18,"owner_location":19,"owner_email":20,"owner_twitter":21,"owner_website":22,"owner_url":23,"languages":24,"stars":29,"forks":30,"last_commit_at":31,"license":32,"difficulty_score":33,"env_os":34,"env_gpu":35,"env_ram":36,"env_deps":37,"category_tags":47,"github_topics":49,"view_count":33,"oss_zip_url":21,"oss_zip_packed_at":21,"status":54,"created_at":55,"updated_at":56,"faqs":57,"releases":103},140,"cszn\u002FBSRGAN","BSRGAN","Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!","BSRGAN 是一种用于盲图像超分辨率（Blind Super-Resolution）的深度学习方法，旨在将低质量、模糊或压缩过的低分辨率图像恢复为清晰的高分辨率图像。与传统超分方法假设图像退化过程已知不同，BSRGAN 提出了一种更贴近真实场景的退化模型，能同时处理多种未知的模糊、噪声和下采样因素，从而显著提升在真实图像上的重建效果。它特别适合研究人员和开发者使用，尤其适用于需要处理现实拍摄图像（如老照片、手机拍摄图等）的超分辨率任务。BSRGAN 的核心亮点在于其精心设计的退化合成策略，使训练数据更接近真实低质图像分布，并提供了完整的训练与测试代码（基于 PyTorch），便于复现和二次开发。项目还包含预训练模型，支持 2 倍和 4 倍放大，对学术研究和实际应用均有较高参考价值。","#  [Designing a Practical Degradation Model for Deep Blind Image Super-Resolution](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2103.14006.pdf)\n\n![visitors](https:\u002F\u002Fvisitor-badge.glitch.me\u002Fbadge?page_id=cszn\u002FBSRGAN) \n\n[Kai Zhang](https:\u002F\u002Fcszn.github.io\u002F), Jingyun Liang, [Luc Van Gool](https:\u002F\u002Fvision.ee.ethz.ch\u002Fpeople-details.OTAyMzM=.TGlzdC8zMjQ4LC0xOTcxNDY1MTc4.html), [Radu Timofte](http:\u002F\u002Fpeople.ee.ethz.ch\u002F~timofter\u002F)  \n_[Computer Vision Lab](https:\u002F\u002Fvision.ee.ethz.ch\u002Fthe-institute.html), ETH Zurich, Switzerland_\n\n[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14006)] [[Code](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fblob\u002Fmain\u002Fmain_test_bsrgan.py)] [[Training Code](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR)]\n\n_**Our new work for real image denoising ---> [https:\u002F\u002Fgithub.com\u002Fcszn\u002FSCUNet](https:\u002F\u002Fgithub.com\u002Fcszn\u002FSCUNet)**_\n\n_**Our work is the beginning rather than the end of real image super-resolution.**_\n\n_______\n- **_News (2021-08-31)_**: We upload the training code. \n- **_News (2021-08-24)_**: We upload the BSRGAN degradation model. \n```python\nfrom utils import utils_blindsr as blindsr\nimg_lq, img_hq = blindsr.degradation_bsrgan(img, sf=4, lq_patchsize=72)\n```\n- **_News (2021-07-23)_**: After rejection by CVPR 2021, our paper is accepted by ICCV 2021. For the sake of fairness, we will not update the trained models in our camera-ready version. However, we may update the trained models in github.\n- **_News (2021-05-18)_**: Add trained BSRGAN model for scale factor 2.\n- **_News (2021-04)_**: Our degradation model for face image enhancement: [https:\u002F\u002Fgithub.com\u002Fvvictoryuki\u002FBSRGAN_implementation](https:\u002F\u002Fgithub.com\u002Fvvictoryuki\u002FBSRGAN_implementation)\n\n\nTraining\n----------\n1. Download [KAIR](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR): `git clone https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR.git`\n2. Put your training high-quality images into `trainsets\u002FtrainH` or set `\"dataroot_H\": \"trainsets\u002FtrainH\"`\n3. Train BSRNet\n    1. Modify [train_bsrgan_x4_psnr.json](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR\u002Fblob\u002Fmaster\u002Foptions\u002Ftrain_bsrgan_x4_psnr.json) e.g., `\"gpu_ids\": [0]`, `\"dataloader_batch_size\": 4`\n    2. Training with `DataParallel`\n    ```bash\n    python main_train_psnr.py --opt options\u002Ftrain_bsrgan_x4_psnr.json\n    ```\n    2. Training with `DistributedDataParallel` - 4 GPUs\n    ```bash\n    python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_psnr.py --opt options\u002Ftrain_bsrgan_x4_psnr.json  --dist True\n    ```\n4. Train BSRGAN\n    1. Put BSRNet model (e.g., '400000_G.pth') into `superresolution\u002Fbsrgan_x4_gan\u002Fmodels`\n    2. Modify [train_bsrgan_x4_gan.json](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR\u002Fblob\u002Fmaster\u002Foptions\u002Ftrain_bsrgan_x4_gan.json) e.g., `\"gpu_ids\": [0]`, `\"dataloader_batch_size\": 4`\n    3. Training with `DataParallel`\n    ```bash\n    python main_train_gan.py --opt options\u002Ftrain_bsrgan_x4_gan.json\n    ```\n    3. Training with `DistributedDataParallel` - 4 GPUs\n    ```bash\n    python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_gan.py --opt options\u002Ftrain_bsrgan_x4_gan.json  --dist True\n    ```\n5. Test BSRGAN model `'xxxxxx_E.pth'` by modified `main_test_bsrgan.py`\n    1. `'xxxxxx_E.pth'` is more stable than `'xxxxxx_G.pth'`\n\n\n_______\n✨ _**Some visual examples**_: [oldphoto2](https:\u002F\u002Fimgsli.com\u002FNDgzMjU); [butterfly](https:\u002F\u002Fimgsli.com\u002FNDgyNjY); [comic](https:\u002F\u002Fimgsli.com\u002FNDgyNzg); [oldphoto3](https:\u002F\u002Fimgsli.com\u002FNDgyNzk); [oldphoto6](https:\u002F\u002Fimgsli.com\u002FNDgyODA); [comic_01](https:\u002F\u002Fimgsli.com\u002FNDgzNTg); [comic_03](https:\u002F\u002Fimgsli.com\u002FNDgzNTk); [comic_04](https:\u002F\u002Fimgsli.com\u002FNDgzNTY)\n\n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_eba6d425fe5f.png\" width=\"390px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgzMjU) [\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_575408020dc7.png\" width=\"390px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgyNzk) \n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_c3b7d88c72c3.png\" width=\"784px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgzNDk)\n___________\n\n* [Testing code](#testing-code)\n* [Main idea](#main-idea)\n* [Comparison](#comparison)\n* [More visual results on RealSRSet dataset](#more-visual-results-on-realsrset-dataset)\n* [Visual results on DPED dataset](#visual-results-on-dped-dataset)\n* [Citation](#citation)\n* [Acknowledgments](#acknowledgments)\n\nTesting code\n----------\n\n* [main_test_bsrgan.py](main_test_bsrgan.py)\n* [model_zoo](model_zoo) (_Download the following models from [Google drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing) or [腾讯微云](https:\u002F\u002Fshare.weiyun.com\u002F5qO32s3)_).\n   * Proposed:\n     * BSRGAN.pth     [[Google drive]](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing) [[腾讯微云]](https:\u002F\u002Fshare.weiyun.com\u002F7GPI8p7x)🌱\n     * BSRNet.pth      [[Google drive]](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing)  [[腾讯微云]](https:\u002F\u002Fshare.weiyun.com\u002FVOFW5Ela)🌱\n   * Compared methods:\n     * RRDB.pth  --->  [original link](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FESRGAN)\n     * ESRGAN.pth --->   [original link](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FESRGAN)\n     * FSSR_DPED.pth --->   [original link](https:\u002F\u002Fgithub.com\u002FManuelFritsche\u002Freal-world-sr)\n     * FSSR_DPED.pth --->   [original link](https:\u002F\u002Fgithub.com\u002FManuelFritsche\u002Freal-world-sr)\n     * RealSR_DPED.pth --->   [original link](https:\u002F\u002Fgithub.com\u002Fjixiaozhong\u002FRealSR)\n     * RealSR_JPEG.pth --->   [original link](https:\u002F\u002Fgithub.com\u002Fjixiaozhong\u002FRealSR)\n\n\nMain idea\n----------\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_5861db45cd8a.png\" width=\"790px\"\u002F> \n\n__Design a new degradation model to synthesize LR images for training:__\n\n* **_1) Make the blur, downsampling and noise more practical_**\n  * **_Blur:_** _two convolutions with isotropic and anisotropic Gaussian kernels from both the HR space and LR space_\n  * **_Downsampling:_** _nearest, bilinear, bicubic, down-up-sampling_\n  * **_Noise:_** _Gaussian noise, JPEG compression noise, processed camera sensor noise_\n* **_2) Degradation shuffle:_** _instead of using the commonly-used blur\u002Fdownsampling\u002Fnoise-addition pipeline, we perform randomly shuffled degradations to synthesize LR images_\n\n__Some notes on the proposed degradation model:__\n\n* *The degradation model is mainly designed to synthesize degraded LR images. Its most direct application is to train a deep blind super-resolver with paired LR\u002FHR images. In particular, the degradation model can be performed on a large dataset of HR images to produce unlimited perfectly aligned training images, which typically do not suffer from the limited data issue of laboriously collected paired data and the misalignment issue of unpaired training data.*\n \n* *The degradation model tends to be unsuited to model a degraded LR image as it involves too many degradation parameters and also adopts a random shuffle strategy.*\n\n* *The degradation model can produce some degradation cases that rarely happen in real-world scenarios, while this can still be expected to improve the generalization ability of the trained deep blind super-resolver.*\n\n* *A DNN with large capacity has the ability to handle different degradations via a single model. This has been validated multiple times. For example, DnCNN is able\nto handle SISR with different scale factors, JPEG compression deblocking with different quality factors and denoising for a wide range of noise levels, while still having a performance comparable to VDSR for SISR. It is worth noting that even when the super-resolver reduces the performance for unrealistic bicubic downsampling, it is still a preferred choice for real SISR.*\n\n* *One can conveniently modify the degradation model by changing the degradation parameter settings and adding more reasonable degradation\ntypes to improve the practicability for a certain application.*\n\n\n\n\nComparison\n----------\n\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_58bc20738230.png\" width=\"790px\"\u002F> \n\n*These no-reference IQA metrics, i.e., NIQE, NRQM and PI, do not always match perceptual visual quality [1] and the IQA metric should be updated with new SISR methods [2]. We further argue that the IQA metric for SISR should also be updated with new image degradation types, which we leave for future work.*\n\n```\n[1] \"NTIRE 2020 challenge on real-world image super-resolution: Methods and results.\" CVPRW, 2020.\n[2] \"PIPAL: a large-scale image quality assessment dataset for perceptual image restoration.\" ECCV, 2020.\n```\n\n\n\nMore visual results on [RealSRSet](testsets\u002FRealSRSet) dataset\n----------\n\n\n**Left**: [real images](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRNet\u002Ftree\u002Fmain\u002Ftestsets\u002FRealSRSet) **|** **Right**: [super-resolved images with scale factor 4](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRNet\u002Ftree\u002Fmain\u002Ftestsets\u002FBSRGAN)\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_9df0092dfb30.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_49773af73bf0.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_f0ba8c0573d0.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_43f7a71966fe.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_f0ba8c0573d0.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_43f7a71966fe.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_980d16ceed14.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0337b9edbdd5.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_980d16ceed14.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0337b9edbdd5.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_2212aaa1dafc.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_2dd64835aced.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_31376ff72302.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_78d2cce68fba.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_68ebd9b4c1f0.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0777c4f24832.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_77c7dad96c22.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_05b90427e4f7.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_87eedc84d0a4.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_735d0e1be97b.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_8a48167ad75b.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_00fd656dab52.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_080c3f477ce8.png\" width=\"784px\"\u002F> \n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0d140aee1349.png\" width=\"784px\"\u002F>\n\n\nVisual results on DPED dataset\n----------\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_c14efd339cdd.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_b4c5a3795e4a.png\" width=\"790px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_ace98884ea66.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_332d42da3c25.png\" width=\"790px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_afa112b4b533.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_13d07b71b2a3.png\" width=\"790px\"\u002F>\n\n*Without using any prior information of DPED dataset for training, our BSRGAN still performs well.*\n\n\n\n\nCitation\n----------\n```BibTex\n@inproceedings{zhang2021designing,\n    title={Designing a Practical Degradation Model for Deep Blind Image Super-Resolution},\n    author={Zhang, Kai and Liang, Jingyun and Van Gool, Luc and Timofte, Radu},\n    booktitle={IEEE International Conference on Computer Vision},\n    pages={4791--4800},\n    year={2021}\n}\n```\n\n\nAcknowledgments\n----------\nThis work was partly supported by the ETH Zurich Fund (OK), a Huawei Technologies Oy (Finland) project, and an Amazon AWS grant.\n\n\n\n","# [Designing a Practical Degradation Model for Deep Blind Image Super-Resolution（设计用于深度盲图像超分辨率的实用退化模型）](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2103.14006.pdf)\n\n![visitors](https:\u002F\u002Fvisitor-badge.glitch.me\u002Fbadge?page_id=cszn\u002FBSRGAN) \n\n[Kai Zhang](https:\u002F\u002Fcszn.github.io\u002F)、Jingyun Liang、[Luc Van Gool](https:\u002F\u002Fvision.ee.ethz.ch\u002Fpeople-details.OTAyMzM=.TGlzdC8zMjQ4LC0xOTcxNDY1MTc4.html)、[Radu Timofte](http:\u002F\u002Fpeople.ee.ethz.ch\u002F~timofter\u002F)  \n_[Computer Vision Lab（计算机视觉实验室）](https:\u002F\u002Fvision.ee.ethz.ch\u002Fthe-institute.html), ETH Zurich, Switzerland_\n\n[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14006)] [[测试代码](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fblob\u002Fmain\u002Fmain_test_bsrgan.py)] [[训练代码](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR)]\n\n_**我们关于真实图像去噪的新工作 ---> [https:\u002F\u002Fgithub.com\u002Fcszn\u002FSCUNet](https:\u002F\u002Fgithub.com\u002Fcszn\u002FSCUNet)**_\n\n_**我们的工作是真实图像超分辨率的起点，而非终点。**_\n\n_______\n- **_新闻 (2021-08-31)_**: 我们上传了训练代码。\n- **_新闻 (2021-08-24)_**: 我们上传了 BSRGAN 退化模型。\n```python\nfrom utils import utils_blindsr as blindsr\nimg_lq, img_hq = blindsr.degradation_bsrgan(img, sf=4, lq_patchsize=72)\n```\n- **_新闻 (2021-07-23)_**: 在被 CVPR 2021 拒稿后，我们的论文已被 ICCV 2021 接收。为保证公平性，我们不会在最终提交版本中更新训练好的模型，但可能会在 GitHub 上更新。\n- **_新闻 (2021-05-18)_**: 新增了缩放因子为 2 的 BSRGAN 训练模型。\n- **_新闻 (2021-04)_**: 我们针对人脸图像增强的退化模型：[https:\u002F\u002Fgithub.com\u002Fvvictoryuki\u002FBSRGAN_implementation](https:\u002F\u002Fgithub.com\u002Fvvictoryuki\u002FBSRGAN_implementation)\n\n\n训练\n----------\n1. 下载 [KAIR](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR)：`git clone https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR.git`\n2. 将你的高质量训练图像放入 `trainsets\u002FtrainH` 目录，或设置 `\"dataroot_H\": \"trainsets\u002FtrainH\"`\n3. 训练 BSRNet\n    1. 修改 [train_bsrgan_x4_psnr.json](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR\u002Fblob\u002Fmaster\u002Foptions\u002Ftrain_bsrgan_x4_psnr.json)，例如设置 `\"gpu_ids\": [0]`、`\"dataloader_batch_size\": 4`\n    2. 使用 `DataParallel` 进行训练\n    ```bash\n    python main_train_psnr.py --opt options\u002Ftrain_bsrgan_x4_psnr.json\n    ```\n    2. 使用 `DistributedDataParallel`（4 块 GPU）\n    ```bash\n    python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_psnr.py --opt options\u002Ftrain_bsrgan_x4_psnr.json  --dist True\n    ```\n4. 训练 BSRGAN\n    1. 将 BSRNet 模型（例如 `'400000_G.pth'`）放入 `superresolution\u002Fbsrgan_x4_gan\u002Fmodels`\n    2. 修改 [train_bsrgan_x4_gan.json](https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR\u002Fblob\u002Fmaster\u002Foptions\u002Ftrain_bsrgan_x4_gan.json)，例如设置 `\"gpu_ids\": [0]`、`\"dataloader_batch_size\": 4`\n    3. 使用 `DataParallel` 进行训练\n    ```bash\n    python main_train_gan.py --opt options\u002Ftrain_bsrgan_x4_gan.json\n    ```\n    3. 使用 `DistributedDataParallel`（4 块 GPU）\n    ```bash\n    python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_gan.py --opt options\u002Ftrain_bsrgan_x4_gan.json  --dist True\n    ```\n5. 通过修改 `main_test_bsrgan.py` 测试 BSRGAN 模型 `'xxxxxx_E.pth'`\n    1. `'xxxxxx_E.pth'` 比 `'xxxxxx_G.pth'` 更稳定\n\n\n_______\n✨ _**部分可视化示例**_: [oldphoto2](https:\u002F\u002Fimgsli.com\u002FNDgzMjU); [butterfly](https:\u002F\u002Fimgsli.com\u002FNDgyNjY); [comic](https:\u002F\u002Fimgsli.com\u002FNDgyNzg); [oldphoto3](https:\u002F\u002Fimgsli.com\u002FNDgyNzk); [oldphoto6](https:\u002F\u002Fimgsli.com\u002FNDgyODA); [comic_01](https:\u002F\u002Fimgsli.com\u002FNDgzNTg); [comic_03](https:\u002F\u002Fimgsli.com\u002FNDgzNTk); [comic_04](https:\u002F\u002Fimgsli.com\u002FNDgzNTY)\n\n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_eba6d425fe5f.png\" width=\"390px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgzMjU) [\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_575408020dc7.png\" width=\"390px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgyNzk) \n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_c3b7d88c72c3.png\" width=\"784px\"\u002F>](https:\u002F\u002Fimgsli.com\u002FNDgzNDk)\n___________\n\n* [测试代码](#测试代码)\n* [核心思想](#核心思想)\n* [对比](#对比)\n* [RealSRSet 数据集上的更多可视化结果](#realsrset-数据集上的更多可视化结果)\n* [DPED 数据集上的可视化结果](#dped-数据集上的可视化结果)\n* [引用](#引用)\n* [致谢](#致谢)\n\n测试代码\n----------\n\n* [main_test_bsrgan.py](main_test_bsrgan.py)\n* [model_zoo](model_zoo)（_从 [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing) 或 [腾讯微云](https:\u002F\u002Fshare.weiyun.com\u002F5qO32s3) 下载以下模型_）。\n   * 提出的方法：\n     * BSRGAN.pth     [[Google Drive]](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing) [[腾讯微云]](https:\u002F\u002Fshare.weiyun.com\u002F7GPI8p7x)🌱\n     * BSRNet.pth      [[Google Drive]](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing)  [[腾讯微云]](https:\u002F\u002Fshare.weiyun.com\u002FVOFW5Ela)🌱\n   * 对比方法：\n     * RRDB.pth  --->  [原始链接](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FESRGAN)\n     * ESRGAN.pth --->   [原始链接](https:\u002F\u002Fgithub.com\u002Fxinntao\u002FESRGAN)\n     * FSSR_DPED.pth --->   [原始链接](https:\u002F\u002Fgithub.com\u002FManuelFritsche\u002Freal-world-sr)\n     * FSSR_DPED.pth --->   [原始链接](https:\u002F\u002Fgithub.com\u002FManuelFritsche\u002Freal-world-sr)\n     * RealSR_DPED.pth --->   [原始链接](https:\u002F\u002Fgithub.com\u002Fjixiaozhong\u002FRealSR)\n     * RealSR_JPEG.pth --->   [原始链接](https:\u002F\u002Fgithub.com\u002Fjixiaozhong\u002FRealSR)\n\n\n核心思想\n----------\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_5861db45cd8a.png\" width=\"790px\"\u002F> \n\n__设计一种新的退化模型用于合成训练用的低分辨率（LR）图像：__\n\n* **_1) 使模糊、下采样和噪声更贴近实际_**\n  * **_模糊（Blur）:_** _在高分辨率（HR）空间和低分辨率（LR）空间分别使用各向同性和各向异性高斯核进行两次卷积_\n  * **_下采样（Downsampling）:_** _最近邻、双线性、双三次、下采样后再上采样_\n  * **_噪声（Noise）:_** _高斯噪声、JPEG 压缩噪声、经处理的相机传感器噪声_\n* **_2) 退化顺序随机打乱（Degradation shuffle）:_** _不同于常用的“模糊→下采样→加噪声”流程，我们对退化操作进行随机打乱以合成 LR 图像_\n\n__关于所提出退化模型的一些说明：__\n\n* *该退化模型主要用于合成退化的 LR 图像。其最直接的应用是利用成对的 LR\u002FHR 图像训练深度盲超分辨率模型。具体而言，该退化模型可在大规模 HR 图像数据集上生成无限量且完美对齐的训练图像，通常可避免人工收集的成对数据数量有限的问题，以及非成对训练数据存在的对齐不准问题。*\n \n* *该退化模型不太适合用于对单张已退化的 LR 图像进行建模，因为它涉及过多的退化参数，并采用了随机打乱策略。*\n\n* *该退化模型可能生成一些在现实场景中极少出现的退化情况，但这仍有望提升所训练深度盲超分辨率模型的泛化能力。*\n\n* *具有大容量的深度神经网络（DNN, Deep Neural Network）能够通过单一模型处理多种退化类型。这一点已被多次验证。例如，DnCNN 能够处理不同缩放因子的单图像超分辨率（SISR, Single Image Super-Resolution）、不同质量因子的 JPEG 压缩去块效应以及大范围噪声水平的去噪任务，同时在 SISR 任务上的性能仍可与 VDSR 相媲美。值得注意的是，即使该超分辨率模型在处理不切实际的双三次下采样（bicubic downsampling）时性能有所下降，它仍然是真实场景 SISR 的首选方案。*\n\n* *用户可以通过调整退化参数设置并加入更多合理的退化类型，方便地修改退化模型，从而提升其在特定应用中的实用性。*\n\n\n\n\n对比结果\n----------\n\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_58bc20738230.png\" width=\"790px\"\u002F> \n\n*这些无参考图像质量评估（IQA, Image Quality Assessment）指标，即 NIQE、NRQM 和 PI，并不总能与人类感知的视觉质量一致 [1]，并且随着新的 SISR 方法出现，IQA 指标也应随之更新 [2]。我们进一步认为，针对 SISR 的 IQA 指标还应随新的图像退化类型进行更新，这将留作未来的工作。*\n\n```\n[1] \"NTIRE 2020 challenge on real-world image super-resolution: Methods and results.\" CVPRW, 2020.\n[2] \"PIPAL: a large-scale image quality assessment dataset for perceptual image restoration.\" ECCV, 2020.\n```\n\n\n\n在 [RealSRSet](testsets\u002FRealSRSet) 数据集上的更多视觉结果\n----------\n\n\n**左图**：[真实图像](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRNet\u002Ftree\u002Fmain\u002Ftestsets\u002FRealSRSet) **|** **右图**：[使用缩放因子 4 超分辨率重建的图像](https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRNet\u002Ftree\u002Fmain\u002Ftestsets\u002FBSRGAN)\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_9df0092dfb30.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_49773af73bf0.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_f0ba8c0573d0.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_43f7a71966fe.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_f0ba8c0573d0.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_43f7a71966fe.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_980d16ceed14.png\" width=\"156px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0337b9edbdd5.png\" width=\"624px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_980d16ceed14.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0337b9edbdd5.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_2212aaa1dafc.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_2dd64835aced.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_31376ff72302.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_78d2cce68fba.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_68ebd9b4c1f0.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0777c4f24832.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_77c7dad96c22.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_05b90427e4f7.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_87eedc84d0a4.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_735d0e1be97b.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_8a48167ad75b.png\" width=\"390px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_00fd656dab52.png\" width=\"390px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_080c3f477ce8.png\" width=\"784px\"\u002F> \n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_0d140aee1349.png\" width=\"784px\"\u002F>\n\n\n\n在 DPED 数据集上的视觉结果\n----------\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_c14efd339cdd.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_b4c5a3795e4a.png\" width=\"790px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_ace98884ea66.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_332d42da3c25.png\" width=\"790px\"\u002F>\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_afa112b4b533.png\" width=\"200px\"\u002F> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_readme_13d07b71b2a3.png\" width=\"790px\"\u002F>\n\n*尽管训练过程中未使用 DPED 数据集的任何先验信息，我们的 BSRGAN 仍然表现良好。*\n\n\n\n\n引用\n----------\n```BibTex\n@inproceedings{zhang2021designing,\n    title={Designing a Practical Degradation Model for Deep Blind Image Super-Resolution},\n    author={Zhang, Kai and Liang, Jingyun and Van Gool, Luc and Timofte, Radu},\n    booktitle={IEEE International Conference on Computer Vision},\n    pages={4791--4800},\n    year={2021}\n}\n```\n\n\n致谢\n----------\n本工作部分得到了苏黎世联邦理工学院基金（OK）、华为技术（芬兰）有限公司项目以及亚马逊 AWS 资助的支持。","# BSRGAN 快速上手指南\n\nBSRGAN 是一种用于真实图像盲超分辨率（Blind Super-Resolution）的深度学习模型，通过更贴近真实场景的退化模型训练，可有效提升低质量图像的重建效果。\n\n---\n\n## 环境准备\n\n**系统要求：**\n- Linux \u002F Windows \u002F macOS（推荐 Linux）\n- Python ≥ 3.7\n- PyTorch ≥ 1.7\n- 支持 CUDA 的 GPU（可选，但强烈推荐）\n\n**前置依赖：**\n```bash\npip install torch torchvision numpy opencv-python\n```\n\n> 💡 建议使用 [清华源](https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple) 或 [阿里云镜像](https:\u002F\u002Fmirrors.aliyun.com\u002Fpypi\u002Fsimple\u002F) 加速安装：\n> ```bash\n> pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple torch torchvision numpy opencv-python\n> ```\n\n---\n\n## 安装步骤\n\n1. **克隆主仓库（含测试代码）：**\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN.git\n   cd BSRGAN\n   ```\n\n2. **（可选）克隆训练框架 KAIR：**\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR.git\n   ```\n\n3. **下载预训练模型：**\n\n   将以下模型文件放入 `model_zoo\u002F` 目录（若目录不存在请手动创建）：\n\n   - **BSRGAN.pth**（主模型）\n   - **BSRNet.pth**（PSNR 预训练模型）\n\n   **国内用户推荐使用腾讯微云链接下载：**\n   - [BSRGAN.pth（腾讯微云）](https:\u002F\u002Fshare.weiyun.com\u002F7GPI8p7x)\n   - [BSRNet.pth（腾讯微云）](https:\u002F\u002Fshare.weiyun.com\u002FVOFW5Ela)\n\n   > 也可通过 Google Drive 下载（需科学上网）：[模型链接](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F13kfr3qny7S2xwG9h7v95F5mkWs0OmU0D?usp=sharing)\n\n---\n\n## 基本使用\n\n### 1. 超分辨率重建（最简示例）\n\n将待处理图像 `input.png` 放入项目根目录，运行：\n\n```bash\npython main_test_bsrgan.py --model_path model_zoo\u002FBSRGAN.pth --input input.png --output output.png\n```\n\n默认放大倍数为 **4 倍**。如需其他倍数，请确保使用对应训练好的模型（如 `sf=2` 需使用 scale=2 的模型）。\n\n### 2. 使用退化模型生成训练数据（可选）\n\n在 Python 中调用退化函数生成配对的低清\u002F高清图像：\n\n```python\nfrom utils import utils_blindsr as blindsr\nimport cv2\n\nimg = cv2.imread('hq_image.png')  # 读取高清图像 (HWC, BGR)\nimg_lq, img_hq = blindsr.degradation_bsrgan(img, sf=4, lq_patchsize=72)\n# img_lq: 生成的低清图像，img_hq: 对应裁剪后的高清图像\n```\n\n> 此功能适用于自定义训练数据合成。\n\n---\n\n> ✅ 提示：实际使用中，`xxxxxx_E.pth` 模型通常比 `xxxxxx_G.pth` 更稳定，测试时建议优先选用 `_E.pth` 结尾的权重。","一家地方档案馆正在将上世纪80年代的老照片数字化，用于线上历史展览，但扫描得到的图像分辨率低、模糊且带有噪点，难以满足高清展示需求。\n\n### 没有 BSRGAN 时\n- 使用传统插值方法（如双三次插值）放大图像后，细节严重丢失，边缘出现明显锯齿。\n- 现有超分模型（如SRCNN、ESRGAN）在合成退化数据上训练，面对真实老照片中的复杂模糊和噪声表现不佳，常产生伪影或过度锐化。\n- 需要手动调整多个参数尝试不同模型，耗时且效果不稳定，难以批量处理上千张照片。\n- 修复后的图像缺乏真实感，观众反馈“看起来假”，影响展览的专业性和沉浸感。\n\n### 使用 BSRGAN 后\n- BSRGAN 基于更贴近真实场景的退化模型训练，能有效还原老照片中的人物面部纹理、文字笔画等关键细节，放大4倍后仍清晰自然。\n- 直接加载预训练模型即可一键处理，无需针对每张图调参，大幅提高批量处理效率。\n- 在模糊、压缩、噪声混合退化的老照片上表现稳健，避免了伪影和过度平滑问题。\n- 输出图像既保留历史质感又提升可读性，线上展览用户停留时长显著增加。\n\nBSRGAN 让真实世界中的低质图像修复从“勉强可用”变为“专业可用”，真正打通了老照片数字化的最后一公里。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcszn_BSRGAN_eba6d425.png","cszn","Kai Zhang","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fcszn_02895d4d.png","Image Restoration; Inverse Problems","Nanjing University","Nanjing","cskaizhang@gmail.com",null,"https:\u002F\u002Fcszn.github.io\u002F","https:\u002F\u002Fgithub.com\u002Fcszn",[25],{"name":26,"color":27,"percentage":28},"Python","#3572A5",100,1363,192,"2026-04-05T03:44:04","Apache-2.0",4,"Linux, macOS, Windows","需要 NVIDIA GPU（训练必需，推理可选），显存建议 8GB+，CUDA 版本需与 PyTorch 兼容（未明确指定版本）","未说明",{"notes":38,"python":36,"dependencies":39},"训练代码位于 KAIR 仓库，需额外克隆；测试时需下载预训练模型文件；支持 DataParallel 和 DistributedDataParallel 多 GPU 训练方式；依赖库版本未明确指定，但需与 PyTorch 兼容",[40,41,42,43,44,45,46],"torch","numpy","opencv-python","Pillow","tqdm","scipy","matplotlib",[48],"图像",[50,51,52,53],"blind-image-super-resolution","super-resolution","real-image-super-resolution","realsr","ready","2026-03-27T02:49:30.150509","2026-04-06T09:46:57.312978",[58,63,68,73,78,83,88,93,98],{"id":59,"question_zh":60,"answer_zh":61,"source_url":62},198,"BSRGAN 的论文在哪里可以找到？","论文地址为：https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14006。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F1",{"id":64,"question_zh":65,"answer_zh":66,"source_url":67},199,"训练时如何判断何时停止？是否需要跑完所有 epoch？","不需要跑完设置的全部 epoch。应根据验证集上的指标（如 PSNR）是否趋于收敛来决定停止训练。例如在 DIV2K_val 数据集上，PSNR 达到约 30.86 即可停止。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F30",{"id":69,"question_zh":70,"answer_zh":71,"source_url":72},200,"训练代码和数据处理代码什么时候发布？","作者已将扩展的 BSRGAN 退化模型代码发布在 utils\u002Futils_blindsr.py 中，并计划在 KAIR 仓库（https:\u002F\u002Fgithub.com\u002Fcszn\u002FKAIR）中更新训练代码。数据处理使用 cv2、scipy、numpy 和 torch 实现。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F19",{"id":74,"question_zh":75,"answer_zh":76,"source_url":77},201,"使用 degradation_bsrgan 生成低质量图像时出现异常条纹或伪影怎么办？","可在 JPEG 压缩前添加裁剪操作：img = np.clip(img, 0.0, 255)。此外，确保输入图像数值范围正确（通常应在 [0,1] 而非 [0,255]）。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F7",{"id":79,"question_zh":80,"answer_zh":81,"source_url":82},202,"调用 degradation_bsrgan 时遇到 'str' object has no attribute 'shape' 错误怎么办？","该错误通常是因为传入的 img 是文件路径字符串而非图像数组。应先用 cv2.imread 或 PIL 加载图像，并确保图像数值范围在 [0,1]（而非 [0,255]）。例如：img = cv2.imread(path).astype(np.float32) \u002F 255.0。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F22",{"id":84,"question_zh":85,"answer_zh":86,"source_url":87},203,"为什么在退化过程中有时使用 util.imresize_np 而不是 cv2.resize？","util.imresize_np 模拟的是 MATLAB 的双三次插值行为，与 OpenCV 的 cv2.resize（即使使用 INTER_CUBIC）在细节上有差异。BSRGAN 为了更贴近原始退化设计，混合使用了这两种方式。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F31",{"id":89,"question_zh":90,"answer_zh":91,"source_url":92},204,"高斯噪声退化中为何只使用 noise_level2 而非在 noise_level1 和 noise_level2 之间随机采样？","对于多变量高斯噪声，其强度由协方差矩阵决定，不能简单用单一标量表示。作者指出 noise_level1 影响较小（甚至设为 0 也无妨），实际退化主要依赖 noise_level2 控制的 L 参数。重点在于“随机打乱+双重退化+多样退化类型”的整体策略，而非严格遵循具体参数。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F32",{"id":94,"question_zh":95,"answer_zh":96,"source_url":97},205,"论文中提到的“axis length”是什么意思？","“axis length”等同于高斯核特征值的缩放因子，用于控制各向异性模糊核的形状。相关实现可参考 SRMD、USRNet 和 KAIR 仓库中的 anisotropic_Gaussian 相关代码。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F4",{"id":99,"question_zh":100,"answer_zh":101,"source_url":102},206,"degradation_bsrgan 函数中是否存在裁剪维度错误？","是的，原代码在模数裁剪（mod crop）时错误地交换了宽高维度。正确写法应为 img[:h1 - h1 % sf, :w1 - w1 % sf, ...]。该问题已在后续版本中修复。","https:\u002F\u002Fgithub.com\u002Fcszn\u002FBSRGAN\u002Fissues\u002F33",[],[105,116,125,139,147,155],{"id":106,"name":107,"github_repo":108,"description_zh":109,"stars":110,"difficulty_score":111,"last_commit_at":112,"category_tags":113,"status":54},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[114,48,115],"开发框架","Agent",{"id":117,"name":118,"github_repo":119,"description_zh":120,"stars":121,"difficulty_score":122,"last_commit_at":123,"category_tags":124,"status":54},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,2,"2026-04-03T11:11:01",[114,48,115],{"id":126,"name":127,"github_repo":128,"description_zh":129,"stars":130,"difficulty_score":122,"last_commit_at":131,"category_tags":132,"status":54},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[48,133,134,135,115,136,137,114,138],"数据工具","视频","插件","其他","语言模型","音频",{"id":140,"name":141,"github_repo":142,"description_zh":143,"stars":144,"difficulty_score":111,"last_commit_at":145,"category_tags":146,"status":54},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[115,48,114,137,136],{"id":148,"name":149,"github_repo":150,"description_zh":151,"stars":152,"difficulty_score":111,"last_commit_at":153,"category_tags":154,"status":54},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74939,"2026-04-05T23:16:38",[137,48,114,136],{"id":156,"name":157,"github_repo":158,"description_zh":159,"stars":160,"difficulty_score":122,"last_commit_at":161,"category_tags":162,"status":54},2471,"tesseract","tesseract-ocr\u002Ftesseract","Tesseract 是一款历史悠久且备受推崇的开源光学字符识别（OCR）引擎，最初由惠普实验室开发，后由 Google 维护，目前由全球社区共同贡献。它的核心功能是将图片中的文字转化为可编辑、可搜索的文本数据，有效解决了从扫描件、照片或 PDF 文档中提取文字信息的难题，是数字化归档和信息自动化的重要基础工具。\n\n在技术层面，Tesseract 展现了强大的适应能力。从版本 4 开始，它引入了基于长短期记忆网络（LSTM）的神经网络 OCR 引擎，显著提升了行识别的准确率；同时，为了兼顾旧有需求，它依然支持传统的字符模式识别引擎。Tesseract 原生支持 UTF-8 编码，开箱即用即可识别超过 100 种语言，并兼容 PNG、JPEG、TIFF 等多种常见图像格式。输出方面，它灵活支持纯文本、hOCR、PDF、TSV 等多种格式，方便后续数据处理。\n\nTesseract 主要面向开发者、研究人员以及需要构建文档处理流程的企业用户。由于它本身是一个命令行工具和库（libtesseract），不包含图形用户界面（GUI），因此最适合具备一定编程能力的技术人员集成到自动化脚本或应用程序中",73286,"2026-04-03T01:56:45",[114,48]]