[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-BR-IDL--PaddleViT":3,"tool-BR-IDL--PaddleViT":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159267,2,"2026-04-17T11:29:14",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":73,"owner_company":73,"owner_location":73,"owner_email":73,"owner_twitter":73,"owner_website":73,"owner_url":75,"languages":76,"stars":85,"forks":86,"last_commit_at":87,"license":88,"difficulty_score":10,"env_os":89,"env_gpu":90,"env_ram":89,"env_deps":91,"category_tags":95,"github_topics":96,"view_count":32,"oss_zip_url":73,"oss_zip_packed_at":73,"status":17,"created_at":111,"updated_at":112,"faqs":113,"releases":143},8545,"BR-IDL\u002FPaddleViT","PaddleViT",":robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+","PaddleViT 是百度飞桨（PaddlePaddle）生态中专注于视觉 Transformer 和 MLP 模型的开源工具箱。它突破了传统卷积神经网络的局限，汇集了包括 ViT、Visual Attention 及纯 MLP 架构在内的多种前沿视觉模型，旨在复现并优化计算机视觉领域的最先进成果。\n\n该工具主要解决了开发者在尝试新型非卷积架构时面临的代码复现难、训练流程复杂以及缺乏统一框架等痛点。PaddleViT 不仅提供了图像分类、目标检测、语义分割及生成对抗网络（GAN）等多种任务的全套解决方案，还内置了数据增强、混合精度训练（AMP）和分布式训练（DDP）等高效工具，让用户能轻松加载预训练权重进行微调或直接开展科研实验。\n\nPaddleViT 特别适合计算机视觉研究人员、算法工程师以及希望探索前沿技术的高校师生使用。其模块化设计使得模型定义清晰独立，便于快速修改以验证新想法；同时，统一的配置接口降低了使用门槛，让教育者和从业者都能直观上手。依托飞桨框架的强大支持，配合丰富的中文教程与在线课程，PaddleViT 致力于让尖端的视觉技术变得更易用、更普及。","English | [简体中文](.\u002FREADME_cn.md)\n\n# PaddlePaddle Vision Transformers #\n\n[![GitHub](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FBR-IDL\u002FPaddleViT?color=blue)](.\u002FLICENSE)\n[![CodeFactor](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_9ee0cb95ac54.png)](https:\u002F\u002Fwww.codefactor.io\u002Frepository\u002Fgithub\u002Fbr-idl\u002Fpaddlevit)\n[![CLA assistant](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_c50f92066d4a.png)](https:\u002F\u002Fcla-assistant.io\u002FBR-IDL\u002FPaddleViT)\n[![GitHub Repo stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBR-IDL\u002FPaddleViT?style=social)](https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fstargazers)\n\n\n\u003Cp align=\"center\">    \n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_6b1c2a28ef70.png\" width=\"100%\"\u002F>\n\u003C\u002Fp>\n \n## State-of-the-art Visual Transformer and MLP Models for PaddlePaddle ##\n\n:robot: PaddlePaddle Visual Transformers (`PaddleViT` or `PPViT`) is a collection of vision models beyond convolution. Most of the models are based on Visual Transformers, Visual Attentions, and MLPs, etc. PaddleViT also integrates popular layers, utilities, optimizers, schedulers, data augmentations, training\u002Fvalidation scripts for PaddlePaddle 2.1+. The aim is to reproduce a wide variety of state-of-the-art ViT and MLP models with full training\u002Fvalidation procedures. We are passionate about making cuting-edge CV techniques easier to use for everyone.\n\n:robot: PaddleViT provides models and tools for multiple vision tasks, such as classifications, object detection, semantic segmentation, GAN, and more. Each model architecture is defined in standalone python module and can be modified to enable quick research experiments. At the same time, pretrained weights can be downloaded and used to finetune on your own datasets. PaddleViT also integrates popular tools and modules for custimized dataset, data preprocessing, performance metrics, DDP and more.\n\n:robot: PaddleViT is backed by popular deep learning framework [PaddlePaddle](https:\u002F\u002Fwww.paddlepaddle.org\u002F), we also provide tutorials and projects on [Paddle AI Studio](https:\u002F\u002Faistudio.baidu.com\u002Faistudio\u002Fcourse\u002Fintroduce\u002F25102). It's intuitive and straightforward to get started for new users.\n\n\n## Quick Links ##\nPaddleViT implements model architectures and tools for multiple vision tasks, go to the following links for detailed information.\n- [PaddleViT-Cls](.\u002Fimage_classification) for Image Classification\n- [PaddleViT-Det](.\u002Fobject_detection\u002FDETR) for object detection\n- [PaddleViT-Seg](.\u002Fsemantic_segmentation) for Semantic Segmentation\n- [PaddleViT-GAN](.\u002Fgan) for GANs.\n- [Docs](.\u002Fdocs\u002F) for tutorials and documentations.\n- [docs-export](.\u002Fdocs\u002Fpaddlevit-export-en.md) for exporte paddlevit models to inference models for produiction deployment\n  \nWe also provide tutorials:\n- [Online Course](https:\u002F\u002Faistudio.baidu.com\u002Faistudio\u002Fcourse\u002Fintroduce\u002F25102): on Paddle AIStudio (in chinese version)\n\n## Features ##\n1. **State-of-the-art**\n   - State-of-the-art transformer models for multiple CV tasks\n   - State-of-the-art data processings and training methods \n   - We keep pushing it forward.\n\n2. **Easy-to-use tools**\n   - Easy configs for model vairants\n   - Modular design for utiliy functions and tools\n   - Low barrier for educators and practitioners\n   - Unified framework for all the models\n\n3. **Easily customizable to your needs**\n   - Examples for each model to reproduce the results\n   - Model implementations are exposed for you to customize\n   - Model files can be used independently for quick experiments\n\n4. **High Performance**\n   - DDP (multiprocess training\u002Fvalidation where each process runs on a single GPU).\n   - Mixed-precision support (AMP)\n  \n\n  \n## Model architectures ##\n\n### Image Classification (Transformers) ###\n1. **[ViT](.\u002Fimage_classification\u002FViT)** (from Google), released with paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.11929), by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.\n2. **[DeiT](.\u002Fimage_classification\u002FDeiT)** (from Facebook and Sorbonne), released with paper [Training data-efficient image transformers & distillation through attention](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.12877), by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.\n3. **[Swin Transformer](.\u002Fimage_classification\u002FSwinTransformer)** (from Microsoft), released with paper [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14030), by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.\n4. **[VOLO](.\u002Fimage_classification\u002FVOLO)** (from Sea AI Lab and NUS), released with paper [VOLO: Vision Outlooker for Visual Recognition](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.13112), by Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, Shuicheng Yan.\n5. **[CSwin Transformer](.\u002Fimage_classification\u002FCSwin)** (from USTC and Microsoft), released with paper [CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows\n](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.00652), by Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo.\n6. **[CaiT](.\u002Fimage_classification\u002FCaiT)** (from Facebook and Sorbonne), released with paper [Going deeper with Image Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.17239), by Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou.\n7. **[PVTv2](.\u002Fimage_classification\u002FPVTv2)** (from NJU\u002FHKU\u002FNJUST\u002FIIAI\u002FSenseTime), released with paper [PVTv2: Improved Baselines with Pyramid Vision Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.13797), by Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao.\n8. **[Shuffle Transformer](.\u002Fimage_classification\u002FShuffle_Transformer)** (from Tencent), released with paper [Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.03650), by Zilong Huang, Youcheng Ben, Guozhong Luo, Pei Cheng, Gang Yu, Bin Fu.\n9. **[T2T-ViT](.\u002Fimage_classification\u002FT2T_ViT)** (from NUS and YITU), released with paper [Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet\n](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.11986), by Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan.\n10. **[CrossViT](.\u002Fimage_classification\u002FCrossViT)** (from IBM), released with paper [CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14899), by Chun-Fu Chen, Quanfu Fan, Rameswar Panda.\n11. **[BEiT](.\u002Fimage_classification\u002FBEiT)** (from Microsoft Research), released with paper [BEiT: BERT Pre-Training of Image Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.08254), by Hangbo Bao, Li Dong, Furu Wei.\n12. **[Focal Transformer](.\u002Fimage_classification\u002FFocal_Transformer)** (from Microsoft), released with paper [Focal Self-attention for Local-Global Interactions in Vision Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.00641), by Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan and Jianfeng Gao.\n13. **[Mobile-ViT](.\u002Fimage_classification\u002FMobileViT)** (from Apple), released with paper [MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2110.02178), by Sachin Mehta, Mohammad Rastegari.\n14. **[ViP](.\u002Fimage_classification\u002FViP)** (from National University of Singapore), released with [Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.12368), by Qibin Hou and Zihang Jiang and Li Yuan and Ming-Ming Cheng and Shuicheng Yan and Jiashi Feng.\n15. **[XCiT](.\u002Fimage_classification\u002FXCiT)** (from Facebook\u002FInria\u002FSorbonne), released with paper [XCiT: Cross-Covariance Image Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.09681), by Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou.\n16. **[PiT](.\u002Fimage_classification\u002FPiT)** (from NAVER\u002FSogan University), released with paper [Rethinking Spatial Dimensions of Vision Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.16302), by Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh.\n17. **[HaloNet](.\u002Fimage_classification\u002FHaloNet)**, (from Google), released with paper [Scaling Local Self-Attention for Parameter Efficient Visual Backbones](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.12731), by Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens.\n18. **[PoolFormer](.\u002Fimage_classification\u002FPoolFormer)**, (from Sea AI Lab\u002FNUS), released with paper [MetaFormer is Actually What You Need for Vision](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.11418), by Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan.\n19. **[BoTNet](.\u002Fimage_classification\u002FBoTNet)**, (from UC Berkeley\u002FGoogle), released with paper [Bottleneck Transformers for Visual Recognition](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.11605), by Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani.\n20. **[CvT](.\u002Fimage_classification\u002FCvT)** (from McGill\u002FMicrosoft), released with paper [CvT: Introducing Convolutions to Vision Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.15808), by Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang\n21. **[HvT](.\u002Fimage_classification\u002FHVT)** (from Monash University), released with paper [Scalable Vision Transformers with Hierarchical Pooling](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.10619), by Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai.\n22. **[TopFormer](.\u002Fimage_classification\u002FTopFormer)** (from HUST\u002FTencent\u002FFudan\u002FZJU), released with paper [TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2204.05525.pdf), by Wenqiang Zhang, Zilong Huang, Guozhong Luo, Tao Chen, Xinggang Wang, Wenyu Liu, Gang Yu, Chunhua Shen.\n\n22. **[ConvNeXt](.\u002Fimage_classification\u002FConvNeXt)** (from FAIR\u002FUCBerkeley), released with paper [A ConvNet for the 2020s](https:\u002F\u002Farxiv.org\u002Fabs\u002F2201.03545), by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie.\n22. **[CoaT](.\u002Fimage_classification\u002FCoaT)** (from UCSD), released with paper [Co-Scale Conv-Attentional Image Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.06399), by Weijian Xu, Yifan Xu, Tyler Chang, Zhuowen Tu.\n22. **[ResT](.\u002Fimage_classification\u002FResT)** (from NJU), released with paper [ResT: An Efficient Transformer for Visual Recognition](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.13677), by Qinglong Zhang, Yubin Yang.\n22. **[ResTV2](.\u002Fimage_classification\u002FResT)** (from NJU), released with paper [ResT V2: Simpler, Faster and Stronger](https:\u002F\u002Farxiv.org\u002Fabs\u002F2204.07366), by Qinglong Zhang, Yubin Yang.\n\n\n### Image Classification (MLP & others) ###\n1. **[MLP-Mixer](.\u002Fimage_classification\u002FMLP-Mixer)** (from Google), released with paper [MLP-Mixer: An all-MLP Architecture for Vision](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.01601), by Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy\n2. **[ResMLP](.\u002Fimage_classification\u002FResMLP)** (from Facebook\u002FSorbonne\u002FInria\u002FValeo), released with paper [ResMLP: Feedforward networks for image classification with data-efficient training](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.03404), by Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou.\n3. **[gMLP](.\u002Fimage_classification\u002FgMLP)** (from Google), released with paper [Pay Attention to MLPs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.08050), by Hanxiao Liu, Zihang Dai, David R. So, Quoc V. Le.\n4. **[FF Only](.\u002Fimage_classification\u002FFF_Only)** (from Oxford), released with paper [Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.02723), by Luke Melas-Kyriazi.\n5. **[RepMLP](.\u002Fimage_classification\u002FRepMLP)** (from BNRist\u002FTsinghua\u002FMEGVII\u002FAberystwyth), released with paper [RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.01883), by Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding.\n6. **[CycleMLP](.\u002Fimage_classification\u002FCycleMLP)** (from HKU\u002FSenseTime), released with paper [CycleMLP: A MLP-like Architecture for Dense Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.10224), by Shoufa Chen, Enze Xie, Chongjian Ge, Ding Liang, Ping Luo.\n7. **[ConvMixer](.\u002Fimage_classification\u002FConvMixer)** (from Anonymous), released with [Patches Are All You Need?](https:\u002F\u002Fopenreview.net\u002Fforum?id=TVHS5Y4dNvM), by Anonymous.\n8. **[ConvMLP](.\u002Fimage_classification\u002FConvMLP)** (from UO\u002FUIUC\u002FPAIR), released with [ConvMLP: Hierarchical Convolutional MLPs for Vision](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.04454), by Jiachen Li, Ali Hassani, Steven Walton, Humphrey Shi.\n1. **[RepLKNet](.\u002FRepLKNet)** (from Tsinghua\u002FMEGVII\u002FAberystwyth), released with [Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs\n](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.06717), by Xiaohan Ding, Xiangyu Zhang, Yizhuang Zhou, Jungong Han, Guiguang Ding, Jian Sun.\n2. **[MobileOne](.\u002FMobileOne)** (from Apple), released with [An Improved One millisecond Mobile Backbone](https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.04040), by Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan.\n\n\n\n### Detection ###\n1. **[DETR](.\u002Fobject_detection\u002FDETR)** (from Facebook), released with paper [End-to-End Object Detection with Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.12872), by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.\n2. **[Swin Transformer](.\u002Fobject_detection\u002FSwin)** (from Microsoft), released with paper [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14030), by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.\n3. **[PVTv2](.\u002Fobject_detection\u002FPVTv2)** (from NJU\u002FHKU\u002FNJUST\u002FIIAI\u002FSenseTime), released with paper [PVTv2: Improved Baselines with Pyramid Vision Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.13797), by Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao.\n\n#### Coming Soon: ####\n1. **[Focal Transformer]()** (from Microsoft), released with paper [Focal Self-attention for Local-Global Interactions in Vision Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.00641), by Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan and Jianfeng Gao.\n2. **[UP-DETR]()** (from Tencent), released with paper [UP-DETR: Unsupervised Pre-training for Object Detection with Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2011.09094), by Zhigang Dai, Bolun Cai, Yugeng Lin, Junying Chen.\n\n\n\n\n### Semantic Segmentation ###\n#### Now: ####\n1. **[SETR](.\u002Fsemantic_segmentation)** (from Fudan\u002FOxford\u002FSurrey\u002FTencent\u002FFacebook), released with paper [Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.15840), by Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang.\n2. **[DPT](.\u002Fsemantic_segmentation)** (from Intel), released with paper [Vision Transformers for Dense Prediction](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.13413), by René Ranftl, Alexey Bochkovskiy, Vladlen Koltun.\n3. **[Swin Transformer](.\u002Fsemantic_segmentation)** (from Microsoft), released with paper [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.14030), by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.\n4. **[Segmenter](.\u002Fsemantic_segmentation)** (from Inria), realeased with paper [Segmenter: Transformer for Semantic Segmentation](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2105.05633.pdf), by Robin Strudel, Ricardo Garcia, Ivan Laptev, Cordelia Schmid.\n5. **[Trans2seg](.\u002Fsemantic_segmentation)** (from HKU\u002FSensetime\u002FNJU), released with paper [Segmenting Transparent Object in the Wild with Transformer](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2101.08461.pdf), by Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo.\n6. **[SegFormer](.\u002Fsemantic_segmentation)** (from HKU\u002FNJU\u002FNVIDIA\u002FCaltech), released with paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.15203), by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.\n7. **[CSwin Transformer]()** (from USTC and Microsoft), released with paper [CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.00652.pdf)\n8. **[TopFormer](.\u002Fsemantic_segmentation)** (from HUST\u002FTencent\u002FFudan\u002FZJU), released with paper [TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2204.05525.pdf)\n\n#### Coming Soon:  ####\n1. **[FTN]()** (from Baidu), released with paper [Fully Transformer Networks for Semantic Image Segmentation](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.04108.pdf), by Sitong Wu, Tianyi Wu, Fangjian Lin, Shengwei Tian, Guodong Guo.\n2. **[Shuffle Transformer]()** (from Tencent), released with paper [Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.03650), by Zilong Huang, Youcheng Ben, Guozhong Luo, Pei Cheng, Gang Yu, Bin Fu\n3. **[Focal Transformer]()** (from Microsoft), released with paper [Focal Self-attention for Local-Global Interactions in Vision Transformers](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.00641), by Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan and Jianfeng Gao.\n](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.00652), by Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo.\n\n\n### GAN ###\n1. **[TransGAN](.\u002Fgan\u002FtransGAN)** (from Seoul National University and NUUA), released with paper [TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.07074), by Yifan Jiang, Shiyu Chang, Zhangyang Wang.\n2. **[Styleformer](.\u002Fgan\u002FStyleformer)** (from Facebook and Sorbonne), released with paper [Styleformer: Transformer based Generative Adversarial Networks with Style Vector](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.07023), by Jeeseung Park, Younggeun Kim.\n#### Coming Soon: ####\n1. **[ViTGAN]()** (from UCSD\u002FGoogle), released with paper [ViTGAN: Training GANs with Vision Transformers](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.04589), by Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu.\n\n\n\n## Installation\n### Prerequistites\n* Linux\u002FMacOS\u002FWindows\n* Python 3.6\u002F3.7\n* PaddlePaddle 2.1.0+\n* CUDA10.2+\n> Note: It is recommended to install the latest version of PaddlePaddle to avoid some CUDA errors for  PaddleViT training. For PaddlePaddle, please refer to this [link](https:\u002F\u002Fwww.paddlepaddle.org.cn\u002Finstall\u002Fquick?docurl=\u002Fdocumentation\u002Fdocs\u002Fzh\u002Finstall\u002Fpip\u002Flinux-pip.html) for stable version installation and this [link](https:\u002F\u002Fwww.paddlepaddle.org.cn\u002Finstall\u002Fquick?docurl=\u002Fdocumentation\u002Fdocs\u002Fzh\u002Fdevelop\u002Finstall\u002Fpip\u002Flinux-pip.html#gpu) for develop version installation. \n### Installation\n1. Create a conda virtual environment and activate it.\n   ```shell\n   conda create -n paddlevit python=3.7 -y\n   conda activate paddlevit\n   ```\n2. Install PaddlePaddle following the official instructions, e.g.,\n   ```shell\n   conda install paddlepaddle-gpu==2.1.2 cudatoolkit=10.2 --channel https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fcloud\u002FPaddle\u002F\n   ```\n   > Note: please change the paddlepaddle version and cuda version accordingly to your environment.\n\n3. Install dependency packages\n    * General dependencies:\n        ```\n        pip install yacs pyyaml\n        ```\n    * Packages for Segmentation:\n        ```\n        pip install cityscapesScripts\n        ```\n        Install `detail` package:\n        ```shell\n        git clone https:\u002F\u002Fgithub.com\u002Fccvl\u002Fdetail-api\n        cd detail-api\u002FPythonAPI\n        make\n        make install\n        ```\n    * Packages for GAN:\n        ```\n        pip install lmdb\n        ```\n4. Clone project from GitHub\n    ```\n    git clone https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT.git \n    ```\n\n\n## Results (Model Zoo) ## \n### Image Classification ###\n| Model                         | Acc@1 | Acc@5 | #Params | FLOPs  | Image Size | Crop pct | Interp | Link         |\n|-------------------------------|-------|-------|---------|--------|------------|----------|---------------|--------------|\n| vit_base_patch32_224          | 80.68 | 95.61 | 88.2M   | 4.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1DPEhEuu9sDdcmOPukQbR7ZcHq2bxx9cr\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ppOLj5SWlJmA-NjoLCoYIw)(ubyr) |\n| vit_base_patch32_384          | 83.35 | 96.84 | 88.2M   | 12.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1nCOSwrDiFBFmTkLEThYwjL9SfyzkKoaf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jxnL00ocpmdiPM4fOu4lpg)(3c2f) |\n| vit_base_patch16_224          | 84.58 | 97.30 | 86.4M   | 17.0G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F13D9FqU4ISsGxWXURgKW9eLOBV-pYPr-L\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ms3o2fHMQpIoVqnEHitRtA)(qv4n) |\n| vit_base_patch16_384          | 85.99 | 98.00 | 86.4M   | 49.8G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1kWKaAgneDx0QsECxtf7EnUdUZej6vSFT\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15ggLdiL98RPcz__SXorrXA)(wsum) |\n| vit_large_patch16_224         | 85.81 | 97.82 | 304.1M  | 59.9G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jgwtmtp_cDWEhZE-FuWhs7lCdpqhAMft\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1HRxUJAwEiKgrWnJSjHyU0A)(1bgk) |\n| vit_large_patch16_384         | 87.08 | 98.30 | 304.1M  | 175.9G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zfw5mdiIm-mPxxQddBFxt0xX-IR-PF2U\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KvxfIpMeitgXAUZGr5HV8A)(5t91) |\n| vit_large_patch32_384         | 81.51 | 96.09 | 306.5M  | 44.4G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Py1EX3E35jL7DComW-29Usg9788BB26j\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1W8sUs0pObOGpohP4vsT05w)(ieg3) |\n| | | | | | | | | |\n| swin_t_224   \t\t\t\t\t| 81.37 | 95.54 | 28.3M   | 4.4G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1v_wzWv3TaQ0RKkKwRQwuDPzwpOb_jGEs\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tbc751RVh3fIRsrLzrmeOw)(h2ac) |\n| swin_s_224   \t\t\t\t\t| 83.21 | 96.32 | 49.6M   | 8.6G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lrODzr8zIOU9sBrH2x3zolMOS4mv4o7x\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rlXL0tjLWbWnkIt_2Ne8Jw)(ydyx) |\n| swin_b_224   \t\t\t\t\t| 83.60 | 96.46 | 87.7M   | 15.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1hjEVODThNEDAlIqkg8C1KzUh3KsVNu6R\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ucSHBiuiG2sHAmR1N1JENQ)(h4y6) |\n| swin_b_384   \t\t\t\t\t| 84.48 | 96.89 | 87.7M   | 45.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1szLgwhB6WJu02Me6Uyz94egk8SqKlNsd\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1t0oXbqKNwpUAMJV7VTzcNw)(7nym) |\n| swin_b_224_22kto1k    \t\t| 85.27 | 97.56 | 87.7M   | 15.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1FhdlheMUlJzrZ7EQobpGRxd3jt3aQniU\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KBocL_M6YNW1ZsK-GYFiNw)(6ur8) |\n| swin_b_384_22kto1k    \t\t| 86.43 | 98.07 | 87.7M   | 45.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zVwIrJmtuBSiSVQhUeblRQzCKx-yWNCA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1NziwdsEJtmjfGCeUFgtZXA)(9squ) |\n| swin_l_224_22kto1k    \t\t| 86.32 | 97.90 | 196.4M  | 34.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1yo7rkxKbQ4izy2pY5oQ5QAnkyv7zKcch\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GsUJbSkGxlGsBYsayyKjVg)(nd2f) |\n| swin_l_384_22kto1k    \t\t| 87.14 | 98.23 | 196.4M  | 100.9G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-6DEvkb-FMz72MyKtq9vSPKYBqINxoKK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JLdS0aTl3I37oDzGKLFSqA)(5g5e) |\n| | | | | | | | | |\n| deit_tiny_distilled_224   \t| 74.52 | 91.90 | 5.9M    | 1.1G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fku9-11O_gQI7UpZTjagVeND-pcHbV0C\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hAQ_85wWkqQ7sIGO1CmO9g)(rhda) |\n| deit_small_distilled_224  \t| 81.17 | 95.41 | 22.4M   | 4.3G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1RIeWTdf5o6pwkjqN4NbW91GZSOCalI5t\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wCVrukvwxISAGGjorPw3iw)(pv28) |\n| deit_base_distilled_224  \t\t| 83.32 | 96.49 | 87.2M   | 17.0G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12_x6-NN3Jde2BFUih4OM9NlTwe9-Xlkw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ZnmAWgT6ewe7Vl3Xw_csuA)(5f2g) |\n| deit_base_distilled_384  \t\t| 85.43 | 97.33 | 87.2M   | 49.9G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1i5H_zjSdHfM-Znv89DHTv9ChykWrIt8I\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PQsQIci4VCHY7l2tCzMklg)(qgj2) |\n| | | | | | | | | |\n| volo_d1_224  \t\t\t\t\t| 84.12 | 96.78 | 26.6M   | 6.6G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1kNNtTh7MUWJpFSDe_7IoYTOpsZk5QSR9\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1EKlKl2oHi_24eaiES67Bgw)(xaim) |\n| volo_d1_384  \t\t\t\t\t| 85.24 | 97.21 | 26.6M   | 19.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fku9-11O_gQI7UpZTjagVeND-pcHbV0C\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qZWoFA7J89i2aujPItEdDQ)(rr7p) |\n| volo_d2_224  \t\t\t\t\t| 85.11 | 97.19 | 58.6M   | 13.7G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KjKzGpyPKq6ekmeEwttHlvOnQXqHK1we\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JCK0iaYtiOZA6kn7e0wzUQ)(d82f) |\n| volo_d2_384  \t\t\t\t\t| 86.04 | 97.57 | 58.6M   | 40.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1uLLbvwNK8N0y6Wrq_Bo8vyBGSVhehVmq\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1e7H5aa6miGpCTCgpK0rm0w)(9cf3) |\n| volo_d3_224  \t\t\t\t\t| 85.41 | 97.26 | 86.2M   | 19.8G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OtOX7C29fJ20ESKQnYGevp4euxhmXKAT\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1vhARtV2wfI6EFf0Ap71xwg)(a5a4) |\n| volo_d3_448  \t\t\t\t\t| 86.50 | 97.71 | 86.2M   | 80.3G  | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lHlYhra1NNp0dp4NWaQ9SMNNmw-AxBNZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Q6KiQw4Vu1GPm5RF9_eycg)(uudu) |\n| volo_d4_224  \t\t\t\t\t| 85.89 | 97.54 | 192.8M  | 42.9G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16oXN7xuy-mkpfeD-loIVOK95PfptHhpX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PE83ZLd5evkKmHJ1V2KDsg)(vcf2) |\n| volo_d4_448  \t\t\t\t\t| 86.70 | 97.85 | 192.8M  | 172.5G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1N9-1OhPewA5TBR9CX5oA10obDS8e4Cfa\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1QoJ2Sqe1SK9hxbmV4uZiyg)(nd4n) |\n| volo_d5_224  \t\t\t\t\t| 86.08 | 97.58 | 295.3M  | 70.6G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fcrvOGbAmKUhqJT-pU3MVJZQJIe4Qina\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nqDcXMW00v9PKr3RQI-g1w)(ymdg) |\n| volo_d5_448  \t\t\t\t\t| 86.92 | 97.88 | 295.3M  | 283.8G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1aFXEkpfLhmQlDQHUYCuFL8SobhxUzrZX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1K4FBv6fnyMGcAXhyyybhgw)(qfcc) |\n| volo_d5_512  \t\t\t\t\t| 87.05 | 97.97 | 295.3M  | 371.3G | 512        | 1.15     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CS4-nv2c9FqOjMz7gdW5i9pguI79S6zk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16Wseyiqvv0MQJV8wwFDfSA)(353h) |\n| | | | | | | | | |\n| cswin_tiny_224  \t\t\t\t| 82.81 | 96.30 | 22.3M   | 4.2G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1l-JY0u7NGyD6SjkyiyNnDx3wFFT1nAYO\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L5FqU7ImWAhQHAlSilqVAw)(4q3h) |\n| cswin_small_224 \t\t\t\t| 83.60 | 96.58 | 34.6M   | 6.5G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10eEBk3wvJdQ8Dy58LvQ11Wk1K2UfPy-E\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FiaNiWyAuWu1IBsUFLUaAw)(gt1a) |\n| cswin_base_224  \t\t\t\t| 84.23 | 96.91 | 77.4M   | 14.6G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YufKh3DKol4-HrF-I22uiorXSZDIXJmZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1koy8hXyGwvgAfUxdlkWofg)(wj8p) |\n| cswin_base_384  \t\t\t\t| 85.51 | 97.48 | 77.4M   | 43.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1qCaFItzFoTYBo-4UbGzL6M5qVDGmJt4y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WNkY7o_vP9KJ8cd5c7n2sQ)(rkf5) |\n| cswin_large_224 \t\t\t\t| 86.52 | 97.99 | 173.3M  | 32.5G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1V1hteGK27t1nI84Ac7jdWfydBLLo7Fxt\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KgIX6btML6kPiPGkIzvyVA)(b5fs) |\n| cswin_large_384 \t\t\t\t| 87.49 | 98.35 | 173.3M  | 96.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LRN_6qUz71yP-OAOpN4Lscb8fkUytMic\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1eCIpegPj1HIbJccPMaAsew)(6235) |\n| | | | | | | | | |\n| cait_xxs24_224                | 78.38 | 94.32 | 11.9M   | 2.2G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LKsQUr824oY4E42QeUEaFt41I8xHNseR\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YIaBLopKIK5_p7NlgWHpGA)(j9m8) |\n| cait_xxs36_224                | 79.75 | 94.88 | 17.2M   | 33.1G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zZx4aQJPJElEjN5yejUNsocPsgnd_3tS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1pdyFreRRXUn0yPel00-62Q)(nebg) |\n| cait_xxs24_384                | 80.97 | 95.64 | 11.9M   | 6.8G   | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1J27ipknh_kwqYwR0qOqE9Pj3_bTcTx95\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1uYSDzROqCVT7UdShRiiDYg)(2j95) |\n| cait_xxs36_384                | 82.20 | 96.15 | 17.2M   | 10.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F13IvgI3QrJDixZouvvLWVkPY0J6j0VYwL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GafA8B6T3h_vtmNNq2HYKg)(wx5d) |\n| cait_s24_224                  | 83.45 | 96.57 | 46.8M   | 8.7G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sdCxEw328yfPJArf6Zwrvok-91gh7PhS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1BPsAMEcrjtnbOnVDQwZJYw)(m4pn) |\n| cait_xs24_384                 | 84.06 | 96.89 | 26.5M   | 15.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zKL6cZwqmvuRMci-17FlKk-lA-W4RVte\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1w10DPJvK8EwhOCm-tZUpww)(scsv) |\n| cait_s24_384                  | 85.05 | 97.34 | 46.8M   | 26.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1klqBDhJDgw28omaOpgzInMmfeuDa7NAi\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-aNO6c7Ipm9x1hJY6N6G2g)(dnp7) |\n| cait_s36_384                  | 85.45 | 97.48 | 68.1M   | 39.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m-55HryznHbiUxG38J2rAa01BYcjxsRZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-uWg-JHLEKeMukFFctoufg)(e3ui) |\n| cait_m36_384                  | 86.06 | 97.73 | 270.7M  | 156.2G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1WJjaGiONX80KBHB3YN8mNeusPs3uDhR2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aZ9bEU5AycmmfmHAqZIaLA)(r4hu) |\n| cait_m48_448                  | 86.49 | 97.75 | 355.8M  | 287.3G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lJSP__dVERBNFnp7im-1xM3s_lqEe82-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F179MA3MkG2qxFle0K944Gkg)(imk5) |\n| | | | | | | | | |\n| pvtv2_b0 \t\t\t\t\t\t| 70.47\t| 90.16\t| 3.7M    | 0.6G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wkx4un6y7V87Rp_ZlD4_pV63QRst-1AE\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mab4dOtBB-HsdzFJYrvgjA)(dxgb) |\n| pvtv2_b1 \t\t\t\t\t\t| 78.70\t| 94.49\t| 14.0M   | 2.1G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11hqLxL2MTSnKPb-gp2eMZLAzT6q2UsmG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ur0s4SEOxVqggmgq6AM-sQ)(2e5m) |\n| pvtv2_b2 \t\t\t\t\t\t| 82.02\t| 95.99\t| 25.4M   | 4.0G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-KY6NbS3Y3gCaPaUam0v_Xlk1fT-N1Mz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FWx0QB7_8_ikrPIOlL7ung)(are2) |\n| pvtv2_b2_linear \t\t\t\t| 82.06\t| 96.04\t| 22.6M   | 3.9G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1hC8wE_XanMPi0_y9apEBKzNc4acZW5Uy\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IAhiiaJPe-Lg1Qjxp2p30w)(a4c8) |\n| pvtv2_b3 \t\t\t\t\t\t| 83.14\t| 96.47\t| 45.2M   | 6.8G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16yYV8x7aKssGYmdE-YP99GMg4NKGR5j1\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ge0rBsCqIcpIjrVxsrFhnw)(nc21) |\n| pvtv2_b4 \t\t\t\t\t\t| 83.61\t| 96.69\t| 62.6M   | 10.0G  | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1gvPdvDeq0VchOUuriTnnGUKh0N2lj-fA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VMSD_Kr_hduCZ5dxmDbLoA)(tthf) |\n| pvtv2_b5 \t\t\t\t\t\t| 83.77\t| 96.61\t| 82.0M   | 11.5G  | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OHaHiHN_AjsGYBN2gxFcQCDhBbTvZ02g\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ey4agxI2Nb0F6iaaX3zAbA)(9v6n) |\n| | | | | | | | | | \n| shuffle_vit_tiny  \t\t\t| 82.39 | 96.05 | 28.5M   | 4.6G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ffJ-tG_CGVXztPEPQMaT_lUoc4hxFy__\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F19DhlLIFyPGOWtyq_c83ZGQ)(8a1i) |\n| shuffle_vit_small \t\t\t| 83.53 | 96.57 | 50.1M   | 8.8G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1du9H0SKr0QH9GQjhWDOXOnhpSVpfbb8X\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rM2J8BVwxQ3kRZoHngwNZA)(xwh3) |\n| shuffle_vit_base  \t\t\t| 83.95 | 96.91 | 88.4M   | 15.5G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sYh808AyTG3-_qv6nfN6gCmyagsNAE6q\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1fks_IYDdnXdAkCFuYHW_Nw)(1gsr) |\n| | | | | | | | | |\n| t2t_vit_7      \t\t\t\t| 71.68 | 90.89 | 4.3M    | 1.0G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YkuPs1ku7B_udydOf_ls1LQvpJDg_c_j\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jVNsz37gatLCDaOoU3NaMA)(1hpa) |\n| t2t_vit_10     \t\t\t\t| 75.15 | 92.80 | 5.8M    | 1.3G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1H--55RxliMDlOCekn7FpKrHDGsUkyrJZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nbdb4PFMq4nsIp8HrNxLQg)(ixug) |\n| t2t_vit_12     \t\t\t\t| 76.48 | 93.49 | 6.9M    | 1.5G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1stnIwOwaescaEcztaF1QjI4NK4jaqN7P\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DcMzq9WeSwrS3epv6jKJXw)(qpbb) |\n| t2t_vit_14     \t\t\t\t| 81.50 | 95.67 | 21.5M   | 4.4G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HSvN3Csgsy7SJbxJYbkzjUx9guftkfZ1\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wcfh22uopBv7pS7rKcH_iw)(c2u8) |\n| t2t_vit_19     \t\t\t\t| 81.93 | 95.74 | 39.1M   | 7.8G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1eFnhaL6I33pHCQw2BaEE0Oet9CnjmUf_\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_24     \t\t\t\t| 82.28 | 95.89 | 64.0M   | 12.8G  | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Z7nZCHeFp0AhIkGYcMAFkKdkGN0yXtpv\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_t_14   \t\t\t\t| 81.69 | 95.85 | 21.5M   | 4.4G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16li4voStt_B8eWDXqJt7s20OT_Z8L263\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_t_19   \t\t\t\t| 82.44 | 96.08 | 39.1M   | 7.9G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Ty-42SYOu15Nk8Uo6VRTJ7J0JV_6t7zJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YdQd6l8tj5xMCWvcHWm7sg)(mier) |\n| t2t_vit_t_24   \t\t\t\t| 82.55 | 96.07 | 64.0M   | 12.9G  | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cvvXrGr2buB8Np2WlVL7n_F1_CnI1qow\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1BMU3KX_TRmPxQ1jN5cmWhg)(6vxc) |\n| t2t_vit_14_384 \t\t\t\t| 83.34 | 96.50 | 21.5M   | 13.0G  | 384   \t    | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Yuso8WD7Q8Lu_9I8dTvAvkcXXtPSkmnm\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1AOMhyVRF9zPqJe-lTrd7pw)(r685) |\n| | | | | | | | | |\n| cross_vit_tiny_224 \t\t\t| 73.20 | 91.90 | 6.9M    | 1.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ILTVwQtetcb_hdRjki2ZbR26p-8j5LUp\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1byeUsM34_gFL0jVr5P5GAw)(scvb) |\n| cross_vit_small_224 \t\t\t| 81.01 | 95.33 | 26.7M   | 5.2G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ViOJiwbOxTbk1V2Go7PlCbDbWPbjWPJH\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1I9CrpdPU_D5LniqIVBoIPQ)(32us) |\n| cross_vit_base_224 \t\t\t| 82.12 | 95.87 | 104.7M  | 20.2G  | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vTorkc63O4JE9cYUMHBRxFMDOFoC-iK7\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1TR_aBHQ2n1J0RgHFoVh_bw)(jj2q) |\n| cross_vit_9_224 \t\t\t\t| 73.78 | 91.93 | 8.5M    | 1.6G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UCX9_mJSx2kDAmEd_xDXyd4e6-Mg3RPf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1M8r5vqMHJ-rFwBoW1uL2qQ)(mjcb) |\n| cross_vit_15_224 \t\t\t\t| 81.51 | 95.72 | 27.4M   | 5.2G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HwkLWdz6A3Nz-dVbw4ZUcCkxUbPXgHwM\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wiO_Gjk4fvSq08Ud8xKwVw)(n55b) |\n| cross_vit_18_224 \t\t\t\t| 82.29 | 96.00 | 43.1M   | 8.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C4b_a_6ia8NCEXSUEMDdCEFzedr0RB_m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1w7VJ7DNqq6APuY7PdlKEjA)(xese) |\n| cross_vit_9_dagger_224 \t\t| 76.92 | 93.61 | 8.7M    | 1.7G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1_cXQ0M8Hr9UyugZk07DrsBl8dwwCA6br\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1F1tRSaG4EfCV_WiTEwXxBw)(58ah) |\n| cross_vit_15_dagger_224 \t\t| 82.23 | 95.93 | 28.1M   | 5.6G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cCgBoozh2WFtSz42LwEUUPPyC5KmkAFg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1xJ4P2zy3r9RcNFSMtzvZgg)(qwup) |\n| cross_vit_18_dagger_224 \t\t| 82.51 | 96.03 | 44.1M   | 8.7G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sdAbWxKL5k3QIo1zdgHzasIOtpy_Ogpw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15qYHgt0iRxdhtXoC_ct2Jg)(qtw4) |\n| cross_vit_15_dagger_384 \t\t| 83.75 | 96.75 | 28.1M   | 16.4G  | 384   \t    | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12LQjYbs9-LyrY1YeRt46x9BTB3NJuhpJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1d-BAm03azLP_CyEHF3c7ZQ)(w71e) |\n| cross_vit_18_dagger_384 \t\t| 84.17 | 96.82 | 44.1M   | 25.8G  | 384   \t    | 1.0 \t   | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CeGwB6Tv0oL8QtL0d7Ar-d02Lg_PqACr\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1l_6PTldZ3IDB7XWgjM6LhA)(99b6) |\n| | | | | | | | | | \n| beit_base_patch16_224_pt22k   | 85.21 | 97.66 | 87M    | 12.7G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lq5NeQRDHkIQi7U61OidaLhNsXTWfh_Z\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1pjblqaESqfXVrpgo58oR6Q)(fshn) |\n| beit_base_patch16_384_pt22k   | 86.81 | 98.14 | 87M    | 37.3G   | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wn2NS7kUdlERkzWEDeyZKmcRbmWL7TR2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WVbNjxuIUh514pKAgZZEzg)(arvc) |\n| beit_large_patch16_224_pt22k  | 87.48 | 98.30 | 304M   | 45.0G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11OR1FKxzfafqT7GzTW225nIQjxmGSbCm\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bvhERVXN2TyRcRJFzg7sIA)(2ya2) |\n| beit_large_patch16_384_pt22k  | 88.40 | 98.60 | 304M   | 131.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10EraafYS8CRpEshxClOmE2S1eFCULF1Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1H76G2CGLY3YmmYt4-suoRA)(qtrn) |\n| beit_large_patch16_512_pt22k  | 88.60 | 98.66 | 304M   | 234.0G  | 512        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1xIIocftsB1PcDHZttPqLdrJ-G4Tyfrs-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WtTVK_Wvg-izaF0M6Gzw-Q)(567v) |\n| | | | | | | | | | \n| Focal-T    \t\t\t\t\t| 82.03 | 95.86 | 28.9M   | 4.9G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HzZJbYH_eIo94h0wLUhqTyJ6AYthNKRh\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JCr2qIA-SZvTqbTO-m2OwA)(i8c2) |\n| Focal-T (use conv)   \t\t\t| 82.70 | 96.14 | 30.8M   | 4.9G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1PS0-gdXHGl95LqH5k5DG62AH6D3i7v0D\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tVztox4bVJuJEjkD1fLaHQ)(smrk) |\n| Focal-S    \t\t\t\t\t| 83.55 | 96.29 | 51.1M   | 9.4G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HnVAYsI_hmiomyS4Ax3ccPE7gk4mlTU8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1b7uugAY9RhrgTkUwYcvvow)(dwd8) |\n| Focal-S (use conv)   \t\t\t| 83.85 | 96.47 | 53.1M   | 9.4G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vcHjYiGNMayoSTPoM8z39XRH6h89TB9V\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F174a2aZzCEt3teLuAnIzMtA)(nr7n) |\n| Focal-B    \t\t\t\t\t| 83.98 | 96.48 | 89.8M   | 16.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bNMegxetWpwZNcmDEC3MHCal6SNXSgWR\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1piBslNhxWR78aQJIdoZjEw)(8akn) |\n| Focal-B (use conv)   \t\t\t| 84.18 | 96.61 | 93.3M   | 16.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-J2gDnKrvZGtasvsAYozrbMXR2LtIJ43\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GTLfnTlt6I6drPdfSWB1Iw)(5nfi) |\n| | | | | | | | | | \n| mobilevit_xxs   \t\t\t\t| 70.31| 89.68 | 1.32M   | 0.44G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1l3L-_TxS3QisRUIb8ohcv318vrnrHnWA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KFZ5G834_-XXN33W67k8eg)(axpc) |\n| mobilevit_xs   \t\t\t\t| 74.47| 92.02 | 2.33M   | 0.95G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1oRMA4pNs2Ba0LYDbPufC842tO4OFcgwq\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IP8S-S6ZAkiL0OEsiBWNkw)(hfhm) |\n| mobilevit_s   \t\t\t\t| 76.74| 93.08 | 5.59M   | 1.88G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ibkhsswGYWvZwIRjwfgNA4-Oo2stKi0m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-rI6hiCHZaI7os2siFASNg)(34bg) |\n| mobilevit_s $\\dag$  \t\t\t| 77.83| 93.83 | 5.59M   | 1.88G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1BztBJ5jzmqgDWfQk-FB_ywDWqyZYu2yG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F19YepMAO-sveBOLA4aSjIEQ?pwd=92ic)(92ic) |\n| | | | | | | | | | \n| vip_s7  \t\t\t\t\t\t| 81.50 | 95.76 | 25.1M   | 7.0G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16bZkqzbnN08_o15k3MzbegK8SBwfQAHF\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1uY0FsNPYaM8cr3ZCdAoVkQ)(mh9b) |\n| vip_m7  \t\t\t\t\t\t| 82.75 | 96.05 | 55.3M   | 16.4G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11lvT2OXW0CVGPZdF9dNjY_uaEIMYrmNu\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1j3V0Q40iSqOY15bTKlFFRw)(hvm8) |\n| vip_l7  \t\t\t\t\t\t| 83.18 | 96.37 | 87.8M   | 24.5G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bK08JorLPMjYUep_TnFPKGs0e1j0UBKJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1I5hnv3wHWEaG3vpDqaNL-w)(tjvh) |\n| | | | | | | | | | \n| xcit_nano_12_p16_224_dist   | 72.32  | 90.86  | 0.6G    | 3.1M      | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14FsYtm48JB-rQFF9CanJsJaPESniWD7q\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kdY4vzwU2QiBSU5127AYA)(7qvz)     |\n| xcit_nano_12_p16_384_dist   | 75.46  | 92.70  | 1.6G    | 3.1M      | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zR-hFQryocF9muG-erzcxFuJme5y_e9f\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1449qtQzEMg6lqdtClyiCRQ)(1y2j)     |\n| xcit_large_24_p16_224_dist  | 84.92  | 97.13  | 35.9G   | 189.1M    | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lAtko_KwOagjwaFvUkeXirVClXCV8gt-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Gs401mXqG1bifi1hBdXtig)(kfv8)     |\n| xcit_large_24_p16_384_dist  | 85.76  | 97.54  | 105.5G  | 189.1M    | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15djnKz_-eooncvyZp_UTwOiHIm1Hxo_G\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F14583hbtIVbZ_2ifZepQItQ)(ffq3)     |\n| xcit_nano_12_p8_224_dist    | 76.33  | 93.10  | 2.2G    | 3.0M      | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1XxRNjskLvSVp6lvhlsnylq6g7vd_5MsI\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DZJxuahFJyz-rEEsCqhhrA)(jjs7)     |\n| xcit_nano_12_p8_384_dist    | 77.82  | 94.04  | 6.3G    | 3.0M      | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1P3ln8JqLzMKbJAhCanRbu7i5NMPVFNec\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ECY9-PVDMNSup8NMQiqBrw)(dmc1)     |\n| xcit_large_24_p8_224_dist   | 85.40  | 97.40  | 141.4G  | 188.9M    | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14ZoDxEez5NKVNAsbgjTPisjOQEAA30Wy\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1D_zyvjzIVFp6iqx1s7IEbA)(y7gw)     |\n| xcit_large_24_p8_384_dist   | 85.99  | 97.69  | 415.5G  | 188.9M    | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1stcUwwFNJ38mdaFsNXq24CBMmDenJ_e4\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1lwbBk7GFuqnnP_iU2OuDRw)(9xww)     |\n| | | | | | | | | |\n| pit_ti \t     | 72.91\t| 91.40\t| 4.8M    | 0.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bbeqzlR_CFB8CAyTUN52p2q6ii8rt0AW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Yrq5Q16MolPYHQsT_9P1mw)(ydmi)  |\n| pit_ti_distill | 74.54\t| 92.10 | 5.1M    | 0.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m4L0OVI0sYh8vCv37WhqCumRSHJaizqX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1RIM9NGq6pwfNN7GJ5WZg2w)(7k4s)  |\n| pit_xs \t     | 78.18    | 94.16 | 10.5M   | 1.1G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1qoMQ-pmqLRQmvAwZurIbpvgMK8MOEgqJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15d7ep05vI2UoKvL09Zf_wg)(gytu)  |\n| pit_xs_distill | 79.31 \t| 94.36 | 10.9M   | 1.1G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1EfHOIiTJOR-nRWE5AsnJMsPCncPHEgl8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DqlgVF7U5qHfGD3QJAad4A)(ie7s)  |\n| pit_s  \t\t | 81.08 \t| 95.33 | 23.4M   | 2.4G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TDSybTrwQpcFf9PgCIhGX1t-f_oak66W\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Vk-W1INskQq7J5Qs4yphCg)(kt1n)  |\n| pit_s_distill  | 81.99 \t| 95.79 | 24.0M   | 2.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1U3VPP6We1vIaX-M3sZuHmFhCQBI9g_dL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L7rdWmMW8tiGkduqmak9Fw)(hhyc)  |\n| pit_b   \t\t | 82.44 \t| 95.71 | 73.5M\t  | 10.6G  | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-NBZ9-83nZ52jQ4DNZAIj8Xv6oh54nx-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1XRDPY4OxFlDfl8RMQ56rEg)(uh2v)  |\n| pit_b_distill  | 84.14 \t| 96.86 | 74.5M   | 10.7G  | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12Yi4eWDQxArhgQb96RXkNWjRoCsDyNo9\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1vJOUGXPtvC0abg-jnS4Krw)(3e6g)  |\n| | | | | | | | | |\n| halonet26t \t | 79.10\t| 94.31\t| 12.5M    | 3.2G   | 256        | 0.95     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1F_a1brftXXnPM39c30NYe32La9YZQ0mW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FSlSTuYMpwPJpi4Yz2nCTA)(ednv)  |\n| halonet50ts \t | 81.65\t| 95.61\t| 22.8M    | 5.1G   | 256        | 0.94     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12t85kJcPA377XePw6smch--ELMBo6p0Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1X4LM-sqoTKG7CrM5BNjcdA)(3j9e)  |\n| | | | | | | | | |\n| poolformer_s12 | 77.24 | 93.51 | 11.9M   | 1.8G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15EBfTTU6coLCsDNiLgAWYiWeMpp3uYH4\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1n6TUxQGlssTu4lyLrBOXEw)(zcv4)             |\n| poolformer_s24 | 80.33 | 95.05 | 21.3M   | 3.4G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1JxqJluDpp1wwe7XtpTi1aWaVvlq0Q3xF\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1d2uyHB5R6ZWPzXWhdtm6fw)(nedr)             |\n| poolformer_s36 | 81.43 | 95.45 | 30.8M   | 5.0G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ka3VeupDRFBSzzrcw4wHXKGqoKv6sB_Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1de6ZJkmYEmVI7zKUCMB_xw)(fvpm)             |\n| poolformer_m36 | 82.11 | 95.69 | 56.1M   | 8.9G   | 224        | 0.95     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LTZ8wNRb_GSrJ9H3qt5-iGiGlwa4dGAK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qNTYLw4vyuoH1EKDXEcSvw)(whfp)             |\n| poolformer_m48 | 82.46 | 95.96 | 73.4M   | 11.8G  | 224        | 0.95     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YhXEVjWtI4bZB_Qwama8G4RBanq2K15L\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VJXANTseTUEA0E6HYf-XyA)(374f)             |\n| | | | | | | | | |\n| botnet50 \t | 77.38\t| 93.56\t| 20.9M    | 5.3G   | 224        | 0.875     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1S4nxgRkElT3K4lMx2JclPevmP3YUHNLw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1CW40ShBJQYeFgdBIZZLSjg)(wh13)\n| | | | | | | | | |\n| CvT-13-224      | 81.59 | 95.67 | 20M    | 4.5G    | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1r0fnHn1bRPmN0mi8RwAPXmD4utDyOxEf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F13xNwCGpdJ5MVUi369OGl5Q)(vev9) |\n| CvT-21-224      | 82.46 | 96.00 | 32M    | 7.1G    | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18s7nRfvcmNdbRuEpTQe02AQE3Y9UWVQC\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mOjbMNoQb7X3VJD3LV0Hhg)(t2rv) |\n| CvT-13-384   \t  | 83.00 | 96.36 | 20M    | 16.3G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1J0YYPUsiXSqyExBPtOPrOLL9c16syllg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1upITRr5lNHLjbBJtIr-jdg)(wswt) |\n| CvT-21-384   \t  | 83.27 | 96.16 | 32M    | 24.9G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1tpXv_yYXtvyArlYi7AFcHUOqemhyMWHW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hXKi3Kb7mNxPFVmR6cdkMg)(hcem) |\n| CvT-13-384-22k  | 83.26 | 97.09 | 20M    | 16.3G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18djrvq422u1pGLPxNfWAp6d17F7C5lbP\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YYv5rKPmroxKCnzkesUr0g)(c7m9) |\n| CvT-21-384-22k  | 84.91 | 97.62 | 32M    | 24.9G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1NVXd7vxVoRpL-21GN7nGn0-Ut0L0Owp8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1N3xNU6XFHb1CdEOrnjKuoA)(9jxe) |\n| CvT-w24-384-22k | 87.58 | 98.47 | 277M   | 193.2G  | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1M3bg46N4SGtupK8FcvAOE0jltOwP5yja\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1MNJurm8juHRGG9SAw3IOkg)(bbj2) |\n| | | | | | | | | |\n| HVT-Ti-1       | 69.45 | 89.28 | 5.7M    | 0.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11BW-qLBMu_1TDAavlrAbfVlXB53dgm42\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16rZvJqL-UVuWFsCDuxFDqg?pwd=egds)(egds) |\n| HVT-S-0        | 80.30 | 95.15 | 22.0M   | 4.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1GlJ2j2QVFye1tAQoUJlgKTR_KELq3mSa\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L-tjDxkQx00jg7BsDClabA?pwd=hj7a)(hj7a) |\n| HVT-S-1        | 78.06 | 93.84 | 22.1M   | 2.4G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16H33zNIpNrHBP1YhCq4zmLjRYQJ0XEmX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1quOsgVuxTcauISQ3SehysQ?pwd=tva8)(tva8) |\n| HVT-S-2        | 77.41 | 93.48 | 22.1M   | 1.9G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1U14LA7SXJtFep_SdUCjAV-cDOQ9A_OFk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nooWTBzaXyBtEgadn9VDmw?pwd=bajp)(bajp) |\n| HVT-S-3        | 76.30 | 92.88 | 22.1M   | 1.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m1CjOcZfPMLDRyX4QBgMhHV1m6rtu44v\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15sAOmQN6Hx0GLelYDuMQXw?pwd=rjch)(rjch) |\n| HVT-S-4        | 75.21 | 92.34 | 22.1M   | 1.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14comGo9lO12dUeGGL52MuIJWZPSit7I0\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1o31hMRWR7FTCjUk7_fAOgA?pwd=ki4j)(ki4j) |\n| | | | | | | | | |\n| | | | | | | | | |\n| mlp_mixer_b16_224            \t| 76.60 | 92.23 | 60.0M   | 12.7G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ZcQEH92sEPvYuDc6eYZgssK5UjYomzUD\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F12nZaWGMOXwrCMOIBfUuUMA)(xh8x) |\n| mlp_mixer_l16_224           \t| 72.06 | 87.67 | 208.2M  | 44.9G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mkmvqo5K7JuvqGm92a-AdycXIcsv1rdg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1AmSVpwCaGR9Vjsj_boL7GA)(8q7r) |\n| | | | | | | | | |\n| resmlp_24_224                \t| 79.38 | 94.55 | 30.0M   | 6.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15A5q1XSXBz-y1AcXhy_XaDymLLj2s2Tn\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nLAvyG53REdwYNCLmp4yBA)(jdcx) |\n| resmlp_36_224             \t| 79.77 | 94.89 | 44.7M   | 9.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1WrhVm-7EKnLmPU18Xm0C7uIqrg-RwqZL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1QD4EWmM9b2u1r8LsnV6rUA)(33w3) |\n| resmlp_big_24_224         \t| 81.04 | 95.02 | 129.1M  | 100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KLlFuzYb17tC5Mmue3dfyr2L_q4xHTZi\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1oXU6CR0z7O0XNwu_UdZv_w)(r9kb) |\n| resmlp_12_distilled_224 \t\t| 77.95 | 93.56 | 15.3M   |\t3.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cDMpAtCB0pPv6F-VUwvgwAaYtmP8IfRw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kJeZ_V1MMjTX9f1DBCgnw)(ghyp) |\n| resmlp_24_distilled_224 \t\t| 80.76 | 95.22 | 30.0M   |\t6.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15d892ExqR1sIAjEn-cWGlljX54C3vihA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1NgQtSwuAwsVVOB8U6N4Aqw)(sxnx) |\n| resmlp_36_distilled_224 \t\t| 81.15 | 95.48 | 44.7M\t  | 9.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Laqz1oDg-kPh6eb6bekQqnE0m-JXeiep\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1p1xGOJbMzH_RWEj36ruQiw)(vt85) |\n| resmlp_big_24_distilled_224 \t| 83.59 | 96.65 | 129.1M  |\t100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F199q0MN_BlQh9-HbB28RdxHj1ApMTHow-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1yUrfbqW8vLODDiRV5WWkhQ)(4jk5) |\n| resmlp_big_24_22k_224   \t\t| 84.40 | 97.11 | 129.1M  | 100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zATKq1ruAI_kX49iqJOl-qomjm9il1LC\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VrnRMbzzZBmLiR45YwICmA)(ve7i) |\n| | | | | | | | | |\n| gmlp_s16_224                 \t| 79.64 | 94.63 | 19.4M   | 4.5G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TLypFly7aW0oXzEHfeDSz2Va4RHPRqe5\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F13UUz1eGIKyqyhtwedKLUMA)(bcth) |\n| | | | | | | | | |\n| ff_only_tiny (linear_tiny) \t| 61.28 | 84.06 |         |        | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14bPRCwuY_nT852fBZxb9wzXzbPWNfbCG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nNE4Hh1Nrzl7FEiyaZutDA)(mjgd) |\n| ff_only_base (linear_base) \t| 74.82 | 91.71 |         |        | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1DHUg4oCi41ELazPCvYxCFeShPXE4wU3p\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1l-h6Cq4B8kZRvHKDTzhhUg)(m1jc) |\n| | | | | | | | | |\n| repmlp_res50_light_224 \t\t| 77.01 | 93.46 | 87.1M   | 3.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16bCFa-nc_-tPVol-UCczrrDO_bCFf2uM\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bzmpS6qJJTsOq3SQE7IOyg)(b4fg) |\n| | | | | | | | | |\n| cyclemlp_b1 \t\t\t\t\t | 78.85 | 94.60 | 15.1M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10WQenRy9lfOJF4xEHc9Mekp4zHRh0mJ_\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11UQp1RkWBsZFOqit_uU80w)(mnbr) |\n| cyclemlp_b2 \t\t\t\t\t | 81.58 | 95.81 | 26.8M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1dtQHCwtxNh9jgiHivN5iYpHe7uKRUjhk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Js-Oq5vyiB7oPagn43cn3Q)(jwj9) |\n| cyclemlp_b3 \t\t\t\t\t | 82.42 | 96.07 | 38.3M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11kMq112tAwVE5llJIepIIixz74AjaJhU\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1b7cau1yPxqATA8X7t2DXkw)(v2fy) |\n| cyclemlp_b4 \t\t\t\t\t | 82.96 | 96.33 | 51.8M   |    | 224   \t    | 0.875  | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vwJ0eD9Ic-NvLvCz1zEAmn7RxBMtd_v2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1P3TlnXRFGWj9nVP5xBGGWQ)(fnqd) |\n| cyclemlp_b5 \t\t\t\t\t | 83.25 | 96.44 | 75.7M   |    | 224   \t    | 0.875  | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12_I4cfOBfp7kC0RvmnMXFqrSxww6plRW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-Cka1tNqGUQutkAP3VZXzQ)(s55c) |\n| | | | | | | | | |\n| convmixer_1024_20  \t\t\t| 76.94 | 93.35 | 24.5M   | 9.5G   |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1R7zUSl6_6NFFdNOe8tTfoR9VYQtGfD7F\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DgGA3qYu4deH4woAkvjaBw)(qpn9) |\n| convmixer_768_32  \t\t\t| 80.16 | 95.08 | 21.2M   | 20.8G  |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F196Lg_Eet-hRj733BYASj22g51wdyaW2a\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F17CbRNzY2Sy_Cu7cxNAkWmQ)(m5s5) |\n| convmixer_1536_20  \t\t\t| 81.37 | 95.62 | 51.8M   | 72.4G  |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-LlAlADiu0SXDQmE34GN2GBhqI-RYRqO\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1R-gSzhzQNfkuZVxsaE4vEw)(xqty) |\n| | | | | | | | | |\n| convmlp_s\t\t\t  \t\t\t| 76.76 | 93.40 | 9.0M    | 2.4G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1D8kWVfQxOyyktqDixaZoGXB3wVspzjlc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WseHYALFB4Of3Dajmlt45g)(3jz3) |\n| convmlp_m\t\t\t  \t\t\t| 79.03 | 94.53 | 17.4M   | 4.0G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TqVlKHq-WRdT9KDoUpW3vNJTIRZvix_m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1koipCAffG6REUyLYk0rGAQ)(vyp1) |\n| convmlp_l\t\t\t  \t\t\t| 80.15 | 95.00 | 42.7M   | 10.0G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KXxYogDh6lD3QGRtFBoX5agfz81RDN3l\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1f1aEeVoySzImI89gkjcaOA)(ne5x) |\n| | | | | | | | | |\n| topformer_tiny | 65.98 | 87.32 | 1.5M   | 0.13G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)             |\n| topformer_small| 72.44 | 91.17 | 3.1M   | 0.24G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)             |\n| topformer_base | 75.25 | 92.67 | 5.1M   | 0.37G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)             |\n| | | | | | | | | |\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n### Object Detection ###\n| Model | backbone  | box_mAP | Model                                                                                                                                                       |\n|-------|-----------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| DETR  | ResNet50  | 42.0    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ruIKCqfh_MMqzq_F4L2Bv-femDMjS_ix\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1J6lB1mezd6_eVW3jnmohZA)(n5gk) |\n| DETR  | ResNet101 | 43.5    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11HCyDJKZLX33_fRGp4bCg1I14vrIKYW5\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1_msuuAwFMNbAlMpgUq89Og)(bxz2) |\n| Mask R-CNN | Swin-T 1x |  43.7   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OpbCH5HuIlxwakNz4PzrAlJF3CxkLSYp\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18HALSo2RHMBsX-Gbsi-YOw)(qev7) |\n| Mask R-CNN | Swin-T 3x |  46.0   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1oREwIk1ORhSsJcs4Y-Cfd0XrSEfPFP3-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tw607oogDWQ7Iz91ItfuGQ)(m8fg) |\n| Mask R-CNN | Swin-S 3x |  48.4   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ZPWkz0zMzHJycHd6_s2hWDHIsW8SdZcK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ubC5_CKSq0ExQSINohukVg)(hdw5) |\n| Mask R-CNN | pvtv2_b0 \t\t|  38.3   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wA324LkFtGezHJovSZ4luVqSxVt9woFc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1q67ZIDSHn9Y-HU_WoQr8OQ)(3kqb) |\n| Mask R-CNN | pvtv2_b1 \t\t|  41.8   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1alNaSmR4TSXsPpGoUZr2QQf5phYQjIzN\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aSkuDiNpxdnFWE1Wn1SWNw)(k5aq) |\n| Mask R-CNN | pvtv2_b2 \t\t|  45.2   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1tg6B5OEV4OWLsDxTCjsWgxgaSgIh4cID\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DLwxCZVZizb5HKih7RFw2w)(jh8b) |\n| Mask R-CNN | pvtv2_b2_linear \t|  44.1   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1b26vxK3QVGx5ovqKir77NyY6YPgAWAEj\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16T-Nyo_Jm2yDq4aoXpdnbg)(8ipt) |\n| Mask R-CNN | pvtv2_b3 \t\t|  46.9   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1H6ZUCixCaYe1AvlBkuqYoxzz4b-icJ3u\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16QVsjUOXijo5d9cO3FZ39A)(je4y) |\n| Mask R-CNN | pvtv2_b4 \t\t|  47.5   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1pXQNpn0BoKqiuVaGtJL18eWG6XmdlBOL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1yhX7mpmb2wbRvWZFnUloBQ)(n3ay) |\n| Mask R-CNN | pvtv2_b5 \t\t|  47.4   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12vOyw6pUfK1NdOWBF758aAZuaf-rZLvx\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-gasQk9PqLMkrWXw4aX41g)(jzq1) |\n\n### Semantic Segmentation ###\n#### Pascal Context ####\n|Model      | Backbone  | Batch_size | mIoU (ss) | mIoU (ms+flip) | Backbone_checkpoint | Model_checkpoint      |     ConfigFile  |\n|-----------|-----------|------------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_large |     16     |   52.06   |      52.57        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1AUyBLeoAcMH0P_QGer8tdeU44muTUOCA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11XgmgYG071n_9fSGUcPpDQ)(xdb8)   | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_480x480_80k_pascal_context_bs_16.yaml) | \n|SETR_PUP   | ViT_large |     16     |   53.90   |       54.53    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1IY-yBIrDPg5CigQ18-X2AX6Oq3rvWeXL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1v6ll68fDNCuXUIJT2Cxo-A)(6sji) | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_480x480_80k_pascal_context_bs_16.yaml) |\n|SETR_MLA   | ViT_Large |     8      |   54.39   |       55.16       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1utU2h0TrtuGzRX5RMGroudiDcz0z6UmV\u002Fview)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Eg0eyUQXc-Mg5fg0T3RADA)(wora)| [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml) |\n|SETR_MLA   | ViT_large |     16     |   55.01   |       55.87        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1SOXB7sAyysNhI8szaBqtF8ZoxSaPNvtl\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jskpqYbazKY1CKK3iVxAYA)(76h2) | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_480x480_80k_pascal_context_bs_16.yaml) |\n\n#### Cityscapes ####\n|Model      | Backbone  | Batch_size | Iteration | mIoU (ss) | mIoU (ms+flip) | Backbone_checkpoint | Model_checkpoint     |     ConfigFile  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_Large |     8      |     40k   |   76.71   |       79.03        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1QialLNMmvWW8oi7uAHhJZI3HSOavV4qj\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1F3IB31QVlsohqW8cRNphqw)(g7ro)  |  [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_Naive | ViT_Large |     8      |     80k   |   77.31   |       79.43      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1RJeSGoDaOP-fM4p1_5CJxS5ku_yDXXLV\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1XbHPBfaHS56HlaMJmdJf1A)(wn6q)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_768x768_80k_cityscapes_bs_8.yaml)| \n|SETR_PUP   | ViT_Large |     8      |     40k   |   77.92   |       79.63        |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12rMFMOaOYSsWd3f1hkrqRc1ThNT8K8NG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1H8b3valvQ2oLU9ZohZl_6Q)(zmoi)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_PUP   | ViT_Large |     8      |     80k   |   78.81   |       80.43     |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tkMhRzO0XHqKYM0lojE3_g)(f793)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_768x768_80k_cityscapes_bs_8.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     40k   |   76.70    |       78.96      |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1sUug5cMKSo6mO7BEI4EV_w)(qaiw)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     80k   |  77.26     |       79.27      |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IqPZ6urdQb_0pbdJW2i3ow)(6bgj)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_768x768_80k_cityscapes_bs_8.yaml)| \n\n\n#### ADE20K ####\n|Model      | Backbone  | Batch_size | Iteration | mIoU (ss) | mIoU (ms+flip) | Backbone_checkpoint | Model_checkpoint     |     ConfigFile  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_Large |     16      |     160k   | 47.57   |      48.12        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1_AY6BMluNn71UiMNZbnKqQ)(lugq)   | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_512x512_160k_ade20k_bs_16.yaml)| \n|SETR_PUP   | ViT_Large |     16      |     160k   |  49.12   |      49.51        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1N83rG0EZSksMGZT3njaspg)(udgs)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_512x512_160k_ade20k_bs_16.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     160k   |  47.80   |       49.34        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L83sdXWL4XT02dvH2WFzCA)(mrrv)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_512x512_160k_ade20k_bs_8.yaml)| \n|DPT        | ViT_Large |     16     |     160k   |  47.21   |       -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PCSC1Kvcg291gqp6h5pDCg)(ts7h)   |  [config](semantic_segmentation\u002Fconfigs\u002Fdpt\u002FDPT_Large_480x480_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Tiny  |     16     |     160k   |  38.45   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nZptBc-IY_3PFramXSlovQ)(1k97)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Tiny_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Small |     16     |     160k   |  46.07   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1gKE-GEu7gX6dJsgtlvrmWg)(i8nv)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_small_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Base  |     16     |     160k   |  49.08   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qb7HEtKW0kBSP6iv-r_Hjg)(hxrl)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_512x512_160k_ade20k_bs_16.yaml) |\n|Segmenter  | ViT_Large  |     16     |     160k   |  51.82   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F121FOwpsYue7Z2Rg3ZlxnKg)(wdz6)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Tiny_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter_Linear  | DeiT_Base |     16     |     160k   |  47.34   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hk_zcXUIt_h5sKiAjG2Pog)(5dpv)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_distilled_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | DeiT_Base |     16     |     160k   |  49.27   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-TBUuvcBKNgetSJr0CsAHA)(3kim)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_distilled_512x512_160k_ade20k_bs_16.yaml) |\n|Segformer  | MIT-B0 |     16     |     160k   |  38.37   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WOD9jGjQRLnwKrRYzgBong)(ges9)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegformer\u002Fsegformer_mit-b0_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B1 |     16     |     160k   |  42.20   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aiSBXMd8nP82XK7sSZ05gg)(t4n4)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b1_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B2 |     16     |     160k   |  46.38   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wFFh-K5t46YktkfoWUOTAg)(h5ar)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b2_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B3 |     16     |     160k   |  48.35   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IwBnDeLNyKgs-xjhlaB9ug)(g9n4)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b3_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B4 |     16     |     160k   |  49.01   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1a25fCVlwJ-1TUh9HQfx7YA)(e4xw)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b4_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B5 |     16     |     160k   |  49.73   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kXXxKEjjtJv-BmrPnSTOw)(uczo)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b5_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Tiny |     16     |     160k   |  44.90   |       45.37     |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1S8JR4ILw0u4I-DzU4MaeVQ)(lkhg)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_tiny_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Small |     16     |     160k   |  47.88   |       48.90      |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F17RKeSpuWqONVptQZ3B4kEA)(vvy1)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_small_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Base |     16     |     160k   |   48.59   |       49.04      |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bM15KHNsb0oSPblQwhxbgw)(y040)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_base_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Tiny |     16     |     160k   |  49.46   |           |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ol_gykZjgAFbJ3PkqQ2j0Q)(l1cp) | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1gLePNLybtrax9yCQ2fcIPg)(y1eq)  |  [config](seman}tic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_tiny_patch4_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Small |     16     |     160k   |  50.88   |      | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mSd_JdNS4DtyVNYxqVobBw)(6vwk)   | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1a_vhHoib0-BcRwTnnSVGWA)(fz2e)   | [config](semantic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_small_patch4_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Base |     16     |     160k   |  50.64   |      | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1suO0jX_Tw56CVm3UhByOWg)(0ys7)   | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ym-RUooqizgUDEm5jWyrhA)(83w3)   | [config](semantic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_base_patch4_512x512_160k_ade20k.yaml) |\n| TopFormer  | TopFormer_Base |     16     |     160k   |  38.3   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_base_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Base |     32     |     160k   |  39.2   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_base_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Small |     16     |     160k   |  36.5   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_small_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Small |     32     |     160k   |  37.0   |  -    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_small_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     16     |     160k   |  33.6   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     32     |     160k   |  34.6   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     16     |     160k   |  32.5   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_448x448_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     32     |     160k   |  33.4   |  -    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_448x448_160k_4x8_ade20k.yaml) |\n|Trans2seg_Medium | Resnet50c |     32      |    160k    |  36.81  |      -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C6nMg6DgQ73wzF21UwDVxmkcRTeKngnK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hs0tbSGIeMLLGMq05NN--w)(4dd5)    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1IqFCEC8PeKgtoljmUxCqI3kmfsMRcTLN\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hF7-TrjGeHTw0zxzTvhXUA)(i2nt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftrans2seg\u002FTrans2Seg_medium_512x512_16k_ade20k_bs_32.yaml)| \n\n#### Trans10kV2 ####\n|Model      | Backbone  | Batch_size | Iteration | mIoU (ss) | mIoU (ms+flip) | Backbone_checkpoint | Model_checkpoint     |     ConfigFile  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|Trans2seg_Medium | Resnet50c |     16      |    16k    |  75.97  |      -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C6nMg6DgQ73wzF21UwDVxmkcRTeKngnK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hs0tbSGIeMLLGMq05NN--w)(4dd5)    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Si03aM3m9aqGocvN9XQGbvHIsWQxxXZu\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wdOUD6S8QGqD6S-98Yb37w)(w25r)   | [config](semantic_segmentation\u002Fconfigs\u002Ftrans2seg\u002FTrans2Seg_medium_512x512_16k_trans10kv2_bs_16.yaml)| \n\n### GAN ###\n| Model                          | FID | Image Size | Crop_pct | Interpolation | Model        |\n|--------------------------------|-----|------------|----------|---------------|--------------|\n| styleformer_cifar10            |2.73 | 32         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1iW76QmwbYz6GeAPQn8vKvsG0GvFdhV4T\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ax7BNEr1T19vgVjXG3rW7g)(ztky)  |\n| styleformer_stl10              |15.65| 48         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15p785y9eP1TeoqUcHPbwFPh98WNof7nw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rSORxMYAiGkLQZ4zTA2jcg)(i973)|\n| styleformer_celeba             |3.32 | 64         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1_YauwZN1osvINCboVk2VJMscrf-8KlQc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16NetcPxLQF9C_Zlp1SpkLw)(fh5s) |\n| styleformer_lsun               | 9.68 | 128        | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1i5kNzWK04ippFSmrmcAPMItkO0OFukTd\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jTS9ExAMz5H2lhue4NMV2A)(158t)|\n> *The results are evaluated on Cifar10, STL10, Celeba and LSUNchurch dataset, using **fid50k_full** metric.\n\n\n## Quick Demo for Image Classification\nTo use the model with pretrained weights, go to the specific subfolder e.g., `\u002Fimage_classification\u002FViT\u002F`, then download the `.pdparam` weight file and change related file paths in the following python scripts. The model config files are located in `.\u002Fconfigs`.  \n\nAssume the downloaded weight file is stored in `.\u002Fvit_base_patch16_224.pdparams`, to use the `vit_base_patch16_224` model in python:\n```python\nfrom config import get_config\nfrom visual_transformer import build_vit as build_model\n# config files in .\u002Fconfigs\u002F\nconfig = get_config('.\u002Fconfigs\u002Fvit_base_patch16_224.yaml')\n# build model\nmodel = build_model(config)\n# load pretrained weights\nmodel_state_dict = paddle.load('.\u002Fvit_base_patch16_224.pdparams')\nmodel.set_dict(model_state_dict)\n```\n> :robot: See the README file in each model folder for detailed usages.\n\n\n### Evaluation ###\nTo evaluate ViT model performance on ImageNet2012 with a single GPU, run the following script using command line:\n```shell\nsh run_eval.sh\n```\nor\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython main_single_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Fval \\\n    -eval \\\n    -pretrained=\u002Fpath\u002Fto\u002Fpretrained\u002Fmodel\u002Fvit_base_patch16_224  # .pdparams is NOT needed\n```\n\n\u003Cdetails>\n\n\u003Csummary>\nRun evaluation using multi-GPUs:\n\u003C\u002Fsummary>\n\n\n```shell\nsh run_eval_multi.sh\n```\nor\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 \\\npython main_multi_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Fval \\\n    -eval \\\n    -pretrained=\u002Fpath\u002Fto\u002Fpretrained\u002Fmodel\u002Fvit_base_patch16_224   # .pdparams is NOT needed\n```\n\n\u003C\u002Fdetails>\n\n\n### Training ###\nTo train the ViT model on ImageNet2012 with single GPU, run the following script using command line:\n```shell\nsh run_train.sh\n```\nor\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython main_single_gpu.py \\\n  -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n  -dataset=imagenet2012 \\\n  -batch_size=32 \\\n  -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Ftrain\n```\n\n\n\u003Cdetails>\n\n\u003Csummary>\nRun training using multi-GPUs:\n\u003C\u002Fsummary>\n\n\n```shell\nsh run_train_multi.sh\n```\nor\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 \\\npython main_multi_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Ftrain\n```\n\n\u003C\u002Fdetails>\n\n\n\n## Contributing ##\n* We encourage and appreciate your contribution to **PaddleViT** project, please refer to our workflow and work styles by [CONTRIBUTING.md](.\u002FCONTRIBUTING.md)\n\n\n## Licenses ##\n* This repo is under the Apache-2.0 license. \n\n## Contact ##\n* Please raise an issue on GitHub.\n","中文 | [English](.\u002FREADME.md)\n\n# PaddlePaddle 视觉 Transformer #\n\n[![GitHub](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FBR-IDL\u002FPaddleViT?color=blue)](.\u002FLICENSE)\n[![CodeFactor](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_9ee0cb95ac54.png)](https:\u002F\u002Fwww.codefactor.io\u002Frepository\u002Fgithub\u002Fbr-idl\u002Fpaddlevit)\n[![CLA assistant](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_c50f92066d4a.png)](https:\u002F\u002Fcla-assistant.io\u002FBR-IDL\u002FPaddleViT)\n[![GitHub 仓库星级](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBR-IDL\u002FPaddleViT?style=social)](https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fstargazers)\n\n\n\u003Cp align=\"center\">    \n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_readme_6b1c2a28ef70.png\" width=\"100%\"\u002F>\n\u003C\u002Fp>\n \n## 面向 PaddlePaddle 的最先进视觉 Transformer 和 MLP 模型 ##\n\n:robot: PaddlePaddle 视觉 Transformer（`PaddleViT` 或 `PPViT`）是一系列超越卷积的视觉模型集合。大多数模型基于视觉 Transformer、视觉注意力机制和 MLP 等技术。PaddleViT 还集成了流行的层、工具、优化器、学习率调度器、数据增强方法以及适用于 PaddlePaddle 2.1+ 的训练和验证脚本。我们的目标是复现各种最先进的 ViT 和 MLP 模型，并提供完整的训练和验证流程。我们致力于让前沿的计算机视觉技术更加易于使用，惠及每一位开发者。\n\n:robot: PaddleViT 提供适用于多种视觉任务的模型和工具，例如图像分类、目标检测、语义分割、GAN 等。每个模型架构都定义在独立的 Python 模块中，可以方便地进行修改以支持快速的研究实验。同时，用户也可以下载预训练权重，并在自己的数据集上进行微调。此外，PaddleViT 还集成了用于自定义数据集、数据预处理、性能指标计算、分布式数据并行训练等的常用工具和模块。\n\n:robot: PaddleViT 基于流行的深度学习框架 [PaddlePaddle](https:\u002F\u002Fwww.paddlepaddle.org\u002F) 构建，我们还在 [Paddle AI Studio](https:\u002F\u002Faistudio.baidu.com\u002Faistudio\u002Fcourse\u002Fintroduce\u002F25102) 上提供了教程和项目。对于新用户来说，入门非常直观且简单易懂。\n\n## 快速链接 ##\nPaddleViT 实现了多种视觉任务的模型架构和工具，详情请访问以下链接：\n- [PaddleViT-Cls](.\u002Fimage_classification)：用于图像分类\n- [PaddleViT-Det](.\u002Fobject_detection\u002FDETR)：用于目标检测\n- [PaddleViT-Seg](.\u002Fsemantic_segmentation)：用于语义分割\n- [PaddleViT-GAN](.\u002Fgan)：用于生成对抗网络\n- [Docs](.\u002Fdocs\u002F)：教程和文档\n- [docs-export](.\u002Fdocs\u002Fpaddlevit-export-en.md)：将 PaddleViT 模型导出为推理模型，便于生产部署\n\n我们还提供以下教程：\n- [在线课程](https:\u002F\u002Faistudio.baidu.com\u002Faistudio\u002Fcourse\u002Fintroduce\u002F25102)：在 Paddle AIStudio 上（中文版）\n\n## 特点 ##\n1. **最先进**\n   - 适用于多种计算机视觉任务的最先进 Transformer 模型\n   - 最先进的数据处理和训练方法\n   - 我们将持续推动技术进步。\n\n2. **易于使用的工具**\n   - 模型变体配置简单\n   - 工具函数和模块化设计\n   - 对教育工作者和从业者友好，门槛低\n   - 所有模型采用统一框架\n\n3. **可根据需求轻松定制**\n   - 每个模型都提供示例，便于复现实验结果\n   - 模型实现公开，方便用户自定义\n   - 模型文件可独立使用，适合快速实验\n\n4. **高性能**\n   - 支持分布式数据并行训练和验证（每个进程运行在单个 GPU 上）\n   - 支持混合精度训练（AMP）\n\n  \n\n## 模型架构 ##\n\n### 图像分类（Transformer） ###\n1. **[ViT](.\u002Fimage_classification\u002FViT)**（来自谷歌），随论文《一张图胜过16×16个词：大规模图像识别中的Transformer》（arXiv:2010.11929）发布，作者为Alexey Dosovitskiy、Lucas Beyer、Alexander Kolesnikov、Dirk Weissenborn、Xiaohua Zhai、Thomas Unterthiner、Mostafa Dehghani、Matthias Minderer、Georg Heigold、Sylvain Gelly、Jakob Uszkoreit、Neil Houlsby。\n2. **[DeiT](.\u002Fimage_classification\u002FDeiT)**（来自Facebook和索邦大学），随论文《数据高效训练的图像Transformer及基于注意力的蒸馏》（arXiv:2012.12877）发布，作者为Hugo Touvron、Matthieu Cord、Matthijs Douze、Francisco Massa、Alexandre Sablayrolles、Hervé Jégou。\n3. **[Swin Transformer](.\u002Fimage_classification\u002FSwinTransformer)**（来自微软），随论文《Swin Transformer：使用移位窗口的层次化视觉Transformer》（arXiv:2103.14030）发布，作者为Ze Liu、Yutong Lin、Yue Cao、Han Hu、Yixuan Wei、Zheng Zhang、Stephen Lin、Baining Guo。\n4. **[VOLO](.\u002Fimage_classification\u002FVOLO)**（来自Sea AI Lab和新加坡国立大学），随论文《VOLO：用于视觉识别的Vision Outlooker》（arXiv:2106.13112）发布，作者为Li Yuan、Qibin Hou、Zihang Jiang、Jiashi Feng、Shuicheng Yan。\n5. **[CSwin Transformer](.\u002Fimage_classification\u002FCSwin)**（来自中国科学技术大学和微软），随论文《CSWin Transformer：一种具有十字形窗口的通用视觉Transformer骨干网络》（arXiv:2107.00652）发布，作者为Xiaoyi Dong、Jianmin Bao、Dongdong Chen、Weiming Zhang、Nenghai Yu、Lu Yuan、Dong Chen、Baining Guo。\n6. **[CaiT](.\u002Fimage_classification\u002FCaiT)**（来自Facebook和索邦大学），随论文《深入探索图像Transformer》（arXiv:2103.17239）发布，作者为Hugo Touvron、Matthieu Cord、Alexandre Sablayrolles、Gabriel Synnaeve、Hervé Jégou。\n7. **[PVTv2](.\u002Fimage_classification\u002FPVTv2)**（来自南京大学、香港大学、北京理工大学、IIAI和商汤科技），随论文《PVTv2：基于金字塔视觉Transformer的改进基线》（arXiv:2106.13797）发布，作者为Wenhai Wang、Enze Xie、Xiang Li、Deng-Ping Fan、Kaitao Song、Ding Liang、Tong Lu、Ping Luo、Ling Shao。\n8. **[Shuffle Transformer](.\u002Fimage_classification\u002FShuffle_Transformer)**（来自腾讯），随论文《Shuffle Transformer：重新思考视觉Transformer中的空间洗牌操作》（arXiv:2106.03650）发布，作者为Zilong Huang、Youcheng Ben、Guozhong Luo、Pei Cheng、Gang Yu、Bin Fu。\n9. **[T2T-ViT](.\u002Fimage_classification\u002FT2T_ViT)**（来自新加坡国立大学和依图科技），随论文《Tokens-to-Token ViT：从头开始在ImageNet上训练视觉Transformer》（arXiv:2101.11986）发布，作者为Li Yuan、Yunpeng Chen、Tao Wang、Weihao Yu、Yujun Shi、Zihang Jiang、Francis EH Tay、Jiashi Feng、Shuicheng Yan。\n10. **[CrossViT](.\u002Fimage_classification\u002FCrossViT)**（来自IBM），随论文《CrossViT：用于图像分类的跨注意力多尺度视觉Transformer》（arXiv:2103.14899）发布，作者为Chun-Fu Chen、Quanfu Fan、Rameswar Panda。\n11. **[BEiT](.\u002Fimage_classification\u002FBEiT)**（来自微软研究院），随论文《BEiT：图像Transformer的BERT预训练》（arXiv:2106.08254）发布，作者为Hangbo Bao、Li Dong、Furu Wei。\n12. **[Focal Transformer](.\u002Fimage_classification\u002FFocal_Transformer)**（来自微软），随论文《视觉Transformer中用于局部-全局交互的焦点自注意力》（arXiv:2107.00641）发布，作者为Jianwei Yang、Chunyuan Li、Pengchuan Zhang、Xiyang Dai、Bin Xiao、Lu Yuan和Jianfeng Gao。\n13. **[Mobile-ViT](.\u002Fimage_classification\u002FMobileViT)**（来自苹果公司），随论文《MobileViT：轻量级、通用且适用于移动设备的视觉Transformer》（arXiv:2110.02178）发布，作者为Sachin Mehta、Mohammad Rastegari。\n14. **[ViP](.\u002Fimage_classification\u002FViP)**（来自新加坡国立大学），随论文《Vision Permutator：一种用于视觉识别的可置换MLP-like架构》（arXiv:2106.12368）发布，作者为Qibin Hou、Zihang Jiang、Li Yuan、Ming-Ming Cheng、Shuicheng Yan、Jiashi Feng。\n15. **[XCiT](.\u002Fimage_classification\u002FXCiT)**（来自Facebook、Inria和索邦大学），随论文《XCiT：交叉协方差图像Transformer》（arXiv:2106.09681）发布，作者为Alaaeldin El-Nouby、Hugo Touvron、Mathilde Caron、Piotr Bojanowski、Matthijs Douze、Armand Joulin、Ivan Laptev、Natalia Neverova、Gabriel Synnaeve、Jakob Verbeek、Hervé Jegou。\n16. **[PiT](.\u002Fimage_classification\u002FPiT)**（来自NAVER和Sogan大学），随论文《重新思考视觉Transformer的空间维度》（arXiv:2103.16302）发布，作者为Byeongho Heo、Sangdoo Yun、Dongyoon Han、Sanghyuk Chun、Junsuk Choe、Seong Joon Oh。\n17. **[HaloNet](.\u002Fimage_classification\u002FHaloNet)**（来自谷歌），随论文《扩展局部自注意力以构建参数高效的视觉骨干网络》（arXiv:2103.12731）发布，作者为Ashish Vaswani、Prajit Ramachandran、Aravind Srinivas、Niki Parmar、Blake Hechtman、Jonathon Shlens。\n18. **[PoolFormer](.\u002Fimage_classification\u002FPoolFormer)**（来自Sea AI Lab和新加坡国立大学），随论文《MetaFormer才是你真正需要的视觉模型》（arXiv:2111.11418）发布，作者为Weihao Yu、Mi Luo、Pan Zhou、Chenyang Si、Yichen Zhou、Xinchao Wang、Jiashi Feng、Shuicheng Yan。\n19. **[BoTNet](.\u002Fimage_classification\u002FBoTNet)**（来自加州大学伯克利分校和谷歌），随论文《瓶颈Transformer用于视觉识别》（arXiv:2101.11605）发布，作者为Aravind Srinivas、Tsung-Yi Lin、Niki Parmar、Jonathon Shlens、Pieter Abbeel、Ashish Vaswani。\n20. **[CvT](.\u002Fimage_classification\u002FCvT)**（来自麦吉尔大学和微软），随论文《CvT：将卷积引入视觉Transformer》（arXiv:2103.15808）发布，作者为Haiping Wu、Bin Xiao、Noel Codella、Mengchen Liu、Xiyang Dai、Lu Yuan、Lei Zhang。\n21. **[HvT](.\u002Fimage_classification\u002FHVT)**（来自莫纳什大学），随论文《具有层次化池化的可扩展视觉Transformer》（arXiv:2103.10619）发布，作者为Zizheng Pan、Bohan Zhuang、Jing Liu、Haoyu He、Jianfei Cai。\n22. **[TopFormer](.\u002Fimage_classification\u002FTopFormer)**（来自华中科技大学、腾讯、复旦大学和浙江大学），随论文《TopFormer：用于移动端语义分割的Token Pyramid Transformer》（arXiv:2204.05525）发布，作者为Wenqiang Zhang、Zilong Huang、Guozhong Luo、Tao Chen、Xinggang Wang、Wenyu Liu、Gang Yu、Chunhua Shen。\n\n22. **[ConvNeXt](.\u002Fimage_classification\u002FConvNeXt)**（来自 FAIR\u002FUCBerkeley），随论文《面向2020年代的卷积神经网络》（A ConvNet for the 2020s，arXiv:2201.03545）发布，作者为 Zhuang Liu、Hanzi Mao、Chao-Yuan Wu、Christoph Feichtenhofer、Trevor Darrell 和 Saining Xie。\n22. **[CoaT](.\u002Fimage_classification\u002FCoaT)**（来自 UCSD），随论文《协同尺度的卷积-注意力图像Transformer》（Co-Scale Conv-Attentional Image Transformers，arXiv:2104.06399）发布，作者为 Weijian Xu、Yifan Xu、Tyler Chang 和 Zhuowen Tu。\n22. **[ResT](.\u002Fimage_classification\u002FResT)**（来自 NJU），随论文《ResT：一种用于视觉识别的高效Transformer》（ResT: An Efficient Transformer for Visual Recognition，arXiv:2105.13677）发布，作者为 Qinglong Zhang 和 Yubin Yang。\n22. **[ResTV2](.\u002Fimage_classification\u002FResT)**（来自 NJU），随论文《ResT V2：更简单、更快、更强》（ResT V2: Simpler, Faster and Stronger，arXiv:2204.07366）发布，作者为 Qinglong Zhang 和 Yubin Yang。\n\n\n\n\n### 图像分类（MLP及其他） ###\n1. **[MLP-Mixer](.\u002Fimage_classification\u002FMLP-Mixer)**（来自 Google），随论文《MLP-Mixer：一种全MLP的视觉架构》（MLP-Mixer: An all-MLP Architecture for Vision，arXiv:2105.01601）发布，作者为 Ilya Tolstikhin、Neil Houlsby、Alexander Kolesnikov、Lucas Beyer、Xiaohua Zhai、Thomas Unterthiner、Jessica Yung、Andreas Steiner、Daniel Keysers、Jakob Uszkoreit、Mario Lucic 和 Alexey Dosovitskiy。\n2. **[ResMLP](.\u002Fimage_classification\u002FResMLP)**（来自 Facebook\u002FSorbonne\u002FInria\u002FValeo），随论文《ResMLP：用于图像分类且数据高效训练的前馈网络》（ResMLP: Feedforward networks for image classification with data-efficient training，arXiv:2105.03404）发布，作者为 Hugo Touvron、Piotr Bojanowski、Mathilde Caron、Matthieu Cord、Alaaeldin El-Nouby、Edouard Grave、Gautier Izacard、Armand Joulin、Gabriel Synnaeve、Jakob Verbeek 和 Hervé Jégou。\n3. **[gMLP](.\u002Fimage_classification\u002FgMLP)**（来自 Google），随论文《关注MLP吧》（Pay Attention to MLPs，arXiv:2105.08050）发布，作者为 Hanxiao Liu、Zihang Dai、David R. So 和 Quoc V. Le。\n4. **[FF Only](.\u002Fimage_classification\u002FFF_Only)**（来自 Oxford），随论文《你真的需要注意力机制吗？一叠前馈层在ImageNet上表现惊人》（Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet，arXiv:2105.02723）发布，作者为 Luke Melas-Kyriazi。\n5. **[RepMLP](.\u002Fimage_classification\u002FRepMLP)**（来自 BNRist\u002FTsinghua\u002FMEGVII\u002FAberystwyth），随论文《RepMLP：将卷积重新参数化为全连接层用于图像识别》（RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition，arXiv:2105.01883）发布，作者为 Xiaohan Ding、Chunlong Xia、Xiangyu Zhang、Xiaojie Chu、Jungong Han 和 Guiguang Ding。\n6. **[CycleMLP](.\u002Fimage_classification\u002FCycleMLP)**（来自 HKU\u002FSenseTime），随论文《CycleMLP：一种类似MLP的密集预测架构》（CycleMLP: A MLP-like Architecture for Dense Prediction，arXiv:2107.10224）发布，作者为 Shoufa Chen、Enze Xie、Chongjian Ge、Ding Liang 和 Ping Luo。\n7. **[ConvMixer](.\u002Fimage_classification\u002FConvMixer)**（来自 Anonymous），随论文《只需补丁就够了吗？》（Patches Are All You Need?，openreview.net\u002Fforum?id=TVHS5Y4dNvM）发布，作者为 Anonymous。\n8. **[ConvMLP](.\u002Fimage_classification\u002FConvMLP)**（来自 UO\u002FUIUC\u002FPAIR），随论文《ConvMLP：用于视觉任务的分层卷积MLP》（ConvMLP: Hierarchical Convolutional MLPs for Vision，arXiv:2109.04454）发布，作者为 Jiachen Li、Ali Hassani、Steven Walton 和 Humphrey Shi。\n1. **[RepLKNet](.\u002FRepLKNet)**（来自 Tsinghua\u002FMEGVII\u002FAberystwyth），随论文《将卷积核扩大到31×31：重访CNN中的大卷积核设计》（Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs，arXiv:2203.06717）发布，作者为 Xiaohan Ding、Xiangyu Zhang、Yizhuang Zhou、Jungong Han、Guiguang Ding 和 Jian Sun。\n2. **[MobileOne](.\u002FMobileOne)**（来自 Apple），随论文《改进的毫秒级移动端骨干网络》（An Improved One millisecond Mobile Backbone，arXiv:2206.04040）发布，作者为 Pavan Kumar Anasosalu Vasu、James Gabriel、Jeff Zhu、Oncel Tuzel 和 Anurag Ranjan。\n\n\n\n### 目标检测 ###\n1. **[DETR](.\u002Fobject_detection\u002FDETR)**（来自 Facebook），随论文《基于Transformer的端到端目标检测》（End-to-End Object Detection with Transformers，arXiv:2005.12872）发布，作者为 Nicolas Carion、Francisco Massa、Gabriel Synnaeve、Nicolas Usunier、Alexander Kirillov 和 Sergey Zagoruyko。\n2. **[Swin Transformer](.\u002Fobject_detection\u002FSwin)**（来自 Microsoft），随论文《Swin Transformer：使用移位窗口的分层视觉Transformer》（Swin Transformer: Hierarchical Vision Transformer using Shifted Windows，arXiv:2103.14030）发布，作者为 Ze Liu、Yutong Lin、Yue Cao、Han Hu、Yixuan Wei、Zheng Zhang、Stephen Lin 和 Baining Guo。\n3. **[PVTv2](.\u002Fobject_detection\u002FPVTv2)**（来自 NJU\u002FHKU\u002FNJUST\u002FIIAI\u002FSenseTime），随论文《PVTv2：基于金字塔视觉Transformer的改进基线》（PVTv2: Improved Baselines with Pyramid Vision Transformer，arXiv:2106.13797）发布，作者为 Wenhai Wang、Enze Xie、Xiang Li、Deng-Ping Fan、Kaitao Song、Ding Liang、Tong Lu、Ping Luo 和 Ling Shao。\n\n#### 即将推出： ####\n1. **[Focal Transformer]()**（来自 Microsoft），随论文《视觉Transformer中用于局部-全局交互的焦点自注意力机制》（Focal Self-attention for Local-Global Interactions in Vision Transformers，arXiv:2107.00641）发布，作者为 Jianwei Yang、Chunyuan Li、Pengchuan Zhang、Xiyang Dai、Bin Xiao、Lu Yuan 和 Jianfeng Gao。\n2. **[UP-DETR]()**（来自 Tencent），随论文《UP-DETR：基于Transformer的目标检测无监督预训练》（UP-DETR: Unsupervised Pre-training for Object Detection with Transformers，arXiv:2011.09094）发布，作者为 Zhigang Dai、Bolun Cai、Yugeng Lin 和 Junying Chen。\n\n### 语义分割 ###\n#### 现在： ####\n1. **[SETR](.\u002Fsemantic_segmentation)**（来自复旦大学\u002F牛津大学\u002F萨里大学\u002F腾讯\u002FFacebook），随论文《从序列到序列的角度重新思考基于Transformer的语义分割》（Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers）发布，作者为Sixiao Zheng、Jiachen Lu、Hengshuang Zhao、Xiatian Zhu、Zekun Luo、Yabiao Wang、Yanwei Fu、Jianfeng Feng、Tao Xiang、Philip H.S. Torr和Li Zhang。\n2. **[DPT](.\u002Fsemantic_segmentation)**（来自英特尔），随论文《用于密集预测的视觉Transformer》（Vision Transformers for Dense Prediction）发布，作者为René Ranftl、Alexey Bochkovskiy和Vladlen Koltun。\n3. **[Swin Transformer](.\u002Fsemantic_segmentation)**（来自微软），随论文《Swin Transformer：使用移位窗口的层次化视觉Transformer》（Swin Transformer: Hierarchical Vision Transformer using Shifted Windows）发布，作者为Ze Liu、Yutong Lin、Yue Cao、Han Hu、Yixuan Wei、Zheng Zhang、Stephen Lin和Baining Guo。\n4. **[Segmenter](.\u002Fsemantic_segmentation)**（来自Inria），随论文《Segmenter：用于语义分割的Transformer》（Segmenter: Transformer for Semantic Segmentation）发布，作者为Robin Strudel、Ricardo Garcia、Ivan Laptev和Cordelia Schmid。\n5. **[Trans2seg](.\u002Fsemantic_segmentation)**（来自香港大学\u002F商汤科技\u002F南京大学），随论文《利用Transformer在野外分割透明物体》（Segmenting Transparent Object in the Wild with Transformer）发布，作者为Enze Xie、Wenjia Wang、Wenhai Wang、Peize Sun、Hang Xu、Ding Liang和Ping Luo。\n6. **[SegFormer](.\u002Fsemantic_segmentation)**（来自香港大学\u002F南京大学\u002FNVIDIA\u002F加州理工学院），随论文《SegFormer：一种简单高效的基于Transformer的语义分割设计》（SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers）发布，作者为Enze Xie、Wenhai Wang、Zhiding Yu、Anima Anandkumar、Jose M. Alvarez和Ping Luo。\n7. **[CSwin Transformer]()**（来自中国科学技术大学和微软），随论文《CSWin Transformer：一种具有十字形窗口的通用视觉Transformer骨干网络》（CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows）发布。\n8. **[TopFormer](.\u002Fsemantic_segmentation)**（来自华中科技大学\u002F腾讯\u002F复旦大学\u002F浙江大学），随论文《TopFormer：用于移动端语义分割的Token金字塔Transformer》（TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation）发布。\n\n#### 即将推出： ####\n1. **[FTN]()**（来自百度），随论文《用于图像语义分割的全Transformer网络》（Fully Transformer Networks for Semantic Image Segmentation）发布，作者为Sitong Wu、Tianyi Wu、Fangjian Lin、Shengwei Tian和Guodong Guo。\n2. **[Shuffle Transformer]()**（来自腾讯），随论文《Shuffle Transformer：重新思考视觉Transformer中的空间洗牌》（Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer）发布，作者为Zilong Huang、Youcheng Ben、Guozhong Luo、Pei Cheng、Gang Yu和Bin Fu。\n3. **[Focal Transformer]()**（来自微软），随论文《视觉Transformer中用于局部-全局交互的焦点自注意力机制》（Focal Self-attention for Local-Global Interactions in Vision Transformers）发布，作者为Jianwei Yang、Chunyuan Li、Pengchuan Zhang、Xiyang Dai、Bin Xiao、Lu Yuan和Jianfeng Gao。\n\n\n### GAN ###\n1. **[TransGAN](.\u002Fgan\u002FtransGAN)**（来自首尔国立大学和NUAA），随论文《TransGAN：两个纯Transformer可以组成一个强大的GAN，而且还能扩展规模》（TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up）发布，作者为Yifan Jiang、Shiyu Chang和Zhangyang Wang。\n2. **[Styleformer](.\u002Fgan\u002FStyleformer)**（来自Facebook和索邦大学），随论文《Styleformer：基于Transformer并带有风格向量的生成对抗网络》（Styleformer: Transformer based Generative Adversarial Networks with Style Vector）发布，作者为Jeeseung Park和Younggeun Kim。\n#### 即将推出： ####\n1. **[ViTGAN]()**（来自UCSD\u002F谷歌），随论文《ViTGAN：使用视觉Transformer训练GAN》（ViTGAN: Training GANs with Vision Transformers）发布，作者为Kwonjoon Lee、Huiwen Chang、Lu Jiang、Han Zhang、Zhuowen Tu和Ce Liu。\n\n\n\n## 安装\n### 先决条件\n* Linux\u002FMacOS\u002FWindows\n* Python 3.6\u002F3.7\n* PaddlePaddle 2.1.0+\n* CUDA10.2+\n> 注意：建议安装最新版本的PaddlePaddle，以避免PaddleViT训练时出现某些CUDA错误。关于PaddlePaddle的安装，请参考此链接（https:\u002F\u002Fwww.paddlepaddle.org.cn\u002Finstall\u002Fquick?docurl=\u002Fdocumentation\u002Fdocs\u002Fzh\u002Finstall\u002Fpip\u002Flinux-pip.html）获取稳定版安装方法，以及此链接（https:\u002F\u002Fwww.paddlepaddle.org.cn\u002Finstall\u002Fquick?docurl=\u002Fdocumentation\u002Fdocs\u002Fzh\u002Fdevelop\u002Finstall\u002Fpip\u002Flinux-pip.html#gpu）获取开发版安装方法。\n### 安装\n1. 创建并激活一个conda虚拟环境。\n   ```shell\n   conda create -n paddlevit python=3.7 -y\n   conda activate paddlevit\n   ```\n2. 按照官方说明安装PaddlePaddle，例如：\n   ```shell\n   conda install paddlepaddle-gpu==2.1.2 cudatoolkit=10.2 --channel https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fcloud\u002FPaddle\u002F\n   ```\n   > 注意：请根据您的环境相应地调整PaddlePaddle和CUDA的版本。\n\n3. 安装依赖包\n    * 一般依赖：\n        ```\n        pip install yacs pyyaml\n        ```\n    * 用于分割的包：\n        ```\n        pip install cityscapesScripts\n        ```\n        安装`detail`包：\n        ```shell\n        git clone https:\u002F\u002Fgithub.com\u002Fccvl\u002Fdetail-api\n        cd detail-api\u002FPythonAPI\n        make\n        make install\n        ```\n    * 用于GAN的包：\n        ```\n        pip install lmdb\n        ```\n4. 从GitHub克隆项目\n    ```\n    git clone https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT.git \n    ```\n\n\n## 结果（模型库） ##\n\n### Image Classification ###\n| Model                         | Acc@1 | Acc@5 | #Params | FLOPs  | Image Size | Crop pct | Interp | Link         |\n|-------------------------------|-------|-------|---------|--------|------------|----------|---------------|--------------|\n| vit_base_patch32_224          | 80.68 | 95.61 | 88.2M   | 4.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1DPEhEuu9sDdcmOPukQbR7ZcHq2bxx9cr\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ppOLj5SWlJmA-NjoLCoYIw)(ubyr) |\n| vit_base_patch32_384          | 83.35 | 96.84 | 88.2M   | 12.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1nCOSwrDiFBFmTkLEThYwjL9SfyzkKoaf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jxnL00ocpmdiPM4fOu4lpg)(3c2f) |\n| vit_base_patch16_224          | 84.58 | 97.30 | 86.4M   | 17.0G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F13D9FqU4ISsGxWXURgKW9eLOBV-pYPr-L\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ms3o2fHMQpIoVqnEHitRtA)(qv4n) |\n| vit_base_patch16_384          | 85.99 | 98.00 | 86.4M   | 49.8G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1kWKaAgneDx0QsECxtf7EnUdUZej6vSFT\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15ggLdiL98RPcz__SXorrXA)(wsum) |\n| vit_large_patch16_224         | 85.81 | 97.82 | 304.1M  | 59.9G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jgwtmtp_cDWEhZE-FuWhs7lCdpqhAMft\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1HRxUJAwEiKgrWnJSjHyU0A)(1bgk) |\n| vit_large_patch16_384         | 87.08 | 98.30 | 304.1M  | 175.9G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zfw5mdiIm-mPxxQddBFxt0xX-IR-PF2U\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KvxfIpMeitgXAUZGr5HV8A)(5t91) |\n| vit_large_patch32_384         | 81.51 | 96.09 | 306.5M  | 44.4G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Py1EX3E35jL7DComW-29Usg9788BB26j\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1W8sUs0pObOGpohP4vsT05w)(ieg3) |\n| | | | | | | | | |\n| swin_t_224   \t\t\t\t\t| 81.37 | 95.54 | 28.3M   | 4.4G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1v_wzWv3TaQ0RKkKwRQwuDPzwpOb_jGEs\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tbc751RVh3fIRsrLzrmeOw)(h2ac) |\n| swin_s_224   \t\t\t\t\t| 83.21 | 96.32 | 49.6M   | 8.6G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lrODzr8zIOU9sBrH2x3zolMOS4mv4o7x\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rlXL0tjLWbWnkIt_2Ne8Jw)(ydyx) |\n| swin_b_224   \t\t\t\t\t| 83.60 | 96.46 | 87.7M   | 15.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1hjEVODThNEDAlIqkg8C1KzUh3KsVNu6R\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ucSHBiuiG2sHAmR1N1JENQ)(h4y6) |\n| swin_b_384   \t\t\t\t\t| 84.48 | 96.89 | 87.7M   | 45.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1szLgwhB6WJu02Me6Uyz94egk8SqKlNsd\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1t0oXbqKNwpUAMJV7VTzcNw)(7nym) |\n| swin_b_224_22kto1k    \t\t| 85.27 | 97.56 | 87.7M   | 15.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1FhdlheMUlJzrZ7EQobpGRxd3jt3aQniU\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KBocL_M6YNW1ZsK-GYFiNw)(6ur8) |\n| swin_b_384_22kto1k    \t\t| 86.43 | 98.07 | 87.7M   | 45.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zVwIrJmtuBSiSVQhUeblRQzCKx-yWNCA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1NziwdsEJtmjfGCeUFgtZXA)(9squ) |\n| swin_l_224_22kto1k    \t\t| 86.32 | 97.90 | 196.4M  | 34.3G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1yo7rkxKbQ4izy2pY5oQ5QAnkyv7zKcch\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GsUJbSkGxlGsBYsayyKjVg)(nd2f) |\n| swin_l_384_22kto1k    \t\t| 87.14 | 98.23 | 196.4M  | 100.9G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-6DEvkb-FMz72MyKtq9vSPKYBqINxoKK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JLdS0aTl3I37oDzGKLFSqA)(5g5e) |\n| | | | | | | | | |\n| deit_tiny_distilled_224   \t| 74.52 | 91.90 | 5.9M    | 1.1G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fku9-11O_gQI7UpZTjagVeND-pcHbV0C\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hAQ_85wWkqQ7sIGO1CmO9g)(rhda) |\n| deit_small_distilled_224  \t| 81.17 | 95.41 | 22.4M   | 4.3G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1RIeWTdf5o6pwkjqN4NbW91GZSOCalI5t\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wCVrukvwxISAGGjorPw3iw)(pv28) |\n| deit_base_distilled_224  \t\t| 83.32 | 96.49 | 87.2M   | 17.0G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12_x6-NN3Jde2BFUih4OM9NlTwe9-Xlkw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ZnmAWgT6ewe7Vl3Xw_csuA)(5f2g) |\n| deit_base_distilled_384  \t\t| 85.43 | 97.33 | 87.2M   | 49.9G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1i5H_zjSdHfM-Znv89DHTv9ChykWrIt8I\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PQsQIci4VCHY7l2tCzMklg)(qgj2) |\n| | | | | | | | | |\n| volo_d1_224  \t\t\t\t\t| 84.12 | 96.78 | 26.6M   | 6.6G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1kNNtTh7MUWJpFSDe_7IoYTOpsZk5QSR9\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1EKlKl2oHi_24eaiES67Bgw)(xaim) |\n| volo_d1_384  \t\t\t\t\t| 85.24 | 97.21 | 26.6M   | 19.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fku9-11O_gQI7UpZTjagVeND-pcHbV0C\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qZWoFA7J89i2aujPItEdDQ)(rr7p) |\n| volo_d2_224  \t\t\t\t\t| 85.11 | 97.19 | 58.6M   | 13.7G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KjKzGpyPKq6ekmeEwttHlvOnQXqHK1we\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JCK0iaYtiOZA6kn7e0wzUQ)(d82f) |\n| volo_d2_384  \t\t\t\t\t| 86.04 | 97.57 | 58.6M   | 40.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1uLLbvwNK8N0y6Wrq_Bo8vyBGSVhehVmq\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1e7H5aa6miGpCTCgpK0rm0w)(9cf3) |\n| volo_d3_224  \t\t\t\t\t| 85.41 | 97.26 | 86.2M   | 19.8G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OtOX7C29fJ20ESKQnYGevp4euxhmXKAT\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1vhARtV2wfI6EFf0Ap71xwg)(a5a4) |\n| volo_d3_448  \t\t\t\t\t| 86.50 | 97.71 | 86.2M   | 80.3G  | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lHlYhra1NNp0dp4NWaQ9SMNNmw-AxBNZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Q6KiQw4Vu1GPm5RF9_eycg)(uudu) |\n| volo_d4_224  \t\t\t\t\t| 85.89 | 97.54 | 192.8M  | 42.9G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16oXN7xuy-mkpfeD-loIVOK95PfptHhpX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PE83ZLd5evkKmHJ1V2KDsg)(vcf2) |\n| volo_d4_448  \t\t\t\t\t| 86.70 | 97.85 | 192.8M  | 172.5G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1N9-1OhPewA5TBR9CX5oA10obDS8e4Cfa\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1QoJ2Sqe1SK9hxbmV4uZiyg)(nd4n) |\n| volo_d5_224  \t\t\t\t\t| 86.08 | 97.58 | 295.3M  | 70.6G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fcrvOGbAmKUhqJT-pU3MVJZQJIe4Qina\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nqDcXMW00v9PKr3RQI-g1w)(ymdg) |\n| volo_d5_448  \t\t\t\t\t| 86.92 | 97.88 | 295.3M  | 283.8G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1aFXEkpfLhmQlDQHUYCuFL8SobhxUzrZX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1K4FBv6fnyMGcAXhyyybhgw)(qfcc) |\n| volo_d5_512  \t\t\t\t\t| 87.05 | 97.97 | 295.3M  | 371.3G | 512        | 1.15     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CS4-nv2c9FqOjMz7gdW5i9pguI79S6zk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16Wseyiqvv0MQJV8wwFDfSA)(353h) |\n| | | | | | | | | |\n| cswin_tiny_224  \t\t\t\t| 82.81 | 96.30 | 22.3M   | 4.2G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1l-JY0u7NGyD6SjkyiyNnDx3wFFT1nAYO\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L5FqU7ImWAhQHAlSilqVAw)(4q3h) |\n| cswin_small_224 \t\t\t\t| 83.60 | 96.58 | 34.6M   | 6.5G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10eEBk3wvJdQ8Dy58LvQ11Wk1K2UfPy-E\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FiaNiWyAuWu1IBsUFLUaAw)(gt1a) |\n| cswin_base_224  \t\t\t\t| 84.23 | 96.91 | 77.4M   | 14.6G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YufKh3DKol4-HrF-I22uiorXSZDIXJmZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1koy8hXyGwvgAfUxdlkWofg)(wj8p) |\n| cswin_base_384  \t\t\t\t| 85.51 | 97.48 | 77.4M   | 43.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1qCaFItzFoTYBo-4UbGzL6M5qVDGmJt4y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WNkY7o_vP9KJ8cd5c7n2sQ)(rkf5) |\n| cswin_large_224 \t\t\t\t| 86.52 | 97.99 | 173.3M  | 32.5G  | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1V1hteGK27t1nI84Ac7jdWfydBLLo7Fxt\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KgIX6btML6kPiPGkIzvyVA)(b5fs) |\n| cswin_large_384 \t\t\t\t| 87.49 | 98.35 | 173.3M  | 96.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LRN_6qUz71yP-OAOpN4Lscb8fkUytMic\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1eCIpegPj1HIbJccPMaAsew)(6235) |\n| | | | | | | | | |\n| cait_xxs24_224                | 78.38 | 94.32 | 11.9M   | 2.2G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LKsQUr824oY4E42QeUEaFt41I8xHNseR\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YIaBLopKIK5_p7NlgWHpGA)(j9m8) |\n| cait_xxs36_224                | 79.75 | 94.88 | 17.2M   | 33.1G  | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zZx4aQJPJElEjN5yejUNsocPsgnd_3tS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1pdyFreRRXUn0yPel00-62Q)(nebg) |\n| cait_xxs24_384                | 80.97 | 95.64 | 11.9M   | 6.8G   | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1J27ipknh_kwqYwR0qOqE9Pj3_bTcTx95\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1uYSDzROqCVT7UdShRiiDYg)(2j95) |\n| cait_xxs36_384                | 82.20 | 96.15 | 17.2M   | 10.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F13IvgI3QrJDixZouvvLWVkPY0J6j0VYwL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GafA8B6T3h_vtmNNq2HYKg)(wx5d) |\n| cait_s24_224                  | 83.45 | 96.57 | 46.8M   | 8.7G   | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sdCxEw328yfPJArf6Zwrvok-91gh7PhS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1BPsAMEcrjtnbOnVDQwZJYw)(m4pn) |\n| cait_xs24_384                 | 84.06 | 96.89 | 26.5M   | 15.1G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zKL6cZwqmvuRMci-17FlKk-lA-W4RVte\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1w10DPJvK8EwhOCm-tZUpww)(scsv) |\n| cait_s24_384                  | 85.05 | 97.34 | 46.8M   | 26.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1klqBDhJDgw28omaOpgzInMmfeuDa7NAi\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-aNO6c7Ipm9x1hJY6N6G2g)(dnp7) |\n| cait_s36_384                  | 85.45 | 97.48 | 68.1M   | 39.5G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m-55HryznHbiUxG38J2rAa01BYcjxsRZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-uWg-JHLEKeMukFFctoufg)(e3ui) |\n| cait_m36_384                  | 86.06 | 97.73 | 270.7M  | 156.2G | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1WJjaGiONX80KBHB3YN8mNeusPs3uDhR2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aZ9bEU5AycmmfmHAqZIaLA)(r4hu) |\n| cait_m48_448                  | 86.49 | 97.75 | 355.8M  | 287.3G | 448        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lJSP__dVERBNFnp7im-1xM3s_lqEe82-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F179MA3MkG2qxFle0K944Gkg)(imk5) |\n| | | | | | | | | |\n| pvtv2_b0 \t\t\t\t\t\t| 70.47\t| 90.16\t| 3.7M    | 0.6G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wkx4un6y7V87Rp_ZlD4_pV63QRst-1AE\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mab4dOtBB-HsdzFJYrvgjA)(dxgb) |\n| pvtv2_b1 \t\t\t\t\t\t| 78.70\t| 94.49\t| 14.0M   | 2.1G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11hqLxL2MTSnKPb-gp2eMZLAzT6q2UsmG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ur0s4SEOxVqggmgq6AM-sQ)(2e5m) |\n| pvtv2_b2 \t\t\t\t\t\t| 82.02\t| 95.99\t| 25.4M   | 4.0G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-KY6NbS3Y3gCaPaUam0v_Xlk1fT-N1Mz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FWx0QB7_8_ikrPIOlL7ung)(are2) |\n| pvtv2_b2_linear \t\t\t\t| 82.06\t| 96.04\t| 22.6M   | 3.9G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1hC8wE_XanMPi0_y9apEBKzNc4acZW5Uy\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IAhiiaJPe-Lg1Qjxp2p30w)(a4c8) |\n| pvtv2_b3 \t\t\t\t\t\t| 83.14\t| 96.47\t| 45.2M   | 6.8G   | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16yYV8x7aKssGYmdE-YP99GMg4NKGR5j1\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ge0rBsCqIcpIjrVxsrFhnw)(nc21) |\n| pvtv2_b4 \t\t\t\t\t\t| 83.61\t| 96.69\t| 62.6M   | 10.0G  | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1gvPdvDeq0VchOUuriTnnGUKh0N2lj-fA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VMSD_Kr_hduCZ5dxmDbLoA)(tthf) |\n| pvtv2_b5 \t\t\t\t\t\t| 83.77\t| 96.61\t| 82.0M   | 11.5G  | 224 \t    | 0.875    | bicubic \t   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OHaHiHN_AjsGYBN2gxFcQCDhBbTvZ02g\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ey4agxI2Nb0F6iaaX3zAbA)(9v6n) |\n| | | | | | | | | | \n| shuffle_vit_tiny  \t\t\t| 82.39 | 96.05 | 28.5M   | 4.6G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ffJ-tG_CGVXztPEPQMaT_lUoc4hxFy__\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F19DhlLIFyPGOWtyq_c83ZGQ)(8a1i) |\n| shuffle_vit_small \t\t\t| 83.53 | 96.57 | 50.1M   | 8.8G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1du9H0SKr0QH9GQjhWDOXOnhpSVpfbb8X\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rM2J8BVwxQ3kRZoHngwNZA)(xwh3) |\n| shuffle_vit_base  \t\t\t| 83.95 | 96.91 | 88.4M   | 15.5G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sYh808AyTG3-_qv6nfN6gCmyagsNAE6q\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1fks_IYDdnXdAkCFuYHW_Nw)(1gsr) |\n| | | | | | | | | |\n| t2t_vit_7      \t\t\t\t| 71.68 | 90.89 | 4.3M    | 1.0G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YkuPs1ku7B_udydOf_ls1LQvpJDg_c_j\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jVNsz37gatLCDaOoU3NaMA)(1hpa) |\n| t2t_vit_10     \t\t\t\t| 75.15 | 92.80 | 5.8M    | 1.3G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1H--55RxliMDlOCekn7FpKrHDGsUkyrJZ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nbdb4PFMq4nsIp8HrNxLQg)(ixug) |\n| t2t_vit_12     \t\t\t\t| 76.48 | 93.49 | 6.9M    | 1.5G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1stnIwOwaescaEcztaF1QjI4NK4jaqN7P\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DcMzq9WeSwrS3epv6jKJXw)(qpbb) |\n| t2t_vit_14     \t\t\t\t| 81.50 | 95.67 | 21.5M   | 4.4G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HSvN3Csgsy7SJbxJYbkzjUx9guftkfZ1\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wcfh22uopBv7pS7rKcH_iw)(c2u8) |\n| t2t_vit_19     \t\t\t\t| 81.93 | 95.74 | 39.1M   | 7.8G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1eFnhaL6I33pHCQw2BaEE0Oet9CnjmUf_\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_24     \t\t\t\t| 82.28 | 95.89 | 64.0M   | 12.8G  | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Z7nZCHeFp0AhIkGYcMAFkKdkGN0yXtpv\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_t_14   \t\t\t\t| 81.69 | 95.85 | 21.5M   | 4.4G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16li4voStt_B8eWDXqJt7s20OT_Z8L263\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hpyc5hBYo1zqoXWpryegnw)(4in3) |\n| t2t_vit_t_19   \t\t\t\t| 82.44 | 96.08 | 39.1M   | 7.9G   | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Ty-42SYOu15Nk8Uo6VRTJ7J0JV_6t7zJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YdQd6l8tj5xMCWvcHWm7sg)(mier) |\n| t2t_vit_t_24   \t\t\t\t| 82.55 | 96.07 | 64.0M   | 12.9G  | 224   \t    | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cvvXrGr2buB8Np2WlVL7n_F1_CnI1qow\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1BMU3KX_TRmPxQ1jN5cmWhg)(6vxc) |\n| t2t_vit_14_384 \t\t\t\t| 83.34 | 96.50 | 21.5M   | 13.0G  | 384   \t    | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Yuso8WD7Q8Lu_9I8dTvAvkcXXtPSkmnm\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1AOMhyVRF9zPqJe-lTrd7pw)(r685) |\n| | | | | | | | | |\n| cross_vit_tiny_224 \t\t\t| 73.20 | 91.90 | 6.9M    | 1.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ILTVwQtetcb_hdRjki2ZbR26p-8j5LUp\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1byeUsM34_gFL0jVr5P5GAw)(scvb) |\n| cross_vit_small_224 \t\t\t| 81.01 | 95.33 | 26.7M   | 5.2G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ViOJiwbOxTbk1V2Go7PlCbDbWPbjWPJH\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1I9CrpdPU_D5LniqIVBoIPQ)(32us) |\n| cross_vit_base_224 \t\t\t| 82.12 | 95.87 | 104.7M  | 20.2G  | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vTorkc63O4JE9cYUMHBRxFMDOFoC-iK7\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1TR_aBHQ2n1J0RgHFoVh_bw)(jj2q) |\n| cross_vit_9_224 \t\t\t\t| 73.78 | 91.93 | 8.5M    | 1.6G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UCX9_mJSx2kDAmEd_xDXyd4e6-Mg3RPf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1M8r5vqMHJ-rFwBoW1uL2qQ)(mjcb) |\n| cross_vit_15_224 \t\t\t\t| 81.51 | 95.72 | 27.4M   | 5.2G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HwkLWdz6A3Nz-dVbw4ZUcCkxUbPXgHwM\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wiO_Gjk4fvSq08Ud8xKwVw)(n55b) |\n| cross_vit_18_224 \t\t\t\t| 82.29 | 96.00 | 43.1M   | 8.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C4b_a_6ia8NCEXSUEMDdCEFzedr0RB_m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1w7VJ7DNqq6APuY7PdlKEjA)(xese) |\n| cross_vit_9_dagger_224 \t\t| 76.92 | 93.61 | 8.7M    | 1.7G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1_cXQ0M8Hr9UyugZk07DrsBl8dwwCA6br\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1F1tRSaG4EfCV_WiTEwXxBw)(58ah) |\n| cross_vit_15_dagger_224 \t\t| 82.23 | 95.93 | 28.1M   | 5.6G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cCgBoozh2WFtSz42LwEUUPPyC5KmkAFg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1xJ4P2zy3r9RcNFSMtzvZgg)(qwup) |\n| cross_vit_18_dagger_224 \t\t| 82.51 | 96.03 | 44.1M   | 8.7G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1sdAbWxKL5k3QIo1zdgHzasIOtpy_Ogpw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15qYHgt0iRxdhtXoC_ct2Jg)(qtw4) |\n| cross_vit_15_dagger_384 \t\t| 83.75 | 96.75 | 28.1M   | 16.4G  | 384   \t    | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12LQjYbs9-LyrY1YeRt46x9BTB3NJuhpJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1d-BAm03azLP_CyEHF3c7ZQ)(w71e) |\n| cross_vit_18_dagger_384 \t\t| 84.17 | 96.82 | 44.1M   | 25.8G  | 384   \t    | 1.0 \t   | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CeGwB6Tv0oL8QtL0d7Ar-d02Lg_PqACr\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1l_6PTldZ3IDB7XWgjM6LhA)(99b6) |\n| | | | | | | | | | \n| beit_base_patch16_224_pt22k   | 85.21 | 97.66 | 87M    | 12.7G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lq5NeQRDHkIQi7U61OidaLhNsXTWfh_Z\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1pjblqaESqfXVrpgo58oR6Q)(fshn) |\n| beit_base_patch16_384_pt22k   | 86.81 | 98.14 | 87M    | 37.3G   | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wn2NS7kUdlERkzWEDeyZKmcRbmWL7TR2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WVbNjxuIUh514pKAgZZEzg)(arvc) |\n| beit_large_patch16_224_pt22k  | 87.48 | 98.30 | 304M   | 45.0G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11OR1FKxzfafqT7GzTW225nIQjxmGSbCm\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bvhERVXN2TyRcRJFzg7sIA)(2ya2) |\n| beit_large_patch16_384_pt22k  | 88.40 | 98.60 | 304M   | 131.7G  | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10EraafYS8CRpEshxClOmE2S1eFCULF1Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1H76G2CGLY3YmmYt4-suoRA)(qtrn) |\n| beit_large_patch16_512_pt22k  | 88.60 | 98.66 | 304M   | 234.0G  | 512        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1xIIocftsB1PcDHZttPqLdrJ-G4Tyfrs-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WtTVK_Wvg-izaF0M6Gzw-Q)(567v) |\n| | | | | | | | | | \n| Focal-T    \t\t\t\t\t| 82.03 | 95.86 | 28.9M   | 4.9G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HzZJbYH_eIo94h0wLUhqTyJ6AYthNKRh\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1JCr2qIA-SZvTqbTO-m2OwA)(i8c2) |\n| Focal-T (use conv)   \t\t\t| 82.70 | 96.14 | 30.8M   | 4.9G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1PS0-gdXHGl95LqH5k5DG62AH6D3i7v0D\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tVztox4bVJuJEjkD1fLaHQ)(smrk) |\n| Focal-S    \t\t\t\t\t| 83.55 | 96.29 | 51.1M   | 9.4G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HnVAYsI_hmiomyS4Ax3ccPE7gk4mlTU8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1b7uugAY9RhrgTkUwYcvvow)(dwd8) |\n| Focal-S (use conv)   \t\t\t| 83.85 | 96.47 | 53.1M   | 9.4G    | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vcHjYiGNMayoSTPoM8z39XRH6h89TB9V\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F174a2aZzCEt3teLuAnIzMtA)(nr7n) |\n| Focal-B    \t\t\t\t\t| 83.98 | 96.48 | 89.8M   | 16.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bNMegxetWpwZNcmDEC3MHCal6SNXSgWR\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1piBslNhxWR78aQJIdoZjEw)(8akn) |\n| Focal-B (use conv)   \t\t\t| 84.18 | 96.61 | 93.3M   | 16.4G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-J2gDnKrvZGtasvsAYozrbMXR2LtIJ43\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1GTLfnTlt6I6drPdfSWB1Iw)(5nfi) |\n| | | | | | | | | | \n| mobilevit_xxs   \t\t\t\t| 70.31| 89.68 | 1.32M   | 0.44G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1l3L-_TxS3QisRUIb8ohcv318vrnrHnWA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1KFZ5G834_-XXN33W67k8eg)(axpc) |\n| mobilevit_xs   \t\t\t\t| 74.47| 92.02 | 2.33M   | 0.95G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1oRMA4pNs2Ba0LYDbPufC842tO4OFcgwq\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IP8S-S6ZAkiL0OEsiBWNkw)(hfhm) |\n| mobilevit_s   \t\t\t\t| 76.74| 93.08 | 5.59M   | 1.88G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ibkhsswGYWvZwIRjwfgNA4-Oo2stKi0m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-rI6hiCHZaI7os2siFASNg)(34bg) |\n| mobilevit_s $\\dag$  \t\t\t| 77.83| 93.83 | 5.59M   | 1.88G   | 256        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1BztBJ5jzmqgDWfQk-FB_ywDWqyZYu2yG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F19YepMAO-sveBOLA4aSjIEQ?pwd=92ic)(92ic) |\n| | | | | | | | | | \n| vip_s7  \t\t\t\t\t\t| 81.50 | 95.76 | 25.1M   | 7.0G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16bZkqzbnN08_o15k3MzbegK8SBwfQAHF\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1uY0FsNPYaM8cr3ZCdAoVkQ)(mh9b) |\n| vip_m7  \t\t\t\t\t\t| 82.75 | 96.05 | 55.3M   | 16.4G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11lvT2OXW0CVGPZdF9dNjY_uaEIMYrmNu\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1j3V0Q40iSqOY15bTKlFFRw)(hvm8) |\n| vip_l7  \t\t\t\t\t\t| 83.18 | 96.37 | 87.8M   | 24.5G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bK08JorLPMjYUep_TnFPKGs0e1j0UBKJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1I5hnv3wHWEaG3vpDqaNL-w)(tjvh) |\n| | | | | | | | | | \n| xcit_nano_12_p16_224_dist   | 72.32  | 90.86  | 0.6G    | 3.1M      | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14FsYtm48JB-rQFF9CanJsJaPESniWD7q\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kdY4vzwU2QiBSU5127AYA)(7qvz)     |\n| xcit_nano_12_p16_384_dist   | 75.46  | 92.70  | 1.6G    | 3.1M      | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zR-hFQryocF9muG-erzcxFuJme5y_e9f\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1449qtQzEMg6lqdtClyiCRQ)(1y2j)     |\n| xcit_large_24_p16_224_dist  | 84.92  | 97.13  | 35.9G   | 189.1M    | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lAtko_KwOagjwaFvUkeXirVClXCV8gt-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Gs401mXqG1bifi1hBdXtig)(kfv8)     |\n| xcit_large_24_p16_384_dist  | 85.76  | 97.54  | 105.5G  | 189.1M    | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15djnKz_-eooncvyZp_UTwOiHIm1Hxo_G\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F14583hbtIVbZ_2ifZepQItQ)(ffq3)     |\n| xcit_nano_12_p8_224_dist    | 76.33  | 93.10  | 2.2G    | 3.0M      | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1XxRNjskLvSVp6lvhlsnylq6g7vd_5MsI\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DZJxuahFJyz-rEEsCqhhrA)(jjs7)     |\n| xcit_nano_12_p8_384_dist    | 77.82  | 94.04  | 6.3G    | 3.0M      | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1P3ln8JqLzMKbJAhCanRbu7i5NMPVFNec\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ECY9-PVDMNSup8NMQiqBrw)(dmc1)     |\n| xcit_large_24_p8_224_dist   | 85.40  | 97.40  | 141.4G  | 188.9M    | 224        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14ZoDxEez5NKVNAsbgjTPisjOQEAA30Wy\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1D_zyvjzIVFp6iqx1s7IEbA)(y7gw)     |\n| xcit_large_24_p8_384_dist   | 85.99  | 97.69  | 415.5G  | 188.9M    | 384        | 1.0      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1stcUwwFNJ38mdaFsNXq24CBMmDenJ_e4\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1lwbBk7GFuqnnP_iU2OuDRw)(9xww)     |\n| | | | | | | | | |\n| pit_ti \t     | 72.91\t| 91.40\t| 4.8M    | 0.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1bbeqzlR_CFB8CAyTUN52p2q6ii8rt0AW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Yrq5Q16MolPYHQsT_9P1mw)(ydmi)  |\n| pit_ti_distill | 74.54\t| 92.10 | 5.1M    | 0.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m4L0OVI0sYh8vCv37WhqCumRSHJaizqX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1RIM9NGq6pwfNN7GJ5WZg2w)(7k4s)  |\n| pit_xs \t     | 78.18    | 94.16 | 10.5M   | 1.1G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1qoMQ-pmqLRQmvAwZurIbpvgMK8MOEgqJ\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15d7ep05vI2UoKvL09Zf_wg)(gytu)  |\n| pit_xs_distill | 79.31 \t| 94.36 | 10.9M   | 1.1G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1EfHOIiTJOR-nRWE5AsnJMsPCncPHEgl8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DqlgVF7U5qHfGD3QJAad4A)(ie7s)  |\n| pit_s  \t\t | 81.08 \t| 95.33 | 23.4M   | 2.4G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TDSybTrwQpcFf9PgCIhGX1t-f_oak66W\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Vk-W1INskQq7J5Qs4yphCg)(kt1n)  |\n| pit_s_distill  | 81.99 \t| 95.79 | 24.0M   | 2.5G   | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1U3VPP6We1vIaX-M3sZuHmFhCQBI9g_dL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L7rdWmMW8tiGkduqmak9Fw)(hhyc)  |\n| pit_b   \t\t | 82.44 \t| 95.71 | 73.5M\t  | 10.6G  | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-NBZ9-83nZ52jQ4DNZAIj8Xv6oh54nx-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1XRDPY4OxFlDfl8RMQ56rEg)(uh2v)  |\n| pit_b_distill  | 84.14 \t| 96.86 | 74.5M   | 10.7G  | 224        | 0.9      | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12Yi4eWDQxArhgQb96RXkNWjRoCsDyNo9\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1vJOUGXPtvC0abg-jnS4Krw)(3e6g)  |\n| | | | | | | | | |\n| halonet26t \t | 79.10\t| 94.31\t| 12.5M    | 3.2G   | 256        | 0.95     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1F_a1brftXXnPM39c30NYe32La9YZQ0mW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1FSlSTuYMpwPJpi4Yz2nCTA)(ednv)  |\n| halonet50ts \t | 81.65\t| 95.61\t| 22.8M    | 5.1G   | 256        | 0.94     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12t85kJcPA377XePw6smch--ELMBo6p0Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1X4LM-sqoTKG7CrM5BNjcdA)(3j9e)  |\n| | | | | | | | | |\n| poolformer_s12 | 77.24 | 93.51 | 11.9M   | 1.8G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15EBfTTU6coLCsDNiLgAWYiWeMpp3uYH4\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1n6TUxQGlssTu4lyLrBOXEw)(zcv4)             |\n| poolformer_s24 | 80.33 | 95.05 | 21.3M   | 3.4G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1JxqJluDpp1wwe7XtpTi1aWaVvlq0Q3xF\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1d2uyHB5R6ZWPzXWhdtm6fw)(nedr)             |\n| poolformer_s36 | 81.43 | 95.45 | 30.8M   | 5.0G   | 224        | 0.9      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ka3VeupDRFBSzzrcw4wHXKGqoKv6sB_Y\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1de6ZJkmYEmVI7zKUCMB_xw)(fvpm)             |\n| poolformer_m36 | 82.11 | 95.69 | 56.1M   | 8.9G   | 224        | 0.95     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LTZ8wNRb_GSrJ9H3qt5-iGiGlwa4dGAK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qNTYLw4vyuoH1EKDXEcSvw)(whfp)             |\n| poolformer_m48 | 82.46 | 95.96 | 73.4M   | 11.8G  | 224        | 0.95     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1YhXEVjWtI4bZB_Qwama8G4RBanq2K15L\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VJXANTseTUEA0E6HYf-XyA)(374f)             |\n| | | | | | | | | |\n| botnet50 \t | 77.38\t| 93.56\t| 20.9M    | 5.3G   | 224        | 0.875     | bicubic       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1S4nxgRkElT3K4lMx2JclPevmP3YUHNLw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1CW40ShBJQYeFgdBIZZLSjg)(wh13)\n| | | | | | | | | |\n| CvT-13-224      | 81.59 | 95.67 | 20M    | 4.5G    | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1r0fnHn1bRPmN0mi8RwAPXmD4utDyOxEf\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F13xNwCGpdJ5MVUi369OGl5Q)(vev9) |\n| CvT-21-224      | 82.46 | 96.00 | 32M    | 7.1G    | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18s7nRfvcmNdbRuEpTQe02AQE3Y9UWVQC\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mOjbMNoQb7X3VJD3LV0Hhg)(t2rv) |\n| CvT-13-384   \t  | 83.00 | 96.36 | 20M    | 16.3G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1J0YYPUsiXSqyExBPtOPrOLL9c16syllg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1upITRr5lNHLjbBJtIr-jdg)(wswt) |\n| CvT-21-384   \t  | 83.27 | 96.16 | 32M    | 24.9G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1tpXv_yYXtvyArlYi7AFcHUOqemhyMWHW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hXKi3Kb7mNxPFVmR6cdkMg)(hcem) |\n| CvT-13-384-22k  | 83.26 | 97.09 | 20M    | 16.3G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18djrvq422u1pGLPxNfWAp6d17F7C5lbP\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1YYv5rKPmroxKCnzkesUr0g)(c7m9) |\n| CvT-21-384-22k  | 84.91 | 97.62 | 32M    | 24.9G   | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1NVXd7vxVoRpL-21GN7nGn0-Ut0L0Owp8\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1N3xNU6XFHb1CdEOrnjKuoA)(9jxe) |\n| CvT-w24-384-22k | 87.58 | 98.47 | 277M   | 193.2G  | 384        | 1.0        | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1M3bg46N4SGtupK8FcvAOE0jltOwP5yja\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1MNJurm8juHRGG9SAw3IOkg)(bbj2) |\n| | | | | | | | | |\n| HVT-Ti-1       | 69.45 | 89.28 | 5.7M    | 0.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11BW-qLBMu_1TDAavlrAbfVlXB53dgm42\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16rZvJqL-UVuWFsCDuxFDqg?pwd=egds)(egds) |\n| HVT-S-0        | 80.30 | 95.15 | 22.0M   | 4.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1GlJ2j2QVFye1tAQoUJlgKTR_KELq3mSa\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L-tjDxkQx00jg7BsDClabA?pwd=hj7a)(hj7a) |\n| HVT-S-1        | 78.06 | 93.84 | 22.1M   | 2.4G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16H33zNIpNrHBP1YhCq4zmLjRYQJ0XEmX\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1quOsgVuxTcauISQ3SehysQ?pwd=tva8)(tva8) |\n| HVT-S-2        | 77.41 | 93.48 | 22.1M   | 1.9G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1U14LA7SXJtFep_SdUCjAV-cDOQ9A_OFk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nooWTBzaXyBtEgadn9VDmw?pwd=bajp)(bajp) |\n| HVT-S-3        | 76.30 | 92.88 | 22.1M   | 1.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1m1CjOcZfPMLDRyX4QBgMhHV1m6rtu44v\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15sAOmQN6Hx0GLelYDuMQXw?pwd=rjch)(rjch) |\n| HVT-S-4        | 75.21 | 92.34 | 22.1M   | 1.6G   | 224        |  0.875   |  bicubic      |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14comGo9lO12dUeGGL52MuIJWZPSit7I0\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1o31hMRWR7FTCjUk7_fAOgA?pwd=ki4j)(ki4j) |\n| | | | | | | | | |\n| | | | | | | | | |\n| mlp_mixer_b16_224            \t| 76.60 | 92.23 | 60.0M   | 12.7G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ZcQEH92sEPvYuDc6eYZgssK5UjYomzUD\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F12nZaWGMOXwrCMOIBfUuUMA)(xh8x) |\n| mlp_mixer_l16_224           \t| 72.06 | 87.67 | 208.2M  | 44.9G  | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mkmvqo5K7JuvqGm92a-AdycXIcsv1rdg\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1AmSVpwCaGR9Vjsj_boL7GA)(8q7r) |\n| | | | | | | | | |\n| resmlp_24_224                \t| 79.38 | 94.55 | 30.0M   | 6.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15A5q1XSXBz-y1AcXhy_XaDymLLj2s2Tn\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nLAvyG53REdwYNCLmp4yBA)(jdcx) |\n| resmlp_36_224             \t| 79.77 | 94.89 | 44.7M   | 9.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1WrhVm-7EKnLmPU18Xm0C7uIqrg-RwqZL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1QD4EWmM9b2u1r8LsnV6rUA)(33w3) |\n| resmlp_big_24_224         \t| 81.04 | 95.02 | 129.1M  | 100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KLlFuzYb17tC5Mmue3dfyr2L_q4xHTZi\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1oXU6CR0z7O0XNwu_UdZv_w)(r9kb) |\n| resmlp_12_distilled_224 \t\t| 77.95 | 93.56 | 15.3M   |\t3.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1cDMpAtCB0pPv6F-VUwvgwAaYtmP8IfRw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kJeZ_V1MMjTX9f1DBCgnw)(ghyp) |\n| resmlp_24_distilled_224 \t\t| 80.76 | 95.22 | 30.0M   |\t6.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15d892ExqR1sIAjEn-cWGlljX54C3vihA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1NgQtSwuAwsVVOB8U6N4Aqw)(sxnx) |\n| resmlp_36_distilled_224 \t\t| 81.15 | 95.48 | 44.7M\t  | 9.0G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Laqz1oDg-kPh6eb6bekQqnE0m-JXeiep\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1p1xGOJbMzH_RWEj36ruQiw)(vt85) |\n| resmlp_big_24_distilled_224 \t| 83.59 | 96.65 | 129.1M  |\t100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F199q0MN_BlQh9-HbB28RdxHj1ApMTHow-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1yUrfbqW8vLODDiRV5WWkhQ)(4jk5) |\n| resmlp_big_24_22k_224   \t\t| 84.40 | 97.11 | 129.1M  | 100.7G | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zATKq1ruAI_kX49iqJOl-qomjm9il1LC\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1VrnRMbzzZBmLiR45YwICmA)(ve7i) |\n| | | | | | | | | |\n| gmlp_s16_224                 \t| 79.64 | 94.63 | 19.4M   | 4.5G   | 224        | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TLypFly7aW0oXzEHfeDSz2Va4RHPRqe5\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F13UUz1eGIKyqyhtwedKLUMA)(bcth) |\n| | | | | | | | | |\n| ff_only_tiny (linear_tiny) \t| 61.28 | 84.06 |         |        | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F14bPRCwuY_nT852fBZxb9wzXzbPWNfbCG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nNE4Hh1Nrzl7FEiyaZutDA)(mjgd) |\n| ff_only_base (linear_base) \t| 74.82 | 91.71 |         |        | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1DHUg4oCi41ELazPCvYxCFeShPXE4wU3p\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1l-h6Cq4B8kZRvHKDTzhhUg)(m1jc) |\n| | | | | | | | | |\n| repmlp_res50_light_224 \t\t| 77.01 | 93.46 | 87.1M   | 3.3G   | 224   \t    | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16bCFa-nc_-tPVol-UCczrrDO_bCFf2uM\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bzmpS6qJJTsOq3SQE7IOyg)(b4fg) |\n| | | | | | | | | |\n| cyclemlp_b1 \t\t\t\t\t | 78.85 | 94.60 | 15.1M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F10WQenRy9lfOJF4xEHc9Mekp4zHRh0mJ_\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11UQp1RkWBsZFOqit_uU80w)(mnbr) |\n| cyclemlp_b2 \t\t\t\t\t | 81.58 | 95.81 | 26.8M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1dtQHCwtxNh9jgiHivN5iYpHe7uKRUjhk\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Js-Oq5vyiB7oPagn43cn3Q)(jwj9) |\n| cyclemlp_b3 \t\t\t\t\t | 82.42 | 96.07 | 38.3M   |    | 224   \t    | 0.9    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11kMq112tAwVE5llJIepIIixz74AjaJhU\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1b7cau1yPxqATA8X7t2DXkw)(v2fy) |\n| cyclemlp_b4 \t\t\t\t\t | 82.96 | 96.33 | 51.8M   |    | 224   \t    | 0.875  | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1vwJ0eD9Ic-NvLvCz1zEAmn7RxBMtd_v2\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1P3TlnXRFGWj9nVP5xBGGWQ)(fnqd) |\n| cyclemlp_b5 \t\t\t\t\t | 83.25 | 96.44 | 75.7M   |    | 224   \t    | 0.875  | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12_I4cfOBfp7kC0RvmnMXFqrSxww6plRW\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-Cka1tNqGUQutkAP3VZXzQ)(s55c) |\n| | | | | | | | | |\n| convmixer_1024_20  \t\t\t| 76.94 | 93.35 | 24.5M   | 9.5G   |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1R7zUSl6_6NFFdNOe8tTfoR9VYQtGfD7F\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DgGA3qYu4deH4woAkvjaBw)(qpn9) |\n| convmixer_768_32  \t\t\t| 80.16 | 95.08 | 21.2M   | 20.8G  |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F196Lg_Eet-hRj733BYASj22g51wdyaW2a\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F17CbRNzY2Sy_Cu7cxNAkWmQ)(m5s5) |\n| convmixer_1536_20  \t\t\t| 81.37 | 95.62 | 51.8M   | 72.4G  |    224     | 0.96     | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-LlAlADiu0SXDQmE34GN2GBhqI-RYRqO\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1R-gSzhzQNfkuZVxsaE4vEw)(xqty) |\n| | | | | | | | | |\n| convmlp_s\t\t\t  \t\t\t| 76.76 | 93.40 | 9.0M    | 2.4G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1D8kWVfQxOyyktqDixaZoGXB3wVspzjlc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WseHYALFB4Of3Dajmlt45g)(3jz3) |\n| convmlp_m\t\t\t  \t\t\t| 79.03 | 94.53 | 17.4M   | 4.0G   |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TqVlKHq-WRdT9KDoUpW3vNJTIRZvix_m\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1koipCAffG6REUyLYk0rGAQ)(vyp1) |\n| convmlp_l\t\t\t  \t\t\t| 80.15 | 95.00 | 42.7M   | 10.0G  |    224     | 0.875    | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1KXxYogDh6lD3QGRtFBoX5agfz81RDN3l\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1f1aEeVoySzImI89gkjcaOA)(ne5x) |\n| | | | | | | | | |\n| topformer_tiny | 65.98 | 87.32 | 1.5M   | 0.13G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)             |\n| topformer_small| 72.44 | 91.17 | 3.1M   | 0.24G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)             |\n| topformer_base | 75.25 | 92.67 | 5.1M   | 0.37G   | 224        | 0.875      | bicubic       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)             |\n| | | | | | | | | |\n\n### 目标检测 ###\n| 模型 | 主干网络  | box_mAP | 模型                                                                                                                                                       |\n|-------|-----------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| DETR  | ResNet50  | 42.0    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ruIKCqfh_MMqzq_F4L2Bv-femDMjS_ix\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1J6lB1mezd6_eVW3jnmohZA)(n5gk) |\n| DETR  | ResNet101 | 43.5    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F11HCyDJKZLX33_fRGp4bCg1I14vrIKYW5\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1_msuuAwFMNbAlMpgUq89Og)(bxz2) |\n| Mask R-CNN | Swin-T 1x |  43.7   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1OpbCH5HuIlxwakNz4PzrAlJF3CxkLSYp\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18HALSo2RHMBsX-Gbsi-YOw)(qev7) |\n| Mask R-CNN | Swin-T 3x |  46.0   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1oREwIk1ORhSsJcs4Y-Cfd0XrSEfPFP3-\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tw607oogDWQ7Iz91ItfuGQ)(m8fg) |\n| Mask R-CNN | Swin-S 3x |  48.4   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1ZPWkz0zMzHJycHd6_s2hWDHIsW8SdZcK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ubC5_CKSq0ExQSINohukVg)(hdw5) |\n| Mask R-CNN | pvtv2_b0 \t\t|  38.3   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1wA324LkFtGezHJovSZ4luVqSxVt9woFc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1q67ZIDSHn9Y-HU_WoQr8OQ)(3kqb) |\n| Mask R-CNN | pvtv2_b1 \t\t|  41.8   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1alNaSmR4TSXsPpGoUZr2QQf5phYQjIzN\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aSkuDiNpxdnFWE1Wn1SWNw)(k5aq) |\n| Mask R-CNN | pvtv2_b2 \t\t|  45.2   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1tg6B5OEV4OWLsDxTCjsWgxgaSgIh4cID\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1DLwxCZVZizb5HKih7RFw2w)(jh8b) |\n| Mask R-CNN | pvtv2_b2_linear \t|  44.1   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1b26vxK3QVGx5ovqKir77NyY6YPgAWAEj\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16T-Nyo_Jm2yDq4aoXpdnbg)(8ipt) |\n| Mask R-CNN | pvtv2_b3 \t\t|  46.9   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1H6ZUCixCaYe1AvlBkuqYoxzz4b-icJ3u\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16QVsjUOXijo5d9cO3FZ39A)(je4y) |\n| Mask R-CNN | pvtv2_b4 \t\t|  47.5   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1pXQNpn0BoKqiuVaGtJL18eWG6XmdlBOL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1yhX7mpmb2wbRvWZFnUloBQ)(n3ay) |\n| Mask R-CNN | pvtv2_b5 \t\t|  47.4   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12vOyw6pUfK1NdOWBF758aAZuaf-rZLvx\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-gasQk9PqLMkrWXw4aX41g)(jzq1) |\n\n### 语义分割 ###\n#### Pascal Context ####\n|模型      | 主干网络  | Batch_size | mIoU (ss) | mIoU (ms+flip) | 主干网络检查点 | 模型检查点      |     配置文件  |\n|-----------|-----------|------------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_large |     16     |   52.06   |      52.57        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1AUyBLeoAcMH0P_QGer8tdeU44muTUOCA\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11XgmgYG071n_9fSGUcPpDQ)(xdb8)   | [配置文件](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_480x480_80k_pascal_context_bs_16.yaml) | \n|SETR_PUP   | ViT_large |     16     |   53.90   |       54.53    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1IY-yBIrDPg5CigQ18-X2AX6Oq3rvWeXL\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1v6ll68fDNCuXUIJT2Cxo-A)(6sji) | [配置文件](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_480x480_80k_pascal_context_bs_16.yaml) |\n|SETR_MLA   | ViT_Large |     8      |   54.39   |       55.16       | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1utU2h0TrtuGzRX5RMGroudiDcz0z6UmV\u002Fview)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Eg0eyUQXc-Mg5fg0T3RADA)(wora)| [配置文件](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_480x480_80k_pascal_context_bs_8.yaml) |\n|SETR_MLA   | ViT_large |     16     |   55.01   |       55.87        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1SOXB7sAyysNhI8szaBqtF8ZoxSaPNvtl\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jskpqYbazKY1CKK3iVxAYA)(76h2) | [配置文件](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_480x480_80k_pascal_context_bs_16.yaml) |\n\n#### 城市景观 ####\n|模型      | 主干网络  | 批量大小 | 迭代次数 | mIoU (ss) | mIoU (ms+flip) | 主干网络检查点 | 模型检查点     |     配置文件  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_Large |     8      |     40k   |   76.71   |       79.03        | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1QialLNMmvWW8oi7uAHhJZI3HSOavV4qj\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1F3IB31QVlsohqW8cRNphqw)(g7ro)  |  [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_Naive | ViT_Large |     8      |     80k   |   77.31   |       79.43      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1RJeSGoDaOP-fM4p1_5CJxS5ku_yDXXLV\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1XbHPBfaHS56HlaMJmdJf1A)(wn6q)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_768x768_80k_cityscapes_bs_8.yaml)| \n|SETR_PUP   | ViT_Large |     8      |     40k   |   77.92   |       79.63        |  [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)     | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F12rMFMOaOYSsWd3f1hkrqRc1ThNT8K8NG\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1H8b3valvQ2oLU9ZohZl_6Q)(zmoi)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_PUP   | ViT_Large |     8      |     80k   |   78.81   |       80.43     |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1tkMhRzO0XHqKYM0lojE3_g)(f793)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_768x768_80k_cityscapes_bs_8.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     40k   |   76.70    |       78.96      |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1sUug5cMKSo6mO7BEI4EV_w)(qaiw)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_768x768_40k_cityscapes_bs_8.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     80k   |  77.26     |       79.27      |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IqPZ6urdQb_0pbdJW2i3ow)(6bgj)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_768x768_80k_cityscapes_bs_8.yaml)|\n\n#### ADE20K ####\n|模型      | 主干网络  | 批量大小 | 迭代次数 | mIoU (ss) | mIoU (ms+flip) | 主干网络检查点 | 模型检查点     |     配置文件  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|SETR_Naive | ViT_Large |     16      |     160k   | 47.57   |      48.12        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1_AY6BMluNn71UiMNZbnKqQ)(lugq)   | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_Naive_Large_512x512_160k_ade20k_bs_16.yaml)| \n|SETR_PUP   | ViT_Large |     16      |     160k   |  49.12   |      49.51        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1N83rG0EZSksMGZT3njaspg)(udgs)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_PUP_Large_512x512_160k_ade20k_bs_16.yaml)| \n|SETR_MLA   | ViT_Large |     8      |     160k   |  47.80   |       49.34        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)    | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1L83sdXWL4XT02dvH2WFzCA)(mrrv)    | [config](semantic_segmentation\u002Fconfigs\u002Fsetr\u002FSETR_MLA_Large_512x512_160k_ade20k_bs_8.yaml)| \n|DPT        | ViT_Large |     16     |     160k   |  47.21   |       -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1TPgh7Po6ayYb1DksJeZp60LGnNyznr-r\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F18WSi8Jp3tCZgv_Vr3V1i7A)(owoj)      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1PCSC1Kvcg291gqp6h5pDCg)(ts7h)   |  [config](semantic_segmentation\u002Fconfigs\u002Fdpt\u002FDPT_Large_480x480_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Tiny  |     16     |     160k   |  38.45   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nZptBc-IY_3PFramXSlovQ)(1k97)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Tiny_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Small |     16     |     160k   |  46.07   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1gKE-GEu7gX6dJsgtlvrmWg)(i8nv)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_small_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | ViT_Base  |     16     |     160k   |  49.08   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1qb7HEtKW0kBSP6iv-r_Hjg)(hxrl)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_512x512_160k_ade20k_bs_16.yaml) |\n|Segmenter  | ViT_Large  |     16     |     160k   |  51.82   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F121FOwpsYue7Z2Rg3ZlxnKg)(wdz6)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Tiny_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter_Linear  | DeiT_Base |     16     |     160k   |  47.34   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Hk_zcXUIt_h5sKiAjG2Pog)(5dpv)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_distilled_512x512_160k_ade20k_bs_16.yaml)\n|Segmenter  | DeiT_Base |     16     |     160k   |  49.27   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1-TBUuvcBKNgetSJr0CsAHA)(3kim)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegmenter_Base_distilled_512x512_160k_ade20k_bs_16.yaml) |\n|Segformer  | MIT-B0 |     16     |     160k   |  38.37   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1WOD9jGjQRLnwKrRYzgBong)(ges9)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegformer\u002Fsegformer_mit-b0_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B1 |     16     |     160k   |  42.20   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1aiSBXMd8nP82XK7sSZ05gg)(t4n4)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b1_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B2 |     16     |     160k   |  46.38   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wFFh-K5t46YktkfoWUOTAg)(h5ar)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b2_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B3 |     16     |     160k   |  48.35   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1IwBnDeLNyKgs-xjhlaB9ug)(g9n4)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b3_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B4 |     16     |     160k   |  49.01   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1a25fCVlwJ-1TUh9HQfx7YA)(e4xw)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b4_512x512_160k_ade20k.yaml) |\n|Segformer  | MIT-B5 |     16     |     160k   |  49.73   |       -        |   TODO      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F15kXXxKEjjtJv-BmrPnSTOw)(uczo)   |  [config](semantic_segmentation\u002Fconfigs\u002Fsegmenter\u002Fsegformer_mit-b5_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Tiny |     16     |     160k   |  44.90   |       45.37     |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1S8JR4ILw0u4I-DzU4MaeVQ)(lkhg)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_tiny_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Small |     16     |     160k   |  47.88   |       48.90      |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F17RKeSpuWqONVptQZ3B4kEA)(vvy1)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_small_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | Swin_Base |     16     |     160k   |   48.59   |       49.04      |   -      |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1bM15KHNsb0oSPblQwhxbgw)(y040)   |  [config](semantic_segmentation\u002Fconfigs\u002Fupernet_swin\u002Fupernet_swin_base_patch4_windown7_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Tiny |     16     |     160k   |  49.46   |           |[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ol_gykZjgAFbJ3PkqQ2j0Q)(l1cp) | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1gLePNLybtrax9yCQ2fcIPg)(y1eq)  |  [config](seman}tic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_tiny_patch4_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Small |     16     |     160k   |  50.88   |      | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1mSd_JdNS4DtyVNYxqVobBw)(6vwk)   | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1a_vhHoib0-BcRwTnnSVGWA)(fz2e)   | [config](semantic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_small_patch4_512x512_160k_ade20k.yaml) |\n| UperNet  | CSwin_Base |     16     |     160k   |  50.64   |      | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1suO0jX_Tw56CVm3UhByOWg)(0ys7)   | [baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ym-RUooqizgUDEm5jWyrhA)(83w3)   | [config](semantic_segmentation\u002Fconfigs\u002Fupernet_cswin\u002Fupernet_cswin_base_patch4_512x512_160k_ade20k.yaml) |\n| TopFormer  | TopFormer_Base |     16     |     160k   |  38.3   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_base_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Base |     32     |     160k   |  39.2   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1jC_NVpaTRqFJ4ACnv_TTs9kH1yPvHE4H\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1ep2YEQ1ZwgXFb0V6RrQq5Q?pwd=v9xm)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_base_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Small |     16     |     160k   |  36.5   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_small_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Small |     32     |     160k   |  37.0   |  -    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-eJLnHhwpy_6kLKOG-pAvSfKdHePurUz\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1nlw_r55SwfK8ERnHs9kZwg?pwd=b69w)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_small_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     16     |     160k   |  33.6   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_512x512_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     32     |     160k   |  34.6   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_512x512_160k_4x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     16     |     160k   |  32.5   |   -   | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_448x448_160k_2x8_ade20k.yaml) |\n| TopFormer  | TopFormer_Tiny |     32     |     160k   |  33.4   |  -    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1CXhp26GYA-yIUvf1PEEVuhKUt2Qq_EoS\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F11kLXIEHchXm2PGcOOuxXng?pwd=gvdb)   | [google](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F11yCohFEuxwvDjObt05Rp9Qs85MLrUoHo?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1k6Z9NKLEaDaKm5OgbIi0oA)(ufxt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftop_former\u002Ftopformer_tiny_448x448_160k_4x8_ade20k.yaml) |\n|Trans2seg_Medium | Resnet50c |     32      |    160k    |  36.81  |      -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C6nMg6DgQ73wzF21UwDVxmkcRTeKngnK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hs0tbSGIeMLLGMq05NN--w)(4dd5)    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1IqFCEC8PeKgtoljmUxCqI3kmfsMRcTLN\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hF7-TrjGeHTw0zxzTvhXUA)(i2nt)   | [config](semantic_segmentation\u002Fconfigs\u002Ftrans2seg\u002FTrans2Seg_medium_512x512_16k_ade20k_bs_32.yaml)|\n\n#### Trans10kV2 ####\n|模型      | 主干网络  | 批量大小 | 迭代次数 | mIoU (ss) | mIoU (ms+flip) | 主干网络检查点 | 模型检查点     |     配置文件  |\n|-----------|-----------|------------|-----------|-----------|----------------|-----------------------------------------------|-----------------------------------------------------------------------|------------|\n|Trans2seg_Medium | Resnet50c |     16      |    16k    |  75.97  |      -        |   [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1C6nMg6DgQ73wzF21UwDVxmkcRTeKngnK\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1hs0tbSGIeMLLGMq05NN--w)(4dd5)    | [google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Si03aM3m9aqGocvN9XQGbvHIsWQxxXZu\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1wdOUD6S8QGqD6S-98Yb37w)(w25r)   | [config](semantic_segmentation\u002Fconfigs\u002Ftrans2seg\u002FTrans2Seg_medium_512x512_16k_trans10kv2_bs_16.yaml)| \n\n\n\n### GAN ###\n| 模型                          | FID | 图像尺寸 | 裁剪比例 | 插值方法 | 模型        |\n|--------------------------------|-----|------------|----------|---------------|--------------|\n| styleformer_cifar10            |2.73 | 32         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1iW76QmwbYz6GeAPQn8vKvsG0GvFdhV4T\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1Ax7BNEr1T19vgVjXG3rW7g)(ztky)  |\n| styleformer_stl10              |15.65| 48         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15p785y9eP1TeoqUcHPbwFPh98WNof7nw\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1rSORxMYAiGkLQZ4zTA2jcg)(i973)|\n| styleformer_celeba             |3.32 | 64         | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1_YauwZN1osvINCboVk2VJMscrf-8KlQc\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16NetcPxLQF9C_Zlp1SpkLw)(fh5s) |\n| styleformer_lsun               | 9.68 | 128        | 1.0      | lanczos       |[google](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1i5kNzWK04ippFSmrmcAPMItkO0OFukTd\u002Fview?usp=sharing)\u002F[baidu](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F1jTS9ExAMz5H2lhue4NMV2A)(158t)|\n\n> *结果是在Cifar10、STL10、Celeba和LSUNchurch数据集上使用**fid50k_full**指标评估的。\n\n\n## 图像分类快速演示\n要使用预训练权重的模型，请进入特定子文件夹，例如`\u002Fimage_classification\u002FViT\u002F`，然后下载`.pdparam`权重文件，并在以下Python脚本中更改相关文件路径。模型配置文件位于`.\u002Fconfigs`目录下。\n\n假设下载的权重文件存储在`.\u002Fvit_base_patch16_224.pdparams`中，要在Python中使用`vit_base_patch16_224`模型：\n```python\nfrom config import get_config\nfrom visual_transformer import build_vit as build_model\n# 配置文件位于.\u002Fconfigs\u002F\nconfig = get_config('.\u002Fconfigs\u002Fvit_base_patch16_224.yaml')\n# 构建模型\nmodel = build_model(config)\n# 加载预训练权重\nmodel_state_dict = paddle.load('.\u002Fvit_base_patch16_224.pdparams')\nmodel.set_dict(model_state_dict)\n```\n> :robot: 请参阅每个模型文件夹中的README文件以获取详细用法。\n\n\n### 评估 ###\n要在单个GPU上评估ViT模型在ImageNet2012上的性能，请使用命令行运行以下脚本：\n```shell\nsh run_eval.sh\n```\n或者\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython main_single_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Fval \\\n    -eval \\\n    -pretrained=\u002Fpath\u002Fto\u002Fpretrained\u002Fmodel\u002Fvit_base_patch16_224  # 不需要.pdparams文件\n```\n\n\u003Cdetails>\n\n\u003Csummary>\n使用多GPU进行评估：\n\u003C\u002Fsummary>\n\n\n```shell\nsh run_eval_multi.sh\n```\n或者\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 \\\npython main_multi_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Fval \\\n    -eval \\\n    -pretrained=\u002Fpath\u002Fto\u002Fpretrained\u002Fmodel\u002Fvit_base_patch16_224   # 不需要.pdparams文件\n```\n\n\u003C\u002Fdetails>\n\n\n### 训练 ###\n要在单个GPU上训练ViT模型于ImageNet2012数据集，请使用命令行运行以下脚本：\n```shell\nsh run_train.sh\n```\n或者\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython main_single_gpu.py \\\n  -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n  -dataset=imagenet2012 \\\n  -batch_size=32 \\\n  -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Ftrain\n```\n\n\n\u003Cdetails>\n\n\u003Csummary>\n使用多GPU进行训练：\n\u003C\u002Fsummary>\n\n\n```shell\nsh run_train_multi.sh\n```\n或者\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 \\\npython main_multi_gpu.py \\\n    -cfg=.\u002Fconfigs\u002Fvit_base_patch16_224.yaml \\\n    -dataset=imagenet2012 \\\n    -batch_size=16 \\\n    -data_path=\u002Fpath\u002Fto\u002Fdataset\u002Fimagenet\u002Ftrain\n```\n\n\u003C\u002Fdetails>\n\n\n\n## 贡献 ##\n* 我们鼓励并感谢您对**PaddleViT**项目的贡献，请参考我们的工作流程和工作方式，详见[CONTRIBUTING.md](.\u002FCONTRIBUTING.md)\n\n\n## 许可证 ##\n* 本仓库采用Apache-2.0许可证。\n\n## 联系方式 ##\n* 请在GitHub上提交问题。","# PaddleViT 快速上手指南\n\nPaddleViT 是基于百度飞桨（PaddlePaddle）框架的视觉 Transformer 及 MLP 模型集合，涵盖了图像分类、目标检测、语义分割等任务。本指南将帮助您快速搭建环境并运行首个模型。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux \u002F macOS \u002F Windows\n*   **Python 版本**：>= 3.7\n*   **深度学习框架**：PaddlePaddle >= 2.1\n*   **硬件要求**：支持 CUDA 的 NVIDIA GPU（推荐）或 CPU 环境\n\n### 安装 PaddlePaddle\n建议优先使用国内镜像源加速安装。根据您的 CUDA 版本选择对应的命令（以下为 CUDA 11.2 示例，其他版本请参考 [PaddlePaddle 官网](https:\u002F\u002Fwww.paddlepaddle.org.cn)）：\n\n```bash\n# 使用国内镜像源安装 PaddlePaddle GPU 版本\npython -m pip install paddlepaddle-gpu==2.4.0 -i https:\u002F\u002Fmirror.baidu.com\u002Fpypi\u002Fsimple\n\n# 如果仅使用 CPU 版本\npython -m pip install paddlepaddle==2.4.0 -i https:\u002F\u002Fmirror.baidu.com\u002Fpypi\u002Fsimple\n```\n\n## 2. 安装步骤\n\n克隆 PaddleViT 代码库并安装必要的依赖项：\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT.git\ncd PaddleViT\n\n# 安装项目依赖\npip install -r requirements.txt\n```\n\n> **提示**：如果 `requirements.txt` 下载较慢，可手动编辑该文件，将源替换为国内镜像（如 `https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`）后再执行安装。\n\n## 3. 基本使用\n\nPaddleViT 采用模块化设计，以图像分类任务为例，您可以直接调用预定义模型进行推理或微调。\n\n### 示例：加载预训练模型进行推理\n\n以下代码演示了如何加载一个预训练的 ViT 模型并对单张图片进行预测：\n\n```python\nimport paddle\nfrom paddlevit.models import vit_base_patch16_224\nfrom paddlevit.utils import load_pretrained_weights\nfrom PIL import Image\nimport numpy as np\n\n# 1. 构建模型\nmodel = vit_base_patch16_224(pretrained=False)\n\n# 2. 加载预训练权重 (自动下载或使用本地路径)\n# 这里以 ImageNet-1k 预训练权重为例\nload_pretrained_weights(model, 'vit_base_patch16_224')\n\n# 3. 设置为评估模式\nmodel.eval()\n\n# 4. 准备输入数据 (假设已预处理为 224x224 的 tensor)\n# 实际使用中请配合 transforms 进行归一化等操作\ndummy_input = paddle.rand([1, 3, 224, 224])\n\n# 5. 执行推理\nwith paddle.no_grad():\n    output = model(dummy_input)\n    probabilities = paddle.nn.functional.softmax(output, axis=1)\n\nprint(\"预测结果形状:\", output.shape)\nprint(\"最大概率类别:\", paddle.argmax(probabilities, axis=1).numpy())\n```\n\n### 训练与微调\n对于自定义数据集的训练，PaddleViT 提供了统一的配置文件接口。您只需修改配置文件中的数据路径和模型参数，即可启动训练：\n\n```bash\n# 使用默认配置启动图像分类训练 (示例)\npython tools\u002Ftrain.py -c configs\u002Fvit\u002Fvit_base_patch16_224.yaml\n```\n\n更多针对目标检测（PaddleViT-Det）、语义分割（PaddleViT-Seg）等任务的详细用法，请参阅项目根目录下的 `docs` 文档或对应子文件夹中的说明。","某电商公司的算法团队正致力于升级其商品图片自动分类系统，以应对日益增长的 SKU 数量和复杂的背景干扰。\n\n### 没有 PaddleViT 时\n- **模型性能遭遇瓶颈**：团队依赖传统的 CNN 架构（如 ResNet），在处理细粒度分类（如区分不同花纹的衬衫）时准确率难以突破，且对遮挡和光照变化敏感。\n- **复现前沿算法困难**：想要尝试最新的 Vision Transformer (ViT) 技术，但需从零编写复杂的注意力机制代码，调试周期长且容易出错。\n- **训练效率低下**：缺乏原生支持的高效分布式训练（DDP）和混合精度训练（AMP）配置，导致在大规模数据集上训练耗时数天，迭代速度缓慢。\n- **落地部署复杂**：从实验代码到生产环境推理模型的转换过程繁琐，缺乏统一的导出工具，增加了工程化风险。\n\n### 使用 PaddleViT 后\n- **识别精度显著提升**：直接调用 PaddleViT 中预训练的 SOTA ViT 模型，利用其强大的全局建模能力，细粒度分类准确率提升了 15%，有效解决了复杂背景干扰问题。\n- **研发门槛大幅降低**：通过模块化设计，团队仅需修改配置文件即可快速切换不同的 Transformer 变体进行实验，将新算法验证周期从周缩短至小时级。\n- **训练加速与资源优化**：借助内置的 DDP 多卡并行和 AMP 混合精度训练功能，模型训练时间缩短了 60%，显著降低了算力成本。\n- **无缝对接生产环境**：利用 PaddleViT 提供的模型导出工具，一键将训练好的模型转换为推理格式，平滑集成到现有的 PaddlePaddle 服务中，实现了快速上线。\n\nPaddleViT 通过提供开箱即用的前沿视觉 Transformer 模型与高效训练工具，帮助团队打破了传统卷积网络的性能天花板，实现了从算法创新到生产落地的极速闭环。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBR-IDL_PaddleViT_6b1c2a28.png","BR-IDL",null,"https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FBR-IDL_46b455ef.png","https:\u002F\u002Fgithub.com\u002FBR-IDL",[77,81],{"name":78,"color":79,"percentage":80},"Python","#3572A5",99.5,{"name":82,"color":83,"percentage":84},"Shell","#89e051",0.5,1240,328,"2026-04-08T16:50:42","Apache-2.0","未说明","需要 GPU 以支持 DDP（分布式数据并行）和混合精度训练（AMP），具体型号、显存大小及 CUDA 版本未在文中明确说明，但需兼容 PaddlePaddle 2.1+ 的 GPU 环境",{"notes":92,"python":89,"dependencies":93},"该工具基于百度飞桨（PaddlePaddle）深度学习框架，而非 PyTorch。支持多卡分布式训练（DDP）和自动混合精度（AMP）。提供图像分类、目标检测、语义分割、GAN 等多种任务的模型。建议参考 Paddle AI Studio 上的教程入门，预训练权重可下载用于微调。",[94],"PaddlePaddle>=2.1",[35,15,14],[97,98,99,100,101,102,103,104,105,106,107,108,109,110],"cv","computer-vision","paddlepaddle","vit","mlp","transformer","encoder-decoder","classification","detection","segmentation","gan","deep-learning","semantic-segmentation","object-detection","2026-03-27T02:49:30.150509","2026-04-18T00:45:53.091994",[114,119,123,128,133,138],{"id":115,"question_zh":116,"answer_zh":117,"source_url":118},38282,"在 README 的 Usage 示例中加载预训练权重时，文件名是否需要添加 `.pdparams` 后缀？","是的，需要添加。README 中的注释说明 `-pretrained` 命令行参数不需要后缀，但在 Python 代码中使用 `paddle.load()` 加载模型时，必须显式加上 `.pdparams` 后缀。正确的代码示例如下：\n```python\nmodel_state_dict = paddle.load('.\u002Fbeit_base_patch16_224_ft22kto1k.pdparams')\nmodel.set_dict(model_state_dict)\n```","https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fissues\u002F92",{"id":120,"question_zh":121,"answer_zh":122,"source_url":118},38283,"运行训练或评估命令时出现 `FileNotFoundError`，提示找不到包含单引号的配置文件路径（如 `'.\u002Fconfigs\u002Fxxx.yaml'`），如何解决？","这是因为 `argparse` 解析参数时会自动处理字符串，如果在命令行参数值两边手动添加了单引号，会导致路径被错误地包含引号字符。\n解决方案有两种：\n1. **推荐**：去掉命令行参数值两边的单引号。例如将 `-cfg='.\u002Fconfigs\u002Fxxx.yaml'` 改为 `-cfg=.\u002Fconfigs\u002Fxxx.yaml`。\n2. 或者将单引号改为双引号（视 Shell 环境而定）。\n注意：在 PowerShell 和 Git Bash 中通常不受影响，该问题主要出现在使用 cmd 作为 Shell 的环境（如某些 PyCharm 配置）中。",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},38284,"使用自己的数据集训练 ViT 模型时，Loss 不下降且精度远低于 ResNet，是什么原因？","这通常不是代码 Bug，而是超参数调整或数据规模问题。ViT 模型参数量远大于 ResNet50，如果数据集规模较小，很难收敛。\n建议尝试以下调整：\n1. **加载预训练模型**：务必使用在大规模数据集（如 ImageNet-22k）上预训练的权重进行微调。\n2. **调整超参数**：根据数据集特点调整学习率（lr）和 Batch Size。有用户反馈在使用预训练模型后，设置 `lr=0.0001` 和 `batch_size=32`，训练 10 轮后精度即可达到 0.8 以上。","https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fissues\u002F113",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},38285,"BEiT 模型在使用 PaddlePaddle 混合精度训练时，前向传播出现 NaN（数值溢出），如何解决？","这是一个已知问题，维护者已经更新了代码以修复混合精度训练下的 NaN 问题。请拉取最新的代码版本，具体路径参考：\nhttps:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Ftree\u002Fdevelop\u002Fimage_classification\u002FBEiT\n更新代码后重新运行即可解决。","https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fissues\u002F178",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},38286,"恢复训练（Resume Training）时报错 `AttributeError: 'Momentum' object has no attribute 'set_dict'`，怎么办？","这是由于优化器状态加载方式不兼容导致的错误。维护者已经修复了相关代码。\n请确保您使用的是最新版本的 PaddleViT 代码库。如果您使用的是旧版本，需要手动修改加载优化器状态的逻辑，避免直接对优化器对象调用 `set_dict`，或者升级到修复后的版本即可解决。","https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fissues\u002F116",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},38287,"代码中 `if not paddle.distributed.parallel.parallel_helper._is_parallel_ctx_initialized()` 这个判断的作用是什么？多机多卡训练需要修改代码吗？","该判断用于检查并行训练环境是否已经初始化，避免重复初始化。如果是多机多卡训练，目前有一个正在开发中的分支支持该功能。\n您可以参考 `multi_node` 分支下的代码：\nhttps:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fblob\u002Fmulti_node\u002Fimage_classification\u002FCycleMLP\u002Fmain_multi_node.py\n对于简单的多机设置，可以尝试在 `spawn()` 函数中添加 `ips` 参数，指定所有节点的 IP 地址，例如：`ips='172.18.0.2, 172.18.0.3'`（请替换为您实际的节点 IP）。","https:\u002F\u002Fgithub.com\u002FBR-IDL\u002FPaddleViT\u002Fissues\u002F107",[144,149],{"id":145,"version":146,"summary_zh":147,"released_at":148},306442,"v0.8","本次发布新增内容：\n1. 增加更多分类模型、检测模型和分割模型。\n2. 增加用于模型训练和验证的工具及脚本。\n3. 重构单卡和多卡的训练\u002F验证方案。\n4. 修复常见错误和问题。\n5. 增加更多文档和教程。\n6. 优化README文件。","2022-01-11T10:36:25",{"id":150,"version":151,"summary_zh":152,"released_at":153},306443,"v0.1","这是 `PaddleViT` 的首个版本。","2021-08-30T08:18:45"]