[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-whitelok--image-text-localization-recognition":3,"tool-whitelok--image-text-localization-recognition":61},[4,18,28,37,45,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":24,"last_commit_at":25,"category_tags":26,"status":17},9989,"n8n","n8n-io\u002Fn8n","n8n 是一款面向技术团队的公平代码（fair-code）工作流自动化平台，旨在让用户在享受低代码快速构建便利的同时，保留编写自定义代码的灵活性。它主要解决了传统自动化工具要么过于封闭难以扩展、要么完全依赖手写代码效率低下的痛点，帮助用户轻松连接 400 多种应用与服务，实现复杂业务流程的自动化。\n\nn8n 特别适合开发者、工程师以及具备一定技术背景的业务人员使用。其核心亮点在于“按需编码”：既可以通过直观的可视化界面拖拽节点搭建流程，也能随时插入 JavaScript 或 Python 代码、调用 npm 包来处理复杂逻辑。此外，n8n 原生集成了基于 LangChain 的 AI 能力，支持用户利用自有数据和模型构建智能体工作流。在部署方面，n8n 提供极高的自由度，支持完全自托管以保障数据隐私和控制权，也提供云端服务选项。凭借活跃的社区生态和数百个现成模板，n8n 让构建强大且可控的自动化系统变得简单高效。",184740,2,"2026-04-19T23:22:26",[16,14,13,15,27],"插件",{"id":29,"name":30,"github_repo":31,"description_zh":32,"stars":33,"difficulty_score":10,"last_commit_at":34,"category_tags":35,"status":17},10095,"AutoGPT","Significant-Gravitas\u002FAutoGPT","AutoGPT 是一个旨在让每个人都能轻松使用和构建 AI 的强大平台，核心功能是帮助用户创建、部署和管理能够自动执行复杂任务的连续型 AI 智能体。它解决了传统 AI 应用中需要频繁人工干预、难以自动化长流程工作的痛点，让用户只需设定目标，AI 即可自主规划步骤、调用工具并持续运行直至完成任务。\n\n无论是开发者、研究人员，还是希望提升工作效率的普通用户，都能从 AutoGPT 中受益。开发者可利用其低代码界面快速定制专属智能体；研究人员能基于开源架构探索多智能体协作机制；而非技术背景用户也可直接选用预置的智能体模板，立即投入实际工作场景。\n\nAutoGPT 的技术亮点在于其模块化“积木式”工作流设计——用户通过连接功能块即可构建复杂逻辑，每个块负责单一动作，灵活且易于调试。同时，平台支持本地自托管与云端部署两种模式，兼顾数据隐私与使用便捷性。配合完善的文档和一键安装脚本，即使是初次接触的用户也能在几分钟内启动自己的第一个 AI 智能体。AutoGPT 正致力于降低 AI 应用门槛，让人人都能成为 AI 的创造者与受益者。",183572,"2026-04-20T04:47:55",[13,36,27,14,15],"语言模型",{"id":38,"name":39,"github_repo":40,"description_zh":41,"stars":42,"difficulty_score":10,"last_commit_at":43,"category_tags":44,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":46,"name":47,"github_repo":48,"description_zh":49,"stars":50,"difficulty_score":24,"last_commit_at":51,"category_tags":52,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",161147,"2026-04-19T23:31:47",[14,13,36],{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":24,"last_commit_at":59,"category_tags":60,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":75,"owner_twitter":75,"owner_website":78,"owner_url":79,"languages":75,"stars":80,"forks":81,"last_commit_at":82,"license":75,"difficulty_score":83,"env_os":84,"env_gpu":85,"env_ram":85,"env_deps":86,"category_tags":89,"github_topics":90,"view_count":24,"oss_zip_url":75,"oss_zip_packed_at":75,"status":17,"created_at":101,"updated_at":102,"faqs":103,"releases":104},9966,"whitelok\u002Fimage-text-localization-recognition","image-text-localization-recognition","A general list of resources to image text localization and recognition  场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約","image-text-localization-recognition 是一个专注于场景文本定位与识别的开源资源合集，旨在为相关领域的研究者和开发者提供一站式的论文索引与代码实现参考。在复杂的自然场景中，如何精准地从图片里框出文字位置（定位）并准确读出内容（识别），一直是计算机视觉领域的难点。该资源库系统性地整理了来自牛津大学、深圳先进院等顶尖机构的经典成果，涵盖了从早期的深度特征提取到最新的统一网络架构（如 FOTS、CTPN 等）等多种技术方案。\n\n它不仅解决了研究人员在海量文献中难以快速筛选高质量模型的问题，还通过提供对应的代码链接和数据集地址，极大地降低了复现前沿算法的门槛。无论是希望深入了解行业发展的学术研究人员，还是需要将文字识别功能落地到实际产品中的 AI 工程师，都能从中找到极具价值的参考素材。此外，对于想要探索合成数据训练或无约束文本识别等独特技术路径的开发者，这里也收录了诸多具有启发性的创新工作。作为一份持续更新的指南，它帮助用户紧跟“深度学习时代”的场景文本处理趋势，是进入该领域不可或缺的入门向导与进阶宝典。","# Scene Text Localization & Recognition Resources\n\n*Read this institute-wise: [English](README.md), [简体中文](README.zh-cn.md).*\n\n*Read this year-wise: [English](README.yearwise.md), [简体中文](README.zh-cn.yearwise.md).*\n\n*Tags: [STL] (Scene Text Localization), [TR] (Text Recognition)*\n\n*[STL] (Scene Text Localization) Detect text area from scene input image*\n\n*[TR] (Text Recognition) Recognize text content*\n\n**Last update: Sep.17 2023**\n\n## 1. Papers & Code\n\n#### Overview\n\n- [2020-arxiv] Text Detection and Recognition in the Wild: A Review [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.04305.pdf)\n- [2020-arxiv] Text Recognition in the Wild: A Survey [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.03492.pdf)\n- [2020-IJCV] Scene Text Detection and Recognition: The Deep Learning Era [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.04256.pdf)\n- [2019-ICCV] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FBaek_What_Is_Wrong_With_Scene_Text_Recognition_Model_Comparisons_Dataset_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fclovaai\u002Fdeep-text-recognition-benchmark)\n- [2016-TIP] Text Detection Tracking and Recognition in Video: A Comprehensive Survey [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fapplication\u002Fenterprise\u002Fentconfirmation.jsp?arnumber=7452620&icp=false)\n- [2015-PAMI] Text Detection and Recognition in Imagery: A Survey [`paper`](http:\u002F\u002Flampsrv02.umiacs.umd.edu\u002Fpubs\u002FPapers\u002Fqixiangye-14\u002Fqixiangye-14.pdf)\n- [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends [`paper`](http:\u002F\u002Fmc.eistar.net\u002Fuploadfiles\u002FPapers\u002FFCS_TextSurvey_2015.pdf)\n\n#### University of Oxford\n\n- [2020-ECCV][STL][TR] Adaptive Text Recognition through Visual Matching [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2492_ECCV_2020_paper.php) [`code`](https:\u002F\u002Fgithub.com\u002FChuhanxx\u002FFontAdaptor)\n- [2018-BMVC][TR] Inductive Visual Localisation: Factorised Training for Superior Generalisation [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1807.08179)\n- [2016-IJCV][STL][TR] Reading Text in the Wild with Convolutional Neural Networks  [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1412.1842) [`demo`](http:\u002F\u002Fzeus.robots.ox.ac.uk\u002Ftextsearch\u002F#\u002Fsearch\u002F)  [`homepage`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fresearch\u002Ftext\u002F)\n- [2016-CVPR][STL] Synthetic Data for Text Localisation in Natural Images [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Fscenetext\u002Fgupta16.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fankush-me\u002FSynthText) [`data`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Fscenetext\u002F)\n- [2015-ICLR][TR] Deep structured output learning for unconstrained text recognition [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1412.5903)\n- [2015-PhD Thesis][STL] Deep Learning for Text Spotting\n [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2015\u002FJaderberg15b\u002Fjaderberg15b.pdf) [`code`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting)\n- [2014-ECCV][STL] Deep Features for Text Spotting [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14\u002Fjaderberg14.pdf) [`code`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting) [`model`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting)\n- [2014-NIPS][TR] Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14c\u002Fjaderberg14c.pdf)  [`homepage`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14c\u002F) [`model`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fresearch\u002Ftext\u002Fmodel_release.tar.gz)\n\n#### Shenzhen Institutes of Advanced Technology\n\n- [2018-arxiv][STL][TR] FOTS: Fast Oriented Text Spotting with a Unified Network [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01671)\n- [2016-ECCV][STL] CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1609.03605) [`code`](https:\u002F\u002Fgithub.com\u002Ftianzhi0549\u002FCTPN)\n- [2016-CVPR][STL] Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1603.09423)\n- [2016-AAAI][STL] Reading Scene Text in Deep Convolutional Sequences [`paper`](http:\u002F\u002Fwhuang.org\u002Fpapers\u002Fphe2016_aaai.pdf)\n- [2016-TIP][STL] Text-Attentional Convolutional Neural Networks for Scene Text Detection [`paper`](http:\u002F\u002Fwhuang.org\u002Fpapers\u002Fthe2016_tip.pdf)\n- [2016-TIP][STL] Text-Attentional Convolutional Neural Network for Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1510.03283.pdf)\n- [2014-ECCV][STL] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [`paper`](http:\u002F\u002Fwww.whuang.org\u002Fpapers\u002Fwhuang2014_eccv.pdf)\n\n#### South China University of Technology\n\n- [2021-IJCV][STL] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.09629.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FBox_Discretization_Network)\n- [2021-CVPR][STL] Fourier Contour Embedding for Arbitrary-Shaped Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FZhu_Fourier_Contour_Embedding_for_Arbitrary-Shaped_Text_Detection_CVPR_2021_paper.pdf)\n- [2021-CVPR][TR][STL] Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWang_Implicit_Feature_Alignment_Learn_To_Convert_Text_Recognizer_to_Text_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FWang-Tianwei\u002FImplicit-feature-alignment)\n- [2020-CVPR][TR] Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FLuo_Learn_to_Augment_Joint_Data_Augmentation_and_Network_Optimization_for_CVPR_2020_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FCanjie-Luo\u002FText-Image-Augmentation)\n- [2020-AAAI][STL][TR] Decoupled Attention Network for Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.10205.pdf)\n- [2020-CVPR][STL][TR] ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.10200.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002Fbezier_curve_text_spotting)\n- [2020-IJCV][TR] Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.04189.pdf)\n- [2019-Pattern Recognition][TR] A Multi-Object Rectified Attention Network for Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1901.03003.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FCanjie-Luo\u002FMORAN_v2)\n- [2019-CVPR][TR] Aggregation Cross-Entropy for Sequence Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.08364.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fsummerlvsong\u002FAggregation-Cross-Entropy)\n- [2019-arxiv][STL] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.09629.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FBox_Discretization_Network) [`code`](https:\u002F\u002Fgit.io\u002FTextDet)\n- [2019-CVPR][STL] Tightness-Aware Evaluation Protocol for Scene Text Detection [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FLiu_Tightness-Aware_Evaluation_Protocol_for_Scene_Text_Detection_CVPR_2019_paper.html)\n- [2018-AAAI][STL] Feature Enhancement Network: A Refined Scene Text Detector [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1711.04249.pdf)\n- [2017-arXiv][STL] Detecting Curve Text in the Wild: New Dataset and New Solution [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.02170)\n- [2020-arxiv][TR] Adaptive Embedding Gate for Attention-Based Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.09475.pdf)\n- [2017-PAMI][TR] Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition [`paper`](http:\u002F\u002Fdiscovery.ucl.ac.uk\u002F1569458\u002F1\u002FTPAMI-2016-08-0656-R2.pdf)\n- [2017-CVPR][STL] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.01425)\n- [2016-arXiv][STL] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1605.07314)\n- [2016-IEEE Transactions on Multimedia][STL] A Convolutional Neural Network Based Chinese Text Detection Algorithm Via Text Structure Modeling [`paper`](http:\u002F\u002Fwww2.egr.uh.edu\u002F~zhan2\u002FECE6111_spring2017\u002FA%20Convolutional%20Neural%20Network%20%20Based%20Chinese%20Text%20Detection%20Algorithm%20Via%20Text%20Structure%20Modeling.pdf)\n\n#### Fudan University\n\n- [2022-AAAI][TR] Text Gestalt: Stroke-Aware Scene Text Image Super-resolution [`paper`](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F19904) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-MM][TR] Chinese Character Recognition with Augmented Character Profile Matching [`paper`](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503161.3547827) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-ICCV][TR] Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.01083) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-arxiv][STL][TR] Weakly-Supervised Text Instance Segmentation [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.10848) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-IJCAI][TR] Orientation-Independent Chinese Text Recognition in Scene Images [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0185.pdf)\n- [2023-IJCAI][TR] TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0197.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fsimplify23\u002FTPS_PP)\n- [2023-IJCAI][STL][TR] Towards Accurate Video Text Spotting with Text-wise Semantic Reasoning [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0206.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-MM][TR] Chinese Character Recognition with Augmented Character Profile Matching [`paper`](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503161.3547827) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-WACV][TR] Robustly Recognizing Irregular Scene Text by Rectifying Principle Irregularities [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2022\u002Fpapers\u002FXu_Robustly_Recognizing_Irregular_Scene_Text_by_Rectifying_Principle_Irregularities_WACV_2022_paper.pdf)\n- [2021-IJCAI][TR] Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2021\u002F0085.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-IJCAI][TR] C3-STISR: Scene Text Image Super-resolution with Triple Clues [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F0238.pdf) [`code`][https:\u002F\u002Fgithub.com\u002Fzhaominyiz\u002FC3-STISR]\n- [2021-CVPR][TR] Scene Text Telescope: Text-Focused Scene Image Super-Resolution [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FChen_Scene_Text_Telescope_Text-Focused_Scene_Image_Super-Resolution_CVPR_2021_paper.pdf)\n- [2020-arxiv][TR] Text Recognition in Real Scenarios with a Few Labeled Samples [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.12209.pdf)\n- [2018-CVPR][TR] Edit Probability for Scene Text Recognition [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FBai_Edit_Probability_for_CVPR_2018_paper.pdf)\n- [2017-arXiv][STL] Arbitrary-Oriented Scene Text Detection via Rotation Proposals [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.01086) [`code`](https:\u002F\u002Fgithub.com\u002Fmjq11302010044\u002FRRPN)\n\n#### Huazhong University of Science and Technology\n\n- [2021-CVPR][STL][TR] Scene Text Retrieval via Joint Text Detection and Similarity Learning [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWang_Scene_Text_Retrieval_via_Joint_Text_Detection_and_Similarity_Learning_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Flanfeng4659\u002FSTR-TDSL)\n- [2021-CVPR][STL] MOST: A Multi-Oriented Scene Text Detector With Localization Refinement [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FHe_MOST_A_Multi-Oriented_Scene_Text_Detector_With_Localization_Refinement_CVPR_2021_paper.pdf)\n- [2020-ECCV][TR] AutoSTR: Efficient Backbone Search for Scene Text Recognition [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F4796_ECCV_2020_paper.php)\n- [2020-AAAI][STL][TR] All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.09550.pdf)\n- [2020-AAAI][STL] Real-time Scene Text Detection with Differentiable Binarization [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.08947.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FDB)\n- [2020-ECCV][STL][TR] Mask TextSpotter V3: Segmentation Proposal Network for Robust Scene Text Spotting [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F1436_ECCV_2020_paper.php) [`code`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FMaskTextSpotterV3)\n- [2019-PAMI][TR] ASTER: An Attentional Scene Text Recognizer with Flexible Rectification [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8395027) [`code`](https:\u002F\u002Fgithub.com\u002Fayumiymk\u002Faster.pytorch)\n- [2019-AAAI][TR] Scene Text Recognition from Two-Dimensional Perspective [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.06508.pdf)\n- [2019-PAMI][STL] Gliding vertex on the horizontal bounding box for multi-oriented object detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.09358.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMingtaoFu\u002Fgliding_vertex)\n- [2019-ICCV][TR] Symmetry-Constrained Rectification Network for Scene Text Recognition [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FYang_Symmetry-Constrained_Rectification_Network_for_Scene_Text_Recognition_ICCV_2019_paper.html)\n- [2018-arxiv][STL] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.01393.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYukangWang\u002FTextField)\n- [2018-ECCV][TR][STL] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FPengyuan_Lyu_Mask_TextSpotter_An_ECCV_2018_paper.pdf)\n- [2018-ICIP][STL] Feature Fusion Network for Scene Text Detection [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8395194\u002F)\n- [2018-CVPR][STL] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLyu_Multi-Oriented_Scene_Text_CVPR_2018_paper.pdf)\n- [2018-CVPR][STL] Rotation-sensitive Regression for Oriented Scene Text Detection [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLiao_Rotation-Sensitive_Regression_for_CVPR_2018_paper.pdf)\n- [2018-TIP][STL] TextBoxes++: A Single-Shot Oriented Scene Text Detector [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.02765) [`code`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FTextBoxes_plusplus)\n- [2017-AAAI][STL] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1611.06779) [`code`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FTextBoxes)\n- [2017-CVPR][STL] Detecting Oriented Text in Natural Images by Linking Segments [`paper`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002FUpLoadFiles\u002FPapers\u002FSegLink_CVPR17.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fbgshih\u002Fseglink)\n- [2016-CVPR][TR] Robust scene text recognition with automatic rectification [`paper`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1603.03915v2.pdf)\n- [2016-arXiv][STL] Scene Text Detection via Holistic, Multi-Channel Prediction [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.09002)\n- [2016-CVPR][STL] Multi-oriented text detection with fully convolutional networks    [`paper`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002FUpLoadFiles\u002FPapers\u002FTextDectionFCN_CVPR16.pdf)\n- [2015-PAMI][TR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [`paper`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1507.05717v1.pdf) [`code`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002F~xbai\u002FCRNN\u002Fcrnn_code.zip) [`code`](https:\u002F\u002Fgithub.com\u002Fbgshih\u002Fcrnn)\n- [2014-CVPR][TR] Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition [`paper`](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2014\u002Fpapers\u002FYao_Strokelets_A_Learned_2014_CVPR_paper.pdf)\n\n#### Universitat Autònoma de Barcelona\n\n- [2019-ICCV][STL][TR] Scene Text Visual Question Answering [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FBiten_Scene_Text_Visual_Question_Answering_ICCV_2019_paper.html)\n- [2018-ECCV][STL] Single Shot Scene Text Retrieval [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FLluis_Gomez_Single_Shot_Scene_ECCV_2018_paper.pdf)\n- [2017-arXiv][STL] Improving Text Proposal for Scene Images with Fully Convolutional Networks [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.05089)\n- [2016-arXiv][STL] TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1604.02619.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Flluisgomez\u002FTextProposals)\n- [2015-ICDAR][STL] Object Proposals for Text Extraction in the Wild [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1509.02317) [`code`](https:\u002F\u002Fgithub.com\u002Flluisgomez\u002FTextProposals)\n- [2014-PAMI][TR] Word Spotting and Recognition with Embedded Attributes [`paper`](http:\u002F\u002Fwww.cvc.uab.es\u002F~afornes\u002Fpubli\u002Fjournals\u002F2014_PAMI_Almazan.pdf) [`homepage`](http:\u002F\u002Fwww.cvc.uab.es\u002F~almazan\u002Findex\u002Fprojects\u002Fwords-att\u002Findex.html) [`code`](https:\u002F\u002Fgithub.com\u002Falmazan\u002Fwatts)\n\n#### Stanford University\n\n- [2012-ICPR][TR] End-to-End Text Recognition with Convolutional Neural Networks [`paper`](http:\u002F\u002Fwww.cs.stanford.edu\u002F~acoates\u002Fpapers\u002Fwangwucoatesng_icpr2012.pdf) [`code`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FSceneTextCNN_demo.tar) [`SVHN Dataset`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers\u002F)\n- [2012-PhD Thesis][TR] End-to-End Text Recognition with Convolutional Neural Networks [`paper`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Fdwu4\u002FHonorThesis.pdf)\n\n#### Seoul National University\n\n- [2017-AAAI][STL][TR] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [`paper`](https:\u002F\u002Fgithub.com\u002Fcmkang\u002FCTSN\u002Fblob\u002Fmaster\u002Faaai2017_cameraready.pdf)\n\n#### Megvii Technology Inc: Face++\n\n- [2020-CVPR][TR] On Vocabulary Reliance in Scene Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FWan_On_Vocabulary_Reliance_in_Scene_Text_Recognition_CVPR_2020_paper.html)\n- [2020-AAAI][STL][TR] TextScanner: Reading Characters in Order for Robust Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.12422.pdf)\n- [2017-CVPR][STL] EAST: An Efficient and Accurate Scene Text Detector [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.03155) [`code`](https:\u002F\u002Fgithub.com\u002Fargman\u002FEAST) [`code with improvement`](https:\u002F\u002Fgithub.com\u002Fhuoyijie\u002FAdvancedEAST)\n\n#### Institute of Automation, Chinese Academy of Sciences\n\n- [2020-IJCV][STL][TR] Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs11263-020-01388-x)\n- [2019-CVPR][TR] Sequence-to-Sequence Domain Adaptation Networkfor Robust Text Image Recognition [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8953495)\n- [2019-ICCV][STL][TR] TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FFeng_TextDragon_An_End-to-End_Framework_for_Arbitrary_Shaped_Text_Spotting_ICCV_2019_paper.html)\n- [2018-arxiv][TR] NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.00926.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBelval\u002FNRTR)\n- [2018-arxiv][TR] SCAN: Sliding Convolutional Attention Network for Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.00578.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fnameful\u002FSCAN)\n- [2018-arxiv][TR] Recurrent Calibration Network for Irregular Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.07145.pdf)  \n- [2017-arxiv][TR] Scene Text Recognition with Sliding Convolutional Character Models [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1709.01727.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Flsvih\u002FSliding-Convolution)\n- [2017-arXiv][STL] Deep Direct Regression for Multi-Oriented Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.08289)\n- [2017-IAPR][STL] Scene Text Detection with Novel Superpixel Based Character Candidate Extraction [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8270087)\n\n#### University of California, San Diego\n\n- [2016-CVPR][TR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [`paper`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1603.03101v1.pdf)\n\n#### University of California, Santa Cruz\n\n- [2017-arXiv][STL] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.00834)\n\n#### Cornell University\n\n- [2016-arXiv][STL][TR] COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images [`paper`](http:\u002F\u002Fvision.cornell.edu\u002Fse3\u002Fwp-content\u002Fuploads\u002F2016\u002F01\u002F1601.07140v1.pdf)\n\n#### Pennsylvania State University\n\n- [2017-WACV][STL] TextContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.03050.pdf)\n- [2016-PhD Thesis][STL] Context Modeling for Semantic Text Matching and Scene Text Detection [`paper`](https:\u002F\u002Fetda.libraries.psu.edu\u002Fcatalog\u002Fzw12z528p)\n\n#### University of Science and Technology Beijing\n\n- [2021-ICCV][STL] Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FZhang_Adaptive_Boundary_Proposal_Network_for_Arbitrary_Shape_Text_Detection_ICCV_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FGXYM\u002FTextBPN)\n- [2020-CVPR][STL] Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FZhang_Deep_Relational_Reasoning_Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.html)\n- [2017-arxiv][TR] AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1710.03425.pdf)\n- [2016-IJCAI][STL] Scene Text Detection in Video by Learning Locally and Globally [`paper`](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F16\u002FPapers\u002F376.pdf)\n- [2014-PAMI][TR] Robust Text Detection in Natural Scene Images [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=6613482)\n\n#### Pohang University of Science and Technology\n\n- [2016-CVPR][STL] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7780757\u002F)\n\n#### École d'Ingénieurs en Informatique\n\n- [2016-IJDAR][STL] TextCatcher: a method to detect curved and challenging text in natural scenes [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10032-016-0264-4)\n\n#### České vysoké učení technické v Praze. Czech Technical University\n\n- [2018-ACCV][STL][TR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.09919.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FE2E-MLT)\n- [2017-ICCV][STL][TR] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and\nRecognition Framework [`peper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FBusta_Deep_TextSpotter_An_ICCV_2017_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FDeepTextSpotter)\n- [2015-PAMI][STL][TR] Real-time Lexicon-free Scene Text Localization and Recognition [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=7313008)\n- [2015-ICCV][STL] FASText: Efficient unconstrained scene text detector [`paper`](https:\u002F\u002Fpdfs.semanticscholar.org\u002F2131\u002F106318d4674bc9260e671c9f427bfc3f1029.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FFASText)\n- [2012-CVPR][STL][TR] Real-time scene text localization and recognition [`paper`](http:\u002F\u002Fcmp.felk.cvut.cz\u002F~matas\u002Fpapers\u002Fneumann-2012-rt_text-cvpr.pdf) [`code`](http:\u002F\u002Fdocs.opencv.org\u002F3.0-beta\u002Fmodules\u002Ftext\u002Fdoc\u002Ferfilter.html)\n\n#### Google Inc\n\n- [2019-ICCV][STL] Towards Unconstrained End-to-End Text Spotting [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FQin_Towards_Unconstrained_End-to-End_Text_Spotting_ICCV_2019_paper.html)\n- [2013-ICCV][STL][TR] Photo OCR: Reading Text in Uncontrolled Conditions [`paper`](https:\u002F\u002Fai2-s2-pdfs.s3.amazonaws.com\u002F31a8\u002F803d7e2618bfa44c472d003055bb5961b9de.pdf)\n\n#### Microsoft Inc\n\n- [2010-CVPR][STL] SWT: Detecting Text in Natural Scenes with Stroke Width Transform [`paper`](http:\u002F\u002Fwww.math.tau.ac.il\u002F~turkel\u002Fimagepapers\u002Ftext_detection.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Faperrau\u002FDetectText)\n\n#### Samsung R&D Institute China\n\n- [2019-CVPR][STL] Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FWang_Arbitrary_Shape_Scene_Text_Detection_With_Adaptive_Text_Region_Representation_CVPR_2019_paper.html)\n- [2017-arXiv][STL] R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fftp\u002Farxiv\u002Fpapers\u002F1706\u002F1706.09579.pdf)\n- [2017-IAPR][STL] Deep Residual Text Detection Network for Scene Text [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8270068)\n\n#### Vicarious FPC Inc\n\n- [2016-NIPS][TR] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1611.02788)\n\n#### Chinese State Key Laboratory of Management and Control for Complex Systems\n\n- [2013-CVPR][TR] Scene Text Recognition using Part-based Tree-structured Character Detection [`paper`](http:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2013\u002Fpapers\u002FShi_Scene_Text_Recognition_2013_CVPR_paper.pdf)\n\n#### Stanford University\n\n- [2012-ICPR][TR] End-to-End Text Recognition with CNN [`paper`](http:\u002F\u002Fwww.cs.stanford.edu\u002F~acoates\u002Fpapers\u002Fwangwucoatesng_icpr2012.pdf) [`code`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FSceneTextCNN_demo.tar)\n\n#### Visual Computing Department, Institute for Infocomm Research\n\n- [2017-ICCV][STL] WeText: Scene Text Detection under Weak Supervision [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FTian_WeText_Scene_Text_ICCV_2017_paper.pdf)\n\n#### University of Florida\n\n- [2017-ICCV][STL] Single Shot Text Detector with Regional Attention [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FHe_Single_Shot_Text_ICCV_2017_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBestSonny\u002FSSTD)\n\n#### University of Southern California\n\n- [2017-ICCV][STL] Self-organized Text Detection with Minimal Post-processing via Border Learning [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FWu_Self-Organized_Text_Detection_ICCV_2017_paper.pdf)\n\n#### Hikvision Research Institute\n\n- [2021-AAAI][STL][TR] MANGO: A Mask Attention Guided One-Stage Scene Text Spotter [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.04350.pdf)\n- [2020-AAAI][STL][TR] Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.06820.pdf)\n- [2018-CVPR][TR] AON: Towards Arbitrarily-Oriented Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1711.04226.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fhuizhang0110\u002FAON)\n- [2017-ICCV][TR] Focusing Attention: Towards Accurate Text Recognition in Natural Images [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FCheng_Focusing_Attention_Towards_ICCV_2017_paper.pdf)\n\n#### University of Adelaide\n\n- [2019-AAAI][TR] Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.00751.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FPay20Y\u002FSAR_TF)\n- [2017-ICCV][STL][TR] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLi_Towards_End-To-End_Text_ICCV_2017_paper.pdf)\n\n#### City University of New York\n\n- [2017-CVPR][STL] Unambiguous Text Localization and Retrieval for Cluttered Scenes [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FRong_Unambiguous_Text_Localization_CVPR_2017_paper.pdf)\n\n#### The University of Hong Kong\n\n- [2020-ECCV][STL][TR] AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2183_ECCV_2020_paper.php)\n- [2018-AAAI][TR] Char-Net: A Character-Aware Neural Network for Distorted Scene Text [`paper`](http:\u002F\u002Fwww.visionlab.cs.hku.hk\u002Fpublications\u002Fwliu_aaai18.pdf)\n\n#### Zhejiang University\n\n- [2021-TIP][STL][TR] FREE: A Fast and Robust End-to-End Video Text Spotter [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=9266586)\n- [2020-arxiv][TR] Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.11338.pdf)\n- [2018-AAAI][STL] PixelLink: Detecting Scene Text via Instance Segmentation [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.01315.pdf)\n\n#### University of Potsdam\n\n- [2018-AAAI][STL][TR] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.05404.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBartzi\u002Fsee)\n\n#### Arizona State Unviversity\n\n- [2018-AAAI][TR] SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional\nEncoder-decoder Network [`paper`](https:\u002F\u002Fpdfs.semanticscholar.org\u002F9061\u002F47e6eb8e963d9751dda18fb540ed7faeb9fb.pdf)\n\n#### Stevens Institute of Technology\n\n- [2018-CVPR][STL] Geometry-Aware Scene Text Detection with Instance Transformation Network [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Geometry-Aware_Scene_Text_CVPR_2018_paper.pdf)\n\n#### Nanyang Technological University\n\n- [2020-IJCV][STL] Bottom-Up Scene Text Detection with Markov Clustering Networks [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs11263-020-01298-y)\n- [2020-AAAI][STL][TR] GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.01276.pdf)\n- [2019-ICCV][STL][TR] GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FZhan_GA-DAN_Geometry-Aware_Domain_Adaptation_Network_for_Scene_Text_Detection_and_ICCV_2019_paper.html)\n- [2019-CVPR][STL] ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FZhan_ESIR_End-To-End_Scene_Text_Recognition_via_Iterative_Image_Rectification_CVPR_2019_paper.html)\n- [2019-CVPR][STL] Towards Robust Curve Text Detection With Conditional Spatial Expansion [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002F)Liu_Towards_Robust_Curve_Text_Detection_With_Conditional_Spatial_Expansion_CVPR_2019_paper.html)\n- [2018-ECCV][STL] Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FFangneng_Zhan_Verisimilar_Image_Synthesis_ECCV_2018_paper.pdf)\n- [2018-ECCV][STL] Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FChuhui_Xue_Accurate_Scene_Text_ECCV_2018_paper.pdf)\n- [2018-ECCV][STL] Using Object Information for Spotting Text [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FShitala_Prasad_Using_Object_Information_ECCV_2018_paper.pdf)\n- [2018-CVPR][STL] Learning Markov Clustering Networks for Scene Text Detection [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLiu_Learning_Markov_Clustering_CVPR_2018_paper.pdf)\n\n#### Alibaba Group \n\n- [2018-ICPR][STL][TR] A Novel Integrated Framework for Learning both Text Detection and Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.08611.pdf)\n- [2018-IJCAI][STL] IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1805.01167.pdf)\n\n#### Chinese Academy of Sciences\n\n- [2020-CVPR][STL][TR] Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FGao_Multi-Modal_Graph_Neural_Network_for_Joint_Reasoning_on_Vision_and_CVPR_2020_paper.html)\n- [2018-ICIP][STL] Focal Text: An Accurate Text Detection With Focal Loss [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=8451241)\n- [2018-ICIP][STL] Dense Chained Attention Network for Scene Text Recognition [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=8451273)\n\n#### University of Cambridge\n\n- [2018-ECCV][STL] Synthetically Supervised Feature Learning for Scene Text Recognition [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FYang_Liu_Synthetically_Supervised_Feature_ECCV_2018_paper.pdf)\n\n#### Peking University\n\n- [2021-NIPS][TR] CentripetalText: An Efficient Text Instance Representation for Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.05945.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fshengtao96\u002FCentripetalText)\n- [2020-ICASSP][TR] A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.03509.pdf)\n- [2020-ICASSP][STL] All you need is a second look: Towards Tighter Arbitrary shape text detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.12436.pdf)\n- [2019-WACV][STL] Mask R-CNN with Pyramid Attention Network for Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.09058.pdf)\n- [2018-ECCV][STL] TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.01544.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fprincewang1994\u002FTextSnake.pytorch)\n\n#### SenseTime Research\n\n- [2021-WACV][STL] Disentangled Contour Learning for Quadrilateral Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2021\u002Fpapers\u002FBi_Disentangled_Contour_Learning_for_Quadrilateral_Text_Detection_WACV_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FSakuraRiven\u002FDCLNet)\n- [2020-ECCV][TR] RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F3160_ECCV_2020_paper.php)\n- [2020-ECCV][TR] Scene Text Image Super-resolution in the wild [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F1186_ECCV_2020_paper.php)\n- [2019-arxiv][STL] Pyramid Mask Text Detector [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1903.11800.pdf)\n- [2019-ICCV][STL] Geometry Normalization Networks for Accurate Scene Text Detection [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FXu_Geometry_Normalization_Networks_for_Accurate_Scene_Text_Detection_ICCV_2019_paper.html)\n- [2018-BMVC][STL] Boosting up Scene Text Detectors with Guided CNN [`paper`](http:\u002F\u002Fbmvc2018.org\u002Fcontents\u002Fpapers\u002F0633.pdf)\n\n#### Naver Clova AI Research\n\n- [2020-ECCV][STL] Character Region Attention For Text Spotting [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F6775_ECCV_2020_paper.php)\n- [2019-CVPR][STL][TR] Character Region Awareness for Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.01941) [`code`](https:\u002F\u002Fgithub.com\u002Fclovaai\u002FCRAFT-pytorch)\n\n#### Baidu\n\n- [2020-arxiv][STL][TR] PP-OCR: A Practical Ultra Lightweight OCR System [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.09941.pdf)\n- [2019-ICCV][STL][TR] Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FSun_Chinese_Street_View_Text_Large-Scale_Chinese_Text_Reading_With_Partially_ICCV_2019_paper.html)\n- [2019-CVPR][STL] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes\n[`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.06535)\n- [2018-arxiv][STL] Detecting Text in the Wild with Deep Character Embedding Network [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01671)\n- [2018-ACCV][STL][TR] TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.09900.pdf)\n\n#### University of Adelaide \n\n- [2018-CVPR][STL][TR] An End-to-End TextSpotter with Explicit Alignment and Attention [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FHe_An_End-to-End_TextSpotter_CVPR_2018_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Ftonghe90\u002Ftextspotter)\n\n#### Nanjing University\n\n- [2020-BMVC][TR] Robust Scene Text Recognition Through Adaptive Image Enhancement [`paper`](https:\u002F\u002Fwww.bmvc2020-conference.com\u002Fassets\u002Fpapers\u002F0257.pdf)\n- [2019-ICCV][STL] Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FWang_Efficient_and_Accurate_Arbitrary-Shaped_Text_Detection_With_Pixel_Aggregation_Network_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FWenmuZhou\u002FPAN.pytorch)\n- [2019-CVPR][STL] Shape Robust Text Detection With Progressive Scale Expansion Network [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FWang_Shape_Robust_Text_Detection_With_Progressive_Scale_Expansion_Network_CVPR_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fwhai362\u002FPSENet)\n\n#### The Chinese University of Hong Kong\n\n- [2022-AAAI][TR] Context-based Contrastive Learning for Scene Text Recognition [`paper`](https:\u002F\u002Fwww.cse.cuhk.edu.hk\u002F~byu\u002Fpapers\u002FC139-AAAI2022-ConCLR.pdf)\n- [2019-CVPR][STL] Learning Shape-Aware Embedding for Scene Text Detection [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FTian_Learning_Shape-Aware_Embedding_for_Scene_Text_Detection_CVPR_2019_paper.html)\n\n#### Malong Technologies\n\n- [2019-ICCV][STL][TR] Convolutional Character Networks [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FXing_Convolutional_Character_Networks_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FMalongTech\u002Fresearch-charnet)\n\n#### University of Rochester\n\n- [2019-ICCV][TR] Large-Scale Tag-Based Font Retrieval With Generative Feature Learning [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FChen_Large-Scale_Tag-Based_Font_Retrieval_With_Generative_Feature_Learning_ICCV_2019_paper.html)\n\n#### Facebook AI Research\n\n- [2021-CVPR][STL][TR] TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FSingh_TextOCR_Towards_Large-Scale_End-to-End_Reasoning_for_Arbitrary-Shaped_Scene_Text_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Ftextvqa.org\u002Ftextocr\u002Fcode)\n- [2020-CVPR][STL][TR] Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.06258.pdf)\n- [2018-arxiv][STL] Improving Rotated Text Detection with Rotation Region Proposal Networks [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.07031.pdf) \n\n#### University of Marlyand\n\n- [2020-WACV][TR] Adapting Style and Content for Attended Text Sequence Recognition [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FSchwarcz_Adapting_Style_and_Content_for_Attended_Text_Sequence_Recognition_WACV_2020_paper.pdf)\n\n#### Penta-AI\n\n- [2020-WACV][STL] It’s All About The Scale - Efficient Text Detection Using Adaptive Scaling [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FRichardson_Its_All_About_The_Scale_-_Efficient_Text_Detection_Using_WACV_2020_paper.pdf)\n\n#### Central China Normal University\n\n- [2020-ECCV][STL][TR] PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2318_ECCV_2020_paper.php)\n\n#### Tencent\n\n- [2022-AAAI][TR] Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition [`paper`](https:\u002F\u002Fwww.aaai.org\u002FAAAI22Papers\u002FAAAI-785.LiuH.pdf)\n- [2020-arxiv][STL] PuzzleNet: Scene Text Detection by Segment Context Graph Learning [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.11371.pdf)\n- [2020-AAAI][STL][TR] Accurate Structured-Text Spotting for Arithmetical Exercise Correction [`paper`](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F341891992_Accurate_Structured-Text_Spotting_for_Arithmetical_Exercise_Correction)\n- [2019-arxiv][TR] 2D Attentional Irregular Scene Text Recognizer [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.05708.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fchenjun2hao\u002FBert_OCR.pytorch)\n\n#### Tsinghua University\n\n- [2023-IJCAI][TR] Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0087.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FLEMMA)\n- [2021-CVPR][STL] Primitive Representation Learning for Scene Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FYan_Primitive_Representation_Learning_for_Scene_Text_Recognition_CVPR_2021_paper.pdf)\n- [2020-ECCV][STL] Sequential Deformation for Accurate Scene Text Detection [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F6576_ECCV_2020_paper.php)\n\n#### University of Science and Technology of China\n\n- [2023-IJCAI][TR] Linguistic More: Taking a Further Step toward Effcient and Accurate Scene Text Recognition [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0189.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FCyrilSterling\u002FLPV)\n- [2021-ICCV][TR] From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FWang_From_Two_to_One_A_New_Scene_Text_Recognizer_With_ICCV_2021_paper.pdf)\n- [2021-CVPR][STL] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FFang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFangShancheng\u002FABINet)\n- [2020-CVPR][STL] ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FWang_ContourNet_Taking_a_Further_Step_Toward_Accurate_Arbitrary-Shaped_Scene_Text_CVPR_2020_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fwangyuxin87\u002FContourNet)\n- [2020-arxiv][TR] Focus-Enhanced Scene Text Recognition with Deformable Convolutions [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.10998.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FAlpaca07\u002Fdtr)\n- [2018-Pattern Recognition][STL] TextMountain: Accurate Scene Text Detection via Instance Segmentation [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.12786.pdf)\n\n#### University of Electronic Science and Technology of China\n\n- [2020-CVPR][TR] What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FXu_What_Machines_See_Is_Not_What_They_Get_Fooling_Scene_CVPR_2020_paper.html)\n\n#### Indian Statistical Institute\n\n- [2020-CVPR][STL][TR] STEFANN: Scene Text Editor Using Font Adaptive Neural Network [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FRoy_STEFANN_Scene_Text_Editor_Using_Font_Adaptive_Neural_Network_CVPR_2020_paper.html)\n\n#### Institute of Information Engineering, Chinese Academy of Sciences\n\n- [2021-CVPR][STL] Progressive Contour Regression for Arbitrary-Shape Scene Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FDai_Progressive_Contour_Regression_for_Arbitrary-Shape_Scene_Text_Detection_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fdpengwen\u002FPCR)\n- [2020-CVPR][TR] SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FQiao_SEED_Semantics_Enhanced_Encoder-Decoder_Framework_for_Scene_Text_Recognition_CVPR_2020_paper.html)\n- [2020-ICPR][TR] Gaussian Constrained Attention Network for Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.09169.pdf)\n- [2020-arxiv][STL] Self-Training for Domain Adaptive Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.11487.pdf)\n- [2019-ICDAR][STL] Curved Text Detection in Natural Scene Images with Semi- and Weakly-Supervised Learning [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.09990.pdf)\n- [2019-BMVC][TR] Text Recognition using local correlation[`paper`](https:\u002F\u002Fbmvc2019.org\u002Fwp-content\u002Fuploads\u002Fpapers\u002F0469-paper.pdf)\n\n#### University of Chinese Academy of Sciences\n\n- [2020-CVPR][STL][TR] Towards Accurate Scene Text Recognition With Semantic Reasoning Networks [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FYu_Towards_Accurate_Scene_Text_Recognition_With_Semantic_Reasoning_Networks_CVPR_2020_paper.html)\n\n#### Amazon\n\n- [2020-CVPR][STL] SCATTER: Selective Context Attentional Scene Text Recognizer [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FLitman_SCATTER_Selective_Context_Attentional_Scene_Text_Recognizer_CVPR_2020_paper.html)\n\n#### Heritage Institute of Technology\n\n- [2020-ICIP][STL] Scale-invariant Multi-oriented Text Detection in Wild Scene Images [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.06423.pdf)\n\n#### Indian Institute of Technology\n\n- [2020-arxiv][STL] NENET: An Edge Learnable Network for Link Prediction in Scene Text [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.12147.pdf)\n\n#### Xidian University\n\n- [2021-AAAI][STL][TR] PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2104.05458.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FPaddleOCR\u002Fblob\u002Frelease\u002F2.1\u002Fdoc\u002Fdoc_en\u002Fpgnet_en.md)\n- [2020-ICASSP][STL] Efficient Scene Text Detection with Textual Attention Tower [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.03741.pdf)\n- [2019-ACM-MM][STL] A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.05498.pdf)\n\n#### Tongji University\n\n- [2019-AAAI][STL] Scene Text Detection with Supervised Pyramid Context Network [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.08605.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FAirBernard\u002FScene-Text-Detection-with-SPCNET)\n\n#### Harbin Institute of Technology\n\n- [2017-TIP][STL] Scene text detection and segmentation based on cascaded convolution neural networks (`paper`)[https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7828014]\n\n#### Shanghai Jiao Tong University\n\n- [2018-ICPR][STL] Fused Text Segmentation Networks for Multi-oriented Scene Text Detection [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1709.03272.pdf)  \n\n#### Ping An Property & Casualty Insurance\n\n- [2020-arxiv][TR] Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.10874.pdf)\n\n#### Hefei University of Technology\n\n- [2020-arxiv][TR] Fast Dense Residual Network: Enhancing Global Dense Feature Flow for Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.09021v1.pdf)\n\n#### Beihang University\n\n- [2020-arxiv][TR] A Feasible Framework for Arbitrary-Shaped Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.04561.pdf) [`code`](https:\n\u002F\u002Fgithub.com\u002Fzhang0jhon\u002FAttentionOCR)\n\n#### Boston University\n\n- [2020-arxiv][TR] Deep Neural Network for Semantic-based Text Recognition in Images [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.01403.pdf)\n\n#### Carnegie Mellon University\n\n- [2019-ICDAR][TR] Rethinking Irregular Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.11834.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FJyouhou\u002FICDAR2019-ArT-Recognition-Alchemy)\n\n#### Northwestern Polytechnical University\n\n- [2019-CVPR][STL][TR] Towards End-to-End Text Spotting in Natural Scenes [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.06013.pdf)\n\n#### VinAI Research\n\n- [2021-CVPR][STL] Dictionary-Guided Scene Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FNguyen_Dictionary-Guided_Scene_Text_Recognition_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002Fdict-guided)\n\n#### University of Tokyo\n\n- [2021-CVPR][TR] What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FBaek_What_if_We_Only_Use_Real_Datasets_for_Scene_Text_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fku21fan\u002FSTR-Fewer-Labels)\n\n#### University of Surrey\n\n- [2021-ICCV][TR] Towards the Unseen: Iterative Text Recognition by Distilling from Errors [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FBhunia_Towards_the_Unseen_Iterative_Text_Recognition_by_Distilling_From_Errors_ICCV_2021_paper.pdf)\n- [2021-ICCV][TR] Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FBhunia_Joint_Visual_Semantic_Reasoning_Multi-Stage_Decoder_for_Text_Recognition_ICCV_2021_paper.pdf)\n- [2021-CVPR][TR] MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FBhunia_MetaHTR_Towards_Writer-Adaptive_Handwritten_Text_Recognition_CVPR_2021_paper.pdf)\n\n#### The Technion – Israel Institute of Technology\n\n- [2021-CVPR][TR] Sequence-to-Sequence Contrastive Learning for Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FAberdam_Sequence-to-Sequence_Contrastive_Learning_for_Text_Recognition_CVPR_2021_paper.pdf)\n\n#### University of Illinois at Urbana-Champaign\n\n- [2021-CVPR][TR] Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FXu_Rethinking_Text_Segmentation_A_Novel_Dataset_and_a_Text-Specific_Refinement_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FRethinking-Text-Segmentation)\n\n#### National Laboratory of Pattern Recognition\n\n- [2021-CVPR][STL] Semantic-Aware Video Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FFeng_Semantic-Aware_Video_Text_Detection_CVPR_2021_paper.pdf)\n\n#### Shenzhen University\n\n- [2021-CVPR][STL][TR] Self-Attention Based Text Knowledge Mining for Text Detection [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWan_Self-Attention_Based_Text_Knowledge_Mining_for_Text_Detection_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FCVI-SZU\u002FSTKM)\n\n#### University of the Philippines\n\n- [2021-ICDAR][TR] Vision Transformer for Fast and Efficient Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2105.08582.pdf) ['code'](https:\u002F\u002Fgithub.com\u002Froatienza\u002Fdeep-text-recognition-benchmark)\n\n#### Beijing Jiaotong University\n\n- [2022-IJCAI][TR] SVTR: Scene Text Recognition with a Single Visual Model [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2205.00159.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FPaddleOCR)\n\n#### Wuhan University\n\n- [2022-AAAI][TR] Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2112.12916.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fadeline-cs\u002FGTR)\n\n#### Helsing AI\n\n- [2022-WACV][TR] One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2022\u002Fpapers\u002FSouibgui_One-Shot_Compositional_Data_Generation_for_Low_Resource_Handwritten_Text_Recognition_WACV_2022_paper.pdf)\n\n#### Purdue University\n\n- [2023-WACV][TR] Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2023\u002Fpapers\u002FPatel_Seq-UPS_Sequential_Uncertainty-Aware_Pseudo-Label_Selection_for_Semi-Supervised_Text_Recognition_WACV_2023_paper.pdf)\n\n## 2. Datasets\n\n#### [`SCUT-CTW1500`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FCurve-Text-Detector) `2018`\n\nTask: text location(with different style) and recognition\n\n[`download`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FCurve-Text-Detector)\n\n#### [`Total Text Dataset`](https:\u002F\u002Fgithub.com\u002Fcs-chan\u002FTotal-Text-Dataset) `2017`\n\n1,555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind\n\nTask: text location(with different style) and recognition\n\n[`download`](https:\u002F\u002Fgithub.com\u002Fcs-chan\u002FTotal-Text-Dataset)\n\n#### [`PowerPoint Text Detection and Recognition Dataset`](https:\u002F\u002Fgitlab.com\u002Frex-yue-wu\u002FISI-PPT-Dataset) `2017`\n\n21,384 images, 21,384+ text instances\n\nTask: text location and recognition\n\n[`download`](https:\u002F\u002Fgitlab.com\u002Frex-yue-wu\u002FISI-PPT-Dataset)\n\n#### [`COCO-Text (Computer Vision Group, Cornell)`](http:\u002F\u002Fvision.cornell.edu\u002Fse3\u002Fcoco-text\u002F)   `2016`\n\n63,686 images, 173,589 text instances, 3 fine-grained text attributes.\n\nTask: text location and recognition\n\n[`download`](https:\u002F\u002Fgithub.com\u002Fandreasveit\u002Fcoco-text)\n\n#### [`Synthetic Word Dataset (Oxford, VGG)`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Ftext\u002F)   `2014`\n\n9 million images covering 90k English words\n\nTask: text recognition, segmantation\n\n[`download`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Ftext\u002Fmjsynth.tar.gz)\n\n#### [`The Street View House Number Dataset (SVHN)`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers)   `2012`\n\nReal-world street view number image with its position and classification tags.\n\nTask: number location detection, text recognition\n\n[`download`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers)\n\n#### [`IIIT 5K-Words`](http:\u002F\u002Fcvit.iiit.ac.in\u002Fprojects\u002FSceneTextUnderstanding\u002FIIIT5K.html)   `2012`\n\n5000 images from Scene Texts and born-digital (2k training and 3k testing images)\n\nEach image is a cropped word image of scene text with case-insensitive labels\n\nTask: text recognition\n\n[`download`](http:\u002F\u002Fcvit.iiit.ac.in\u002Fprojects\u002FSceneTextUnderstanding\u002FIIIT5K-Word_V3.0.tar.gz)\n\n#### [`StanfordSynth(Stanford, AI Group)`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002F#research)   `2012`\n\nSmall single-character images of 62 characters (0-9, a-z, A-Z)\n\nTask: text recognition\n\n[`download`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FsyntheticData.tar)\n\n#### [`MSRA Text Detection 500 Database (MSRA-TD500)`](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FMSRA_Text_Detection_500_Database_(MSRA-TD500))   `2012`\n\n500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)\n\nChinese, English or mixture of both\n\nTask: text detection\n\n#### [`Street View Text (SVT)`](http:\u002F\u002Ftc11.cvc.uab.es\u002Fdatasets\u002FSVT_1)   `2010`\n\n350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)\n\nOnly word level bounding boxes are provided with case-insensitive labels\n\nTask: text location\n\n#### [`KAIST Scene_Text Database`](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FKAIST_Scene_Text_Database)   `2010`\n\n3000 images of indoor and outdoor scenes containing text\n\nKorean, English (Number), and Mixed (Korean + English + Number)\n\nTask: text location, segmantation and recognition\n\n#### [`Chars74k`](http:\u002F\u002Fwww.ee.surrey.ac.uk\u002FCVSSP\u002Fdemos\u002Fchars74k\u002F)   `2009`\n\nOver 74K images from natural images, as well as a set of synthetically generated characters\n\nSmall single-character images of 62 characters (0-9, a-z, A-Z)\n\nTask: text recognition\n\n#### `ICDAR Benchmark Datasets`\n\n|Dataset| Description | Competition Paper |\n|---|---|----|\n|[ICDAR 2017](http:\u002F\u002Frrc.cvc.uab.es\u002F)| over 173,589 labeled text regions in over 63,686 images |`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1601.07140)|\n|[ICDAR 2015](http:\u002F\u002Frrc.cvc.uab.es\u002F)| 1000 training images and 500 testing images|`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Frrc.cvc.uab.es\u002Ffiles\u002FRobust-Reading-Competition-Karatzas.pdf)|\n|[ICDAR 2013](http:\u002F\u002Fdagdata.cvc.uab.es\u002Ficdar2013competition\u002F)| 229 training images and 233 testing images |`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fdagdata.cvc.uab.es\u002Ficdar2013competition\u002Ffiles\u002Ficdar2013_competition_report.pdf)|\n|[ICDAR 2011](http:\u002F\u002Frobustreading.opendfki.de\u002Ftrac\u002F)| 229 training images and 255 testing images |`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fwww.iapr-tc11.org\u002Farchive\u002Ficdar2011\u002Ffileup\u002FPDF\u002F4520b491.pdf)|\n|[ICDAR 2005](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FICDAR_2005_Robust_Reading_Competitions)| 1001 training images and 489 testing images |`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fwww.academia.edu\u002Fdownload\u002F30700479\u002F10.1.1.96.4332.pdf)|\n|[ICDAR 2003](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FICDAR_2003_Robust_Reading_Competitions)| 181 training images and 251 testing images(word level and character level) |`paper`  [![link](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.332.3461&rep=rep1&type=pdf)|\n\n## 3. Competitions\n\n- [ICDAR - Robust Reading Competitions](http:\u002F\u002Frrc.cvc.uab.es\u002F?com=introduction)\n\n## 4. Online OCR Service\n\n| Name | Description |\n|---|----\n|[Tesseract OCR](https:\u002F\u002Fgithub.com\u002Ftesseract-ocr\u002Ftesseract)| API，free |\n|[Online OCR](https:\u002F\u002Fwww.onlineocr.net\u002F)| API，free |\n|[Free OCR](http:\u002F\u002Fwww.free-ocr.com\u002F)| API，free |\n|[New OCR](http:\u002F\u002Fwww.newocr.com\u002F)| API，free |\n|[ABBYY FineReader Online](https:\u002F\u002Ffinereaderonline.com)| No API，Not free |\n|[Super Online Transfer Tools (Chinese)](http:\u002F\u002Fwww.wdku.net\u002F)| API，free |\n|[Online Chinese Recognition](http:\u002F\u002Fchongdata.com\u002Focr\u002F)| API，free |\n\n## 5. Blogs\n\n- [Scene Text Detection with OpenCV 3](http:\u002F\u002Fdocs.opencv.org\u002F3.0-beta\u002Fmodules\u002Ftext\u002Fdoc\u002Ferfilter.html)\n- [Handwritten numbers detection and recognition](https:\u002F\u002Fmedium.com\u002F@o.kroeger\u002Frecognize-your-handwritten-numbers-3f007cbe46ff#.8hg7vl6mo)\n- [Applying OCR Technology for Receipt Recognition](http:\u002F\u002Frnd.azoft.com\u002Fapplying-ocr-technology-receipt-recognition\u002F)\n- [Convolutional Neural Networks for Object(Car License) Detection](http:\u002F\u002Frnd.azoft.com\u002Fconvolutional-neural-networks-object-detection\u002F)\n- [Extracting text from an image using Ocropus](http:\u002F\u002Fwww.danvk.org\u002F2015\u002F01\u002F09\u002Fextracting-text-from-an-image-using-ocropus.html)\n- [Number plate recognition with Tensorflow](http:\u002F\u002Fmatthewearl.github.io\u002F2016\u002F05\u002F06\u002Fcnn-anpr\u002F) [`github`](https:\u002F\u002Fgithub.com\u002Fmatthewearl\u002Fdeep-anpr)\n- [Using deep learning to break a Captcha system](https:\u002F\u002Fdeepmlblog.wordpress.com\u002F2016\u002F01\u002F03\u002Fhow-to-break-a-captcha-system\u002F) [`report`](http:\u002F\u002Fweb.stanford.edu\u002F~jurafsky\u002Fburszstein_2010_captcha.pdf) [`github`](https:\u002F\u002Fgithub.com\u002Farunpatala\u002Fcaptcha)\n- [Breaking reddit captcha with 96% accuracy](https:\u002F\u002Fdeepmlblog.wordpress.com\u002F2016\u002F01\u002F05\u002Fbreaking-reddit-captcha-with-96-accuracy\u002F) [`github`](https:\u002F\u002Fgithub.com\u002Farunpatala\u002Freddit.captcha)\n- [文字检测与识别资源-1](http:\u002F\u002Fblog.csdn.net\u002Fpeaceinmind\u002Farticle\u002Fdetails\u002F51387367)\n- [文字的检测与识别资源-2](http:\u002F\u002Fblog.csdn.net\u002Fu010183397\u002Farticle\u002Fdetails\u002F56497303?locationNum=12&fps=1)\n- Scene Text Recognition in iOS [`blog`](https:\u002F\u002Fmedium.com\u002F@khurram.pak522\u002Fscene-text-recognition-in-ios-11-2d0df8412151) [`github`](https:\u002F\u002Fgithub.com\u002Fkhurram18\u002FSceneTextRecognitioniOS)\n","# 场景文本定位与识别资源\n\n*按研究所阅读：[英文](README.md)，[简体中文](README.zh-cn.md)。*\n\n*按年份阅读：[英文](README.yearwise.md)，[简体中文](README.zh-cn.yearwise.md)。*\n\n*标签：[STL]（场景文本定位），[TR]（文本识别）*\n\n*[STL]（场景文本定位）从场景输入图像中检测文本区域*\n\n*[TR]（文本识别）识别文本内容*\n\n**最后更新：2023年9月17日**\n\n## 1. 论文与代码\n\n#### 概述\n\n- [2020-arxiv] 自然场景下的文本检测与识别：综述 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.04305.pdf)\n- [2020-arxiv] 自然场景下的文本识别：综述 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.03492.pdf)\n- [2020-IJCV] 场景文本检测与识别：深度学习时代 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.04256.pdf)\n- [2019-ICCV] 场景文本识别模型比较存在什么问题？数据集与模型分析 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FBaek_What_Is_Wrong_With_Scene_Text_Recognition_Model_Comparisons_Dataset_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fclovaai\u002Fdeep-text-recognition-benchmark)\n- [2016-TIP] 视频中的文本检测、跟踪与识别：全面综述 [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fapplication\u002Fenterprise\u002Fentconfirmation.jsp?arnumber=7452620&icp=false)\n- [2015-PAMI] 图像中的文本检测与识别：综述 [`paper`](http:\u002F\u002Flampsrv02.umiacs.umd.edu\u002Fpubs\u002FPapers\u002Fqixiangye-14\u002Fqixiangye-14.pdf)\n- [2014-Front.Comput.Sci] 场景文本检测与识别：最新进展与未来趋势 [`paper`](http:\u002F\u002Fmc.eistar.net\u002Fuploadfiles\u002FPapers\u002FFCS_TextSurvey_2015.pdf)\n\n#### 牛津大学\n\n- [2020-ECCV][STL][TR] 基于视觉匹配的自适应文本识别 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2492_ECCV_2020_paper.php) [`code`](https:\u002F\u002Fgithub.com\u002FChuhanxx\u002FFontAdaptor)\n- [2018-BMVC][TR] 归纳式视觉定位：分解训练以实现更优的泛化能力 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1807.08179)\n- [2016-IJCV][STL][TR] 使用卷积神经网络在自然场景中阅读文本 [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1412.1842) [`demo`](http:\u002F\u002Fzeus.robots.ox.ac.uk\u002Ftextsearch\u002F#\u002Fsearch\u002F)  [`homepage`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fresearch\u002Ftext\u002F)\n- [2016-CVPR][STL] 用于自然图像中文本定位的合成数据 [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Fscenetext\u002Fgupta16.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fankush-me\u002FSynthText) [`data`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Fscenetext\u002F)\n- [2015-ICLR][TR] 非约束条件下文本识别的深度结构化输出学习 [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1412.5903)\n- [2015-博士论文][STL] 文本定位的深度学习\n [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2015\u002FJaderberg15b\u002Fjaderberg15b.pdf) [`code`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting)\n- [2014-ECCV][STL] 文本定位的深度特征 [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14\u002Fjaderberg14.pdf) [`code`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting) [`model`](https:\u002F\u002Fbitbucket.org\u002Fjaderberg\u002Feccv2014_textspotting)\n- [2014-NIPS][TR] 用于自然场景文本识别的合成数据和人工神经网络 [`paper`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14c\u002Fjaderberg14c.pdf)  [`homepage`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2014\u002FJaderberg14c\u002F) [`model`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fresearch\u002Ftext\u002Fmodel_release.tar.gz)\n\n#### 深圳先进技术研究院\n\n- [2018-arxiv][STL][TR] FOTS：统一网络的快速方向性文本定位 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01671)\n- [2016-ECCV][STL] CTPN：基于连接主义文本提案网络的自然图像中文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1609.03605) [`code`](https:\u002F\u002Fgithub.com\u002Ftianzhi0549\u002FCTPN)\n- [2016-CVPR][STL] 级联卷积文本网络在自然图像中实现精确的文本定位 [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1603.09423)\n- [2016-AAAI][STL] 在深度卷积序列中阅读场景文本 [`paper`](http:\u002F\u002Fwhuang.org\u002Fpapers\u002Fphe2016_aaai.pdf)\n- [2016-TIP][STL] 基于文本注意力机制的卷积神经网络用于场景文本检测 [`paper`](http:\u002F\u002Fwhuang.org\u002Fpapers\u002Fthe2016_tip.pdf)\n- [2016-TIP][STL] 基于文本注意力机制的卷积神经网络用于场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1510.03283.pdf)\n- [2014-ECCV][STL] 使用卷积神经网络诱导的MSER树进行鲁棒的场景文本检测 [`paper`](http:\u002F\u002Fwww.whuang.org\u002Fpapers\u002Fwhuang2014_eccv.pdf)\n\n#### 华南理工大学\n\n- [2021-IJCV][STL] 探索无序边界框离散化网络在多方向场景文本检测中的能力 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.09629.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FBox_Discretization_Network)\n- [2021-CVPR][STL] 用于任意形状文本检测的傅里叶轮廓嵌入 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FZhu_Fourier_Contour_Embedding_for_Arbitrary-Shaped_Text_Detection_CVPR_2021_paper.pdf)\n- [2021-CVPR][TR][STL] 隐式特征对齐：学习将文本识别器转换为文本定位器 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWang_Implicit_Feature_Alignment_Learn_To_Convert_Text_Recognizer_to_Text_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FWang-Tianwei\u002FImplicit-feature-alignment)\n- [2020-CVPR][TR] 学习增强：面向文本识别的联合数据增强与网络优化 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FLuo_Learn_to_Augment_Joint_Data_Augmentation_and_Network_Optimization_for_CVPR_2020_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FCanjie-Luo\u002FText-Image-Augmentation)\n- [2020-AAAI][STL][TR] 用于文本识别的解耦注意力网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.10205.pdf)\n- [2020-CVPR][STL][TR] ABCNet：基于自适应贝塞尔曲线网络的实时场景文本定位 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.10200.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002Fbezier_curve_text_spotting)\n- [2020-IJCV][TR] 使用对抗学习分离内容与风格以识别野外文本 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.04189.pdf)\n- [2019-Pattern Recognition][TR] 用于场景文本识别的多目标校正注意力网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1901.03003.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FCanjie-Luo\u002FMORAN_v2)\n- [2019-CVPR][TR] 用于序列识别的聚合交叉熵 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.08364.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fsummerlvsong\u002FAggregation-Cross-Entropy)\n- [2019-arxiv][STL] 探索无序边界框离散化网络在多方向场景文本检测中的能力 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.09629.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FBox_Discretization_Network) [`code`](https:\u002F\u002Fgit.io\u002FTextDet)\n- [2019-CVPR][STL] 场景文本检测的紧致度感知评估协议 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FLiu_Tightness-Aware_Evaluation_Protocol_for_Scene_Text_Detection_CVPR_2019_paper.html)\n- [2018-AAAI][STL] 特征增强网络：一种改进的场景文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1711.04249.pdf)\n- [2017-arXiv][STL] 野外曲线文本检测：新数据集与新方案 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.02170)\n- [2020-arxiv][TR] 基于注意力的场景文本识别中的自适应嵌入门控 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.09475.pdf)\n- [2017-PAMI][TR] 利用全卷积循环网络学习空间语义上下文，用于在线手写中文文本识别 [`paper`](http:\u002F\u002Fdiscovery.ucl.ac.uk\u002F1569458\u002F1\u002FTPAMI-2016-08-0656-R2.pdf)\n- [2017-CVPR][STL] 深度匹配先验网络：迈向更紧密的多方向文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.01425)\n- [2016-arXiv][STL] DeepText：自然图像中统一的文本提案生成与文本检测框架 [`paper`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1605.07314)\n- [2016-IEEE Transactions on Multimedia][STL] 基于卷积神经网络、通过文本结构建模的中文文本检测算法 [`paper`](http:\u002F\u002Fwww2.egr.uh.edu\u002F~zhan2\u002FECE6111_spring2017\u002FA%20Convolutional%20Neural%20Network%20%20Based%20Chinese%20Text%20Detection%20Algorithm%20Via%20Text%20Structure%20Modeling.pdf)\n\n#### 复旦大学\n\n- [2022-AAAI][TR] Text Gestalt：笔画感知的场景文本图像超分辨率 [`paper`](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F19904) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-MM][TR] 基于增强字符轮廓匹配的汉字识别 [`paper`](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503161.3547827) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-ICCV][TR] 通过图像-ID对齐使用预训练CLIP-like模型进行中文文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.01083) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-arxiv][STL][TR] 弱监督文本实例分割 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.10848) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2023-IJCAI][TR] 场景图像中与方向无关的中文文本识别 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0185.pdf)\n- [2023-IJCAI][TR] TPS++：用于场景文本识别的注意力增强薄板样条 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0197.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fsimplify23\u002FTPS_PP)\n- [2023-IJCAI][STL][TR] 基于文本语义推理实现精准视频文本定位 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0206.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-MM][TR] 基于增强字符轮廓匹配的汉字识别 [`paper`](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503161.3547827) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-WACV][TR] 通过校正主要不规则性稳健地识别不规则场景文本 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2022\u002Fpapers\u002FXu_Robustly_Recognizing_Irregular_Scene_Text_by_Rectifying_Principle_Irregularities_WACV_2022_paper.pdf)\n- [2021-IJCAI][TR] 基于笔画级分解的零样本汉字识别 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2021\u002F0085.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFudanVI\u002FFudanOCR)\n- [2022-IJCAI][TR] C3-STISR：基于三重线索的场景文本图像超分辨率 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F0238.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fzhaominyiz\u002FC3-STISR)\n- [2021-CVPR][TR] 场景文本望远镜：以文本为中心的场景图像超分辨率 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FChen_Scene_Text_Telescope_Text-Focused_Scene_Image_Super-Resolution_CVPR_2021_paper.pdf)\n- [2020-arxiv][TR] 在真实场景中使用少量标注样本进行文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.12209.pdf)\n- [2018-CVPR][TR] 场景文本识别中的编辑概率 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FBai_Edit_Probability_for_CVPR_2018_paper.pdf)\n- [2017-arXiv][STL] 通过旋转提案进行任意方向场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.01086) [`code`](https:\u002F\u002Fgithub.com\u002Fmjq11302010044\u002FRRPN)\n\n#### 华中科技大学\n\n- [2021-CVPR][STL][TR] 基于联合文本检测与相似度学习的场景文本检索 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWang_Scene_Text_Retrieval_via_Joint_Text_Detection_and_Similarity_Learning_CVPR_2021_paper.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002Flanfeng4659\u002FSTR-TDSL)\n- [2021-CVPR][STL] MOST：一种具有定位精修功能的多方向场景文本检测器 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FHe_MOST_A_Multi-Oriented_Scene_Text_Detector_With_Localization_Refinement_CVPR_2021_paper.pdf)\n- [2020-ECCV][TR] AutoSTR：面向场景文本识别的高效骨干网络搜索 [`论文`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F4796_ECCV_2020_paper.php)\n- [2020-AAAI][STL][TR] 一切尽在边界：迈向任意形状文本定位 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.09550.pdf)\n- [2020-AAAI][STL] 基于可微二值化的实时场景文本检测 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.08947.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FDB)\n- [2020-ECCV][STL][TR] Mask TextSpotter V3：用于鲁棒场景文本定位的分割提议网络 [`论文`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F1436_ECCV_2020_paper.php) [`代码`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FMaskTextSpotterV3)\n- [2019-PAMI][TR] ASTER：一种带有灵活校正机制的注意力场景文本识别器 [`论文`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8395027) [`代码`](https:\u002F\u002Fgithub.com\u002Fayumiymk\u002Faster.pytorch)\n- [2019-AAAI][TR] 从二维视角进行场景文本识别 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.06508.pdf)\n- [2019-PAMI][STL] 水平包围盒上的滑动顶点用于多方向目标检测 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.09358.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FMingtaoFu\u002Fgliding_vertex)\n- [2019-ICCV][TR] 用于场景文本识别的对称约束校正网络 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FYang_Symmetry-Constrained_Rectification_Network_for_Scene_Text_Recognition_ICCV_2019_paper.html)\n- [2018-arxiv][STL] TextField：为不规则场景文本检测学习深度方向场 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.01393.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FYukangWang\u002FTextField)\n- [2018-ECCV][TR][STL] Mask TextSpotter：一种端到端可训练的神经网络，用于检测任意形状的文本 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FPengyuan_Lyu_Mask_TextSpotter_An_ECCV_2018_paper.pdf)\n- [2018-ICIP][STL] 用于场景文本检测的特征融合网络 [`论文`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8395194\u002F)\n- [2018-CVPR][STL] 基于角点定位和区域分割的多方向场景文本检测 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLyu_Multi-Oriented_Scene_Text_CVPR_2018_paper.pdf)\n- [2018-CVPR][STL] 用于定向场景文本检测的旋转敏感回归 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLiao_Rotation-Sensitive_Regression_for_CVPR_2018_paper.pdf)\n- [2018-TIP][STL] TextBoxes++：一种单次通过的定向场景文本检测器 [`论文`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.02765) [`代码`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FTextBoxes_plusplus)\n- [2017-AAAI][STL] TextBoxes：一种基于单个深度神经网络的快速文本检测器 [`论文`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1611.06779) [`代码`](https:\u002F\u002Fgithub.com\u002FMhLiao\u002FTextBoxes)\n- [2017-CVPR][STL] 通过连接片段检测自然图像中的定向文本 [`论文`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002FUpLoadFiles\u002FPapers\u002FSegLink_CVPR17.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002Fbgshih\u002Fseglink)\n- [2016-CVPR][TR] 带自动校正的鲁棒场景文本识别 [`论文`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1603.03915v2.pdf)\n- [2016-arXiv][STL] 基于整体多通道预测的场景文本检测 [`论文`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.09002)\n- [2016-CVPR][STL] 使用全卷积网络进行多方向文本检测 [`论文`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002FUpLoadFiles\u002FPapers\u002FTextDectionFCN_CVPR16.pdf)\n- [2015-PAMI][TR] 一种端到端可训练的神经网络，用于基于图像的序列识别及其在场景文本识别中的应用 [`论文`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1507.05717v1.pdf) [`代码`](http:\u002F\u002Fmclab.eic.hust.edu.cn\u002F~xbai\u002FCRNN\u002Fcrnn_code.zip) [`代码`](https:\u002F\u002Fgithub.com\u002Fbgshih\u002Fcrnn)\n- [2014-CVPR][TR] Strokelets：一种用于场景文本识别的多尺度学习表示 [`论文`](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2014\u002Fpapers\u002FYao_Strokelets_A_Learned_2014_CVPR_paper.pdf)\n\n#### 巴塞罗那自治大学\n\n- [2019-ICCV][STL][TR] 场景文本视觉问答 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FBiten_Scene_Text_Visual_Question_Answering_ICCV_2019_paper.html)\n- [2018-ECCV][STL] 单次通过场景文本检索 [`论文`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FLluis_Gomez_Single_Shot_Scene_ECCV_2018_paper.pdf)\n- [2017-arXiv][STL] 使用全卷积网络改进场景图像中的文本提议 [`论文`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.05089)\n- [2016-arXiv][STL] TextProposals：一种针对野外单词定位的文本专用选择性搜索算法 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1604.02619.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002Flluisgomez\u002FTextProposals)\n- [2015-ICDAR][STL] 用于野外文本提取的对象提议 [`论文`](http:\u002F\u002Farxiv.org\u002Fabs\u002F1509.02317) [`代码`](https:\u002F\u002Fgithub.com\u002Flluisgomez\u002FTextProposals)\n- [2014-PAMI][TR] 带嵌入式属性的单词定位与识别 [`论文`](http:\u002F\u002Fwww.cvc.uab.es\u002F~afornes\u002Fpubli\u002Fjournals\u002F2014_PAMI_Almazan.pdf) [`主页`](http:\u002F\u002Fwww.cvc.uab.es\u002F~almazan\u002Findex\u002Fprojects\u002Fwords-att\u002Findex.html) [`代码`](https:\u002F\u002Fgithub.com\u002Falmazan\u002Fwatts)\n\n#### 斯坦福大学\n\n- [2012-ICPR][TR] 基于卷积神经网络的端到端文本识别 [`论文`](http:\u002F\u002Fwww.cs.stanford.edu\u002F~acoates\u002Fpapers\u002Fwangwucoatesng_icpr2012.pdf) [`代码`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FSceneTextCNN_demo.tar) [`SVHN数据集`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers\u002F)\n- [2012-博士论文][TR] 基于卷积神经网络的端到端文本识别 [`论文`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Fdwu4\u002FHonorThesis.pdf)\n\n#### 首尔国立大学\n\n- [2017-AAAI][STL][TR] 基于神经上下文模型在线图像中文本嵌入的检测与识别 [`论文`](https:\u002F\u002Fgithub.com\u002Fcmkang\u002FCTSN\u002Fblob\u002Fmaster\u002Faaai2017_cameraready.pdf)\n\n#### 梅花科技有限公司：Face++\n\n- [2020-CVPR][TR] 场景文本识别中的词汇依赖性 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FWan_On_Vocabulary_Reliance_in_Scene_Text_Recognition_CVPR_2020_paper.html)\n- [2020-AAAI][STL][TR] TextScanner：按顺序读取字符以实现鲁棒的场景文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.12422.pdf)\n- [2017-CVPR][STL] EAST：高效准确的场景文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.03155) [`code`](https:\u002F\u002Fgithub.com\u002Fargman\u002FEAST) [`改进版代码`](https:\u002F\u002Fgithub.com\u002Fhuoyijie\u002FAdvancedEAST)\n\n#### 中国科学院自动化研究所\n\n- [2020-IJCV][STL][TR] 通过融合自底向上和自顶向下处理的残差双尺度场景文本定位 [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs11263-020-01388-x)\n- [2019-CVPR][TR] 面向鲁棒文本图像识别的序列到序列领域适应网络 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8953495)\n- [2019-ICCV][STL][TR] TextDragon：任意形状文本定位的端到端框架 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FFeng_TextDragon_An_End-to-End_Framework_for_Arbitrary_Shaped_Text_Spotting_ICCV_2019_paper.html)\n- [2018-arxiv][TR] NRTR：用于场景文本识别的无循环序列到序列模型 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.00926.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBelval\u002FNRTR)\n- [2018-arxiv][TR] SCAN：用于场景文本识别的滑动卷积注意力网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.00578.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fnameful\u002FSCAN)\n- [2018-arxiv][TR] 用于不规则文本识别的循环校准网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.07145.pdf)\n- [2017-arxiv][TR] 基于滑动卷积字符模型的场景文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1709.01727.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Flsvih\u002FSliding-Convolution)\n- [2017-arXiv][STL] 多方向场景文本检测的深度直接回归方法 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.08289)\n- [2017-IAPR][STL] 基于新型超像素的字符候选提取的场景文本检测 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8270087)\n\n#### 加州大学圣地亚哥分校\n\n- [2016-CVPR][TR] 具有注意力机制的递归循环网络用于野外OCR [`paper`](http:\u002F\u002Farxiv.org\u002Fpdf\u002F1603.03101v1.pdf)\n\n#### 加州大学圣克鲁斯分校\n\n- [2017-arXiv][STL] 用于单词级文本定位的级联分割-检测网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.00834)\n\n#### 康奈尔大学\n\n- [2016-arXiv][STL][TR] COCO-Text：自然图像中文本检测与识别的数据集和基准 [`paper`](http:\u002F\u002Fvision.cornell.edu\u002Fse3\u002Fwp-content\u002Fuploads\u002F2016\u002F01\u002F1601.07140v1.pdf)\n\n#### 宾夕法尼亚州立大学\n\n- [2017-WACV][STL] TextContourNet：一种灵活有效的框架，通过多任务级联改进场景文本检测架构 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.03050.pdf)\n- [2016-博士论文][STL] 用于语义文本匹配和场景文本检测的上下文建模 [`paper`](https:\u002F\u002Fetda.libraries.psu.edu\u002Fcatalog\u002Fzw12z528p)\n\n#### 北京科技大学\n\n- [2021-ICCV][STL] 用于任意形状文本检测的自适应边界提案网络 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FZhang_Adaptive_Boundary_Proposal_Network_for_Arbitrary_Shape_Text_Detection_ICCV_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FGXYM\u002FTextBPN)\n- [2020-CVPR][STL] 用于任意形状文本检测的深度关系推理图网络 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FZhang_Deep_Relational_Reasoning_Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.html)\n- [2017-arxiv][TR] AdaDNNs：用于场景文本识别的深度神经网络自适应集成 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1710.03425.pdf)\n- [2016-IJCAI][STL] 通过局部和全局学习进行视频中的场景文本检测 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F16\u002FPapers\u002F376.pdf)\n- [2014-PAMI][TR] 自然场景图像中的鲁棒文本检测 [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=6613482)\n\n#### 浦项工科大学\n\n- [2016-CVPR][STL] CannyText检测器：快速且鲁棒的场景文本定位算法 [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7780757\u002F)\n\n#### 计算机工程学院\n\n- [2016-IJDAR][STL] TextCatcher：一种在自然场景中检测弯曲及复杂文本的方法 [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10032-016-0264-4)\n\n#### 布拉格捷克理工大学\n\n- [2018-ACCV][STL][TR] E2E-MLT——一种不受约束的多语言场景文本端到端方法 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.09919.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FE2E-MLT)\n- [2017-ICCV][STL][TR] Deep TextSpotter：一个可训练的场景文本定位与识别端到端框架 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FBusta_Deep_TextSpotter_An_ICCV_2017_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FDeepTextSpotter)\n- [2015-PAMI][STL][TR] 实时无词典场景文本定位与识别 [`paper`](http:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=7313008)\n- [2015-ICCV][STL] FASText：高效的无约束场景文本检测器 [`paper`](https:\u002F\u002Fpdfs.semanticscholar.org\u002F2131\u002F106318d4674bc9260e671c9f427bfc3f1029.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FMichalBusta\u002FFASText)\n- [2012-CVPR][STL][TR] 实时场景文本定位与识别 [`paper`](http:\u002F\u002Fcmp.felk.cvut.cz\u002F~matas\u002Fpapers\u002Fneumann-2012-rt_text-cvpr.pdf) [`code`](http:\u002F\u002Fdocs.opencv.org\u002F3.0-beta\u002Fmodules\u002Ftext\u002Fdoc\u002Ferfilter.html)\n\n#### Google公司\n\n- [2019-ICCV][STL] 向无约束的端到端文本定位迈进 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FQin_Towards_Unconstrained_End-to-End_Text_Spotting_ICCV_2019_paper.html)\n- [2013-ICCV][STL][TR] Photo OCR：在不可控条件下读取文本 [`paper`](https:\u002F\u002Fai2-s2-pdfs.s3.amazonaws.com\u002F31a8\u002F803d7e2618bfa44c472d003055bb5961b9de.pdf)\n\n#### Microsoft公司\n\n- [2010-CVPR][STL] SWT：利用笔画宽度变换检测自然场景中的文本 [`paper`](http:\u002F\u002Fwww.math.tau.ac.il\u002F~turkel\u002Fimagepapers\u002Ftext_detection.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Faperrau\u002FDetectText)\n\n#### 三星中国研究院\n\n- [2019-CVPR][STL] 基于自适应文本区域表示的任意形状场景文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FWang_Arbitrary_Shape_Scene_Text_Detection_With_Adaptive_Text_Region_Representation_CVPR_2019_paper.html)\n- [2017-arXiv][STL] R2CNN：用于方向鲁棒场景文本检测的旋转区域CNN [`paper`](https:\u002F\u002Farxiv.org\u002Fftp\u002Farxiv\u002Fpapers\u002F1706\u002F1706.09579.pdf)\n- [2017-IAPR][STL] 用于场景文本的深度残差文本检测网络 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8270068)\n\n#### Vicarious FPC公司\n\n- [2016-NIPS][TR] 生成式形状模型：在极少训练数据下实现文本识别与分割的联合方法 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1611.02788)\n\n#### 中国科学院管理与控制复杂系统国家重点实验室\n\n- [2013-CVPR][TR] 基于部件树结构字符检测的场景文本识别 [`paper`](http:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2013\u002Fpapers\u002FShi_Scene_Text_Recognition_2013_CVPR_paper.pdf)\n\n#### 斯坦福大学\n\n- [2012-ICPR][TR] 基于CNN的端到端文本识别 [`paper`](http:\u002F\u002Fwww.cs.stanford.edu\u002F~acoates\u002Fpapers\u002Fwangwucoatesng_icpr2012.pdf) [`code`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FSceneTextCNN_demo.tar)\n\n#### 信息通信研究院视觉计算部\n\n- [2017-ICCV][STL] WeText：弱监督下的场景文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FTian_WeText_Scene_Text_ICCV_2017_paper.pdf)\n\n#### 佛罗里达大学\n\n- [2017-ICCV][STL] 具有区域注意力的单阶段文本检测器 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FHe_Single_Shot_Text_ICCV_2017_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBestSonny\u002FSSTD)\n\n#### 南加州大学\n\n- [2017-ICCV][STL] 通过边界学习实现最小后处理的自组织文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FWu_Self-Organized_Text_Detection_ICCV_2017_paper.pdf)\n\n#### 海康威视研究院\n\n- [2021-AAAI][STL][TR] MANGO：基于掩码注意力引导的一阶段场景文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.04350.pdf)\n- [2020-AAAI][STL][TR] Text Perceptron：迈向端到端任意形状文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.06820.pdf)\n- [2018-CVPR][TR] AON：迈向任意方向文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1711.04226.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fhuizhang0110\u002FAON)\n- [2017-ICCV][TR] 聚焦注意力：迈向自然图像中的准确文本识别 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FCheng_Focusing_Attention_Towards_ICCV_2017_paper.pdf)\n\n#### 阿德莱德大学\n\n- [2019-AAAI][TR] 展示、注意并阅读：一种简单而强大的不规则文本识别基线 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.00751.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FPay20Y\u002FSAR_TF)\n- [2017-ICCV][STL][TR] 基于卷积循环神经网络的端到端文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLi_Towards_End-To-End_Text_ICCV_2017_paper.pdf)\n\n#### 纽约市立大学\n\n- [2017-CVPR][STL] 杂乱场景中无歧义的文本定位与检索 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FRong_Unambiguous_Text_Localization_CVPR_2017_paper.pdf)\n\n#### 香港大学\n\n- [2020-ECCV][STL][TR] AE TextSpotter：为歧义文本检测学习视觉和语言表征 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2183_ECCV_2020_paper.php)\n- [2018-AAAI][TR] Char-Net：一种针对扭曲场景文本的字符感知神经网络 [`paper`](http:\u002F\u002Fwww.visionlab.cs.hku.hk\u002Fpublications\u002Fwliu_aaai18.pdf)\n\n#### 浙江大学\n\n- [2021-TIP][STL][TR] FREE：一种快速且鲁棒的端到端视频文本检测器 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=9266586)\n- [2020-arxiv][TR] 精细化门控：一种简单有效的循环单元门控机制 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.11338.pdf)\n- [2018-AAAI][STL] PixelLink：通过实例分割检测场景文本 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.01315.pdf)\n\n#### 波茨坦大学\n\n- [2018-AAAI][STL][TR] SEE：迈向半监督端到端场景文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.05404.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FBartzi\u002Fsee)\n\n#### 亚利桑那州立大学\n\n- [2018-AAAI][TR] SqueezedText：一种基于二值卷积编码器-解码器网络的实时场景文本识别 [`paper`](https:\u002F\u002Fpdfs.semanticscholar.org\u002F9061\u002F47e6eb8e963d9751dda18fb540ed7faeb9fb.pdf)\n\n#### 史蒂文斯理工学院\n\n- [2018-CVPR][STL] 基于实例变换网络的几何感知场景文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Geometry-Aware_Scene_Text_CVPR_2018_paper.pdf)\n\n#### 南洋理工大学\n\n- [2020-IJCV][STL] 基于马尔可夫聚类网络的自底向上场景文本检测 [`paper`](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs11263-020-01298-y)\n- [2020-AAAI][STL][TR] GTC：指导CTC训练以实现高效准确的场景文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.01276.pdf)\n- [2019-ICCV][STL][TR] GA-DAN：面向场景文本检测与识别的几何感知领域适应网络 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FZhan_GA-DAN_Geometry-Aware_Domain_Adaptation_Network_for_Scene_Text_Detection_and_ICCV_2019_paper.html)\n- [2019-CVPR][STL] ESIR：通过迭代图像校正实现端到端场景文本识别 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FZhan_ESIR_End-To-End_Scene_Text_Recognition_via_Iterative_Image_Rectification_CVPR_2019_paper.html)\n- [2019-CVPR][STL] 向着鲁棒的曲线文本检测迈进：利用条件空间扩张 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002F)Liu_Towards_Robust_Curve_Text_Detection_With_Conditional_Spatial_Expansion_CVPR_2019_paper.html)\n- [2018-ECCV][STL] 真实感图像合成用于场景中文本的准确检测与识别 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FFangneng_Zhan_Verisimilar_Image_Synthesis_ECCV_2018_paper.pdf)\n- [2018-ECCV][STL] 通过边界语义意识和自举法实现准确的场景文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FChuhui_Xue_Accurate_Scene_Text_ECCV_2018_paper.pdf)\n- [2018-ECCV][STL] 利用目标信息进行文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FShitala_Prasad_Using_Object_Information_ECCV_2018_paper.pdf)\n- [2018-CVPR][STL] 学习马尔可夫聚类网络用于场景文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FLiu_Learning_Markov_Clustering_CVPR_2018_paper.pdf)\n\n#### 阿里巴巴集团\n\n- [2018-ICPR][STL][TR] 一种新颖的集成框架，用于同时学习文本检测与识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.08611.pdf)\n- [2018-IJCAI][STL] IncepText：一种带有可变形PSROI池化的新Inception-Text模块，用于多方向场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1805.01167.pdf)\n\n#### 中国科学院\n\n- [2020-CVPR][STL][TR] 用于视觉与场景文本联合推理的多模态图神经网络 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FGao_Multi-Modal_Graph_Neural_Network_for_Joint_Reasoning_on_Vision_and_CVPR_2020_paper.html)\n- [2018-ICIP][STL] 焦点文本：一种基于焦点损失的精确文本检测方法 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=8451241)\n- [2018-ICIP][STL] 用于场景文本识别的密集链式注意力网络 [`paper`](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=8451273)\n\n#### 剑桥大学\n\n- [2018-ECCV][STL] 面向场景文本识别的合成监督特征学习 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FYang_Liu_Synthetically_Supervised_Feature_ECCV_2018_paper.pdf)\n\n#### 北京大学\n\n- [2021-NIPS][TR] 向心文本：一种用于场景文本检测的高效文本实例表示方法 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.05945.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fshengtao96\u002FCentripetalText)\n- [2020-ICASSP][TR] 通过字符锚点池化实现场景文本识别中灵活特征提取的新视角 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.03509.pdf)\n- [2020-ICASSP][STL] 只需再看一眼：迈向更紧密的任意形状文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.12436.pdf)\n- [2019-WACV][STL] 基于金字塔注意力网络的Mask R-CNN用于场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.09058.pdf)\n- [2018-ECCV][STL] TextSnake：一种用于检测任意形状文本的灵活表示方法 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.01544.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fprincewang1994\u002FTextSnake.pytorch)\n\n#### 商汤科技研究院\n\n- [2021-WACV][STL] 用于四边形文本检测的解耦轮廓学习 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2021\u002Fpapers\u002FBi_Disentangled_Contour_Learning_for_Quadrilateral_Text_Detection_WACV_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FSakuraRiven\u002FDCLNet)\n- [2020-ECCV][TR] RobustScanner：动态增强位置线索以实现鲁棒文本识别 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F3160_ECCV_2020_paper.php)\n- [2020-ECCV][TR] 野外场景文本图像超分辨率 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F1186_ECCV_2020_paper.php)\n- [2019-arxiv][STL] 金字塔掩码文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1903.11800.pdf)\n- [2019-ICCV][STL] 用于精确场景文本检测的几何归一化网络 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FXu_Geometry_Normalization_Networks_for_Accurate_Scene_Text_Detection_ICCV_2019_paper.html)\n- [2018-BMVC][STL] 通过引导卷积神经网络提升场景文本检测性能 [`paper`](http:\u002F\u002Fbmvc2018.org\u002Fcontents\u002Fpapers\u002F0633.pdf)\n\n#### Naver Clova AI Research\n\n- [2020-ECCV][STL] 用于文本定位的字符区域注意力 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F6775_ECCV_2020_paper.php)\n- [2019-CVPR][STL][TR] 用于文本检测的字符区域感知能力 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.01941) [`code`](https:\u002F\u002Fgithub.com\u002Fclovaai\u002FCRAFT-pytorch)\n\n#### 百度\n\n- [2020-arxiv][STL][TR] PP-OCR：一个实用的超轻量级OCR系统 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.09941.pdf)\n- [2019-ICCV][STL][TR] 中文街景文本：基于部分监督学习的大规模中文文本阅读 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FSun_Chinese_Street_View_Text_Large-Scale_Chinese_Text_Reading_With_Partially_ICCV_2019_paper.html)\n- [2019-CVPR][STL] 多看几眼：一种精确的任意形状文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.06535)\n- [2018-arxiv][STL] 使用深度字符嵌入网络在野外检测文本 [`paper`](https:\u002F\u002Farxiv.org\u002Fabs\u002F1801.01671)\n- [2018-ACCV][STL][TR] TextNet：一种端到端可训练网络，用于从图像中读取不规则文本 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1812.09900.pdf)\n\n#### 阿德莱德大学\n\n- [2018-CVPR][STL][TR] 具有显式对齐和注意力机制的端到端文本定位器 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FHe_An_End-to-End_TextSpotter_CVPR_2018_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Ftonghe90\u002Ftextspotter)\n\n#### 南京大学\n\n- [2020-BMVC][TR] 通过自适应图像增强实现鲁棒场景文本识别 [`paper`](https:\u002F\u002Fwww.bmvc2020-conference.com\u002Fassets\u002Fpapers\u002F0257.pdf)\n- [2019-ICCV][STL] 基于像素聚合网络的高效且精确的任意形状文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FWang_Efficient_and_Accurate_Arbitrary-Shaped_Text_Detection_With_Pixel_Aggregation_Network_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FWenmuZhou\u002FPAN.pytorch)\n- [2019-CVPR][STL] 基于渐进尺度扩展网络的形状鲁棒文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FWang_Shape_Robust_Text_Detection_With_Progressive_Scale_Expansion_Network_CVPR_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fwhai362\u002FPSENet)\n\n#### 香港中文大学\n\n- [2022-AAAI][TR] 基于上下文的对比学习用于场景文本识别 [`paper`](https:\u002F\u002Fwww.cse.cuhk.edu.hk\u002F~byu\u002Fpapers\u002FC139-AAAI2022-ConCLR.pdf)\n- [2019-CVPR][STL] 学习面向场景文本检测的形状感知嵌入 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FTian_Learning_Shape-Aware_Embedding_for_Scene_Text_Detection_CVPR_2019_paper.html)\n\n#### 马龙科技\n\n- [2019-ICCV][STL][TR] 卷积字符网络 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FXing_Convolutional_Character_Networks_ICCV_2019_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002FMalongTech\u002Fresearch-charnet)\n\n#### 罗切斯特大学\n\n- [2019-ICCV][TR] 基于生成特征学习的大规模标签驱动字体检索 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FChen_Large-Scale_Tag-Based_Font_Retrieval_With_Generative_Feature_Learning_ICCV_2019_paper.html)\n\n#### Facebook AI Research\n\n- [2021-CVPR][STL][TR] TextOCR：迈向大规模的任意形状场景文本端到端推理 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FSingh_TextOCR_Towards_Large-Scale_End-to-End_Reasoning_for_Arbitrary-Shaped_Scene_Text_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Ftextvqa.org\u002Ftextocr\u002Fcode)\n- [2020-CVPR][STL][TR] 基于指针增强型多模态Transformer的迭代答案预测用于TextVQA [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.06258.pdf)\n- [2018-arxiv][STL] 利用旋转区域建议网络改进旋转文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.07031.pdf)\n\n#### 马里兰大学\n\n- [2020-WACV][TR] 为有关注视序列识别调整风格与内容 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FSchwarcz_Adapting_Style_and_Content_for_Attended_Text_Sequence_Recognition_WACV_2020_paper.pdf)\n\n#### Penta-AI\n\n- [2020-WACV][STL] 一切在于尺度——基于自适应缩放的高效文本检测 [`paper`](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FRichardson_Its_All_About_The_Scale_-_Efficient_Text_Detection_Using_WACV_2020_paper.pdf)\n\n#### 华中师范大学\n\n- [2020-ECCV][STL][TR] PlugNet：由可插拔超分辨率单元监督的退化感知场景文本识别 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F2318_ECCV_2020_paper.php)\n\n#### 腾讯\n\n- [2022-AAAI][TR] 理解笔画语义上下文：用于鲁棒场景文本识别的层次化对比学习 [`paper`](https:\u002F\u002Fwww.aaai.org\u002FAAAI22Papers\u002FAAAI-785.LiuH.pdf)\n- [2020-arxiv][STL] PuzzleNet：通过分割上下文图学习进行场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.11371.pdf)\n- [2020-AAAI][STL][TR] 针对算术练习批改的精确结构化文本定位 [`paper`](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F341891992_Accurate_Structured-Text_Spotting_for_Arithmetical_Exercise_Correction)\n- [2019-arxiv][TR] 二维注意力机制不规则场景文本识别器 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.05708.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fchenjun2hao\u002FBert_OCR.pytorch)\n\n#### 清华大学\n\n- [2023-IJCAI][TR] 通过显式位置增强实现鲁棒场景文本图像超分辨率 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0087.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FLEMMA)\n- [2021-CVPR][STL] 场景文本识别中的基元表示学习 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FYan_Primitive_Representation_Learning_for_Scene_Text_Recognition_CVPR_2021_paper.pdf)\n- [2020-ECCV][STL] 用于精确场景文本检测的序列变形方法 [`paper`](http:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fhtml\u002F6576_ECCV_2020_paper.php)\n\n#### 中国科学技术大学\n\n- [2023-IJCAI][TR] Linguistic More：迈向高效准确场景文本识别的新一步 [`paper`](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2023\u002F0189.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FCyrilSterling\u002FLPV)\n- [2021-ICCV][TR] 从二到一：一种基于视觉语言建模网络的新场景文本识别器 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FWang_From_Two_to_One_A_New_Scene_Text_Recognizer_With_ICCV_2021_paper.pdf)\n- [2021-CVPR][STL] 像人类一样阅读：场景文本识别中的自主、双向和迭代语言建模 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FFang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FFangShancheng\u002FABINet)\n- [2020-CVPR][STL] ContourNet：迈向精确任意形状场景文本检测的新一步 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FWang_ContourNet_Taking_a_Further_Step_Toward_Accurate_Arbitrary-Shaped_Scene_Text_CVPR_2020_paper.html) [`code`](https:\u002F\u002Fgithub.com\u002Fwangyuxin87\u002FContourNet)\n- [2020-arxiv][TR] 基于可变形卷积的焦点增强场景文本识别 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.10998.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FAlpaca07\u002Fdtr)\n- [2018-Pattern Recognition][STL] TextMountain：通过实例分割实现精确场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.12786.pdf)\n\n#### 电子科技大学\n\n- [2020-CVPR][TR] 机器所见并非所得：用对抗性文本图像愚弄场景文本识别模型 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FXu_What_Machines_See_Is_Not_What_They_Get_Fooling_Scene_CVPR_2020_paper.html)\n\n#### 印度统计研究所\n\n- [2020-CVPR][STL][TR] STEFANN：使用字体自适应神经网络的场景文本编辑器 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FRoy_STEFANN_Scene_Text_Editor_Using_Font_Adaptive_Neural_Network_CVPR_2020_paper.html)\n\n#### 中国科学院信息工程研究所\n\n- [2021-CVPR][STL] 用于任意形状场景文本检测的渐进轮廓回归 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FDai_Progressive_Contour_Regression_for_Arbitrary-Shape_Scene_Text_Detection_CVPR_2021_paper.pdf) [`code`](https:\u002F\u002Fgithub.com\u002Fdpengwen\u002FPCR)\n- [2020-CVPR][TR] SEED：用于场景文本识别的语义增强编码器-解码器框架 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FQiao_SEED_Semantics_Enhanced_Encoder-Decoder_Framework_for_Scene_Text_Recognition_CVPR_2020_paper.html)\n- [2020-ICPR][TR] 用于场景文本识别的高斯约束注意力网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.09169.pdf)\n- [2020-arxiv][STL] 用于领域自适应场景文本检测的自训练 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.11487.pdf)\n- [2019-ICDAR][STL] 基于半监督和弱监督学习的自然场景图像中曲线文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.09990.pdf)\n- [2019-BMVC][TR] 使用局部相关性的文本识别 [`paper`](https:\u002F\u002Fbmvc2019.org\u002Fwp-content\u002Fuploads\u002Fpapers\u002F0469-paper.pdf)\n\n#### 中国科学院大学\n\n- [2020-CVPR][STL][TR] 借助语义推理网络迈向精确场景文本识别 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FYu_Towards_Accurate_Scene_Text_Recognition_With_Semantic_Reasoning_Networks_CVPR_2020_paper.html)\n\n#### 亚马逊\n\n- [2020-CVPR][STL] SCATTER：选择性上下文注意力场景文本识别器 [`paper`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fhtml\u002FLitman_SCATTER_Selective_Context_Attentional_Scene_Text_Recognizer_CVPR_2020_paper.html)\n\n#### Heritage Institute of Technology\n\n- [2020-ICIP][STL] 野外场景图像中尺度不变的多方向文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.06423.pdf)\n\n#### 印度理工学院\n\n- [2020-arxiv][STL] NENET：用于场景文本中链接预测的边缘可学习网络 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.12147.pdf)\n\n#### 西安电子科技大学\n\n- [2021-AAAI][STL][TR] PGNet：基于点聚集网络的实时任意形状文本定位 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2104.05458.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FPaddleOCR\u002Fblob\u002Frelease\u002F2.1\u002Fdoc\u002Fdoc_en\u002Fpgnet_en.md)\n- [2020-ICASSP][STL] 基于文本注意力塔的高效场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.03741.pdf)\n- [2019-ACM-MM][STL] 基于上下文注意力多任务学习的单次通过任意形状文本检测器 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.05498.pdf)\n\n#### 同济大学\n\n- [2019-AAAI][STL] 基于监督金字塔上下文网络的场景文本检测 [`paper`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.08605.pdf) [`code`](https:\u002F\u002Fgithub.com\u002FAirBernard\u002FScene-Text-Detection-with-SPCNET)\n\n#### 哈尔滨工业大学\n\n- [2017-TIP][STL] 基于级联卷积神经网络的场景文本检测与分割（`论文`）[https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7828014]\n\n#### 上海交通大学\n\n- [2018-ICPR][STL] 用于多方向场景文本检测的融合文本分割网络 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1709.03272.pdf)  \n\n#### 平安财产保险\n\n- [2020-arxiv][TR] 汉明OCR：一种用于场景文本识别的局部敏感哈希神经网络 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.10874.pdf)\n\n#### 合肥工业大学\n\n- [2020-arxiv][TR] 快速密集残差网络：增强全局密集特征流以提升文本识别性能 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.09021v1.pdf)\n\n#### 北京航空航天大学\n\n- [2020-arxiv][TR] 一种适用于任意形状场景文本识别的可行框架 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.04561.pdf) [`代码`](https:\n\u002F\u002Fgithub.com\u002Fzhang0jhon\u002FAttentionOCR)\n\n#### 波士顿大学\n\n- [2020-arxiv][TR] 基于语义的图像文本识别深度神经网络 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.01403.pdf)\n\n#### 卡内基梅隆大学\n\n- [2019-ICDAR][TR] 重新思考不规则场景文本识别 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.11834.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FJyouhou\u002FICDAR2019-ArT-Recognition-Alchemy)\n\n#### 西北工业大学\n\n- [2019-CVPR][STL][TR] 向自然场景中的端到端文本定位迈进 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.06013.pdf)\n\n#### VinAI Research\n\n- [2021-CVPR][STL] 字典引导的场景文本识别 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FNguyen_Dictionary-Guided_Scene_Text_Recognition_CVPR_2021_paper.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002Fdict-guided)\n\n#### 东京大学\n\n- [2021-CVPR][TR] 如果我们只使用真实数据集进行场景文本识别会怎样？迈向少标签场景文本识别 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FBaek_What_if_We_Only_Use_Real_Datasets_for_Scene_Text_CVPR_2021_paper.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002Fku21fan\u002FSTR-Fewer-Labels)\n\n#### 萨里大学\n\n- [2021-ICCV][TR] 迈向未知领域：通过从错误中提炼知识实现迭代式文本识别 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FBhunia_Towards_the_Unseen_Iterative_Text_Recognition_by_Distilling_From_Errors_ICCV_2021_paper.pdf)\n- [2021-ICCV][TR] 联合视觉语义推理：用于文本识别的多阶段解码器 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fpapers\u002FBhunia_Joint_Visual_Semantic_Reasoning_Multi-Stage_Decoder_for_Text_Recognition_ICCV_2021_paper.pdf)\n- [2021-CVPR][TR] MetaHTR：迈向书写者自适应的手写文本识别 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FBhunia_MetaHTR_Towards_Writer-Adaptive_Handwritten_Text_Recognition_CVPR_2021_paper.pdf)\n\n#### 特克尼昂理工学院\n\n- [2021-CVPR][TR] 面向文本识别的序列到序列对比学习 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FAberdam_Sequence-to-Sequence_Contrastive_Learning_for_Text_Recognition_CVPR_2021_paper.pdf)\n\n#### 伊利诺伊大学厄巴纳-香槟分校\n\n- [2021-CVPR][TR] 重新思考文本分割：一个新颖的数据集和一种针对文本的精细化方法 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FXu_Rethinking_Text_Segmentation_A_Novel_Dataset_and_a_Text-Specific_Refinement_CVPR_2021_paper.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FRethinking-Text-Segmentation)\n\n#### 国家模式识别实验室\n\n- [2021-CVPR][STL] 语义感知的视频文本检测 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FFeng_Semantic-Aware_Video_Text_Detection_CVPR_2021_paper.pdf)\n\n#### 深圳大学\n\n- [2021-CVPR][STL][TR] 基于自注意力机制的文本知识挖掘用于文本检测 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FWan_Self-Attention_Based_Text_Knowledge_Mining_for_Text_Detection_CVPR_2021_paper.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FCVI-SZU\u002FSTKM)\n\n#### 菲律宾大学\n\n- [2021-ICDAR][TR] 用于快速高效场景文本识别的视觉Transformer [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2105.08582.pdf) ['代码'](https:\u002F\u002Fgithub.com\u002Froatienza\u002Fdeep-text-recognition-benchmark)\n\n#### 北京交通大学\n\n- [2022-IJCAI][TR] SVTR：基于单一视觉模型的场景文本识别 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2205.00159.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FPaddleOCR)\n\n#### 武汉大学\n\n- [2022-AAAI][TR] 视觉语义有助于在场景文本识别中更好地进行文本推理 [`论文`](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2112.12916.pdf) [`代码`](https:\u002F\u002Fgithub.com\u002Fadeline-cs\u002FGTR)\n\n#### Helsing AI\n\n- [2022-WACV][TR] 针对低资源手写文本识别的一次性组合数据生成 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2022\u002Fpapers\u002FSouibgui_One-Shot_Compositional_Data_Generation_for_Low_Resource_Handwritten_Text_Recognition_WACV_2022_paper.pdf)\n\n#### 普渡大学\n\n- [2023-WACV][TR] Seq-UPS：面向半监督文本识别的序列化不确定性感知伪标签选择 [`论文`](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FWACV2023\u002Fpapers\u002FPatel_Seq-UPS_Sequential_Uncertainty-Aware_Pseudo-Label_Selection_for_Semi-Supervised_Text_Recognition_WACV_2023_paper.pdf)\n\n## 2. 数据集\n\n#### [`SCUT-CTW1500`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FCurve-Text-Detector) `2018`\n\n任务：文本定位（不同风格）及识别\n\n[`下载`](https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002FCurve-Text-Detector)\n\n#### [`Total Text 数据集`](https:\u002F\u002Fgithub.com\u002Fcs-chan\u002FTotal-Text-Dataset) `2017`\n\n包含1,555张图像，涵盖水平、多方向和弯曲三种不同的文本方向，独一无二。\n\n任务：文本定位（不同风格）及识别\n\n[`下载`](https:\u002F\u002Fgithub.com\u002Fcs-chan\u002FTotal-Text-Dataset)\n\n#### [`PowerPoint 文本检测与识别数据集`](https:\u002F\u002Fgitlab.com\u002Frex-yue-wu\u002FISI-PPT-Dataset) `2017`\n\n共21,384张图像，包含21,384个以上的文本实例。\n\n任务：文本定位与识别\n\n[`下载`](https:\u002F\u002Fgitlab.com\u002Frex-yue-wu\u002FISI-PPT-Dataset)\n\n#### [`COCO-Text（康奈尔大学计算机视觉组）`](http:\u002F\u002Fvision.cornell.edu\u002Fse3\u002Fcoco-text\u002F)   `2016`\n\n包含63,686张图像，173,589个文本实例，以及3种细粒度的文本属性。\n\n任务：文本定位与识别\n\n[`下载`](https:\u002F\u002Fgithub.com\u002Fandreasveit\u002Fcoco-text)\n\n#### [`合成单词数据集（牛津大学VGG）`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Ftext\u002F)   `2014`\n\n包含900万张图像，覆盖9万个英文单词。\n\n任务：文本识别、分割\n\n[`下载`](http:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fdata\u002Ftext\u002Fmjsynth.tar.gz)\n\n#### [`街景房屋号码数据集（SVHN）`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers)   `2012`\n\n真实世界中的街景数字图像，附有位置和分类标签。\n\n任务：数字定位检测、文本识别\n\n[`下载`](http:\u002F\u002Fufldl.stanford.edu\u002Fhousenumbers)\n\n#### [`IIIT 5K-Words`](http:\u002F\u002Fcvit.iiit.ac.in\u002Fprojects\u002FSceneTextUnderstanding\u002FIIIT5K.html)   `2012`\n\n来自场景文本和原生数字图像的5,000张图片（2,000张用于训练，3,000张用于测试）。\n\n每张图像是场景文本中裁剪出的一个单词图像，并配有不区分大小写的标签。\n\n任务：文本识别\n\n[`下载`](http:\u002F\u002Fcvit.iiit.ac.in\u002Fprojects\u002FSceneTextUnderstanding\u002FIIIT5K-Word_V3.0.tar.gz)\n\n#### [`StanfordSynth（斯坦福大学人工智能小组）`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002F#research)   `2012`\n\n包含62个字符的小型单字符图像（0-9、a-z、A-Z）。\n\n任务：文本识别\n\n[`下载`](http:\u002F\u002Fcs.stanford.edu\u002Fpeople\u002Ftwangcat\u002FICPR2012_code\u002FsyntheticData.tar)\n\n#### [`MSRA 文本检测500数据库（MSRA-TD500）`](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FMSRA_Text_Detection_500_Database_(MSRA-TD500))   `2012`\n\n包含500张自然图像（图像分辨率从1296×864到1920×1280不等）。\n\n文本为中文、英文或中英文混合。\n\n任务：文本检测\n\n#### [`街景文本（SVT）`](http:\u002F\u002Ftc11.cvc.uab.es\u002Fdatasets\u002FSVT_1)   `2010`\n\n包含350张高分辨率图像（平均尺寸为1260×860），其中100张用于训练，250张用于测试。\n\n仅提供单词级别的边界框，并附有不区分大小写的标签。\n\n任务：文本定位\n\n#### [`KAIST 场景文本数据库`](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FKAIST_Scene_Text_Database)   `2010`\n\n包含3,000张室内外场景图像，图像中均含有文本。\n\n文本类型包括韩语、英语（数字）以及混合文本（韩语+英语+数字）。\n\n任务：文本定位、分割及识别。\n\n#### [`Chars74k`](http:\u002F\u002Fwww.ee.surrey.ac.uk\u002FCVSSP\u002Fdemos\u002Fchars74k\u002F)   `2009`\n\n包含超过74,000张自然图像，以及一组合成生成的字符图像。\n\n小型单字符图像，涵盖62个字符（0-9、a-z、A-Z）。\n\n任务：文本识别。\n\n#### `ICDAR 基准数据集`\n\n|数据集| 描述 | 竞赛论文 |\n|---|---|----|\n|[ICDAR 2017](http:\u002F\u002Frrc.cvc.uab.es\u002F)| 超过63,686张图像中包含173,589个标注文本区域 |`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F1601.07140)|\n|[ICDAR 2015](http:\u002F\u002Frrc.cvc.uab.es\u002F)| 1,000张训练图像和500张测试图像|`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Frrc.cvc.uab.es\u002Ffiles\u002FRobust-Reading-Competition-Karatzas.pdf)|\n|[ICDAR 2013](http:\u002F\u002Fdagdata.cvc.uab.es\u002Ficdar2013competition\u002F)| 229张训练图像和233张测试图像 |`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fdagdata.cvc.uab.es\u002Ficdar2013competition\u002Ffiles\u002Ficdar2013_competition_report.pdf)|\n|[ICDAR 2011](http:\u002F\u002Frobustreading.opendfki.de\u002Ftrac\u002F)| 229张训练图像和255张测试图像 |`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fwww.iapr-tc11.org\u002Farchive\u002Ficdar2011\u002Ffileup\u002FPDF\u002F4520b491.pdf)|\n|[ICDAR 2005](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FICDAR_2005_Robust_Reading_Competitions)| 1,001张训练图像和489张测试图像 |`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fwww.academia.edu\u002Fdownload\u002F30700479\u002F10.1.1.96.4332.pdf)|\n|[ICDAR 2003](http:\u002F\u002Fwww.iapr-tc11.org\u002Fmediawiki\u002Findex.php\u002FICDAR_2003_Robust_Reading_Competitions)| 181张训练图像和251张测试图像（单词级别和字符级别） |`论文`  [![链接](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_readme_529348c4130c.jpg)](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.332.3461&rep=rep1&type=pdf)|\n\n## 3. 竞赛\n\n- [ICDAR - Robust Reading 竞赛](http:\u002F\u002Frrc.cvc.uab.es\u002F?com=introduction)\n\n## 4. 在线OCR服务\n\n| 名称 | 描述 |\n|---|----\n|[Tesseract OCR](https:\u002F\u002Fgithub.com\u002Ftesseract-ocr\u002Ftesseract)| API，免费 |\n|[Online OCR](https:\u002F\u002Fwww.onlineocr.net\u002F)| API，免费 |\n|[Free OCR](http:\u002F\u002Fwww.free-ocr.com\u002F)| API，免费 |\n|[New OCR](http:\u002F\u002Fwww.newocr.com\u002F)| API，免费 |\n|[ABBYY FineReader Online](https:\u002F\u002Ffinereaderonline.com)| 无API，收费 |\n|[超级在线转换工具（中文）](http:\u002F\u002Fwww.wdku.net\u002F)| API，免费 |\n|[在线中文识别](http:\u002F\u002Fchongdata.com\u002Focr\u002F)| API，免费 |\n\n## 5. 博客\n\n- [使用 OpenCV 3 进行场景文本检测](http:\u002F\u002Fdocs.opencv.org\u002F3.0-beta\u002Fmodules\u002Ftext\u002Fdoc\u002Ferfilter.html)\n- [手写数字检测与识别](https:\u002F\u002Fmedium.com\u002F@o.kroeger\u002Frecognize-your-handwritten-numbers-3f007cbe46ff#.8hg7vl6mo)\n- [将 OCR 技术应用于收据识别](http:\u002F\u002Frnd.azoft.com\u002Fapplying-ocr-technology-receipt-recognition\u002F)\n- [用于目标（车牌）检测的卷积神经网络](http:\u002F\u002Frnd.azoft.com\u002Fconvolutional-neural-networks-object-detection\u002F)\n- [使用 Ocropus 从图像中提取文本](http:\u002F\u002Fwww.danvk.org\u002F2015\u002F01\u002F09\u002Fextracting-text-from-an-image-using-ocropus.html)\n- [使用 Tensorflow 进行车牌识别](http:\u002F\u002Fmatthewearl.github.io\u002F2016\u002F05\u002F06\u002Fcnn-anpr\u002F) [`github`](https:\u002F\u002Fgithub.com\u002Fmatthewearl\u002Fdeep-anpr)\n- [利用深度学习破解验证码系统](https:\u002F\u002Fdeepmlblog.wordpress.com\u002F2016\u002F01\u002F03\u002Fhow-to-break-a-captcha-system\u002F) [`报告`](http:\u002F\u002Fweb.stanford.edu\u002F~jurafsky\u002Fburszstein_2010_captcha.pdf) [`github`](https:\u002F\u002Fgithub.com\u002Farunpatala\u002Fcaptcha)\n- [以 96% 的准确率破解 Reddit 验证码](https:\u002F\u002Fdeepmlblog.wordpress.com\u002F2016\u002F01\u002F05\u002Fbreaking-reddit-captcha-with-96-accuracy\u002F) [`github`](https:\u002F\u002Fgithub.com\u002Farunpatala\u002Freddit.captcha)\n- [文字检测与识别资源-1](http:\u002F\u002Fblog.csdn.net\u002Fpeaceinmind\u002Farticle\u002Fdetails\u002F51387367)\n- [文字的检测与识别资源-2](http:\u002F\u002Fblog.csdn.net\u002Fu010183397\u002Farticle\u002Fdetails\u002F56497303?locationNum=12&fps=1)\n- iOS 中的场景文本识别 [`博客`](https:\u002F\u002Fmedium.com\u002F@khurram.pak522\u002Fscene-text-recognition-in-ios-11-2d0df8412151) [`github`](https:\u002F\u002Fgithub.com\u002Fkhurram18\u002FSceneTextRecognitioniOS)","# image-text-localization-recognition 快速上手指南\n\n本工具集合了场景文本定位（STL）与识别（TR）的前沿开源资源，涵盖从经典算法到最新深度学习模型。以下指南基于该领域通用的深度学习环境配置及代表性项目（如 CTPN、FOTS、ABCNet 等）的部署流程整理而成。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求。推荐使用 **Linux (Ubuntu 18.04\u002F20.04)** 系统以获得最佳兼容性。\n\n### 系统要求\n- **操作系统**: Linux (推荐) 或 macOS (部分项目支持)，Windows 需使用 WSL2。\n- **GPU**: NVIDIA 显卡 (显存建议 ≥ 4GB)，用于加速推理和训练。\n- **CUDA**: 版本需与 PyTorch\u002FTensorFlow 版本匹配 (通常推荐 CUDA 11.x)。\n\n### 前置依赖\n- **Python**: 3.7 - 3.9 (大多数 OCR 项目在此范围兼容最好)\n- **包管理器**: `pip` 或 `conda`\n- **基础库**: `git`, `cmake`, `build-essential`\n\n**安装系统级依赖 (Ubuntu):**\n```bash\nsudo apt-get update\nsudo apt-get install -y git cmake build-essential libgl1-mesa-glx libglib2.0-0\n```\n\n## 2. 安装步骤\n\n由于该 README 汇总了多个不同机构（如牛津大学、华南理工大学、复旦大学等）的独立项目，安装时需选择您具体需要的模型仓库。以下以通用的 **Conda 环境搭建** 及 **典型项目安装** 为例。\n\n### 第一步：创建虚拟环境\n建议使用 Conda 隔离环境，避免依赖冲突。\n\n```bash\nconda create -n ocr_env python=3.8\nconda activate ocr_env\n```\n\n### 第二步：安装深度学习框架\n根据您的需求选择 PyTorch 或 TensorFlow。目前大多数 SOTA 模型基于 PyTorch。\n\n**安装 PyTorch (推荐使用国内清华镜像源加速):**\n```bash\npip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n# 若需纯 CPU 版本：\n# pip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcpu\n```\n\n### 第三步：克隆并安装具体项目\n从 README 列表中选择一个项目，例如华南理工大学的 **ABCNet** (实时场景文本检测) 或深圳先进院的 **CTPN**。\n\n**示例：安装 ABCNet**\n```bash\n# 克隆代码\ngit clone https:\u002F\u002Fgithub.com\u002FYuliang-Liu\u002Fbezier_curve_text_spotting.git\ncd bezier_curve_text_spotting\n\n# 安装项目依赖\npip install -r requirements.txt\n# 如果包含 setup.py，通常还需要运行：\npip install -e .\n```\n\n> **提示**: 对于复旦大学的项目 (如 `FudanOCR`)，部分支持中文识别的模型可能提供了专门的中文文档或镜像，请优先查看其仓库内的 `README.zh-cn.md`。\n\n## 3. 基本使用\n\n安装完成后，通常可以通过命令行脚本或 Python API 进行文本定位与识别。以下提供一个通用的 Python 调用示例逻辑（具体参数请参考所选项目的官方文档）。\n\n### 简单推理示例\n\n假设您已下载好预训练模型权重文件 `model.pth`，并准备了一张测试图片 `test.jpg`。\n\n```python\nimport cv2\nimport torch\nfrom model import build_model  # 具体导入路径依项目而定\nfrom utils import preprocess, postprocess\n\n# 1. 加载模型\ndevice = \"cuda\" if torch.cuda.is_available() else \"cpu\"\nmodel = build_model(pretrained=\"model.pth\")\nmodel.to(device).eval()\n\n# 2. 读取并预处理图像\nimage = cv2.imread(\"test.jpg\")\ninput_tensor = preprocess(image).to(device)\n\n# 3. 推理 (定位 + 识别)\nwith torch.no_grad():\n    predictions = model(input_tensor)\n\n# 4. 后处理与结果输出\nboxes, texts = postprocess(predictions, image.shape)\n\nfor box, text in zip(boxes, texts):\n    print(f\"Detected Text: {text}, Location: {box}\")\n    # 可选：在原图上绘制框\n    # cv2.rectangle(image, ...) \n```\n\n### 命令行快速测试 (以部分支持 CLI 的项目为例)\n\n许多项目提供了直接的推理脚本：\n\n```bash\npython demo.py --image_path .\u002Ftest.jpg --weights .\u002Fmodel.pth --device cuda\n```\n\n**输出说明：**\n- **STL (定位)**: 返回文本区域的坐标框（矩形或多边形）。\n- **TR (识别)**: 返回坐标框内对应的文本内容。\n\n---\n*注：由于本资源列表包含数十个独立项目，具体模型的输入输出格式、数据增强策略及评估指标请务必查阅对应项目仓库的详细说明。*","某跨境电商运营团队需要每日从海外社交媒体抓取数千张包含商品促销信息的图片，并提取其中的价格与折扣文字以更新数据库。\n\n### 没有 image-text-localization-recognition 时\n- **人工成本极高**：团队成员需手动查看每张截图，肉眼定位文字区域并逐字录入，处理一张图片平均耗时 2 分钟。\n- **复杂场景识别率低**：面对倾斜拍摄、背景杂乱或艺术字体的促销海报，人工转录极易出错，导致价格数据频繁偏差。\n- **流程无法自动化**：由于缺乏统一的文本检测与识别接口，无法将图片解析环节接入现有的自动爬虫 pipeline，数据更新滞后严重。\n- **多语言支持困难**：遇到非拉丁语系（如日文、韩文）的场景文本时，团队缺乏现成模型，只能外包翻译，进一步拉长周期。\n\n### 使用 image-text-localization-recognition 后\n- **全流程自动化**：调用该资源合集中成熟的算法（如 CTPN 或 FOTS），系统可自动定位图片中的文本框并识别内容，单张处理缩短至秒级。\n- **鲁棒性显著提升**：利用深度学习模型对扭曲、模糊及复杂背景的强大适应性，准确提取各类创意字体中的关键促销信息，错误率降低 90%。\n- **无缝集成开发**：直接复用 GitHub 上经过验证的代码实现，快速构建端到端的图像文字提取服务，实时同步最新竞品数据。\n- **全球化覆盖**：基于合集内涵盖的多语言预训练模型，无需额外开发即可支持全球主流语种的文字识别，打破语言壁垒。\n\nimage-text-localization-recognition 通过将分散的顶尖学术成果转化为可调用的工程能力，帮助团队实现了从“人工抄录”到“智能感知”的效率飞跃。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwhitelok_image-text-localization-recognition_4b3584b1.png","whitelok","Karl Lok (Zhaokai Luo)","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fwhitelok_a05f5eac.png",null,"AI Platform","Earth","https:\u002F\u002Fwhitelok.github.io\u002F","https:\u002F\u002Fgithub.com\u002Fwhitelok",959,231,"2026-02-26T09:34:42",5,"","未说明",{"notes":87,"python":85,"dependencies":88},"提供的 README 内容仅为场景文本定位与识别领域的论文和代码资源列表（综述），并非具体某个 AI 工具的安装文档。文中列出了来自牛津大学、复旦大学等多个机构的研究成果链接，但未包含任何关于运行环境、依赖库、硬件需求或安装步骤的具体说明。用户需点击具体的代码仓库链接（如 GitHub）以获取相应项目的详细环境需求。",[],[15,14],[91,92,93,94,95,96,97,98,99,100],"text-recognition","text-detection","convolutional-neural-networks","scene-texts","deep-learning","ocr","text-extraction","deep-learning-algorithms","machine-learning","awesome","2026-03-27T02:49:30.150509","2026-04-20T12:55:29.240625",[],[]]