[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-PruneTruong--DenseMatching":3,"tool-PruneTruong--DenseMatching":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",145895,2,"2026-04-08T11:32:59",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":94,"difficulty_score":10,"env_os":95,"env_gpu":96,"env_ram":95,"env_deps":97,"category_tags":102,"github_topics":78,"view_count":32,"oss_zip_url":78,"oss_zip_packed_at":78,"status":17,"created_at":103,"updated_at":104,"faqs":105,"releases":136},5572,"PruneTruong\u002FDenseMatching","DenseMatching","Dense matching library based on PyTorch","DenseMatching 是一个基于 PyTorch 构建的通用密集匹配开源库，旨在为计算机视觉中的图像对齐任务提供从模型训练、评估到部署的一站式解决方案。它主要解决了在不同场景下（如几何匹配、光流估计及语义匹配）难以高效获取精确像素级对应关系的难题，广泛应用于三维重建、视频分析和姿态估计等领域。\n\n该工具特别适合计算机视觉研究人员和算法开发者使用。它不仅内置了 MegaDepth、KITTI、Sintel 等主流验证数据集的处理脚本，还提供了完整的数据采样、真值生成及性能分析框架，极大降低了复现前沿算法的门槛。DenseMatching 的核心亮点在于集成了多种官方实现的顶尖模型，包括 GLU-Net、PDC-Net 以及最新的 PWarpC 等，并直接提供预训练权重供调用。此外，库中创新性地引入了转置卷积的双线性插值初始化策略，显著提升了模型收敛速度与最终精度；其独特的概率一致性学习机制，还能有效处理真实场景中的遮挡与背景杂乱问题，让弱监督下的语义匹配更加鲁可靠。无论是进行学术研究还是工程落地，DenseMatching 都是值得信赖的基础设施。","# Dense Matching\n\nA general dense matching library based on PyTorch.\n\nFor any questions, issues or recommendations, please contact Prune at prune.truong@vision.ee.ethz.ch\n\n\u003Cbr \u002F>\n\n\n**If you're interested in training a probabilistic correspondence network unsupervised (mix of \n[PDCNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710) and [WarpC](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)), \ncheck out our recent work [Refign](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10030987), with [code](https:\u002F\u002Fgithub.com\u002Fbrdav\u002Frefign).** \nIt is also integrated in this code base as UAWarpC. \n\n\n## Updates\n\n06\u002F03\u002F2022: We found that significantly better performance and reduced training time are obtained when \n**initializing with bilinear interpolation weights the weights of the transposed convolutions** used to upsample \nthe predicted flow fields between the different pyramid levels. We have integrated this initialization as the default. We might provide updated pre-trained weights as well.\nAlternatively, one can directly simply use bilinear interpolation for upsampling with similar \n(maybe a bit better) performance, which is also now an option proposed. \n\n\n## Highlights\n\nLibraries for implementing, training and evaluating dense matching networks. It includes\n* Common dense matching **validation datasets** for **geometric matching** (MegaDepth, RobotCar, ETH3D, HPatches), \n**optical flow** (KITTI, Sintel) and **semantic matching** (TSS, PF-Pascal, PF-Willow, Spair). \n* Scripts to **analyse** network performance and obtain standard performance scores for matching and pose estimation.\n* General building blocks, including deep networks, optimization, feature extraction and utilities.\n* **General training framework** for training dense matching networks with\n    * Common training datasets for matching networks.\n    * Functions to generate random image pairs and their corresponding ground-truth flow, as well as to add \n    moving objects and modify the flow accordingly. \n    * Functions for data sampling, processing etc.\n    * And much more...\n\n* **Official implementation** of GLU-Net (CVPR 2020), GLU-Net-GOCor (NeurIPS 2020), PWC-Net-GOCor (NeurIPS 2020), \nPDC-Net (CVPR 2021), WarpC models (ICCV 2021), PWarpC models (CVPR 2022) including trained models and respective results.\n\n\u003Cbr \u002F>\n\n## Dense Matching Networks \n\nThe repo contains the implementation of the following matching models. \nWe provide pre-trained model weights, data preparation, evaluation commands, and results for each dataset and method. \nIf you find this library useful, please consider citing the relevant following research papers.  \n\n### [6] PWarpC: Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences. (CVPR 2022)\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F), [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F), \n[Fisher Yu](https:\u002F\u002Fwww.yf.io\u002F), Luc Van Gool\u003Cbr \u002F>\n\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.04279)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpwarpc)\\]\n\\[[Poster](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lP5E3BNqdKJL1q-YsQ-C7rOwkcb5S63W\u002Fview?usp=sharing)\\]\n\\[[Video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=I2KtnvI8xZU)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDYG22.html?view=bibtex)\\]\n\n\u003Cdetails>\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c75787f56290.png)\n\nWe propose Probabilistic Warp Consistency, a weakly-supervised learning objective for semantic matching. \nOur approach directly supervises the dense matching scores predicted by the network, encoded as a conditional \nprobability distribution. We first construct an image triplet by applying a known warp to one of the images in \na pair depicting different instances of the same object class. Our probabilistic learning objectives are then \nderived using the constraints arising from the resulting image triplet. We further account for occlusion and \nbackground clutter present in real image pairs by extending our probabilistic output space with a learnable \nunmatched state. To supervise it, we design an objective between image pairs depicting different object classes. \nWe validate our method by applying it to four recent semantic matching architectures. Our weakly-supervised approach \nsets a new state-of-the-art on four challenging semantic matching benchmarks. Lastly, we demonstrate that our \nobjective also brings substantial improvements in the strongly-supervised regime, when combined with keypoint annotations. \n\n\u003C\u002Fdetails>\n\n### [5] PDC-Net+: Enhanced Probabilistic Dense Correspondence Network. (TPAMI 2023)\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F), [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F), Radu Timofte, Luc Van Gool \u003Cbr \u002F>\n\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpdcnet+)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fjournals\u002Fcorr\u002Fabs-2109-13912.html?view=bibtex)\\]\n\n\n\n### [4] WarpC: Warp Consistency for Unsupervised Learning of Dense Correspondences. (ICCV 2021 - ORAL)\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F), [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F), \n[Fisher Yu](https:\u002F\u002Fwww.yf.io\u002F), Luc Van Gool\u003Cbr \u002F>\n\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fwarpc)\\]\n\\[[Poster](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1PCXkjxvVsjHAbYzsBtgKWLO1uE6oGP6p\u002Fview?usp=sharing)\\]\n\\[[Slides](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mVpLBW55nlNJZBsvxkBCti9_KhH1r9V_\u002Fview?usp=sharing)\\]\n\\[[Video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=IsMotj7-peA)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficcv\u002FTruongDYG21.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\n\nWarp Consistency Graph            |  Results \n:-------------------------:|:-------------------------:\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_b8290b3039c5.png) |  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_abdcee91c337.png) \n\n\n\n\nThe key challenge in learning dense correspondences lies in the lack of ground-truth matches for real image pairs. \nWhile photometric consistency losses provide unsupervised alternatives, they struggle with large appearance changes, \nwhich are ubiquitous in geometric and semantic matching tasks. Moreover, methods relying on synthetic training pairs \noften suffer from poor generalisation to real data.\nWe propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. \nOur objective is effective even in settings with large appearance and view-point changes. Given a pair of \nreal images, we first construct an image triplet by applying a randomly sampled warp to one of the original images. \nWe derive and analyze all flow-consistency constraints arising between the triplet. From our observations and \nempirical results, we design a general unsupervised objective employing two of the derived constraints.\nWe validate our warp consistency loss by training three recent dense correspondence networks for the geometric and \nsemantic matching tasks. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, \nRobotCar and TSS. \n\n\u003C\u002Fdetails>\n\n\n### [3] PDC-Net: Learning Accurate Correspondences and When to Trust Them. (CVPR 2021 - ORAL)\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F), [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F), Luc Van Gool, Radu Timofte\u003Cbr \u002F>\n\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpdcnet)\\]\n\\[[Poster](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18ya__AdEIgZyix8dXuRpJ15tdrpbMUsB\u002Fview?usp=sharing)\\]\n\\[[Slides](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zUQmpmVp6WSa_psuI3KFvKVrNyJE-beG\u002Fview?usp=sharing)\\]\n\\[[Video](https:\u002F\u002Fyoutu.be\u002FbX0rEaSf88o)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDGT21.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c39d6c677e8b.png)\n\n\nDense flow estimation is often inaccurate in the case of large displacements or homogeneous regions. For most \napplications and down-stream tasks, such as pose estimation, image manipulation, or 3D reconstruction, it is \ncrucial to know **when and where** to trust the estimated matches. \nIn this work, we aim to estimate a dense flow field relating two images, coupled with a robust pixel-wise \nconfidence map indicating the reliability and accuracy of the prediction. We develop a flexible probabilistic \napproach that jointly learns the flow prediction and its uncertainty. In particular, we parametrize the predictive \ndistribution as a constrained mixture model, ensuring better modelling of both accurate flow predictions and outliers. \nMoreover, we develop an architecture and training strategy tailored for robust and generalizable uncertainty \nprediction in the context of self-supervised training. \n\n\u003C\u002Fdetails>\n\n### [2] GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network. (NeurIPS 2020)\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F) *, [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F) *, Luc Van Gool, Radu Timofte\u003Cbr \u002F>\n\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fgocor)\\]\n\\[[Video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=V22MyFChBCs)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fnips\u002FTruongDGT20.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\nThe feature correlation layer serves as a key neural network module in numerous computer vision problems that\ninvolve dense correspondences between image pairs. It predicts a correspondence volume by evaluating dense scalar products \nbetween feature vectors extracted from pairs of locations in two images. However, this point-to-point feature comparison \nis insufficient when disambiguating multiple similar regions in an image, severely affecting the performance of \nthe end task. \n**This work proposes GOCor, a fully differentiable dense matching module, acting as a direct replacement to \nthe feature correlation layer.** The correspondence volume generated by our module is the result of an internal \noptimization procedure that explicitly accounts for similar regions in the scene. Moreover, our approach is \ncapable of effectively learning spatial matching priors to resolve further matching ambiguities. \n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_2c8fcbc55192.jpg)\n\n\u003C\u002Fdetails>\n\n\n### [1] GLU-Net: Global-Local Universal Network for dense flow and correspondences (CVPR 2020 - ORAL).\nAuthors: [Prune Truong](https:\u002F\u002Fprunetruong.com\u002F), [Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F) and Radu Timofte \u003Cbr \u002F>\n\\[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.05524)\\]\n\\[[Website](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fglu-net)\\]\n\\[[Poster](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1pS_OMZ83EG-oalD-30vDa3Ru49GWi-Ky\u002Fview?usp=sharing)\\]\n\\[[Oral Video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=xB2gNx8f8Xc&feature=emb_title)\\]\n\\[[Teaser Video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s5OUdkM9QLo)\\]\n\\[[Citation](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDT20.html?view=bibtex)\\]\n\n\u003Cdetails>\n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_ba67441a2374.png)\n\n\u003C\u002Fdetails>\n\n\u003Cbr \u002F>\n\u003Cbr \u002F>\n\n## Pre-trained weights\n\nThe pre-trained models can be found in the [model zoo](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002FMODEL_ZOO.md)\n\n\n\u003Cbr \u002F>\n\n## Table of Content\n\n1. [Installation](#installation)\n2. [Test on your own image pairs!](#test)\n    1. [Test on image pairs](#test1)\n    2. [Various demos](#test2)\n3. [Overview](#overview)\n4. [Benchmarks and results](#results)\n    1. [Correspondence evaluation](#correspondence_eval)\n        1. [MegaDepth](#megadepth)\n        2. [RobotCar](#robotcar)\n        3. [ETH3D](#eth3d)\n        4. [HPatches](#hpatches)\n        5. [KITTI](#kitti)\n        6. [Sintel](#sintel)\n        7. [TSS](#tss)\n        8. [PF-Pascal](#pfpascal)\n        9. [PF-Willow](#pfwillow)\n        10. [Spair-71k](#spair)\n        11. [Caltech-101](#caltech)\n    2. [Pose estimation](#pose_estimation)\n        1. [YFCC100M](#yfcc)\n        2. [ScanNet](#scannet)\n    3. [Sparse evaluation on HPatches](#sparse_hp)\n5. [Training](#training)\n6. [Acknowledgement](#acknowledgement)\n7. [Changelog](#changelog)\n\n\n\u003Cbr \u002F>\n\n## 1. Installation \u003Ca name=\"installation\">\u003C\u002Fa>\n\nInference runs for torch version >= 1.0\n\n* Create and activate conda environment with Python 3.x\n\n```bash\nconda create -n dense_matching_env python=3.7\nconda activate dense_matching_env\n```\n\n* Install all dependencies (except for cupy, see below) by running the following command:\n```bash\npip install numpy opencv-python torch torchvision matplotlib imageio jpeg4py scipy pandas tqdm gdown pycocotools timm\n```\n\n**Note**: CUDA is required to run the code. Indeed, the correlation layer is implemented in CUDA using CuPy, \nwhich is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the \nprovided binary packages as outlined in the CuPy repository. The code was developed using Python 3.7 & PyTorch 1.0 & CUDA 9.0, \nwhich is why I installed cupy for cuda90. For another CUDA version, change accordingly. \n\n```bash\npip install cupy-cuda90 --no-cache-dir \n```\n\n\n* This repo includes [GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823) as git submodule. \nYou need to pull submodules with \n```bash\ngit submodule update --init --recursive\ngit submodule update --recursive --remote\n```\n\n* Create admin\u002Flocal.py by running the following command and update the paths to the dataset. \nWe provide an example admin\u002Flocal_example.py where all datasets are stored in data\u002F. \n```bash\npython -c \"from admin.environment import create_default_local_file; create_default_local_file()\"\n```\n\n* **Download pre-trained model weights** with the command ```bash assets\u002Fdownload_pre_trained_models.sh```. See more in [model zoo](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002FMODEL_ZOO.md)\n\n\u003Cbr \u002F>\n\n## 2. Test on your own image pairs!  \u003Ca name=\"test\">\u003C\u002Fa>\n\nPossible model choices are : \n* SFNet, PWarpCSFNet_WS, PWarpCSFNet_SS, NCNet, PWarpCNCNet_WS, PWarpCNCNet_SS, CATs, PWarpCCATs_SS, CATs_ft_features, \n PWarpCCATs_ft_features_SS, \n* UAWarpC (no pre-trained models available), \n* WarpCGLUNet, GLUNet_star, WarpCSemanticGLUNet, \n* PDCNet_plus, PDCNet, GLUNet_GOCor_star, \n* SemanticGLUNet, GLUNet, GLUNet_GOCor, PWCNet, PWCNet_GOCor\n\nPossible pre-trained model choices are: static, dynamic, chairs_things, chairs_things_ft_sintel, megadepth, \nmegadepth_stage1, pfpascal, spair\n\n\u003Cbr \u002F>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Note on PDCNet and PDC-Net+ inference options\u003C\u002Fb>\u003C\u002Fsummary>\n\nPDC-Net and PDC-Net+ have multiple inference alternative options. \nif model is PDC-Net, add options:  \n* --confidence_map_R, for computation of the confidence map p_r, default is 1.0\n* --multi_stage_type in \n    * 'D' (or 'direct')\n    * 'H' (or 'homography_from_quarter_resolution_uncertainty')\n    * 'MS' (or 'multiscale_homo_from_quarter_resolution_uncertainty')\n* --ransac_thresh, used for homography and multiscale multi-stages type, default is 1.0\n* --mask_type, for thresholding the estimated confidence map and using the confident matches for internal homography estimation, for \nhomography and multiscale multi-stage types, default is proba_interval_1_above_5\n* --homography_visibility_mask, default is True\n* --scaling_factors', used for multi-scale, default are \\[0.5, 0.6, 0.88, 1, 1.33, 1.66, 2\\]\n\nUse direct ('D') when image pairs only show limited view-point changes (for example consecutive images of a video, \nlike in the optical flow task). For larger view-point changes, use homography ('H') or multi-scale ('MS'). \n\n\nFor example, to run PDC-Net or PDC-Net+ with homography, add at the end of the command\n```bash\nPDCNet --multi_stage_type H --mask_type proba_interval_1_above_10\n```\n\n\u003C\u002Fdetails>\n\n\n### Test on a specific image pair  \u003Ca name=\"test1\">\u003C\u002Fa>\n\n\nYou can test the networks on a pair of images using test_models.py and the provided trained model weights. \nYou must first choose the model and pre-trained weights to use. \nThe inputs are the paths to the query and reference images. \nThe images are then passed to the network which outputs the corresponding flow field relating the reference to the query image. \nThe query is then warped according to the estimated flow, and a figure is saved. \n\n\u003Cbr \u002F>\n\nFor this pair of MegaDepth images (provided to check that the code is working properly) and using **PDCNet** (MS) \ntrained on the megadepth dataset, the output is:\n\n```bash\npython test_models.py --model PDCNet --pre_trained_model megadepth --path_query_image images\u002Fpiazza_san_marco_0.jpg --path_reference_image images\u002Fpiazza_san_marco_1.jpg --save_dir evaluation\u002F PDCNet --multi_stage_type MS --mask_type proba_interval_1_above_10\n```\nadditional optional arguments: --path_to_pre_trained_models (default is pre_trained_models\u002F)\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_4e8c25db8947.png)\n\n\n\u003Cbr \u002F>\n\nUsing **GLU-Net-GOCor** trained on the dynamic dataset, the output for this image pair of eth3d is:\n\n```bash\npython test_models.py --model GLUNet_GOCor --pre_trained_model dynamic --path_query_image images\u002Feth3d_query.png --path_reference_image images\u002Feth3d_reference.png --save_dir evaluation\u002F\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_84d452de4deb.png)\n\n\u003Cbr \u002F>\n\nFor baseline **GLU-Net**, the output is instead:\n\n```bash\npython test_models.py --model GLUNet --pre_trained_model dynamic --path_query_image images\u002Feth3d_query.png --path_reference_image images\u002Feth3d_reference.png --save_dir evaluation\u002F\n\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c4965f2cba89.png)\n\n\n\u003Cbr \u002F>\n\nAnd for **PWC-Net-GOCor** and baseline **PWC-Net**:\n\n\n```bash\npython test_models.py --model PWCNet_GOCor --pre_trained_model chairs_things --path_query_image images\u002Fkitti2015_query.png --path_reference_image images\u002Fkitti2015_reference.png --save_dir evaluation\u002F\n```\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_f2325b585b16.png)\n\n\u003Cbr \u002F>\n\n```bash\npython test_models.py --model PWCNet --pre_trained_model chairs_things --path_query_image images\u002Fkitti2015_query.png --path_reference_image images\u002Fkitti2015_reference.png --save_dir evaluation\u002F\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_dd04c7d6e33f.png)\n\n\u003Cbr \u002F>\n\n\n## Demos with videos \u003Ca name=\"test2\">\u003C\u002Fa>\n\n* **demos\u002Fdemo_single_pair.ipynb**: Play around with our models on different image pairs, compute the flow field \nrelating an image pair and visualize the warped images and confident matches. \n\n* **demos\u002Fdemo_pose_estimation_and_reconstruction.ipynb**:  Play around with our models on different image pairs \n(with intrinsic camera parameters known), compute the flow field and confidence map, then the relative pose.   \n\n* Run the **online demo with a webcam or video** to reproduce the result shown in the GIF above. \nWe compute the flow field between the target (middle) and the source (left). We plot the 1000 top confident matches as well. \nThe warped source is represented on the right (and should resemble the middle). \nOnly the regions for which the matches were predicted as confident are visible. \n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_1589010d9353.gif)\n\nWe modify the utils code from [SuperGlue](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork), so you need to adhere to \ntheir license in order to use that. Then run:\n```bash\nbash demo\u002Frun_demo_confident_matches.sh\n``` \n\n\n* From a video, **warp each frame of the video (left) to the middle frame** like in the GIF below. The warped frame is \nrepresented on the right (and should resemble the middle). \nOnly the regions for which the matches were predicted as confident are visible. \n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_885f8f268a72.gif)\n\nWe modify the utils code from [SuperGlue](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork), so you need to adhere to \ntheir license in order to use that. Then run:\n```bash\nbash demo\u002Frun_demo_warping_videos.sh\n``` \n\n\n\n* More to come!\n\n\n## 3. Overview  \u003Ca name=\"overview\">\u003C\u002Fa>\n\nThe framework consists of the following sub-modules.\n\n* training: \n    * actors: Contains the actor classes for different trainings. The actor class is responsible for passing the input \n    data through the network and calculating losses. \n    Here are also pre-processing classes, that process batch tensor inputs to the desired inputs needed for training the network. \n    * trainers: The main class which runs the training.\n    * losses: Contain the loss classes \n* train_settings: Contains settings files, specifying the training of a network.\n* admin: Includes functions for loading networks, tensorboard etc. and also contains environment settings.\n* datasets: Contains integration of a number of datasets. Additionally, it includes modules to generate \nsynthetic image pairs and their corresponding ground-truth flow as well as to add independently moving objects \nand modify the flow accordingly. \n* utils_data: Contains functions for processing data, e.g. loading images, data augmentations, sampling frames.\n* utils_flow: Contains functions for working with flow fields, e.g. converting to mapping, warping an array according \nto a flow, as well as visualization tools. \n* third_party: External libraries needed for training. Added as submodules.\n* models: Contains different layers and network definitions.\n* validation: Contains functions to evaluate and analyze the performance of the networks in terms of predicted \nflow and uncertainty. \n\n\n\n## 4. Benchmark and results  \u003Ca name=\"results\">\u003C\u002Fa>\n\nAll paths to the datasets must be provided in file admin\u002Flocal.py. \nWe provide an example admin\u002Flocal_example.py where all datasets are stored in data\u002F. \nYou need to update the paths of admin\u002Flocal.py before running the evaluation. \n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Note on PDCNet and PDCNet+ inference options\u003C\u002Fb>\u003C\u002Fsummary>\n\nPDC-Net and PDC-Net+ has multiple inference alternative options. \nif model if PDC-Net, add options:  \n* --confidence_map_R, for computation of the confidence map p_r, default is 1.0\n* --multi_stage_type in \n    * 'D' (or 'direct')\n    * 'H' (or 'homography_from_quarter_resolution_uncertainty')\n    * 'MS' (or 'multiscale_homo_from_quarter_resolution_uncertainty')\n* --ransac_thresh, used for homography and multiscale multi-stages type, default is 1.0\n* --mask_type, for thresholding the estimated confidence map and using the confident matches for internal homography estimation, for \nhomography and multiscale multi-stage types, default is proba_interval_1_above_5\n* --homography_visibility_mask, default is True\n* --scaling_factors', used for multi-scale, default are \\[0.5, 0.6, 0.88, 1, 1.33, 1.66, 2\\]\n\nFor example, to run PDC-Net or PDC-Net+ with homography, add at the end of the command\n```bash\nPDCNet --multi_stage_type H --mask_type proba_interval_1_above_10\n```\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Note on reproducibility\u003C\u002Fb>\u003C\u002Fsummary>\n\nResults using PDC-Net with multi-stage (homography_from_quarter_resolution_uncertainty, H) or multi-scale \n(multiscale_homo_from_quarter_resolution_uncertainty, MS) employ RANSAC internally. Therefore results may vary a bit but should remain within 1-2 %.\nFor pose estimation, we also compute the pose with RANSAC, which leads to some variability in the results.  \n  \n\u003C\u002Fdetails>\n\n### 4.1. Correspondence evaluation \u003Ca name=\"correspondence_eval\">\u003C\u002Fa>\n\nMetrics are computed with, \n```bash\npython -u eval_matching.py --dataset dataset_name --model model_name --pre_trained_models pre_trained_model_name --optim_iter optim_step  --local_optim_iter local_optim_iter --save_dir path_to_save_dir \n```\n\nOptional argument:\n--path_to_pre_trained_models: \n   * default is pre_trained_models\u002F\n   * if it is a path to a directory: it is the path to the directory containing the model weights, the path to the model weights will be \n    path_to_pre_trained_models + model_name + '_' + pre_trained_model_name\n   * if it is a path to a checkpoint directly, it is used as the path to the model weights directly, and pre_trained_model_name is only used\n    as the name to save the metrics. \n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>MegaDepth\u003C\u002Fb>\u003Ca name=\"megadepth\">\u003C\u002Fa>\u003C\u002Fsummary>\n\n\n**Data preparation**: We use the test set provided in [RANSAC-Flow](https:\u002F\u002Fgithub.com\u002FXiSHEN0220\u002FRANSAC-Flow\u002Ftree\u002Fmaster\u002Fevaluation\u002FevalCorr). \nIt is composed of 1600 pairs and  also includes a csv file ('test1600Pairs.csv') containing \nthe name of image pairs to evaluate and the corresponding ground-truth correspondences. \nDownload everything with \n```bash\nbash assets\u002Fdownload_megadepth_test.sh\n```\n\nThe resulting file structure is the following\n```bash\nmegadepth_test_set\u002F\n└── MegaDepth\u002F\n    └── Test\u002F\n        └── test1600Pairs\u002F  \n        └── test1600Pairs.csv\n```\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: After updating the path of 'megadepth' and 'megadepth_csv' in admin\u002Flocal.py, evaluation is run with\n```bash\npython eval_matching.py --dataset megadepth --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type MS\n```\n\n\nSimilar results should be obtained: \n| Model          | Pre-trained model type      | PCK-1 (%) | PCK-3 (%) | PCK-5 (%) | \n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net  (this repo)      | static  (CityScape-DPED-ADE)        | 29.51 | 50.67 | 56.12 | \n| GLU-Net  (this repo)      | dynamic           | 21.59 | 52.27 | 61.91 | \n| GLU-Net  (paper)      | dynamic       | 21.58 | 52.18 | 61.78 | \n| GLU-Net-GOCor  (this repo) | static (CitySCape-DPED-ADE) | 32.24 | 52.51 | 58.90 | \n| GLU-Net-GOCor (this repo) | dynamic  | 37.23 | **61.25** | **68.17** | \n| GLU-Net-GOCor (paper) | dynamic      | **37.28**| 61.18 | 68.08 | \n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net-GOCor* (paper) | megadepth                   | 57.77 | 78.61 | 82.24 | \n| PDC-Net  (D)   (this repo)  | megadepth                   | 68.97 | 84.03 | 85.68 | \n| PDC-Net  (H)   (paper)   | megadepth                   | 70.75 | 86.51 | 88.00 | \n| PDC-Net (MS)  (paper) | megadepth                   | 71.81 | 89.36 | 91.18 | \n| PDC-Net+ (D) (paper)  | megadepth |  72.41 | 86.70 | 88.13  |\n| PDC-Net+ (H) (paper)  | megadepth |  73.92  |  89.21 |  90.48  |\n| PDC-Net+ (MS) (paper)  | megadepth |  **74.51**  |  **90.69**  | **92.10** |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net*  (paper)   | megadepth  |  38.50 | 54.66 | 59.60 | \n| GLU-Net*  (this repo)   | megadepth  |  38.62 | 54.70 |  59.76 | \n| WarpC-GLU-Net (paper) | megadepth | 50.61 | 73.80 | 78.61 | \n| WarpC-GLU-Net (this repo) | megadepth | **50.77** | **73.91** | **78.73** | \n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>RobotCar \u003Ca name=\"robotcar\">\u003C\u002Fb>\u003C\u002Fa>\u003C\u002Fsummary>\n  \n**Data preparation**: Images can be downloaded from the \n[Visual Localization Challenge](https:\u002F\u002Fwww.visuallocalization.net\u002Fdatasets\u002F) (at the bottom of the site), \nor more precisely [here](https:\u002F\u002Fwww.dropbox.com\u002Fsh\u002Fql8t2us433v8jej\u002FAAB0wfFXs0CLPqSiyq0ukaKva\u002FROBOTCAR?dl=0&subfolder_nav_tracking=1). \nThe CSV file with the ground-truth correspondences can be downloaded from [here](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16mZLUKsjceAt1RTW1KLckX0uCR3O4x5Q\u002Fview). \nThe file structure should be the following: \n\n```bash\nRobotCar\n├── img\u002F\n└── test6511.csv\n```\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: After updating the path of 'robotcar' and 'robotcar_csv' in admin\u002Flocal.py, evaluation is run with\n```bash\npython eval_matching.py --dataset robotcar --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type MS\n```\n \nSimilar results should be obtained: \n | Model          | Pre-trained model type      | PCK-1 (%) | PCK-3 (%) | PCK-5 (%) |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net     (paper)   | static (CityScape-DPED-ADE)          | 2.30  | 17.15 | 33.87 |\n| GLU-Net-GOCor  (paper) | static | **2.31**  | **17.62** | **35.18** |\n| GLU-Net-GOCor  (paper) | dynamic                     | 2.10  | 16.07 | 31.66 |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net-GOCor* (paper) | megadepth                   | 2.33  | 17.21 | 33.67 |\n| PDC-Net  (H)    (paper)   | megadepth                   | 2.54  | 18.97 | 36.37 |\n| PDC-Net (MS)   (paper) | megadepth                   | 2.58  | 18.87 | 36.19 |\n| PDC-Net+ (D)  (paper)  | megadepth |  2.57  | **19.12** | **36.71**  |\n| PDC-Net+ (H)  (paper)  | megadepth |  2.56 | 19.00 | 36.56 |\n| PDC-Net+ (MS)  (paper)  | megadepth |  **2.63** | 19.01 | 36.57 |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net* (paper) | megadepth | 2.36 | 17.18 | 33.28 |\n| WarpC-GLU-Net (paper) | megadepth | **2.51** | **18.59** | **35.92** | \n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>ETH3D \u003Ca name=\"eth3d\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**Data preparation**: execute 'bash assets\u002Fdownload_ETH3D.sh' from our [GLU-Net repo](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net). \nIt does the following: \n- Create your root directory ETH3D\u002F, create two sub-directories multiview_testing\u002F and multiview_training\u002F\n- Download the \"Low rew multi-view, training data, all distorted images\" [here](https:\u002F\u002Fwww.eth3d.net\u002Fdata\u002Fmulti_view_training_rig.7z) and unzip them in multiview_training\u002F\n- Download the \"Low rew multi-view, testing data, all undistorted images\" [here](https:\u002F\u002Fwww.eth3d.net\u002Fdata\u002Fmulti_view_test_rig_undistorted.7z) and unzip them in multiview_testing\u002F\n- We directly provide correspondences for pairs of images taken at different intervals. There is one bundle file for each dataset and each rate of interval, for example \"lakeside_every_5_rate_of_3\". \nThis means that we sampled the source images every 5 images and the target image is taken at a particular rate from each source image. Download all these files [here](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Okqs5QYetgVu_HERS88DuvsABGak08iN\u002Fview?usp=sharing) and unzip them. \n\nAs illustration, your root ETH3D directory should be organised as follows:\n\u003Cpre>\n\u002FETH3D\u002F\n       multiview_testing\u002F\n                        lakeside\u002F\n                        sand_box\u002F\n                        storage_room\u002F\n                        storage_room_2\u002F\n                        tunnel\u002F\n       multiview_training\u002F\n                        delivery_area\u002F\n                        electro\u002F\n                        forest\u002F\n                        playground\u002F\n                        terrains\u002F\n        info_ETH3D_files\u002F\n\u003C\u002Fpre>\nThe organisation of your directories is important, since the bundle files contain the relative paths to the images, from the ETH3D root folder. \n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: for each interval rate (3,5,7,9,11,13,15), we compute the metrics for each of the sub-datasets \n(lakeside, delivery area and so on). The final metrics are the average over all datasets for each rate. \nAfter updating the path of 'eth3d' in admin\u002Flocal.py, evaluation is run with\n```bash\npython eval_matching.py --dataset robotcar --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type D\n```\n\n\u003Cbr \u002F>\nAEPE for different rates of intervals between image pairs.\n\n| Method        | Pre-trained model type | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **1.66**   | 2.58   | 6.05   | 12.95  | 29.67   | 52.41   | 74.96   |\n| PWC-Net       | chairs-things          | 1.75   | 2.10   | 3.21   | 5.59   | 14.35   | 27.49   | 43.41   |\n| PWC-Net-GOCor | chairs-things          |  1.70      |  **1.98**      |  **2.58**      | **4.22**      |  **10.32**       |  **21.07**       |  **38.12**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | 2.49   | 3.28   | 4.18   | 5.35   | 6.78    | 9.02    | 12.23   |\n| GLU-Net       | static                 | 1.98   | 2.54   | 3.49   | 4.24   | 5.61    | 7.55    | 10.78   |\n| GLU-Net       | dynamic                | 2.01   | 2.46   | 2.98   | 3.51   | 4.30    | 6.11    | 9.08    |\n| GLU-Net-GOCor | dynamic                | **1.93**   | **2.28**   | **2.64**   | **3.01**   | **3.62**    | **4.79**    | **7.80**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              |  1.68 | 1.92 | 2.18 | 2.43 |  2.89 | 3.31 |  4.27 |\n| PDC-Net  (D) (paper) | megadepth       | 1.60   | 1.79   | 2.03   |  2.26  |  2.58   | 2.92    | 3.69    |\n| PDC-Net  (H)  | megadepth              | 1.58   | 1.77   | 1.98   |  2.24  |  2.56   | 2.91    | 3.73    |\n| PDC-Net  (MS)  | megadepth             |  1.60 |  1.79  |  2.00  | 2.26   | 2.57   | 2.90    |  3.56   |\n| PDC-Net+  (H) (paper) | megadepth  | **1.56** | **1.74** | **1.96** | **2.18** | **2.48** |  **2.73** | **3.24** |\n| PDC-Net+  (MS) (paper) | megadepth  | 1.58 | 1.76 | 1.96 | 2.16 | 2.49 | **2.73** | **3.24** |\n\n\nPCK-1 for different rates of intervals between image pairs: \n\nNote that the PCKs are computed **per image**, and \nthen averaged per sequence. The final metrics is the average over all sequences. It corresponds to the results \n'_per_image' in the outputted metric file. \nNote that this is not the metrics used in the [PDC-Net paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710), where the PCKs are c\nomputed **per sequence** instead, using the PDC-Net direct approach (corresponds to results '-per-rate' \nin outputted metric file). \n\n\n| Method        | Pre-trained model type | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **61.63**   | **56.55**   | **49.83**   | **42.00**  | 33.14   | 26.46   | 21.22   |\n| PWC-Net       | chairs-things          | 58.50   | 52.02   | 44.86   | 37.41   | 30.36   | 24.75   | 19.89   |\n| PWC-Net-GOCor | chairs-things          | 58.93   | 53.10   |  46.91  |  40.93  |  **34.58**   |  **29.25**  |  **24.59**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | \n| GLU-Net       | static                 | **50.55**   | **43.08**   | **36.98**   | 32.45   | 28.45    | 25.06    | 21.89   |\n| GLU-Net       | dynamic                | 46.27  | 39.28   | 34.05   | 30.11   | 26.69    | 23.73    | 20.85    |\n| GLU-Net-GOCor | dynamic                | 47.97   |41.79   | 36.81   | **33.03**   | **29.80**    | **26.93**    | **23.99**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              | 59.40 |  55.15 | 51.18 |  47.86 | 44.46 | 41.78 | 38.91 |\n| PDC-Net  (D)  | megadepth              |  61.82 | 58.41  | 55.02 |  52.40 | 49.61 | 47.43 | 45.01 | \n| PDC-Net  (H)  | megadepth              |  62.63  | 59.29 | 56.09 | 53.31 | 50.69 | 48.46 | 46.17 | \n| PDC-Net  (MS) | megadepth              |  62.29 | 59.14   | 55.87 | 53.23 | 50.59 | 48.45 | 46.17 |\n| PDC-Net+  (H) | megadepth              | **63.12** | **59.93** | **56.81** | **54.12** | **51.59** | **49.55** | **47.32** |\n| PDC-Net+  (MS) | megadepth              | 62.95 | 59.76 | 56.64 | 54.02 | 51.50 | 49.38 | 47.24 | \n\n\n\nPCK-5 for different rates of intervals between image pairs: \n\n| Method        | Pre-trained model type | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **92.79**   | 90.70   | 86.29   | 78.50  | 66.07   | 55.05   | 46.29   |\n| PWC-Net       | chairs-things          | 92.64  | 90.82   | 87.32   | 81.80   | 72.95   | 64.07   | 55.47   |\n| PWC-Net-GOCor | chairs-things          | 92.81      |  **91.45**      |  **88.96**      |  **85.53**      |  **79.44**       |  **72.06**       |  **64.92**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | 88.50   | 83.25   | 78.32  | 73.74   | 69.23   | 64.28    | 58.66  |\n| GLU-Net       | static                 | 91.22  | 87.91   | 84.23   |  80.74  | 76.84    | 72.35    | 67.77   |\n| GLU-Net       | dynamic                | 91.45   | 88.57   | 85.64   | 83.10   | 80.12    | 76.66    | 73.02    |\n| GLU-Net-GOCor | dynamic                | **92.08**   | **89.87**   | **87.77**   | **85.88**   | **83.69**    | **81.12**    | **77.90**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              |  93.03 |92.13 | 91.04 | 90.19 | 88.98 |  87.81 |  85.93 | \n| PDC-Net  (D) (paper) | megadepth       | 93.47 | 92.72 | 91.84 | 91.15 | 90.23 | 89.45 | 88.10 | \n| PDC-Net  (H)  | megadepth              | 93.50 | 92.71  | 91.93 | 91.16 | 90.35 | 89.52 | 88.32 | \n| PDC-Net  (MS)  | megadepth             |  93.47 | 92.69 | 91.85 | 91.15 | 90.33 | 89.55 | 88.43 | \n| PDC-Net+  (H)  | megadepth             |  **93.54** | 92.78 |  **92.04** | 91.30 | **90.60** | 89.9 | **89.03** |\n| PDC-Net+  (MS)  | megadepth             | 93.50 |  **92.79** | **92.04** | **91.35** | **90.60** | **89.97** | 88.97 | \n\n\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>HPatches \u003Ca name=\"hpatches\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: Download the data with \n```bash\nbash assets\u002Fdownload_hpatches.sh\n```\nThe corresponding csv files for each viewpoint ID with the path to the images and the homography parameters \nrelating the pairs are listed in assets\u002F. \n\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: After updating the path of 'hp' in admin\u002Flocal.py, evaluation is run with\n```bash\npython eval_matching.py --dataset hp --model GLUNet_GOCor --pre_trained_models static --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir\n```\nSimilar results should be obtained:\n|                             | Pre-trained model type | AEPE  | PCK-1 (\\%)  | PCK-3 (%) | PCK-5 (\\%)  |\n|-----------------------------|------------------------|-------|-------------|-----------|-------------|\n| DGC-Net \\[Melekhov2019\\] |                        | 33.26 | 12.00       |           | 58.06       |\n| GLU-Net           (this repo)          | static                 | 25.05 | 39.57       |   71.45        | 78.60       |\n| GLU-Net           (paper)          | static                 | 25.05 | 39.55       |     -      | 78.54       |\n| GLU-Net-GOCor          (this repo)     | static                 | **20.16** | 41.49       |   74.12        | **81.46**       |\n| GLU-Net-GOCor          (paper)     | static                 | **20.16** | **41.55**       |    -       | 81.43       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|\n| PDCNet (D)         (this repo)     | megadepth                 | 19.40 | 43.94       |    78.51       | 85.81       |\n| PDCNet (H)         (this repo)     | megadepth                 | **17.51** |  **48.69**   |   **82.71**       | **89.44**       |\n\n\n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>KITTI\u003Ca name=\"kitti\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: Both KITTI-2012 and 2015 datasets are available [here](http:\u002F\u002Fwww.cvlibs.net\u002Fdatasets\u002Fkitti\u002Feval_flow.php)\n\n\u003Cbr \u002F> \n\n**Evaluation**: After updating the path of 'kitti2012' and 'kitti2015' in admin\u002Flocal.py, evaluation is run with\n```bash\npython eval_matching.py --dataset kitti2015 --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 PDCNet --multi_stage_type direct\n```\n\nSimilar results should be obtained:\n|                |                         | KITTI-2012 |             | KITTI-2015 |           |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| Models         | Pre-trained model type  | AEPE       | F1   (%)    | AEPE       | F1  (%)   |\n| PWC-Net-GOCor    (this repo)    | chairs-things                | 4.12      | 19.58       |  10.33     | 31.23     |\n| PWC-Net-GOCor    (paper)    | chairs-things                | 4.12      | 19.31       |  10.33     | 30.53     |\n| PWC-Net-GOCor    (this repo)    | chairs-things ft sintel          |    2.60  |  9.69     | 7.64       | 21.36     |\n| PWC-Net-GOCor    (paper)    | chairs-things ft sintel          |    **2.60**  |  **9.67**     | **7.64**       | **20.93**     |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| GLU-Net    (this repo)    | static                | 3.33      | 18.91       | 9.79       | 37.77     |\n| GLU-Net    (this repo)    | dynamic                 | 3.12     | 19.73       | 7.59       | 33.92     |\n| GLU-Net    (paper)    | dynamic                 | 3.14       | 19.76       | 7.49       | 33.83     |\n| GLU-Net-GOCor  (this repo) | dynamic                 | **2.62**       | **15.17**       | **6.63**       | 27.58     |\n| GLU-Net-GOCor  (paper) | dynamic                 | 2.68       | 15.43       | 6.68       | **27.57**     |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| GLU-Net-GOCor* (paper) | megadepth               | 2.26       | 9.89        | 5.53       | 18.27     |\n| PDC-Net **(D)**    (paper and this repo) | megadepth               | 2.08       | 7.98        | 5.22       | 15.13     |\n| PDC-Net (H)    (this repo) | megadepth               | 2.16       | 8.19        | 5.31      | 15.23     |\n| PDC-Net (MS)    (this repo) | megadepth               | 2.16       | 8.13        | 5.40      | 15.33     |\n| PDC-Net+ **(D)**    (paper) | megadepth    | **1.76** | **6.60** | **4.53** | **12.62**  |\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Sintel \u003Ca name=\"sintel\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: Download the data with \n```bash\nbash assets\u002Fdownload_sintel.sh\n```\n\n\n**Evaluation**: After updating the path of 'sintel' in admin\u002Flocal.py, evaluation is run with\n\n```bash\npython eval_matching.py --dataset sintel --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type direct\n```\n\nSimilar results should be obtained:\n|               | Pre-trained model type         | AEPE   | PCK-1 \u002F dataset (\\%) | PCK-5 \u002F dataset  (\\%)  | AEPE   | PCK-1  \u002F dataset (\\%) | PCK-5 \u002F dataset  (\\%)  |\n|---------------|---------------------|--------|-------------|--------------|--------|-------------|--------------|\n| PWC-Net-GOCor (this repo) | chairs-things          | 2.38   | 82.18       | 94.14        | 3.70   | 77.36       | 91.20        |\n| PWC-Net-GOCor (paper) | chairs-things          | 2.38   | 82.17       | 94.13        | 3.70   | 77.34       | 91.20        |\n| PWC-Net-GOCor (paper) | chairs-things ft sintel  | (1.74) | (87.93)     | (95.54)      | (2.28) | (84.15)     | (93.71)      |\n|---------------|--------------------|--------|-------------|--------------|--------|-------------|--------------|\n| GLU-Net   (this repo)     | dynamic                | 4.24   | 62.21       | 88.47        | 5.49   | 58.10       | 85.16        |\n| GLU-Net    (paper)     | dynamic               | 4.25   | 62.08       | 88.40        | 5.50   | 57.85       | 85.10        |\n| GLU-Net-GOCor (this repo) | dynamic                | **3.77**   | 67.11       | **90.47**        | **4.85**   | 63.36       | **87.76**        |\n| GLU-Net-GOCor (paper) | dynamic                | 3.80   | **67.12**       | 90.41        | 4.90   | **63.38**       | 87.69        |\n|---------------|-----------------|--------|-------------|--------------|--------|-------------|--------------|\n| GLU-Net-GOCor* (paper) | megadepth         | **3.12** | 80.00 | 92.68 | **4.46** | 73.10 | 88.94 | \n| PDC-Net (D)   (this repo)     |  megadepth     | 3.30  | **85.06**      | **93.38**       | 4.48  | **78.07**       | **90.07**        |\n| PDC-Net (H)   (this repo)     |  megadepth     | 3.38   | 84.95       | 93.35        | 4.50   | 77.62       | 90.07      |\n| PDC-Net (MS)   (this repo)     |  megadepth     | 3.40   | 84.85       | 93.33        | 4.54   | 77.41       | 90.06      |\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>TSS  \u003Ca name=\"tss\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: To download the images, run:\n```bash\nbash assets\u002Fdownload_tss.sh\n```\n \n\u003Cbr \u002F>\n\n**Evaluation**: After updating the path of 'tss' in admin\u002Flocal.py, evaluation is run with\n ```bash\npython eval_matching.py --dataset TSS --model GLUNet_GOCor --pre_trained_models static --optim_iter 3 --local_optim_iter 7 --flipping_condition True --save_dir path_to_save_dir\n```\nSimilar results should be obtained:\n| Model          | Pre-trained model type      | FGD3Car | JODS | PASCAL | All  |\n|--------------------------------|--------|---------|------|--------|------|\n| Semantic-GLU-Net \\[1\\] |   Static     | 94.4    | 75.5 | 78.3   | 82.8 |\n| GLU-Net (our repo)                | Static | 93.2    | 73.69 | 71.1   | 79.33 |\n| GLU-Net (paper)                | Static | 93.2    | 73.3 | 71.1   | 79.2 |\n| GLU-Net-GOCor (our repo, GOCor iter=3, 3)          | Static | 94.6   | 77.9 | 77.7   | 83.4 |\n| GLU-Net-GOCor (our repo, GOCor iter=3, 7)          | Static | 94.6    | 77.6 | 77.1   | 83.1 |\n| GLU-Net-GOCor (paper)          | Static | 94.6    | 77.9 | 77.7   | 83.4 |\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 95.3 | 82.2 | 78.2 | \n| WarpC-SemanticGLU-Net  | pfpascal |  **97.1** | **84.7** | **79.7** | **87.2** |\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PF-Pascal \u003Ca name=\"pfpascal\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: To download the images, run:\n```bash\nbash assets\u002Fdownload_pf_pascal.sh\n```\n \n\u003Cbr \u002F>\n\n**Evaluation**: After updating the path of 'PFPascal' in admin\u002Flocal.py, evaluation is run with\n ```bash\npython eval_matching.py --dataset PFPascal --model WarpCSemanticGLUNet --pre_trained_models pfpascal --flipping_condition False --save_dir path_to_save_dir\n```\nSimilar results should be obtained:\n| Model          | Pre-trained model type     |  alpha=0.05 | alpha=0.1  \n|--------------------------------|--------|---------|---------|\n| Semantic-GLU-Net  \\[1\\]  |  static (paper) | 46.0 | 70.6 | \n| Semantic-GLU-Net  \\[1\\]  |  static (this repo) | 45.3 |  70.3 | \n| Semantic-GLU-Net  \\[4\\]  (this repo)  |  pfpascal | 48.4  | 72.4 |\n| WarpC-SemanticGLU-Net  \\[4\\] (paper)  | pfpascal |   62.1 | **81.7** |\n| WarpC-SemanticGLU-Net  \\[4\\]  (this repo) | pfpascal  |   **62.7** |  **81.7** | \n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PF-Willow \u003Ca name=\"pfwillow\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: To download the images, run:\n```bash\nbash assets\u002Fdownload_pf_willow.sh\n```\n \n\u003Cbr \u002F>\n\n**Evaluation**: After updating the path of 'PFWillow' in admin\u002Flocal.py, evaluation is run with\n ```bash\npython eval_matching.py --dataset PFWillow --model WarpCSemanticGLUNet --pre_trained_models pfpascal --flipping_condition False --save_dir path_to_save_dir\n```\nSimilar results should be obtained:\n| Model          | Pre-trained model type     |  alpha=0.05  |  alpha=0.1  \n|--------------------------------|--------|---------|---------|\n| Semantic-GLU-Net  \\[1\\]  (paper) |  static | 36.4 | 63.8 |\n| Semantic-GLU-Net  \\[1\\]  (this repo) |  static | 36.2 | 63.7 |\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 39.7 | 67.6 |\n| WarpC-SemanticGLU-Net  \\[4\\] (paper)  | pfpascal |  **49.0** | 75.1 | \n| WarpC-SemanticGLU-Net  \\[4\\] (this repo)  | pfpascal |  48.9 | **75.2** | \n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Spair-71k \u003Ca name=\"spair\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: To download the images, run:\n```bash\nbash assets\u002Fdownload_spair.sh\n```\n \n\u003Cbr \u002F>\n\n**Evaluation**: After updating the path of 'spair' in admin\u002Flocal.py, evaluation is run with\n ```bash\npython eval_matching.py --dataset spair --model WarpCSemanticGLUNet --pre_trained_models pfpascal  --flipping_condition False --save_dir path_to_save_dir\n```\nSimilar results should be obtained:\n| Model          | Pre-trained model type     |  alpha=0.1  \n|--------------------------------|--------|---------|\n| Semantic-GLU-Net  \\[1\\]  |  static |  15.1\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 16.5 |\n| WarpC-SemanticGLU-Net   | spair |   23.5 |\n| WarpC-SemanticGLU-Net  \\[4\\] | pfpascal |   **23.8** |\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Caltech-101 \u003Ca name=\"caltech\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Data preparation**: To download the images, run:\n```bash\nbash assets\u002Fdownload_caltech.sh\n```\n \n\u003Cbr \u002F>\n\n**Evaluation**: After updating the path of 'spair' in admin\u002Flocal.py, evaluation is run with\n ```bash\npython eval_matching.py --dataset caltech --model WarpCSemanticGLUNet --pre_trained_models pfpascal  --flipping_condition False --save_dir path_to_save_dir\n```\n\u003C\u002Fdetails>\n\n\n\n\n\n### 4.2 Pose estimation \u003Ca name=\"pose_estimation\">\u003C\u002Fa>\n\n\nMetrics are computed with\n```bash\npython -u eval_pose_estimation.py --dataset dataset_name --model model_name --pre_trained_models pre_trained_model_name --optim_iter optim_step  --local_optim_iter local_optim_iter --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir \n```\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>YFCC100M  \u003Ca name=\"yfcc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**Data preparation**: The groundtruth for YFCC is provided the file assets\u002Fyfcc_test_pairs_with_gt_original.txt (from [SuperGlue repo](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork)). \nImages can be downloaded from the [OANet repo](https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet) and moved to the desired location\n```bash\nbash assets\u002Fdownload_yfcc.sh\n```\nFile structure should be \n```bash\nYFCC\n└──  images\u002F\n       ├── buckingham_palace\u002F\n       ├── notre_dame_front_facade\u002F\n       ├── reichstag\u002F\n       └── sacre_coeur\u002F\n```\n\n  \n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: After updating the path 'yfcc' in admin\u002Flocal.py, compute metrics on YFCC100M with PDC-Net homography (H) using the command:\n\n```bash\npython -u eval_pose_estimation.py --dataset YFCC --model PDCNet --pre_trained_models megadepth --optim_iter 3  --local_optim_iter 7 --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir PDCNet --multi_stage_type H --mask_type proba_interval_1_above_10 \n```\n\nYou should get similar metrics (not exactly the same because of RANSAC):\n  \n|              | mAP @5 | mAP @10 | mAP @20 | Run-time (s) |\n|--------------|--------|---------|---------|--------------|\n| PDC-Net (D)  | 60.52  | 70.91   | 80.30   | 0.         |\n| PDC-Net (H)  | 63.90  | 73.00   | 81.22   | 0.74         |\n| PDC-Net (MS) | 65.18  | 74.21   | 82.42   | 2.55         |\n| PDC-Net+ (D) |  63.93 | 73.81 | 82.74 |\n| PDC-Net+ (H) | **67.35** | **76.56** | **84.56** | 0.74 | \n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>ScanNet \u003Ca name=\"scanNet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**Data preparation**:  The images of the ScanNet test set (100 scenes, scene0707_00 to scene0806_00) are provided \n[here](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F19o07SOWpv_DQcIbjb87BAKBHNCcsr4Ax\u002Fview?usp=sharing). \nThey were extracted from [ScanNet github repo](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet) and processed. \nWe use the groundtruth provided by in the [SuperGlue repo](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork) \nprovided here in the file assets\u002Fscannet_test_pairs_with_gt.txt. \n\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**Evaluation**: After updating the path 'scannet_test' in admin\u002Flocal.py, compute metrics on ScanNet with PDC-Net homography (H) using the command:\n```bash\npython -u eval_pose_estimation.py --dataset scannet --model PDCNet --pre_trained_models megadepth --optim_iter 3  --local_optim_iter 7 --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir PDCNet --multi_stage_type H --mask_type proba_interval_1_above_10 \n```\n\n\nYou should get similar metrics (not exactly the same because of RANSAC):\n  \n|              | mAP @5 | mAP @10 | mAP @20 |\n|--------------|--------|---------|---------|\n| PDC-Net (D)  | 39.93  | 50.17   | 60.87   |\n| PDC-Net (H)  | 42.87  | 53.07   | 63.25   |\n| PDC-Net (MS) | 42.40  | 52.83   | 63.13   | \n| PDC-Net+ (D) |  42.93 | 53.13 | 63.95 |\n| PDC-Net+ (H) |  **45.66** | **56.67** | **67.07** |\n\n\n\u003C\u002Fdetails>\n\n\n\n### 4.3 Sparse evaluation on HPatches \u003Ca name=\"sparse_hp\">\u003C\u002Fa>\n\n\nWe provide the link to the cache results [here](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1gphUcvBXO12EsqskdMlH3CsLxHPLtIqL?usp=sharing) \nfor the sparse evaluation on HPatches. Check [PDC-Net+](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912) for more details. \n    \n\n## 5. Training \u003Ca name=\"training\">\u003C\u002Fa>\n\n### Quick Start\n\nThe installation should have generated a local configuration file \"admin\u002Flocal.py\". \nIn case the file was not generated, run \n```python -c \"from admin.environment import create_default_local_file; create_default_local_file()\"```to generate it. \nNext, set the paths to the training workspace, i.e. the directory where the model weights and checkpoints will be saved. \nAlso set the paths to the datasets you want to use (and which should be downloaded beforehand, see below). \nIf all the dependencies have been correctly installed, you can train a network using the run_training.py script \nin the correct conda environment.\n\n```bash\nconda activate dense_matching_env\npython run_training.py train_module train_name\n```\n\nHere, train_module is the sub-module inside train_settings and train_name is the name of the train setting file to be used.\n\nFor example, you can train using the included default train_PDCNet_stage1 settings by running:\n```bash\npython run_training.py PDCNet train_PDCNet_stage1\n```\n\n### Training datasets downloading \u003Ca name=\"scanNet\">\u003C\u002Fa>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>DPED-CityScape-ADE \u003C\u002Fb>\u003C\u002Fsummary>\n\nThis is the same image pairs used in [GLU-Net repo](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net). \nFor the training, we use a combination of the DPED, CityScapes and ADE-20K datasets. \nThe DPED training dataset is composed of only approximately 5000 sets of images taken by four different cameras. \nWe use the images from two cameras, resulting in around  10,000 images. \nCityScapes additionally adds about 23,000 images. \nWe complement with a random sample of ADE-20K images with a minimum resolution of 750 x 750. \nIt results in 40.000 original images, used to create pairs of training images by applying geometric transformations to them. \nThe path to the original images as well as the geometric transformation parameters are given in the csv files\n'assets\u002Fcsv_files\u002Fhomo_aff_tps_train_DPED_CityScape_ADE.csv' and 'assets\u002Fcsv_files\u002Fhomo_aff_tps_test_DPED_CityScape_ADE.csv'.\n\n1. Download the original images\n\n* Download the [DPED dataset](http:\u002F\u002Fpeople.ee.ethz.ch\u002F~ihnatova\u002F) (54 GB) ==> images are created in original_images\u002F\n* Download the [CityScapes dataset](https:\u002F\u002Fwww.cityscapes-dataset.com\u002F)\n    - download 'leftImg8bit_trainvaltest.zip' (11GB, left 8-bit images - train, val, and test sets', 5000 images) ==> images are created in CityScape\u002F\n    - download leftImg8bit_trainextra.zip (44GB, left 8-bit images - trainextra set, 19998 images) ==> images are created in CityScape_extra\u002F\n\n* Download the [ADE-20K dataset](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F19r7dsYraHsNGI1ViZi4VwCfQywdODCDU\u002Fview?usp=sharing) (3.8 GB, 20.210 images) ==> images are created in ADE20K_2016_07_26\u002F\n\n\nPut all the datasets in the same directory. \nAs illustration, your root training directory should be organised as follows:\n```bash\ntraining_datasets\u002F\n    ├── original_images\u002F\n    ├── CityScape\u002F\n    ├── CityScape_extra\u002F\n    └── ADE20K_2016_07_26\u002F\n```\n\n2. Save the synthetic image pairs and flows to disk                \nDuring training, from this set of original images, the pairs of synthetic images could be created on the fly at each epoch. \nHowever, this dataset generation takes time and since no augmentation is applied at each epoch, one can also create the dataset in advance\nand save it to disk. During training, the image pairs composing the training datasets are then just loaded from the disk \nbefore passing through the network, which is a lot faster. \nTo generate the training dataset and save it to disk: \n\n```bash\npython assets\u002Fsave_training_dataset_to_disk.py --image_data_path \u002Fdirectory\u002Fto\u002Foriginal\u002Ftraining_datasets\u002F \n--csv_path assets\u002Fhomo_aff_tps_train_DPED_CityScape_ADE.csv --save_dir \u002Fpath\u002Fto\u002Fsave_dir --plot True\n```    \nIt will create the images pairs and corresponding flow fields in save_dir\u002Fimages and save_dir\u002Fflow respectively.\n\n3. Add the paths in admin\u002Flocal.py as 'training_cad_520' and 'validation_cad_520'\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>COCO \u003C\u002Fb>\u003C\u002Fsummary>\n  \nThis is useful for adding moving objects. \nDownload the images along with annotations from [here](http:\u002F\u002Fcocodataset.org\u002F#download). The root folder should be\norganized as follows. The add the paths in admin\u002Flocal.py as 'coco'. \n```bash\ncoco_root\n    └── annotations\n        └── instances_train2014.json\n    └──images\n        └── train2014\n```\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb> MegaDepth \u003C\u002Fb>\u003C\u002Fsummary>\n  \nWe use the reconstructions provided in the [D2-Net repo](https:\u002F\u002Fgithub.com\u002Fmihaidusmanu\u002Fd2-net).\nYou can download the undistorted reconstructions and aggregated scene information folder directly \n[here - Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1hxpOsqOZefdrba_BqnW490XpNX_LgXPB). \n\nFile structure should be the following:\n```bash\nMegaDepth\n├── Undistorted_Sfm\n└── scene_info\n```\n\nThem add the paths in admin\u002Flocal.py as 'megadepth_training'. \n\n\u003C\u002Fdetails>\n\n\n\n\n### Training scripts \n\nThe framework currently contains the training code for the following matching networks. \nThe setting files can be used train the networks, or to know the exact training details.\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>UAWarpC \u003Ca name=\"uawarpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n \nThis is adapted from our recent work [Refign](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10030987), with [code](https:\u002F\u002Fgithub.com\u002Fbrdav\u002Frefign).\nIt allows training a probabilistic correspondence network unsupervised (mix of \n[PDCNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710) and [WarpC](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)). \nIt uses a simplifided version of PDCNet, without GOCor and which is predicting a uni-modal \nGaussian probability distribution at every pixel instead of a Mixture of Laplacians. \n\n* **UAWarpC.train_UAWarpC_PDCNet_stage1**: The default settings used for first stage network training without visibility mask. \nWe train on real image pairs of the MegaDepth dataset. \n\n* **UAWarpC.train_UAWarpC_PDCNet_stage2**: We further finetune the network trained with stage1, by including our visibility mask. \n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Probabilistic Warp Consistency (PWarpC) \u003Ca name=\"pwarpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n \n* **PWarpC.train_weakly_supervised_PWarpC_SFNet_pfpascal**: The default settings used to train the \nweakly-supervised PWarpC-SF-Net on PF-Pascal. \n\n* **PWarpC.train_weakly_supervised_PWarpC_SFNet_spair_from_pfpascal**: The default settings used to train the \nweakly-supervised PWarpC-SF-Net on SPair. More precisely, the network is first trained on PF-Pascal (above) and \nfurther finetuned on SPair-71K. \n\n* **PWarpC.train_strongly_supervised_PWarpC_SFNet_pfpascal**: The default settings used to train the strongly-supervised\nPWarpC-SF-Net on PF-Pascal. \n\n* **PWarpC.train_strongly_supervised_PWarpC_SFNet_spair_from_pfpascal**: The default settings used to train the strongly-supervised\nPWarpC-SF-Net on Spair-71K. \n\n* The rest to come\n\n\u003C\u002Fdetails>\n\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Warp Consistency (WarpC) \u003Ca name=\"warpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n* **WarpC.train_WarpC_GLUNet_stage1**: The default settings used for first stage network training without visibility mask. \nWe train on real image pairs of the MegaDepth dataset. \n\n* **WarpC.train_WarpC_GLUNet_stage2**: We further finetune the network trained with stage1, by including our visibility mask. \nThe network corresponds to our final WarpC-GLU-Net (see [WarpC paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)). \n\n* **WarpC.train_ft_WarpCSemanticGLUNet**: The default settings used for training the final WarpC-SemanticGLU-Net \n(see [WarpC paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)). \nWe finetune the original SemanticGLUNet (trained on the static\u002FCAD synthetic data) on PF-Pascal using Warp Consistency. \n\n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PDC-Net and PDC-Net+\u003Ca name=\"pdcnet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **PDCNet.train_PDCNet_plus_stage1**: The default settings used for first stage network training with fixed backbone weights. \nWe train first on synthetically generated image pairs from the DPED, CityScape and ADE dataset (pre-computed and saved), \non which we add MULTIPLE independently moving objects and perturbations. We also train by applying our object reprojection mask. \n\n* **PDCNet.train_PDCNet_plus_stage2**: The default settings used for training the final PDC-Net+ model (see [PDC-Net+ paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912)). \nThis setting fine-tunes all layers in the model trained using PDCNet_stage1 (including the feature backbone). As training\ndataset, we use a combination of the same dataset than in stage 1 as well as image pairs from the MegaDepth dataset \nand their sparse ground-truth correspondence data. We also apply the reprojection mask. \n\n\n\n* **PDCNet.train_PDCNet_stage1**: The default settings used for first stage network training with fixed backbone weights. \nWe initialize the backbone VGG-16 with pre-trained ImageNet weights. We train first on synthetically generated image \npairs from the DPED, CityScape and ADE dataset (pre-computed and saved), on which we add independently moving objects and perturbations. \n\n* **PDCNet.train_PDCNet_stage2**: The default settings used for training the final PDC-Net model (see [PDC-Net paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)). \nThis setting fine-tunes all layers in the model trained using PDCNet_stage1 (including the feature backbone). As training\ndataset, we use a combination of the same dataset than in stage 1 as well as image pairs from the MegaDepth dataset \nand their sparse ground-truth correspondence data. \n\n* **PDCNet.train_GLUNet_GOCor_star_stage1**: Same settings than for PDCNet_stage1, with different model (non probabilistic baseline). \nThe loss is changed accordingly to the L1 loss instead of the negative log likelihood loss. \n\n* **PDCNet.train_GLUNet_GOCor_star_stage2**: The default settings used for training the final GLU-Net-GOCor* \n(see [PDCNet paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)). \n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Example training with randomly generated data \u003Ca name=\"glunet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **GLUNet.train_GLUNet_with_synthetically_generated_data**: This is a simple example of how to generate random transformations\non the fly, and to apply them to original images, in order to create training image pairs and their corresponding \nground-truth flow. Here, the random transformations are applied to MegaDepth images. On the created image pairs and \nground-truth flows, we additionally add a randomly moving object. \n\n\n\u003C\u002Fdetails>\n \n \n \n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>GLU-Net \u003Ca name=\"glunet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **GLUNet.train_GLUNet_static**: The default settings used training the final GLU-Net (of the paper\n [GLU-Net](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.05524)).  \nWe fix the  backbone weights and initialize the backbone VGG-16 with pre-trained ImageNet weights. \nWe train on synthetically generated image pairs from the DPED, CityScape and ADE dataset (pre-computed and saved),\nwhich is later ([GOCor paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)) referred to as 'static' dataset. \n\n* **GLUNet.train_GLUNet_dynamic**: The default settings used training the final GLU-Net trained on the dynamic \ndataset (of the paper [GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)).  \nWe fix the  backbone weights and initialize the backbone VGG-16 with pre-trained ImageNet weights. \nWe train on synthetically generated image pairs from the DPED, CityScape and ADE dataset (pre-computed and saved), \non which we add one independently moving object. \nThis dataset is referred to as 'dynamic' dataset in [GOCor paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823). \n\n* **GLUNet.train_GLUNet_GOCor_static**: The default settings used training the final GLU-Net-GOCor \n(of the paper [GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)).  \nWe fix the  backbone weights and initialize the backbone VGG-16 with pre-trained ImageNet weights. \nWe train on synthetically generated image pairs from the DPED, CityScape and ADE dataset (pre-computed and saved),\nwhich is later ([GOCor paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)) referred to as 'static' dataset. \n\n* **GLUNet.train_GLUNet_GOCor_dynamic**: The default settings used training the final GLU-Net-GOCor trained on the dynamic \ndataset (of the paper [GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)).  \nWe fix the  backbone weights and initialize the backbone VGG-16 with pre-trained ImageNet weights. \nWe train on synthetically generated image pairs from the DPED, CityScape and ADE dataset (pre-computed and saved), \non which we add one independently moving object. \nThis dataset is referred to as 'dynamic' dataset in [GOCor paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823). \n\n\u003C\u002Fdetails>\n\n\n### Training your own networks\n\nTo train a custom network using the toolkit, the following components need to be specified in the train settings. \nFor reference, see [train_GLUNet_static.py](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002Ftrain_settings\u002FGLUNet\u002Ftrain_GLUNet_static.py).\n\n* Datasets: The datasets to be used for training. A number of standard matching datasets are already available in \nthe datasets module. The dataset class can be passed a processing function, which should perform the necessary \nprocessing of the data before batching it, e.g. data augmentations and conversion to tensors.\n* Dataloader: Determines how to sample the batches. Can use specific samplers. \n* Network: The network module to be trained.\n* BatchPreprocessingModule: The pre-processing module that takes the batch and will transform it to the inputs \nrequired for training the network. Depends on the different networks and training strategies. \n* Objective: The training objective.\n* Actor: The trainer passes the training batch to the actor who is responsible for passing the data through the \nnetwork correctly, and calculating the training loss. The batch preprocessing is also done within the actor class. \n* Optimizer: Optimizer to be used, e.g. Adam.\n* Scheduler: Scheduler to be used. \n* Trainer: The main class which runs the epochs and saves checkpoints.\n\n\n\n## 6. Acknowledgement \u003Ca name=\"acknowledgement\">\u003C\u002Fa>\n\nWe borrow code from public projects, such as [pytracking](https:\u002F\u002Fgithub.com\u002Fvisionml\u002Fpytracking), [GLU-Net](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net), \n[DGC-Net](https:\u002F\u002Fgithub.com\u002FAaltoVision\u002FDGC-Net), [PWC-Net](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FPWC-Net), \n[NC-Net](https:\u002F\u002Fgithub.com\u002Fignacio-rocco\u002Fncnet), [Flow-Net-Pytorch](https:\u002F\u002Fgithub.com\u002FClementPinard\u002FFlowNetPytorch), \n[RAFT](https:\u002F\u002Fgithub.com\u002Fprinceton-vl\u002FRAFT), [CATs](https:\u002F\u002Fgithub.com\u002FSunghwanHong\u002FCost-Aggregation-transformers)...\n\n## 7. ChangeLog \u003Ca name=\"changelog\">\u003C\u002Fa>\n\n* 06\u002F21: Added evaluation code\n* 07\u002F21: Added training code and more options for evaluation\n* 08\u002F21: Fixed memory leak in mixture dataset + added other sampling for megadepth dataset\n* 10\u002F21: Added pre-trained models of WarpC \n* 12\u002F21: Added training code for WarpC and PDC-Net+, + randomly generated data + Caltech evaluation, + pre-trained models of PDC-Net+ + demo on notebook\n* 02\u002F22: Small modifications \n* 03\u002F22: Major refactoring, added video demos, code for PWarpC, default initialization of deconv to bilinear weights. \n","# 密集匹配\n\n基于 PyTorch 的通用密集匹配库。\n\n如有任何问题、意见或建议，请联系 Prune：prune.truong@vision.ee.ethz.ch\n\n\u003Cbr \u002F>\n\n\n**如果您有兴趣无监督地训练一个概率对应网络（结合了 [PDCNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710) 和 [WarpC](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308) 的方法），请查看我们最近的工作 [Refign](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10030987)，代码可在 [GitHub](https:\u002F\u002Fgithub.com\u002Fbrdav\u002Frefign) 上找到。** 它也已集成到本代码库中，名为 UAWarpC。\n\n\n## 更新\n\n2022年3月6日：我们发现，当使用双线性插值权重来初始化用于在不同金字塔层级之间上采样预测光流场的转置卷积层时，可以获得显著更好的性能并缩短训练时间。我们已将此初始化方式设为默认。未来可能会提供更新的预训练权重。此外，也可以直接使用双线性插值进行上采样，效果相似（甚至可能更好），现在这也作为一种可选方案提供。\n\n## 亮点\n\n用于实现、训练和评估密集匹配网络的库。其中包括：\n* 常用的密集匹配 **验证数据集**，适用于 **几何匹配**（MegaDepth、RobotCar、ETH3D、HPatches）、**光流**（KITTI、Sintel）和 **语义匹配**（TSS、PF-Pascal、PF-Willow、Spair）。\n* 用于 **分析** 网络性能并获得匹配和位姿估计标准性能指标的脚本。\n* 通用构建模块，包括深度网络、优化、特征提取和实用工具。\n* 用于训练密集匹配网络的 **通用训练框架**，包含：\n    * 常用的匹配网络训练数据集。\n    * 用于生成随机图像对及其对应的真值光流，并添加运动物体及相应修改光流的函数。\n    * 用于数据采样、处理等功能。\n    * 以及更多内容……\n\n* GLU-Net（CVPR 2020）、GLU-Net-GOCor（NeurIPS 2020）、PWC-Net-GOCor（NeurIPS 2020）、PDC-Net（CVPR 2021）、WarpC 模型（ICCV 2021）、PWarpC 模型（CVPR 2022）的 **官方实现**，包括训练好的模型和相应的结果。\n\n\u003Cbr \u002F>\n\n## 密集匹配网络\n\n该仓库包含了以下匹配模型的实现。我们为每个数据集和方法提供了预训练模型权重、数据准备、评估命令以及结果。如果您觉得这个库有用，请考虑引用相关的研究论文。\n\n### [6] PWarpC：用于弱监督语义对应关系的概率形变一致性。（CVPR 2022）\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F)、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F)、[Fisher Yu](https:\u002F\u002Fwww.yf.io\u002F)、Luc Van Gool\u003Cbr \u002F>\n\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.04279)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpwarpc)\\]\n\\[[海报](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lP5E3BNqdKJL1q-YsQ-C7rOwkcb5S63W\u002Fview?usp=sharing)\\]\n\\[[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=I2KtnvI8xZU)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDYG22.html?view=bibtex)\\]\n\n\u003Cdetails>\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c75787f56290.png)\n\n我们提出了概率形变一致性这一弱监督学习目标，用于语义匹配任务。我们的方法直接监督网络预测的密集匹配分数，这些分数被编码为条件概率分布。我们首先通过将一对展示同一类别不同实例的图像中的其中一张应用已知的形变，构建一个图像三元组。然后利用由此产生的图像三元组所形成的约束条件，推导出我们的概率学习目标。为了进一步考虑真实图像对中存在的遮挡和背景杂乱情况，我们将可学习的未匹配状态扩展到了我们的概率输出空间中。为此，我们设计了一个针对展示不同类别物体的图像对的目标函数来监督这一状态。我们通过将其应用于四种近期的语义匹配架构来验证了我们的方法。我们的弱监督方法在四个具有挑战性的语义匹配基准测试中达到了新的最先进水平。最后，我们还证明，当与关键点标注结合使用时，我们的目标函数在强监督模式下也能带来显著的提升。\n\n\u003C\u002Fdetails>\n\n### [5] PDC-Net+：增强型概率密集对应网络。（TPAMI 2023）\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F)、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F)、Radu Timofte、Luc Van Gool\u003Cbr \u002F>\n\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpdcnet+)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fjournals\u002Fcorr\u002Fabs-2109-13912.html?view=bibtex)\\]\n\n\n\n### [4] WarpC：用于无监督学习密集对应关系的形变一致性。（ICCV 2021 - 口头报告）\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F)、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F)、[Fisher Yu](https:\u002F\u002Fwww.yf.io\u002F)、Luc Van Gool\u003Cbr \u002F>\n\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fwarpc)\\]\n\\[[海报](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1PCXkjxvVsjHAbYzsBtgKWLO1uE6oGP6p\u002Fview?usp=sharing)\\]\n\\[[幻灯片](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mVpLBW55nlNJZBsvxkBCti9_KhH1r9V_\u002Fview?usp=sharing)\\]\n\\[[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=IsMotj7-peA)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficcv\u002FTruongDYG21.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\n\n形变一致性图            |  结果 \n:-------------------------:|:-------------------------:\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_b8290b3039c5.png) |  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_abdcee91c337.png) \n\n\n\n\n学习密集对应关系的关键挑战在于缺乏真实图像对的真值匹配。虽然光度一致性损失提供了一种无监督的替代方案，但它们难以应对外观变化较大的情况，而这种变化在几何和语义匹配任务中非常常见。此外，依赖于合成训练对的方法往往难以泛化到真实数据上。\n我们提出了形变一致性这一无监督学习目标，用于密集对应关系回归任务。即使在外观和视角变化较大的情况下，我们的目标仍然有效。给定一对真实图像，我们首先通过将其中一个原始图像应用随机采样的形变，构建一个图像三元组。随后，我们推导并分析三元组之间产生的一切光流一致性约束条件。基于我们的观察和实验结果，我们设计了一个通用的无监督目标函数，该函数采用了其中两个推导出的约束条件。\n我们通过训练三种近期的密集对应网络来验证形变一致性损失在几何和语义匹配任务中的有效性。我们的方法在多个具有挑战性的基准测试中达到了新的最先进水平，包括 MegaDepth、RobotCar 和 TSS。\n\n\u003C\u002Fdetails>\n\n### [3] PDC-Net：学习准确的对应关系及何时信任它们。（CVPR 2021 - 口头报告）\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F)、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F)、Luc Van Gool、Radu Timofte\u003Cbr \u002F>\n\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fpdcnet)\\]\n\\[[海报](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F18ya__AdEIgZyix8dXuRpJ15tdrpbMUsB\u002Fview?usp=sharing)\\]\n\\[[幻灯片](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1zUQmpmVp6WSa_psuI3KFvKVrNyJE-beG\u002Fview?usp=sharing)\\]\n\\[[视频](https:\u002F\u002Fyoutu.be\u002FbX0rEaSf88o)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDGT21.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c39d6c677e8b.png)\n\n\n在大位移或同质区域的情况下，稠密光流估计通常不够准确。对于大多数应用和下游任务，例如姿态估计、图像操作或三维重建，关键在于知道**何时以及在哪里**可以信任估计出的匹配结果。 \n在本工作中，我们旨在估计两幅图像之间的稠密光流场，并同时生成一个鲁棒的像素级置信度图，以指示预测的可靠性和准确性。我们开发了一种灵活的概率方法，联合学习光流预测及其不确定性。特别地，我们将预测分布参数化为一种受约束的混合模型，从而更好地建模准确的光流预测和异常值。此外，我们还设计了一种针对自监督训练场景下鲁棒且可泛化的不确定性预测的架构和训练策略。\n\n\u003C\u002Fdetails>\n\n### [2] GOCor：将全局优化的对应体积引入你的神经网络。（NeurIPS 2020）\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F) *、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F) *、Luc Van Gool、Radu Timofte\u003Cbr \u002F>\n\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fgocor)\\]\n\\[[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=V22MyFChBCs)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fnips\u002FTruongDGT20.html?view=bibtex)\\]\n\n\n\u003Cdetails>\n\n特征相关层是许多涉及图像对之间稠密对应关系的计算机视觉问题中的关键神经网络模块。它通过计算两幅图像中成对位置提取的特征向量之间的密集标量积来预测对应体积。然而，在需要区分图像中多个相似区域时，这种点对点的特征比较是不够的，会严重降低最终任务的性能。 \n**本工作提出了GOCor，一个完全可微分的稠密匹配模块，可直接替代特征相关层。** 我们的模块生成的对应体积是内部优化过程的结果，该过程明确考虑了场景中的相似区域。此外，我们的方法能够有效地学习空间匹配先验，以进一步解决匹配歧义。\n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_2c8fcbc55192.jpg)\n\n\u003C\u002Fdetails>\n\n\n### [1] GLU-Net：用于稠密光流和对应关系的全局-局部通用网络（CVPR 2020 - 口头报告）。\n作者：[Prune Truong](https:\u002F\u002Fprunetruong.com\u002F)、[Martin Danelljan](https:\u002F\u002Fmartin-danelljan.github.io\u002F) 和 Radu Timofte \u003Cbr \u002F>\n\\[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.05524)\\]\n\\[[网站](https:\u002F\u002Fprunetruong.com\u002Fresearch\u002Fglu-net)\\]\n\\[[海报](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1pS_OMZ83EG-oalD-30vDa3Ru49GWi-Ky\u002Fview?usp=sharing)\\]\n\\[[口头报告视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=xB2gNx8f8Xc&feature=emb_title)\\]\n\\[[预告片视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=s5OUdkM9QLo)\\]\n\\[[引用](https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FTruongDT20.html?view=bibtex)\\]\n\n\u003Cdetails>\n\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_ba67441a2374.png)\n\n\u003C\u002Fdetails>\n\n\u003Cbr \u002F>\n\u003Cbr \u002F>\n\n## 预训练权重\n\n预训练模型可以在[模型库](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002FMODEL_ZOO.md)中找到。\n\n\n\u003Cbr \u002F>\n\n## 目录\n\n1. [安装](#installation)\n2. [在你自己的图像对上测试！](#test)\n    1. [在图像对上测试](#test1)\n    2. [各种演示](#test2)\n3. [概述](#overview)\n4. [基准测试与结果](#results)\n    1. [对应关系评估](#correspondence_eval)\n        1. [MegaDepth](#megadepth)\n        2. [RobotCar](#robotcar)\n        3. [ETH3D](#eth3d)\n        4. [HPatches](#hpatches)\n        5. [KITTI](#kitti)\n        6. [Sintel](#sintel)\n        7. [TSS](#tss)\n        8. [PF-Pascal](#pfpascal)\n        9. [PF-Willow](#pfwillow)\n        10. [Spair-71k](#spair)\n        11. [Caltech-101](#caltech)\n    2. [姿态估计](#pose_estimation)\n        1. [YFCC100M](#yfcc)\n        2. [ScanNet](#scannet)\n    3. [HPatches上的稀疏评估](#sparse_hp)\n5. [训练](#training)\n6. [致谢](#acknowledgement)\n7. [更新日志](#changelog)\n\n\n\u003Cbr \u002F>\n\n## 1. 安装 \u003Ca name=\"installation\">\u003C\u002Fa>\n\n推理运行要求PyTorch版本 >= 1.0\n\n* 创建并激活一个Python 3.x的conda环境\n\n```bash\nconda create -n dense_matching_env python=3.7\nconda activate dense_matching_env\n```\n\n* 运行以下命令安装所有依赖项（除cupy外，见下文）：\n```bash\npip install numpy opencv-python torch torchvision matplotlib imageio jpeg4py scipy pandas tqdm gdown pycocotools timm\n```\n\n**注意**：运行代码需要CUDA。事实上，相关层是使用CuPy在CUDA中实现的，因此CuPy是必需的依赖项。可以通过`pip install cupy`安装，也可以使用CuPy仓库中提供的二进制包之一。代码是在Python 3.7、PyTorch 1.0和CUDA 9.0环境下开发的，所以我安装了适用于cuda90的cupy。如果使用其他版本的CUDA，请相应调整。\n\n```bash\npip install cupy-cuda90 --no-cache-dir \n```\n\n\n* 本仓库包含[GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)作为git子模块。你需要拉取子模块：\n```bash\ngit submodule update --init --recursive\ngit submodule update --recursive --remote\n```\n\n* 运行以下命令创建`admin\u002Flocal.py`文件，并更新数据集路径。我们提供了一个示例`admin\u002Flocal_example.py`，其中所有数据集都存储在`data\u002F`目录下。\n```bash\npython -c \"from admin.environment import create_default_local_file; create_default_local_file()\"\n```\n\n* 使用命令```bash assets\u002Fdownload_pre_trained_models.sh``` **下载预训练模型权重**。更多信息请参阅[模型库](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002FMODEL_ZOO.md)。\n\n\u003Cbr \u002F>\n\n## 2. 在您自己的图像对上进行测试！  \u003Ca name=\"test\">\u003C\u002Fa>\n\n可能的模型选择包括：\n* SFNet、PWarpCSFNet_WS、PWarpCSFNet_SS、NCNet、PWarpCNCNet_WS、PWarpCNCNet_SS、CATs、PWarpCCATs_SS、CATs_ft_features、\nPWarpCCATs_ft_features_SS、\n* UAWarpC（无预训练模型可用）、\n* WarpCGLUNet、GLUNet_star、WarpCSemanticGLUNet、\n* PDCNet_plus、PDCNet、GLUNet_GOCor_star、\n* SemanticGLUNet、GLUNet、GLUNet_GOCor、PWCNet、PWCNet_GOCor\n\n可能的预训练模型选择包括：static、dynamic、chairs_things、chairs_things_ft_sintel、megadepth、\nmegadepth_stage1、pfpascal、spair\n\n\u003Cbr \u002F>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PDCNet 和 PDC-Net+ 推理选项说明\u003C\u002Fb>\u003C\u002Fsummary>\n\nPDC-Net 和 PDC-Net+ 具有多种推理替代选项。 \n如果模型是 PDC-Net，则需添加以下选项：\n* --confidence_map_R，用于计算置信度图 p_r，默认值为 1.0\n* --multi_stage_type 可选：\n    * 'D'（或 'direct'）\n    * 'H'（或 'homography_from_quarter_resolution_uncertainty'）\n    * 'MS'（或 'multiscale_homo_from_quarter_resolution_uncertainty'）\n* --ransac_thresh，用于单应性和多尺度多阶段类型，默认值为 1.0\n* --mask_type，用于对估计的置信度图进行阈值处理，并使用置信匹配点进行内部单应性估计，适用于单应性和多尺度多阶段类型，默认值为 proba_interval_1_above_5\n* --homography_visibility_mask，默认值为 True\n* --scaling_factors，用于多尺度方法，默认值为 \\[0.5, 0.6, 0.88, 1, 1.33, 1.66, 2\\]\n\n当图像对仅表现出有限的视角变化时（例如视频中的连续帧，类似于光流任务），请使用直接方式 ('D')。对于较大的视角变化，请使用单应性 ('H') 或多尺度 ('MS')。\n\n例如，要以单应性方式运行 PDC-Net 或 PDC-Net+，可在命令末尾添加：\n```bash\nPDCNet --multi_stage_type H --mask_type proba_interval_1_above_10\n```\n\n\u003C\u002Fdetails>\n\n\n### 在特定图像对上进行测试  \u003Ca name=\"test1\">\u003C\u002Fa>\n\n\n您可以使用 test_models.py 和提供的已训练模型权重，在一对图像上测试这些网络。首先需要选择要使用的模型和预训练权重。输入是查询图像和参考图像的路径。随后图像会被传递到网络中，网络会输出参考图像与查询图像之间的对应光流场。然后根据估计的光流对查询图像进行变形，并保存生成的图像。\n\n\u003Cbr \u002F>\n\n对于这对 MegaDepth 图像（用于检查代码是否正常工作）并使用在 megadepth 数据集上训练的 **PDCNet** (MS)，输出如下：\n\n```bash\npython test_models.py --model PDCNet --pre_trained_model megadepth --path_query_image images\u002Fpiazza_san_marco_0.jpg --path_reference_image images\u002Fpiazza_san_marco_1.jpg --save_dir evaluation\u002F PDCNet --multi_stage_type MS --mask_type proba_interval_1_above_10\n```\n其他可选参数：--path_to_pre_trained_models（默认值为 pre_trained_models\u002F）\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_4e8c25db8947.png)\n\n\n\u003Cbr \u002F>\n\n使用在 dynamic 数据集上训练的 **GLU-Net-GOCor**，针对这组 eth3d 图像的输出如下：\n\n```bash\npython test_models.py --model GLUNet_GOCor --pre_trained_model dynamic --path_query_image images\u002Feth3d_query.png --path_reference_image images\u002Feth3d_reference.png --save_dir evaluation\u002F\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_84d452de4deb.png)\n\n\u003Cbr \u002F>\n\n而对于基线 **GLU-Net**，输出则为：\n\n```bash\npython test_models.py --model GLUNet --pre_trained_model dynamic --path_query_image images\u002Feth3d_query.png --path_reference_image images\u002Feth3d_reference.png --save_dir evaluation\u002F\n\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_c4965f2cba89.png)\n\n\n\u003Cbr \u002F>\n\n对于 **PWC-Net-GOCor** 和基线 **PWC-Net**：\n\n\n```bash\npython test_models.py --model PWCNet_GOCor --pre_trained_model chairs_things --path_query_image images\u002Fkitti2015_query.png --path_reference_image images\u002Fkitti2015_reference.png --save_dir evaluation\u002F\n```\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_f2325b585b16.png)\n\n\u003Cbr \u002F>\n\n```bash\npython test_models.py --model PWCNet --pre_trained_model chairs_things --path_query_image images\u002Fkitti2015_query.png --path_reference_image images\u002Fkitti2015_reference.png --save_dir evaluation\u002F\n```\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_dd04c7d6e33f.png)\n\n\u003Cbr \u002F>\n\n\n## 视频演示 \u003Ca name=\"test2\">\u003C\u002Fa>\n\n* **demos\u002Fdemo_single_pair.ipynb**：尝试使用我们的模型处理不同的图像对，计算图像对之间的光流场，并可视化变形后的图像以及置信匹配点。\n\n* **demos\u002Fdemo_pose_estimation_and_reconstruction.ipynb**：尝试使用我们的模型处理不同图像对（已知相机内参），计算光流场和置信度图，进而推算相对位姿。\n\n* 运行 **在线摄像头或视频演示**，重现上方 GIF 中的效果。我们计算目标图像（中间）与源图像（左）之间的光流场，并绘制出前 1000 个最置信的匹配点。右侧显示的是变形后的源图像（应与中间图像相似）。只有那些被预测为高置信度的区域才会显示出来。\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_1589010d9353.gif)\n\n我们修改了 [SuperGlue](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork) 的工具代码，因此在使用时需遵守其许可协议。然后运行：\n```bash\nbash demo\u002Frun_demo_confident_matches.sh\n``` \n\n\n* 从一段视频中，**将视频的每一帧（左侧）变形到中间帧**，如下方 GIF 所示。右侧显示的是变形后的帧（应与中间帧相似）。只有那些被预测为高置信度的区域才会显示出来。\n\n![alt text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_readme_885f8f268a72.gif)\n\n我们同样修改了 [SuperGlue](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork) 的工具代码，因此在使用时需遵守其许可协议。然后运行：\n```bash\nbash demo\u002Frun_demo_warping_videos.sh\n``` \n\n\n\n* 更多内容即将推出！\n\n## 3. 概述  \u003Ca name=\"overview\">\u003C\u002Fa>\n\n该框架由以下子模块组成。\n\n* training: \n    * actors: 包含用于不同训练的演员类。演员类负责将输入数据通过网络传递并计算损失。\n    这里还包括预处理类，用于将批量张量输入处理为训练网络所需的理想输入。\n    * trainers: 执行训练的主要类。\n    * losses: 包含损失函数类。\n* train_settings: 包含设置文件，用于指定网络的训练参数。\n* admin: 包含加载网络、TensorBoard等功能，同时也包含环境设置。\n* datasets: 集成了多个数据集。此外，还包含生成合成图像对及其对应的真实流场标签，以及添加独立移动物体并相应修改流场的模块。\n* utils_data: 包含用于数据处理的函数，例如加载图像、数据增强、采样帧等。\n* utils_flow: 包含用于处理流场的函数，例如转换为映射、根据流场扭曲数组，以及可视化工具。\n* third_party: 训练所需的外部库，以子模块形式添加。\n* models: 包含不同的层和网络定义。\n* validation: 包含用于评估和分析网络在预测流场及不确定性方面性能的函数。\n\n\n\n## 4. 基准测试与结果  \u003Ca name=\"results\">\u003C\u002Fa>\n\n所有数据集的路径都必须在 admin\u002Flocal.py 文件中提供。\n我们提供了一个示例 admin\u002Flocal_example.py，其中所有数据集都存储在 data\u002F 目录下。\n在运行评估之前，您需要更新 admin\u002Flocal.py 中的路径。\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PDCNet 和 PDCNet+ 推理选项说明\u003C\u002Fb>\u003C\u002Fsummary>\n\nPDC-Net 和 PDC-Net+ 具有多种推理替代选项。\n如果模型是 PDC-Net，则需添加以下选项：\n* --confidence_map_R，用于计算置信度图 p_r，默认值为 1.0\n* --multi_stage_type，可选：\n    * 'D'（或 'direct'）\n    * 'H'（或 'homography_from_quarter_resolution_uncertainty'）\n    * 'MS'（或 'multiscale_homo_from_quarter_resolution_uncertainty'）\n* --ransac_thresh，用于单应性和多尺度多阶段类型，默认值为 1.0\n* --mask_type，用于对估计的置信度图进行阈值化，并使用置信匹配进行内部单应性估计，适用于单应性和多尺度多阶段类型，默认值为 proba_interval_1_above_5\n* --homography_visibility_mask，默认值为 True\n* --scaling_factors，用于多尺度情况，默认值为 [0.5, 0.6, 0.88, 1, 1.33, 1.66, 2]\n\n例如，要使用单应性运行 PDC-Net 或 PDC-Net+，可在命令末尾添加：\n```bash\nPDCNet --multi_stage_type H --mask_type proba_interval_1_above_10\n```\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>关于结果可重复性的说明\u003C\u002Fb>\u003C\u002Fsummary>\n\n使用 PDC-Net 的多阶段（homography_from_quarter_resolution_uncertainty，H）或多尺度（multiscale_homo_from_quarter_resolution_uncertainty，MS）时，内部会采用 RANSAC 方法。因此，结果可能会略有差异，但应在 1-2% 以内。\n对于姿态估计，我们也使用 RANSAC 计算姿态，这会导致结果存在一定变异性。\n  \n\u003C\u002Fdetails>\n\n### 4.1. 对应关系评估 \u003Ca name=\"correspondence_eval\">\u003C\u002Fa>\n\n指标的计算命令如下：\n```bash\npython -u eval_matching.py --dataset 数据集名称 --model 模型名称 --pre_trained_models 预训练模型名称 --optim_iter 优化步数 --local_optim_iter 局部优化步数 --save_dir 保存目录路径\n```\n\n可选参数：\n--path_to_pre_trained_models:\n   * 默认值为 pre_trained_models\u002F\n   * 如果是一个目录路径：则为包含模型权重的目录路径，模型权重的路径将是 path_to_pre_trained_models + 模型名称 + '_' + 预训练模型名称\n   * 如果直接指向一个检查点路径，则该路径将被用作模型权重的直接路径，而 pre_trained_model_name 仅作为保存指标的名称使用。\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>MegaDepth\u003C\u002Fb>\u003Ca name=\"megadepth\">\u003C\u002Fa>\u003C\u002Fsummary>\n\n\n**数据准备**: 我们使用 [RANSAC-Flow](https:\u002F\u002Fgithub.com\u002FXiSHEN0220\u002FRANSAC-Flow\u002Ftree\u002Fmaster\u002Fevaluation\u002FevalCorr) 提供的测试集。\n该测试集由 1600 对图像组成，并附带一个 CSV 文件（'test1600Pairs.csv'），其中包含待评估的图像对名称及相应的地面真实对应关系。\n可通过以下命令下载所有内容：\n```bash\nbash assets\u002Fdownload_megadepth_test.sh\n```\n\n最终的文件结构如下：\n```bash\nmegadepth_test_set\u002F\n└── MegaDepth\u002F\n    └── Test\u002F\n        └── test1600Pairs\u002F\n        └── test1600Pairs.csv\n```\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**: 在 admin\u002Flocal.py 中更新 'megadepth' 和 'megadepth_csv' 的路径后，即可通过以下命令进行评估：\n```bash\npython eval_matching.py --dataset megadepth --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir 保存目录路径 PDCNet --multi_stage_type MS\n```\n\n\n预计可获得类似的结果：\n| 模型          | 预训练模型类型      | PCK-1 (%) | PCK-3 (%) | PCK-5 (%) | \n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net  (本仓库)      | 静态  (CityScape-DPED-ADE)        | 29.51 | 50.67 | 56.12 | \n| GLU-Net  (本仓库)      | 动态           | 21.59 | 52.27 | 61.91 | \n| GLU-Net  (论文)      | 动态       | 21.58 | 52.18 | 61.78 | \n| GLU-Net-GOCor  (本仓库) | 静态 (CitySCape-DPED-ADE) | 32.24 | 52.51 | 58.90 | \n| GLU-Net-GOCor (本仓库) | 动态  | 37.23 | **61.25** | **68.17** | \n| GLU-Net-GOCor (论文) | 动态      | **37.28**| 61.18 | 68.08 | \n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net-GOCor* (论文) | megadepth                   | 57.77 | 78.61 | 82.24 | \n| PDC-Net  (D)   (本仓库)  | megadepth                   | 68.97 | 84.03 | 85.68 | \n| PDC-Net  (H)   (论文)   | megadepth                   | 70.75 | 86.51 | 88.00 | \n| PDC-Net (MS)  (论文) | megadepth                   | 71.81 | 89.36 | 91.18 | \n| PDC-Net+ (D) (论文)  | megadepth |  72.41 | 86.70 | 88.13  |\n| PDC-Net+ (H) (论文)  | megadepth |  73.92  |  89.21 |  90.48  |\n| PDC-Net+ (MS) (论文)  | megadepth |  **74.51**  |  **90.69**  | **92.10** |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net*  (论文)   | megadepth  |  38.50 | 54.66 | 59.60 | \n| GLU-Net*  (本仓库)   | megadepth  |  38.62 | 54.70 |  59.76 | \n| WarpC-GLU-Net (论文) | megadepth | 50.61 | 73.80 | 78.61 | \n| WarpC-GLU-Net (本仓库) | megadepth | **50.77** | **73.91** | **78.73** | \n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>RobotCar \u003Ca name=\"robotcar\">\u003C\u002Fb>\u003C\u002Fa>\u003C\u002Fsummary>\n  \n**数据准备**: 图像可以从 \n[视觉定位挑战赛](https:\u002F\u002Fwww.visuallocalization.net\u002Fdatasets\u002F) 下载（位于网站底部），更准确地说是从 [这里](https:\u002F\u002Fwww.dropbox.com\u002Fsh\u002Fql8t2us433v8jej\u002FAAB0wfFXs0CLPqSiyq0ukaKva\u002FROBOTCAR?dl=0&subfolder_nav_tracking=1) 获取。  \n包含真实标注对应关系的 CSV 文件可从 [这里](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F16mZLUKsjceAt1RTW1KLckX0uCR3O4x5Q\u002Fview) 下载。  \n文件结构应如下所示： \n\n```bash\nRobotCar\n├── img\u002F\n└── test6511.csv\n```\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**: 在 admin\u002Flocal.py 中更新 'robotcar' 和 'robotcar_csv' 的路径后，使用以下命令运行评估：\n```bash\npython eval_matching.py --dataset robotcar --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type MS\n```\n  \n应得到类似的结果： \n | 模型          | 预训练模型类型      | PCK-1 (%) | PCK-3 (%) | PCK-5 (%) |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net     (论文)   | 静态（CityScape-DPED-ADE）          | 2.30  | 17.15 | 33.87 |\n| GLU-Net-GOCor  (论文) | 静态 | **2.31**  | **17.62** | **35.18** |\n| GLU-Net-GOCor  (论文) | 动态                     | 2.10  | 16.07 | 31.66 |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net-GOCor* (论文) | megadepth                   | 2.33  | 17.21 | 33.67 |\n| PDC-Net  (H)    (论文)   | megadepth                   | 2.54  | 18.97 | 36.37 |\n| PDC-Net (MS)   (论文) | megadepth                   | 2.58  | 18.87 | 36.19 |\n| PDC-Net+ (D)  (论文)  | megadepth |  2.57  | **19.12** | **36.71**  |\n| PDC-Net+ (H)  (论文)  | megadepth |  2.56 | 19.00 | 36.56 |\n| PDC-Net+ (MS)  (论文)  | megadepth |  **2.63** | 19.01 | 36.57 |\n|----------------|-----------------------------|-------|-------|-------|\n| GLU-Net* (论文) | megadepth | 2.36 | 17.18 | 33.28 |\n| WarpC-GLU-Net (论文) | megadepth | **2.51** | **18.59** | **35.92** | \n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>ETH3D \u003Ca name=\"eth3d\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**数据准备**: 在我们的 [GLU-Net 仓库](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net) 中执行 'bash assets\u002Fdownload_ETH3D.sh'。  \n该脚本会完成以下操作：  \n- 创建根目录 ETH3D\u002F，并在其中创建两个子目录 multiview_testing\u002F 和 multiview_training\u002F。  \n- 从 [这里](https:\u002F\u002Fwww.eth3d.net\u002Fdata\u002Fmulti_view_training_rig.7z) 下载“低分辨率多视角、训练数据、所有畸变图像”，并将其解压到 multiview_training\u002F 目录中。  \n- 从 [这里](https:\u002F\u002Fwww.eth3d.net\u002Fdata\u002Fmulti_view_test_rig_undistorted.7z) 下载“低分辨率多视角、测试数据、所有未畸变图像”，并将其解压到 multiview_testing\u002F 目录中。  \n- 我们直接提供了以不同间隔拍摄的图像对之间的对应关系。每个数据集和每种间隔速率都有一份打包文件，例如 “lakeside_every_5_rate_of_3”。  \n这意味着我们每隔 5 张源图像采样一次，而目标图像则是在每张源图像的基础上按特定速率选取的。所有这些文件均可从 [这里](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1Okqs5QYetgVu_HERS88DuvsABGak08iN\u002Fview?usp=sharing) 下载并解压。\n\n作为示例，您的 ETH3D 根目录应按如下方式组织：\n\u003Cpre>\n\u002FETH3D\u002F\n       multiview_testing\u002F\n                        lakeside\u002F\n                        sand_box\u002F\n                        storage_room\u002F\n                        storage_room_2\u002F\n                        tunnel\u002F\n       multiview_training\u002F\n                        delivery_area\u002F\n                        electro\u002F\n                        forest\u002F\n                        playground\u002F\n                        terrains\u002F\n        info_ETH3D_files\u002F\n\u003C\u002Fpre>\n目录的组织非常重要，因为打包文件中包含了相对于 ETH3D 根目录的图像相对路径。  \n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**: 对于每种间隔速率（3、5、7、9、11、13、15），我们都会计算各个子数据集（如 lakeside、delivery area 等）的指标。最终的指标是各速率下所有数据集的平均值。  \n在 admin\u002Flocal.py 中更新 'eth3d' 的路径后，使用以下命令运行评估：\n```bash\npython eval_matching.py --dataset robotcar --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type D\n```\n\n\u003Cbr \u002F>\n图像对之间不同间隔速率下的 AEPE：\n\n| 方法        | 预训练模型类型 | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **1.66**   | 2.58   | 6.05   | 12.95  | 29.67   | 52.41   | 74.96   |\n| PWC-Net       | chairs-things          | 1.75   | 2.10   | 3.21   | 5.59   | 14.35   | 27.49   | 43.41   |\n| PWC-Net-GOCor | chairs-things          |  1.70      |  **1.98**      |  **2.58**      | **4.22**      |  **10.32**       |  **21.07**       |  **38.12**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | 2.49   | 3.28   | 4.18   | 5.35   | 6.78    | 9.02    | 12.23   |\n| GLU-Net       | static                 | 1.98   | 2.54   | 3.49   | 4.24   | 5.61    | 7.55    | 10.78   |\n| GLU-Net       | dynamic                | 2.01   | 2.46   | 2.98   | 3.51   | 4.30    | 6.11    | 9.08    |\n| GLU-Net-GOCor | dynamic                | **1.93**   | **2.28**   | **2.64**   | **3.01**   | **3.62**    | **4.79**    | **7.80**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              |  1.68 | 1.92 | 2.18 | 2.43 |  2.89 | 3.31 |  4.27 |\n| PDC-Net  (D) (论文) | megadepth       | 1.60   | 1.79   | 2.03   |  2.26  |  2.58   | 2.92    | 3.69    |\n| PDC-Net  (H)  | megadepth              | 1.58   | 1.77   | 1.98   |  2.24  |  2.56   | 2.91    | 3.73    |\n| PDC-Net  (MS)  | megadepth             |  1.60 |  1.79  |  2.00  | 2.26   | 2.57   | 2.90    |  3.56   |\n| PDC-Net+  (H) (论文) | megadepth  | **1.56** | **1.74** | **1.96** | **2.18** | **2.48** |  **2.73** | **3.24** |\n| PDC-Net+  (MS) (论文) | megadepth  | 1.58 | 1.76 | 1.96 | 2.16 | 2.49 | **2.73** | **3.24** |\n\n\n图像对之间不同间隔速率下的 PCK-1：  \n\n请注意，PCK 是 **逐图像** 计算的，然后按序列取平均值。最终的指标是所有序列的平均值。这对应于输出的指标文件中的 '_per_image' 结果。  \n需要注意的是，这与 [PDC-Net 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710) 中使用的指标不同，在那篇论文中，PCK 是 **按序列** 计算的，采用的是 PDC-Net 的直接方法（对应于输出指标文件中的 '-per-rate' 结果）。\n\n| 方法        | 预训练模型类型 | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **61.63**   | **56.55**   | **49.83**   | **42.00**  | 33.14   | 26.46   | 21.22   |\n| PWC-Net       | chairs-things          | 58.50   | 52.02   | 44.86   | 37.41   | 30.36   | 24.75   | 19.89   |\n| PWC-Net-GOCor | chairs-things          | 58.93   | 53.10   |  46.91  |  40.93  |  **34.58**   |  **29.25**  |  **24.59**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | \n| GLU-Net       | static                 | **50.55**   | **43.08**   | **36.98**   | 32.45   | 28.45    | 25.06    | 21.89   |\n| GLU-Net       | dynamic                | 46.27  | 39.28   | 34.05   | 30.11   | 26.69    | 23.73    | 20.85    |\n| GLU-Net-GOCor | dynamic                | 47.97   |41.79   | 36.81   | **33.03**   | **29.80**    | **26.93**    | **23.99**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              | 59.40 |  55.15 | 51.18 |  47.86 | 44.46 | 41.78 | 38.91 |\n| PDC-Net  (D)  | megadepth              |  61.82 | 58.41  | 55.02 |  52.40 | 49.61 | 47.43 | 45.01 | \n| PDC-Net  (H)  | megadepth              |  62.63  | 59.29 | 56.09 | 53.31 | 50.69 | 48.46 | 46.17 |\n| PDC-Net  (MS) | megadepth              |  62.29 | 59.14   | 55.87 | 53.23 | 50.59 | 48.45 | 46.17 |\n| PDC-Net+  (H) | megadepth              | **63.12** | **59.93** | **56.81** | **54.12** | **51.59** | **49.55** | **47.32** |\n| PDC-Net+  (MS) | megadepth              | 62.95 | 59.76 | 56.64 | 54.02 | 51.50 | 49.38 | 47.24 | \n\n\n\nPCK-5 对于不同图像对间隔率的情况：\n\n| 方法        | 预训练模型类型 | rate=3 | rate=5 | rate=7 | rate=9 | rate=11 | rate=13 | rate=15 |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| LiteFlowNet   | chairs-things          | **92.79**   | 90.70   | 86.29   | 78.50  | 66.07   | 55.05   | 46.29   |\n| PWC-Net       | chairs-things          | 92.64  | 90.82   | 87.32   | 81.80   | 72.95   | 64.07   | 55.47   |\n| PWC-Net-GOCor | chairs-things          | 92.81      |  **91.45**      |  **88.96**      |  **85.53**      |  **79.44**       |  **72.06**       |  **64.92**       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| DGC-Net       |                        | 88.50   | 83.25   | 78.32  | 73.74   | 69.23   | 64.28    | 58.66  |\n| GLU-Net       | static                 | 91.22  | 87.91   | 84.23   |  80.74  | 76.84    | 72.35    | 67.77   |\n| GLU-Net       | dynamic                | 91.45   | 88.57   | 85.64   | 83.10   | 80.12    | 76.66    | 73.02    |\n| GLU-Net-GOCor | dynamic                | **92.08**   | **89.87**   | **87.77**   | **85.88**   | **83.69**    | **81.12**    | **77.90**    |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|---------|\n| GLU-Net-GOCor*| megadepth              |  93.03 |92.13 | 91.04 | 90.19 | 88.98 |  87.81 |  85.93 | \n| PDC-Net  (D) (paper) | megadepth       | 93.47 | 92.72 | 91.84 | 91.15 | 90.23 | 89.45 | 88.10 | \n| PDC-Net  (H)  | megadepth              | 93.50 | 92.71  | 91.93 | 91.16 | 90.35 | 89.52 | 88.32 | \n| PDC-Net  (MS)  | megadepth             |  93.47 | 92.69 | 91.85 | 91.15 | 90.33 | 89.55 | 88.43 | \n| PDC-Net+  (H)  | megadepth             |  **93.54** | 92.78 |  **92.04** | 91.30 | **90.60** | 89.9 | **89.03** |\n| PDC-Net+  (MS)  | megadepth             | 93.50 |  **92.79** | **92.04** | **91.35** | **90.60** | **89.97** | 88.97 | \n\n\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>HPatches \u003Ca name=\"hpatches\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: 使用以下命令下载数据：\n```bash\nbash assets\u002Fdownload_hpatches.sh\n```\n每个视点ID对应的CSV文件，其中包含图像路径和对应图像对之间的单应性参数，列在assets\u002F目录中。\n\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**: 在admin\u002Flocal.py中更新'hp'的路径后，使用以下命令进行评估：\n```bash\npython eval_matching.py --dataset hp --model GLUNet_GOCor --pre_trained_models static --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir\n```\n应得到类似的结果：\n|                             | 预训练模型类型 | AEPE  | PCK-1 (\\%)  | PCK-3 (%) | PCK-5 (\\%)  |\n|-----------------------------|------------------------|-------|-------------|-----------|-------------|\n| DGC-Net \\[Melekhov2019\\] |                        | 33.26 | 12.00       |           | 58.06       |\n| GLU-Net           (本仓库)          | static                 | 25.05 | 39.57       |   71.45        | 78.60       |\n| GLU-Net           (论文)          | static                 | 25.05 | 39.55       |     -      | 78.54       |\n| GLU-Net-GOCor          (本仓库)     | static                 | **20.16** | 41.49       |   74.12        | **81.46**       |\n| GLU-Net-GOCor          (论文)     | static                 | **20.16** | **41.55**       |    -       | 81.43       |\n|---------------|------------------------|--------|--------|--------|--------|---------|---------|\n| PDCNet (D)         (本仓库)     | megadepth                 | 19.40 | 43.94       |    78.51       | 85.81       |\n| PDCNet (H)         (本仓库)     | megadepth                 | **17.51** |  **48.69**   |   **82.71**       | **89.44**       |\n\n\n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>KITTI\u003Ca name=\"kitti\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: KITTI-2012和2015数据集均可在此处获取 [http:\u002F\u002Fwww.cvlibs.net\u002Fdatasets\u002Fkitti\u002Feval_flow.php](http:\u002F\u002Fwww.cvlibs.net\u002Fdatasets\u002Fkitti\u002Feval_flow.php)\n\n\u003Cbr \u002F> \n\n**评估**: 在admin\u002Flocal.py中更新'kitti2012'和'kitti2015'的路径后，使用以下命令进行评估：\n```bash\npython eval_matching.py --dataset kitti2015 --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 PDCNet --multi_stage_type direct\n```\n\n应得到类似的结果：\n|                |                         | KITTI-2012 |             | KITTI-2015 |           |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| Models         | Pre-trained model type  | AEPE       | F1   (%)    | AEPE       | F1  (%)   |\n| PWC-Net-GOCor    (this repo)    | chairs-things                | 4.12      | 19.58       |  10.33     | 31.23     |\n| PWC-Net-GOCor    (paper)    | chairs-things                | 4.12      | 19.31       |  10.33     | 30.53     |\n| PWC-Net-GOCor    (this repo)    | chairs-things ft sintel          |    2.60  |  9.69     | 7.64       | 21.36     |\n| PWC-Net-GOCor    (paper)    | chairs-things ft sintel          |    **2.60**  |  **9.67**     | **7.64**       | **20.93**     |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| GLU-Net    (this repo)    | static                | 3.33      | 18.91       | 9.79       | 37.77     |\n| GLU-Net    (this repo)    | dynamic                 | 3.12     | 19.73       | 7.59       | 33.92     |\n| GLU-Net    (paper)    | dynamic                 | 3.14       | 19.76       | 7.49       | 33.83     |\n| GLU-Net-GOCor  (this repo) | dynamic                 | **2.62**       | **15.17**       | **6.63**       | 27.58     |\n| GLU-Net-GOCor  (paper) | dynamic                 | 2.68       | 15.43       | 6.68       | **27.57**     |\n|----------------|-------------------------|------------|-------------|------------|-----------|\n| GLU-Net-GOCor* (paper) | megadepth               | 2.26       | 9.89        | 5.53       | 18.27     |\n| PDC-Net **(D)**    (paper and this repo) | megadepth               | 2.08       | 7.98        | 5.22       | 15.13     |\n| PDC-Net (H)    (this repo) | megadepth               | 2.16       | 8.19        | 5.31      | 15.23     |\n| PDC-Net (MS)    (this repo) | megadepth               | 2.16       | 8.13        | 5.40      | 15.33     |\n| PDC-Net+ **(D)**    (paper) | megadepth    | **1.76** | **6.60** | **4.53** | **12.62**  |\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Sintel \u003Ca name=\"sintel\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: 使用以下命令下载数据\n```bash\nbash assets\u002Fdownload_sintel.sh\n```\n\n\n**评估**: 在 admin\u002Flocal.py 中更新 'sintel' 的路径后，使用以下命令进行评估：\n\n```bash\npython eval_matching.py --dataset sintel --model PDCNet --pre_trained_models megadepth --optim_iter 3 --local_optim_iter 7 --save_dir path_to_save_dir PDCNet --multi_stage_type direct\n```\n\n应得到类似的结果：\n|               | Pre-trained model type         | AEPE   | PCK-1 \u002F dataset (\\%) | PCK-5 \u002F dataset  (\\%)  | AEPE   | PCK-1  \u002F dataset (\\%) | PCK-5 \u002F dataset  (\\%)  |\n|---------------|---------------------|--------|-------------|--------------|--------|-------------|--------------|\n| PWC-Net-GOCor (this repo) | chairs-things          | 2.38   | 82.18       | 94.14        | 3.70   | 77.36       | 91.20        |\n| PWC-Net-GOCor (paper) | chairs-things          | 2.38   | 82.17       | 94.13        | 3.70   | 77.34       | 91.20        |\n| PWC-Net-GOCor (paper) | chairs-things ft sintel  | (1.74) | (87.93)     | (95.54)      | (2.28) | (84.15)     | (93.71)      |\n|---------------|--------------------|--------|-------------|--------------|--------|-------------|--------------|\n| GLU-Net   (this repo)     | dynamic                | 4.24   | 62.21       | 88.47        | 5.49   | 58.10       | 85.16        |\n| GLU-Net    (paper)     | dynamic               | 4.25   | 62.08       | 88.40        | 5.50   | 57.85       | 85.10        |\n| GLU-Net-GOCor (this repo) | dynamic                | **3.77**   | 67.11       | **90.47**        | **4.85**   | 63.36       | **87.76**        |\n| GLU-Net-GOCor (paper) | dynamic                | 3.80   | **67.12**       | 90.41        | 4.90   | **63.38**       | 87.69        |\n|---------------|-----------------|--------|-------------|--------------|--------|-------------|--------------|\n| GLU-Net-GOCor* (paper) | megadepth         | **3.12** | 80.00 | 92.68 | **4.46** | 73.10 | 88.94 | \n| PDC-Net (D)   (this repo)     |  megadepth     | 3.30  | **85.06**      | **93.38**       | 4.48  | **78.07**       | **90.07**        |\n| PDC-Net (H)   (this repo)     |  megadepth     | 3.38   | 84.95       | 93.35        | 4.50   | 77.62       | 90.07      |\n| PDC-Net (MS)   (this repo)     |  megadepth     | 3.40   | 84.85       | 93.33        | 4.54   | 77.41       | 90.06      |\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>TSS  \u003Ca name=\"tss\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: 要下载图像，请运行：\n```bash\nbash assets\u002Fdownload_tss.sh\n```\n \n\u003Cbr \u002F>\n\n**评估**: 在 admin\u002Flocal.py 中更新 'tss' 的路径后，使用以下命令进行评估：\n ```bash\npython eval_matching.py --dataset TSS --model GLUNet_GOCor --pre_trained_models static --optim_iter 3 --local_optim_iter 7 --flipping_condition True --save_dir path_to_save_dir\n```\n应得到类似的结果：\n| Model          | Pre-trained model type      | FGD3Car | JODS | PASCAL | All  |\n|--------------------------------|--------|---------|------|--------|------|\n| Semantic-GLU-Net \\[1\\] |   Static     | 94.4    | 75.5 | 78.3   | 82.8 |\n| GLU-Net (our repo)                | Static | 93.2    | 73.69 | 71.1   | 79.33 |\n| GLU-Net (paper)                | Static | 93.2    | 73.3 | 71.1   | 79.2 |\n| GLU-Net-GOCor (our repo, GOCor iter=3, 3)          | Static | 94.6   | 77.9 | 77.7   | 83.4 |\n| GLU-Net-GOCor (our repo, GOCor iter=3, 7)          | Static | 94.6    | 77.6 | 77.1   | 83.1 |\n| GLU-Net-GOCor (paper)          | Static | 94.6    | 77.9 | 77.7   | 83.4 |\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 95.3 | 82.2 | 78.2 | \n| WarpC-SemanticGLU-Net  | pfpascal |  **97.1** | **84.7** | **79.7** | **87.2** |\n\n\u003C\u002Fdetails\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PF-Pascal \u003Ca name=\"pfpascal\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: 要下载图像，请运行：\n```bash\nbash assets\u002Fdownload_pf_pascal.sh\n```\n \n\u003Cbr \u002F>\n\n**评估**: 在 admin\u002Flocal.py 中更新 'PFPascal' 的路径后，使用以下命令进行评估：\n ```bash\npython eval_matching.py --dataset PFPascal --model WarpCSemanticGLUNet --pre_trained_models pfpascal --flipping_condition False --save_dir path_to_save_dir\n```\n应得到类似的结果：\n| Model          | Pre-trained model type     |  alpha=0.05 | alpha=0.1  \n|--------------------------------|--------|---------|---------|\n| Semantic-GLU-Net  \\[1\\]  |  static (paper) | 46.0 | 70.6 | \n| Semantic-GLU-Net  \\[1\\]  |  static (this repo) | 45.3 |  70.3 | \n| Semantic-GLU-Net  \\[4\\]  (this repo)  |  pfpascal | 48.4  | 72.4 |\n| WarpC-SemanticGLU-Net  \\[4\\] (paper)  | pfpascal |   62.1 | **81.7** |\n| WarpC-SemanticGLU-Net  \\[4\\]  (this repo) | pfpascal  |   **62.7** |  **81.7** | \n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PF-Willow \u003Ca name=\"pfwillow\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**: 要下载图像，请运行：\n```bash\nbash assets\u002Fdownload_pf_willow.sh\n```\n \n\u003Cbr \u002F>\n\n**评估**：在 admin\u002Flocal.py 中更新 'PFWillow' 的路径后，使用以下命令运行评估：\n```bash\npython eval_matching.py --dataset PFWillow --model WarpCSemanticGLUNet --pre_trained_models pfpascal --flipping_condition False --save_dir path_to_save_dir\n```\n应得到类似的结果：\n| 模型          | 预训练模型类型     |  alpha=0.05  |  alpha=0.1  \n|--------------------------------|--------|---------|---------|\n| Semantic-GLU-Net  \\[1\\]  (论文) |  static | 36.4 | 63.8 |\n| Semantic-GLU-Net  \\[1\\]  (本仓库) |  static | 36.2 | 63.7 |\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 39.7 | 67.6 |\n| WarpC-SemanticGLU-Net  \\[4\\] (论文)  | pfpascal |  **49.0** | 75.1 | \n| WarpC-SemanticGLU-Net  \\[4\\] (本仓库)  | pfpascal |  48.9 | **75.2** | \n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Spair-71k \u003Ca name=\"spair\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**：要下载图像，运行：\n```bash\nbash assets\u002Fdownload_spair.sh\n```\n \n\u003Cbr \u002F>\n\n**评估**：在 admin\u002Flocal.py 中更新 'spair' 的路径后，使用以下命令运行评估：\n```bash\npython eval_matching.py --dataset spair --model WarpCSemanticGLUNet --pre_trained_models pfpascal  --flipping_condition False --save_dir path_to_save_dir\n```\n应得到类似的结果：\n| 模型          | 预训练模型类型     |  alpha=0.1  \n|--------------------------------|--------|---------|\n| Semantic-GLU-Net  \\[1\\]  |  static |  15.1\n| Semantic-GLU-Net  \\[4\\]  |  pfpascal | 16.5 |\n| WarpC-SemanticGLU-Net   | spair |   23.5 |\n| WarpC-SemanticGLU-Net  \\[4\\] | pfpascal |   **23.8** |\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>Caltech-101 \u003Ca name=\"caltech\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n**数据准备**：要下载图像，运行：\n```bash\nbash assets\u002Fdownload_caltech.sh\n```\n \n\u003Cbr \u002F>\n\n**评估**：在 admin\u002Flocal.py 中更新 'spair' 的路径后，使用以下命令运行评估：\n```bash\npython eval_matching.py --dataset caltech --model WarpCSemanticGLUNet --pre_trained_models pfpascal  --flipping_condition False --save_dir path_to_save_dir\n```\n\u003C\u002Fdetails>\n\n\n\n\n\n\n\n### 4.2 姿态估计 \u003Ca name=\"pose_estimation\">\u003C\u002Fa>\n\n\n指标通过以下命令计算：\n```bash\npython -u eval_pose_estimation.py --dataset dataset_name --model model_name --pre_trained_models pre_trained_model_name --optim_iter optim_step  --local_optim_iter local_optim_iter --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir \n```\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>YFCC100M  \u003Ca name=\"yfcc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**数据准备**：YFCC 的真值标注文件为 assets\u002Fyfcc_test_pairs_with_gt_original.txt（来自 [SuperGlue 仓库](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork)）。 \n图像可以从 [OANet 仓库](https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet) 下载，并移动到所需位置：\n```bash\nbash assets\u002Fdownload_yfcc.sh\n```\n文件结构应为：\n```bash\nYFCC\n└──  images\u002F\n       ├── buckingham_palace\u002F\n       ├── notre_dame_front_facade\u002F\n       ├── reichstag\u002F\n       └── sacre_coeur\u002F\n```\n\n  \n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**：在 admin\u002Flocal.py 中更新 'yfcc' 的路径后，使用 PDC-Net 单应矩阵 (H) 在 YFCC100M 上计算指标，命令如下：\n\n```bash\npython -u eval_pose_estimation.py --dataset YFCC --model PDCNet --pre_trained_models megadepth --optim_iter 3  --local_optim_iter 7 --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir PDCNet --multi_stage_type H --mask_type proba_interval_1_above_10 \n```\n\n您应该会得到类似的指标（由于 RANSAC 的影响，可能不会完全相同）：\n  \n|              | mAP @5 | mAP @10 | mAP @20 | 运行时间 (s) |\n|--------------|--------|---------|---------|--------------|\n| PDC-Net (D)  | 60.52  | 70.91   | 80.30   | 0.         |\n| PDC-Net (H)  | 63.90  | 73.00   | 81.22   | 0.74         |\n| PDC-Net (MS) | 65.18  | 74.21   | 82.42   | 2.55         |\n| PDC-Net+ (D) |  63.93 | 73.81 | 82.74 |\n| PDC-Net+ (H) | **67.35** | **76.56** | **84.56** | 0.74 | \n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>ScanNet \u003Ca name=\"scanNet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n**数据准备**：ScanNet 测试集的图像（100 场景，scene0707_00 至 scene0806_00）在此处提供：\n[这里](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F19o07SOWpv_DQcIbjb87BAKBHNCcsr4Ax\u002Fview?usp=sharing)。 \n这些图像提取自 [ScanNet GitHub 仓库](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet)，并经过处理。 \n我们使用 [SuperGlue 仓库](https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork) 提供的真值标注文件 assets\u002Fscannet_test_pairs_with_gt.txt。\n\n\n\u003Cbr \u002F>\u003Cbr \u002F>\n**评估**：在 admin\u002Flocal.py 中更新 'scannet_test' 的路径后，使用 PDC-Net 单应矩阵 (H) 在 ScanNet 上计算指标，命令如下：\n```bash\npython -u eval_pose_estimation.py --dataset scannet --model PDCNet --pre_trained_models megadepth --optim_iter 3  --local_optim_iter 7 --estimate_at_quarter_reso True --mask_type_for_pose_estimation proba_interval_1_above_10 --save_dir path_to_save_dir PDCNet --multi_stage_type H --mask_type proba_interval_1_above_10 \n```\n\n\n您应该会得到类似的指标（由于 RANSAC 的影响，可能不会完全相同）：\n  \n|              | mAP @5 | mAP @10 | mAP @20 |\n|--------------|--------|---------|---------|\n| PDC-Net (D)  | 39.93  | 50.17   | 60.87   |\n| PDC-Net (H)  | 42.87  | 53.07   | 63.25   |\n| PDC-Net (MS) | 42.40  | 52.83   | 63.13   | \n| PDC-Net+ (D) |  42.93 | 53.13 | 63.95 |\n| PDC-Net+ (H) |  **45.66** | **56.67** | **67.07** |\n\n\n\u003C\u002Fdetails>\n\n\n\n### 4.3 HPatches 上的稀疏评估 \u003Ca name=\"sparse_hp\">\u003C\u002Fa>\n\n\n我们在此处提供了 HPatches 稀疏评估结果的缓存链接：\n[这里](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1gphUcvBXO12EsqskdMlH3CsLxHPLtIqL?usp=sharing)。 \n更多详情请参阅 [PDC-Net+]（https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912）。 \n    \n\n## 5. 训练 \u003Ca name=\"training\">\u003C\u002Fa>\n\n### 快速入门\n\n安装过程中应已生成本地配置文件 \"admin\u002Flocal.py\"。 \n如果未生成该文件，请运行以下命令以创建：\n```python -c \"from admin.environment import create_default_local_file; create_default_local_file()\"```。 \n接下来，设置训练工作区的路径，即保存模型权重和检查点的目录。 \n同时设置您想要使用的数据集的路径（这些数据集需要提前下载，详见下文）。 \n如果所有依赖项都已正确安装，您可以在正确的 conda 环境中使用 run_training.py 脚本训练网络。\n\n```bash\nconda activate dense_matching_env\npython run_training.py train_module train_name\n```\n\n其中，train_module 是 train_settings 内的子模块，train_name 是要使用的训练设置文件名。\n\n例如，您可以使用附带的默认 train_PDCNet_stage1 设置进行训练，运行以下命令：\n```bash\npython run_training.py PDCNet train_PDCNet_stage1\n```\n\n### 训练数据集下载 \u003Ca name=\"scanNet\">\u003C\u002Fa>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>DPED-CityScape-ADE \u003C\u002Fb>\u003C\u002Fsummary>\n\n这是与 [GLU-Net 仓库](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net) 中使用的相同图像对。\n在训练过程中，我们结合使用了 DPED、CityScapes 和 ADE-20K 数据集。\nDPED 训练数据集仅由四台不同相机拍摄的约 5000 组图像组成。\n我们使用其中两台相机的图像，最终得到大约 10,000 张图像。\nCityScapes 额外增加了约 23,000 张图像。\n我们再从 ADE-20K 中随机抽取一些最低分辨率为 750x750 的图像作为补充。\n这样总共得到了 40,000 张原始图像，通过对其施加几何变换来生成训练用的图像对。\n原始图像的路径以及几何变换参数均记录在 CSV 文件中：\n'assets\u002Fcsv_files\u002Fhomo_aff_tps_train_DPED_CityScape_ADE.csv' 和 'assets\u002Fcsv_files\u002Fhomo_aff_tps_test_DPED_CityScape_ADE.csv'。\n\n1. 下载原始图像\n\n* 下载 [DPED 数据集](http:\u002F\u002Fpeople.ee.ethz.ch\u002F~ihnatova\u002F) (54 GB) ==> 图像将被存放在 original_images\u002F 目录下\n* 下载 [CityScapes 数据集](https:\u002F\u002Fwww.cityscapes-dataset.com\u002F)\n    - 下载 'leftImg8bit_trainvaltest.zip' (11GB，包含训练、验证和测试集的左视图8位图像，共5000张) ==> 图像将被存放在 CityScape\u002F 目录下\n    - 下载 leftImg8bit_trainextra.zip (44GB，包含额外训练集的左视图8位图像，共19998张) ==> 图像将被存放在 CityScape_extra\u002F 目录下\n\n* 下载 [ADE-20K 数据集](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F19r7dsYraHsNGI1ViZi4VwCfQywdODCDU\u002Fview?usp=sharing) (3.8 GB，20,210 张图像) ==> 图像将被存放在 ADE20K_2016_07_26\u002F 目录下\n\n\n请将所有数据集放置在同一目录下。示例性的根训练目录结构如下：\n```bash\ntraining_datasets\u002F\n    ├── original_images\u002F\n    ├── CityScape\u002F\n    ├── CityScape_extra\u002F\n    └── ADE20K_2016_07_26\u002F\n```\n\n2. 将合成图像对及光流保存至磁盘                \n在训练过程中，可以从这些原始图像中每轮迭代动态生成合成图像对。\n然而，这种数据集的生成需要一定时间，且由于每轮迭代不会进行数据增强，因此也可以预先生成数据集并将其保存到磁盘上。\n在训练时，只需从磁盘加载构成训练数据集的图像对，然后再送入网络处理，这样会快得多。\n要生成训练数据集并将其保存到磁盘：\n\n```bash\npython assets\u002Fsave_training_dataset_to_disk.py --image_data_path \u002Fdirectory\u002Fto\u002Foriginal\u002Ftraining_datasets\u002F \n--csv_path assets\u002Fhomo_aff_tps_train_DPED_CityScape_ADE.csv --save_dir \u002Fpath\u002Fto\u002Fsave_dir --plot True\n```    \n这将会在 save_dir\u002Fimages 和 save_dir\u002Fflow 分别生成图像对及其对应的光流场。\n\n3. 在 admin\u002Flocal.py 中添加路径，分别命名为 'training_cad_520' 和 'validation_cad_520'\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>COCO \u003C\u002Fb>\u003C\u002Fsummary>\n  \n这对于添加运动物体非常有用。\n请从 [这里](http:\u002F\u002Fcocodataset.org\u002F#download) 下载图像及其标注文件。根文件夹应按以下方式组织：\n```bash\ncoco_root\n    └── annotations\n        └── instances_train2014.json\n    └──images\n        └── train2014\n```\n然后在 admin\u002Flocal.py 中将路径添加为 'coco'。\n\u003C\u002Fdetails>\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb> MegaDepth \u003C\u002Fb>\u003C\u002Fsummary>\n  \n我们使用 [D2-Net 仓库](https:\u002F\u002Fgithub.com\u002Fmihaidusmanu\u002Fd2-net) 提供的重建结果。\n您可以直接从 [这里 - Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1hxpOsqOZefdrba_BqnW490XpNX_LgXPB) 下载去畸变后的重建结果及聚合的场景信息文件夹。\n\n文件结构应如下所示：\n```bash\nMegaDepth\n├── Undistorted_Sfm\n└── scene_info\n```\n\n然后在 admin\u002Flocal.py 中将路径添加为 'megadepth_training'。\n\n\u003C\u002Fdetails>\n\n\n\n\n### 训练脚本 \n\n目前该框架包含了以下匹配网络的训练代码。\n设置文件可用于训练网络，或了解确切的训练细节。\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>UAWarpC \u003Ca name=\"uawarpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n \n这是基于我们最近的工作 [Refign](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10030987)，其代码可在 [GitHub](https:\u002F\u002Fgithub.com\u002Fbrdav\u002Frefign) 上找到。\n它允许无监督地训练概率对应网络（结合了 [PDCNet](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710) 和 [WarpC](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)）。\n该网络使用简化的 PDCNet 版本，去除了 GOCor 模块，并且在每个像素点预测单模态高斯概率分布，而非拉普拉斯混合模型。\n\n* **UAWarpC.train_UAWarpC_PDCNet_stage1**: 默认设置用于第一阶段网络训练，不使用可见性掩码。\n我们在 MegaDepth 数据集的真实图像对上进行训练。\n\n* **UAWarpC.train_UAWarpC_PDCNet_stage2**: 我们进一步微调第一阶段训练得到的网络，加入了我们的可见性掩码。\n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>概率性形变一致性 (PWarpC) \u003Ca name=\"pwarpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n \n* **PWarpC.train_weakly_supervised_PWarpC_SFNet_pfpascal**: 默认设置用于在 PF-Pascal 数据集上训练弱监督的 PWarpC-SF-Net。\n\n* **PWarpC.train_weakly_supervised_PWarpC_SFNet_spair_from_pfpascal**: 默认设置用于在 SPair 数据集上训练弱监督的 PWarpC-SF-Net。\n具体来说，先在 PF-Pascal 数据集上训练（如上所述），然后再在 SPair-71K 数据集上进行微调。\n\n* **PWarpC.train_strongly_supervised_PWarpC_SFNet_pfpascal**: 默认设置用于在 PF-Pascal 数据集上训练强监督的 PWarpC-SF-Net。\n\n* **PWarpC.train_strongly_supervised_PWarpC_SFNet_spair_from_pfpascal**: 默认设置用于在 SPair-71K 数据集上训练强监督的 PWarpC-SF-Net。\n\n* 其余部分待续\n\n\u003C\u002Fdetails>\n\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>形变一致性 (WarpC) \u003Ca name=\"warpc\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n  \n* **WarpC.train_WarpC_GLUNet_stage1**: 默认设置用于第一阶段网络训练，不使用可见性掩码。\n我们在 MegaDepth 数据集的真实图像对上进行训练。\n\n* **WarpC.train_WarpC_GLUNet_stage2**: 我们进一步微调第一阶段训练得到的网络，加入了我们的可见性掩码。\n该网络即为最终的 WarpC-GLU-Net（参见 [WarpC 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)）。\n\n* **WarpC.train_ft_WarpCSemanticGLUNet**: 默认设置用于训练最终的 WarpC-语义GLU-Net（参见 [WarpC 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.03308)）。\n我们将原本在静态\u002FCAD 合成数据上训练的语义GLU-Net，利用形变一致性技术在 PF-Pascal 数据集上进行了微调。\n\n\n\u003C\u002Fdetails>\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>PDC-Net 和 PDC-Net+\u003Ca name=\"pdcnet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **PDCNet.train_PDCNet_plus_stage1**: 用于第一阶段网络训练的默认设置，其中骨干网络权重固定。\n我们首先在DPED、CityScape和ADE数据集上使用预先计算并保存的合成图像对进行训练，\n并在这些图像对上添加多个独立移动的对象及扰动。此外，我们还应用了对象重投影掩码。\n\n* **PDCNet.train_PDCNet_plus_stage2**: 用于训练最终PDC-Net+模型的默认设置（参见[PDC-Net+论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.13912)）。\n该设置会对使用PDCNet_stage1训练得到的模型中的所有层进行微调，包括特征骨干网络。训练数据集由第一阶段使用的相同数据集，以及来自MegaDepth数据集的图像对及其稀疏的地面真值对应关系数据组成。同时，我们也应用了重投影掩码。\n\n\n\n* **PDCNet.train_PDCNet_stage1**: 用于第一阶段网络训练的默认设置，其中骨干网络权重固定。\n我们使用预训练的ImageNet权重初始化骨干网络VGG-16。首先在DPED、CityScape和ADE数据集上使用预先计算并保存的合成图像对进行训练，\n并在这些图像对上添加独立移动的对象及扰动。\n\n* **PDCNet.train_PDCNet_stage2**: 用于训练最终PDC-Net模型的默认设置（参见[PDC-Net论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)）。\n该设置会对使用PDCNet_stage1训练得到的模型中的所有层进行微调，包括特征骨干网络。训练数据集由第一阶段使用的相同数据集，以及来自MegaDepth数据集的图像对及其稀疏的地面真值对应关系数据组成。\n\n* **PDCNet.train_GLUNet_GOCor_star_stage1**: 设置与PDCNet_stage1相同，但使用不同的模型（非概率基线）。损失函数相应地改为L1损失，而非负对数似然损失。\n\n* **PDCNet.train_GLUNet_GOCor_star_stage2**: 用于训练最终GLU-Net-GOCor*的默认设置（参见[PDCNet论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.01710)）。\n\n\n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>使用随机生成数据的训练示例 \u003Ca name=\"glunet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **GLUNet.train_GLUNet_with_synthetically_generated_data**: 这是一个简单的示例，说明如何实时生成随机变换，并将其应用于原始图像，以创建训练图像对及其对应的地面真值光流。此处，随机变换被应用于MegaDepth图像。在生成的图像对和地面真值光流基础上，我们进一步添加了一个随机移动的对象。\n\n\n\u003C\u002Fdetails>\n \n \n \n\n\u003Cdetails>\n  \u003Csummary>\u003Cb>GLU-Net \u003Ca name=\"glunet\">\u003C\u002Fa>\u003C\u002Fb>\u003C\u002Fsummary>\n\n* **GLUNet.train_GLUNet_static**: 用于训练最终GLU-Net的默认设置（出自论文[GLU-Net](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.05524)）。\n我们固定骨干网络权重，并使用预训练的ImageNet权重初始化骨干网络VGG-16。\n训练数据为DPED、CityScape和ADE数据集上的合成图像对（预先计算并保存），这些数据后来在[GOCor论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)中被称为“静态”数据集。\n\n* **GLUNet.train_GLUNet_dynamic**: 用于训练最终GLU-Net的默认设置，该模型基于动态数据集进行训练（出自论文[GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)）。\n我们同样固定骨干网络权重，并使用预训练的ImageNet权重初始化骨干网络VGG-16。\n训练数据为DPED、CityScape和ADE数据集上的合成图像对（预先计算并保存），并在这些图像对上添加一个独立移动的对象。\n该数据集在[GOCor论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)中被称为“动态”数据集。\n\n* **GLUNet.train_GLUNet_GOCor_static**: 用于训练最终GLU-Net-GOCor的默认设置（出自论文[GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)）。\n我们固定骨干网络权重，并使用预训练的ImageNet权重初始化骨干网络VGG-16。\n训练数据为DPED、CityScape和ADE数据集上的合成图像对（预先计算并保存），这些数据后来在[GOCor论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)中被称为“静态”数据集。\n\n* **GLUNet.train_GLUNet_GOCor_dynamic**: 用于训练最终GLU-Net-GOCor的默认设置，该模型基于动态数据集进行训练（出自论文[GOCor](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)）。\n我们固定骨干网络权重，并使用预训练的ImageNet权重初始化骨干网络VGG-16。\n训练数据为DPED、CityScape和ADE数据集上的合成图像对（预先计算并保存），并在这些图像对上添加一个独立移动的对象。\n该数据集在[GOCor论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07823)中被称为“动态”数据集。\n\n\u003C\u002Fdetails>\n\n\n\n\n### 训练您自己的网络\n\n要使用该工具包训练自定义网络，需要在训练设置中指定以下组件。\n参考示例：[train_GLUNet_static.py](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002Ftrain_settings\u002FGLUNet\u002Ftrain_GLUNet_static.py)。\n\n* 数据集：用于训练的数据集。datasets模块中已提供若干标准匹配数据集。数据集类可以传入一个处理函数，该函数应在将数据分批之前执行必要的处理操作，例如数据增强和转换为张量。\n* 数据加载器：决定如何采样批次。可使用特定的采样器。\n* 网络：待训练的网络模块。\n* 批次预处理模块：负责接收批次并将其转换为网络训练所需的输入。具体实现取决于不同的网络和训练策略。\n* 目标函数：训练目标。\n* 演员：训练师将训练批次传递给演员，由演员负责正确地将数据通过网络，并计算训练损失。批次预处理也在演员类中完成。\n* 优化器：所使用的优化器，例如Adam。\n* 调度器：所使用的调度器。\n* 训练师：运行训练轮次并保存检查点的主要类。\n\n\n\n## 6. 致谢 \u003Ca name=\"acknowledgement\">\u003C\u002Fa>\n\n我们借鉴了多个公开项目的代码，例如[pytracking](https:\u002F\u002Fgithub.com\u002Fvisionml\u002Fpytracking)、[GLU-Net](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FGLU-Net)、\n[DGC-Net](https:\u002F\u002Fgithub.com\u002FAaltoVision\u002FDGC-Net)、[PWC-Net](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FPWC-Net)、\n[NC-Net](https:\u002F\u002Fgithub.com\u002Fignacio-rocco\u002Fncnet)、[Flow-Net-Pytorch](https:\u002F\u002Fgithub.com\u002FClementPinard\u002FFlowNetPytorch)、\n[RAFT](https:\u002F\u002Fgithub.com\u002Fprinceton-vl\u002FRAFT)、[CATs](https:\u002F\u002Fgithub.com\u002FSunghwanHong\u002FCost-Aggregation-transformers)...\n\n## 7. 变更日志 \u003Ca name=\"changelog\">\u003C\u002Fa>\n\n* 2021年6月21日：添加了评估代码\n* 2021年7月21日：添加了训练代码及更多评估选项\n* 2021年8月21日：修复了混合数据集中的内存泄漏问题，并为Megadepth数据集增加了其他采样方式\n* 2021年10月21日：添加了WarpC的预训练模型\n* 2021年12月21日：添加了WarpC和PDC-Net+的训练代码，以及随机生成的数据、Caltech数据集的评估代码、PDC-Net+的预训练模型和笔记本演示\n* 2022年2月22日：进行了小幅修改\n* 2022年3月22日：进行了大规模重构，新增了视频演示、PWarpC的代码，并将反卷积层的默认初始化设置为双线性插值权重。","# DenseMatching 快速上手指南\n\nDenseMatching 是一个基于 PyTorch 的通用稠密匹配库，支持几何匹配、光流估计和语义匹配任务。它集成了 GLU-Net、PDC-Net、WarpC 等多个经典模型的官方实现，并提供预训练权重和完整的训练\u002F评估框架。\n\n## 1. 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**: Linux (推荐) 或 macOS\n*   **Python**: 3.7 或更高版本\n*   **PyTorch**: 1.0 或更高版本 (推理需求)\n*   **硬件**: 推荐使用 NVIDIA GPU 以加速计算\n\n**前置依赖安装：**\n\n建议先安装基础依赖。国内用户可使用清华源或阿里源加速 `pip` 下载：\n\n```bash\n# 可选：配置国内 pip 源\npip config set global.index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 安装 PyTorch (请根据官网选择适合您 CUDA 版本的命令，以下为示例)\npip install torch torchvision torchaudio\n```\n\n## 2. 安装步骤\n\n克隆仓库并配置 Python 虚拟环境：\n\n```bash\n# 1. 克隆项目代码\ngit clone https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching.git\ncd DenseMatching\n\n# 2. 创建并激活 Conda 环境 (推荐 Python 3.7)\nconda create -n dense_matching_env python=3.7\nconda activate dense_matching_env\n\n# 3. 安装项目依赖\n# 注意：如果 repo 根目录有 requirements.txt，请执行以下命令\npip install -r requirements.txt\n```\n\n> **提示**：如果 `requirements.txt` 中某些包下载缓慢，可手动指定国内镜像源安装，例如：`pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`。\n\n## 3. 基本使用\n\n### 获取预训练模型\n\n项目提供了多种任务的预训练权重。请访问 [Model Zoo](https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fblob\u002Fmain\u002FMODEL_ZOO.md) 下载您需要的模型文件（例如 `PDCNet_plus.pth`），并将其放置在项目指定的目录下（通常为 `pre_trained_models\u002F`）。\n\n### 测试自己的图像对\n\n最简单的使用方式是通过提供的脚本对两张图像进行稠密匹配预测。假设您有两张图像 `image1.jpg` 和 `image2.jpg`。\n\n**运行推理命令：**\n\n```bash\npython demo.py \\\n    --first_img_path path\u002Fto\u002Fimage1.jpg \\\n    --second_img_path path\u002Fto\u002Fimage2.jpg \\\n    --model PDCNet_plus \\\n    --pre_trained_path pre_trained_models\u002FPDCNet_plus.pth \\\n    --output_dir results\u002F\n```\n\n**参数说明：**\n*   `--first_img_path` \u002F `--second_img_path`: 输入的两张图像路径。\n*   `--model`: 模型名称 (如 `GLU_Net`, `PDCNet`, `PDCNet_plus`, `WarpC` 等)。\n*   `--pre_trained_path`: 下载的预训练权重文件路径。\n*   `--output_dir`: 输出结果目录，将包含可视化的光流图和匹配结果。\n\n运行结束后，请在 `results\u002F` 目录中查看生成的匹配可视化结果。\n\n### 更多演示\n\n项目还包含其他演示脚本，用于分析网络性能或在标准数据集上进行评估，具体可参考仓库中的 `scripts\u002F` 目录及主 README 的 \"Benchmarks and results\" 章节。","某自动驾驶感知团队正在开发一套视觉定位系统，需要精准计算连续帧图像中车辆与行人的像素级位移，以辅助高精地图构建。\n\n### 没有 DenseMatching 时\n- **数据准备繁琐**：团队需手动编写大量脚本生成随机图像对及对应的真值光流，难以模拟真实场景中的遮挡和动态物体干扰。\n- **模型复现困难**：想要尝试业界领先的 PDC-Net 或 WarpC 等算法，必须从零搭建网络架构并调试训练超参数，耗时数周且容易出错。\n- **评估标准不一**：缺乏统一的验证数据集（如 KITTI、Sintel）和标准化评测脚本，导致不同模型间的性能对比缺乏说服力。\n- **训练效率低下**：由于缺少针对上采样卷积权重的优化初始化策略，模型收敛速度慢，难以在有限算力下快速迭代。\n\n### 使用 DenseMatching 后\n- **数据生成自动化**：直接调用库内函数即可生成带有随机遮挡和动态物体修改的高质量训练数据对，大幅缩短数据预处理周期。\n- **开箱即用主流模型**：直接加载预训练的 GLU-Net、PDC-Net 等官方模型权重，无需重复造轮子，当天即可开展下游任务微调。\n- **评测流程标准化**：利用内置的 MegaDepth、HPatches 等几何匹配数据集及分析脚本，快速输出权威的性能指标，加速技术选型决策。\n- **训练性能显著提升**：得益于默认集成的双线性插值权重初始化策略，模型训练时间明显缩短，且在光流预测精度上获得更优表现。\n\nDenseMatching 通过提供从数据构建、模型复现到标准化评估的一站式解决方案，将研发重心从底层基建回归到核心算法创新。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FPruneTruong_DenseMatching_abdcee91.png","PruneTruong","Prune Truong","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FPruneTruong_fff32fb6.jpg","Research Scientist at Google. \r\nFormer PhD Student in Computer Vision Lab of ETH Zurich","ETH Zurich","Zurich",null,"prunetruong","prunetruong.com","https:\u002F\u002Fgithub.com\u002FPruneTruong",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",99.8,{"name":88,"color":89,"percentage":90},"Shell","#89e051",0.2,750,88,"2026-03-07T12:43:34","LGPL-2.1","未说明","需要 NVIDIA GPU (基于 PyTorch)，具体型号和显存大小未说明",{"notes":98,"python":99,"dependencies":100},"建议使用 conda 创建和管理虚拟环境。该库包含多种密集匹配网络（如 GLU-Net, PDC-Net, WarpC 等）的实现、训练和评估框架。推理运行要求 Torch 版本大于等于 1.0。预训练模型权重需从单独的 Model Zoo 链接获取。","3.7+",[101],"torch>=1.0",[15,14],"2026-03-27T02:49:30.150509","2026-04-08T23:43:39.940034",[106,111,116,121,126,131],{"id":107,"question_zh":108,"answer_zh":109,"source_url":110},25265,"测试自定义图片时出现通道数错误（如 4 通道或灰度图）怎么办？","该问题通常是因为输入图片不是标准的 3 通道 RGB 格式（例如包含 Alpha 通道的 4 通道图片或灰度图）。维护者已更新代码以支持 4 通道和灰度图片的读取。如果遇到此问题，请确保拉取最新代码。另外，注意不同读取库的行为差异：使用 `imageio.imread` 可能会读出 4 通道，而 `cv2.imread` 通常读出 3 通道。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F7",{"id":112,"question_zh":113,"answer_zh":114,"source_url":115},25266,"为什么置信度图（confidence map）的最大值始终约为 0.573，即使查询图和参考图完全相同？","这是由拉普拉斯混合模型（mixture of Laplacians）的参数化方式决定的。为了平衡精度和鲁棒性，代码中将最小方差约束为 sigma^2_1 = 1。在这种参数设置下（alpha_1=1, sigma^2_1=1, alpha_2=0），理论计算得出的最大置信度即为 1-[1-exp(-sqrt(2))]^2 ≈ 0.573。这并非错误，而是设计如此。如果需要 [0, 1] 范围的置信度，可以手动将结果除以 0.573 进行归一化。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F6",{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},25267,"训练过程中是否只对有掩码（mask）的区域计算损失？越界（out-of-view）像素参与损失计算吗？","损失计算策略取决于数据集：对于合成数据，损失会在所有区域（包括越界区域）进行计算；对于 MegaDepth 数据集，损失仅在稀疏真值（sparse ground-truth）存在的位置计算。预测的置信度图并不会显式区分不可靠匹配区和越界像素，但在越界区域由于光流容易出错，网络通常会表现出不确定性。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F10",{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},25268,"如何单独保存 warped image（变形后的图像）或置信度掩码，而不是保存拼接在一起的图像？","可以通过设置命令行参数 `--save_ind_images = True` 来单独保存变形后的图像。如果需要保存置信度图（confidence_map）或置信掩码（confident_mask），代码逻辑与保存 warped image 类似，维护者已修复相关提交以确保这些功能可用。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F9",{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},25269,"使用 RANSAC-flow 权重对齐图片时出现图像变形或偏移，可能是什么原因？","如果出现图像变形或方向错误的偏移，首先应检查第一阶段单应性矩阵（homography）的计算是否失败。因为网络假设位移较小，如果初始单应性估计失败，网络无法恢复。建议对比原始 RANSAC-Flow 代码的结果，或者尝试直接使用 PDC-Net，它在大多数此类示例中表现良好。此外，需确认是否使用了正确的核大小（kernel size），原 RANSAC-flow 中通常为 7。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F23",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},25270,"PDCNet.py 第 442 行的源图像和目标图像变量赋值是否正确？","用户曾质疑该行代码应为 `c_t=c23, c_s=c13`（即源是 image1，目标是 image2）。维护者确认这是一个在调试过程中已发现的泄漏问题，并已在内部仓库修正。该问题也与日志记录中忘记添加 `.item()` 导致梯度在整个 epoch 中累积有关，这会引起内存泄漏。确保使用最新推送的代码可解决此问题。","https:\u002F\u002Fgithub.com\u002FPruneTruong\u002FDenseMatching\u002Fissues\u002F5",[]]