[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-magicleap--SuperGluePretrainedNetwork":3,"tool-magicleap--SuperGluePretrainedNetwork":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",160411,2,"2026-04-18T23:33:24",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":85,"forks":86,"last_commit_at":87,"license":88,"difficulty_score":32,"env_os":89,"env_gpu":90,"env_ram":89,"env_deps":91,"category_tags":99,"github_topics":100,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":105,"updated_at":106,"faqs":107,"releases":146},9498,"magicleap\u002FSuperGluePretrainedNetwork","SuperGluePretrainedNetwork","SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)","SuperGluePretrainedNetwork 是一个基于深度学习的图像特征匹配工具，由 Magic Leap 研发并在 CVPR 2020 上发表。它主要解决计算机视觉中“如何在两张不同视角的图片里精准找到对应点”这一核心难题，广泛应用于三维重建、视觉定位和机器人导航等领域。\n\n与传统方法不同，SuperGlue 创新性地结合了图神经网络（GNN）与最优传输层，充当特征提取与最终匹配之间的“中间件”。它能够聚合上下文信息，在端到端的架构中同时完成特征关联、匹配优化和误检过滤，显著提升了在弱纹理、大视角变化或光照差异等复杂场景下的匹配准确率与鲁棒性。项目提供了在室内（ScanNet）和室外（MegaDepth）场景预训练的模型权重，并支持与 SuperPoint 检测器无缝协作。\n\n该工具非常适合计算机视觉研究人员、算法工程师以及需要高精度视觉定位功能的开发者使用。通过提供的 Python 脚本，用户不仅可以轻松对图像对进行批量匹配评估，还能利用摄像头或视频文件运行实时演示，直观观察匹配效果。凭借其开源的代码结构和成熟的预训练模型，SuperGluePretrainedNetwo","SuperGluePretrainedNetwork 是一个基于深度学习的图像特征匹配工具，由 Magic Leap 研发并在 CVPR 2020 上发表。它主要解决计算机视觉中“如何在两张不同视角的图片里精准找到对应点”这一核心难题，广泛应用于三维重建、视觉定位和机器人导航等领域。\n\n与传统方法不同，SuperGlue 创新性地结合了图神经网络（GNN）与最优传输层，充当特征提取与最终匹配之间的“中间件”。它能够聚合上下文信息，在端到端的架构中同时完成特征关联、匹配优化和误检过滤，显著提升了在弱纹理、大视角变化或光照差异等复杂场景下的匹配准确率与鲁棒性。项目提供了在室内（ScanNet）和室外（MegaDepth）场景预训练的模型权重，并支持与 SuperPoint 检测器无缝协作。\n\n该工具非常适合计算机视觉研究人员、算法工程师以及需要高精度视觉定位功能的开发者使用。通过提供的 Python 脚本，用户不仅可以轻松对图像对进行批量匹配评估，还能利用摄像头或视频文件运行实时演示，直观观察匹配效果。凭借其开源的代码结构和成熟的预训练模型，SuperGluePretrainedNetwork 降低了高性能特征匹配技术的应用门槛，是构建先进视觉系统的有力助手。","\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_c1e277f632a0.png\" width=\"240\">\n\n### Research @ Magic Leap (CVPR 2020, Oral)\n\n# SuperGlue Inference and Evaluation Demo Script\n\n## Introduction\nSuperGlue is a CVPR 2020 research project done at Magic Leap. The SuperGlue network is a Graph Neural Network combined with an Optimal Matching layer that is trained to perform matching on two sets of sparse image features. This repo includes PyTorch code and pretrained weights for running the SuperGlue matching network on top of [SuperPoint](https:\u002F\u002Farxiv.org\u002Fabs\u002F1712.07629) keypoints and descriptors. Given a pair of images, you can use this repo to extract matching features across the image pair.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_2d0c3c6ce8d7.png\" width=\"500\">\n\u003C\u002Fp>\n\nSuperGlue operates as a \"middle-end,\" performing context aggregation, matching, and filtering in a single end-to-end architecture. For more details, please see:\n\n* Full paper PDF: [SuperGlue: Learning Feature Matching with Graph Neural Networks](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.11763).\n\n* Authors: *Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich*\n\n* Website: [psarlin.com\u002Fsuperglue](https:\u002F\u002Fpsarlin.com\u002Fsuperglue) for videos, slides, recent updates, and more visualizations.\n\n* `hloc`: a new toolbox for visual localization and SfM with SuperGlue, available at [cvg\u002FHierarchical-Localization](https:\u002F\u002Fgithub.com\u002Fcvg\u002FHierarchical-Localization\u002F). Winner of 3 CVPR 2020 competitions on localization and image matching!\n\nWe provide two pre-trained weights files: an indoor model trained on ScanNet data, and an outdoor model trained on MegaDepth data. Both models are inside the [weights directory](.\u002Fmodels\u002Fweights). By default, the demo will run the **indoor** model.\n\n## Dependencies\n* Python 3 >= 3.5\n* PyTorch >= 1.1\n* OpenCV >= 3.4 (4.1.2.30 recommended for best GUI keyboard interaction, see this [note](#additional-notes))\n* Matplotlib >= 3.1\n* NumPy >= 1.18\n\nSimply run the following command: `pip3 install numpy opencv-python torch matplotlib`\n\n## Contents\nThere are two main top-level scripts in this repo:\n\n1. `demo_superglue.py` : runs a live demo on a webcam, IP camera, image directory or movie file\n2. `match_pairs.py`: reads image pairs from files and dumps matches to disk (also runs evaluation if ground truth relative poses are provided)\n\n## Live Matching Demo Script (`demo_superglue.py`)\nThis demo runs SuperPoint + SuperGlue feature matching on an anchor image and live image. You can update the anchor image by pressing the `n` key. The demo can read image streams from a USB or IP camera, a directory containing images, or a video file. You can pass all of these inputs using the `--input` flag.\n\n### Run the demo on a live webcam\n\nRun the demo on the default USB webcam (ID #0), running on a CUDA GPU if one is found:\n\n```sh\n.\u002Fdemo_superglue.py\n```\n\nKeyboard control:\n\n* `n`: select the current frame as the anchor\n* `e`\u002F`r`: increase\u002Fdecrease the keypoint confidence threshold\n* `d`\u002F`f`: increase\u002Fdecrease the match filtering threshold\n* `k`: toggle the visualization of keypoints\n* `q`: quit\n\nRun the demo on 320x240 images running on the CPU:\n\n```sh\n.\u002Fdemo_superglue.py --resize 320 240 --force_cpu\n```\n\nThe `--resize` flag can be used to resize the input image in three ways:\n\n1. `--resize` `width` `height` : will resize to exact `width` x `height` dimensions\n2. `--resize` `max_dimension` : will resize largest input image dimension to `max_dimension`\n3. `--resize` `-1` : will not resize (i.e. use original image dimensions)\n\nThe default will resize images to `640x480`.\n\n### Run the demo on a directory of images\n\nThe `--input` flag also accepts a path to a directory. We provide a directory of sample images from a sequence. To run the demo on the directory of images in `freiburg_sequence\u002F` on a headless server (will not display to the screen) and write the output visualization images to `dump_demo_sequence\u002F`:\n\n```sh\n.\u002Fdemo_superglue.py --input assets\u002Ffreiburg_sequence\u002F --output_dir dump_demo_sequence --resize 320 240 --no_display\n```\n\nYou should see this output on the sample Freiburg-TUM RGBD sequence:\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_14c20e870b67.gif\" width=\"560\">\n\nThe matches are colored by their predicted confidence in a jet colormap (Red: more confident, Blue: less confident).\n\n### Additional useful command line parameters\n* Use `--image_glob` to change the image file extension (default: `*.png`, `*.jpg`, `*.jpeg`).\n* Use `--skip` to skip intermediate frames (default: `1`).\n* Use `--max_length` to cap the total number of frames processed (default: `1000000`).\n* Use `--show_keypoints` to visualize the detected keypoints (default: `False`).\n\n## Run Matching+Evaluation (`match_pairs.py`)\n\nThis repo also contains a script `match_pairs.py` that runs the matching from a list of image pairs. With this script, you can:\n\n* Run the matcher on a set of image pairs (no ground truth needed)\n* Visualize the keypoints and matches, based on their confidence\n* Evaluate and visualize the match correctness, if the ground truth relative poses and intrinsics are provided\n* Save the keypoints, matches, and evaluation results for further processing\n* Collate evaluation results over many pairs and generate result tables\n\n### Matches only mode\n\nThe simplest usage of this script will process the image pairs listed in a given text file and dump the keypoints and matches to compressed numpy `npz` files. We provide the challenging ScanNet pairs from the main paper in `assets\u002Fexample_indoor_pairs\u002F`. Running the following will run SuperPoint + SuperGlue on each image pair, and dump the results to `dump_match_pairs\u002F`:\n\n```sh\n.\u002Fmatch_pairs.py\n```\n\nThe resulting `.npz` files can be read from Python as follows:\n\n```python\n>>> import numpy as np\n>>> path = 'dump_match_pairs\u002Fscene0711_00_frame-001680_scene0711_00_frame-001995_matches.npz'\n>>> npz = np.load(path)\n>>> npz.files\n['keypoints0', 'keypoints1', 'matches', 'match_confidence']\n>>> npz['keypoints0'].shape\n(382, 2)\n>>> npz['keypoints1'].shape\n(391, 2)\n>>> npz['matches'].shape\n(382,)\n>>> np.sum(npz['matches']>-1)\n115\n>>> npz['match_confidence'].shape\n(382,)\n```\n\nFor each keypoint in `keypoints0`, the `matches` array indicates the index of the matching keypoint in `keypoints1`, or `-1` if the keypoint is unmatched.\n\n### Visualization mode\n\nYou can add the flag `--viz` to dump image outputs which visualize the matches:\n\n```sh\n.\u002Fmatch_pairs.py --viz\n```\n\nYou should see images like this inside of `dump_match_pairs\u002F` (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_b6777c78f4a2.png\" width=\"560\">\n\nThe matches are colored by their predicted confidence in a jet colormap (Red: more confident, Blue: less confident).\n\n### Evaluation mode\n\nYou can also estimate the pose using RANSAC + Essential Matrix decomposition and evaluate it if the ground truth relative poses and intrinsics are provided in the input `.txt` files. Each `.txt` file contains three key ground truth matrices: a 3x3 intrinsics matrix of image0: `K0`, a 3x3 intrinsics matrix of image1: `K1` , and a 4x4 matrix of the relative pose extrinsics `T_0to1`.\n\nTo run the evaluation on the sample set of images (by default reading `assets\u002Fscannet_sample_pairs_with_gt.txt`), you can run:\n\n```sh\n.\u002Fmatch_pairs.py --eval\n```\n\n\nSince you enabled `--eval`, you should see collated results printed to the terminal. For the example images provided, you should get the following numbers (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n```txt\nEvaluation Results (mean over 15 pairs):\nAUC@5    AUC@10  AUC@20  Prec    MScore\n26.99    48.40   64.47   73.52   19.60\n```\n\nThe resulting `.npz` files in `dump_match_pairs\u002F` will now contain scalar values related to the evaluation, computed on the sample images provided. Here is what you should find in one of the generated evaluation files:\n\n```python\n>>> import numpy as np\n>>> path = 'dump_match_pairs\u002Fscene0711_00_frame-001680_scene0711_00_frame-001995_evaluation.npz'\n>>> npz = np.load(path)\n>>> print(npz.files)\n['error_t', 'error_R', 'precision', 'matching_score', 'num_correct', 'epipolar_errors']\n```\n\nYou can also visualize the evaluation metrics by running the following command:\n\n```sh\n.\u002Fmatch_pairs.py --eval --viz\n```\n\nYou should also now see additional images in `dump_match_pairs\u002F` which visualize the evaluation numbers (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_d017bc033714.png\" width=\"560\">\n\nThe top left corner of the image shows the pose error and number of inliers, while the lines are colored by their epipolar error computed with the ground truth relative pose (red: higher error, green: lower error).\n\n### Running on sample outdoor pairs\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nIn this repo, we also provide a few challenging Phototourism pairs, so that you can re-create some of the figures from the paper. Run this script to run matching and visualization (no ground truth is provided, see this [note](#reproducing-outdoor-evaluation-final-table)) on the provided pairs:\n\n```sh\n.\u002Fmatch_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3  --resize_float --input_dir assets\u002Fphototourism_sample_images\u002F --input_pairs assets\u002Fphototourism_sample_pairs.txt --output_dir dump_match_pairs_outdoor --viz\n```\n\nYou should now image pairs such as these in `dump_match_pairs_outdoor\u002F` (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_e90ac634c1e1.png\" width=\"560\">\n\n\u003C\u002Fdetails>\n\n### Recommended settings for indoor \u002F outdoor\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nFor **indoor** images, we recommend the following settings (these are the defaults):\n\n```sh\n.\u002Fmatch_pairs.py --resize 640 --superglue indoor --max_keypoints 1024 --nms_radius 4\n```\n\nFor **outdoor** images, we recommend the following settings:\n\n```sh\n.\u002Fmatch_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float\n```\n\nYou can provide your own list of pairs `--input_pairs` for images contained in `--input_dir`. Images can be resized before network inference with `--resize`. If you are re-running the same evaluation many times, you can use the `--cache` flag to reuse old computation.\n\u003C\u002Fdetails>\n\n### Test set pair file format explained\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nWe provide the list of ScanNet test pairs in `assets\u002Fscannet_test_pairs_with_gt.txt` (with ground truth) and Phototourism test pairs `assets\u002Fphototourism_test_pairs.txt` (without ground truth) used to evaluate the matching from the paper. Each line corresponds to one pair and is structured as follows:\n\n```\npath_image_A path_image_B exif_rotationA exif_rotationB [KA_0 ... KA_8] [KB_0 ... KB_8] [T_AB_0 ... T_AB_15]\n```\n\nThe `path_image_A` and `path_image_B` entries are paths to image A and B, respectively. The `exif_rotation` is an integer in the range [0, 3] that comes from the original EXIF metadata associated with the image, where, 0: no rotation, 1: 90 degree clockwise, 2: 180 degree clockwise, 3: 270 degree clockwise. If the EXIF data is not known, you can just provide a zero here and no rotation will be performed. `KA` and `KB` are the flattened `3x3` matrices of image A and image B intrinsics. `T_AB` is a flattened `4x4` matrix of the extrinsics between the pair.\n\u003C\u002Fdetails>\n\n### Reproducing the indoor evaluation on ScanNet\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nWe provide the groundtruth for ScanNet in our format in the file `assets\u002Fscannet_test_pairs_with_gt.txt` for convenience. In order to reproduce similar tables to what was in the paper, you will need to download the dataset (we do not provide the raw test images). To download the ScanNet dataset, do the following:\n\n1. Head to the [ScanNet](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet) github repo to download the ScanNet test set (100 scenes).\n2. You will need to extract the raw sensor data from the 100 `.sens` files in each scene in the test set using the [SensReader](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet\u002Ftree\u002Fmaster\u002FSensReader) tool.\n\nOnce the ScanNet dataset is downloaded in `~\u002Fdata\u002Fscannet`, you can run the following:\n\n```sh\n.\u002Fmatch_pairs.py --input_dir ~\u002Fdata\u002Fscannet --input_pairs assets\u002Fscannet_test_pairs_with_gt.txt --output_dir dump_scannet_test_results --eval\n```\n\nYou should get the following table for ScanNet (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n```txt\nEvaluation Results (mean over 1500 pairs):\nAUC@5    AUC@10  AUC@20  Prec    MScore\n16.12    33.76   51.79   84.37   31.14\n```\n\n\u003C\u002Fdetails>\n\n### Reproducing the outdoor evaluation on YFCC\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nWe provide the groundtruth for YFCC in our format in the file `assets\u002Fyfcc_test_pairs_with_gt.txt` for convenience. In order to reproduce similar tables to what was in the paper, you will need to download the dataset (we do not provide the raw test images). To download the YFCC dataset, you can use the [OANet](https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet) repo:\n\n```sh\ngit clone https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet\ncd OANet\nbash download_data.sh raw_data raw_data_yfcc.tar.gz 0 8\ntar -xvf raw_data_yfcc.tar.gz\nmv raw_data\u002Fyfcc100m ~\u002Fdata\n```\n\nOnce the YFCC dataset is downloaded in `~\u002Fdata\u002Fyfcc100m`, you can run the following:\n\n```sh\n.\u002Fmatch_pairs.py --input_dir ~\u002Fdata\u002Fyfcc100m --input_pairs assets\u002Fyfcc_test_pairs_with_gt.txt --output_dir dump_yfcc_test_results --eval --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float\n```\n\nYou should get the following table for YFCC (or something very close to it, see this [note](#a-note-on-reproducibility)):\n\n```txt\nEvaluation Results (mean over 4000 pairs):\nAUC@5    AUC@10  AUC@20  Prec    MScore\n39.02    59.51   75.72   98.72   23.61  \n```\n\n\u003C\u002Fdetails>\n\n### Reproducing outdoor evaluation on Phototourism\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nThe Phototourism results shown in the paper were produced using similar data as the test set from the [Image Matching Challenge 2020](https:\u002F\u002Fvision.uvic.ca\u002Fimage-matching-challenge\u002F), which holds the ground truth data private for the test set. We list the pairs we used in `assets\u002Fphototourism_test_pairs.txt`. To reproduce similar numbers on this test set, please submit to the challenge benchmark. While the challenge is still live, we cannot share the test set publically since we want to help maintain the integrity of the challenge. \n\n\u003C\u002Fdetails>\n\n### Correcting EXIF rotation data in YFCC and Phototourism\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nIn this repo, we provide manually corrected the EXIF rotation data for the outdoor evaluations on YFCC and Phototourism. For the YFCC dataset we found 7 images with incorrect EXIF rotation flags, resulting in 148 pairs out of 4000 being corrected. For Phototourism, we found 36 images with incorrect EXIF rotation flags, resulting in 212 out of 2200 pairs being corrected.\n\nThe SuperGlue paper reports the results of SuperGlue **without** the corrected rotations, while the numbers in this README are reported **with** the corrected rotations. We found that our final conclusions from the evaluation still hold with or without the corrected rotations. For backwards compatability, we included the original, uncorrected EXIF rotation data in `assets\u002Fphototourism_test_pairs_original.txt` and `assets\u002Fyfcc_test_pairs_with_gt_original.txt` respectively.\n\n\u003C\u002Fdetails>\n\n### Outdoor training \u002F validation scene splits of MegaDepth\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nFor training and validation of the outdoor model, we used scenes from the [MegaDepth dataset](http:\u002F\u002Fwww.cs.cornell.edu\u002Fprojects\u002Fmegadepth\u002F). We provide the list of scenes used to train the outdoor model in the `assets\u002F` directory:\n\n* Training set: `assets\u002Fmegadepth_train_scenes.txt`\n* Validation set: `assets\u002Fmegadepth_validation_scenes.txt`\n\n\u003C\u002Fdetails>\n\n### A note on reproducibility\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nAfter simplifying the model code and evaluation code and preparing it for release, we made some improvements and tweaks that result in slightly different numbers than what was reported in the paper. The numbers and figures reported in the README were done using Ubuntu 16.04, OpenCV 3.4.5, and PyTorch 1.1.0. Even with matching the library versions, we observed some slight differences across Mac and Ubuntu, which we believe are due to differences in OpenCV's image resize function implementation and randomization of RANSAC.\n\u003C\u002Fdetails>\n\n### Creating high-quality PDF visualizations and faster visualization with --fast_viz\n\n\u003Cdetails>\n  \u003Csummary>[Click to expand]\u003C\u002Fsummary>\n\nWhen generating output images with `match_pairs.py`, the default `--viz` flag uses a Matplotlib renderer which allows for the generation of camera-ready PDF visualizations if you additionally use `--viz_extension pdf` instead of the default png extension.\n\n```\n.\u002Fmatch_pairs.py --viz --viz_extension pdf\n```\n\nAlternatively, you might want to save visualization images but have the generation be much faster.  You can use the `--fast_viz` flag to use an OpenCV-based image renderer as follows:\n\n```\n.\u002Fmatch_pairs.py --viz --fast_viz\n```\n\nIf you would also like an OpenCV display window to preview the results (you must use non-pdf output and use fast_fiz), simply run:\n\n```\n.\u002Fmatch_pairs.py --viz --fast_viz --opencv_display\n```\n\n\u003C\u002Fdetails>\n\n\n## BibTeX Citation\nIf you use any ideas from the paper or code from this repo, please consider citing:\n\n```txt\n@inproceedings{sarlin20superglue,\n  author    = {Paul-Edouard Sarlin and\n               Daniel DeTone and\n               Tomasz Malisiewicz and\n               Andrew Rabinovich},\n  title     = {{SuperGlue}: Learning Feature Matching with Graph Neural Networks},\n  booktitle = {CVPR},\n  year      = {2020},\n  url       = {https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.11763}\n}\n```\n\n## Additional Notes\n* For the demo, we found that the keyboard interaction works well with OpenCV 4.1.2.30, older versions were less responsive and the newest version had a [OpenCV bug on Mac](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002F60032540\u002Fopencv-cv2-imshow-is-not-working-because-of-the-qt)\n* We generally do not recommend to run SuperPoint+SuperGlue below 160x120 resolution (QQVGA) and above 2000x1500\n* We do not intend to release the SuperGlue training code.\n* We do not intend to release the SIFT-based or homography SuperGlue models.\n\n## Legal Disclaimer\nMagic Leap is proud to provide its latest samples, toolkits, and research projects on Github to foster development and gather feedback from the spatial computing community. Use of the resources within this repo is subject to (a) the license(s) included herein, or (b) if no license is included, Magic Leap's [Developer Agreement](https:\u002F\u002Fid.magicleap.com\u002Fterms\u002Fdeveloper), which is available on our [Developer Portal](https:\u002F\u002Fdeveloper.magicleap.com\u002F).\nIf you need more, just ask on the [forums](https:\u002F\u002Fforum.magicleap.com\u002Fhc\u002Fen-us\u002Fcommunity\u002Ftopics)!\nWe're thrilled to be part of a well-meaning, friendly and welcoming community of millions.\n","\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_c1e277f632a0.png\" width=\"240\">\n\n### Magic Leap 的研究（CVPR 2020，口头报告）\n\n# SuperGlue 推理与评估演示脚本\n\n## 简介\nSuperGlue 是 Magic Leap 在 CVPR 2020 上发表的一项研究项目。SuperGlue 网络是一种结合了最优匹配层的图神经网络，经过训练后能够在两组稀疏图像特征之间执行匹配任务。本仓库包含 PyTorch 代码及预训练权重，用于在 [SuperPoint](https:\u002F\u002Farxiv.org\u002Fabs\u002F1712.07629) 关键点和描述子的基础上运行 SuperGlue 匹配网络。给定一对图像，您可以通过本仓库提取这对图像之间的匹配特征。\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_2d0c3c6ce8d7.png\" width=\"500\">\n\u003C\u002Fp>\n\nSuperGlue 作为一个“中间层”，在一个端到端的架构中同时完成上下文聚合、匹配和过滤。更多详情请参阅：\n\n* 论文全文 PDF：[SuperGlue: 使用图神经网络学习特征匹配](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.11763)。\n\n* 作者：*Paul-Edouard Sarlin、Daniel DeTone、Tomasz Malisiewicz、Andrew Rabinovich*\n\n* 官网：[psarlin.com\u002Fsuperglue](https:\u002F\u002Fpsarlin.com\u002Fsuperglue)，提供视频、幻灯片、最新更新以及更多可视化内容。\n\n* `hloc`：一个基于 SuperGlue 的视觉定位与 SfM 新工具箱，可在 [cvg\u002FHierarchical-Localization](https:\u002F\u002Fgithub.com\u002Fcvg\u002FHierarchical-Localization\u002F) 获取。该工具箱曾荣获 CVPR 2020 中关于定位和图像匹配的三项竞赛冠军！\n\n我们提供了两个预训练权重文件：一个是在 ScanNet 数据上训练的室内模型，另一个是在 MegaDepth 数据上训练的室外模型。这两个模型都位于 [weights 目录](.\u002Fmodels\u002Fweights) 中。默认情况下，演示将运行 **室内** 模型。\n\n## 依赖项\n* Python 3 >= 3.5\n* PyTorch >= 1.1\n* OpenCV >= 3.4（推荐使用 4.1.2.30 以获得最佳 GUI 键盘交互体验，详见此 [注释](#additional-notes)）\n* Matplotlib >= 3.1\n* NumPy >= 1.18\n\n只需运行以下命令：`pip3 install numpy opencv-python torch matplotlib`\n\n## 内容\n本仓库包含两个主要的顶层脚本：\n\n1. `demo_superglue.py`：可在网络摄像头、IP 摄像头、图像目录或影片文件上运行实时演示。\n2. `match_pairs.py`：从文件中读取图像对，并将匹配结果转储到磁盘上（如果提供了真值相对位姿，则还会进行评估）。\n\n## 实时匹配演示脚本 (`demo_superglue.py`)\n该演示会在锚定图像和实时图像上运行 SuperPoint + SuperGlue 特征匹配。您可以通过按下 `n` 键来更新锚定图像。演示可以从 USB 或 IP 摄像头、包含图像的目录或视频文件中读取图像流。所有这些输入都可以通过 `--input` 标志传递。\n\n### 在实时网络摄像头上运行演示\n\n在默认的 USB 网络摄像头上运行演示（ID #0），如果有 CUDA GPU 则会使用 GPU 运行：\n\n```sh\n.\u002Fdemo_superglue.py\n```\n\n键盘控制：\n\n* `n`：将当前帧设为锚定帧。\n* `e`\u002F`r`：分别提高\u002F降低关键点置信度阈值。\n* `d`\u002F`f`：分别提高\u002F降低匹配过滤阈值。\n* `k`：切换关键点的可视化显示。\n* `q`：退出。\n\n在 CPU 上以 320x240 分辨率运行演示：\n\n```sh\n.\u002Fdemo_superglue.py --resize 320 240 --force_cpu\n```\n\n`--resize` 标志可用于三种方式调整输入图像大小：\n\n1. `--resize` `width` `height`：将图像调整为精确的 `width` x `height` 尺寸。\n2. `--resize` `max_dimension`：将输入图像的最大边调整为 `max_dimension`。\n3. `--resize` `-1`：不进行缩放（即使用原始图像尺寸）。\n\n默认情况下，图像会被调整为 `640x480`。\n\n### 在图像目录上运行演示\n\n`--input` 标志也可以接受目录路径。我们提供了一个包含序列样本图像的目录。要在无显示器的服务器上对 `freiburg_sequence\u002F` 目录中的图像运行演示，并将输出可视化图像保存到 `dump_demo_sequence\u002F`：\n\n```sh\n.\u002Fdemo_superglue.py --input assets\u002Ffreiburg_sequence\u002F --output_dir dump_demo_sequence --resize 320 240 --no_display\n```\n\n您应该会在 Freiburg-TUM RGBD 示例序列中看到如下输出：\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_14c20e870b67.gif\" width=\"560\">\n\n匹配结果按其预测置信度用喷射色图着色（红色表示置信度高，蓝色表示置信度低）。\n\n### 其他有用的命令行参数\n* 使用 `--image_glob` 可更改图像文件扩展名（默认：`*.png`、`*.jpg`、`*.jpeg`）。\n* 使用 `--skip` 可跳过中间帧（默认：1）。\n* 使用 `--max_length` 可限制处理的总帧数（默认：1000000）。\n* 使用 `--show_keypoints` 可可视化检测到的关键点（默认：False）。\n\n## 运行匹配与评估 (`match_pairs.py`)\n本仓库还包含一个脚本 `match_pairs.py`，用于从一组图像对中运行匹配。借助此脚本，您可以：\n\n* 对一组图像对运行匹配器（无需真值）。\n* 根据置信度可视化关键点和匹配结果。\n* 如果提供了真值相对位姿和内参，则可以评估并可视化匹配的正确性。\n* 将关键点、匹配结果和评估结果保存下来以便进一步处理。\n* 整合多对图像的评估结果并生成结果表格。\n\n### 仅匹配模式\n该脚本最简单的用法是处理给定文本文件中列出的图像对，并将关键点和匹配结果转储为压缩的 numpy `npz` 文件。我们在 `assets\u002Fexample_indoor_pairs\u002F` 中提供了主论文中具有挑战性的 ScanNet 图像对。运行以下命令将在每对图像上运行 SuperPoint + SuperGlue，并将结果转储到 `dump_match_pairs\u002F`：\n\n```sh\n.\u002Fmatch_pairs.py\n```\n\n生成的 `.npz` 文件可以用 Python 读取如下：\n\n```python\n>>> import numpy as np\n>>> path = 'dump_match_pairs\u002Fscene0711_00_frame-001680_scene0711_00_frame-001995_matches.npz'\n>>> npz = np.load(path)\n>>> npz.files\n['keypoints0', 'keypoints1', 'matches', 'match_confidence']\n>>> npz['keypoints0'].shape\n(382, 2)\n>>> npz['keypoints1'].shape\n(391, 2)\n>>> npz['matches'].shape\n(382,)\n>>> np.sum(npz['matches']>-1)\n115\n>>> npz['match_confidence'].shape\n(382,)\n```\n\n对于 `keypoints0` 中的每个关键点，`matches` 数组会指示其在 `keypoints1` 中的匹配关键点索引，若未匹配则为 `-1`。\n\n### 可视化模式\n您可以添加 `--viz` 标志来转储包含匹配结果的图像输出：\n\n```sh\n.\u002Fmatch_pairs.py --viz\n```\n\n您应该会在 `dump_match_pairs\u002F` 目录中看到类似这样的图像（或非常接近的版本，详见此 [注释](#a-note-on-reproducibility))：\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_b6777c78f4a2.png\" width=\"560\">\n\n匹配结果按其预测置信度用喷射色图着色（红色表示置信度高，蓝色表示置信度低）。\n\n### 评估模式\n\n你还可以使用 RANSAC + 基本矩阵分解来估计位姿，并在输入的 `.txt` 文件中提供了真值相对位姿和内参的情况下对其进行评估。每个 `.txt` 文件包含三个关键的真值矩阵：图像0的 3×3 内参矩阵 `K0`、图像1的 3×3 内参矩阵 `K1`，以及相对位姿外参的 4×4 矩阵 `T_0to1`。\n\n要在示例图像集上运行评估（默认读取 `assets\u002Fscannet_sample_pairs_with_gt.txt`），可以运行以下命令：\n\n```sh\n.\u002Fmatch_pairs.py --eval\n```\n\n由于启用了 `--eval` 参数，你应该会在终端看到汇总的结果打印出来。对于提供的示例图像，你应该会得到以下数值（或非常接近这些数值，详见 [注释](#a-note-on-reproducibility)）：\n\n```txt\n评估结果（15对的平均值）：\nAUC@5    AUC@10  AUC@20  精度    MScore\n26.99    48.40   64.47   73.52   19.60\n```\n\n此时，`dump_match_pairs\u002F` 中生成的 `.npz` 文件将包含与评估相关的标量值，这些值是基于提供的示例图像计算得出的。以下是其中一个生成的评估文件中应包含的内容：\n\n```python\n>>> import numpy as np\n>>> path = 'dump_match_pairs\u002Fscene0711_00_frame-001680_scene0711_00_frame-001995_evaluation.npz'\n>>> npz = np.load(path)\n>>> print(npz.files)\n['error_t', 'error_R', 'precision', 'matching_score', 'num_correct', 'epipolar_errors']\n```\n\n你还可以通过运行以下命令来可视化评估指标：\n\n```sh\n.\u002Fmatch_pairs.py --eval --viz\n```\n\n现在，你也应该会在 `dump_match_pairs\u002F` 中看到额外的图像，这些图像展示了评估数值（或非常接近这些数值，详见 [注释](#a-note-on-reproducibility)）：\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_d017bc033714.png\" width=\"560\">\n\n图像的左上角显示了位姿误差和内点数量，而线条则根据其与真值相对位姿计算出的极线误差进行着色：红色表示误差较大，绿色表示误差较小。\n\n### 在示例室外图像对上运行\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n在这个仓库中，我们还提供了一些具有挑战性的 Phototourism 图像对，以便你可以重现论文中的部分图表。运行以下脚本以对提供的图像对执行匹配和可视化操作（未提供真值，详见 [注释](#reproducing-outdoor-evaluation-final-table)）：\n\n```sh\n.\u002Fmatch_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3  --resize_float --input_dir assets\u002Fphototourism_sample_images\u002F --input_pairs assets\u002Fphototourism_sample_pairs.txt --output_dir dump_match_pairs_outdoor --viz\n```\n\n现在，你应该会在 `dump_match_pairs_outdoor\u002F` 中看到类似这样的图像对（或非常接近这些图像，详见 [注释](#a-note-on-reproducibility)）：\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_readme_e90ac634c1e1.png\" width=\"560\">\n\n\u003C\u002Fdetails>\n\n### 室内\u002F室外推荐设置\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n对于 **室内** 图像，我们推荐以下设置（这些也是默认设置）：\n\n```sh\n.\u002Fmatch_pairs.py --resize 640 --superglue indoor --max_keypoints 1024 --nms_radius 4\n```\n\n对于 **室外** 图像，我们推荐以下设置：\n\n```sh\n.\u002Fmatch_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float\n```\n\n你可以通过 `--input_pairs` 提供自己定义的图像对列表，这些图像位于 `--input_dir` 目录下。图像可以在网络推理之前通过 `--resize` 进行缩放。如果你需要多次重复相同的评估，可以使用 `--cache` 标志来重用之前的计算结果。\n\u003C\u002Fdetails>\n\n### 测试集图像对文件格式说明\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n我们提供了包含真值的 ScanNet 测试图像对列表 `assets\u002Fscannet_test_pairs_with_gt.txt`，以及不包含真值的 Phototourism 测试图像对列表 `assets\u002Fphototourism_test_pairs.txt`，用于评估论文中的匹配效果。每行对应一对图像，其格式如下：\n\n```\npath_image_A path_image_B exif_rotationA exif_rotationB [KA_0 ... KA_8] [KB_0 ... KB_8] [T_AB_0 ... T_AB_15]\n```\n\n其中，`path_image_A` 和 `path_image_B` 分别是图像 A 和图像 B 的路径。`exif_rotation` 是一个介于 [0, 3] 之间的整数，来自原始图像的 EXIF 元数据：0 表示无旋转，1 表示顺时针旋转 90 度，2 表示顺时针旋转 180 度，3 表示顺时针旋转 270 度。如果不知道 EXIF 数据，可以在此处填写 0，这样就不会进行旋转。`KA` 和 `KB` 分别是图像 A 和图像 B 内参的 3×3 矩阵展开形式。`T_AB` 是这对图像之间外参的 4×4 矩阵展开形式。\n\u003C\u002Fdetails>\n\n### 在 ScanNet 上重现室内评估\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n为了方便起见，我们在 `assets\u002Fscannet_test_pairs_with_gt.txt` 文件中提供了符合我们格式的 ScanNet 真值数据。要重现与论文中相似的表格，你需要下载该数据集（我们不提供原始测试图像）。下载 ScanNet 数据集的步骤如下：\n\n1. 访问 [ScanNet](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet) 的 GitHub 仓库，下载 ScanNet 测试集（共 100 个场景）。\n2. 使用 [SensReader](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet\u002Ftree\u002Fmaster\u002FSensReader) 工具，从测试集中每个场景的 100 个 `.sens` 文件中提取原始传感器数据。\n\n一旦 ScanNet 数据集下载到 `~\u002Fdata\u002Fscannet` 目录下，你可以运行以下命令：\n\n```sh\n.\u002Fmatch_pairs.py --input_dir ~\u002Fdata\u002Fscannet --input_pairs assets\u002Fscannet_test_pairs_with_gt.txt --output_dir dump_scannet_test_results --eval\n```\n\n你应该会得到以下 ScanNet 的评估结果表（或非常接近此结果，详见 [注释](#a-note-on-reproducibility)）：\n\n```txt\n评估结果（1500 对的平均值）：\nAUC@5    AUC@10  AUC@20  精度    MScore\n16.12    33.76   51.79   84.37   31.14\n```\n\n\u003C\u002Fdetails>\n\n### 在 YFCC 数据集上复现户外评估\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n为方便起见，我们在 `assets\u002Fyfcc_test_pairs_with_gt.txt` 文件中提供了我们格式的 YFCC 真值数据。为了复现论文中类似的表格，您需要下载该数据集（我们不提供原始测试图像）。要下载 YFCC 数据集，您可以使用 [OANet](https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet) 仓库：\n\n```sh\ngit clone https:\u002F\u002Fgithub.com\u002Fzjhthu\u002FOANet\ncd OANet\nbash download_data.sh raw_data raw_data_yfcc.tar.gz 0 8\ntar -xvf raw_data_yfcc.tar.gz\nmv raw_data\u002Fyfcc100m ~\u002Fdata\n```\n\n一旦 YFCC 数据集被下载到 `~\u002Fdata\u002Fyfcc100m` 目录下，您可以运行以下命令：\n\n```sh\n.\u002Fmatch_pairs.py --input_dir ~\u002Fdata\u002Fyfcc100m --input_pairs assets\u002Fyfcc_test_pairs_with_gt.txt --output_dir dump_yfcc_test_results --eval --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float\n```\n\n您应该会得到如下关于 YFCC 的表格（或与其非常接近的结果，请参阅[可复现性说明](#a-note-on-reproducibility)）：\n\n```txt\n评估结果（4000对的平均值）：\nAUC@5    AUC@10  AUC@20  Prec    MScore\n39.02    59.51   75.72   98.72   23.61  \n```\n\n\u003C\u002Fdetails>\n\n### 在 Phototourism 数据集上复现户外评估\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n论文中展示的 Phototourism 结果是使用与 [2020 年图像匹配挑战赛](https:\u002F\u002Fvision.uvic.ca\u002Fimage-matching-challenge\u002F) 测试集相似的数据生成的，而该挑战赛对测试集的真实标签进行了保密。我们使用的配对列表已在 `assets\u002Fphototourism_test_pairs.txt` 中列出。若想在该测试集中复现类似的结果，请提交至该挑战赛的基准测试。由于该挑战赛仍在进行中，我们无法公开测试集，因为我们希望维护挑战赛的公正性。\n\n\u003C\u002Fdetails>\n\n### 修正 YFCC 和 Phototourism 数据中的 EXIF 旋转信息\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n在此仓库中，我们手动修正了 YFCC 和 Phototourism 户外评估所用图像的 EXIF 旋转信息。对于 YFCC 数据集，我们发现有 7 张图像的 EXIF 旋转标记不正确，导致 4000 对中的 148 对需要修正。对于 Phototourism 数据集，我们发现了 36 张图像的 EXIF 旋转标记不正确，从而修正了 2200 对中的 212 对。\n\nSuperGlue 论文报告的是未修正旋转信息时的结果，而本 README 中的数字则是基于已修正旋转信息得出的。我们发现，无论是否修正旋转信息，最终的评估结论都保持一致。为保持向后兼容性，我们将原始未修正的 EXIF 旋转数据分别保存在 `assets\u002Fphototourism_test_pairs_original.txt` 和 `assets\u002Fyfcc_test_pairs_with_gt_original.txt` 中。\n\n\u003C\u002Fdetails>\n\n### MegaDepth 数据集的户外训练\u002F验证场景划分\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n用于户外模型的训练和验证，我们使用了来自 [MegaDepth 数据集](http:\u002F\u002Fwww.cs.cornell.edu\u002Fprojects\u002Fmegadepth\u002F) 的场景。我们在 `assets\u002F` 目录中提供了用于训练户外模型的场景列表：\n\n* 训练集：`assets\u002Fmegadepth_train_scenes.txt`\n* 验证集：`assets\u002Fmegadepth_validation_scenes.txt`\n\n\u003C\u002Fdetails>\n\n### 关于可复现性的说明\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n在简化模型代码和评估代码并准备发布之后，我们进行了一些改进和调整，这导致最终结果与论文中报告的结果略有不同。README 中报告的数字和图表是在 Ubuntu 16.04、OpenCV 3.4.5 和 PyTorch 1.1.0 环境下完成的。即使库版本完全一致，我们仍然观察到 Mac 和 Ubuntu 之间的细微差异，我们认为这些差异源于 OpenCV 图像缩放函数实现的不同以及 RANSAC 随机化过程的差异。\n\u003C\u002Fdetails>\n\n### 创建高质量 PDF 可视化图及使用 --fast_viz 提高可视化速度\n\n\u003Cdetails>\n  \u003Csummary>[点击展开]\u003C\u002Fsummary>\n\n当使用 `match_pairs.py` 生成输出图像时，默认的 `--viz` 标志会使用 Matplotlib 渲染器，如果您额外使用 `--viz_extension pdf` 而不是默认的 png 扩展名，则可以生成适合投稿的 PDF 可视化图。\n\n```\n.\u002Fmatch_pairs.py --viz --viz_extension pdf\n```\n\n或者，您可能希望保存可视化图像，但同时希望生成速度更快。此时可以使用 `--fast_viz` 标志，以 OpenCV 基础的图像渲染器来实现：\n\n```\n.\u002Fmatch_pairs.py --viz --fast_viz\n```\n\n如果您还想通过 OpenCV 显示窗口预览结果（必须使用非 PDF 输出并启用 fast_viz），只需运行：\n\n```\n.\u002Fmatch_pairs.py --viz --fast_viz --opencv_display\n```\n\n\u003C\u002Fdetails>\n\n\n## BibTeX 引用\n如果您使用了本文中的任何观点或此仓库中的代码，请考虑引用以下文献：\n\n```txt\n@inproceedings{sarlin20superglue,\n  author    = {Paul-Edouard Sarlin and\n               Daniel DeTone and\n               Tomasz Malisiewicz and\n               Andrew Rabinovich},\n  title     = {{SuperGlue}: 使用图神经网络学习特征匹配},\n  booktitle = {CVPR},\n  year      = {2020},\n  url       = {https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.11763}\n}\n```\n\n## 其他说明\n* 在演示过程中，我们发现键盘交互在 OpenCV 4.1.2.30 版本下表现良好。较旧的版本响应较慢，而最新版本则存在一个 [Mac 上的 OpenCV bug](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002F60032540\u002Fopencv-cv2-imshow-is-not-working-because-of-the-qt)。\n* 我们通常不建议将 SuperPoint+SuperGlue 的输入分辨率设置低于 160x120（QQVGA）或高于 2000x1500。\n* 我们无意公开 SuperGlue 的训练代码。\n* 我们无意公开基于 SIFT 或单应矩阵的 SuperGlue 模型。\n\n## 法律声明\nMagic Leap 很荣幸能在 Github 上分享其最新的样本、工具包和研究项目，以促进开发并收集空间计算社区的反馈。使用本仓库中的资源需遵守 (a) 此处包含的许可协议，或 (b) 如果未包含许可协议，则需遵守 Magic Leap 的 [开发者协议](https:\u002F\u002Fid.magicleap.com\u002Fterms\u002Fdeveloper)，该协议可在我们的 [开发者门户](https:\u002F\u002Fdeveloper.magicleap.com\u002F) 上找到。\n如果您需要更多帮助，欢迎前往 [论坛](https:\u002F\u002Fforum.magicleap.com\u002Fhc\u002Fen-us\u002Fcommunity\u002Ftopics)! 我们很高兴能成为数百万充满善意、友好且热情的社区的一员。","# SuperGluePretrainedNetwork 快速上手指南\n\nSuperGlue 是一个基于图神经网络（GNN）的特征匹配工具，常与 SuperPoint 结合使用，用于在两幅图像间进行高精度的稀疏特征点匹配。本项目提供了预训练模型（室内\u002F室外场景）及推理脚本。\n\n## 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**: Linux, macOS 或 Windows\n*   **Python**: 版本 >= 3.5 (推荐 3.8+)\n*   **硬件**: 推荐使用 NVIDIA GPU (CUDA) 以获得最佳性能，CPU 亦可运行但速度较慢。\n\n### 前置依赖\n主要依赖以下 Python 库：\n*   PyTorch >= 1.1\n*   OpenCV >= 3.4 (推荐 4.1.2.30 以获得更好的 GUI 交互体验)\n*   Matplotlib >= 3.1\n*   NumPy >= 1.18\n\n## 安装步骤\n\n### 1. 克隆项目\n首先从 GitHub 克隆代码仓库：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork.git\ncd SuperGluePretrainedNetwork\n```\n\n### 2. 安装 Python 依赖\n您可以直接使用 pip 安装所需依赖。为了加快下载速度，国内用户建议使用清华或阿里镜像源：\n\n**使用默认源：**\n```bash\npip3 install numpy opencv-python torch matplotlib\n```\n\n**使用国内镜像源（推荐）：**\n```bash\npip3 install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple numpy opencv-python torch matplotlib\n```\n\n> **注意**：如果您的环境需要特定版本的 PyTorch（例如带 CUDA 支持），请访问 [PyTorch 官网](https:\u002F\u002Fpytorch.org\u002F) 获取对应的安装命令。\n\n## 基本使用\n\n本项目提供两个核心脚本：`demo_superglue.py`（实时演示）和 `match_pairs.py`（批量匹配与评估）。以下是两种最常用的快速启动方式。\n\n### 场景一：实时摄像头演示 (Live Demo)\n此脚本会调用默认 USB 摄像头，实时提取特征并进行匹配。您可以按 `n` 键将当前帧设为锚点图像。\n\n**运行命令：**\n```bash\n.\u002Fdemo_superglue.py\n```\n\n**常用参数：**\n*   `--force_cpu`: 强制使用 CPU 运行（若无 GPU）。\n*   `--resize 320 240`: 调整输入图像分辨率以加速处理。\n*   `--input \u003Cpath>`: 指定输入源（可以是摄像头 ID、图片文件夹路径或视频文件）。\n\n**键盘控制：**\n*   `n`: 选择当前帧作为锚点 (Anchor)\n*   `e` \u002F `r`: 增加\u002F减少关键点置信度阈值\n*   `d` \u002F `f`: 增加\u002F减少匹配过滤阈值\n*   `k`: 切换关键点可视化\n*   `q`: 退出程序\n\n### 场景二：批量图片对匹配 (Batch Matching)\n此脚本用于处理静态图片对，提取匹配点并保存结果。默认会使用提供的室内场景示例数据。\n\n**运行命令（仅匹配）：**\n```bash\n.\u002Fmatch_pairs.py\n```\n执行后，匹配结果（关键点和匹配索引）将保存在 `dump_match_pairs\u002F` 目录下的 `.npz` 文件中。\n\n**运行命令（匹配 + 可视化）：**\n若需生成带有匹配连线可视化的图片：\n```bash\n.\u002Fmatch_pairs.py --viz\n```\n\n**运行命令（匹配 + 评估）：**\n如果输入文件包含真值位姿（Ground Truth），可添加 `--eval` 进行精度评估：\n```bash\n.\u002Fmatch_pairs.py --eval\n```\n\n### 针对不同场景的推荐配置\n根据图片内容选择合适的预训练模型和参数可获得更好效果：\n\n**室内场景 (Indoor - 默认):**\n```bash\n.\u002Fmatch_pairs.py --resize 640 --superglue indoor --max_keypoints 1024 --nms_radius 4\n```\n\n**室外场景 (Outdoor):**\n```bash\n.\u002Fmatch_pairs.py --resize 1600 --superglue outdoor --max_keypoints 2048 --nms_radius 3 --resize_float\n```\n\n> **提示**：匹配结果中的连线颜色代表预测置信度（红色表示高置信度，蓝色表示低置信度）。","某机器人团队正在开发一款用于大型仓储中心的自主导航系统，需要实时通过摄像头画面计算自身位置并构建环境地图。\n\n### 没有 SuperGluePretrainedNetwork 时\n- **特征匹配不稳定**：在仓库货架重复纹理多、光照变化剧烈的环境下，传统算法（如 SIFT+RANSAC）极易产生大量错误匹配点，导致定位漂移。\n- **弱纹理区域失效**：面对光滑地面或纯色墙壁等缺乏明显角点的区域，系统经常无法提取足够特征，造成机器人“迷路”或停止运行。\n- **后处理流程繁琐**：开发者需要手动编写复杂的几何验证代码和启发式规则来过滤误匹配，调试成本高且难以适应不同场景。\n- **实时性差**：为了保证精度不得不增加计算耗时，导致帧率下降，无法满足机器人高速移动时的低延迟控制需求。\n\n### 使用 SuperGluePretrainedNetwork 后\n- **上下文感知匹配精准**：SuperGluePretrainedNetwork 利用图神经网络聚合全局上下文信息，即使在重复纹理或动态光照下，也能准确区分相似特征，大幅降低误匹配率。\n- **稀疏特征鲁棒性强**：结合 SuperPoint 检测器，该工具能在弱纹理区域依然保持稳定的特征提取与匹配能力，确保机器人在空旷走廊也能连续定位。\n- **端到端自动过滤**：内置的最优匹配层直接输出高置信度匹配结果，无需额外编写复杂的后处理逻辑，显著简化了开发管线。\n- **高效实时推理**：经过优化的预训练模型支持 GPU 加速，在保持高精度的同时实现毫秒级推理，完美支撑机器人的实时闭环控制。\n\nSuperGluePretrainedNetwork 通过将深度学习引入特征匹配环节，彻底解决了复杂工业场景下视觉定位的鲁棒性与实时性难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmagicleap_SuperGluePretrainedNetwork_14c20e87.gif","magicleap","Magic Leap","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmagicleap_7bde963b.png","",null,"http:\u002F\u002Fwww.magicleap.com","https:\u002F\u002Fgithub.com\u002Fmagicleap",[81],{"name":82,"color":83,"percentage":84},"Python","#3572A5",100,3991,763,"2026-04-18T15:30:21","NOASSERTION","未说明","非必需。支持 CUDA GPU（若检测到则自动使用），也可通过 --force_cpu 标志在 CPU 上运行。未指定具体显卡型号、显存大小或 CUDA 版本要求。",{"notes":92,"python":93,"dependencies":94},"该工具包含两个预训练权重文件（室内模型和室外模型），默认运行室内模型。演示脚本支持网络摄像头、IP 摄像头、图像目录或视频文件输入。若在无显示器的服务器（headless server）上运行，需添加 --no_display 参数。建议 OpenCV 版本为 4.1.2.30 以获得最佳的 GUI 键盘交互体验。",">=3.5",[95,96,97,98],"PyTorch>=1.1","OpenCV>=3.4 (推荐 4.1.2.30)","Matplotlib>=3.1","NumPy>=1.18",[14],[101,102,103,104],"deep-learning","feature-matching","pose-estimation","graph-neural-networks","2026-03-27T02:49:30.150509","2026-04-19T15:38:03.559628",[108,113,117,122,127,131,136,141],{"id":109,"question_zh":110,"answer_zh":111,"source_url":112},42605,"在评估表格（如 Table 2, 3, 6）中，使用的是标准 SIFT 还是 RootSIFT？","在所有评估中，项目均使用了 RootSIFT 而不是标准 SIFT。如果您尝试复现论文结果，请确保使用 RootSIFT 描述子。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F27",{"id":114,"question_zh":115,"answer_zh":116,"source_url":112},42606,"PhotoTourism 测试对文件（phototourism_test_pairs.txt）的数据格式是什么？每行的数字代表什么？","该文件格式曾存在错误，已在 Issue #30 中修复。建议下载并使用修复后的新测试对文件。修复前，每行前两个字符串应为图像名称，最后两个数字通常代表其他元数据（如重叠度或索引），但具体含义需参考更新后的文件或代码解析逻辑。",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},42607,"SuperPoint 模型中描述子插值的坐标归一化公式是什么？align_corners 参数应如何设置？","正确的坐标归一化公式应为 `i' = (i - 3.5) \u002F (8 * (w-1)) * 2 - 1`（假设步长为 8）。在使用 `grid_sample` 进行插值时，即使进行了上述缩放，也必须将 `align_corners` 参数设置为 `True` 以获得正确结果。此外，需注意坐标从 (h, w) 到 (x, y) 的转换。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F18",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},42608,"MegaDepth 数据集训练时的真值匹配（Ground Truth Matches）是如何生成的？相对误差的计算公式是什么？","生成步骤如下：1. 对每个关键点插值深度并提升至 3D；2. 使用相对姿态投影到另一张图像；3. 在投影位置插值深度并检查一致性。相对误差计算公式为：`relative_error = abs(d_3D - d_proj) \u002F d_3D`。如果相对误差大于 10%，则标记为遮挡。两个关键点若均未遮挡且投影距离低于阈值，则视为可匹配。数据增强仅包括随机裁剪和针对少于 1024 个关键点图像的随机添加关键点。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F52",{"id":128,"question_zh":129,"answer_zh":130,"source_url":126},42609,"Aachen 数据集上的视觉定位结果是使用在 MegaDepth 上训练的模型获得的，还是单独训练的？","论文中 Aachen 数据集的结果通常是基于在 MegaDepth 上预训练的模型进行的评估。定位流程类似于 2019 年局部特征挑战赛（Local Feature Challenge 2019）的流程。如果在复现时发现结果略低于研讨会报告的值，可能是由于定位管道（Pipeline）的具体实现细节差异造成的。",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},42610,"在 C++ (LibTorch) 中使用 SuperGlue 模型时出现匹配错误（mismatch），如何处理？","在 Python 代码中，匹配结果直接通过 `matches0` 获取。在转换为 C++ (LibTorch) 使用时，需确保 `.pth` 模型文件已正确转换为 `.pt` (TorchScript) 格式。如果出现不匹配，请检查输入数据的预处理（如关键点坐标归一化）是否与 Python 版本完全一致，并确认输出张量 `matches0` 的索引提取逻辑是否正确（即 `indices0` 对应匹配点）。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F23",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},42611,"训练过程中损失值出现 'NAN' 怎么办？是否需要在最后一层添加 BN 或 ReLU？","损失值出现 'NAN' 可能是由于噪声较大的真值监督导致的。建议监控网络中梯度的幅度以及得分矩阵（score matrix）的幅度。虽然用户提到不加 BN 或 ReLU 容易出错，但维护者主要建议检查梯度爆炸或数值不稳定问题。如果问题持续，可以尝试调整学习率或检查数据预处理是否正常。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F56",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},42612,"复现 ScanNet 室内姿态估计实验时，如何高效地筛选具有特定重叠分数（如 0.4-0.8）的图像对？"," exhaustive search（穷举搜索）所有图像对计算重叠分数确实非常耗时。虽然维护者未在评论中给出具体的剪枝启发式代码，但通常做法是先基于相机轨迹或场景结构进行粗略筛选，或者使用降采样后的图像快速估算重叠度，然后再对候选对进行精确计算。不同的实现方式会导致硬样本（小重叠分数）的比例差异巨大，从而影响训练效果。","https:\u002F\u002Fgithub.com\u002Fmagicleap\u002FSuperGluePretrainedNetwork\u002Fissues\u002F31",[]]