[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-nianticlabs--acezero":3,"tool-nianticlabs--acezero":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",148568,2,"2026-04-09T23:34:24",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":100,"forks":101,"last_commit_at":102,"license":103,"difficulty_score":104,"env_os":105,"env_gpu":106,"env_ram":107,"env_deps":108,"category_tags":117,"github_topics":120,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":131,"updated_at":132,"faqs":133,"releases":164},6174,"nianticlabs\u002Facezero","acezero","[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.","acezero 是一款基于深度学习的运动恢复结构（SfM）开源工具，荣获 ECCV 2024 口头报告。它旨在解决传统方法在处理大规模图像集时相机位姿估计不准或鲁棒性不足的问题。acezero 通过学习一种多视角一致的隐式场景表示，能够直接从一组无序图像中高精度地估算出相机参数，无需依赖复杂的传统特征匹配流程。\n\n该工具的核心亮点在于其“增量学习重定位器”机制，能够逐步重建场景坐标并优化相机姿态。此外，acezero 功能丰富，不仅支持标准的相机位姿细化，还能结合深度信息（RGB-D）进行重建，甚至可作为预处理步骤辅助训练 NeRF 或高斯泼溅（Gaussian Splatting）等前沿神经渲染模型。代码库还兼容了最新的场景重建先验技术，进一步提升了在复杂环境下的表现。\n\nacezero 非常适合计算机视觉研究人员、SLAM 开发者以及从事三维重建和神经渲染的工程师使用。由于项目基于 PyTorch 构建且提供了详细的配置指南，具备一定深度学习基础的用户可以轻松上手，利用其在学术实验或工业级三维场景理解任务中取得更优成果。","# ACE0 (ACE Zero)\n\nThis repository contains the code associated to the ACE0 paper:\n> **Scene Coordinate Reconstruction: \nPosing of Image Collections via Incremental Learning of a Relocalizer**\n> \n> [Eric Brachmann](https:\u002F\u002Febrach.github.io\u002F), \n> [Jamie Wynn](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=ASP-uu4AAAAJ&hl=en), \n> [Shuai Chen](https:\u002F\u002Fchenusc11.github.io\u002F), \n> [Tommaso Cavallari](https:\u002F\u002Fscholar.google.it\u002Fcitations?user=r7osSm0AAAAJ&hl=en), \n> [Áron Monszpart](https:\u002F\u002Famonszpart.github.io\u002F), \n> [Daniyar Turmukhambetov](https:\u002F\u002Fdantkz.github.io\u002Fabout\u002F), and \n> [Victor Adrian Prisacariu](https:\u002F\u002Fwww.robots.ox.ac.uk\u002F~victor\u002F)\n> \n> ECCV 2024, Oral\n\nFor further information please visit:\n\n- [Project page (with a method overview and videos)](https:\u002F\u002Fnianticlabs.github.io\u002Facezero)\n- [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.14351)\n\n#### Change Log\n\n**Note**: We try to make sure all code changes are backwards compatible, i.e. the results of the ECCV 2024 paper remain reproducible. But just in case, we added a tag `eccv_2024_checkpoint` that you can check out to get the exact code version used for the ECCV 2024 paper.\n\n- **2025-Nov-05**: [Added instructions](#standard-relocalization) for running [standard ACE (CVPR 2023)](https:\u002F\u002Fnianticlabs.github.io\u002Face\u002F) with this codebase.\n- **2025-Nov-06**: Added capabilities of [Scene Coordinate Reconstruction Priors (ICCV 2025)](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F), disabled by default.\n    - [RGB-D version](#rgb-d-reconstruction) of ACE\u002FACE0\n    - [Reconstruction priors](#using-reconstruction-priors): Depth distribution prior and 3D point cloud diffusion prior\n\n#### Table of Contents\n\n- [Installation](#installation)\n- [Usage](#usage)\n    - [Basic Usage](#basic-usage)\n    - [Visualization Capabilities](#visualization-capabilities)\n    - [Advanced Use Cases](#advanced-use-cases)\n      - [Refine Existing Poses](#refine-existing-poses)\n      - [Start From a Partial Reconstruction](#start-from-a-partial-reconstruction)\n      - [RGB-D Reconstruction (\"SCR Priors\" paper, ICCV 2025)](#rgb-d-reconstruction)\n      - [Using Reconstruction Priors (\"SCR Priors\" paper, ICCV 2025)](#using-reconstruction-priors)\n      - [Standard Relocalization (\"ACE\" paper, CVPR 2023)](#standard-relocalization)\n      - [Self-Supervised Relocalization](#self-supervised-relocalization)\n      - [Train NeRF models or Gaussian splats](#train-nerf-models-or-gaussian-splats)\n    - [Utility Scripts](#utility-scripts)\n- [Benchmark](#benchmark)\n  - [Nerfacto](#nerfacto)\n  - [Splatfacto](#splatfacto)\n- [Paper Experiments](#paper-experiments)\n  - [7-Scenes](#7-scenes)\n  - [Mip-NeRF 360](#mip-nerf-360)\n  - [Tanks and Temples](#tanks-and-temples)\n- [Frequently Asked Questions](#frequently-asked-questions)\n- [References](#publications)\n\n## Installation\n\nThis code uses PyTorch and has been tested on Ubuntu 20.04 with a V100 Nvidia GPU, although it should reasonably run \nwith other Linux distributions and GPUs as well. Note our [FAQ](#frequently-asked-questions) if you want to run ACE0 on GPUs with less memory. \n\nWe provide a pre-configured [`conda`](https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002F) environment containing all required dependencies\nnecessary to run our code.\nYou can re-create and activate the environment with:\n\n```shell\nconda env create -f environment.yml\nconda activate ace0\n```\n\n**All the following commands in this file need to run from the repository root and in the `ace0` environment.**\n\nACE0 represents a scene using an [ACE](https:\u002F\u002Fnianticlabs.github.io\u002Face\u002F) scene coordinate regression model.\nIn order to register cameras to the scene, it relies on the RANSAC implementation of the DSAC* paper (Brachmann and\nRother, TPAMI 2021), which is written in C++.\nAs such, you need to build and install the C++\u002FPython bindings of those functions.\nYou can do this with:\n\n```shell\ncd dsacstar\npython setup.py install\ncd ..\n```\n\nHaving done the steps above, you are ready to experiment with ACE0!\n\n**Important note:** the first time you run ACE0, the script may ask you to confirm that you are happy to download the ZoeDepth depth estimation code and its pretrained weights from GitHub.\nSee [this link](https:\u002F\u002Fgithub.com\u002Fisl-org\u002FZoeDepth) for its license and details.\nACE0 uses that model to estimate the depth for the seed images.\nIt can be replaced, please see the [FAQ](#frequently-asked-questions) section below for details.\n\n## Docker\n\nIf you preferred to run ACE0 in a docker container, you can start it with:\n\n```shell  \ndocker-compose up -d \n```\n\nYou can then shell into the container with the following command: \n\n```shell  \ndocker exec -it acezero \u002Fbin\u002Fbash\n```\n\nFrom there you can follow the Gaussian Splatting tutorial described at the bottom of the README [here.](#frequently-asked-questions) Make sure to add your images to the volume defined in ```docker-compose.yml```\n\n## Usage\n\nWe explain how to run ACE0 to reconstruct images from scratch, with and without knowledge about the image intrinsics.\nWe also explain how to use ACE0 to refine existing poses, or to initialise reconstruction with a subset of poses.\nFurthermore, we cover the visualization capabilities of ACE0, including export of the reconstruction as a video and as \n3D models.\n\n### Basic Usage\n\nIn the minimal case, you can run ACE0 on a set of images as defined by a glob pattern. \n\n```shell\n# running on a set of images with default parameters\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\n```\n\nNote the quotes around the glob pattern to ensure it is passed to the ACE0 script rather than being expanded by the shell.\n\nIf you want to run ACE0 on a video, you can extract frames from the video and run ACE0 on the extracted frames, see our [Utility Scripts](#utility-scripts).\n\nThe ACE0 script will call ACE training (`train_ace.py`) and camera registration (`register_mapping.py`) in a loop until \nall images have been registered to the scene representation, or there is no change between iterations.\n\nThe result of an ACE0 reconstruction is the `poses_final.txt` in the result folder. \nThese files contain the estimated image poses in the following format:\n```\nfilename qw qx qy qz x y z focal_length confidence\n``` \n`filename` is the image file relative to the repository root.\n`qw qx qy qz ` is the camera rotation as a quaternion, and `x y z` is the camera translation.\nCamera poses are world-to-camera transformations, using the OpenCV camera convention.\n`focal_length` is the focal length estimated by ACE0 or set externally (see below).\n`confidence` is the reliability of an estimate. \nIf the confidence is less than 1000, it should be treated as unreliable and possibly ignored.\n\nThe pose files can be used e.g. to train a Nerfacto or Splatfacto model, using our benchmarking scripts, see [Benchmarking](#benchmark).\nOur benchmarking scripts also allow you to only convert our pose files to the format required by Nerfstudio, without running the benchmark itself.\n\n\u003Cdetails>\n\u003Csummary>Other content of the result folder explained.\u003C\u002Fsummary>\n\nThe result folder will contain files such as the following:\n\n- `iterationX.pt`: The ACE scene model (the MLP network) at iteration X. Output of `train_ace.py` in iteration X.\n- `iterationX.txt`: Training statistics of the ACE model at iteration X, e.g. loss values, pose statistics, etc. See `ace_trainer.py`. Output of `train_ace.py` in iteration X.\n- `poses_iterationX_preliminary.txt`: Poses of cameras after the mapping iteration but before relocalization. Contains poses refined by the MLP, rather than poses re-estimated by RANSAC. Output of `train_ace.py` in iteration X. \n- `poses_iterationX.txt`: Final poses of iteration X, after relocalization, i.e. re-estimated by RANSAC. Output of `register_mapping.py` in iteration X.\n- `poses_final.txt`: The final poses of the images in the scene. Corresponds to the poses of the last relocalisation iteration, i.e. the output of the last `register_mapping.py` call.\n- `pc_final.ply`: An ACE0 point cloud of the scene, for visualisation or initialisation of Gaussian splats. This output is optional and triggered using the `--export_point_cloud True` option of `ace_zero.py`.\n\u003C\u002Fdetails>\n\n#### Setting Calibration Parameters\n\nUsing default parameters, ACE0 will estimate the focal length of the images, starting from a heuristic value (70% of the image diagonal.)\nIf you have a better estimate of the focal length, you can provide it as an initialisation parameter.\n\n```shell\n# running ACE0 with an initial guess for the focal length\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length>\n```\n\nUsing the call above, ACE0 will refine the focal length throughout the reconstruction process.\nIf you are confident that your focal length value is correct, you can disable focal length refinement.\n\n```shell\n# running ACE0 with a fixed focal length\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n```\n\n**Note:** The current implementation of ACE0 supports only a single focal length value shared by all images. \nACE0 currently also does assume that the principal point is at the image center, and pixels are square and unskewed.\nChanging these assumptions should be possible, but requires some implementation effort.\n\n### Visualization Capabilities\n\nACE0 can visualize the reconstruction process as a video. \n\n```shell\n# running ACE0 with visualisation enabled\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --render_visualization True\n```\n\nWith visualisation enabled, ACE0 will render individual frames in a subfolder `renderings` and call `ffmpeg` at the end.\nThe visualisation will be saved as a video in the results folder, named `reconstruction.mp4`.\n\n\u003Cdetails>\n\u003Csummary>Other content of the renderings folder explained.\u003C\u002Fsummary>\n\n* `frame_N.png`: The Nth frame of the video.\n* `iterationX_mapping.pkl`: The visualisation buffer of the mapping call in iteration X. It stores the 3D point cloud of the scene, the last rendering camera for a smooth transition, and the last frame index.\n* `iterationX_register.pkl`: The visualisation buffer of the relocalization call in iteration X.\n\u003C\u002Fdetails>\n\n**Note that this will slow down the reconstruction considerably.**\nAlternatively, you can run without visualisation enabled and export the final reconstruction as a 3D model, see [Utility Scripts](#utility-scripts).\n\n### Advanced Use Cases\n\nYou can combine the ACE0 meta script with custom calls to `train_ace.py` and `register_mapping.py` to cater to more advanced use cases.\n\n* `train_ace.py`: Trains an ACE model on a set of images with corresponding poses.\n* `register_mapping.py`: Estimates poses of images in a scene given an ACE model.\n* `ace_zero.py`: Can start from an existing ACE model.\n\nYou are free to switch image sets between the calls to these functions.\nWe provide some examples of advanced use cases that also cover some of the experiments in our paper.\n\n#### Refine Existing Poses\n\nIf you have an initial guess of all image poses, you can use ACE to refine them quickly.\nWe combine a single ACE mapping call with pose refinement enabled, and a single relocalization call.\n\n```shell\n# running ACE mapping with pose refinement enabled\npython train_ace.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --pose_files \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.txt\" --pose_refinement mlp --pose_refinement_wait 5000 --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# re-estimate poses of all images\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --use_external_focal_length \u003Cfocal_length> --session ace_network\n```\n\nIn this example, ACE takes the existing poses in the [7-Scenes](#7-scenes) format as input: one text file per image with the camera-to-world pose stored as a 4x4 matrix.\nThe option `--pose_refinement mlp` enables pose refinement using a refinement network.\nThe option `--pose_refinement_wait 5000` freezes poses for the first 5000 iterations which increases the stability if you are mapping from scratch with pose refinement.\n\nAfter calling `register_mapping.py`, the result folder will contain the refined poses in `poses_ace_network.txt`.\n\nNote that the example above assumes a known, fixed focal length. If you let ACE refine the calibration, you need to pass the refined focal length of `train_ace.py` to `register_mapping.py`.\nPlease see `scripts\u002Freconstruct_7scenes_warmstart.sh` for a complete example where we refine KinectFusion poses with ACE.\n\n#### Start From a Partial Reconstruction\n\nIf you have pose estimates for subsets of images, you can use ACE0 to complete the reconstruction.\nFirst, you call ACE mapping on the subset of images with poses which results in an ACE scene model.\nYou pass this model to ACE0, which will then register the remaining images to the scene.\n\n```shell\n# running ACE mapping on a subset of images wit poses\npython train_ace.py \"\u002Fimages\u002Fwith\u002Fposes\u002F*.jpg\" result_folder\u002Fiteration0_seed0.pt --pose_files \"\u002Fposes\u002Fof\u002Fimages\u002F*.txt\" --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# running ACE0 with the ACE model as a seed, and the complete set of images\npython ace_zero.py \"\u002Fall\u002Fimages\u002F*.jpg\" result_folder --seed_network result_folder\u002Fiteration0_seed0.pt --use_external_focal_length ${focal_length} --refine_calibration False\n```\n\nACE0 will store the final poses in `poses_final.txt` in the result folder, containing poses of all images.\nNote that the example above assumes a known, fixed focal length.\nYou can also let ACE or ACE0 estimate or refine the focal length, but you need to take care of passing the correct focal length between the calls.\n\nPlease see `scripts\u002Freconstruct_t2_training_videos_warmstart.sh` for a complete example where we reconstruct the Tanks and Temples training scenes starting from a partial reconstruction by COLMAP. More information about this example in [Tanks and Temples](#tanks-and-temples).\n\n#### RGB-D Reconstruction\n\nACE0 supports RGB-D reconstruction as presented in the [Scene Coordinate Reconstruction Priors (SCR Priors) paper](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F), ICCV 2025.\nYou can enable RGB-D reconstruction by providing depth maps for all images via the `--depth_files` option of `ace_zero.py` and setting `--depth_use_always` to True.\nFor best results, we recommend to use the RGB-D loss function of the SCR Priors paper.\n\n```shell\n# running ACE0 with RGB-D images and the recommended RGB-D loss function\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --depth_use_always True --depth_files \"\u002Fpath\u002Fto\u002Fsome\u002Fdepths\u002F*.png\" --loss_structure probabilistic --prior_loss_type rgbd_laplace_nll --prior_loss_weight 1.0 --prior_loss_bandwidth 0.1\n```\n\nFor more information, check the SCR Priors paper and its [code base](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors). \n\n#### Using Reconstruction Priors\n\nACE0 supports the use of reconstruction priors as presented in the [Scene Coordinate Reconstruction Priors (SCR Priors) paper](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F), ICCV 2025.\nTwo kinds of priors are available: hand-crafted priors based on expected depth distributions, and learned priors based on a 3D diffusion model.\nNote that the parameters and pre-trained models are catered to indoor scenes.\nUsing these priors on outdoor scenes may require tuning the parameters of the hand-crafted priors, and re-training the diffusion model for the learned prior.\n\nYou can enable the use of reconstruction priors via the appropriate command line options of `ace_zero.py`.\n\n```shell\n# running ACE0 with a depth distribution prior using the negative log-likelihood loss\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure probabilistic --prior_loss_type laplace_nll --prior_loss_weight 0.1 --prior_loss_bandwidth 0.6 --prior_loss_location 1.73\n\n# OR running ACE0 with a depth distribution prior using the Wasserstein distance\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure probabilistic --prior_loss_type laplace_wd --prior_loss_weight 0.1 --prior_loss_bandwidth 0.6 --prior_loss_location 1.73\n```\nUse of the diffusion prior requires some additional setup.\nFirstly, install and activate the `ace0_priors` conda environment as described in the [SCR Priors code base](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors) since the diffusion prior has additional dependencies.\nSecondly, download the pre-trained diffusion prior from [here](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Fscr-priors\u002Fdiffusion_prior.pt).\n\n```shell\n# running ACE0 with a diffusion prior\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure \"dsac*\" --prior_loss_type diffusion --prior_loss_weight 200 --prior_diffusion_model_path \u002Fpath\u002Fto\u002Fdiffusion_prior.pt\n```\n\nThe diffusion prior (folder `diffusion`) contains code from the following projects: \n  * [denoising-diffusion-pytorch](https:\u002F\u002Fgithub.com\u002Flucidrains\u002Fdenoising-diffusion-pytorch\u002Ftree\u002Fmain) (MIT license)\n  * [projection-conditioned-point-cloud-diffusion](https:\u002F\u002Fgithub.com\u002Flukemelas\u002Fprojection-conditioned-point-cloud-diffusion\u002Ftree\u002Fmain) (MIT license)\n  * [pvcnn](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fpvcnn) (MIT license) \n\nFor more information on all reconstruction priors, see the readme of the [SCR Priors code base](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors).\n\n#### Standard Relocalization\n\nACE0 was built on top of the [ACE codebase](https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Face) and fully supports standard relocalization as presented in the ACE paper (CVPR 2023).\n`train_ace.py` can be used for mapping posed images, and `register_mapping.py` can be used to relocalize query images.\n\nDifferent from ACE0, standard ACE (i.e. `train_ace.py` + `register_mapping.py`) do support varying focal lengths per image, see `--calibration_files` and `--calibration_file_f_idx` options of both scripts.\n\n```shell\n# mapping posed images with ACE\npython train_ace.py \"\u002Fpath\u002Fto\u002Fmapping\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --pose_files \"\u002Fpath\u002Fto\u002Fmapping\u002Fposes\u002F*.txt\" --calibration_files \"\u002Fpath\u002Fto\u002Fmapping\u002Fcalibrations\u002F*.txt\"\n\n# relocalizing query images with ACE\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fquery\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --calibration_files \"\u002Fpath\u002Fto\u002Fquery\u002Fcalibrations\u002F*.txt\"  --session query\n```\nThe relocalization results will be stored in `poses_query.txt`.\nWe provide an evaluation script to compare to query ground truth poses, see [Utility Scripts](#utility-scripts).\n\n```shell\n# compare estimated and ground truth poses, assuming they are already aligned in the same coordinate space\npython eval_poses.py result_folder\u002Fposes_query.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\" --estimate_alignment none\n```\n\nOf course, all extensions that come with ACE0 can be enabled in the relocalization setting via the appropriate parameters, e.g. estimating a shared focal length, mapping with RGB-D images, early stopping, or refining mapping poses.\n\n#### Self-Supervised Relocalization\n\nYou can use ACE0 to map a set of images, and call `register_mapping.py` on a different set of images to relocalize them.\nHere, ACE0 would run on the set of mapping images, while `register_mapping.py` would run on the set of query images.\n\n```shell\n# running ACE0 on the mapping images\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fmapping\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# running relocalization on the query images\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fquery\u002Fimages\u002F*.jpg\" result_folder\u002FiterationX.pt --use_external_focal_length \u003Cfocal_length> --session query\n```\n\nYou need to point `register_mapping.py` to the ACE model from the last mapping iteration (e.g. `iterationX.pt`). \nThe relocalization results will be stored in `poses_query.txt`.\nNote that ACE0 reconstructions are only approximately metric. \nIf you compare the query poses to ground truth, you need to fit a similarity transform first.\nWe provide a script for doing that.\n\n```shell\n# compare estimated and ground truth poses, after fitting a similiarity transform to align them\npython eval_poses.py result_folder\u002Fposes_query.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\"\n```\n\nMore information about the evaluation script can be found under [Utility Scripts](#utility-scripts).\n\n#### Train NeRF models or Gaussian splats\n\nSee [Benchmarking](#benchmark) for instructions on how to use Nerfstudio on top of ACE0.\n\n### Utility Scripts\n\n#### Video to Dataset\n\nWe provide a script for extracting frames from MP4 videos via ffmpeg.\n\n```shell\npython datasets\u002Fvideo_to_dataset.py datasets\n```\n\nThe script looks for all MP4 files in the target folder (here `datasets`) and extracts frames into a subfolder `datasets\u002Fvideo_\u003Cmp4_file_name>` for each video.\n\n#### Export 3D Scene as Point Cloud\n\nWe provide a script for exporting ACE point clouds from a network and a pose file.\n\n```shell\npython export_point_cloud.py point_cloud_out.txt --network \u002Fpath\u002Fto\u002Face_network.pt --pose_file \u002Fpath\u002Fto\u002Fposes_final.txt\n````\n\nThe script can either write out TXT of PLY files, decided by the file extension of the output file you specify. \nIf the output file has a .txt extension, the script will write the point cloud into a text file in the format `x y z r g b` per line for each point.\nIf the output file has a .ply extension, the script will write the point cloud into a binary PLY file.\nBoth formats can be imported into most 3D software, e.g. Meshlab, CloudCompare, etc.\nThe PLY format is understood by Nerfstudio for initialisation of Gaussian splats.\nNote, you can also point the script to an existing visualization buffer, `result_folder\u002Frenderings\u002FiterationX_mapping.pkl`, which already contains the point cloud so it does not have to be re-generated.\n\nPoint clouds can be exported either using OpenGL or OpenCV coordinate conventions. Nerfstudio expects OpenCV coordinates.\nThe script can extract sparse or dense point clouds. The sparse point clouds have more filters applied and look cleaner.\nThe dense point clouds tend to work better for Gaussian splatting if you have a lot of images (2000+) as they cover more of the background. \n\n#### Export Cameras as Mesh\n\nWe provide a script for exporting an ACE pose file to PLY showing the cameras.\n\n```shell\npython export_cameras.py \u002Fpath\u002Fto\u002Face\u002Fpose_file.txt \u002Fpath\u002Fto\u002Foutput.ply\n```\n\nThe script will color-code the cameras by their confidence value. \nThe PLY format can be imported into most 3D software, e.g. Meshlab, CloudCompare, etc.\n\n#### Evaluate Poses Against (Pseudo) Ground Truth\n\nWe provide a script that measures the pose error of a set of estimated poses against a set of ground truth poses.\n\n```shell\npython eval_poses.py \u002Fpath\u002Fto\u002Face\u002Fpose_file.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\"\n```\n\nThe ground truth poses are given as a glob pattern, where each file contains the pose of a single image as a 4x4 camera-to-world transformation (e.g. as provided by the 7-Scenes dataset).\nCorrespondence between ACE estimates and ground truth files will be established via alphabetical order of the image filenames.\n\nThe script calculates:\n* registration rate, i.e. the percentage of estimates above a confidence threshold (default: 1000 inliers), \n* accuracy, i.e. the percentage of poses below a pose error threshold (default: 5cm and 5°),\n* median rotation and translation errors,\n* absolute trajectory error (ATE) and relative pose error (RPE)\n\nSince ACE0 poses are only approximately metric and in an arbitrary reference frame, the script will fit a similarity transform between estimates and ground truth before calculating error.\nBy default, the script will use RANSAC-based alignment of camera trajectories.\nNote that ATE and RPE errors are usually calculated based on a least-squares alignment.\nYou can change the type of alignment with the appropriate command line flag, or disable alignment altogether (e.g. in relocalization experiments).\n\nOptionally, the evaluation script can store the evaluation results in a text file. \n\n## Benchmark\n\nWe evaluate the ACE0 pose quality using novel view synthesis via Nerfstudio.\n\n**Note:** All paper results were produced with Nerfstudio v0.3.4. Since then, we updated this repository to support newer version of Nerfstudio.\nWe verified that benchmark results did not change significantly when updating to Nerfstudio v1.1.4. \nHowever, if you observe benchmarking inconsistencies w.r.t. the paper, we advise to first down-grade to Nerfstudio v0.3.4 and checkout our code using the `eccv_2024_checkpoint` git tag.\n\n### Nerfacto\n\nIn our paper, we benchmark the ACE0 reconstruction by training a Nerfacto model and measuring PSNR on a dataset-specific training\u002Ftest split of images.\nTo setup the benchmark, follow the instructions in the [Benchmark README](benchmarks\u002FREADME.md).\n\nNote that the benchmark lives in its own conda environment, so you have to change environments between reconstruction and benchmarking.\n\nThe benchmark takes an ACE0 pose file and fits a Nerfacto model. Optionally, you can also use our benchmarking scripts to generate the input files for Nerfstudio without running the benchmark, see the `--no_run_nerfstudio` flag.\n\nIf you do run the benchmark, it will apply a 1\u002F8 split of the images by default to calculate PSNR.\nThe scripts we provide for our [Paper Experiments](#paper-experiments) do optionally run the benchmark on each dataset using the correct split.\n\nSince the benchmarking results are stored in a nested structure, we provide a script to extract the PSNR values:\n\n```shell\n# show the benchmark results of all scenes as sub-folders in the provided top-level folder\npython scripts\u002Fshow_benchmark_results.py \u002Fpath\u002Fto\u002Ftop\u002Flevel\u002Fresults\u002Ffolder\n```\n\nThe script assumes a folder structure where each scene is a sub-folder in a dataset-specific top-level folder.\nE.g. `benchmark\u002F7scenes` contains sub-folders `chess`, `fire`, `heads`, etc.\n\nAfter running benchmarking on a reconstruction, you can load the NeRF model using Nerfstudio's viewer, render videos etc.\n\n```shell\nns-viewer --load-config \u002Fpath\u002Fto\u002Fnerf\u002Fconfig.yaml\n```\n\n### Splatfacto\n\nTraining Gaussian splats with Splatfacto is very similar to training a Nerfacto model.\nSplatfacto additionally needs a point cloud to initialise the splats, which you can export using one of our [utility scripts](#export-3d-scene-as-point-cloud) or by running ACE0 with `--export_point_cloud True`.\nOur benchmarking scripts will look for a file `pc_final.ply` to pass to Nerfstudio.\nNote that Nerfstudio will also proceed if no `pc_final.ply` file is found, but the splats will be initialised uniformly which can result in very poor quality.\nYou can check for the following warning in the Nerfstudio log to see if the point cloud was expected but missing:\n```\nWarning: load_3D_points set to true but no point cloud found.\n```\n\nOtherwise, you just have to run our benchmarking scripts with `--method splatfacto`, see the [Benchmark README](benchmarks\u002FREADME.md) for details.\nThe script `show_benchmark_results.py` mentioned in the previous section also has the `--method splatfacto` option to show benchmarking metrics of Splatfacto models.\n\nNote that all our scripts for the [Paper Experiments](#paper-experiments) support both Nerfacto and Splatfacto benchmarking.\nThe method can just be toggled at the top of each script. \nWe recommend to take a look and use these scripts as blueprints for your own experiments.\n\n## Paper Experiments\n\nWe provide scripts to run the main experiments of the paper.\nWe also provide pre-computed results for all these experiments, along with the corresponding visualizations, in the respective sections below.\n\n### 7-Scenes\n\nSetup the dataset.\n\n```shell\n# setup the 7-Scenes dataset in the datasets folder\ncd datasets\n# download and unpack the dataset\npython setup_7scenes.py\n# back to root directory\ncd ..\n```\nThe script can optionally convert the dataset to the ACE format, download alternative pseudo ground truth poses, calibrate depth maps, etc.\nHowever, it is not required for the ACE0 experiments.\n\n(Optional for the benchmark) Create a benchmarking train\u002Ftest split for the 7-Scenes dataset, see the [Benchmark README](benchmarks\u002FREADME.md) for details.\n\n```shell\npython scripts\u002Fcreate_splits_7scenes.py datasets\u002F7scenes split_files\n```\n\nReconstruct each scene (corresponding to \"ACE0\" in Table 1, left).\n\n```shell\nbash scripts\u002Freconstruct_7scenes.sh\n```\n\nBy default, the script will run with benchmarking enabled (make sure you set it up, see [Nerfacto Benchmark](#benchmark)), using Nerfacto and with visualisation disabled. \nFlip the appropriate flags in the script to change this behaviour, e.g. to train Gaussian splats instead of NeRF models. \nThe ACE0 reconstruction files will be stored in `reconstructions\u002F7scenes` while the benchmarking results will be stored in `benchmark\u002F7scenes`.\nTo show the benchmarking results, call:\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002F7scenes\n```\n\nTo refine KinectFusion poses using ACE (corresponding to \"KF+ACE0\" in Table 1, left), run:\n\n```shell\nbash scripts\u002Freconstruct_7scenes_warmstart.sh\n# show the benchmark results\npython scripts\u002Fshow_benchmark_results.py benchmark\u002F7scenes_warmstart\n```\n\nFind pre-computed poses and reconstruction videos for 7-Scenes [here](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_7scenes.tar.gz). \nThese results are from a different run of ACE0 than the one we used for the paper results, but PSNR values are very close (&plusmn; 0.1dB PSNR on average).\n\nFor some experiments in the paper (see right side of Table 1), we run ACE0 and baselines on a subset of images for each scene.\nWe provide the lists of images, together with how they have been split for the view synthesis benchmark here: [200 images per scene](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fsplits_7s_200frames.tar.gz) and [50 images per scene](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fsplits_7s_50frames.tar.gz).\n\n### Mip-NeRF 360\n\nSetup the dataset.\n\n```shell\n# setup the Mip-NeRF 360 dataset in the datasets folder\ncd datasets\n# download and unpack the dataset\npython setup_mip360.py\n# back to root directory\ncd ..\n```\nThe script can optionally convert the COLMAP ground truth to the ACE format, but it is not required for the ACE0 experiments.\n\n(Optional for the benchmark) Create a benchmarking train\u002Ftest split for the Mip-NeRF 360 dataset, see the [Benchmark README](benchmarks\u002FREADME.md) for details. \nThis uses a slightly different 1\u002F8 split than the default benchmark split.\n\n```shell\npython scripts\u002Fcreate_splits_mip360.py datasets\u002Fmip360 split_files\n```\n\nReconstruct each scene (corresponding to \"ACE0\" in Table 2 (b)).\n\n```shell\nbash scripts\u002Freconstruct_mip360.sh\n```\n\nBy default, the script will run with benchmarking enabled (make sure you set it up, see [Nerfacto Benchmark](#benchmark)), using Nerfacto and with visualisation disabled. \nFlip the appropriate flags in the script to change this behaviour, e.g. to train Gaussian splats instead of NeRF models.  \nThe ACE0 reconstruction files will be stored in `reconstructions\u002Fmip360` while the benchmarking results will be stored in `benchmark\u002Fmip360`.\nTo show the benchmarking results, call:\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Fmip360\n```\n\nFind pre-computed poses and reconstruction videos for the Mip-NerF 360 dataset [here](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_mip360.tar.gz). \nThese results are from a different run of ACE0 than the one we used for the paper results, but PSNR values are very close (&plusmn; 0.1dB PSNR on average).\n\n### Tanks and Temples\n\nYou have to manually [download the dataset](https:\u002F\u002Fwww.tanksandtemples.org\u002Fdownload\u002F).\nOur dataset script assumes you downloaded the group archives into `datasets\u002Ft2` without unpacking them:\n```\ndatasets\u002Ft2\u002Ftraining.zip\ndatasets\u002Ft2\u002Ftraining_videos.zip\ndatasets\u002Ft2\u002Fintermediate.zip\ndatasets\u002Ft2\u002Fintermediate_videos.zip\ndatasets\u002Ft2\u002Fadvanced.zip\ndatasets\u002Ft2\u002Fadvanced_videos.zip\n```\n\nSetup the dataset.\n\n```shell\n# setup the T&T dataset in the datasets folder\ncd datasets\n# unpack the dataset\npython setup_t2.py\n# back to root directory\ncd ..\n```\n\nOptionally, the script can download and setup COLMAP ground truth poses, and convert them to the ACE format.\nThis is required for the ACE0 experiments which reconstruct the dataset videos starting from a sparse COLMAP reconstruction.\nCall the script with `--with-colmap`.\nThis will create an additional folder `t2_colmap` in the datasets folder where each scene folder not only has the image \nfiles, but also corresponding `*_pose.txt` files with COLMAP poses as 4x4, camera-to-world transformations.\nAlso, per scene, a single `focal_length.txt` file is created with the COLMAP focal length estimate.\n\nWe provide scripts for Tanks and Temples separated by scene group, i.e. training, intermediate, and advanced.\nThe following explanations are for the training group, but the scripts for the intermediate and advanced groups are similar.\n\nReconstruct each scene from a few hundred images (corresponding to \"ACE0\" in Table 3, left).\n\n```shell\nbash scripts\u002Freconstruct_t2_training.sh\n```\n\nBy default, the script will run with benchmarking enabled (make sure you set it up, see [Nerfacto Benchmark](#benchmark)), using Nerfacto and with visualisation disabled. \nFlip the appropriate flags in the script to change this behaviour, e.g. to train Gaussian splats instead of NeRF models. \nThe ACE0 reconstruction files will be stored in `reconstructions\u002Ft2_training` while the benchmarking results will be stored in `benchmark\u002Ft2_training`.\nTo show the benchmarking results, call:\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training\n```\n\nNote that no benchmarking split files need to be generated for Tanks and Temples. \nThe benchmark will apply a default 1\u002F8 split.\n\nTo reconstruct the full videos of each scene (corresponding to \"ACE0\" in Table 3, right), call:\n```shell\nbash scripts\u002Freconstruct_t2_training_videos.sh\n# show benchmarking results\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training_videos\n```\n\nTo reconstruct the full videos of each scene starting from a COLMAP reconstruction (corresponding to \"Sparse COLMAP + ACE0\" in Table 3, left), call:\n```shell\nbash scripts\u002Freconstruct_t2_training_videos_warmstart.sh\n# show benchmarking results\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training_videos_warmstart\n```\n\nNote that the last experiment assumes that you set up the dataset with `--with-colmap`.\nThe code will first call ACE mapping on the images with COLMAP poses to create an initial scene model.\nThis model is then passed to ACE0 which will use it as a seed for the full video reconstruction.\nIn this example, we trust the focal length estimate of COLMAP and keep it fixed throughout the reconstruction.\n\nFind pre-computed poses and reconstruction videos for Tanks and Temples here: [Training scenes](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_training.tar.gz), [Intermediate scenes](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_intermediate.tar.gz), [Advanced scenes](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_advanced.tar.gz). \nThese results are from a different run of ACE0 than the one we used for the paper results, but PSNR values are very close (&plusmn; 0.3dB PSNR on average).\n\n## Frequently Asked Questions\n\n**Q: I want Gaussian splats from my images. What do I need to do?**\n\nPrepare ACE0 as explained in the beginning of this document: Create the ACE0 environment, compile the DSAC* bindings, and [setup Nerfstudio](benchmarks\u002FREADME.md).\nThen, run the following commands on your image set:\n\n```\n# activate our conda environment\nconda activate ace0\n\n# run ACE0 reconstruction with point cloud export\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --export_point_cloud True\n\n# switch to the Nerfstudio conda environment\nconda activate nerfstudio\n\n# convert the ACE0 output to a Nerfstudio compatible format and run Splatfacto training (also runs evaluation but it's fast)\npython -m benchmarks.benchmark_poses --pose_file result_folder\u002Fposes_final.txt --output_dir benchmark_folder --images_glob_pattern \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" --method splatfacto \n\n# view the Gaussian splats\nns-viewer --load-config benchmark_folder\u002Fnerf_data\u002Fnerf_for_eval\u002Fsplatfacto\u002Frun\u002Fconfig.yaml\n```\n\n**Q: I run out of GPU memory during the ACE0 reconstruction. What can I do?**\n\n**A:** All experiments in the paper were performed with 16GB of GPU memory (e.g. NVIDIA V100\u002FT4) and the default settings should work with such a GPU.\nThe bulk of the memory is used by the ACE training buffer (up to ~8GB). \nYou can run ACE0 with the flag `--training_buffer_cpu True` to keep the training buffer on the CPU at the expense of reconstruction speed.\nWith that option, ACE0 should require ~1GB of GPU memory.\n\n**Q: I have an image collection with various images sizes, aspect ratios and intrinsics. Can I use ACE0?**\n\n**A:** No. ACE0 assumes that all images share their intrinsics, particularly the focal length.\nThis is a limitation of the current implementation, rather than the method. \nSupporting images with varying intrinics should work, but would require some implementation effort, particularly in `refine_calibration.py`. \n\n**Q: Does ACE0 estimate intrinsics other than the focal length?**\n\n**A:** No. ACE0 assumes that the principal point is at the image center, and pixels are square and unskewed.\nThe focal length, shared by all images, is the only intrinsic parameter estimated and\u002For refined by ACE0.\n\n**Q: I have images from a complex camera model. e.g. with severe image distortion. Can I use ACE0?**\n\n**A:** No. The scene coordinate regression network might be able to remove some distortion, but presumably not much.\nThe reprojection loss of ACE and the RANSAC pose estimator assume a pinhole camera model. \nThese parts would need to implement a camera distortion model. \nIf the distortion parameters are known, we would recommend to undistort the images before passing them to ACE0.\n\n**Q: How can I run ACE0 with depth other than ZoeDepth estimates?**\n\n**A:** If you have pre-calculated depth maps, you can call `ace_zero.py` with `--depth_files \"\u002Fpath\u002Fto\u002Fdepths\u002F*.png\"`.\nIn this case, ACE0 will use the provided depth maps for the seed images instead of estimating depth.\nOtherwise, the functions `get_depth_model()` and `estimate_depth()` in `dataset_io.py` can be adapted to use a depth estimator other than ZoeDepth.\nNote that we found the impact of the depth estimation model to be rather small in our experiments.\n\n**Q: Is ACE0 able to reconstruct from a small set of sparse views?**\n\n**A:** It can work but this scenario is challenging for ACE0. \nWe expect other methods, and even COLMAP, to work much better in this case.\nACE0 relies on images having sufficient visual overlap, particularly when registering new images to the reconstruction.\nYou can lower the registration threshold when running `ace_zero.py` via `--registration_confidence` setting it to 300 or 100 - but at some point ACE0 will get unstable.\nACE0 shines if you have dense coverage of a scene, and reconstruct it from many images in reasonable time.\n\n## Publications\n\nIf you use ACE0 or parts of its code in your own work, please cite:\n\n```\n@inproceedings{brachmann2024acezero,\n    title={Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer},\n    author={Brachmann, Eric and Wynn, Jamie and Chen, Shuai and Cavallari, Tommaso and Monszpart, {\\'{A}}ron and Turmukhambetov, Daniyar and Prisacariu, Victor Adrian},\n    booktitle={ECCV},\n    year={2024},\n}\n```\n\nThis code builds on the ACE relocalizer and uses the DSAC* pose estimator. Please consider citing:\n\n```\n@inproceedings{brachmann2023ace,\n    title={Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses},\n    author={Brachmann, Eric and Cavallari, Tommaso and Prisacariu, Victor Adrian},\n    booktitle={CVPR},\n    year={2023},\n}\n\n@article{brachmann2021dsacstar,\n  title={Visual Camera Re-Localization from {RGB} and {RGB-D} Images Using {DSAC}},\n  author={Brachmann, Eric and Rother, Carsten},\n  journal={TPAMI},\n  year={2021}\n}\n```\n\nIf you use capabilities associated with the paper \"Scene Coordinate Reconstruction Priors\" (ICCV 2025), please cite it. \nThis includes RGB-D reconstruction and the use of probabilistic loss functions or the diffusion prior.\n\n```\n@inproceedings{bian2024scrpriors,\n    title={Scene Coordinate Reconstruction Priors},\n    author={Bian, Wenjing and Barroso-Laguna, Axel and Cavallari, Tommaso and Prisacariu, Victor Adrian and Brachmann, Eric},\n    booktitle={ICCV},\n    year={2025},\n}\n```\n\nACE0 estimates depth of seed images using ZoeDepth. Please consider citing:\n\n```\n@article{bhat2023zoedepth,\n  title={Zoe{D}epth: Zero-shot transfer by combining relative and metric depth},\n  author={Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and M{\\\"u}ller, Matthias},\n  journal={arXiv},\n  year={2023}\n}\n```\n\nThis repository relies on Nerfstudio for benchmarking. \nPlease consider citing according to [their docs](https:\u002F\u002Fdocs.nerf.studio\u002F#citation).\n\n\n## License\n\nCopyright © Niantic, Inc. 2024. Patent Pending.\nAll rights reserved.\nPlease see the [license file](LICENSE) for terms.","# ACE0（ACE 零）\n\n本仓库包含与 ACE0 论文相关的代码：\n> **场景坐标重建：通过增量学习重定位器对图像集合进行姿态估计**\n> \n> [Eric Brachmann](https:\u002F\u002Febrach.github.io\u002F)、\n> [Jamie Wynn](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=ASP-uu4AAAAJ&hl=en)、\n> [Shuai Chen](https:\u002F\u002Fchenusc11.github.io\u002F)、\n> [Tommaso Cavallari](https:\u002F\u002Fscholar.google.it\u002Fcitations?user=r7osSm0AAAAJ&hl=en)、\n> [Áron Monszpart](https:\u002F\u002Famonszpart.github.io\u002F)、\n> [Daniyar Turmukhambetov](https:\u002F\u002Fdantkz.github.io\u002Fabout\u002F)，以及\n> [Victor Adrian Prisacariu](https:\u002F\u002Fwww.robots.ox.ac.uk\u002F~victor\u002F)\n> \n> ECCV 2024，口头报告\n\n更多信息请访问：\n\n- [项目主页（包含方法概述和视频）](https:\u002F\u002Fnianticlabs.github.io\u002Facezero)\n- [arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.14351)\n\n#### 变更日志\n\n**注意**：我们尽量确保所有代码更改都向后兼容，即 ECCV 2024 论文的结果仍然可复现。但为保险起见，我们添加了一个标签 `eccv_2024_checkpoint`，您可以检出该标签以获取用于 ECCV 2024 论文的精确代码版本。\n\n- **2025年11月5日**：[新增说明](#standard-relocalization)，介绍如何使用此代码库运行 [标准 ACE（CVPR 2023）](https:\u002F\u002Fnianticlabs.github.io\u002Face\u002F)。\n- **2025年11月6日**：增加了 [场景坐标重建先验（ICCV 2025）](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F) 的功能，默认关闭。\n    - ACE\u002FACE0 的 [RGB-D 版本](#rgb-d-reconstruction)\n    - [使用重建先验](#using-reconstruction-priors)：深度分布先验和 3D 点云扩散先验\n\n#### 目录\n\n- [安装](#installation)\n- [使用](#usage)\n    - [基本用法](#basic-usage)\n    - [可视化功能](#visualization-capabilities)\n    - [高级用法](#advanced-use-cases)\n      - [优化现有位姿](#refine-existing-poses)\n      - [从部分重建开始](#start-from-a-partial-reconstruction)\n      - [RGB-D 重建（“SCR 先验”论文，ICCV 2025）](#rgb-d-reconstruction)\n      - [使用重建先验（“SCR 先验”论文，ICCV 2025）](#using-reconstruction-priors)\n      - [标准重定位（“ACE”论文，CVPR 2023）](#standard-relocalization)\n      - [自监督重定位](#self-supervised-relocalization)\n      - [训练 NeRF 模型或高斯泼溅模型](#train-nerf-models-or-gaussian-splats)\n    - [实用脚本](#utility-scripts)\n- [基准测试](#benchmark)\n  - [Nerfacto](#nerfacto)\n  - [Splatfacto](#splatfacto)\n- [论文实验](#paper-experiments)\n  - [7-Scenes](#7-scenes)\n  - [Mip-NeRF 360](#mip-nerf-360)\n  - [Tanks and Temples](#tanks-and-temples)\n- [常见问题解答](#frequently-asked-questions)\n- [参考文献](#publications)\n\n## 安装\n\n本代码使用 PyTorch，并已在 Ubuntu 20.04 系统上搭载 V100 Nvidia 显卡进行了测试，不过它也应该能在其他 Linux 发行版和显卡上正常运行。如果您希望在显存较小的 GPU 上运行 ACE0，请参阅我们的 [FAQ](#frequently-asked-questions)。\n\n我们提供了一个预配置的 [`conda`](https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002F) 环境，其中包含了运行代码所需的所有依赖项。\n您可以通过以下命令重新创建并激活该环境：\n\n```shell\nconda env create -f environment.yml\nconda activate ace0\n```\n\n**本文件中的所有后续命令都需要在仓库根目录下，并且在 `ace0` 环境中执行。**\n\nACE0 使用 [ACE](https:\u002F\u002Fnianticlabs.github.io\u002Face\u002F) 场景坐标回归模型来表示场景。\n为了将相机注册到场景中，它依赖于 DSAC* 论文（Brachmann 和 Rother，TPAMI 2021）中的 RANSAC 实现，该实现是用 C++ 编写的。\n因此，您需要构建并安装这些函数的 C++\u002FPython 绑定。\n您可以通过以下命令完成这一操作：\n\n```shell\ncd dsacstar\npython setup.py install\ncd ..\n```\n\n完成上述步骤后，您就可以开始体验 ACE0 了！\n\n**重要提示：** 第一次运行 ACE0 时，脚本可能会要求您确认是否同意从 GitHub 下载 ZoeDepth 深度估计代码及其预训练权重。\n其许可证和详细信息请参阅 [此链接](https:\u002F\u002Fgithub.com\u002Fisl-org\u002FZoeDepth)。\nACE0 使用该模型来估计种子图像的深度。\n该模型可以被替换，详情请参阅下方的 [FAQ](#frequently-asked-questions) 部分。\n\n## Docker\n\n如果您更倾向于在 Docker 容器中运行 ACE0，可以使用以下命令启动容器：\n\n```shell  \ndocker-compose up -d \n```\n\n然后，您可以通过以下命令进入容器：\n\n```shell  \ndocker exec -it acezero \u002Fbin\u002Fbash\n```\n\n之后，您可以按照 README 底部所述的高斯泼溅教程进行操作 [此处。](#frequently-asked-questions) 请务必将您的图像添加到 `docker-compose.yml` 中定义的卷中。\n\n## 使用\n\n我们将介绍如何从头开始使用 ACE0 重建图像，无论是否了解图像的内参。\n我们还将说明如何利用 ACE0 优化现有位姿，或者使用部分位姿作为初始条件来进行重建。\n此外，我们还会讲解 ACE0 的可视化功能，包括将重建结果导出为视频和 3D 模型。\n\n### 基本用法\n\n在最简单的情况下，您可以使用 glob 模式指定的一组图像来运行 ACE0。\n\n```shell\n\n# 在一组图像上使用默认参数运行\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\n```\n\n请注意，glob 模式周围加了引号，以确保它被传递给 ACE0 脚本，而不是由 shell 展开。\n\n如果你想对视频运行 ACE0，可以先从视频中提取帧，然后在提取的帧上运行 ACE0，详情请参阅我们的 [实用脚本](#utility-scripts)。\n\nACE0 脚本会循环调用 ACE 训练（`train_ace.py`）和相机注册（`register_mapping.py`），直到所有图像都已注册到场景表示中，或者迭代之间没有变化为止。\n\nACE0 重建的结果是结果文件夹中的 `poses_final.txt`。这些文件包含估计的图像位姿，格式如下：\n```\nfilename qw qx qy qz x y z focal_length confidence\n``` \n`filename` 是相对于仓库根目录的图像文件名。\n`qw qx qy qz` 是相机旋转的四元数表示，`x y z` 是相机平移。\n相机位姿是世界坐标系到相机坐标系的变换，采用 OpenCV 的相机约定。\n`focal_length` 是 ACE0 估计的焦距，或由外部设置的焦距（见下文）。\n`confidence` 是估计的可靠性。如果置信度小于 1000，则应视为不可靠，可能需要忽略。\n\n位姿文件可用于训练 Nerfacto 或 Splatfacto 模型，方法是使用我们的基准测试脚本，详见 [Benchmarking](#benchmark)。我们的基准测试脚本还允许你仅将我们的位姿文件转换为 Nerfstudio 所需的格式，而无需运行基准测试本身。\n\n\u003Cdetails>\n\u003Csummary>结果文件夹中的其他内容说明。\u003C\u002Fsummary>\n\n结果文件夹中会包含如下文件：\n\n- `iterationX.pt`: 第 X 次迭代时的 ACE 场景模型（MLP 网络）。第 X 次迭代时 `train_ace.py` 的输出。\n- `iterationX.txt`: 第 X 次迭代时 ACE 模型的训练统计信息，例如损失值、位姿统计等。详见 `ace_trainer.py`。第 X 次迭代时 `train_ace.py` 的输出。\n- `poses_iterationX_preliminary.txt`: 映射迭代后、重定位前的相机位姿。包含由 MLP 优化后的位姿，而非由 RANSAC 重新估计的位姿。第 X 次迭代时 `train_ace.py` 的输出。\n- `poses_iterationX.txt`: 第 X 次迭代的最终位姿，即经过重定位后由 RANSAC 重新估计的位姿。第 X 次迭代时 `register_mapping.py` 的输出。\n- `poses_final.txt`: 场景中图像的最终位姿。对应于最后一次重定位迭代的位姿，即最后一次调用 `register_mapping.py` 的输出。\n- `pc_final.ply`: ACE0 生成的场景点云，用于可视化或初始化高斯样条。此输出为可选，通过在 `ace_zero.py` 中使用 `--export_point_cloud True` 选项来触发。\n\u003C\u002Fdetails>\n\n#### 设置标定参数\n\n使用默认参数时，ACE0 会根据启发式值（图像对角线的 70%）估计图像的焦距。如果你有更准确的焦距估计，可以将其作为初始参数提供。\n\n```shell\n# 使用焦距初始猜测运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length>\n```\n\n使用上述命令，ACE0 会在整个重建过程中不断优化焦距。如果你确信自己的焦距值是正确的，可以禁用焦距优化。\n\n```shell\n# 使用固定焦距运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n```\n\n**注意：** 当前 ACE0 的实现只支持所有图像共享一个焦距值。此外，ACE0 还假定主点位于图像中心，且像素为正方形、无畸变。若要改变这些假设，理论上可行，但需要一定的实现工作量。\n\n### 可视化功能\n\nACE0 可以将重建过程可视化为视频。\n\n```shell\n# 启用可视化运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --render_visualization True\n```\n\n启用可视化后，ACE0 会将每一帧渲染到子文件夹 `renderings` 中，并在最后调用 `ffmpeg`。可视化结果将以视频形式保存在结果文件夹中，名为 `reconstruction.mp4`。\n\n\u003Cdetails>\n\u003Csummary>renderings 文件夹中的其他内容说明。\u003C\u002Fsummary>\n\n* `frame_N.png`: 视频中的第 N 帧。\n* `iterationX_mapping.pkl`: 第 X 次迭代中映射调用的可视化缓冲区。它存储了场景的 3D 点云、用于平滑过渡的上一帧渲染相机以及上一帧的索引。\n* `iterationX_register.pkl`: 第 X 次迭代中重定位调用的可视化缓冲区。\n\u003C\u002Fdetails>\n\n**请注意，这会显著减慢重建速度。** 或者，你可以不启用可视化，而是将最终重建结果导出为 3D 模型，详情请参阅 [实用脚本](#utility-scripts)。\n\n### 高级用例\n\n你可以将 ACE0 元脚本与自定义调用 `train_ace.py` 和 `register_mapping.py` 结合使用，以满足更高级的用例需求。\n\n* `train_ace.py`: 根据一组带有相应位姿的图像训练 ACE 模型。\n* `register_mapping.py`: 根据 ACE 模型估计场景中图像的位姿。\n* `ace_zero.py`: 可以从现有的 ACE 模型开始。\n\n你可以在这些函数的调用之间自由切换图像集。我们提供了一些高级用例示例，其中也涵盖了我们论文中的一些实验。\n\n#### 优化现有位姿\n\n如果你已经对所有图像的位姿有了初步估计，可以使用 ACE 快速对其进行优化。我们将一次 ACE 映射调用与位姿优化功能结合，并再进行一次重定位调用。\n\n```shell\n# 运行启用位姿优化的 ACE 映射\npython train_ace.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --pose_files \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.txt\" --pose_refinement mlp --pose_refinement_wait 5000 --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# 重新估计所有图像的位姿\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --use_external_focal_length \u003Cfocal_length> --session ace_network\n```\n\n在这个示例中，ACE 将 [7-Scenes](#7-scenes) 格式的现有位姿作为输入：每张图像对应一个文本文件，其中相机到世界坐标系的位姿以 4×4 矩阵的形式存储。选项 `--pose_refinement mlp` 启用基于精炼网络的位姿精炼。选项 `--pose_refinement_wait 5000` 在前 5000 次迭代中冻结位姿，这在从头开始进行位姿精炼建图时可以提高稳定性。\n\n调用 `register_mapping.py` 后，结果文件夹中将包含精炼后的位姿，保存在 `poses_ace_network.txt` 文件中。\n\n请注意，上述示例假设焦距已知且固定。如果让 ACE 自动精炼标定参数，则需要将 `train_ace.py` 精炼得到的焦距传递给 `register_mapping.py`。完整的示例请参阅 `scripts\u002Freconstruct_7scenes_warmstart.sh`，其中我们使用 ACE 对 KinectFusion 的位姿进行精炼。\n\n#### 从部分重建开始\n\n如果你已经拥有部分图像的位姿估计，可以使用 ACE0 完成整个重建。首先，对带有位姿的部分图像运行 ACE 建图，生成一个 ACE 场景模型。然后将该模型传递给 ACE0，ACE0 会将剩余的图像注册到该场景中。\n\n```shell\n# 对带有位姿的部分图像运行 ACE 建图\npython train_ace.py \"\u002Fimages\u002Fwith\u002Fposes\u002F*.jpg\" result_folder\u002Fiteration0_seed0.pt --pose_files \"\u002Fposes\u002Fof\u002Fimages\u002F*.txt\" --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# 使用 ACE 模型作为种子，并对完整图像集运行 ACE0\npython ace_zero.py \"\u002Fall\u002Fimages\u002F*.jpg\" result_folder --seed_network result_folder\u002Fiteration0_seed0.pt --use_external_focal_length ${focal_length} --refine_calibration False\n```\n\nACE0 会将最终的位姿存储在结果文件夹中的 `poses_final.txt` 文件中，包含所有图像的位姿。请注意，上述示例假设焦距已知且固定。你也可以让 ACE 或 ACE0 估计或精炼焦距，但需要确保在不同步骤之间正确传递焦距值。\n\n完整的示例请参阅 `scripts\u002Freconstruct_t2_training_videos_warmstart.sh`，其中我们基于 COLMAP 的部分重建结果来重建 Tanks and Temples 训练场景。更多关于此示例的信息，请参阅 [Tanks and Temples](#tanks-and-temples) 部分。\n\n#### RGB-D 重建\n\nACE0 支持基于 [场景坐标重建先验（SCR 先验）论文](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F) 的 RGB-D 重建方法，该论文发表于 ICCV 2025。你可以通过 `ace_zero.py` 的 `--depth_files` 选项为所有图像提供深度图，并将 `--depth_use_always` 设置为 True 来启用 RGB-D 重建功能。为了获得最佳效果，我们建议使用 SCR 先验论文中提出的 RGB-D 损失函数。\n\n```shell\n# 使用 RGB-D 图像和推荐的 RGB-D 损失函数运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --depth_use_always True --depth_files \"\u002Fpath\u002Fto\u002Fsome\u002Fdepths\u002F*.png\" --loss_structure probabilistic --prior_loss_type rgbd_laplace_nll --prior_loss_weight 1.0 --prior_loss_bandwidth 0.1\n```\n\n更多信息请参考 SCR 先验论文及其 [代码库](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors)。\n\n#### 使用重建先验\n\nACE0 支持使用 [场景坐标重建先验（SCR 先验）论文](https:\u002F\u002Fnianticspatial.github.io\u002Fscr-priors\u002F) 中提出的重建先验，该论文发表于 ICCV 2025。可用的先验分为两类：基于预期深度分布的手工先验，以及基于 3D 扩散模型的可学习先验。需要注意的是，这些参数和预训练模型主要针对室内场景设计。在室外场景中使用这些先验可能需要调整手工先验的参数，并针对可学习先验重新训练扩散模型。\n\n可以通过 `ace_zero.py` 的相应命令行选项启用重建先验。\n\n```shell\n# 使用基于负对数似然损失的深度分布先验运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure probabilistic --prior_loss_type laplace_nll --prior_loss_weight 0.1 --prior_loss_bandwidth 0.6 --prior_loss_location 1.73\n\n# 或者使用基于 Wasserstein 距离的深度分布先验运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure probabilistic --prior_loss_type laplace_wd --prior_loss_weight 0.1 --prior_loss_bandwidth 0.6 --prior_loss_location 1.73\n```\n\n使用扩散先验需要额外的设置。首先，按照 [SCR 先验代码库](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors) 的说明安装并激活 `ace0_priors` conda 环境，因为扩散先验有额外的依赖项。其次，从 [这里](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Fscr-priors\u002Fdiffusion_prior.pt) 下载预训练的扩散先验模型。\n\n```shell\n# 使用扩散先验运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --loss_structure \"dsac*\" --prior_loss_type diffusion --prior_loss_weight 200 --prior_diffusion_model_path \u002Fpath\u002Fto\u002Fdiffusion_prior.pt\n```\n\n扩散先验（`diffusion` 文件夹）包含了以下项目的代码：\n  * [denoising-diffusion-pytorch](https:\u002F\u002Fgithub.com\u002Flucidrains\u002Fdenoising-diffusion-pytorch\u002Ftree\u002Fmain)（MIT 许可证）\n  * [projection-conditioned-point-cloud-diffusion](https:\u002F\u002Fgithub.com\u002Flukemelas\u002Fprojection-conditioned-point-cloud-diffusion\u002Ftree\u002Fmain)（MIT 许可证）\n  * [pvcnn](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fpvcnn)（MIT 许可证）\n\n有关所有重建先验的详细信息，请参阅 [SCR 先验代码库](https:\u002F\u002Fgithub.com\u002Fnianticspatial\u002Fscr-priors) 的 README 文件。\n\n#### 标准重定位\n\nACE0 构建于 [ACE 代码库](https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Face) 之上，完全支持 ACE 论文（CVPR 2023）中提出的标准重定位功能。`train_ace.py` 可用于对带位姿的图像进行建图，而 `register_mapping.py` 则可用于对查询图像进行重定位。\n\n与 ACE0 不同，标准 ACE（即 `train_ace.py` + `register_mapping.py`）支持每张图像使用不同的焦距，具体可通过两个脚本中的 `--calibration_files` 和 `--calibration_file_f_idx` 选项实现。\n\n```shell\n# 使用 ACE 对带位姿的图像进行建图\npython train_ace.py \"\u002Fpath\u002Fto\u002Fmapping\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --pose_files \"\u002Fpath\u002Fto\u002Fmapping\u002Fposes\u002F*.txt\" --calibration_files \"\u002Fpath\u002Fto\u002Fmapping\u002Fcalibrations\u002F*.txt\"\n\n# 使用 ACE 重新定位查询图像\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fquery\u002Fimages\u002F*.jpg\" result_folder\u002Face_network.pt --calibration_files \"\u002Fpath\u002Fto\u002Fquery\u002Fcalibrations\u002F*.txt\"  --session query\n```\n重新定位结果将存储在 `poses_query.txt` 中。\n我们提供了一个评估脚本，用于将估计位姿与查询的真实位姿进行比较，详情请参见 [实用脚本](#utility-scripts)。\n\n```shell\n# 比较估计位姿和真实位姿，假设它们已在同一坐标系中对齐\npython eval_poses.py result_folder\u002Fposes_query.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\" --estimate_alignment none\n```\n\n当然，在重新定位设置中，可以通过相应的参数启用 ACE0 提供的所有扩展功能，例如估计共享焦距、使用 RGB-D 图像进行建图、提前停止或优化建图位姿等。\n\n#### 自监督重新定位\n\n您可以使用 ACE0 对一组图像进行建图，然后对另一组图像调用 `register_mapping.py` 进行重新定位。在此过程中，ACE0 将运行于建图图像集上，而 `register_mapping.py` 则运行于查询图像集上。\n\n```shell\n# 在建图图像上运行 ACE0\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fmapping\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n\n# 在查询图像上进行重新定位\npython register_mapping.py \"\u002Fpath\u002Fto\u002Fquery\u002Fimages\u002F*.jpg\" result_folder\u002FiterationX.pt --use_external_focal_length \u003Cfocal_length> --session query\n```\n\n您需要将 `register_mapping.py` 指向上次建图迭代生成的 ACE 模型（例如 `iterationX.pt`）。重新定位的结果将存储在 `poses_query.txt` 中。请注意，ACE0 的重建结果仅为近似度量尺度。如果您要将查询位姿与真实位姿进行比较，首先需要拟合一个相似变换。我们为此提供了一个脚本。\n\n```shell\n# 拟合相似变换以对齐后，比较估计位姿和真实位姿\npython eval_poses.py result_folder\u002Fposes_query.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\"\n```\n\n有关评估脚本的更多信息，请参阅 [实用脚本](#utility-scripts)。\n\n#### 训练 NeRF 模型或高斯泼溅模型\n\n有关如何在 ACE0 基础上使用 Nerfstudio 的说明，请参阅 [基准测试](#benchmark)。\n\n### 实用脚本\n\n#### 视频转数据集\n\n我们提供了一个脚本，可通过 ffmpeg 从 MP4 视频中提取帧。\n\n```shell\npython datasets\u002Fvideo_to_dataset.py datasets\n```\n\n该脚本会在目标文件夹（此处为 `datasets`）中查找所有 MP4 文件，并为每段视频提取帧，存入子文件夹 `datasets\u002Fvideo_\u003Cmp4_file_name>` 中。\n\n#### 将 3D 场景导出为点云\n\n我们提供了一个脚本，用于从网络和位姿文件中导出 ACE 点云。\n\n```shell\npython export_point_cloud.py point_cloud_out.txt --network \u002Fpath\u002Fto\u002Face_network.pt --pose_file \u002Fpath\u002Fto\u002Fposes_final.txt\n```\n\n该脚本可以输出 TXT 或 PLY 文件，具体取决于您指定的输出文件扩展名。如果输出文件扩展名为 .txt，则脚本会将点云以文本格式写入文件，每行包含一个点的坐标 `(x y z r g b)`。如果输出文件扩展名为 .ply，则脚本会将点云以二进制 PLY 格式写入文件。这两种格式都可以导入大多数 3D 软件中，例如 Meshlab、CloudCompare 等。PLY 格式还可被 Nerfstudio 识别，用于初始化高斯泼溅模型。\n\n注意，您也可以将脚本指向现有的可视化缓冲区 `result_folder\u002Frenderings\u002FiterationX_mapping.pkl`，其中已包含点云，无需再次生成。\n\n点云可以使用 OpenGL 或 OpenCV 坐标约定导出。Nerfstudio 预期使用 OpenCV 坐标。该脚本可以提取稀疏或密集点云。稀疏点云应用了更多滤波器，看起来更干净；而密集点云则更适合高斯泼溅，尤其是在图像数量较多（2000 张以上）时，因为它们能更好地覆盖背景。\n\n#### 将相机导出为网格\n\n我们提供了一个脚本，用于将 ACE 位姿文件导出为显示相机位置的 PLY 文件。\n\n```shell\npython export_cameras.py \u002Fpath\u002Fto\u002Face\u002Fpose_file.txt \u002Fpath\u002Fto\u002Foutput.ply\n```\n\n该脚本会根据相机的置信度值对相机进行颜色编码。PLY 格式可以导入大多数 3D 软件中，例如 Meshlab、CloudCompare 等。\n\n#### 评估位姿与（伪）真实位姿\n\n我们提供了一个脚本，用于衡量一组估计位姿与一组真实位姿之间的误差。\n\n```shell\npython eval_poses.py \u002Fpath\u002Fto\u002Face\u002Fpose_file.txt \"\u002Fpath\u002Fto\u002Fground\u002Ftruth\u002Fposes\u002F*.txt\"\n```\n\n真实位姿以 glob 模式给出，每个文件包含单张图像的 4x4 相机到世界变换矩阵（例如 7-Scenes 数据集提供的格式）。ACE 估计与真实位姿文件之间的对应关系将通过图像文件名的字母顺序建立。\n\n该脚本计算：\n* 注册率，即高于置信度阈值的估计比例（默认：1000 个内点）；\n* 准确率，即低于位姿误差阈值的位姿比例（默认：5 cm 和 5°）；\n* 旋转和平移误差的中位数；\n* 绝对轨迹误差 (ATE) 和相对位姿误差 (RPE)。\n\n由于 ACE0 的位姿仅为近似度量尺度，且处于任意参考框架中，因此在计算误差之前，脚本会先在估计位姿和真实位姿之间拟合一个相似变换。默认情况下，脚本会使用基于 RANSAC 的相机轨迹对齐方法。需要注意的是，ATE 和 RPE 误差通常是在最小二乘法对齐的基础上计算的。您可以通过相应的命令行标志更改对齐方式，或完全禁用对齐（例如在重新定位实验中）。\n\n此外，评估脚本还可以将评估结果保存到文本文件中。\n\n## 基准测试\n\n我们使用 Nerfstudio 的新视图合成来评估 ACE0 位姿的质量。\n\n**注意：** 所有论文中的结果均是使用 Nerfstudio v0.3.4 生成的。此后，我们更新了此仓库以支持更高版本的 Nerfstudio。我们验证发现，升级到 Nerfstudio v1.1.4 后，基准测试结果并未发生显著变化。然而，如果您在基准测试中观察到与论文不一致的情况，我们建议您先降级至 Nerfstudio v0.3.4，并使用 `eccv_2024_checkpoint` Git 标签检出我们的代码。\n\n### Nerfacto\n\n在我们的论文中，我们通过训练一个Nerfacto模型，并在特定数据集的训练\u002F测试图像划分上测量PSNR，来对ACE0重建进行基准测试。\n\n要设置基准测试，请按照[基准测试README](benchmarks\u002FREADME.md)中的说明操作。\n\n请注意，基准测试运行在它自己的conda环境中，因此您需要在重建和基准测试之间切换环境。\n\n基准测试会使用ACE0姿态文件并拟合一个Nerfacto模型。可选地，您也可以使用我们的基准测试脚本生成Nerfstudio的输入文件，而无需运行基准测试，参见`--no_run_nerfstudio`标志。\n\n如果您确实运行基准测试，默认会应用1\u002F8的图像划分来计算PSNR。我们为[论文实验]提供的脚本可以选择性地使用正确的划分在每个数据集上运行基准测试。\n\n由于基准测试结果存储在一个嵌套结构中，我们提供了一个脚本来提取PSNR值：\n\n```shell\n# 在提供的顶级文件夹中以子文件夹的形式显示所有场景的基准测试结果\npython scripts\u002Fshow_benchmark_results.py \u002Fpath\u002Fto\u002Ftop\u002Flevel\u002Fresults\u002Ffolder\n```\n\n该脚本假设文件夹结构为：每个场景都是某个数据集专用顶级文件夹下的一个子文件夹。例如，`benchmark\u002F7scenes`包含子文件夹`chess`、`fire`、`heads`等。\n\n在对重建结果进行基准测试后，您可以使用Nerfstudio的查看器加载NeRF模型，渲染视频等。\n\n```shell\nns-viewer --load-config \u002Fpath\u002Fto\u002Fnerf\u002Fconfig.yaml\n```\n\n### Splatfacto\n\n使用Splatfacto训练高斯样点与训练Nerfacto模型非常相似。Splatfacto还需要一个点云来初始化样点，您可以使用我们的[实用脚本](#export-3d-scene-as-point-cloud)之一导出点云，或者在运行ACE0时设置`--export_point_cloud True`。我们的基准测试脚本会寻找名为`pc_final.ply`的文件传递给Nerfstudio。请注意，即使没有找到`pc_final.ply`文件，Nerfstudio也会继续执行，但此时样点将被均匀初始化，这可能导致质量非常差。您可以在Nerfstudio的日志中查找以下警告，以确认是否预期存在点云但实际缺失：\n\n```\nWarning: load_3D_points set to true but no point cloud found.\n```\n\n除此之外，您只需使用`--method splatfacto`运行我们的基准测试脚本，详情请参阅[基准测试README](benchmarks\u002FREADME.md)。前一节提到的`show_benchmark_results.py`脚本也带有`--method splatfacto`选项，用于展示Splatfacto模型的基准测试指标。\n\n请注意，我们为[论文实验]提供的所有脚本都支持Nerfacto和Splatfacto的基准测试。只需在每个脚本的顶部切换方法即可。我们建议您查看这些脚本，并将其用作您自己实验的模板。\n\n## 论文实验\n\n我们提供了运行论文主要实验的脚本。此外，在下面的相应章节中，我们也提供了所有这些实验的预计算结果以及相应的可视化内容。\n\n### 7-Scenes\n\n设置数据集。\n\n```shell\n# 在datasets文件夹中设置7-Scenes数据集\ncd datasets\n# 下载并解压数据集\npython setup_7scenes.py\n# 返回根目录\ncd ..\n```\n\n该脚本可以选择性转换数据集为ACE格式、下载替代的伪真值姿态、校准深度图等。然而，对于ACE0实验来说，这些步骤并非必需。\n\n（基准测试可选）为7-Scenes数据集创建基准测试的训练\u002F测试划分，详情请参阅[基准测试README](benchmarks\u002FREADME.md)。\n\n```shell\npython scripts\u002Fcreate_splits_7scenes.py datasets\u002F7scenes split_files\n```\n\n重建每个场景（对应于表1左侧的“ACE0”）。\n\n```shell\nbash scripts\u002Freconstruct_7scenes.sh\n```\n\n默认情况下，该脚本会在启用基准测试的情况下运行（请确保已设置好基准测试，参见[Nerfacto基准测试](#benchmark)），使用Nerfacto且禁用可视化。您可以通过修改脚本中的相应标志来改变这一行为，例如改为训练高斯样点而非NeRF模型。ACE0重建文件将存储在`reconstructions\u002F7scenes`中，而基准测试结果则存储在`benchmark\u002F7scenes`中。要显示基准测试结果，可以调用：\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002F7scenes\n```\n\n要使用ACE改进KinectFusion的姿态（对应于表1左侧的“KF+ACE0”），请运行：\n\n```shell\nbash scripts\u002Freconstruct_7scenes_warmstart.sh\n# 显示基准测试结果\npython scripts\u002Fshow_benchmark_results.py benchmark\u002F7scenes_warmstart\n```\n\n7-Scenes的预计算姿态和重建视频可在[这里](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_7scenes.tar.gz)找到。这些结果来自与我们用于论文结果不同的ACE0运行，但PSNR值非常接近（平均误差±0.1dB PSNR）。\n\n对于论文中的部分实验（见表1右侧），我们在每个场景的子集图像上运行ACE0和基线模型。我们提供了这些图像列表，以及它们如何被划分为视图合成基准测试的数据集：[每场景200张图像](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fsplits_7s_200frames.tar.gz)和[每场景50张图像](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fsplits_7s_50frames.tar.gz)。\n\n### Mip-NeRF 360\n\n设置数据集。\n\n```shell\n# 在datasets文件夹中设置Mip-NeRF 360数据集\ncd datasets\n# 下载并解压数据集\npython setup_mip360.py\n\n# 返回根目录\ncd ..\n```\n该脚本可以选择性地将 COLMAP 真值转换为 ACE 格式，但在 ACE0 实验中并非必需。\n\n（基准测试的可选步骤）为 Mip-NeRF 360 数据集创建基准训练\u002F测试划分，详情请参阅 [Benchmark README](benchmarks\u002FREADME.md)。\n这使用了一种与默认基准划分略有不同的 1\u002F8 划分。\n\n```shell\npython scripts\u002Fcreate_splits_mip360.py datasets\u002Fmip360 split_files\n```\n\n重建每个场景（对应于表 2(b) 中的“ACE0”）。\n\n```shell\nbash scripts\u002Freconstruct_mip360.sh\n```\n\n默认情况下，脚本会启用基准测试模式运行（请确保已设置好基准测试环境，详见 [Nerfacto 基准测试](#benchmark)），使用 Nerfacto 并禁用可视化功能。  \n可通过修改脚本中的相应标志来改变此行为，例如改为训练 Gaussian splats 而不是 NeRF 模型。  \nACE0 的重建文件将存储在 `reconstructions\u002Fmip360` 目录下，而基准测试结果则存储在 `benchmark\u002Fmip360` 目录下。  \n要查看基准测试结果，可运行以下命令：\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Fmip360\n```\n\nMip-NerF 360 数据集的预计算位姿和重建视频可在此处找到：[这里](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_mip360.tar.gz)。  \n这些结果来自与我们论文结果所用的 ACE0 运行不同的版本，但 PSNR 值非常接近（平均相差 &plusmn; 0.1dB）。\n\n### Tanks and Temples\n\n您需要手动 [下载数据集](https:\u002F\u002Fwww.tanksandtemples.org\u002Fdownload\u002F)。\n我们的数据集脚本假设您已将各组压缩包下载至 `datasets\u002Ft2` 目录，且未解压：\n```\ndatasets\u002Ft2\u002Ftraining.zip\ndatasets\u002Ft2\u002Ftraining_videos.zip\ndatasets\u002Ft2\u002Fintermediate.zip\ndatasets\u002Ft2\u002Fintermediate_videos.zip\ndatasets\u002Ft2\u002Fadvanced.zip\ndatasets\u002Ft2\u002Fadvanced_videos.zip\n```\n\n设置数据集。\n\n```shell\n# 在 datasets 文件夹中设置 T&T 数据集\ncd datasets\n# 解压数据集\npython setup_t2.py\n# 返回根目录\ncd ..\n```\n\n可选地，脚本还可以下载并设置 COLMAP 真值位姿，并将其转换为 ACE 格式。  \n这对于从稀疏 COLMAP 重建开始重建数据集视频的 ACE0 实验是必需的。  \n调用脚本时需添加 `--with-colmap` 参数。  \n这将在 datasets 文件夹中创建一个额外的 `t2_colmap` 目录，其中每个场景文件夹不仅包含图像文件，还包含对应的 `*_pose.txt` 文件，内含 COLMAP 提供的 4x4 相机到世界变换矩阵位姿。  \n此外，每个场景还会生成一个单独的 `focal_length.txt` 文件，记录 COLMAP 估计的焦距。\n\n我们为 Tanks and Temples 数据集提供了按场景组划分的脚本，即训练组、中级组和高级组。  \n以下说明以训练组为例，但中级和高级组的脚本类似。\n\n从几百张图像重建每个场景（对应于表 3 左侧的“ACE0”）。\n\n```shell\nbash scripts\u002Freconstruct_t2_training.sh\n```\n\n默认情况下，脚本会启用基准测试模式运行（请确保已设置好基准测试环境，详见 [Nerfacto 基准测试](#benchmark)），使用 Nerfacto 并禁用可视化功能。  \n可通过修改脚本中的相应标志来改变此行为，例如改为训练 Gaussian splats 而不是 NeRF 模型。  \nACE0 的重建文件将存储在 `reconstructions\u002Ft2_training` 目录下，而基准测试结果则存储在 `benchmark\u002Ft2_training` 目录下。  \n要查看基准测试结果，可运行以下命令：\n\n```shell\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training\n```\n\n请注意，Tanks and Temples 数据集无需生成基准测试划分文件。  \n基准测试将采用默认的 1\u002F8 划分。\n\n要重建每个场景的完整视频（对应于表 3 右侧的“ACE0”），可运行以下命令：\n```shell\nbash scripts\u002Freconstruct_t2_training_videos.sh\n# 查看基准测试结果\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training_videos\n```\n\n若要从 COLMAP 重建开始重建每个场景的完整视频（对应于表 3 左侧的“Sparse COLMAP + ACE0”），可运行以下命令：\n```shell\nbash scripts\u002Freconstruct_t2_training_videos_warmstart.sh\n# 查看基准测试结果\npython scripts\u002Fshow_benchmark_results.py benchmark\u002Ft2_training_videos_warmstart\n```\n\n请注意，最后这个实验要求您在设置数据集时使用 `--with-colmap` 参数。  \n代码会首先对带有 COLMAP 位姿的图像应用 ACE 映射，以创建初始场景模型。  \n然后将该模型传递给 ACE0，作为完整视频重建的种子。  \n在本示例中，我们信任 COLMAP 的焦距估计，并在整个重建过程中保持其固定。\n\nTanks and Temples 数据集的预计算位姿和重建视频可在此处找到：[训练场景](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_training.tar.gz)、[中级场景](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_intermediate.tar.gz)、[高级场景](https:\u002F\u002Fstorage.googleapis.com\u002Fniantic-lon-static\u002Fresearch\u002Facezero\u002Fresults_ace0_t2_advanced.tar.gz)。  \n这些结果来自与我们论文结果所用的 ACE0 运行不同的版本，但 PSNR 值非常接近（平均相差 &plusmn; 0.3dB）。\n\n## 常见问题解答\n\n**问：我想用我的图像生成 Gaussian splats。我需要做什么？**\n\n按照本文档开头的说明准备 ACE0：创建 ACE0 环境、编译 DSAC* 绑定，并 [设置 Nerfstudio](benchmarks\u002FREADME.md)。  \n然后，在您的图像集上运行以下命令：\n\n```\n# 激活我们的 conda 环境\nconda activate ace0\n\n# 运行 ACE0 重建并导出点云\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --export_point_cloud True\n\n# 切换到 Nerfstudio conda 环境\nconda activate nerfstudio\n\n# 将 ACE0 输出转换为 Nerfstudio 兼容格式，并运行 Splatfacto 训练（也会进行评估，但速度很快）\npython -m benchmarks.benchmark_poses --pose_file result_folder\u002Fposes_final.txt --output_dir benchmark_folder --images_glob_pattern \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" --method splatfacto\n\n# 查看高斯喷溅图\nns-viewer --load-config benchmark_folder\u002Fnerf_data\u002Fnerf_for_eval\u002Fsplatfacto\u002Frun\u002Fconfig.yaml\n```\n\n**问：我在进行ACE0重建时遇到了显存不足的问题，该怎么办？**\n\n**答：** 论文中的所有实验都是在16GB显存的GPU（如NVIDIA V100\u002FT4）上完成的，默认设置在这种GPU上应该可以正常运行。\n大部分显存被ACE训练缓冲区占用（最多约8GB）。\n你可以通过添加`--training_buffer_cpu True`标志来将训练缓冲区放在CPU上，这样会牺牲一些重建速度，但显存占用会降低到约1GB。\n\n**问：我有一组包含不同尺寸、不同宽高比和不同内参的图像，能使用ACE0吗？**\n\n**答：** 不行。ACE0假设所有图像共享相同的内参，尤其是焦距。\n这是当前实现的限制，而非方法本身的局限性。支持具有不同内参的图像理论上是可行的，但需要一定的开发工作，特别是在`refine_calibration.py`中进行修改。\n\n**问：ACE0会估计除焦距之外的其他内参吗？**\n\n**答：** 不会。ACE0假设主点位于图像中心，且像素为正方形、无畸变。只有所有图像共用的焦距是ACE0会估计和\u002F或优化的唯一内参。\n\n**问：我的图像来自复杂的相机模型，例如存在严重畸变的情况，能使用ACE0吗？**\n\n**答：** 不行。场景坐标回归网络或许能够去除部分畸变，但效果可能有限。ACE的重投影损失以及RANSAC位姿估计算法都基于针孔相机模型。如果要处理畸变，这些部分需要引入相机畸变模型。如果已知畸变参数，建议在将图像输入ACE0之前先对其进行去畸变处理。\n\n**问：如何使用ZoeDepth以外的深度估计结果来运行ACE0？**\n\n**答：** 如果你已经预先计算好了深度图，可以通过`--depth_files \"\u002Fpath\u002Fto\u002Fdepths\u002F*.png\"`参数调用`ace_zero.py`。此时，ACE0将直接使用提供的深度图作为种子图像的深度信息，而不再进行深度估计。否则，可以修改`dataset_io.py`中的`get_depth_model()`和`estimate_depth()`函数，以使用除ZoeDepth之外的其他深度估计器。需要注意的是，我们在实验中发现深度估计模型对最终结果的影响较小。\n\n**问：ACE0能否从少量稀疏视角的图像中进行重建？**\n\n**答：** 可以，但这种情况下对ACE0来说比较困难。我们预计其他方法，甚至COLMAP，在这种情况下表现会更好。ACE0依赖于图像之间有足够的视觉重叠，尤其是在将新图像注册到重建结果中时。你可以在运行`ace_zero.py`时通过`--registration_confidence`参数降低注册阈值，将其设置为300或100，但当阈值过低时，ACE0可能会变得不稳定。ACE0最适合用于密集覆盖场景的重建，并能在合理时间内完成大量图像的处理。\n\n## 出版物\n\n如果你在自己的工作中使用了ACE0或其代码的一部分，请引用以下文献：\n\n```\n@inproceedings{brachmann2024acezero,\n    title={Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer},\n    author={Brachmann, Eric and Wynn, Jamie and Chen, Shuai and Cavallari, Tommaso and Monszpart, {\\'{A}}ron and Turmukhambetov, Daniyar and Prisacariu, Victor Adrian},\n    booktitle={ECCV},\n    year={2024},\n}\n```\n\n本代码基于ACE重新定位器，并使用DSAC*位姿估计算法。请一并参考以下文献：\n\n```\n@inproceedings{brachmann2023ace,\n    title={Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses},\n    author={Brachmann, Eric and Cavallari, Tommaso and Prisacariu, Victor Adrian},\n    booktitle={CVPR},\n    year={2023},\n}\n\n@article{brachmann2021dsacstar,\n  title={Visual Camera Re-Localization from {RGB} and {RGB-D} Images Using {DSAC}},\n  author={Brachmann, Eric and Rother, Carsten},\n  journal={TPAMI},\n  year={2021}\n}\n```\n\n如果你使用了与“场景坐标重建先验”论文（ICCV 2025）相关的能力，请引用该论文。这包括RGB-D重建、概率损失函数的使用以及扩散先验等。\n\n```\n@inproceedings{bian2024scrpriors,\n    title={Scene Coordinate Reconstruction Priors},\n    author={Bian, Wenjing and Barroso-Laguna, Axel and Cavallari, Tommaso and Prisacariu, Victor Adrian and Brachmann, Eric},\n    booktitle={ICCV},\n    year={2025},\n}\n```\n\nACE0使用ZoeDepth估计种子图像的深度，请考虑引用以下文献：\n\n```\n@article{bhat2023zoedepth,\n  title={Zoe{D}epth: Zero-shot transfer by combining relative and metric depth},\n  author={Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and M{\\\"u}ller, Matthias},\n  journal={arXiv},\n  year={2023}\n}\n```\n\n本仓库依赖Nerfstudio进行基准测试，请根据其文档中的说明进行引用。\n\n\n## 许可证\n\n版权所有 © Niantic, Inc. 2024。专利申请中。\n保留所有权利。\n详细条款请参阅[许可证文件](LICENSE)。","# ACE0 (ACE Zero) 快速上手指南\n\nACE0 是一个基于增量学习的场景坐标重建工具，能够通过图像集合自动估计相机位姿并构建场景表示。它适用于从无序图像或视频中恢复 3D 结构，并可作为 NeRF 或 3D Gaussian Splatting 的前端输入。\n\n## 环境准备\n\n*   **操作系统**: Linux (推荐 Ubuntu 20.04)，其他发行版可能兼容。\n*   **硬件要求**: NVIDIA GPU (测试环境为 V100)。显存较小的显卡请参考相关优化策略。\n*   **核心依赖**:\n    *   Python (通过 Conda 管理)\n    *   PyTorch\n    *   C++ 编译器 (用于编译 DSAC* RANSAC 绑定)\n    *   FFmpeg (可选，用于生成重建过程视频)\n\n## 安装步骤\n\n### 1. 创建并激活 Conda 环境\n项目提供了预配置的 `environment.yml` 文件，包含所有必要的 Python 依赖。\n\n```shell\nconda env create -f environment.yml\nconda activate ace0\n```\n\n> **提示**：如果下载依赖较慢，建议在创建环境前配置国内镜像源（如清华源）：\n> ```shell\n> conda config --add channels https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fpkgs\u002Fmain\u002F\n> conda config --add channels https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fpkgs\u002Ffree\u002F\n> ```\n\n### 2. 编译 C++\u002FPython 绑定\nACE0 依赖 DSAC* 论文的 RANSAC 实现（C++ 编写），需要手动编译并安装绑定。\n\n```shell\ncd dsacstar\npython setup.py install\ncd ..\n```\n\n### 3. 首次运行确认\n首次运行时，脚本会提示下载 ZoeDepth 深度估计模型及其预训练权重（用于种子图像的深度估计）。请根据提示确认下载，或参考官方文档替换为其他深度模型。\n\n> **注意**：以下所有命令需在仓库根目录下执行，且确保已激活 `ace0` 环境。\n\n## 基本使用\n\n### 最简单的重建流程\n对一组图像进行从零开始的场景重建和位姿估计。支持使用通配符指定图像路径。\n\n```shell\n# 对指定路径下的所有 jpg 图像运行 ACE0，结果保存至 result_folder\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder\n```\n\n**关键提示**：请务必给图像路径的通配符加上**双引号**，以防止 Shell 提前展开路径，确保路径字符串正确传递给脚本。\n\n### 输出结果说明\n运行完成后，主要结果位于 `result_folder` 中：\n*   `poses_final.txt`: 最终估计的相机位姿文件。格式为：`文件名 四元数 (qw qx qy qz) 平移 (x y z) 焦距 置信度`。\n    *   若置信度低于 1000，建议视为不可靠结果。\n    *   位姿遵循 OpenCV 相机约定（世界到相机变换）。\n*   `pc_final.ply`: (可选) 导出的场景点云，可用于可视化或初始化 3D Gaussian Splatting。需添加参数 `--export_point_cloud True` 启用。\n\n### 自定义焦距（可选）\n默认情况下，ACE0 会根据图像对角线的 70% 启发式估计焦距。如果你已知准确的焦距参数：\n\n```shell\n# 使用外部焦距作为初始值，并在重建过程中继续微调\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length>\n\n# 使用固定焦距，不进行微调\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --use_external_focal_length \u003Cfocal_length> --refine_calibration False\n```\n\n### 可视化重建过程（可选）\n生成重建过程的演示视频（会显著降低运行速度）：\n\n```shell\npython ace_zero.py \"\u002Fpath\u002Fto\u002Fsome\u002Fimages\u002F*.jpg\" result_folder --render_visualization True\n```\n生成的视频将保存为 `result_folder\u002Freconstruction.mp4`。","某无人机测绘团队正在处理一组缺乏 GPS 信号的地下矿井巡检图像，急需重建高精度的三维场景并解算相机位姿以生成正射影像图。\n\n### 没有 acezero 时\n- 传统运动恢复结构（SfM）算法在纹理重复或弱纹理的矿井环境中极易失效，导致特征点匹配错误，无法完成初始化。\n- 面对大规模图像集合，增量式重建过程计算耗时极长，且容易因累积误差导致轨迹漂移，最终模型发生断裂或扭曲。\n- 若初始相机参数估计不准，后续需要人工介入进行繁琐的关键帧筛选和手动校正，严重拖慢项目交付进度。\n- 难以利用神经隐式表示的优势，生成的稀疏点云密度不足，无法满足后续训练 NeRF 或高斯泼溅（Gaussian Splatting）的需求。\n\n### 使用 acezero 后\n- acezero 通过学习多视图一致的隐式场景表示，即使在弱纹理区域也能稳健地回归场景坐标，成功解算出所有图像的相机参数。\n- 其增量学习重定位机制自动优化全局一致性，显著减少了轨迹漂移，无需人工干预即可输出连贯、精确的相机轨迹。\n- 端到端的深度学习流程大幅缩短了处理时间，团队能在数小时内完成原本需要数天的地下场景重建任务。\n- 直接输出高质量的稠密场景坐标重构结果，可无缝对接下游任务，快速训练出逼真的 NeRF 模型或高斯泼溅场景用于可视化汇报。\n\nacezero 通过隐式场景坐标回归技术，彻底解决了复杂环境下无 GPS 图像集的自动化高精度重建难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnianticlabs_acezero_9b4e156e.png","nianticlabs","Niantic Labs","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fnianticlabs_edeead43.png","Building technologies and ideas that move us",null,"https:\u002F\u002Fwww.nianticlabs.com","https:\u002F\u002Fgithub.com\u002Fnianticlabs",[80,84,88,92,96],{"name":81,"color":82,"percentage":83},"Python","#3572A5",67.9,{"name":85,"color":86,"percentage":87},"C++","#f34b7d",21.8,{"name":89,"color":90,"percentage":91},"Shell","#89e051",5.1,{"name":93,"color":94,"percentage":95},"Cuda","#3A4E3A",4.9,{"name":97,"color":98,"percentage":99},"Dockerfile","#384d54",0.3,805,54,"2026-04-08T17:04:50","NOASSERTION",4,"Linux","需要 NVIDIA GPU，已在 V100 上测试，显存需求视场景而定（低显存需参考 FAQ），CUDA 版本未明确说明","未说明",{"notes":109,"python":110,"dependencies":111},"1. 官方仅提供 Ubuntu 20.04 的测试支持，其他 Linux 发行版可能兼容但未保证。2. 必须手动编译安装 C++\u002FPython 绑定 (dsacstar)。3. 首次运行会自动下载 ZoeDepth 深度估计模型及其权重。4. 提供预配置的 conda 环境文件 (environment.yml) 和 Docker 支持。5. 可视化功能需要安装 ffmpeg。","未说明 (通过 conda environment.yml 管理)",[112,113,114,115,116],"PyTorch","conda","C++\u002FPython bindings (DSAC*)","ffmpeg (用于视频渲染)","ZoeDepth (自动下载)",[118,14,15,119],"视频","其他",[121,122,123,124,125,126,127,128,129,130],"3d-reconstruction","camera-relocalization","computer-vision","eccv","eccv2024","machine-learning","pose-estimation","sfm","structure-from-motion","visual-relocalization","2026-03-27T02:49:30.150509","2026-04-10T18:55:18.237938",[134,139,144,149,154,159],{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},27961,"如何在 WSL2 (Ubuntu) 中解决运行 ace_zero.py 时的错误？","如果在 WSL2 中遇到运行错误，通常是因为 CUDA 环境不完整。除了安装 `cuda-toolkit` 外，建议安装完整的 CUDA 包。可以尝试执行以下命令安装必要的库：\nsudo apt-get -y install cuda-cudart-11-8 \\\n                      cuda-compiler-11-8 \\\n                      libcublas-11-8 \\\n                      libcufft-11-8 \\\n                      libcurand-11-8 \\\n                      libcusolver-11-8 \\\n                      libcusparse-11-8\n此外，尝试将 CUDA 版本从 12.1 降级到 11.8，并重新创建 conda 环境（conda env create -f environment.yml）通常能解决问题。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F9",{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},27962,"ACE0 源代码何时发布？","ACE0 的源代码已经发布。维护者确认代码库已更新，用户可以直接克隆仓库并使用。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F1",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},27963,"如何在使用模型时冻结所有内参（fx, fy, cx, cy）？","目前 ACEZero 仅支持单一焦距 f (即 fx=fy)，并且默认假设主点 (cx, cy) 位于图像中心且不进行优化。\n1. 若要固定焦距，可使用参数 `--use_external_focal_length \u003Cfocal_length>` 和 `--refine_calibration False`。\n2. 主点 (cx, cy) 目前无法通过命令行参数直接冻结或修改，它们被硬编码为图像中心。\n3. 如果需要支持完整的内参（包括每张图片不同的 fx, fy, cx, cy），需要修改源码（参考 dataset.py 第 406-412 行）进行集成，但这属于高级定制功能。对于与假设值偏差较小（如 +\u002F- 10%）的情况，默认设置通常仍能产生合理结果。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F40",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},27964,"如何配置参数以获得最高质量的重建结果？","为了获得高质量结果，建议根据硬件能力调整以下参数：\n1. 增加种子数量：使用 `--try_seeds` (例如 6) 和 `--seed_parallel_workers`。\n2. 增加迭代次数：设置 `--seed_iterations 25000`。\n3. 提高 RANSAC 精度：增加 `--ransac_iterations` (例如 128) 并调整 `--ransac_threshold` (例如 5)。\n4. 如果已知精确焦距，务必使用 `--use_external_focal_length \u003Cvalue>` 并设置 `--refine_calibration False`。\n5. 调整图像分辨率：使用 `--image_resolution` (例如 480 或更高，取决于显存)。\n注意：如果方法完全失败（AUC 接近 0），请检查数据是否符合论文中的限制条件，并利用 ACE0 的可视化功能分析失败原因。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F39",{"id":155,"question_zh":156,"answer_zh":157,"source_url":158},27965,"如何将 ACE0 生成的位姿数据用于 3D Gaussian Splatting (3DGS)？","可以通过 Nerfstudio 将 ACE0 生成的位姿用于训练 3DGS。具体步骤如下：\n1. 确保安装了 `nerfacto` 或 `splatfacto`。\n2. 修改 `run_nerfstudio.py` 中的命令，将 `nerfacto` 替换为 `splatfacto`（如果需要直接训练高斯泼溅）。\n3. 从参数列表中移除 `'pipeline.datamanager.camera-optimizer.mode': 'off'`。\n4. 运行以下命令生成数据：\npython -m benchmarks.benchmark_poses --pose_file \u003Cace_result_path>\u002Fposes_final.txt --output_dir \u003Coutput_path> --images_glob_pattern \"\u003Cimages_path>*.jpg\"\n详细指南请参考项目 README 中的常见问题部分。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F8",{"id":160,"question_zh":161,"answer_zh":162,"source_url":163},27966,"运行过程中出现 'TypeError: forward_rgb(): incompatible function arguments' 或焦距置信度为 'inf' 是怎么回事？","这通常发生在处理特定数据集（如 360_V2）时，表明焦距估计异常（例如置信度为 inf）。这可能是由于图像元数据缺失、焦距初始值极端或深度估计模型在特定场景下失效导致的。建议检查输入图像的 EXIF 信息是否完整，或者尝试使用 `--use_external_focal_length` 手动指定一个合理的焦距值来绕过自动估计过程。如果问题持续，可能是该场景超出了模型的适用范围（参考论文中的局限性部分）。","https:\u002F\u002Fgithub.com\u002Fnianticlabs\u002Facezero\u002Fissues\u002F33",[]]