[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-facebookresearch--fast3r":3,"tool-facebookresearch--fast3r":64},[4,17,26,40,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,2,"2026-04-03T11:11:01",[13,14,15],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":23,"last_commit_at":32,"category_tags":33,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,34,35,36,15,37,38,13,39],"数据工具","视频","插件","其他","语言模型","音频",{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":10,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,38,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":10,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[38,14,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":23,"last_commit_at":62,"category_tags":63,"status":16},2471,"tesseract","tesseract-ocr\u002Ftesseract","Tesseract 是一款历史悠久且备受推崇的开源光学字符识别（OCR）引擎，最初由惠普实验室开发，后由 Google 维护，目前由全球社区共同贡献。它的核心功能是将图片中的文字转化为可编辑、可搜索的文本数据，有效解决了从扫描件、照片或 PDF 文档中提取文字信息的难题，是数字化归档和信息自动化的重要基础工具。\n\n在技术层面，Tesseract 展现了强大的适应能力。从版本 4 开始，它引入了基于长短期记忆网络（LSTM）的神经网络 OCR 引擎，显著提升了行识别的准确率；同时，为了兼顾旧有需求，它依然支持传统的字符模式识别引擎。Tesseract 原生支持 UTF-8 编码，开箱即用即可识别超过 100 种语言，并兼容 PNG、JPEG、TIFF 等多种常见图像格式。输出方面，它灵活支持纯文本、hOCR、PDF、TSV 等多种格式，方便后续数据处理。\n\nTesseract 主要面向开发者、研究人员以及需要构建文档处理流程的企业用户。由于它本身是一个命令行工具和库（libtesseract），不包含图形用户界面（GUI），因此最适合具备一定编程能力的技术人员集成到自动化脚本或应用程序中",73286,"2026-04-03T01:56:45",[13,14],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":106,"forks":107,"last_commit_at":108,"license":109,"difficulty_score":10,"env_os":110,"env_gpu":111,"env_ram":112,"env_deps":113,"category_tags":124,"github_topics":79,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":125,"updated_at":126,"faqs":127,"releases":158},642,"facebookresearch\u002Ffast3r","fast3r","[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass","fast3r 是一款来自 CVPR 2025 的高效 3D 重建开源模型，旨在从多张图像中快速构建三维场景。它主要解决了传统方法在处理大规模图像序列时计算耗时过长的问题。凭借创新的技术架构，fast3r 能够在单次计算中完成超过 1000 张图片的重建，大幅提升了处理速度与扩展能力。\n\nfast3r 非常适合计算机视觉开发者、科研人员以及对 3D 技术感兴趣的创作者。开发者可以直接调用其 PyTorch 接口将模型嵌入自己的应用中，研究人员可基于此进行性能对比与改进。普通用户也能通过官方提供的 Gradio 演示界面，上传视频或图片即可实时查看 3D 重建效果及相机姿态。fast3r 提供了预训练模型和详细的使用文档，虽然安装时需要配置 Conda 环境并注意特定模块兼容性，但其带来的效率提升足以弥补这些步骤。如果你正在寻找一种既能保证精度又能应对海量数据输入的快速 3D 解决方案，fast3r 绝对值得尝试。","\u003Cdiv align=\"center\">\n\n# ⚡️Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass\n\n\n${{\\color{Red}\\Huge{\\textsf{  CVPR\\ 2025\\ \\}}}}\\$\n\n\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-Paper-b31b1b?logo=arxiv&logoColor=b31b1b)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.13928)\n[![Project Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFast3R-Website-4CAF50?logo=googlechrome&logoColor=white)](https:\u002F\u002Ffast3r-3d.github.io\u002F)\n[![Gradio Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGradio-Demo-orange?style=flat&logo=Gradio&logoColor=red)](https:\u002F\u002Ffast3r.ngrok.app\u002F)\n[![Hugging Face Model](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https:\u002F\u002Fhuggingface.co\u002Fjedyang97\u002FFast3R_ViT_Large_512\u002F)\n\u003C\u002Fdiv>\n\n![Teaser Image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_cd8e49d9a83a.png)\n\nOfficial implementation of **Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass**, CVPR 2025\n\n*[Jianing Yang](https:\u002F\u002Fjedyang.com\u002F), [Alexander Sax](https:\u002F\u002Falexsax.github.io\u002F), [Kevin J. Liang](https:\u002F\u002Fkevinjliang.github.io\u002F), [Mikael Henaff](https:\u002F\u002Fwww.mikaelhenaff.net\u002F), [Hao Tang](https:\u002F\u002Ftanghaotommy.github.io\u002F), [Ang Cao](https:\u002F\u002Fcaoang327.github.io\u002F), [Joyce Chai](https:\u002F\u002Fweb.eecs.umich.edu\u002F~chaijy\u002F), [Franziska Meier](https:\u002F\u002Ffmeier.github.io\u002F), [Matt Feiszli](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmatt-feiszli-76b34b\u002F)*\n\n## Installation\n\n```bash\n# clone project\ngit clone https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\ncd fast3r\n\n# create conda environment\nconda create -n fast3r python=3.11 cmake=3.14.0 -y\nconda activate fast3r\n\n# install PyTorch (adjust cuda version according to your system)\nconda install pytorch torchvision torchaudio pytorch-cuda=12.4 nvidia\u002Flabel\u002Fcuda-12.4.0::cuda-toolkit -c pytorch -c nvidia\n\n# install requirements\npip install -r requirements.txt\n\n# install fast3r as a package (so you can import fast3r and use it in your own project)\npip install -e .\n```\n\nNote: Please make sure to NOT install the cuROPE module like in DUSt3R - it would mess up Fast3R's prediction.\n\n## Demo\n\nUse the following command to run the demo:\n\n```bash\npython fast3r\u002Fviz\u002Fdemo.py\n```\nThis will automatically download the pre-trained model weights and config from [Hugging Face Model](https:\u002F\u002Fhuggingface.co\u002Fjedyang97\u002FFast3R_ViT_Large_512).\n\nThe demo is a Gradio interface where you can upload images or a video and visualize the 3D reconstruction and camera pose estimation.\n\n`fast3r\u002Fviz\u002Fdemo.py` also serves as an example of how to use the model for inference.\n\n\u003Cdiv>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_649c273bd9cf.gif\" width=\"45%\" alt=\"Demo GIF 1\" \u002F>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_124e63946402.gif\" width=\"45%\" alt=\"Demo GIF 2\" style=\"margin-left: 5%;\" \u002F>\n  \u003Cbr>\n  \u003Cem>Left: Upload a video. Right: Visualize the 3D Reconstruction\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n\u003Cdetails>\n\u003Csummary>Click here to see example of: visualize confidence heatmap + play frame by frame + render a GIF\u003C\u002Fsummary>\n\u003Cdiv style=\"display: flex; justify-content: center;\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_1ecde1c57ae0.gif\" width=\"100%\" alt=\"Demo GIF 3\" \u002F>\n\u003C\u002Fdiv>\n\u003C\u002Fdetails>\n\n## Using Fast3R in Your Own Project\n\nTo use Fast3R in your own project, you can import the `Fast3R` class from `fast3r.models.fast3r` and use it as a regular PyTorch model.\n\n```python\nimport torch\nfrom fast3r.dust3r.utils.image import load_images\nfrom fast3r.dust3r.inference_multiview import inference\nfrom fast3r.models.fast3r import Fast3R\nfrom fast3r.models.multiview_dust3r_module import MultiViewDUSt3RLitModule\n\n# --- Setup ---\n# Load the model from Hugging Face\nmodel = Fast3R.from_pretrained(\"jedyang97\u002FFast3R_ViT_Large_512\")  # If you have networking issues, try pre-download the HF checkpoint dir and change the path here to a local directory\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\nmodel = model.to(device)\n\n# Create a lightweight lightning module wrapper for the model.\n# This provides functions to estimate camera poses, evaluate 3D reconstruction, etc.\nlit_module = MultiViewDUSt3RLitModule.load_for_inference(model)\n\n# Set model to evaluation mode\nmodel.eval()\nlit_module.eval()\n\n# --- Load Images ---\n# Provide a list of image file paths. Images can come from different cameras and aspect ratios.\nfilelist = [\"path\u002Fto\u002Fimage1.jpg\", \"path\u002Fto\u002Fimage2.jpg\", \"path\u002Fto\u002Fimage3.jpg\"]\nimages = load_images(filelist, size=512, verbose=True)\n\n# --- Run Inference ---\n# The inference function returns a dictionary with predictions and view information.\noutput_dict, profiling_info = inference(\n    images,\n    model,\n    device,\n    dtype=torch.float32,  # or use torch.bfloat16 if supported\n    verbose=True,\n    profiling=True,\n)\n\n# --- Estimate Camera Poses ---\n# This step estimates the camera-to-world (c2w) poses for each view using PnP.\nposes_c2w_batch, estimated_focals = MultiViewDUSt3RLitModule.estimate_camera_poses(\n    output_dict['preds'],\n    niter_PnP=100,\n    focal_length_estimation_method='first_view_from_global_head'\n)\n# poses_c2w_batch is a list; the first element contains the estimated poses for each view.\ncamera_poses = poses_c2w_batch[0]\n\n# Print camera poses for all views.\nfor view_idx, pose in enumerate(camera_poses):\n    print(f\"Camera Pose for view {view_idx}:\")\n    print(pose.shape)  # np.array of shape (4, 4), the camera-to-world transformation matrix\n\n# --- Extract 3D Point Clouds for Each View ---\n# Each element in output_dict['preds'] corresponds to a view's point map.\nfor view_idx, pred in enumerate(output_dict['preds']):\n    point_cloud = pred['pts3d_in_other_view'].cpu().numpy()\n    print(f\"Point Cloud Shape for view {view_idx}: {point_cloud.shape}\")  # shape: (1, 368, 512, 3), i.e., (1, Height, Width, XYZ)\n```\n\n## Training\n\nTrain model with chosen experiment configuration from [configs\u002Fexperiment\u002F](configs\u002Fexperiment\u002F)\n\n```bash\npython fast3r\u002Ftrain.py experiment=super_long_training\u002Fsuper_long_training\n```\n\nYou can override any parameter from command line following [Hydra override syntax](https:\u002F\u002Fhydra.cc\u002Fdocs\u002Fadvanced\u002Foverride_grammar\u002Fbasic\u002F):\n\n```bash\npython fast3r\u002Ftrain.py experiment=super_long_training\u002Fsuper_long_training trainer.max_epochs=20 trainer.num_nodes=2\n```\n\nTo submit a multi-node training job with Slurm, use the following command:\n\n```bash\npython scripts\u002Fslurm\u002Fsubmit_train.py --nodes=\u003CNODES> --experiment=\u003CEXPERIMENT>\n```\n\nAfter training, you can run the demo with a lightning checkpoint with the following command:\n```bash\npython fast3r\u002Fviz\u002Fdemo.py --is_lightning_checkpoint --checkpoint_dir=\u002Fpath\u002Fto\u002Fsuper_long_training_999999\n```\n\n## Evaluation\n\nTo evaluate on 3D reconstruction or camera pose estimation tasks, run:\n\n```bash\npython fast3r\u002Feval.py eval=\u003Ceval_config>\n```\n`\u003Ceval_config>` can be any of the evaluation configurations in [configs\u002Feval\u002F](configs\u002Feval\u002F). For example:\n- `ablation_recon_better_inference_hp\u002Fablation_recon_better_inference_hp` evaluates the 3D reconstruction on DTU, 7-Scenes and Neural-RGBD datasets.\n- `eval_cam_pose\u002Feval_cam_pose_10views` evaluates the camera pose estimation on 10 views on CO3D dataset.\n\n\nTo evaluate camera poses on RealEstate10K dataset, run:\n\n```bash\npython scripts\u002Ffast3r_re10k_pose_eval.py  --subset_file scripts\u002Fre10k_test_1800.txt\n```\n\nTo evaluate multi-view depth estimation on Tanks and Temples, ETH-3D, DTU, and ScanNet datasets, follow the data download and preparation guide of [robustmvd](https:\u002F\u002Fgithub.com\u002Flmb-freiburg\u002Frobustmvd), install that repo's `requirements.txt` into the current conda environment, and run:\n\n```bash\npython scripts\u002Frobustmvd_eval.py\n```\n\n## Dataset Preprocessing\n\nPlease follow [DUSt3R's data preprocessing instructions](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fdust3r\u002Ftree\u002Fmain?tab=readme-ov-file#datasets) to prepare the data for training and evaluation. The pre-processed data is compatible with the [multi-view dataloaders](fast3r\u002Fdust3r\u002Fdatasets) in this repo.\n\nFor preprocessing the DTU, 7-Scene, and NRGBD datasets for evaluation, we follow [Spann3r's data processing instructions](https:\u002F\u002Fgithub.com\u002FHengyiWang\u002Fspann3r\u002Fblob\u002Fmain\u002Fdocs\u002Fdata_preprocess.md).\n\n## FAQ\n\n- Q: `httpcore.ConnectError: All connection attempts failed` when launching the demo?\n  - See [#34](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F34). Download the example videos into a local directory.\n- Q: Data pre-processing for BlendedMVS, `train_list.txt` is missing?\n  - See [#33](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F33).\n- Q: Loading checkpoint to fine-tune Fast3R?\n  - See [#25](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F25)\n- Q: Running demo on Windows? (TypeError: cannot pickle '_thread.RLock' object)\n  - See [#28](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F28). It seems that some more work is needed to make the demo compatible with Windows - we hope the community could contribute a PR!\n- Q: Completely messed-up point cloud output?\n  - See [#21](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F21). Please make sure the cuROPE module is NOT installed.\n- Q: My GPU doesn't support FlashAttention \u002F `No available kernel. Aborting execution`?\n  - See [#17](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F17). Use `attn_implementation=pytorch_auto` option instead.\n- Q: `TypeError: Fast3R.__init__() missing 3 required positional arguments: 'encoder_args', 'decoder_args', and 'head_args'`\n  - See See [#7](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F7). It is caused by a networking issue with downloading the model from Huggingface in some countries (e.g., China) - please pre-download the model checkpoint with a working networking configuration, and use a local path to load the model instead.\n## License\n\nThe code and models are licensed under the [FAIR NC Research License](LICENSE).\n\n## Contributing\n\nSee [contributing](CONTRIBUTING.md) and the [code of conduct](CODE_OF_CONDUCT.md).\n\n## Citation\n\n```\n@InProceedings{Yang_2025_Fast3R,\n    title={Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass},\n    author={Jianing Yang and Alexander Sax and Kevin J. Liang and Mikael Henaff and Hao Tang and Ang Cao and Joyce Chai and Franziska Meier and Matt Feiszli},\n    booktitle={Proceedings of the IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n    month={June},\n    year={2025},\n}\n```\n\n## Acknowledgement\n\nFast3R is built upon a foundation of remarkable open-source projects. We deeply appreciate the contributions of these projects and their communities, whose efforts have significantly advanced the field and made this work possible.\n\n- [DUSt3R](https:\u002F\u002Fdust3r.europe.naverlabs.com\u002F)\n- [Spann3R](https:\u002F\u002Fhengyiwang.github.io\u002Fprojects\u002Fspanner)\n- [Viser](https:\u002F\u002Fviser.studio\u002Fmain\u002F)\n- [Lightning-Hydra-Template](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template)\n\n# Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_a8d9aa385f16.png)](https:\u002F\u002Fstar-history.com\u002F#facebookresearch\u002Ffast3r&Date)\n","\u003Cdiv align=\"center\">\n\n# ⚡️Fast3R：迈向单次前向传播完成 1000+ 图像的 3D 重建\n\n\n${{\\color{Red}\\Huge{\\textsf{ CVPR\\ 2025\\ \\}}}}\\$\n\n\n[![论文](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-Paper-b31b1b?logo=arxiv&logoColor=b31b1b)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.13928)\n[![项目网站](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFast3R-Website-4CAF50?logo=googlechrome&logoColor=white)](https:\u002F\u002Ffast3r-3d.github.io\u002F)\n[![Gradio 演示](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGradio-Demo-orange?style=flat&logo=Gradio&logoColor=red)](https:\u002F\u002Ffast3r.ngrok.app\u002F)\n[![🤗 Hugging Face 模型](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https:\u002F\u002Fhuggingface.co\u002Fjedyang97\u002FFast3R_ViT_Large_512\u002F)\n\u003C\u002Fdiv>\n\n![Teaser Image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_cd8e49d9a83a.png)\n\n**Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass** 的官方实现，CVPR 2025\n\n*[Jianing Yang](https:\u002F\u002Fjedyang.com\u002F), [Alexander Sax](https:\u002F\u002Falexsax.github.io\u002F), [Kevin J. Liang](https:\u002F\u002Fkevinjliang.github.io\u002F), [Mikael Henaff](https:\u002F\u002Fwww.mikaelhenaff.net\u002F), [Hao Tang](https:\u002F\u002Ftanghaotommy.github.io\u002F), [Ang Cao](https:\u002F\u002Fcaoang327.github.io\u002F), [Joyce Chai](https:\u002F\u002Fweb.eecs.umich.edu\u002F~chaijy\u002F), [Franziska Meier](https:\u002F\u002Ffmeier.github.io\u002F), [Matt Feiszli](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmatt-feiszli-76b34b\u002F)*\n\n## 安装\n\n```bash\n# clone project\ngit clone https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\ncd fast3r\n\n# create conda environment\nconda create -n fast3r python=3.11 cmake=3.14.0 -y\nconda activate fast3r\n\n# install PyTorch (adjust cuda version according to your system)\nconda install pytorch torchvision torchaudio pytorch-cuda=12.4 nvidia\u002Flabel\u002Fcuda-12.4.0::cuda-toolkit -c pytorch -c nvidia\n\n# install requirements\npip install -r requirements.txt\n\n# install fast3r as a package (so you can import fast3r and use it in your own project)\npip install -e .\n```\n\n注意：请确保不要像 DUSt3R 那样安装 cuROPE 模块，否则会破坏 Fast3R 的预测结果。\n\n## 演示\n\n使用以下命令运行演示：\n\n```bash\npython fast3r\u002Fviz\u002Fdemo.py\n```\n这将自动从 [Hugging Face 模型](https:\u002F\u002Fhuggingface.co\u002Fjedyang97\u002FFast3R_ViT_Large_512) 下载预训练模型权重和配置。\n\n该演示是一个 Gradio 界面，您可以上传图像或视频，并可视化 3D 重建和相机姿态估计。\n\n`fast3r\u002Fviz\u002Fdemo.py` 也可作为使用模型进行推理的示例。\n\n\u003Cdiv>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_649c273bd9cf.gif\" width=\"45%\" alt=\"Demo GIF 1\" \u002F>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_124e63946402.gif\" width=\"45%\" alt=\"Demo GIF 2\" style=\"margin-left: 5%;\" \u002F>\n  \u003Cbr>\n  \u003Cem>左：上传视频。右：可视化 3D 重建\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n\u003Cdetails>\n\u003Csummary>点击此处查看示例：可视化置信度热力图 + 逐帧播放 + 渲染 GIF\u003C\u002Fsummary>\n\u003Cdiv style=\"display: flex; justify-content: center;\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_1ecde1c57ae0.gif\" width=\"100%\" alt=\"Demo GIF 3\" \u002F>\n\u003C\u002Fdiv>\n\u003C\u002Fdetails>\n\n## 在自己的项目中使用 Fast3R\n\n要在自己的项目中使用 Fast3R，您可以从 `fast3r.models.fast3r` 导入 `Fast3R` 类，并将其作为常规的 PyTorch 模型使用。\n\n```python\nimport torch\nfrom fast3r.dust3r.utils.image import load_images\nfrom fast3r.dust3r.inference_multiview import inference\nfrom fast3r.models.fast3r import Fast3R\nfrom fast3r.models.multiview_dust3r_module import MultiViewDUSt3RLitModule\n\n# --- Setup ---\n# Load the model from Hugging Face\nmodel = Fast3R.from_pretrained(\"jedyang97\u002FFast3R_ViT_Large_512\")  # If you have networking issues, try pre-download the HF checkpoint dir and change the path here to a local directory\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\nmodel = model.to(device)\n\n# Create a lightweight lightning module wrapper for the model.\n# This provides functions to estimate camera poses, evaluate 3D reconstruction, etc.\nlit_module = MultiViewDUSt3RLitModule.load_for_inference(model)\n\n# Set model to evaluation mode\nmodel.eval()\nlit_module.eval()\n\n# --- Load Images ---\n# Provide a list of image file paths. Images can come from different cameras and aspect ratios.\nfilelist = [\"path\u002Fto\u002Fimage1.jpg\", \"path\u002Fto\u002Fimage2.jpg\", \"path\u002Fto\u002Fimage3.jpg\"]\nimages = load_images(filelist, size=512, verbose=True)\n\n# --- Run Inference ---\n# The inference function returns a dictionary with predictions and view information.\noutput_dict, profiling_info = inference(\n    images,\n    model,\n    device,\n    dtype=torch.float32,  # or use torch.bfloat16 if supported\n    verbose=True,\n    profiling=True,\n)\n\n# --- Estimate Camera Poses ---\n# This step estimates the camera-to-world (c2w) poses for each view using PnP.\nposes_c2w_batch, estimated_focals = MultiViewDUSt3RLitModule.estimate_camera_poses(\n    output_dict['preds'],\n    niter_PnP=100,\n    focal_length_estimation_method='first_view_from_global_head'\n)\n# poses_c2w_batch is a list; the first element contains the estimated poses for each view.\ncamera_poses = poses_c2w_batch[0]\n\n# Print camera poses for all views.\nfor view_idx, pose in enumerate(camera_poses):\n    print(f\"Camera Pose for view {view_idx}:\")\n    print(pose.shape)  # np.array of shape (4, 4), the camera-to-world transformation matrix\n\n# --- Extract 3D Point Clouds for Each View ---\n# Each element in output_dict['preds'] corresponds to a view's point map.\nfor view_idx, pred in enumerate(output_dict['preds']):\n    point_cloud = pred['pts3d_in_other_view'].cpu().numpy()\n    print(f\"Point Cloud Shape for view {view_idx}: {point_cloud.shape}\")  # shape: (1, 368, 512, 3), i.e., (1, Height, Width, XYZ)\n```\n\n## 训练\n\n使用 [configs\u002Fexperiment\u002F](configs\u002Fexperiment\u002F) 中选定的实验配置来训练模型\n\n```bash\npython fast3r\u002Ftrain.py experiment=super_long_training\u002Fsuper_long_training\n```\n\n您可以按照 [Hydra override syntax](https:\u002F\u002Fhydra.cc\u002Fdocs\u002Fadvanced\u002Foverride_grammar\u002Fbasic\u002F) 从命令行覆盖任何参数：\n\n```bash\npython fast3r\u002Ftrain.py experiment=super_long_training\u002Fsuper_long_training trainer.max_epochs=20 trainer.num_nodes=2\n```\n\n若要使用 Slurm 提交多节点训练作业，请使用以下命令：\n\n```bash\npython scripts\u002Fslurm\u002Fsubmit_train.py --nodes=\u003CNODES> --experiment=\u003CEXPERIMENT>\n```\n\n训练完成后，您可以使用 lightning checkpoint 通过以下命令运行演示：\n```bash\npython fast3r\u002Fviz\u002Fdemo.py --is_lightning_checkpoint --checkpoint_dir=\u002Fpath\u002Fto\u002Fsuper_long_training_999999\n```\n\n## 评估\n\n要在 3D 重建 (3D reconstruction) 或相机位姿估计 (camera pose estimation) 任务上进行评估，请运行：\n\n```bash\npython fast3r\u002Feval.py eval=\u003Ceval_config>\n```\n`\u003Ceval_config>` 可以是 [configs\u002Feval\u002F](configs\u002Feval\u002F) 中的任何评估配置。例如：\n- `ablation_recon_better_inference_hp\u002Fablation_recon_better_inference_hp` 评估在 DTU、7-Scenes 和 Neural-RGBD 数据集上的 3D 重建效果。\n- `eval_cam_pose\u002Feval_cam_pose_10views` 评估在 CO3D 数据集上 10 个视角的相机位姿估计效果。\n\n要在 RealEstate10K 数据集上评估相机位姿，请运行：\n\n```bash\npython scripts\u002Ffast3r_re10k_pose_eval.py  --subset_file scripts\u002Fre10k_test_1800.txt\n```\n\n要在 Tanks and Temples、ETH-3D、DTU 和 ScanNet 数据集上评估多视图深度估计 (multi-view depth estimation)，请遵循 [robustmvd](https:\u002F\u002Fgithub.com\u002Flmb-freiburg\u002Frobustmvd) 的数据下载和准备指南，将该仓库的 `requirements.txt` 安装到当前的 conda 环境 (conda environment) 中，然后运行：\n\n```bash\npython scripts\u002Frobustmvd_eval.py\n```\n\n## 数据集预处理\n\n请遵循 [DUSt3R 的数据预处理说明](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fdust3r\u002Ftree\u002Fmain?tab=readme-ov-file#datasets) 来准备训练和评估所需的数据。预处理后的数据与此仓库中的 [多视图数据加载器](fast3r\u002Fdust3r\u002Fdatasets) (multi-view dataloaders) 兼容。\n\n对于用于评估的 DTU、7-Scene 和 NRGBD 数据集的预处理，我们遵循 [Spann3r 的数据处理说明](https:\u002F\u002Fgithub.com\u002FHengyiWang\u002Fspann3r\u002Fblob\u002Fmain\u002Fdocs\u002Fdata_preprocess.md)。\n\n## 常见问题\n\n- Q: 启动演示时出现 `httpcore.ConnectError: All connection attempts failed`？\n  - 参见 [#34](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F34)。将示例视频下载到本地目录。\n- Q: BlendedMVS 的数据预处理中，缺少 `train_list.txt`？\n  - 参见 [#33](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F33)。\n- Q: 如何加载检查点 (checkpoint) 以微调 (fine-tune) Fast3R？\n  - 参见 [#25](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F25)\n- Q: 在 Windows 上运行演示？(TypeError: cannot pickle '_thread.RLock' object)\n  - 参见 [#28](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F28)。似乎还需要更多工作才能使演示与 Windows 兼容——我们希望社区能贡献一个 Pull Request (PR)！\n- Q: 点云 (point cloud) 输出完全混乱？\n  - 参见 [#21](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F21)。请确保未安装 cuROPE 模块。\n- Q: 我的 GPU (图形处理器) 不支持 FlashAttention \u002F `No available kernel. Aborting execution`？\n  - 参见 [#17](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F17)。改用 `attn_implementation=pytorch_auto` 选项。\n- Q: `TypeError: Fast3R.__init__() missing 3 required positional arguments: 'encoder_args', 'decoder_args', and 'head_args'`\n  - 参见 [#7](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F7)。这是由于在某些国家（如中国）从 Huggingface 下载模型时的网络问题导致的——请使用有效的网络配置预先下载模型检查点 (checkpoint)，并使用本地路径加载模型。\n## 许可证\n\n代码和模型均根据 [FAIR NC Research License](LICENSE) 许可。\n\n## 贡献\n\n请参阅 [贡献指南](CONTRIBUTING.md) 和 [行为准则](CODE_OF_CONDUCT.md)。\n\n## 引用\n\n```\n@InProceedings{Yang_2025_Fast3R,\n    title={Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass},\n    author={Jianing Yang and Alexander Sax and Kevin J. Liang and Mikael Henaff and Hao Tang and Ang Cao and Joyce Chai and Franziska Meier and Matt Feiszli},\n    booktitle={Proceedings of the IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n    month={June},\n    year={2025},\n}\n```\n\n## 致谢\n\nFast3R 建立在一些杰出的开源项目基础之上。我们衷心感谢这些项目及其社区的贡献，他们的努力极大地推动了该领域的发展并使这项工作成为可能。\n\n- [DUSt3R](https:\u002F\u002Fdust3r.europe.naverlabs.com\u002F)\n- [Spann3R](https:\u002F\u002Fhengyiwang.github.io\u002Fprojects\u002Fspanner)\n- [Viser](https:\u002F\u002Fviser.studio\u002Fmain\u002F)\n- [Lightning-Hydra-Template](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template)\n\n# 星标历史\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_readme_a8d9aa385f16.png)](https:\u002F\u002Fstar-history.com\u002F#facebookresearch\u002Ffast3r&Date)","# Fast3R 快速上手指南\n\n**Fast3R** 是 CVPR 2025 提出的开源项目，旨在实现单次前向传播处理 1000+ 图像的 3D 重建。本指南帮助开发者快速完成环境搭建与基础使用。\n\n## 1. 环境准备\n\n*   **系统要求**: Linux \u002F macOS \u002F Windows (Windows 下 Demo 可能存在兼容性问题，建议优先使用 Linux)\n*   **Python 版本**: 3.11\n*   **深度学习框架**: PyTorch (推荐 CUDA 12.4)\n*   **其他依赖**: CMake (>= 3.14.0), Conda\n\n## 2. 安装步骤\n\n请按照以下顺序执行命令。\n\n### 克隆项目\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\ncd fast3r\n```\n\n### 创建 Conda 环境\n```bash\nconda create -n fast3r python=3.11 cmake=3.14.0 -y\nconda activate fast3r\n```\n\n### 安装 PyTorch\n根据您的系统 CUDA 版本调整命令（示例为 CUDA 12.4）：\n```bash\nconda install pytorch torchvision torchaudio pytorch-cuda=12.4 nvidia\u002Flabel\u002Fcuda-12.4.0::cuda-toolkit -c pytorch -c nvidia\n```\n\n### 安装依赖包\n```bash\npip install -r requirements.txt\npip install -e .\n```\n\n> **⚠️ 重要提示**: 请勿安装 `cuROPE` 模块（类似 DUSt3R 中的做法），否则会导致 Fast3R 预测错误。\n\n> **💡 国内网络优化**: 由于模型权重托管于 Hugging Face，国内用户可能面临下载缓慢或失败的问题。建议在运行 Demo 或代码前，先手动下载模型文件至本地目录，并在代码中指定本地路径加载。\n\n## 3. 基本使用\n\n### 方式一：运行 Gradio Demo (可视化)\n此命令会自动从 Hugging Face 下载预训练模型和配置，并启动 Web 界面。\n\n```bash\npython fast3r\u002Fviz\u002Fdemo.py\n```\n在浏览器打开链接后，您可以上传图片或视频，实时查看 3D 重建结果及相机位姿估计。\n\n### 方式二：代码集成 (Inference)\n若需在项目中调用 Fast3R，可参考以下示例代码。\n\n```python\nimport torch\nfrom fast3r.dust3r.utils.image import load_images\nfrom fast3r.dust3r.inference_multiview import inference\nfrom fast3r.models.fast3r import Fast3R\nfrom fast3r.models.multiview_dust3r_module import MultiViewDUSt3RLitModule\n\n# --- 初始化 ---\n# 加载模型 (如有网络问题，请改为本地路径)\nmodel = Fast3R.from_pretrained(\"jedyang97\u002FFast3R_ViT_Large_512\")\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\nmodel = model.to(device)\n\n# 创建轻量级 Lightning 模块包装器\nlit_module = MultiViewDUSt3RLitModule.load_for_inference(model)\n\n# 设置评估模式\nmodel.eval()\nlit_module.eval()\n\n# --- 加载图像 ---\n# 提供图像文件路径列表\nfilelist = [\"path\u002Fto\u002Fimage1.jpg\", \"path\u002Fto\u002Fimage2.jpg\", \"path\u002Fto\u002Fimage3.jpg\"]\nimages = load_images(filelist, size=512, verbose=True)\n\n# --- 推理 ---\noutput_dict, profiling_info = inference(\n    images,\n    model,\n    device,\n    dtype=torch.float32,  # 或根据硬件支持使用 torch.bfloat16\n    verbose=True,\n    profiling=True,\n)\n\n# --- 估计相机位姿 ---\nposes_c2w_batch, estimated_focals = MultiViewDUSt3RLitModule.estimate_camera_poses(\n    output_dict['preds'],\n    niter_PnP=100,\n    focal_length_estimation_method='first_view_from_global_head'\n)\ncamera_poses = poses_c2w_batch[0]\n\n# 打印所有视图的相机位姿\nfor view_idx, pose in enumerate(camera_poses):\n    print(f\"Camera Pose for view {view_idx}:\")\n    print(pose.shape)  # np.array shape (4, 4)\n\n# --- 提取点云 ---\nfor view_idx, pred in enumerate(output_dict['preds']):\n    point_cloud = pred['pts3d_in_other_view'].cpu().numpy()\n    print(f\"Point Cloud Shape for view {view_idx}: {point_cloud.shape}\")\n```","某无人机测绘团队正在为大型物流园区构建高精度三维模型，面对数百个航点拍摄的上千张重叠照片，他们急需高效的解决方案。\n\n### 没有 fast3r 时\n- 传统 SfM 流程处理上千张图片通常需要数小时甚至通宵，漫长的等待时间严重拖慢了项目迭代节奏。\n- 受限于显存容量，团队必须对数据进行复杂的分块切割，这不仅增加了配准误差风险，还破坏了场景的整体连续性。\n- 每次调整重建参数都需要重新运行整个流程，调试周期漫长，导致问题定位困难且影响最终交付进度。\n\n### 使用 fast3r 后\n- fast3r 凭借单次前向传播能力，能在几分钟内完成千级图像的重建，相比传统方法速度提升数十倍。\n- 原生支持大规模输入，无需繁琐的分块预处理，直接输出连贯的稠密点云和精确的相机轨迹。\n- 工程师可现场实时预览重建结果，迅速发现遮挡或模糊区域并安排补拍，大幅提高了数据采集的一次成功率。\n\nfast3r 通过颠覆性的推理速度，将大规模三维重建从“离线批处理”转变为“即时交互”，彻底释放了空间计算的开发潜力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_fast3r_cd8e49d9.png","facebookresearch","Meta Research","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Ffacebookresearch_449342bd.png","",null,"https:\u002F\u002Fopensource.fb.com","https:\u002F\u002Fgithub.com\u002Ffacebookresearch",[83,87,91,95,99,103],{"name":84,"color":85,"percentage":86},"Python","#3572A5",68.6,{"name":88,"color":89,"percentage":90},"Jupyter Notebook","#DA5B0B",30.9,{"name":92,"color":93,"percentage":94},"Cuda","#3A4E3A",0.3,{"name":96,"color":97,"percentage":98},"C++","#f34b7d",0.2,{"name":100,"color":101,"percentage":102},"Dockerfile","#384d54",0,{"name":104,"color":105,"percentage":102},"Shell","#89e051",1529,88,"2026-04-03T23:10:12","NOASSERTION","Linux, macOS","需要 NVIDIA GPU，CUDA 12.4，显存大小未明确说明","未说明",{"notes":114,"python":115,"dependencies":116},"1. 严禁安装 cuROPE 模块（如 DUSt3R 中那样），否则会导致预测错误；2. Windows 系统运行 Demo 存在已知兼容性问题（pickle 错误）；3. 首次运行会自动从 Hugging Face 下载预训练模型权重；4. 若 GPU 不支持 FlashAttention，需使用 attn_implementation=pytorch_auto 选项；5. 数据预处理需参考 DUSt3R 或 Spann3R 的指南进行准备。","3.11",[117,118,119,120,121,122,123],"pytorch","torchvision","torchaudio","cmake","gradio","pytorch-lightning","hydra",[14,37],"2026-03-27T02:49:30.150509","2026-04-06T05:32:14.642312",[128,133,138,143,148,153],{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},2644,"Demo 运行到 \"ViserServerManager started\" 卡住或点云可视化混乱怎么办？","尝试卸载 cuROPE 库。根据 Issue #13 的维护者建议，这通常与 cuROPE 库冲突导致的问题有关，请参照 Issue #21 的相关讨论进行排查。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F13",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},2645,"推理阶段如何获取深度图和相机内参？`pts3d_local` 中出现负值是否正常？","计算最终深度图时应直接使用 `pts3d_local`。即使其中包含负值，这也是局部坐标系的正常表现，无需额外转换。维护者确认应使用 `pts3d_local` 而非全局对齐后的坐标。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F19",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},2646,"GS Bundle Adjustment 中近场失真或单视图重建效果差的原因是什么？","问题主要在于数据覆盖度而非单视图本身。Fast3R\u002FDUSt3R 对训练数据分布敏感，如果图像包含大量训练集中未见的区域，会导致 GauBA 崩溃或近场扭曲。建议检查输入图像是否在训练集覆盖范围内。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F53",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},2647,"初始化 Fast3R 时报错缺少参数或模型下载失败如何解决？","检查网络环境，特别是国内用户可能需要配置镜像源。尝试设置环境变量 `export HF_ENDPOINT=\"https:\u002F\u002Fhf-mirror.com\"` 以确保 HuggingFace 模型能正常下载。确保模型权重完整后再初始化实例。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F7",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},2648,"如何将预测的点云渲染到真实相机位姿上并对比？","由于相机位姿是独立预测的，轨迹形状可能与真值不同，仅靠缩放、旋转和平移难以完全对齐。社区建议尝试使用 ICP 算法获取变换矩阵，但需注意因轨迹形状差异导致的误差，目前尚无完美对齐方案。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F37",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},2649,"如何在 demo.py 中保存渲染后的点云文件？","该功能已在后续版本中合并（参考 PR #57）。现在支持下载字节编码的 PLY 文件，不再依赖服务器端的 open3d 库。具体实现可参考相关 Pull Request 中的代码逻辑。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r\u002Fissues\u002F54",[]]