[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-JonasSchult--Mask3D":3,"tool-JonasSchult--Mask3D":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",144730,2,"2026-04-07T23:26:32",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":99,"forks":100,"last_commit_at":101,"license":102,"difficulty_score":103,"env_os":104,"env_gpu":105,"env_ram":106,"env_deps":107,"category_tags":119,"github_topics":121,"view_count":32,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":127,"updated_at":128,"faqs":129,"releases":158},5498,"JonasSchult\u002FMask3D","Mask3D","Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.","Mask3D 是一款专注于 3D 实例分割的开源深度学习模型，旨在精准识别三维空间中的物体及其语义类别。在自动驾驶、机器人导航及数字孪生等应用中，如何让机器准确区分场景中每一个独立的物体（如区分不同的椅子或行人）一直是个难题。Mask3D 通过引入先进的 Mask Transformer 架构，有效解决了这一痛点，能够直接从点云数据中输出高精度的 3D 语义实例预测。\n\n该工具在 ScanNet、ScanNet200、S3DIS 和 STPLS3D 等多个权威基准测试中均取得了业界领先的性能表现，证明了其卓越的泛化能力和准确性。技术上，Mask3D 基于高度模块化的 MinkowskiEngine 框架开发，并集成了 PyTorch Lightning 与 Hydra 配置管理，不仅支持稀疏卷积高效处理大规模点云，还便于研究人员进行灵活的实验调整与二次开发。\n\nMask3D 主要面向计算机视觉领域的研究人员、算法工程师以及从事 3D 感知开发的开发者。对于希望探索前沿 3D 分割技术、复现顶级论文成果或构建高精度三维理解系统的团队来说，这是一个极具价值的参考实现。项目代码结构清晰且文档","Mask3D 是一款专注于 3D 实例分割的开源深度学习模型，旨在精准识别三维空间中的物体及其语义类别。在自动驾驶、机器人导航及数字孪生等应用中，如何让机器准确区分场景中每一个独立的物体（如区分不同的椅子或行人）一直是个难题。Mask3D 通过引入先进的 Mask Transformer 架构，有效解决了这一痛点，能够直接从点云数据中输出高精度的 3D 语义实例预测。\n\n该工具在 ScanNet、ScanNet200、S3DIS 和 STPLS3D 等多个权威基准测试中均取得了业界领先的性能表现，证明了其卓越的泛化能力和准确性。技术上，Mask3D 基于高度模块化的 MinkowskiEngine 框架开发，并集成了 PyTorch Lightning 与 Hydra 配置管理，不仅支持稀疏卷积高效处理大规模点云，还便于研究人员进行灵活的实验调整与二次开发。\n\nMask3D 主要面向计算机视觉领域的研究人员、算法工程师以及从事 3D 感知开发的开发者。对于希望探索前沿 3D 分割技术、复现顶级论文成果或构建高精度三维理解系统的团队来说，这是一个极具价值的参考实现。项目代码结构清晰且文档完善，同时也提供了便捷的部署方案，帮助用户快速上手并应用于实际场景。","## Mask3D: Mask Transformer for 3D Instance Segmentation\n\u003Cdiv align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fjonasschult.github.io\u002F\">Jonas Schult\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Ffrancisengelmann.github.io\u002F\">Francis Engelmann\u003C\u002Fa>\u003Csup>2,3\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Fwww.vision.rwth-aachen.de\u002Fperson\u002F10\u002F\">Alexander Hermans\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Forlitany.github.io\u002F\">Or Litany\u003C\u002Fa>\u003Csup>4\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Finf.ethz.ch\u002Fpeople\u002Fperson-detail.MjYyNzgw.TGlzdC8zMDQsLTg3NDc3NjI0MQ==.html\">Siyu Tang\u003C\u002Fa>\u003Csup>3\u003C\u002Fsup>,  \u003Ca href=\"https:\u002F\u002Fwww.vision.rwth-aachen.de\u002Fperson\u002F1\u002F\">Bastian Leibe\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>\n\n\u003Csup>1\u003C\u002Fsup>RWTH Aachen University \u003Csup>2\u003C\u002Fsup>ETH AI Center \u003Csup>3\u003C\u002Fsup>ETH Zurich \u003Csup>4\u003C\u002Fsup>NVIDIA\n\nMask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-scannetv2)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-scannetv2?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-scannet200)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-scannet200?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-s3dis?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-stpls3d)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-stpls3d?p=mask3d-for-3d-semantic-instance-segmentation)\n\n\u003Ca href=\"https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F\">\u003Cimg alt=\"PyTorch\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPyTorch-ee4c2c?logo=pytorch&logoColor=white\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fpytorchlightning.ai\u002F\">\u003Cimg alt=\"Lightning\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Lightning-792ee5?logo=pytorchlightning&logoColor=white\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fhydra.cc\u002F\">\u003Cimg alt=\"Config: Hydra\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FConfig-Hydra-89b8cd\">\u003C\u002Fa>\n\n![teaser](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJonasSchult_Mask3D_readme_85049271a8b0.jpg)\n\n\u003C\u002Fdiv>\n\u003Cbr>\u003Cbr>\n\n[[Project Webpage](https:\u002F\u002Fjonasschult.github.io\u002FMask3D\u002F)]\n[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03105)]\n[[Demo](https:\u002F\u002Ffrancisengelmann.github.io\u002Fmask3d\u002F)]\n\n\n## News\n\n* **29. October 2023**: Check out this [easy setup](https:\u002F\u002Fgithub.com\u002Fcvg\u002FMask3D) for Mask3D.\n* **17. January 2023**: Mask3D is accepted at ICRA 2023. :fire:\n* **14. October 2022**: STPLS3D support added.\n* **10. October 2022**: Mask3D ranks 2nd on the [STPLS3D Challenge](https:\u002F\u002Fcodalab.lisn.upsaclay.fr\u002Fcompetitions\u002F4646#results) hosted by the [Urban3D Workshop](https:\u002F\u002Furban3dchallenge.github.io\u002F) at ECCV 2022.\n* **6. October 2022**: [Mask3D preprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03105) released on arXiv.\n* **25. September 2022**: Code released.\n\n## Code structure\nWe adapt the codebase of [Mix3D](https:\u002F\u002Fgithub.com\u002Fkumuji\u002Fmix3d) which provides a highly modularized framework for 3D Semantic Segmentation based on the MinkowskiEngine.\n\n```\n├── mix3d\n│   ├── main_instance_segmentation.py \u003C- the main file\n│   ├── conf                          \u003C- hydra configuration files\n│   ├── datasets\n│   │   ├── preprocessing             \u003C- folder with preprocessing scripts\n│   │   ├── semseg.py                 \u003C- indoor dataset\n│   │   └── utils.py        \n│   ├── models                        \u003C- Mask3D modules\n│   ├── trainer\n│   │   ├── __init__.py\n│   │   └── trainer.py                \u003C- train loop\n│   └── utils\n├── data\n│   ├── processed                     \u003C- folder for preprocessed datasets\n│   └── raw                           \u003C- folder for raw datasets\n├── scripts                           \u003C- train scripts\n├── docs\n├── README.md\n└── saved                             \u003C- folder that stores models and logs\n```\n\n### Dependencies :memo:\nThe main dependencies of the project are the following:\n```yaml\npython: 3.10.9\ncuda: 11.3\n```\nYou can set up a conda environment as follows\n```\n# Some users experienced issues on Ubuntu with an AMD CPU\n# Install libopenblas-dev (issue #115, thanks WindWing)\n# sudo apt-get install libopenblas-dev\n\nexport TORCH_CUDA_ARCH_LIST=\"6.0 6.1 6.2 7.0 7.2 7.5 8.0 8.6\"\n\nconda env create -f environment.yml\n\nconda activate mask3d_cuda113\n\npip3 install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu113\npip3 install torch-scatter -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-1.12.1+cu113.html\npip3 install 'git+https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2.git@710e7795d0eeadf9def0e7ef957eea13532e34cf' --no-deps\n\nmkdir third_party\ncd third_party\n\ngit clone --recursive \"https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FMinkowskiEngine\"\ncd MinkowskiEngine\ngit checkout 02fc608bea4c0549b0a7b00ca1bf15dee4a0b228\npython setup.py install --force_cuda --blas=openblas\n\ncd ..\ngit clone https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet.git\ncd ScanNet\u002FSegmentator\ngit checkout 3e5726500896748521a6ceb81271b0f5b2c0e7d2\nmake\n\ncd ..\u002F..\u002Fpointnet2\npython setup.py install\n\ncd ..\u002F..\u002F\npip3 install pytorch-lightning==1.7.2\n```\n\n### Data preprocessing :hammer:\nAfter installing the dependencies, we preprocess the datasets.\n\n#### ScanNet \u002F ScanNet200\nFirst, we apply Felzenswalb and Huttenlocher's Graph Based Image Segmentation algorithm to the test scenes using the default parameters.\nPlease refer to the [original repository](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet\u002Ftree\u002Fmaster\u002FSegmentator) for details.\nPut the resulting segmentations in `.\u002Fdata\u002Fraw\u002Fscannet_test_segments`.\n```\npython -m datasets.preprocessing.scannet_preprocessing preprocess \\\n--data_dir=\"PATH_TO_RAW_SCANNET_DATASET\" \\\n--save_dir=\"data\u002Fprocessed\u002Fscannet\" \\\n--git_repo=\"PATH_TO_SCANNET_GIT_REPO\" \\\n--scannet200=false\u002Ftrue\n```\n\n#### S3DIS\nThe S3DIS dataset contains some smalls bugs which we initially fixed manually. We will soon release a preprocessing script which directly preprocesses the original dataset. For the time being, please follow the instructions [here](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F8#issuecomment-1279535948) to fix the dataset manually. Afterwards, call the preprocessing script as follows:\n\n```\npython -m datasets.preprocessing.s3dis_preprocessing preprocess \\\n--data_dir=\"PATH_TO_Stanford3dDataset_v1.2\" \\\n--save_dir=\"data\u002Fprocessed\u002Fs3dis\"\n```\n\n#### STPLS3D\n```\npython -m datasets.preprocessing.stpls3d_preprocessing preprocess \\\n--data_dir=\"PATH_TO_STPLS3D\" \\\n--save_dir=\"data\u002Fprocessed\u002Fstpls3d\"\n```\n\n### Training and testing :train2:\nTrain Mask3D on the ScanNet dataset:\n```bash\npython main_instance_segmentation.py\n```\nPlease refer to the [config scripts](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Ftree\u002Fmain\u002Fscripts) (for example [here](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fblob\u002Fmain\u002Fscripts\u002Fscannet\u002Fscannet_val.sh#L15)) for detailed instructions how to reproduce our results.\nIn the simplest case the inference command looks as follows:\n```bash\npython main_instance_segmentation.py \\\ngeneral.checkpoint='PATH_TO_CHECKPOINT.ckpt' \\\ngeneral.train_mode=false\n```\n\n## Trained checkpoints :floppy_disk:\nWe provide detailed scores and network configurations with trained checkpoints.\n\n### [S3DIS](http:\u002F\u002Fbuildingparser.stanford.edu\u002Fdataset.html) (pretrained on ScanNet train+val)\nFollowing PointGroup, HAIS and SoftGroup, we finetune a model pretrained on ScanNet ([config](.\u002Fscripts\u002Fscannet\u002Fscannet_pretrain_for_s3dis.sh) and [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Fscannet_pretrained.ckpt)).\n| Dataset | AP | AP_50 | AP_25 | Config | Checkpoint :floppy_disk: | Scores :chart_with_upwards_trend: | Visualizations :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| Area 1 | 69.3 | 81.9 | 87.7 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea1_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area1_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_1\u002F)\n| Area 2 | 44.0 | 59.5 | 66.5 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea2_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area2_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_2\u002F)\n| Area 3 | 73.4 | 83.2 | 88.2 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea3_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area3_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_3\u002F)\n| Area 4 | 58.0 | 69.5 | 74.9 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea4_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area4_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_4\u002F)\n| Area 5 | 57.8 | 71.9 | 77.2 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea5_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area5_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_5\u002F)\n| Area 6 | 68.4 | 79.9 | 85.2 | [config](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea6_scannet_pretrained.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area6_scannet_pretrained.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_6\u002F)\n\n### [S3DIS](http:\u002F\u002Fbuildingparser.stanford.edu\u002Fdataset.html) (from scratch)\n\n| Dataset | AP | AP_50 | AP_25 | Config | Checkpoint :floppy_disk: | Scores :chart_with_upwards_trend: | Visualizations :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| Area 1 | 74.1 | 85.1 | 89.6 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea1_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area1_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_1\u002F)\n| Area 2 | 44.9 | 57.1 | 67.9 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea2_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area2_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_2\u002F)\n| Area 3 | 74.4 | 84.4 | 88.1 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea3_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area3_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_3\u002F)\n| Area 4 | 63.8 | 74.7 | 81.1 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea4_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area4_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_4\u002F)\n| Area 5 | 56.6 | 68.4 | 75.2 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea5_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area5_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_5\u002F)\n| Area 6 | 73.3 | 83.4 | 87.8 | [config](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea6_from_scratch.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area6_from_scratch.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_6\u002F)\n\n### [ScanNet v2](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fsemantic_instance_3d?metric=ap)\n\n| Dataset | AP | AP_50 | AP_25 | Config | Checkpoint :floppy_disk: | Scores :chart_with_upwards_trend: | Visualizations :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| ScanNet val  | 55.2 | 73.7 | 83.5 | [config](scripts\u002Fscannet\u002Fscannet_val.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet\u002Fscannet_val.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fscannet_val.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet\u002Fval\u002F)\n| ScanNet test | 56.6 | 78.0 | 87.0 | [config](scripts\u002Fscannet\u002Fscannet_benchmark.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet\u002Fscannet_benchmark.ckpt) | [scores](http:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fresult_details?id=1081) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet\u002Ftest\u002F)\n\n### [ScanNet 200](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fscannet200_semantic_instance_3d)\n\n| Dataset | AP | AP_50 | AP_25 | Config | Checkpoint :floppy_disk: | Scores :chart_with_upwards_trend: | Visualizations :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| ScanNet200 val | 27.4 | 37.0 | 42.3 | [config](scripts\u002Fscannet200\u002Fscannet200_val.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet200\u002Fscannet200_val.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fscannet200_val.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet200\u002Fval\u002F)\n| ScanNet200 test | 27.8 | 38.8 | 44.5 | [config](scripts\u002Fscannet200\u002Fscannet200_benchmark.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet200\u002Fscannet200_benchmark.ckpt) | [scores](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fresult_details?id=1242) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet200\u002Ftest\u002F)\n\n### [STPLS3D](https:\u002F\u002Fwww.stpls3d.com\u002F)\n\n| Dataset | AP | AP_50 | AP_25 | Config | Checkpoint :floppy_disk: | Scores :chart_with_upwards_trend: | Visualizations :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| STPLS3D val | 57.3 | 74.3 | 81.6 | [config](scripts\u002Fstpls3d\u002Fstpls3d_val.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fstpls3d\u002Fstpls3d_val.ckpt) | [scores](.\u002Fdocs\u002Fdetailed_scores\u002Fstpls3d.txt) | [visualizations](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fstpls3d\u002F)\n| STPLS3D test | 63.4 | 79.2 | 85.6 | [config](scripts\u002Fstpls3d\u002Fstpls3d_benchmark.sh) | [checkpoint](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fstpls3d\u002Fstpls3d_benchmark.zip) | [scores](https:\u002F\u002Fcodalab.lisn.upsaclay.fr\u002Fcompetitions\u002F4646#results) | visualizations\n\n## BibTeX :pray:\n```\n@article{Schult23ICRA,\n  title     = {{Mask3D: Mask Transformer for 3D Semantic Instance Segmentation}},\n  author    = {Schult, Jonas and Engelmann, Francis and Hermans, Alexander and Litany, Or and Tang, Siyu and Leibe, Bastian},\n  booktitle = {{International Conference on Robotics and Automation (ICRA)}},\n  year      = {2023}\n}\n```\n","## Mask3D：用于3D实例分割的掩码Transformer\n\u003Cdiv align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fjonasschult.github.io\u002F\">Jonas Schult\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Ffrancisengelmann.github.io\u002F\">Francis Engelmann\u003C\u002Fa>\u003Csup>2,3\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Fwww.vision.rwth-aachen.de\u002Fperson\u002F10\u002F\">Alexander Hermans\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Forlitany.github.io\u002F\">Or Litany\u003C\u002Fa>\u003Csup>4\u003C\u002Fsup>, \u003Ca href=\"https:\u002F\u002Finf.ethz.ch\u002Fpeople\u002Fperson-detail.MjYyNzgw.TGlzdC8zMDQsLTg3NDc3NjI0MQ==.html\">Siyu Tang\u003C\u002Fa>\u003Csup>3\u003C\u002Fsup>,  \u003Ca href=\"https:\u002F\u002Fwww.vision.rwth-aachen.de\u002Fperson\u002F1\u002F\">Bastian Leibe\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>\n\n\u003Csup>1\u003C\u002Fsup>亚琛工业大学 \u003Csup>2\u003C\u002Fsup>ETH AI中心 \u003Csup>3\u003C\u002Fsup>苏黎世联邦理工学院 \u003Csup>4\u003C\u002Fsup>NVIDIA\n\nMask3D能够预测精确的3D语义实例，在ScanNet、ScanNet200、S3DIS和STPLS3D数据集上均达到当前最优性能。\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-scannetv2)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-scannetv2?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-scannet200)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-scannet200?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-s3dis?p=mask3d-for-3d-semantic-instance-segmentation)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fmask3d-for-3d-semantic-instance-segmentation\u002F3d-instance-segmentation-on-stpls3d)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-instance-segmentation-on-stpls3d?p=mask3d-for-3d-semantic-instance-segmentation)\n\n\u003Ca href=\"https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F\">\u003Cimg alt=\"PyTorch\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPyTorch-ee4c2c?logo=pytorch&logoColor=white\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fpytorchlightning.ai\u002F\">\u003Cimg alt=\"Lightning\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Lightning-792ee5?logo=pytorchlightning&logoColor=white\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fhydra.cc\u002F\">\u003Cimg alt=\"Config: Hydra\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FConfig-Hydra-89b8cd\">\u003C\u002Fa>\n\n![teaser](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJonasSchult_Mask3D_readme_85049271a8b0.jpg)\n\n\u003C\u002Fdiv>\n\u003Cbr>\u003Cbr>\n\n[[项目主页](https:\u002F\u002Fjonasschult.github.io\u002FMask3D\u002F)]\n[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03105)]\n[[演示](https:\u002F\u002Ffrancisengelmann.github.io\u002Fmask3d\u002F)]\n\n\n## 新闻\n\n* **2023年10月29日**：请查看这个针对Mask3D的[简易安装指南](https:\u002F\u002Fgithub.com\u002Fcvg\u002FMask3D)。\n* **2023年1月17日**：Mask3D已被ICRA 2023接收。 :fire:\n* **2022年10月14日**：新增了STPLS3D支持。\n* **2022年10月10日**：在ECCV 2022期间由[Urbn3D Workshop](https:\u002F\u002Furban3dchallenge.github.io\u002F)举办的[STPLS3D挑战赛](https:\u002F\u002Fcodalab.lisn.upsaclay.fr\u002Fcompetitions\u002F4646#results)中，Mask3D位列第二。\n* **2022年10月6日**：[Mask3D预印本](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03105)已在arXiv上发布。\n* **2022年9月25日**：代码正式发布。\n\n## 代码结构\n我们基于[Mix3D](https:\u002F\u002Fgithub.com\u002Fkumuji\u002Fmix3d)的代码库进行改造，该库提供了一个高度模块化的框架，用于基于MinkowskiEngine的3D语义分割。\n\n```\n├── mix3d\n│   ├── main_instance_segmentation.py \u003C- 主程序文件\n│   ├── conf                          \u003C- hydra配置文件\n│   ├── datasets\n│   │   ├── preprocessing             \u003C- 预处理脚本文件夹\n│   │   ├── semseg.py                 \u003C- 室内数据集\n│   │   └── utils.py        \n│   ├── models                        \u003C- Mask3D模块\n│   ├── trainer\n│   │   ├── __init__.py\n│   │   └── trainer.py                \u003C- 训练循环\n│   └── utils\n├── data\n│   ├── processed                     \u003C- 预处理后的数据集文件夹\n│   └── raw                           \u003C- 原始数据集文件夹\n├── scripts                           \u003C- 训练脚本\n├── docs\n├── README.md\n└── saved                             \u003C- 存储模型和日志的文件夹\n```\n\n### 依赖项 :memo:\n该项目的主要依赖项如下：\n```yaml\npython: 3.10.9\ncuda: 11.3\n```\n您可以按照以下步骤设置conda环境：\n```\n# 一些用户在Ubuntu系统上使用AMD CPU时遇到了问题\n# 请安装libopenblas-dev（问题#115，感谢WindWing）\n# sudo apt-get install libopenblas-dev\n\nexport TORCH_CUDA_ARCH_LIST=\"6.0 6.1 6.2 7.0 7.2 7.5 8.0 8.6\"\n\nconda env create -f environment.yml\n\nconda activate mask3d_cuda113\n\npip3 install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu113\npip3 install torch-scatter -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-1.12.1+cu113.html\npip3 install 'git+https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2.git@710e7795d0eeadf9def0e7ef957eea13532e34cf' --no-deps\n\nmkdir third_party\ncd third_party\n\ngit clone --recursive \"https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FMinkowskiEngine\"\ncd MinkowskiEngine\ngit checkout 02fc608bea4c0549b0a7b00ca1bf15dee4a0b228\npython setup.py install --force_cuda --blas=openblas\n\ncd ..\ngit clone https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet.git\ncd ScanNet\u002FSegmentator\ngit checkout 3e5726500896748521a6ceb81271b0f5b2c0e7d2\nmake\n\ncd ..\u002F..\u002Fpointnet2\npython setup.py install\n\ncd ..\u002F..\u002F\npip3 install pytorch-lightning==1.7.2\n```\n\n### 数据预处理 :hammer:\n安装完依赖项后，我们对数据集进行预处理。\n\n#### ScanNet \u002F ScanNet200\n首先，我们使用默认参数对测试场景应用Felzenswalb和Huttenlocher提出的基于图的图像分割算法。详细信息请参阅[原始仓库](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet\u002Ftree\u002Fmaster\u002FSegmentator)。将得到的分割结果放入`.\u002Fdata\u002Fraw\u002Fscannet_test_segments`目录下。\n```\npython -m datasets.preprocessing.scannet_preprocessing preprocess \\\n--data_dir=\"RAW_SCANNET_DATASET路径\" \\\n--save_dir=\"data\u002Fprocessed\u002Fscannet\" \\\n--git_repo=\"SCANNET_GIT_REPO路径\" \\\n--scannet200=false\u002Ftrue\n```\n\n#### S3DIS\nS3DIS数据集存在一些小错误，我们最初是手动修复的。我们很快会发布一个可以直接预处理原始数据集的脚本。目前，请按照[此处](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F8#issuecomment-1279535948)的说明手动修复数据集。之后，按如下方式调用预处理脚本：\n\n```\npython -m datasets.preprocessing.s3dis_preprocessing preprocess \\\n--data_dir=\"PATH_TO_Stanford3dDataset_v1.2\" \\\n--save_dir=\"data\u002Fprocessed\u002Fs3dis\"\n```\n\n#### STPLS3D\n```\npython -m datasets.preprocessing.stpls3d_preprocessing preprocess \\\n--data_dir=\"PATH_TO_STPLS3D\" \\\n--save_dir=\"data\u002Fprocessed\u002Fstpls3d\"\n```\n\n### 训练与测试 :train2:\n在 ScanNet 数据集上训练 Mask3D：\n```bash\npython main_instance_segmentation.py\n```\n请参考 [配置脚本](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Ftree\u002Fmain\u002Fscripts)（例如 [这里](https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fblob\u002Fmain\u002Fscripts\u002Fscannet\u002Fscannet_val.sh#L15)），以获取重现我们结果的详细说明。\n最简单的推理命令如下：\n```bash\npython main_instance_segmentation.py \\\ngeneral.checkpoint='PATH_TO_CHECKPOINT.ckpt' \\\ngeneral.train_mode=false\n```\n\n## 训练好的检查点 :floppy_disk:\n我们提供了详细的指标和网络配置，以及训练好的检查点。\n\n### [S3DIS](http:\u002F\u002Fbuildingparser.stanford.edu\u002Fdataset.html)（在 ScanNet train+val 上预训练）\n继 PointGroup、HAIS 和 SoftGroup 之后，我们对在 ScanNet 上预训练的模型进行了微调（[配置](.\u002Fscripts\u002Fscannet\u002Fscannet_pretrain_for_s3dis.sh) 和 [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Fscannet_pretrained.ckpt)）。\n| 数据集 | AP | AP_50 | AP_25 | 配置 | 检查点 :floppy_disk: | 分数 :chart_with_upwards_trend: | 可视化 :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| Area 1 | 69.3 | 81.9 | 87.7 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea1_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area1_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_1\u002F)\n| Area 2 | 44.0 | 59.5 | 66.5 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea2_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area2_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_2\u002F)\n| Area 3 | 73.4 | 83.2 | 88.2 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea3_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area3_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_3\u002F)\n| Area 4 | 58.0 | 69.5 | 74.9 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea4_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area4_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_4\u002F)\n| Area 5 | 57.8 | 71.9 | 77.2 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea5_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area5_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_5\u002F)\n| Area 6 | 68.4 | 79.9 | 85.2 | [配置](scripts\u002Fs3dis\u002Fs3dis_pretrained.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Fscannet_pretrained\u002Farea6_scannet_pretrained.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Fscannet_pretrained\u002Fs3dis_area6_scannet_pretrained.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Fscannet_pretrained\u002Farea_6\u002F)\n\n### [S3DIS](http:\u002F\u002Fbuildingparser.stanford.edu\u002Fdataset.html)（从零开始训练）\n\n| 数据集 | AP | AP_50 | AP_25 | 配置 | 检查点 :floppy_disk: | 分数 :chart_with_upwards_trend: | 可视化 :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| Area 1 | 74.1 | 85.1 | 89.6 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea1_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area1_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_1\u002F)\n| Area 2 | 44.9 | 57.1 | 67.9 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea2_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area2_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_2\u002F)\n| Area 3 | 74.4 | 84.4 | 88.1 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea3_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area3_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_3\u002F)\n| Area 4 | 63.8 | 74.7 | 81.1 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea4_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area4_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_4\u002F)\n| Area 5 | 56.6 | 68.4 | 75.2 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea5_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area5_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_5\u002F)\n| Area 6 | 73.3 | 83.4 | 87.8 | [配置](scripts\u002Fs3dis\u002Fs3dis_from_scratch.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fs3dis\u002Ffrom_scratch\u002Farea6_from_scratch.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fs3dis\u002Ffrom_scratch\u002Fs3dis_area6_from_scratch.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fs3dis\u002Ffrom_scratch\u002Farea_6\u002F)\n\n### [ScanNet v2](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fsemantic_instance_3d?metric=ap)\n\n| 数据集 | AP | AP_50 | AP_25 | 配置 | 检查点 :floppy_disk: | 分数 :chart_with_upwards_trend: | 可视化 :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| ScanNet 验证集  | 55.2 | 73.7 | 83.5 | [配置](scripts\u002Fscannet\u002Fscannet_val.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet\u002Fscannet_val.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fscannet_val.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet\u002Fval\u002F)\n| ScanNet 测试集 | 56.6 | 78.0 | 87.0 | [配置](scripts\u002Fscannet\u002Fscannet_benchmark.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet\u002Fscannet_benchmark.ckpt) | [分数](http:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fresult_details?id=1081) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet\u002Ftest\u002F)\n\n### [ScanNet 200](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fscannet200_semantic_instance_3d)\n\n| 数据集 | AP | AP_50 | AP_25 | 配置 | 检查点 :floppy_disk: | 分数 :chart_with_upwards_trend: | 可视化 :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| ScanNet200 验证集 | 27.4 | 37.0 | 42.3 | [配置](scripts\u002Fscannet200\u002Fscannet200_val.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet200\u002Fscannet200_val.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fscannet200_val.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet200\u002Fval\u002F)\n| ScanNet200 测试集 | 27.8 | 38.8 | 44.5 | [配置](scripts\u002Fscannet200\u002Fscannet200_benchmark.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fscannet200\u002Fscannet200_benchmark.ckpt) | [分数](https:\u002F\u002Fkaldir.vc.in.tum.de\u002Fscannet_benchmark\u002Fresult_details?id=1242) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fscannet200\u002Ftest\u002F)\n\n### [STPLS3D](https:\u002F\u002Fwww.stpls3d.com\u002F)\n\n| 数据集 | AP | AP_50 | AP_25 | 配置 | 检查点 :floppy_disk: | 分数 :chart_with_upwards_trend: | 可视化 :telescope:\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n| STPLS3D 验证集 | 57.3 | 74.3 | 81.6 | [配置](scripts\u002Fstpls3d\u002Fstpls3d_val.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fstpls3d\u002Fstpls3d_val.ckpt) | [分数](.\u002Fdocs\u002Fdetailed_scores\u002Fstpls3d.txt) | [可视化](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fvisualizations\u002Fstpls3d\u002F)\n| STPLS3D 测试集 | 63.4 | 79.2 | 85.6 | [配置](scripts\u002Fstpls3d\u002Fstpls3d_benchmark.sh) | [检查点](https:\u002F\u002Fomnomnom.vision.rwth-aachen.de\u002Fdata\u002Fmask3d\u002Fcheckpoints\u002Fstpls3d\u002Fstpls3d_benchmark.zip) | [分数](https:\u002F\u002Fcodalab.lisn.upsaclay.fr\u002Fcompetitions\u002F4646#results) | 可视化\n\n## BibTeX :pray:\n```\n@article{Schult23ICRA,\n  title     = {{Mask3D: 用于3D语义实例分割的掩码Transformer}},\n  author    = {Schult, Jonas and Engelmann, Francis and Hermans, Alexander and Litany, Or and Tang, Siyu and Leibe, Bastian},\n  booktitle = {{国际机器人与自动化会议 (ICRA)}},\n  year      = {2023}\n}\n```","# Mask3D 快速上手指南\n\nMask3D 是一个基于 Mask Transformer 的 3D 实例分割模型，在 ScanNet、S3DIS 等数据集上达到了业界领先（SOTA）的性能。本指南将帮助你快速搭建环境并运行模型。\n\n## 环境准备\n\n在开始之前，请确保你的系统满足以下要求：\n\n*   **操作系统**: Linux (推荐 Ubuntu)\n    *   *注意*: 部分用户在 AMD CPU 的 Ubuntu 系统上遇到过问题，建议预先安装 `libopenblas-dev` (`sudo apt-get install libopenblas-dev`)。\n*   **Python**: 3.10.9\n*   **CUDA**: 11.3\n*   **GPU**: 支持 CUDA 的 NVIDIA 显卡\n\n## 安装步骤\n\n### 1. 创建 Conda 环境并安装基础依赖\n\n首先设置 CUDA 架构列表，然后创建并激活虚拟环境：\n\n```bash\nexport TORCH_CUDA_ARCH_LIST=\"6.0 6.1 6.2 7.0 7.2 7.5 8.0 8.6\"\n\nconda env create -f environment.yml\nconda activate mask3d_cuda113\n```\n\n### 2. 安装 PyTorch 及相关库\n\n按照官方源安装指定版本的 PyTorch、torch-scatter 和 detectron2：\n\n```bash\npip3 install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu113\npip3 install torch-scatter -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-1.12.1+cu113.html\npip3 install 'git+https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2.git@710e7795d0eeadf9def0e7ef957eea13532e34cf' --no-deps\n```\n\n### 3. 编译第三方依赖 (MinkowskiEngine, ScanNet Segmentator, PointNet2)\n\n创建目录并依次编译必要的底层库：\n\n```bash\nmkdir third_party\ncd third_party\n\n# 安装 MinkowskiEngine\ngit clone --recursive \"https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FMinkowskiEngine\"\ncd MinkowskiEngine\ngit checkout 02fc608bea4c0549b0a7b00ca1bf15dee4a0b228\npython setup.py install --force_cuda --blas=openblas\n\ncd ..\n\n# 安装 ScanNet Segmentator\ngit clone https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet.git\ncd ScanNet\u002FSegmentator\ngit checkout 3e5726500896748521a6ceb81271b0f5b2c0e7d2\nmake\n\ncd ..\u002F..\u002Fpointnet2\npython setup.py install\n\ncd ..\u002F..\u002F\n```\n\n### 4. 安装 PyTorch Lightning\n\n最后安装训练框架：\n\n```bash\npip3 install pytorch-lightning==1.7.2\n```\n\n## 数据预处理\n\n在训练或测试前，需对原始数据集进行预处理。假设你已下载原始数据，运行以下命令（以 ScanNet 为例）：\n\n```bash\npython -m datasets.preprocessing.scannet_preprocessing preprocess \\\n--data_dir=\"PATH_TO_RAW_SCANNET_DATASET\" \\\n--save_dir=\"data\u002Fprocessed\u002Fscannet\" \\\n--git_repo=\"PATH_TO_SCANNET_GIT_REPO\" \\\n--scannet200=false\n```\n\n*注：对于 S3DIS 和 STPLS3D 数据集，请参考项目原文中的特定预处理脚本指令。*\n\n## 基本使用\n\n### 训练模型\n\n使用默认配置在 ScanNet 数据集上开始训练：\n\n```bash\npython main_instance_segmentation.py\n```\n\n*提示：如需复现特定论文结果或针对不同数据集训练，请参考 `scripts\u002F` 目录下的具体 shell 脚本配置。*\n\n### 推理测试 (使用预训练权重)\n\n下载预训练模型 checkpoint 后，运行以下命令进行推理：\n\n```bash\npython main_instance_segmentation.py \\\ngeneral.checkpoint='PATH_TO_CHECKPOINT.ckpt' \\\ngeneral.train_mode=false\n```\n\n将 `PATH_TO_CHECKPOINT.ckpt` 替换为你实际的模型文件路径即可得到分割结果。","某智慧城市团队正在利用激光雷达扫描数据，构建高精度的室内数字孪生模型，以支持自动化空间分析与设施管理。\n\n### 没有 Mask3D 时\n- **实例区分困难**：传统算法难以区分紧密相邻的同类型物体（如并排的办公椅），常将多个独立物体错误合并为一个整体。\n- **边界识别模糊**：在处理复杂场景（如会议室或仓库）时，物体边缘分割不精准，导致生成的 3D 模型轮廓粗糙，无法直接用于测量。\n- **人工修正耗时**：由于自动分割准确率低，工程师需花费大量时间手动逐点修正语义标签，严重拖慢项目交付进度。\n- **泛化能力不足**：模型在训练集表现尚可，但面对新场景（如不同布局的商场或车站）时性能大幅下降，需反复重新调参。\n\n### 使用 Mask3D 后\n- **精准实例分离**：Mask3D 凭借先进的掩码变换器架构，能清晰识别并分离紧密排列的独立物体，即使类别相同也能准确划定个体边界。\n- **细节还原度高**：在 ScanNet 等基准测试中达到业界领先水平的分割精度，生成的物体几何轮廓平滑且贴合真实边缘，直接满足工程测量需求。\n- **流程高度自动化**：大幅减少了对人工后处理的依赖，团队可将原本数天的修正工作缩短至几小时，显著提升数据处理吞吐量。\n- **场景适应性强**：得益于强大的泛化能力，Mask3D 无需大量微调即可适应从办公室到大型交通枢纽等多种复杂室内环境，保持稳定的输出质量。\n\nMask3D 通过实现业界领先的 3D 语义实例分割，将繁琐的点云处理流程转化为高效、精准的自动化生产线，极大降低了数字孪生建设的门槛与成本。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FJonasSchult_Mask3D_85049271.jpg","JonasSchult","Jonas Schult","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FJonasSchult_373e89fb.jpg","I'm a PhD candidate at RWTH Aachen University focusing on 3D Scene Understanding.","RWTH Aachen","Aachen",null,"https:\u002F\u002Fjonasschult.github.io\u002F","https:\u002F\u002Fgithub.com\u002FJonasSchult",[83,87,91,95],{"name":84,"color":85,"percentage":86},"Python","#3572A5",82.2,{"name":88,"color":89,"percentage":90},"Cuda","#3A4E3A",9.5,{"name":92,"color":93,"percentage":94},"C++","#f34b7d",6.7,{"name":96,"color":97,"percentage":98},"Shell","#89e051",1.6,717,127,"2026-04-06T17:21:02","MIT",5,"Linux","必需 NVIDIA GPU，支持 CUDA 架构列表：6.0, 6.1, 6.2, 7.0, 7.2, 7.5, 8.0, 8.6；需安装 CUDA 11.3","未说明",{"notes":108,"python":109,"dependencies":110},"1. Ubuntu 系统若使用 AMD CPU，需预先安装 libopenblas-dev 以避免问题。2. 需手动编译第三方库 MinkowskiEngine（需指定 --force_cuda --blas=openblas）和 ScanNet Segmentator。3. 数据集（ScanNet, S3DIS, STPLS3D）需按照特定脚本进行预处理，其中 S3DIS 数据集存在已知小 bug 需手动修复或使用特定指令处理。4. 环境变量 TORCH_CUDA_ARCH_LIST 需根据显卡架构进行设置。","3.10.9",[111,112,113,114,115,116,117,118],"torch==1.12.1+cu113","torchvision==0.13.1+cu113","pytorch-lightning==1.7.2","MinkowskiEngine (特定 commit)","torch-scatter","detectron2 (特定 commit)","hydra","libopenblas-dev (Ubuntu AMD CPU 用户可能需要)",[120,14,15],"其他",[122,123,124,125,126],"3d-computer-vision","computer-vision","deep-learning","deep-neural-networks","pytorch","2026-03-27T02:49:30.150509","2026-04-08T19:00:33.824049",[130,135,140,144,149,154],{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},24946,"安装 detectron2 时遇到依赖冲突或版本问题怎么办？","如果直接使用包管理器安装最新版 detectron2 导致与项目所需的 hydra-core 或 omegaconf 版本冲突，建议不要自动安装依赖。可以尝试先手动安装项目要求的特定版本的 hydra-core (>=1.1) 和 omegaconf (>=2.1)，然后再安装 detectron2，或者参考项目文档中指定的 commit 版本来锁定依赖环境。","https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F28",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},24941,"在 ScanNet 或 S3DIS 数据集上进行预处理时脚本报错或无法完成，如何解决？","这通常是由数据文件中的编码错误或命名不规范引起的。具体解决步骤如下：\n1. 检查并修复编码错误：例如在 `Area_5\u002Fhallway_6\u002FAnnotations\u002Fceiling_1.txt` 的第 180389 行可能存在坏字符，删除该坏字符即可。\n2. 修正文件命名：确保文件名符合规范，例如将 `Area_6\u002FcopyRoom_1\u002Fcopy_Room_1.txt` 重命名为 `Area_6\u002FcopyRoom_1\u002FcopyRoom_1.txt`（去掉下划线）。\n3. 注意：某些错误仅出现在 \"Aligned_Version\" 版本的数据集中。修复后若仍报 \"FILE SIZE DOES NOT MATCH\" 错误，请再次确认文件大小是否一致。","https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F8",{"id":141,"question_zh":142,"answer_zh":143,"source_url":134},24942,"升级了 hydra-core 和 omegaconf 版本后，运行时报错 'Cannot instantiate config of type Res16UNet34C'，该如何修复？","这是因为新版本的 Hydra 实例化机制发生了变化。请修改 `models\u002Fmask3d.py` 文件第 56 行左右的代码：\n将 `self.backbone = hydra.utils.instantiate(config.backbone)` \n改为 `self.backbone = config.backbone`。\n直接赋值配置对象即可解决此实例化异常。",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},24943,"训练或测试时如何启用或禁用 DBSCAN 后处理步骤？","DBSCAN 是一个可选步骤，耗时较长。您可以通过配置文件参数来控制它：\n设置 `general.use_dbscan=true` 以启用 DBSCAN；\n设置 `general.use_dbscan=false` 以禁用 DBSCAN。","https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F81",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},24944,"运行时出现 WandB 错误 'entity schult not found' (404 Error)，导致无法记录日志，如何解决？","该错误是因为配置文件中指定的 WandB 实体（entity）'schult' 不存在或您没有访问权限。解决方法是修改配置文件（通常在 yaml 中），将 `entity: 'schult'` 更改为您自己的 WandB 用户名或团队名称，或者如果您不需要使用 WandB，可以暂时禁用 WandB 日志记录功能。","https:\u002F\u002Fgithub.com\u002FJonasSchult\u002FMask3D\u002Fissues\u002F66",{"id":155,"question_zh":156,"answer_zh":157,"source_url":148},24945,"在 ScanNet200 数据集上训练时，应该使用什么命令？","基本的训练命令是 `python main_instance_segmentation.py`。如果您遇到关于 GPU 设置的弃用警告（如 `Trainer(gpus=1)`），建议更新代码或配置以使用新的 PyTorch Lightning 语法：`Trainer(accelerator='gpu', devices=1)`。此外，确保已正确预处理数据集并配置了相应的数据集路径。",[]]