[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-drprojects--superpoint_transformer":3,"tool-drprojects--superpoint_transformer":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",143909,2,"2026-04-07T11:33:18",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":79,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":96,"forks":97,"last_commit_at":98,"license":99,"difficulty_score":100,"env_os":101,"env_gpu":102,"env_ram":101,"env_deps":103,"category_tags":110,"github_topics":112,"view_count":32,"oss_zip_url":80,"oss_zip_packed_at":80,"status":17,"created_at":130,"updated_at":131,"faqs":132,"releases":161},5066,"drprojects\u002Fsuperpoint_transformer","superpoint_transformer","Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] \"Efficient 3D Semantic Segmentation with Superpoint Transformer\" and SuperCluster introduced in [3DV'24 Oral] \"Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering\"","Superpoint Transformer 是一款基于 PyTorch 开发的开源深度学习架构，专为大规模 3D 场景的语义分割任务设计。它源自 ICCV'23 等顶级会议的研究成果，旨在解决传统点云处理方法在面对海量数据时计算效率低、显存占用高以及难以捕捉长距离依赖关系的痛点。\n\n该工具的核心亮点在于其独特的“超点”（Superpoint）机制。它首先通过快速算法将原始点云分层聚类为具有语义一致性的超点结构，大幅减少了需要处理的数据单元数量；随后利用自注意力机制，在多尺度上高效挖掘超点间的空间与语义关联。这种设计不仅显著提升了推理速度，还在 S3DIS、KITTI-360 和 DALES 等多个权威数据集上取得了领先的精度表现。此外，项目还衍生支持了全景分割（SuperCluster）及更轻量级的 EZ-SP 方案，展现了极强的扩展性。\n\nSuperpoint Transformer 非常适合从事计算机视觉、自动驾驶感知、机器人导航及数字孪生领域的研究人员与开发者使用。如果你需要在资源受限的环境下处理复杂的 3D 点云数据，或希望复现前沿的 3D 分割算法，这将是一个高效且可靠的选","Superpoint Transformer 是一款基于 PyTorch 开发的开源深度学习架构，专为大规模 3D 场景的语义分割任务设计。它源自 ICCV'23 等顶级会议的研究成果，旨在解决传统点云处理方法在面对海量数据时计算效率低、显存占用高以及难以捕捉长距离依赖关系的痛点。\n\n该工具的核心亮点在于其独特的“超点”（Superpoint）机制。它首先通过快速算法将原始点云分层聚类为具有语义一致性的超点结构，大幅减少了需要处理的数据单元数量；随后利用自注意力机制，在多尺度上高效挖掘超点间的空间与语义关联。这种设计不仅显著提升了推理速度，还在 S3DIS、KITTI-360 和 DALES 等多个权威数据集上取得了领先的精度表现。此外，项目还衍生支持了全景分割（SuperCluster）及更轻量级的 EZ-SP 方案，展现了极强的扩展性。\n\nSuperpoint Transformer 非常适合从事计算机视觉、自动驾驶感知、机器人导航及数字孪生领域的研究人员与开发者使用。如果你需要在资源受限的环境下处理复杂的 3D 点云数据，或希望复现前沿的 3D 分割算法，这将是一个高效且可靠的选择。","\u003Cdiv align=\"center\">\n\n# Superpoint Transformer\n\n[![python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Python_3.8+-blue?logo=python&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fpre-commit\u002Fpre-commit)\n[![pytorch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPyTorch_2.2+-ee4c2c?logo=pytorch&logoColor=white)](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F)\n[![lightning](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Lightning_2.2+-792ee5?logo=pytorchlightning&logoColor=white)](https:\u002F\u002Fpytorchlightning.ai\u002F)\n[![hydra](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FConfig-Hydra_1.3-89b8cd)](https:\u002F\u002Fhydra.cc\u002F)\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg?labelColor=gray)](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template#license)\n\n[\u002F\u002F]: # ([![Paper]&#40;https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpaper-arxiv.1001.2234-B31B1B.svg&#41;]&#40;https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fnature14539&#41;)\n[\u002F\u002F]: # ([![Conference]&#40;https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAnyConference-year-4b44ce.svg&#41;]&#40;https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020&#41;)\n\n\nOfficial implementation for\n\u003Cbr>\n\u003Cbr>\n[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045) (ICCV'23)\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2306.08045-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.8042712.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.8042712)\n[![Project page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Fdrprojects.github.io\u002Fsuperpoint-transformer-site)\n[![Tutorial](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTutorial-FFC300)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw)\n\u003Cbr>\n\u003Cbr>\n[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704) (3DV'24 Oral)\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2401.06704-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.10689037.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.10689037)\n[![Project page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Fdrprojects.github.io\u002Fsupercluster-site)\n\u003Cbr>\n\u003Cbr>\n[_EZ-SP: Fast and Lightweight Superpoint-Based 3D Segmentation_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385) (ICRA'26)\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2512.00385-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.18329602.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.18329602)\n[![Project page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Flouisgeist.github.io\u002Fez-sp\u002F)\n\u003Cbr>\n\u003Cbr>\n**If you ❤️ or simply use this project, don't forget to give the repository a ⭐,\nit means a lot to us !**\n\u003Cbr>\n\u003C\u002Fdiv>\n\n\n\u003Cbr>\n\n## 📌  Description\n\n### Superpoint Transformer\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_a3abb423a8c1.png\">\n\u003C\u002Fp>\n\n**Superpoint Transformer (SPT)** is a superpoint-based transformer 🤖 architecture that efficiently ⚡ \nperforms **semantic segmentation** on large-scale 3D scenes. This method includes a \nfast algorithm that partitions 🧩 point clouds into a hierarchical superpoint \nstructure, as well as a self-attention mechanism to exploit the relationships \nbetween superpoints at multiple scales. \n\n\u003Cdiv align=\"center\">\n\n|                                                                                   ✨ SPT in numbers ✨                                                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|\n|                                                                          📊 **S3DIS 6-Fold** (76.0 mIoU)                                                                          |\n|                                                                         📊 **KITTI-360 Val** (63.5 mIoU)                                                                          |\n|                                                                           📊 **DALES** (79.6 mIoU)                                                                           | \n|      🦋 **212k parameters** ([PointNeXt](https:\u002F\u002Fgithub.com\u002Fguochengqian\u002FPointNeXt) ÷ 200, [Stratified Transformer](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FStratified-Transformer) ÷ 40)       | \n| ⚡ S3DIS training in **3h on 1 GPU** ([PointNeXt](https:\u002F\u002Fgithub.com\u002Fguochengqian\u002FPointNeXt) ÷ 7, [Stratified Transformer](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FStratified-Transformer) ÷ 70) | \n|                                                  ⚡ **Preprocessing x7 faster than [SPG](https:\u002F\u002Fgithub.com\u002Floicland\u002Fsuperpoint_graph)**                                                   |\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-dales)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-dales?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002Fsemantic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fsemantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-kitti-360)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-kitti-360?p=efficient-3d-semantic-segmentation-with-1)\n\u003C\u002Fdiv>\n\n### SuperCluster\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_f80c4b26478f.png\">\n\u003C\u002Fp>\n\n**SuperCluster** is a superpoint-based architecture for **panoptic segmentation** of (very) large 3D scenes 🐘 based on SPT. \nWe formulate the panoptic segmentation task as a **scalable superpoint graph clustering** task. \nTo this end, our model is trained to predict the input parameters of a graph optimization problem whose solution is a panoptic segmentation 💡.\nThis formulation allows supervising our model with per-node and per-edge objectives only, circumventing the need for computing an actual panoptic segmentation and associated matching issues at train time.\nAt inference time, our fast parallelized algorithm solves the small graph optimization problem, yielding object instances 👥.\nDue to its lightweight backbone and scalable formulation, SuperCluster can process scenes of unprecedented scale at once, on a single GPU 🚀, with fewer than 1M parameters 🦋.\n\n\u003Cdiv align=\"center\">\n\n|                               ✨ SuperCluster in numbers ✨                                |\n|:----------------------------------------------------------------------------------------:|\n|                        📊 **S3DIS 6-Fold** (55.9 PQ)                                     |\n|                              📊 **S3DIS Area 5** (50.1 PQ)                               |\n|                               📊 **ScanNet Val** (58.7 PQ)                               |\n|                              📊 **KITTI-360 Val** (48.3 PQ)                              |\n|                                  📊 **DALES** (61.2 PQ)                                  |\n| 🦋 **212k parameters** ([PointGroup](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FPointGroup) ÷ 37) |\n|                           ⚡ S3DIS training in **4h on 1 GPU**                            | \n|              ⚡ **7.8km²** tile of **18M** points in **10.1s** on **1 GPU**               |\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-s3dis?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-s3dis-area5)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-s3dis-area5?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-scannetv2)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-scannetv2?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-kitti-360)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-kitti-360?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-dales)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-dales?p=scalable-3d-panoptic-segmentation-with)\n\n\u003C\u002Fdiv>\n\n### EZ-SP (Easy Superpoints)\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_b3f035de916c.png\">\n\u003C\u002Fp>\n\n**EZ-SP** brings two main improvements over SPT:\n- Much faster preprocessing and inference\n- Easier and learnable parametrization of the partition\n\nEZ-SP replaces the costly, CPU-based, cut-pursuit partitioning step of SPT \nwith a **fast and learnable GPU-based partitioning**. \nFirst, we train a small convolutional backbone to embed each point of the input \nscene into a low-dimensional space, where adjacent points from different \nsemantic classes are pushed apart. Next, our new \n[GPU-accelerated graph clustering algorithm](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components) \ngroups neighboring points with similar embeddings, while encouraging simple \ncluster contours, thus produces semantically homogeneous superpoints. These \nsuperpoints can then be used in the SPT semantic segmentation framework.\n\n\u003Cdiv align=\"center\">\n\n|                                         ✨ EZ-SP in numbers ✨                                         |\n|:----------------------------------------------------------------------------------------------------:|\n|                                   📊 **S3DIS 6-Fold** (76.1 mIoU)                                    |\n|                                   📊 **S3DIS Area 5** (69.6 mIoU)                                    |\n|                                   📊 **KITTI-360 Val** (62.0 mIoU)                                   |\n|                              📊 **DALES** (79.4 mIoU)                                                |\n|                                        🦋 **392k parameters**                                        |\n| ⚡️ **72×** faster than [PTv3](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.10035) for end-to-end semantic segmentation |\n|                   ⚡️ **5.3x** faster than SPT for end-to-end semantic segmentation                   |\n\n\u003C\u002Fdiv>\n\n> **Note**: If you mostly care for the fast graph connected components and \n> partitioning algorithms introduced in the \n> [EZ-SP paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385), \n> please see our\n> [torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components)\n> library.\n\n## 📰  Updates\n- **31.01.2026** Our paper [_**EZ-SP: Fast and Lightweight Superpoint-Based 3D Segmentation**_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385) was accepted at **[ICRA'26](https:\u002F\u002F2026.ieee-icra.org\u002F)** 🥳\n- **22.01.2026** Release the graph connected components and graph\npartitioning algorithms introduced in EZ-SP as a standalone library: \n[torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components). \nWe hope this will facilitate the application of these core building blocks to \nother graph-based projects. \n- **27.11.2025** Major code release for our **learnable, GPU-accelerated \npartition**, implementing \n[_**EZ-SP: Fast and Lightweight Superpoint-Based 3D Segmentation**_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385).\nThis new version introduces some changes to the codebase which are \n**non-backward compatible**.\nWe strived to document the breaking changes and provide **instructions and\nscripts** to help users of previous versions move to the new codebase.\nPlease refer to the [**CHANGELOG**](CHANGELOG.md) for more details❗\n- **27.06.2024** Released our Superpoint Transformer 🧑‍🏫 tutorial \n[slides](media\u002Fsuperpoint_transformer_tutorial.pdf), \n[notebook](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb), and [video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw). \nCheck these out if you are getting started with the project ! \n- **21.06.2024** [Damien](https:\u002F\u002Fgithub.com\u002Fdrprojects) will be giving a \n**🧑‍🏫 tutorial on Superpoint Transformer on 📅 27.06.2024 at 1pm CEST**. \nMake sure to come if you want to gain some hands-on experience with the project !\n**[Registration here](https:\u002F\u002Fwww.linkedin.com\u002Fevents\u002Fsuperpointtransformersfor3dpoin7209130538110963712)**. \n- **28.02.2024** Major code release for **panoptic segmentation**, implementing \n**[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)**.\nThis new version also implements long-awaited features such as lightning's\n`predict()` behavior, **voxel-resolution and full-resolution prediction**.\nSome changes in the dependencies and repository structure are **not \nbackward-compatible**. If you were already using anterior code versions, this\nmeans we recommend re-installing your conda environment and re-running the \npreprocessing or your datasets❗\n- **15.10.2023** Our paper **[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)** was accepted for an **oral** presentation at **[3DV'24](https:\u002F\u002F3dvconf.github.io\u002F2024\u002F)** 🥳\n- **06.10.2023** Come see our poster for **[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)** at **[ICCV'23](https:\u002F\u002Ficcv2023.thecvf.com\u002F)**\n- **14.07.2023** Our paper **[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)** was accepted at **[ICCV'23](https:\u002F\u002Ficcv2023.thecvf.com\u002F)** 🥳\n- **15.06.2023** Official release 🌱\n\n\u003Cbr>\n\n## 💻  Environment requirements\nThis project was tested with:\n- Linux OS\n- **64G** RAM\n- NVIDIA GTX 1080 Ti **11G**, NVIDIA V100 **32G**, NVIDIA A40 **48G**\n- CUDA 11.8 and 12.1\n- conda 23.3.1\n\n\u003Cbr>\n\n## 🏗  Installation\nSimply run [`install.sh`](install.sh) to install all dependencies in a new \nconda environment named `spt`. \n```bash\n# Creates a conda env named 'spt' env and installs dependencies\n.\u002Finstall.sh\n```\n\n**Optional dependency**: [TorchSparse](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Ftorchsparse) \nis an optional dependency that enables sparse 3D convolutions, used in the \nEZ-SP models. To install an environment with this, use:\n```bash\n# Creates a conda env named 'spt' env and installs all dependencies + TorchSparse\n.\u002Finstall.sh with_torchsparse\n```\n\n\n> **Note**: See the [Datasets page](docs\u002Fdatasets.md) for setting up your dataset\n> path and file structure.\n\n\u003Cbr>\n\n### 🔩  Project structure\n```\n└── superpoint_transformer\n    │\n    ├── configs                   # Hydra configs\n    │   ├── callbacks                 # Callbacks configs\n    │   ├── data                      # Data configs\n    │   ├── debug                     # Debugging configs\n    │   ├── experiment                # Experiment configs\n    │   ├── extras                    # Extra utilities configs\n    │   ├── hparams_search            # Hyperparameter search configs\n    │   ├── hydra                     # Hydra configs\n    │   ├── local                     # Local configs\n    │   ├── logger                    # Logger configs\n    │   ├── model                     # Model configs\n    │   ├── paths                     # Project paths configs\n    │   ├── trainer                   # Trainer configs\n    │   │\n    │   ├── eval.yaml                 # Main config for evaluation\n    │   └── train.yaml                # Main config for training\n    │\n    ├── data                      # Project data (see docs\u002Fdatasets.md)\n    │\n    ├── docs                      # Documentation\n    │\n    ├── logs                      # Logs generated by hydra and lightning loggers\n    │\n    ├── media                     # Media illustrating the project\n    │\n    ├── notebooks                 # Jupyter notebooks\n    │\n    ├── scripts                   # Shell scripts\n    │\n    ├── src                       # Source code\n    │   ├── data                      # Data structure for hierarchical partitions\n    │   ├── datamodules               # Lightning DataModules\n    │   ├── datasets                  # Datasets\n    │   ├── dependencies              # Compiled dependencies\n    │   ├── loader                    # DataLoader\n    │   ├── loss                      # Loss\n    │   ├── metrics                   # Metrics\n    │   ├── models                    # Model architecture\n    │   ├── nn                        # Model building blocks\n    │   ├── optim                     # Optimization \n    │   ├── transforms                # Functions for transforms, pre-transforms, etc\n    │   ├── utils                     # Utilities\n    │   ├── visualization             # Interactive visualization tool\n    │   │\n    │   ├── eval.py                   # Run evaluation\n    │   └── train.py                  # Run training\n    │\n    ├── tests                     # Tests of any kind\n    │\n    ├── .env.example              # Example of file for storing private environment variables\n    ├── .gitignore                # List of files ignored by git\n    ├── .pre-commit-config.yaml   # Configuration of pre-commit hooks for code formatting\n    ├── install.sh                # Installation script\n    ├── LICENSE                   # Project license\n    └── README.md\n\n```\n\n> **Note**: See the [Datasets page](docs\u002Fdatasets.md) for further details on `data\u002F`. \n\n> **Note**: See the [Logs page](docs\u002Flogging.md) for further details on `logs\u002F`. \n\n\u003Cbr>\n\n## 🚀  Usage\n### Datasets\nSee the [Datasets page](docs\u002Fdatasets.md) to set up your datasets. \n\n### Evaluation\nUse the following command structure for evaluating our models from a checkpoint \nfile `checkpoint.ckpt`, where `\u003Ctask>` should be `semantic` for using SPT and `panoptic` for using \nSuperCluster:\n\n```bash\n# Evaluate for \u003Ctask> segmentation on \u003Cdataset>\npython src\u002Feval.py experiment=\u003Ctask>\u002F\u003Cdataset> ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n```\n\nSome examples:\n\n```bash\n# Evaluate SPT on S3DIS Fold 5\npython src\u002Feval.py experiment=semantic\u002Fs3dis datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate SPT on KITTI-360 Val\npython src\u002Feval.py experiment=semantic\u002Fkitti360  ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt \n\n# Evaluate SPT on DALES\npython src\u002Feval.py experiment=semantic\u002Fdales ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate SuperCluster on S3DIS Fold 5\npython src\u002Feval.py experiment=panoptic\u002Fs3dis datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src\u002Feval.py experiment=panoptic\u002Fs3dis_with_stuff datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate SuperCluster on ScanNet Val\npython src\u002Feval.py experiment=panoptic\u002Fscannet ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate SuperCluster on KITTI-360 Val\npython src\u002Feval.py experiment=panoptic\u002Fkitti360  ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt \n\n# Evaluate SuperCluster on DALES\npython src\u002Feval.py experiment=panoptic\u002Fdales ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# Evaluate EZ-SP on DALES\npython src\u002Feval.py experiment=semantic\u002Fdales_ezsp ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt datamodule.pretrained_cnn_ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fpartition_checkpoint.ckpt\n```\n\n> **Note**: \n> \n> The pretrained weights of the **SPT** and **SPT-nano** models for \n>**S3DIS 6-Fold**, **KITTI-360 Val**, and **DALES** are available at:\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.8042712.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.8042712)\n> \n> The pretrained weights of the **SuperCluster** models for \n>**S3DIS 6-Fold**, **S3DIS 6-Fold with stuff**, **ScanNet Val**, **KITTI-360 Val**, and **DALES** are available at:\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.10689037.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.10689037)\n>\n> The pretrained weights of the **EZ-SP** models for **S3DIS 6-Fold**, **KITTI-360 Val**, and **DALES** are available at:\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.18329602.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.18329602)\n\n### Training\n#### SPT & SuperCluster\nUse the following command structure for **train our models on a 32G-GPU**, \nwhere `\u003Ctask>` should be `semantic` for using SPT and `panoptic` for using \nSuperCluster:\n\n```bash\n# Train for \u003Ctask> segmentation on \u003Cdataset>\npython src\u002Ftrain.py experiment=\u003Ctask>\u002F\u003Cdataset>\n```\n\nSome examples:\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360 \n\n# Train SPT on DALES\npython src\u002Ftrain.py experiment=semantic\u002Fdales\n\n# Train SuperCluster on S3DIS Fold 5\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet\n\n# Train SuperCluster on KITTI-360 Val\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360 \n\n# Train SuperCluster on DALES\npython src\u002Ftrain.py experiment=panoptic\u002Fdales\n```\n\nUse the following to **train on a 11G-GPU 💾** (training time and performance \nmay vary):\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis_11g datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360_11g \n\n# Train SPT on DALES\npython src\u002Ftrain.py experiment=semantic\u002Fdales_11g\n\n# Train SuperCluster on S3DIS Fold 5\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_11g datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff_11g datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet_11g\n\n# Train SuperCluster on KITTI-360 Val\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360_11g \n\n# Train SuperCluster on DALES\npython src\u002Ftrain.py experiment=panoptic\u002Fdales_11g\n```\n\n> **Note**: Encountering CUDA Out-Of-Memory errors 💀💾 ? See our dedicated \n> [troubleshooting section](#cuda-out-of-memory-errors).\n\n> **Note**: Other ready-to-use configs are provided in\n>[`configs\u002Fexperiment\u002F`](configs\u002Fexperiment). You can easily design your own \n>experiments by composing [configs](configs):\n>```bash\n># Train Nano-3 for 50 epochs on DALES\n>python src\u002Ftrain.py datamodule=dales model=nano-3 trainer.max_epochs=50\n>```\n>See \n>[Lightning-Hydra](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template) for more\n>information on how the config system works and all the awesome perks of the \n> Lightning+Hydra combo.\n\n> **Note**: By default, your logs will automatically be uploaded to \n>[Weights and Biases](https:\u002F\u002Fwandb.ai), from where you can track and compare \n>your experiments. Other loggers are available in \n>[`configs\u002Flogger\u002F`](configs\u002Flogger). See \n>[Lightning-Hydra](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template) for more\n>information on the logging options.\n\nNB: Current EZ-SP implementation supports `\u003Cdataset>` among `s3dis`, \n`kitti360` and `dales`.\n\n#### EZ-SP\n\nOur EZ-SP method involves a two-stage training. \n\n1.  We train a small model to learn pointwise features for the superpoint \n**partition**, by solving a **contrastive task** at the semantic boundaries.\n\n2. We train a Superpoint Transformer model which takes these point features \nas input and reasons on the associated hierarchical partition, by solving a \n**semantic classification** task. \n\n> **Note**: If you mostly care for the fast graph connected components and \n> partitioning algorithms introduced in the \n> [EZ-SP paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385), \n> please see our\n> [torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components)\n> library.\n\n##### 1. Train the partition model\n\n```bash\npython src\u002Ftrain.py experiment=partition\u002F\u003Cdataset>_ezsp\n```\n\nThe checkpoint should be logged in \n`logs\u002Ftrain\u002Fruns\u002F\u003Crun_dir>\u002Fcheckpoints\u002Flast.ckpt` \n(check your bash and wandb logs 😉).\n\n> **Note**: \n> The experiments from `config\u002Fexperiments\u002Fpartition` train a small sparse CNN for\n> EZ-SP that embeds every point into a low-dimensional space where adjacent points \n> from different semantic classes are pushed apart. \n> This first training stage is controlled by the `model.training_partition_stage` \n> parameter.\n\n##### 2. Train the semantic model\n\n```bash\npython src\u002Ftrain.py experiment=semantic\u002F\u003Cdataset>_ezsp datamodule.pretrained_cnn_ckpt_path=\u003Cpartition_ckpt_path>\n```\n\nMake sure you set the partition model checkpoint `pretrained_cnn_ckpt_path`.\nFor this, you can train your own partition model as explained in step 1., or\nyou can use our [pretrained checkpoints](https:\u002F\u002Fzenodo.org\u002Frecords\u002F18329602) \n(named `ezsp_partition_\u003Cdataset>.ckpt`).\n\n> **Note**: \n> The experiments from `config\u002Fexperiments\u002Fsemantic` that end with `_ezsp` train\n> the full EZ-SP model for semantic segmentation. Note that these configurations\n> require a checkpoint path to a partition model, specified via the \n> `datamodule.pretrained_cnn_ckpt_path` parameter. The pretrained partition \n> model is used to compute the hierarchical superpoint partition during \n> preprocessing, on which the full model reasons during training.\n\n\n### PyTorch Lightning `predict()`\nBoth SPT and SuperCluster inherit from `LightningModule` and implement `predict_step()`, which permits using \n[PyTorch Lightning's `Trainer.predict()` mechanism](https:\u002F\u002Flightning.ai\u002Fdocs\u002Fpytorch\u002Fstable\u002Fdeploy\u002Fproduction_basic.html).\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# Predict behavior for semantic segmentation from a torch DataLoader\ndataloader = DataLoader(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, dataloaders=dataloader)\n```\n\nThis, however, still requires you to instantiate a `Trainer`, a `DataLoader`, \nand a model with relevant parameters.\n\nFor a little more simplicity, all our datasets inherit from \n`LightningDataModule` and implement `predict_dataloader()` by pointing to their \ncorresponding test set by default. This permits directly passing a datamodule to\n[PyTorch Lightning's `Trainer.predict()`](https:\u002F\u002Flightning.ai\u002Fdocs\u002Fpytorch\u002Fstable\u002Fcommon\u002Ftrainer.html#predict)\nwithout explicitly instantiating a `DataLoader`.\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# Predict behavior for semantic segmentation on S3DIS\ndatamodule = S3DISDataModule(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, datamodule=datamodule)\n```\n\nFor more details on how to instantiate these, as well as the output format\nof our model, we strongly encourage you to play with our \n[demo notebook](notebooks\u002Fdemo.ipynb) and have a look at the [`src\u002Feval.py`](src\u002Feval.py) script.\n\n### Full-resolution predictions\nBy design, our models only need to produce predictions for the superpoints of \nthe $P_1$ partition level during training. \nAll our losses and metrics are formulated as superpoint-wise objectives. \nThis conveniently saves compute and memory at training and evaluation time.\n\nAt inference time, however, we often need the **predictions on the voxels** of the\n$P_0$ partition level or on the **full-resolution input point cloud**.\nTo this end, we provide helper functions to recover voxel-wise and full-resolution\npredictions.\n\nSee our [demo notebook](notebooks\u002Fdemo.ipynb) for more details on these.\n\n### Using a pretrained model on custom data\nFor running a pretrained model on your own point cloud, please refer to our \ntutorial [slides](media\u002Fsuperpoint_transformer_tutorial.pdf), \n[notebook](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb), \nand [video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw).\n\n### Parametrizing the superpoint partition on custom data\nOur hierarchical superpoint partition is computed at preprocessing time. Its\nconstruction involves several steps whose parametrization must be adapted to\nyour specific dataset and task. Please refer to our \ntutorial [slides](media\u002Fsuperpoint_transformer_tutorial.pdf), \n[notebook](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb), \nand [video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw) for better \nunderstanding this process and tuning it to your needs.\n\n### Parameterizing SuperCluster graph clustering\nOne specificity of SuperCluster is that the model is not trained to explicitly \ndo panoptic segmentation, but to predict the input parameters of a superpoint \ngraph clustering problem whose solution is a panoptic segmentation.\n\nFor this reason, the hyperparameters for this graph optimization problem are \nselected after training, with a grid search on the training or validation set.\nWe find that fairly similar hyperparameters yield the best performance on all \nour datasets (see our [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)'s appendix). Yet, you may want to explore \nthese hyperparameters for your own dataset. To this end, see our \n[demo notebook](notebooks\u002Fdemo_panoptic_parametrization.ipynb) for \nparameterizing the panoptic segmentation.\n\n### Notebooks & visualization\nWe provide [notebooks](notebooks) to help you get started with manipulating our \ncore data structures, configs loading, dataset and model instantiation, \ninference on each dataset, and visualization.\n\nIn particular, we created an interactive visualization tool ✨ which can be used\nto produce shareable HTMLs. Demos of how to use this tool are provided in \nthe [notebooks](notebooks). Additionally, examples of such HTML files are \nprovided in [media\u002Fvisualizations.7z](media\u002Fvisualizations.7z)\n\n\u003Cbr>\n\n## 📚  Documentation\n\n| Location                                          | Content                                                                                                                     |\n|:--------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------|\n| [README](README.md)                               | General introduction to the project                                                                                         |\n| [`docs\u002Fdata_structures`](docs\u002Fdata_structures.md) | Introduction to the core data structures of this project: `Data`, `NAG`, `Cluster`, and `InstanceData`                      |\n| [`docs\u002Fdatasets`](docs\u002Fdatasets.md)               | Introduction to our implemented datasets, to our `BaseDataset` class, and how to create your own dataset inheriting from it |\n| [`docs\u002Flogging`](docs\u002Flogging.md)                 | Introduction to logging and the project's `logs\u002F` structure                                                                 |\n| [`docs\u002Fvisualization`](docs\u002Fvisualization.md)     | Introduction to our interactive 3D visualization tool                                                                       |\n\n> **Note**: We endeavoured to **comment our code** as much as possible to make \n> this project usable. If you don't find the answer you are looking for in the \n> `docs\u002F`, make sure to **have a look at the source code and past issues**. \n> Still, if you find some parts are unclear or some more documentation would be \n> needed, feel free to let us know by creating an issue ! \n\n\u003Cbr>\n\n## 👩‍🔧  Troubleshooting\nHere are some common issues and tips for tackling them.\n\n### SPT or SuperCluster on an 11G-GPU \nOur default configurations are designed for a 32G-GPU. Yet, SPT and SuperCluster can run \non an **11G-GPU 💾**, with minor time and performance variations.\n\nWe provide configs in [`configs\u002Fexperiment\u002Fsemantic`](configs\u002Fexperiment\u002Fsemantic) for \ntraining SPT on an **11G-GPU 💾**:\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis_11g datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360_11g \n\n# Train SPT on DALES\npython src\u002Ftrain.py experiment=semantic\u002Fdales_11g\n```\n\nSimilarly, we provide configs in [`configs\u002Fexperiment\u002Fpanoptic`](configs\u002Fexperiment\u002Fpanoptic) for \ntraining SuperCluster on an **11G-GPU 💾**:\n\n```bash\n# Train SuperCluster on S3DIS Fold 5\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_11g datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff_11g datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet_11g\n\n# Train SuperCluster on KITTI-360 Val\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360_11g \n\n# Train SuperCluster on DALES\npython src\u002Ftrain.py experiment=panoptic\u002Fdales_11g\n```\n\n\n### CUDA Out-Of-Memory Errors\nHaving some CUDA OOM errors 💀💾 ? Here are some parameters you can play \nwith to mitigate GPU memory use, based on when the error occurs.\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Parameters affecting CUDA memory.\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Legend**: 🟡 Preprocessing | 🔴 Training | 🟣 Inference (including validation and testing during training)\n\n| Parameter                                   | Description                                                                                                                                                                                                                        |  When  |\n|:--------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|\n| `datamodule.xy_tiling`                      | Splits dataset tiles into xy_tiling^2 smaller tiles, based on a regular XY grid. Ideal square-shaped tiles à la DALES. Note this will affect the number of training steps.                                                         |  🟡🟣  |\n| `datamodule.pc_tiling`                      | Splits dataset tiles into 2^pc_tiling smaller tiles, based on a their principal component. Ideal for varying tile shapes à la S3DIS and KITTI-360. Note this will affect the number of training steps.                             |  🟡🟣  |\n| `datamodule.max_num_nodes`                  | Limits the number of $P_1$ partition nodes\u002Fsuperpoints in the **training batches**.                                                                                                                                                |   🔴   |\n| `datamodule.max_num_edges`                  | Limits the number of $P_1$ partition edges in the **training batches**.                                                                                                                                                            |   🔴   |\n| `datamodule.voxel`                          | Increasing voxel size will reduce preprocessing, training and inference times but will reduce performance.                                                                                                                         | 🟡🔴🟣 |\n| `datamodule.pcp_regularization`             | Regularization for partition levels. The larger, the fewer the superpoints.                                                                                                                                                        | 🟡🔴🟣 |\n| `datamodule.pcp_spatial_weight`             | Importance of the 3D position in the partition. The smaller, the fewer the superpoints.                                                                                                                                            | 🟡🔴🟣 |\n| `datamodule.pcp_cutoff`                     | Minimum superpoint size. The larger, the fewer the superpoints.                                                                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_k_max`                    | Maximum number of adjacent nodes in the superpoint graphs. The smaller, the fewer the superedges.                                                                                                                                  | 🟡🔴🟣 |\n| `datamodule.graph_gap`                      | Maximum distance between adjacent superpoints int the superpoint graphs. The smaller, the fewer the superedges.                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_chunk`                    | Reduce to avoid OOM when `RadiusHorizontalGraph` preprocesses the superpoint graph.                                                                                                                                                |   🟡   |\n| `datamodule.dataloader.batch_size`          | Controls the number of loaded tiles. Each **train batch** is composed of `batch_size`*`datamodule.sample_graph_k` spherical samplings. Inference is performed on **entire validation and test tiles**, without spherical sampling. |  🔴🟣  |\n| `datamodule.sample_segment_ratio`           | Randomly drops a fraction of the superpoints at each partition level.                                                                                                                                                              |   🔴   |\n| `datamodule.sample_graph_k`                 | Controls the number of spherical samples in the **train batches**.                                                                                                                                                                 |   🔴   |\n| `datamodule.sample_graph_r`                 | Controls the radius of spherical samples in the **train batches**. Set to `sample_graph_r\u003C=0` to use the entire tile without spherical sampling.                                                                                   |   🔴   |\n| `datamodule.sample_point_min`               | Controls the minimum number of $P_0$ points sampled per superpoint in the **train batches**.                                                                                                                                       |   🔴   |\n| `datamodule.sample_point_max`               | Controls the maximum number of $P_0$ points sampled per superpoint in the **train batches**.                                                                                                                                       |   🔴   |\n| `callbacks.gradient_accumulator.scheduling` | Gradient accumulation. Can be used to train with smaller batches, with more training steps.                                                                                                                                        |   🔴   |\n\n\u003Cbr>\n\u003C\u002Fdetails>\n\n\u003Cbr>\n\n## 💬 Citing our work\nIf your work uses all or part of the present code, please include the following a citation:\n\n```\n@article{robert2023spt,\n  title={Efficient 3D Semantic Segmentation with Superpoint Transformer},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE\u002FCVF International Conference on Computer Vision},\n  year={2023}\n}\n\n@article{robert2024scalable,\n  title={Scalable 3D Panoptic Segmentation as Superpoint Graph Clustering},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE International Conference on 3D Vision},\n  year={2024}\n}\n\n@article{geist2025ezsp,\n  title={EZ-SP: Fast and Lightweight Superpoint-Based 3D Segmentation},\n  author={Geist, Louis and Landrieu, Loic and Robert, Damien},\n  journal={arXiv},\n  year={2025},\n}\n```\n\n📄 You can find our papers on arxiv:\n- [SPT](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)\n- [SuperCluster](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)\n- [EZ-SP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)\n\nAlso, **if you ❤️ or simply use this project, don't forget to give the \nrepository a ⭐, it means a lot to us !**\n\n\u003Cbr>\n\n## 💳  Credits\n- This project was built using [Lightning-Hydra template](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template).\n- The main data structures of this work rely on [PyTorch Geometric](https:\u002F\u002Fgithub.com\u002Fpyg-team\u002Fpytorch_geometric)\n- Some point cloud operations were inspired from the \n[Torch-Points3D framework](https:\u002F\u002Fgithub.com\u002Fnicolas-chaulet\u002Ftorch-points3d), although not merged with the official project\nat this point. \n- For the KITTI-360 dataset, some code from the official [KITTI-360](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002Fkitti360Scripts) was \nused.\n- Some superpoint-graph-related operations were inspired from\n[Superpoint Graph](https:\u002F\u002Fgithub.com\u002Floicland\u002Fsuperpoint_graph).\n- [Parallel Cut-Pursuit](https:\u002F\u002Fgitlab.com\u002F1a7r0ch3\u002Fparallel-cut-pursuit) was used in SPT and SPC to compute the \nhierarchical superpoint partition and graph clustering. Note that this step is \nreplaced by our GPU-based algorithm in EZ-SP.\n\nThis project has greatly benefited from the support of \n[**Romain Janvier**](https:\u002F\u002Fgithub.com\u002Frjanvier). \nThis collaboration was made possible thanks to\nthe [3DFin](https:\u002F\u002Fgithub.com\u002F3DFin) project. **3DFin** has been developed at the Centre of Wildfire\nResearch of Swansea University (UK) in collaboration with the Research Institute\nof Biodiversity (CSIC, Spain) and the Department of Mining Exploitation of the \nUniversity of Oviedo (Spain). \nFunding provided by the UK NERC project (NE\u002FT001194\u002F1):\n_Advancing 3D Fuel Mapping for Wildfire Behaviour and Risk Mitigation Modelling_\nand by the Spanish Knowledge Generation project (PID2021-126790NB-I00):\n_Advancing carbon emission estimations from wildfires applying artificial \nintelligence to 3D terrestrial point clouds_.\n\nThis project has also benefited from contributions by **[Louis Geist](https:\u002F\u002Flouisgeist.github.io)**, \nwhose work was funded by the [PEPR IA SHARP](https:\u002F\u002Fwww.pepr-ia.fr\u002Fen\u002Fprojet\u002Fsharp-english\u002F).\n","\u003Cdiv align=\"center\">\n\n# Superpoint Transformer\n\n[![python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Python_3.8+-blue?logo=python&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fpre-commit\u002Fpre-commit)\n[![pytorch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPyTorch_2.2+-ee4c2c?logo=pytorch&logoColor=white)](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F)\n[![lightning](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F-Lightning_2.2+-792ee5?logo=pytorchlightning&logoColor=white)](https:\u002F\u002Fpytorchlightning.ai\u002F)\n[![hydra](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FConfig-Hydra_1.3-89b8cd)](https:\u002F\u002Fhydra.cc\u002F)\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg?labelColor=gray)](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template#license)\n\n[\u002F\u002F]: # ([![Paper]&#40;https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpaper-arxiv.1001.2234-B31B1B.svg&#41;]&#40;https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fnature14539&#41;)\n[\u002F\u002F]: # ([![Conference]&#40;https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAnyConference-year-4b44ce.svg&#41;]&#40;https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020&#41;)\n\n\n官方实现，用于\n\u003Cbr>\n\u003Cbr>\n[_高效3D语义分割的Superpoint Transformer_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)（ICCV'23）\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2306.08045-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.8042712.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.8042712)\n[![项目页面](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Fdrprojects.github.io\u002Fsuperpoint-transformer-site)\n[![教程](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTutorial-FFC300)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw)\n\u003Cbr>\n\u003Cbr>\n[_基于Superpoint图聚类的可扩展3D全景分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)（3DV'24 口头报告）\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2401.06704-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.10689037.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.10689037)\n[![项目页面](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Fdrprojects.github.io\u002Fsupercluster-site)\n\u003Cbr>\n\u003Cbr>\n[_EZ-SP：快速轻量级的Superpoint基3D分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)（ICRA'26）\n\u003Cbr>\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Farxiv-2512.00385-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)\n[![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.18329602.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.18329602)\n[![项目页面](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_page-8A2BE2)](https:\u002F\u002Flouisgeist.github.io\u002Fez-sp\u002F)\n\u003Cbr>\n\u003Cbr>\n**如果您喜欢或只是使用本项目，请别忘了给仓库点个赞⭐，这对我们意义重大！**\n\u003Cbr>\n\u003C\u002Fdiv>\n\n\n\u003Cbr>\n\n## 📌  描述\n\n### Superpoint Transformer\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_a3abb423a8c1.png\">\n\u003C\u002Fp>\n\n**Superpoint Transformer (SPT)** 是一种基于superpoint的Transformer架构🤖，能够高效⚡地对大规模3D场景进行**语义分割**。该方法包含一个快速算法，可以将点云分割成层次化的superpoint结构，同时利用自注意力机制来挖掘不同尺度下superpoint之间的关系。\n\n\u003Cdiv align=\"center\">\n\n|                                                                                   ✨ SPT 数据概览 ✨                                                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|\n|                                                                          📊 **S3DIS 6折交叉验证**（76.0 mIoU）                                                                          |\n|                                                                         📊 **KITTI-360 验证集**（63.5 mIoU）                                                                          |\n|                                                                           📊 **DALES 数据集**（79.6 mIoU）                                                                           | \n|      🦋 **21.2万参数**（与[PointNeXt](https:\u002F\u002Fgithub.com\u002Fguochengqian\u002FPointNeXt)相比减少200倍，与[Stratified Transformer](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FStratified-Transformer)相比减少40倍）       | \n| ⚡ S3DIS 训练在**1张GPU上仅需3小时**（与[PointNeXt](https:\u002F\u002Fgithub.com\u002Fguochengqian\u002FPointNeXt)相比缩短7倍，与[Stratified Transformer](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FStratified-Transformer)相比缩短70倍） | \n|                                                  ⚡ **预处理速度比[SPG](https:\u002F\u002Fgithub.com\u002Floicland\u002Fsuperpoint_graph)快7倍**                                                   |\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-dales)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-dales?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002Fsemantic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fsemantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fefficient-3d-semantic-segmentation-with-1\u002F3d-semantic-segmentation-on-kitti-360)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002F3d-semantic-segmentation-on-kitti-360?p=efficient-3d-semantic-segmentation-with-1)\n\u003C\u002Fdiv>\n\n### SuperCluster\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_f80c4b26478f.png\">\n\u003C\u002Fp>\n\n**SuperCluster** 是一种基于超点的架构，用于对（非常）大型 3D 场景进行全景分割 🐘，其基础为 SPT。我们将全景分割任务建模为一个 **可扩展的超点图聚类** 问题。为此，我们的模型被训练来预测一个图优化问题的输入参数，该问题的解即为全景分割结果 💡。这种建模方式使得我们仅需使用节点级和边级的目标函数来监督模型训练，从而避免了在训练时计算实际的全景分割结果以及相关的匹配问题。\n\n在推理阶段，我们高效的并行化算法会求解这个小型的图优化问题，最终生成物体实例 👥。由于其轻量级的骨干网络和可扩展的建模方式，SuperCluster 能够在单个 GPU 上一次性处理前所未有的大规模场景 🚀，且模型参数量不足 100 万 🦋。\n\n\u003Cdiv align=\"center\">\n\n|                               ✨ SuperCluster 数据概览 ✨                                |\n|:----------------------------------------------------------------------------------------:|\n|                        📊 **S3DIS 6-Fold** (55.9 PQ)                                     |\n|                              📊 **S3DIS Area 5** (50.1 PQ)                               |\n|                               📊 **ScanNet Val** (58.7 PQ)                               |\n|                              📊 **KITTI-360 Val** (48.3 PQ)                              |\n|                                  📊 **DALES** (61.2 PQ)                                  |\n| 🦋 **212k 参数** ([PointGroup](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FPointGroup) ÷ 37) |\n|                           ⚡ S3DIS 训练只需 **4 小时，单 GPU**                            | \n|              ⚡ **7.8km²** 的 **1800 万点** 场景瓦片，仅需 **10.1 秒，单 GPU**               |\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-s3dis)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-s3dis?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-s3dis-area5)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-s3dis-area5?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-scannetv2)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-scannetv2?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-kitti-360)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-kitti-360?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fscalable-3d-panoptic-segmentation-with\u002Fpanoptic-segmentation-on-dales)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fpanoptic-segmentation-on-dales?p=scalable-3d-panoptic-segmentation-with)\n\n\u003C\u002Fdiv>\n\n### EZ-SP (Easy Superpoints)\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"100%\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_readme_b3f035de916c.png\">\n\u003C\u002Fp>\n\n**EZ-SP** 相较于 SPT 主要有两项改进：\n- 预处理和推理速度大幅提升\n- 分割参数化更加简单且可学习\n\nEZ-SP 用一种 **快速且可学习的 GPU 加速分割方法** 替代了 SPT 中耗时的、基于 CPU 的切分追踪分割步骤。首先，我们训练一个小型卷积骨干网络，将输入场景中的每个点嵌入到低维空间中，在该空间内，来自不同语义类别的相邻点会被相互推开。随后，我们新开发的 \n[GPU 加速图聚类算法](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components) \n会将具有相似嵌入特征的邻近点分组在一起，同时鼓励形成简洁的簇边界，从而生成语义上同质的超点。这些超点可以进一步用于 SPT 的语义分割框架中。\n\n\u003Cdiv align=\"center\">\n\n|                                         ✨ EZ-SP 数据概览 ✨                                         |\n|:----------------------------------------------------------------------------------------------------:|\n|                                   📊 **S3DIS 6-Fold** (76.1 mIoU)                                    |\n|                                   📊 **S3DIS Area 5** (69.6 mIoU)                                    |\n|                                   📊 **KITTI-360 Val** (62.0 mIoU)                                   |\n|                              📊 **DALES** (79.4 mIoU)                                                |\n|                                        🦋 **392k 参数**                                        |\n| ⚡️ **比 [PTv3](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.10035)** 全流程语义分割快 **72 倍**             |\n|                   ⚡️ **比 SPT** 全流程语义分割快 **5.3 倍**                   |\n\n\u003C\u002Fdiv>\n\n> **注**：如果您主要关注的是 **EZ-SP 论文**（https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385）中提出的快速图连通分量及分割算法，请参阅我们的 \n> [torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components) \n> 库。\n\n## 📰  更新\n- **31.01.2026** 我们的论文[_**EZ-SP：基于超点的快速轻量级3D分割**_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)已被**[ICRA'26](https:\u002F\u002F2026.ieee-icra.org\u002F)**接收 🥳\n- **22.01.2026** 将EZ-SP中引入的图连通分量和图划分算法以独立库的形式发布：\n[torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components)。\n我们希望这能促进这些核心组件在其他基于图的项目中的应用。\n- **27.11.2025** 针对我们的**可学习、GPU加速的划分**的重大代码发布，实现了\n[_**EZ-SP：基于超点的快速轻量级3D分割**_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)。\n新版本对代码库进行了一些**不向后兼容**的改动。\n我们尽力记录了这些破坏性变更，并提供了**说明和脚本**，帮助旧版本用户迁移到新的代码库。\n更多详情请参阅[**CHANGELOG**](CHANGELOG.md)❗\n- **27.06.2024** 发布了我们的Superpoint Transformer🧑‍🏫教程：\n[幻灯片](media\u002Fsuperpoint_transformer_tutorial.pdf)、\n[笔记本](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb)以及[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw)。\n如果你刚开始接触这个项目，不妨看看这些资料！\n- **21.06.2024** [Damien](https:\u002F\u002Fgithub.com\u002Fdrprojects)将于**📅 27.06.2024 下午1点（CEST）**举办一场关于Superpoint Transformer的**🧑‍🏫教程**。\n如果你想获得该项目的实际操作经验，一定要来参加！\n**[注册链接](https:\u002F\u002Fwww.linkedin.com\u002Fevents\u002Fsuperpointtransformersfor3dpoin7209130538110963712)**。\n- **28.02.2024** 针对**全景分割**的重大代码发布，实现了\n**[_基于超点图聚类的可扩展3D全景分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)**。\n新版本还实现了备受期待的功能，如Lightning的`predict()`行为、\n**体素分辨率和全分辨率预测**。\n部分依赖项和仓库结构的更改是**不向后兼容**的。如果你之前使用过旧版本的代码，建议重新安装你的conda环境，并重新运行数据预处理或数据集处理❗\n- **15.10.2023** 我们的论文**[_基于超点图聚类的可扩展3D全景分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)**被接受在**[3DV'24](https:\u002F\u002F3dvconf.github.io\u002F2024\u002F)**上作**口头报告** 🥳\n- **06.10.2023** 欢迎参观我们在**[ICCV'23](https:\u002F\u002Ficcv2023.thecvf.com\u002F)**上展示的海报，\n主题为**[_利用Superpoint Transformer实现高效的3D语义分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)**。\n- **14.07.2023** 我们的论文**[_利用Superpoint Transformer实现高效的3D语义分割_](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)**被**[ICCV'23](https:\u002F\u002Ficcv2023.thecvf.com\u002F)**接收 🥳\n- **15.06.2023** 正式发布 🌱\n\n\u003Cbr>\n\n## 💻  环境要求\n该项目已在以下环境中测试通过：\n- Linux操作系统\n- **64G** 内存\n- NVIDIA GTX 1080 Ti **11G**、NVIDIA V100 **32G**、NVIDIA A40 **48G**\n- CUDA 11.8 和 12.1\n- conda 23.3.1\n\n\u003Cbr>\n\n## 🏗  安装\n只需运行[`install.sh`](install.sh)，即可在名为`spt`的新conda环境中安装所有依赖项。\n```bash\n# 创建名为'spt'的conda环境并安装依赖\n.\u002Finstall.sh\n```\n\n**可选依赖**：[TorchSparse](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Ftorchsparse)是一个可选依赖，支持稀疏3D卷积，用于EZ-SP模型中。若需安装包含该依赖的环境，请使用：\n```bash\n# 创建名为'spt'的conda环境并安装所有依赖项 + TorchSparse\n.\u002Finstall.sh with_torchsparse\n```\n\n\n> **注**：请参阅[数据集页面](docs\u002Fdatasets.md)，了解如何设置你的数据集路径和文件结构。\n\n\u003Cbr>\n\n### 🔩  项目结构\n```\n└── superpoint_transformer\n    │\n    ├── configs                   # Hydra配置文件\n    │   ├── callbacks                 # 回调配置\n    │   ├── data                      # 数据配置\n    │   ├── debug                     # 调试配置\n    │   ├── experiment                # 实验配置\n    │   ├── extras                    # 其他工具配置\n    │   ├── hparams_search            # 超参数搜索配置\n    │   ├── hydra                     # Hydra配置\n    │   ├── local                     # 本地配置\n    │   ├── logger                    # 日志配置\n    │   ├── model                     # 模型配置\n    │   ├── paths                     # 项目路径配置\n    │   ├── trainer                   # 训练配置\n    │   │\n    │   ├── eval.yaml                 # 评估主配置\n    │   └── train.yaml                # 训练主配置\n    │\n    ├── data                      # 项目数据（详见docs\u002Fdatasets.md）\n    │\n    ├── docs                      # 文档\n    │\n    ├── logs                      # 由hydra和lightning日志器生成的日志\n    │\n    ├── media                     # 展示项目的媒体资源\n    │\n    ├── notebooks                 # Jupyter笔记本\n    │\n    ├── scripts                   # Shell脚本\n    │\n    ├── src                       # 源代码\n    │   ├── data                      # 分层划分的数据结构\n    │   ├── datamodules               # Lightning DataModules\n    │   ├── datasets                  # 数据集\n    │   ├── dependencies              # 编译后的依赖\n    │   ├── loader                    # DataLoader\n    │   ├── loss                      # 损失函数\n    │   ├── metrics                   # 指标\n    │   ├── models                    # 模型架构\n    │   ├── nn                        # 模型构建模块\n    │   ├── optim                     # 优化\n    │   ├── transforms                # 变换、预变换等功能\n    │   ├── utils                     # 工具函数\n    │   ├── visualization             # 交互式可视化工具\n    │   │\n    │   ├── eval.py                   # 运行评估\n    │   └── train.py                  # 运行训练\n    │\n    ├── tests                     # 各类测试\n    │\n    ├── .env.example              # 存储私有环境变量的示例文件\n    ├── .gitignore                # Git忽略的文件列表\n    ├── .pre-commit-config.yaml   # 代码格式化预提交钩子配置\n    ├── install.sh                # 安装脚本\n    ├── LICENSE                   # 项目许可证\n    └── README.md\n\n```\n\n> **注**：有关`data\u002F`的更多信息，请参阅[数据集页面](docs\u002Fdatasets.md)。\n\n> **注**：有关`logs\u002F`的更多信息，请参阅[日志页面](docs\u002Flogging.md)。\n\n\u003Cbr>\n\n## 🚀  使用\n### 数据集\n请参阅[数据集页面](docs\u002Fdatasets.md)，以设置你的数据集。\n\n### 评估\n使用以下命令结构从检查点文件 `checkpoint.ckpt` 评估我们的模型，其中 `\u003Ctask>` 应为 `semantic` 表示使用 SPT，`panoptic` 表示使用 SuperCluster：\n\n```bash\n# 在 \u003Cdataset> 数据集上评估 \u003Ctask> 分割任务\npython src\u002Feval.py experiment=\u003Ctask>\u002F\u003Cdataset> ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n```\n\n一些示例：\n\n```bash\n# 在 S3DIS 第 5 折上评估 SPT\npython src\u002Feval.py experiment=semantic\u002Fs3dis datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 KITTI-360 验证集上评估 SPT\npython src\u002Feval.py experiment=semantic\u002Fkitti360  ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt \n\n# 在 DALES 上评估 SPT\npython src\u002Feval.py experiment=semantic\u002Fdales ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 S3DIS 第 5 折上评估 SuperCluster\npython src\u002Feval.py experiment=panoptic\u002Fs3dis datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 S3DIS 第 5 折上评估 SuperCluster，将 {wall, floor, ceiling} 定义为 'stuff'\npython src\u002Feval.py experiment=panoptic\u002Fs3dis_with_stuff datamodule.fold=5 ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 ScanNet 验证集上评估 SuperCluster\npython src\u002Feval.py experiment=panoptic\u002Fscannet ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 KITTI-360 验证集上评估 SuperCluster\npython src\u002Feval.py experiment=panoptic\u002Fkitti360  ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt \n\n# 在 DALES 上评估 SuperCluster\npython src\u002Feval.py experiment=panoptic\u002Fdales ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt\n\n# 在 DALES 上评估 EZ-SP\npython src\u002Feval.py experiment=semantic\u002Fdales_ezsp ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fcheckpoint.ckpt datamodule.pretrained_cnn_ckpt_path=\u002Fpath\u002Fto\u002Fyour\u002Fpartition_checkpoint.ckpt\n```\n\n> **注意**：\n> \n> **SPT** 和 **SPT-nano** 模型在 **S3DIS 6-Fold**、**KITTI-360 Val** 和 **DALES** 上的预训练权重可在以下链接获取：\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.8042712.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.8042712)\n> \n> **SuperCluster** 模型在 **S3DIS 6-Fold**、**S3DIS 6-Fold with stuff**、**ScanNet Val**、**KITTI-360 Val** 和 **DALES** 上的预训练权重可在以下链接获取：\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.10689037.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.10689037)\n>\n> **EZ-SP** 模型在 **S3DIS 6-Fold**、**KITTI-360 Val** 和 **DALES** 上的预训练权重可在以下链接获取：\n>\n> [![DOI](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.5281\u002Fzenodo.18329602.svg)](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.18329602)\n\n### 训练\n#### SPT & SuperCluster\n使用以下命令结构在 **32G 显存的 GPU 上训练我们的模型**，其中 `\u003Ctask>` 应为 `semantic` 表示使用 SPT，`panoptic` 表示使用 SuperCluster：\n\n```bash\n# 在 \u003Cdataset> 数据集上训练 \u003Ctask> 分割任务\npython src\u002Ftrain.py experiment=\u003Ctask>\u002F\u003Cdataset>\n```\n\n一些示例：\n\n```bash\n# 在 S3DIS 第 5 折上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis datamodule.fold=5\n\n# 在 KITTI-360 验证集上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360 \n\n# 在 DALES 上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fdales\n\n# 在 S3DIS 第 5 折上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis datamodule.fold=5\n\n# 在 S3DIS 第 5 折上训练 SuperCluster，并将 {wall, floor, ceiling} 定义为 'stuff'\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff datamodule.fold=5\n\n# 在 ScanNet 验证集上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet\n\n# 在 KITTI-360 验证集上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360 \n\n# 在 DALES 上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fdales\n```\n\n若需在 **11G 显存的 GPU 上训练 💾**（训练时间和性能可能会有所不同），可使用以下命令：\n\n```bash\n# 在 S3DIS 第 5 折上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis_11g datamodule.fold=5\n\n# 在 KITTI-360 验证集上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360_11g \n\n# 在 DALES 上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fdales_11g\n\n# 在 S3DIS 第 5 折上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_11g datamodule.fold=5\n\n# 在 S3DIS 第 5 折上训练 SuperCluster，并将 {wall, floor, ceiling} 定义为 'stuff'\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff_11g datamodule.fold=5\n\n# 在 ScanNet 验证集上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet_11g\n\n# 在 KITTI-360 验证集上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360_11g\n\n# 在 DALES 数据集上训练 SuperCluster\npython src\u002Ftrain.py experiment=panoptic\u002Fdales_11g\n```\n\n> **注意**: 遇到 CUDA 内存不足错误 💀💾 吗？请参阅我们专门的 \n> [故障排除部分](#cuda-out-of-memory-errors)。\n\n> **注意**: 其他开箱即用的配置文件位于\n>[`configs\u002Fexperiment\u002F`](configs\u002Fexperiment)。您可以通过组合[配置文件](configs)轻松设计自己的实验：\n>```bash\n># 在 DALES 数据集上训练 Nano-3 模型 50 个 epoch\n>python src\u002Ftrain.py datamodule=dales model=nano-3 trainer.max_epochs=50\n>```\n>有关配置系统的工作原理以及 Lightning+Hydra 组合的所有强大优势，请参阅\n>[Lightning-Hydra](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template)。\n\n> **注意**: 默认情况下，您的日志会自动上传到\n>[Weights and Biases](https:\u002F\u002Fwandb.ai)，您可以在那里跟踪和比较您的实验。其他日志记录器可在\n>[`configs\u002Flogger\u002F`](configs\u002Flogger) 中找到。有关日志记录选项的更多信息，请参阅\n>[Lightning-Hydra](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template)。\n\n注：当前 EZ-SP 实现支持 `\u003Cdataset>`，取值为 `s3dis`、`kitti360` 和 `dales`。\n\n#### EZ-SP\n\n我们的 EZ-SP 方法包含两个训练阶段。\n\n1. 我们训练一个小型模型，用于学习超点**划分**的逐点特征，通过在语义边界上解决**对比任务**来实现。\n\n2. 然后我们训练一个超点 Transformer 模型，该模型以这些点特征作为输入，并基于相关的层次化划分进行推理，通过解决**语义分类**任务来完成。\n\n> **注意**: 如果您主要关注的是在 \n> [EZ-SP 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385) 中介绍的快速图连通分量及划分算法，\n> 请查看我们的\n> [torch-graph-components](https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Ftorch-graph-components)\n> 库。\n\n##### 1. 训练划分模型\n\n```bash\npython src\u002Ftrain.py experiment=partition\u002F\u003Cdataset>_ezsp\n```\n\n检查点应记录在\n`logs\u002Ftrain\u002Fruns\u002F\u003Crun_dir>\u002Fcheckpoints\u002Flast.ckpt`\n（请检查您的 bash 和 wandb 日志 😉）。\n\n> **注意**: \n> `config\u002Fexperiments\u002Fpartition` 中的实验会训练一个用于 EZ-SP 的小型稀疏 CNN，该网络将每个点嵌入到低维空间中，使得来自不同语义类别的相邻点被相互推开。 \n> 这一训练阶段由 `model.training_partition_stage` 参数控制。\n\n##### 2. 训练语义模型\n\n```bash\npython src\u002Ftrain.py experiment=semantic\u002F\u003Cdataset>_ezsp datamodule.pretrained_cnn_ckpt_path=\u003Cpartition_ckpt_path>\n```\n\n请确保设置划分模型的检查点路径 `pretrained_cnn_ckpt_path`。为此，您可以按照步骤 1 的说明训练您自己的划分模型，或者使用我们提供的[预训练检查点](https:\u002F\u002Fzenodo.org\u002Frecords\u002F18329602)（名为 `ezsp_partition_\u003Cdataset>.ckpt`）。\n\n> **注意**: \n> `config\u002Fexperiments\u002Fsemantic` 中以 `_ezsp` 结尾的实验会训练完整的 EZ-SP 模型用于语义分割。请注意，这些配置需要指定划分模型的检查点路径，通过 `datamodule.pretrained_cnn_ckpt_path` 参数进行设置。预训练的划分模型用于在预处理阶段计算层次化的超点划分，而完整模型则在此基础上进行训练时的推理。\n\n\n### PyTorch Lightning `predict()`\nSPT 和 SuperCluster 均继承自 `LightningModule`，并实现了 `predict_step()` 方法，这使得可以使用\n[PyTorch Lightning 的 `Trainer.predict()` 机制](https:\u002F\u002Flightning.ai\u002Fdocs\u002Fpytorch\u002Fstable\u002Fdeploy\u002Fproduction_basic.html)。\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# 对语义分割任务从 torch DataLoader 中进行预测\ndataloader = DataLoader(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, dataloaders=dataloader)\n```\n\n不过，这仍然需要您实例化一个 `Trainer`、一个 `DataLoader`，以及一个具有相关参数的模型。\n\n为了更简便起见，我们所有的数据集都继承自 `LightningDataModule`，并实现了 `predict_dataloader()` 方法，默认指向各自的数据集测试集。这使得可以直接将数据模块传递给\n[PyTorch Lightning 的 `Trainer.predict()`](https:\u002F\u002Flightning.ai\u002Fdocs\u002Fpytorch\u002Fstable\u002Fcommon\u002Ftrainer.html#predict)\n而无需显式地实例化 `DataLoader`。\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# 对 S3DIS 数据集上的语义分割任务进行预测\ndatamodule = S3DISDataModule(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, datamodule=datamodule)\n```\n\n有关如何实例化这些组件以及我们模型输出格式的更多详细信息，我们强烈建议您尝试我们的\n[demo notebook](notebooks\u002Fdemo.ipynb)，并查看 [`src\u002Feval.py`](src\u002Feval.py) 脚本。\n\n### 全分辨率预测\n根据设计，我们的模型在训练过程中只需对 $P_1$ 分区级别的超点生成预测。所有损失和指标都是以超点为单位定义的。这在训练和评估时能够有效节省计算资源和内存。\n\n然而，在推理阶段，我们通常需要获得 $P_0$ 分区级别的**体素预测**，或对**全分辨率输入点云**的预测。为此，我们提供了辅助函数来恢复体素级和全分辨率的预测结果。\n\n有关这些内容的详细信息，请参阅我们的 [demo notebook](notebooks\u002Fdemo.ipynb)。\n\n### 在自定义数据上使用预训练模型\n如需在您自己的点云数据上运行预训练模型，请参考我们的教程[幻灯片](media\u002Fsuperpoint_transformer_tutorial.pdf)、[笔记本](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb)以及[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw)。\n\n### 在自定义数据上参数化超点划分\n我们的层次化超点划分是在预处理阶段计算的。其构建过程涉及多个步骤，这些步骤的参数化需要根据您的特定数据集和任务进行调整。请参考我们的教程[幻灯片](media\u002Fsuperpoint_transformer_tutorial.pdf)、[笔记本](notebooks\u002Fsuperpoint_transformer_tutorial.ipynb)以及[视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=2qKhpQs9gJw)，以便更好地理解这一过程，并根据您的需求进行优化。\n\n### 超簇图聚类的参数化\n超簇的一个特殊之处在于，该模型并非直接训练用于进行全景分割，而是用于预测超像素图聚类问题的输入参数，而该问题的解即为全景分割结果。\n\n因此，针对这一图优化问题的超参数是在训练完成后，通过在训练集或验证集上进行网格搜索来选择的。我们发现，在所有数据集上，使用大致相同的超参数都能获得最佳性能（详见我们的[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)附录）。不过，您可能仍希望针对自己的数据集探索这些超参数。为此，请参阅我们的[演示笔记本](notebooks\u002Fdemo_panoptic_parametrization.ipynb)，以了解如何对全景分割进行参数化。\n\n### 笔记本与可视化\n我们提供了[笔记本](notebooks)，帮助您快速上手操作我们的核心数据结构、配置加载、数据集和模型实例化、各数据集上的推理以及可视化。\n\n特别地，我们开发了一个交互式可视化工具✨，可用于生成可分享的HTML文件。关于如何使用该工具的演示已在[笔记本](notebooks)中提供。此外，此类HTML文件的示例也已收录在[media\u002Fvisualizations.7z](media\u002Fvisualizations.7z)中。\n\n\u003Cbr>\n\n## 📚 文档\n\n| 位置                                          | 内容                                                                                                                     |\n|:--------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------|\n| [README](README.md)                               | 项目概述                                                                                         |\n| [`docs\u002Fdata_structures`](docs\u002Fdata_structures.md) | 介绍本项目的核心数据结构：`Data`、`NAG`、`Cluster` 和 `InstanceData`                      |\n| [`docs\u002Fdatasets`](docs\u002Fdatasets.md)               | 介绍我们实现的数据集、`BaseDataset` 类，以及如何创建您自己的继承自该类的数据集 |\n| [`docs\u002Flogging`](docs\u002Flogging.md)                 | 介绍日志记录及项目的 `logs\u002F` 目录结构                                                                 |\n| [`docs\u002Fvisualization`](docs\u002Fvisualization.md)     | 介绍我们的交互式3D可视化工具                                                                       |\n\n> **注**：我们尽可能地为代码添加了**注释**，以使该项目易于使用。如果您在 `docs\u002F` 中未能找到所需答案，请务必**查看源代码和过往的问题**。即便如此，若您仍觉得某些部分不够清晰，或认为需要更多文档说明，请随时通过提交问题告知我们！\n\n\u003Cbr>\n\n## 👩‍🔧 故障排除\n以下是一些常见问题及其解决方法。\n\n### SPT 或超簇在 11G 显存的 GPU 上运行\n我们的默认配置是为 32G 显存的 GPU 设计的。然而，SPT 和超簇也可以在**11G 显存的 GPU 💾**上运行，只是在时间和性能上会有一些细微差异。\n\n我们在 [`configs\u002Fexperiment\u002Fsemantic`](configs\u002Fexperiment\u002Fsemantic) 中提供了在**11G 显存的 GPU 💾**上训练 SPT 的配置：\n\n```bash\n# 在 S3DIS 第5折上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fs3dis_11g datamodule.fold=5\n\n# 在 KITTI-360 验证集上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fkitti360_11g \n\n# 在 DALES 数据集上训练 SPT\npython src\u002Ftrain.py experiment=semantic\u002Fdales_11g\n```\n\n同样地，我们在 [`configs\u002Fexperiment\u002Fpanoptic`](configs\u002Fexperiment\u002Fpanoptic) 中提供了在**11G 显存的 GPU 💾**上训练超簇的配置：\n\n```bash\n# 在 S3DIS 第5折上训练超簇\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_11g datamodule.fold=5\n\n# 将 {墙、地板、天花板} 视作“stuff”类别，在 S3DIS 第5折上训练超簇\npython src\u002Ftrain.py experiment=panoptic\u002Fs3dis_with_stuff_11g datamodule.fold=5\n\n# 在 ScanNet 验证集上训练超簇\npython src\u002Ftrain.py experiment=panoptic\u002Fscannet_11g\n\n# 在 KITTI-360 验证集上训练超簇\npython src\u002Ftrain.py experiment=panoptic\u002Fkitti360_11g \n\n# 在 DALES 数据集上训练超簇\npython src\u002Ftrain.py experiment=panoptic\u002Fdales_11g\n```\n\n### CUDA 内存不足错误\n遇到一些 CUDA OOM 错误 💀💾 吗？以下是一些可以根据错误发生时机调整的参数，以减少 GPU 内存占用。\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>影响 CUDA 内存的参数。\u003C\u002Fb>\u003C\u002Fsummary>\n\n**图例**: 🟡 预处理 | 🔴 训练 | 🟣 推理（包括训练过程中的验证和测试）\n\n| 参数                                   | 描述                                                                                                                                                                                                                        |  发生阶段  |\n|:--------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------:|\n| `datamodule.xy_tiling`                      | 根据规则的 XY 网格，将数据集瓦片分割成 xy_tiling^2 个更小的瓦片。适合 DALES 那样的理想正方形瓦片。请注意，这会影响训练步数。                                                         |  🟡🟣  |\n| `datamodule.pc_tiling`                      | 根据数据点的主要成分，将数据集瓦片分割成 2^pc_tiling 个更小的瓦片。适合 S3DIS 和 KITTI-360 那样的形状不一的瓦片。请注意，这会影响训练步数。                             |  🟡🟣  |\n| `datamodule.max_num_nodes`                  | 限制 **训练批次** 中 $P_1$ 分割节点\u002F超点的数量。                                                                                                                                                |   🔴   |\n| `datamodule.max_num_edges`                  | 限制 **训练批次** 中 $P_1$ 分割边的数量。                                                                                                                                                            |   🔴   |\n| `datamodule.voxel`                          | 增大体素大小会缩短预处理、训练和推理时间，但会降低性能。                                                                                                                                         | 🟡🔴🟣 |\n| `datamodule.pcp_regularization`             | 分割层级的正则化项。值越大，超点数量越少。                                                                                                                                                        | 🟡🔴🟣 |\n| `datamodule.pcp_spatial_weight`             | 超点分割中 3D 位置的重要性。值越小，超点数量越少。                                                                                                                                            | 🟡🔴🟣 |\n| `datamodule.pcp_cutoff`                     | 超点的最小尺寸。值越大，超点数量越少。                                                                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_k_max`                    | 超点图中每个节点的最大邻接节点数。值越小，超边数量越少。                                                                                                                                  | 🟡🔴🟣 |\n| `datamodule.graph_gap`                      | 超点图中相邻超点之间的最大距离。值越小，超边数量越少。                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_chunk`                    | 在 `RadiusHorizontalGraph` 预处理超点图时，减小该值以避免内存不足。                                                                                                                                                |   🟡   |\n| `datamodule.dataloader.batch_size`          | 控制加载的瓦片数量。每个 **训练批次** 由 `batch_size`*`datamodule.sample_graph_k` 次球面采样组成。推理则在 **整个验证和测试瓦片** 上进行，不进行球面采样。 |  🔴🟣  |\n| `datamodule.sample_segment_ratio`           | 在每个分割层级随机丢弃一部分超点。                                                                                                                                                              |   🔴   |\n| `datamodule.sample_graph_k`                 | 控制 **训练批次** 中的球面采样次数。                                                                                                                                                                 |   🔴   |\n| `datamodule.sample_graph_r`                 | 控制 **训练批次** 中球面采样的半径。设置为 `sample_graph_r\u003C=0` 可以不进行球面采样，直接使用整个瓦片。                                                                                   |   🔴   |\n| `datamodule.sample_point_min`               | 控制 **训练批次** 中每个超点采样的 $P_0$ 点的最小数量。                                                                                                                                       |   🔴   |\n| `datamodule.sample_point_max`               | 控制 **训练批次** 中每个超点采样的 $P_0$ 点的最大数量。                                                                                                                                       |   🔴   |\n| `callbacks.gradient_accumulator.scheduling` | 梯度累积。可用于以较小的批次进行训练，从而增加训练步数。                                                                                                                                        |   🔴   |\n\n\u003Cbr>\n\u003C\u002Fdetails>\n\n\u003Cbr>\n\n## 💬 引用我们的工作\n如果您在工作中使用了本代码的全部或部分内容，请务必包含以下引用：\n\n```\n@article{robert2023spt,\n  title={高效三维语义分割的超点Transformer},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={IEEE\u002FCVF国际计算机视觉会议论文集},\n  year={2023}\n}\n\n@article{robert2024scalable,\n  title={可扩展的三维全景分割：基于超点图的聚类},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={IEEE三维视觉国际会议论文集},\n  year={2024}\n}\n\n@article{geist2025ezsp,\n  title={EZ-SP：快速轻量级的超点基三维分割},\n  author={Geist, Louis and Landrieu, Loic and Robert, Damien},\n  journal={arXiv},\n  year={2025},\n}\n```\n\n📄 您可以在arXiv上找到我们的论文：\n- [SPT](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08045)\n- [SuperCluster](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.06704)\n- [EZ-SP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.00385)\n\n另外，**如果您喜欢或只是使用了这个项目，请别忘了给仓库点个⭐，这对我们来说意义重大！**\n\n\u003Cbr>\n\n## 💳 致谢\n- 本项目基于[Lightning-Hydra模板](https:\u002F\u002Fgithub.com\u002Fashleve\u002Flightning-hydra-template)构建。\n- 本工作的主要数据结构依赖于[PyTorch Geometric](https:\u002F\u002Fgithub.com\u002Fpyg-team\u002Fpytorch_geometric)。\n- 部分点云操作受到[Torch-Points3D框架](https:\u002F\u002Fgithub.com\u002Fnicolas-chaulet\u002Ftorch-points3d)的启发，尽管目前尚未与官方项目合并。\n- 对于KITTI-360数据集，部分代码源自官方[KITTI-360](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002Fkitti360Scripts)项目。\n- 一些与超点图相关的操作灵感来自[Superpoint Graph](https:\u002F\u002Fgithub.com\u002Floicland\u002Fsuperpoint_graph)。\n- 在SPT和SPC中，我们使用了[Parallel Cut-Pursuit](https:\u002F\u002Fgitlab.com\u002F1a7r0ch3\u002Fparallel-cut-pursuit)来计算层次化的超点划分和图聚类。需要注意的是，在EZ-SP中，这一步已被我们基于GPU的算法所取代。\n\n本项目得到了[**Romain Janvier**](https:\u002F\u002Fgithub.com\u002Frjanvier)的大力支持。此次合作得益于[3DFin](https:\u002F\u002Fgithub.com\u002F3DFin)项目。**3DFin**项目由英国斯旺西大学野火研究中心与西班牙CSIC生物多样性研究所、西班牙奥维耶多大学采矿工程系共同开发。该项目获得了英国NERC项目（NE\u002FT001194\u002F1）——“推进用于野火行为及风险缓解建模的三维燃料测绘”——以及西班牙知识创造项目（PID2021-126790NB-I00）——“应用人工智能处理三维地面点云以提升野火碳排放估算水平”的资助。\n\n此外，本项目还受益于**[Louis Geist](https:\u002F\u002Flouisgeist.github.io)**的贡献，其工作得到了[PEPR IA SHARP](https:\u002F\u002Fwww.pepr-ia.fr\u002Fen\u002Fprojet\u002Fsharp-english\u002F)项目的资助。","# Superpoint Transformer 快速上手指南\n\nSuperpoint Transformer (SPT) 及其衍生模型（SuperCluster, EZ-SP）是用于大规模 3D 场景语义分割和全景分割的高效架构。本指南将帮助您快速搭建环境并运行基础示例。\n\n## 🛠️ 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**: Linux (推荐 Ubuntu 18.04+)\n*   **Python**: 3.8 或更高版本\n*   **GPU**: 支持 CUDA 的 NVIDIA 显卡（建议显存 ≥ 8GB 以处理大型点云）\n*   **编译器**: GCC\u002FG++ (用于编译部分扩展)\n\n### 前置依赖\n本项目基于 PyTorch Lightning 和 Hydra 构建。建议先创建一个干净的虚拟环境：\n\n```bash\n# 创建虚拟环境\npython -m venv spt_env\nsource spt_env\u002Fbin\u002Factivate\n\n# 升级 pip\npip install --upgrade pip\n```\n\n## 📦 安装步骤\n\n### 1. 安装 PyTorch\n请访问 [PyTorch 官网](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F) 获取适合您 CUDA 版本的安装命令。以下是通用示例（请根据实际情况调整 `cuXXX` 版本）：\n\n```bash\n# 示例：安装 PyTorch 2.2+ (请替换为您的实际 CUDA 版本)\npip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n```\n\n> **国内加速提示**：如果遇到下载速度慢的问题，推荐使用清华大学或阿里云镜像源：\n> ```bash\n> pip install torch torchvision torchaudio --index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n### 2. 克隆仓库并安装项目依赖\n\n```bash\n# 克隆代码库\ngit clone https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Fsuperpoint_transformer.git\ncd superpoint_transformer\n\n# 安装项目依赖\npip install -e .\n```\n\n如果 `pip install -e .` 失败，可以尝试手动安装核心依赖：\n\n```bash\npip install pytorch-lightning==2.2.* hydra-core==1.3.* omegaconf\n```\n\n## 🚀 基本使用\n\n本项目使用 **Hydra** 进行配置管理，所有实验通过命令行参数启动。以下是最简单的数据预处理和训练示例（以 S3DIS 数据集为例）。\n\n### 1. 数据预处理\nSPT 需要先将原始点云转换为超点（Superpoint）结构。以 S3DIS 数据集为例：\n\n```bash\n# 运行预处理脚本 (假设数据已放置在 data\u002Fs3dis 目录下)\npython scripts\u002Fpreprocess_s3dis.py \\\n  dataset_path=data\u002Fs3dis \\\n  output_path=data\u002Fs3dis_preprocessed \\\n  mode=train\n```\n\n> **注意**：EZ-SP 版本引入了更快的 GPU 加速分区算法，如果使用 EZ-SP 模式，请确保参考其特定的预处理标志。\n\n### 2. 开始训练\n使用默认配置在单个 GPU 上启动训练。以下命令演示如何训练 SPT 模型：\n\n```bash\n# 训练 SPT 模型 (S3DIS 6-Fold 示例)\npython train.py \\\n  model=spt \\\n  dataset=s3dis \\\n  trainer.accelerator=gpu \\\n  trainer.devices=1 \\\n  data.root_dir=data\u002Fs3dis_preprocessed\n```\n\n### 3. 推理与评估\n训练完成后，使用保存的检查点进行推理：\n\n```bash\npython evaluate.py \\\n  model=spt \\\n  dataset=s3dis \\\n  ckpt_path=outputs\u002Ftrain\u002Frun_0\u002Fcheckpoints\u002Fepoch=50-step=1000.ckpt \\\n  data.root_dir=data\u002Fs3dis_preprocessed\n```\n\n### 💡 核心配置说明\n*   `model`: 选择模型架构 (`spt`, `supercluster`, `ez_sp`)。\n*   `dataset`: 选择数据集 (`s3dis`, `scannet`, `kitti360`, `dales`)。\n*   `trainer`: 配置训练参数（如 GPU 数量、精度 `precision=16-mixed` 等）。\n\n您可以查看 `configs\u002F` 目录下的 YAML 文件来修改网络结构、超参数和数据增强策略。","某自动驾驶感知团队正在处理城市街道的海量激光雷达点云数据，需要实时识别道路、车辆、行人及交通设施以构建高精地图。\n\n### 没有 superpoint_transformer 时\n- **计算资源爆炸**：直接对数亿个原始点进行逐点特征提取，显存占用极高，普通工作站无法加载完整街区数据，必须强行切割导致上下文丢失。\n- **推理速度缓慢**：传统稀疏卷积网络在处理复杂场景时延迟过高，无法满足自动驾驶系统对实时性的严苛要求，帧率远低于安全阈值。\n- **小目标识别差**：由于缺乏多尺度关联机制，路灯、交通标志等细小物体常被误判为背景噪声，语义分割的边界模糊不清。\n- **工程部署困难**：模型结构复杂且难以优化，从训练环境迁移到车载边缘设备时，量化精度损失严重，调试周期长达数周。\n\n### 使用 superpoint_transformer 后\n- **内存效率飞跃**：利用分层超点（Superpoint）结构将原始点云聚合，显存占用降低一个数量级，轻松在单卡上处理公里级连续场景。\n- **实时性能达标**：基于超点的自注意力机制大幅减少计算量，推理速度提升数倍，在保持高精度的同时满足车载系统的实时帧率需求。\n- **细节捕捉精准**：通过多尺度超点图聚类，有效捕捉局部几何特征与全局语义关系，显著提升了细小交通设施的识别准确率与边界清晰度。\n- **落地流程顺畅**：依托 PyTorch Lightning 和 Hydra 的模块化设计，模型训练与导出流程标准化，快速完成从算法验证到嵌入式端的高效部署。\n\nsuperpoint_transformer 通过将海量点云转化为高效的超点图结构，彻底解决了大规模 3D 场景语义分割中精度与速度的矛盾，让高精地图构建真正具备工程落地价值。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdrprojects_superpoint_transformer_a3abb423.png","drprojects","Damien ROBERT","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fdrprojects_8dd2b411.png","Postdoctoral researcher at @ecovision-uzh. Deep learning for environmental applications.","University of Zurich","Switzerland","damien.robert@uzh.ch",null,"drprojects.github.io","https:\u002F\u002Fgithub.com\u002Fdrprojects",[84,88,92],{"name":85,"color":86,"percentage":87},"Python","#3572A5",92.4,{"name":89,"color":90,"percentage":91},"Jupyter Notebook","#DA5B0B",7.3,{"name":93,"color":94,"percentage":95},"Shell","#89e051",0.3,977,131,"2026-04-04T17:14:44","MIT",4,"未说明","必需 NVIDIA GPU（文中多次提及单卡训练和推理），具体型号和显存未说明，需支持 PyTorch 2.2+",{"notes":104,"python":105,"dependencies":106},"该工具包含三个主要模型：SPT（语义分割）、SuperCluster（全景分割）和 EZ-SP（快速轻量分割）。其中 EZ-SP 引入了基于 GPU 加速的可学习图聚类算法。代码库近期有重大更新（2025 年 11 月），新版本与旧版本不向后兼容。训练效率极高，例如 S3DIS 数据集可在单张 GPU 上 3-4 小时内完成训练。若仅需使用 EZ-SP 中的图连通分量和分区算法，可参考独立的 torch-graph-components 库。","3.8+",[107,108,109],"PyTorch>=2.2","PyTorch Lightning>=2.2","Hydra>=1.3",[35,15,111,14],"其他",[113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129],"3d","deep-learning","efficient","fast","hierarchical","lightweight","partition","point-cloud","pytorch","semantic-segmentation","superpoint","transformer","graph-clustering","panoptic-segmentation","partitioning","3dv2024","iccv2023","2026-03-27T02:49:30.150509","2026-04-07T22:49:53.677129",[133,138,143,148,152,156],{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},23041,"如何将模型预测结果保存为可可视化的点云文件（如 PLY）并恢复全分辨率位置？","默认预处理包含体素化步骤（GridSampling3D），会导致点数减少。若要获取全分辨率输入点云的预测结果，请参考 README 中的\"Full-resolution predictions\"部分。代码库提供了辅助函数来恢复体素级和全分辨率的预测。具体实现可参考 `SemanticSegmentationOutput.full_res_semantic_pred()` 的逻辑：利用 `super_index` 层级映射，将超点预测值映射回原始点云。例如，通过 `nag[1].super_index[super_index_level0_to_level1][super_index_raw_to_level0]` 进行索引映射以匹配原始点数。","https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Fsuperpoint_transformer\u002Fissues\u002F78",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},23042,"如何调整参数以生成更大的超点（Superpoints）？","主要调整以下参数来控制超点的大小和分割效果：\n1. `pcp_regularization`、`pcp_spatial_weight`、`pcp_cutoff`：这是控制 Cut-Pursuit 分区算法的核心参数，请查阅 `CutPursuitPartition` 的文档字符串了解具体影响。\n2. `partition_hf`：调整用于构建超点的点特征。\n3. `pcp_w_adjacency`：控制基于边长切割边的难度。\n4. `knn`：调整计算局部几何特征时使用的邻居数量。\n注意：SPT 使用了比 SPG 更快的并行 Cut-Pursuit 实现，API 可能有所变化，需重新调整参数。","https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Fsuperpoint_transformer\u002Fissues\u002F134",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},23043,"如何保存中间层的超点数据（例如 Partition L2）以便后续使用？","虽然项目中没有直接保存中间超点的简洁脚本，但可以通过修改代码逻辑实现。核心思路是参考 `SemanticSegmentationOutput.full_res_semantic_pred()` 的实现，提取 `nag` 对象中的 `super_index`。你需要构建从原始点到目标层级超点的索引映射，例如使用 `nag[1].super_index` 结合层级间的索引关系，将数据整理后保存为 `.pth` 文件。","https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Fsuperpoint_transformer\u002Fissues\u002F109",{"id":149,"question_zh":150,"answer_zh":151,"source_url":147},23044,"如何在配置中关闭分块（Tiling）和体素化（Voxelization）？","在 datamodule 配置文件中进行如下修改：\n1. **关闭分块**：将 `xy_tiling` 和 `pc_tiling` 设置为 `null`。\n   ```yaml\n   xy_tiling: null\n   pc_tiling: null\n   ```\n2. **关闭体素化**：由于完全移除 `GridSampling3D` 可能引发问题，建议将 `voxel` 大小设置为一个极小值（如 0.001），这样实际上不会移除任何点，从而达到保留全分辨率的效果。\n   ```yaml\n   voxel: 0.001\n   ```",{"id":153,"question_zh":154,"answer_zh":155,"source_url":142},23045,"如何使用在 DALES 数据集上训练的模型推理自己的点云数据？","模型输入必须与训练数据一致。DALES 数据集包含 xyz 坐标和强度（intensity）信息。如果你的点云只有 xyz 而没有强度：\n1. **方案一（推荐）**：修改配置文件，在训练时不使用强度特征重新训练模型。\n2. **方案二**：如果强行使用现有模型，必须确保输入数据格式匹配。将你的点云文件放入 `data\u002Fdales\u002Fraw\u002Ftest\u002F` 目录，并修改 `dales_config.py` 中 `TILES` 字典的 'test' 项指向你的文件名。注意：测试时不需要语义标签和实例标签（这些是 GT），但必须有强度通道，否则输入不匹配会导致结果错误。",{"id":157,"question_zh":158,"answer_zh":159,"source_url":160},23046,"为什么预处理后的点云数量（nag[0].num_points）少于原始点云数量？","这是因为预处理流程中默认包含了体素化（Voxelization）步骤（通过 `GridSampling3D` 实现），该步骤会对点进行下采样。如果你希望预处理后的点数尽可能接近原始点数，可以在配置文件中将 `voxel` 参数设置得非常小（例如 `0.001`）。这样可以保留几乎所有原始点，使得 `nag[0]` 的点数与原始点云几乎一致，从而方便后续将预测结果映射回原始点云。","https:\u002F\u002Fgithub.com\u002Fdrprojects\u002Fsuperpoint_transformer\u002Fissues\u002F136",[]]