INF-SLiM: Large-scale Implicit Neural Fields for Semantic LiDAR Mapping
of Embodied AI Agents
Abstract
Large-scale semantic mapping is crucial for outdoor autonomous agents to
perform high-level tasks such as planning and navigation. In this paper,
we propose a novel method for large-scale 3D semantic reconstruction
through implicit representations from posed LiDAR measurements alone. We
first leverage an octree-based hierarchical structure to store implicit
features, then these implicit features are decoded to signed distance
value and semantic information through shallow Multilayer Perceptrons
(MLPs). We leverage radial window self-attention networks to predict the
semantic labels of point clouds. We then jointly optimize the feature
embedding and MLP parameters with a self-supervision paradigm for
point-cloud geometry and a pseudo-supervision paradigm for semantic and
panoptic labels. Subsequently, geometric structures and object
categories for novel points in the unseen area are regressed, and the
marching cubes method is exploited to subdivide and visualize scenes in
the inferring stage. Experiments on two real-world datasets,
SemanticKITTI and SemanticPOSS, demonstrate the superior segmentation
efficiency and mapping effectiveness of our framework compared to
current state-of-the-art 3D semantic LiDAR mapping methods.