We provide code for reproducing results of two papers
SoftGroup for 3D Instance Segmentation on Point Clouds
Thang Vu, Kookhoi Kim, Tung M. Luu, Thanh Nguyen, and Chang D. Yoo.
CVPR 2022 (Oral).
Scalable SoftGroup for 3D Instance Segmentation on Point Clouds
Thang Vu, Kookhoi Kim, Tung M. Luu, Thanh Nguyen, Junyeong Kim, and Chang D. Yoo.
arXiv preprint 2022.
Existing state-of-the-art 3D instance segmentation methods perform semantic segmentation followed by grouping. The hard predictions are made when performing semantic segmentation such that each point is associated with a single class. However, the errors stemming from hard decision propagate into grouping that results in (1) low overlaps between the predicted instance with the ground truth and (2) substantial false positives. To address the aforementioned problems, this paper proposes a 3D instance segmentation method referred to as SoftGroup by performing bottom-up soft grouping followed by top-down refinement. SoftGroup allows each point to be associated with multiple classes to mitigate the problems stemming from semantic prediction errors and suppresses false positive instances by learning to categorize them as background. Experimental results on different datasets and multiple evaluation metrics demonstrate the efficacy of SoftGroup. Its performance surpasses the strongest prior method by a significant margin of +6.2% on the ScanNet v2 hidden test set and +6.8% on S3DIS Area 5 of AP_50.
Please refer to installation guide.
Please refer to data preparation.
Dataset | Model | AP | AP_50 | AP_25 | Download |
---|---|---|---|---|---|
S3DIS | SoftGroup | 51.4 | 66.5 | 75.4 | model |
S3DIS | SoftGroup++ | 50.9 | 67.8 | 76.0 | model |
ScanNet v2 | SoftGroup | 45.8 | 67.4 | 79.1 | model |
ScanNet v2 | SoftGroup++ | 45.9 | 67.9 | 79.4 | above |
STPLS3D | SoftGroup | 47.3 | 63.1 | 71.4 | model |
STPLS3D | SoftGroup++ | 46.5 | 62.9 | 71.8 | above |
NOTE: SoftGroup and SoftGroup++ use can use same trained model for inference on ScanNet v2 and STPLS3D.
Dataset | PQ | Config | Model |
---|---|---|---|
SemanticKITTI | 60.2 | config | model |
We use the checkpoint of HAIS as pretrained backbone. We have already converted the checkpoint to work on spconv2.x
. Download the pretrained HAIS-spconv2 model and put it in SoftGroup/
directory.
Converted hais checkpoint: model
Noted that for fair comparison with implementation in STPLS3D paper, we train SoftGroup on this dataset from scratch without pretrained backbone.
The default configs suppose training on 4 GPU. If you use smaller number of GPUs, you should reduce the learning rate linearly.
First, finetune the pretrained HAIS point-wise prediction network (backbone) on S3DIS.
./tools/dist_train.sh configs/softgroup_s3dis_backbone_fold5.yaml 4
Then, train model from frozen backbone.
./tools/dist_train.sh configs/softgroup_s3dis_fold5.yaml 4
Training on ScanNet doesnot require finetuning the backbone. Just freeze pretrained backbone and train the model.
./tools/dist_train.sh configs/softgroup_scannet.yaml 4
./tools/dist_train.sh configs/softgroup_stpls3d_backbone.yaml 4
./tools/dist_train.sh configs/softgroup_stpls3d.yaml 4
./tools/dist_test.sh $CONFIG_FILE $CHECKPOINT $NUM_GPU
For example, on scannet test split, just change prefix
to test
and with_label
to False
before running inference.
We provide script to evaluate detection performance on axis-aligned boxes from predicted/ground-truth instance.
save_instance
to True
in config file.CUDA_VISIBLE_DEVICES=0 python test.py --config config/softgroup_default_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$
python eval_det.py
Please refer to visualization guide for visualizing ScanNet and S3DIS results.
Please refer to custom dataset guide.
If you find our work helpful for your research. Please consider citing our paper.
@inproceedings{vu2022softgroup,
title={SoftGroup for 3D Instance Segmentation on 3D Point Clouds},
author={Vu, Thang and Kim, Kookhoi and Luu, Tung M. and Nguyen, Xuan Thanh and Yoo, Chang D.},
booktitle={CVPR},
year={2022}
}
Code is built based on HAIS, PointGroup, and spconv
This work was partly supported by Institute for Information communications Technology Planning Evaluation (IITP) grant funded by the Korea government (MSIT) (2021-0-01381, Development of Causal AI through Video Understanding, and partly supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-01371, Development of brain-inspired AI with human-like intelligence).