Multimodal Range View Based Semantic Segmentation

Code for our project "Multimodal Range View Based Semantic Segmentation" of the course "Deep Learning for 3D Perception" at the Technical University of Munich under supervision of Prof. Angela Dai.

Prepare:

Download SemanticKITTI from their official website.

Usage：

Train：

Lidar backbone with Range Augmentations (RA):

512 x 64 range-view (RV) resolution:

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/cenet_512.yml \
    -n cenet_512_RA

1024 x 64 RV resolution (retrain from 512 x 64 checkpoint as the authors of CENet recommend):

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/cenet_1024.yml \
    -p /path/to/cenet_512_RA -n cenet_1024_RA

RGB backbone fine-tuning on SemanticKITTI dataset with range-view labels:

for usage with 512 x 64 model:

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/mask2former_512.yml \
    -n mask2former_512

for usage with 1024 x 64 model:

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/mask2former_1024.yml \
    -n mask2former_1024

Fusion Model:

512 x 64 range-view (RV) resolution:

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/fusion_512.yml \
    -n fusion_512

1024 x 64 RV resolution:

python train.py -d /path/to/SemanticKITTI/dataset -ac config/arch/fusion_1024.yml \
    -n fusion_1024

Infer and Evaluation：

Infer:

python infer.py -d /path/to/SemanticKITTI/dataset -l /path/to/save/predictions/in \
    -m path/to/trained_model

Evalulation:

Lidar and fusion models:

python evaluate_iou.py -d /path/to/SemanticKITTI/dataset -p /path/to/predictions

RGB models:

python evaluate_iou_rgb.py -d /path/to/SemanticKITTI/dataset -p /path/to/predictions

Visualize Example:

Visualize GT:

python visualize.py -w kitti -d /path/to/SemanticKITTI/dataset -s which_sequences

Visualize Predictions:

python visualize.py -w kitti -d /path/to/SemanticKITTI/dataset -p /path/to/predictions \
    -s which_sequences

Pretrained Models and Logs:

Our pre-trained models can be found here.

Acknowledgments：

Our codebase originates from CENet. For the fusion model we use code from SwinFusion, while we follow the Hugging Face implementation of Mask2Former as RGB backbone. For initialization, we utilize the pre-trained Mask2Former models trained on the Cityscapes dataset for semantic segmentation.

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
.vscode		.vscode
assert		assert
common		common
config		config
dataset		dataset
logs		logs
modules		modules
postproc		postproc
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
evaluate_iou.py		evaluate_iou.py
evaluate_iou_rgb.py		evaluate_iou_rgb.py
infer.py		infer.py
kitti.sh		kitti.sh
pos.sh		pos.sh
temp.py		temp.py
train.py		train.py
ts_infer.py		ts_infer.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Range View Based Semantic Segmentation

Prepare:

Usage：

Train：

Infer and Evaluation：

Visualize Example:

Pretrained Models and Logs:

Acknowledgments：

About

Releases

Packages

Contributors 2

Languages

License

nschi00/rangeview-rgb-lidar-fusion

Folders and files

Latest commit

History

Repository files navigation

Multimodal Range View Based Semantic Segmentation

Prepare:

Usage：

Train：

Infer and Evaluation：

Visualize Example:

Pretrained Models and Logs:

Acknowledgments：

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages