Unreached potentials of RGB-D segmentation

More Info
expand_more

Abstract

It is commonly believed that image recognition based on RGB improves when using RGB-D, ie: when depth information (distance from the camera) is added. Adding depth should make models more robust to appearance variations in colors and lighting; to recognize shape and spatial relationships while allowing models to ignore irrelevant backgrounds. In this paper we investigate how robust current RGB-D models truly are to changes in appearance, depth, and background where we vary one modality (RGB or depth) and compare RGB-D to RGB-only and depth-only in a semantic segmentation setting. Experiments show that all investigated RGB-D models show some robustness to variations in color, but might severely fail for unseen variations in lighting, spatial position and backgrounds. Our results show that we need new RGB-D models that can exploit the best of both modalities while remaining robust to changes in a single modality.

Files