Shichao Yang, Daneil Maturana, Sebastian Scherer
We consider the problem of understanding the 3D layout of indoor corridor
scenes from a single image in real time. Identifying obstacles such as walls is
essential for robot navigation, but also challenging due to the diversity in
structure, appearance and illumination of real-world corridor scenes. Many
current single-image methods make Manhattan-world assumptions, and break down
in environments that do not meet this mold. They also may require complicated
hand-designed features for image segmentation or clear boundaries to form
certain building models. In addition, most cannot run in real time.
In this paper, we propose to combine machine learning with geometric modelling
to build a simplified 3D model from a single image. We first employ a
supervised Convolutional Neural Network (CNN) to provide a dense, but coarse,
geometric class labelling of the scene. We then refine this labelling with a
fully connected Conditional Random Field (CRF). Finally, we fit line segments
along wall-ground boundaries and
pop up a 3D model using geometric
We assemble a dataset of 967 labelled corridor images. Our
experiments on this dataset and another publicly available dataset
show our method outperforms other single image scene understanding methods
in pixelwise accuracy while labelling images at over 15Hz.
This is an image dataset (967 images) for corridor environments. All the images are annotated as ground or wall using polygons. It is a mixture of processed SUN RGBD (349, corridor category), SUN Database (327, corridor category), and self-collected images (291, in CMU). If you use this dataset, you should also cite two other paper listed in the bibtex.