Segmentation is an important part of robotic perception and object manipulation. This task becomes especially challenging when several deformable linear objects (DLOs) are entangled, a scenario in which both vision algorithms and humans often struggle.
This work describes a two-stage robust perception pipeline, capable of segmenting entangled DLOs in arbitrary configurations. In the first stage, a ResNet-based CNN segments the input image into crossings and the DLO segments that connect them. In the second stage, another CNN estimates the likelihood that pairs of segments belong to the same continuous DLO which is then used for reconstruction of the DLO’s full topology.
The networks were trained on different types of data including RGB images and depth data captured with LiDAR and Stereo technologies. The effects of the different inputs on the network’s performance were tested, and the relevance of depth information to the task was evaluated.
The results demonstrate that the proposed pipeline effectively segments entangled DLOs, even in complex scenes.