We use mAP at various thresholds on the front camera image for evaluation. To our knowledge, there is only one significant work in weakly-supervised 3D object detection VS3D which uses an in-network proposal generation and cross-modal transfer learning between camera images and point clouds. There are a couple of methods that use selective search to generate proposals like PCL, and OICR[5]. Although our method is bottlenecked due to offline proposal generation, our model is simple and is backed by solid geometric computer vision concepts. We are not evaluating 3D mAP because most bounding boxes are inaccurate due to occlusion length. This is not a big issue in terms of self-driving cars because the min_y of the bounding box i.e., the nearest distance of the obstacle in the direction of length, is accurate. max_y is highly affected by occlusion.

Comparison with Baselines

Model name 2D AP @IoU = 0.3 2D AP @IoU = 0.5
PCL 4.789 1.29
OICR 7.6361 4.228
V3SD 72.42 69.38
Ours with Focal Loss 49.76 29.97
                                                           ***Mean average precision (AP) of Car on the KITTI validation set of variation Weakly supervised Methods taken from VS3D***

NB: The results other than ours are evaluated on KITTI validation data, which takes time to get evaluated as we have to submit the code to KITTI's official website. Our mAP is on a validation set split from the training set. We are confident these results would hold good for the KITTI validation set as our model is weakly supervised i.e. proposal generation is irrespective of the network. There is not much chance of overfitting because the KITTI train data itself is very diverse.

Comparison of losses

Model 2D AP @IoU = 0.2 2D AP @IoU = 0.3 2D AP @IoU = 0.4 2D AP @IoU = 0.5
Ours with BCE Loss 35.03 32.5 29.35 26.56
Ours with Focal Loss 58.31 49.76 40.52 29.97

The difference in AP decreases as IoU increases, but overall Focal loss performs much better

Precision-Recall Curves

Precision-Recall curve of Car class with different IoU (legend)

Precision-Recall curve of Car class with different IoU (legend)

Train loss curve

For training with focal loss

For training with focal loss

For training with BCE loss

For training with BCE loss


Camera Results

media_images_Image proposals 002582_1111_7deb3838f4d55bbf0b06.png

media_images_Image proposals 000401_1099_928d1692cf5fcd87ab51.png