Aggregate View Object Detection
This repository contains the public release of the Python implementation of our Aggregate View Object Detection (AVOD) network for 3D object detection.
If you use this code, please cite our paper:
@article{ku2018joint,
title={Joint 3D Proposal Generation and Object Detection from View Aggregation},
author={Ku, Jason and Mozifian, Melissa and Lee, Jungwook and Harakeh, Ali and Waslander, Steven},
journal={IROS},
year={2018}
}
Videos
These videos show detections on several KITTI sequences and our own data in snowy and night driving conditions (with no additional training data).
AVOD Detections
AVOD-FPN Detections
KITTI Object Detection Results (3D and BEV)
AP-3D
AP-BEV
Method
Runtime
Easy
Moderate
Hard
Easy
Moderate
Hard
Car
MV3D
0.36
71.09
62.35
55.12
86.02
76.90
68.49
VoxelNet
0.23
77.47
65.11
57.73
89.35
79.26
77.39
F-PointNet
0.17
81.20
70.39
62.19
88.70
84.00
75.33
AVOD
0.08
73.59
65.78
58.38
86.80
85.44
77.73
AVOD-FPN
0.10
81.94
71.88
66.38
88.53
83.79
77.90
Pedestrian
VoxelNet
0.23
39.48
33.69
31.51
46.13
40.74
38.11
F-PointNet
0.17
51.21
44.89
40.23
58.09
50.22
47.20
AVOD
0.08
38.28
31.51
26.98
42.52
35.24
33.97
AVOD-FPN
0.10
50.80
42.81