Autonomous driving - Car detection
Welcome to your week 3 programming assignment. You will learn about object detection using the very powerful YOLO model. Many of the ideas in this notebook are described in the two YOLO papers: Redmon et al., 2016 (https://arxiv.org/abs/1506.02640) and Redmon and Farhadi, 2016 (https://arxiv.org/abs/1612.08242).
You will learn to:
- Use object detection on a car detection dataset
- Deal with bounding boxes
Run the following cell to load the packages and dependencies that are going to be useful for your journey!
import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body
%matplotlib inline
1 - Problem Statement
You are working on a self-driving car. As a critical component of this project, you'd like to first build a car detection system. To collect data, you've mounted a camera to the hood (meaning the front) of the car, which takes pictures of the road ahead every few seconds while you drive around.
YOLO ("you only look once") is a popular algoritm because it achieves high accuracy while also being able to run in real-time. This algorithm "only looks once" at the image in the sense that it requires only one forward propagation pass through the network to make predictions. After non-max suppression, it then outputs recognized objects together with the bounding boxes.
2.1 - Model details
First things to know:
- The input is a batch of images of shape (m, 608, 608, 3)
- The output is a list of bounding boxes along with the recognized classes. Each bounding box is represented by 6 numbers (pc,bx,by,bh,bw,c)(pc,bx,by,bh,bw,c) as explained above. If you expand cc into an 80-dimensional vector, each bounding box is then represented by 85 numbers.
We will use 5 anchor boxes. So you can think of the YOLO architecture as the following: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).
Lets look in greater detail at what this encoding represents.
2.2 - Filtering with a threshold on class scores
You are going to apply a first filter by thresholding. You would like to get rid of any box for which the class "score" is less than a chosen threshold.
The model gives you a total of 19x19x5x85 numbers, with each box described by 85 numbers. It'll be convenient to rearrange the (19,19,5,85) (or (19,19,425)) dimensional tensor into the following variables:
box_confidence
: tensor of shape (19×19,5,1)(19×19,5,1) containing pcpc (confidence probability that there's some object) for each of the 5 boxes predicted in each of the 19x19 cells.boxes
: tensor of shape (19×19,5,4)(19×19,5,4) containing (bx,by,bh,bw)(bx,by,bh,bw) for each of the 5 boxes per cell.box_class_probs
: tensor of shape (19×19,5,80)(19×19,5,80) containing the detection probabilities (c1,c2,...c80)(c1,c2,...c80) for each of the 80 classes for each of the 5 boxes per cell.
Exercise: Implement yolo_filter_boxes()
.
- Compute box scores by doing the elementwise product as described in Figure 4. The following code may help you choose the right operator:
a = np.random.randn(19*19, 5, 1) b = np.random.randn(19*19, 5, 80) c = a * b # shape of c will be (19*19, 5, 80)
- For each box, find:
- Create a mask by using a threshold. As a reminder:
([0.9, 0.3, 0.4, 0.5, 0.1] < 0.4)
returns:[False, True, False, False, True]
. The mask should be True for the boxes you want to keep. - Use TensorFlow to apply the mask to box_class_scores, boxes and box_classes to filter out the boxes we don't want. You should be left with just the subset of boxes you want to keep. (Hint)
Reminder: to call a Keras function, you should use K.function(...)
.
# GRADED FUNCTION: yolo_filter_boxes
def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6):
"""Filters YOLO boxes by thresholding on object and cl