Computer Vision Fundamentals
In this chapter, I mainly learned about how to detect and idntify the lane lines out of a image captured by a camera fixed on cars. The original image is like this:
And then I will follow the routine below:
- Color Selection (the lane lines are usually in a special color in the image)
- Region Masking (eliminate the useless area of the image)
- Edge Detect (detect the edges of the lane lines)
- Find Lane Lines
1.Color Selection
Firstly, we need to understand RGB and the dimensions of an image.
Let’s see the dimensions of the image shown before.
image = mpimg.imread('test.jpg')
print('This image is: ',type(image),
'with dimensions:', image.shape)
-This image is: <class 'numpy.ndarray'> with dimensions: (720, 1280, 3)
As we can see from the result, there are three dimensions in the image, we can divide it into three array that each one has two dimensions :
(:, :, 0),(:, :, 1)and(:, :, 2)
Actually, these are just their red, green and blue parts. We can use code to divide the original image into three color space.
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
# Read in the image and print out some stats
image = mpimg.imread('exit-ramp.jpg')
print('This image is: ',type(image),
'with dimensions:', image.shape)
# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
# Note: always make a copy rather than simply using "="
color_select = np.copy(image)
fig,(ax1, ax2, ax3) = plt.subplots(1,3,sharey= True)
ax1.imshow(image[:,:,0])
ax2.imshow(image[:,:,1])
ax3.imshow(image[:,:,2])
plt.show()
Then we can use a Conditional judgment to select the lane lines as they are always kind of white (255, 255, 255).
# Define our color selection criteria
# Note: if you run this code, you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz
red_threshold = 180
green_threshold = 180
blue_threshold = 180
rgb_threshold = [red_threshold, green_threshold, blue_threshold]
# Identify pixels below the threshold
thresholds = (image[:,:,0] < rgb_threshold[0]) \
| (image[:,:,1] < rgb_threshold[1]) \
| (image[:,:,2] < rgb_threshold[2])
color_select[thresholds] = [0,0,0]
# Display the image
plt.imshow(color_select)
plt.show()
As you can see, there are lane line being identified, but the result isn’t satisfying.
2. Region Masking
This step is to eliminate the useless area of the image by taking only the area of interest into consideration.
The example is like this:
# Pull out the x and y sizes and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
region_select = np.copy(image)
# Define a triangle region of interest
# Keep in mind the origin (x=0, y=0) is in the upper left in image processing
# Note: if you run this code, you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz
left_bottom = [0, 540]
right_bottom = [960, 540]
apex = [800, 0]
# Fit lines (y=Ax+B) to identify the 3 sided region of interest
# np.polyfit() returns the coefficients [A, B] of the fit
fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)
# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
(YY > (XX*fit_right[0] + fit_right[1])) & \
(YY < (XX*fit_bottom[0] + fit_bottom[1]))
# Color pixels red which are inside the region of interest
region_select[region_thresholds] = [255, 0, 0]
3. Edge detect
Now we try to perform an enforced version of Step 1: Color Selection.
As it happens, lane lines are not always the same color, and even lines of the same color under different lighting conditions (day, night, etc) may fail to be detected by our simple color selection.
What we need is to take our algorithm to the next level to detect lines of any color using sophisticated computer vision methods.
We use Canny method to calculate the gradient of an image and then achieve the edge detect function. Why gradient is being used to detect the edge? Here’s an illustration:
#doing all the relevant imports
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2
# Read in the image and convert to grayscale
image = mpimg.imread('exit-ramp.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# Define a kernel size for Gaussian smoothing / blurring
# Note: this step is optional as cv2.Canny() applies a 5x5 Gaussian internally
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size), 0)
# Define parameters for Canny and run it
# NOTE: if you try running this code you might want to change these!
low_threshold = 50
high_threshold = 150
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)
# Display the image
plt.imshow(edges, cmap='Greys_r')
The algorithm will first detect strong edge (strong gradient) pixels above the high_threshold, and reject pixels below the low_threshold. Next, pixels with values between the low_threshold and high_threshold will be included as long as they are connected to strong edges. The output edges is a binary image with white pixels tracing out the detected edges and black everywhere else.
4. Find Lane Lines
Finally, as we have done all the preparation steps, we can now use Hough Transform to find the lane lines.
Hough Transform is a method aiming at finding lines in an imag. More details about Hough Transform.
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2
# Read in and grayscale the image
image = mpimg.imread('exit-ramp.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
# Define a kernel size and apply Gaussian smoothing
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)
# Define our parameters for Canny and apply
low_threshold = 50
high_threshold = 150
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)
# Next we'll create a masked edges image using cv2.fillPoly()
mask = np.zeros_like(edges)
ignore_mask_color = 255
# This time we are defining a four sided polygon to mask
imshape = image.shape
vertices = np.array([[(0,imshape[0]),(450, 290), (490, 290), (imshape[1],imshape[0])]], dtype=np.int32)
cv2.fillPoly(mask, vertices, ignore_mask_color)
masked_edges = cv2.bitwise_and(edges, mask)
# Define the Hough transform parameters
# Make a blank the same size as our image to draw on
rho = 1 # distance resolution in pixels of the Hough grid
theta = np.pi/180 # angular resolution in radians of the Hough grid
threshold = 15 # minimum number of votes (intersections in Hough grid cell)
min_line_length = 40#minimum number of pixels making up a line
max_line_gap = 20 # maximum gap in pixels between connectable line segments
line_image = np.copy(image)*0 # creating a blank to draw lines on
# Run Hough on edge detected image
# Output "lines" is an array containing endpoints of detected line segments
lines = cv2.HoughLinesP(masked_edges, rho, theta, threshold, np.array([]),
min_line_length, max_line_gap)
# Iterate over the output "lines" and draw lines on a blank image
for line in lines:
for x1,y1,x2,y2 in line:
cv2.line(line_image,(x1,y1),(x2,y2),(255,0,0),10)
# Create a "color" binary image to combine with line image
color_edges = np.dstack((edges, edges, edges))
# Draw the lines on the edge image
lines_edges = cv2.addWeighted(color_edges, 0.8, line_image, 1, 0)
plt.imshow(lines_edges)
plt.show()
Done!