The goal of this project is to create a powerful pipeline to detect lane lines with raw images from a car's dash cam. The pipeline will visually display the lane boundaries, numerically giving a numerical estimate of the lane curvature.
To execute the pipeline, the following dependencies are necessary :
- numpy
- cv2
- glob
- matplotlib
- moviepy.editor
The goals / steps of this project are the following :
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
Overview of Files :
My project includes the following files :
- README.md (writeup report) documentation of the results
- Camera_Calibration.ipynb code for calibration of the camera
- Color_transform_&_Gradient_Threshold.ipynb code for correction of distortion & calculation of thresholds & Color transform
- Perspective_Transformation.ipynb code for perspective transformation
- Lane_Finding_Project.ipynb Code for all the project
- project video result
Each camera has a certain lens that distorts the captured image in a certain way compared to reality. Because we want to capture the view of the surroundings as accurately as possible, we have to correct this distortion. OpenCV provides some very useful functions to accomplish this task.
The code for this step in the file called Camera_Calibration.ipynb.
I start by preparing "object points", which will be the (x, y, z) coordinates of the chessboard corners in the world. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus, objp is just a replicated array of coordinates, and objpoints will be appended with a copy of it every time I successfully detect all chessboard corners in a test image. To do so I use the cv2.findChessboardCorners() function provided by OpenCV. imgpoints will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection.
Here is an example of a successful detection:
I then used the output objpoints and imgpoints to compute the camera calibration and distortion coefficients using the cv2.calibrateCamera() function. I applied this distortion correction to the test image using the cv2.undistort() function and obtained this result:
The camera calibration and distortion coefficients are stored using pickle to be used on the main notebook
I am now going to describe the pipeline each images goes through in order to detect the lines. I am going to display images to illustrate each steps of the way.
The code for this step in the file called Color_transform_&_Gradient_Threshold.ipynb.
Here an exmples of images that I am going to use :
1. Distortion correction
The pictures of the chessboard were taken with the same camera as the one mounted on the car that took every pictures and videos that were provided for this project. Therefore, after calibrating the camera on the chessboard, we can use the same coefficients to undistort every images and videos of the road.
Using the camera matrix and distortion coefficients produced in the previous step, Here is the result after undistorting an image using the OpenCV undistort() function.
2. Lane detection
I used a combination of color and gradient thresholds to generate a binary image where every non-zeros pixels have a high probability of being part of a lane line,In an order word to filter out potential noise (such as shadows, different color lanes, other cars, etc)
2.1 Color thresholding
Since the lines can be yellow and white, I chose to detect the lane lines by color as well. The S (saturation) channel of the HLS color space can be well suited for extracting a particular color.
The following image shows how I extract the yellow lines:
I found that extracting only the pixels between 90 and 255 would suit this problem quite well.
To extract the white color, I would chose the L channel and threshold between 200 and 255.
We can observe that the white detecting picture detect the right line more accurately.
2.2 Sobel operator
A Sobel operator is an edge-detection algorithm that computed and detect high gradient in a given direction by doing a 2-D convolution over the image (Applying the Sobel operator to an image is a way of taking the derivative of the image in the x or y direction.)
In this case, I chose to detect pixels which luminance (channel L of HLS color space) returns a high gradient in the x direction since the lines I am trying to detect are generally vertical in the image.
Thresholded x-derivative
Thresholded y-derivative
2.3 Gradient magnitude
(apply a threshold to the overall magnitude of the gradient, in both x and y.)
The magnitude, or absolute value, of the gradient is just the square root of the squares of the individual x and y gradients. For a gradient in both the xx and yy directions, the magnitude is the square root of the sum of the squares.
abs_sobelx = sqrt{(sobel_x)^2}
abs_sobely = sqrt{(sobel_y)^2}
abs_sobelxy = sqrt{(sobel_x)^2+(sobel_y)^2}
I apply a threshold on the magnitude of the gradient to filter out weak signals
2.3 Direction of the Gradient
In the case of lane lines, i'm interested only in edges of a particular orientation. So now i will explore the direction, or orientation, of the gradient.
The direction of the gradient is simply the inverse tangent (arctangent) of the yy gradient divided by the xx gradient :
arctan(sobel_y/sobel_x)
Each pixel of the resulting image contains a value for the angle of the gradient away from horizontal in units of radians, covering a range of −π/2 to π/2.
2.4 Combining Thresholds
(use various aspects of your gradient measurements (x, y, magnitude, direction) to isolate lane-line pixels) to generate a binary image
I'used thresholds of the x and y gradients, the overall gradient magnitude, and the gradient direction to focus on pixels that are likely to be part of the lane lines.
The perspective transformation code could be found on Perspective_Transformation.ipynb notebook.
The image used were the one with straight lane lines:
Since we want to detect the curvature of the lines, we need to change the perspective of the image. OpenCV comes very handy at doing so
A perspective transform maps the points in a given image to a different, desired, image points with a new perspective. I use the OpenCV functions getPerspectiveTransform() and warpPerspective() to generate a bird's-eye view of a lane from above, which is useful for calculating the lane curvature.
The unwarp() function takes as inputs an image (img), and runs cv2.warpPerspective() using the follwing source (src) and destination (dst) points.I chose the source points and destination points used to perform the transform by analyzing different test images :
// img_size = (720, 1280)
src = np.float32(
[[(img_size[0] / 2) - 62, img_size[1] / 2 + 100],
[((img_size[0] / 6) - 10), img_size[1]],
[(img_size[0] * 5 / 6) + 60, img_size[1]],
[(img_size[0] / 2 + 62), img_size[1] / 2 + 100]])
dst = np.float32(
[[(img_size[0] / 4), 0],
[(img_size[0] / 4), img_size[1]],
[(img_size[0] * 3 / 4), img_size[1]],
[(img_size[0] * 3 / 4), 0]])
This resulted in the following source and destination points:
Source | 578, 460 | 203, 720 | 1127, 720 | 702, 460 |
---|---|---|---|---|
Destination | 320, 0 | 320, 720 | 960, 720 | 960, 0 |
I verified that my perspective transform was working as expected by drawing the src and dst points onto a test image and its warped counterpart to verify that the lines appear parallel in the warped image.
1. First frame
After obtaining the binarized image and transform its perspective to look at the road form the top, I need to decide which pixels belong to the lane. For the first image I compute a histogram of every pixels in the bottom half of image along the x axis and detect the 2 highest values on each sides. They give me the start of the lane. (function find_firstlane())
Then, I iteratively create some bounding boxes (in green) and add the position of each pixels inside them to be part of the line. The center of the next box is the average x position of all of the pixels in the current box. That position is shown by the blue line.
At the end, I chose to color every pixels found for the left line in red and the ones for the right line in blue.
Here is the result:
The yellow line in the middle of each lines are polynomials that are fitting each of the colored lines.
2. Subsequent frames
After obtaining both polynomials:
x = a * y^2 + b * y + c
After the initial line is detected, we can continue searching for the new location of the lane line starting in the area where the current line was detected. (function search_next_Lane())
(I do not need to use the previous method that would be described as blind detection. I use the polynomials and only look at the area nearby.)
I extend each polynomials by a margin (100 pixels) on both sides, add every non zero pixels within that region and fit a new polynomial on top of that.
The code for that function is the following:
def search_next_Lane(binary_warped,left_fit,right_fit) :
# HYPERPARAMETER
# Choose the width of the margin around the previous polynomial to search
margin = 100
# Grab activated pixels
nonzero = binary_warped.nonzero()
nonzeroy = np.array(nonzero[0])
nonzerox = np.array(nonzero[1])
# the area of search based on activated x-values
# within the +/- margin of our polynomial function
left_lane_inds = ((nonzerox > (left_fit[0]*(nonzeroy**2) + left_fit[1]*nonzeroy +
left_fit[2] - margin)) & (nonzerox < (left_fit[0]*(nonzeroy**2) +
left_fit[1]*nonzeroy + left_fit[2] + margin)))
right_lane_inds = ((nonzerox > (right_fit[0]*(nonzeroy**2) + right_fit[1]*nonzeroy +
right_fit[2] - margin)) & (nonzerox < (right_fit[0]*(nonzeroy**2) +
right_fit[1]*nonzeroy + right_fit[2] + margin)))
# Again, extract left and right line pixel positions
leftx = nonzerox[left_lane_inds]
lefty = nonzeroy[left_lane_inds]
rightx = nonzerox[right_lane_inds]
righty = nonzeroy[right_lane_inds]
# Fit a second order polynomial to each
left_fit = np.polyfit(lefty, leftx, 2)
right_fit = np.polyfit(righty, rightx, 2)
left_fit_cr = np.polyfit(lefty*ym_per_pix, leftx*xm_per_pix, 2)
right_fit_cr = np.polyfit(righty*ym_per_pix, rightx*xm_per_pix, 2)
# Create an image to draw on
out_img = np.dstack((binary_warped, binary_warped, binary_warped))*255
# Color in left and right line pixels
out_img[lefty, leftx] = [255, 0, 0]
out_img[righty, rightx] = [0, 0, 255]
return out_img, left_fit, right_fit, left_fit_cr,right_fit_cr
The whole purpose of detecting a lane is to compute a lane curvature and from it a command to steer the wheel to control the car.
The radius of curvature of the lane at a particular point is defined as the radius of the approximating circle, which changes as we move along the curve. A good tutorial of the radius of curvature can be found here which describes the mathematical formula used to calculate it.
The radius of curvature is computed at the bottom of the image, point closest to the car.
This is performed by the function
def measure_curvature(left_fit_cr,right_fit_cr) :
# Define conversions in x and y from pixels space to meters
ym_per_pix = 30/720 # meters per pixel in y dimension
xm_per_pix = 3.7/700 # meters per pixel in x dimension
# cover same y-range as image
ploty = np.linspace(0, 719, num=720)
y_eval = np.max(ploty)
# Calculate the new R_curve (radius of curvature)
left_curverad = ((1 + (2*left_fit_cr[0]*y_eval*ym_per_pix + left_fit_cr[1])**2)**1.5) / np.absolute(2*left_fit_cr[0])
right_curverad = ((1 + (2*right_fit_cr[0]*y_eval*ym_per_pix + right_fit_cr[1])**2)**1.5) / np.absolute(2*right_fit_cr[0])
return left_curverad,right_curverad
The pixel values of the lane are scaled into meters (real world space) using the scaling factors ( I used the estimated figures based on U.S. regulations that require a minimum lane width of 3.7 meters and dashed lane lines of 3 meters long each.) defined as follows :
ym_per_pix = 30/720 # meters per pixel in y dimension
xm_per_pix = 3.7/700 # meters per pixel in x dimension
These values are then used to compute the polynomial coefficients in meters and then the formula given in the lectures is used to compute the radius of curvature.
left_curverad = ((1 + (2*left_fit_cr[0]*y_eval*ym_per_pix + left_fit_cr[1])**2)**1.5) / np.absolute(2*left_fit_cr[0])
right_curverad = ((1 + (2*right_fit_cr[0]*y_eval*ym_per_pix + right_fit_cr[1])**2)**1.5) / np.absolute(2*right_fit_cr[0])
Warp Detected Lane Boundaries onto Original Image
After detecting the lane lines, calculating the radius of curvature, and finding the vehicles position within the lane, I unwarp the image back to the original perspective using the OpenCV warpPerspective() function as before, but this time using the inverse matrix
Visual Display of the Lane Boundaries, Curvature, and Vehicle Position
Here is a final result on a test image :
Here is a plantuml activity diagram of my pipeline :
A link to my video result can be found here
During the implementation of this project I found it very useful to make use of python unittests in order to test the code and in order to calculate and evaluate differnt parameters, e.g. in the context of thresholding. The chosen approach worked pretty well for all three test videos. However, it had some problems in case the surface of the street changes, there is much shadow on the road, lane lines are missing, etc.
Improvements could be:
- consider provious polynoms that were fitted onto the lane lines in order to smoothen lane line detection
- add sanity checks that kick out values that do nor make sense. In that case the previous polynoms could be reused.
- the values of the curvature and the position of the car are hardly readable since they change so often. Less frequent update of the values and calculation of the - average over several frames of the video could help.
- In order to reduce effort for calculation of the thresholds, etc. the images reduced to the area of interest (the street only)
- There might be a better values for min and max thresholds of the threshold calculation. More finetuning might produce even better results
- There are a few improvements that could be done on the performance of the process due to repetitive calculations.
- More information could be use from frame to frame to improve the robustness of the process.
- Other gradients could be use (Ex. Laplacian) to improve the line detection.
Please, the links below can be of great help for more information :
- Camera Calibration and 3D Reconstruction
- Camera calibration With OpenCV
- Camera Calibration
- Performance evaluation of edge detection techniques for images in spatial domain
- Gradient-enhancing conversion for illumination-robust lane detection
- Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation.