Computer Vision
Exercises during the 5th semester and a lane detection project based on a given dataset and the KITTI dataset
This repository contains exercises and the final project for the Computer Vision course at the DHBW Stuttgart. The lane detection project is based on a given dataset and the KITTI dataset.
Requirements & Usage
Python 3.10 is required
- Install Poetry
- Poetry is a dependency manager for Python used in this project
- (Optional) Set up poetry to use the local
.venvfolder by runningpoetry config virtualenvs.in-project true
- Run
poetry installto install all dependencies - Afterwards, runpoetry shellto activate the virtual environment - Install the pre-commit hooks with
poetry run pre-commit install
Note: If you are using Poetry it is recommended to use
poetry run <command>to run commands. This ensures that the.envfile is loaded and the virtual environment is activated.
After the setup is complete, use the following commands to run the lane detection project. Note that you need to be in the root directory of the project.
# If you want to run the lane detection with the default settings
poetry run Python src/main.py
# With this, you can disable the overlay (`pretty` visualization)
poetry run Python src/main.py False
# Select a specific step in the pipeline to be shown
# Since `pretty` will not be shown we can disable it
poetry run Python src/main.py False 5
Note: To suppress the debug messages, you can set the
LOGURU_LEVELenvironment variable toINFOwithin the.envfile in therootdirectory.
Scope of the Project
Minimum Requirements:
- Camera Calibration (
src/pipeline/calibration.py) - Segmentation of the image/frame (
src/pipeline/perspective.py) - Color thresholding and masking using a histogram (
src/pipeline/threshold.py)- Inner Line:
- Mask white colors using the
rchannel (rgb) - Mask yellow colors using the
hchannel (hsv) - Mask saturated colors using the
schannel (hsvandhls)
- Mask white colors using the
- Outer Line
- Sobel filter in
xdirection with a threshold of40and100 - Filter Sobel using the
rchannel (rgb)
- Sobel filter in
- Inner Line:
- Providing
~30fps- We first downscale the image to
1280x720and then use thecv2.resizefunction to scale the image to640x360 - As a second step, we narrow down the area of interest to the lower half of the image
- We first downscale the image to
- Increasing the performance by using the previously detected lines and a histogram for fitting the sliding windows (
src/pipeline/lane.pyandsrc/pipeline/line.py) - Curve and polynomial fitting (
src/pipeline/lane.py) - Contiguous lane detection for
project_video.mp4
Additional Features:
- Contiguous lane detection for
challenge_video.mp4 - Detecting lines for the KITTI dataset
- We don't apply the camera calibration because KITTI uses a different camera
- The angel and view of the camera are different therefore, we use a different conversion matrix
- To detect some specific lines, we use additional color thresholding and Sobel filter
- Thresholding the maximum change of the lines between two frames (
src/pipeline/lane.py)- If the thresholding is exceeded, the last detected lines are used
- If the detected lane is too often detected as a error, it will reposition using the sliding windows
Questions
Approach
As a first step, we had to decide which approach for lane detection we would use. For this, we mainly used the lane detection methods presented to us during lectures. This was done because we knew more about these approaches than about other possible techniques like using neural networks.
After deciding on our approach, we conceptualize the pipeline of functions we wanted to use to detect the lanes. This first pipeline was then used as a starting point to develop the different functions used to detect lanes, although, of course, changes to the pipeline had to be made during development. We also implemented some functions that were not part of the lecture, e.g., sliding windows for line detection, as they were helpful in achieving better results on the given images and videos.
Alternatives
Another possible approach would have been using neural networks. Using them, it might have been possible to develop lane detection, which would be more generally applicable. But as they were not a big part of the lecture and can be pretty tricky to debug if they don't detect what they are supposed to detect, we ultimately decided against using them.
During the development of lane detection, it was also often necessary to decide which specific functions to use and which not to use. For example, the canny edge was not used as the implemented Sobel filter was more effective. Furthermore, some functions could not be implemented as they would have worsened the performance significantly.
Problems and Possible Solutions
One problem we encountered was that lane detection needed to be fixed on the challenge video. This was because the challenge video had a lot of shadows, and the thresholding did not work properly on the shadows. To solve this problem, we developed an algorithm that measures the divergence of a detected line compared to the previous only accepts the new line if the divergence is below a certain threshold. More information can be found in the documentation about lane.py
The core of our lane detection solution is thresholding (done in threshold.py) and detecting the actual lines with a polyfit operation and sliding windows technique (done in lane.py). More information, including a detailed description, about these steps in our pipeline can be found on the respective pages in the documentation.
Learnings
Both image thresholding and preprocessing are important parts of lane detection, as a good binary image is essential for proper lane detection.
It is quite easy to run into performance problems, making it important to think about which functions to implement. Even if they are beneficial in detecting the lanes, implementing them might not be worth it as they worsen the FPS by too much.
Problems that could not be solved
We did not manage to come even close to a solution for the harder challenge. This was due to the stark differences in brightness, which completely disrupted our lane detection. It might have been possible to change some parameters so that lane detection would perform better for the harder challenge video. Still, the lane detection would likely have performed worse for the other videos in return.
In general, Thresholds were chosen specifically so they work for the first two videos and the KITTI images. Due to this, many of the parameters chosen for the lane detection are somewhat "overfitted", and the lane detection would probably not work as well for new videos not tested during development. For lane detection that works more universally, it would have been necessary to try it on a wide array of different videos instead of just ensuring that it works well for two specific ones.
Outlook
Thresholding could be improved further and lead to better lane detection. For example, one way to improve it is to separate the window in two parts so that the thresholds can be applied locally.
To make the lane detection more useful it could be beneficial to change some parameters so that the lane detection works less well on the two videos but better on other videos which were not considered during development.
As the performance was constantly above 25 FPS and only 20 FPS is necessary, there is still some room to use more computationally expensive functions which decrease the FPS but improve the lane detection.
License
This project is licensed under MIT.