After struggling to make a neural net that would predict steering commands reliably for an autonomous toy RC car, only based on the current camera view (no history), I approached the problem systematically in a robot simulator, which allowed for faster experimentation, finally leading to success.
With only two simple training tracks, one with a 90° left curve and the other one with a 90° right curve, I was able to teach reliable driving behavior. The neural net generalizes better than expected, such that the self-driving car stays on the “road”, even for tracks differing significantly from the training data.
Given more varied examples of successful steering, the driving behavior could become a lot smoother than the video shows. But interestingly, the convolutional neural network (CNN) seems to interpolate nicely between the provided training examples, and is able to handle unknown degrees of road bends.
It even manages to drive through road crossings (see after the break), if a little awkwardly, since crossings “look confusing” and were never trained. When positioned outside of the track facing it at a slight angle, the car also manages to steer in the “hinted” direction and aligns properly with the track!
This project is about an autonomous vehicle, based on a modified toy RC car, that can drive along a “road” without any manual interaction required.
To this end, the car’s remote control is modified so it can be attached to a microcontroller, that receives commands from a Python program running on a laptop. The camera, mounted on the top of the car, streams its view wirelessly to a neural net on the laptop, that decides what steering commands are the most appropriate at every time step/frame.
In this post, I will present how to modify the remote control (soldering and mechanical changes), how to extend the car, and how to stream live video, with low latency, from the Raspberry Pi to a laptop using GStreamer and OpenCV. An upcoming post will show a reliable neural net model for automated steering.
Parsing the structured text files, provided by the Unicode consortium, at each startup is too inefficient, and merely storing the parsed text into a simple integer array wastes too much memory.
A more efficient storage uses a dictionary-like approach, to compress the needed data using a few layers of indirections, while still giving array-like performance with constant (and negligible) overhead.
In the following, I’ll briefly present the solution I found.
Analyzing these heatmaps can point out undesired correlations in the training data, between samples and labels. For example, an image classifiers for train track might rely on objects that are present in each picture (such as , while not being present in pictures of counter examples for horses. This artifact in the collected data set may be subtle, and not noticeable to a human, but would be visible on the heatmap that highlights the critical features in each image that drove the classification.