Building an autonomous robot car – Tuning in to growing words

This project is about an autonomous vehicle, based on a modified toy RC car, that can drive along a “road” without any manual interaction required.

To this end, the car’s remote control is modified so it can be attached to a microcontroller, that receives commands from a Python program running on a laptop. The camera, mounted on the top of the car, streams its view wirelessly to a neural net on the laptop, that decides what steering commands are the most appropriate at every time step/frame.

In this post, I will present how to modify the remote control (soldering and mechanical changes), how to extend the car, and how to stream live video, with low latency, from the Raspberry Pi to a laptop using GStreamer and OpenCV. An upcoming post will show a reliable neural net model for automated steering.

I originally published this article on LMR in December, 2018, but I decided to post an updated version with better formatting and some editing on my own website, now.

Introduction

Initially, I wanted to use the Boe-Bot robot base I had already and modified to have a kind of arm (pan-and-tilt-mechanism) for a camera and a ToF sensor to measure distances. I extended it further with custom made encoders, that keep track of the wheels’ position, a custom power supply, several additional levels made of wood for mounting various PBCs, etc.

Since it’s all a rather small form factor, it is a bit fiddly to get right. There are also too many things that need to be really reliable (e.g., measurement of wheel speed) before I could use this base for the autonomous car project. Finally, it uses differential steering, and not Ackermann steering like a real car.

So I decided to simply use a toy RC car as a basis, a Raspberry Pi, with a power bank, and a remote control that could be modified. Searching the web I even found several projects that did just that, which I used as inspiration.

To get a good overview of how (self-)driving and training works, it is useful to describe the two modes the car can be in.

Operating modes of the car

The car can be either in run mode (autonomous driving) or in training mode (learning to drive).

Run mode (autonomous driving)

The car should drive along a lane whose borders are delimited with paper sheets, on the left and right side. Each camera image from the Raspberry Pi is sent over WiFi and fed into a neural net on the laptop, that has a steering command (either of these: forward, forward left, forwad right) as output. That command gets translated to a set of button presses on the remote control by the Arduino.

This way the RC car drives autonomously along a lane, provided the neural net can extract the relevant features to derive the appropriate drive commands at each frame.

Training mode (learning to drive)

These features are learned in the training mode. To that end, the user can drive the car along a lane, using the cursor keys on the laptop’s keyboard. At fixed time intervals the currently pressed keys, together with the current camera frame from the Raspberry Pi cam, are stored in an array. The intervals (and camera image quality) may be affected by network bandwidth or QoS issues, but idealized this procedure gives us a recording of all the necessary information as a sequence of (camera frame, pressed keys)-pairs.

The neural net is then trained with these pairs until the network’s parameters are trained well enough that the correlation between camera image and predicted necessary steering command matches well with the examples given.

Many different neural net architecture, or really any kind of machine learning classifiers can be used. There are many possibilities to improve the results by collecting more training data or tuning the architecture, layers, etc.

Once enough training data (=driving examples) are collected and the prediction accuracy of the neural net is satisfactory, it is saved to a file that can be loaded in the autonomous driver program for the car (= run mode).

Overview

In the rest of this post, I’ll describe the hardware implementation and modifications that were necessary, but also the software implementation. I will add some relevant code inline. If there is interest, I can post my complete project on GitHub later or link to similar projects, though I am still working on improving the neural network’s prediction accuracy. (I overcame the issues by training a car in a robot simulator.)

Modifying the remote control

To drive the car over a laptop keyboard (training mode) or let the program drive on its own (autonomous mode), the remote control has to be hacked and connected to a microcontroller (any Arduino will do) that is plugged into the laptop.

For the model of the car I have, it is rather easy to modify the remote control in a quick and dirty way (no need for soldering, either). You can directly attach probes to pins of the RC’s controller chip, from which traces directly lead to the user-accessible buttons. These buttons are purely on/off-switches (no analog signals), which also means you can only drive forward and backward at a fixed speed (ignoring the time necessary for acceleration due to inertia).

The button traces on the PCB are usually at around 3V-3.3V when high. Pulling one of the button traces down to ground will be detected as a button press.

This approach however has a couple disadvantages. When unplugging the Arduino, or powering off the laptop, it will automatically pull some traces down far enough to cause the car to drive forward unintentionally. On top of that this is a rather fragile setup, where connections get loose and you get random short circuits just by touching the PCB.

Therefore, I made the effort to really modify the remote control properly, so it works without batteries, directly powered by the microcontroller, and can be closed up again. It took me many days, since I have no 3D printer, and had to invent several tricks to go by with the tools I had. Not the least, to mount the black female header properly into the white casing, so it stays put when you push leads into or pull them out again. (I essentially sculpted a socket out of hot glue, and removed other parts that were in the way.)

Soldering cables onto the remote control’s PCB and main IC

The unmodified board (apart from the VCC cable, which broke off).

I used Bluetack to hold the wires in place when soldering them onto the controller IC.

I unsoldered the cables for VCC and GND, and drilled bigger holes to fit supplementary cables and make them available from the outside. Using a fiber glass pencil I scraped off part of the solder mask, so I had a surface to solder on.

Solder mask removal with a glass fiber pencil, before soldering the additional (black and red) wires in place.

Then I soldered wires onto the relevant IC pins to bring out all the steering signals over the female header, together with the power rails.

Many new wires soldered onto the control IC and brought out over a female header (black rectangle).
From left to right: VCC (black), steering signals (green, white, yellow, blue), Gnd (red).

Mechanical mounting and fixation

I tentatively placed the female header inside the white case, after the PCB was mounted again, to see how much space was left and how to lock it in place. I used hot glue to make the female header higher, so that the top part of the case would hold it down by pressure.

Fitting the female header so it locks tightly into the white case.

It turned out it was insufficient to hold the header in place when the white case was closed again, and cables where inserted into the female header. The pressure from the top part of the case prevented vertical movement of the female header, but horizontal movement when pushing cables or pulling them was not prevented yet.

So I made a few more modifications, that involved a lot of fiddling. Maybe a 3D printer would have made it easier.

After various ideas, I cut, saw and used an X-Acto knife, until I had enough room to put hotglue as a support bed for the female header. I then carved it out to create a hole just the right size to receive a rectangular shape to prevent horizontal movement.

Carved out a hole in the precisely dosed and repeatedly trimmed amount of hot glue.

The “male header” end of a jumper wire proved to be the perfect size to act as an anchor for the female header in the hot glue support bed.

Carved out hole filled with a spare male header case (black).

After successfully fitting the “male header” into the carved out hole, I glued it onto the bottom side of the larger female header, so that pushing the female header down into the hole locks it in place. That way horizontal movement is prevented effectively, ensuring stable placement, together with the pressure provided by the top part of the white RC case, which prevents vertical movement.

“Male header” super glued onto larger female header, to act as an anchor when placed in the white case.

Now everything can be assembled and disassembled completely, and only pressure/friction and screws hold things in place. So I can open it up and modify it or fix things easily if I need to in future.

Painting the solder joints

Short circuits were possible, when the pins get squished close together, when everything is mounted again. Usually, a female header should avoid this from happening, but due to the size of the solder joints and the female header being a little lose, this was necessary.

It would have been better to use some kind of solder mask or even nail polish, but at that movement I just had some spray paint available. I am not sure why I didn’t use shrink tubing then, probably it would have caused the cables to become too thick to fit next to each other.

Before painting, I used insulating tape to cover regions close to the solder joints.

Using electrical tape to prepare for painting.

Even though I put tape and aluminum foil around everything, it still made a mess, and I had to painstakingly clean each cable individually.

More protection of parts that should remain clean, using aluminum foil and masking tape.

The cleaned up result looks like this:

The coating is not perfect, but luckily it covered the solder joints everywhere necessary. After placing the “fork” pins back into the female header the painted parts become more obvious.

Not a perfect paint job, but enough paint where it matters to prevent short circuits.

Final result of the RC modification

When everything is mounted back together, the remote control looks almost the same, with the exception of a small opening where the female header peeks through.

Almost unmodified looking remote control, besides the new slit where cables can be inserted to connect it to a microcontroller.

You can remove the cables now, and use the remote control normally, as if it had never been hacked. Though I had quite a few issues initially with cables that broke while working, short circuits, and things that wouldn’t fit in the little space (as described above), now everything is nice, robust, and satisfying.

Obviously, you can also attach it to a microcontroller with a few cables, to control the car with a PC (which was the initial goal ;).

Extending the toy RC car

With the remote control finished, the next part is about modifying the toy RC car as well, since it has to observe its environment and stream that data back to the laptop, to receive steering commands (over the remote control).

One of the most information rich and flexible ways would be to use a camera and process that data to steer accordingly. Or, you could build a sort of line follower, and use photodiodes that point to the ground, to detect the lines that delimit the road by reacting to high contrast differences. But I was interested into using neural nets, which are well suited for processing complex input data, such as obtained from a real world scene through a camera.

Picking the power bank and the RC car model took a while, because I tried to find the car with the flattest roof, and a power bank which was light enough, yet had enough capacity. Otherwise the spring-loaded wheels are pushed down too much and steering is affected.

Additionally I had plans to react to traffic signs, and mounted the camera properly with screws, but it turned out that you can either point the camera towards the ground, so you see enough of the ground close by, or install it higher up, to see the traffic signs. So for now I decided to focus on the “road”. Maybe in future a kind of fisheye lens, or the newer Raspberry Pi Cam v2, which has a larger field of view, could allow to see the road and signs simultaneously.

Car with power bank, cased Raspberry Pi, and Pi mounted on a cardboard “stand” all stacked upon its roof.

The picture above shows the extended car, when it is finished. It has a power bank directly mounted on the roof, and on top of it a Raspberry Pi in a black case. The tall white cardboard box is there to mount the camera, which is just held with rubberbands. Initially, I mounted the camera with tiny screws at the top of the box, to capture road signs. But as mentioned earlier, the field of view of the camera is too small to also show the road just in front of the car from this height.

The white cardboard box end was cut and scored to obtain two lids, that could be fixed with double sided tape to the inside of the upper part of the Raspberry Pi case. I used the slots for the camera and display cable to slide in the lids. Luckily, the base, of the cardboard box I had available, fit the gap between the two case slots almost perfectly.

The tall cardboard camera “stand” stably mounted on the Pi case, with scored and glued lids.

When the camera is mounted with screws at the top, a longer cable is necessary, which is also neatly hidden inside the cardboard box.

The power bank is attached to the roof of the car with Velcro, and since the Pi’s case has some rubber feet, it is enough to use rubber bands to hold it in place. This allows for easier unmounting (which I had to do surprisingly often).

Second patch aligned closely with first one.

Both Velcro counterparts are mounted here, so that I just needed to align the power bank properly and press it down to glue it onto the strips.

Once the WiFi dongle and the power bank is plugged into the Raspberry Pi, and the cables are held in place with rubber bands, all is set to program the Raspberry Pi for streaming the video signal.

Streaming Pi cam video to laptop

I was looking for a low latency method to stream the Pi cam video to my laptop, since the timestamps of the steering commands and corresponding camera frames should match up as precisely as possible, to take a proper “sensor snapshot” at each point in time.

If that basis is not correct, it may affect training, since the correlation between camera image and required keypresses is the only information the car gets. There are no distance sensors, or wheel encoders, though all of that information could be useful, especially to correct obviously wrong decisions by the neural net. For now I want to keep it reasonably simple, though.

On the Raspberry Pi side the following command should be executed to initiate streaming:

raspivid -n -w 320 -h 240 -fps 60 -b 2000000 -t 0 -o - | gst-launch-1.0 -v fdsrc ! h264parse ! rtph264pay config-interval=10 pt=96 ! udpsink host=&lt;laptop-ip&gt; port=&lt;laptop-port&gt;

1	raspivid -n -w 320 -h 240 -fps 60 -b 2000000 -t 0 -o - \| gst-launch-1.0 -v fdsrc ! h264parse ! rtph264pay config-interval=10 pt=96 ! udpsink host=<laptop-ip> port=<laptop-port>

It will start the camera and capture a video with a resolution of 320×240 at 60 frames per second with no preview (-n, reduces CPU usage on Pi), streaming it to stdout. GStreamer will pick it up from there (fdsrc) and send it over UDP (for minimum latency) to the ip and port specified under <laptop-ip> and <laptop-port> using Real-Time Streaming Protocol (RTSP). There are other protocols that can be used, such as GStreamer Data Protocol (GDP), but it is less reliable. It will require that the streamer source (Pi) and streamer sink (laptop) be started in the right order, also forcing the laptop to wait for a while to hopefully connect only once the source is ready. RTSP handles all this handshaking properly, and has only a marginally higher latency.

To reduce boring repetitiveness, my code uses plink.exe to run the command remotely once the main Python program runs on my laptop, so there is no need to manually login to the Pi and run the command.

Capturing of the video stream on the laptop is done using OpenCV. Unfortunately, GStreamer support has to be enabled manually, and the whole package needs to be recompiled in VS C++, which takes long. Once it is done, replacing the original cv2.pyd (pyd are really DLL files) of your Python installation with the newly created one will enable GStreamer support.

After this the following capturestring will work (Python code on the laptop):

capture_string = 'udpsrc port= caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" ! rtph264depay ! video/x-h264,width=320,height=240,framerate=60/1 ! h264parse ! avdec_h264 ! videoconvert ! appsink sync=false'

1	capture_string = 'udpsrc port= caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" ! rtph264depay ! video/x-h264,width=320,height=240,framerate=60/1 ! h264parse ! avdec_h264 ! videoconvert ! appsink sync=false'

cap = cv2.VideoCapture(capture_string)

1	cap = cv2.VideoCapture(capture_string)

appsink is really the essential parameter to make it work with OpenCV’s VideoCapture, instead of the standalone gst-launch-1.0.exe.

All of this code was encapsulated into easily useable Python classes, so that streaming is as simple as pulling a frame out of a queue. Since capturing frames runs in a separate thread, it does not delay processing of key strokes that drive the car.

All this recorded data is then stored as NumPy arrays, which are used for training, or inspecting to verify the recording really captured the right frames for each keystroke. (Initially I had problems where the keystrokes and frames would not match up, or the frames where from several seconds earlier.)

Getting all the things right and reducing latency took a while, one of the lessons were that Python’s lists are faster at storing the recorded data pairs (keystroke, frame) than using NumPy arrays. So using lists as a temporary buffer, improved latency quite a lot.

I also wrote a little Delphi program to visualize the recorded data, since Python is not really practical for developing interactive GUIs. I could have used C#, but Delphi programs are just snappier.

More to follow

Currently, I made it drive along a curve. I trained the neural net in one room, but as training is highly specific to the environment (viewed pictures/frames), I have to retrain it for the room that is available to me now.

Later I might add a video and some notes about the programming, if there is interest.

I am in the stage of experimenting with various neuronal network architectures. Probably transfer learning would be the most robust.

Tensorflow is too slow for real time driving, which is why I am using a simple feed forward network currently, as available by OpenCV. This however is too inaccurate.

With smaller models, such as generated by Tensorflow Lite, Tensorflow Lite Micro, or directly processing the neural nets on the Raspberry Pi, instead of the laptop, would allow to reduce the latency. Donkey cars seem to manage fine.

More on dealing with these issue in an upcoming post, that uses simulation to experiment with various approaches, sensor properties (e.g., FOV of the camera), steering angles and tracks, but also allows to relax timing constraints (pauses can be introduced).