Exclusive: The Tesla AutoPilot – An In-Depth Look At The Technology Behind the Engineering Marvel


Tesla's Autopilot System - MobilEye

Those who are diligent enough will know that the company Mobileye powers the self-driving capabilities of the Tesla vehicles. I will go over the features of this system in this part whereas the second part of the Mobileye section will contain a hardware overview as well as a comparison to Nvidia offerings.

Introduction to Autonomous Driving and ADAS

Mobileye refers to driving automation in three broad milestones.

The first milestone is ADAS or Automated Driver Assistance Systems. An ADAS system assumes that the driver is in control of the car for most of the time but will provide assistance or emergency capabilities.  This includes the likes of AEP (Automatic Emergency Braking), Adaptive Cruise Control, Collision Avoidance Systems and similar features.  This is something that is now part and parcel of most high end (and mainstream) vehicles with a select few (including Tesla) even having advanced ADAS like Emergency Auto Steering.

The second milestone is Semi-Autonomous driving, something only Tesla can claim at the moment, and consists of the car driving itself (hands-off the steering wheel) with the driver being a necessary requirement for regular monitoring. In this case, the car will handover control to the human in various scenarios. The element of the human driver is assumed to be an active participant in the process - albeit one which doesn't interfere for some (if not most) of the time. Basically, if you crash the car while on Autopilot - you are responsible.

The last and final milestone is fully Autonomous Driving, in which a car can go from Point A to Point B without any human monitoring necessary and can tackle all sorts of scenarios on its own. The role of the human driver here is one that is completely passive and should remain passive for the duration of the trip. It is this future that automobile companies are striving for and companies like Mobileye and Google are racing towards.

Mobileye plans to complete the Semi-Autonomous phase of self-driving cars by 2018, during which the capability of a semi-autonomous car will be increased from Highways, to Country Roads and finally City Roads. Note that the Tesla Autopilot currently does not work on roads where the lane markings aren't clear - even though Mobileye is perfectly capable of holistic path planning without any markings.

Tesla's autopilot system is unique in many ways - but one of the first things worth mentioning is that:

Tesla's Autopilot, powered by Mobileye, is the world's first DNN deployed on the road

I think the best way to tackle the process that goes on behind the scenes is to simply break it down in parts. Please note that while Tesla uses a plethora of software the primary bulk of ADAS and Semi-Autonomous driving is handled by Mobileye's chip. The process shown above is a high level diagram of how the Tesla Autopilot system functions. Some of it includes algorithmic functions such as motion segmentation, ego-motion, camera solving etc, but the really interesting part is the DNN based functions. As mentioned above, Mobileye has deployed the first DNN on the road with Tesla EVs and is responsible for the following (major) jobs:

  • Free Space Pixel Labeling
  • Holistic Path Planning
  • General Object Detection
  • Sign Detection

A very pertinent point to make here is that there is a difference between the core Mobileye DNN and the system Tesla is using to 'learn' - they are not the same. To reiterate:

The system Tesla EVs use to make the autopilot 'learn' over time is an implementation of their own design and not related to Mobileye

One of the primary things that the Tesla Model X and Model S are capable of is 3D Modelling of vehicles on the road. Owners might have guessed as much, since the quaint little cars that appear on the instrument cluster are generated after this is done. Not to mention that the "follow-the-car" approach Tesla so happily utilizes is dependent upon this. All this is done ofcourse, by machine learning. The DNN in question was trained with various side , front and rear sides of various cars until it is able to detect them to a reasonable accuracy and consequently construct a 3D model of the same (just a plain box showing the area occupied in real 3D space).

Free Space Pixel Labeling is simply put, recognizing the area on-camera which is obstruction free. It is also the area upon which the car will be allowed to go. This allowable area is shaded in green in the images below (real output generated by the Mobileye DNN). As you can see, road edges, and vehicle edges are being correctly recognized with very little visual cues. This part of the system is very critical because if executed incorrectly it could result in the car veering off-road, into an object or worse.

Given below are representations of the Holistic Path Planning capability of the Mobilye chip which allows it to decide the way forward with very little visual cues. This is the process which tells the car where to drive to and controls the steering. Anyone with knowledge of how these things work would agree that it is remarkable that the processor is able to distinguish the road without any high contrast lane markings. Note that this feature is only partially available on the Tesla (probably because the manufacturer has decided to play it safe) and only works with clear lane markings.

Of course, driving isn't just about flooring the accelerator and steering (though some might argue otherwise); situational and contextual awareness is something that is very crucial. This is where Mobileye's Object Detection Capabilities come in: something that is a much more traditional implementation of DNNs. In this case, the chip on-board the Tesla is capable of identifying over 250 signs in more than 50 countries. These include everything from turn signs to speed limits. The system is also capable of identifying and interpreting traffic lights,  road markings and general items such as traffic cones. It even has the capability to detect large animals that appear suddenly on the road - and of course, human pedestrians.

Last but not least is the capability to detect the road surface as well as any debris present. This allows the Tesla to be aware of not only what kind of road the car is traveling on (highway vs country side etc) but would also allow it to detect debris and other undesirables such as potholes on the road (and consequently avoid them). The DNN based system is even able to identify the types of tarmac and road composition and adjust steering and electronic stabilization accordingly - something that will be part and parcel of tomorrow's smart car.

While we are on the topic of general features it is worth mentioning that most of these capabilities were originally designed to run a Monocular setup - which means that it was designed to function with only one primary camera. This has since been expanded to bigger and expansive surround configurations which provide much more visual coverage. This offers unparalleled flexibility and reliability to the car owner. Thanks to the increases in processing power offered by the current generation Mobileye chip, complete 360 Surround awareness is now part of the package although most of the ADAS and driving simulation still uses the camera and radar setup on the front.

Many of these capabilities remain dormant for the time being, until Tesla deems them ready for activation

Before we delve into the nitty gritty of the hardware involved I would like to point out that many of these features have been adapted by Tesla and will only be activated/modified back to their original state once it feels the time is right. The system is approaching zero tolerance for mistakes - but it is very rightly said to be a beta program. As the system "learns" and becomes more adept at navigating without human assistance, this should change, but until then, Tesla owners should be proud of the fact that they are driving nothing less than an absolute technological marvel. Here is a gif that surfaced a while back showcasing what a Tesla 'sees'. Readers should be able to spot the various types of DNN based techniques at work here: