Tesla Produces Its Own Chips

Tesla hinted at this before, but apparently its long-term plan is to build its own autonomous vehicle chips. They are taking “vertical integration” to a whole new level.

(Interestingly, when I looked up vertical integration on Wikipedia just now, the opening paragraph of the article lists Ford as an example. The more things change, the more they stay the same.)

Elon Musk apparently announced this at an event for AI researchers in Long Beach last week, concurrent with NIPS 2017.

The event was live-tweeted by Stephen Merity, who is worth a read in his own right:

It Starts with Data

Like many commuters, I listen to podcasts, and, again like many commuters, one of my favorite podcasts is 99% Invisible. The show is about design and all of the ways that design shapes the word, mostly in ways we don’t see unless somebody points them out to us.

The most recent episode is “The Nut Behind the Wheel”.

When I saw the episode title, I naturally thought it referred to a generic nut behind the wheel — or rather, that humans are just bad drivers.

But the episode actually refers to a specific nut, Hugh DeHaven, who Wikipedia refers to as, “the father of crash survivability”.

DeHaven created the first comprehensive database of crash information, which has since become NHTSA’s Fatality Analysis Reporting System. This database collects extensive data on every fatal automotive crash in the United States.

This data is used by automotive manufacturers, safety certification organizations, and governmental regulators, all of whom study the crashes to find ways to make cars safer. 60 years ago, this started with the collapsible steering column, and of course it’s grown to include airbags, crumple zones, and collision alerts.

It’s a great podcast episode and kind of a wild story, including DeHaven getting up on ladders and dropping containers with eggs, to see what materials cushioned the fall best.

Everything starts with collecting data. As the proverb goes, what gets measured gets managed.

Delphi Automotive Becomes Aptiv

Reuters reporter Paul Lienert scored one of the first post-splinoff interviews with Kevin Clark, the CEO of Aptiv. Aptiv is a spinoff from Delphi, one of the world’s foremost Tier 1 automotive suppliers. The existing Delphi Technologies will retain the core business of automotive supply, whereas Aptiv will focus on autonomous technology.

In this vein, Delphi’s recent acquisition of nuTonomy will live within the Aptiv spinoff.

The split will hopefully resolve some potential tension for Delphi, as its new autonomous business seemed to be increasingly moving toward competition with the customers of its core automotive supply business. By splitting the companies, the legacy Delphi Technologies business may retain its credibility as a supplier, without carrying a side division engaged in competition with key customers.

One of the key insights to come out of the Reuters interview is Kevin Clark’s statement that autonomous technology will drop by orders of magnitude over the next 7 or so years.

While current estimates for the cost of a self-driving hardware and software package range from $70,000 to $150,000, “the cost of that autonomous driving stack by 2025 will come down to about $5,000 because of technology developments and (higher) volume,” Clark said in an interview.

Delphi is one of the leaders in the development of automotive techology, all the more so with their acquisition of nuTonomy. And their history as a Tier 1 supplier gives them greater insight than most other companies into how costs and production will scale.

So this seems like a prediction to take seriously. And if it comes to pass, that will be a game-changer. At $5000 marginal cost, consumers really could own their own self-driving vehicles, without relying on ride-sharing companies.

Of course, there are a host of reasons why consumers still might not want to own cars in the future — the costs of mapping, geofences, cratering costs of shared transportation. But $5000 autonomy would make plausible a lot of scenarios that thus far have seemed unlikely.

The “Deep Neural Networks” Lesson

Lesson 7 of the Udacity Self-Driving Car Engineer Nanodegree Program is “Deep Neural Networks”

I am continuing on my quest to write a post detailing every one of the 67 projects that currently comprise our Self-Driving Car Engineer Nanodegree program curriculum, and today, we look at the “Deep Neural Networks” lesson!

Students actually start learning about deep neural networks prior to this lesson, but this is the lesson where students begin to implement deep neural networks in TensorFlow, Google’s deep learning framework.

In the previous lesson, “Introduction to TensorFlow,” students learned to use TensorFlow to build linear models, like linear or logistic regression. In the “Deep Neural Networks” lesson, students learn new techniques in TensorFlow, to build up these models into neural networks.

Some of the most important foundational blocks of neural networks are demonstrated in TensorFlow.

  • Activation functions help neural networks represent non-linear models
  • Backpropagation trains neural networks from real data quickly and efficiently
  • Dropout removes neurons randomly during training to prevent overfitting the training data, which makes the model more accurate on new data

Students also learn some practical skills, like how to save and restore models in TensorFlow.

Future lessons take these basic skills and help students apply them to important problems for autonomous vehicles, like how to recognize traffic signs.

Self-Driving Cars in Boston

nuTonomy announced a while ago that they would be testing self-driving cars in Boston, but then I kind of lost track of that, especially in the wake of the Delphi acquisition.

Recently WBUR reported that nuTonomy actually already completed its first pilot program in Boston. Seems like it happened under stealth:

Over a two-week trial in November, a select group of volunteers tested out nuTonomy’s self-driving cars in Boston. The participants hailed a ride using the company’s booking app. The trips they took looped around the Seaport District, starting at the company’s Drydock Ave. office and moving onto Summer Street into downtown Boston and back along Congress Street.

Sounds like everything went well and in fact WBUR reports that another Boston company, Optimus Ride, is also testing in Boston.

I used to joke that there’s a reason every self-driving car company is testing in California, Nevada, or Arizona — lots of sun and warmth.

But with Uber in Pittsburgh and these companies in Boston, we’re making small steps to all-weather support for self-driving cars.

The “Introduction to TensorFlow” Lesson

The sixth lesson of the Udacity Self-Driving Car Engineer Nanodegree Program is “Introduction to TensorFlow.”

TensorFlow is Google’s library for deep learning, and one of the most popular tools for building and training deep neural networks. In the previous lesson, MiniFlow, students build their own miniature versions of a deep learning library. But for real deep learning work, an industry-standard library like TensorFlow is essential.

This lesson combines videos from Vincent Vanhoucke’s free Udacity Deep Learning course with new material we have added to support installing and working with TensorFlow.

Students learn the differences between regression and classification problems. Then they to build a logistic classifier in TensorFlow. Finally, students use fundamental techniques like activation functions, one-hot encoding, and cross-entropy loss to train feedforward networks.

Most of these topics are already familiar to students from the previous “Introduction to Neural Networks” and “MiniFlow” lessons, but implementing them in TensorFlow is a whole new animal. This lesson provides lots of quizzes and solutions demonstrating how to do that.

Towards the end of the lesson, students walk through a quick tutorial on using GPU-enabled AWS EC2 instances to train deep neural networks. Thank you to our friends at AWS Educate for providing free credits to Udacity students to use for training neural networks!

Deep learning has been around for a long time, but it has only really taken off in the last five years because of the ability to use GPUs to dramatically accelerate the training of neural networks. Students who have their own high-performance GPUs are able to experience this acceleration locally. But many students do not own their own GPUs, and AWS EC2 instances are a cloud tool for achieving the same results from anywhere.

The lesson closes with a lab in which students use TensorFlow to perform the classic deep learning exercise of classifying characters: ‘A’, ‘B’, ‘C’ and so on.

How to Solve the Trolley Problem

The Trolley Problem is a favorite conundrum of armchair self-driving car ethicists.

In the original version of the problem, imagine a trolley were running down the rails and about to run over three people tied to the tracks. What if you could throw a switch that would send the trolley down a different track? But what if that track had one person tied down? Would you actually throw the switch to kill one person, even if it meant saving the other three people? Or would you let three people die through inaction?

The self-driving car version of this problem is simpler: what if a self-driving car has to choose between running over a pedestrian, or driving off a cliff and killing the passenger in the vehicle? Whose life is more valuable?

USA Today’s article, “Self-driving cars will decide who dies in a crash” does a reasonable job tackling this issue in-depth, from multiple angles. But the editors didn’t do the article any favors with the headline. It’s not actually self-driving cars that will decide who dies, it’s the humans that design them.

Here’s Sebastian Thrun, my boss and the former head of the Google Self-Driving Car Project, explaining why this isn’t a useful question:

I’ve heard another automotive executive call it “An impossible problem. You can’t make that decision, so how can you expect a car to solve it?”

To be honest, I think of it as an unhelpful problem because we don’t have enough data to know at any given point, with what amount of certainty is the car going to kill anybody. Fatal accidents in self-driving cars haven’t happened yet in any meaningful numbers, so the necessary data doesn’t exist to even work on the problem.

But, I think I’ve come to a conclusion, at least about the hypothetical ethical dilemma:

The car should minimize the number of people who die, by following utilitarian ethics.

This raises some questions about how to value the lives of children versus adults, but I assume some government statistician in the bowels of the Department of Labor has worked that out.

So why should self-driving cars be utilitarian? Because people want them to be.

From USA Today:

Azim Shariff, an assistant professor of psychology and social behavior at the University of California, Irvine, co-authored a study last year that found that while respondents generally agreed that a car should, in the case of an inevitable crash, kill the fewest number of people possible regardless of whether they were passengers or people outside of the car, they were less likely to buy any car “in which they and their family member would be sacrificed for the greater good.”

I’ve seen this in a few places now. The general public thinks cars should be designed to minimize fatalities, even if that means sacrificing the passengers. But they don’t want to ride in a car that would sacrifice passengers.

If you believe, as I do, and as Sebastian does, that these scenarios are vanishingly small, then who cares? Give the public what they want. In the exceedingly unlikely scenario that a car has to make this choice, choose the lowest number of fatalities.

And if people don’t want to ride in those cars themselves, they can choose not to. They can drive themselves, but of course that is pretty dangerous, too.

I’ll choose to ride in the self-driving cars.

Literature Review: Apple and Baidu and Deep Neural Networks for Point Clouds

Recently, Apple made what they must have known would be a big splash by silently publishing a research paper with results from a deep neural network that two of their researchers built.

The network and the paper in question were clearly designed for autonomous driving, which Apple has been working on, more or less in secret, for years.

The network in question — VoxelNet — has been trained to perform object detection on lidar point clouds. This isn’t a huge leap from object detection on images, which has been a topic of deep learning research for several years, but it is a new frontier in deep learning for autonomous vehicles. Kudos to Apple for publishing their results.

VoxelNet (by Apple), draws heavily on two previous efforts at applying deep learning to lidar point clouds, both by Baidu-affiliated researchers. Since the three papers kind of work as a trio, I did a quick scan of them together.

3D Fully Convolutional Network for Vehicle Detection in Point Cloud

Bo Li (Baidu)

Bo Li basically applies the DenseBox fully convolutional network (FCN) architecture to a three-dimensional point cloud.

To do this, Li:

  • Divides the point cloud into voxels. So instead of running 2D pixels through a network, we’re running 3D voxels.
  • Trains an FCN to identify features in the voxel-ized point cloud.
  • Upsamples the FCN to produce two output tensors: an objectness tensor, and a bounding box tensor.
  • The bounding box tensor is probably more interesting for perception purposes. It draws a bounding box around cars on the road.
  • Q.E.D.

Multi-View 3D Object Detection Network for Autonomous Driving

Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia (Tsinghua and Baidu)

A team of Tsinghua and Baidu researchers developed Multi-View 3D (MV3D) networks, which combine lidar and camera images in a complex neural network pipeline.

In contrast to Li’s solo work, which constructs voxels out of the lidar point cloud, MV3D simply takes two separate 2D views of the point cloud: one from the front and one from the top (birds’ eye). MV3D also uses the 2D camera image associated with each lidar scan.

That provides three separate 2D images (lidar front view, lidar top view, camera front view).

MV3D uses each view to create a bounding box in two-dimensions. Birds-eye view lidar created a bounding box parallel to the ground, whereas front-view lidar and camera view each create a 2D bounding box perpendicular to the ground. Combining these 2D bounding boxes creates a 3D bounding box to draw around the vehicle.

At the end of the network, MV3D employs something called “deep fusion” to combine output from each of the three neural network pipelines (one associated with each view). I’ll be honest — I don’t really understand how “deep fusion” works, so leave me a note in the comments if you can follow what they’re doing.

The results are a classification of the object and a bounding box around it.

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Yin Zhou, Oncel Tuzel (Apple)

That brings us to VoxelNet, from Apple, which got so much press recently.

VoxelNet has three components, in order:

  • Feature Learning Network
  • Convolutional Middle Layers
  • Region Proposal Network

The Feature Learning Network seems to be the main “contribution to knowledge”, as the scholars say.

It seems that what this network does is start with a semi-random sample of points from within “interesting” (my word, not theirs) voxels. This sample of points gets run through a fully-connected (not fully-convolutional) network. This network learns point-wise features which are relevant to the voxel from which the points came.

The network, in fact, uses these point-wise features to develop voxel-wise features that describe each of the “interesting” voxels. I’m oversimplifying wildly, but think of this as learning features that describe each voxel and are relevant to classifying the part of the vehicle that is in that voxel. So a voxel might have features like “black”, “rubber”, and “treads”, and so you could guess that the voxel captures part of a tire. Of course, the real features won’t necessarily be intelligible by humans, but that’s the idea.

These voxel-wise features can then get pumped through the Convolutional Middle Layers and finally through the Region Proposal Network and, voila, out come bounding boxes and classifications.


One of the most impressive parts of this line of research is just how new it is. The two Baidu papers were both first published online a year ago, and only made it into conferences in the last six months. The Apple paper only just appeared online in the last couple of weeks.

It’s an exciting time to be building deep neural networks for autonomous vehicles.

Photorealism of Microsoft AirSim

Over the last year, a number of companies (including Udacity) have released self-driving car simulators powered by gaming engines.

The latest entrant is Microsoft, which has updated their open-source AirSim flight program to also support self-driving cars.

AirSim looks awesome. The big advantages of building off of a gaming engine (AirSim uses Unreal Engine, whereas the Udacity simulator uses Unity) include fully baked APIs, powerful physics engines, and incredibly realistic design and graphics.

That last item what will ultimately make or brake AirSim, or any other simulation engine.

The holy grail of autonomous vehicle simulation is the ability to train machine learning models in the simulator, and then port them to the real world. Once a simulator breaks that barrier, we should see incredibly fast improvements in our ability to build autonomous driving systems, as it’s exponentially faster to drive “simulated” miles compared “real” miles.

As photorealistic as AirSim is, it doesn’t yet look to me like it’s realistic enough to reliably move models between AirSims photorealistic environment and the actual, real environment.

That said, I doubt it’s possible to determine model portability with much confidence simply by eyeballing YouTube videos of the simulator, which is all I’ve done so far.

I look forward to people trying out AirSim models in the real world and seeing how they do.

The “MiniFlow” Lesson

Exploring how to build a Self-Driving Car, step-by-step with Udacity!

Editor’s note: David Silver (Program Lead for Udacity’s Self-Driving Car Engineer Nanodegree program), continues his mission to write a new post for each of the 67 lessons currently in the program. We check in with him today as he introduces us to Lesson 5!

The 5th lesson of the Udacity Self-Driving Car Engineer Nanodegree Program is “MiniFlow.” Over the course of this lesson, students build their own neural network library, which we call MiniFlow.

The lesson starts with a fairly basic, feedforward neural network, with just a few layers. Students learn to build the connections between the artificial neurons and implement forward propagation to move calculations through the network.

A feedforward network.

The real mind-bend comes in the “Linear Transform” concept, where we go from working with individual neurons to working with layers of neurons. Working with layers allows us to dramatically accelerate the calculations of the networks, because we can use matrix operations and their associated optimizations to represent the layers. Sometimes this is called vectorization, and it’s a key to why deep learning has become so successful.

Once students implement layers in MiniFlow, they learn about a particular activation function: the sigmoid function. Activation functions define the extent to which each neuron is “on” or “off”. Sophisticated activation functions, like the sigmoid function, don’t have to be all the way “on” or “off”. They can hold a value somewhere along the activation function, between 0 and 1.

The sigmoid function.

The next step is to train the network to better classify our data. For example, if we want the network to recognize handwriting, we need to adjust the weight associated with each neuron in order to achieve the correct classification. Students implement an optimization technique called gradient descent to determine how to adjust the weights of the network.

Gradient descent, or finding the lowest point on the curve.

Finally, students implement backpropagation to relay those weight adjustments backwards through the networks, from finish to start. If we do this thousands of times, hopefully we’ll wind up with a trained, accurate network.

And once students have finished this lesson, they have their own Python library they can use to build as many neural networks as they want!

If all of that sounds interesting to you, maybe you should apply to join the Udacity Self-Driving Car Engineer Nanodegree Program and learn to become a Self-Driving Car Engineer!