Probabilistic Risk Assessment

The hardest challenge is autonomous vehicle development isn’t technical – it’s raising enough money to keep going 💸

The hardest technical challenge, however, is determining when you are ready to remove the safety driver and let the system operate on its own. Once you’re able to make that assessment, the task list to get to “ready” becomes clear, and progress is steady.

Kodiak just published a blog post on two of the major tools we use to determine when we are ready to “go driverless.” We use these tools every day, and we’ve used them to go driverless in an increasing range of domains. We’re using them to prepare to go driverless on the highway.

The tools are heavily dependent on probability and statistics. The first tool, in fact, we call “probabilistic risk assessment.”

In simple terms, the Kodiak PRA decomposes scenarios into three primary factors:

Scenario Exposure: How often does our vehicle encounter this type of operating scenario?

Collision Likelihood: Given that our vehicle encounters this operating scenario, how likely is it that a collision occurs?

Severity of Collisions: How severe would the collision in this scenario be?

The second tool is called “Breakpoint.” This tool feeds values into the three factors of the PRA:

BreakPoint deliberately injects realistic, time-varying errors onto the signals that flow through the autonomy system in order to ensure that the autonomy system still drives safely in both normal and extreme conditions. Better yet, BreakPoint actively guides its search adversarially, actively trying to drive our system into a “collision” situation, thus helping us discover autonomy failure modes. Then, BreakPoint tooling helps us estimate the risk associated with this failure mode, and this information flows directly into our PRA.

If you’re interested in solving the hardest technical challenge in autonomous vehicle development, you should read the blog post.

Kodiak’s Driverless Truck Debuts At CES

Kodiak’s driverless-ready sixth-generation truck debuted last week at the Consumer Electronics Show in Las Vegas. There was lots of great coverage!

Kristen Korosec from TechCrunch wrote:

This isn’t just any big rig. Packed inside this sixth-generation semi truck are two — and sometimes three — of every mechanical component that is critical for safe operations, including braking, steering, sensors and computers. Those redundant systems are there as a backup in case anything were to fail while its self-driving truck barrels down a highway without a driver behind the wheel.

And Ed Garsten from Forbes summarized:

Specifically, the sixth-generation Kodiak self-driving technology includes:

Redundancy across all safety-critical functions, including redundant braking, steering and power systems.

Kodiak’s custom-designed high-integrity Actuation Control Engine system. The ACE is responsible for ensuring that the Kodiak Driver can guide the truck to a safe “fallback” out of the flow of traffic in the unlikely event of a critical system failure.

The Kodiak Driver, the vehicle-agnostic self-driving system which includes Kodiak’s redundant, driverless-ready hardware platform, is “designed to be safer than a human driver,” the company says.

Twice the GPU processor cores, 1.6 times greater processing speed, 3 times more memory, and 2.75 times greater bandwidth to run software processes compared to Kodiak’s first-generation truck.

Kodiak’s proprietary SensorPods, which replace a truck’s side-view mirrors and house two upgraded higher-resolution, automotive-grade LiDAR sensors and two additional side radar sensors to improve long-range object detection.

12 cameras, four LiDAR sensors and six radar sensors.

One thing I add is the quality of the process Kodiak uses to determine that our systems are ready for different domains (test tracks, highways, surface streets, driverless operations). Kodiak’s Systems Engineering team is incredibly thorough and thoughtful in terms of preparing test data and analysis for each new domain. And our internal Safety Review Board is rigorous, thorough, and demanding.

Putting a driverless vehicle on the roads in 2024 will demand world-class hardware, software, data, but most of all a top-notch safety culture and safety process.

South by Southwest

In March I had the pleasure of speaking at South by Southwest. I was on a panel with colleagues from Udacity, HubSpot, and Knowable.

The discussion centered around how to generate training data for machine learning, and I spoke specifically about simulated data for self-driving cars.

I also threw KitKats at the audience.

You can watch here.

Literature Review: Capsule Networks

My Udacity colleague, Cezanne Camacho, is preparing a presentation on capsule networks and gave a draft version in the office today. Cezanne is a terrific engineer and teacher, and she’s already written a great blog post on capsule networks, and she graciously allowed me to share some of that here.

Capsule networks come from a 2017 paper by Sara Sabour, Nicholas Frosst, and Geoffrey Hinton at Google: “Dynamic Routing Between Capsules”. Hinton, in particular, is one of the world’s foremost authorities on neural networks.

As my colleague, Cezanne, writes on her blog:

Capsule Networks provide a way to detect parts of objects in an image and represent spatial relationships between those parts. This means that capsule networks are able to recognize the same object in a variety of different poses even if they have not seen that pose in training data.

Love the Pacman GIF. Did I mention Cezanne is also an artist?

Cezanne explains that a “capsule” encompasses features that make up a piece of an image. Think of an image of a face, for example, and imagine capsules that capture each eye, and the nose, and the mouth.

These capsules organize into a tree structure. Larger structures, like a face, would be parent nodes in the tree, and smaller structures would be child nodes.

“In the example below, you can see how the parts of a face (eyes, nose, mouth, etc.) might be recognized in leaf nodes and then combined to form a more complete face part in parent nodes.”

“Dynamic routing” plays a role in capsule networks:

“Dynamic routing is a process for finding the best connections between the output of one capsule and the inputs of the next layer of capsules. It allows capsules to communicate with each other and determine how data moves through them, according to real-time changes in the network inputs and outputs!”

Dynamic routing is ultimately implemented via an iterative routing process that Cezanne does a really nice job describing, along with the accompanying math, in her blog post.

Capsule networks seem to do well with image classification on a few datasets, but they haven’t been widely deployed yet because they are slow to train.

In case you’d like to play with capsule networks yourself, Cezanne also published a Jupyter notebook with her PyTorch implementation of the Sabour, Frosst, and Hinton paper!

Join Me in Philadelphia!

The Saturday before Thanksgiving I will be speaking at an event in Philadelphia, discussing artificial intelligence and the future. The event is hosted by the Free Library of Philadelphi and sponsored by the Italian Consulate.

RSVP here and come say hello!

6 Awesome Projects from Udacity Students (and 1 Awesome Thinkpiece)

Udacity students are constantly impressing us with their skill, ingenuity, and their knowledge of the most obscure features in Slack.

Here are 6 blog posts that will astound you, and 1 think-piece that will blow your mind.

How to identify a Traffic Sign using Machine Learning !!

Sujay Babruwad

Sujay’s managed his data in a few clever ways for the traffic sign classifier project. First, he converted all of his images to grayscale. Then he skewed and augmented them. Finally, he balanced the data set. The result:

“The validation accuracy attained 98.2% on the validation set and the test accuracy was about 94.7%”

Udacity Advance Lane Finding Notes

A Nguyen

An’s post is a great step-through of how to use OpenCV to find lane lines on the road. It includes lots of code samples!

“Project summary:
– Applying calibration on all chessboard images that are taken from the same camera recording the driving to obtain distort coefficients and matrix.
– Applying perspective transform and warp image to obtain bird-eyes view on road.
– Applying binary threshold by combining derivative x & y, magnitude, direction and S channel.
– Reduce noise and locate left & right lanes by histogram data.
– Draw line lanes over the image”

P5: Vehicle Detection with Linear SVC classification

Rana Khalil

Rana’s video shows the amazing results that are achievable with Support Vector Classifiers. Look at how well the bounding boxes track the other vehicles on the highway!

Updated! My 99.40% solution to Udacity Nanodegree project P2 (Traffic Sign Classification)

Cherkeng Heng

Cherkeng’s approach to the Traffic Sign Classification Project was based on an academic paper that uses “dense blocks” of convolutional layers to fit the training data tightly. He also uses several clever data augmentation techniques to prevent overfitting. Here’s how that works out:

“The new network is smaller with test accuracy of 99.40% and MAC (multiply–accumulate operation counts) of 27.0 million.”

Advanced Lane Line Project

Arnaldo Gunzi

Arnaldo has a thorough walk-through of the Udacity Advanced Lane Finding Project. If you want to know how to use computer vision to find lane lines on the road, this is a perfect guide!

“1 Camera calibration
2 Color and gradient threshold
3 Birds eye view
4 Lane detection and fit
5 Curvature of lanes and vehicle position with respect to center
6 Warp back and display information
7 Sanity check
8 Video”

Build a Deep Learning Rig for $800

Nick Condo

I love this how-to post that lists all the components for a mid-line deep learning rig. Not too cheap, not too expensive. Just right.

Here’s how it does:

“As you can see above, my new machine (labeled “DL Rig”) is the clear winner. It performed this task more than 24 times faster than my MacBook Pro, and almost twice as fast as the AWS p2.large instance. Needless to say, I’m very happy with what I was able to get for the price.”

How Gig Economy Startups Will Replace Jobs with Robots

Caleb Kirksey

Companies like Uber and Lyft and Seamless and Fiverr and Upwork facilitate armies of independent contractors who work “gigs” on their own time, for as much money as they want, but without the structure of traditional employment.

Caleb makes the point that, for all the press the gig economy gets, the end might be in sight. Many of these gigs might soon be replaced by computers and robots. He illustrates this point with his colleague, Eric, who works as a safety driver for the autonomous vehicle startup Auro Robotics. Auro’s whole mission is to eliminate Eric’s job!

“Don’t feel too bad for Eric though. He’s become skilled with hardware and robotics. His experience working in cooperation with a robot can enable him to build better systems that don’t need explicit instructions.”

Udacity Students Experiment with Neural Networks and Computer Vision

The Udacity Self-Driving Car Engineer Nanodegree Program requires students to complete a number of projects, and each project requires some experimentation from students to figure out a solution that works.

Here are five posts by Udacity students, outlining how they used experimentation to complete their projects.

Self-Driving Car Engineer Diary — 4

Andrew Wilkie

Andrew has lots of images in this blog post, including a spreadsheet of all the different functions he used in building his Traffic Sign Classifier with TensorFlow!

I got to explore TensorFlow and various libraries (see table below), different convolutional neural network models, pre-processing images, manipulating n-dimensional arrays and learning how to display results.

Intricacies of Traffic Sign Classification with TensorFlow

Param Aggarwal

In this post, Param goes step-by-step through his iterative process of finding the right combination of pre-processing, augmentation, and network architecture for classifying traffic signs. 54 neural network architectures in all!

I went crazy by this point, nothing I would do would push me into the 90% range. I wanted to cry. A basic linearly connected model was giving me 85% and here I am using the latest hotness of convolution layers and not able to match.

I took a nap.

Backpropagation Explained

Jonathan Mitchell

Backpropagation is the most difficult and mind-bending concept to understand about deep neural networks. After backpropagation, everything else is a piece of cake. In this concise post, Jonathan takes a crack and summarizing backpropagation in a few paragraphs.

When we are training a neural network we need to figure out how to alter a parameter to minimize the cost/loss. The first step is to find out what effect that parameter has on the loss. Then find the total loss up to that parameters point and perform the gradient descent update equation to that parameter.

Teaching a car to drive itself

Arnaldo Gunzi

Arnaldo presents a number of lessons he learned while designing an end-to-end network for driving in the Behavioral Cloning Project. In particular, he came to appreciate the power of GPUSs.

Using GPU is magic. Is like to give a Coke to someone in the desert. Or to buy a new car — the feeling of ‘how I was using that crap old one’. Or to find a shortcut in the route to the office: you’ll never use the long route again. Or to find a secret code in a game that give superpowers…

Robust Extrapolation of Lines in Video Using Probabilistic Hough Transform

Esmat Nabil

Esmat presents a well-organized outline of his Finding Lane Lines Porject and the computer vision pipeline that he used. In particular, he has a nice explanation of the Hough transform, which is a tricky concept!

The probabilistic Hough line transform more efficient implementation of Hough transform. It gives as output the extremes of the detected lines (x0, y0, x1, y1). It is difficult to detect straight lines which are part of a curve because they are very very small. For detecting such lines it is important to properly set all the parameters of Hough transform. Two of most important parameters are: Hough votes and maximum distance between points which are to be joined to make a line. Both parameters are set at their minimum value.

CarND Students on Preparation, Generalization, and Hacking Cars

Here are five great posts from students in Udacity’s Self-Driving Car Engineer Nanodegree Program, dealing with generalizing machine learning models and hacking cars!

SDC

Daniel Stang

Daniel has devoted a section of his blog to the Self-Driving Car projects, including applying his lane-line finder to video he took himself!

The first project for the Udacity Self-Driving Car Nanodegree was to create a software pipeline capable of detecting the lane lines in video feed. The project was done using python with the bulk of work being performed using the OpenCV library. The video to the side shows the software pipeline I developed in action using video footage I took myself.

Traffic Sign Classifier: Normalising Data

Jessica Yung

Jessica’s post discusses the need to normalize image data before feeding it into a neural network, including a bonus explainer on the differences between normalization and standardization.

The same range of values for each of the inputs to the neural network can guarantee stable convergence of weights and biases. (Source: Mahmoud Omid on ResearchGate)

Suppose we have one image that’s really dark (almost all black) and one that’s really bright (almost all white). Our model has to address both cases using the same parameters (weights and biases). It’s hard for our model to be accurate and generalise well if it has to tackle both extreme cases.

Hardware, tools, and cardboard mockups.

Dylan Brown

Dylan is a student in both the Georgia Tech Online Master’s in Computer Science Program (run by Udacity) and also in CarND. He’s also turning his own Subaru into a self-driving car! (Note: We do not recommend this.)

Below I’ve put together a list of purchases needed for this project. There will definitely be more items coming soon, at least a decent power supply or UPS. Thankfully, this list covers all the big-ticket items.

Jetson TX1 Developement Kit (with .edu discount) NVIDIA $299
ZED Stereo Camera with 6-axis pose Stereolabs $449 CAN(-FD) to USB interface PEAK-System $299
Touch display, 10.1” Toguard $139
Wireless keyboard K400 Logitech $30
Total $1216

Self-Driving Car Engineer Diary — 1

Andrew Wilkie

Andrew has a running blog of his experiences in CarND, including his preparation.

I REALLY want a deep understanding of the material so followed Gilad Gressel’s recommendation (course mentor): Essence Of Linear Algebra (for linear classifiers which is step 1 towards CNNs), Gradients & Derivatives (for back propagation understanding) and CS231n: Convolutional Neural Networks for Visual Recognition lectures (for full Neural Networks and Convolutional Deep Neural Networks understanding).

Comparing model performance: Including Max Pooling and Dropout Layers

Jessica Yung

Another post by Jessica Yung! This time, she runs experiments on her model by training with and without different layers, to see which version of the model generalizes best.

Means of training accuracy - validation accuracy in epochs 80-100 (lower gap first):

Pooling and dropout (0.0009)
Dropout but no pooling (0.0061)
Pooling but no dropout (0.0069)
No pooling or dropout (0.0094)

Should You Understand Backpropagation?

Backpropagation is a leaky abstraction; it is a credit assignment scheme with non-trivial consequences. If you try to ignore how it works under the hood because “TensorFlow automagically makes my networks learn”, you will not be ready to wrestle with the dangers it presents, and you will be much less effective at building and debugging neural networks.

That is from the excellent Andrej Karpathy, “Yes you should understand backprop”.

I say it’s possible to use deep neural networks quite effectively without truly understanding backprop. But if your goal is to specialize in the field and apply this tool to a range of problems, then “yes you should understand backprop”.

By the way, @karpathy is a prolific Twitter feed with 37,100 followers.

Fun fact 27/120: There are nuclear submarines out there carrying 40 nuclear warheads controlled by a computer running Windows XP.

— Andrej Karpathy (@karpathy) December 25, 2016

Artificial Intelligence, Machine Learning, Deep Learning

Udacity has separate courses on Artificial Intelligence, Machine Learning (actually we have two), and Deep Learning.

What is the difference between all of these? It can be a little hard to explain.

Fortunately, NVIDIA has a nice blog post up explaining these concepts as concentric circles:

The easiest way to think of their relationship is to visualize them as concentric circles with AI — the idea that came first — the largest, then machine learning — which blossomed later, and finally deep learning — which is driving today’s AI explosion — fitting inside both.

I guess if I had to explain, I would say that:

“artificial intelligence” refers to techniques that help computers accomplish goals
“machine learning” refers to techniques that help computers accomplish goals by learning from data
“deep learning” refers to techniques that help computers accomplish goals by using deep neural networks to learn from data

But if you’re interested in these topics, then read the NVIDIA post. It’s good.