Udacity Students on Computer Vision, Sensor Fusion, Deep Learning, and More

All sorts of interesting topics in this set of student posts, including some inside stories from the creator of ALVINN!

Emphatic Camera Calibration With OpenCV

Chris X Edwards

While trying to undistort his camera images, Chris walked into a store and asked to take a photo of their floor. Then things got really weird.

“I wrote a program that iterated through all possible grid sizes and looked at all images. Now I was finding grids. Ah ha! Turning to the documentation to figure out what exactly was going on, I noticed the function had a parameter, flags, which could be set to enable certain grid finding techniques. I set one of the flags and the grids I could detect changed quite a bit. Now I added to my program another inner loop to iterate through all the detection modes.”

Output Appearance Reliability Estimation

Dean Pomerleau

Dean Pomerleau, the creator of ALVINN, responded to Param Aggarwal with some cool stories about how ALVINN took advantage of confusion in the network to estimate how confident it was about its own steering ability:

“Using the OARE technique and a related one called Input Reconstruction Reliability Estimation (IRRE), ALVINN was able to localize itself (e.g. ‘I’ve reached the fork in the road!’), tell the human safety driver (me) when it needed help, arbitrate between networks trained on different road types, and even tell when there was crap on the windshield in front of the camera obstructing its view of the road.”

Cutting-edge (high-tech) career path.

Uki Dominique Lucas

Uki riffs here on all of the various projects he could be working on, how he chooses to spend his limited time, and where that intersects with career development.

“The next part of the career development is keeping up with the computer science basics. Honestly, it does not matter how much programming you do on daily basis, you will not pass the “whiteboard hazing” without any preparation. I lost countless of interviews with fine companies like Amazon, to what I thought was a “power trip” of some engineer without any social skills in a cookie factory — for years I was saying, “Why do I need that? I can make good money on my own”. Only later, I have read books and articles on interviewing and realized that the “whiteboard” is simply a thing they do and that people prepare for it for months.”

Vehichle detection using LIDAR: EDA, augmentation and feature extraction (Udacity/Didi challenge)

Vivek Yadav

Vivek goes into detail on his voxel-based approach for identifying cars based on the KITTI dataset for the Udacity-Didi Challenge. If you don’t know what a voxel is, read on:

“A voxel is a volume unit in space, similar to pixel in 2D images. I first constrained our space so x-dimension (front), y-dimension (L-R) varied between -30 and 30, and vertical dimension varied between -.1.5 and 1 m. I next constructed voxels of width and length .1 m and height 0.3125 m. I then computed maximum height in each voxel and used this value as the height of the point cloud in that voxel. This gave us a height map of 600X600X5 features. We specifically chose 5 height maps because Udacity’s data uses vlp-16 lidar and having more fine discretization can result in height slices without any points.”

Make sense of Kalman Filter

An Nguyen

What is a Kalman filter? Why do we use it? An gives a more intuitive explanation here than you will find on Wikipedia:

“Assume the car makes the lane change successfully to get in front of me, I still continuously observe the car and adjust my speed so my car can always stay in the safe zone. If the car goes slow, I predict the car will still be slow in the next seconds and I’ll stay at a slow speed behind it. However, if it suddenly goes fast, I can speed up a little bit (as long as under speed limit) and update my belief. What I did there is a continuous process of prediction and update.”

Udacity Students at Track, in the Didi Challenge, and Building Deep Learning Servers

Udacity Self-Driving Car students have been writing about the Self Racing Cars track day, the Didi Challenge, and building their own deep learning machines!

Self Racing Cars 2017 Photo Gallery — The Day Before

Kunfeng Chen

Udacity students were sponsored by PolySync to compete in the Self-Racing Cars track day at Thunderhill last weekend, and these photos show what it was like!

Self Racing Cars 2017 Photo Gallery — Day 1

Kunfeng Chen

Self Racing Cars 2017 Video Gallery — Shot on iPhone 6

Kunfeng Chen

Deep Learning PC Build

Tim Camber

Here’s how Tim built his own GPU-enabled deep learning machine. He provides helpful instructions, a bill of materials, links to graphs comparing the value of different NVIDIA GPUs.

“The GPU is the main component of our system, and hopefully comprises a significant fraction of the cost of the system. ServeTheHome has a nice article in which they show the following graph of GPU compute per unit price.”

part.1: Didi Udacity Challenge 2017 — Car and pedestrian Detection using Lidar and RGB

This is one student’s journal of tackling the Udacity-Didi Challenge. Pay attention to the different neural network architectures he uses!

“Just from these 2 simple steps, I observed the following possible issues:

Small object detection. This is a well-known weakness in the original plain faster rcnn net.

Creation of 2d top view image could be slow. There are quite a number of 3d points needs to be processed

Now that I am sure that the implementation is correct, the next step will be to start training with the actual dataset, which contains many images.”

Voyage

Yesterday Udacity announced that my colleague, Oliver Cameron, is spinning out his own autonomous vehicle company, Voyage.

Friends have texted to ask if that means I’m now part of Voyage, and the answer is no.

I’m staying at Udacity to build the Self-Driving Car Engineer Nanodegree Program, which has thousands of students and is a lot of fun. We’ve launched modules on Deep Learning, Computer Vision, Sensor Fusion, and Localization, with development underway on Control, Path Planning, System Integration, plus several elective modules.

If you’re reading this, you really should sign up for the program 😉

Oliver recruited me to Udacity, gave me lots of room to run, and has been a driving force in building the company for the last three years. While I wish him the best, it’s sad to see him go.

But Voyage is its own independent company, so this won’t affect Udacity’s mission to place our students in jobs with our many amazing hiring partners, like Didi, Mercedes-Benz, NVIDIA, Uber ATG, and many more.

The Udacity Open-Source Self-Driving Car

Last week my colleague Yousuf and I spoke at the Open Source Software for Decision Making Conference at Stanford.

It was a lot of fun! Thanks to Mykel Kochenderfer and Tim Wheeler for inviting us.

Yousuf and I spoke about building the Udacity open source self-driving car. If you’re interested in what Udacity and our students have done, check it out:

You can find all the presentations, including some pretty impressive academic work, at the conference website.

Human and Autonomous Machine Interaction

In a few weeks, I’ll be speaking at Car HMI USA, so please say hi if you’re there.

HMI stands for Human-Machine Interaction, and while I’m at the conference, I’m really excited to hear from UX and HMI engineers about what the future holds for riders of autonomous vehicles.

The Motley Fool predicts that self-driving cars will be great for Netflix and terrible for radio companies, which seems likely, but not particularly creative.

If we spend close to an hour per day in a self-driving car, how will we use that?

Maybe we’ll use it like we use our leisure time: 55% watching TV, 14% socializing, and 8% gaming.

I like to think we can do better. We could use self-driving cars to spend more time with our families — maybe we’ll drag our kids to work with us and have the self-driving car take them home. Maybe we’ll use that time to do housework like paying the bills or online grocery shopping.

Anything but more TV.

The Carnage of Self-Driving Snowplows

From OnMilwaukee.com comes this important April 1st story about the dangers of autonomous snowplowing:

With the 2016–17 winter nearly behind us, the tally is in: The DPW’s fleet of 200 self-driving snowplows destroyed 3,019 parked cars, killed or injured 29 stray cats, created 17,898 potholes/sinkholes and sent one elderly South Side man to the hospital after burying him in a snow drift.

“Yeah, I guess Milwaukee isn’t quite ready for this technology,” admits Kowolski. “We are considering ‘hiring’ monkeys to drive the plows next season.”

But consider the vendor:

In its pilot program, the City had considered using well-tested self-driving plows built by Google and Tesla, but instead opted to install hardware from RadioShack in its existing trucks.

They tested the equipment in Mountain View, of course.

Jobs with Udacity Hiring Partners

Our guiding star in developing the Udacity Self-Driving Car Engineer Nanodegree Program is to help students get jobs working on autonomous vehicles.

To that end, we’ve built hiring partnerships with some of the most exciting employers in the world of autonomous vehicles.

Our newest hiring partners include giants like Fiat-Chrysler and Lockheed Martin, critical suppliers like Delphi and Velodyne and Dataspeed, as well as exciting startups like Renovo.

We work directly with recruiters at these companies to identify open positions that Udacity students might be interested in. Then we announce those positions in the Udacity Career Resource Center, and encourage students to apply.

Once students apply, we connect students to employers and guide students through the interview process.

Udacity has been focused on careers for several years, but this level of support is new to the Self-Driving Car Program, and we’re really excited about it. As Sebastian said recently, you can’t talk about education today without talking about jobs.

Autonomous Vehicles and Big Data

Ford just announced a $200 MM investment in a transforming its Flat Rock, Michigan, assembly plant into a data center.

There is an angle here that ties into the Michigan vs. Silicon Valley competition for autonomous vehicle development, but that’s not what interests me.

What interests me is what this move says about Big Data in automotive applications. Thus far, most autonomous vehicle development work has proceeded with relatively small amounts of data, certainly compared the amount of data that companies like Google deal with.

Ford’s investment this new Flat Rock data center portends a future in which autonomous vehicle teams need to know about Hadoop and Spark, in addition to deep learning and robotics.

Conference Talks This Week

I’ll be speaking at two conferences this week! So much talking.

Today I’ll be talking at 4pm at the Global Data Science Conference about “How to Become a Self-Driving Car Engineer”.

This is kind of last-minute (sorry!), but I have some free passes to give away for that conference, so send me an email (david.silver@udacity.com) if you want one.

On Thursday, at 11:30am, I’ll be speaking at the Open-Source Software for Decision-Making Conference at Stanford University, on the Udacity Open-Source Self-Driving Car.

That conference is free to attend.

If you see me, say hello!

Udacity Students Build Tools for Computer Vision, Deep Learning, and the Didi Challenge

One of the big challenges with working on cutting-edge technology is the lack of established tools to rely on. Sometimes you have to build your own.

Here are tools that different Udacity Self-Driving Car students built to help them solve problems related to deep learning, computer vision, and the Didi Challenge!

Detecting road features

Alex Staravoitau

Alex provides great step-by-step analysis of his lane detection and vehicle tracking software. I really like his detailed explanation of the feature-tracking pipeline:

“After experimenting with various features I settled on a combination of HOG (Histogram of Oriented Gradients), spatial information and color channel histograms, all using YCbCr color space. Feature extraction is implemented as a context-preserving class (FeatureExtractor) to allow some pre-calculations for each frame. As some features take a lot of time to compute (looking at you, HOG), we only do that once for entire image and then return regions of it.”

Autonomous Vehicle Speed Estimation from dashboard cam

Jonathan Mitchell

Jonathan built a really cool independent project to estimate vehicle speed from camera images. I really enjoyed his explanation of using optical flow for velocity:

“The Farneback method computes the Dense optical flow. That means it computes the optical flow from each pixel point in the current image to each pixel point in the next image.”

Transfer Learning in Keras

Galen Ballew

Galen is particularly interested in how to deploy neural networks in industry. To that end, he ran an experiment to see how well and how quickly various neural networks converged on classifying a training set:

“These networks (especially ResNet50 in this case) required extremely little training time and were relatively easy to implement. Once there is a proof of concept, it is a lot easier to write an optimized network that suits your needs (and maybe mimics the network you transfer learned from) than it is to both write and train from scratch.”

Attempting to Visualize a Convolutional Neural Network in Realtime

Param Aggarwal

One of the knocks on neural networks is that they’re black boxes. Figuring out what drives their decisions is hard. Param built a tool to help visualize the internals of his network:

“On the right we have our Udacity Simulator running. On the left is my little React app that is visualizing all the outputs of the convolutional layers in my neural network.”

part.1: Didi Udacity Challenge 2017 — Car and pedestrian Detection using Lidar and RGB

Cherkeng Heng

Cherkeng is keeping a diary of his work on the Didi Challenge!

“During development, visualization is very important. It helps to ensure that the code implementation and mathematical formulation are correct. I first covert a rectangular region of lidar 3d point cloud into a multi-channel top view image. I use the kitti dataset for my initial development”