Artificial Intelligence, Machine Learning, Deep Learning

Udacity has separate courses on Artificial Intelligence, Machine Learning (actually we have two), and Deep Learning.

What is the difference between all of these? It can be a little hard to explain.

Fortunately, NVIDIA has a nice blog post up explaining these concepts as concentric circles:

The easiest way to think of their relationship is to visualize them as concentric circles with AI — the idea that came first — the largest, then machine learning — which blossomed later, and finally deep learning — which is driving today’s AI explosion — fitting inside both.

I guess if I had to explain, I would say that:

“artificial intelligence” refers to techniques that help computers accomplish goals
“machine learning” refers to techniques that help computers accomplish goals by learning from data
“deep learning” refers to techniques that help computers accomplish goals by using deep neural networks to learn from data

But if you’re interested in these topics, then read the NVIDIA post. It’s good.

Behavioral Cloning

One of the first modules in our Self-Driving Car Nanodegree program will be Deep Learning. This is such a fun topic!

We’ll be covering behavioral cloning, which is a technique whereby you drive the car (or the simulated car, in this case) yourself and then pass the data to a neural network. The neural network trains on your driving data and auto-magically learns how to drive the car, without any other information. You don’t have to tell it about the color of the road or which way to turn or where the horizon is. You just pass in data of your own driving and it learns.

By the end, students will be building their own neural networks to drive cars, just like in this video.

TensorFlow vs. TF Learn vs. Keras vs. TF-Slim

One module in Udacity’s Self-Driving Car Nanodegree program will cover deep learning, with a focus on automotive applications.

We’ve decided to use the TensorFlow library that Google has built as the main tool for this module.

Caffe, an alternative framework, has lots of great research behind it, but TensorFlow uses Python, and our hope is that this will make learning it a lot easier for students.

Even with TensorFlow, however, we face a choice of which “front-end” framework to use. Should we use straight TensorFlow, or TF Learn, or Keras, or the new TF-Slim library that Google released within TensorFlow.

Right now we’re learning toward TF Learn, almost by default. Straight TensorFlow is really verbose, TF-Slim seems new and under-documented. Keras and TF Learn both seem solid, but the TF Learn syntax seems a little cleaner.

One big drawback to TF Learn, though, is the lack of easily integrated pre-trained models. I spent a while today trying to figure out how to migrate pre-trained AlexNet weights from Caffe to TF Learn.

So far, no one solution is jumping out at me as perfect. Let me know in the comments if you’ve got a suggestion.

Geoffrey Hinton

I’ve been doing a little reading both about and from Geoffrey Hinton, who is fairly the godfather of neural networks.

Separately, I’ve also been listening to Malcolm Gladwell’s podcast, Revisionist History.

One of Gladwell’s recent episodes focused on creativity, and the popular notion that creativity is a product of youth and genius.

It turns out that notion is true!

But it is also true that creativity can be a product of a lifetime of tinkering, or so Gladwell contends.

And that’s where I see the connection to Geoffrey Hinton.

Hinton was born in 1947, which made him 63 years-old in 2010, the year he and his graduate students developed and published AlexNet, the deep neural network that blew the machine learning field wide open.

This was the product of a lifetime of working on neural networks. Hinton was one of the original leaders in the field in the 1980s, and developed the practice of back-propagation, which is still a critical element of deep neural networks.

But neural networks faded into relative obscurity until AlexNet revolutionized the field with a GPU implementation in 2010.

It’s nice for the rest of us to remember that our engineering life doesn’t end at 30.

How to Land An Autonomous Vehicle Job: Coursework

Recently I outlined a short series of posts I’ll be writing about how I landed a job in autonomous vehicles.

The first part of that equation was coursework.

There are so many free online courses to take!

My background is that I have a pretty solid foundation in software engineering, including an undergraduate degree in computer science. But most recently my programming has been on the web, not so much in the machine learning and embedded systems areas that dominate vehicle software.

Here are the courses I took:

Artificial Intelligence for Robotics (Udacity): This is a terrific and super-fun introduction into self-driving cars by Sebastian Thrun. Thrun is both the founder of Udacity and also the founder of Google’s self-driving car project and also a former professor at Stanford. Taking the class is like being in the presence of greatness.

Machine Learning (Coursera): This class is really broad, covering supervised and unsupervised learning algorithms, as well as optimization and tuning. The teacher is Andrew Ng, who is like Sebastian Thrun’s mirror image — Stanford professor, then founder of Coursera, now head of Baidu’s self-driving car program.

Control of Mobile Robots (Coursera): This course is taught through Coursera’s partnership with Georgia Tech, and covers the basics of control theory. It was especially helpful for me, as a computer science undergrad with minimal background in mechanical engineering.

Deep Learning (Udacity): This is a relatively short overview of the theory behind deep neural networks, with some practical programming exercises.

Deep Learning (NVIDIA): In practice, it’s possible to get a lot of value out of deep neural networks with only a thin understanding of how DNNs actually work. That’s because practitioners can get a lot of mileage out of deep learning frameworks like Caffe, Theano, and Torch. This course provides an overview of each framework, along with programming exercises.

Intro to Parallel Programming with CUDA (Udacity): Deep learning plays a prominent role in autonomous software, and deep learning is itself enabled by the massive parallelization that GPUs offer. CUDA is the parallel programming framework created by NVIDIA, and this course provides great background into how parallel programming works.

Underactuated Robotics (edX): This was by far the most math-heavy of the courses I took, owing to its target audience — MIT upperclassmen. I confess that due to some family obligations I only finished about 2/3 of the course. But the course provides terrific exercises in how to model robots in the physical world. It also forced me to brush up on my advanced math.

All of these are fairly advanced courses. Some of the programming exercises are in C++, some in Python, many in Matlab.

For somebody with minimal software engineering background, I might recommend starting with some more introductory computer science and linear algebra courses.

But for somebody with my background — that is to say, a strong software engineer with no real robotics experience, I found these classes to be terrific.

Hi Peter!

Occasionally Medium sends me an email letting me know that people have followed my posts or maybe even liked them, which always feels great.

But I got a special treat a few days ago when one of the headshots that popped up in the “New Followers” email was Peter Norvig.

Peter Norvig is Director of Research at Google, but to me and many other engineers he is first-and-foremost the co-author of Artificial Intelligence: A Modern Approach. This was the seminal textbook of artificial intelligence in my undergraduate days. To judge by Amazon rankings, the book maintains that position today.

So Peter, if you’re reading this, thanks for the lessons and for your out-sized contributions to computer science.

Also, I followed you back. 🙂

Image Annotation

Image annotation is an interesting and surprising problem that many autonomous vehicle researchers are struggling with.

The issue is that its easy to send a car and a camera out into the world to collect data, but its time-consuming and expensive to label that data.

And the labeling is necessary in order to train the machine to read the data.

Think about lane lines, for example. Lots of companies can now capture millions of images of roads, in all sorts of conditions. But in order to find the lane lines, the computer models have to be trained. And training the models involves telling them where the lane lines are in the sample images.

Lots of academic researchers use Amazon’s Mechanical Turk, a job system where cheap workers overseas can pick up manual tasks from companies in the US and elsewhere. But that is both expensive – even the world’s cheapest workers become expensive when asked to perform billions of tasks – and slow.

There doesn’t seem to be a solution yet to this problem.

Deep Learning Frameworks

I’ve finished NVIDIA’s introductory Deep Learning course, and I’m now starting Udacity’s.

These courses outline the construction and use of Deep Neural Networks (DNNs) for image processing and text recognition. They’re great!

Here are some of the highights:

DIGITS: This is NVIDIA’s Deep Learning visualization program. It presents a GUI for both building and reviewing DNNs.

Caffe: There are several frameworks for building DNNs, but Caffe seems the most straightforward. Although it is written in C++ and provides a Python interface, no coding is required to get started. This is because Caffe can be configured with Google Protobuf, a JSON-like text format.

Theano: NVIDIA’s course advises that the various Deep Learning frameworks are “more similar than they are different”, but Theano is at least syntactically different than Caffe. Theano is a Python symbolic math library, on top of which various packages offer DNN capability.

Torch: Torch is Facebook’s tool of choice for DNN research, and it gets support from Google, NVIDIA, and other major Deep Learning companies, as well. It uses the Lua programming language (yay, Brazil!).

TensorFlow: In the same way that Torch is Facebook’s go-to DNN tool, TensorFlow fills that role for Google. Like many Google projects, it is Python-based. I am just diving into TensorFlow now, via Udacity’s course, so I may have more to say later.

cuDNN: This is NVIDIA’s library for parallelizing DNN training on the GPU. It is the key to building large neural networks, and all of the DNN frameworks integrate with it. As Google’s Vincent Vanhoucke relates, neural networks went through a period of popularity in the eighties and nineties, and then slumped in the 2000s, as CPUs weren’t able to provide enough power to train large networks. The publication of AlexNet (2012), showed that the use of GPU parallelization could provide massive training acceleration. This revolutionized the field.

Convolutional Neural Networks (CNNs): Convolutional Neural Networks are a building block of DNNs that involve learning on small parts of an image and then tiling the neighboring small parts to learn over the entire image. This blocking and tiling reduces the learning complexity, which is especially important for large images.

Deep Learning

I have been studying a little bit about deep learning recently, and hope to learn more over the next week.

In particular, I have been progressing through NVIDIA’s introductory Deep Learning course, which offers an overview of Deep Neural Networks (DNNs). The course covers three DNN frameworks (Caffe, Theano, and Torch) and one visualization tool (DIGITS).

This type of course is super-helpful, in that it’s geared toward practitioners and problem-solving, and less on the theory of DNNs. The Caffe framework, combined with the DIGITS visualization tool, seems particularly well-suited to quickly constructing a DNN and seeing where it leads.

So I’m a big fan of the NVIDIA course.

Next I’d like to take either Coursera’s Neural Networks for Machine Learning, or Udacity’s Deep Learning.

Coursera’s course is taught by the famed neural network researcher Geoffrey Hinton, whereas Udacity’s courses have a great UI and often a more practical (versus theoretical) approach.

I’ll let you know what I choose, and let me know if you have any recommendations!