GPUs Are Eating the World

Our partners at NVIDIA just announced an amazing third-quarter, which cycled (see what I did there?) their stock price up 30%.

The bulk of NVIDIA’s present growth is in their bread and butter gaming business, where they sold $1.24 billion worth of GPUs in just the third quarter.

Headlines then mention NVIDIA’s datacenter business, where they sell GPUs to companies like Google and Facebook, which use the GPUs not for gaming, but rather for high-powered deep learning.

GPUs employ massive parallelism to stream games to computer monitors. One way to think of it is that every pixel on a monitor is doing pretty much the same thing, just with different inputs, which is how the colors change.

That massive parallelism turns out to be equally helpful for deep neural networks, in which every unit in the network is doing pretty much the same thing, just with different inputs.

The third and fastest-growing unit of NVIDIA’s business is automotive, which grew 61% year-over-year. Every automotive company in the world is pulling NVIDIA chips, particularly the DRIVE PX2, into their autonomous vehicles. These chips enable deep learning and other parallelized computations that help the car process data in real-time.

It’s a good time to be making GPUs.

Behavioral Cloning

One of the first modules in our Self-Driving Car Nanodegree program will be Deep Learning. This is such a fun topic!

We’ll be covering behavioral cloning, which is a technique whereby you drive the car (or the simulated car, in this case) yourself and then pass the data to a neural network. The neural network trains on your driving data and auto-magically learns how to drive the car, without any other information. You don’t have to tell it about the color of the road or which way to turn or where the horizon is. You just pass in data of your own driving and it learns.

By the end, students will be building their own neural networks to drive cars, just like in this video.

TensorFlow vs. TF Learn vs. Keras vs. TF-Slim

One module in Udacity’s Self-Driving Car Nanodegree program will cover deep learning, with a focus on automotive applications.

We’ve decided to use the TensorFlow library that Google has built as the main tool for this module.

Caffe, an alternative framework, has lots of great research behind it, but TensorFlow uses Python, and our hope is that this will make learning it a lot easier for students.

Even with TensorFlow, however, we face a choice of which “front-end” framework to use. Should we use straight TensorFlow, or TF Learn, or Keras, or the new TF-Slim library that Google released within TensorFlow.

Right now we’re learning toward TF Learn, almost by default. Straight TensorFlow is really verbose, TF-Slim seems new and under-documented. Keras and TF Learn both seem solid, but the TF Learn syntax seems a little cleaner.

One big drawback to TF Learn, though, is the lack of easily integrated pre-trained models. I spent a while today trying to figure out how to migrate pre-trained AlexNet weights from Caffe to TF Learn.

So far, no one solution is jumping out at me as perfect. Let me know in the comments if you’ve got a suggestion.

Geoffrey Hinton

I’ve been doing a little reading both about and from Geoffrey Hinton, who is fairly the godfather of neural networks.

Separately, I’ve also been listening to Malcolm Gladwell’s podcast, Revisionist History.

One of Gladwell’s recent episodes focused on creativity, and the popular notion that creativity is a product of youth and genius.

It turns out that notion is true!

But it is also true that creativity can be a product of a lifetime of tinkering, or so Gladwell contends.

And that’s where I see the connection to Geoffrey Hinton.

Hinton was born in 1947, which made him 63 years-old in 2010, the year he and his graduate students developed and published AlexNet, the deep neural network that blew the machine learning field wide open.

This was the product of a lifetime of working on neural networks. Hinton was one of the original leaders in the field in the 1980s, and developed the practice of back-propagation, which is still a critical element of deep neural networks.

But neural networks faded into relative obscurity until AlexNet revolutionized the field with a GPU implementation in 2010.

It’s nice for the rest of us to remember that our engineering life doesn’t end at 30.

Deep Learning Frameworks

I’ve finished NVIDIA’s introductory Deep Learning course, and I’m now starting Udacity’s.

These courses outline the construction and use of Deep Neural Networks (DNNs) for image processing and text recognition. They’re great!

Here are some of the highights:

DIGITS: This is NVIDIA’s Deep Learning visualization program. It presents a GUI for both building and reviewing DNNs.

Caffe: There are several frameworks for building DNNs, but Caffe seems the most straightforward. Although it is written in C++ and provides a Python interface, no coding is required to get started. This is because Caffe can be configured with Google Protobuf, a JSON-like text format.

Theano: NVIDIA’s course advises that the various Deep Learning frameworks are “more similar than they are different”, but Theano is at least syntactically different than Caffe. Theano is a Python symbolic math library, on top of which various packages offer DNN capability.

Torch: Torch is Facebook’s tool of choice for DNN research, and it gets support from Google, NVIDIA, and other major Deep Learning companies, as well. It uses the Lua programming language (yay, Brazil!).

TensorFlow: In the same way that Torch is Facebook’s go-to DNN tool, TensorFlow fills that role for Google. Like many Google projects, it is Python-based. I am just diving into TensorFlow now, via Udacity’s course, so I may have more to say later.

cuDNN: This is NVIDIA’s library for parallelizing DNN training on the GPU. It is the key to building large neural networks, and all of the DNN frameworks integrate with it. As Google’s Vincent Vanhoucke relates, neural networks went through a period of popularity in the eighties and nineties, and then slumped in the 2000s, as CPUs weren’t able to provide enough power to train large networks. The publication of AlexNet (2012), showed that the use of GPU parallelization could provide massive training acceleration. This revolutionized the field.

Convolutional Neural Networks (CNNs): Convolutional Neural Networks are a building block of DNNs that involve learning on small parts of an image and then tiling the neighboring small parts to learn over the entire image. This blocking and tiling reduces the learning complexity, which is especially important for large images.