These courses outline the construction and use of Deep Neural Networks (DNNs) for image processing and text recognition. They’re great!
Here are some of the highights:
DIGITS: This is NVIDIA’s Deep Learning visualization program. It presents a GUI for both building and reviewing DNNs.
Caffe: There are several frameworks for building DNNs, but Caffe seems the most straightforward. Although it is written in C++ and provides a Python interface, no coding is required to get started. This is because Caffe can be configured with Google Protobuf, a JSON-like text format.
Theano: NVIDIA’s course advises that the various Deep Learning frameworks are “more similar than they are different”, but Theano is at least syntactically different than Caffe. Theano is a Python symbolic math library, on top of which various packages offer DNN capability.
Torch: Torch is Facebook’s tool of choice for DNN research, and it gets support from Google, NVIDIA, and other major Deep Learning companies, as well. It uses the Lua programming language (yay, Brazil!).
TensorFlow: In the same way that Torch is Facebook’s go-to DNN tool, TensorFlow fills that role for Google. Like many Google projects, it is Python-based. I am just diving into TensorFlow now, via Udacity’s course, so I may have more to say later.
cuDNN: This is NVIDIA’s library for parallelizing DNN training on the GPU. It is the key to building large neural networks, and all of the DNN frameworks integrate with it. As Google’s Vincent Vanhoucke relates, neural networks went through a period of popularity in the eighties and nineties, and then slumped in the 2000s, as CPUs weren’t able to provide enough power to train large networks. The publication of AlexNet (2012), showed that the use of GPU parallelization could provide massive training acceleration. This revolutionized the field.
Convolutional Neural Networks (CNNs): Convolutional Neural Networks are a building block of DNNs that involve learning on small parts of an image and then tiling the neighboring small parts to learn over the entire image. This blocking and tiling reduces the learning complexity, which is especially important for large images.