
Last spring, Facebook published SEER, a new approach to self-supervised deep learning.
One of the core challenges for most deep learning efforts is securing labeled data. The neural network needs labeled data for training, so that the network can learn when it’s right and when it’s wrong, and how wrong it is, and then improve.
Unfortunately, lots of datasets don’t come with labels. The solution is often to pay a third-party vendor to ship the data to a country with low labor costs for manual human labeling. Even in very economical locations, this effort becomes very expensive. And surprisingly error-prone.
Over time, most companies have gotten smarter about how to automatically label a lot of data, but human labeling remains important.
Facebook’s SEER approach skips the labeling entirely, using a “self-supervised” approach to learn directly from the raw data. Instead of labeling different images with “cat”, “dog”, and other descriptors, SEER learns to correlate similar images together. The basic idea is to extract features from each image and then assign images with similar features to clusters.
The second contribution of SEER is an architecture for training a network at Facebook’s scale. The Facebook AI team behind this effort documents their use of RegNets (regulator networks) to trade off compute power for memory capacity, and scale the system.
Self-supervised learning seems like it might become important for robotics, and autonomous vehicles, particularly in the planning pipeline. This is an area in which it can be hard to even know what labels to assign to raw data. If we could instead design a system to let the network learn for itself, that would be a big step forward.