Deep learning is everywhere. This branch of artificial intelligence curates your social media and serves your Google search results. Soon, deep learning could also check your vital signs or adjust your thermostat. MIT researchers have developed a system that can bring deep learning neural networks to new – and much smaller – places, like the tiny computer chips in portable medical devices, household appliances, and the 250 billion other objects that make up the "Internet of Things" ( IoT).
The system is called MCUNetdevelops compact neural networks that, despite limited storage and processing power, offer unparalleled speed and accuracy for deep learning on IoT devices. The technology could make it easier to expand the IoT universe while saving energy and improving data security.
The research will be featured at next month's conference on neural information processing systems. The lead author is Ji Lin, a PhD student in Song Han's laboratory at MIT's Electrical and Computer Science Institute. Co-authors include Han and Yujun Lin from MIT, Wei-Ming Chen from MIT and the National University of Taiwan, and John Cohn and Chuang Gan from MIT-IBM Watson AI Lab.
The internet of things
The IoT was born in the early 1980s. Carnegie Mellon University students, including Mike Kazar 78, connected a cola-cola machine to the internet. The group's motivation was simple: laziness. They wanted their computers to confirm that the machine was stocked from their office to make a purchase before the trek. It was the world's first internet connected device. "This was pretty much treated as the punch line of a joke," says Kazar, now a Microsoft engineer. "Nobody expected billions of devices on the Internet."
Since this cola machine, everyday objects are increasingly networked in the growing IoT. This includes everything from portable heart monitors to smart refrigerators that let you know when you're low on milk. IoT devices often run on microcontrollers – simple computer chips with no operating system, minimal processing power, and less than one-thousandth the memory of a typical smartphone. Therefore, pattern recognition tasks such as deep learning are difficult to perform locally on IoT devices. For complex analysis, IoT-gathered data is often sent to the cloud, making it vulnerable to hacking.
“How do we use neural networks directly on these tiny devices? It's a new area of research that is getting very hot, ”says Han. "Companies like Google and ARM are all working in that direction." Han is too.
With MCUNet, Hans Gruppe has coded two components that are required for “tiny deep learning” – the operation of neural networks on microcontrollers. One component is TinyEngine, an inference engine that controls resource management similar to an operating system. TinyEngine is optimized to run a specific neural network structure selected by the other component of MCUNet: TinyNAS, a neural architecture search algorithm.
System algorithm codesign
Designing a deep network for microcontrollers is not easy. Existing neural architecture search techniques start with a large pool of possible network structures based on a predefined template and then gradually find those with high accuracy and low cost. The method works, but is not the most efficient. "It can work pretty well for GPUs or smartphones," says Lin. "However, it has been difficult to apply these techniques directly to tiny microcontrollers because they are too small."
So Lin developed TinyNAS, a neural architecture search method that creates custom sized networks. “We have many microcontrollers with different performance capacities and memory sizes,” says Lin. "That's why we developed the algorithm (TinyNAS) to optimize the search space for different microcontrollers." Due to the customer-specific properties of TinyNAS, compact neural networks can be generated with the best possible performance for a specific microcontroller – without unnecessary parameters. “Then we deliver the final, efficient model to the microcontroller,” says Lin.
In order to run this tiny neural network, a microcontroller also needs a lean inference engine. A typical inference engine has some dead weight – instructions for tasks it may infrequently perform. The additional code is not a problem for a laptop or smartphone, but it can easily overwhelm a microcontroller. "It has no off-chip memory and no hard drive," says Han. "All in all, it's only a megabyte of Flash, so we have to manage such a small resource really carefully." Keyword TinyEngine.
The researchers developed their inference engine in collaboration with TinyNAS. TinyEngine generates the essential code required to run TinyNAS 'custom neural network. Any deadweight code is discarded, which reduces compilation time. "We only keep what we need," says Han. “And since we designed the neural network, we know exactly what we need. This is the advantage of the system algorithm codesign. "In the group's TinyEngine tests, the compiled binary size was between 1.9 and five times smaller than comparable inference engines for microcontrollers from Google and ARM. TinyEngine also includes innovations that reduce runtime, including deep-folding in place and After the code signature of TinyNAS and TinyEngine, Hans team put MCUNet to the test.
The first challenge for MCUNet was image classification. Researchers used the ImageNet database to train the system with labeled images and then test its ability to classify novel ones. On a commercial microcontroller they tested, MCUNet successfully classified 70.7 percent of the new images – the previous combination of neural networks and inference motors based on the latest technology was only 54 percent accurate. "Even a 1 percent improvement is seen as significant," says Lin. "So this is a big leap for microcontroller settings."
The team found similar results in ImageNet tests on three other microcontrollers. In terms of speed and accuracy, MCUNet beat the competition for audio and visual "wake word" tasks where a user interacts with a computer using voice cues (think "Hey Siri") or simply initiated by entering a room. The experiments underline the adaptability of MCUNet to numerous applications.
The promising test results give Han hope that it will become the new industry standard for microcontrollers. "It has huge potential," he says.
The progress "pushes the boundaries of deep neural network design even further into the computing realm of small, energy-efficient microcontrollers," says Kurt Keutzer, a computer scientist at the University of California at Berkeley, who was not involved in the work. He adds that MCUNet "could add intelligent computer vision capabilities or enable smarter motion sensors to even the simplest kitchen appliances".
MCUNet could also make IoT devices more secure. "A key benefit is privacy," says Han. "You don't have to transfer the data to the cloud."
Analyzing data locally reduces the risk of personal information – including personal health data – being stolen. Han envisions smartwatches with MCUNet that not only track users' heartbeat, blood pressure and oxygen levels, but also analyze that information and help them understand it. MCUNet could also deepen IoT devices in vehicles and rural areas with limited internet access.
In addition, the low computational effort of MCUNet leads to a low carbon footprint. "Our big dream is green AI," says Han, adding that training a large neural network can burn carbon equivalent to the lifetime emissions of five cars. MCUNet on a microcontroller would use a small portion of that energy. "Our ultimate goal is to enable efficient, tiny AI with fewer computational resources, fewer people, and less data," says Han.