The Rise of VPUs: Giving eyes to machines

Because, seeing is believing.

By Prasid Banerjee Published Date
28 - Mar - 2016
| Last Updated
28 - Mar - 2016
The Rise of VPUs: Giving eyes to machines

On August 31, 1999, when Nvidia unveiled the GeForce 256, the company called it the "world's first GPU". According to Nvidia's website, "A GPU represents a significant breakthrough in realism. It literally transforms the way you interact with your PC. It accomplishes this by completely offloading all graphics acceleration from the CPU." In essence, a GPU is a specialised chip tasked with taking load off the CPU in order to deliver high level graphics. It was developed out of a need for such performance in computers, because a single chip couldn't perform every task that it was required to.

It was a turning point in computing, leading to a lot of features that we take for granted today. But as then, today we stand at the same hurdle once again: The need for specialised chips is again apparent, thanks in no small part to the obsolescence of Moore's Law. Until now, developments in the chip industry were driven by a prophecy made by Intel's co-founder Gordon Moore that has led us from the Intel 4004 (with around 2,300 transistors embedded) to the Intel Skylake, with approximately 1.75 billion transistors embedded on it.

"Moore's Law: Processing power doubles roughly every two years, as smaller and more transistors are packed on a silicon wafer. This boosts performance and reduces costs"

Death of Moore's law and the rise of VPUs
"Broadly speaking, yes," said Jack Dashwood, Marcom Director, Movidius, when asked whether the obsolescence of Moore's law plays a part in the rise of VPUs. "We are increasingly less reliant on the implicit benefits provided by moving down to a new process node. Purpose built processors and perhaps even more importantly, elegant marriage of software on top of the underlying silicon is going to be a huge source of improvements going forward, both from a technological and economic perspective."

Movidius is a small chip startup that you've most probably not heard of, yet. It's a European company that currently produces specialised chips known as Vision Processing Units (VPUs). The chips are meant for application in areas like Augmented Reality, Virtual Reality and others. Movidius' Myriad 2 chip is running on the recently-announced DJI Phantom 4 drone, and its Myriad 1 chip was used in Google's first Project Tango device. Dashwood explains that while Movidius is a decade-old company, it turned its attention to VPUs in 2009, with Google's first Project Tango phone being the first instance of its chips being implemented. The Myriad 1 and Myriad 2 are high-performance, low-power chips, meant specifically for Computer Vision.

The failure of Moore's Law has led companies to look at new methods of adding more computing power, and chips such as the Myriad 2 make for one of the most promising avenues. VPUs, like the Myriad 2, don't sound like much, but like GPUs they come out of a specific need for computer vision. Computer Vision is a branch of computing that deals with processing and understanding real world elements and images. It is the technology behind myriad augmented reality apps that you see today (like Blippar, for instance). It's also important in intelligent drones and robots that can navigate around and interpret real world objects by themselves.

"GPUs are actually quite a good analogy. In the early 1990s, people realised that 3D gaming and visualisation was going to be hugely important for both commercial as well as consumer purposes, but existing architectures were not well suited to the types of computation required for rich graphics. In a similar vein, we are now keenly aware of the value of computer vision, but much of the existing hardware and software approaches aren't optimised for such tasks," said Dashwood.

"VPUs are to Computer Vision, what GPUs are to gaming and graphics"

Obstacles to overcome
Using a specialised chip for a particular purpose is easier said than done. Gamers usually use complex liquid cooling techniques in order to meet the heat requirements that GPUs come with. But, while GPUs were originally meant for PCs where there was enough space to implement such cooling methods, VPUs do not enjoy that luxury. They are meant for drones, smartphones and other smaller devices, becoming an integral part of the mobile environment that the world is rapidly progressing towards.

According to Dashwood, that problem has already been solved. "The Myriad 2 has been developed from the ground up to run in a low power envelope, and at temperatures low enough that they can be embedded on wearable devices." The Myriad 2 can process millions of pixels, while consuming less than one watt of power. This is significantly lower than the power consumed by smartphone processors today and necessary for a chip that is supposed to run alongside those processors. In essence, while the multi-core processor on your phone will be responsible for fast boot-up of an augmented reality app, the VPU will be responsible for what that app does,” he said.

Heat isn't the only hurdle, though. The problem with implementing specialised chips is that it’s harder to program for them. Dashwood explained that the Myriad 2 is aimed at device manufacturers who are competent in this realm. Programmability of the chip should not be confused with end-applications running on Android OS.

The computing industry is no stranger to specialised logic. Intel's newest chips have special programming meant for videos and other tasks; MediaTek's Helio chips come with CorePilot algorithm to improve performance; Qualcomm, the biggest name in smartphone SoCs, recently introduced a bunch of enhancements made to its chips using specialised algorithms. In its data centres, Microsoft uses a specialised FPGA (Field-Programmable Gate Array) chip for Bing. The company told The Economist that it has doubled the number of queries a server can process in a given time. Given the DJI Phantom 4's proficiency in obstacle avoidance and Google's recent showcase of Project Tango devices, it looks like the industry has overcome this hurdle as well.

Lastly, Dashwood says that VPUs take almost no discernible space, which makes them easier to implement in smaller devices such as smartphones and smartwatches. "The additional sensors often involved are much larger considerations when it comes to space," Dashwood said.

Application in Virtual Reality
While the implementation of VPUs in augmented reality is apparent, the industry today has been primarily focused on virtual reality. For starters, VPUs can help make a VR headset less bulky, Dashwood believes. More importantly, it can add hundreds of ways for Virtual Reality applications to interact with the real world. Room-scaling in HTC's Vive headset is one example of how virtual reality can work in conjunction with the real world. Think of this as a union between Virtual and Augmented Reality. What if the virtual space was built around your real space?

VPUs can help in "all sorts of areas", says Dashwood. He lists positional tracking, gesture, environment mapping, eye tracking and object classification as a few examples. These are some of the essential components of Virtual Reality today. If your environment can be effectively mapped, then the virtual reality space that a headset like the Oculus Rift takes you to can be built around it. This means that if you're in your living room, your Minecraft game will be built based on things in the room. Thinking back to the legendary Age of Empires games, imagine players in a single house, building their empires in separate rooms of the house, while the doorways act as borders between their empires.

Microsoft's Hololens is another area where VPUs can come in handy. The augmented reality headset seems to be one of the best things to have come out of Microsoft's stables recently, and it essentially depends on recognising the real world and then overlaying the virtual on top of it.

"VPUs are not just possible for VR, they're almost essential"

Application in smartphones
VR is still about a year or so away from truly coming to the mainstream, and consumers today are still more focused on smartphones. A very interesting possibility for VPUs in smartphones is in improving cameras on them. "Computational photography is an obvious application," says Dashwood, "there are great deal of ways of working around the physical limitations of optics running on various operations, to construct a visually pleasing photograph." According to him, computational photography has the potential to bring DSLR (or better) quality images to our smartphones.

In essence, the fact is that Computer Vision allows your smartphone to understand the scene in front of you. A photograph can be passed through additional processing, adding inputs from the VPU to generate more realistic representations.
This could help in two things -- first, enhancing camera quality without making your phone thicker. One of the main reasons why smartphones cannot attain DSLR-like quality lies in their space constraints. You cannot fit large enough lenses or sensors into them. VPUs, potentially, can solve this. Secondly, it could also improve low-light photography, a major area of focus for smartphone OEMs. While a lot of advancements have been made by companies like Apple and Samsung, low light remains the bane for smartphone cameras, and VPUs may help here as well.

We have reached out to some OEMs to get their take on the use of VPUs for such purposes. The story will be updated when their response is available.

Brains and Brawns
Perhaps the most interesting and potentially scary implementation of VPUs lies in machine learning. A neural network of machine learning algorithm replicates the human brain, which means that a VPU can act as the eyes for that brain. On January 27, 2016, Movidius announced that it is working with Google to accelerate the adoption of deep learning within mobile devices. The partnership gives Movidius access to Google's neural network technology roadmap, while the Search giant will source Movidius' processors and entire software development network.

"What Google has been able to achieve with neural networks is providing us with the building blocks for machine intelligence, laying the groundwork for the next decade of how technology will enhance the way people interact with the world," said Blaise Aguera y Arcas, head of Google's machine intelligence group. Arcas said, working with Movidius allowed Google to expand its technology out of data centres and into the real world. Google is using MA2450, the most powerful iteration of Movidius' Myriad 2 chip for this purpose. According to Remi El-Ouazzane, CEO, Movidius, the challenge in embedding the technological advances that Google has made in machine intelligence is in extreme power efficiency. This needs deep synthesis between the underlying hardware architecture, and that is where neural computer comes in.

In an interview with Digit, David Silver, Research Scientist on Google's Deep Mind, said that it is early days for Artificial Intelligence and we are "decades away from human level AGI". Silver heads the team that developed Deep Mind's AlphaGo algorithm, which recently beat Go champion Lee Sedol in a best-of-five tournament. Dashwood says that machine learning and VPUs go hand-in-hand.

"VPUs will, in future, make for an integral part of artificially intelligent robots, working as the eyes for the neural networks to work with"

The booming market
To add the proverbial cherry on the cake, Computer Vision and VPU markets are at a nascent stage, but is booming. Google and DJI are two of the best-known names, but there are others exploring these avenues. Dashwood says that currently, Movidius is the only viable solution that presents low power, low thermal characteristics and high performance.

According to him, the company's chief competitors come from the GPU and CPU market, as in some cases, they may make for viable solutions for Computer Vision requirements. "In some instances, a CPU or GPU might make for a viable solution for high performance, OR, low power, OR, low thermal characteristics...but all three at the same time? We think we are the only viable solution right now," said he.

The Movidius MA2450 mentioned above is the only commercial solution for computer vision in the market today. While VPUs won't offset chipmakers like Qualcomm, MediaTek and Intel, and they won't compete against Nvidia and AMD in the GPU segment either. Instead, they're creating a whole new segment for themselves.

Prasid BanerjeePrasid Banerjee

Trying to explain technology to my parents. Failing miserably.