[Cross post of Authors Blog ]
The renewed interest in artificial intelligence in the past decade has been a boon for the graphics cards industry. Companies like Nvidia and AMD have seen a huge boost to their stock prices as their GPUs have proven to be very efficient for training and running deep learning models. Nvidia, in fact, has even pivoted from a pure GPU and gaming company to a provider of cloud GPU services and a competent AI research lab.
But GPUs also have inherent flaws that pose challenges in putting them to use in AI applications, according to Ludovic Larzul, CEO and co-founder of Mipsology, a company that specializes in machine learning software.
The solution, Larzul says, are field programmable gate arrays (FPGA), an area where his company specializes. FPGA is a type of processor that can be customized after manufacturing, which makes it more efficient than generic processors. FPGAs, however, are very hard to program, a problem that Larzul hopes to solve with a new platform his company has developed.
Specialized AI hardware has become an industry of its own and the jury is still out on what will be the best infrastructure for deep learning algorithms. But if Mipsology succeeds in its mission, it could be beneficial to many AI developers where GPUs are currently struggling.
The challenges of using GPUs for deep learning
Three-dimensional graphics, the original reason GPUs are packed with so much memory and computing power, have one thing in common with deep neural networks: They require massive amounts of matrix multiplications.
Graphics cards can perform matrix multiplications in parallel, which speeds up operations tremendously. Graphics processors can cut down the time of training neural networks from days and weeks to hours and minutes.
Aside from the rising stock of graphics hardware companies, the appeal of GPUs in deep learning has given rise to a host of public cloud services that offer virtual machines with strong GPUs for deep learning projects.
But graphic cards also have hardware and environmental limitations. “Neural network training is typically conducted in an environment that is not comprehensive of the varying constraints that the system running the neural network will experience in deployment – this can put a strain on the real-world use of GPUs,” Larzul says.
GPUs require a lot of electricity, produce a lot of heat, and use fans for cooling. This is not much of a problem as you’re training your neural network on a desktop workstation, a laptop computer, or a server rack. But many of the environments where deep learning models are deployed are not friendly to GPUs, such as self-driving cars, factories, robotics, and many smart-city settings where the hardware has to endure environmental factors such as heat, dust, humidity, motion, and electricity constraints.
“Some critical applications like smart cities’ video surveillance require hardware to be exposed to environmental factors (like the sun, for example) that negatively impact GPUs,” Larzul says. “GPUs are at the technology limit of transistors, causing them to run at high temperatures and require significant cooling, which is not always possible. This means more power, electricity, maintenance costs, etc.”
Lifespan is also an issue. In general GPUs last around 2-5 years, which isn’t a major issue for gamers who usually replace their computers every few years. But in other domains, such as the automotive industry, where there’s expectation for higher durability, it can become problematic, especially as GPUs can die out faster due to the exposure to environmental factors and more intense usage.
“When factoring in the commercial viability of applications like autonomous vehicles, which could require as many as 7-10 GPUs—most of which will fail in less than four years—the cost of a smart or autonomous vehicle becomes impractical for most car purchasers,” Larzul says.
Other areas industries such as robotics, health care, and security systems face similar challenges.
FPGAs and deep learning
FPGAs are customizable hardware devices that have adaptable components, so they can be optimized for specific types of architectures, such as convolutional neural networks. Their customizability reduces their electricity requirements and gives them higher performance in terms of acceleration and throughput. They also have a longer lifespan, about 2-5 times that of GPUs, and are more resistant to rugged settings and environmental factors.
Some companies are already using FPGAs in their AI products. An example is Microsoft, which provides its FPGA-powered machine learning technology as part of the offerings of its Azure cloud service.
But the problem with FPGAs is that they are very hard to program. Configuring FPGUs requires knowledge and expertise of hardware descriptor languages such as Verilog or VHDL. Machine learning programs are written in high-level languages such as Python or C, and converting their logic to FPGA instructions is very difficult. Running neural networks modeled with TensorFlow, PyTorch, Caffe, and other frameworks on FPGAs would normally require considerable manual time and effort.
“To program an FPGA, you need to assemble a team of hardware engineers who know how to develop FPGAs, hire a good architect who understands neural networks, spend a few years developing a hardware model, and compile it for an FPGA while facing the problem of reaching high usage or high frequency,” Larzul says. “Meanwhile, you need to have a extensive math skills to accurately compute the models with less precision and a team of software people to map the AI framework models to the hardware architecture.
Mipsology, Larzul’s company, aims to bridge that gap with Zebra, a software platform that allows developers to easily port their deep learning code to FPGA hardware.
“We offer a software abstraction layer that conceals the complexity that would normally require high level FPGA expertise,” Larzul says. “Simply load Zebra, type a single Linux command and Zebra goes to work – it requires zero compiling, zero changes to your neural network, and zero new tools to learn. And you can keep your GPU for training.”
Zebra provides an abstraction layer that translates deep learning code to FPGA hardware instructions
The AI hardware landscape
Mipsology’s Zebra platform is one of several efforts that can open the path for many developers to explore the usage of FPGAs in their AI projects. Xilinx, the leading developer of FPGA cards, has endorsed Zebra and integrated it into its boards.
Other companies like Google and Tesla are creating their own specialized AI hardware for and are offering them in cloud or edge environments. There are also efforts in neuromorphic chips, computer architectures that have been specially designed for neural networks. Intel is leading efforts in neuromorphic computing and has already several working models, but the field is still in early adoption.
There are also application-specific integrated circuits (ASIC), chips manufactured for one very specific AI task. But ASICs lack the flexibility of FPGAs and can’t be reprogrammed.
“We decided to focus on software and how to compute neural networks for performance or lower latency. Zebra runs on FPGAs so that it can power AI inference without having to replace hardware. We get high performance thanks to the great efficiency and short development cycles with each refresh of the FPGA firmware. Also, many choices of FPGAs exist to adapt to the right market,” Larzul says