Deep Learning Training Efficiency
Deep Learning Requires Learning
From technology to manufacturing, from healthcare to insurance, discussions surrounding the impact of artificial intelligence on a host of industries continue to gather momentum. At present, a significant amount of attention is focused on training accurate models (based on specific data sets) as quickly as possible. And because AI involves a complex matrix of operations and vast number of iterations, the performance of the computing platform is very critical—and we’re not just talking about GPUs. They’re important, of course, but there are also a multitude of other critical factors that need to be addressed, beginning with GPU processing power vs. data storage performance:
Understanding the characteristics of your chosen DL framework
Dual CPU systems present special challenges, so using two CPUs rather than one is only half the story
Local storage is a major source of bottlenecks in deep learning applications
Selecting the right NVMe drives
With data storage and multi-GPU systems, adding GPUs is an effective way of accelerating DNN model development
A clear understanding regarding the inherent latency of your platform
And because a mismatched storage sub-system can leave the GPU (or multiple GPUs) idle, leading to a severe loss of training efficiency, data richness and DNN training topics like aligning data type and storage sub-system performance or how to solve a latency problem, are also essential.
In order to be successful, data scientists working to achieve the fastest possible time-to-model need to fully understand the precise hardware configuration necessary for their specific data type. That’s why the new BOXX whitepaper, Deep Learning Training Efficiency: It’s Not Just About GPUs, is essential reading. From GPU processing and data storage performance, to data richness and DNN training performance, the deep learning infrastructure experts at BOXX and Cirrascale Cloud Services provide the critical information you need to configure the right AI computing platform.