The exponential growth in use of large, deep neural networks (DNNs) has accelerated the need for training these networks in hours. Even minutes.
This kind of speed cannot be achieved on a single machine—a single node cannot satisfy the compute, memory, and I/O requirements of today’s state-of-the-art DNNs.
The way to do it is through scalable and efficient distributed training, which is facilitated by deep learning (DL) frameworks.
Join Intel Software Engineer and deep learning expert Mikhail Smorkalov for an overview of three Intel-optimized DL frameworks—Caffe*, Horovod* (for TensorFlow*), and nGraph—that boost communication performance on distributed workloads compared to existing approaches.
Additional Resources
Find out more about these optimized frameworks, including how to get them.