Deep learning deployment on the edge for real-time inference can significantly reduce the cost of communicating with the cloud in terms of network bandwidth, network latency, and power consumption.
But there’s a flip side: Edge devices have limited memory, compute, and power. As a result, using the traditional 32-bits of floating-point precision is often too computationally heavy for embedded deep learning inference workloads.
The Intel® Distribution of OpenVINO™ toolkit offers a solution via INT8 quantization—deep learning inference with 8-bit multipliers.
Join deep learning expert Alex Kozlov for a deeper dive into achieving better performance with less overhead on Intel® CPUs, GPUs, and VPUs using OpenVINO™ toolkit’s latest INT8 Calibration Tool and Runtime. He’ll cover:
- New features such as asymmetric quantization, bias correction, and weight equalization to improve quality of inference workloads and lower precision
- How to make best use of OpenVINO’s enhanced capabilities for your AI applications
- Using INT8 to accelerate computation performance and save memory bandwidth and power, and provide better cache locality
Get the software
Download the latest version of Intel® Distribution of OpenVINO™ toolkit so you can follow along during the webinar.
More resources