Home > Topics > Goodbye, Slow Inference Workloads. Hello, Improved Quantization Techniques.

Goodbye, Slow Inference Workloads. Hello, Improved Quantization Techniques.

Deep learning deployment on the edge for real-time inference can significantly reduce the cost of communicating with the cloud in terms of network bandwidth, network latency, and power consumption.

But there’s a flip side: Edge devices have limited memory, compute, and power. As a result, using the traditional 32-bits of floating-point precision is often too computationally heavy for embedded deep learning inference workloads.

The Intel® Distribution of OpenVINO™ toolkit offers a solution via INT8 quantization—deep learning inference with 8-bit multipliers.

Join deep learning expert Alex Kozlov for a deeper dive into achieving better performance with less overhead on Intel® CPUs, GPUs, and VPUs using OpenVINO™ toolkit’s latest INT8 Calibration Tool and Runtime. He’ll cover:

New features such as asymmetric quantization, bias correction, and weight equalization to improve quality of inference workloads and lower precision
How to make best use of OpenVINO’s enhanced capabilities for your AI applications
Using INT8 to accelerate computation performance and save memory bandwidth and power, and provide better cache locality

Get the software
Download the latest version of Intel® Distribution of OpenVINO™ toolkit so you can follow along during the webinar.

More resources

#VisualComputing

Alexander Kozlov, Deep Learning R&D Engineer, Intel Corporation

Alexander is a Machine Learning/Deep Learning (ML/DL) Engineer at Intel with expertise in DL object detection architectures, Human Action Recognition approaches, and Neural Network compression techniques. Before Intel, he was a senior software engineer and researcher at Itseez (now acquired by Intel) where he worked on Computer VIsion algorithms for ADAS systems. Now Alexander focuses on deep learning neural network (DNN) compression methods and tools which allow getting more lightweight and hardware-friendly models. Alex holds a Master’s Degree from University of Nizhni Novgorod.

Related Videos
Related Articles

Big Picture Video

Innovate Deep Learning Applications with Intel® Distribution of OpenVINO™ toolkit

Watch now

Essentials

Achieve Fast CPU Inference with New Optimization Features

Watch now

Essentials

Optimize Deep Learning Inference Applications using OpenVINO™ Toolkit

Watch now

Article

Effectively Train and Execute Machine Learning and Deep Learning Projects on CPUs

Article

Intel-Powered Deep Learning Frameworks

Developer Tools

Intel® Distribution of OpenVINO™ toolkit

Innovate data visualization. Develop multiplatform applications and solutions that emulate human vision—smart cameras, video surveillance, robotics, transportation, and more.

See details

Download now

See all tools

Sign up for Updates

By submitting this form, you are confirming you are an adult 18 years or older and you agree to share your personal information with Intel to stay connected to the latest Intel technologies and industry trends by email and telephone. You can unsubscribe at any time. Intel’s web sites and communications are subject to our Privacy Notice and Terms of Use.

Thank you for your submission.

Intel Developer Zone