Home > Topics > Vector-Aware Programming: Tips & Tricks to Streamline the Process on a Petascale System

Vector-Aware Programming: Tips & Tricks to Streamline the Process on a Petascale System

The trend for today’s CPUs is core count … and lots of it. (Cases in point: 2nd Gen Intel® Xeon® Scalable processors scale up to 48 cores per CPU. And Intel® Xeon Phi™ processors have as many as 72!) In this environment, vectorizing your code is critical to delivering optimal application performance on core-rich nodes.

So how do you write vectorization-friendly code?

You start by identifying and removing barriers like those affecting memory access patterns and cache usage, and balancing multi-process programming (MPI) with multi-threaded programming (OpenMP).

This presentation is a deep dive on how to do both, demonstrated on Texas Advanced Computing Center’s newest petascale system, Frontera, powered by Intel Xeon Scalable processors.

Watch Ian Wang, HPC specialist from University of Texas, discuss these concepts, including:

The basics of vector-aware programming, dependency analysis, and optimization reports
Guidance in using vector units, the proper placement of tasks/threads, the efficient use of memory bandwidth, and the impact of frequency scaling
Software tools of the trade, including Intel® Math Kernel Library and Intel® C++ Compilers
Code samples and step-by-step instructions

Download the software

Get Intel® Math Kernel Library one of five FREE Intel® Performance Libraries
Get Intel® C++ Compiler as part of Intel® Parallel Studio XE or Intel® System Studio
Get Intel® Advisor as a standalone tool, or as part of Intel® Parallel Studio XE or Intel® System Studio

#CodeModernization

Ian Wang, Research Associate, HPC Performance & Architectures Group, University of Texas at Austin

Ian joined the Texas Advanced Computing Center (TACC) in 2018 as a Research Associate in the Performance and Architecture Group. Currently, he focuses on system performance analysis and industry standard benchmarking on HPC platforms. He also assists TACC users port, analyze and improve their research software. Prior to joining TACC, he held a position at the Center for Research in Extreme Scale Technologies (CREST) at Indiana University where he worked on asynchronous multi-tasking runtime system development and RDMA networking library integration. He also worked as a Research Geophysicist at the Indiana Geological and Water Survey where he involved the collaborative development of a science gateway for simulation and assessment of CO2 capture and storage technologies with Los Alamos National Lab and Pervasive Technology Institute. Ian holds a Ph.D. in Geophysics from Indiana University.

Related Videos
Related Articles

Essentials

Exascale in Sight: MPI Communication Layer Migration Benefits

Watch now

Essentials

Efficient HPC Communications: Profiling and Tuning MPI Applications

Watch now

Essentials

Get the Latest on Intel® MPI to Boost Performance, Container & Cloud Support

Watch now

Article

Vectorization Becomes Important—Again

Article

Effectively Using Your Whole Cluster

Article

K-means Acceleration with 2nd Generation Intel® Xeon® Scalable Processors

Developer Tools

Intel® Math Kernel Library (Intel® MKL)

Accelerate math processing routines, increase application performance, and reduce development time with the fastest and most-used math library for Intel®-based...

See details

Download now

See all tools

Sign up for Updates

By submitting this form, you are confirming you are an adult 18 years or older and you agree to share your personal information with Intel to stay connected to the latest Intel technologies and industry trends by email and telephone. You can unsubscribe at any time. Intel’s web sites and communications are subject to our Privacy Notice and Terms of Use.

Thank you for your submission.

Intel Developer Zone