Accelerating and optimizing data analytics workflows has several challenges depending on your perspective and approach. Here are three examples:
- Database players view them from the perspective of storage, viewing analytics workload problems as an extension of database problems
- Higher-level programming languages and environments such as JVM result in tradeoffs between performance and ease of programming
- Data analytics workflows have been split between frameworks/tools that focus on analytic computation and those that focus on data visualization
In this highly informative talk, Founder and CEO of OmniSci Todd Mostak takes us on a comprehensive tour of how to use the latest HPC techniques to simultaneously accelerate analytics SQL and data visualization—a skill the company has been honing since 2013.
Topics covered:
- Key lessons OmniSci has learned by re-examining the nature of data-centric workflows, including how successive generations of hardware accelerators provide opportunities and unique technical challenges
- How these workloads can be viewed as a “co-design” problem requiring an understanding of the hardware/infrastructure characteristics and the workload patterns themselves
- Techniques to accelerate analytic workflows by leveraging hardware optimizations at every stage of the workflow—IO acceleration to LLVM-based JIT compilation to large-scale, in-situ data visualization and efficient interfaces with ML/DL workflows
Download the software
Get Intel® oneAPI Base Toolkit, which includes many optimized tools and libraries for data analytics
Additional resources:
Intel® Optane™ DC technology for data centers