Ibrahim, Khaled Z; Williams, Samuel W; Epifanovsky, Evgeny; Krylov, Anna I

doi:10.1109/hipc.2014.7116881

Download PDF

Analysis and Tuning of Libtensor Framework on Multicore Architectures

2014

Published Web Location

https://doi.org/10.1109/hipc.2014.7116881

Abstract

Libtensor is a framework designed to implement the tensor contractions arising form the coupled cluster and equations of motion computational quantum chemistry equations. It has been optimized for symmetry and sparsity to be memory efficient. This allows it to run efficiently on the ubiquitous and cost-effective SMP architectures. Unfortunately, movement of memory controllers on chip has endowed these SMP systems with strong NUMA properties. Moreover, the manycore trend in processor architecture demands that the implementation be extremely thread-scalable on node. To date, Libtensor has been generally agnostic of these effects. To that end, in this paper, we explore a number of optimization techniques including a thread-friendly and NUMA-aware memory allocator and garbage collector, tuning the tensor tiling factor, and tuning the scheduling quanta. In the end, our optimizations can improve the performance of contractions implemented in Libtensor by up to 2× on representative Ivy Bridge, Nehalem, and Opteron SMPs.

Main Content

For improved accessibility of PDF content, download the file to your device.

Computing Sciences

Analysis and Tuning of Libtensor Framework on Multicore Architectures

Published Web Location