Tutorial:
Metall – A Persistent Memory Allocator for Accelerating Data Analytics
In this tutorial we introduce Metall, a persistent memory allocator designed to provide developers with an API to allocate custom C++ data structures in both block-storage and byte-addressable persistent memories (e.g., NVMe SSD and Intel Optane DC Persistent Memory). Metall relies on a file-backed mmap mechanism to provide applications with transparent access to the data store in persistent memory. Additionally, Metall incorporates state-of-the-art allocation algorithms with the rich C++ interface developed by Boost.Interprocess and provides persistent memory snapshotting (versioning) capabilities.
An often overlooked but common theme among the variety of data analytics platforms is the need to persist data beyond a single process lifecycle. For example, data analytics applications usually perform data ingestion task, which index and partition data with analytics-specific data structures before performing the analysis. However, the data ingestion task is often more expensive than the analytic itself, and the same or derived data is re-ingested frequently (e.g., running multiple analytics to the same data, developing/debugging analytics programs). The promise of persistent memory is that, once constructed, data structures can be re-analyzed and updated beyond the lifetime of a single execution. Thanks to the recent notable performance improvements and cost reductions in non-volatile memory (NVRAM) technology, we believe that leveraging persistent memory in this way brings significant benefits to data analytics applications.
We begin this tutorial by introducing necessary technology for using persistent memory and Metall. Then, we learn about Metall: how to allocate data into PM with Metall, and discuss the internal architecture of Metall. Application case studies using Metall will also be presented. Finally, in a hands-on section, we will stay online to work with anyone wishing to experiment with provided example code. Metall available at https://github.com/LLNL/metall.