Loading Events

Data Reduction/Refactoring

Join Oak Ridge National Labs Specialists to learn about data reduction/refactoring using ADIOS.

Research talk and Q&A, August 30, from 9:30 – 11:00am AWST / 11:30am – 1:00pm AEST.

Experimental, observational, and computational facilities face a data crisis.

New technologies allow data to be generated at unprecedented rates and scales in fields ranging from high energy physics and nuclear physics to radio astronomy and photon sciences.  The resulting data streams are scientifically rich; however, it is becoming increasingly impractical to store them in their entirety. And, while great strides have been made in data reduction in fields such as digital media, scientific data presents unique challenges.

Furthermore, one of our greatest challenges is to process and annotate all of the data being produced by large scientific instruments. We have fallen into a new age where we can generate data at unprecedented rates, and although it is possible to store a large amount of this data, it is impossible to fully analyze. To fully realize the power and capability of these new scientific facilities, new investments must appear to reduce the size, storage footprint, and time spent to query and analyze this data. Without such investments, ad-hoc decisions will be made to remove data, which will inevitably lead to the loss of important information.

R&D is urgently needed.

New methods for reducing streaming and voluminous data sets while maintaining accurate representations of scientifically relevant derived quantities of interest (QoIs) are of critical and growing importance for today’s science. Such methods are of particular importance for the dozens of scientific facilities, with aggregate costs in the billions, operated by large facilities around the world and many other science agencies. Many of these facilities face critical decisions concerning the allocation of resources for moving, processing, and storing data. R&D is urgently needed, both to develop improved data reduction methods and to improve understanding of how these methods can be used effectively within facilities and in scientific campaigns.

The co-designing of critical software infrastructure.

To tackle these goals, Dr. Scott Klasky, Dr. Norbert Podhorszki, and their group has worked closely with many large-scale applications and researchers to co-design critical software infrastructure for these communities. These research artifacts have been fully integrated into many of the largest simulations and experiments, and have increased the performance of these codes by over 10X. This impact was recognized with an R&D 100 award in 2013 and was highlighted in the 2020 US Department of Energy (DOE) Advanced Scientific Computing Research (ASCR) @40 report. In this presentation, I will discuss the research details on three major contributions I have led: large-scale self-describing parallel I/O (ADIOS), in situ/streaming data (SST), and data reduction/refactoring (MGARD).

The focus of this 90-minute session.

During this session, Dr. Klasky and Dr. Podhorszki introduces the overall concepts and presents several results from the group’s research, which has been applied and fully integrated into many of the world’s largest scientific applications. He will also go through several examples of where we can use MGARD with ADIOS to greatly reduce the overall size of fusion data by over 1000X while maintaining errors which are < 10^-8 .

Scott will conclude the session by describing a new consortium, which they’re starting to help further tackle these problems for scientific data.

What is ADIOS?

ADIOS is a high performance publish/subscribe I/O framework which has been designed and developed for the exascale computing era. In a snapshot, ADIOS:

  • Is integrated into most of the popular analysis and visualization packages.
  • Has strict continuous integration practices, providing stable, portable and efficient I/O services.
  • Has a programming interface designed for easy switching from files to streams (on-HPC machines) to streams over the Wide Area Network.

ADIOS is also a research framework for new I/O technologies, pushing the boundaries beyond current use cases.

This is a hybrid event – you can join in-person in Perth or online. 

Register your Expression of Interest