New international consortium formed to create trustworthy and reliable generative AI models for science

Written by Charlie Catlett
Originally published by Argonne National Laboratory on 10 November 2023, here

 

Trillion Parameter Consortium launches with dozens of founding partners from around the world.

The initiative brings together teams of researchers engaged in creating large-scale generative AI models to address key challenges in advancing AI for science.

A global consortium of scientists from federal laboratories, research institutes, academia, and industry has formed to address the challenges of building large-scale artificial intelligence (AI) systems and advancing trustworthy and reliable AI for scientific discovery.

The Trillion Parameter Consortium (TPC) brings together teams of researchers engaged in creating large-scale generative AI models to address key challenges in advancing AI for science. These challenges include developing scalable model architectures and training strategies, organizing, and curating scientific data for training models; optimizing AI libraries for current and future exascale computing platforms; and developing deep evaluation platforms to assess progress on scientific task learning and reliability and trust.

Toward these ends, TPC will:

  • Build an open community of researchers interested in creating state-of-the-art large-scale generative AI models aimed broadly at advancing progress on scientific and engineering problems by sharing methods, approaches, tools, insights, and workflows.
  • Incubate, launch, and coordinate projects voluntarily to avoid duplication of effort and to maximize the impact of the projects in the broader AI and scientific community.
  • Create a global network of resources and expertise to facilitate the next generation of AI and bring together researchers interested in developing and using large-scale AI for science and engineering.

Trillion parameter models represent the frontier of large-scale AI with only the largest commercial AI systems currently approaching this scale.

Training LLMs (large language models) with these many parameters requires exascale class computing resources, such as those being deployed at several U.S. Department of Energy (DOE) national laboratories and multiple TPC founding partners in Japan, Europe, and elsewhere. Even with such resources, training a state-of-the-art one trillion parameter model will require months of dedicated time—intractable on all but the largest systems. Consequently, such efforts will involve large, multi-disciplinary, multi-institutional teams. TPC is envisioned as a vehicle to support collaboration and cooperative efforts among and within such teams.

“At our laboratory and at a growing number of partner institutions around the world, teams are beginning to develop frontier AI models for scientific use and are preparing enormous collections of previously untapped scientific data for training,” said Rick Stevens, associate laboratory director of computing, environment and life sciences at DOE’s Argonne National Laboratory. “We collaboratively created TPC to accelerate these initiatives and to rapidly create the knowledge and tools necessary for creating AI models with the ability to not only answer domain-specific questions but to synthesize knowledge across scientific disciplines.”

List of TPC founding organizations is available here: https://pawsey.org.au/the-founding-partners-of-tpc/