COVID-19 – Galaxy Australia COVID-19 dedicated Pulsar

Galaxy Australia enables an accessible, reproducible, and transparent computational biological service for Australian life scientists. Working with our global Galaxy partners in the USA and Europe (headquartered in Germany) we deployed an agreed and tested set of tools to enable analysis of SARS CoV-2 virus; the genetics, the gene expression profiles, the evolution of virus variants, chemical complexity profiles and protein complexity profiles. Researchers can bring their own SARS CoV-2 data or access public repositories of virus data for analysis and reanalysis, to help manage and combat the Covid-19 pandemic.
Person

Principal investigator

Gareth Price g.price@qcif.edu.au
Magnifying glass

Area of science

Bioinformatics, Computational Biology
CPU

Systems used

Nimbus
Computer

Applications used

Galaxy Australia (https://usegalaxy.org.au)
Partner Institution: QCIF Facility for Advanced Bioinformatics / The University of Queensland | Project Code: A000203

The Challenge

The democratisation of genetic information, through rapid and cheap DNA sequencing, has held to an explosion of data and also as a by-product a raft of inconsistent analytical approaches to data processing. This is plainly evident in early 2020 where the global rush to document people with Covid-19 and to describe the SARS CoV-2 virus led to publications with highly variable (and in some cases completely inappropriate) bioinformatic approaches to analysing the viral genome. The Galaxy Australia dedicated Covid-19 processing resource (a Pulsar in Galaxy language) at Pawsey allows researchers to access global accepted best practice workflows, making sure that the most accurate virus information enters the public domain and is used to help manage and combat the Covid-19 pandemic.

The Solution

Galaxy Australia relies on remote (to head node) deployments called Pulsar to increase the scope and number of jobs that can be run on the service. Galaxy can be configured to send only certain jobs to an individual Pulsar. In April 2020, Galaxy Australia collaborated with its global Galaxy partners to deploy the tools and workflows for SARS CoV-2 analysis onto Australian cloud infrastructure at the Pawsey Supercomputing Centre. The dedicated resource allocation for SARS CoV-2 related tools means those researchers analysing the virus have access to the most appropriately resourced and fastest queuing system for the tools they require.

The Outcome

Since creation in April 2020 the tools associated with SARS CoV-2 analysis have been executed 10,744 times, with the dedicated Pulsar handling all the data processing. Galaxy Australia as a service further built on the robust nature of the Galaxy platform in a three way demonstration of reproducibility of virus analysis with our partners in the USA and Germany, Galaxy Main and Galaxy Europe respectively. Alongside two publications, the Covid-19@Galaxy Project (https://covid19.galaxyproject.org/) website was created to act as a portal to the global Galaxy based SARS CoV-2 resources, including those made available through the Pawsey / NCI Covid-19 Accelerated Access Initiatives.

List of Publications

No more business as usual: Agile and effective responses to emerging pathogen threats require open data and open analytics
Plos-Pathogens August 13, 2020; doi: https://doi.org/10.1371/journal.ppat.1008643

Freely accessible ready to use global infrastructure for SARS-CoV-2 monitoring
Submitted to Nature Methods, bioRxiv 2021.03.25.437046; doi: https://doi.org/10.1101/2021.03.25.437046
(Publication is 2021, utilised Pawsey Nimbus allocation made available in 2020).