Unlocking the Genetic Code

Project Leader: A/Prof. Scott G Wilson, University of Western Australia

Susceptibility to human disease has been a cornerstone of genetic research for several decades. Adjunct Associate Professor Scott G. Wilson, from the University of Western Australia’s School of Medicine and Pharmacology, is leading a team of researchers to study genetic and epigenetic data to determine which factors govern a predisposition to particular complex diseases.

 
Hours Allocated
 
Gigabytes Processed
 
Nucleotide Base Pairs
Partner Institution: University of Western Australia System: Magnus Areas of science: Genetic Epidemology Applications used: SAMtools, GATK, R statistical package, GCTA, PLINK 1.9, GWAMA, GTOOL, LINKAGE SNPTEST, GWAF, GEMMA ANNOVAR

Through the use of the Magnus system at the Pawsey Supercomputing Centre, Professor Wilson and his team have been able to make significant progress. Research focused on genetic determinants of diseases such as osteoporosis, polycystic ovarian syndrome, autoimmune disease, thyroid metabolism and cancer, and muscle and liver disease.

The Challenge

In collaboration with other researchers, Professor Wilson has undertaken numerous genome-wide association studies on thousands of individuals. Researchers investigated various gene characteristics of endocrine, bone, liver and muscle phenotypes. In examining these characteristics, Professor Wilson aims to uncover the key role that certain genes play in the onset of complex diseases. Such extensive studies, however, generated vast amounts of data for the team of researchers to analyse. According to Professor Wilson, the project called for complex mathematical procedures that exceeded the capability of standard computing facilities. Largescale genetic analyses were needed. “It’s often difficult for people to comprehend just how much data is generated from whole genome sequencing. For example, while the entire genome of each person fits inside each cell of the body, that DNA code contains approximately 3 billion nucleotide base pairs and the data from sequencing is about 300 gigabytes in size. You’d only fit data for one or two individuals on an average desktop PC and even then wouldn’t have enough memory (RAM) to manipulate the data. We have whole genome sequence data from thousands of people available to us for study, but it’s not practical for research teams to network thousands of PCs together.”

The Solution

The Magnus system, the powerful supercomputer at the Pawsey Supercomputing Centre, provided the resources for the research team to manage large datasets of the project. As a result, proficient data analyses on genetic disease susceptibility were made possible. “Access to the Pawsey Centre supercomputing infrastructure was a key element of our research program because of the profound increase in volume and complexity of the whole exome and whole genome sequencing data that we are now working with in medical research,” said Professor Wilson. “We have access to a number of other international supercomputing facilities, but to my mind the resources available at the Pawsey Supercomputing Centre are state-of-art. I can’t overstate the importance of the assistance we received from the local technical and support staff in getting our analyses running and data processed efficiently.”

Outcome

Using the world-class Pawsey supercomputing facilities, Professor Wilson and his team of researchers were able to process immense amounts of data, streamlining the task of pinpointing relevant genetic information. In doing so, knowledge of hereditary vulnerability to diseases such as thyroid cancer, polycystic ovarian syndrome, and osteoporosis can be recognised and utilised for future medical applications. Unlocking the genetic code underlying complex diseases may help researchers predict familial patterns of illnesses and lead to the processes for disease minimisation and effective treatment. Supercomputers now play a fundamental role of providing large-scale analyses in projects such as those of Professor Wilson’s. “Supercomputers are an essential tool for Western Australian researchers and I feel really privileged to have access to this highly sophisticated data processing infrastructure,” said Professor Wilson. “I’m very grateful to the visionary Information Technology leaders and politicians who had the foresight to establish this facility – we are now reaping the rewards from their efforts.”

to my mind the resources available at the Pawsey Supercomputing Centre are state-of-art. I can’t overstate the importance of the assistance we received.
A/Prof. Scott G Wilson, University of Western Australia,
Project Leader.
Download printable PDF