Pawsey Internship Alumni: Celebrating Achievement!
Since 2005, more than 250 students have spent more than 2500 weeks researching and contributing to computational algorithms, data science, cloud computing, and more – through the Pawsey Summer Internship Programmes.
The story does not stop at the 2500 hours…
… These Alumni keep on giving through:
- Activities “close to home”, e.g., participating as Pawsey Intern Mentors and Intern poster judges, and
- Activities in the wider community, through their contributions to science, industry, government, education and more.
Pawsey Intern Alumni are recognised and celebrated here. Click on a profile below to view short summaries, or search using the filters.
If you’re a Pawsey Alumni and you would like to have your profile included, contact training@pawsey.org.au. Let’s celebrate you!
The information flow in social networks is extensive, containing thousands of posts and reactions from people. This information is subjective and based on people’s personal experiences or opinions. One of its potential drawbacks is a significant amount of malicious users spreading doctored and fake information. Such conduct often leads to cyberbullying against individuals. Cyberbullying appears in various forms, e.g., posting offensive images or videos or sharing information without the owner’s permission. However, most commonly, it is in a textual format. Such heterogeneous data poses multiple fundamental processing challenges. Aashima’s project involves developing a hybrid model to detect these incidents.
Duplex stainless steel (DSS) alloys are increasingly utilised in oil and gas applications, given their high corrosion resistance. However, the susceptibility of DSS to environmentally assisted degradation (i.e., stress corrosion cracking, SCC) in production environments is not well understood, such that its operating limits in standards are perceived as overly conservative.
In Abraham’s project, a model based on Bayesian Networks (BN) is being developed to provide an explanatory framework for SCC of DSS in production environments. Nonetheless, it is required to conduct comparative studies between the results obtained by BNs and other machine learning (ML) approaches (e.g., XGBoost, Neural Networks), so as to validate the accuracy of preliminary results.
First predicted in 1916 by Albert Einstein, the detection of gravitational waves a hundred years later ushered in a new era of gravitational-wave astronomy. The detection of gravitational waves is now a regular occurrence, and astronomers are busy using this valuable data to learn about the properties of the extreme matter in neutron stars, probe the limits of general relativity and otherwise illuminate the cosmological mysteries of the universe.
Aidan’s project aims to explore the potential for near-term quantum algorithms to assist with gravitational wave-matched filtering. It will investigate the use of several available quantum methods, including the well-known Grover’s search algorithm and the QAOA/QWOA quantum optimisation algorithms.
Cell membranes are the primary site of damage during cryopreservation in liquid nitrogen due to the formation of ice both inside and outside of cells. Cryosolvents are used to promote the formation of a glassy state of water and prevent the formation of damaging ice, but their presence can also be damaging to the integrity of cell membranes.
Aonghas’ project will will use molecular dynamics (MD) simulations to study the mechanism of action of cryosolvents with model cell membranes of complex lipid composition at different levels of desiccation to mimic the process of dehydration that occurs during cryopreservation, with two specific aims: (1) describe in atomistic detail the interactions of cryosolvents with cell membranes and the changes in membrane properties, and (2) conduct 3D stereoscopic visualization of these molecular events. Molecular simulation and 3D visualization will enable the elucidation of what is a complex, challenging and yet fascinating problem in cryogenics.
Modification of stomatal traits improves plant water-use efficiency (WUE). Stomata have thus become a target for improving the drought tolerance of Australian crop varieties.
Brittany’s project aims to identify novel polymorphisms in genes involved in stomatal development as future targets to enhance crop WUE. As part of project outcomes, the student will determine the extent of divergence between stomatal development genes across the grass family and identify polymorphisms in stomatal development genes in barley, wheat, and oat that may have resulted from breeding or natural selection for improved WUE.
Most bushfires in Australia usually occur in remote areas where network infrastructure is unavailable to support ICT services. Adopting Space-Air-Ground Integrated Network (SAGIN) can allow ICT services in an infrastructure-less environment like national parks in Australia to detect bushfires, initiate responses and alert other fire-prone sites. Carl’s project aims to model such a system by interfacing lightweight solar-powered sensing devices residing on the ground with UAVs providing proximate computing facilities and on-demand networking support to reach out to Cloud and Satellites. The project investigating team will also develop the required software solutions to facilitate the proposed interfacing in real-time with an optimized packet loss rate. The feasibility and performance study of different communication standards, including mobile ad-hoc networking and opportunistic 5G multicasting within UAVs, will be conducted to realize the project
The DAliuGE workflow development and execution system (https://daliuge.readthedocs.io) has been developed to support the Square Kilometre Array data processing system on an extreme scale. DALiuGE has been used to run very large workflows on the biggest supercomputers in the world, including Tianhe-2 (China), Summit (US) and, of course, also the various Pawsey systems. Daniil’s project is about enabling DALiuGE to use the Acacia object store directly through a dedicated so-called data component, which can then be used in complex scientific workflows to persist data.
The rising cost of fossil fuels and the general public’s increased awareness of adverse climate changes resulting from burning fossil fuels have driven up interest in renewable energy sources. The Green Electric Energy Park (GEEP) at Curtin University features state-of-the-art renewable energy-based electric power generation technology, including solar photovoltaic arrays, wind turbines, micro-hydro turbines, and fuel cell stacks. All feeds from renewable energy sources and network operating systems have created big datasets of 10 years of real-time data. We aim to effectively analyse the data using cutting-edge machine learning techniques and big data analytics to extract meaningful information. Deepak’s project will focus on advanced and accurate forecasting of renewable energy output using multivariate embedded Deep neural considering various factors such as weather, device conditions, the multivariate association between wind and solar generation, and the effects of clouds.
Dina’s project aims at exploring advances in deep learning to detect liver cancer in patients and predict their survival using a combination of hematoxylin and eosin (H&E) images and drug/therapy response. It is expected that by combining information on the H&R images of tumors and how a patient responds to treatment, we will have a better understanding of whether a cancer is imminent. This project will develop a new deep neural network architecture based on vision transformers, which are known to be state-of-the-arts for sequence-to-sequence modelling. Here, a transformer network will be utilised to predict tumor gene expression from H&E slides. The predicted expression will be then fused with the drug/therapy responses through a fully-connected network to predict the cancer. The transformer network will be pre-trained on images and corresponding transcriptomics data of tumor samples, which will be provided by the Harry Perkins Institute.
Haadi’s project aims to generate a comprehensive collision data set for electron collisions with tin ions that is relevant for modeling a hot and dense laser-driven tin plasma. Such plasma is used in extreme ultraviolet (EUV) lithography (13.5 nm wavelength) that is currently entering high-volume manufacturing to enable the continued miniaturization of semiconductor devices. Another important application of the produced dataset will be in fusion research where tin is used as a marker to monitor the erosion of wall tiles of the fusion reactor. The unique observable spectral signatures of tin ions allow to identify and locate the damage. The applicant will learn about EUV lithography and fusion research and produce a list of tin ions for quantum collision modeling. They will use quantum collision codes developed by the Curtin research group to study electron collisions with tin ions and thereby contribute to a better understanding of various processes happening in tin plasma
About 17% of Australians own cryptocurrency worth 8 billion AUD, although it is difficult to identify whether these digital coins are generated from a legitimate source following Australian rules and regulations. Due to the anonymous nature of blockchain, it becomes further complicated to resist fraud using the cryptocurrencies obtained through ransomware. Since no existing solution provides facilities for such risk assessment within the crypto transactions, Helen’s project aims to develop a software service for on-the-fly fraud and money laundering detection while using cryptocurrencies.
Most molecules can take more than one low-energy conformation, i.e. arrangement of their atoms. Searching for these low-energy conformations is a key problem in drug discovery. In Jack’s project, he will be developing an ultra-fast algorithm for finding low-energy molecular conformations using a combination of molecular fragmentation techniques, quantum chemistry methods and GPU programming.
The Swarm-based Unmanned System is an emerging autonomous system for various applications such as surveillance, emergency response, entertainment, and warfighting, where autonomous entities such as unmanned aerial vehicles (UAV) and unmanned underwater vehicles (UUV) are dynamically deployed with little to no human intervention. The objective of James’ project is to develop a collaborative and resilient Swarm Intelligence (SI) platform that composes and coordinates multimodal swarms for collaborative surveillance. A swarm of UAV/UUVs may act as an array of sensors that can map a large geographical area, which is impossible for a single drone or water bot. Fixed rules do not apply to the swarm systems in uncharted regions. He will analyse the efficiency of different multi-agent coordination architectures for optimal swarm activities (i.e., deform around obstacles, and reform to build the collective array of sensors) using the AIRSIM environment.
Viroporins are pore-forming peptides found in more than 30 viruses of relevance to human health and agriculture and are validated drug targets for the development of antiviral therapies. Because viroporins are challenging to study using traditional structural biology approaches, the development of viroporin inhibitors is hindered by the lack of high-resolution structures
Jiayu’s project will use low-resolution data from impedance spectroscopy experiments with simulations to determine the structure of viroporin in membrane environments.
Marine environments contain many unique micro-organisms that have yet to isolated in the laboratory setting, yet the exploration of the marine metagenome (environmental DNA) has yielded much information into many hitherto uncultivated species. Advances in next generation sequencing and bioinformatic tools have allowed for the recovery of genomic information of individuals species directly from the environment, which has been essential in expanding our understanding of the microbial tree of life. The next stage of sequencing technology is long read sequencing, which not only improves the quality of the recovered genomes but has also been critical in uncovering biosynthetic gene clusters, a series of sequential genes which work in tandem within a metabolic process.
The goal of Joel’s project is to use the Nanopore sequencing platform for long read sequencing, along with the gold standard for DNA sequencing, Illumina, to produce sequence datasets from marine biofilms.
The goal of Kevin’s project is to characterize a biobank of metastatic melanoma samples from patients in WA. The samples have been fresh-frozen and transplanted to immune-compromised mice to generate so called patient-derived xenograft (PDX) mouse models. Tumors growing in the PDXs can be used to study genetics of the disease, as well as serve as source of tumor cells for drug discovery in vitro as well as in vivo. An overarching goal is to couple genetics to the pharmacological response to drugs by metastatic melanoma, to gain insight into potential biomarkers of response. To that end, we will sequence the RNA and DNA (exome) of PDX tumors and DNA of matching normal cells. The sequencing data will be 1) aligned to a reference genome, 2) used for mutation and 3) copy number analyses, as well as, 4) sample clustering analyses based on gene expression. These analyses will be suitable for a ten week project student supervised by an experienced bioinformatician in the group.
Cell membranes are the primary site of damage during cryopreservation in liquid nitrogen due to the formation of ice both inside and outside of cells. Cryosolvents are used to promote the formation of a glassy state of water and prevent the formation of damaging ice, but their presence can also be damaging to the integrity of cell membranes. Lara’s project will will use molecular dynamics (MD) simulations to study the mechanism of action of cryosolvents with model cell membranes of complex lipid composition at different levels of desiccation to mimic the process of dehydration that occurs during cryopreservation, with two specific aims: (1) describe in atomistic detail the interactions of cryosolvents with cell membranes and the changes in membrane properties, and (2) conduct 3D stereoscopic visualization of these molecular events. Molecular simulation and 3D visualization will enable the elucidation of what is a complex, challenging and yet fascinating problem in cryogenics.
This UWA and Pawsey Quantum Computing Center (UP-QCC) initiative will take the Triangulum three-qubit quantum computer out of the physics department and into the hands of students across disciplines. Luke’s project will involve developing an interactive browser-based workshop introducing practical quantum computing concepts in Python for the Jupyter Notebook platform featuring remote access to the Triangulum system via its SpinQKit API. By delivering the activity in-browser and leveraging cloud-based compute, these learning materials will be available cross-platform.
Amphotericin B (AmB) is one of the most effective treatments for life-threatening invasive fungal infections, but the drug’s toxicity causes severe side effects, including chronic organ damage.
Patrick’s project studies Lactofungin (LFG), a peptide that increases the anti-fungal activity of AmB. The AMB-LFG synergy involves interaction with the cell membrane, but the details of the mechanism are not clear.
Patrick will use biomolecular simulations to understand LFG-membrane and LFG-AmB interactions to help better understand the mechanism of synergy.
Sam’s project aims to develop a testbed for realising, evaluating, and fine-tuning red-meat processing operations through Digital Twins. As red-meat processing requires multiple various types of machinery, having potentially conflicting objectives, creating the exact virtual counterpart is highly complex. Hence, the proposed testbed will explicitly leverage the technological advancement in sensing and embedded computing to collect data from the physical assets within the red-meat industry and map them onto the virtual model to investigate performance bottlenecks, optimisation parameters and trade-offs.
Distributed Energy Resources (DERs) refers to various small-scale electricity generation and storage devices, including solar, battery and hydropower, that provide alternative energy sources, supplementing the traditional power grids. However, their efficient management is challenging due to the lack of understanding of how DERs work in diverse situations, locations and weather conditions. Since no existing solution mimics the behaviour of DERs, Sean’s project aims to develop a Digital Twin-based software platform for assimilating DERs’ conflicting operational dependencies and setting a knowledge base for their efficient management.
The rising cost of fossil fuels and the general public’s increased awareness of adverse climate changes resulting from burning fossil fuels have driven up interest in renewable energy sources. The Green Electric Energy Park (GEEP) at Curtin University features state-of-the-art renewable energy-based electric power generation technology, including solar photovoltaic arrays, wind turbines, micro-hydro turbines, and fuel cell stacks. All feeds from renewable energy sources and network operating systems have created big datasets of 10 years of real-time data. We aim to effectively analyse the data using cutting-edge machine learning techniques and big data analytics to extract meaningful information. William’s project will focus on advanced and accurate forecasting of renewable energy output using multivariate embedded Deep neural considering various factors such as weather, device conditions, the multivariate association between wind and solar generation, and the effects of clouds.
In the context of Australia, due to its unique geographical distribution, it is difficult for farmers to closely monitor animal health and welfare. There is a need for innovative and autonomous solutions to keep up with demand in the agricultural industry. Xinyu’s project proposes using Unmanned Aerial Vehicles (UAVs) as an effective solution to offer an integrated wearable sensor data acquisition and analytics platform for livestock monitoring. Using computer vision algorithms, it will deploy UAVs to fly on the farm periodically and detect livestock with collars. The UAVs then fly in close proximity to the animals and establish communication with their wearable sensors. UAVs will also pull out the raw data at high speed and transmit it to the Cloud for further analysis. Data analytics methods will run in real-time at the cloud data centre to infer valuable phenomena and alert the farmers if urgent action is necessary.
Grinding of particles is a critical stage in mineral processing. AI models are increasingly used for online process control as the models can perform multi-variable/objective optimization. However, the current AI models are still largely driven by large amount of data with little physics involved, so their accuracy are limited to the data range/operation conditions. Yaoyu’s project is to develop a digital platform which integrates an AI model with a mechanistic model based on information obtained from Discrete Element Method (DEM) simulations of grinding processes. The platform will be applied for quick predictions of grinding process and accelerating DEM simulations.
Marine environments contain many unique micro-organisms that have yet to isolated in the laboratory setting, yet the exploration of the marine metagenome (environmental DNA) has yielded much information into many hitherto uncultivated species. Advances in next generation sequencing and bioinformatic tools have allowed for the recovery of genomic information of individuals species directly from the environment, which has been essential in expanding our understanding of the microbial tree of life. The next stage of sequencing technology is long read sequencing, which not only improves the quality of the recovered genomes but has also been critical in uncovering biosynthetic gene clusters, a series of sequential genes which work in tandem within a metabolic process.
The goal of Yutathkarn’s project is to use the Nanopore sequencing platform for long read sequencing, along with the gold standard for DNA sequencing, Illumina, to produce sequence datasets from marine biofilms. Upon conclusion of this project, the interns should be familiar with recovering Metagenome-assembled genomes (MAGs) from environmental data from short and long read sequences using bioinformatic tools (read QC, assembly, binning, etc).
The goal of this project was to train the EfficientDet neural network using a synthetic dataset of damaged traffic signs. To do this we used the Pawsey systems. We extended the model to not only detect but also assess the damage of traffic signs.
Quantum Walk-based Optimisation Algorithms allow problems in combinatorial optimisation, such as finding the shortest route to visit a number of cities, to be solved much faster than would be possible classically. This is because in a quantum algorithm, all the possible combinations are simulated simultaneously as part of a wavefunction, and interference allows only the most optimum solution to be returned to the user.
Antonia’s project aims to expand on the current understanding of certain geochemical properties of hydrothermal fluids. Under the supervision of Dr. Yuan Mei and Dr. Fang Huang at CSIRO (Mineral Resources), she will be modelling complex systems using the molecular dynamics approach to generate fluid property data, useful for understanding metal mobility and ore formation processes in the deep Earth.
Phishers impersonate legitimate organisations or trusted senders by sending unsolicited emails to harvest victim credentials. Although recent advances in Artificial Intelligence boost the automatic detection of phishing attempts, it also provides hackers with opportunities to build increasingly sophisticated phishing tactics to bypass security filters. In addition, cyber criminals are exploiting human factors to take advantage of vulnerabilities, for example, during the COVID-19 pandemic phishing attempts would impersonate health organisations and include precautions of coronavirus to lure victims in clicking on links. While phishing attackers are taking advantage of human trust, curiosity, and emotions such as fears and anxiety, human skills and factors can be a powerful component in cyber defence such as cognitive function and professional judgment. The goal is to design a collaborative way that human strengths and AI harness, extend, and complement each other. This will build a sense of responsibility and trust for users and maximise resilience against phishing attacks.
Calum is working with a piece of software called EXtreme-Scale Electronic Structure Software (EXESS), which computes electron behaviour in atomic systems efficiently on large-scale supercomputers. EXESS is designed and optimized for GPU-based supercomputers; Calum’s job is to port it to CPU-based systems.
We already know that Northern Australia is home to several large mineral deposits, but finding new ones comes at a great cost to both bottom lines and local ecosystems. This project is about predicting the location of underground mineral deposits using geological simulations based on small, but informative, observational datasets. Emily will be trying to constrain the physical parameter space that could produce mineral deposits in the Macarthur basin using stratiagraphic models and the finite-element code MOOSE, and she will aim to optimise these geological simulations to run efficiently across the computing clusters at Pawsey.
Our team aims to study the robustness of Federated Learning (FL) based Machine Learning (ML) against adversarial machine learning. In FL, data is distributed among multiple clients who collaboratively train a model based on their local summarised model. Using adversarial machine learning, a participant can corrupt the collaborative model. Our team is developing a novel defence mechanism against adversarial attacks on FL.
Ivan’s project is in the Computational Quantum Physics field. He is working on GPU acceleration of code for an ill-conditioned atomic scattering system. This system concerns antihydrogen formation, where accurate calculations are of interest to several groups at CERN that are engaged in researching major problems in Physics, such as the Matter-Antimatter asymmetry in the Universe and behaviour of antimatter under gravity.
We are using Deep Learning to analyse micro-images of cementitious materials in order to identify relevant features so that the materials can be appropriately classified. This could result in an algorithm having the ability to judge which mixtures of cement are robust simply through viewing an image.
The aim of Jesse’s project is to investigate the magnetic confinement of ions flowing through an electromagnetic field, with application to the design of fusion reactors. To approach the problem, the project will perform large-scale Monte Carlo simulations that take advantage of the parallelism enabled by the Pawsey facilities.
GAMBIT, a global fitting software framework designed to automatically analyse essentially any theory beyond the standard model of particle physics, requires the ability to load C++ classes automatically and dynamically at runtime. Currently, GAMBIT has a Backend On a Stick Script (BOSS) to deal with this, however, it currently has the limitation of not being able to load templated classes. Thus, the project is to extend BOSS so that templated classes can also be dynamically loaded.
John’s project focuses on the application of Tensor Network Notation (TNN) in the approximation and analysis of Quantum Circuits (one of the central mathematical models of Quantum Computing). During the project we will employ TNN to simulate particular Quantum Circuits and investigate how good the approximation is by comparing exact solutions with the results given by TNN.
Joren’s project involves using deep learning on a large dataset containing images of garbage, with the intention of localising and classifying the type of trash. The aim of this project is to, at the very least, provide the bones of a system that could eventually be used to automate garbage collection worldwide via camera footage.
Joshua’s project involves Microstructure Categorization of Cementitious Materials via Deep Learning. Using deep learning networks for microscopic image representation, analysis and generation.
Julia’s project involves Molecular Dynamics simulations to predict interactions of molecules that have the potential to improve selectivity of fungal infection treatments.
Mike’s project involves using machine learning algorithms to collect and dispose of different variants of garbage.
Moritz will be working under Dr Sonny Pham and A/Prof Aneesh Krishna to investigate improvements to machine learning architectures for fast semantic segmentation. This involves experimenting with a wide variety of architectures and components and performing easily comparable benchmarks to assess different designs.
Using the wave-packet convergent close-coupling approach to ion-atom collisions, Nicholas is studying bare beryllium ion scattering on excited atomic hydrogen. Specifically, he will be focussing on calculating state selective electron capture cross sections as a result of these collisions.
Rayna-Jade’s project aims to identify if blocking lipid binding sites in membrane proteins of flaviviruses can stop the membrane proteins from maturing. To identify if this is possible, simulations will be run to confirm the stability of the lipid binding sites in the mature membrane proteins.
Sean’s project is developing a program that takes in an image and returns the top similar images from a database, called a content based image retrieval system. Through the use of deep learning techniques he will improve his model to the current state of the art method, aiming to enhance performance.
Sean is helping out with an image retrieval project, similar to Google’s search by image. The aim is to apply different modern methodologies to better find images similar to a query image, for example, find photos of the same building but from a different angle.
The main goal of Tinula’s project is to develop a program that allows for automatic detection of microplastic with reflectance-FTIR. With Pawsey’s supercomputers, he hopes to develop and reach this aim with deep learning and different methods of unsupervised statistical analysis.
The aim of our project is to use Machine Learning algorithms to investigate player-to-player interactions to direct the creation of a dynamic planning model capable of identifying and predicting the opponent team’s tactics in a live soccer match. Based on this we will aim to build a model recommending temporal tactical formations and potential substitutions based on players’ live performances.
The project will utilise molecular dynamics (MD) simulations to investigate the mechanism of interaction of model cell membranes (phospholipid bilayers) with various sugar alcohols at different concentrations and hydration states.
Yvonne’s role is to improve on previous work by utilising a more appropriate parameter set combination.
Zelun’s project involves updating part of a large physics simulation software GAMBIT to be more adaptable to external libraries. Currently the software works well with external libraries that have standard C++ classes. During the project the project aims to extend the code to work with template classes in C++.
The project investigated the use of GPU-acceleration in molecular dynamics (MD) simulations of complex biomolecular systems. This was done using the AMBER suite of molecular simulation programs, in which a fast GPU MD simulation engine, pmend.cuda, has been developed such that the entirety of the MD calculation is performed on the GPU while the CPU core only drives the simulation.
Polyhydroxyalkanoates (PHAs) are a family of microbially-made polyesters that are meant to quickly degrade in the environment, but this degradation is reliant on microbially-secreted PHA depolymerases, whose taxonomic and environmental distribution have not been well-defined. As a result, the impact of increased PHA production and disposal on global environments is unknown. This Intern Project searched the global databases for metagenomes to analyze the distribution of PHA depolymerase genes in microbial communities from diverse aquatic, terrestrial and waste management systems.
In 2020, a paper entitled ‘Latent Space Phenotyping: Automatic Image-Based Phenotyping for Treatment Studies’ was published in Plant Phenomics (https://spj.sciencemag.org/journals/plantphenomics/2020/5801869/). The paper outlines a novel alternative to traditional image analysis methods for phenotyping without the need for complex and bespoke image analysis pipelines. The source code has been made available (https://github.com/p2irc/lsplab). The project was developed in Python, using Tensorflow and leverages nVidia GPUs (CUDA/cuDNN). This Intern Project looked at modernising that project on supporting infrastructure (a supercomputing environment or a cloud environment).
Semantic change detection (SCD) is an important problem for many industries. For example, rail network operators need to identify early warning signs of deteriorating conditions of supporting structure to avoid derail accidents. Where existing detection problems aim to recognise and locate known objects, SCD aims to recognise and locate characteristics of objects that deviate from what is expected. SCD is a challenging machine learning task. This Intern Project set up baselines for advanced research in SCD, by performing and analysis and then comparing several deep learning-based change detection approaches recently proposed in the literature, such as those based on Generative Adversarial Networks (GANs), Deep Convolutional Autoencoders (CAEs), and Long Short Term Memory (LSTM) models.
Batten disease is a group of genetically inherited, neuro-degenerate diseases, most of which start in early childhood. Batten is always fatal, usually in the late teens or twenties and there is no treatment to reverse or halt disease progression. The overall aim of this Intern Project was to use molecular dynamics (MD) simulations, that combined with wet-lab experiments, to improve understanding of recently discovered lead molecules to treat Batten. In addition, the simulations will assist to develop technology to screen for more lead molecules.
Soccer is a popular sport around the world. Effective soccer analytics could improve team performance regardless of resources by maximizing the potential of existing players, analysing opponents’ game strategies, and highlighting players’ features, which are usually undervalued in predicting winning. In this Intern Project, we explored machine learning (ML) approaches to design feature extraction and prediction models for context-aware and adaptive Soccer Analytics. We explored three types of context-aware machine learning approaches: Bayesian network, Decision Tree, and Deep Neural Networks; and one advanced adaptive machine learning approach: Forest Deep Neural Network (fDNN)
Understanding electron transfer in ion-atom collisions is essential for a variety of applications, ranging from astrophysical processes, such as solar wind and nuclear fusion, to modern cancer treatment techniques like hadron therapy. The goal of this Intern Project was to perform accurate calculations of differential cross sections for electron capture in high-energy (MeV regime) proton-helium collisions using a semiclassical wave-packet convergent close-coupling (WP-CCC) method recently developed in our group [Alladustov et al 2019 Phys. Rev. A 99 052706].
The Intern Project explored applications of deep generative models in geoscientific model development. Applying deep learning algorithms to parameter estimation problems has received active interest from both academia and industry in recent years. Modern deep neural networks allow for fast reconstruction of various subsurface properties with a sufficient degree of accuracy. At the same time, realistic models require large datasets for training, which are not always possible to obtain from real data. Using deep generative models can significantly improve the performance. The project used the latest development in deep learning algorithms and worked with real data.
Genomic prediction has been a staple of plant and animal breeding, but ‘classic’ approaches cannot suitably model large datasets or complex additional datasets, such as time-series weather data. ‘Modern’ approaches such as the construction of Artificial Neural Networks can be highly effective for prediction of complex data. The Intern Project built ensembled Artifical Neural Networks that took already existing genomic and weather data in two streams to calculate phenotypic predictions. The input data was divided into training, testing and hold-out sets, where the neural network was built upon training and testing data and ideally could accurately predict phenotypes from the hold out set. Once built the Neural Networks were further optimised, taking advantage of the GPU backend for optimal hyper-parameter selection, to improve the phenotypic predictions in crop breeding.
The transfer of ions across liquid-liquid interfaces is key to many technological applications, such has heavy metal extractions and sensing. Although, there is a phenomenological understanding of how this process occurs, there is no detailed molecular picture of the ion transfer process in the presence of electric fields. The group has recently developed a method to correctly simulate the effect of an external electric field in heterogenous systems, which was tested on the interfaces between two immiscible liquids, water and 1,2-Dichloroethane (DCE) (the most commonly used system for sensing applications). This Intern Project continued this work by studying the properties of the water/DCE interface in the presence of electrolytes on both sides of the interfaces, focusing on determining how the structure of the interface changes with the concentration of the electrolytes and on computing the transfer potential of various ions from one liquid phase to the other.
This Intern Project focused on computational and theoretical physics based on high-performance computing, specifically implementing a new parallelization framework for the Monte Carlo simulation computer code and performing relevant modeling calculations. In the Project, Reese studied charged particle transport in a hydrogen-helium plasma, which is particularly relevant to fusion research. The present version of the code was implemented on one node with no parallelization. Monte Carlo simulation code is very computationally intensive and appropriate parallelization needed to be implemented. In addition, approaches to visualization of the obtained results were investigated and implemented.
We now know a quantum computer can solve an enormously large set of linear equations, can simulate a wide range of Hamiltonians representing chemical and biological systems, can perform various linear transformations including Fourier transforms, and can efficiently evaluate inner products and distances in super high dimensional vector space, the last of which is particularly useful in machine learning. In this Intern Project, we explored potential applications in combinatorial optimization, which are known to be notoriously difficult to solve, even approximately in general. The group recently developed a promising quantum algorithm, taking advantage of intrinsic quantum correlations and quantum parallelism, to deal with combinatorial optimization problems that scale up exponentially. The Intern Project helped to validate this algorithm through large-scale high-performance simulation of an actual quantum computer.
Thai took the sleep dataset that is being collected globally by students and, with the support of the Pawsey Visualisation Team, developed interactive web-based visualisations to enable high school students to explore and understand the dataset. The Intern Project is being undertaken with the goal of further growing the dataset and expanding the initial portal to include STEM educational materials and “voices” of scientists and experts.
Impact craters across the surface of planetary bodies are of great importance to understand the formation and the evolution of celestial bodies. Secondary craters result from the debris ejection from a primary impact and lead to the formation of long chains of smaller craters on the surrounding ground. The team developed a Crater Detection Algorithm trained on Mars, detecting 94 million impact craters > 25m in diameter. The team is retraining the algorithm on the Moon, and will then turn its attention to Mercury. Mercury exhibits the most unusual secondary crater population in the Solar System. The analysis of secondary craters smaller than 1 km in diameter has never been performed because they are too numerous to be counted by hand. The goal of this Intern Project is to perform analysis on Mercury using the Messenger/MDIS-NAC (1.1m/px) by creating a training dataset using this set of imagery, to retrain the current model. The resulting automatic impact crater catalog will be used by the Bepi-Columbo mission to help target areas of interest.
In recent years, public and government concern in Australia about the potential for tick-borne diseases in people has increased considerably. Uncertainty about Australian Lyme disease-like illness requires evidence-based science to identify the microorganisms responsible and provide conclusive data about the speed of infection after tick attachment. This Intern Project identified appropriate bioinformatic pipelines to assign taxonomy to multiple sourced samples (tick, vertebrate host, microbe), which contributed to ultimately improve diagnostic tests, treatment protocols, and the control of tick-borne diseases
Rayleigh and Raman scattering: This research project in computational and theoretical physics used Pawsey’s high-performance computing, implementing a new parallelization framework for the photon collision computer code and performed relevant modeling calculations. The problem of photon-atom scattering was addressed using a fully quantum approach based on the evaluation of the Kramers-Heisenberg-Waller (KHW) matrix elements. Appropriate parallelization was implemented to improve computational performance
The goal of the Intern Project was to develop a containerised workflow solution for DNA Zoo genome alignments to human genome using the LASTZ sequence alignment program. The project planned to take advantage of the HPC and Nimbus Research Cloud architecture at Pawsey’s to test the primary alignment processing stages, using the DNA Zoo genome assemblies of diverse mammal species to human. This work is foundational to doing any comparative work, and to a key desideratum: mapping conservation in the human genome with single-base-pair resolution
The goal of the Intern Project was to port the EDIP interatomic potential developed by the Curtin Carbon Group to GPU-enabled systems. This project continued the development of HPC capability in the Curtin Carbon Group. A number of years ago the group ported the EDIP interatomic potential to LAMMPS as part of a Pawsey Internship Project. The routines proved extremely valuable, underpinning a successful ARC Discovery Project and establishing the group as international leaders in this field. By expanding our capability into GPUs, the group planned to continue to push the boundaries of what is possible with molecular dynamics simulation.
3D geophysical inversion is a core method for resolution of the subsurface, for a wide range of applications, but in particular is used for minerals and petroleum exploration. The goal of this Intern Project was to derive new workflows to build a “one step” process to generate 3D geophysical inversion models from native-format data, as is collected in airborne surveys. This data is inherently anisotropic as it is collected along long lines, densely sampled (e.g. 10m), but with a much greater separation between lines (e.g. 400 m). Most inversion procedures require a number of pre-processing steps, which are sub-optimal (e.g., time consuming, extensive manual input, numerous assumptions).
Modern approaches are taking advantage of HPC infrastructures that permit much more comprehensive and precise models to be implemented. The ability to rapidly and rigorously build 3D models is burgeoning as “live-data” environments and on demand services become more common
Edric worked on the simulation of quantum statistical algorithms. The key to this project was the calculation of extremely large matrix exponentials using algorithms parallelised by MPI. These codes simulated the quantum statistical algorithms that were proposed by the quantum research group at UWA.
The ability of proteins to fold spontaneously in their native structure or functional state is essential for biological function. Failure to fold in the native shape may lead to misfolding and aggregation of proteins into insoluble aggregates, known as amyloid fibrils. These fibrous deposits have been linked to debilitating and age-related diseases, such as Alzheimer’s, Parkinson’s, type-II diabetes and others. The Intern Project studied the role of mutations on the structure, dynamics and aggregation propensity on the lipid-oriented protein: apolipoprotein A-I (apoA-I). The accumulation of this protein as amyloid fibril has been associated with atherosclerotic plaques. The work was done in collaboration with the experimental research group led by Dr Michael Griffin from the Bio21 Institute and University of Melbourne.
The Intern Project aimed at calibrating a multiphysics geomechanical simulator against experimental data using a Deep Learning (DL) approach. A specificity of the simulator used was its novel constitutive model controlling the mechanical behaviour from state variables like temperature and pore pressure. Since those properties were not directly measured, the calibration could only be obtained through an inversion process. Traditional approaches to inverse problems are largely based on deterministic gradient-based methods, which are limited by non-linearity and non-uniqueness of large-scale problems in high-dimensional parameter spaces. The non-linear physical couplings involved in multiphysics problems make this process extremely challenging, even for expert users, and therefore are particularly suitable for Artificial Intelligence (AI) methods.
This Intern Project evaluated and compared state-of-the-art semantic segmentation methods (DeconvNet, UNet, SegNet, PSPNet, FastSCNN, DeepLabV3) for critical infrastructure monitoring. Semantic segmentation is usually the first task in any scene analysis application, providing useful information about the different foreground and background objects in the scene. The project compared the methods on benchmark segmentation datasets, and then applied them to specific applications where scenes containing critical infrastructure needed to be analysed. The project recommended the most suitable methods based on the overall speed and accuracy.
Repeat expansions of short tandem repeats (STRs) are responsible for over twenty-five human neurological disorders, including Huntington disease, spinocerebellar ataxias and intellectual disabilities (e.g. Fragile-X). Many disorders showing anticipation go undiagnosed as we do not know all the possible repeat expansions. Next-generation sequencing (NGS) may be used to detecting novel repeat expansions but requires computationally intensive algorithms. The goal of this Intern Project was to scan for novel repeat expansions genome-wide in hundreds of NGS samples by creating analysis pipelines using a workflow manager to help analyse NGS samples for evidence of repeat expansions and by incorporating the running of several packages for repeat detection, including HipSTR, STRetch, ExpansionHunter and exSTRa.
The aim of this Intern Project was to automate the detection of new impact craters on the surface of Mars by using a Crater Detection Algorithm. The pipeline of data treatment involved training on images containing already known new impact craters, which were then applied on all high-resolution imagery dataset currently available, with preferential focus on dust-free regions.
Tarun worked on the characterisation and comparative analyses of immune genes in marsupials. The goal of the project was to develop a containerised workflow solution to map the already characterised 800 genes vital to the immune response in the human genome for the 18 marsupial genomes available now. Among these genes are the highly divergent immune genes, such as cytokines, natural killer cell receptors, and antimicrobials. The work revealed the level of complexity of the marsupial immunome as compared to the human.
Images deblurring is a newly arising research area, especially for improving 3D reconstructions which are commonly used in many disciplines. Images are not always captured under ideal circumstances, which will result in lower quality images. The degraded quality images can affect the quality of some 3D reconstruction features; this will eventually lead to difficulty of 3D reconstruction, missing parts and holes in the 3D models. This project is to find an effective deblurring algorithm for images that are taken underwater for 3D reconstruction process. We focus on deblurring blurred images caused by defocus and motion, as either type of these blurred images will destroy details we need in 3D reconstruction process.
Alex participated in the Pawsey Summer Internship Program in 2014/2015. Alex participated through Curtin University, School of Public Health.
Accurate understanding and simulation of airflow through lungs could lead to advances in aerosol medicine delivery, knowledge of lung function and other benefits to public health. Computational fluid dynamics (CFD) simulations of this kind have been carried out previously using the open source software OpenFOAM, however the sensitivity to certain input variables on lung models is yet to be tested and doing so will be important for their validation. This work looks at the sensitivity of lung airflow simulations to the expansion ratio of the lung model itself, i.e. the ratio between the lung before a breath is taken and the lung at full inflation.
Alexander participated in the Pawsey Summer Internship in 2014/2015. He participated through Curtin University, Theoretical Physics.
The probabilities involved in atomic collisions can be calculated through a method known as convergent close coupling (CCC). The existing technique of solving the CCC equation involves integration over a singularity. Here, we present an alternate method using an analytical solution to the Green’s function, which yields the same results as the original formulation and yet is free from singularities.
The Square Kilometre Array (SKA) is a global next-generation radio telescope project aiming to answer fundamental questions about our universe. A key technical challenge in building the SKA is providing a stable clock reference to each antenna. We have designed and tested a system which can stably disseminate a clock signal (10MHz) over a fibre network while actively compensating fluctuations in the fibre. The system has a stability of ~10-4 at 1s for up to 5km lengths, and has recently been verified on the Australian Square Kilometre Array Pathfinder (ASKAP) telescopes.
Hannah participated in the Pawsey Summer Internship Program in 2014/2015. She participated through the Pawsey Supercomputing Centre.
This project aims to explore the plausibility of extracting dimensional measurements of fish from single-camera underwater footage. The extracted data could lead to estimates of body mass/biomass which can be used as indicators of fish health and stress. At present this is generally done by capturing live fish or using expensive stereo-camera setups.
The ability for researchers to monitor a marine habitat less invasively using cameras would reduce the impact they have on the habitat. The ability to analyse single-camera footage could add value to the existing abundance of underwater footage. Large volumes of video could be analysed, producing new data from old resources. Various methods were explored over the course of eothis project: photogrammetry/3D reconstruction from still frames; object recognition using cascading classifiers; and image segmentation using background subtraction.
This project is a first step in investigating the galactic structures of radio galaxies in the Murchison Widefield Array Commissioning Survey (MWACS) and Galaxy And Mass Assembly (GAMA) catalogues. I look at the 69 deg^2 G23 region, yielding 40 matches and a smaller catalogue of 15 sources with redshifts. A full cross-match between all the GAMA regions and the new GLEAM survey is thus predicted to have ~200 matches for which a more comprehensive study of radio host galaxies can be made.
This project explored how large a Laguerre basis is needed in the Convergent Close-Coupling (CCC) method before convergence in the estimate of the SDCS is reached for electron scattering on atomic Hydrogen in the S-Wave model. The study found that a Laguerre basis size of 16 resulted in an estimate that was converging for incident electrons of 44.4 eV, a basis size of 17 was required for 54.4 eV electrons and a basis size of 19 for 64.4 eV electrons.
3D reconstruction using images has become more important in different disciplines such as heritage mapping and marine science. Some of these image datasets are very large and require numerous days of processing. The goal of this project is to implement a parallel solution on the iVEC supercomputers to reduce computing time allowing a time realistic processing of large image datasets.
In this project we look at the performance of the bispectrum method for detecting transient sources in interferometric data. We test a relation predicted to exist between the signal to noise ratio (SNR) of an interferometric detection of a pulsar versus the channel width of visibility data in the correlator. We used data gathered from the Giant Metrewave Radio Telescope (GMRT), and formed the closure triangles from those antennas to synthesise a time-domain signal from the combined array. We test a newly derived analytical relationship between SNR and number of channels and demonstrate that it is possible to use the bispectrum method to achieve better transient detection than ever before.
This project was undertaken to assist Dr Andrew King in the process of developing a Computational Fluid Dynamics (CFD) simulation that analyses and estimates the potential power generation from two wave energy harvesting device designs. Due to the time constraints of the internship, this project focussed on the preliminary aspects of this analysis – calculating the forces on the wave energy harvesting devices by a wave scheme that replicates the conditions found off the coast of Perth, Western Australia.
In order to aid the process of interpreting data across a very broad frequency range, we have developed a new technique for comparing the spectral forms of various types of radio galaxy, and defined a set of classifications based on these forms, successfully linking them back to the current understanding of radio galaxy behaviour.
The environment-dependent interaction potential for carbon (CEDIP) is implemented in LAMMPS, and its parallel efficiency is benchmarked using the Pawsey Centre’s Magnus supercomputer. CEDIP performs significantly better than the original (non-LAMMPS) version for large numbers of CPUs, and the same as (or, for larger systems, slightly poorer than) other carbon potentials.
Developing camera-tracking applications and other interface devices using cheaply available commercial software and hardware that allow participants to have their gestures, movements, and group behaviour be fed into the virtual environment either directly or indirectly in order for presenters to present 3D virtual worlds to remotely located audiences while appearing to be inside those virtual worlds has immediate practical uses.
The interfacial properties of methanol-water and ethanol – water mixtures were studied by using molecular dynamics simulation. The surface tension, density distributions of water, methanol, ethanol and their hydrophobic and hydrophilic groups were analysed. The results show good agreements to the previous literatures. It is followed by studying the angle between the water dipole and the positive z-axis of the simulation box in terms of cosine. The presents of positive peaks at different concentrations of the alcohol confirm the existence of the second water layer. The water density from the vapour phase to the positive peak of the water dipole order was analysed in both systems. This new adoption of the second water layer has helped to quantify the amount of the water molecules, which have specific orientations at different alcohol compositions and confirm a relation to the surface tension.