In 2018 the Australian Government awarded $70 million to upgrade Pawsey’s supercomputing infrastructure, on top of the $80 million granted in 2009 to establish a petascale supercomputing facility.
The Pawsey upgrade, as a major part of the national HPC infrastructure, is ensuring Australia continues to enable computationally-intense research.
Pawsey capital refresh is a complex upgrade will be a staged process. Some ancillary systems, including storage and network infrastructure, have been procured prior to the main system.
The delivery of the main supercomputing is expected to occur in two-phases.
The first phase will provide researchers with a system that is at least equivalent in capacity to what they are currently using, with the latest generation of processors and increased memory per node.
During this phase, researchers with an active allocation on Magnus will transition to the new system.
Phase one is due to be commissioned in mid-2021.
The second phase is expected to be in production by mid-2022. It will provide an exponential expansion in capacity and state-of-the-art technology.
You can see all the parts that make the Capital Refresh Project on the infographics displayed at the bottom of this page or they can be download here.
The Pawsey Supercomputing Centre has announced it has selected Hewlett Packard Enterprise (HPE) to deliver its new supercomputer as part of the biggest upgrade to the Pawsey computing infrastructure since the centre opened in 2009. The new supercomputer will deliver up to 50 petaFLOPs, or 30 times more compute power than its predecessor systems Magnus and Galaxy, to help power the future high-impact Australian research projects.
Pawsey’s new supercomputer will be built using the HPE Cray EX architecture, featuring significantly increased compute power and more emphasis on accelerators with future-generation AMD EPYC™ CPUs and AMD Instinct™ GPUs, and including expanded data storage capabilities with the Cray Clusterstor E1000 system.
The new system will be delivered in two stages. Phase 1, available by Q3 2021, will provide a 45 percent increase in raw compute power in one-fifth of the size compared with the Magnus and Galaxy systems. Full commissioning of the system will occur by the second quarter of 2022.
The new supercomputer will be at least ten times more power efficient than its predecessors Magnus and Galaxy, while providing a 30-fold increase in raw compute power. The supercomputers are cooled by a groundwater cooling system specially developed by CSIRO for the supercomputing centre, which is offset by a 118kW solar photovoltaic system.
To support the move from the existing Pawsey Supercomputers (Magnus and Galaxy) to the new PSS, Pawsey are working on a number of initiatives designed to support users as they migrate to the new technology. The first of these initiatives is Pawsey Supercomputing Centre for Extreme scale Readiness (PACER) program. PACER will assist researcher’s code optimization and application and workflow readiness by running a Grand Challenge Problem; on a previously unavailable scale on the next-generation supercomputer. To solve these problems, Pawsey will co-fund Australian postdoctoral or PhD positions, embedded within a subset of successful PACER projects. These positions will work to solve computational problems in collaboration with researchers.
Applications for PACER will open in November 2020 and detailed information about the application process and the specifications of the allocations will be shared with the call of applications.
To learn more about the PACER program visit https://pawsey.org.au/about-us/capital-refresh/ to listen to the most recent podcast.
The new MWA Compute Cluster is named “Garrawarla”, meaning spider in the Wajarri language; whose land the Murchison Radio Observatory is on. The new 78-node cluster will provide a dedicated system for astronomers to process in excess of 30 PB of MWA telescope data using Pawsey infrastructure. The new cluster will provide users with enhanced GPU capabilities to power AI, computational work, machine learning workflows and data analytics. Stress testing has been successfully completed and users are now being migrated to the new system.
Pawsey would like to thank the researchers who took part in the Garrawarla early adopters program and their effort and feedback in developing the software stack and environment. We have noticed a 2-8 times speedup with several MWA workflows – this is a significant performance and computational capacity improvement considering the total Flop/s of Garrawarla and workflows speedup achieved.
Please refer to Garrawarla documentation for detailed instruction on how to access Garrawarla cluster, the system details, to compile and run jobs and other relevant information. Several packages are available as system modules. Some modules with python/2.7.17 support can be accessed from /pawsey/mwa/software/mwa_sles12sp4/modulefiles directory.
Pawsey are looking forward to assisting all MWA users to migrate to Garrawarla by the end of October, 2020. So, if you have any questions or issues while migrating, please contact the Pawsey Helpdesk at Pawsey Service Desk.
The ASKAP ingest nodes procurement was awarded to HPE in July 2020. The ASKAP ingest nodes are one of the most critical components of the pipeline between the ASKAP telescopes and the data store which houses the final data products. They receive the data from the correlators located at the Murchison Radio Observatory and write them to disk ready for post processing on the Galaxy supercomputer.
As part of the capital refresh, the sixteen ASKAP ingest nodes are being replaced with nodes with the latest AMD processors designed for I/O. They have twice as much data bandwidth as the previous generation and more memory channels, ensuring that they can keep up with the torrents of data that are produced by the telescopes. Along with three dedicated nodes for providing ancillary services, they will have dedicated storage in the form of the ClusterStor E1000. Approximately half a petabyte of NVMe storage will be dedicated to the ingest process, capable of speeds in excess of 150 GB/s.
The populated racks, cables, and cooling doors were installed in the whitespace at Pawsey during September and the storage is scheduled to be delivered to Pawsey at the end of October 2020. ASKAP has been provided with some (very) early access to a small number of nodes. Migration to the new infrastructure is scheduled to occur later in Q4 2020.
The Pawsey Long Term Storage RFQ’s closed at the end of August 2020. Evaluation of responses commenced in September 2020 and is ongoing. Implementation of the planned upgrades to the Long term storage are scheduled for Q1/Q2 2021.
The upgrade to the Nimbus high-throughput computing (HTC) infrastructure is complete. The new infrastructure provides improved computational flexibility, accessibility and speed. The upgrade allows researchers to process and analyse even larger amounts of data through additional object storage and the Kubernetes container orchestrator, building on Pawsey’s existing container technology for its HPC systems. New users can apply for an allocation on Nimbus via apply.pawsey.org.au.
To read the stories published related to the project milestones referred to the information below:
- Powering the next generation of Australian research with HPE 20/10/2020
- PACER – upscaling Australian researchers in the new era of supercomputing 25/07/2020
- New Pawsey Nimbus Cloud infrastructure available for Australian researchers 10/03/2020
- HPE to deliver a dedicated system for astronomy needs 28/02/2020
- Pawsey Capital Refresh Boosts Cloud Infrastructure 21/11/2019
- Tender released for Australia’s new research supercomputer 14/11/2019
- Three times more storage and performance for SKA pathfinders 11/11/2019
- Pawsey Capital Refresh – Reference Groups Established 5/04/2019
- New funding to accelerate science and innovation 28/04/2018
Pawsey is committed to engage with its diverse stakeholders and keep it update regarding the procurement. Some of the channels the Centre has established to achieve this are the Pawsey user forums, Capital Refresh Update for potential vendors, Pawsey newsletters and more recently our podcasts.
You can listen to the Capital Refresh Podcast from the list below:
Episodes in the podcast:
- Episode 7: PACER, accelerating researchers in the new era of supercomputing
- Episode 6: Spiders for the sky
- Episode 5: Nimbus – HTC Cloud Service Upgrade & Training
- Episode 4: The year that was, the year ahead
- Episode 3: HTC Cloud Procurement
- Episode 2: Capital Refresh Status Update
- Episode 1: What is the Pawsey Capital Refresh?
Find below an infographic regarding the current status of the Project (last updated on 20/10/2020). They can also be download here
Pawsey Capital Refresh Status
Tape library expansion
Additional tape storage has been procured to expand the existing tape libraries from 50 to 63 Petabytes in each library.
Long term storage
Server and hard disk storage and networking will be refreshed, along with the tape library’s total maximum storage capacity. Both ingest and egress data transfer services will be upgraded, improving upload and download times.
Pawsey partnered with Dell EMC to expand its cloud system with 5x more memory and 25x more storage to form a cutting-edge flexible compute system. This expansion provides better service to emerging research areas and communities who benefit more from a high throughput compute.
Astronomy high-speed storage: 3x more storage and performance. The existing Astro filesystem was expanded to service the MWA community. Powered by HPE, it has been upgraded to 2.7 PB of usable space and capable of reading/writing at 30 GB/s. The New buffer filesystem, a dedicated resource for ASKAP researchers, provides 3.7 PB of usable space and is capable of reading/writing at 40 GB/s. It is manufactured by Dell.
High-speed storage filesystems: Designed to deal with thousands of users accessing them at the same time. The Pawsey high speed filesystems will be procured as part of the main supercomputer system to increase speed and storage capability to general purpose science.
Garrawarla, the 546 TeraFlops MWA cluster, is a resource tuned to MWA’s needs, powered by HPE. Procured ahead of the Main Supercomputer, this cluster allows ASKAP to use the full CPU partition of Galaxy.
The high-speed interconnect ties all the pieces of the Pawsey ‘puzzle’ together. When procured, all parts will sit on the same fabric as first-class citizens, allowing Pawsey researchers to run their workflows quicker.
The remote visualisation capability has been procured as part of the main supercomputer. When the new capabilities become available, researchers will be able to visualise their science in real-time, while being processed.
This new capability will allow researchers to steer their visualisation while the data is processing and fine tuned to the desired outcome.
PSS will be built using HPE Cray EX supercomputer architecture, will deliver 30x more compute power than its predecessors and will be at least 10x more power efficient.
It will be delivered in two phases, phase 1, available by Q3 2021, will provide researchers 45 percent increase in compute power in one-fifth of the size compare with Magnus and Galaxy. Phase 2 will become available in Q2 2022, providing up to 50 petaFLOPS of raw compute power.
Pawsey Data Workflow