In 2018 the Australian Government awarded $70 million to upgrade Pawsey’s supercomputing infrastructure, on top of the $80 million granted in 2009 to establish a petascale supercomputing facility.
The Pawsey upgrade, as a major part of the national HPC infrastructure, is ensuring Australia continues to enable computationally-intense research.
Pawsey capital refresh is a complex upgrade will be a staged process. Some ancillary systems, including storage and network infrastructure, have been procured prior to the main system.
The delivery of the main supercomputing is expected to occur in two-phases.
The first phase will provide researchers with a system that is at least equivalent in capacity to what they are currently using, with the latest generation of processors and increased memory per node.
During this phase, researchers with an active allocation on Magnus will transition to the new system.
Phase one is due to be commissioned in mid-2021.
The second phase is expected to be in production by mid-2022. It will provide an exponential expansion in capacity and state-of-the-art technology.
You can see all the parts that make the Capital Refresh Project on the infographics displayed at the bottom of this page or they can be download here.
The Pawsey Supercomputer System (PSS) evaluation of tenders is progressing well. To support the move from the existing Pawsey Supercomputers (Magnus and Galaxy) to the new PSS, Pawsey are working on a number of initiatives designed to support users as they migrate to the new technology. The first of these initiatives is Pawsey Supercomputing Centre for Extreme scale Readiness (PACER) program. PACER will assist researcher’s code optimization and application and workflow readiness by running a Grand Challenge Problem; on a previously unavailable scale on the next-generation supercomputer. To solve these problems, Pawsey will co-fund Australian postdoctoral or PhD positions, embedded within a subset of successful PACER projects. These positions will work to solve computational problems in collaboration with researchers.
Applications for PACER will open when the new Pawsey supercomputing system is announced later in 2020 and detailed information about the application process and the specifications of the allocations will be shared with the call of applications.
To learn more about the PACER program visit https://pawsey.org.au/about-us/capital-refresh/ to listen to the most recent podcast.
The new MWA Compute Cluster is named “Garrawarla”, meaning spider in the Wajarri language; whose land the Murchison Radio Observatory is on. The new 78-node cluster will provide a dedicated system for astronomers to process in excess of 30 PB of MWA telescope data using Pawsey infrastructure. The new cluster will provide users with enhanced GPU capabilities to power AI, computational work, machine learning workflows and data analytics. Delivery and installation of this cluster has been delayed by about 4 weeks due to travel restrictions associated with COVID-19 and some issues with power supply to the cluster. MWA Early Adopters have been given access to 30% of the new system which has allowed them to perform some early work. Power works were performed late in July 2020 and the stress test is being scheduled for early in August. Assuming all goes well with the stress test, general MWA users will commence migrating to Garrawarla in August/September.
The ASKAP ingest nodes procurement was awarded to HPE in July 2020. The ASKAP ingest nodes are one of the most critical components of the pipeline between the ASKAP telescopes and the data store which houses the final data products. They receive the data from the correlators located at the Murchison Radio Observatory and write them to disk ready for post processing on the Galaxy supercomputer.
As part of the capital refresh, the sixteen ASKAP ingest nodes are being replaced with nodes with the latest AMD processors designed for I/O. They have twice as much data bandwidth as the previous generation and more memory channels, ensuring that they can keep up with the torrents of data that are produced by the telescopes. Along with three dedicated nodes for providing ancillary services, they will have dedicated storage in the form of the ClusterStor E1000. Approximately half a petabyte of NVMe storage will be dedicated to the ingest process, capable of speeds in excess of 150 GB/s.
The populated racks, cables, and cooling doors are scheduled to arrive at Pawsey during September 2020 and migration to the new infrastructure is scheduled to occur during Q4 2020.
The Pawsey Long Term Storage requirements were released to the CSIRO Scientific Panel in July 2020. Implementation of the planned upgrades to the Long term storage are scheduled for Q1/Q2 2021.
The upgrade to the Nimbus high-throughput computing (HTC) infrastructure is complete. The new infrastructure provides improved computational flexibility, accessibility and speed. The upgrade allows researchers to process and analyse even larger amounts of data through additional object storage and the Kubernetes container orchestrator, building on Pawsey’s existing container technology for its HPC systems.
To read the stories published related to the project milestones referred to the information below:
- PACER – upscaling Australian researchers in the new era of supercomputing 25/07/2020
- New Pawsey Nimbus Cloud infrastructure available for Australian researchers 10/03/2020
- HPE to deliver a dedicated system for astronomy needs 28/02/2020
- Pawsey Capital Refresh Boosts Cloud Infrastructure 21/11/2019
- Tender released for Australia’s new research supercomputer 14/11/2019
- Three times more storage and performance for SKA pathfinders 11/11/2019
- Pawsey Capital Refresh – Reference Groups Established 5/04/2019
- New funding to accelerate science and innovation 28/04/2018
Pawsey is committed to engage with its diverse stakeholders and keep it update regarding the procurement. Some of the channels the Centre has established to achieve this are the Pawsey user forums, Capital Refresh Update for potential vendors, Pawsey newsletters and more recently our podcasts.
You can listen to the Capital Refresh Podcast from the list below:
Episodes in the podcast:
- Episode 7: PACER, accelerating researchers in the new era of supercomputing
- Episode 6: Spiders for the sky
- Episode 5: Nimbus – HTC Cloud Service Upgrade & Training
- Episode 4: The year that was, the year ahead
- Episode 3: HTC Cloud Procurement
- Episode 2: Capital Refresh Status Update
- Episode 1: What is the Pawsey Capital Refresh?
Find below an infographic regarding the current status of the Project (last updated on 1/07/2020). They can also be download here
Pawsey Capital Refresh Status
Tape library expansion
Additional tape storage has been procured to expand the existing tape libraries from 50 to 63 Petabytes in each library.
Long term storage
Server and hard disk storage and networking will be refreshed, along with the tape library’s total maximum storage capacity. Both ingest and egress data transfer services will be upgraded, improving upload and download times.
Pawsey is partnering with Dell EMC to expand its current cloud system with 5x more memory and 25x more storage to form a new cutting-edge flexible compute system. This expansion aims to better service emerging research areas and communities who benefit more from a high throughput compute.
Astronomy high-speed storage: 3x more storage and performance. The existing Astro filesystem was expanded to service the MWA community. Powered by HPE, it has been upgraded to 2.7 PB of usable space and capable of reading/writing at 30 GB/s. The New buffer filesystem, a dedicated resource for ASKAP researchers, provides 3.7 PB of usable space and is capable of reading/writing at 40 GB/s. It is manufactured by Dell.
High-speed storage filesystems: Designed to deal with thousands of users accessing them at the same time, high-speed filesystems will be procured to increase speed and storage capability to general-purpose science.
The 546 TeraFlops MWA cluster will be a resource better tuned to MWA’s needs, powered by HPE. Procured ahead of the Main Supercomputer, this cluster will allow ASKAP to use the full CPU partition of Galaxy.
The high-speed interconnect ties all the pieces of the Pawsey ‘puzzle’ together. When procured, all parts will sit on the same fabric as first-class citizens, allowing Pawsey researchers to run their workflows quicker.
The remote visualisation capability will be procured as part of the main supercomputer. When the new capabilities become available, researchers will be able to visualise their science in real-time, while being processed. This new capability will allow researchers to steer their visualisation while the data is processing and fine-tuned to the desired outcome.
Delivered in two phases, the Pawsey Supercomputer phase 1 will be delivered by mid-2021, it will provide researchers with a system that is at least equivalent in capacity to what they are currently using. Phase 2 is expected to be in production by mid-2022; it will provide an exponential expansion in capacity and the latest state-of-the-art technology.
Pawsey Data Workflow