Setonix is the name of Pawsey’s new supercomputer being delivered by Hewlett Packard Enterprise (HPE) as part of the biggest upgrade to the Pawsey computing infrastructure since the centre opened in 2009. Setonix will deliver up to 50 petaFLOPs, or 30 times more compute power than its predecessor systems Magnus and Galaxy, to help power the future high-impact Australian research projects.
Setonix is being built using the HPE Cray EX architecture, featuring significantly increased compute power and more emphasis on accelerators with future-generation AMD EPYC™ CPUs and AMD Instinct™ GPUs, and including expanded data storage capabilities with the Cray Clusterstor E1000 system.
Pawsey commenced transitioning researchers to Setonix Phase 1 in June 2022. Setonix Phase 1 will provide a 45 percent increase in raw compute power in one-fifth of the size compared with the Magnus and Galaxy systems.
Setonix Phase 2 (the full system) will be delivered at the end of 2022. Setonix will be at least ten times more energy efficient than its predecessors Magnus and Galaxy, while providing a 30-fold increase in raw compute power.
During the 2022 ISC HPC conference in Germany – one of the largest international supercomputing conferences – the TOP500, and Green500 lists were released. The top four systems on the Green500 list, ranking the most energy-efficient systems in gigaflops/watt, were all HPE Cray EX systems and will be soon made available to their research communities.
By working with the same computing architecture, Pawsey is ensuring the researchers’ workflows are exascale-ready for future requirements.
The upgraded compute capability is be supported by a number of other upgrades at the Pawsey Centre designed to provide researchers with an improved user experience:
The ASKAP ingest nodes are one of the most critical components of the pipeline between the ASKAP telescopes and the data store which houses the final data products. They receive the data from the correlators located at the Murchison Radio Observatory and write them to disk ready for post processing on the Galaxy supercomputer.
As part of the capital refresh, the sixteen ASKAP ingest nodes have been replaced with nodes with the latest AMD processors designed for I/O. They have twice as much data bandwidth as the previous generation and more memory channels, ensuring that they can keep up with the torrents of data that are produced by the telescopes. Along with three dedicated nodes for providing ancillary services, they have dedicated storage in the form of the ClusterStor E1000. Approximately half a petabyte of NVMe storage has been dedicated to the ingest process, capable of speeds in excess of 150 GB/s.
The new MWA Compute Cluster is named “Garrawarla”, meaning spider in the Wajarri language; whose land the Murchison Radio Observatory is on. The new 78-node cluster provides a dedicated system for astronomers to process in excess of 30 PB of MWA telescope data using Pawsey infrastructure. The new cluster provides users with enhanced GPU capabilities to power AI, computational work, machine learning workflows and data analytics.
The upgrade to the Nimbus high-throughput computing (HTC) infrastructure is complete. The new infrastructure provides improved computational flexibility, accessibility and speed. The upgrade allows researchers to process and analyse even larger amounts of data through additional object storage and the Kubernetes container orchestrator, building on Pawsey’s existing container technology for its HPC systems.
Acacia is the new online storage system and provides over 60 PB of object storage for long term archiving of researcher data. The system is divided into two zones, one designed for data which needs to be accessed faster than the other which is designed for energy efficient long-term storage. Acacia went into production in February 2022.
All users will need to migrate to Acacia as the existing /Group filesystem will not be available once researchers migrate to Setonix.
Banksia is the new offline storage and provides a replacement of the previous storage management software with an open system which will provide an expandable platform to build on and leverage the investment in object storage. It uses Pawsey’s current investment in tapes by re-using the existing tape libraries and utilise a new 5 PB cache to take full advantage of the new 100 GBe network infrastructure.
Pawsey is moving from a monolithic single core router to a spine-leaf architecture with a 400Gbps backbone and 100 Gbps links to host endpoints. This will allow all network endpoint (ie. login nodes, visualisation servers, data mover nodes, etc.) to realise a ten-fold increase in bandwidth from moving from 10 Gbps to 100 Gbps ethernet.
To support all the above upgrades to the Pawsey Supercomputing Centre, the Pawsey building needs to be upgraded to provide power and cooling to the new infrastructure. This work commenced in April 2021 and is being delivered in phases to support Setonix and the Long Term Storage upgrades.
It has been another busy month Pawsey with all researchers now having access to Setonix Phase 1. At the end of July, 65% of active projects have migrated to the new supercomputer and at least 130 of you have accessed the system with average utilization of around 50% of the entire system during the migration period.
During the 6 weeks since the migration started, 50 of you joined the training sessions live and 20 people asked questions during our Ask.Me.Anything sessions. We also had 633 views of the recorded training modules, which you can access from our Pawsey YouTube Playlist: Setonix Phase 1 Migration.
The last live online training session for the migration took place last week. The recordings are still available on Pawsey’s Youtube channel and the training materials from our Pawsey Documentation site: Setonix Migration Training Material. Together with the new documentation, we will continue supporting your migration.
A FAQ section is now available here, and it is the result of our recent AMA sessions. We are still looking to hear from you and your experiences. Your feedback has already helped us to update content and facilitate the migration process.
Setonix Phase 2 is on its way to Australia and it cannot be installed until Magnus and Zeus are decommissioned. You can support us in delivering the full power of Setonix as soon as possible by migrating your projects to Setonix now.
At present, you have your full allocation available on both Magnus and Setonix. This will continue until mid September when we plan to commence the decommissioning of Magnus. Please plan to migrate your projects to Setonix before Magnus is decommissioned.
We expect the rest of 2022 to be busy as we complete migrating Pawsey researchers to Setonix Phase 1 and then work through the challenges of delivering the full power of Setonix in the midst of worldwide supply chain issues, travel restrictions and the impacts of COVID-19 on our workforce.
Thank you for being patient with us on this journey, we are looking forward to delivering the southern hemisphere’s fastest research supercomputer, accelerating Australian science.
To read the stories published related to the project milestones referred to the information below:
- Pawsey provides the first look at Setonix, wrapped in stars 21/09/2021
- Pawsey to deploy 130PB of multi-tier storage 16/08/2021
- Pawsey unveiles its super-fast tribute to the quokka 24/02/2021
- Powering the next generation of Australian research with HPE 20/10/2020
- PACER – upscaling Australian researchers in the new era of supercomputing 25/07/2020
- New Pawsey Nimbus Cloud infrastructure available for Australian researchers 10/03/2020
- HPE to deliver a dedicated system for astronomy needs 28/02/2020
- Pawsey Capital Refresh Boosts Cloud Infrastructure 21/11/2019
- Tender released for Australia’s new research supercomputer 14/11/2019
- Three times more storage and performance for SKA pathfinders 11/11/2019
- Pawsey Capital Refresh – Reference Groups Established 5/04/2019
- New funding to accelerate science and innovation 28/04/2018
Pawsey is committed to engage with its diverse stakeholders and keep it update regarding the procurement. Some of the channels the Centre has established to achieve this are the Pawsey user forums, Capital Refresh Update for potential vendors, Pawsey newsletters and more recently our podcasts.
You can listen to the Capital Refresh Podcast from the list below:
Find below an infographic regarding the project’s current status (last updated on 04/08/2022). They can also be downloaded here: CapitalRefreshStatusandWorkflow20220801
Pawsey Capital Refresh Status
Banksia – our tape library expansion
Additional tape storage has been procured to expand the existing tape libraries from 50 to 63 Petabytes in each library.
Acacia object storage addition
Warm Tier – a disk-based system powered by Dell, named Acacia after Australia’s national floral emblem the Golden Wattle – Acacia pycnantha, provides 60PB of high-speed object storage for hosting research data online. This multi-tiered cluster separates different types of data to improve data availability. Acacia will be fully integrated with Setonix, enabling a better experience when transferring data between Pawsey Centre systems
Pawsey partnered with Dell EMC to expand its cloud system with 5x more memory and 25x more storage to form a cutting-edge flexible compute system. This expansion provides better service to emerging research areas and communities who benefit more from a high throughput compute.
Astronomy high-speed storage: 3x more storage and performance. The existing Astro filesystem was expanded to service the MWA community. Powered by HPE, it has been upgraded to 2.7 PB of usable space and capable of reading/writing at 30 GB/s. The New buffer filesystem, a dedicated resource for ASKAP researchers, provides 3.7 PB of usable space and is capable of reading/writing at 40 GB/s. It is manufactured by Dell.
High-speed storage filesystems: Designed to deal with thousands of users accessing them at the same time. The Pawsey high speed filesystems will be procured as part of the main supercomputer system to increase speed and storage capability to general purpose science.
Garrawarla, the 546 TeraFlops MWA cluster, is a resource tuned to MWA’s needs, powered by HPE. Procured ahead of the Main Supercomputer, this cluster allows ASKAP to use the full CPU partition of Galaxy.
Pawsey is moving to a CISCO spine-leaf architecture with a 400Gbps backbone and 100 Gbps links to host endpoints. The network has been designed to be easily expandable to support the object storage platform being purchased as part of the Long-Term Storage procurement as well as integration with the Pawsey new supercomputer.
The remote visualisation capability has been procured as part of the main supercomputer. When the new capabilities become available, researchers will be able to visualise their science in real-time, while being processed.
This new capability will allow researchers to steer their visualisation while the data is processing and fine tuned to the desired outcome.
Setonix will be built using HPE Cray EX supercomputer architecture, will deliver 30x more compute power than its predecessors and will be at least 10x more power efficient.
Pawsey Data Workflow