Data harmonisation and delivery of exploration geochemistry data from approximately one billion assay results held in the DMIRS Minerals Drillholes Database (MDHDB) from company reports of data from more than 3 million historic drillholes and more than 7 million surface samples.

Principal investigator

David Howard
Magnifying glass

Area of science


Systems used


Applications used

Ubuntu, PostgreSQL, Flask
Partner Institution: Geological Survey of Western Australia | Project Code: Nimbus Application A000309

The Challenge

The data in the MDHDB are held as text strings in an ‘attribute name-value pair’ structure in order to accommodate the large number of uncontrolled attribute (column) names in the more than 70,000 source data files submitted by 3,000 companies. For example, for gold assays in just the surface sample data there are more than 3,300 ‘attribute name – unit’ combinations for which the attribute name contains ‘au’ or ‘gold’. As a consequence, the data must be harmonised in order to make them generally usable for querying or import into a conventional surface sample or drillhole database. The number of attributes is such that it is not feasible to do this other than programmatically. However, internal security constraints do not encourage the installation of non-standard software and the development of experimental codes for data harmonisation. Furthermore, the size of the database is such that it is not feasible to manipulate on standard in-house computers. Security and procedural constraints govern the use of internal infrastructure for experimenting with non-standard data delivery.

The Solution

Experimentation with new processing codes is needed to harmonise the data and store the processed data, and to test various data delivery options, without constraints imposed by standard operating procedures and environments.

The Outcome

The Nimbus facility at the Pawsey Centre provides the means of creating an appropriate virtual machine and sufficient storage without physical hardware, procedural and security constraints. This is ideal for developing code and experimenting with alternative approaches.
The result of our experimentation was the harmonisation and storage of open file geochemistry data in a Nimbus VM, and the development of a proof-of-concept web delivery interface using:
• 8-core, 32 GB VM with 290 GB storage and Ubuntu v18 OS
• PostgreSQL database
• Flask website

The successful solution has now been moved to a commercial hosting facility at This would have been very difficult to achieve without having first been able to work freely in the Nimbus environment.

List of Publications

LinkedIn post:

Figure 1. Screen image of online drill holes geochemistry data discovery and delivery application developed in Nimbus VM environment in 2021.