DMIRS-mdhdbData harmonisation and delivery of exploration geochemistry data from approximately one billion assay results held in the DMIRS Minerals Drillholes Database (MDHDB) from company reports of data from more than 3 million historic drillholes and more than 7 million surface samples.
Principal investigatorDavid Howard email@example.com
Area of scienceGeoscience
Applications usedUbuntu, PostgreSQL, Flask
The data in the MDHDB are held as text strings in an ‘attribute name-value pair’ structure in order to accommodate the large number of uncontrolled attribute (column) names in the more than 70,000 source data files submitted by 3,000 companies. For example, for gold assays in just the surface sample data there are more than 3,300 ‘attribute name – unit’ combinations for which the attribute name contains ‘au’ or ‘gold’. As a consequence, the data must be harmonised in order to make them generally usable for querying or import into a conventional surface sample or drillhole database. The number of attributes is such that it is not feasible to do this other than programmatically. However, internal security constraints do not encourage the installation of non-standard software and the development of experimental codes for data harmonisation. Furthermore, the size of the database is such that it is not feasible to manipulate on standard in-house computers. Security and procedural constraints govern the use of internal infrastructure for experimenting with non-standard data delivery.
Experimentation with new processing codes is needed to harmonise the data and store the processed data, and to test various data delivery options, without constraints imposed by standard operating procedures and environments.
The Nimbus facility at the Pawsey Centre provides the means of creating an appropriate virtual machine and sufficient storage without physical hardware, procedural and security constraints. This is ideal for developing code and experimenting with alternative approaches.
The result of our experimentation was the harmonisation and storage of open file geochemistry data in a Nimbus VM, and the development of a proof-of-concept web delivery interface using:
• 8-core, 32 GB VM with 290 GB storage and Ubuntu v18 OS
• PostgreSQL database
• Flask website
The successful solution has now been moved to a commercial hosting facility at http://wamexgeochem.net.au. This would have been very difficult to achieve without having first been able to work freely in the Nimbus environment.
List of Publications