CCF (Core Computational Facility) @ UQ run by ITS / SMP


[Home] [why HPC?] [UQ Resources] [getafix] [dogmatix] [asterix] [ghost] [contacts/help]
howto: [data] [Slurm] [OpenMP/MKL] [Nvidia GPU] [Intel Phi] [MPI] [matlab] [mathematica] [FAQ]

SMP dogmatix cluster

The SMP dogmatix cluster is accessed through the login node dogmatix.smp.uq.edu.au. The cluster merges the former obelix, asterix and ghost clusters with new hardware purchased in 2016 by SMP as a means to continue running, consolidate and extend existing computational resources.


System status

You can see the CPU loads on the cluster via the webpage
http://faculty-cluster.soe.uq.edu.au/ganglia/
This page can only be accessed from a computer within the UQ domain (or after starting a UQ VPN session)

Software


System documentation

dogmatix system documentation is found below, but see also the various howto pages starting with the SLURM queue.

Connecting to dogmatix

The login host of the cluster is dogmatix.smp.uq.edu.au (alt. smp-login-0.smp.uq.edu.au ). You should be able to log in to this via ssh with your UQ login name and password, once you have contacted ITS to get an account. If you're off campus, you can either (a) ssh in on port 2022, or (b) run a UQ VPN.

Slave node types

There are 91 compute/slave nodes in total in the dogmatix cluster. These are divided into 3 partitions:

smp
the default partition including the new nodes and the former obelix cluster nodes.
asterix
all the nodes from the former asterix cluster useful primarily for GPU jobs
ghost
nodes from the old ghost cluster used exclusively for GPU jobs and restricted to staff and students in astrophysics

Obelix had separate queues for different hardware but this isn't needed on dogmatix. If you want your job to run on a high memory node, specify the amount of memory you need with: --mem=memory. If you want your job to run on a specific type of hardware specify the hardware type as a constraint.

Recommendations

All jobs should be submitted through the SLURM queueing system to ensure optimal use of resources.

Always specify a maximum run time (--time=[days-]hh:mm:ss). The default run time is unlimited to stop jobs being killed prematurely but the scheduler will favour jobs with a shorter maximum run time.

Always specify the memory you need (--mem=memory). The default memory allocation is small to ensure optimal use of resources and your job will stall or fail if it requires more memory than requested. If you specify an appropriate memory limit, your job will likely run sooner and it keeps the large memory nodes free for jobs that really need large memory.

If you want your job to run on a specific type of hardware, specify the hardware to run on with: --constraint=hardware.

Storage

It is important to remember that the files stored on the dogmatix cluster are NOT BACKED UP! This means that you need to keep a backup copy of any important data that you have on the cluster.

You have space to store your files in:

Note that to get access to /data1/, /data2/, /data3/, /data4/, /data5/, /data6/, you need to specifically request it from ITS and provide a brief justification.

Obelix and Asterix data

You can find your obelix and asterix files in these partitions on dogmatix:
/data/username
the same as on obelix
/data[2-6]/username
the same as on obelix
/obelix-home/username
your home directory from obelix (read only)
/asterix-home/username
your home directory from asterix (read only)
/asterix-data/username
your data directory from asterix (read only)
For example you can copy files from your obelix home directory with:

cp /obelix-home/username/path path


This page last updated 13th March 2019. [Contacts/help]