The Code
WOMBAT and the CosmoPlasmas project are the result of a partnership of academic institutions with Cray Inc., an industry leader in High Performance Computing (HPC). Code design strategies are lead by performance engineers to showcase the optimal way to facilitate modern super computers using solely open standards. The academic partners lead the algorithmic implementation and define the applicable science.
We are using some unusual programming techniques and parallelization strategies in the code. As the HPC community is approaching the Exascale, these methods will become crucial to further exploit the progress in hardware. Stable versions of the code are released to the public, so the Astronomy & Astrophysics community can benefit from the progress made with Wombat.
Object Oriented Fortran 2008
Fortran is a programming language developed in the early days of scientific computing. Nonetheless Fortran code is still 20% faster than C or C++ implementations, because compilers are more developed and the language structure allows deeper optimization.
In this project, we are using a modern dialect of Fortran (F08), which offers modern features like object orientation. Its CONTIGUOUS attribute practically guarantees SIMD vectorization without the use of pragmas.
Julialang
is a modern high level functional language. As a decendent of Matlab, it is Fortran's grand daughter and shares common data layout and array indexing. Julialang is as easy to program as the ever popular Python, but approaches the speed of C. Thus it is a perfect fit for WOMBAT.
In this project, we aim to compute all post processing exclusively in Julia. Library functionality is more than sufficient for cosmological simulations and highest quality scientific visualization. Importantly, Julia exposes users to important HPC concepts, like vectorization, thread and node parallelism. As most students start their projects with post processing simulation data using a high level language, we can teach these concepts naturally and with much less effort than with a compiled language.
Advanced MPI
We use cutting edge technology based on the open MPI standard to achieve computation-communication overlap on the thread level. MPI is the standard library for inter-process communication from small compute clusters to the largest super computers on the planet.
In this project, we combine one-sided remote memory access MPI with calls to MPI_Put and MPI_Get in MPI_THREAD_MULTIPLE mode with Cray's brand new, a lock-free library. This way threads can access memory on other nodes without any synchronization and achieve optimal overlap.
OpenMP on Accelerators
OpenMP is an open standard for writing multi-threaded applications in C/C++ and Fortran. With version 4 support for accelerators/GPU was introduced to OpenMP.
In this project, we are using OpenMP for threading support, with only a single region to minimize overhead in the code. Because of our open research policy, we cannot use the popular CUDA framework to use accelerators in WOMBAT. Instead, we are working with the Cray OpenMP team to bring accelerator support to WOMBAT using OpenMP 4+.
Agile Software development
We are integrating techniques from Agile project management and code organization in the project. It helps us coordinate the team over different time zones and focus the work of student and hold supervisors accountable. Agile development is a standard in the IT industry, but not common in numerical astrophysics.
Wombat uses GIT for source code management, the group is connected via Slack and the project is managed using a Jira board and monthly sprint meetings. We will look into pair programming as well.