Algorithmic Fidelity

by Julius Donnert

Bug fixing the 2d and 3d code versions, I recently ran the famous Orszag-Tang Vortex test. This is a small simulation that showcases the transition of a flow to 2d MHD turbulence.  Here is a screen shot of the simulation after some time with WENO5 (left) and TVD (right) :

It is a challenge to find differences between the two simulations (there are few though). However the WENO5 simulations has only half the resolution of the TVD run ! 

This nicely showcases the impact of increased algorithmic fidelity on fluid simulations. Even though a WENO5 timestep is nearly 4 times more expensive to compute than a TVD step (exact numbers to come, but likely 1 Million zones/sec/node on Broadwell), similar effective resolution is obtained at double the zone size.

In 3d this means that WENO5 is actually faster than TVD, if the quality of the solution is the only concern. TVD has a factor 8 more zones to compute, with twice as many time steps.  The error introduced by finite precision reduces as well. Additionally, the data size goes down by a factor of 8, which makes post-processing simulations much easier.

We will argue in our paper that WENO5 represents an efficiency optimum for our type of simulations. Nonetheless one is usually bound by memory and runs the largest simulation that fits on the machine at hand. This is where scalability and vectorization come into play, but that is another blog entry ...

Magnetic field constrained transport in WENO

by Julius Donnert

I have been experimenting with CT schemes in our new WENO5 implementation into WOMBAT. We already had the simple CT implementation from Ryu et al 1998, which is second order and not computationally very intensive. In second order codes, a high order upwinded CT scheme is essential to get magnetic field advection right. So I implemented the Lee 2013 scheme, which is an upwinded version of his excellent high order scheme. Given that cosmological simulations are all about advection and isotropy at insane density contrasts, it is a good idea to make sure the code handles this situation accurately.

Here is a video of the magnetic field loop advection in 2d with TVD (Mendygral+ 2017), WOMBAT WENO5 with second order CT and WOMBAT WENO5 with upwinded high order CT like Lee 2013. 

 

As you can see the simple CT scheme performs as well as the high order scheme that costs significantly more flops. There certainly is room for improvement, but the code outperforms Lee's implementation in FLASH already with the low order scheme. 

We are running the CT scheme at every sub-step of the  RK4 time integrator, so my guess is the simple estimate from Ryu's scheme is just good enough to get a non-diffusive field. With these results I won't even test 3D leave a higher order CT scheme for a future paper. We would have the flops to spare, but at the moment its just not worth it.

3D Visualization

by Julius Donnert

In the last weeks, I've been busy developing the visualization infrastructure for 3D Wombat data in Julialang. We plan to use it for all our visualization and post processing, including Terabyte-sized snapshots on a cluster. Here is the 3D blob in a shock tube test visualized with less than 20 lines of Julialang:

Commonly, scientists use a low level language like C/C++ to compute projection maps or ray casted images from their simulations and then produce a graph or movie with a high level language like Python or IDL. 

Here, Julialang enables us to solve this two language problem. For embarrassingly parallel problems, we skip the low level language entirely. Because Julialang is fast and node parallel, so it can process large amounts of 3D data on the fly. This increases productivity of the user dramatically, because debugging is much easier in a high level language than in C or Fortran. 

Julialang makes the user "compiler aware", the language can show why simple Julia code is faster than simple Python or IDL code. Bachelor and master projects usually consist of running WOMBAT, followed by lots of post processing in a high level language. With its introspection abilities, Julialang allows us to introduce students to HPC concepts like vectorization and parallelism in a fun and easy way that does not require sifting through log files. Our visualization routines are already thread parallel (1 macro) and vectorize where possible (2 macros).  

JD