We initialized the values of water height and are ready to get to the core of our simulation–iterating forward in time and calculating the solution at each time step. This consists of two steps:
- Calculate the spatial difference of water height (
dh), including the periodic boundary condition. - Use
dhto calculate the new values of water heighth.
The following listing provides the core of the solver.
Listing 2.12 Iterating the solution forward in time
time_loop: do n = 1, num_time_steps ❶
dh(1) = h(1) - h(grid_size) ❷
do i = 2, grid_size
dh(i) = h(i) - h(i-1) ❸
end do
do i = 1, grid_size
h(i) = h(i) - c * dh(i) / dx * dt ❹
end do
end do time_loop
❶ Iterates over num_time_steps time steps
❷ Applies the periodic boundary condition on the left
❸ Calculates the finite difference of h in space
❹ Evaluates h at the next time step
The outer loop, what we call time_loop, increments the integer n from 1 to num_ time_steps. Although we’re not using n anywhere inside the body of the loop, we use this loop to repeat the body for num_time_steps times. Inside the time_loop, we perform two calculations:
- We calculate the difference of
hin space and store it in the arraydh. We do this in two separate steps:- We calculate the value of
dh(1), which corresponds to the element on the left edge of the domain. Because we’re applying periodic (cyclic) boundary conditions,dh(1)depends on the value ofhfrom the right edge of the domain. - We loop over the remaining elements (from
2togrid_size) and setdh(i)to the difference ofhin space (h(i)-h(i-1)).
- We calculate the value of
- Once we have the array
dhcomputed, we use it to compute the new value ofhand update it. Here we don’t need to store the value ofhfor every time step, and we overwrite the old values with new ones.
Fortran follows the same operator precedence rules as general arithmetic:
- Exponentiation (
**) is evaluated first. - Multiplication (
*) and division (/) are evaluated second. - Addition (
+) and subtraction (-) are evaluated last. - Finally, the precedence can be overridden with parentheses.
Furthermore, Fortran operations of equal precedence are evaluated left to right. For example, this expression
h(i) = h(i) - c * dh(i) / dx * dt
h(i) = h(i) - (((c * dh(i)) / dx) * dt)
A few pages back, I asked you if it’s possible to parallelize this loop in a trivial way:
do i = 1, grid_size
h(i) = exp(-decay * (i - icenter)**2)
end do
What you should look for is whether any iteration depends on data calculated in any other iteration. Here, the right side depends only on the loop counter i and parameters decay and icenter, whereas the variable on the left side (h(i)) is not used on the right side. Could every iteration be carried out in any order without changing the final result? If yes, the computation can be easily parallelized.
The first step is to inform the compiler that this section of the code can be executed in any order it finds most optimal. Fortran 2008 introduced the do concurrent construct for this purpose. This construct uses a slightly modified syntax, as shown in the following listing.
Listing 2.13 Using do concurrent for embarrassingly parallel calculations
do concurrent (i = 1:grid_size)
h(i) = exp(-decay * (i - icenter)**2)
end do
Here, we use a (i = 1:grid_size) syntax instead of i = 1, grid_size. We’ll cover this in more detail in chapter 6, but for now we’ll use this syntax to promote all our parallelizable loops to do concurrent.
What do concurrent is and what it isn’t
What does do concurrent do exactly? It’s a promise from programmer to compiler that the code inside the loop can be safely vectorized or parallelized. In practice, a good compiler would do this using a system threading library or SIMD machine instructions if available.
do concurrent by no means guarantees that the loop will run in parallel! In cases such as short loops with simple computations, the compiler may determine that serial execution would be more efficient. We’ll study explicit, distributed-memory parallelism with coarrays in chapter 7. For now, we use do concurrent as a note for both ourselves and the compiler that some regions of the code are safe to parallelize.
Inside of time_loop, can you find any other loops that could be expressed using do concurrent? If yes, use the modified syntax to rewrite them as do concurrent loops.