Objectives

Let’s set some simple objectives for our data analysis exercise. This should be challenging enough to require data decomposition and communication in parallel mode, but simple enough to not get bogged down in the details of math or statistics. To that end, I’d like to know

What was the maximum measured wind speed in the Gulf of Mexico in the 2005-2017 period? Which buoy recorded the maximum value?
Which buoys had the strongest average winds, and which had the lowest average winds? What were their respective values?

To find the answers, we’ll need our program to have a few elements:

Reading each CSV file and storing the wind speed data in arrays
Finding the maximum and mean (average) wind speed values for each buoy
Comparing the maximum and mean wind speed between all buoys

You could program each of these tasks without parallel considerations. However, if we execute this program serially, each file will be processed in order, one at a time. For many large files, this approach can become infeasible or even impossible. This is where parallel data decomposition will come to our aid!

If we implement our program correctly, we should get output like this:

Maximum wind speed measured is    40.9000015     at station 42001
Highest mean wind speed is    6.47883749     at station 42020
Lowest mean wind speed is    5.43456125     at station 42036

Objectives

Leave a Reply Cancel reply