Exercise - Midterm revision#
The main goal of the exercise is to cover as many of the topics as possible, which we have discussed in class so far.
You can download the necessary data files for the exercise here.
Write a module
wind_timeseries.py
that does not execute when imported (import wind_timeseries
), but performs the steps outlined below when run as a script (%run wind_timeseries
).Define two global variables that contain the directory where you saved the data files (a
string
) and the list of filenames (as atuple
).Loop over all data files and read in the data (point 4), quality control the data (point 5), calculate wind speed (point 6), and plot the wind speed time series (point 7).
Use the function
read_data
at the end of this document to read in the data (either copy and paste it into your module or put it in a separate module and import it into your module). Create the filename using theos.path
module.Write a function to quality control the data.
The function should take one positional argument (the numpy array containing the wind components) and one keyword argument (the numpy array containing the diagnostic flags, with a default value of None). Make sure that the keyword argument cannot be called as a positional argument.
If the keyword argument is provided, find all the indices where the diagnostic flag is larger than 0 using
np.where
. Replace the wind components at these times withnp.nan
. Remember that the indices refer to all three wind components.Replace all wind components with a magnitude larger than 10 m s\(^{-1}\) (can also be negative) with
np.nan
.Do you need to return the modified array containing the wind components with
return
or not to make the changes available to the parent function?
Write a function to calculate the wind speed.
The function takes two positional arguments: the numpy array containing the wind components and the list of variables.
Calculate the wind speed from the three wind components and add the new 1D array to the array containing the wind components so that it now has the size 3600 x 4.
Add the new variable name ‘wspd’ to the list of variable names.
Do you need to return the modified array containing the wind components and the modified list of variable names with
return
to make the changes available to the parent function?
Write a function to plot the wind speed.
The function takes three positional arguments: a 1D numpy array containing the wind speed, a string containing the variable name (from the list of variables), and the timestamp.
Add a title to the plot that reads, e.g., ‘20-Hz wspd - 2023-06-11 14:00’, where the variable name (‘wspd’) and the timestamp are supposed to come from the function arguments.
Save the plot as a png file.
Make sure that your code uses meaningful variable names and follows the PEP 8 guidelines regarding indentation, line breaks, whitespace, and import statements.
Function to read in the data#
def read_data(ecfile):
''' read in csv file with EC data and store the wind data in numpy array '''
# read csv file into pandas dataframe
data = pd.read_csv(ecfile, index_col=0, parse_dates=True)
# get the variable names from the dataframe columns
wind_variables = data.columns.to_list()
wind_variables.remove('diag')
# store the three wind components in a single 2D numpy array
usonic = data['us'].to_numpy()
vsonic = data['vs'].to_numpy()
wsonic = data['ws'].to_numpy()
uvw = np.concatenate(
(usonic[...,np.newaxis], vsonic[...,np.newaxis], wsonic[...,np.newaxis]), axis=1)
# sonic diagnostic flag (bad data if > 0)
diag = data['diag'].to_numpy()
# timestamp at the start of the data file
timestamp = data.index[0]
return uvw, diag, wind_variables, timestamp