Programming project¶
Instructions¶
The programming tasks described below have to be done in groups of two. It's totally okay to ask for help or advice to others (also to me!), but copy-pasting from your classmates is strictly forbidden. You will see that some tasks require internet searches that go beyond what you've learned in my class notes. This is expected: my goal is for you to be able to solve new problems more or less independently.
Project due date: Sunday 11.06.2023, 23:59 (the return folders on OLAT will close at precisely this time - no exceptions!)
Please return two files:
- the full notebook as
.ipynb
, executed with all outputs (I recommend to execute "Restart kernel and run all cells" one last time before submission) - any additional file you may have downloaded for your project (as zip or tar file containing several files if too large).
Please return only one project per group (one of the two has to upload the files).
Grading (30% of the final VU grade) will be based on the following criteria:
- correctness of the code (the notebook should theoretically run on my laptop as well, assuming that the data files are in the folder)
- correctness of the solutions (correct plot, correct values, etc.)
- for the plots: correct use of title, axis labels, units, etc.
- clarity of the code (no unused lines of code, presence of inline comments, etc.)
- clarity and layout of the overall document (correct use of the Jupyter Notebook markdown format)
- originality (for the open project)
You can download this notebook to get started. Delete this "Instructions" section, add your name and Matrikelnummer and start coding! For the tasks that ask for a written answer (e.g. some values, answers, etc.), please use the notebook's markdown format to write them down!
Preamble¶
- Names: Korbinian Kerschner & Sarah Schmidhuber
- Matrikelnummer: 62201248 & 12243159
1 2 3 4 5 | import pandas as pd import numpy as np import matplotlib.pyplot as plt import matplotlib.gridspec as gridspec from windrose import WindroseAxes |
01 - An energy balance model with hysteresis¶
Based on the code we wrote in week 07, code the following extension. For this exercise it's OK to copy-paste my solutions and start from there. Only copy the necessary (no useless code)!
The planetary albedo $\alpha$ is in fact changing with climate change. As the temperature drops, sea-ice and ice sheets are extending (increasing the albedo). Inversely, the albedo decreases as temperature rises. The planetary albedo of our simple energy balance model follows the following equation:
$$ \alpha = \begin{cases} 0.3,& \text{if } T \gt 280\\ 0.7,& \text{if } T \lt 250\\ a T + b, & \text{otherwise} \end{cases} $$
01-01: Compute the parameters $a$ and $b$ so that the equation is continuous at T=250K and T=280K.
We need to solve this system of equations:
a* 280 + b = 0.3
a * 250 + b = 0.7
We can do this using a matrix equation.
1 2 3 4 | v1= np.array([[280,1],[250,1]]) v2= np.array([[0.3],[0.7]]) a, b = np.linalg.solve(v1, v2) # solving the matrix equation print(f"The parameter a is: {a}\nThe parameter b is: {b}") |
The parameter a is: [-0.01333333] The parameter b is: [4.03333333]
01-02: Now write a function called alpha_from_temperature
which accepts a single positional parameter T
as input (a scalar) and returns the corresponding albedo. Test your function using doctests to make sure that it complies to the instructions above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | def alpha_from_temperature(T): """ Function computes albedo at the Tmeperature T T= Temperature >>> result= alpha_from_temperature(290) >>> np.isclose(result,0.3) True >>> result= alpha_from_temperature(240) >>> np.isclose(result,0.7) True >>> result= alpha_from_temperature(280) >>> np.isclose(result,0.3) True >>> result= alpha_from_temperature(270) >>> np.isclose(result,0.433333) True """ if T<250: alpha= 0.7 elif T> 280: alpha = 0.3 else: alpha= float(a)* T + float(b) return alpha import doctest doctest.testmod() |
TestResults(failed=0, attempted=8)
01-03: Adapt the existing code from week 07 to write a function called temperature_change_with_hysteresis
which accepts t0
(the starting temperature in K), n_years
(the number of simulation years) as positional arguments and tau
(the atmosphere transmissivity) as keyword argument (default value 0.611). Verify that:
- the stabilization temperature with
t0 = 292
and default tau is approximately 288K - the stabilization temperature with
t0 = 265
and default tau is approximately 233K
Functions asr() and olr() from week 07:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | def asr(alpha=0.3): """Absorbed Solar Radiation (W m-2). Parameters ---------- alpha : float, optional the planetary albedo Returns ------- the asr (float) Examples -------- >>> print(f'{asr():.2f}') 238.35 >>> print(f'{asr(alpha=0.32):.2f}') 231.54 """ s0 = 1362 return (1 - alpha) * s0 / 4 # Testing import doctest doctest.testmod() |
TestResults(failed=0, attempted=10)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | def olr(t, tau=0.611): """Outgoing Longwave Radiation (W m-2). Parameters ---------- t : float the atmosphere temperature (K) tau : float, optional the atmosphere transmissivity (-) Returns ------- the olr (float) Examples -------- >>> print(f'{olr(288):.2f}') 238.34 >>> print(f'{olr(288, tau=0.63):.2f}') 245.75 """ sigma = 5.67E-8 return sigma * tau * t**4 # Testing import doctest doctest.testmod() |
TestResults(failed=0, attempted=12)
Now the new function, taking in account the change of alpha caused by the temp change:
1 2 3 4 5 6 7 8 9 10 11 12 | def temperature_change_with_hysteresis(t0, n_years, tau=0.611): C = 4.0e+08 t_0 = 288 dt = 60 * 60 * 24 * 365 years = np.arange(n_years + 1) temperature = np.zeros(n_years + 1) temperature[0] = t0 for i in range(n_years): temperature[i + 1] = temperature[i] + dt / C * (asr(alpha=alpha_from_temperature(temperature[i])) - olr(temperature[i], tau=tau)) return years, temperature |
1 2 3 | # calculate the temp after 10000 years starting with t0= 292K t0_292= temperature_change_with_hysteresis(292, 10000)[1][-1] print(f"This way we verify that the stabilization temperature with `t0 = 292` and default tau is approximately {round(t0_292)}K") |
This way we verify that the stabilization temperature with `t0 = 292` and default tau is approximately 288K
1 2 3 | # calculate the temp after 10000 years starting with t0= 265K t0_265= temperature_change_with_hysteresis(265, 10000)[1][-1] print(f"This way we verify that the stabilization temperature with `t0 = 265` and default tau is approximately {round(t0_265)}K") |
This way we verify that the stabilization temperature with `t0 = 265` and default tau is approximately 233K
01-04: Realize a total of N simulations with starting temperatures regularly spaced between t0
=206K, and t0
=318K and plot them on a single plot for n_years
=50. The plot should look somewhat similar to this example for N=21
.
1 2 3 4 5 6 7 | N = 21 spaced_temp = np.linspace(206, 318, N) for i in spaced_temp: plt.plot(temperature_change_with_hysteresis(i,50)[0],temperature_change_with_hysteresis(i,50)[1]) plt.title("Climate change scenario with hysteresis") plt.xlabel("Years") plt.ylabel("Temperature (K)") |
Text(0, 0.5, 'Temperature (K)')
You can clearly see that there are to different stabilization temperatures. One at 288K and the other one at 233K like we verify on 01-03. Depending on the initial temperature t0 it approaches one of them. When t0
reaches from 206K to 318K, like it dose in the upper plot, the stabilization temperatures are roughly reached after 40 years.
Bonus: only if you want (and if time permits), you can try to increase N and add colors to your plot to create a graph similar to this one.
02 - Weather station data files¶
I downloaded 10 min data from the recently launched ZAMG data hub. The data file contains selected parameters from the "INNSBRUCK-FLUGPLATZ (ID: 11804)" weather station.
You can download the data from the following links (right-click + "Save as..."):
- station data
- parameter metadata
- station list from the ZAMG (in a better format than last time)
Let me open the data for you and display its content:
1 2 3 | import pandas as pd import numpy as np import matplotlib.pyplot as plt |
1 2 3 | df = pd.read_csv('INNSBRUCK-FLUGPLATZ_Datensatz_20150101_20211231.csv', index_col=1, parse_dates=True) df = df.drop('station', axis=1) display(df) |
DD | FF | GSX | P | RF | RR | SO | TB1 | TB2 | TB3 | TL | TP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
time | ||||||||||||
2015-01-01 00:00:00 | 232.0 | 1.0 | 0.0 | 965.7 | 96.0 | 0.0 | 0.0 | 3.2 | 3.6 | 5.2 | -2.7 | -3.4 |
2015-01-01 00:10:00 | 248.0 | 1.6 | 0.0 | 965.6 | 97.0 | 0.0 | 0.0 | 3.2 | 3.6 | 5.2 | -2.4 | -2.9 |
2015-01-01 00:20:00 | 252.0 | 1.6 | 0.0 | 965.6 | 97.0 | 0.0 | 0.0 | 3.2 | 3.6 | 5.2 | -2.4 | -2.9 |
2015-01-01 00:30:00 | 267.0 | 1.3 | 0.0 | 965.7 | 97.0 | 0.0 | 0.0 | 3.2 | 3.6 | 5.2 | -2.7 | -3.2 |
2015-01-01 00:40:00 | 279.0 | 1.4 | 0.0 | 965.7 | 97.0 | 0.0 | 0.0 | 3.2 | 3.6 | 5.2 | -2.5 | -3.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2021-12-31 23:10:00 | 257.0 | 1.1 | 0.0 | 959.8 | 98.0 | 0.0 | 0.0 | 3.1 | 3.4 | 3.4 | -1.7 | -1.7 |
2021-12-31 23:20:00 | 106.0 | 0.3 | 0.0 | 960.0 | 100.0 | 0.0 | 0.0 | 3.0 | 3.4 | 3.4 | -2.0 | -2.0 |
2021-12-31 23:30:00 | 141.0 | 0.5 | 0.0 | 960.2 | 100.0 | 0.0 | 0.0 | 3.0 | 3.4 | 3.4 | -1.7 | -1.7 |
2021-12-31 23:40:00 | 167.0 | 0.5 | 0.0 | 960.2 | 100.0 | 0.0 | 0.0 | 3.0 | 3.4 | 3.4 | -1.6 | -1.6 |
2021-12-31 23:50:00 | 275.0 | 0.3 | 0.0 | 960.1 | 100.0 | 0.0 | 0.0 | 3.0 | 3.4 | 3.4 | -1.6 | -1.7 |
368208 rows × 12 columns
In the given code snippet, parse_dates=True is used in conjunction with the index_col=1 argument. This means that the second column of the CSV file will be parsed and used as the index column of the resulting DataFrame, with the values converted into datetime objects.
For example, assuming the CSV file contains a column named 'time' with date-like strings, using parse_dates=True will result in the 'time' column being converted to a datetime index in the DataFrame, allowing for easier manipulation and analysis of time series data.
Note that if you want to specify specific columns to be parsed as dates instead of all columns, you can pass a list of column names or indices to the parse_dates parameter instead of using parse_dates=True.
02-01 after reading the documentation of the respective functions (and maybe try a few things yourself), explain in plain sentences:
- what am I asking pandas to do with the
index_col=1, parse_dates=True
keyword arguments? Why am I doing this? - what am I asking pandas to do with
.drop()
? Whyaxis=1
?
Anserws:
- 'index_col=1': The index is column 1, in our case the datetime. with
parse_dates=True
the column 1 gets formatted to a datetime .drop()
drops/deletes one row or column, in our case the column station because ofaxis=1
.
Now let me do something else from you:
1 2 | dfmeta = pd.read_csv('ZEHNMIN Parameter-Metadaten.csv', index_col=0) dfmeta.loc[df.columns] |
Kurzbeschreibung | Beschreibung | Einheit | |
---|---|---|---|
DD | Windrichtung | Windrichtung, vektorielles Mittel über 10 Minuten | ° |
FF | vektorielle Windgeschwindigkeit | Windgeschwindigkeit, vektorielles Mittel über ... | m/s |
GSX | Globalstrahlung | Globalstrahlung, arithmetisches Mittel über 10... | W/m² |
P | Luftdruck | Luftdruck, Basiswert zur Minute10 | hPa |
RF | Relative Feuchte | Relative Luftfeuchte, Basiswert zur Minute10 | % |
RR | Niederschlag | 10 Minuten Summe des Niederschlags, Summe der ... | mm |
SO | Sonnenscheindauer | Sonnenscheindauer, Sekundensumme über 10 Minuten | s |
TB1 | Erdbodentemperatur in 10cm Tiefe | Erdbodentemperatur in 10cm Tiefe, Basiswert zu... | °C |
TB2 | Erdbodentemperatur in 20cm Tiefe | Erdbodentemperatur in 20cm Tiefe, Basiswert zu... | °C |
TB3 | Erdbodentemperatur in 50cm Tiefe | Erdbodentemperatur in 50cm Tiefe, Basiswert zu... | °C |
TL | Lufttemperatur in 2m | Lufttemperatur in 2m Höhe, Basiswert zur Minute10 | °C |
TP | Taupunkt | Taupunktstemperatur, Basiswert zur Minute10 | °C |
02-02: again, explain in plain sentences what the dfmeta.loc[df.columns]
is doing, and why it works that way.
df.columns
are the various names of the columns of the df DataFrame.dfmeta.loc[]
selects all lines of the dfmeta DataFrame, which indexes are in the brackets
--> Since the names of the column of the df Dataframe are in the brackets in case ofdfmeta.loc[df.columns]
, all lines of the dfmeta Dataframe, which contain these names as there index are selected. This way we get a Dataframe, containing the explanations of the abbreviations used as Column names in the df Dataframe. The source for these explanations is the dfmeta Dataframe.
Finally, let me do a last step for you before you start coding:
1 | dfh = df.resample('H').mean() |
02-03: explore the dfh
dataframe. Explain, in plain words, what the purpose of .resample('H')
followed by mean()
is. Explain what .resample('H').max()
and .resample('H').sum()
would do.
.resample('H').mean()
: now the datetime has steps of an hour and the data of each column is the mean of the hour.resample('H').max()
: this would also change the timestamp to an hour but this time, the maximum which occured in this hour is the data.resample('H').sum()
: now it's the sum of an hour in the columns
02-04: Using np.allclose
, make sure that the average of the first hour (that you'll compute yourself from df
) is indeed equal to the first row of dfh
. Now, two variables in the dataframe have units that aren't suitable for averaging. Please convert the following variables to the correct units:
RR
needs to be converted from the average of 10 min sums to mm/hSO
needs to be converted from the average of 10 min sums to s/h
From now on, we will use the hourly data only (and further aggregations when necessary). The 10 mins data are great but require a little bit more of pandas kung fu (the chinese term, not the sport) to be used efficiently.
Spend some time exploring the dfh
dataframe we just created. What time period does it cover? What variables does it contain?
Note on pandas: all the exercises below can be done with or without pandas. Each question can be answered with very few lines of code (often one or two) with pandas, and I recommend to use it as much as possible. If you want, you can always use numpy in case of doubt: you can access the data as a numpy array with: df[column_name].values
.
Checking if the average of the first hour(compute ourself) is equal to to the first row of dfh.
1 2 3 | df_fh= df.head(6) # selecting first 6 lines of 10 df --> is the first hour mean_f6= df_fh.mean() # calculating the mean of the first hour np.allclose(mean_f6, dfh.iloc[0]) # comparring calculated mean to the first hour of the dfh Dataframe |
True
This means that the average of the first hour (that we computed ourselves from df) is indeed equal to the first row of dfh.
1 2 3 | # Replace data in the columns RR, SO by hourly sums dfh[["RR","SO"]]=df[["RR","SO"]].resample('H').sum() dfh |
DD | FF | GSX | P | RF | RR | SO | TB1 | TB2 | TB3 | TL | TP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
time | ||||||||||||
2015-01-01 00:00:00 | 257.333333 | 1.383333 | 0.0 | 965.666667 | 97.000000 | 0.0 | 0.0 | 3.200000 | 3.600000 | 5.2 | -2.533333 | -3.066667 |
2015-01-01 01:00:00 | 160.666667 | 0.766667 | 0.0 | 965.966667 | 97.666667 | 0.0 | 0.0 | 3.150000 | 3.600000 | 5.2 | -3.250000 | -3.716667 |
2015-01-01 02:00:00 | 227.000000 | 0.816667 | 0.0 | 966.316667 | 98.500000 | 0.0 | 0.0 | 3.100000 | 3.600000 | 5.2 | -3.583333 | -3.933333 |
2015-01-01 03:00:00 | 258.833333 | 1.933333 | 0.0 | 966.466667 | 97.000000 | 0.0 | 0.0 | 3.100000 | 3.600000 | 5.2 | -3.250000 | -3.766667 |
2015-01-01 04:00:00 | 241.500000 | 1.266667 | 0.0 | 966.416667 | 95.166667 | 0.0 | 0.0 | 3.100000 | 3.600000 | 5.2 | -3.116667 | -3.883333 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2021-12-31 19:00:00 | 283.500000 | 1.166667 | 0.0 | 957.766667 | 98.833333 | 0.0 | 0.0 | 3.316667 | 3.300000 | 3.4 | 0.900000 | 0.716667 |
2021-12-31 20:00:00 | 237.333333 | 0.716667 | 0.0 | 958.100000 | 99.500000 | 0.0 | 0.0 | 3.283333 | 3.300000 | 3.4 | -0.200000 | -0.350000 |
2021-12-31 21:00:00 | 149.833333 | 1.083333 | 0.0 | 958.783333 | 99.666667 | 0.0 | 0.0 | 3.200000 | 3.300000 | 3.4 | -0.800000 | -0.916667 |
2021-12-31 22:00:00 | 175.833333 | 0.516667 | 0.0 | 959.433333 | 99.166667 | 0.0 | 0.0 | 3.100000 | 3.383333 | 3.4 | -1.083333 | -1.200000 |
2021-12-31 23:00:00 | 198.833333 | 0.600000 | 0.0 | 960.016667 | 99.500000 | 0.0 | 0.0 | 3.033333 | 3.400000 | 3.4 | -1.683333 | -1.716667 |
61368 rows × 12 columns
1 2 3 4 | #What time period dose it cover print(f"The dfh DataFrame covers the time period from {dfh.index[0]} to {dfh.index[-1]}") print(f"It covers information on:") display(dfmeta.loc[df.columns]) |
The dfh DataFrame covers the time period from 2015-01-01 00:00:00 to 2021-12-31 23:00:00 It covers information on:
Kurzbeschreibung | Beschreibung | Einheit | |
---|---|---|---|
DD | Windrichtung | Windrichtung, vektorielles Mittel über 10 Minuten | ° |
FF | vektorielle Windgeschwindigkeit | Windgeschwindigkeit, vektorielles Mittel über ... | m/s |
GSX | Globalstrahlung | Globalstrahlung, arithmetisches Mittel über 10... | W/m² |
P | Luftdruck | Luftdruck, Basiswert zur Minute10 | hPa |
RF | Relative Feuchte | Relative Luftfeuchte, Basiswert zur Minute10 | % |
RR | Niederschlag | 10 Minuten Summe des Niederschlags, Summe der ... | mm |
SO | Sonnenscheindauer | Sonnenscheindauer, Sekundensumme über 10 Minuten | s |
TB1 | Erdbodentemperatur in 10cm Tiefe | Erdbodentemperatur in 10cm Tiefe, Basiswert zu... | °C |
TB2 | Erdbodentemperatur in 20cm Tiefe | Erdbodentemperatur in 20cm Tiefe, Basiswert zu... | °C |
TB3 | Erdbodentemperatur in 50cm Tiefe | Erdbodentemperatur in 50cm Tiefe, Basiswert zu... | °C |
TL | Lufttemperatur in 2m | Lufttemperatur in 2m Höhe, Basiswert zur Minute10 | °C |
TP | Taupunkt | Taupunktstemperatur, Basiswert zur Minute10 | °C |
03 - Precipitation¶
In this section, we will focus on precipitation only.
03-01: Compute the average annual precipitation (m/year) over the 7-year period.
1 | dfh["RR"].sum()/7/1000 |
0.920557142857143
03-02: What is the smallest non-zero precipitation measured at the station? What is the maximum hourly precipitation measured at the station? When did this occur?
1 2 3 4 | without_zero=dfh["RR"] without_zero=without_zero[without_zero>0] print('The smallest non-zero precipitation measured at the station is:', without_zero.min(),' mm/h') print('The maximum hourly precipitation measured at the station is: ',dfh["RR"].max(), 'mm/h and it occured on: ',dfh["RR"].idxmax()) |
The smallest non-zero precipitation measured at the station is: 0.1 mm/h The maximum hourly precipitation measured at the station is: 22.2 mm/h and it occured on: 2021-09-16 22:00:00
03-03: Plot a histogram of hourly precipitation, with bins of size 0.2 mm/h, starting at 0.1 mm/h and ending at 25 mm/h. Plot the same data, but this time with a logarithmic y-axis. Compute the 99th percentile (or quantile) of hourly precipitation.
1 2 | preci_quantile = dfh['RR'].quantile(0.99) print(f"The 99th percentile (or quantile) of hourly precipitation is: {preci_quantile}") |
The 99th percentile (or quantile) of hourly precipitation is: 2.3330000000001747
1 2 3 4 | histo= plt.hist(dfh['RR'][dfh['RR']>=0.1], bins=124) plt.title("Hourly precipitation") plt.xlabel("Hourly precipitation (mm/h)") plt.ylabel("times of occurance") |
Text(0, 0.5, 'times of occurance')
1 2 3 4 | histo_log= plt.hist(dfh['RR'][dfh['RR']>=0.1], bins=124, log =True) plt.title("Hourly precipitation with a logarithmic y-axis") plt.xlabel("Hourly precipitation (mm/h)") plt.ylabel("times of occurance") |
Text(0, 0.5, 'times of occurance')
You can clearly see that the use of a logarithmic y-axis makes way more sense in this case, because small hourly precipitation takes place way more often than lager precipitation. Due to this large precipitation events are nearly invisible when using a non-logarithmic y-axis.
03-04: Compute daily sums (mm/d) of precipitation (tip: use .resample
again). Compute the average number or rain days per year in Innsbruck (a "rain day" is a day with at least 0.1 mm / d of measured precipitation).
1 2 | df_daily= df.resample('D').sum() print('Average number or rain days per year in Innsbruck: ',round(df_daily["RR"][df_daily["RR"]>=0.1].count()/7,2)) |
Average number or rain days per year in Innsbruck: 170.29
03-05: Now select (subset) the daily dataframe to keep only daily data in the months of December, January, February (DFJ). To do this, note that dfh.index.month
exists and can be used to subset the data efficiently. Compute the average precipitation in DJF (mm / d), and the average number of rainy days in DJF. Repeat with the months of June, July, August (JJA).
1 2 3 4 5 6 7 8 9 | #Average in djf df_DJF = df_daily["RR"][df_daily.index.month.isin([1, 2, 12])] # crate a list containing RR for the months December, January, February (DFJ) print(f"Average precipitation in December, January and February (DJF) is {round(df_DJF.mean())} mm/d") print(f"Arage number of rainy days in December, January and February (DJF) is: {round(df_DJF[df_DJF>=0.1].count()/7)}\n") # average number of rain days #Average in JJA df_JJA = df_daily["RR"][df_daily.index.month.isin([6, 7, 8])] print(f"Average precipitation in June, July, August (JJA) is {round(df_JJA.mean())} mm/d") print(f"Aerage number of rainy days in June, July, August (JJA) is: {round(df_JJA[df_JJA>=0.1].count()/7)}\n") |
Average precipitation in December, January and February (DJF) is 2 mm/d Arage number of rainy days in December, January and February (DJF) is: 38 Average precipitation in June, July, August (JJA) is 4 mm/d Aerage number of rainy days in June, July, August (JJA) is: 53
03-06: Repeat the DJF and JJA subsetting, but this time with hourly data. Count the total number of times that hourly precipitation in DJF is above the 99th percentile computed in exercise 03-03. Repeat with JJA.
1 2 3 4 5 6 7 8 9 10 | #Average in djf df_DJF_hour = dfh["RR"][dfh.index.month.isin([1, 2, 12])] df_DJF_hour_above_99th= df_DJF_hour[df_DJF_hour>preci_quantile].count() print(f"Hourly precipitation in DJF is above the 99th percentile {round(df_DJF_hour_above_99th/7)} times per year") #Average in JJA df_JJA_houer = dfh["RR"][dfh.index.month.isin([6, 7, 8])] df_JJA_hour_above_99th= df_JJA_houer[df_JJA_houer>preci_quantile].count() print() print(f"Hourly precipitation in JJA is above the 99th percentile {round(df_JJA_hour_above_99th/7)} times per year") |
Hourly precipitation in DJF is above the 99th percentile 10 times per year Hourly precipitation in JJA is above the 99th percentile 44 times per year
03-07: Compute and plot the average daily cycle of hourly precipitation in DFJ and JJA. I expect a plot similar to this example. To compute the daily cycle, I recommend to combine two very useful tools. First, start by noticing that ds.index.hour
exists and can be used to categorize data. Then, note that df.groupby
exists and can be used exactly for that (documentation).
1 2 3 4 5 6 | df_JJA_houer.groupby(df_JJA_houer.index.hour).sum() plt.plot(df_JJA_houer.groupby(df_JJA_houer.index.hour).sum()) plt.plot(df_DJF_hour.groupby(df_DJF_hour.index.hour).sum()) plt.xlabel("Hour of day (UTC)") plt.ylabel("Hourly precipitation (mm/h)") plt.title("Precipitation daily cycle Innsbruck 2015-2021") |
Text(0.5, 1.0, 'Precipitation daily cycle Innsbruck 2015-2021')
04 - A few other variables¶
In this section, we will continue to analyze the weather station data.
04-01: Verify that the three soil temperatures have approximately the same average value over the entire period. Now plot the three soil temperature timeseries in hourly resolution over the course of the year of 2020 (example). Repeat the plot with the month of may 2020.
1 | print(np.isclose(dfh["TB1"].mean(),dfh["TB2"].mean(),dfh["TB3"].mean())) |
True
This verifys that the three soil temperatures have approximately the same average value over the entire period
1 2 3 4 5 6 7 8 9 10 | TB1_2020 = dfh["TB1"][dfh.index.year.isin([2020])] TB2_2020 = dfh["TB2"][dfh.index.year.isin([2020])] TB3_2020 = dfh["TB3"][dfh.index.year.isin([2020])] plt.plot(TB1_2020, label="TB1") plt.plot(TB2_2020, label="TB2") plt.plot(TB3_2020, label="TB3") plt.title("Soil temperatures 2020 ") plt.xlabel("time") plt.ylabel("Soil temperature (°C)") plt.legend() |
<matplotlib.legend.Legend at 0x7f717fb45ae0>
1 2 3 4 5 6 7 8 9 10 11 | TB1_2020_may= TB1_2020[TB1_2020.index.month.isin([5])] TB2_2020_may= TB2_2020[TB2_2020.index.month.isin([5])] TB3_2020_may= TB3_2020[TB3_2020.index.month.isin([5])] plt.plot(TB1_2020_may, label='TB1') plt.plot(TB2_2020_may, label='TB2') plt.plot(TB3_2020_may, label='TB2') plt.xticks(['2020-05-01','2020-05-05','2020-05-09','2020-05-13','2020-05-17','2020-05-21','2020-05-25','2020-05-29'],[1,5,9,13,17,21,25,29]) plt.title("Soil temperatures May 2020 ") plt.xlabel("Day") plt.ylabel("Soil temperature (°C)") plt.legend() |
<matplotlib.legend.Legend at 0x7f717f9c9a20>
04-02: Plot the average daily cycle of all three soil temperatures.
1 2 3 4 5 6 7 8 9 10 | TB1_cycle= dfh["TB1"].groupby(dfh["TB1"].index.hour).mean() TB2_cycle= dfh["TB2"].groupby(dfh["TB2"].index.hour).mean() TB3_cycle= dfh["TB3"].groupby(dfh["TB3"].index.hour).mean() plt.plot(TB1_cycle, label='TB1') plt.plot(TB2_cycle, label='TB2') plt.plot(TB3_cycle, label='TB3') plt.title('Daily cycle of soil temperatures') plt.xlabel("time [h]") plt.ylabel("Soil temperature [°C]") plt.legend() |
<matplotlib.legend.Legend at 0x7f717f9c97b0>
This shows that the strength of the daily cycle of the soil temperature decreases with increasing depth.
04-03: Compute the difference (in °C) between the air temperature and the dewpoint temperature. Now plot this difference on a scatter plot (x-axis: relative humidity, y-axis: temperature difference).
1 2 3 4 5 6 | dfh_temp_diff["temp_diff"]= dfh["TL"]- dfh["TP"] dfh_diff_hum= pd.DataFrame() plt.scatter(dfh["RF"],dfh_temp_diff["temp_diff"], s=0.3) plt.title("Correlation of relative humidity and difference\n between the air temperature and the dewpoint temperature") plt.xlabel("Relative humidity (%)") plt.ylabel("Difference between the air and the dewpoint temperature (°C)") |
--------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_52353/3347976723.py in <cell line: 1>() ----> 1 dfh_temp_diff["temp_diff"]= dfh["TL"]- dfh["TP"] 2 dfh_diff_hum= pd.DataFrame() 3 plt.scatter(dfh["RF"],dfh_temp_diff["temp_diff"], s=0.3) 4 plt.title("Correlation of relative humidity and difference\n between the air temperature and the dewpoint temperature") 5 plt.xlabel("Relative humidity (%)") NameError: name 'dfh_temp_diff' is not defined
You can clearly see that the difference between the air and the dewpoint temperature, decreases as relative humidity increases.
05 - Free coding project¶
The last part of this semester project is up to you! You are free to explore whatever interests you. I however add three requirements:
- This section should have at least 5 original plots in it. They are the output of your analysis.
- This section should also use additional data that you downloaded yourself. The easiest way would probably be to download another station(s) from the ZAMG database, or data from the same station but for another time period (e.g. for trend or change analysis). You can, however, decide to do something completely different if you prefer (as long as you download and read one more file).
- this section should contain at least one regression or correlation analysis between two parameters. Examples:
- between two different variables at the same station (like we did with the dewpoint above)
- between different stations (for example, average temperature as a function of station elevation)
- between average temperature and time (trends analysis)
- etc.
That's it! Here are a few ideas:
- detection of trends and changes at the station Innsbruck for 1993-2021
- comparison of 5-yr climatologies at various stations in Tirol, taking elevation or location into account
- compute the theoretical day length from the station's longitude and latitude (you can find solutions for this online, just let me know the source if you used a solution online), and use these computations to compare the measured sunshine duration to the maximum day length. This can be used to classify "sunshine days" for example.
- use the python "windrose" library to plot a windrose at different locations and time of day.
- etc.
If you have your own idea but are unsure about whether this is too much or not enough, come to see me in class! In general, the three requirements above should be enough.
My goal with this section is to let you formulate a programming goal and implement it.
The first part of the our free coding project deals with snow depths.
Therefore we downloaded a file called "comparison_SH.csv" from the ZAMG Download Form, wich contains snow depth messurments form Sonnblick(3109m), Pitztaler Gletscher(2863m), Obergurgl(1941m) and Katschberg(1635m).
Firstly we crated a Dataframe called "com_d", containing snow depth messurements from each of the stations. The names of the columns are the Klima-ID of the respectiv stations:
- 15411: Sonnblick(3109m)
- 15715: Katschberg (1635m)
- 17301: Obergurgl (1941m)
- 17315: Pitztaler Gletscher (2863m)
1 2 3 4 5 6 7 8 9 | df_com = pd.read_csv('comparison_SH.csv', index_col=1) #creating a Dataframe containing snow depth for four different stations df_com['time'] = df_com['time'].astype(str) # make the given time convertible to datetime object df_com['time']= df_com['time'].str[:-6] df_com["time_con"]=pd.to_datetime(df_com["time"]) # convert the time to datetime object df_com = df_com.drop('time', axis=1) # delet old time column df_com["station"] = df_com.index com = df_com.pivot(index='time_con', columns= "station", values='SH') # creating a new Dataframe, which is structured differently com_d = com.resample('D').mean() display(com_d) |
station | 15411 | 15715 | 17301 | 17315 |
---|---|---|---|---|
time_con | ||||
2018-01-01 | NaN | NaN | NaN | NaN |
2018-01-02 | NaN | NaN | NaN | NaN |
2018-01-03 | NaN | NaN | NaN | NaN |
2018-01-04 | NaN | NaN | NaN | NaN |
2018-01-05 | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... |
2023-06-05 | 288.020833 | 0.0 | NaN | 173.798611 |
2023-06-06 | 286.576389 | 0.0 | NaN | 171.916667 |
2023-06-07 | 281.755245 | NaN | NaN | 168.208333 |
2023-06-08 | 278.618056 | NaN | NaN | 165.027778 |
2023-06-09 | 275.041667 | NaN | NaN | 161.881944 |
1986 rows × 4 columns
Seasonal cycle of snow height at Pitztaler Gletscher (2863m)¶
The first plot refers exclusively to Pitztaler Gletscher (2863m). It shows the Average, Maximum and Minimum seasonal cycle of snow depth at Pitztaler Gletscher(2863m). Additionally, it shows the course of snow depth, for the 2022/23 winter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | piz_SH=com_d[17315] # creating a Dataframe containing only information about Pitztaler Gletscher sh_group = pd.DataFrame() sh_max = piz_SH.groupby(piz_SH.index.dayofyear).max()[:-1] #compute maximum snowhight for every day sh_max1= sh_max[244:] # splitting sh_max to change sh_max2= sh_max[:244] sh_group["max"]= pd.concat([sh_max1, sh_max2]) # sh_group["max"] starts in september and ends in august sh_min = piz_SH.groupby(piz_SH.index.dayofyear).min()[:-1] sh_min1= sh_min[244:] sh_min2= sh_min[:244] sh_group["min"]= pd.concat([sh_min1, sh_min2]) sh_mean = piz_SH.groupby(piz_SH.index.dayofyear).mean() sh_mean1= sh_mean[244:] sh_mean2= sh_mean[:244] sh_group["mean"]= pd.concat([sh_mean1, sh_mean2]) sh_2023 = piz_SH["2022-09-01 00:00:00":].groupby(piz_SH["2022-09-01 00:00:00":].index.dayofyear).mean() sh_2023_1= sh_2023[244:] sh_2023_2= sh_2023[:244] sh_group["2023"]= pd.concat([sh_2023_1, sh_2023_2]) sh_group.reset_index # resetting index so spetember 1. has index 1 sh_group.reset_index(drop=True, inplace=True) sh_group.index = range(1, len(sh_group) + 1) |
1 2 3 4 5 6 7 8 9 10 11 | plt.plot(sh_group["max"],color='darkslategray', label='Maximum') plt.plot(sh_group["min"], color='dimgray', label='Minimum') plt.plot(sh_group["mean"],color='orange', label='Average') plt.plot(sh_group["2023"], color='blue', label='2022/2023') plt.grid() plt.fill_between(sh_group.index, sh_group["max"],sh_group["min"], color='gray', alpha=0.3) plt.xticks([1, 31, 62, 92, 123, 154, 182, 213, 243, 274, 304, 335],["Sep", "Oct", "Nov", "Dec","Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]) plt.title("Seasonal cycle of snow depth Pitztaler Gletscher(2863m)") plt.xlabel("Time") plt.ylabel("Snow depth (cm)") plt.legend() |
<matplotlib.legend.Legend at 0x7f71820fe0e0>
1 | sh_group["mean"].max() |
260.6875
You can clearly see that the snow depth is usually 0cm in early November and starts to build up to 260cm in the following months on average. The maximum is usually reached between mid-March and mid-April. From then on, the snow depth decreases faster than it build up and usually reaches 0cm again in July.
It is particularly noticeable that the snow depth was exceptionally low in the 2022/23 winter until April. This made the 2022/23 winter set the new minimum snow heights almost constantly from mid-October to April. Keeping in mind that the snow depth measurements we could find at the ZAMG Download Form start at 2018, this makes the 2022/23 winter the one with the lowest snow depth until April at least since 2018. Sadly, we couldn't find snow depth data reaching further into the past at ZAMG.
However he 2022/23 winter had a late awakening mid-April, with snow heights being about average since then.
Average seasonal cycle of snow depth at different locations¶
This section compares the average seasonal cycle of snow depth of the following locations:
- Sonnblick(3109m)
- Pitztaler Gletscher(2863m)
- Obergurgl(1941m)
- Katschberg(1635m)
1 2 3 4 5 6 7 8 9 | sh_com= pd.DataFrame() for i in [15411, 17315, 17301, 15715]: average_i = com_d[i].groupby(com_d[i].index.dayofyear).mean() # computing dayly average snow hight for every station sh_i_1 = average_i[244:-1] # from September on sh_i_2 = average_i[:244] # Janurary until September sh_com[i]= pd.concat([sh_i_1, sh_i_2]) # conect in this order sh_com.reset_index # seting new index where first Sep. is 1 sh_com.reset_index(drop=True, inplace=True) sh_com.index = range(1, len(sh_com) + 1) |
1 2 3 4 5 6 7 8 9 10 11 | #ploting the seasonal cycle for every station plt.plot(sh_com[15411], color="blue",label="Sonnblick (3109m)") plt.plot(sh_com[17315], color="black", label="Pitztalergletscher (2863m)") plt.plot(sh_com[17301], color="red",label="Obergurgl (1941m)") plt.plot(sh_com[15715], color="green", label="Katschberg (1635m)") plt.legend() plt.grid() plt.xticks([1, 31, 62, 92, 123, 154, 182, 213, 243, 274, 304, 335],["Sep", "Oct", "Nov", "Dec","Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug"]) plt.title("Average seasonal cycle of snow depth") plt.xlabel("Time") plt.ylabel("snow depth (cm)") |
Text(0, 0.5, 'snow depth (cm)')
1 2 3 4 | print(f"Maximum snow depth: {round(sh_com[15715].max())}m and days of snow coverage: {len(sh_com[15715][sh_com[15715]>0])} at Katschberg") print(f"Maximum snow depth: {round(sh_com[17301].max())}m and days of snow coverage: {len(sh_com[17301][sh_com[17301]>0])} at Obergurgl") print(f"Maximum snow depth: {round(sh_com[17315].max())}m and days of snow coverage: {len(sh_com[17315][sh_com[17315]>0])} at Pitztaler Gletscher") print(f"Maximum snow depth: {round(sh_com[15411].max())}m and days of snow coverage: {len(sh_com[15411][sh_com[15411]>0])} at Sonnblick") |
Maximum snow depth: 68m and days of snow coverage: 209 at Katschberg Maximum snow depth: 95m and days of snow coverage: 216 at Obergurgl Maximum snow depth: 261m and days of snow coverage: 328 at Pitztaler Gletscher Maximum snow depth: 361m and days of snow coverage: 365 at Sonnblick
In this case it is obvious that higher altitude leads to higher snow depths. But that is not necessarily the case for other locations, because the location itself plays a significant role when it comes to precipitation.
Furthermore, it can be recognized that the maximum snow height is reached later in the season as altitude rises.
At Katschberg(1635m)/Obergurgl (1941) the maximums of 68cm and 95cm are already reached in early-February.
At Pitztaler Gletscher(2863m) and Sonnblick(3109m) the maximums of 261cm and 361cm are reached mid-March and late-April.
Therewith almost all snow at Katschberg(1635m) and Obergurgl (1941) is gone when the maximum snow depth at Sonnblick is reached.
Moreover, there are 210/270 days per year were Katschberg/Obergurgl is snow covered. With 329 days of snow coverage the snow free season at Pitztaler Gletscher is way shorter. Wears there isn't any snow free day at Sonnblick at all.
Correlation between Precipitation and Pressure (Obergurgl May 2021)¶
This section is refering to the coorilation between Precipitation and Pressure at Obergurgl in May 2021. We downloaded a csv file called "Obergurgl_PRR.csv" wich contains measurments of Precipitation and Pressure at Obergurgl in 2021 at the ZAMG Download Forum.
1 2 3 4 5 6 7 | dfprr = pd.read_csv('Obergurgl_PRR.csv', index_col=1, parse_dates=True) dfprr['time'] = dfprr['time'].astype(str) dfprr['time']= dfprr['time'].str[:-6] dfprr["time_con"]=pd.to_datetime(dfprr["time"]) dfprr = dfprr.drop('time', axis=1) dfprr["time_con"] dfprr= dfprr.set_index("time_con") |
1 2 3 4 5 6 7 | dfprri= dfprr.loc["2021-05-01 00:00:00":"2021-05-31 23:50:00"] #selecting May. df_cor= pd.DataFrame() df_cor["perci"]= dfprri["P"].groupby(dfprri["RR"]).mean() # group Pressure by percipetation plt.plot(df_cor["perci"]) plt.xlabel("Percipitation (mm)") plt.ylabel("Pressure (hPa)") plt.title('Correlation between Precipitation and Pressure\n Obergurgl May 2021') |
Text(0.5, 1.0, 'Correlation between Precipitation and Pressure\n Obergurgl May 2021')
It can be clearly seen that the probability of precipitation increases with low air pressure.
In the following plot the pressure and precipitation graphs are superimposed. Here you can clearly see that every Rainfall goes along with a local minimum of pressure.
1 2 3 4 5 6 7 8 9 10 11 | fig, ax1 =plt.subplots() ax1.plot(dfprr["RR"]["2021-05-01 00:00:00":"2021-05-31 23:50:00"], color= "black", label='percipitation') ax2= ax1.twinx() ax2.plot(dfprr["P"]["2021-5-01 00:00:00":"2021-05-31 23:50:00"], label='pressure') ax1.legend(loc='upper left') ax2.legend(loc='upper right') ax1.set_xticks(["2021-05-01","2021-05-05","2021-05-10","2021-05-15","2021-05-20","2021-05-25","2021-05-30"],[1,5,10,15,20,25,30]) ax1.set_title("Pressure and precipitation Obergurgl May 2021") ax2.set_ylabel("Pressure (hPa)") ax1.set_ylabel("Precipitation (mm)") ax1.set_xlabel("Day") |
Text(0.5, 0, 'Day')
Winddirection and windspeed on different locations¶
Now we are going to have a look on Winddirection and windspeed at different locations. We have choosen Patscherkofel and Valluga for this. At first we have to convert/format this data into something more useful:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #Load and format the data of Patscherkofel dp = pd.read_csv('Patscherkofel.csv', index_col=1, parse_dates=True) #convert the datetime to a datetime stamp dp['time'] = dp['time'].astype(str) dp['time']= dp['time'].str[:-6] dp["time_con"]=pd.to_datetime(dp["time"]) dp = dp.drop('time', axis=1) dp["time_con"] dp= dp.set_index("time_con") #Load and format the data of Valluga dv = pd.read_csv('Valluga.csv', index_col=1, parse_dates=True) #convert the datetime to a datetime stamp dv['time'] = dv['time'].astype(str) dv['time']= dv['time'].str[:-6] dv["time_con"]=pd.to_datetime(dv["time"]) dv = dv.drop('time', axis=1) dv["time_con"] dv= dv.set_index("time_con") |
Now that we have converted the Data from Patscherkofel and Valluga into a Format as above, we can use this data the same way. Let's say we are interested from which direction the wind at Patscherkofel vs the wind at Valluga came from. If we only would calculate the difference between the mean of 2022, we would sugguest, that the winddirection is nearly the same.
1 | print(dv['DD'].mean()-dp['DD'].mean(),'°') |
-2.465055230445415 °
But if we plot a windrose diagram for each station, we see, that the direction is quite different:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | fig = plt.figure() G = gridspec.GridSpec(3, 2) fig = plt.figure(figsize=(15, 10)) ax1 = WindroseAxes(fig,G[0:2,0]) fig.add_axes(ax1) ax1.bar(dp['DD'], dp['FFAM'], normed=True, opening=0.8, edgecolor="white") ax1.set_legend(); ax1.set_title('Patscherkofel') ax2 = WindroseAxes(fig,G[0:2,1]) fig.add_axes(ax2) ax2.bar(dv['DD'],dv['FFAM'], normed=True, opening=0.8, edgecolor="white") ax2.set_legend(loc='lower right'); ax2.set_title('Valluga') plt.show() |
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /tmp/ipykernel_52353/64023190.py in <cell line: 6>() 4 fig = plt.figure(figsize=(15, 10)) 5 ----> 6 ax1 = WindroseAxes(fig,G[0:2,0]) 7 fig.add_axes(ax1) 8 ax1.bar(dp['DD'], dp['FFAM'], normed=True, opening=0.8, edgecolor="white") ~/.miniconda3/envs/py3/lib/python3.10/site-packages/windrose/windrose.py in __init__(self, *args, **kwargs) 98 self.theta_labels = kwargs.pop("theta_labels", DEFAULT_THETA_LABELS) 99 --> 100 PolarAxes.__init__(self, *args, **kwargs) 101 self.set_aspect("equal", adjustable="box", anchor="C") 102 self.radii_angle = 67.5 ~/.miniconda3/envs/py3/lib/python3.10/site-packages/matplotlib/projections/polar.py in __init__(self, theta_offset, theta_direction, rlabel_position, *args, **kwargs) 775 self._default_theta_direction = theta_direction 776 self._default_rlabel_position = np.deg2rad(rlabel_position) --> 777 super().__init__(*args, **kwargs) 778 self.use_sticky_edges = True 779 self.set_aspect('equal', adjustable='box', anchor='C') ~/.miniconda3/envs/py3/lib/python3.10/site-packages/matplotlib/_api/deprecation.py in wrapper(*args, **kwargs) 457 "parameter will become keyword-only %(removal)s.", 458 name=name, obj_type=f"parameter of {func.__name__}()") --> 459 return func(*args, **kwargs) 460 461 # Don't modify *func*'s signature, as boilerplate.py needs it. ~/.miniconda3/envs/py3/lib/python3.10/site-packages/matplotlib/axes/_base.py in __init__(self, fig, rect, facecolor, frameon, sharex, sharey, label, xscale, yscale, box_aspect, **kwargs) 601 self._position = rect 602 else: --> 603 self._position = mtransforms.Bbox.from_bounds(*rect) 604 if self._position.width < 0 or self._position.height < 0: 605 raise ValueError('Width and height specified must be non-negative') TypeError: matplotlib.transforms.Bbox.from_bounds() argument after * must be an iterable, not SubplotSpec
<Figure size 640x480 with 0 Axes>
<Figure size 1500x1000 with 0 Axes>
As you can see above, the wind at Patscherkofel came mainly from South and NNW and at Valluga SouthWest. A Windrose is a great way to display winddirection and windspeed. Speaking of windspeed: Let's have a look on the windspeed in course of the last year:
1 2 3 4 5 6 7 8 | dph=dp.resample('D').mean() dvh=dv.resample('D').mean() plt.plot(dph['FFAM'],color='royalblue', label='Patscherkofel') plt.plot(dvh['FFAM'],color='black', label='Valluga') plt.title('seasonal change of windspeed') plt.ylabel('windspeed [m/s]') plt.xlabel('date [YYYY-MM]') plt.legend() |
<matplotlib.legend.Legend at 0x7f717d5fee90>
1 2 | print('Patscherkofel average windspeed of 2022 was ',round(dp['FFAM'].mean(),2),' m/s.') print('Valluga average windspeed of 2022 was ',round(dv['FFAM'].mean(),2),' m/s.') |
Patscherkofel average windspeed of 2022 was 6.16 m/s. Valluga average windspeed of 2022 was 5.02 m/s.
As we can see, the windspeed at Patscherkofel has some high peaks and is overall higher, than at Valluga. Now we are looking if there are any corrilations between winddirection and windspeed:
1 2 3 4 5 6 7 8 9 10 11 12 | dayp=dp.loc['2022-09-07 00:00:00':'2022-09-07 23:50:00'] fig, ax1 = plt.subplots() ax1.set_title('Patscherkofel wind 2022-09-07') ax1.set_xlabel('time [h]') ax1.set_ylabel('windspeed [m/s]') ax1.plot(dayp['FFAM'],color='royalblue', label='windspeed') ax2 = ax1.twinx() ax2.set_ylabel('winddirection [°]') ax2.plot(dayp['DD'],color='black', label='winddirection') ax1.set_xticks(['2022-09-07 00:00:00','2022-09-07 03:00:00','2022-09-07 06:00:00','2022-09-07 09:00:00','2022-09-07 12:00:00','2022-09-07 15:00:00','2022-09-07 18:00:00','2022-09-07 21:00:00','2022-09-08 00:00:00'],[0,3,6,9,12,15,18,21,24]) ax1.legend() ax2.legend() |
<matplotlib.legend.Legend at 0x7f717db56bf0>
When the winddirection varies a lot at Patscherkofel, the windspeed isn't high, but if the direction is pretty stable (wich would be Southwind in case of 07.September 2022), the speed is much higher.