Programming project¶

Preamble¶

Names: Rosina Linhard and Marie Kaucher
Matrikelnummer: 12220308 and 12218322

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

01 - An energy balance model with hysteresis¶

The planetary albedo $\alpha$ is in fact changing with climate change. As the temperature drops, sea-ice and ice sheets are extending (increasing the albedo). Inversely, the albedo decreases as temperature rises. The planetary albedo of our simple energy balance model follows the following equation:

$$ \alpha = \begin{cases} 0.3,& \text{if } T \gt 280\\ 0.7,& \text{if } T \lt 250\\ a T + b, & \text{otherwise} \end{cases} $$

01-01: Compute the parameters $a$ and $b$ so that the equation is continuous at T=250K and T=280K.

c = np.array([[280,1], [250,1]])
d = np.array([0.3,0.7])
x = np.linalg.solve(c, d)
f'The parameters a and b are {x}'

'The parameters a and b are [-0.01333333  4.03333333]'

01-02: Now write a function called alpha_from_temperature which accepts a single positional parameter T as input (a scalar) and returns the corresponding albedo. Test your function using doctests to make sure that it complies to the instructions above.

def alpha_from_temperature(T):
    '''Caculating the Albedo for an Temperature.

    Parameters
    ----------
    T : a scalar
        T is the temperature in K

    Returns
    -------
    (alpha) : a

    Examples
    --------
    >>> print(f'{alpha_from_temperature(240):.2f}')
    0.70
    >>> print(f'{alpha_from_temperature(290):.2f}')
    0.30
    >>> print(f'{alpha_from_temperature(260):.2f}')
    0.57
    '''
    T = np.array(T)
    alpha = np.where(T >= 280, 0.3, np.where(T <= 250, 0.7, (-0.01333333 * np.array(T) + 4.03333333)))
    
    return alpha

import doctest
doctest.testmod()

TestResults(failed=0, attempted=3)

01-03: Adapt the existing code from week 07 to write a function called temperature_change_with_hysteresis which accepts t0 (the starting temperature in K), n_years (the number of simulation years) as positional arguments and tau (the atmosphere transmissivity) as keyword argument (default value 0.611). Verify that:

def asr(alpha=0.3):
    """Absorbed Solar Radiation (W m-2).

    Parameters
    ----------
    alpha : float, optional
        the planetary albedo

    Returns
    -------
    the asr (float)
    """
    s0 = 1362
    
    return (1 - alpha) * s0 / 4

def olr(t, tau=0.611):
    """Outgoing Longwave Radiation (W m-2).

    Parameters
    ----------
    t : float
        the atmosphere temperature (K)
    tau : float, optional
        the atmosphere transmissivity (-)

    Returns
    -------
    the olr (float)
    """
    sigma = 5.67E-8
    
    return sigma * tau * t**4

def temperature_change_with_hysteresis(t0, n_years, tau=0.611):
    """Temperature change scenario after change of transmissivity.

    Parameters
    ----------
    t0 : float
        the starting temperature (K)
    n_years : int
        the number of simulation years
    tau : float, optional
        the atmosphere transmissivity (-)
        
    Returns
    -------
    (time, temperature) : ndarrays of size n_years + 1
    """
    years = np.arange(n_years + 1)
    temperature = np.zeros(n_years + 1)
    temperature[0] = t0
    alpha = np.where(t0>=280,0.3, np.where(t0<=250,0.7, (-0.01333333*np.array(t0)+4.03333333)))
    dt = 60 * 60 * 24 * 365
    C = 4.0e+08
    
    for i in range(n_years):
        temperature[i + 1] = temperature[i] + dt / C * (asr(alpha=alpha) - olr(temperature[i], tau=tau))
        alpha=alpha_from_temperature(temperature[i + 1])
        
    return years, temperature

a,b = temperature_change_with_hysteresis(292, 50, tau=0.611)
e,f = temperature_change_with_hysteresis(265, 50, tau=0.611)
g = np.isclose(b[50], 288, rtol=0, atol=0.5)
h = np.isclose(f[50], 233, rtol=0, atol=0.5)
g, h

(True, True)

Using the code above, we checked that the stabilization temperature with t0 = 292 and default tau is approximately 288K and the stabilization temperature with t0 = 265 and default tau is approximately 233K. Both are correct.

01-04: Realize a total of N simulations with starting temperatures regularly spaced between t0=206K, and t0=318K and plot them on a single plot for n_years=50. The plot should look somewhat similar to this example for N=21.

n_years = np.arange(0, 51)
t0 = np.linspace(206,318,21)
df = pd.DataFrame(index=n_years)

for t in t0:
    y, temp = temperature_change_with_hysteresis(t, 50, tau=0.611)
    df[t] = temp
    
df.plot(legend=False)
plt.xlabel('Years'); plt.ylabel('Temperature (K)'); plt.title('Climate change scenarios');

No description has been provided for this image

Bonus: you can try to increase N and add colors to your plot to create a graph similar to this one.

n_years = np.arange(0, 51)
t0 = np.linspace(206, 318, 200)
df = pd.DataFrame(index=n_years)

for t in t0:
    y, temp = temperature_change_with_hysteresis(t, 50, tau=0.611)
    df[t] = temp

df.plot(cmap='gnuplot', legend=False)
plt.xlabel('Years'); plt.ylabel('Temperature (K)'); plt.title('Climate change scenarios');

/tmp/ipykernel_4980/4163411529.py:7: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df[t] = temp

02 - Weather station data files¶

I downloaded 10 min data from the recently launched ZAMG data hub. The data file contains selected parameters from the "INNSBRUCK-FLUGPLATZ (ID: 11804)" weather station.

You can download the data from the following links (right-click + "Save as..."):

station data
parameter metadata
station list from the ZAMG (in a better format than last time)

Let me open the data for you and display its content:

df = pd.read_csv('INNSBRUCK-FLUGPLATZ_Datensatz_20150101_20211231.csv', index_col=1, parse_dates=True)
df = df.drop('station', axis=1)

02-01: after reading the documentation of the respective functions (and maybe try a few things yourself), explain in plain sentences:

what am I asking pandas to do with the index_col=1, parse_dates=True keyword arguments? Why am I doing this?
what am I asking pandas to do with .drop()? Why axis=1?

With the index_col=1 Pandas is asked to use the first column as row label of the data frame, either given as string name or column index. Here it is date/time. With parse_dates=True Pandas is asked to try parsing the index. Here it is the date and the time of the format of year-month-day time (0000-00-00 00:00). With .drop() you remove labels from the data. Here you use a name and an axis. Axis is one so the label of the columns is droped. If it would be 0 the label of the index would be dropped. The station number is droped here.

Now let me do something else from you:

dfmeta = pd.read_csv('ZEHNMIN Parameter-Metadaten.csv', index_col=0)
dfmeta.loc[df.columns]

	Kurzbeschreibung	Beschreibung	Einheit
DD	Windrichtung	Windrichtung, vektorielles Mittel über 10 Minuten	°
FF	vektorielle Windgeschwindigkeit	Windgeschwindigkeit, vektorielles Mittel über ...	m/s
GSX	Globalstrahlung	Globalstrahlung, arithmetisches Mittel über 10...	W/m²
P	Luftdruck	Luftdruck, Basiswert zur Minute10	hPa
RF	Relative Feuchte	Relative Luftfeuchte, Basiswert zur Minute10	%
RR	Niederschlag	10 Minuten Summe des Niederschlags, Summe der ...	mm
SO	Sonnenscheindauer	Sonnenscheindauer, Sekundensumme über 10 Minuten	s
TB1	Erdbodentemperatur in 10cm Tiefe	Erdbodentemperatur in 10cm Tiefe, Basiswert zu...	°C
TB2	Erdbodentemperatur in 20cm Tiefe	Erdbodentemperatur in 20cm Tiefe, Basiswert zu...	°C
TB3	Erdbodentemperatur in 50cm Tiefe	Erdbodentemperatur in 50cm Tiefe, Basiswert zu...	°C
TL	Lufttemperatur in 2m	Lufttemperatur in 2m Höhe, Basiswert zur Minute10	°C
TP	Taupunkt	Taupunktstemperatur, Basiswert zur Minute10	°C

02-02: again, explain in plain sentences what the dfmeta.loc[df.columns] is doing, and why it works that way.

dfmeta is the name of the Data Frame. The command .loc orders a group of columns or rows (here columns of df). It checks which names are in the first data frame and reads them out. So that you get a short describtion.

02-03: explore the dfh dataframe. Explain, in plain words, what the purpose of .resample('H') followed by mean() is. Explain what .resample('H').max() and .resample('H').sum() would do.

dfh = df.resample('H').mean()

The command .resample('H') put the datas of an hour to one new value. With mean() you get an hourly average, with .max() you get the maximum value, with .sum() you get the sum of all values in this time period.

02-04: Using np.allclose, make sure that the average of the first hour (that you'll compute yourself from df) is indeed equal to the first row of dfh. Now, two variables in the dataframe have units that aren't suitable for averaging. Please convert the following variables to the correct units:

RR needs to be converted from the average of 10 min sums to mm/h
SO needs to be converted from the average of 10 min sums to s/h

dfh['RR'] = dfh['RR'] * 6
dfh['SO'] = dfh['SO'] * 6

np.allclose(df[:6].mean(),dfh[:1])

True

Spend some time exploring the dfh dataframe we just created. What time period does it cover? What variables does it contain?
The time period is 2015 to 2021 and it contains the same variables as df, but with the hourly average.

03 - Precipitation¶

03-01: Compute the average annual precipitation (m/year) over the 7-year period.

dfy = df.resample('Y').sum()
p = dfy['RR'].mean()
p = p * 1e-03
f'The average annual precipitation over the 7-year period was {p:.4f} m/year'

'The average annual precipitation over the 7-year period was 0.9206 m/year'

03-02: What is the smallest non-zero precipitation measured at the station? What is the maximum hourly precipitation measured at the station? When did this occur?

a = dfh['RR'].values
m = np.min(a[np.nonzero(a)])
f'The minimum non-zero value at the station is {m:.2f}mm.'

'The minimum non-zero value at the station is 0.10mm.'

mx = np.max(a)
d = dfh['RR'].idxmax()
f'The maximmum hourly precipitation is {m:.2f}mm and it was on {d}.'

'The maximmum hourly precipitation is 0.10mm and it was on 2021-09-16 22:00:00.'

03-03: Plot a histogram of hourly precipitation, with bins of size 0.2 mm/h, starting at 0.1 mm/h and ending at 25 mm/h. Plot the same data, but this time with a logarithmic y-axis. Compute the 99th percentile (or quantile) of hourly precipitation.

dfhs = df.resample('H').sum()
ax = dfhs['RR'].plot.hist(by=None, bins=125, xlim=(0.1, 25), ylim=(0, 1750))
ax.set_xlabel('Hourly Precipitation in mm/h')
ax.set_ylabel('Frequency of each amount of precipitaion')
ax.set_title('Histogram of hourly precipitation');

ax = dfhs['RR'].plot.hist(by=None, bins=125, xlim=(0.1, 25), ylim=(0, 2000), logy=True)
ax.set_xlabel('Hourly Precipitation in mm/h')
ax.set_ylabel('Frequency of each amount of precipitaion')
ax.set_title('Histogram of hourly precipitation with a logarithmic y-axis');

/home/rl/mambaforge/envs/inpro/lib/python3.10/site-packages/pandas/plotting/_matplotlib/core.py:695: UserWarning: Attempt to set non-positive ylim on a log-scaled axis will be ignored.
  ax.set_ylim(self.ylim)

r = dfhs['RR'].quantile(q=0.99)
f'The 99th percentile is {r:.2f}'

'The 99th percentile is 2.33'

03-04: Compute daily sums (mm/d) of precipitation (tip: use .resample again). Compute the average number or rain days per year in Innsbruck (a "rain day" is a day with at least 0.1 mm / d of measured precipitation).

dfd = df.resample('D').sum()

rd = dfd.groupby(dfd.index.map(lambda t: t.year)).apply(lambda x: (x>=0.1).sum()).reset_index()
rda = rd['RR'].sum() / rd['RR'].count()
f'In average there are {rda:.0f} rain days per year.'

'In average there are 170 rain days per year.'

03-05: Now select (subset) the daily dataframe to keep only daily data in the months of December, January, February (DFJ). To do this, note that dfh.index.month exists and can be used to subset the data efficiently. Compute the average precipitation in DJF (mm / d), and the average number of rainy days in DJF. Repeat with the months of June, July, August (JJA).

i = (dfd.index.month == 1) | (dfd.index.month == 2) | (dfd.index.month == 12)
djf = dfd[i]

s = djf['RR'].mean()

rddjf = djf.groupby(djf.index.map(lambda t: t.year)).apply(lambda x: (x>=0.1).sum()).reset_index()
rdadjf = rddjf['RR'].sum() / rddjf['RR'].count()
f'The average precipatition per day is {s:.2f} mm/d in January, February and December. In average there are {rdadjf:.0f} rain days in January, February and December.'

'The average precipatition per day is 1.77 mm/d in January, February and December. In average there are 38 rain days in January, February and December.'

j = (dfd.index.month == 6) | (dfd.index.month == 7) | (dfd.index.month == 8)
jja = dfd[j]

u = jja['RR'].mean()

rdjja = jja.groupby(jja.index.map(lambda t: t.year)).apply(lambda x: (x>=0.1).sum()).reset_index()
rdajja = rdjja['RR'].sum() / rdjja['RR'].count()
f'The average precipatition per day is {u:.2f} mm/d in June, July and August. In average there are {rdajja:.0f} rain days in June, July and August.'

'The average precipatition per day is 4.02 mm/d in June, July and August. In average there are 53 rain days in June, July and August.'

03-06: Repeat the DJF and JJA subsetting, but this time with hourly data. Count the total number of times that hourly precipitation in DJF is above the 99th percentile computed in exercise 03-03. Repeat with JJA.

i = (dfhs.index.month == 1) | (dfhs.index.month == 2) | (dfhs.index.month == 12)
djfh = dfhs[i]

rhdjf = djfh[(djfh['RR']>=r)]
l = len(rhdjf)
f'In December, January, February the hourly precipitation is {l} times above 99th percentile.'

'In December, January, February the hourly precipitation is 68 times above 99th percentile.'

i = (dfhs.index.month == 6) | (dfhs.index.month == 7) | (dfhs.index.month == 8)
jjah = dfhs[i]

rhjja = jjah[(jjah['RR']>=r)]
l2=len(rhjja)
f'In June, July and August the hourly precipitation is {l2} times above 99th percentile.'

'In June, July and August the hourly precipitation is 308 times above 99th percentile.'

03-07: Compute and plot the average daily cycle of hourly precipitation in DFJ and JJA. I expect a plot similar to this example. To compute the daily cycle, I recommend to combine two very useful tools. First, start by noticing that ds.index.hour exists and can be used to categorize data. Then, note that df.groupby exists and can be used exactly for that (documentation).

w = djfh.index.hour
avwint = djfh.groupby(w).mean()
su = jjah.index.hour
avsum = jjah.groupby(su).mean()

avwint['RR'].plot(label='December, January, February')
avsum['RR'].plot(label='June, July, August')
plt.xlabel('Hours of the day')
plt.ylabel('Hourly precipitation [mm/h]')
plt.legend()
plt.title('Precipitation daily cycle of Innsbruck 2015-2021');

04 - A few other variables¶

04-01: Verify that the three soil temperatures have approximately the same average value over the entire period. Now plot the three soil temperature timeseries in hourly resolution over the course of the year of 2020 (example). Repeat the plot with the month of may 2020.

n = dfh.index.year==2020
q = dfh[n]
q['TB1'].plot(label='Soil Temperature in 10cm deepth')
q['TB2'].plot(label='Soil Temperature in 20cm deepth')
q['TB3'].plot(label='Soil Temperature in 50cm deepth')
plt.xlabel('Soil Temperature in °C')
plt.ylabel('Date')
plt.legend()
plt.title('The three soil temperatures in hourly resolution 2020');

n = dfh.index.year == 2020
q = dfh[n]
k = (q.index.month == 5)
y = q[k]
y['TB1'].plot(label='Soil Temperature in 10cm deepth')
y['TB2'].plot(label='Soil Temperature in 20cm deepth')
y['TB3'].plot(label='Soil Temperature in 50cm deepth')
plt.xlabel('Soil Temperature in °C')
plt.ylabel('Date')
plt.legend()
plt.title('The three soil temperatures in hourly resolution May 2020');

04-02: Plot the average daily cycle of all three soil temperatures.

a = dfh.index.hour
dfha = dfh.groupby(a).mean()
dfha['TB1'].plot(label='TB1')
dfha['TB2'].plot(label='TB2')
dfha['TB3'].plot(label='TB3')
plt.xlabel('Hours of the day')
plt.ylabel('Soil Temperature in °C')
plt.legend()
plt.title('Soil Temperature - daily cycle of Innsbruck 2015-2021');

04-03: Compute the difference (in °C) between the air temperature and the dewpoint temperature. Now plot this difference on a scatter plot (x-axis: relative humidity, y-axis: temperature difference).

df['diffTLTP'] = df['TL'] - df['TP']
df.plot.scatter(x='RF', y='diffTLTP', figsize=(10,5), grid=True, color='pink')
plt.xlabel('Relative humidity in %')
plt.ylabel('Temperature difference in °C')
plt.title('Difference between air temperature and dewpoint temperature in Innsbruck 2015-2021');

05 - Free coding project¶

We choosed two stations near our hometowns and started to compare data from there. The goal is comparing the wheater condition and proofing some assumumptions of or own perseptation.

Rosina	Marie
The nearest weather station to my village is in the city Regensburg in Bavaria.	I am from Hessen. Kleiner Feldberg is a mountain (880m) near my hometown. It has an own wheater station.

Assumptions of our own perseptations:

Normally it is colder on the Kleinen Feldberg. However, there was a warm winter in Hessen this year, which was probably not in Regensburg.
The wind should be stronger on Kleinen Feldberg thorugh fact the station is on a top of a mountain. The directions should be in both case manifold.
The precipitation in both cities is comparable.
Also the duration of the sunshine should be comparable.

First we read in the data:

fTU = pd.read_csv('Feldberg_Temperatur.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
fTU = fTU.drop(['STATIONS_ID', 'eor'], axis=1)

fN =pd.read_csv('Feldberg_Niederschlag.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
fN = fN.drop(['STATIONS_ID', 'eor'], axis=1)

fSO = pd.read_csv('Feldberg_Sonne.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
fSO = fSO.drop(['STATIONS_ID', 'eor'], axis=1)

fW = pd.read_csv('Feldberg_wind.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
fW = fW.drop(['STATIONS_ID', 'eor'], axis=1)

dkf = pd.concat([fTU, fN, fW, fSO], axis=1)
dkf = dkf.drop(dkf.columns[[0,6, 10,13]], axis=1)

rW = pd.read_csv('Regensburg_wind.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
rW = rW.drop(['STATIONS_ID', 'eor'], axis=1)

rSO = pd.read_csv('Regensburg_Sonne.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
rSO = rSO.drop(['STATIONS_ID', 'eor'], axis=1)

rN = pd.read_csv('Regensburg_Niederschlag.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
rN = rN.drop(['STATIONS_ID', 'eor'], axis=1)

rTU = pd.read_csv('Regensburg_Temperatur.txt', index_col='MESS_DATUM', parse_dates=True, sep=';')
rTU = rTU.drop(['STATIONS_ID', 'eor'], axis=1)

dr = pd.concat([rTU,rN,rW,rSO], axis=1)
dr = dr.drop(dr.columns[[0,6, 10,13]], axis=1)

Now we have two pandas Data Frames:

print("Data Frame Kleiner Feldberg")
dkf

Data Frame Kleiner Feldberg

	PP_10	TT_10	TM5_10	RF_10	TD_10	RWS_DAU_10	RWS_10	RWS_IND_10	FF_10	DD_10	DS_10	GS_10	SD_10	LS_10
MESS_DATUM
2021-11-27 00:00:00	891.9	-2.5	-1.8	98.8	-2.7	10	0.22	1	5.9	190	0.0	0.0	0.0	-999
2021-11-27 00:10:00	891.8	-2.5	-1.7	99.2	-2.6	10	0.17	1	5.6	190	0.0	0.0	0.0	-999
2021-11-27 00:20:00	891.7	-2.4	-1.7	98.9	-2.6	10	0.16	1	5.8	200	0.0	0.0	0.0	-999
2021-11-27 00:30:00	891.7	-2.4	-1.7	99.5	-2.5	10	0.10	1	5.2	200	0.0	0.0	0.0	-999
2021-11-27 00:40:00	891.6	-2.3	-1.6	99.2	-2.4	10	0.05	1	5.7	200	0.0	0.0	0.0	-999
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2023-05-30 23:10:00	927.8	11.0	9.8	64.1	4.5	0	0.00	0	6.2	90	0.0	0.0	0.0	-999
2023-05-30 23:20:00	927.9	11.0	9.8	63.9	4.4	0	0.00	0	6.2	80	0.0	0.0	0.0	-999
2023-05-30 23:30:00	927.9	11.1	9.8	63.0	4.3	0	0.00	0	5.8	80	0.0	0.0	0.0	-999
2023-05-30 23:40:00	927.8	11.0	9.7	63.9	4.4	0	0.00	0	6.5	70	0.0	0.0	0.0	-999
2023-05-30 23:50:00	927.9	11.0	9.7	64.6	4.6	0	0.00	0	5.9	70	0.0	0.0	0.0	-999

79200 rows × 14 columns

print("Data Regensburg")
dr

Data Regensburg

	PP_10	TT_10	TM5_10	RF_10	TD_10	RWS_DAU_10	RWS_10	RWS_IND_10	FF_10	DD_10	DS_10	GS_10	SD_10	LS_10
MESS_DATUM
2021-11-27 00:00:00	950.2	0.0	-0.6	90.7	-1.3	0	0.0	0	3.9	130	0.0	0.0	0.0	-999
2021-11-27 00:10:00	950.1	0.0	-0.6	90.6	-1.4	0	0.0	0	3.7	130	0.0	0.0	0.0	-999
2021-11-27 00:20:00	950.0	0.0	-0.6	91.2	-1.3	0	0.0	0	3.8	140	0.0	0.0	0.0	-999
2021-11-27 00:30:00	949.9	-0.1	-0.7	92.3	-1.2	0	0.0	0	3.8	130	0.0	0.0	0.0	-999
2021-11-27 00:40:00	949.8	0.0	-0.5	91.3	-1.3	0	0.0	0	4.3	130	0.0	0.0	0.0	-999
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2023-05-30 23:10:00	978.4	11.5	8.0	57.2	3.3	0	0.0	0	1.0	20	0.0	0.0	0.0	-999
2023-05-30 23:20:00	978.4	11.7	7.9	58.1	3.7	0	0.0	0	0.7	20	0.0	0.0	0.0	-999
2023-05-30 23:30:00	978.4	11.8	7.8	56.7	3.5	0	0.0	0	1.1	360	0.0	0.0	0.0	-999
2023-05-30 23:40:00	978.3	11.7	7.7	57.8	3.7	0	0.0	0	1.2	10	0.0	0.0	0.0	-999
2023-05-30 23:50:00	978.2	11.4	7.8	57.7	3.4	0	0.0	0	1.3	10	0.0	0.0	0.0	-999

79200 rows × 14 columns

Names of our columns:
MESS_DATUM Zeitstempel yyyymmddhhmi

PP_10 Luftdruck auf Stationshöhe (hPa) $\Rightarrow$ pressure at station level (hPa)
TT_10 Lufttemperatur in 2m Höhe °C $\Rightarrow$ air temperature 2m height
TM5_10 Lufttemperatur 5cm Höhe °C $\Rightarrow$ air temperature 5m height
RF_10 relative Feuchte in 2m Höhe % $\Rightarrow$ realtive humidity in 2m height
TD_10 Taupunkttemperatur in 2m Höhe °C $\Rightarrow$ dew point temperature in 2m height
RWS_DAU_10 Niederschlagsdauer in den letzten 10 Minuten (min) $\Rightarrow$ duration of precipitation within the last 10 minuntes (min)
RWS_10 Niederschlagshöhe in den letzten 10 Minuten (mm) $\Rightarrow$ precipitation heigth of the last 10 minutes (mm)
RWS_IND_10 Regenindex $\Rightarrow$ index (0 no precipitation, 1 precipitation has fallen, 3 precipitation has fallen and heating of instrument was on)
FF_10 Durchschnittliche Windgeschwindigkeit in den letzten 10 Minuten (m/s) $\Rightarrow$ mean of wind speed during the last 10 minutes m/s
DD_10 Durchschnittliche Windrichtung in den letzten 10 Minuten (Grad) $\Rightarrow$ mean of wind direction during the last 10 minutes degree
DS_10 10min-Summe der diffusen solaren Strahlung (J/cm^2) $\Rightarrow$ sum of difuse solar radiation during the last 10 minutes (J/cm^2⁾
GS_10 10min-Summe der Globalstrahlung (J/cm^2) $\Rightarrow$ sum of global radiation during the last 10 minutes (J/cm^2)
SD_10 10min-Summe der Sonnenscheindauer (h) $\Rightarrow$ duration of sunshine during the last 10 minutes (h)
LS_10 10min-Summe der atmosphärischen Gegenstrahlung J/cm^2 $\Rightarrow$ sum of the atmospheric counter-radiation during the last 10 minutes (J/cm^2)

Now we create hourly datas from the 10 minute data.

dkfh = dkf.resample('H').mean()
drh = dr.resample('H').mean()
dkfh['RWS_10'] = dkfh['RWS_10'] * 6
dkfh['SD_10'] = dkfh['SD_10'] * 6
drh['RWS_10'] = drh['RWS_10'] * 6
drh['SD_10'] = drh['SD_10'] * 6

1. Assumption: Looking on Air temperature in both places
Now we plot the average temperature of both places in this year.

n = dkfh.index.year==2023
q = dkfh[n]
q['TT_10'].plot(label='Kleiner Feldberg, Hessen')

k = drh.index.year==2023
r = drh[k]
r['TT_10'].plot(label='Regensburg, Bayern')
plt.xlabel('Date')
plt.ylabel('Air Temperature in 2m height')
plt.legend()
plt.title('The Air temperature in 2m height of this year');

Now we caculate the differnce in air temperature of two places in 2023.

drh['difftemp'] = dkfh['TT_10'] - drh['TT_10']
n = drh.index.year==2023
q = drh[n]
plt.scatter(drh[n].index, q['difftemp'])
plt.xlabel('Date')
plt.ylabel('Temperature difference in °C')
plt.title('Difference between air temperature Regensburg and Kleiner Feldberg');

Normally, it is cooler in average on the Kleiner Feldberg (first plot). However, in the second plot you can see that there was a warm winter in Hessen compared to the normally warmer place Regensburg. (For negative values the air temperature in Regensburg was higher and for positive values the air temperature on Kleiner Feldberg was higher.)

2. Assumption: The wind
Windrose for the Kleiner Feldberg in the year 2022

from windrose import WindroseAxes

n = dkfh.index.year==2022
q = dkfh[n]
ws = q['FF_10']
wd = q['DD_10']
ax = WindroseAxes.from_ax()                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
ax.bar(wd, ws, normed=True, opening=0.8, edgecolor='white')
ax.set_legend();

Windrose for Regensburg in 2022

m = drh.index.year==2022
p = drh[m]
ws = p['FF_10']
wd = p['DD_10']
ax = WindroseAxes.from_ax()                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
ax.bar(wd, ws, normed=True, opening=0.8, edgecolor='white')
ax.set_legend();

The wind direction are like we assumed from the orography. Less wind in Regensburg because the station is i the city and higher hills around. The station of Kleiner Feldberg is on the top of a mountain, so there are all direction stronger and you see the neighbored higher mountains Großer Feldberg in NE and Altkönig in SE where the wind is less strong.

3.Assumption: The precipitaion

m = dkfh.index.year==2022
p = dkfh[m]

w = p.index.month
dkfm = p.groupby(w).sum()

n = drh.index.year==2022
q = drh[n]
z = q.index.month
drm = q.groupby(z).sum()

dkfm['RWS_10'].plot.bar(label='Kleiner Feldberg', alpha=0.8, color='C1', ylim=(0,200), align='center')
drm['RWS_10'].plot.bar(label='Regensburg', alpha=0.8, align='edge')


plt.xlabel('Time in a year')
plt.ylabel('Precipitation mm ')
plt.legend()
gridnumber = range(0,12)
plt.xticks(gridnumber)
plt.title('Precipitation over the year 2022');

The plot shows a bigger different as we assumed. Surprisingly, there is more rain on Kleiner Feldberg over the year. On both places the main rain duration is in autuum/winter. In May and March we have some missing data.

4. Sunshine

dkfm['SD_10'].plot.bar(label='Kleiner Feldberg', alpha=0.8, color='darkred', ylim=(0,400), align='center')
drm['SD_10'].plot.bar(label='Regensburg', color='gold', alpha=0.8, align='edge')


plt.xlabel('Time in a year')
plt.ylabel('Duration of sunshine in h')
plt.legend()
gridnumber = range(0,12)
plt.xticks(gridnumber)
plt.title('Duration of sunshine over the year 2022');

Like we considered is the duration of sunshine more or less the same. In March, May and November we had missing data again.