Data science of climate change - C02 analysis¶

Gloria Benoit and Pierre Poulain (2023).

  • This project aims to study climate change, from original data sources to the reproduction of scientific graphics.
  • This notebook focuses specifically on the analysis of atmospheric CO2.

Data source¶

National Oceanic and Atmospheric Administration (NOAA)¶

NOAA is an agency of the U.S. department of commerce that “enriches life through science”. They work to keep the public informed of the changing environment around them. In 1972, they've created The Global Monitoring Laboratory (GML), as part of the NOAA Environmental Research Laboratories. It's mission is focused on geophysical monitoring for climatic change.

GML created a program called the Carbon Cycle Greenhouse Gases (CCGG), in which the spatial and temporal distributions of greenhouse gases are measured and modeled. The CCGG research area operates the Global Greenhouse Gas Reference Network, measuring the atmospheric distribution and trends of the three main long-term drivers of climate change, carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O), as well as carbon monoxide (CO) which is an important indicator of air pollution.

CO2 measurements since the 70s¶

Greenhouse gases are measured at four Atmospheric Baseline Observatories and multiple tall towers in the United States. The first measurements of CO2 at the observatories started in 1973. Continuous in-situ measurements of these gases provide great detail in their long-term trends, seasonal and short-term variations, and diurnal cycles.

This data can also be found in GML's CO2 database.

CO2 estimates before real-time measurements: paleoclimatology¶

The most direct method of investigating past variations of the atmospheric CO2 concentration before 1958, when continuous direct atmospheric CO2 measurements started, is the analysis of air extracted from suitable ice cores (Siegenthaler et al., 2005)

The measurement of the gas composition is direct: trapped in deep ice cores are tiny bubbles of ancient air, which we can extract and analyze using mass spectrometers.

From 1000 to 2004¶

In 2010, pre-industrial CO2 data was estimated using two ice cores. Measurements for Law Dome (LD) and Dronning Maud Land (DML) were sufficient to yield smoothed, with 50-year splines, estimates for CO2 evolution between 1000 and 2004 (Frank et al., 2010).

This data can be found in NOAA's Paleoclimatology database, as study number 10437.

800,000 years before present¶

In 2008, the Antarctic Vostok and EPICA Dome C ice cores had provided a composite record of atmospheric carbon dioxide levels over the past 800,000 years (Lüthi et al., 2008).

This data can also be found in NOAA's Paleoclimatology database, as study number 6091.

Unit of measurement¶

When measuring gases the term concentration is used to describe the amount of gas by volume in the air. The two most common units of measurement are parts-per-million, and percent concentration. All data studied here uses parts-per-million as its unit of measurement.

Parts-per-million (abbreviated ppm) is the ratio of one gas to another. For example, 1,000 ppm of CO2 means that if one could count a million gas molecules, 1,000 of them would be of carbon dioxide and 999,000 molecules would be of some other gases.

Modules import¶

1
2
3
4
5
6
7
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import requests
from scipy import stats

In-situ observatories (1973-2022) data analysis¶

The dataset is stored in a text format, uses spaces as the separator, and contains 20 columns for 593 lines. It contains a site code, which refers to the observatory where the data was recorded, the date, from the year down to the second, the value of the CO2 measurement and other attributes of the measurements. Many comments are provided as lines starting by #.

We have four different datasets, each specific to a given observatory:

Barrow Mauna Loa American Samoa South Pole
BRW MLO SMO SPO

Their locations can be found directly on GML's site, where in situ observatories are marked by a blue rectangle.

Collect data¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def download_data(url, filename):
    """
    Download data file.

    Only if file is not already present on disk.
    """
    if not Path(filename).is_file():
        request = requests.get(url, allow_redirects=True)
        open(filename, "wb").write(request.content)
        print(f"Downloaded file: {filename}")
    else:
        print(f"File already downloaded: {filename}")
1
2
3
4
observatories = {"brw": "Barrow",
                 "mlo": "Mauna Loa",
                 "smo": "American Samoa",
                 "spo": "South Pole"}
1
2
3
4
5
6
7
for obs_code in observatories:
    filename = f"co2_{obs_code}_surface-insitu_1_ccgg_MonthlyData.txt"
    url = (
        "https://gml.noaa.gov/aftp/data/trace_gases/co2/in-situ/"
        f"surface/txt/{filename}"
    )
    download_data(url, filename)
File already downloaded: co2_brw_surface-insitu_1_ccgg_MonthlyData.txt
File already downloaded: co2_mlo_surface-insitu_1_ccgg_MonthlyData.txt
File already downloaded: co2_smo_surface-insitu_1_ccgg_MonthlyData.txt
File already downloaded: co2_spo_surface-insitu_1_ccgg_MonthlyData.txt

Prepare data¶

To keep our dataframe legible, we will keep only the following columns:

  • datetime,
  • time_decimal (useful for the regression),
  • and value.

The file header provides some information related to the value column:

# value:_FillValue : -999.999
# value:long_name : measured_mole_fraction_of_trace_gas_in_dry_air
# value:units : micromol mol-1
# value:comment : Mole fraction reported in units of micromol mol-1 (10-6 mol per mol of dry air); abbreviated as ppm (parts per million).

For this column, missing values are coded with -999.999

However, if we open the dataset file

1
2
df_mlo = pd.read_csv("co2_mlo_surface-insitu_1_ccgg_MonthlyData.txt",
                     comment="#", sep=" ", header=0)

and look closer at the value column, we found no occurence of the -999.999 missing value code.

1
(df_mlo["value"] == -999.999).any()
False

But we do find a value of -999.99, that is not compatible with a real concentration of CO2 in the atmosphere (CO2 concentration is expected to be a positive number).

1
df_mlo["value"].min()
-999.99
1
len(df_mlo[ df_mlo["value"] == -999.99 ])
11

Contrary to what is written in the header, missing values are coded as -999.99. We will take care of this while loading the data file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def read_obs_data(filename):
    """
    Read a dataframe from an observatory measurement file.
    
    Take care of missing values and column data type."""
    df = pd.read_csv(filename,
                     comment="#",
                     sep=" ",
                     header=0,
                     na_values=-999.99)
    # Keep a subset of columns.
    df = df[["datetime", "time_decimal", "value"]]
    # Convert datetime to real datetime.
    df["datetime"] = pd.to_datetime(df["datetime"])
    print(f"Read {filename}: {df.shape[0]} lines x {df.shape[1]} columns")
    return df
1
2
filename = "co2_mlo_surface-insitu_1_ccgg_MonthlyData.txt"
df_mlo = read_obs_data(filename)
Read co2_mlo_surface-insitu_1_ccgg_MonthlyData.txt: 593 lines x 3 columns

Here is a quick view of our data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
df_mlo.plot(
    kind="scatter",
    x="datetime",
    y="value",
    color="red",
    title="Monthly atmospheric CO2 records",
    xlabel="Year",
    style="o",
    s=1,
    ylabel="CO2 (ppm)",
    figsize=(10, 6)
);
No description has been provided for this image

We have a lot of variations due to the seasonal (winter / summer) cycling.

To smooth the CO2 measurements and remove seasonal variations, we apply a rolling average on the eleven adjacent months.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def add_rolling_average(df):
    """
    Smooth C02 measurements with a 11-month rolling average.
    
    Average is calculated on:
    - 5 months before,
    - the actual month (as central point),
    - 5 months after.
    """
    df["rolling_mean"] = df["value"].rolling(11, center=True).mean()
    return df
1
2
df_mlo = add_rolling_average(df_mlo)
df_mlo.head(15)
datetime time_decimal value rolling_mean
0 1973-08-01 00:00:00+00:00 1973.580822 NaN NaN
1 1973-09-01 00:00:00+00:00 1973.665753 NaN NaN
2 1973-10-01 00:00:00+00:00 1973.747945 NaN NaN
3 1973-11-01 00:00:00+00:00 1973.832877 NaN NaN
4 1973-12-01 00:00:00+00:00 1973.915068 NaN NaN
5 1974-01-01 00:00:00+00:00 1974.000000 NaN NaN
6 1974-02-01 00:00:00+00:00 1974.084932 NaN NaN
7 1974-03-01 00:00:00+00:00 1974.161644 NaN NaN
8 1974-04-01 00:00:00+00:00 1974.246575 NaN NaN
9 1974-05-01 00:00:00+00:00 1974.328767 333.16 NaN
10 1974-06-01 00:00:00+00:00 1974.413699 332.17 NaN
11 1974-07-01 00:00:00+00:00 1974.495890 331.11 NaN
12 1974-08-01 00:00:00+00:00 1974.580822 329.11 NaN
13 1974-09-01 00:00:00+00:00 1974.665753 327.30 NaN
14 1974-10-01 00:00:00+00:00 1974.747945 327.30 330.195455

For MLO (Mauna Loa) observatory, C02 measurements start in May 1974. Due to the rolling average calculated on 11 months (5 months before and 5 months after the central month), averaged values start in October 1974.

Aggregate all four observatories¶

We can now read datasets for the four observatories and aggregate all data into one dataframe.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
df_obs = pd.DataFrame(columns=["datetime", "time_decimal", "obs", "value", "rolling_mean"])
for obs_code in observatories:
    filename = f"co2_{obs_code}_surface-insitu_1_ccgg_MonthlyData.txt"
    df_tmp = read_obs_data(filename)
    df_tmp = add_rolling_average(df_tmp)
    df_tmp["obs"] = obs_code
    if not df_obs.empty:
        df_obs = pd.concat([df_obs, df_tmp], axis=0, ignore_index=True)
    else:
        df_obs = df_tmp
Read co2_brw_surface-insitu_1_ccgg_MonthlyData.txt: 593 lines x 3 columns
Read co2_mlo_surface-insitu_1_ccgg_MonthlyData.txt: 593 lines x 3 columns
Read co2_smo_surface-insitu_1_ccgg_MonthlyData.txt: 588 lines x 3 columns
Read co2_spo_surface-insitu_1_ccgg_MonthlyData.txt: 588 lines x 3 columns

Display a few random rows from the dataset:

1
df_obs.sample(10)
datetime time_decimal value rolling_mean obs
858 1995-09-01 00:00:00+00:00 1995.665753 358.29 361.281818 mlo
1761 2021-12-01 00:00:00+00:00 2021.915068 NaN NaN smo
321 2000-05-01 00:00:00+00:00 2000.330601 376.37 371.304545 brw
963 2004-06-01 00:00:00+00:00 2004.415301 379.94 377.733636 mlo
47 1977-07-01 00:00:00+00:00 1977.495890 330.11 334.587273 brw
819 1992-06-01 00:00:00+00:00 1992.415301 359.44 356.630909 mlo
1457 1996-08-01 00:00:00+00:00 1996.581967 361.16 361.286364 smo
1465 1997-04-01 00:00:00+00:00 1997.246575 362.02 362.075455 smo
284 1997-04-01 00:00:00+00:00 1997.246575 369.72 365.150000 brw
2210 2010-05-01 00:00:00+00:00 2010.328767 385.43 386.020000 spo

Visualize CO2 variations versus time for all four observatories¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
fig, axs = plt.subplots(4, 2, figsize=(12, 8), sharex="col")
# Define subgraphs.
axes = [[0, 0], [0, 1], [1, 0], [1, 1], [2, 0], [2, 1], [3, 0], [3, 1]]
fig.suptitle("Monthly atmospheric CO2 records", fontsize=16)

position = 0
for obs_code in observatories:
    # Select data for a given observatory.
    df = df_obs[ df_obs["obs"] == obs_code ]
    
    # First plot in a row: from 1973 to 2022.
    i, j = axes[position]
    axs[i,j].plot(df["datetime"], df["value"],
                  "-o", color="red", markersize=1)
    axs[i,j].plot(df["datetime"], df["rolling_mean"],
                  color="black")
    axs[i,j].set_ylabel(observatories[obs_code])

    # Second plot in a row: last five years.
    start_year = max(df["datetime"].dt.year) - 5
    df = df[ df["datetime"].dt.year > start_year ]
    i, j = axes[position+1]
    axs[i,j].plot(df["datetime"], df["value"],
                  "-o", color="red", markersize=3)
    axs[i,j].plot(df["datetime"], df["rolling_mean"],
                  color="black")
    position += 2

fig.supxlabel("Year", fontsize=14)
fig.supylabel("CO2 (ppm)", fontsize=14)
plt.show()
No description has been provided for this image
  • Raw measurements are displayed in red.
  • Rolling averages are shown in black. Averaged values remove seasonal variations.
  • Horizontal panels represent observatories.
  • Vertical panels are different time frames: 1973-2022 on the left panels, 2018-2022 on the right panels.

Forecast CO2 evolution for Mauna Loa¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
target_year = 2040
predicted_years = np.arange(2023, target_year+1)
fig, ax = plt.subplots(figsize=(12, 6))

observatory_code = "mlo" # Mauna Loa observatory.

df = (df_obs
    .query(f"obs == '{observatory_code}'") # Select records for a given observatory.
    .dropna() # Remove missing values for the linear regression.
)

# Make linear regression with Scipy.
# Need to use a decimal number for date.
lin_res = stats.linregress(df["time_decimal"], df["rolling_mean"])

# Display results.
ax.scatter(df["time_decimal"],
           df["value"],
           color="#f48c06",
           s=3,
           label="row measurements"
)
ax.plot(df["time_decimal"],
        df["rolling_mean"],
        color="#9d0208",
        label="rolling average"
)
ax.plot(df["time_decimal"],
        lin_res.intercept + lin_res.slope*df["time_decimal"],
        color="black",
        label=f"linear regression (slope: {lin_res.slope:.2f})"
)
ax.plot(predicted_years,
        lin_res.intercept + lin_res.slope * predicted_years,
        color="black", linestyle='--',
        label="forecast"
)

ax.set_xlim([2000, target_year])
ax.set_xlabel("Year", fontsize=16)
ax.set_ylabel("CO2 (ppm)", fontsize=16)
ax.legend(loc="upper left")
ax.set_title("CO2 forecast based on linear regression of averaged CO2 records");
No description has been provided for this image

Paleoclimatology data analysis¶

1000 to 2004 (paleo1)¶

Collect data¶

The dataset is stored:

  • in a text format,
  • uses tabulations as the separator,
  • and contains 23 columns, and 1005 lines.

It contains the years of records, from 1000 to 2004, and three mains values:

  • the record at Law Dome (LD),
  • the record at Dronning Maud Land (DML)
  • and the average record over those two sites (ALL).

Additional columns are provided with values smoothed with splines ranging from 50 to 200 years with a 25-year step.

1
2
3
4
5
6
filename = "smoothedco2.txt"
url = f"https://www.ncei.noaa.gov/pub/data/paleo/contributions_by_author/frank2010/{filename}"
download_data(url, filename)

df_paleo1 = pd.read_csv(filename, sep="\t", header=0)
df_paleo1.head()
File already downloaded: smoothedco2.txt
Year ALL_50_full LD_050 DML_050 ALL_050 LD_075 DML_075 ALL_075 LD_100 DML_100 ... ALL_125 LD_150 DML_150 ALL_150 LD_175 DML_175 ALL_175 LD_200 DML_200 ALL_200
0 1000 278.66 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 1001 278.68 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 1002 278.69 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 1003 278.71 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 1004 278.72 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 23 columns

Prepare data¶

It looks like there are many missing values.

1
df_paleo1.isna().sum()
Year             0
ALL_50_full      0
LD_050         254
DML_050        254
ALL_050        254
LD_075         254
DML_075        254
ALL_075        254
LD_100         254
DML_100        254
ALL_100        254
LD_125         254
DML_125        254
ALL_125        254
LD_150         254
DML_150        254
ALL_150        254
LD_175         254
DML_175        254
ALL_175        254
LD_200         254
DML_200        254
ALL_200        254
dtype: int64

Column ALL_50_full has no missing value.

We will take into account, the years and the averaged value for both sites with a 50-year spline.

1
2
df_paleo1 = df_paleo1[["Year", "ALL_50_full"]]
df_paleo1.describe()
Year ALL_50_full
count 1005.000000 1005.000000
mean 1502.000000 284.887801
std 290.262812 14.310892
min 1000.000000 276.450000
25% 1251.000000 279.600000
50% 1502.000000 280.980000
75% 1753.000000 282.100000
max 2004.000000 372.930000

Display data¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(df_paleo1["Year"],
         df_paleo1["ALL_50_full"],
         color="lightblue",
         linewidth=3
)
ax.axvline(x=1850, linestyle='--', color="grey")
ax.set_title("Global atmospheric CO2 from paleo 1000 - 2004")
ax.set_xlabel("Year", fontsize=16)
ax.set_ylabel("CO2 (ppm)", fontsize=16);
No description has been provided for this image

Before 1750, atmospheric CO2 was oscillating around 280 ppm. After 1850, it skyrockets up to 418 ppm in 2022.

800,000 years before present¶

Collecte data¶

The dataset is:

  • stored in a text format,
  • uses tabulations as the separator,
  • and contains 2 columns for 1096 lines.
1
2
3
4
5
filename = "edc3-composite-co2-2008-noaa.txt"
url = f"https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/epica_domec/{filename}"
download_data(url, filename)
df_paleo2 = pd.read_csv(filename, comment='#', sep='\t', header=0)
df_paleo2.head()
File already downloaded: edc3-composite-co2-2008-noaa.txt
gas_ageBP CO2
0 137 280.4
1 268 274.9
2 279 277.9
3 395 279.1
4 404 281.9
1
df_paleo2.describe()
gas_ageBP CO2
count 1096.000000 1096.000000
mean 390905.979015 230.835675
std 262092.947239 27.573616
min 137.000000 171.600000
25% 137133.500000 207.500000
50% 423206.500000 231.450000
75% 627408.000000 251.525000
max 798512.000000 298.600000

Prepare data¶

The gas_ageBP column correspond to the number of years before present, meaning before 1950. This means that the first year 137 BP is actually the year 1813 (1950 - 137 = 1813). We need to compute the exact year to make a comparison with previous datasets.

1
2
df_paleo2["Year"] = 1950 - df_paleo2["gas_ageBP"]
df_paleo2.head()
gas_ageBP CO2 Year
0 137 280.4 1813
1 268 274.9 1682
2 279 277.9 1671
3 395 279.1 1555
4 404 281.9 1546

Visualize data¶

1
2
3
4
5
6
7
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(df_paleo2["Year"],
        df_paleo2["CO2"],
        color="lightblue")
ax.set_title("Global atmospheric CO2 from paleo -800,000 - 1813")
ax.set_xlabel("Year", fontsize=16)
ax.set_ylabel("CO2 (ppm)", fontsize=16);
No description has been provided for this image

Compare all datasets¶

Aggregate data¶

First average values from Mauna Loa observatory by year. Extract raw and rolling average C02 values for comparison, but keep only raw C02 values ultimately.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Select Mona Loa observatory.
df_all = (df_obs
 .query("obs == 'mlo'")
 .assign(Year=df_obs["datetime"].dt.year)
 .groupby("Year")
 .agg(
     CO2=("value", "mean"),
     CO2_rolling=("rolling_mean", "mean"),
 )
 .reset_index()
 .assign(Source="OBS")
)
1
df_all.head(10)
Year CO2 CO2_rolling Source
0 1973 NaN NaN OBS
1 1974 329.753750 330.246970 OBS
2 1975 331.161818 331.001364 OBS
3 1976 332.039167 332.148831 OBS
4 1977 333.855833 333.823409 OBS
5 1978 335.423333 335.427045 OBS
6 1979 336.836667 336.854773 OBS
7 1980 338.790000 338.767197 OBS
8 1981 340.123333 340.145530 OBS
9 1982 341.475000 341.453712 OBS
1
2
df_all = df_all.drop(["CO2_rolling"], axis=1)
df_all.head()
Year CO2 Source
0 1973 NaN OBS
1 1974 329.753750 OBS
2 1975 331.161818 OBS
3 1976 332.039167 OBS
4 1977 333.855833 OBS

Then add paleo data for the time period 1000 - 2004.

1
2
3
4
df = (df_paleo1
 .rename(columns={"ALL_50_full": "CO2"})
 .assign(Source="PALEO1")
)
1
df_all = pd.concat([df_all, df], axis=0, ignore_index=True)

Finally, add paleo data for the time period -800,0000 to 1813.

1
2
3
4
5
df = (df_paleo2
 .drop("gas_ageBP", axis=1)
 .assign(Source="PALEO2")
)
df.head()
CO2 Year Source
0 280.4 1813 PALEO2
1 274.9 1682 PALEO2
2 277.9 1671 PALEO2
3 279.1 1555 PALEO2
4 281.9 1546 PALEO2
1
df_all = pd.concat([df_all, df], axis=0, ignore_index=True)
1
df_all.sample(10)
Year CO2 Source
1419 -228472 233.90 PALEO2
1397 -208863 243.40 PALEO2
1362 -177167 198.10 PALEO2
272 1222 281.63 PALEO1
1744 -537009 205.00 PALEO2
2013 -710883 218.50 PALEO2
154 1104 281.92 PALEO1
75 1025 279.00 PALEO1
1482 -290455 207.60 PALEO2
322 1272 281.32 PALEO1

Visualize data (full picture)¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
plt.figure(figsize=(12, 6))

sources = ("OBS", "PALEO1", "PALEO2")
names = ("NOAA (Mauna Loa)", "Ice cores", "Paleo")
colors = ("orange", "royalblue", "tab:green")

for source, name, color in zip(sources, names, colors):
    plt.plot(df_all[ df_all["Source"] == source]["Year"]/1000,
             df_all[ df_all["Source"] == source]["CO2"],
             color=color, linewidth=3, label=name, alpha=0.6)

plt.xlim([-800, 5])
plt.title("CO2 values for the last 800,000 years")
plt.xlabel("Year (x$10^3$)")
plt.ylabel("CO2 (ppm)")
max_year = df_all["Year"].max()
max_CO2 =  df_all[ df_all["Year"] == max_year ]["CO2"]
plt.axhline(max_CO2.values[0], linestyle="--", color="gray",
            label="Highest CO2 level")
plt.legend(loc="upper left")
plt.show()
No description has been provided for this image

Visualize data (zoom in)¶

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
plt.figure(figsize=(12, 6))

start_year = 1000
for source, name, color in zip(sources, names, colors):
    plt.plot(df_all[ (df_all["Source"] == source) & (df_all["Year"] >= start_year) ]["Year"],
             df_all[ (df_all["Source"] == source) & (df_all["Year"] >= start_year) ]["CO2"],
             linewidth=4, color=color, label=name, alpha=0.6)

plt.xlim([start_year, 2023])
plt.ylim([270, 425])
plt.title(f"CO2 values since {start_year}")
plt.xlabel("Year")
plt.ylabel("CO2 (ppm)")
max_year = df_all["Year"].max()
max_CO2 =  df_all[ df_all["Year"] == max_year ]["CO2"]
plt.axhline(max_CO2.values[0], linestyle="--", color="gray",
            label="Highest CO2 level")
plt.legend(loc="upper left")
plt.show()
No description has been provided for this image

Even though CO2 values from paleoclimatology are estimated values only and come from different locations, they are in good agreement with each other and with values from Mauna Loa observatory.

Conclusion¶

Atmospheric CO2 records have been soaring since pre-industrial times (circa 1850).

Library versions¶

1
2
3
4
5
6
7
%load_ext watermark
# Python implementation and version, and machine architecture
%watermark
# Versions for jupyterlab, imported packages and watermark itself
%watermark --packages jupyterlab,ipywidgets --iversions --watermark
# Name of conda environment
%watermark --conda
Last updated: 2024-01-21T23:30:40.778294+01:00

Python implementation: CPython
Python version       : 3.12.0
IPython version      : 8.20.0

Compiler    : GCC 11.2.0
OS          : Linux
Release     : 6.2.0-39-generic
Machine     : x86_64
Processor   : x86_64
CPU cores   : 8
Architecture: 64bit

jupyterlab: 4.0.10
ipywidgets: 8.1.1

scipy     : 1.11.4
matplotlib: 3.8.2
requests  : 2.31.0
numpy     : 1.26.3
pandas    : 2.1.4

Watermark: 2.4.3

conda environment: data-science-climate-change

1