{ "cells": [ { "cell_type": "markdown", "id": "fcc18135-13cb-4fd8-b04b-c5cc9836ee74", "metadata": {}, "source": [ "# RioXarray raster index examples" ] }, { "cell_type": "code", "execution_count": 1, "id": "1b4ca507-2fa4-4809-a66e-28a907eda00e", "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import rioxarray as rio\n", "\n", "xr.set_options(display_expand_indexes=True);" ] }, { "cell_type": "markdown", "id": "9cbecd44-d3a7-4efa-8ac1-9840788bed81", "metadata": {}, "source": [ "## Example with rectilinear and no rotation affine transform\n", "\n", "Both x and y coordinates are 1-dimensional." ] }, { "cell_type": "code", "execution_count": 2, "id": "01685f92-546b-454a-908f-12e64c1a4631", "metadata": {}, "outputs": [], "source": [ "source = \"/vsicurl/https://noaadata.apps.nsidc.org/NOAA/G02135/south/daily/geotiff/2024/01_Jan/S_20240101_concentration_v3.0.tif\"" ] }, { "cell_type": "markdown", "id": "b12bfaff-fe20-43d1-b827-812a6299b87f", "metadata": {}, "source": [ "### Load and inspect the datasets, with and without `RasterIndex`." ] }, { "cell_type": "code", "execution_count": 3, "id": "628796f4-0cd8-405d-99de-ff7a1de976ef", "metadata": {}, "outputs": [], "source": [ "da_no_raster_index = xr.open_dataarray(source, engine=\"rasterio\")\n", "\n", "with rio.set_options(use_raster_index=True):\n", " # note: use open_dataarray() instead of load_dataarray()\n", " # for getting lazy spatial coordinates\n", " da_raster_index = xr.open_dataarray(source, engine=\"rasterio\")" ] }, { "cell_type": "code", "execution_count": 4, "id": "68328dfd-2885-4bd9-8346-44c2f01d5deb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, y: 332, x: 316)> Size: 420kB\n",
       "[104912 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "  * x            (x) float64 3kB -3.938e+06 -3.912e+06 ... 3.912e+06 3.938e+06\n",
       "  * y            (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n",
       "    spatial_ref  int64 8B ...\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 420kB\n", "[104912 values with dtype=float32]\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " * x (x) float64 3kB -3.938e+06 -3.912e+06 ... 3.912e+06 3.938e+06\n", " * y (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n", " spatial_ref int64 8B ...\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_no_raster_index" ] }, { "cell_type": "markdown", "id": "28f1f83c-37e7-478f-b676-9800c7eae768", "metadata": {}, "source": [ "The \"x\" and \"y\" coordinates with a raster index are lazy! The repr below shows a few values for each coordinate (those have been computed on-the-fly) but clicking on the database icon doesn't show any value in the spatial coordinate data reprs." ] }, { "cell_type": "code", "execution_count": 5, "id": "5f07a272-895f-40ac-8ae3-07a8d3e07521", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, y: 332, x: 316)> Size: 420kB\n",
       "[104912 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "  * x            (x) float64 3kB -3.938e+06 -3.912e+06 ... 3.912e+06 3.938e+06\n",
       "  * y            (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n",
       "    spatial_ref  int64 8B ...\n",
       "Indexes:\n",
       "  ┌ x        RasterIndex\n",
       "  └ y\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 420kB\n", "[104912 values with dtype=float32]\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " * x (x) float64 3kB -3.938e+06 -3.912e+06 ... 3.912e+06 3.938e+06\n", " * y (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n", " spatial_ref int64 8B ...\n", "Indexes:\n", " ┌ x RasterIndex\n", " └ y\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_raster_index" ] }, { "cell_type": "code", "execution_count": 6, "id": "9ab08edf-c413-451b-9ddb-b9b894ea139c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "xarray.core.indexing.CoordinateTransformIndexingAdapter" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(da_raster_index.coords.variables[\"x\"]._data)" ] }, { "cell_type": "markdown", "id": "81ccec5b-cfbc-4532-846d-725f62a9fb9b", "metadata": {}, "source": [ "### Compare and align the datasets with and without `RasterIndex`\n", "\n", "`equals` compares variable values without relying on Xarray coordinate indexes. Both dataarrays should thus be equal." ] }, { "cell_type": "code", "execution_count": 7, "id": "95309ec5-b317-4171-85ba-bd01cc9bd649", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_raster_index.equals(da_no_raster_index)" ] }, { "cell_type": "markdown", "id": "faa19ff9-404f-4885-90a6-5188c6c2e585", "metadata": {}, "source": [ "Xarray alignment relies on Xarray coordinate indexes. Trying to align both datasets fails here since they each have different index types.\n", "\n", "Maybe Xarray should try aligning the datasets based on coordinate variable data in this case? I don't think this would be easy to implement in a general way... Xarray's alignment logic is already complex! Also the alignment failure here is not necessarily a bad thing (cf. alignment issues with explicit floating-point coordinates)." ] }, { "cell_type": "code", "execution_count": 8, "id": "ac2d1532-2441-4a86-8339-a4d5ad78dd01", "metadata": {}, "outputs": [], "source": [ "# this fails!\n", "\n", "# da_raster_index + da_no_raster_index" ] }, { "cell_type": "markdown", "id": "909e425e-178b-401a-b754-f198f59dce5e", "metadata": {}, "source": [ "### Indexing the dataarray with `RasterIndex`" ] }, { "cell_type": "markdown", "id": "e0bd8ecb-25c1-4ce9-9af8-b354b636d445", "metadata": {}, "source": [ "#### Integer-based selection (isel)" ] }, { "cell_type": "markdown", "id": "6832b3b2-c632-492b-ba51-582c010384eb", "metadata": {}, "source": [ "- *Slicing both x and y*\n", "\n", "Note: this should normally return a single RasterIndex for both x and y (see https://github.com/pydata/xarray/issues/10063)." ] }, { "cell_type": "code", "execution_count": 9, "id": "f2d90e00-72c5-4312-95e8-b9a23fe5d5e3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, y: 166, x: 3)> Size: 2kB\n",
       "array([[[0., 0., 0.],\n",
       "        [0., 0., 0.],\n",
       "        ...,\n",
       "        [0., 0., 0.],\n",
       "        [0., 0., 0.]]], shape=(1, 166, 3), dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "  * x            (x) float64 24B -3.912e+06 -3.888e+06 -3.862e+06\n",
       "  * y            (y) float64 1kB 4.338e+06 4.288e+06 ... -3.862e+06 -3.912e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Indexes:\n",
       "    x        RasterIndex\n",
       "    y        RasterIndex\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 2kB\n", "array([[[0., 0., 0.],\n", " [0., 0., 0.],\n", " ...,\n", " [0., 0., 0.],\n", " [0., 0., 0.]]], shape=(1, 166, 3), dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " * x (x) float64 24B -3.912e+06 -3.888e+06 -3.862e+06\n", " * y (y) float64 1kB 4.338e+06 4.288e+06 ... -3.862e+06 -3.912e+06\n", " spatial_ref int64 8B 0\n", "Indexes:\n", " x RasterIndex\n", " y RasterIndex\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_sliced = da_raster_index.isel(x=slice(1, 4), y=slice(None, None, 2))\n", "da_sliced" ] }, { "cell_type": "markdown", "id": "baf29b0e-3062-426d-920d-d0b293143422", "metadata": {}, "source": [ "Slicing keeps both coordinates lazy (it computes a new affine transform):" ] }, { "cell_type": "code", "execution_count": 10, "id": "513ebddc-bc55-4893-8f80-881e1bfa56c2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RasterIndex\n", "'x':\n", " \n", "RasterIndex\n", "'y':\n", " \n" ] } ], "source": [ "print(da_sliced.xindexes[\"x\"])\n", "print(da_sliced.xindexes[\"y\"])" ] }, { "cell_type": "markdown", "id": "dbafcfdc-4ffe-4c0c-985d-4f760a81b0bd", "metadata": {}, "source": [ "- *Outer indexing with arbitrary array values*" ] }, { "cell_type": "code", "execution_count": 11, "id": "be610609-4eed-4971-8fb3-bf1fc324d128", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, y: 3, x: 3)> Size: 36B\n",
       "array([[[0., 0., 0.],\n",
       "        [0., 0., 0.],\n",
       "        [0., 0., 0.]]], dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "  * x            (x) float64 24B -3.938e+06 -3.888e+06 -3.838e+06\n",
       "  * y            (y) float64 24B 4.338e+06 4.338e+06 4.312e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Indexes:\n",
       "    x        RasterIndex\n",
       "    y        RasterIndex\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 36B\n", "array([[[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]]], dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " * x (x) float64 24B -3.938e+06 -3.888e+06 -3.838e+06\n", " * y (y) float64 24B 4.338e+06 4.338e+06 4.312e+06\n", " spatial_ref int64 8B 0\n", "Indexes:\n", " x RasterIndex\n", " y RasterIndex\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_outer = da_raster_index.isel(x=[0, 2, 4], y=[0, 0, 1])\n", "da_outer" ] }, { "cell_type": "markdown", "id": "ce3502c0-3af5-4470-ba58-b46743ab3d7e", "metadata": {}, "source": [ "We cannot compute a new affine transfrom given arbitrary array positions. To allow further data selection, pandas indexes are created for indexed spatial coordinates:" ] }, { "cell_type": "code", "execution_count": 12, "id": "9695d517-99f8-4708-85a0-0aedc7a0386b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RasterIndex\n", "'x':\n", " PandasIndex(Index([-3937500.0, -3887500.0, -3837500.0], dtype='float64', name='x'))\n", "RasterIndex\n", "'y':\n", " PandasIndex(Index([4337500.0, 4337500.0, 4312500.0], dtype='float64', name='y'))\n" ] } ], "source": [ "print(da_outer.xindexes[\"x\"])\n", "print(da_outer.xindexes[\"y\"])" ] }, { "cell_type": "markdown", "id": "a1b7a75b-94dc-4fad-b901-a7d2e563c5ff", "metadata": {}, "source": [ "- *Basic indexing with scalars*" ] }, { "cell_type": "code", "execution_count": 13, "id": "69a94f72-781d-412d-b02f-8068bef2e0a8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1)> Size: 4B\n",
       "array([0.], dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "    x            float64 8B -3.938e+06\n",
       "    y            float64 8B 4.312e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 4B\n", "array([0.], dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " x float64 8B -3.938e+06\n", " y float64 8B 4.312e+06\n", " spatial_ref int64 8B 0\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_scalar = da_raster_index.isel(x=0, y=1)\n", "da_scalar" ] }, { "cell_type": "code", "execution_count": 14, "id": "55148b6b-6f4c-4d2b-aded-1c7895a65bb5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, y: 332)> Size: 1kB\n",
       "array([[0., 0., 0., ..., 0., 0., 0.]], shape=(1, 332), dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "    x            float64 8B -3.938e+06\n",
       "  * y            (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Indexes:\n",
       "    y        RasterIndex\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 1kB\n", "array([[0., 0., 0., ..., 0., 0., 0.]], shape=(1, 332), dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " x float64 8B -3.938e+06\n", " * y (y) float64 3kB 4.338e+06 4.312e+06 ... -3.912e+06 -3.938e+06\n", " spatial_ref int64 8B 0\n", "Indexes:\n", " y RasterIndex\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_xscalar = da_raster_index.isel(x=0)\n", "da_xscalar" ] }, { "cell_type": "markdown", "id": "859268ae-cb9f-49bb-ab4d-a716af7e7f1f", "metadata": {}, "source": [ "The RasterIndex is preserved in case of partial dimension reduction. (Note: below the index of \"y\" still wraps an index for \"x\", this is also related to https://github.com/pydata/xarray/issues/10063)." ] }, { "cell_type": "code", "execution_count": 15, "id": "8ad95332-8643-462d-a395-c34deaa06f28", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RasterIndex\n", "'x':\n", " \n", "'y':\n", " " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_xscalar.xindexes[\"y\"]" ] }, { "cell_type": "markdown", "id": "1402935c-3e70-4ec3-a2e2-8912158a0b11", "metadata": {}, "source": [ "- *Vectorized (fancy) indexing*\n", "\n", "Indexing the spatial coordinates with Xarray `Variable` objects returns a `RasterIndex` (wrapping `PandasIndex`) for 1-dimensional variables and no index for scalar or n-dimensional variables." ] }, { "cell_type": "code", "execution_count": 16, "id": "8e707c5b-6f7b-4027-be1b-683b10d24179", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, z: 2)> Size: 8B\n",
       "array([[0., 0.]], dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "  * x            (z) float64 16B -3.938e+06 -3.912e+06\n",
       "  * y            (z) float64 16B 4.312e+06 4.312e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Dimensions without coordinates: z\n",
       "Indexes:\n",
       "    x        RasterIndex\n",
       "    y        RasterIndex\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 8B\n", "array([[0., 0.]], dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " * x (z) float64 16B -3.938e+06 -3.912e+06\n", " * y (z) float64 16B 4.312e+06 4.312e+06\n", " spatial_ref int64 8B 0\n", "Dimensions without coordinates: z\n", "Indexes:\n", " x RasterIndex\n", " y RasterIndex\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_points = da_raster_index.isel(x=xr.Variable(\"z\", [0, 1]), y=xr.Variable(\"z\", [1, 1]))\n", "da_points" ] }, { "cell_type": "code", "execution_count": 17, "id": "36cafddb-a167-4e7f-97d6-f4b3420f1597", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'band_data' (band: 1, u: 2, v: 2)> Size: 16B\n",
       "array([[[0., 0.],\n",
       "        [0., 0.]]], dtype=float32)\n",
       "Coordinates:\n",
       "  * band         (band) int64 8B 1\n",
       "    x            (u, v) float64 32B -3.938e+06 -3.912e+06 -3.888e+06 -3.862e+06\n",
       "    y            (u, v) float64 32B 4.312e+06 4.312e+06 4.288e+06 4.288e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Dimensions without coordinates: u, v\n",
       "Attributes:\n",
       "    AREA_OR_POINT:  Area
" ], "text/plain": [ " Size: 16B\n", "array([[[0., 0.],\n", " [0., 0.]]], dtype=float32)\n", "Coordinates:\n", " * band (band) int64 8B 1\n", " x (u, v) float64 32B -3.938e+06 -3.912e+06 -3.888e+06 -3.862e+06\n", " y (u, v) float64 32B 4.312e+06 4.312e+06 4.288e+06 4.288e+06\n", " spatial_ref int64 8B 0\n", "Dimensions without coordinates: u, v\n", "Attributes:\n", " AREA_OR_POINT: Area" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_points2d = da_raster_index.isel(\n", " x=xr.Variable((\"u\", \"v\"), [[0, 1], [2, 3]]),\n", " y=xr.Variable((\"u\", \"v\"), [[1, 1], [2, 2]]),\n", ")\n", "da_points2d" ] }, { "cell_type": "markdown", "id": "5e9c553c-a056-4d31-9102-51c44ac1cd37", "metadata": {}, "source": [ "#### Label-based selection (sel)" ] }, { "cell_type": "markdown", "id": "5ecd7d5a-a20e-4cfd-bbf2-87fac330bd08", "metadata": {}, "source": [ "TODO" ] }, { "cell_type": "markdown", "id": "5bdbb1ae-2715-47b7-9743-144fabd335d3", "metadata": {}, "source": [ "## Example with complex affine transformation\n", "\n", "x and y coordinates are both 2-dimensional." ] }, { "cell_type": "markdown", "id": "fb02ebda-7a09-48b2-b95d-7e78a3fd147c", "metadata": {}, "source": [ "TODO" ] }, { "cell_type": "code", "execution_count": null, "id": "6906b70b-07b2-4502-b0c1-c2282d4b94aa", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python (Pixi)", "language": "python", "name": "pixi-kernel-python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.2" } }, "nbformat": 4, "nbformat_minor": 5 }