U.S. County-to-County Migration¶
This notebook is derived from the original deck.gl example in JavaScript, which you can see here.
This dataset originally came from the U.S. Census Bureau and represents people moving in and out of each county between 2009-2013.
This also serves as a notebook for day 10 of 30 Day Map Challenge.
Imports¶
1 2 3 4 5 6 7 8 9 10 11 | import geopandas as gpd import numpy as np import pandas as pd import pyarrow as pa import requests import shapely from matplotlib.colors import Normalize from lonboard import Map, ScatterplotLayer from lonboard.experimental import ArcLayer from lonboard.layer_extension import BrushingExtension |
Fetch the data from the version in the deck.gl-data
repository.
1 2 3 | url = "https://raw.githubusercontent.com/visgl/deck.gl-data/master/examples/arc/counties.json" r = requests.get(url) source_data = r.json() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | arcs = [] targets = [] sources = [] pairs = {} features = source_data["features"] for i, county in enumerate(features): flows = county["properties"]["flows"] target_centroid = county["properties"]["centroid"] total_value = { "gain": 0, "loss": 0, } for to_id, value in flows.items(): if value > 0: total_value["gain"] += value else: total_value["loss"] += value # If number is too small, ignore it if abs(value) < 50: continue pair_key = "-".join(map(str, sorted([i, int(to_id)]))) source_centroid = features[int(to_id)]["properties"]["centroid"] gain = np.sign(flows[to_id]) # add point at arc source sources.append( { "position": source_centroid, "target": target_centroid, "name": features[int(to_id)]["properties"]["name"], "radius": 3, "gain": -gain, } ) # eliminate duplicate arcs if pair_key in pairs.keys(): continue pairs[pair_key] = True if gain > 0: arcs.append( { "target": target_centroid, "source": source_centroid, "value": flows[to_id], } ) else: arcs.append( { "target": source_centroid, "source": target_centroid, "value": flows[to_id], } ) # add point at arc target targets.append( { **total_value, "position": [target_centroid[0], target_centroid[1], 10], "net": total_value["gain"] + total_value["loss"], "name": county["properties"]["name"], } ) # sort targets by radius large -> small targets = sorted(targets, key=lambda d: abs(d["net"]), reverse=True) normalizer = Normalize(0, abs(targets[0]["net"])) |
We define some color constants, as well as a color lookup array.
A nice trick in numpy is that if you have a two-dimensional array like:
[
[166, 3, 3],
[ 35, 181, 184]
]
you can perform a lookup based on the index to transform data from one dimensionality to another. In this case, we'll use 0
and 1
— the two available indexes of the array's first dimension — to create an array of colors.
So when we call COLORS[colors_lookup]
that creates an output array of something like:
[
[166, 3, 3],
[ 35, 181, 184],
[166, 3, 3],
[166, 3, 3]
]
equal to the number of rows in our dataset. We can then pass this to any parameter that accepts a ColorAccessor.
1 2 3 4 5 6 7 8 9 10 | # migrate out SOURCE_COLOR = [166, 3, 3] # migrate in TARGET_COLOR = [35, 181, 184] # Combine into a single arr to use as a lookup table COLORS = np.vstack( [np.array(SOURCE_COLOR, dtype=np.uint8), np.array(TARGET_COLOR, dtype=np.uint8)] ) SOURCE_LOOKUP = 0 TARGET_LOOKUP = 1 |
1 2 | brushing_extension = BrushingExtension() brushing_radius = 200000 |
Convert the sources
list of dictionaries into a GeoPandas GeoDataFrame
to pass into a ScatterplotLayer
.
1 2 3 4 5 6 7 8 9 10 11 | source_arr = np.array([source["position"] for source in sources]) source_positions = shapely.points(source_arr[:, 0], source_arr[:, 1]) source_gdf = gpd.GeoDataFrame( pd.DataFrame.from_records(sources)[["name", "radius", "gain"]], geometry=source_positions, crs="EPSG:4326" ) # We use a lookup table (`COLORS`) to apply either the target color or the source color # to the array source_colors_lookup = np.where(source_gdf["gain"] > 0, TARGET_LOOKUP, SOURCE_LOOKUP) source_fill_colors = COLORS[source_colors_lookup] |
Create a ScatterplotLayer
for source points:
1 2 3 4 5 6 7 8 | source_layer = ScatterplotLayer.from_geopandas( source_gdf, get_fill_color=source_fill_colors, radius_scale=3000, pickable=False, extensions=[brushing_extension], brushing_radius=brushing_radius, ) |
1 2 3 4 5 6 7 8 9 10 11 | targets_arr = np.array([target["position"] for target in targets]) target_positions = shapely.points(targets_arr[:, 0], targets_arr[:, 1]) target_gdf = gpd.GeoDataFrame( pd.DataFrame.from_records(targets)[["name", "gain", "loss", "net"]], geometry=target_positions, crs="EPSG:4326" ) # We use a lookup table (`COLORS`) to apply either the target color or the source color # to the array target_line_colors_lookup = np.where(target_gdf["net"] > 0, TARGET_LOOKUP, SOURCE_LOOKUP) target_line_colors = COLORS[target_line_colors_lookup] |
Create a ScatterplotLayer
for target points:
1 2 3 4 5 6 7 8 9 10 11 | target_ring_layer = ScatterplotLayer.from_geopandas( target_gdf, get_line_color=target_line_colors, radius_scale=4000, pickable=True, stroked=True, filled=False, line_width_min_pixels=2, extensions=[brushing_extension], brushing_radius=brushing_radius, ) |
Note: the ArcLayer
can't currently be created from a GeoDataFrame because it
needs two point columns, not one. This is a large part of why it's still
marked under the "experimental" module.
Here we pass a numpy array for each point column. This is allowed as long as the shape of the array is (N, 2)
or (N, 3)
(i.e. 2D or 3D coordinates).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | value = np.array([arc["value"] for arc in arcs]) get_source_position = np.array([arc["source"] for arc in arcs]) get_target_position = np.array([arc["target"] for arc in arcs]) table = pa.table({"value": value}) arc_layer = ArcLayer( table=table, get_source_position=get_source_position, get_target_position=get_target_position, get_source_color=SOURCE_COLOR, get_target_color=TARGET_COLOR, get_width=1, opacity=0.4, pickable=False, extensions=[brushing_extension], brushing_radius=brushing_radius, ) |
Now we can create a map using these three layers we've created.
As you hover over the map, it should render only the arcs near your cursor.
You can modify brushing_extension.brushing_radius
to control how large the brush is around your cursor.
1 2 | map_ = Map(layers=[source_layer, target_ring_layer, arc_layer], picking_radius=10) map_ |
1 |