Building Pipelines#

In this exercise we will explore Panel’s .rx() API to add widgets to our analyses and plots.

We’ll first load the earthquakes DataFrame and filter to those with >=7 magnitude:

import pathlib
import pandas as pd
import xarray as xr
import panel as pn  # noqa

import hvplot.pandas # noqa: adds hvplot method to pandas objects
import hvplot.xarray # noqa: adds hvplot method to xarray objects

pn.extension(sizing_mode="stretch_width")

df = pd.read_parquet(pathlib.Path('../../data/earthquakes-projected.parq'))
columns = ['mag', 'depth', 'latitude', 'longitude', 'place', 'type']
df = df[columns]
most_severe = df[df.mag >= 7]

df.head()

	mag	depth	latitude	longitude	place	type
time
2000-01-31 23:52:00.619000+00:00	0.60	7.800	37.1623	-116.6037	Nevada	earthquake
2000-01-31 23:44:54.060000+00:00	1.72	4.516	34.3610	-116.1440	26km NNW of Twentynine Palms, California	earthquake
2000-01-31 23:28:38.420000+00:00	2.10	33.000	10.6930	-61.1620	Trinidad, Trinidad and Tobago	earthquake
2000-01-31 23:05:22.010000+00:00	4.50	33.000	-1.2030	-80.7160	near the coast of Ecuador	earthquake
2000-01-31 22:56:50.996000+00:00	1.40	7.200	38.7860	-119.6409	Nevada	earthquake

Initial inspection of the depth data#

Declare and display a depth float slider with the handle depth_slider (and named ‘Minimum depth’) that ranges between zero and 700 meters and verify that the depth values in most_severe lie in this range. Set the default value to the middle of this range.

Hint

You can use the min() and max() method on the depth Series of most_severe to check the range. To declare the slider, use a pn.widgets.FloatSlider.

depth_slider = ... 
depth_slider

Ellipsis

depth_slider = pn.widgets.FloatSlider(name='Minimum depth', start=0, end=700, value=350)
depth_slider

>> most_severe.depth.min()
4.2
>> most_severe.depth.max()
675.4
    

Exploring a reactive `DataFrame`#

Now we will create a new reactive DataFrame called rdf with sizing_mode='stretch_width'.

Hint

Use pn.rx on most_severe to create the reactive DataFrame called rdf

rdf = ... # reactive DataFrame version of most_severe

rdf = pn.rx(most_severe)

Now use this reactive Dataframe to filter the earthquakes deeper than specified by the depth_slider. Call this filtered dataframe depth_filtered and to view it conveniently, use the .head() method to see the first few entries.

Hint

Use the the regular pandas idiom to filter a DataFrame with df[mask] where mask is a boolean mask. The only difference is instead of picking a fixed depth value to filter by, you can use the depth_slider widget instead.

depth_filtered = ...
# Now display the head of this reactive dataframe

depth_slider = pn.widgets.FloatSlider(name='Minimum depth', start=0, end=700, value=350)

rdf = pn.rx(most_severe)
depth_filtered = rdf[rdf['depth'] < depth_slider]

depth_filtered.head()

Plotting the depth filtered data#

For an initial plot, try calling .hvplot() and seeing what happens (which is unlikely to be what you wanted by default!).

# depth_filtered.hvplot()

Now let’s make a more meaningful plot, such as the magnitude of the filtered earthquakes as a scatter plot with (x) markers colored by depth.

Hint

The magnitude column is called mag, you can set x markers with marker='x', and to get a scatter plot you can use kind='scatter'. color accepts not just a single color like 'red', but also the name of column to color by that column.

# Scatter plot of magnitude, filtered by depth with red cross markers

depth_slider = pn.widgets.FloatSlider(name='Minimum depth', start=0, end=700, value=350)

rdf = pn.rx(most_severe)
depth_filtered = rdf[rdf['depth'] < depth_slider]
depth_filtered.hvplot(y='mag', kind='scatter', color='red', marker='x')

Using reactive xarrays#

The .rx interface applies not just to pandas DataFrames, but to essentially any Python object, including an Xarray dataset. Here we load our population raster and perform some simple cleanup:

raw_ds = xr.open_dataarray(pathlib.Path('../../data/raster/gpw_v4_population_density_rev11_2010_2pt5_min.nc'))
cleaned_ds = raw_ds.where(raw_ds.values != raw_ds.nodatavals).sel(band=1)
cleaned_ds = cleaned_ds.rename({'x': 'longitude','y': 'latitude'})
cleaned_ds.name = 'population'
cleaned_ds = cleaned_ds.fillna(0)

One operation we could do on this raster is to collapse one of the two dimensions. For instance, we could view the mean population over latitude (averaged over longitude) or conversely the mean population over longitude (averaged over latitude). To select between these options, we will want a dropdown widget called collapsed_axis.

Hint

A dropdown widget in panel can be made with a pn.widgets.Select object. The dropdown options are specified as a list of strings to the options argument.

collapsed_axis = ... # Declare a dropdown to select either 'latitude' or 'longitude' and display it

collapsed_axis = pn.widgets.Select(options=['latitude', 'longitude'], name='Collapsed dimension')
collapsed_axis

Now create a reactive xarray DataArray called rds in the analogous fashion to the reactive DataFrame we created earlier.

Hint

As before, the reactive object is created by pn.rx(), but now on an xarray object instead of a pandas object.

rds = ... # A reactive DataArray

rds = pn.rx(cleaned_ds)

Plotting population averaged over either latitude or longitude#

Now we can use the xarray API to collapse either latitude or longitude by taking the mean. To do this, we can use the .mean() method of an xarray DataArray which accepts a dim argument specifying the dimension over which to apply the mean. After collapsing the dimensions specified by the widget, plot the population with a green curve.

Hint

First write and test a static version of your pipeline, where you supply 'latitude' or 'longitude' explicitly to the dim argument of the mean method and then call .hvplot to plot it while specifying color='green'. Then try passing your collapsed_axis widget instead of that fixed string.

# Using `rds` plot the population as a green curve where the collapsed dimension is selected by the widget

rds = pn.rx(cleaned_ds)
collapsed_axis = pn.widgets.Select(options=['latitude', 'longitude'], name='Collapsed dimension')
rds.mean(dim=collapsed_axis).hvplot(color='green')

This web page was generated from a Jupyter notebook and not all interactivity will work on this website. Right click to download and run locally for full Python-backed interactivity.

Right click to download this notebook from GitHub.

Building Pipelines#

Initial inspection of the depth data#

Exploring a reactive DataFrame#

Plotting the depth filtered data#

Using reactive xarrays#

Plotting population averaged over either latitude or longitude#

Exploring a reactive `DataFrame`#