Linking Plots#

In this exercise we will link plots generated with hvplot from the earthquake data using HoloViews linked selections.

Loading the data as before#

import pathlib
import pandas as pd
import holoviews as hv # noqa

import hvplot.pandas # noqa: adds hvplot method to pandas objects
import hvplot.xarray # noqa: adds hvplot method to xarray objects

df = pd.read_parquet(pathlib.Path('../../data/earthquakes-projected.parq'))
df.index = df.index.tz_localize(None)  # to prevent error in comparison
df = df.reset_index() # treat time like any other column to allow selections on any dimension
most_severe = df[df.mag >= 7]

The distribution of earthquakes over the Earth’s surface#

Over latitude#

So far we have seen linked histograms, but the same approach generalizes to any other collection of plot types. This time we shall use kind='points' to generate points plots instead of histograms.

First create a points plot called lat_points that plots the latitudes of the most_severe earthquakes over time. Customize it by making the points red plus-sign markers (+) and the plot responsive with height of 300 pixels. At the end of the cell, display this object.

Hint

The time of the earthquakes are in the 'time' column while the latitudes are in the 'latitude' column of most_severe. The points point color is controlled by the color keyword argument and 'red' is a valid color specification. The height is controlled by a keyword of the same name and marker='+' will use that marker style.

lat_points = ... # Use hvplot here to visualize a latitude points from most_severe
lat_points  # Display it

Ellipsis

lat_points = most_severe.hvplot(
     x='time', y='latitude', kind='points', color='red', marker='+', responsive=True, height=300)
lat_points 

Combined with a longitude points plot#

Now make a corresponding points plot over longitude called lon_points that plots the longitudes of the most_severe earthquakes over time. Customize it by making the points out of blue cross markers (x) with the same height of 300 pixels as before. It should be responsive and at the end of the cell, display this object in a layout with the previous lat_points plot. The longitude plot should be on the left and the latitude plot should be on the right.

Hint

This plot is identical to the previous one except the name of the handle and the fact that the 'longitude' column is now used and the points are colored blue. To combine the plots together, use the previous handle of lat_points to create a layout with lon_points and the HoloViews + operator.

lon_points = ... # Use hvplot here to visualize longitude points from most_severe
# Display it to the left of lat_points

lon_points = most_severe.hvplot(
         x='time', y='longitude', kind='points', color='blue', marker='x', responsive=True, height=300)
lon_points + lat_points

Now we have two points plots derived from the most_severe DataFrame but right now they are not linked.

Linking the points plots#

Now we can use hv.link_selections to link these two points plots together. Create the same layout as before with the longitude points on the left and the latitude points on the right, but this time link them together.

Hint

You will need to make a linked selection instance with hv.link_selections.instance() and pass the points layout to that instance in order to link the plots.

# Display a longitude points plot on the left that is linked to a latitude points plot on the right

Combining the plots as well as the linked selection:

ls = hv.link_selections.instance()

lat_points = most_severe.hvplot(
     x='time', y='latitude', kind='points', color='red', marker='+', responsive=True, height=300)

lon_points = most_severe.hvplot(
         x='time', y='longitude', kind='points', color='blue', marker='x', responsive=True, height=300)

ls(lon_points + lat_points)

Use the box select tool to confirm that these plots are now linked. Note that you can reset the selection with the reset button in the Bokeh toolbar.

Analysing the filtered selection#

Now that we have linked plots, we can interactively select points in the visualization and then use that selection to filter the original DataFrame. After making a selection in the plot above, show a statistical summary of the points that have been selected.

Hint

The linked selection object has a .filter method that can filter your original DataFrame (most_severe). To compute statistics of a DataFrame, you can use the pandas .describe() method.

# Display a summary of a linked selection in the plot above using the pandas .describe()

Assume the handle to the linked selection is called ls:

ls.filter(most_severe).describe()

Extra credit: Points vs. Scatter#

If you change kind='points' to kind='scatter' on one or both plots above, you should see what looks like precisely the same plots, so you may wonder why there are two different but so similar plots available. But if you then make a selection, you should be able to see the difference (try it both ways!). We used a points plot above, because the latitude and longitude do not have a dependent relationship with time, and a scatter plot requires and conveys such a relationship. Here if you make a box selection on a scatter plot, you’ll see that the selected “box” is actually a region of the x axis, because the selection is applied only to the key dimensions (independent variables), not to any dependent variables. Whereas the same selection on a points plot will select on both latitude/longitude and time, resulting in a box-shaped selection rather than a horizontal span. Differences in behavior like this are why it is important to be explicit about what you are assuming about how your data dimensions are related.

When in doubt, use a points plot unless you really do want to claim that (or investigate whether) y depends on x, as a scatter plot assumes. E.g. could you imagine swapping x and y and still have a meaningful plot? If not, it’s a candidate for a scatter plot; otherwise it’s a points plot.

This web page was generated from a Jupyter notebook and not all interactivity will work on this website. Right click to download and run locally for full Python-backed interactivity.

Right click to download this notebook from GitHub.