Custom Interactivity#
Using hvPlot allows you to generate a number of different types of plot quickly from a standard API by building HoloViews objects, as discussed in the previous notebook. These objects are rendered with Bokeh, which offers a number of standard ways to interact with your plot, such as panning and zooming tools.
Many other modes of interactivity are possible when building an exploratory visualization (such as a dashboard) and these forms of interactivity cannot be achieved using hvPlot alone.
In this notebook, we will drop down to the HoloViews level of representation to build a visualization directly that consists of linked plots that update when you interactivily select a particular earthquake with the mouse. The goal is to show how more sophisticated forms of interactivity can be built when needed, in a way that’s fully compatible with all the examples shown in earlier sections.
First let us load our initial imports:
import pathlib
import numpy as np
import pandas as pd
import hvplot.pandas # noqa
from holoviews.element import tiles
And clean the data before filtering (for magnitude >7
) and projecting to to Web Mercator as before:
%%time
df = pd.read_parquet(pathlib.Path('../data/earthquakes-projected.parq'))
most_severe = df[df.mag >= 7]
CPU times: user 933 ms, sys: 108 ms, total: 1.04 s
Wall time: 565 ms
Towards the end of the previous notebook we generated a scatter plot of earthquakes
across the earth that had a magnitude >7
overlaid on top of a map tile source:
high_mag_quakes = most_severe.hvplot.points(x='easting', y='northing', c='mag',
title='Earthquakes with magnitude >= 7')
esri = tiles.EsriImagery().redim(x='easting', y='northing')
esri * high_mag_quakes
And saw how this object is a HoloViews Points
object:
print(high_mag_quakes)
:Points [easting,northing] (mag)
This object is an example of a HoloViews Element, which is an object that can display itself. These elements are thin wrappers around your data and the raw input data is always available on the .data
attribute. For instance, we can look at the head
of the most_severe
DataFrame
as follows:
high_mag_quakes.data.head()
depth | depthError | dmin | gap | horizontalError | id | latitude | locationSource | longitude | mag | ... | magType | net | nst | place | rms | status | type | updated | easting | northing | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | |||||||||||||||||||||
2000-01-08 16:47:20.580000+00:00 | 183.4 | NaN | NaN | NaN | NaN | usp0009kx3 | -16.925 | us | -174.248 | 7.2 | ... | mwc | us | NaN | 117 km SSW of Hihifo, Tonga | 1.25 | reviewed | earthquake | 2022-04-29T19:41:48.270Z | -1.939720e+07 | -1.912096e+06 |
2000-02-25 01:43:58.640000+00:00 | 33.0 | NaN | NaN | NaN | NaN | usp0009nxg | -19.528 | us | 173.818 | 7.1 | ... | mwc | us | NaN | Vanuatu region | 1.20 | reviewed | earthquake | 2024-01-17T20:46:46.644Z | 1.934933e+07 | -2.217199e+06 |
2000-03-28 11:00:22.510000+00:00 | 126.5 | NaN | NaN | NaN | NaN | usp0009qb4 | 22.338 | us | 143.730 | 7.6 | ... | mwc | us | NaN | Volcano Islands, Japan region | 1.22 | reviewed | earthquake | 2022-04-29T18:35:42.761Z | 1.599995e+07 | 2.552155e+06 |
2000-04-23 09:27:23.320000+00:00 | 608.5 | NaN | NaN | NaN | NaN | usp0009rrc | -28.307 | us | -62.990 | 7.0 | ... | mwb | us | NaN | 22 km NW of Añatuya, Argentina | 0.89 | reviewed | earthquake | 2022-04-29T19:37:41.054Z | -7.012015e+06 | -3.287735e+06 |
2000-05-12 18:43:18.120000+00:00 | 225.0 | 4.6 | NaN | NaN | NaN | usp0009suu | -23.548 | us | -66.452 | 7.2 | ... | mwc | us | NaN | 75 km N of San Antonio de los Cobres, Argentina | 0.86 | reviewed | earthquake | 2022-04-29T19:16:27.106Z | -7.397403e+06 | -2.698426e+06 |
5 rows × 23 columns
We will now learn a little more about HoloViews
elements, including how to build them up from scratch so that we can control every aspect of them.
An Introduction to HoloViews Elements#
HoloViews elements are the atomic, visualizable components that can be rendered by a plotting library such as Bokeh. We don’t actually need to use hvPlot to create these element objects: we can create them directly by importing HoloViews (and loading the extension if we have not loaded hvPlot):
import holoviews as hv
hv.extension("bokeh") # Optional here as we have already loaded hvplot.pandas
Now we can create our own example of a Points
element. In the next cell we plot 100 points with independent normal distributions in the x
and y
directions:
xs = np.random.randn(100)
ys = np.random.randn(100)
hv.Points((xs, ys))
Now that the axis labels are ‘x’ and ‘y’, the default dimensions for
this element type. We can use a different set of dimensions along the x- and y-axis (say
‘weight’ and ‘height’) and we can also associate additional fitness
information with each point if we wish:
xs = np.random.randn(100)
ys = np.random.randn(100)
fitness = np.random.randn(100)
height_v_weight = hv.Points((xs, ys, fitness), ['weight', 'height'], 'fitness')
height_v_weight
Now we can look at the printed representation of this object:
print(height_v_weight)
:Points [weight,height] (fitness)
Here the printed representation shows the key dimensions that we specified in square brackets as [weight,height]
and the additional value dimension fitness
in parentheses as (fitness)
. The key dimensions map to the axes and the value dimensions can be visually represented by other visual attributes as we shall see shortly.
For more information an HoloViews dimensions, see this user guide.
Exercise#
Visit the HoloViews reference gallery and browse the available set of elements. Pick an element type and try running one of the self-contained examples in the following cell.
Setting Visual Options#
The two Points
elements above look quite different from the one returned by hvPlot showing the earthquake positions. This is because hvPlot makes use of the HoloViews options system to customize the visual representation of these element objects.
Let us color the height_v_weight
scatter by the fitness value and use a larger
point size:
height_v_weight.opts(color='fitness', size=8, colorbar=True, aspect='square')
Exercise#
Copy the line above into the next cell and try changing the points to ‘blue’ or ‘green’ or another dimension of the data such as ‘height’ or ‘weight’.
Are the results what you expect?
The help
system#
You can learn more about the .opts
method and the HoloViews options
system in the corresponding user
guide. To
easily learn about the available options from inside a notebook, you can
use hv.help
and inspect the ‘Style Options’.
# Commented as there is a lot of help output!
# hv.help(hv.Scatter)
At this point, we can have some insight to the sort of HoloViews object hvPlot is building behind the scenes for our earthquake example:
esri * hv.Points(most_severe, ['easting', 'northing'], 'mag').opts(color='mag', size=8, aspect='equal')
Exercise#
Try using hv.help
to inspect the options available for different element types such as the Points
element used above. Copy the line above into the cell below and pick a Points
option that makes sense to you and try using it in the .opts
method.
(Hint)
If you can’t decide on an option to pick, a good choice is marker
. For instance, try:
marker='+'
marker='d'
.
HoloViews uses matplotlib’s conventions for specifying the various marker types. Try finding out which ones are support by Bokeh.
Custom interactivity for Elements#
When rasterization of the population density data via hvplot was first introduced, we saw that the HoloViews object returned was not an element but a DynamicMap
.
A DynamicMap
enables custom interactivity beyond the Bokeh defaults by dynamically generating elements that get displayed and updated as the plot is interacted with.
There is a counterpart to the DynamicMap
that does not require a live Python server to be running, called the HoloMap
. The HoloMap
container will not be covered in the tutorial but you can learn more about them in the containers user guide.
Now let us build a very simple DynamicMap
that is driven by a linked stream (specifically a PointerXY
stream) that represents the position of the cursor over the plot:
from holoviews import streams
ellipse = hv.Ellipse(0, 0, 1)
pointer = streams.PointerXY(x=0, y=0) # x=0 and y=0 are the initialized values
def crosshair(x, y):
return hv.HLine(y) * hv.VLine(x)
ellipse * hv.DynamicMap(crosshair, streams=[pointer])
Try moving your mouse over the plot and you should see the crosshair follow your mouse position.
The core concepts here are:
The plot shows an overlay built with the
*
operator introduced in the previous notebook.There is a callback that returns this overlay that is built according to the supplied
x
andy
arguments. A DynamicMap always contains a callback that returns a HoloViews object such as anElement
orOverlay
These
x
andy
arguments are supplied by thePointerXY
stream that reflect the position of the mouse on the plot.
Exercise#
Look up the Ellipse
, HLine
, and VLine
elements in the
HoloViews reference guide and see
if the definitions of these elements align with your initial intuitions.
Exercise (additional)#
If you have time, try running one of the examples in the ‘Streams’ section of the HoloViews reference guide in the cell below. All the examples in the reference guide should be relatively short and self-contained.
Selecting a particular earthquake with the mouse#
Now we only need two more concepts before we can set up the appropriate mechanism to select a particular earthquake on the hvPlot-generated Scatter plot we started with.
First, we can attach a stream to an existing HoloViews element such as the earthquake distribution generated with hvplot:
selection_stream = streams.Selection1D(source=high_mag_quakes)
Next we need to enable the ‘tap’ tool on our Scatter to instruct Bokeh to enable the desired selection mechanism in the browser.
high_mag_quakes.opts(tools=['tap'])
The Bokeh default alpha of points which are unselected is going to be too low when we overlay these points on a tile source. We can use the HoloViews options system to pick a better default as follows:
hv.opts.defaults(hv.opts.Points(nonselection_alpha=0.4))
The tap tool is in the toolbar with the icon showing the concentric circles and plus symbol. If you enable this tool, you should be able to pick individual earthquakes above by tapping on them.
Now we can make a DynamicMap that uses the stream we defined to show the index of the earthquake selected via the hv.Text
element:
def labelled_callback(index):
if len(index) == 0:
return hv.Text(x=0,y=0, text='')
first_index = index[0] # Pick only the first one if multiple are selected
row = most_severe.iloc[first_index]
text = '%d : %s' % (first_index, row.place)
return hv.Text(x=row.easting, y=row.northing, text=text).opts(color='white')
labeller = hv.DynamicMap(labelled_callback, streams=[selection_stream])
This labeller receives the index argument from the Selection1D stream
which corresponds to the row of the original dataframe (most_severe
)
that was selected. This lets us present the index and place value using
hv.Text
which we then position at the corresponding latitude and
longitude to label the chosen earthquake.
Finally, we overlay this labeller DynamicMap
over the original
plot. Now by using the tap tool you can see the index number of an
earthquake followed by the assigned place name:
(esri * high_mag_quakes * labeller).opts(hv.opts.Points(tools=['tap', 'hover']))
Exercise#
Pick an earthquake point above and using the displayed index, display the corresponding row of the most_severe
dataframe using the .iloc
method in the following cell.
Building a linked earthquake visualizer#
Now we will build a visualization that achieves the following:
The user can select an earthquake with magnitude
>7
using the tap tool in the manner illustrated in the last section.In addition to the existing label, we will add concentric circles to further highlight the selected earthquake location.
All earthquakes within 0.5 degrees of latitude and longitude of the selected earthquake (~50km) will then be used to supply data for two linked plots:
A histogram showing the distribution of magnitudes in the selected area.
A timeseries scatter plot showing the magnitudes of earthquakes over time in the selected area.
The first step is to generate a concentric-circle marker using a similar approach to the labeller
above. We can write a function that uses Ellipse
to mark a particular earthquake and pass it to a DynamicMap
:
def mark_earthquake(index):
if len(index) == 0:
return hv.Overlay([])
first_index = index[0] # Pick only the first one if multiple are selected
row = most_severe.iloc[first_index]
return (hv.Ellipse(row.easting, row.northing, 1.5e6) *
hv.Ellipse(row.easting, row.northing, 3e6)).opts(
hv.opts.Ellipse(color='white', alpha=0.5)
)
quake_marker = hv.DynamicMap(mark_earthquake, streams=[selection_stream])
Now we can test this component by building an overlay of the ESRI
tile source, the >=7
magnitude points and quake_marked
:
esri * high_mag_quakes.opts(tools=['tap']) * quake_marker
Note that you may need to zoom in to your selected earthquake to see the localized, lower magnitude earthquakes around it.
Filtering earthquakes by location#
We wish to analyze the earthquakes that occur around a particular latitude and longitude. To do this we will define a function that given a latitude and longitude, returns the rows of a suitable dataframe that corresponding to earthquakes within 0.5 degrees of that position:
def earthquakes_around_point(df, lat, lon, degrees_dist=0.5):
half_dist = degrees_dist / 2.0
return df[((df['latitude'] - lat).abs() < half_dist)
& ((df['longitude'] - lon).abs() < half_dist)]
As it can be slow to filter our dataframes in this way, we can define the following function that can cache the result of filtering df
(containing all earthquakes) based on an index pulled from the most_severe
dataframe:
def index_to_selection(indices, cache={}):
if not indices:
return most_severe.iloc[[]]
index = indices[0] # Pick only the first one if multiple are selected
if index in cache: return cache[index]
row = most_severe.iloc[index]
selected_df = earthquakes_around_point(df, row.latitude, row.longitude)
cache[index] = selected_df
return selected_df
The caching will be useful as we know both of our planned linked plots (i.e the histogram and scatter over time) make use of the same earthquake selection once a particular index is supplied from a user selection. This particular caching strategy is rather awkward (and leaks memory!) but it simple and will serve for the current example. A better approach to caching will be presented in the Advanced Dashboards section of the tutorial.
Exercise#
Test the index_to_selection
function above for the index you picked in the previous exercise. Note that the stream supplied a list of indices and that the function above only uses the first value given in that list. Do the selected rows look correct?:
Exercise#
Convince yourself that the selected earthquakes are within 0.5$^o$ distance of each other in both latitude and longitude.
For a given chosen
index, you can see the distance difference using the following code:
chosen = 235
delta_long = index_to_selection([chosen]).longitude.max() - index_to_selection([chosen]).longitude.min()
delta_lat = index_to_selection([chosen]).latitude.max() - index_to_selection([chosen]).latitude.min()
print("Difference in longitude: %s" % delta_long)
print("Difference in latitude: %s" % delta_lat)
Linked plots#
So far we have overlayed the display updates on top of the existing spatial distribution of earthquakes. However, there is no requirement that the data is overlaid and we might want to simply attach an entirely new, derived plot that dynamically updates to the side.
Using the same principles as we have already seen, we can define a
DynamicMap
that returns Histogram
distributions of earthquake
magnitude:
def histogram_callback(index):
title = 'Distribution of all magnitudes within half a degree of selection'
selected_df = index_to_selection(index)
return selected_df.hvplot.hist(y='mag', bin_range=(0,10), bins=20, color='red', title=title)
histogram = hv.DynamicMap(histogram_callback, streams=[selection_stream])
The only real difference in the approach here is that we can still use
.hvplot
to generate our elements instead of declaring the HoloViews
elements explicitly. In this example, .hvplot.hist
is used.
The exact same principles can be used to build the scatter callback and temporal_distribution
DynamicMap
:
def scatter_callback(index):
title = 'Temporal distribution of all magnitudes within half a degree of selection '
selected_df = index_to_selection(index)
return selected_df.hvplot.scatter('time', 'mag', color='green', title=title)
temporal_distribution = hv.DynamicMap(scatter_callback, streams=[selection_stream])
Lastly, let us define a DynamicMap
that draws a VLine
to mark the time at which the selected earthquake occurs so we can see which tremors may have been aftershocks immediately after that major earthquake occurred:
def vline_callback(index):
if not index:
return hv.VLine(0).opts(alpha=0)
row = most_severe.iloc[index[0]]
return hv.VLine(row.name).opts(line_width=2, color='black')
temporal_vline = hv.DynamicMap(vline_callback, streams=[selection_stream])
We now have all the pieces we need to build an interactive, linked visualization of earthquake data.
Exercise#
Test the histogram_callback
and scatter_callback
callback functions by supplying your chosen index, remembering that these functions require a list argument in the following cell.
Putting it together#
Now we can combine the components we have already built as follows to create a dynamically updating plot together with an associated, linked histogram:
((esri * high_mag_quakes.opts(tools=['tap']) * labeller * quake_marker)
+ histogram + temporal_distribution * temporal_vline).cols(1)
We now have a custom interactive visualization that builds on the output of hvplot
by making use of the underlying HoloViews objects that it generates.
Conclusion#
When exploring data it can be convenient to use the .plot
API to quickly visualize a particular dataset. By calling .hvplot
to generate different plots over the course of a session and then linking such plots together, it is possible to gradually build up a mental model of how a particular dataset is structured.
In the workflow presented here, building such custom interaction is relatively quick and easy and does not involve throwing away prior code used to generate simpler plots. In the spirit of ‘short cuts not dead ends’, we can use the HoloViews-object output of hvplot
that we used in our initial exploration to build rich visualizations with custom interaction to explore our data at a deeper level.
These interactive visualizations not only allow for custom interactions beyond the scope of hvplot
alone, but they can display visual annotations not offered by the .plot
API. In particular, we can overlay our data on top of tile sources, generate interactive textual annotations, draw shapes such a circles, mark horizontal and vertical marker lines and much more. Using HoloViews you can build visualizations that allow you to directly interact with your data in a useful and intuitive manner.