Composing Plots#

So far we have generated plots using hvPlot, but we haven’t discussed what exactly these plots are and how they differ from the output of other libraries offering the .plot API. Here we will see that the .hvplot() output is actually a rich, composable and compositional object that can be used in many different ways, not just as an immediate plot. Specifically, hvPlot generates HoloViews objects rendered using the Bokeh backend (by default) so that they support interactive hovering, panning, and zooming.

In this notebook, we’ll examine the output of hvPlot calls to take a look at individual HoloViews objects. Then we will see how these “elements” offer us powerful ways of combining and composing layered visualizations.

Read in the data#

We’ll read in the data as before, and also reindex by time so that we can more easily do resampling.

import pathlib
import pandas as pd
import numpy as np
%%time
df = pd.read_parquet(pathlib.Path('../data/earthquakes-projected.parq'))
df = df.set_index(df.time)
CPU times: user 3.1 s, sys: 320 ms, total: 3.42 s
Wall time: 1.83 s

Composing plots#

In this section we’ll start looking at how we can group plots to gain a deeper understanding of the data. We’ll start by resampling the data to explore patterns in number and magnitude of earthquakes over time.

import hvplot.pandas  # noqa: adds hvplot method to pandas objects
weekly_count = df.id.resample('1W').count().rename('count')
weekly_count_plot = weekly_count.hvplot(title='weekly count')

The first thing to note is that with hvplot, it is common to grab a handle on the returned output, which isn’t necessarily displayed right away. The value returned from the Matplotlib based .plot API of Pandas is an axis object that is typically discarded, with plotting display occurring only as a side effect if %matplotlib inline is loaded. With hvPlot, there are no side effects; the plot is displayed only if it is returned as the last value in the Jupyter cell (and thus no plot is visible in the above cell’s output). hvPlot objects thus work like any other normal Python object; just as you don’t expect x=2 to display anything, x=df.hvplot() will not display anything; both simply assign a value to x.

Once you have a handle on it, however, you can plot it if you wish to:

weekly_count_plot

But we can also do other things, such as look at its textual representation by printing it:

print(weekly_count_plot)
:Curve   [time]   (count)

:Curve   [time]   (count)” is HoloViews notation for saying that this object is a Curve element with time as a key dimension (kdim, in square brackets) and count as a value dimension (vdim, in parentheses). In other contexts, key dimensions are also called index dimensions or independent variables, while value dimensions are also called dependent variables. HoloViews will be discussed in more detail in a later section.

Now let’s do a similar resampling, but for magnitude:

weekly_mean_magnitude = df.mag.resample('1W').mean()
weekly_mean_magnitude_plot = weekly_mean_magnitude.hvplot(title='weekly mean magnitude')
weekly_mean_magnitude_plot
print(weekly_mean_magnitude_plot)
:Curve   [time]   (mag)

This plot has time on the x axis like the other, but the value dimension is magnitude rather than count. HoloViews objects can be composed into an overlay using a * symbol, with a legend generated to distinguish them:

weekly_mean_magnitude_plot * weekly_count_plot

The two timeseries have quite different value ranges, making it very hard to see the fluctuations in magnitude with an overlay like this. It is possible to zoom in to see the detail in mag, but a more useful form of composition here is a layout of separate plots, using a + symbol to compose HoloViews objects side-by-side with axes linked for any shared dimensions:

plots = (weekly_mean_magnitude_plot + weekly_count_plot).cols(1)
plots

Try zooming in and out (including on the axes) to explore the linking between the plots above.

Interestingly, there are three clear peaks in the monthly counts, and two of them correspond to sudden dips in the mean magnitude, while the third corresponds to a peak in the mean magnitude.

To stop axis linking, you can use plots.opts(shared_axes=False). This will make all axes independent of each other.

As a final note, it is worth mentioning that while the * and + operators are quick and convenient way to compose HoloViews elements, if you need more detailed control over the layout of your plots, you can use Panel containers instead.

Adding a third dimension#

Now let’s filter the earthquakes to only include the really high intensity ones. Using the pandas .plot() API, we can add extra dimensions to the visualization by using color to represent magnitude in addition to the x and y locations:

most_severe = df[df.mag >= 7]

%matplotlib inline
most_severe.plot.scatter(x='longitude', y='latitude', c='mag');
../_images/e83c05d4d5b365dcceeb20d6e0b1435f7f36217a2ee399fdab722f0253b18f1d.png

Here is the analogous version using hvplot where we grab the handle high_mag_scatter so we can inspect the return value:

high_mag_scatter = most_severe.hvplot.scatter(x='longitude', y='latitude', c='mag')
high_mag_scatter

As always, this return value is actually a HoloViews element which has a printed representation:

print(high_mag_scatter)
:Scatter   [longitude]   (latitude,mag)

This representation reveals that even though the scatterplot here looks ok, it’s actually flawed in a subtle way. Earthquakes occur at a particular 2D longitude, latitude location on the earth’s surface and have a measured magnitude, i.e., they are 2D points with some associated value, and should have a representation like “[longitude,latitude] (mag). That is, there should be two key dimensions (independent variables), with the magnitude being a dependent variable (“value dimension”). The problem here is that the Pandas .scatter call makes no distinction between these types of dimensions, which will confuse HoloViews when it is doing automatic linking and other operations that depend on the interpretation of the data. For this purpose, HoloViews provides a separate .hvplot.points call that has the same visual representation but the correct semantics:

high_mag_points = most_severe.hvplot.points(x='longitude', y='latitude', c='mag')
high_mag_points
print(high_mag_points)
:Points   [longitude,latitude]   (mag)

This object now appropriately represents that latitude and longitude are the key dimensions, with one value dimension (the magnitude).

We will learn more about HoloViews objects in the next notebook. But first, let’s adjust the options to create a better plot. First we’ll use colorcet to get a colormap that will have a good contrast with blue oceans when we show earthquakes on a map; the default blue colormap above would get lost against the seas! We can choose one from the website and use the HoloViews/Bokeh-based colorcet plotting module to make sure it looks good.

import colorcet as cc
from colorcet.plotting import swatch

swatch('CET_L4')

We’ll reverse the colors to align dark reds with higher magnitude earthquakes for better contrast.

mag_cmap = cc.CET_L4[::-1]
swatch("CET_L4_r", mag_cmap)

In addition to fixing the colormap, we will add some additional columns to the hover text, and add a title.

high_mag_points = most_severe.hvplot.points(
    x='longitude', y='latitude', c='mag', hover_cols=['place', 'time'],
    cmap=mag_cmap,  title='Earthquakes with magnitude >= 7')

high_mag_points

When you hover over the points you’ll see the place and time of the earthquake in addition to the magnitude and lat/lon. What’s displayed by default corresponds to the dimensions that HoloViews is keeping track of, even though not all have been mapped to visible features of the plot:

print(high_mag_points)
:Points   [longitude,latitude]   (mag,place,time)

Exercise#

Use the colorcet plotting function swatches(group='linear') to choose a different colormap and create a plot with it.

from colorcet.plotting import swatches
swatches(group='linear')

then

most_severe.hvplot.points(
    x='longitude', y='latitude', c='mag', hover_cols=['place', 'time'],
    cmap="bgy",  title='Earthquakes with magnitude >= 7')

Overlay with a tiled map#

The “CET_L4” colormap works well here, and we can kind of see the outlines of the continents, but the visualization would be much easier to parse if we added a base map underneath. To do this, we’ll import a tile element from HoloViews, namely the EsriImagery tiles from ESRI using the Web Mercator projection:

from holoviews.element.tiles import EsriImagery
EsriImagery()

Note that when you zoom, the map becomes more and more detailed, downloading new tiles dynamically as necessary.

In order to overlay on this basemap, we need to project our earthquakes to the Web Mercator projection system used by Bokeh. This dataset already includes columns for Web Mercator easting (meters East of Greenwich) and northing (meters north of the equator) columns calculated from longitude and latitude using the HoloViews hv.util.transform.lon_lat_to_easting_northing function, so we’ll use those and overlay our points on top of the EsriImagery tile source by using the HoloViews * operator:

EsriImagery() * most_severe.hvplot.points(
    x='easting', y='northing', c='mag', hover_cols=['place', 'time'], 
    cmap=mag_cmap, title='Earthquakes with magnitude >= 7', line_color='black')

Actually, for this special but common case of overlaying data on geographic tiles, hvPlot lets you simply specify tiles='EsriImagery' as a string in the hvplot.points call instead of explicitly overlaying a tile source with *:

most_severe.hvplot.points(
    x='easting', y='northing', c='mag', hover_cols=['place', 'time'], 
    cmap=mag_cmap, title='Earthquakes with magnitude >= 7', tiles='EsriImagery',
    line_color='black')

Note that the Web Mercator projection is only one of many possible projections used when working with geospatial data. If you need to work with these different projections, you can use the GeoViews extension to HoloViews that makes elements aware of the projection they are defined in and automatically projects into whatever coordinates are needed for display.

Exercise#

Import and use different tiles.

(Hint)

Try EsriImagery.

Overlay with a raster#

That base map is helpful for orienting ourselves, but it isn’t really adding much new information. We might instead like to overlay the earthquakes on other data, such as a map of global population. We’ll start by reading in a raster of global population, to see how the events might affect people. We’ll use xarray to load this multidimensional raster data file, as such formats are not handled well by Pandas. (Luckily, hvPlot supports xarray just as well as Pandas.)

import xarray as xr
import hvplot.xarray  # noqa: adds hvplot method to xarray objects
ds = xr.open_dataarray(pathlib.Path('../data/raster/gpw_v4_population_density_rev11_2010_2pt5_min.nc'))
ds
<xarray.DataArray (band: 1, y: 4320, x: 8640)>
[37324800 values with dtype=float32]
Coordinates:
  * band     (band) int64 1
  * y        (y) float64 89.98 89.94 89.9 89.85 ... -89.85 -89.9 -89.94 -89.98
  * x        (x) float64 -180.0 -179.9 -179.9 -179.9 ... 179.9 179.9 179.9 180.0
Attributes:
    transform:   [ 4.16666667e-02  0.00000000e+00 -1.80000000e+02  0.00000000...
    crs:         EPSG:4326
    res:         [0.04166667 0.04166667]
    is_tiled:    0
    nodatavals:  -3.4028230607370965e+38
cleaned_ds = ds.where(ds.values != ds.nodatavals).sel(band=1)
cleaned_ds.name = 'population'

The xarray.plot() Matplotlib API is fine for plotting small sections of this dataset, such as the population of Indonesia:

ROI = cleaned_ds.sel(y=slice(10, -10), x=slice(90, 110))
ROI.plot();
../_images/9b527272cd91e7fed0c08c1594434bd5f71d8562ba960cb2b3ed5247a6c0bcb8.png

Matplotlib will struggle with the full dataset, but we can use hvPlot+Datashader to see it all and interactively look at patterns with different spatial scales. Here we will apply a logarithmic colormap to the population counts (similar in appearance to eq_hist in this case, but easier for people to interpret), and set a clim to exclude the lower bound of zero from the log calculation:

rasterized_pop = cleaned_ds.hvplot.image(rasterize=True, cnorm='log', clim=(1, np.nan))
rasterized_pop