HoloViz Talk#
Revealing your data (nearly) effortlessly,
at every step in your workflow

Workflow from data to decision#

If there's no visualization at any of these stages, you're flying blind.
But visualization is often skipped as too hard to construct, particularly for big data.
What if it were simple to visualize anything, anywhere?

Good news/
Bad news
Lots of choices!
Too hard to
try them all,
learn them all, or
get them to work together.

HoloViz:
Seamless interoperability
for browser-based
viz tools
Supported by Anaconda, Inc.
HoloViz Goals:#
Full functionality in browsers (not desktop)
Full interactivity (inside and out of plots)
Focus on Python users, not web programmers
Start with data, not coding
Work with data of any size
Exploit general-purpose SciPy/PyData tools
Focus on 2D primarily, with some 3D
Avoid entangling your data, code, and viz:
Same viz/analysis code in Jupyter, Python, HPC, …
Widgets/apps in Jupyter, standalone servers, web pages
Jupyter as a tool, not part of the results
Exploring Pandas Dataframes#
If your data is in a Pandas dataframe, it’s natural to explore it using the .plot()
method (based on Matplotlib). Let’s look at a dataset of the number of cases of measles and pertussis (per 100,000 people) over time in each state:
from pathlib import Path
import pandas as pd
df = pd.read_csv(Path('../data/diseases.csv.gz'))
df.head()
Year | Week | State | measles | pertussis | |
---|---|---|---|---|---|
0 | 1928 | 1 | Alabama | 3.67 | NaN |
1 | 1928 | 2 | Alabama | 6.25 | NaN |
2 | 1928 | 3 | Alabama | 7.95 | NaN |
3 | 1928 | 4 | Alabama | 12.58 | NaN |
4 | 1928 | 5 | Alabama | 8.03 | NaN |
Just calling .plot()
won’t give anything meaningful, because it doesn’t know what should be plotted against what:
%matplotlib inline
df.plot();

But with some Pandas operations we can pull out parts of the data that make sense to plot:
import numpy as np
by_year = df[["Year","measles"]].groupby("Year").aggregate(np.sum)
by_year.plot();

Here it is easy to see that the 1963 introduction of a measles vaccine brought the cases down to negligible levels.
Exploring Data with hvPlot and Bokeh#
The above plots are just static images, but if you import the hvplot
package, you can use the same plotting API to get fully interactive plots with hover, pan, and zoom in a web browser:
import hvplot.pandas # noqa: adds hvplot method to pandas objects
by_year.hvplot()