Visualizing Tabular Data
Overview
Teaching: 15 min
Exercises: 0 minQuestions
How can I visualize tabular data in Python?
How can I group several plots together?
Objectives
Plot simple graphs from data.
Group several graphs in a single figure.
Visualizing data
The mathematician Richard Hamming once said, “The purpose of computing is insight, not numbers,” and
the best way to develop insight is often to visualize data. Visualization deserves an entire
lecture of its own, but we can explore a few features of Python’s matplotlib
library here. While
there is no official Python plotting library, matplotlib
is the de facto standard. First, we will
import the pyplot
module from matplotlib
and use two of its functions to create and display a
heat map of our data:
from matplotlib import pyplot
Note the slightly different grammar this time for the import
statement.
Using from
we import the library matplotlib.pyplot
so that we don’t import every module in matplotlib
but just the plotting module.
Furthermore, with this grammar we only have to call it pyplot
.
We could have abbreviated more by using a shortcut, e.g., to refer to the matplotlib.pyplot
library as just plt
, we could use either:
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
Now let’s plot the medical data we read previously:
image = pyplot.imshow(data)
pyplot.show()
Blue pixels in this heat map represent low values, while yellow pixels represent high values. As we can see, inflammation rises and falls over a 40-day period.
To continue, close the plot window.
Let’s take a look at the average inflammation over time:
ave_inflammation = numpy.mean(data, axis=0)
ave_plot = pyplot.plot(ave_inflammation)
pyplot.show()
Here, we have put the average inflammation per day across all patients in the variable ave_inflammation
, then
asked matplotlib.pyplot
to create and display a line graph of those values. The result is a
roughly linear rise and fall, which is suspicious: we might instead expect a sharper rise and slower
fall. Let’s have a look at two other statistics:
max_plot = pyplot.plot(numpy.max(data, axis=0))
pyplot.show()
What is this showing? Can you deduce it from the name of the numpy
function?
How about if you try the numpy.std
method?
std_plot = pyplot.plot(numpy.std(data, axis=0))
pyplot.show()
Key Points
Use the
pyplot
module from thematplotlib
library for creating simple visualizations.