Visualizing Tabular Data

Overview

Teaching: 15 min
Exercises: 0 min
Questions
  • How can I visualize tabular data in Python?

  • How can I group several plots together?

Objectives
  • Plot simple graphs from data.

  • Group several graphs in a single figure.

Visualizing data

The mathematician Richard Hamming once said, “The purpose of computing is insight, not numbers,” and the best way to develop insight is often to visualize data. Visualization deserves an entire lecture of its own, but we can explore a few features of Python’s matplotlib library here. While there is no official Python plotting library, matplotlib is the de facto standard. First, we will import the pyplot module from matplotlib and use two of its functions to create and display a heat map of our data:

from matplotlib import pyplot

Note the slightly different grammar this time for the import statement. Using from we import the library matplotlib.pyplot so that we don’t import every module in matplotlib but just the plotting module.

Furthermore, with this grammar we only have to call it pyplot. We could have abbreviated more by using a shortcut, e.g., to refer to the matplotlib.pyplot library as just plt, we could use either:

from matplotlib import pyplot as plt

import matplotlib.pyplot as plt

Now let’s plot the medical data we read previously:

image = pyplot.imshow(data)
pyplot.show()

Heatmap of the Data

Blue pixels in this heat map represent low values, while yellow pixels represent high values. As we can see, inflammation rises and falls over a 40-day period.

To continue, close the plot window.

Let’s take a look at the average inflammation over time:

ave_inflammation = numpy.mean(data, axis=0)
ave_plot = pyplot.plot(ave_inflammation)
pyplot.show()

Average Inflammation Over Time

Here, we have put the average inflammation per day across all patients in the variable ave_inflammation, then asked matplotlib.pyplot to create and display a line graph of those values. The result is a roughly linear rise and fall, which is suspicious: we might instead expect a sharper rise and slower fall. Let’s have a look at two other statistics:

max_plot = pyplot.plot(numpy.max(data, axis=0))
pyplot.show()

What is this showing? Can you deduce it from the name of the numpy function?

How about if you try the numpy.std method?

std_plot = pyplot.plot(numpy.std(data, axis=0))
pyplot.show()

Key Points

  • Use the pyplot module from the matplotlib library for creating simple visualizations.