---
name: Heatmap
status: draft
description: STATISTICAL family. A grid of cells that encodes each value's magnitude as colour, turning a whole matrix of numbers into a field your eye can scan for hot and cold zones.
sources:
  - title: Wilkinson, L. & Friendly, M. (2009), The History of the Cluster Heat Map, The American Statistician 63(2)
    url: https://doi.org/10.1198/tas.2009.0033
  - title: Heuer, Richards J. (1999), Psychology of Intelligence Analysis
    url: https://openlibrary.org/works/OL20202835W
---

# Heatmap

## Why it matters

A heatmap lays a table of numbers out on a grid — rows down one side, columns across the other — and paints each cell a colour according to how big its value is. The number itself disappears into a shade, and what you get back is a *field of colour*: a whole matrix you can take in at a glance, where your eye finds the hot corner, the cold band, and the cluster faster than it could read a single label. It trades the exact figure in each cell for something a table can't give you — the shape of the whole thing at once.

For example: an analyst has a correlation matrix of forty financial indicators — sixteen hundred numbers, impossible to read as a table. Rendered as a heatmap with a red-white-blue scale, two things jump out before any number is read: a bright diagonal (every indicator correlates perfectly with itself) and an unmistakable warm block in one quadrant where a cluster of rates all move together. The block *is* the finding. No amount of staring at the table would have made it surface that fast — the colour field did the pattern-finding for the eye.

- **What it shows.** A whole two-dimensional matrix of values at once, with each cell's magnitude encoded as colour — so concentration, gaps, bands, and clusters become visible structure instead of buried numbers.
- **When to reach for it.** A dense grid of values — a correlation matrix, activity by hour-and-day, a confusion matrix, conditions against outcomes — where you want to *see the pattern* and a table would force you to find it by hand.
- **How to read it.** Read the colour scale first (it's the whole key), then scan the field for the hot and cold zones; reordering rows and columns so similar ones sit together makes the blocks pop out.
- **What you'd miss without it.** The structure that lives in the matrix as a whole — the cluster of variables that move together, the dead row, the diagonal stripe — none of which a cell-by-cell reading reliably surfaces.
- **Where it misleads.** Colour reads less precisely than position, so it's for spotting patterns, not reading exact values; and a badly chosen colour scale can manufacture a band that isn't in the data or hide one that is.

## How to read it

Picture a grid. The **rows** are one set of things (the forty indicators, the senders, the evidence items), the **columns** are another (the same forty indicators, the months, the hypotheses), and every place a row meets a column is a **cell** holding one number. Instead of printing that number, the heatmap fills the cell with a colour drawn from a scale — and the result is that the whole matrix becomes a picture. You read it the way you read a map: not cell by cell, but by letting your eye roam the field for the bright regions, the dark regions, and the boundaries between them.

The **colour scale is everything** — it's the legend that turns shade back into meaning, and the *kind* of scale you choose changes what the picture says. A **sequential** scale runs from light to dark in a single hue (pale to deep blue, say) and is for plain magnitude: low at one end, high at the other. A **diverging** scale runs from one colour through a neutral midpoint to a second colour (red-white-blue) and is for values that sit *above or below a meaningful centre* — a correlation above or below zero, a result above or below target — because it lets the midpoint read as neutral and pushes both extremes to vivid opposite hues. Match the scale to the question: a diverging scale on plain magnitude invents a midpoint that isn't there; a sequential scale on signed data hides the very sign you cared about.

One move makes a heatmap far more revealing: **reordering the rows and columns** so that similar ones sit next to each other (often by clustering them — grouping the rows that behave alike, then the columns). The raw matrix may look like static; the *reordered* one snaps into blocks, because the eye reads adjacency as kinship and the clusters line up into solid squares of colour. The caveat to hold onto throughout: the human eye judges *position* precisely and *colour* only roughly, so a heatmap is a pattern-spotting instrument, not a readout. A cell that looks "moderately warm" might be a 0.42 or a 0.58 and you cannot tell by looking — which is exactly why the heatmap's job is to point you at the hot zone, not to report its temperature to two decimals.

## When to use it

The heatmap belongs to the **STATISTICAL family** of charts — the ones built to show quantitative structure in data — and within it the heatmap is the *matrix* member: the tool for when your data is a grid of values and the question is "what's the structure in this whole matrix?" That's a different question from the ones its neighbours answer, and the difference is how you choose:

- A **comparison chart** (or a plain table) keeps the *exact values* — reach for it when the reader needs to read each number off precisely, not estimate it from a shade. The heatmap deliberately gives that precision up to gain the whole-matrix view.
- A **scatter plot** is for *two continuous variables* — every point a pair of measurements, position carrying both. The heatmap is for a grid indexed by categories or ordered bins with one quantity per cell, not a cloud of continuous pairs.
- A **distribution plot** shows the shape of *one* variable's values; the heatmap shows one quantity across *two* dimensions at once.

Reach for a heatmap when the data is a dense matrix and the goal is to *see* its structure — correlations among many variables, activity across time-of-day and day-of-week, a confusion matrix of predicted-versus-actual, an analyst's consistency ratings of evidence against hypotheses. Skip it when the reader must read exact numbers (use a table), when you have only one dimension (use a bar or distribution plot), or when the matrix is sparse — a mostly-empty heatmap usually serves the reader worse than a sorted list. The heatmap earns its keep precisely when there are too many numbers to read and a pattern worth seeing.

## How Ora builds it

Ora produces a heatmap from a **semantic spec** — a structured description of the matrix (the row dimension, the column dimension, and the value in each cell), the **colour scale** (sequential for plain magnitude, diverging anchored on a meaningful midpoint for signed data), any **clustering or ordering** to apply to the rows and columns so similar ones sit together, and a **legend** mapping colour back to value. That spec is rendered to a figure (matplotlib- and seaborn-style: a grid of coloured cells with axis labels and a colour bar), with an accessible text description and a keyboard-navigable view of the cell grid, since a colour field alone isn't reachable by a screen reader.

The diagram is the visual face of Ora's **data-analysis** work: when a question takes the natural shape *X by Y, coloured by Z* — and the data is dense enough that a table would make the reader hunt for the pattern by hand — this is the artifact that shows it. The same pipeline renders an **ACH (Analysis of Competing Hypotheses) matrix**, where the rows are pieces of evidence, the columns are competing hypotheses, and each cell's colour encodes how consistent that evidence is with that hypothesis — turning structured analytic judgement into the same scannable colour field.

The heatmap's lineage runs deep: shaded-matrix displays go back to **Toussaint Loua's 1873 statistical atlas of Paris**, where cells were shaded to show social statistics across the city's arrondissements. The *modern* form — the **cluster heat map**, where rows and columns are reordered by similarity and paired with dendrograms showing the clustering — grew out of numerical taxonomy in the biological sciences (the dendrogram-and-matrix displays associated with **Peter Sneath and Robert Sokal**) and became ubiquitous through gene-expression analysis. That whole history is told in **Wilkinson and Friendly's "The History of the Cluster Heat Map"** (*The American Statistician*, 2009), and the ACH variant traces to **Richards Heuer's** *Psychology of Intelligence Analysis* (1999).

## Related

- **Comparison Chart** — the STATISTICAL-family member that preserves exact values in a table or aligned bars; reach for it when the reader must *read* numbers, not estimate them from colour.
- **Distribution Plot** — shows the full shape of one variable's values, where the heatmap spreads one quantity across two dimensions.
- **Scatter Plot** — the tool for two *continuous* variables as a cloud of points, where the heatmap handles a category-or-bin grid with one value per cell.
- **ACH Matrix** — the structured-analytic display (evidence × hypotheses, coloured by consistency) that Ora renders through this same heatmap pipeline.

## Sources

- [Wilkinson, L. & Friendly, M. (2009), The History of the Cluster Heat Map, The American Statistician 63(2)](https://doi.org/10.1198/tas.2009.0033)
- [Heuer, Richards J. (1999), Psychology of Intelligence Analysis](https://openlibrary.org/works/OL20202835W)