---
name: Base-Rate Neglect
status: draft
territory: hypothesis-evaluation
host_mode: bayesian-hypothesis-network
also_loadable_in: []
msi_wired: true
msi_family: inference
sources:
  - title: Kahneman, Daniel & Amos Tversky (1973), "On the Psychology of Prediction," Psychological Review 80(4):237-251
    url: https://doi.org/10.1037/h0034747
  - title: Bar-Hillel, Maya (1980), "The base-rate fallacy in probability judgments," Acta Psychologica 44(3):211-233
    url: https://doi.org/10.1016/0001-6918(80)90046-3
---

# Base-Rate Neglect

## Why it matters

A specific, vivid case grabs the whole mind. The story fits, the test came back positive, the profile matches — and in that grip we forget to ask the one question that actually governs the answer: *how common is this thing to begin with?* When the thing is rare, that one forgotten question doesn't just shade the answer. It flips it.

For example: a security system flags one account out of millions as fraudulent, and the flag is 99% accurate. It feels like a near-certain catch. But if genuine fraud is one account in a hundred thousand, then for every real fraudster the system catches, it lights up on roughly a thousand honest customers — the rare truth is buried under false alarms that the "99% accurate" number never warned you about. The accuracy is real. It's also almost beside the point. What controls the answer is the frequency you skipped.

- **What it reveals.** Whether a probability judgment was actually anchored in how common the category is — or built entirely on how well the case *matches* the category, with the underlying frequency silently dropped.
- **How it changes the read.** You stop asking *"how strong is the match between this case and the category?"* and start asking *"and of all the cases like this, how many really belong to the category?"* — the prior frequency, before any of the case-specific evidence.
- **When to foreground it.** Whenever a positive result, a profile, or a vivid story is being read as near-certain — especially when the category is rare (a disease, a fraud, a threat), where neglecting the base rate does the most damage.
- **What you'd miss without it.** That a confident-looking flag can be overwhelmingly a *false* flag — that at a low enough base rate, most positives are wrong no matter how accurate the test, and the felt certainty is an illusion the arithmetic dissolves.
- **Where it misleads.** Over-corrected, it becomes its own error: clinging to the base rate and waving off genuinely strong evidence. The base rate is the *starting* point, not the verdict — the evidence is still supposed to move it.

## Realtime examples

See real, dated analyses where this discipline shaped the read on the news → **[Base-Rate Neglect on Main Street Independent](https://mainstreetindependent.com/analyses/lens/inference/base-rate-neglect)**

## How to invoke it in Ora

You have a question with several competing explanations and a pile of evidence, and you want each explanation's probability anchored honestly — starting from how *common* each one actually is, not just how well it fits the story.

Describe the phenomenon and the candidate explanations, and ask:

> "Build a Bayesian hypothesis network for what's behind the spike in flagged transactions — and make sure each hypothesis's prior is the real base rate, not a round number."

Base-rate neglect is one of the always-loaded reasoning tools in the Bayesian Hypothesis Network analysis. When the mode sets a starting probability for each hypothesis, this is the model standing guard at that step: it forces every prior to be anchored in the actual frequency of its reference class rather than a comfortable-feeling number, and it flags the case where a rare category, dressed in strong-looking evidence, is about to be read as near-certain.

One thing to know: you don't summon this guard by saying the phrase "base-rate neglect." What routes you to the host analysis are the words *Bayesian network*, *posterior probability*, *prior and likelihood*, *base rate*, *competing explanations*, or plainly *what's the most likely explanation* — and once you're in the host, this model is always present, keeping the base rate in the prior. If you just want a quick gut-check on a single number, a lighter pass is the better fit; the full network takes ten-plus minutes and a real evidence inventory.

State the frequencies you actually know, even as rough ranges. The guard works best when you hand it a reference class — "out of every thousand accounts," "of all firms this size" — because a base rate stated as a frequency is exactly what intuition drops and what the discipline puts back.

One thing Ora won't do: let a hypothesis carry an invented prior because no base rate was at hand. If the reference class genuinely can't be pinned down, the analysis declares that openly rather than fabricating a frequency — because a confident number with no grounding is the very error this model exists to catch.

## How it works

A patient walks out of a routine screening with a positive result. The test is "95% accurate," the doctor's tone is grave, and everyone in the room — the patient, the family, often the doctor too — reads the result the same way: this is almost certainly the disease. The fear is total. And it is almost entirely misplaced.

Walk the actual people through the room. Imagine a thousand patients just like this one, screened for a disease that strikes about 1 in 1,000. Among those thousand, roughly *one* person truly has it — and yes, the test catches them; it lights up positive. But the test is wrong 5% of the time, and there are 999 healthy people in that crowd. Five percent of 999 is about 50. So fifty perfectly healthy people also walk out with a positive result. Now line up everyone holding a positive slip: about 51 people, of whom exactly one is actually sick. A positive result means this patient is that one-in-51 — a bit under 2% — not 95%. The very same "95% accurate" test, on a rare disease, produces fifty false alarms for every real case.

Notice what just happened. The number that felt completely irrelevant — *how rare the disease is in the first place* — turned out to be the number running the whole show. The accuracy of the test barely moved the answer; the rarity of the disease set it. And the mind, fixated on the vivid, specific, frightening case in front of it, sailed right past the rarity as if it didn't matter. It's the same blind spot that makes a witness's testimony feel decisive while we forget how unusual the thing she's identifying actually is: tell people that 85% of a city's cabs are green and only 15% blue, then put a witness on the stand who is 80% reliable and says the cab in the accident was blue — and almost everyone concludes it was probably blue. Work it through and the real chance it was blue is about 41%. The witness pulls the eye; the overwhelming greenness of the streets, the thing that should anchor the answer, slips the mind entirely.

That blind spot has a name: **base-rate neglect**. The base rate is just the background frequency — how common the category is before you look at this particular case — and the bias is the systematic habit of letting the case-specific evidence crowd it out of the calculation. What makes it so stubborn is that it isn't ignorance. In the original experiments, people who were *handed* the base rate, plainly stated, still dropped it the moment a vivid profile arrived; the specific story simply feels more like the answer than a dry statistic does. And the error is sharpest exactly where it's most dangerous — when the category is rare and the evidence is only moderately diagnostic, the false alarms quietly outnumber the real ones, and a result that looks like near-certainty is mostly noise. The cure is not to throw the evidence away. It's to put the forgotten number back where it belongs: start from how common the thing is, *then* let the evidence move you — the discipline that base-rate neglect's partner correction, Bayesian reasoning, makes precise.

## Framework & implementation

*This section uses Ora's own terms for the parts of an analysis, so that if you open the actual mode and lens files they line up. Each is glossed in plain language on first use.*

### Pipeline execution

Base-rate neglect is one of the **always-loaded mental models** in the Bayesian Hypothesis Network analysis — a thinking tool the mode keeps in hand at every run, not the method itself. It sits in the mode's **`ANALYTICAL PERSPECTIVES`** block under "always loaded," alongside bayesian-reasoning, confirmation-bias, falsifiability, occams-razor, and tetlock-superforecasting. (The mode's *required* lens — the one that structures the evidence-versus-hypothesis matrix — is a separate piece, heuer-ach-diagnosticity; base-rate-neglect is a guard applied at one specific stage, not the scaffold.) This is a **molecular** mode running at **Gear 4** — Ora's most thorough adversarial setting, composing two sub-analyses before it synthesizes.

**Composition and breadth.** The mode first runs a **differential-diagnosis fragment** purely to seed breadth — enumerate the candidate hypotheses widely — then a full **competing-hypotheses** (ACH) pass. At Gear 4 a **Depth analyst** and a **Breadth analyst** work the question in parallel and cross-critique each other. Base-rate-neglect doesn't shape this seeding stage; its moment comes next, when those candidate hypotheses are about to be assigned probabilities.

**Where this model bites: prior-elicitation.** The first synthesis stage, **prior-elicitation**, sets a starting probability for each hypothesis — and this is precisely the seam base-rate-neglect guards, enforced as the mode's **CQ1 (prior anchoring)**. Its **Application Steps** run here in order: *identify the base rate* — how common is this category in the relevant population — and only then weigh the case-specific evidence. Its **Detection Signals** are exactly the tells of a fabricated prior: a hypothesis defended by reciting how well it *matches* the evidence while no base rate is cited, or an analyst who can state the frequency on request but never folds it into the working number. What stands between a hypothesis's prior and the **prior-fabrication** failure mode — a round-number prior (0.5, 0.33) with no base-rate anchor — *is* this model. A prior that names no reference class does not survive the step; where genuinely no base rate exists, the analysis declares a *flat-prior assumption* explicitly rather than inventing one.

**The reference class is itself a decision.** Anchoring the prior in a base rate is only half the discipline; anchoring it in the *right* base rate is the other half. Its **Critical Questions** force both: *what is the base rate, and is the population it's drawn from the right one for this question?* A prior computed against the wrong reference class — the **wrong-population base rate** failure mode — is as wrong as no anchor at all, so the step demands the reference class be defined precisely before the number is trusted.

**Carrying the rarity downstream.** The error base-rate neglect produces — a rare category, dressed in strong-looking evidence, read as near-certain — has to be made visible in the output, not just corrected silently in the prior. When a hypothesis's base rate is very low, this model's last Application Step applies: *expect most positive signals to be false positives.* That expectation lands in two of the mode's ten output sections in particular — the **Sensitivity analysis** (the mode's **CQ3**, load-bearing), which shows how fragile a low-prior leading hypothesis is to a single piece of evidence, and the **Confidence map**, which reports the residual false-positive risk rather than a falsely crisp verdict. Posteriors over genuinely low base rates are carried as **bands with confidence**, not precise points.

**Where the finding lands.** Across the ten output sections — **Phenomenon or question · Hypothesis nodes with priors · Evidence nodes with likelihoods · Conditional dependencies · Bayesian network (table/diagram) · Posterior distribution · Sensitivity analysis · MECE check · Leading hypothesis with residual uncertainty · Confidence map** — base-rate-neglect bites hardest at **Hypothesis nodes with priors**, where every prior must show its base-rate anchor, and echoes through **Sensitivity analysis** and the **Confidence map** wherever a low base rate means the positives are mostly false. It is the partner correction to bayesian-reasoning, which supplies the update *logic* once the prior is honestly set: this model guards the starting number, Bayesian reasoning moves it.

### Origin and evidence

The bias was isolated by Daniel Kahneman and Amos Tversky in "On the Psychology of Prediction" (1973), the paper that introduced the *representativeness heuristic* — the mental shortcut of judging probability by how well a case resembles a category prototype, rather than by how frequent the category is. In their now-classic demonstrations, subjects told that a described person was drawn from a pool that was 70% lawyers and 30% engineers nonetheless judged probability almost entirely from the personality sketch, treating the stated base rate as if it weren't there. Maya Bar-Hillel's "The base-rate fallacy in probability judgments" (1980) consolidated and extended the finding into a systematic account of *when* base rates get used and when they get dropped, showing the neglect was robust rather than an artifact of any single problem and sharpening the conditions — notably that more "specific"-seeming individuating evidence crowds out the base rate most forcefully. The result has been among the most replicated in the judgment-and-decision-making literature, and it generalizes from the lab to the clinic and the trading desk: the medical-screening and taxicab problems are its canonical illustrations precisely because the arithmetic is unforgiving and the intuitive error is near-universal. The partner discipline that names the correct calculation — start from the prior, move it by the evidence's discriminating power — is Bayesian reasoning, treated in its own paper.

### Applications and common uses

Base-rate neglect is the audit lens wherever a probability is being read off a match rather than computed from a frequency — and it is used both to *catch* an over-confident judgment after the fact and to *design* systems that won't manufacture them.

- **Diagnosis and screening.** Medicine, security, and fraud detection all live or die on the interaction of a test's accuracy with the base rate of what it's testing for. At a low base rate, a highly accurate test still produces mostly false positives; naming that up front is what turns "95% accurate" into the real, far lower probability that a given flag is true.
- **Detection-system design.** Building a screen or alert without specifying its expected false-positive rate *at the operating base rate* guarantees alert overload — operators drown in false alarms and start ignoring all of them. The fix is to size downstream review tiers to the predicted false-positive volume before deployment, not after the alerts are already being tuned out.
- **Forecasting and intelligence.** Anchoring each hypothesis's prior in the frequency of its reference class — the "outside view" — is documented as a core habit of accurate forecasters, and a standing guard against letting a vivid, specific narrative pre-empt the dull statistic that should anchor the estimate.
- **Law and policy.** From eyewitness identification to risk profiling, the same structure recurs: a moderately reliable signal applied to a rare target yields a flagged population that is mostly innocent, and the base rate is what keeps a confident-sounding match from being mistaken for proof.

In every case the payoff is the same: a probability that begins from how common the thing actually is, lets the case-specific evidence move it by the right amount, and refuses to read a rare event as near-certain just because the evidence in front of it is vivid.

### Failure modes and when not to use it

The lens's characteristic ways of going wrong are catalogued in its **Common Failure Modes**:

- **Match-as-classification.** Treating a strong feature match as classification certainty — the core error. The tell is an analyst who cites the matching features at length while never citing the base rate. The correction is procedural: require an explicit base-rate citation in any classification judgment.
- **Wrong-population base rate.** Anchoring the prior in a base rate computed for the wrong reference class — the population it's drawn from doesn't actually contain the case under judgment, or pads it with irrelevant cases. The correction is to define the reference class precisely and re-compute; a base rate is only as good as the population behind it.
- **Base-rate over-correction.** The mirror error: dismissing genuinely strong, diagnostic evidence in favor of the base rate, so the posterior never budges from the prior. The base rate is the starting point, not the answer — the goal is to *integrate* the evidence per Bayes, not to let the prior overrule it.
- **Hidden screening cost.** Deploying a detection system without analyzing its false-positive rate at the operating base rate, so the alerts turn out mostly false and get ignored. The correction is to design review tiers proportionate to the predicted false-positive volume up front.

**The practical fix.** The single most effective corrective is a reframing, not a warning. Gerd Gigerenzer and Ulrich Hoffrage showed that presenting the same problem in **natural frequencies** — "10 out of every 1,000 people" rather than "a probability of 0.01" — largely dissolves the error even in people with no statistical training, because frequencies keep the base rate's whole population in view where percentages quietly discard it. Whenever a judgment has to be made or audited, stating the base rate as a count of a reference population is the cheapest, most reliable guard against the neglect.

**When not to reach for it.** When there is no definable base rate — the category has no stable frequency in any relevant population — there is nothing to anchor, and forcing a number is worse than naming the absence; the honest move is a *flat-prior assumption* declared openly. When the individuating evidence is genuinely overwhelming — a near-conclusive signal, not a moderately diagnostic one — the base rate's pull is correspondingly small, and leaning on it becomes the over-correction failure above. And when the disagreement is really about which reference class is *right* — not whether to use a base rate but which population to draw it from — that's a framing question to settle first, before any number is trusted.

## Related

- **Bayesian Hypothesis Network** — the analysis this model guards; turns competing explanations into a probabilistic posterior, and this is the tool that keeps each prior anchored in a base rate at the prior-elicitation stage.
- **Bayesian Reasoning** — the partner correction: base-rate neglect is the bias, Bayesian reasoning is the discipline that fixes it — start from the prior, move it by the evidence's discriminating power.
- **Representativeness Heuristic** — the mental shortcut underneath the bias: judging probability by how well a case matches a prototype, the operation that substitutes for weighing the base rate.
- **Regression to the Mean** — a companion inference correction: extreme observations are partly luck and should be expected to move back toward the base rate.

## Sources

- [Kahneman, Daniel & Amos Tversky (1973), "On the Psychology of Prediction," Psychological Review 80(4):237-251](https://doi.org/10.1037/h0034747)
- [Bar-Hillel, Maya (1980), "The base-rate fallacy in probability judgments," Acta Psychologica 44(3):211-233](https://doi.org/10.1016/0001-6918(80)90046-3)
