Multi-Criteria Decision Making (MCDM) Methods

Why it matters

When a choice turns on several things at once and they pull against each other, there is no single number that simply says “best” — and how you combine the scores quietly decides the winner before you have even looked. Choosing the method for combining is itself a decision, and the wrong method gives a confident, official-looking answer to a question you didn’t actually ask.

For example: a family choosing where to relocate scores three cities on jobs, cost of living, schools, and weather. City A wins on jobs and weather but is expensive with mediocre schools; B is cheap with excellent schools but a thin job market; C is middling on everything. Average the scores and C “wins” — not because anyone wants C, but because plain averaging rewards the option with no weaknesses over the options with real strengths. Let a dealbreaker veto instead, or rank by closeness to the ideal, and a different city comes first. Same scores, different method, different decision.

  • What it reveals. Which named method — and which hidden assumption about how criteria combine — actually fits your decision, and the uncomfortable fact that the method, not just the scores, determines the ranking.
  • How it changes the read. You stop asking “what’s the weighted average?” and start asking “can a great score on one criterion buy back a fatal score on another — and does the method I’m using assume that it can?”
  • When to foreground it. Three or more criteria that genuinely trade off; a choice where stakeholders disagree about the weights; any moment a “winner” is about to be declared from one combined score.
  • What you’d miss without it. That a fully compensatory method (a plain weighted sum) will happily rank a dealbreaker option first because its strengths average out its fatal flaw — and that small, defensible changes to the weights can silently flip the winner, a fragile ranking dressed up as a stable one.
  • Where it misleads. The machinery can manufacture false precision — gut-feel weights and shaky scores produce an authoritative-looking ranking no better than its inputs; and forcing a full ranking where two options are genuinely incomparable imposes a decision the evidence doesn’t support.

How to invoke it in Ora

You’re choosing among a handful of discrete options, several things you care about pull in different directions, and no single criterion can settle it — you want to see how the options stack up across all the dimensions, with your weights made explicit.

List the options and what matters, and ask:

“Run a multi-criteria decision analysis on these options — here are the criteria I care about and roughly how much each matters; rank them and show me how stable the ranking is.”

This rides inside the Multi-Criteria Decision analysis. Ora inventories the options, defines each criterion operationally, makes the weights explicit (and traces them to your stated preferences, not a default), scores every option-by-criterion cell, names the aggregation method and why it fits, produces the ranking, runs a sensitivity check to see what weight or score change would flip the top choice, and prunes options that lose on every criterion.

One thing to know: phrases like multi-criteria, MCDA / MCDM, weighted criteria, rank these options across, criteria matrix, AHP, what matters most, or how do they stack up are what route you here. With only one or two criteria the machinery is overkill — a lighter decision read fits better; with three or more that trade off, this is the place.

Say which criteria matter and roughly how much; the analysis is judged on whether the weights came from your preferences. If you genuinely have no weights yet, it will hand you a pairwise-comparison scaffold to fill in rather than inventing weights and passing them off as a verdict.

One thing Ora won’t do: declare a winner from invented weights. If you supply no preferences, it returns the elicitation scaffold and asks for your judgments; and if the top choice flips under a modest, defensible change in the weights, it tells you the ranking is fragile rather than presenting it as settled.

How it works

Suppose you’re choosing a family car, and you do the sensible-looking thing. You pick what matters — price, fuel economy, safety, cargo space — score each car out of ten on each, multiply by how much you care about each, and add it all up. One number per car; highest wins. It feels rigorous. And it can quietly betray you.

Picture a car that’s cheap, sips fuel, and swallows a month of luggage — but scores a 2 out of 10 on crash safety. In a weighted sum, that fatal 2 is just one term among four. Its glittering scores on the other three can outweigh it, and the spreadsheet will cheerfully crown the unsafe car the winner. The method didn’t malfunction. You asked it a precise question — “which car has the highest total?” — and it answered honestly. The trouble is that wasn’t the question you meant. You meant “which good car is also safe enough,” and safety, for you, is not a thing the other strengths are allowed to buy back.

This is the secret hiding inside multi-criteria decisions: the method you use to combine the scores carries an assumption, and the assumption decides the winner as much as the scores do. The plain weighted sum is compensatory — it assumes any criterion can be traded against any other, that enough fuel economy can offset a safety failure. Some decisions genuinely work that way. Many do not. So the field built a small catalog of methods, each quietly embodying a different assumption about how preferences are shaped, and the real skill turns out to be choosing the right one rather than filling in the matrix.

A few of the regulars. The Analytic Hierarchy Process doesn’t ask you to pull weights out of the air; it asks you to compare criteria two at a time — “how much more does safety matter than price?” — and then does something clever: it checks your answers for consistency, catching you when you’ve said safety beats price, price beats cargo, and cargo beats safety, a circle that can’t be true. The outranking methods — ELECTRE and PROMETHEE — let a criterion act as a veto: a car that fails crash testing is simply out, no matter how it scores elsewhere, exactly the move the weighted sum couldn’t make. They’re also honest enough to return “these two options are incomparable” instead of forcing a rank where none exists. TOPSIS takes a geometric view — define the ideal car (best value on every criterion) and the worst, and rank each real option by how close it sits to the ideal and how far from the worst. And the humble weighted sum and its disciplined cousin SMART remain the right tool when the criteria really are independent and really are tradeable, and transparency to everyone in the room is the priority.

None of these is “the correct one.” That is the whole point. You match the method to the shape of the decision: Does any criterion have veto power, or can everything compensate? Are the criteria truly independent, or are two of them secretly measuring the same thing twice? Do you need a complete ranking, or is “these two can’t be honestly compared” the truthful answer? Get that match right and the matrix becomes genuinely clarifying. Get it wrong — reach for the weighted sum by reflex — and you get a crisp, confident number that answers the wrong question, which is a more dangerous thing than no number at all.

Framework & implementation

This section uses Ora’s own terms for the parts of an analysis, so that if you open the actual mode and lens files they line up. Each is glossed in plain language on first use.

Pipeline execution

The MCDM-methods catalog is the required lens of the Multi-Criteria Decision analysis — it sits in the mode’s lens_dependencies.required, meaning it supplies the analysis’s actual method rather than merely informing it (it is also an always-loaded mental model in the mode’s ANALYTICAL PERSPECTIVES block, beside trade-offs, Arrow’s impossibility theorem, prospect theory, loss aversion, and decision trees). The mode runs at Gear 4, Ora’s most thorough setting — a Depth analyst and a Breadth analyst build the matrix in parallel, critique each other (cross-adversarial evaluation), and revise. The lens’s defining job is method-matching: choosing the aggregation method whose built-in assumption fits the decision, instead of defaulting to one approach for every situation.

Where the lens engages. It activates on its Detection Signals — multiple alternatives that must be compared on multiple, non-commensurable criteria; stakeholders disagreeing on weights; a decision being made by intuition where structure would expose hidden inconsistency; an “obvious” winner that is obvious only because one criterion is silently acting as a veto. Its Application Steps receive the options, criteria, and context; match the method to the preference structure (pairwise-hierarchical → AHP; small set with direct weighting → SMART; veto criteria / non-compensatory → ELECTRE or PROMETHEE; geometric ideal framing → TOPSIS; minimum-overhead transparent → weighted sum); elicit the parameters the chosen method needs; aggregate; run sensitivity analysis; and optionally cross-check with a second method.

What it contributes to the analysis. Because it is the method lens, it drives the whole output skeleton: the Options inventory, the Criteria definitions (operational, with units and preference direction), the Weights with rationale, the Scoring matrix, the Aggregated ranking (with the named method and why it fits), the Sensitivity analysis (the perturbation that would flip the top choice), and the Dominant and dominated options block (pruning options beaten on every criterion, flagging any that win on every criterion as a no-brainer). It keeps three confidences distinct — scoring, weight, and method-fit uncertainty.

Cross-adversarial evaluation. At Gear 4 each analyst’s reading is critiqued by the other, which catches the lens’s signature failures, keyed to its Critical Questions and Common Failure Modes and to the mode’s named failure modes: method-by-default (reflexively using a weighted sum regardless of preference structure — the mode’s aggregation-method-opacity); compensatory veto (a fully compensatory method where a criterion is really a dealbreaker); weight elicitation by gut (round-number weights with no discipline — the mode’s weight-imposition); criterion redundancy (two criteria measuring the same underlying attribute — the mode’s criterion-redundancy); sensitivity-analysis omission and rank-reversal blindness (the mode’s false-stability and dominance-blindness); and cross-method disagreement suppression (running two methods and reporting only the flattering one). The evaluator presses the core check: has the method been matched to the decision’s preference structure, or imposed by reflex?

Honesty discipline. The mode treats imposed weights as a near-failure: when the prompt supplies no preference signal, it does not hand back a ranking built on placeholder weights — it emits the AHP pairwise scaffold (a blank comparison matrix plus the criterion definitions) and asks the decision-maker for their judgments. A headline ranking driven by analyst-invented weights is a fail unless the user explicitly declined to weight.

What the analysis will not do. It will not present a method-fragile top choice as stable; will not let a compensatory method silently override a veto criterion; will not manufacture a clear winner where the criteria genuinely conflict (it surfaces the tradeoff instead); and will not force a complete ranking where two options are honestly incomparable.

Origin and evidence

The catalog gathers the canonical methods of multi-criteria decision analysis, each with a traceable origin. Thomas Saaty’s The Analytic Hierarchy Process (1980; compact restatement in Int. J. Services Sciences, 2008) introduced AHP and its pairwise-comparison consistency check. The additive-utility family rests on Ralph Keeney and Howard Raiffa’s Decisions with Multiple Objectives (1976), the foundational multi-attribute utility theory, with Ward Edwards’s “How to Use Multiattribute Utility Measurement for Social Decisionmaking” (1977) popularizing the simple rating technique (SMART). The outranking tradition began with Bernard Roy’s ELECTRE (“Classement et choix en présence de points de vue multiples,” 1968) and continued with Jean-Pierre Brans and Philippe Vincke’s PROMETHEE (Management Science, 1985), both built to honor veto criteria and honest incomparability. Ching-Lai Hwang and Kwangsun Yoon’s Multiple Attribute Decision Making (1981) introduced TOPSIS and its distance-from-ideal geometry. Valerie Belton and Theodor Stewart’s Multiple Criteria Decision Analysis (2002) is the standard survey tying the methods together. The throughline of the evidence is methodological, not empirical: each method is a coherent procedure with known strengths and known pathologies (AHP’s rank-reversal, the weighted sum’s silent compensation), and the discipline’s accumulated lesson is that matching method to decision structure matters more than any single method’s sophistication.

Applications and common uses

MCDM is a working tool wherever discrete options must be compared across criteria that won’t reduce to one number.

  • Procurement and vendor selection. The classic use: scoring bids on cost, capability, risk, and support, with the method chosen to reflect whether any criterion (security, compliance) is a true veto.
  • Hiring and admissions. Ranking candidates across skills, experience, and fit — and catching where averaging would bury a dealbreaker or where two criteria double-count.
  • Engineering and design tradeoffs. Choosing an architecture across performance, cost, maintainability, and risk, with sensitivity analysis showing which preference would have to change to switch the choice.
  • Policy and siting decisions. Comparing options across economic, environmental, and social criteria where stakeholders bring different weights and non-compensatory vetoes are common.
  • Personal high-stakes choices. Jobs, homes, relocations — making the weights explicit so the decision reflects considered preferences rather than whichever option felt best last.

In every case the payoff is the same: not just a ranked list, but a ranking whose method fits the decision, whose weights trace to real preferences, and whose stability (or fragility) under perturbation is stated plainly.

Failure modes and when not to use it

The lens’s characteristic ways of going wrong are catalogued in its Common Failure Modes:

  • Method-by-default. Reaching for the weighted sum or SMART by reflex regardless of preference structure. The tell: no one asked whether veto criteria, criterion interaction, or incomparability matter. Run the method-matching step explicitly.
  • Compensatory veto. Using a fully compensatory method when stakeholders treat a criterion as a dealbreaker. The tell: the top-ranked option is rejected on a single criterion’s score. Switch to an outranking method with explicit veto thresholds.
  • Weight elicitation by gut. Assigning weights with no discipline. The tell: round numbers (10%, 20%, 30%) summing tidily to 100%. Use swing-weighting or pairwise comparison and document the procedure.
  • Rank-reversal blindness. Using AHP (or other reversal-prone methods) without checking whether adding or removing an option reshuffles the rest. The tell: the ranking changes when an also-ran is dropped. Test rank stability under inclusion/exclusion.
  • Sensitivity-analysis omission. Reporting a ranking as definitive without perturbing it. The tell: no robustness check. Perturb weights and key scores; report what flips the top choice.
  • Cross-method disagreement suppression. Running two methods, getting different rankings, and reporting only one. The tell: the second method’s output is missing. The disagreement is information — report both and its source.

When not to reach for it. With only one or two criteria, the overhead exceeds the value — a lighter single-criterion read or a constraint map fits better. When one option dominates (beats the others on every criterion), the matrix is ceremony; name the dominance and stop. When the real difficulty is probability and timing under a single dominant criterion, that is decision-under-uncertainty’s job, not MCDM’s. And when the inputs are too soft to score honestly, MCDM’s precision is false precision — better to admit the judgment is qualitative than to dress it in a matrix.

  • Multi-Criteria Decision — the analysis this lens is the method for; compares options across weighted criteria with explicit method choice, sensitivity analysis, and dominance pruning.
  • Trade-offs — the always-loaded mental model underneath the matrix: every criterion gained is another given up, the no-free-lunch discipline that MCDM operationalizes across many criteria at once.
  • Decision Trees — the sibling for a different decision shape: when the choice unfolds over time under uncertainty rather than across criteria at one moment.
  • Arrow’s Impossibility Theorem — the cautionary companion: aggregating several criteria (or several people’s preferences) into one ranking runs into deep limits no method fully escapes.