CIA Tradecraft Red Team

Why it matters

An artifact reviewed only by people who share its assumptions has not really been reviewed. The way to find what’s wrong with a plan before reality does is to assign someone to attack it — sanctioned, in good faith, and bound by a discipline that separates real red-teaming from mere cynicism: attack what the thing actually says, model the actual adversary, and report honestly when an attack finds nothing.

For example: a company is days from launching a product everyone internally loves. The deck is polished, the room nods along, and no one has been tasked to take it apart — so the first person who does is a journalist, a competitor, or a regulator, who finds in ten minutes the obvious objection the team had grown blind to. A red team is that hostile reader hired early and on purpose: a person whose explicit job is to build the strongest case against the artifact while you can still do something about it.

  • What it reveals. The strongest attacks a committed adversary would actually mount against a specific artifact — grounded in what it really says, modeled on who would really oppose it, and honest about where it is genuinely robust.
  • How it changes the read. You stop asking “is this good?” (which a room of insiders always answers yes) and start asking “if someone were paid to destroy this, exactly how would they do it — and would it work?”
  • When to foreground it. A high-stakes artifact about to be committed and never adversarially tested; convergent agreement no one has challenged; preparing for hostile review or debate; the cost of being wrong exceeding the discomfort of structured disagreement.
  • What you’d miss without it. That an internal review shares the artifact’s blind spots by construction — and that the attacks which will actually land come from an adversary who does not share your frame, which generic “what could go wrong” brainstorming never surfaces.
  • Where it misleads. Performed hostility is as dishonest as performed praise — an attack on a distorted version (a straw target), or on a claim the artifact never made (fabrication), collapses the instant a prepared audience checks; and red-teaming by people who quietly share the artifact’s assumptions (mirror-imaging) just launders groupthink.

How to invoke it in Ora

You have a specific artifact — a plan, proposal, strategy, paper — and you need the strongest case against it: ammunition for a debate, preparation for a hostile review, or a brief to talk someone out of a course of action.

Name the artifact and the audience, and ask:

“Red-team this proposal as an advocate — build the strongest case against it for [the board / a skeptical regulator / a debate opponent], ranked by what would actually land.”

This rides inside the Red Team (Advocate) analysis, with the CIA-tradecraft discipline as its required method. Ora models the named audience (their frame, priorities, what persuades them), restates the artifact with quotes so attacks anchor in what it really says, builds attacks grounded in that content, ranks them by persuasive force with that audience (devastating / strong / plausible), supplies phrasing in the audience’s idiom, and names the concessions you must preempt. Its sibling, Red Team (Assessment), runs the same tradecraft to surface vulnerabilities for your own fix-list instead.

One thing to know: phrases like argue against this, make the case against, give me ammunition, prep me for hostile review, or I need to dissuade are what route you here. It needs a specific named artifact — a vague domain can’t be red-teamed; if the target or audience is too thin, the analysis returns a redirect rather than attacking thin air.

Name who the brief is argued in front of — the attacks are ranked and phrased for that audience, so a generic “critic” produces generic, weaker ammunition.

One thing Ora won’t do: perform hostility. Every attack must rest on what the artifact actually says (no straw-target, no fabrication) and pass the test “would a committed opponent actually use this?”; weak attacks don’t get inflated to “devastating,” and the artifact’s strongest defense is named as a concession rather than hidden.

How it works

In October 1973, the surprise that opened the Yom Kippur War was not really a failure of information — Israeli intelligence had the warning signs. It was a failure of interpretation: a dominant theory (the “Concept” — that Egypt would not attack without certain capabilities it lacked) was so widely shared that contradicting signals were explained away. Everyone agreed, so no one looked again. In the aftermath, Israeli intelligence drew a hard institutional lesson and built a remedy with a name: Ipcha Mistabra — Aramaic for “the opposite seems likely.” A designated unit, sometimes called the devil’s-advocate office or “the tenth man,” was given a standing duty: when the consensus converges, someone is obligated to write the dissent and argue the case nobody wants to hear. The dissent isn’t optional, and it isn’t personal — it’s the job.

That is the heart of red teaming, and the United States built the same muscle through its own surprises. The CIA’s “Team B” exercise in 1976 pitted an outside team against the in-house estimate; after 9/11 the Agency stood up a “Red Cell” specifically to think like the adversary; and the 2009 Tradecraft Primer codified a toolkit of structured techniques — Team A / Team B, Devil’s Advocacy, Key Assumptions Check, “What If” analysis — as institutional defenses against the predictable ways analysis fails. The intuition is ancient (the Catholic Church’s advocatus diaboli argued against candidates for sainthood for centuries), but the modern discipline adds something the old role lacked: rules for doing it honestly.

Because here is the trap. It is easy to perform opposition — to play the hostile critic, score rhetorical points, and feel rigorous without being rigorous. Performed hostility, the tradecraft insists, is just the mirror image of flattery: equally dishonest, equally useless. So real red-teaming runs on a short list of disciplines. No straw targets: attack the artifact as written, not a weakened cartoon of it — because a prepared audience can read the real thing, and a straw-target attack collapses on first contact. No fabrication: don’t attack claims the artifact never made or powers it doesn’t have. No mirror-imaging: the deadliest error in intelligence is assuming the adversary thinks like you; a red team that quietly shares the artifact’s frame just relaunders the groupthink it was meant to break, so you model the actual opponent — their priorities, not yours. And, surprisingly, honest attack-failure: when you mount an attack and it finds nothing, you say so — a documented non-finding is valuable intelligence, because it tells the artifact’s owner where it is genuinely strong and tells the briefer where the audience cannot be moved.

The last move is the one that separates the discipline from cynicism. A red team’s authority comes from being sanctioned — explicitly tasked, so its attacks are read as institutional rigor rather than personal animus — and from being calibrated: it ranks its attacks by how hard they actually land, refuses to inflate a weak objection into a devastating one, and names the artifact’s strongest defense out loud as a concession to preempt. Done this way, red teaming is not the art of finding fault. It is the art of finding the true faults — the ones an adversary would actually exploit — early enough that you can still fix them, and honestly enough that the finding can be trusted.

Framework & implementation

This section uses Ora’s own terms for the parts of an analysis, so that if you open the actual mode and lens files they line up. Each is glossed in plain language on first use.

Pipeline execution

CIA-tradecraft red-teaming is the required lens of the Red Team (Advocate) analysis — it sits in the mode’s lens_dependencies.required, supplying the analysis’s foundational adversarial discipline (it is also foundational: true in its own file, and applicability lists both red-team modes — assessment and advocate — which share this tradecraft core). It is a type: engram lens: rather than the usual Detection-Signals/Application-Steps structure, it carries a Core Principle, a When to Apply list, and a How to Apply protocol. The mode runs at Gear 4, Ora’s most thorough setting — a Depth analyst and a Breadth analyst build the brief in parallel, critique each other (cross-adversarial evaluation), and revise. The cross-adversarial step is itself red-teaming wired into the pipeline.

Where the lens engages. It activates on the conditions in its When to Apply list — a high-stakes artifact about to be committed without adversarial testing; convergent agreement no one has challenged; a brief needed against an artifact for an external audience; groupthink or mirror-imaging threatening the product. Its How to Apply protocol runs the tradecraft: define the specific artifact (vague domains can’t be red-teamed); sanction the role; distinguish assessment (vulnerabilities for the owner’s own fix) from advocate (a brief for an external audience); hold the no-fabrication discipline; apply the sycophantic-inverse self-check; avoid mirror-imaging by modeling the named adversary; run a Key Assumptions Check; and disclose honest attack-failures.

What it contributes to the analysis. In the advocate mode it disciplines every output section: the Audience model (model the real audience’s frame, not a generic critic — the mode’s CQ1 / audience-misalignment), the Attacks ranked by persuasive force (honest devastating/strong/plausible calibration — CQ2 / cynical-overreach), the requirement that every attack be grounded in the artifact’s actual content (no straw target, no fabrication — CQ3 / straw-target-trap and no-fabrication-violation), the framework-vs-artifact discipline (CQ4 / framework-attack-trap), and the Concessions section (name the strongest defense rather than hide it — CQ5). The mode’s Input Sufficiency Protocol enforces the lens’s “define a specific artifact” rule, redirecting rather than attacking thin material.

Cross-adversarial evaluation. At Gear 4 each analyst’s brief is critiqued by the other, which catches the tradecraft’s signature failures: the straw-target trap (attacking a weakened version that collapses when the audience reads the real artifact); the sycophantic-inverse trap (performing hostility instead of analyzing — failing the “would a committed opponent actually use this?” check); mirror-imaging (a brief built for a generic critic rather than the modeled adversary); and cynical overreach (inflating weak attacks, omitting concessions). The evaluator presses the core check: is this the attack a committed, prepared opponent would actually mount against the artifact as written — and is its force rated honestly?

Honesty discipline. The lens’s defining move is Attack-Failure Disclosure: when an attack class is tried and yields nothing, the non-finding is documented, not buried. It tells the owner where the artifact is genuinely robust and the briefer where the audience can’t be persuaded — the inverse of manufacturing objections to look productive.

What the analysis will not do. It will not attack a straw version or a fabricated claim; will not inflate persuasive force; will not omit the artifact’s strongest defense; will not red-team a vague domain (it demands a named artifact); and it keeps the advocate contract distinct from the assessment contract rather than blending severity-for-fix with persuasive-force-for-audience.

Origin and evidence

The discipline is older than its name. The Catholic Church’s advocatus diaboli institutionalized sanctioned opposition for canonization reviews; the Talmudic tradition of arguing the opposite case (Ipcha Mistabra, “the opposite seems likely”) survives in Israeli intelligence doctrine as the duty of the dissenting analyst — formalized after the 1973 intelligence surprise. The modern codification is the CIA’s A Tradecraft Primer: Structured Analytic Techniques for Improving Intelligence Analysis (2009), which catalogs Team A / Team B analysis, Devil’s Advocacy, the Red Cell, the Key Assumptions Check, and What-If analysis as institutional defenses against analytic failure (its companion tradition is Richards Heuer’s Psychology of Intelligence Analysis, 1999). The cross-institutional scholarship is Micah Zenko’s Red Team: How to Succeed by Thinking Like the Enemy (2015), which studies red-teaming across military, intelligence, and corporate settings and catalogs its failure modes (red-team capture, sanitization pressure, irrelevance), and Bryce Hoffman’s Red Teaming (2017), which adapts the practice for business strategy. The throughline of the evidence is institutional: organizations that build sanctioned, disciplined dissent catch their own errors earlier; organizations that let consensus go unchallenged are surprised by adversaries who were not so polite.

Applications and common uses

Red-team tradecraft is a working tool wherever a high-stakes artifact must survive a determined opponent.

  • Intelligence and security. The native ground: stress-testing estimates and plans against an adversary modeled on their actual goals, not a mirror of one’s own.
  • Strategy and major decisions. Attacking a strategy before committing — the sanctioned dissent that breaks the convergent agreement of a leadership team.
  • Debate and hostile-review preparation. Building the strongest case an opponent will make, in their idiom, so you meet it ready rather than ambushed — the advocate use this lens hosts.
  • Product, security, and safety review. Thinking like the attacker, the abuser, or the failure — finding the exploit before a real adversary does (the assessment sibling’s home).
  • Guarding against groupthink. Any setting where a room of people who share assumptions needs a sanctioned outsider’s-eye view to find what the shared frame hides.

In every case the payoff is the same: the artifact meets its strongest honest opposition early, from someone tasked to find the attacks that would actually land — and learns where it is genuinely robust from the attacks that honestly failed.

Failure modes and when not to use it

The tradecraft’s characteristic ways of going wrong (the named failure modes the host mode guards against):

  • Straw-target attack. Attacking a distorted, weakened version of the artifact. The tell: the attack doesn’t apply to the artifact as written, and collapses the moment the audience reads the real thing. Anchor every attack in quoted content.
  • Fabrication. Resting an attack on a claim the artifact never made or a capability it doesn’t have. The tell: the attacked claim isn’t actually in the artifact. Verify the target before attacking it.
  • Mirror-imaging. Modeling the adversary or audience as sharing the artifact’s frame and priorities. The tell: a brief built for a generic critic that persuades nobody specific. Model the named opponent’s actual frame.
  • Sycophantic-inverse (performed hostility). Playing the hostile critic without analytic content. The tell: attacks that fail the “would a committed opponent actually use this?” test. Drop them.
  • Cynical overreach. Inflating weak attacks to “devastating,” or omitting the artifact’s strongest defense to look one-sided. The tell: a brief that will crumble in front of a prepared audience and take the user’s credibility with it. Calibrate force honestly; name concessions.

When not to reach for it. When there is no specific artifact — only a vague domain or area — there is nothing to red-team; the discipline depends on artifact-specific grounding. When the real question is structural fragility regardless of any attacker (“how could this fail under any pressure?”), a fragility audit fits better than modeling a hostile actor. When what’s needed is the strongest case for an idea, that is steelmanning, the opposite stance. And red-teaming finds the attacks; it does not, by itself, decide the artifact’s fate — weighing the surfaced attacks against the artifact’s merits is the owner’s call, for which this supplies the adversarial half.

  • Red Team (Advocate) — the analysis this lens is the required method for; builds the strongest case against a named artifact for a named external audience, ranked by persuasive force.
  • Devil’s Advocacy — the role-based ancestor: where this lens supplies intelligence-grade attack discipline, devil’s advocacy supplies the sanctioned-dissent structure that protects the critic from personal animus.
  • Competing Hypotheses — the sibling from the same CIA tradecraft tradition: Heuer’s ACH disciplines which explanation the evidence supports, as red-teaming disciplines how an artifact is attacked.
  • Steelman Construction — the direct opposite stance: the strongest case for the artifact, the constructive complement to red-teaming’s strongest case against.