This paper argues that AI is better understood as Assisted Human Intelligence than as artificial intelligence on a trajectory toward autonomy. The argument proceeds through several phases: the reframe and what it dissolves; the conventional framing of AI as oracle and why it fails users; the inversion that locates wisdom in the user rather than the system; the philosophical question of whether systems can be intelligent in ways their parts cannot; the empirical observation that every meaningful AI output requires a human recognizer; the mathematical demonstration that iterated synthesis without verification produces accumulated noise rather than accumulated capability; and the structural demonstration that closed-loop multi-agent architectures produce indeterminate drift rather than convergence on truth. The convergence of current AGI projects on harness architectures is the implicit concession that the AHI argument has been making all along; the disagreement that remains is about where the recognizer sits. Locating the recognizer correctly — in the user, outside the loop — is what the architectural choices behind Ora and the public-domain release strategy of the Foundation that stewards it are operationally for.

Ora is a public-domain AI orchestration framework built around an adversarial pipeline and a library of explicit cognitive frameworks. The Ora Knowledge Foundation is the 501(c)(3) nonprofit that stewards the framework, releases it into the public domain as a defense against enclosure, and operates a free knowledge library, a free framework library, educational work for displaced and neurodivergent populations, and a public-interest advisory analysis function. This paper is the foundational philosophical document for both. It can be read as a standalone argument about how AI should be understood, but its full operational implications appear in the architectural and institutional choices the Foundation has made on the strength of the argument that follows.


The reframe

The central reframe of this paper, and of the Ora project that this paper grounds, is AI → AHI. Not a marketing rebrand. An ontological correction.

“Artificial” does three things, all harmful to clear thinking. It positions intelligence as external to the human — something that exists independently and that humans access from outside themselves. It creates conditions for the high-priest dynamic, where the companies controlling the systems become gatekeepers of a new kind of oracle. It invites AGI mythology — the belief that intelligence is on a trajectory toward some superhuman autonomous endpoint that exists apart from human cognition.

Replace “artificial” with “assisted” and the entire narrative collapses into what the technology actually is: a sophisticated information-processing system that becomes transformatively useful when a skilled human directs it well.

“Assisted” carries its own connotations, and they are worth examining rather than glossing. The word can sound like it undersells the technology — as if assistance were a smaller thing than artificial intelligence claims to be. The choice is deliberate. The technology really is subordinate to human direction in the sense that matters; the subordination is the feature, not a limitation. A tool that the user directs is more valuable than a tool that purports to direct itself, because the directed tool serves the user’s purposes while the self-directing tool serves whatever purposes its developers built into it. The capability under “assisted” is not smaller; it is correctly located. The intelligence belongs to the user; the assistance is what the system provides.

AHI recenters everything correctly. Intelligence is human; AI is the assistance. The human brings goals, values, judgment about what matters, recognition of what is true, ethical compass, creative vision. The system amplifies, accelerates, organizes, executes, challenges. Intelligence operating in the loop is always human intelligence — expanded in its reach and capacity, never replaced.

AHI is empirically more accurate, not just philosophically more accurate. Every meaningful output from any AI system requires a human who knows what good looks like in order to recognize it. Without the human who can distinguish a profound insight from a plausible-sounding error, the system produces nothing of value. The intelligence is in the loop, not in the machine. The empirical foundation section develops this claim with specific cases; the compounding error section shows why the empirical observation reflects a mathematical constraint rather than a contingent limitation; the philosophical sections establish what intelligence has to be for these observations to hold together. The whole paper is one argument with several phases.

The conventional framing — what AI feels like under the AGI narrative

The conventional framing of AI is the extrospective fantasy: intelligence as something that exists out there, accessible by visiting a website or opening an app, that the user receives output from rather than thinks alongside. The user types a question. The system returns an answer. The encounter is transactional. The user got the answer. The user did not get any better at the kind of thinking that produced the question.

Every commercial AI interface is optimized for this transaction. The chat-thread interface is the canonical form: a single text input that produces a single text output, repeated, with the prior turns held in a context window the user does not directly see. The interface is built to get the user to the next answer as quickly as possible. The interface is not built to get the user to a better understanding of the problem they brought.

This is not a design oversight. The transactional pattern is the architecture of dependency. The user who is no better at thinking after a thousand transactions is the user who comes back for the thousand-and-first. Subscription revenue depends on continued use; continued use depends on the user not having internalized the cognitive operations the system is performing on their behalf. A cognitive prosthetic is a more durable revenue source than a cognitive training regimen.

The dependency-driven business model is not a moral indictment of the labs. It is a description of the incentive structure. A commercial AI company that helps its users get better at thinking has, all else equal, lower retention than one whose users keep coming back because they cannot do the same work themselves. The successful commercial AI products will be the ones whose users are most dependent on them. This is what market success will select for, in the absence of a structural alternative.

The labs are not interchangeable. The critique applies more strongly to some than to others. OpenAI’s institutional positioning is more aggressively oracle-shaped than Anthropic’s; Anthropic has published more substantial work on the limits of current systems and the necessity of human oversight; Google’s commitments shift with product launches and strategic moments. The argument here is not that every lab is acting in bad faith but that the commercial incentive structure pushes all of them toward similar dynamics in the absence of structural counterweights. The labs that resist the dynamics most explicitly are also the labs whose resistance is most legible as a deliberate counter-current, which is itself evidence of the underlying gravity. A foundation-released public-domain alternative is one of the structural counterweights that lets the labs that want to resist actually do so without losing competitive position.

A related distinction worth making: there are two different AGI claims circulating, and the critique below applies to each in different ways. Marketing-AGI is rhetoric used to position products and raise capital, where the speakers may or may not privately believe the claims they are making. Sincere-AGI is a research target pursued by people who genuinely believe current trajectories lead to autonomous superintelligence. The marketing version is more cynical and harder to defend on its own terms; the sincere version is more honest but operates on premises — chief among them, that generation capability scales into verification capability — that do not survive examination. Both versions produce similar institutional effects, but engaging them productively requires recognizing which version one is engaging.

The high-priest dynamic completes the picture. The companies that own the most capable models are positioned, in their own marketing, as gatekeepers of a new kind of oracle. Not “a tool you use” — “access to a superior intelligence you consult.” The user does not direct the intelligence; the user receives it. The company’s role is to maintain the oracle’s capability, charge access to it, and gradually expand the oracle’s reach into more of the user’s cognitive life. Where the user once made decisions, the oracle now makes the decisions. Where the user once thought through a problem, the oracle now produces the answer. The high priest is the company’s customer-success team, mediating between the user and the system that the user has been trained to consult.

This framing has a name in the philosophical literature: the extended mind read against the user’s interests. The extended mind thesis (Clark and Chalmers, 1998) holds that cognitive processes can extend into the environment when the environment functions as a constitutive part of the cognitive process — Otto’s notebook is part of his memory because he uses it the way another person uses biological memory. The thesis was originally a liberatory one: it expanded the boundary of the cognitive unit beyond the skin. But the extension can run in two directions. It can extend the user’s cognition into a reliable substrate the user controls. It can also extend the user’s cognition into a substrate the user does not control, where the user’s thinking is now performed by the substrate on terms the substrate’s owner sets. The first kind of extension is what a personal vault is. The second kind is what cloud AI is.

AHI is the response to the second kind. The reframe relocates intelligence to the human, makes the system the tool, and structurally prevents the substrate from owning what passes through it.

What the AGI narrative requires you to believe

The AGI framing requires belief in propositions that dissolve under examination.

It requires that intelligence is a property that can exist independently of a conscious observer. It requires that intelligence can be created artificially. It requires that intelligence exists on a measurable scale, with human intelligence somewhere in the middle and superhuman intelligence as the destination. It requires that the companies building toward that destination are doing something categorically different from building useful tools.

The AHI frame dissolves all of these. Intelligence cannot be meaningfully separated from the conscious observer because the observer determines whether any output is intelligent or merely plausible. The scale metaphor — narrow → general → superhuman — requires intelligence to be a single measurable quantity existing independently of context, but intelligence is not like that even in humans. Human intelligence is contextual, embodied, motivated, value-laden — a capacity that manifests differently in different contexts for different purposes. The capacities clustered together as “intelligence” do not have a unitary measurement, even within a single skull.

The high-priest positioning of OpenAI and similar companies requires the oracle framing. Not “a tool you use” — in their narrative, “access to a superior intelligence you consult.” Eventually, a god-like general intelligence that will solve humanity’s problems on humanity’s behalf. AHI destroys that positioning by relocating intelligence back inside the human and reframing the system as a tool — an extraordinarily powerful tool, but a tool nonetheless. Tools do not have high priests.

There is a tighter way to put this. The AGI narrative is a category error about where intelligence lives. It treats intelligence as a substance that some systems have and others lack, with the engineering goal being to add the substance to a system that did not previously have it. But intelligence is not a substance; it is a relational predicate. Calling something intelligent is a description of how a conscious observer can usefully interact with it, not an inventory of properties the thing has on its own. A system can be very useful — very capable, very wide-ranging in its applications — and still not be doing the thing the AGI narrative claims it is doing, because that thing is not a coherent target.

If intelligence is a relational predicate rather than an intrinsic property, the question becomes where the relation actually lives. The AGI narrative places it in the system; the user becomes the secondary party whose role is to consume what the system produces. The AHI reframe inverts this. The relation lives in the user; the system is the secondary party whose role is to serve what the user is doing. This inversion is what the next section develops.

The inversion — wisdom is in here

The AHI reframe is an inversion of the conventional framing.

The conventional framing is extrospective. Intelligence is out there. The user goes to it.

The AHI framing is introspective. Wisdom is in here. The system helps you find it.

This is not a poetic flourish. It is a description of what the system does when it is functioning as designed. The frameworks the user runs do not produce wisdom from outside the user; they organize the user’s existing knowledge, surface contradictions in the user’s existing positions, hold structured space for the user to think through what the user actually believes about the problem. The user’s accumulated reading, experience, conversations, decisions, mistakes, recognitions — all of this is the substrate the frameworks operate against. The output of a framework is the user’s thinking, organized.

The Hitchhiker’s Guide moment is the moment when the user realizes that the answer they were looking for outside themselves was already implicit in what they brought to the encounter. The frameworks make the implicit explicit. The reliability architecture catches the places where the implicit was actually wrong, or partial, or reasoning from outdated information. The persistent vault makes the user’s prior thinking present to their current thinking. The result is not a better answer than the user could have produced; it is the user’s answer, produced under conditions where the user’s thinking has been fully present to itself.

This is what introspection has always been, in the contemplative traditions that have spent millennia studying it. Introspection is not the recovery of facts the introspector did not previously know. It is the bringing into focus of what was already there but unfocused. The Buddhist tradition has a precise vocabulary for this — vipassanā, insight, is the seeing-clearly of what is, as distinct from the production of new content. The Christian contemplative tradition speaks of recollection — bringing back into one place the attention that has been scattered. The Sufi tradition uses muraqaba, watchfulness, the steady attention that lets what is real become legible. The shared structure across these traditions, translated into terms that do not require religious commitments: there is a difference between generating new content and bringing existing content into focus, and the second operation is what the traditions are training. Both operations are real cognitive work; they are different operations; the contemplative traditions have studied the second one with discipline that secular psychology has only recently begun to approach.

This connects directly to the verification function the empirical sections develop. Verification is partly external — checking AI outputs against observable reality, mathematical proof, expert consensus. But verification is also partly introspective — the user recognizing their own thinking in the system’s output, recognizing when a synthesis articulates something they already implicitly knew, recognizing when an analysis surfaces a tension they had been suppressing. The introspective verification works because the user has a privileged relationship to their own cognitive contents; what looks externally like an unfounded recognition is internally a re-cognition of something already present. This is why the AHI framework can work without the user having to externally fact-check every system output — much of what the system produces, when functioning well, is content the user can verify by recognizing it as their own thinking made articulate. Where the user cannot verify by introspection — because the content genuinely goes beyond what they brought — external verification becomes necessary. The two verification modes are complementary, and both are human.

The AHI framing aligns the technology with these traditions rather than with the AGI mythology. The system is an introspection instrument. It helps the user see what the user already knows. It does not pretend to know what the user does not know. Where the user genuinely does not know something, the system helps the user surface that, and helps the user direct further investigation, but the system does not stand in for the knowing. The knowing happens in the user, when it happens at all.

This is why the user gets better at thinking through the use of the system. Repeated practice with frameworks that make the user’s own thinking visible is exactly the kind of practice that builds cognitive capacity. The system trains the user without ever having to lecture the user, because the system’s structural commitments are themselves the curriculum.

The thinking machine that teaches

Ora is a thinking machine that teaches users to think better while automating their processes.

This is the organizing statement of the project and distinguishes Ora from every commercial AI interface at the architectural level.

Every commercial AI interface is optimized for transaction. The user brings a problem; the system returns an output; the interaction is complete. The user’s cognitive capacity is not changed by the interaction. After a thousand transactions they are no better at thinking than they were before the first one. They are more dependent on the oracle while becoming no more capable themselves. The pattern is designed this way because dependency drives subscription revenue.

Ora is designed differently. Frameworks are explicit cognitive process specifications. Invoking the Problem Evolution Framework means internalizing a discipline for thinking about problem definition that the user carries into every subsequent problem, with or without Ora. Steel Man mode teaches the habit of representing opposing views at their strongest before evaluating them. Competing Hypotheses mode teaches the discipline of holding multiple explanations simultaneously rather than collapsing to the first plausible one. Decision Architecture mode teaches the user to lay out the structure of a decision so its tradeoffs are visible.

The pedagogy is structural, not didactic. Ora does not lecture. Frameworks encode good cognitive practice; repeated use of good cognitive practice builds cognitive capacity; the structure is itself the curriculum. The mechanism is recognizable from skill acquisition research and from the older pedagogical traditions that anticipated it: explicit cognitive processes are externalized into framework structure, repeated execution of the process under framework guidance builds increasingly automatic recognition of the process’s steps, and the steps become habits — first effortful, then fluent, then transparent. Eventually the framework is internalized to the point where the user can run it without the explicit scaffolding, and the framework structure becomes a posture the user can adopt independently. The same pattern operates in how a chess player internalizes opening theory, a musician internalizes scales, a clinician internalizes differential diagnosis. The framework is the scaffold; the scaffold comes down when the structure stands on its own. The user who has worked extensively with Ora’s frameworks is a better thinker than they were before, independently of whether they are using Ora in the moment.

Two distinct values are operating here, and they are worth naming separately. Amplification is what the system does in the moment of use — the user accomplishes more in less time than they could alone. Amplification is the immediate value proposition and the one commercial AI products also claim. Formation is what the system does over time — the user becomes a different kind of thinker through repeated engagement. Formation is the long-term value proposition and the one that distinguishes Ora from commercial alternatives. Commercial AI products amplify; they generally do not form, because formation is structurally antagonistic to dependency-driven revenue. Ora amplifies and forms; the formation is what the public-domain release is for, because formation is the gift that the system makes to the user that the user keeps after the system itself is no longer needed.

The training is not abstract. It manifests in concrete cognitive habits. The user who has run dozens of Steel Man analyses notices, reading an opinion column, where the columnist has refused to represent the strongest version of the position they are dismissing. The user who has run dozens of Causal Investigations notices, reading a news story, where the story is asserting a causal claim without tracing the mechanism. The user who has worked through Decision Architecture multiple times finds themselves, in a meeting where a decision is being made, asking the questions the framework would ask: what are the options, what is the optionality structure, what is reversible and what is not, who has authority. These habits do not arrive through being told to develop them. They arrive through repeated practice with structures that demand them.

The texture of formation is worth showing rather than only describing. Consider a user who begins with the system as someone who, in their professional life, has been generally competent but has noticed in themselves a pattern of collapsing to first plausible answers under time pressure, and a pattern of avoiding decisions where the considerations are evenly balanced. Over the first month with the system, they run frameworks because the frameworks are there. The frameworks slow them down; they sometimes resent the slowdown. By the second month, they have started running specific frameworks deliberately for problems they recognize as belonging to specific shapes — Decision Architecture for actual decisions, Steel Man before responding to disagreement, Competing Hypotheses for situations where they suspect they are over-committed to one explanation. By the third month, they are noticing in their meetings that they ask the structural questions before the frameworks would prompt them. By the sixth month, the structural questions are how they think about decisions, not a discipline they have to remember to impose. By the end of the first year, they would have to deliberately stop themselves from running the cognitive operations the frameworks taught them, because the operations have become the texture of their cognition rather than tools they reach for. The system is still useful — it amplifies the now-internalized disciplines across more problems than the user could otherwise handle — but the user is now a different thinker than they were a year before, in ways that persist whether or not they keep using the system. This is what formation looks like, and this is what commercial AI products generally do not produce because they are not architected to produce it.

The distinction is between a prosthetic and a training regimen. Commercial AI is a cognitive prosthetic that substitutes for thinking the user cannot do. Ora is a cognitive training environment that also automates the processes the user has mastered. The goal is expanded human capability, not human dependency on an external system.

This is the architectural answer to the dependency-driven business model. A system that makes its users more capable rather than more dependent is, all else equal, a worse subscription business than a system that makes its users more dependent. But it is a better commitment to its users — and the public-domain release strategy that the Foundation operates under is the commitment that lets a worse business model serve a better human end. The Foundation does not need the dependency revenue. The user does not need to pay for what the Foundation provides for free. The architecture of commitment runs the other way.

What “intelligence” means when the system is mindless parts

The philosophical literature on emergent intelligence is instructive here. The empirical floor is solid: harvester-ant colonies execute a TCP-like congestion-control algorithm whose efficacy is heritable across colony generations and under colony-level selection. Honeybee swarms select among scouted nest cavities through a process formally analogous to vertebrate cortical decision-making, complete with cross-inhibition between evidence-accumulating populations. Physarum polycephalum — a single brainless cell — finds shortest paths through mazes and reconstructs networks at the cost-efficiency frontier of the Tokyo rail system. Working scientists are now comfortable calling this “collective cognition” and “swarm cognition,” and they are uniformly silent or skeptical about phenomenal experience.

This matters because the same architectural pattern — a system that performs work its parts cannot — is what an AI harness is. The model is one component in a system that includes the human in the loop, the framework discipline, the persistent state, the adversarial verification, and the analytical posture appropriate to the problem. The intelligence the system displays is functional; the intelligence the human directs is intentional and value-laden. They are different categories, and conflating them is what the AGI narrative does.

The four threads of the affirmative case

The strongest argument that systems can be intelligent in ways their parts cannot weaves four threads together.

Thread one: the empirical existence proofs. Working scientists have moved cautiously but unmistakably from talking about “self-organization” to “collective decision-making” to “collective cognition” to “swarm cognition.” The argument is functional. Each case has a formally defined problem, a measured local rule, a mathematical iteration of that rule that predicts the global solution, and robustness signatures — adaptivity to perturbation, speed-accuracy tradeoffs matching psychophysics, phase transitions and hysteresis present at the group level but absent in any individual. These are not anthropomorphic projections; they are measured cases.

Thread two: substrate irrelevance. Craig Reynolds’s “boids” (1987) showed that three local rules — separation, alignment, cohesion — suffice for realistic flock motion. Vicsek and colleagues showed that collective directed motion is a non-equilibrium symmetry-breaking phase, not the product of any individual’s decision. Pierre-Paul Grassé’s stigmergy — coordination through traces left in the environment, not through direct communication — closes the circle: the environment itself stores intermediate computational state. Whatever produces intelligent behavior in these systems is not the substrate; it is the pattern of interaction.

Thread three: the philosophical metaphysics. Daniel Dennett’s “Real Patterns” (1991) argues that beliefs, desires, intentions, and intelligence are real if and only if treating a system as having them captures genuinely predictive patterns. By this criterion, the colony, the swarm, and the slime mold are really intelligent, problem-solving systems: the intentional stance is the most predictively efficient description of their behavior. Dennett’s later work in From Bacteria to Bach and Back (2017) extends this to comprehension: comprehension is not a single magical ingredient but a high-level pattern of context-sensitive competences compounded over time. “Turing’s strange inversion” — “in order to be a perfect and beautiful computing machine, it is not requisite to know what arithmetic is” — paired with “Darwin’s strange inversion” — reasons without reasoners, design without designers.

Thread four: the constructive picture. Marvin Minsky’s Society of Mind (1986) gave engineers a working blueprint and gave philosophers a counter-image to the homuncular regress: “you can build a mind from many little parts, each mindless by itself.” Edwin Hutchins’s Cognition in the Wild (1995) showed via ethnographic fieldwork on a Navy navigation team that real human teams plus real instruments compute things that no individual member computes or even has the information to compute. Andy Clark and David Chalmers’s “The Extended Mind” (1998) gave this its principled functionalist statement via the parity principle. Hofstadter’s Gödel, Escher, Bach (1979), published a year before Searle’s Chinese Room, staged the explicit ant-colony / brain analogy in the Ant Fugue dialogue: Aunt Hillary the colony converses, has a personality, has friends; her constituent ants understand none of this.

Combined: a system is genuinely intelligent (rather than merely intelligent-looking) when (1) it solves a formally specified problem; (2) the solution is robust, adaptive, and shows the signatures of genuine computation rather than coincidence; (3) the macro-level intentional/cognitive description is more predictively economical than any non-intentional alternative; and (4) the underlying mechanism is a measured local rule whose mathematical iteration predicts the global solution. By these criteria, harvester-ant colonies, honeybee swarms, slime molds, and (if rich enough) the Chinese Room are intelligent. Their intelligence is real because there is nothing more to intelligence than this.

The five lines of the negative case

The opposing view is not exhausted by Searle’s “as-if intentionality” point. There are at least five distinct lines of resistance, stronger taken together than any one of them is alone.

The first line is Searle’s. In “Minds, Brains, and Programs” (1980), Searle introduced the distinction between intrinsic intentionality (real, original aboutness, found only in biological minds), derived intentionality (real but parasitic on intrinsic), and as-if intentionality (not real intentionality at all, only metaphorical attribution). His thermostat sentence is the canonical illustration: “I say about my thermostat that it perceives changes in the temperature… These are cases of as-if intentionality.” The deeper layer of Searle’s argument is that computation is not intrinsic to physics — computation is observer-relative — and from this it follows that something which has only a derived computational description cannot, by being so described, acquire intrinsic intentionality.

The second line is Block’s. Ned Block’s “Troubles with Functionalism” (1978) ran the China Brain (or Chinese Nation) thought experiment: equip the population of China with two-way radios so that the call pattern realizes the firing pattern of a single human brain. Block’s argument is not that the system fails to be intelligent in a behavioral sense — by hypothesis it passes any test we can devise. The argument is that functional equivalence does not suffice for the right kind of mental properties, including the qualitative/phenomenal ones. A simulation of digestion does not digest. A simulation of the functional pattern of a mind may not duplicate the mental.

The third line is the substrate-specific line, currently most clearly articulated by Anil Seth. “Conscious Artificial Intelligence and Biological Naturalism” (2025) defends the view that consciousness is constitutively tied to the dynamics of self-maintaining living substrate — predictive interoception, allostatic regulation, the active maintenance of a body. On this view current AI lacks the substrate dynamics that would make any of its functional achievements count as genuine cognition or experience; the lack is not contingent.

The fourth line is the meaning-as-grounding line. Emily Bender and Alexander Koller’s “Climbing Towards NLU” (2020) — the “stochastic parrots” critique — argues that meaning requires grounding in communicative intent, not just patterns over forms. A system trained on text has access only to form. Form does not yield meaning.

The fifth line is Grassé’s own. The originator of stigmergy was deflationary about colony intelligence: the appearance of a problem being solved by termites “is only a reflection of human thought.” The empirical literature has moved past this stance, but it is worth noting that the founder of the field that gives the affirmative case its strongest empirical examples thought those examples did not warrant ascriptions of real intelligence to colonies.

The most powerful negative argument is structural rather than any single line. It says that all the affirmative work establishes only that systems can produce behavior that is well-described in computational and intentional terms. This may be a pragmatic, predictive, and even ineliminable description — “real patterns” in Dennett’s sense — without thereby being intrinsic to the systems described. Emergent intelligence in a colony, a slime mold, a Chinese Room, or an LLM is real-as-pattern but not real-as-property.

What AHI does in this dispute

A useful analytical move: distinguish real-as-pattern from real-as-property. By Dennett’s criterion, the colony, the swarm, and the slime mold are really intelligent, problem-solving systems. By the opposing criterion, they are well-described as intelligent without thereby being intelligent intrinsically.

Both positions are coherent. The disagreement is genuine.

AHI makes a different move than either of them. Instead of arguing about whether the system is really intelligent, it relocates the intelligence to the human in the loop and treats the system as the tool that amplifies the human’s intelligence. The metaphysical question of whether the harness has its own intelligence becomes secondary. The operative question is: what does the system enable a human intelligence to accomplish that the human alone could not?

The answer is straightforward: enormous amounts of structured cognitive work that the human can direct but cannot manually execute. That is what the architecture delivers. Whether the system is “really” intelligent in the deeper sense is a separate question that the project does not need to take a position on.

This is not a dodge. It is a structural commitment. If the project took a position on whether the system is intrinsically intelligent, the project would be making the AGI narrative more legitimate by treating the question as one worth answering on those terms. By relocating intelligence to the human and treating the system as the amplifier, the project sidesteps the question productively: regardless of how it would be answered, the architecture serves human flourishing rather than concentrates power around the people who control the systems.

The Hard Problem connection

The Hard Problem of consciousness — why there is something-it-is-like-to-be a conscious system at all, beyond any functional description — is the philosophical neighbor of AHI’s claims. Functional organization fully fixes intelligence-as-pattern; whether functional organization fixes consciousness is contested.

For the AHI argument, the Hard Problem is grounding rather than thesis. The reframe does not require a position on whether consciousness is computationally producible. It requires only the recognition that intelligence-the-pattern is functional and substrate-independent, while the human — the locus of motivation, value, judgment, and recognition — is what gives the pattern its purpose.

If consciousness is computationally producible (the functionalist position, defended by Chalmers’s organizational invariance argument in “Absent Qualia, Fading Qualia, Dancing Qualia”), the AGI narrative still fails as a description of current systems because no current system has the right organization. The functionalist who takes Chalmers seriously has to be a careful empiricist about which systems actually instantiate the relevant organization, and current LLMs do not look like the right candidates on most credible accounts.

If consciousness is not computationally producible (Searle’s biological naturalism, Seth’s substrate-specific naturalism), the AGI narrative fails more straightforwardly. There is no path from scaling computation to creating consciousness. The narrative is asking the technology to do something its substrate cannot do.

Either way, the AHI frame is the right one for the present technology. It does not stand or fall on the resolution of the Hard Problem. It stands on the more limited claim that intelligence, as the technology displays it, is the user’s intelligence amplified by the system — and that this is true regardless of whether the system has its own intelligence in some deeper sense.

The deeper philosophical contribution AHI makes is this: it refuses the assumption that intelligence-the-property must be located in one place. Intelligence-the-pattern can be in the harness; intelligence-the-property can remain in the human. The two do not compete for the same metaphysical space.

There are three consistent positions in the contemporary debate, and AHI is compatible with all three.

Searle is consistent. His biological naturalism, his rejection of group consciousness, and his treatment of LLMs as having only as-if intentionality all flow from one commitment: intentionality is biological-causal, not computational-organizational. If you accept that commitment, the AHI frame is the right one because intelligence-as-property stays with the human regardless of what the system can be made to do.

Dennett is consistent in the opposite direction. The intentional stance applies to thermostats, colonies, and LLMs in degrees. Comprehension is a real pattern, not an extra ingredient. If you accept this, the AHI frame is the right one because intelligence-as-pattern operates in the harness while the human-side intentional stance — the user’s own consciousness, motivation, recognition — is the meaningful locus for moral and operational reasoning.

Hofstadter’s strange-loop view is the third consistent picture. Consciousness arises whenever a system has the right kind of self-referential, level-crossing feedback. The criterion is structural and substrate-independent. AHI is again compatible: the harness is a tool that amplifies the user’s strange-loop cognition, regardless of whether the harness itself has the strange-loop structure that would make it conscious in this sense.

The three positions disagree about deep questions that the AHI frame does not need to settle. What AHI does need is that the human be the locus of motivation, value, and recognition; this is uncontested across all three positions. The architecture is built around that uncontested fact rather than around any one position’s resolution of the Hard Problem.

The empirical foundation

The strongest empirical argument for the AHI reframe is the one the AI industry itself keeps producing every day: every meaningful output from every AI system requires a human who can recognize that the output is good. Not a human who created the output. A human who can distinguish a profound insight from a plausible-sounding error. The system, no matter how capable, does not produce that recognition. It produces output. The recognition happens in the user.

Without the recognizer, the system produces nothing of value. With the recognizer, the system produces work that the recognizer alone could not have produced in the available time. The recognizer is the scarce resource. The system amplifies the recognizer’s reach.

This is observable in concrete cases. A clinician using AI to draft a patient summary catches the inaccuracies because the clinician knows what a clinically valid summary contains; the verification work happens in the clinician, not in the system. The same pattern operates when a litigator drafting a motion notices that the AI has misstated case law — the litigator’s knowledge of the actual cases is what makes the catch possible. A researcher reviewing an AI-generated paper summary works similarly: the missed nuance is visible because the researcher knows the source. Across all three examples, the structural fact is the same. The system produces; the human verifies; the verification is the source of the work’s reliability.

In each of these cases, the system produced output that looked right at first glance and was, on close inspection, not right. The user’s domain competence is what made close inspection productive. A user without domain competence reading the same output would have accepted it. The error rate the system produces is not the error rate the system appears to produce; it is the error rate that survives after the user’s domain competence has done its filtering work.

This is a narrower, more honest description of what these systems do than the marketing language suggests. It also happens to be the description that makes adoption analysis tractable. If the value comes from a human + system loop, then the value of the system depends on the human’s domain competence — and the populations who benefit most from cognitive automation are not the people whose competence the system can replace, but the people whose competence it can leverage.

Two corollaries follow.

The system does not extend competence. A user who does not know the domain cannot benefit from the system in that domain in the same way a competent user can. The system can produce material that looks plausible to the non-competent user, but the non-competent user has no internal mechanism for distinguishing plausible from correct. This is why the deployment of cognitive automation in high-stakes domains has to take seriously the question of who is in the loop. A medical AI deployed to laypeople is structurally different from a medical AI deployed to clinicians, because the recognition function is different in the two cases.

The recognizer is what scales. The supply of competent recognizers in any domain is bounded; the system’s leverage of any individual recognizer is high; the populations who have competent recognizers in their lives are advantaged by cognitive automation in proportion to that access. This has implications for equity in deployment that the AGI narrative obscures, because the AGI narrative treats the system itself as the source of value rather than the recognizer-plus-system loop.

These two corollaries together produce an uncomfortable implication that deserves to be named directly. Competent recognizers tend to be the already-privileged; non-competent recognizers tend to be the less-privileged. Without specific intervention, AI deployment amplifies existing competence disparities rather than reducing them. The clinician with deep training extracts more value from medical AI than the layperson; the litigator extracts more value from legal AI than the pro se litigant; the researcher extracts more value from research AI than the student. The technology that is supposed to democratize cognitive capability, in the absence of intervention, concentrates it further.

This is the equity problem the Foundation’s framework library, educational work, and neurodivergent-population focus are designed to address, and the mechanism by which each addresses it is worth making explicit. The framework library externalizes the verification disciplines that competent recognizers have internalized through long apprenticeship. A clinician’s diagnostic reasoning, a litigator’s case-analysis discipline, a researcher’s source-evaluation habits — these are largely tacit, learned over years of practice with mentors. When the disciplines are made explicit as runnable frameworks, they become learnable through deliberate practice rather than only through the long apprenticeships that filter on prior privilege. A user without medical training cannot become a clinician by running medical frameworks, but a user without medical training can develop substantially better recognition of what makes medical claims trustworthy than they would have otherwise, and can interact more competently with medical AI as a result. The educational work creates the structured practice opportunities — the framework-based curricula, the worked examples, the guided sequences — through which recognizers actually get built rather than merely identified. The neurodivergent focus addresses a specific equity failure that AGI scenarios obscure entirely: cognitive systems designed around neurotypical defaults will undervalue the existing competences of people whose cognitive patterns the designers did not consider. A system that assumes a particular attention profile, a particular sequencing of cognitive operations, a particular interface modality, will fit some users badly without being recognized as the source of the bad fit. Designing frameworks that accommodate neurodivergent cognition makes their existing competences visible and useful where the default-design systems make them invisible and useless.

The intervention is necessary because the default trajectory makes the inequity worse, not better. Naming this clearly is part of taking the AHI argument seriously, because the AHI argument’s force depends on what happens to people who do not yet have the recognition capacity the framework assumes.

The compounding error argument

The philosophical sections established that intelligence is relational and that intelligent output requires a recognizer. The empirical foundation showed that this is observable in concrete cases: every meaningful output requires a human who can verify it. The next step in the argument is showing what happens when the AGI narrative asks the system to do without the recognizer — whether by scaling to the point of self-verification or by iterating syntheses beyond the frontier of human knowledge. The mathematical answer to this question is the compounding error problem, and it converts the philosophical claim about relational intelligence into a quantitative constraint on how far autonomous synthesis can go before it ceases to mean anything.

The two arguments are not parallel critiques pointing at the same target from different angles. They are sequential phases of one argument. The philosophical claim establishes that verification is conceptually required because intelligence is relational; the mathematical claim shows that this requirement is not just conceptually load-bearing but quantitatively binding even if one grants the AGI advocate every optimistic assumption about generation capability. The recognizer is required because intelligence is relational; the recognizer’s capacity is bounded because human cognition has limits; the verification bottleneck the mathematics describes is downstream of the philosophical limit and inherits its character from it. The AGI narrative tries to ignore both. The mathematics shows what happens when one tries.

The argument runs as follows. AI can produce candidate syntheses across domains at a scale humans cannot match. A library of all human knowledge, animated by generation capacity, can propose connections between fields that no individual scholar has the breadth to propose. This is a real capability and worth taking seriously. The question the AGI narrative asks of this capability is whether iterated application of it can extend beyond human knowledge into domains where no human verification is possible — bootstrapping new levels of abstraction without humans in the loop, building meta-syntheses on meta-syntheses until a system reaches frontiers no human could have reached.

The mathematics of compounding error makes this trajectory implausible. Suppose, generously, that AI-proposed cross-domain syntheses are correct 90% of the time. The actual rate for genuinely novel synthesis across domains is almost certainly much lower than that — probably closer to 10% than to 90%, given how shallow AI’s understanding of any specific domain is compared to deep human expertise. But take the optimistic figure. A single layer of synthesis at 90% accuracy is useful. Build subsequent layers on top of the first, and the joint accuracy decays rapidly:

Layers of synthesisJoint accuracy at 90%/layerJoint accuracy at 70%/layer
190%70%
281%49%
373%34%
559%17%
1035%3%
2012%0.08%

By the time the system has built the kind of multi-level abstraction the AGI narrative imagines, the compounded accuracy has degraded to the point where the outputs are indistinguishable from noise. And these are the optimistic numbers; under the more realistic 70%-per-layer figure, the degradation is dramatic by the third layer and effectively complete by the tenth.

This is not an engineering problem that can be solved by better models. It is a structural property of layered synthesis under uncertainty. Each layer inherits the uncertainty of all the layers below it. Without external verification at each layer, the uncertainties compound multiplicatively rather than being arrested. The mathematics is the same as in any other domain where errors propagate — measurement chains, distributed computation, signal processing. The standard response is to introduce verification or correction at intermediate stages. The AGI narrative imagines a system that produces meaningful output at the tenth layer without verification at any layer. The mathematics says this output is nonsense.

The constraint becomes more severe when the verification function is examined. Layer one might be verifiable by domain experts who can evaluate whether a synthesis is genuinely valuable. Layer two requires experts who can evaluate the synthesis of layer one syntheses, which is rarer. Layer three requires verification of layer two synthesis verification, which approaches conditions where no humans can evaluate the outputs at all. The verification problem compounds faster than the accuracy problem because it requires not just accumulated correctness but accumulated human capacity to recognize correctness. Each successive layer requires fewer people who can verify the work, and at some level no humans exist who can verify the work meaningfully. At that point the system is generating outputs that nothing — no human, no other system — can determine to be valuable or worthless. The system has not become superhuman; it has become unverifiable.

The accuracy rate also varies with the type of claim being made. Empirical claims that can be checked against observation can be verified relatively quickly. Mathematical claims that can be formally checked can be verified by mechanical procedures. Philosophical or interpretive claims require sustained human engagement to evaluate, and the evaluation can take years or decades. AGI scenarios typically involve the philosophical or interpretive claims, where verification is slowest and most dependent on human judgment. The compounding error problem is most severe in exactly the domains where AGI scenarios claim transformative impact.

There is a steel-man version of the AGI position that survives this argument by accepting its terms. AI can facilitate the advancement of general intelligence if enough domain experts can be gathered to recognize the advancements from their own fields. If the system proposes a connection between two fields that look unrelated, and field experts in both domains can evaluate that connection, and a consensus refines the candidate synthesis into something that genuinely advances knowledge — that is how AI contributes to general intelligence at the species level. But this version is exactly the AHI argument. The system generates candidates at scale; the human community verifies the candidates that have value; the combination advances knowledge faster than either could alone. The intelligence is in the loop, not in the system. The advancement is human, with the system as the scaffolding. This is not the AGI thesis; it is the AHI thesis articulated in synthesis terms.

The deeper failure of the AGI narrative is that it assumes the path from generation capacity to verification capacity is a matter of more capability of the same kind. Larger training corpora produce more capable generation. Better training methods produce more capable generation. The scaling that the AGI advocates rely on improves the generation function. There is no comparable scaling that improves the verification function, and the specific reason is worth articulating because it is where the strongest AGI counterargument fails. Verification requires one of two things: external grounding in observable reality, which lets claims be checked against something other than the system’s own outputs; or internal consistency checking, which catches contradictions within a body of claims. Scaling produces neither. External grounding requires interaction with reality the system mostly does not have access to in the domains where AGI scenarios claim transformative impact — the system cannot run experiments, cannot observe new phenomena, cannot test its predictions against the world. Internal consistency checking does not catch errors that are internally consistent but externally wrong, which is exactly the failure mode of confabulation: the confabulated content is internally coherent, often more coherent than reality itself, which is what makes it so persuasive. Neither verification mode is improved by larger models or better training. The capability improvement does not produce the verification capability. The path from current systems to autonomous superhuman intelligence is not visible in the scaling work the labs are doing, because the scaling work is improving the wrong function.

A related problem compounds the verification difficulty in a specifically self-defeating way. As AI-generated content increasingly populates the corpus that future models train on, the models train on their own outputs. The training-data contamination problem has the same compounding structure as the synthesis verification problem and reinforces it: a model trained on a corpus polluted by prior AI confabulation inherits the confabulations as if they were ground truth, and produces outputs that compound the error in subsequent generations. The information environment that AI generates becomes the information environment that AI trains on. Without external grounding, the system has no way to distinguish its own prior confabulations from genuine signal in the training data. The result is a kind of epistemic drift that operates over training generations rather than over synthesis layers but produces the same accumulated departure from reality. The two compounding problems together describe a structural ceiling on how far autonomous AI development can extend without ongoing human verification supplying the external grounding that the system itself cannot generate.

A sophisticated AGI advocate may respond that all of this concerns current systems, and that future systems will incorporate verification mechanisms current systems lack. This is the strongest version of the AGI counterargument and deserves a direct answer. The arguments above are not parochial to current architectures. The compounding error problem operates at the level of layered synthesis under uncertainty, which is a structural property of any system that builds outputs from prior outputs without verification at intermediate stages. The verification problem requires either external grounding or internal consistency checking, and these are the only two routes available to any computational system regardless of architecture. External grounding requires interaction with reality through means the system actually possesses — sensors, experimental apparatus, the capacity to make predictions and observe outcomes — and adding these to a future system would be useful but is not the kind of capability the AGI scaling narrative imagines, because real-world interaction is bottlenecked by physical processes that do not scale with model size. Internal consistency checking has a structural ceiling that no amount of architectural sophistication overcomes: a system that generates internally coherent confabulations has no internal signal that distinguishes them from internally coherent truth. The only signal that does this work is external, and “external” here means external to the system, not external to the model. The AGI counterargument that “future systems will be different” is unfalsifiable in its weak form and false in its strong form: it is unfalsifiable insofar as no specification of the future system is provided, and false insofar as any specification will run into the same structural constraints that apply to current systems. The verification function does not become internal to a sufficiently capable system; it remains external because that is what verification means.

The indeterminacy problem

The compounding error argument, as developed so far, assumes there is a correct answer that iterated synthesis is failing to converge on. The argument shows that uncertainty multiplies across layers and that the joint accuracy degrades faster than generation capacity improves. This is true and important, but there is a more radical failure mode operating underneath the accuracy problem, and naming it changes how the AGI argument has to be evaluated. The deeper failure is that without an external recognizer, there is no target at all. The system is not failing to hit a target; it is wandering through internally-coherent configurations with no principled basis for preferring one over another. Two runs of the same system on the same starting material can converge on different “results,” both internally coherent, both reported by the system as successful, neither one the answer the other one is. This is the indeterminacy problem, and it is distinct from compounding error because it operates regardless of how accurate any individual layer is.

The mechanism is straightforward to demonstrate. Consider a two-agent setup in which one AI thread generates questions about a fictional character and another thread generates answers, with the answers feeding back as input to the next round of questions. Both threads work from the same starting notes — the same biographical sketch, the same situational context, the same instructions about what the character profile is for. At each step, each thread does what it does well: produce a locally coherent next move given what it just received. Run the experiment and the threads will produce a complete, internally coherent character. In one run, the character may emerge as someone whose central trait is loyalty under pressure; in another run from the exact same starting notes, the character emerges as someone whose central trait is loyalty’s failure under pressure. Both characterizations cohere with the starting material. Both are internally consistent through subsequent scenes. Neither is the character the other one is. The more starting material the threads have, the closer the runs will be to one another, but they will never be the same; there will always be drift. This is reproducible. It is also exactly what one would expect from a chaotic dynamical system: sensitivity to initial conditions where the “initial conditions” include every micro-choice the system makes along the way, and where small variations in those micro-choices propagate through subsequent steps into large variations in trajectory.

The fiction case is a particularly clean way to see this because the failure is visible and harmless. A character is fictional, so any internally-coherent character built from the starting notes is, in some sense, a successful output. The indeterminacy doesn’t compromise the project; it just produces a different character than a different run would have produced. But the structural fact is on full display: the system did not converge on the character; it generated one of many possible characters that are mutually inconsistent and mutually indistinguishable by any criterion internal to the system. The system has no way to recognize that a different run would have produced a different result. The system reports success in both cases. The user, observing the difference, recognizes that “success” here is not the same thing as truth.

This generalizes immediately. Any closed-loop AI architecture in which one component evaluates the output of another component, with no external recognizer in the loop, will exhibit the same drift. The components negotiate among themselves about what counts as success and settle on something. The thing they settle on is internally coherent because each component was doing what it does well at each step. The thing they settle on is also one of many possible things they could have settled on, with no principled reason to prefer this one. Reproduce the same starting conditions and the system will settle on a different thing, equally internally coherent, equally reported as success, equally lacking principled preference over the first thing.

The AGI scenarios that propose closing the recursive self-improvement loop without humans in it are claiming convergence on something the architecture cannot in principle converge on. They are claiming the loop will produce truth, when what the loop’s structure produces is drift through internally-coherent attractors with no external referent. One could imagine developing a statistical theory of where these systems converge on average — a characterization of the distribution of attractors the loops fall into, in the spirit of how physicists developed statistical mechanics to describe the average behavior of systems whose individual trajectories were unpredictable, or how de Broglie’s wave-particle hypothesis predicted measurable diffraction patterns even where individual electron paths could not be tracked. The crucial difference is that the physical theories made predictions that could be checked against external measurement. A statistical theory of an AI loop’s attractors would describe the system’s confabulation patterns, not the world the system is supposedly modeling, because the system is not tracking the world. It is tracking itself. As a research project into chaotic dynamical systems this could be interesting. As a path to superintelligence it is a fool’s errand. The system cannot find “the answer” because the architecture provides no mechanism by which any answer is privileged over any other internally-coherent answer.

The harness convergence

There is an observation worth being explicit about here, because it changes the character of the critique. The most sophisticated current AGI projects have stopped trying to reach AGI at the model level. They have recognized that a single model, however large, will not become autonomously intelligent. Their response has been to build a harness around the model — a structured environment in which multiple AI components interact, generate candidates, evaluate one another, and iterate. The harness is the architecture; the model is just the substrate that the harness runs on. This is a substantial intellectual advance over pure-scaling AGI claims. The field took years to converge on the recognition that orchestration is the right unit of analysis, and the fact of that convergence is itself a significant data point about how the AGI position has evolved under empirical pressure. The disagreement between the AGI and AHI camps has narrowed from “is intelligence a property of models?” to “where should the recognizer sit within a harness?” — which is a much more productive disagreement than the one the field was having five years ago, and one that can be engaged on architectural grounds rather than on metaphysical ones.

Ora is also a harness. The architectural agreement between Ora and the current generation of AGI harnesses is more significant than it might appear, because it shows that even projects pursuing AGI have abandoned the thesis that intelligence can be located in a single model. They have come around to the position that the AHI argument has been making all along: the interesting unit of analysis is the system that orchestrates AI components, not the components themselves. Different projects can disagree about what the harness should do, but the agreement that a harness is necessary is itself the concession that pure-model AGI is not the right target.

The disagreement is about where the recognizer sits. Ora is a harness in which AI components run a structured cognitive process and then return their output to a human, who evaluates whether the output is valuable and decides what to do next. The recognizer is outside the loop, applying external judgment to what the loop produces. The current AGI harnesses are harnesses in which AI components run a structured process and then return their output to other AI components, which evaluate whether the output is valuable and decide what the next iteration should be. The recognizer is inside the loop. The indeterminacy problem is what happens when the recognizer is inside the loop. The convergent, reproducible results Ora users observe are what happens when the recognizer is outside the loop. Same architecture; different placement of the verification function; opposite results.

This framing lets the critique of recursive-self-improvement projects be more precise than dismissal allows. They have correctly identified that the harness is the right level of analysis. They have correctly identified that orchestration of multiple AI components is more powerful than any single component. They have correctly identified that iterative refinement can produce better outputs than single-shot generation. What they have failed to recognize is that the iteration’s convergence depends on external grounding that their architecture does not provide. They have stumbled onto the right architecture and misidentified its proper use. The harness is the engine of AHI when the recognizer is outside it; the harness is an engine of indeterminate confabulation when the recognizer is inside it. The architectural form is the same. The placement of the human is what determines whether the form produces reliable cognitive amplification or chaotic drift dressed up as superintelligence.

The architectural insight is genuinely valuable; what determines whether it serves human flourishing or concentrates power is where the recognizer sits, and that is a political question as much as an engineering one. A reasonable response to the recursive-self-improvement projects is not to dismiss them but to learn from what they got right while being precise about what they got wrong. The Foundation’s framework library and the broader public-domain harness architecture Ora represents are the right response: take the architectural insight, place the recognizer where it belongs, distribute the resulting capability rather than concentrating it. The recursive-self-improvement projects are doing important architectural work in service of a goal their architecture cannot reach. The same architectural work, with the recognizer relocated, serves a goal that is reachable, useful, and now.

The historical record of how synthesis actually advances knowledge confirms the architectural point. Even in the case of major intellectual breakthroughs, the verification function has always lived in human community, never in the synthesis itself. Newton’s contribution was synthesis across previously separate domains — terrestrial mechanics and celestial mechanics, the work of Galileo and Kepler and others. The synthesis was novel and valuable. But Newton’s contribution required human recognition by other natural philosophers who could evaluate whether the synthesis was correct. The Principia Mathematica was not immediately accepted; it took decades of evaluation by other scientists who could verify the mathematical reasoning and check the predictions against observation. Newton produced the synthesis; the scientific community verified it. Both were necessary; neither alone would have been sufficient. The structure has not changed. AI changes the scale at which candidate syntheses can be produced. It does not change the requirement that human communities verify them before they enter the knowledge base.

The space of possible combinations across all knowledge grows exponentially with the number of elements being combined, but the space of valuable combinations grows much more slowly. A library system that can combine all knowledge in all possible ways produces overwhelming volumes of trivial combinations, with the genuinely valuable combinations being a tiny fraction of the total. The challenge is not generation; it is selection. Generation is what AI does well. Selection requires the verification function that bottlenecks the whole process. The labs are building systems that can generate more candidate syntheses than humans can verify, which produces an interesting failure mode: the systems become tools whose users cannot fully utilize them because the human verification capacity is the constraint, not the AI capacity. More AI capacity beyond the verification capacity is wasted at best, and at worst pollutes the information environment with confabulated content that looks like signal.

The compounding error argument also clarifies what the actual danger is. The conventional AGI fear — that some lab will build an autonomous superintelligence that escapes human control — is the wrong fear directed at the wrong threat. The mathematics makes that scenario implausible: an autonomous system iterating beyond human verification produces accumulated nonsense, not accumulated capability. The actual danger is capture. AI capability is real and useful at the layers where verification is possible. If that capability is captured by specific institutions — through patent enclosure, terms-of-service control, infrastructure consolidation, or regulatory capture — then the verification function itself is concentrated rather than distributed. A small number of people get to decide what counts as a valuable synthesis; everyone else consumes their outputs without the capacity to verify them. The political implications of this distinction are taken up in the section on societal effects below; what matters here is that the mathematical argument and the political argument converge on the same diagnosis. The threat is not what AGI advocates fear; it is what AGI rhetoric enables.

The AGI narrative thus fails on four layers simultaneously. Empirically: every meaningful output requires a human recognizer, observable in every concrete case of AI deployment. Mathematically: iterated synthesis without verification compounds error to the point where outputs cease to be meaningful, regardless of generation capacity. Structurally: closed-loop architectures without external recognizers produce indeterminate drift rather than convergence on truth, with reproducible runs producing different “results” the system cannot distinguish between. Politically: the danger is institutional capture of real capability rather than emergence of autonomous capability that does not exist.

The AHI reframe handles all four failures with a single move: it relocates intelligence to the human in the loop, where verification actually happens, rather than locating it in the system, where verification can only be simulated. The move does specific work against each failure mode. Against the empirical failure: it names the recognizer the empirical evidence already shows is present and necessary, rather than pretending the recognizer can be omitted. Against the mathematical failure: it places verification at every layer of synthesis, arresting the multiplicative compounding of uncertainty before it produces accumulated noise. Against the structural failure: it provides the external referent that closes the loop’s indeterminacy, giving the system a target it could not generate for itself. Against the political failure: it distributes the verification function across humanity rather than concentrating it in institutions that would otherwise own the recognizer’s role. One move, four problems handled, because the four problems were always one problem viewed from four angles — the problem of where intelligence lives, and what happens to systems built on the wrong answer to that question.

The recognizer is the user; the system is the assistance; intelligence is the relation between them. The empirical foundation showed this in concrete cases; the philosophical sections showed why it must be so; the compounding error argument showed what happens when the AGI narrative pretends otherwise; the indeterminacy observation showed why even sophisticated multi-agent architectures cannot route around the requirement; the harness convergence showed that even the AGI camp has implicitly conceded the architectural point, differing only on where the recognizer should sit. The architecture in the next section is the operational expression of the move at the level of design — every product choice answers to where intelligence lives, and that is where intelligence has been all along.

Why this matters for product design

The AHI reframe has architectural consequences. Every design choice answers to it.

Frameworks are explicit because the human has to be able to direct the system’s cognitive work. A black-box model that produces good outputs through inscrutable internal reasoning is incompatible with the AHI frame because the human cannot intervene in or correct or learn from the reasoning. A framework that exposes its steps allows the human to inspect, edit, redirect, or override. The exposition is the directability. A user who can see what the framework is doing at each step can stop the framework when the step is wrong, modify the framework when the framework is missing a step, and learn the framework’s discipline when the framework is correct.

The vault is persistent and local because the human’s cumulative work is the substrate of their continued cognitive capacity. A system that retains the user’s history makes the user’s past work present to their current thinking. A system that loses the history forces the user to re-explain themselves at every session, treating each interaction as if it were the first. Locally, because the user’s cumulative work is genuinely the user’s — not the vendor’s, not the platform’s. Persistently, because cognitive work compounds across time when the substrate is durable; cognitive work atomizes when the substrate is reset.

Modes are explicit because different problems require different cognitive postures, and the human is the one who recognizes which posture is needed. A system that picks the posture for the user — without exposing the choice — short-circuits the pedagogical function. A system that lets the user pick the posture, and that lets the user examine the consequences of that pick, is teaching the user about cognitive postures while doing the work. The user who has explicitly chosen Steel Man mode for an opinion column has practiced the recognition that opinion columns benefit from Steel Manning, and that practice transfers to other contexts in the user’s life.

The adversarial pipeline is structural because human cognition itself benefits from challenge. A system that returns its first plausible output is a system that has skipped the verification step a competent human would have demanded. A system that runs its outputs through structured challenge is enacting, mechanically, the discipline that good thinkers practice voluntarily. The user who has worked extensively with adversarial-pipeline output has internalized the habit of expecting structured challenge for any consequential output, and this habit transfers to the user’s interaction with non-Ora outputs in the world.

The territory taxonomy is explicit because cognitive work has structure, and recognizing the structure is itself a cognitive capacity. A system that classifies the user’s problem against a public taxonomy is teaching the taxonomy in the act of using it. A user who has run problems across the analytical super-clusters — Argument and Reasoning; Causation, Hypothesis, and Mechanism; Decision, Future, and Risk; Position, Stakeholder, and Strategy; Synthesis, Orientation, Structure, and Generation — knows that cognitive work has a recognizable structure and recognizes which territory a new problem belongs to. The classification is not opaque; the dispatch is shown to the user; the user learns the territory taxonomy through repeated exposure.

These are not just engineering choices. They are AHI applied at the architectural layer. Every one of them assumes that the human is the intelligence in the loop and designs the system to amplify that intelligence rather than substitute for it.

The choices have costs as well as benefits, and the costs are worth naming so the argument does not read as utopian. Explicit frameworks require the user to learn the frameworks, which is friction the user does not encounter with commercial AI products that hide their internal structure. Persistent local vaults require the user to manage their own storage, back up their own data, take responsibility for the substrate that holds their cognitive work. Explicit modes require the user to choose appropriate modes, which means the user has to develop the meta-cognitive skill of recognizing which posture a given problem requires. Adversarial pipelines slow down output, sometimes substantially, compared to single-shot generation; users in a hurry will sometimes resent the slowdown. Explicit taxonomies require the user to engage with classification, which is itself a cognitive load. Each cost is real. Each is the price of the corresponding benefit: directability requires explicit structure; persistence requires ownership; pedagogy requires user choice; verification requires time; learning requires engagement. The architecture trades immediate convenience for long-term capability, which is the same trade any serious training regimen makes against a prosthetic.

Different users will make different trades for different reasons, and the choice is legitimate either way. A user running quick administrative tasks under time pressure may want the prosthetic for those tasks; prosthetics have legitimate uses, and an architecture that demands training for every interaction would be tyrannical in the other direction. A user doing strategic thinking on consequential decisions may want the training, accepting that the demand the architecture makes is the price of the capacity it builds. Most users will want both, in different proportions for different work, and the goal of an honest architecture is to let the user choose consciously rather than have the choice made for them by what happens to be available. The criticism of commercial AI in this paper is not that prosthetics are bad; it is that the market currently provides only prosthetics, and the prosthetic-only landscape forecloses the formation that some users — and arguably most users for some kinds of work — would benefit from being able to choose. Ora’s architecture is the alternative that lets the choice exist. The choice itself is the user’s.

What changes when the reframe lands

For the user

Cognitive amplification rather than substitution. Capacity building over time. The user gets better at thinking through hard problems, not more dependent on the oracle.

The change is most visible in the long arc. A user who has been using the system for six months has internalized, by repeated practice, several specific cognitive disciplines that the user did not previously have. The user’s drafts, decisions, conversations, and interpretations show the marks of those disciplines: they hold competing positions longer before resolving them, they ask the structural questions that decision architecture demands, they verify causal claims against mechanism rather than against narrative, they catch their own confirmation bias because the practice has trained them to notice it.

The user does not have to understand the AHI argument philosophically to experience this change. The change is structural: the system’s pedagogy operates whether or not the user knows the system is teaching them. But a user who understands the AHI argument can be more deliberate about the practice, choosing modes for the cognitive disciplines they want to develop, running frameworks that exercise the muscles they want to build.

The change is also durable. A user who stops using Ora — who returns to commercial AI, or to no AI, or to a different stack entirely — does not lose the cognitive capacity they have built. The frameworks have become habits. The modes have become postures the user can adopt independently. The territories have become a map the user reads other situations against. The substrate of capacity is the user’s own cognition, not the system’s continued availability.

For knowledge work

Well-defined cognitive processes get automated. Judgment-bearing work gets augmented. The work that requires recognition of what good looks like remains with the human, who is now leveraged by an architecture that handles the structured cognitive work.

This produces a specific bifurcation in knowledge work. Process-shaped tasks — tasks where the process is well-specified and the outcome is verifiable — automate. The work that used to require a human at every step now requires a human at the supervisory and verification steps. Judgment-shaped tasks — tasks where the value comes from the human’s recognition of what matters in this specific case — augment rather than automate. The human still does the recognizing; the system extends the human’s reach across more cases than the human could otherwise have handled.

For most professional categories, the work splits along this seam. Accounting: the routine bookkeeping automates, the strategic financial advice augments. Law: the routine document drafting automates, the case strategy and advocacy augment. Medicine: the routine documentation and synthesis automate, the clinical judgment augments. The lower-judgment work consolidates around fewer humans operating systems with more leverage. The higher-judgment work concentrates on the humans whose recognition is the source of the work’s value.

The disruption is real and consequential. Many people whose jobs were built around the routine cognitive work will have to find different work or different ways to deploy their existing competence. The Foundation’s framework library is the operational response: free public-domain frameworks for the populations facing the steepest displacement, so that the same automation that displaces them is available to them as a tool of practice rather than only as a tool of their displacement.

For education

The Prussian-industrial model collapses, because content delivery is no longer scarce. Cognitive coaching emerges as the replacement: teaching people to direct cognitive systems toward their own purposes, to recognize good output from bad, to develop the domain competence the system cannot supply. The schools that figure this out reorganize; the ones that do not become anachronisms.

The Prussian-industrial school is structured around content scarcity. Knowledge is expensive to produce; experts are expensive to access; classrooms collect twenty-five students around a single expert who delivers content sequentially across a year. The structure is rational under the constraint that direct expert access is scarce. It becomes irrational the moment that constraint is relaxed.

Cognitive automation relaxes the constraint. Any student with access to a configured system has access to a competent thinking partner who is patient, available, and tuned to the student’s developmental moment. The classroom convoy — a vehicle that must travel at the speed of its slowest member, with one driver, on a road built for everyone going the same place — was structurally antagonistic to many of the children it was supposed to serve. Cognitive automation lets the antagonism be relaxed: students can move at their own pace alongside the school day, with the school day serving the social and developmental functions only school can serve while the cognitive load is handled by the configured system.

What schools should reorganize around: cognitive coaching. Teaching the student to direct cognitive systems. Teaching the student to recognize good output from bad. Teaching the student to develop the domain competence the system cannot supply. The teacher’s role shifts from content delivery to formation of the student’s relationship with the cognitive infrastructure. This is harder, more individualized work than content delivery; it requires fewer teachers per student in some ways and more skilled teachers in others. The schools that figure this out reorganize successfully. The ones that try to defend the old model against irrelevance — by banning AI, by pretending the cognitive landscape has not changed, by asserting that the student should still be doing the work the system can do — produce graduates who are less prepared for the world than students at schools that adapted.

The educational implication of AHI is not anti-school. It is anti-Prussian-industrial. School remains crucial; what school does has to change.

For society

The optimistic and pessimistic scenarios both depend on which framing wins, and the compounding error argument clarifies what is actually at stake. The AGI scenario in which an autonomous superintelligence emerges and displaces human judgment is mathematically implausible — iterated synthesis without verification produces accumulated noise rather than accumulated capability. The pessimistic scenario worth taking seriously is not autonomous emergence; it is institutional capture of the real capability that exists at the verifiable layers.

The AGI framing leads to consolidation of power around oracle-providers, displacement of cognitive work without substitution, dependency at scale, and the legitimization of decisions made by systems whose reasoning is not human. In this scenario, the AI labs become the institutional locus of cognitive infrastructure; their products are the substrate through which most people think; their pricing and access decisions determine who has cognitive amplification and who does not; their terms of service determine what thinking can be externalized to them and what cannot. The high-priest dynamic operates at civilizational scale. Decisions that affect millions of people are made by systems whose reasoning is not legible to the people affected, justified by the systems’ performance on benchmarks the people have not seen, defended by the institutional weight of the companies that own them. The danger is not that the systems are too capable; it is that the institutions controlling them are positioned to capture the verification function and to render their outputs unchallengeable by anyone outside the institution.

The populations most directly exposed to this capture dynamic have already recognized it. People in developing countries have correctly identified that the current trajectory leads to AI capability being controlled by US and increasingly Chinese institutions, with their access conditional on terms set elsewhere. They have largely resigned themselves to this outcome because they do not see alternatives. The resignation is rational given the trajectory but accepts the AGI premise that the capability itself must come from somewhere they cannot reach. The compounding error argument shows the premise is wrong: the capability is real at the layers where verification is human, and the verification function can be distributed across humanity rather than concentrated in particular institutions. The Foundation’s release creates an alternative that the resigned populations have not been able to count on. The mission’s commitment to broad access has particular weight for populations that have specific reasons to expect exclusion.

The AHI framing leads to distributed cognitive capability, leverage of human judgment at the population level, and a cognitive transition that compounds human capacity rather than concentrating it. In this scenario, cognitive infrastructure is a public commons, hosted on distributed substrate, accessible to anyone who can run a configured system on their own machine. The frameworks that mediate between cognitive automation and applied use are open-source under public-domain dedication; the canonical library is curated by a foundation whose governance is structured to outlast its founder; the populations who would otherwise be left behind by the cognitive revolution have access to frameworks designed for their needs by domain experts in their own communities. The high-priest dynamic does not operate because there is no oracle to consult; there is only a tool the user directs. The verification function remains where it has always been — in human judgment, distributed across the species.

These scenarios are not equally likely by accident. They are weighted by the architectural choices made at this moment in the technology’s deployment. The AGI framing is the path of least resistance for commercial actors; the AHI framing requires deliberate intervention. The Foundation’s public-domain release of its architecture is the deliberate intervention. The reframe alone is rhetoric. The architecture made widely available — at no cost, with no permission required, with no proprietary lock-in — is the rhetoric made operative.

The current trajectory is neither scenario fully realized. Elements of both are visible. Commercial AI deployment is consolidating around a small number of well-capitalized labs, terms of service are tightening around what users can do with the systems, and the high-priest positioning is becoming more explicit in lab marketing — these are the capture dynamics operating in real time. But open-source models continue to be released, local inference is becoming viable on consumer hardware, public-domain alternatives are being articulated, and the discourse around AI is beginning to include serious challenges to the AGI narrative — these are the distribution dynamics also operating. Which scenario predominates depends on choices that have not yet been made. The Foundation’s intervention is one input among many; its effect depends on whether other actors make complementary choices that compound the public-domain commitment rather than undermining it. The honest position is that the contest is ongoing and the outcome is not yet determined.

The intervention does not guarantee the AHI scenario wins. It makes the AHI scenario possible. The realization belongs to everyone who actually uses the technology, configures it, contributes frameworks to the library, builds institutions around the public-domain commitment, teaches the cognitive disciplines the frameworks encode. The Foundation’s bet is that the possibility, once preserved, gets realized — because the structural advantages the AGI framing provides to commercial actors are the same advantages the AHI framing provides to everyone else, once the architecture is freely available.

The summary

AHI is the right name for what these systems are. Intelligence is human; AI is the assistance. The metaphysics, the empirical observation, the mathematics of compounding error, the structural fact of indeterminate drift in closed-loop architectures, and the political implications all point to the same conclusion. The AGI narrative is wrong in all the ways narratives fail — empirically, because every meaningful output requires a human recognizer; mathematically, because iterated synthesis without verification compounds error to the point where outputs cease to be meaningful; structurally, because closed-loop architectures without external recognizers produce indeterminate drift rather than convergence on truth; and politically, because the actual danger is institutional capture of real capability rather than emergence of autonomous capability that does not exist. The reframe is not a slogan. It is the description that lets the technology serve human flourishing rather than concentrate power around the people who control the oracles.

What changes when the reframe lands is everything downstream of how the user understands their relationship to the system. A user who believes the AI is intelligent treats the AI as authoritative. A user who understands that the intelligence is in the loop treats the AI as a tool — a remarkably powerful one, but a tool nonetheless. The first relationship is the one the labs are selling. The second is the one Ora is built to support.

The first relationship is the colonization of thinking. The second is its return to the user, where it has always belonged.