# The Prior

Suyoung Hwang · 2026-05-28

Today's discourse on artificial intelligence often collapses into a discussion of models, and of language-model parameters in particular. The convenient narrative that begins from the language model itself is reassuring, of course; this essay is about what that convenience misses.

It is not that we think the language model itself has stalled. If anything, it is closer to the opposite. Several times a week, new model evaluations are posted to [artificialanalysis.ai](http://artificialanalysis.ai). We hear, often enough, that the state of the art has changed on some benchmark, or that a new benchmark has appeared. Tasks that no language model could solve are reduced, a few months later, to saturated benchmarks. Amid this explosive growth, everyone speaks of superintelligence, or AGI, and no one tries to define it.

Since ChatGPT launched in November 2022, language models that produce fluent text have steadily settled into everyday human life. As service after service has appeared, the "generation" of language models has begun to take its place, like the compiler or the packet-routing protocol, within the hierarchy of technical infrastructure. The shift is uneven, and adoption will move at a speed that matches the width of the gap each domain presents, but the possibilities and the direction of language models look clear.

The generative capacity of language models is becoming pervasive. Paradoxically, the thinking that ties it to a domain is not.

That thinking is what we call, somewhat loosely, _the prior_. It is less a pure condition of cognition than the accumulated structure that must exist before anything can be put to use as a domain or a function. The prior is not the function itself but the ground on which the function rests. For the compiler, type theory was the prior; for the database, it is query languages, transaction semantics, relational algebra. For the language model, that prior has not yet been properly built. What most domains concentrate on goes no further than building a substrate that is intuitively coupled to a particular function and leans on the language model's posterior generative capacity.

Here is how we understand what the prior contains.

**(1) An understanding of how the posterior substrate fails.** Language models hallucinate. They were trained to reproduce what they had memorized rather than to solve problems, and they grow unstable wherever training data is sparse. Configured with the greatest care, a language model still offers no guarantee of robust output. It will sometimes generate a run of confident, groundless phrasing. As language models improve, these failures are mitigated or hidden, yet their shape stays much the same. Whether what improved is reliability or a fluency that imitates it is not easily settled.

People commonly say that pouring in computing resources, that is, scaling, will solve these problems. Because infinite compute cannot be assumed, the possibility cannot be logically refuted. But one thing can be said empirically: the cost of identifying and correcting these failures from outside the language model, or through specialized models, is far lower than the cost of training a general-purpose language model free of such failures in the first place. The price that goes by the name of scaling has climbed exponentially and by now reached an astronomical scale, while its return is logarithmic. We do not place that _bet._

**(2) The environment in which language models operate.** A language model call is a nondeterministic primitive operation. To build anything substantial on top of it, you need a runtime that can inspect, constrain, and refine a probabilistic primitive. This is the general view now handled across the entire tech scene under the name _harness._ The difference made on top of that general view comes from how one conceptualizes the environment and how seriously one takes the low-level runtime. When we design our environment, we treat a language model call as a content-addressed object: not a one-off request to an API, but a primitive unit with inputs and outputs set within a flow of information. Language models are nondeterministic, but precisely for that reason, their operating environment must be as robust as we can make it.

**(3) Domain anchoring.** Before any of the above can mean anything, the substrate has to be in actual contact with a domain. A domain corpus, a specialized database, a model trained to encode the domain's tacit relations, recipes that hand the language model the right material at each step: these are examples of that bridge. What suffices for a useful assistant and what is required for genuinely deep, useful scientific output differ in kind. We are not optimistic that work accumulated on top of one domain will transfer cheaply.

**(4) Composable patterns for scientific reasoning.** By now, the difference between a language model's zeroth response and science springs as much from the structure around the language model as from the model itself. Where to juxtapose retrieval, and where not. Which reasoning steps to withhold on purpose. What to cluster before it ever reaches the language model. We have built and accumulated small structures, in composable form, for orchestrating these uses of the language model. These patterns are partial answers to the questions we keep probing: how a language model can reach a genuinely new idea; how to keep a language model from collapsing onto a single self-evident answer.

As has been held so far, some of this prior will be absorbed by future generations of language models. Machine learning is powerful, of course, but it grows powerful only when one can imagine its output. [Chain-of-thought prompting](https://arxiv.org/abs/2201.11903), [basic tool use (function calling)](https://arxiv.org/abs/2210.03629), simple [retrieval-augmented generation](https://arxiv.org/abs/2005.11401): these once existed as priors, and over the past few years, they have been resolved into parameters and distilled away. A posterior substrate has performed _functions_ of the prior before, but the causal order between prior and posterior has never itself been reversed. Scaling without either capital or purpose has lately led the market by misappropriating "the bitter lesson." Yet the market's headlines have, at times, been benighted from a technical point of view.

And the internal structure of a field, say the tacit consensus within a community, or an exact abstraction of the real world, cannot be internalized in any wholly self-contained way. That structure is not only informalized; it shows itself only when it is explicitly chosen. At the frontier, abstraction errors do not merely mislead. They collapse. Where the boundary lies between what scaling absorbs and what it doesn’t cannot be known in advance, and it is not sharp. Even so, we treat that boundary itself as part of the work.

We do not, of course, deny the disruptive power of emergence that scaling carries. Vast capital and compute will at times work out for themselves a domain's tacit order, one humans had not yet perceived, and hand it back to us. But for such an emergence to settle into the orbit of science requires, paradoxically, a more refined and robust vessel to hold it. When scaling raises a great wave, guiding that wave into something real remains the work of those who can surface a domain's underlying structure.

All of the above is written in the grammar of technology. But we do not practice technology for technology's sake. The speed and depth at which humanity understands the world is one of the most important variables there is. Should that variable cross a threshold, the world beyond it is nothing like the present one. We believe the precondition for that change is for AI to become not a tool of science but a participant in it, with agency of its own: a participant as scientist, one that generates hypotheses, designs experiments, and holds a position on what is known and what remains contradictory. The prior is the starting line that makes such participation possible. It is built by people who understand a domain deeply and can make its structure explicit, themselves participants of another kind. This is our definition of **_scientific superintelligence._**

We began this work in biology. Biology is large enough to penalize shallow abstraction, structured enough to reward abstractions that work, and close enough to experiment that a wrong hypothesis comes back, in the end, as a failed validation. One system we have built, for instance, generates scientific proposals that carry an experimental plan, neither summaries of the literature nor simple recombinations of self-evident claims. Some of them are now undergoing validation in real laboratories.

What was once like magic is steadily becoming ordinary. We have been building what lies beyond it.