Mythos, Mythomania, and the Closed Domain

On April 7, Anthropic announced that Claude Mythos Preview had discovered thousands of zero-day vulnerabilities, including a 27-year-old flaw in OpenBSD, with exploits across every major operating system and browser, and 99% of findings still unpatched. The company declined general release, citing risk. It launched Project Glasswing, a vendor consortium granted privileged access. It set post-preview pricing at $25 and $125 per million input and output tokens. Within a week, leaked reports placed China at Anthropic’s door, alarming the White House.

Three things are happening at once. They should not be conflated.

The first is technically real. Mythos is genuinely capable at vulnerability discovery and exploit generation. This is not in dispute. It is a labor displacement event in offensive security, an asymmetric weapon in the offense/defense arms race, and a change in the economics of disclosure.

The second is a signifier operation. “Mythos” is brand naming doing semiotic work. It positions the model as origin story, founding legend, an artifact whose meaning exceeds its function. Coupled with the controlled-release narrative (“over 99% of vulnerabilities not yet patched, so it would be irresponsible for us to disclose”), the naming enrolls the product in a story about transcendence and stewardship. Nirit Weiss-Blatt’s distinction. [1] “Panic-as-a-Business”: the model is dangerous, so fund safety. “AI Panic Marketing”: the model is so dangerous it must be powerful, look how seriously we take it. Mythos is the latter, executed at master class level.

The third is mythomania. Dario Amodei’s Machines of Loving Grace (October 2024) [2] anticipated a “country of geniuses in a datacenter” by late 2026 or 2027, with intellectual capabilities exceeding Nobel laureates across most disciplines, capable of curing essentially all disease within a decade of arrival. The OSTP submission strengthened this to near-certainty. The February 2026 Dwarkesh interview softened to “I wouldn’t be surprised if I’m off by a year or two.” Each iteration calibrates the prediction downward while the rhetoric scales up. The product is named Mythos. The naming is the tell.

These three readings are not contradictory. Anthropic ships a real capability, deploys it as PR signifier, and exaggerates its trajectory simultaneously. The interesting question is what the capability proves.

Code as the closed domain

The technical fact about Mythos that almost no one states plainly: vulnerability discovery is the LLM domain par excellence. The deeper claim is older than computing. Code is a formal language in Chomsky’s sense, [3] with a fully specified context-free grammar, decidable parsing, and an interpreter that resolves meaning by deterministic procedure. Its execution is machine-verifiable. Its semantics are operational. The training corpus is effectively infinite (every public repository, every CVE writeup, every exploit POC, every academic paper, every conference transcript). The task structure is compression plus pattern recombination, which is what autoregressive transformers do.

The philosophical pedigree of this domain is long. Leibniz proposed a characteristica universalis: [4] a formal language in which all reasoning would reduce to calculation, all disputes resolvable by calculemus. The project failed for natural language and survived only in mathematics, logic, and code. These are the domains where syntactic manipulation suffices because the signs do not point outside the system. The signifier-signified relation is degenerate. Code does not refer to a world that exceeds it; its world is exhausted by its grammar. Tarski’s hierarchy of object language and metalanguage works cleanly here because each level is fully formal. [5] Frege’s Sinn is preserved by syntactic role; Bedeutung is fixed by the interpreter’s semantics, not by any extensional reach into the world. [6] This is the bounded paradise in which autoregressive transformers are native.

Mythos finding zero-days in OpenBSD does not contradict Yann LeCun’s argument that autoregressive LLMs are a dead end for general intelligence. [7] It confirms it. LeCun’s claim is that autoregressive systems cannot model the physical, causal, embodied world. Code is none of those. Code is text describing operations on symbols, executed by deterministic machines, against a state space the language fully describes. It is the inverse of what LeCun says LLMs cannot reach.

The doomer framing requires conflating two domains. If Mythos can crack OpenBSD, the argument runs, then it is on a trajectory toward general capability. But the trajectory is bounded by the substrate. LLMs excel inside the symbolic-deterministic envelope (code, mathematics, formal languages, syntactic surfaces with closed grammars). They degrade rapidly outside it (physical reasoning, causal inference, persistent memory, embodied tasks, theory of mind for non-stochastic agents, long-horizon planning). The envelope is structural, not contingent. It maps onto the divide between formal and natural languages that has been understood since the Vienna Circle [8] and rehearsed in every generation of AI since. The Mythos demonstration is evidence the envelope is tighter than the doomer reading suggests, not looser.

What the LLM cannot do

Theory of mind, in the technical sense (modeling the beliefs, intentions, and reasoning of other agents), is not a property Mythos has, regardless of how convincingly it performs one. The distinction Premack and Woodruff drew in 1978, [9] between possessing a theory of mind and simulating its outputs, is what the architecture forces here. Mythos is autoregressive sampling over a distribution shaped by RLHF. It does not represent the mental states of code authors. It pattern-matches against the syntactic signatures of vulnerabilities it has seen, or near-neighbors in latent space. This suffices for the task. It is not theory of mind. The Sally-Anne test [10] passes because the textual surface of the test is in the training data, not because the model represents Sally’s beliefs as distinct from Anne’s.

The philosophical structure is Searle’s Chinese Room scaled to transformer architectures. [11] Syntactic competence does not produce semantic understanding, and the absence of original intentionality is not a bug to be engineered around. It is the form of the artifact. The system manipulates symbols whose meaning, if any, is borrowed from the interpreters at either end of the pipeline (the humans who labeled the training data, the humans who read the output). Wittgenstein cuts cleaner: the limits of the LLM’s language are the limits of the LLM’s world. [12] Outside the formal grammar, there is no world for the model to operate on. Heidegger’s distinction between Vorhandenheit and Zuhandenheit lands in the same place from another angle. [13] The model has the present-at-hand: objects as theoretical entities described in tokens. It does not have the ready-to-hand: the involved, embodied, projective comportment that discloses a world in the first place. Merleau-Ponty’s body schema is the precondition for the kind of world-modeling LeCun is pointing at. [14] The model has no body schema. It has token statistics.

The deepest constraint is causal. Judea Pearl’s hierarchy distinguishes three rungs: association (P(Y|X), observational), intervention (P(Y|do(X)), experimental), and counterfactual (P(Y_x|X’,Y’), retrospective). [15] Autoregressive LLMs operate at rung one and only rung one. They learn correlational structure in token sequences and sample from it. They have no do-operator, no counterfactual model, no mechanism by which to reason about an intervention that did not occur in their training distribution. Frege’s Sinn/Bedeutung distinction sharpens the same point. The model has Sinn (syntactic role, position in the inferential web of tokens) without Bedeutung (reference to a denotation outside the system). When the syntactic web is closed and self-contained, as in code, this suffices. When the syntactic web is supposed to be about a world (medicine, geopolitics, capital allocation, climate), the missing reference relation is the failure mode.

The frame problem haunts this from a different direction. McCarthy and Hayes named it in 1969: [16] how does an agent know what is relevant to update when something in the world changes? Dennett’s parable of the robot in the bomb room generalizes it. [17] LLMs paper over the frame problem by being trained on text that already encodes human relevance judgments. They do not solve it. Outside the distribution, the frame problem returns immediately, which is why hallucination is structural and not a calibration issue. Hume’s induction problem [18] and Goodman’s grue paradox [19] describe the epistemic predicament: induction over symbolic surfaces does not yield the categories that pick out causal regularities in the world. The model cannot tell grue from green. It can tell only what the training data labeled.

Bender et al.’s “stochastic parrots” was always a description of architecture, not a dismissal of utility. [20] The parrot finds your zero-day. It cannot tell you whether a bridge will buckle. Whether a bridge buckles is a physical causation question: material yield strength, load distribution, resonance, real-world boundary conditions outside any closed grammar. An LLM can reproduce structural engineering text fluently. It cannot run the causal model. It associates “bridge” and “load” and “steel grade” with prior text; it does not intervene on the physical system or reason counterfactually about failure modes absent from the training distribution. Pearl’s rung-one constraint is the mechanism. The gap between fluent engineering prose and actual structural prediction is where interface fluency mistaken for substrate access becomes concretely dangerous. The same gap explains why it cannot tell you whether a clinical trial will replicate, whether a regime will fall, whether a customer will renew. These failures are not separate. They are the same failure, expressed across domains where the symbolic surface is not the substrate.

This is the technological substrate question. The economy is not source code. It is physical infrastructure, biological systems, social coordination, embodied labor, capital allocation, regulatory friction. LLMs touch the symbolic interface to those systems. They do not model the systems. The interface is what makes them useful in the closed domain and what makes them dangerous everywhere else, where users mistake interface fluency for substrate access.

The threat model that matters

The economic threat from Mythos is bounded but real.

Red team and offensive security labor: compressed. The $125/MTok output price makes industrial-scale exploit generation economically rational for any well-capitalized adversary.

Defender economics: asymmetric in the short run. Defenders get the tool via Glasswing, but defender workflows are slower and more constrained by deployment realities. Time-to-patch becomes the binding constraint.

Software supply chain risk: the 27-year OpenBSD bug is the headline. The implication is that hardened legacy systems contain undiscovered vulnerabilities at densities no one has priced in. This is an asset-pricing event for any business whose moat is incumbent software.

Regulatory capture: the Glasswing consortium, the China-access leak, the controlled-release framing all build the case for Anthropic as critical infrastructure, with the regulatory protection and state subsidy that designation implies. This is the most durable economic move in the announcement.

What is not happening: the dissolution of human cognitive labor across the economy. The emergence of general agency. The trajectory toward Amodei’s “country of geniuses.”

The mythomania problem

Naming a model Mythos when the CEO has a track record of mythopoeic predictions about near-term capability is a marketing decision that reads as confession. The product is real. The story it is enrolled in is not. The investors writing the checks ($30B Microsoft, $15B Nvidia, gigawatt-scale TPU buys from Google) need the story to be true. The revenue case requires it. The capex requires it. The narrative is functional, not descriptive.

LeCun, who left Meta in November 2025 to raise $1.03B for AMI Labs on the explicit thesis that autoregressive LLMs are a dead end, is doing the inverse trade. So is Gary Marcus. So, in a different register, are Bender and Hanna. The “godfathers” disagree publicly and durably. Hinton and Bengio sit on one side. LeCun and Marcus sit on the other. As Weiss-Blatt observes, the boosters and the doomers are selling the same product: the model as civilizational lever. They differ only in affect.

Stakes

If the Mythos framing prevails, the regulatory and economic infrastructure built around it (export controls calibrated to LLM capability, enterprise security pricing anchored to existential framing, state subsidization of “critical AI” providers, central bank attention to AI-driven productivity assumptions) outlasts the underlying technology. The myth becomes load-bearing. When the capability ceiling becomes visible, and the LeCun argument suggests it will, the institutions remain. The asymmetry worth pricing is not whether Mythos can find zero-days. It can. It is whether the institutional infrastructure built on the assumption that Mythos generalizes survives the discovery that it does not.

The model is real. The myth is a different artifact.

References

Weiss-Blatt, N. The Techlash and Tech Crisis Communication. Routledge, 2022. The “Panic-as-a-Business” vs. “AI Panic Marketing” taxonomy is developed across chapters 4 and 7.
Amodei, D. “Machines of Loving Grace.” darioamodei.com, October 2024.
Chomsky, N. Syntactic Structures. Mouton, 1957. The formal language hierarchy (regular, context-free, context-sensitive, recursively enumerable) establishes the theoretical basis for treating programming languages as a proper subset of formal systems.
Leibniz, G.W. “Dissertatio de Arte Combinatoria.” 1666. The characteristica universalis program is outlined across his correspondence from the 1670s onward; see Philosophical Papers and Letters, ed. Loemker, Kluwer, 1989.
Tarski, A. “The Concept of Truth in Formalized Languages.” 1933. Translated in Logic, Semantics, Metamathematics. Oxford, 1956.
Frege, G. “Uber Sinn und Bedeutung.” Zeitschrift fur Philosophie und philosophische Kritik, 100, 1892. Translated as “On Sense and Reference” in Translations from the Philosophical Writings of Gottlob Frege, ed. Geach and Black, Blackwell, 1952.
LeCun, Y. “A Path Towards Autonomous Machine Intelligence.” Open Review preprint, 2022. Extended across public lectures and interviews through 2025-2026. The thesis that autoregressive LLMs are insufficient for world modeling due to their lack of a world model, persistent memory, and causal structure underpins his departure from Meta and the AMI Labs founding thesis.
Schlick, M.; Carnap, R.; Neurath, O. et al. Wissenschaftliche Weltauffassung: Der Wiener Kreis. 1929. The logical empiricist program drew a sharp line between meaningful formal/empirical statements and nonsense, anticipating the symbolic/sub-symbolic divide in AI.
Premack, D. and Woodruff, G. “Does the chimpanzee have a theory of mind?” Behavioral and Brain Sciences, 1(4), 1978.
Baron-Cohen, S., Leslie, A.M., and Frith, U. “Does the autistic child have a ’theory of mind’?” Cognition, 21(1), 1985. The Sally-Anne false-belief task operationalizes the Premack-Woodruff construct: a subject passes only if they correctly attribute to Sally a belief the subject knows to be false.
Searle, J. “Minds, Brains, and Programs.” Behavioral and Brain Sciences, 3(3), 1980. The Chinese Room argument against strong AI: syntactic symbol manipulation does not constitute semantic understanding or original intentionality.
Wittgenstein, L. Tractatus Logico-Philosophicus. Kegan Paul, Trench, Trubner, 1922. Proposition 5.6: “The limits of my language mean the limits of my world.”
Heidegger, M. Sein und Zeit. 1927. The Vorhandenheit/Zuhandenheit distinction is developed in Division One, sections 15-18. Harper and Row translation by Macquarrie and Robinson, 1962.
Merleau-Ponty, M. Phenomenologie de la Perception. Gallimard, 1945. The body schema (schema corporel) as the pre-reflective condition of spatial and practical world-disclosure is developed in Part One. Routledge translation by Smith, 1962.
Pearl, J. and Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018. The three-rung causal hierarchy is introduced in Chapter 1. Formal treatment in Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge, 2000.
McCarthy, J. and Hayes, P.J. “Some Philosophical Problems from the Standpoint of Artificial Intelligence.” Machine Intelligence, 4, 1969.
Dennett, D.C. “Cognitive Wheels: The Frame Problem of AI.” In Minds, Machines and Evolution, ed. Hookway. Cambridge, 1984. The R1/R1D1/R2D1 robots and the bomb-on-wagon scenario illustrate the frame problem as a structural obstacle to general agency.
Hume, D. An Enquiry Concerning Human Understanding. 1748. Section IV: “Sceptical Doubts Concerning the Operations of the Understanding.”
Goodman, N. Fact, Fiction, and Forecast. Harvard University Press, 1955. Chapter 3 introduces “grue” as the new riddle of induction: syntactic regularities in observation do not determine which predicates project legitimately to unobserved cases.
Bender, E.M., Gebru, T., McMillan-Major, A., and Shmitchell, S. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” Proceedings of FAccT 2021. ACM, 2021.

Code as the closed domain#

What the LLM cannot do#

The threat model that matters#

The mythomania problem#

Stakes#

References#