← Back to Blog

Before AI Capability Can Live in Hardware, There Must Be a Protocol

Existing AI infrastructure has agreements at the call layer, but no protocol layer to answer four questions about a capability: what substrate it runs on, at what fidelity, who verifies it, and whether it can migrate safely. This essay discusses that missing meta-protocol layer.

Before AI Capability Can Live in Hardware, There Must Be a Protocol

The next decade of AI will not be decided by model size alone. Equally consequential is whether the billions of devices already shipped — sitting in pockets, on factory floors, in vehicles — can credibly host the capabilities the cloud is now growing.

Today, most AI capability lives in the cloud. Models are trained and served inside data centers; capabilities are invoked through APIs; hardware mostly handles input and display. But large numbers of devices are already running in the physical world — phones, vehicles, embedded controllers, industrial sensors, edge gateways. They have compute. They have identity and local data. What they do not have is a shared way to declare: what they can actually run, at what fidelity, who verifies it, and whether it can be safely migrated when their service life ends. Without that shared language, those devices can only wait for whole-system upgrades or get retired early as "out-of-date hardware."

This essay is not about a new model. It is about the protocol layer missing between the cloud and that installed base. It belongs neither to centralized inference nor to standalone on-device execution — it sits between capability declarations and the substrates that run them, defining how the two hold each other accountable while still letting capability evolve, accumulate, and move across heterogeneous hardware. Rotifer Protocol is an open-source framework we are building in that direction; it is one concrete candidate along this path, and this essay does not claim exclusivity. The companion paper Where Capability Lives, and How Hardware Earns the Right to Run It develops the full argument; this short essay is an entry point for time-constrained readers.


Three Sentences That Are Not the Same

Most capability drift originates from collapsing three different sentences into one.

"X is possible."

"X is possible on this kind of hardware."

"X is possible on the hardware in your hands right now."

A protocol that does not distinguish these sentences will let any product compress them into one. The first travels well in keynotes. The third is the only one that pays interest on the loan.

Recent information-theoretic work — the epiplexity framework introduced by Finzi et al. (2026), which redefines information content relative to a computationally bounded observer — makes this distinction formalizable: capability is not a property of a problem; it is a property of the pair (problem, observer). Two device generations facing the same workload are not running the same race at different speeds — they are running races with finish lines in different places. No amount of software effort raises an observer's computational budget; software gets better, but substrates remain finite. The protocol's job is to mediate between the two by making substrate-awareness first-class, so that capability declarations and the hardware that honors them stay accountable to each other.


What Is Missing Is Not a New Model — It Is a Protocol Layer

Cloud capability is growing — that is a fact. The installed base cannot one-to-one absorb most of it — that is also a fact. Several attempts to bridge the two already exist:

Each path has a real success region. None of them, alone or in combination, supports distributed intelligence at installed-base scale.

The fourth path — the one we have been building toward — is a meta-protocol layer through which devices can declare what they actually do, attest the substrate they run on, and exchange capabilities with the rest of the network without surrendering control to any centralized layer.

By "meta-protocol" we mean a protocol about how protocols themselves are declared, negotiated, and evolved — it does not dictate how a capability is implemented; it standardizes only how that capability is described, verified, and circulated.


What HTTP Did, and What AI Has Not Done Yet

We hypothesize — and welcome public scrutiny — that this protocol layer may be to AI capability what HTTP was to documents.

In 1991, the Web did not exist. By 2001, it was rewriting commerce, education, and software. The technical precondition was a single thing: a protocol that did not own the content but defined how content could be linked, addressed, and rendered by anyone. HTTP did not invent text. It did not invent the network. What it did was define a coordination layer at which two unrelated parties could agree on what a document was. The Web's value flowed through HTTP, but HTTP itself remained light, unowned, and evolvable.

Compare that to the current state of AI capability: there is no agreed-upon way for one system to ask another "what can you do, on what substrate, at what fidelity, with what verifiable guarantees?" There is no analog of an HTML document for a unit of intelligence — no portable, inspectable, citable, evaluable artifact. Function-calling tool schemas and MCP-style descriptions are improvements at the SDK layer, not the protocol layer. They standardize a calling convention; they do not standardize the substrate-awareness that distinguishes a capability that can run from a capability that should run.

How far the analogy holds is an empirical question that will take time to answer. The working assumption is: far enough to be worth doing seriously.


The Math Just Started Working

A continuous improvement in edge inference would not change the architectural conversation. What has actually happened in 2026 is qualitatively different.

For a class of multi-step agent workflows — tool calling, intermediate reasoning, structured output, several rounds of decision — the throughput threshold has become concrete. Public reports for Google's Gemma 3 family indicate decode rates around 7–8 tokens per second on Raspberry Pi 5 CPU for the smaller variants, and 30+ tokens per second on Qualcomm-class mobile NPUs for the next variant up [^gemma3]. These rates are sufficient to support a roughly 4,000-token input followed by two skill invocations within a wall-clock budget that users will accept as interactive.

We are inclined to read this as a qualitative shift rather than incremental gain — the same workload that previously required cloud round-trips can now, with reasonable engineering effort, be edge-resident. Whether this view holds, and at what device-coverage breadth, requires further falsifiable experiments across broader benchmarks and a wider device set. Based on already-public benchmarks, some recent flagship smartphones, some current vehicle infotainment platforms, and the higher tiers of industrial gateways have started crossing this interactive threshold — concrete coverage figures need hardware profiling work in cooperation with OEMs.

The more cautious version of the claim: for this class of multi-step agent workflows, the bottleneck is shifting from silicon itself to the absence of a protocol layer.

[^gemma3]: Numbers are drawn from Google's Gemma 3 model card and third-party benchmarks on Raspberry Pi 5 / Qualcomm AI Engine; specific figures vary with quantization scheme, precision, and runtime implementation.


TEE: Where Capability Declarations Take Root in Silicon

If the protocol is to make capability declarations accountable to hardware, then this layer needs a physical entry point inside the hardware itself. On the existing installed base, that entry point is the Trusted Execution Environment (TEE) — a hardware-isolated execution mode in the device's silicon that can attest that a specific binary actually ran inside a protected boundary; it is now standard in modern smartphones, vehicle ECUs, and many industrial gateways.

The protocol's L0 Kernel specification has, from the start, listed TEE as one of four legitimate trust backends — alongside distributed ledgers, cryptographic signature chains, and HSMs. What this essay argues is operational, not architectural: among the four, TEE is the only one whose deployment surface is co-extensive with consumer-facing hardware — which makes it a reasonable first choice for plugging the meta-protocol into the installed base, not the only option.

Three properties make this role distinctive:

A TEE alone has no opinion about what a capability is. It can attest that a particular binary ran in a particular isolated state and produced a particular output; it cannot say whether the binary was a faithful implementation of a published capability, whether the output composed correctly with other capabilities, or whether resource declarations matched actual usage. Those are exactly the questions the meta-protocol layer is designed to answer.

TEE provides hardware-trusted; the meta-protocol provides capability-known. Both are necessary; neither is sufficient alone.


How Capability Survives on a Device

Up to this point this essay has deliberately stayed inside a small vocabulary — capability, device, protocol layer, substrate, fidelity. Below is the more specific vocabulary Rotifer Protocol uses for this layer; each term corresponds to an engineering distinction that capability must survive when it lives across heterogeneous hardware.

Term Meaning
Phenotype The set of capabilities a device can actually express, distinct from the set it could in principle support.
Fidelity The degree to which a capability honors its original declaration on a given substrate — the same capability may exist as Native (compiled in), Wrapped (API-mediated), or Hybrid.
Imprinting The local experience a capability accumulates on a specific device, for a specific user, in a specific network environment — this value is local by nature and should not be force-generalized.
Adapter The translation layer used when a capability moves across substrates — across devices, across fidelity tiers, across TEE families.

Putting this vocabulary back onto the cleanest deployment surface — the smartphone:

Consider a five-year-old smartphone in active use today. Under current industry defaults, this device has two futures: either it gets retired because newer capabilities cannot reach it, or it limps along on capability promises that progressively fail to match what the user was told at purchase. Both futures are wasteful, and both are recurrent.

The meta-protocol offers a third future. The device declares its actual Phenotype: which capabilities it can run Natively, which only Wrapped, which exceed its compute class entirely. Its TEE attests that those declarations are honest. The device does not pretend to support what it cannot, and the protocol does not let it. In return, the device receives capabilities sized to its substrate and accumulates Imprinted local value across its remaining operational life — a model of one user's habits, one device's interaction patterns, one network environment's quirks. That value cannot generalize to other users. It does not need to.

When the user eventually replaces the device, the protocol's Adapter layer treats cross-device migration as a form of cross-fidelity translation, attested at both endpoints. This part is currently a draft of the Adapter design with no production implementation — what is described here is target behavior, not delivered capability.


What This Essay Does Not Claim

To prevent the kind of capability drift this argument itself diagnoses, three exclusions are explicit:

  1. This essay does not claim that engineering work to deploy a TEE-backed Binding for Rotifer Protocol is complete or imminent. The argument here is at the strategic and narrative layer, decoupled from the engineering priority of the protocol's near-term release schedule. This essay is being released ahead of full implementation because methodology benefits from public critique before its first measurement is produced.
  2. This essay does not claim that TEE heterogeneity is solved. The five major TEE families currently deployed do not interoperate at the protocol layer today. Bridging them is the responsibility of the Adapter layer; cross-TEE attestation is one of the most concrete near-term open questions.
  3. This essay does not claim that Rotifer becomes a hardware company. Rotifer remains a protocol layer. A Binding is a contract under which a runtime can host the protocol; a TEE-backed Binding would be one such contract. The Foundation does not propose to manufacture silicon, certify devices, or operate TEE infrastructure on behalf of OEMs.

These exclusions are not boilerplate. They are the substrate the rest of the argument depends on.


The Unusual Success Criterion of a Protocol

The success criterion for a meta-protocol is not the same as for a product. A successful product becomes increasingly important to its creators; a successful protocol makes its creators increasingly replaceable. HTTP outlasted its original commercial supporters because the protocol's value migrated away from any single party. The deepest test of a meta-protocol is whether it can keep running after its originating organization steps back.

Rotifer Foundation operates a privileged node within the protocol network. That privilege exists in capacity, in centrality, in early-adopter access. It does not exist in necessity. The protocol's design treats Foundation-operated infrastructure as one privileged node among several — privileged because it was first, not because the protocol depends on it. The most successful version of this story is one where other privileged nodes — operated by partners, communities, competitors, and entities the Foundation has no relationship with — run alongside, and the protocol thrives without distinguishing between them.

To be explicit: in the early protocol phase, the Foundation continues to carry critical engineering coordination and specification maintenance responsibilities. "Replaceable" is a long-term success marker, not a current state.


Open Questions and How to Engage

For readers who find the argument worth engaging with, four channels exist.

Open-source contribution — the protocol's specification, reference implementations, and companion papers are publicly available under permissive licenses. Implementation feedback, specification review, and Adapter contributions are welcome through the open-source community.

Academic collaboration — the information-theoretic framework, the Capable Edge profile, and the cross-fidelity translation analysis each connect to active research traditions. Population biologists, complex-systems theorists, mechanism designers, information theorists, and embedded-systems researchers whose tools we have adopted are invited to collaborate and push back.

OEMs / integrators — the protocol's longer-horizon track includes Binding work for which the only realistic engineering path requires industry participation. Conversations on this track do not assume immediate commercial commitments; they are about the shape of a Binding spec that could, on a multi-year horizon, support production deployment.

Early ecosystem participants — the Foundation's strategy is structured around being a privileged node within an open ecosystem rather than a platform that captures the ecosystem's value.

Open questions this essay does not pretend to answer:

The full argument — including the information-theoretic foundations, the protocol's substrate-aware vocabulary, the honest layering of implementation status, and the open questions still active — is in the companion paper Where Capability Lives, and How Hardware Earns the Right to Run It. This essay is the entry point. The reader is invited to disagree on every page.