Stateful Compute

Thoughts on the future of computation engines on planet Earth and beyond.

Nov 16, 2025

Intro

I want to gather several thoughts in one place. They are all connected, I promise, although it may not be obvious from the very beginning.

This post will touch on:

Attestation: Secure enclaves and verifiable computation.
Web3: Blockchain, protocols, finality, staking, and slashing.
Durability: Persistent, deterministic, exactly-once execution.
Trust: How users and service providers get what they both want.

And ultimately:

What is the future of the above converging?

Just Imagine ©

Just imagine a world where no CPU cycles are wasted.

The world where no CPU cycles are wasted.

The world where intra-continental user interactions take double-digit milliseconds, near-instant.

And trans-continental user interactions still feel real-time. You probably don’t want to play your favorite first-person-shooter or real-time strategy across the ocean with this ping. But it should at least be playable.

Imagine the smoothest-running video game on your computer, or your gaming console. It’s probably 160FPS or more these days, and it responds to your keystrokes and mouse/joystick movements instantly. There is no excuse for any (!) other app you use to not be this slick.

Now think back about how long your Slack or Teams or Zoom takes to load. This is what I believe should die and will die. And this is what Stateful Compute will accelerate their fading away.

In the world of Web3 the above translates to near-instant token and asset transfers and settlements. It’s a human right, to begin with, that you can transfer what you own anywhere, in no time.

In the world of consumer apps, both load times and cross-user interactions should be instant. We have learned how to update software on the go several decades ago. It’s a crime against humanity to make the user wait, after their intent is explicitly expressed.

Especially if their intent has a matching counter-intent from another user. Such as Alice wanting to read the message Bob is sending her. Or Charlie wanting Dwayne to sign that contract. Or Emily getting her travel authorization approved. Or Frank submitting his tax forms to the respective authority. Or George accepting the invite to join some online whiteboarding session without having to refresh their browser. None of these should take more than an eyeblink of time. In fact, an eyeblink is far too slow for some of the above.

In the world of security and regulation, every single operation must be signed by the private key of the user who initiated it. And this signature is journaled, so that it can be verified at a later time. At the very least, let’s talk about tech such as Face ID or Touch ID — confirming anything sensitive with them is a must.

As the tech scene we have converged on the solution that is both easy to use and offers a high degree of baseline security. Yet, I know of exactly zero customer-facing apps where every message or every code commit is signed with Touch ID. If anything, I’d expect that running a payroll should involve something easy yet secure, right?

Also, the above has a lot to do with privacy. End-to-end encryption means that bring-your-own-device (BYOD) is the baseline default that should be universally accepted. Not everybody will bring their hardware wallet or their air-gapped security device with them. But no product, in my worldview, truly qualifies for end-to-end encryption if it does not seamlessly support those external security devices.

So that it is physically impossible for, say, WhatsApp, to see the contents of the message that Alice just sent to Bob. For this message can only materialize in plain text on Alice’s and Bob’s secure devices respectively, over which Meta has no control — and the only thing that Alice and Bob are using from WhatsApp is its identity and transport layer. They will probably be paying for it, since BYOD is a premium feature. I’d say one cent per day would be a fair price, as long as Alice and Bob do not text each other more often than once per second.

The above also organically solves the tokens / assets transfer problem once and for all. But the Web3 crowd already knows it. We just need more awareness worldwide.

The opportunity is massive, and it is waiting for its own next-big-thing-disrupting product.

In a nutshell, in this post I am outlining the worldwide computation model that encompasses the most disjoint parts of today’s state of the art. From pay-for-hardware on-premise clusters and pay-per-time-used cloud compute engines, all the way to pay-per-journaled-record blockchain-based execution.

I do have conviction that these dots will ultimately connect. I do believe that compute and storage should be considered as one whole. And I do believe “database” transactions should always respect “business logic” invariants, which is impossible without blending data and code together.

In other words, everything from CRUD If-Unmodified-Since to compensating transactions and SAGAs should go away. User-defined business logic and database transactions are one. If some code changes some data, this data is changed atomically, for every other piece of code that may be running concurrently and trying to access the same data.

Or, on a higher level, the Starlink fleet is already well-equipped to provide the users on Earth with low-latency 100%-available services that cover the vast majority of human needs. You are probably reading and liking or sharing this post online. My take is that those middlemen solutions should go away, since it’s the {storage + compute + network} layer, and this layer alone is what is needed. And nobody will ask “does Starlink run Mongo?”, because, well, obviously, no — it runs Stateful Compute.

Attestation and Secure Enclaves

The vast majority of modern computers have hardware security chips built into themselves. And, as an industry, we are starting to use them more broadly.

The natural first example of where security chips come into play is UEFI firmware updates. Firmware is ultra-sensitive, and thus should only be updated in an extremely secure way. In practice, this translates to having the public key of the hardware vendor itself physically whitelisted within the device that ships to the customer. So that this device can confirm with 100% certainty that the update it is being requested to apply is indeed signed with the private key of the hardware manufacturer.

This is not perfectly bulletproof, since the private key of the vendor can be leaked or stolen. It also does not protect end users from sophisticated state actor attacks. Nonetheless, it is far better than nothing.

In practice, most of us do not update UEFI manually. What we do use, however, are passkeys: the “touch ID to sign in” on modern laptop computers. The way it works behind the scenes is quite ingenious. For each passkey the private/public key pair is generated within the device — not by the CPU, but by the Secure Enclave of this physical device. The software, in this case the Web browser, can never access the private key within this Secure Enclave. Moreover, even an attacker who has unconstrained physical access to your device would not be able to get that key — to the degree that even if they were to tear the device apart under an electron microscope, the hardware still makes it next to impossible to extract that very private key. The software logic of the secure chip, however, allows the user to sign requests using this private key — via Touch ID.

This way the party communicating with your device can know with certainty that it is you who is currently physically accessing the device. Effectively, every laptop that supports passkeys has a small “hardware wallet” within itself. It does not store DeFi tokens — although it may well do so! — but it does keep enough data to securely identify the user per their physical request.

Most modern mobile devices have a Secure Enclave as well. Signal is famous for making good use of Apple’s secure chips on iPhones. Specifically, it allows users to see who of their contacts are present on Signal without exposing every user’s phonebook data to Signal servers — which is quite a selling point for a pro-privacy messenger.

The next logical step is that the users of cloud providers, such as Amazon Web Services, Google Cloud Platform, or Microsoft Azure, can provision instances with built-in secure enclaves. Without going deeply into the internals, this protects companies that host their services in the cloud — and, by extension, the users of these services — from a wide variety of attacks. Users of those secure enclaves get access to attestation — cryptographically secure proofs that the software code that was run on those machines is exactly what the maintainers of this software have intended it to be: a proof that nobody has messed around with this code and with the user data it is processing.

Byzantine Fault-Tolerance and Trust

This computation strategy is now getting married with the Web3 communities, and the interplay is quite interesting.

For blockchain code, Secure Enclaves, or Trusted Execution Environments (TEEs), truly are a game-changer. Arguably, one cannot blindly rely on Secure Enclaves to be unhackable — mistakes and leaks do happen. So, qualitatively, one can argue that an enclave is not a truly walled garden. But quantitatively, the improvement Secure Enclaves offer over basic non-secure physical machines is massive.

A good analogy is that instead of renting a physical machine to run your code on, you’re renting a Dracula Castle, with physical security and a separate power supply. So, as long as you trust the vendor of this hardware, nobody can see what your code is doing in that castle.

Byzantine trust is about the cost/reward ratio of the most sophisticated attack. If, say, instead of relying on the mathematics of cryptography, the Ethereum ledger would trust itself to one of the Secure Enclave vendors, there is a decent chance it will soon be broken into. It’s far too lucrative to find the right set of people and companies to influence, threaten, or bribe, to get access to a huge amount of wealth locked in users’ private wallets.

On the other hand, the world of the blockchain has found its ways to be effective and secure with no Trusted Execution Environments whatsoever. So, naturally, the blockchain folks know like no other how to best leverage this extra security layer provided by Enclaves. Simply put, one Dracula Castle is not secure enough. But a few dozen of them, built and maintained by different vendors, are about as good as the decentralized community of thousands and thousands of John and Jane Does.

In fact, this is where things get juicy. The Web3 community values trust over everything else. Say what you want about ICOs and NFTs, but, on the protocol level, the degree of durability and resiliency of Web3 solutions is extremely high.

But there is no free lunch. In exchange for maintaining this trust, the protocols must rely on a massively decentralized self-organizing system. And, if anything, this system adds latency to seemingly trivial operations, such as Alice sending tokens to Bob.

To provide a guarantee as trivial as preventing double-spend:

The Bitcoin protocol literally offers no hard finality. Users simply wait for several blocks after the one that contains their transaction. Statistically, it is very unlikely that the longest chain of blocks will overwrite the block that is buried under four or five other blocks — but history has seen such cases documented. Bitcoin simply lives with probabilistic finality, and this decision is deeply ingrained in its core proof-of-work longest-chain-based protocol.
The Ethereum protocol, after switching to proof-of-stake, can confirm transactions faster. But even though the confirmations arrive quickly, they are still not final — there is a small yet nonzero chance they will be overwritten. The actual finality happens approximately every half an hour on Ethereum’s L1 ledger.
Various L2 protocols can offer faster finality. But it makes little sense, since L2 transactions have to be rolled up into their respective L1 chains before the user is actually able to see the transaction settle.

This dichotomy, the wide gap between Trusted Execution Environments on the one hand and slow and high-latency transaction settlement on the other hand, offers a massive arbitrage opportunity.

On the one hand, over the past few decades we have learned how to build systems that reliably handle millions and millions of transactions per second. On the other hand, solutions that consistently demonstrate performance this high are centralized by design.

And where they are not centralized, and perhaps even open source, there is no guarantee that the very code that is run by the system will execute exactly as advertised: the user has to trust the service owner, and the service provider has to trust the cloud provider, and the cloud provider has to trust the hardware manufacturer.

Stateful Compute

We want to be moving towards the world where:

We want the software we use to be low-latency, high-throughput, and transparent in business logic, and at the same time
We want the users to not have to trust the entire stack of the middleware and hardware that powers this software.

The Web3 community has solved (2), but in solving (1) their solutions inevitably became more centralized.

High-performance engineers know how to solve (1), but they got too used to their users trusting them, and they got too used to trusting their own hardware vendors.

Besides, both (1) and (2) present unique — and different! — challenges that affect development velocity badly. Correctly dealing with large-scale distributed systems is hard, especially when “turning it off and back on” is not an option.

If only we could build software such that:

It assumes it is run on a single, planetary-scale computer.
Which is powerful enough to power all the users, and which has enough operational memory to not have to go to “the database”,
Which is durable and resilient and self-healing, so that the developers do not need to worry about it malfunctioning or dying mid-way,
And which executes every operation predictably, deterministically, without repetitions, exactly once,

If the above were true, we would be living our best lives — both the hosts and the users of this software.

Stateful Compute is about enabling these best lives.

Yes, This Solves Distributed Consensus Too

In the interest of keeping this post high-level, I will refrain from talking about pesky low-level details. I wrote about them in several posts before.

Ultimately, to make Stateful Compute reality, we need to solve three fundamental challenges:

Distributed consensus: Have N machines do more work than 1 machine, tolerating failures, and presenting themselves as one logical node.
The CAP theorem: How to maintain data consistency in the world where the machines and the network between them is “worst-case unreliable”.
The latency/consistency tradeoff: How to quickly declare completed transactions completed, with no room for the system to have to roll back.

All three problems are hard, even individually. But our expertise in both datacenter-level resiliency and blockchain-grade protocols makes us confident we can do it.

Moreover, we can do it so that you do not have to: by presenting the Stateful Compute solution as the engineering abstraction, enabling developers to build general-purpose applications on top of our stack.

Honorable mention goes to:

The Erlang programming language, for implementing massive parallelism and zero-downtime on-the-fly code updates,
The Ethereum Virtual Machine, for showing that general-purpose trustworthy zero-trust compute is possible,
And Temporal, for cracking open the space of reliable code execution in domains that are far from Web3 and blockchain.

Closing Thoughts

Last but not least: the above will not only be faster. It will also be much more reliable, and much cheaper.

The effective operational costs of something as trivial as Slack are well under one dollar a year for an average customer. It might go into dozens of dollars if they spend hours in video calls, upload gigabytes of multimedia to shared channels, and send thousands of messages every day. And yet, a company such as Slack — or, in this case, Salesforce — charges ~100x this amount to “keep the lights on”.

Some could say it’s because the ongoing product development work is expensive. Others could say it’s because of vendor lock-in and a lack of regulations that force transparency and interoperability.

I say it’s just because we do not have the infrastructure rails that make building the above at scale possible.

Which was quite a shame ten years ago. And in 2025, with Cursor being the fastest-growing product on planet Earth, it’s not just a shame — it’s gross negligence that, I believe, should not be tolerated.

In the history of the Internet we had periods when one well-timed tech-enabling product has divided the world into before and after. And after was evolving much faster than before, hence the famous hockey stick graphs. Developers from older generations would remember the Apache Web Server, Postgres, PHP, Ruby on Rails. StackOverflow was the greatest thing ever. The younger folks appreciate Redis and Stripe and React and GitHub. The Kool-Aid these days is probably Cursor plus some free-tier-friendly cloud database and hosting providers.

The above has enabled us to ship some software quickly. It only makes it harder to ship reliable software, though. And this is where we need the next revolution.

Stateful Compute is this revolution. It’s much faster, much cheaper, and universally configurable. So when a school kid in Nigeria builds something their friends can use, the entire planet can also use it, the same day. So that it does not lag and does not have the childhood traumas of broken authentication or leaked payment methods.

I, for one, can’t wait until #StatefulCompute becomes mainstream.

Neural Foundry

Nov 17

Your vision of zero wasted CPU cycles and near instant transactions really resonates with where blockchain needs to go. The dichotomy you point out between high throughput centralized systems and slow decentralized protocols is exacty the problem holding back mass adoption. Combining TEEs with byzantine fault tolerance to get both speed and trust is clever. The Slack cost exaple is revealing too, showing how much inefficiency exists in current infrastructure. Building on top of a planetary scale stateful compute layer would be transformative.

Expand full comment

Nov 16

TL;DR: https://claude.ai/share/0c458f58-01be-4e0e-96fa-0fa26c2253e7

2 replies by Dima Korolev and others

2 more comments...

Dima Korolev

Discussion about this post

Ready for more?