Monday, May 25, 2026

Research Before Concrete


Research Before Concrete

Why the cheapest line item in artificial intelligence is the one nobody is funding

The AI industry is committing trillions of dollars to buildings, power, and silicon, and a rounding error to the question of whether the architecture inside them is the right one. That ratio — not the buildout itself — is the mistake.


I. What is being funded, and what is not

There is an extraordinary amount of capital moving through the artificial intelligence industry, and it is worth being precise about where it goes.

It goes into concrete and steel: the shells of data centers, rising across three continents at a pace the construction industry has rarely seen. It goes into power — substations, transmission upgrades, transformers with multi-year lead times, electricity contracts that run for decades, and in a growing number of cases dedicated generation built for a single customer. It goes into silicon: hundreds of thousands of accelerators per large facility, each one a small fortune, replaced every few years. Independent estimates put the cumulative figure in the multiple trillions of dollars across the second half of this decade. It is one of the largest concentrations of private capital expenditure in history.

Almost none of it goes into asking whether the architecture being poured into those buildings is the durable one.

That asymmetry is the subject of this essay. The buildout is, in the end, a bet — a vast, concentrated, physical, multi-year, largely irreversible bet — that the way artificial intelligence computes today is the way it will compute for the economic life of the assets being built. And the research that would tell you whether that bet is sound is being funded at a tiny fraction of the rate of the bet itself. We are paying for the city before we have paid for the survey.

Let me be exact about the claim, because the strength of the argument depends on its modesty. The argument is not that the buildout should stop. Demand for AI computation is real, it is here today, and it must be served on the hardware that exists. The argument is not that the current architecture is wrong — it may well prove durable. The argument is narrower, and I believe it is very hard to refute once stated plainly: research into alternative computing architectures is the cheapest hedge available against the most expensive mistake on the table, and it is currently being treated as an afterthought when the scale of the capital at risk makes it a precondition.

To see why, we need to look at three things in turn: how the present architecture became an unquestioned assumption, how wide the space of alternatives actually is, and what it costs — financially — to be wrong.

II. How a practical choice became an unexamined assumption

Modern artificial intelligence runs, almost in its entirety, on one operation performed densely and at colossal scale: multiply, then add. Matrix multiplication is the heartbeat of every large model in production today. Training a model is matrix multiplication; running it is matrix multiplication; the accelerators are matrix-multiplication engines, and the data centers are buildings designed to feed and cool matrix-multiplication engines. The entire industrial base is an investment in performing one operation faster.

It is important to say clearly that this was not a mistake of ignorance. Multiplication earned its place, and it earned it twice over.

It earned it first through the mathematics of learning. The algorithm at the heart of modern deep learning — backpropagation — requires smooth, differentiable operations, so that an error signal can flow backward through a network and adjust millions or billions of parameters in the right direction. Multiplication and addition over continuous numbers are perfectly smooth and differentiate cleanly. The discontinuous, all-or-nothing operations that a more brain-like system might use are hostile to gradient-based learning. Multiplication was not adopted because anyone proved intelligence must be multiplicative. It was adopted because it was the operation that could be made to learn.

It earned it a second time through a historical accident of hardware. When deep learning began to work — the watershed is usually dated to around 2012 — the ideal hardware for it already existed, built for a completely unrelated market. Graphics processing units, designed to render video-game imagery, happened to be machines for performing dense matrix multiplication in massive parallel. The AI field did not have to wait a decade for someone to design it a substrate. It inherited one, fully formed, along with a software ecosystem that grew up around it. The fit between the algorithm and the silicon was so good that it produced the fastest technological transformation in living memory.

So the choice was correct. Nothing in this essay disputes that. But watch what happened next, because it is the quiet center of the whole problem.

A practical choice — use dense multiplication, because it learns well and the hardware exists — hardened, through a long sequence of individually reasonable decisions, into infrastructure. The infrastructure, in turn, hardened into an assumption: intelligence runs on dense multiplication, so the path forward is more and faster dense multiplication. No one announced this transition. No committee ratified it. It accreted. Each new chip generation, each new data center, each new financial model built on the last, until the industry was planning at the scale of trillions as though the operation itself were a fixed constant of nature, and only its speed and price were left to vary.

You can see the hardened assumption most clearly in the financial models that project the buildout. They are sophisticated documents. They stress-test the cost of power, the price of land, the lead time of transformers, the depreciation schedule of silicon. And almost without exception they carry today's architecture forward across every future year as a given — varying how fast and how cheap that one operation becomes, never whether it remains the operation at all.

That is what an unexamined assumption looks like from the outside: the thing that was once a variable has silently stopped being treated as one. The remainder of this essay is an argument for putting it back.

III. The arithmetic underneath, and the work that need not be done

Before turning to alternatives, it is worth understanding why the current architecture is expensive in the first place — because the expense is not where most people assume.

In a digital circuit, multiplication and addition are not remotely equal in cost. A hardware multiplier is, in effect, a dense array of adders; the work it does grows with the square of the bit width of the numbers involved. An addition grows only linearly. A single bitwise logic operation — an AND, an XNOR — is nearly free by comparison. Measured purely at the arithmetic unit, replacing a full-precision multiply-and-accumulate with bitwise logic and a simple count of set bits is on the order of a thirty- to one-hundred-fold reduction in energy. That figure alone would be reason enough to take alternatives seriously.

But the arithmetic unit is not where most of the energy in a modern AI chip goes. The dominant cost is not computing numbers. It is moving them — hauling weights and activations out of memory, across the chip, between chips, and across the network that binds a data center together. Moving a number from memory can cost hundreds to thousands of times more energy than the arithmetic operation that then consumes it. This is known in computer architecture as the memory wall, and it has a profound implication for this discussion: the real prize is not merely a cheaper operation. It is smaller data and less movement.

This is why the most promising alternatives attack the problem from two directions at once. A weight simple enough that multiplication collapses into a sign-flip is also a weight that takes a fraction of the space to store, a fraction of the energy to move, and a fraction of the bandwidth to transmit. The cheap operation and the cheap data movement arrive together. An architecture built on radically simpler weights is not 30% more efficient. It can be most of an order of magnitude more efficient, because it shrinks the term that actually dominates the energy budget.

There is a second, deeper inefficiency, and it points at an even larger opportunity. A dense matrix multiplication computes every element of its result — including the enormous number of elements that are zero, negligible, or irrelevant to the final answer — because processing the entire grid is simply what dense multiplication hardware does. Yet the networks themselves are not dense in their behavior. In a large language model, only a fraction of the internal units are meaningfully active for any given input; most contribute nothing to that particular result. Today's hardware computes them anyway. It spends energy, at scale, multiplying numbers that do not matter by other numbers, and then adding zero.

An architecture that could skip that work — that touched a unit only when the unit actually had something to contribute — would save not a marginal percentage but a large multiple, because it would simply not perform the majority of the operations that current hardware performs. This is the principle of event-driven, or "lazy," computation: do the work only where and when there is work to do. It is not an exotic idea. It is, as the next section describes, how the most capable intelligence we know of already operates.

IV. Three reasons to believe the design space is wide

Here is the heart of the matter. If the present architecture were the only workable way to compute intelligence, then committing trillions to it would carry no architectural risk — there would be nowhere else for the workload to go. The case for funding research rests entirely on the opposite being true: that the design space is wide, real, and under-explored. Consider three independent pieces of evidence, drawn from three different domains, that this is so.

The first reason is biological. The one machine we know for certain runs general intelligence — the human brain — does not multiply. A biological neuron does not perform floating-point matrix multiplication. It integrates incoming electrical signals, and when that accumulation crosses a threshold, it fires; then it falls quiet. It is event-driven by nature: a neuron receiving no input does almost nothing and costs almost nothing. At any given moment, the overwhelming majority of the brain's roughly eighty-six billion neurons are silent. The system as a whole sustains language, perception, reasoning, and memory on a power budget of about twenty watts — the draw of a dim light bulb.

This observation must be handled honestly, because it is easy to overstate. The brain being event-driven does not prove that event-driven computation is superior. "Nature does it this way, therefore it is better" is a weak form of argument with a long record of being wrong; aircraft do not flap their wings, and they fly farther and faster than any bird. Engineering is permitted, and often wise, to diverge from biology. So the claim here is deliberately narrow: the brain is an existence proof. It demonstrates beyond any possible dispute that a system can exhibit the highest known form of general intelligence without dense floating-point multiplication, and on an energy budget some five to eight orders of magnitude below our engineered approach. It does not tell us our approach is wrong. It tells us, with total certainty, that our approach is not the only one — that the operation is contingent, not fundamental.

The second reason is digital, and it is recent. For most of the history of deep learning, the multiplication assumption was safe in practice for a simple reason: no one knew how to build a competitive model without it, and an efficient architecture that cannot match the quality of the dominant one is merely a curiosity. That barrier has now been breached. Research models that run on addition instead of multiplication — using weights so severely constrained that the multiplication effectively disappears, replaced by a sign-flip and a count — have reached quality comparable to conventional networks while reporting roughly an order-of-magnitude reduction in inference energy. This must also be stated with its limits intact: it has been demonstrated convincingly for inference, not for training models at the absolute frontier; it is a credible and rising direction, not a settled victory. But the wall between "dense multiplication is mandatory" and "addition is sufficient" demonstrably now has a door in it, and that door has been walked through and the results published. The contingency that biology asserts in principle, this work demonstrates in engineering practice.

The third reason is physical, and it is the most striking of the three. It is possible to remove the multiplication not merely from the arithmetic, but from the computing substrate altogether — to arrange matters so that the computation falls out of physics directly, with no arithmetic unit performing it at all. Binary values can be encoded as the phase of a beam of light. When two such phase-encoded beams are brought together and allowed to interfere, the result is governed by wave physics: beams in phase reinforce one another into a bright output, beams out of phase cancel into darkness. That interference is a logical comparison — an XNOR — performed by nature, instantly, at the speed of light, consuming almost nothing. A simple photodetector then counts how many comparisons came out bright. Even the threshold operation that conventionally requires an expensive nonlinear function can be performed by the intrinsic optical nonlinearity of a photonic-crystal cavity. This is the basis of an emerging class of optical binary-attention architectures: designs — and in some cases granted or pending patents — for performing the most computationally expensive part of a transformer with light rather than with transistors.

The third reason demands the most careful honesty, and honesty is exactly what makes it persuasive rather than fanciful. This work is largely at the stage of proposed and patented architecture, supported by physical modeling and simulation, not fabricated and mass-produced silicon. Photonic computing is genuinely difficult — optical components are large, sensitive, and hard to integrate at density — and other research groups are already pursuing photonic transformers of various kinds, so this is not virgin territory. The claim is emphatically not "optical computing has arrived and it wins." The claim is precisely the modest one on which this entire essay turns: it is another point in the design space, and a radically distant one, swapping not merely the arithmetic operation but the very physical medium in which computation occurs.

Now set the three side by side. Event-driven instead of dense. Addition instead of multiplication. Light instead of electrons. Each is independently credible. Each rests on a different foundation — one on biology, one on digital engineering results, one on physics and optical design. Each is, today, funded at a small fraction of the rate of the buildout. And each one widens the space of viable architectures that the current concrete-and-silicon commitment is, implicitly and without having said so, betting against. The design space is not a narrow corridor with one obvious path. It is a wide and sparsely mapped territory — and the buildout has staked everything on a single coordinate within it.

V. Two kinds of obsolescence

A wide design space would be merely interesting, rather than financially urgent, were it not for a specific risk it creates. To see that risk clearly, it helps to distinguish two kinds of obsolescence, because the industry has thoroughly internalized one and barely acknowledged the other.

The AI infrastructure industry already worries about obsolescence — intensely, publicly, and with precise vocabulary. Hardware that still functions perfectly but costs far more to operate than the current generation is described as OpEx-obsolete: it is not broken, it is simply uneconomic to run. Industry figures argue openly that AI accelerators have a true economic life of one to two years, even as they are carried on five- and six-year depreciation schedules; prominent investors have warned that the gap between those numbers flatters present earnings and stores up future write-downs. There are already documented cases of paid-for compute sitting underutilized or stranded. The anxiety is real, and the people voicing it are serious and well-informed.

But examine the shape of that anxiety. Every serious analysis of it traces the same line: this chip generation, then a faster one, then a faster one after that. The risk being modeled is that a quicker version of the same machine strands the slower version. Call this incremental obsolescence. It is real, it is expensive, and — this is the crucial point — the industry already knows how to survive it. You survive incremental obsolescence by refreshing your hardware on a schedule and amortizing accordingly. It is painful and capital-hungry, but it is a known game with known rules, and the buildout's financial models account for it.

It is worth pausing on how thoroughly this incremental framing has captured even the buildout's most prominent skeptics — because it shows the blind spot is not confined to the optimists. Mark Cuban, no one's idea of a credulous bull, has questioned the buildout's economics pointedly: he has argued that processing will get faster and cheaper sooner than expected, that "a lot of the numbers being thrown out there aren't going to come to fruition," and that some companies have gone all-in while spending more cash than they have available. He has flagged the circular financing binding the chipmakers, the model labs, and the cloud providers together. These are sharp and valuable criticisms. But notice their shape: every one of them is incremental. Cuban's case is that the same machine will get cheaper faster than the spending assumes — not that the machine itself might change. The most visible bear on the buildout is reasoning entirely within the incremental frame. That is how complete the capture is: the industry's optimists and its skeptics are, for the most part, arguing about the speed of the same assumed architecture, while the question of whether the architecture itself is durable goes almost entirely unasked.

There is a second kind of obsolescence, and the buildout is barely pricing it at all. Architectural obsolescence occurs when the operation itself, or the substrate itself, changes — when the workload migrates to a different computational primitive, or a different physical medium. This is categorically different from incremental obsolescence, and the difference is the entire point. You cannot refresh your way out of architectural obsolescence, because the next generation is not an upgrade of the asset you hold. It is a replacement for the assumption the asset was built upon. A data center optimized for dense electronic multiplication is not threatened by a faster multiplication chip — it can simply purchase one and slot it in. It is threatened by the workload moving to event-driven, addition-based, or optical computation that its silicon, its interconnect, its power delivery, and its very floor plan were never designed to accommodate. You cannot issue a firmware update to a building.

The three pieces of evidence in the previous section are precisely the early indicators of architectural change. And they are precisely the kind of risk the buildout's models leave out. The industry has stress-tested incremental obsolescence with real rigor. It has scarcely examined the architectural kind — even though the architectural kind is the one that strands assets rather than merely depreciating them.

VI. What architectural obsolescence has looked like before

This is not a hypothetical category invented for the occasion. Architectural obsolescence has happened before, and recently, and it is instructive to look at how it behaves when it arrives.

The clearest modern example is specialized cryptocurrency mining. For a brief period, mining was done on general-purpose processors, then on graphics cards, then on field-programmable chips, and finally on fully custom application-specific integrated circuits built to do nothing but the one required calculation. At each transition, the previous generation of hardware did not gently decline in value. It collapsed. Once a fundamentally more efficient machine existed, the older hardware could no longer cover the cost of the electricity it consumed, and its market value fell to scrap almost overnight. The lesson is not about cryptocurrency. It is about the dynamics of obsolescence in a competitive, energy-intensive computing market: when the change is architectural, the transition is not a glide path. It is a cliff.

There is an even more relevant example, and it is the rise of the AI industry itself. The shift from training neural networks on general-purpose processors to training them on graphics hardware was, exactly, an episode of architectural obsolescence — the workload migrated to a different kind of machine, and a great deal of prior assumption and infrastructure was left behind. The current industry is, in other words, itself the product of the very phenomenon it is now failing to price into its forward models. It happened once, in this field, within living memory. The proposition that it cannot happen again — that dense electronic multiplication is the final architecture, the place where the music stops — is an extraordinary claim, and it is being assumed rather than argued.

The pattern across these cases is consistent. Architectural transitions are infrequent, they are hard to time, and they are brutal to whoever is holding the superseded infrastructure when they arrive. Infrequent and hard-to-time is not the same as improbable, and it is certainly not the same as safe. It is, in fact, the exact risk profile against which prudent actors buy insurance — a low-frequency, high-severity event that cannot be precisely predicted but can be substantially hedged.

VII. The economics of being wrong

Consider what architectural obsolescence would actually do to the balance sheet of a large AI buildout, because the financial texture of the risk matters.

Ordinary depreciation is an orderly process. An asset loses value predictably over its useful life, the loss is booked in advance, and the business plans around it. Incremental obsolescence accelerates this but does not change its nature: the asset still has a residual value, a resale market, a secondary use. A previous-generation accelerator that is no longer competitive for frontier training can still serve inference, still be sold, still be redeployed. There is a floor.

Stranding is different in kind. A stranded asset is one whose economic value has been destroyed by obsolescence before the end of its planned financial life — and in an architectural transition, the destruction can be close to total, because the asset is not merely slower, it is the wrong kind of thing. A data center purpose-built for dense electronic multiplication has limited value if the workload moves to a substrate it cannot host. The building is specialized. The power and cooling design is specialized. The accelerators are specialized, and unlike a previous-generation chip, they have no graceful secondary market in a world that has moved to a different primitive. The residual value does not glide downward. It can fall through the floor, because in an architectural transition there may be no floor.

This is why the distinction between depreciation and stranding is not accounting pedantry. It is the difference between a cost the buildout has already planned for and a cost that could arrive as a sudden, large, concentrated write-down across an entire class of assets at once. Analysts of the data center sector already speak of a coming bifurcation, in which facilities that remain well-matched to the workload command premium valuations while everything else is marked down or divested. That bifurcation is usually discussed in terms of incremental factors — power efficiency, cooling design, location. Architectural obsolescence is the same bifurcation with the dial turned to its extreme.

The point of laying this out is not to forecast a crash. It is to be honest about the shape of the downside. The buildout's risk is not that it is large; large and well-founded is fine. The risk is that it is large, concentrated, irreversible, and exposed to a low-frequency, high-severity failure mode that the prevailing financial models do not include. That is a precisely insurable situation — and the insurance, in this case, is research.

VIII. Why research is the rational line item

Now the proposal itself, and the arithmetic that makes it nearly self-evident.

Research is astonishingly cheap relative to the buildout. A serious, well-funded, multi-year program — one that pursued event-driven computation, addition-based models, optical and other non-conventional substrates, and, just as importantly, the rigorous and honest benchmarking of all of them against the dominant approach — would cost some small fraction of one percent of the capital already committed to concrete, power, and silicon. Against a multi-trillion-dollar buildout, a genuinely ambitious architectural research program is, in the most literal financial sense, a rounding error.

And what that rounding error purchases is the single thing the trillion-dollar bet most conspicuously lacks: information. The danger of the buildout, to repeat, is not its size. It is the combination of size with uncertainty — an enormous, concentrated, illiquid, irreversible commitment to one architecture, made without having paid to discover whether that architecture is durable. Research does not abolish the uncertainty. But it shrinks it, and it shrinks it precisely where the buildout is most exposed. A few years of well-funded work would establish, before the concrete has fully set, which alternative directions are real and which are dead ends; how close the credible alternatives are to frontier-scale viability; and therefore how much architectural risk the current commitment actually carries. That is exactly the knowledge that separates a sound investment from a reckless one — and at present it is the knowledge nobody is buying.

This is the ordinary logic of insurance, and of exploration before commitment, applied to a situation that plainly calls for both. When you are about to make an enormous, irreversible commitment at a single point in a wide and poorly mapped space, the rational first expenditure is the map. Not because committing is wrong — because committing blind is wrong, when sight is available so cheaply.

So the call to action is not "stop building." It is this: fund architectural research at a scale genuinely proportionate to the capital it protects — ahead of the irreversible commitments where that is still possible, and in earnest parallel with them everywhere else, rather than as the token afterthought it is today. Treat the architecture of computation as a variable to be actively investigated, not a constant to be quietly assumed across every year of a trillion-dollar projection.

IX. What that research should actually fund

It is fair to ask what a serious architectural research program would concretely consist of, because a proposal that cannot be made specific is not yet a proposal.

It would fund, first, the honest benchmarking of the alternatives — not the optimistic projections of their advocates, but careful, adversarial, reproducible measurement of addition-based and event-driven models against conventional ones, at growing scale, on the metrics that actually matter to a data center operator: quality at a fixed task, energy per unit of useful work, and throughput per watt per dollar. Much of the current uncertainty exists simply because this measurement has not been done at scale by disinterested parties.

It would fund the unsolved problems that currently keep the alternatives from the frontier. Event-driven computation is harder to train and to schedule than dense computation; addition-based models have been shown for inference but not yet for the largest-scale training; optical components are difficult to integrate at density. None of these is obviously insurmountable, and each is exactly the kind of problem that yields to sustained, funded attention — and exactly the kind that languishes without it.

It would fund the substrate work: small-scale fabrication and physical prototyping of non-conventional accelerators, including photonic and event-driven designs, so that the gap between a promising architecture on paper and a manufacturable one is actually measured rather than guessed at.

And it would fund the integration question that may matter most of all: how, and how cheaply, a more efficient architecture could be adopted without requiring the entire software and hardware ecosystem to be rebuilt from nothing. An architecture that is more efficient but strands every existing model and tool faces an adoption barrier that has little to do with its merits. The research that lowers that barrier — translation paths, compatibility layers, hybrid approaches — is as valuable as the architectures themselves.

None of this is exotic. It is the ordinary substance of applied research, and its total cost is small. What is missing is not the feasibility. What is missing is the decision to fund it at a scale that reflects the trillions it would protect.

X. The honest case on the other side

A position is only worth as much as its treatment of the strongest objections to it, so consider, fairly, the case for the buildout proceeding exactly as it is.

The demand is real and it is compounding. Every month of delay in serving it has a genuine cost, in revenue and in competitive position, and "build now on the architecture that works" is a defensible response to a market growing this fast. A buildout sized to that demand is not obviously irrational even if some of it is later repurposed or written down.

Dense multiplication may simply be very hard to beat. It is extraordinarily general — it makes almost no assumptions about the structure of the problem — and generality has real value when the workload itself keeps changing. The alternatives, by contrast, tend to buy their efficiency by exploiting specific structure, and structure-exploiting approaches have a long history of being overtaken by more general ones riding a faster hardware curve. It is entirely possible that the efficient alternatives prove real but niche.

And the alternatives may not pan out at the frontier at all. Addition-based models are unproven for the largest-scale training; optical computing has promised much before and delivered slowly; event-driven systems remain hard to program. A research program is not guaranteed to produce a durable winner. It may simply confirm that the current architecture was the right one all along.

Every one of these points is legitimate, and a serious reader should weigh them. But notice that not one of them argues against funding the research. They are arguments about how the research will turn out — and the entire purpose of research is to find out how it turns out. If dense multiplication is genuinely unbeatable, a few years of well-funded investigation will demonstrate that, and the buildout will proceed with its central assumption validated rather than merely assumed — which is itself worth far more than its rounding-error cost. If the alternatives prove real, the buildout will have been warned in time to adapt. The research has positive value under every outcome. That asymmetry — cheap in all cases, decisive in some — is the definition of a hedge worth buying. The objections argue against betting on the alternatives. They do not, and cannot, argue against finding out.

XI. The honest limits of the argument

The boundaries of this case deserve to be stated as plainly as the case itself, because it is more credible inside its true limits than outside them.

Research cannot pause demand. Customers need serving today, on the hardware that exists. That is why the proposal is research funded in parallel and at meaningful scale, not construction halted until the studies conclude.

The alternative architectures have not won, and this essay does not claim they have. Addition-based models are demonstrated for inference, not frontier training. Optical computing is largely at the design and patent stage, not fabricated at scale. Event-driven computation remains harder to train and schedule than the dense approach. None of this is a finished product, and anyone presenting it as one is selling something this essay is not.

And efficiency does not automatically reduce total spending. When computation becomes cheaper, the world has a strong and well-documented tendency to do more of it; a more efficient architecture may be met with greatly increased usage rather than reduced hardware. The case for research is therefore not a promise of a smaller overall bill. It is a hedge against building the wrong expensive thing — against pouring the trillions into a coordinate in the design space that the workload then leaves.

What survives every one of those subtractions is still decisive. The design space is demonstrably wide, established by three independent lines of evidence. The current buildout is committed, concentrated and irreversible, to a single point within it. The downside if that point proves wrong is stranding rather than ordinary depreciation — a sudden, severe, correlated loss rather than an orderly one. And the research that would measure and shrink that uncertainty costs a rounding error against the buildout it would protect.

XII. The pause worth taking

The most efficient producer in any competitive market does not merely enjoy lower costs. They set the price — and in doing so they strand the infrastructure of every competitor who cannot match it. If a materially more efficient architecture for artificial intelligence exists and is reachable, the first organization to arrive at it will not simply earn a better margin. It will render purpose-built, less-efficient infrastructure across the rest of the industry uncompetitive, and then stranded. The competitive logic does not reward the largest buildout. It rewards the most durable architecture.

So the question facing anyone deploying capital at this scale is not whether the alternative architectures are certain — they are not, and certainty is not on offer. The question is whether they can afford to have built the entire expensive thing without first paying the rounding error required to find out what they were building next to.

We did not discover that intelligence is dense electronic multiplication. We discovered that dense multiplication was a workable, differentiable, conveniently-supported way to begin — and beginning that way was a genuine and historic achievement. But somewhere in the years since, the field stopped treating the operation as a choice and started treating it as the ground beneath its feet. The twenty-watt machine inside every human skull, the addition-based models now matching conventional ones, and the optical architectures that compute with interfering light are three independent reminders, from biology, from engineering, and from physics, that the ground could be somewhere else — and that an industry committing trillions to a single location, as though the others did not exist, has mistaken a bet for a certainty.

Build. The demand is real, and it must be met. But fund the research first where you still can, and in genuine earnest everywhere else — because architectural research is the cheapest line item on the entire table, and the mistake it guards against is the most expensive one in the industry's history.

Research before concrete.


This is a position piece, not a peer-reviewed result. It argues a risk and a priority, not a certainty: that the artificial intelligence buildout has rigorously priced incremental hardware obsolescence and barely examined architectural obsolescence, and has under-funded architectural research by orders of magnitude relative to the capital exposed to that risk. The energy figures cited are well established at the level of arithmetic; their full end-to-end impact at scale is an active research question with real and stated caveats. The optical architecture described is an emerging, partly-patented design and simulation effort, not fabricated silicon. The argument rests on a single modest claim: that the architecture of computation is a variable — demonstrated as such by biology, by current digital research, and by photonic design work alike — and that a rounding-error investment in mapping that variable should precede, not trail, a billion-dollar commitment.

John Sokol john.sokol@gmail.com  The author has worked for three decades on computing architectures in this direction: an addition-based neural network with event-driven, lazy evaluation, in which — as in the biological neuron — only the units that actually fire ever consume computation; and, more recently, an optical binary-attention architecture in which interference between beams of light performs the core comparison directly, with no arithmetic unit involved. Correspondence from researchers, engineers, and others working on the architecture of efficient computation is welcome.

No comments: