New York · 09:42 ESTMarkets Open

S&P 5005,842.10▲ 0.42%•

NASDAQ19,210.55▲ 0.88%•

NVDA1,184.22▲ 2.41%•

MSFT478.90▲ 0.88%•

GOOGL210.11▲ 1.12%•

META612.50▼ 0.34%•

AAPL239.80▲ 0.21%•

AMZN248.66▲ 1.40%•

AVGO1,902.40▲ 3.12%•

TSLA298.10▼ 1.05%•

BTC98,420▲ 1.88%•

ETH4,210▲ 2.24%•

10Y4.18%▼ 0.02%•

DXY104.12▲ 0.18%•

S&P 5005,842.10▲ 0.42%•

NASDAQ19,210.55▲ 0.88%•

NVDA1,184.22▲ 2.41%•

MSFT478.90▲ 0.88%•

GOOGL210.11▲ 1.12%•

META612.50▼ 0.34%•

AAPL239.80▲ 0.21%•

AMZN248.66▲ 1.40%•

AVGO1,902.40▲ 3.12%•

TSLA298.10▼ 1.05%•

BTC98,420▲ 1.88%•

ETH4,210▲ 2.24%•

10Y4.18%▼ 0.02%•

DXY104.12▲ 0.18%•

Back to homepage

AI Chips

Why Tech Giants Are Quietly Building Chips to Cut Nvidia Out

The cloud is engineering its own AI silicon — a defensive play that could reshape margins, supply chains, and who wins the AI profit pool

Pedro Marini

June 13, 2026 · 4 min read

Why Tech Giants Are Quietly Building Chips to Cut Nvidia Out

Illustration by IMF Alpha editorial · Reviewed by Pedro Marini

Listen to this article

AI narration · ~4 min

Tickers mentioned

NVDA+0.00%MSFT+0.00%AMZN+0.00%GOOGL+0.00%META+0.00%

A subtle arms race under the AI boom

For years the debate focused on model architecture, datasets, and frameworks. Lately the fight has moved deeper — down to die and wafer. Cloud providers are pouring money into custom AI chips to cut reliance on Nvidia and to lower cost per inference at scale.

This isn't vaporware. Google has run Tensor Processing Units for a long time. Amazon built Inferentia and Trainium to offer lower-cost alternatives inside AWS. Microsoft still leans on Nvidia for many workloads, but it also funds bespoke hardware and tight vendor partnerships. Meta and a few hyperscalers have quietly prototyped accelerators for internal use. It feels a bit like the server era replaying itself, when hyperscalers shifted from off-the-shelf parts to custom boards and racks.

Why this matters now

Nvidia currently powers most large training runs and inference clusters. That gives it pricing power and creates a potential choke point. Hyperscalers do the math: billions or trillions of inferences and small per-unit charges add up to a persistent tax on margin.
Custom silicon is more than cheaper chips. You can design for the exact precision, memory bandwidth, and interconnect patterns a service needs. Over large volumes, modest per-inference gains compound into real margin improvement.
The software environment is less hostile to new silicon than it was five years ago. ONNX and a growing set of open compiler projects make targeting non-Nvidia accelerators feasible in ways that used to be impractical.

What's interesting here is the combination: economics plus better software makes the proposition realistic rather than theoretical.

Why this isn’t a slam dunk

Building chips is costly and messy. A few counterpoints to keep in mind:

Nvidia’s advantage goes beyond silicon. CUDA, the developer ecosystem, and optimized kernels create heavy switching costs. Models tuned for CUDA often need substantive rework to reach parity on new hardware.
Scale helps specialists. Nvidia’s manufacturing partners, packaging, and thermal design know-how shorten time-to-market. A cloud provider diverting billions into hardware is taking a strategic gamble versus buying proven components.
Fragmentation could slow progress. If each cloud has its own quirks, model portability suffers and startups may prefer one predictable vendor rather than juggling multiple runtimes.

So yes, custom chips can win cost and control, but the road is full of engineering and ecosystem traps.

Three plausible outcomes

Gradual verticalization: Hyperscalers shift more inference — and some training — to their silicon, while Nvidia retains the top-tier, research-heavy training market. Cloud margins tighten, but Nvidia keeps the highest-end slices.
Faster commoditization: Open standards or fierce pricing force Nvidia to compete on price and openness. That speeds some innovation but compresses Nvidia’s margins.
Fragmented equilibrium: Different clouds optimize for different workloads, and the industry learns to live with heterogeneity. Messy for developers, but fertile ground for chip experimentation.

None of these is guaranteed; the industry could slide between them depending on cost curves, software progress, and manufacturing realities.

What investors and enterprises should watch

Headcount and CapEx in hardware teams at major clouds. A steady rise in chip engineering hires is a clearer signal than a press release.
Tooling investments: compilers, ONNX optimizations, cross-platform runtimes. Those show whether a chip is strategic or just exploratory.
For startups, multi-cloud portability will become a practical requirement sooner than later. Betting everything on CUDA-only deployments looks riskier today.

One more nuance

This is not only a hardware contest. It’s about who controls economics and the full stack. Hyperscalers building silicon is a logical next move in their effort to own more of the customer experience and margin. Nvidia won't be displaced overnight, but the stakes are high: whoever controls both chip and stack captures a disproportionate share of downstream value.

I think of cloud silicon as a long game — slow, iterative, and consequential. Expect splashy headlines when new chips ship at scale, but the quieter signals — internal benchmarks, customer price cards, and the toolchain choices — will show the real direction first.

Related coverage

News· 4 min

Synthetic Data Is the New Oil for AI — But Is It Worth the Hype?

As privacy rules tighten and labeling costs skyrocket, companies are betting on synthetic datasets to train models. Here’s who stands to gain — and who might lose.

By Pedro Marini

News· 4 min

On-Device AI Is the New Battleground: What It Means for Privacy, Apps, and Investors

Smartphones are running larger models locally. That shift reshapes app economics, chips, and financial services in ways investors and developers are only starting to price in.

By Pedro Marini

News· 4 min

AI-Driven Phishing Surges: What U.S. Companies Must Do Today

Cybercriminals are using large language models to craft hyper-personalized lures and voice deepfakes. Defenders can fight back, but speed and strategy matter.

By Pedro Marini

Why Tech Giants Are Quietly Building Chips to Cut Nvidia Out

A subtle arms race under the AI boom

Why this matters now

Why this isn’t a slam dunk

Three plausible outcomes

What investors and enterprises should watch

One more nuance

Related coverage

Synthetic Data Is the New Oil for AI — But Is It Worth the Hype?

On-Device AI Is the New Battleground: What It Means for Privacy, Apps, and Investors

AI-Driven Phishing Surges: What U.S. Companies Must Do Today

The AI economy, decoded before the open.