New York · 09:42 ESTMarkets Open

S&P 5005,842.10▲ 0.42%•

NASDAQ19,210.55▲ 0.88%•

NVDA1,184.22▲ 2.41%•

MSFT478.90▲ 0.88%•

GOOGL210.11▲ 1.12%•

META612.50▼ 0.34%•

AAPL239.80▲ 0.21%•

AMZN248.66▲ 1.40%•

AVGO1,902.40▲ 3.12%•

TSLA298.10▼ 1.05%•

BTC98,420▲ 1.88%•

ETH4,210▲ 2.24%•

10Y4.18%▼ 0.02%•

DXY104.12▲ 0.18%•

S&P 5005,842.10▲ 0.42%•

NASDAQ19,210.55▲ 0.88%•

NVDA1,184.22▲ 2.41%•

MSFT478.90▲ 0.88%•

GOOGL210.11▲ 1.12%•

META612.50▼ 0.34%•

AAPL239.80▲ 0.21%•

AMZN248.66▲ 1.40%•

AVGO1,902.40▲ 3.12%•

TSLA298.10▼ 1.05%•

BTC98,420▲ 1.88%•

ETH4,210▲ 2.24%•

10Y4.18%▼ 0.02%•

DXY104.12▲ 0.18%•

Back to homepage

Synthetic Data

Why Synthetic Data Became Wall Street's Newest Trade

Banks and fintech are swapping real records for fake ones to train AI — a privacy play that creates winners, losers, and a fresh set of regulatory headaches.

Pedro Marini

July 1, 2026 · 3 min read

Why Synthetic Data Became Wall Street's Newest Trade

Illustration by IMF Alpha editorial · Reviewed by Pedro Marini

Listen to this article

AI narration · ~3 min

Tickers mentioned

NVDA+0.00%SNOW+0.00%MSFT+0.00%PLTR+0.00%

Summary

Synthetic data is slipping out of research labs and into trading floors and loan desks. For investors this looks like a structural growth story: more demand for compute, cloud data tooling, and analytics that feed models. But it comes with tricky trade-offs around fidelity, bias, and compliance.

What’s happening now

Large banks and fintechs are increasingly training and validating models on synthetic datasets. Rather than sharing customer records, teams generate artificial-but-plausible profiles that keep the same statistical shape while masking identities. The benefit is obvious: lower privacy risk, fewer legal knots to untie, and much faster experimentation. Of course, it is not a panacea.

Why this matters to markets

Cost and scale: synthetic data makes it easier to spin up large labeled datasets, which pushes demand for GPUs and cloud capacity. Infrastructure vendors win when workloads grow.
Vendor opportunity: firms that can produce high-fidelity synthetic feeds — feeds that models trust — can command recurring fees as customers standardize around them.
Regulatory gray area: banks may be able to move faster, but regulators have yet to decide whether synthetic substitutes remove accountability for bad model outcomes. That ambiguity matters.

Concrete implications (what investors should watch)

Suppliers of compute and data platforms capture much of the economics as synthetic workloads scale.
Look for deals that pair legacy data holders with AI vendors; those partnerships are often the first sign of a durable revenue stream.
Track regulatory guidance and court challenges closely. One high-profile enforcement action could change adoption timelines overnight.

Counterpoints and risks

Synthetic does not equal safe. Poor generators can amplify bias or unintentionally leak patterns that sophisticated attackers can use to re-identify people.
Overreliance on synthetic sets can introduce fragility when real-world distributions shift — classic train/test mismatch, only louder.
Standards and provenance controls are immature. Without open benchmarks, buyers risk vendor lock-in and painful audits.

A short history lesson

There is precedent here. Financial firms once hoarded proprietary datasets as moats. Cloud and APIs turned data into a product. Synthetic is the next twist: not hoarding so much as sanitizing and packaging. Think of it as data-wrangling 2.0 — fewer gates, more slices to sell. That distinction matters more than it might sound at first.

Examples that clarify

Retail lenders can stress-test credit models by generating millions of synthetic borrower journeys that include rare events missing from historical data.
Trading desks can simulate thousands of unusual price and order-flow scenarios without revealing counterparty details.

What to watch next — practical checklist

Regulatory memos on de-identification and algorithmic accountability.
Earnings commentary from cloud and GPU providers about how much synthetic work they see.
Partnerships, pilot deals, and M&A among data owners, AI vendors, and banks.

Final take

Synthetic data is not magic. It is a practical lever that can speed innovation and change where value accrues. For investors the smarter approach is not betting a single vendor but mapping the ecosystem — compute, platforms, synthetic specialists, and compliance tooling. That map will determine who captures long-term value as finance learns to build on fake data that has to behave like the real thing.

Related coverage

News· 3 min

Your Phone Is Finally Smart Enough: How On-Device AI Is Rewriting Privacy, Speed, and Profits

Tiny neural engines, aggressive quantization and smarter chips mean generative AI can run on phones — and that will upend cloud businesses, chip winners, and privacy trade-offs.

By Pedro Marini

News· 4 min

Why On‑Device AI Is Quietly Eating the Cloud—and What It Means for iPhone Users and Investors

Phones are becoming full-fledged AI hubs. The shift to on‑device LLMs changes privacy, latency, app economics and chip winners—and the cloud won't disappear, but it will look different.

By Pedro Marini

News· 4 min

Washington's Next Move: Mandatory AI Incident Reporting Is Coming — Are Markets Ready?

As lawmakers push model transparency and incident disclosure, cloud giants and chipmakers face costs and opportunities — and startups could be squeezed.

By Pedro Marini

Why Synthetic Data Became Wall Street's Newest Trade

Related coverage

Your Phone Is Finally Smart Enough: How On-Device AI Is Rewriting Privacy, Speed, and Profits

Why On‑Device AI Is Quietly Eating the Cloud—and What It Means for iPhone Users and Investors

Washington's Next Move: Mandatory AI Incident Reporting Is Coming — Are Markets Ready?

The AI economy, decoded before the open.