Wall Street’s Secret AI Fuel: Data, Not Just Models
Firms are paying top dollar for proprietary consumer and transaction data to train trading AIs — and that advantage could reshape winners, losers, and regulation.
Firms are paying top dollar for proprietary consumer and transaction data to train trading AIs — and that advantage could reshape winners, losers, and regulation.

Illustration by IMF Alpha editorial · Reviewed by Pedro Marini
Lead
Big models grab the headlines. In trading rooms and fintech war rooms, the talk is different: it isn’t only about architecture or parameter counts. It’s about the data that feeds those models. Wall Street has quietly shifted from buying off-the-shelf models to buying sources — transaction histories, merchant receipts, app telemetry, and narrow vertical feeds — and that shift changes the calculus for returns, concentration, and regulatory exposure.
Why this matters now
A long game, not a quick trick
Think of data like crude oil in the 19th century: raw, indispensable, and often messy. But unlike oil, bad data doesn’t just sit around — it decays if you don’t curate it. A hedge fund that acquires a pile of receipts still faces cleaning, labeling, privacy engineering, and integration. Expensive. Slow. Frequently underestimated when hype takes over.
Who stands to gain
There’s nuance here: owning data isn’t an instant moat unless you keep investing in quality and rights management.
Counterpoints and fragilities
Concrete examples (public behavior, not confidential claims)
Not flashy, but effective in many cases.
What investors and risk managers should watch
A few of these metrics tend to reveal the true durability of a data advantage.
The test is simple: who controls the inputs?
AI-driven trading and fintech products are entering a phase where ownership and curation of data matter as much as model design. Investors should look past slick model demos and ask who controls the inputs — and whether those inputs will still be available next year. Regulators face a delicate trade-off between enabling useful innovation and stopping opaque scraping or commercial resale of highly personal financial footprints.
Historically, edge in finance has gone to whoever controlled scarce, high-quality inputs. Today that input is permissioned, persistent data. That’s where the next decade’s winners will be carved out — or litigated over in courts and argued over in capitals.

Big banks are trimming yields. Short-term Treasuries, ultra-short ETFs and I Bonds offer alternatives — here’s a practical plan to protect liquidity and returns.

From FICO to machine learning: fintechs promise smarter lending, but consumers and regulators are pushing back. What the shift means for credit, risk and markets.

As money floods AI-focused funds, one chipmaker dominates holdings. That concentration changes the risk profile of a supposedly diversified bet on artificial intelligence.