Synthetic markets · deterministic · license-free

Synthetic financial market data for agents.

David Data generates deterministic, internally consistent public-company worlds — prices, fundamentals, filings, news, and events — served through a REST API. Perfect ground truth for backtesting and training AI agents, with no vendor licensing limits.

16,143
Ticker aliases
720
Scenario library
40
Market themes
30+
API endpoints
data.davidhf.com · /prices

$ curl https://data.davidhf.com/prices

?scenario_id=scn_8f3a&ticker=NVDA&interval=day

-H "X-API-KEY: ••••••••"

{
  "ticker": "NVDA",
  "interval": "day",
  "prices": [
    { "date": "2026-01-02",
      "open": 142.18, "high": 145.07,
      "low": 141.55, "close": 144.92,
      "volume": 41203885 },
    { "date": "2026-01-05",
      "open": 145.10, "high": 149.33,
      "low": 144.81, "close": 148.76,
      "volume": 52994117 }
  ]
}

200 OK · X-RateLimit-Remaining: 5,998 · deterministic from seed

OHLCV pricesIncome statementsBalance sheetsCash flowsFinancial metricsEarnings & guidanceAnalyst estimatesSEC-style filingsMarket newsInsider transactions13F-like holdingsCorporate actionsMacro & ratesIndex fundsSegmented financialsEvent timelinesOHLCV pricesIncome statementsBalance sheetsCash flowsFinancial metricsEarnings & guidanceAnalyst estimatesSEC-style filingsMarket newsInsider transactions13F-like holdingsCorporate actionsMacro & ratesIndex fundsSegmented financialsEvent timelines
Platform

A market simulator that behaves like the real thing

Most synthetic data is a random walk with a ticker glued on. David Data builds coherent company worlds where the prices, financials, filings, and headlines all agree with each other.

Perfectly reproducible

Pin a scenario and it returns identical prices, statements, filings, and events every time. Reproduce any backtest or training run to the cent.

Internally consistent

Accounting identities hold, OHLC invariants are valid, news repeats the structured earnings numbers, and filings carry real lineage to events.

Full evidence bundle

Prices, fundamentals, earnings, guidance, analyst revisions, news, SEC-style filings, ownership, insider trades, and corporate actions — all linked.

Real ticker universe

16,143 real exchange-style symbols with company reference metadata sourced from yfinance. Operating companies, ETFs, and benchmarks behave correctly.

Scenario themes & crises

40 market themes — war energy shocks, bank runs, AI IPO manias, Fed pivots, semiconductor blockades — with event-aware macro tapes around each catalyst.

Drop-in compatible

A fully documented OpenAPI surface with API-key auth and rate limits. Standard REST and JSON — point your existing client at it and go.

Coverage

One API, the entire public-company surface

A complete endpoint surface — discovery, statements, screeners, KPI views, ownership, and market snapshots — all from one consistent REST API.

Market data

6 endpoints
  • /prices
  • /prices/snapshot
  • /prices/snapshot/market
  • /index-funds
  • /macro
  • /macro/interest-rates

Fundamentals

6 endpoints
  • /financials/income-statements
  • /financials/balance-sheets
  • /financials/cash-flow-statements
  • /financial-metrics
  • /financials/as-reported
  • /financials/segments

Earnings & estimates

6 endpoints
  • /earnings
  • /kpi/guidance
  • /kpi/metrics
  • /kpi/non-gaap
  • /analyst-estimates
  • /financials/search/screener

Disclosures & artifacts

6 endpoints
  • /news
  • /filings
  • /filings/items
  • /institutional-holdings
  • /company/facts
  • /company/facts/tickers

Plus validation, dataset-quality, and calibration metadata endpoints — full reference in the interactive docs.

Use cases

Ground truth for teams that can't risk dirty data

01 · Quant research

Backtest without survivorship or look-ahead bias

Work across pre-built train/validation/test/holdout splits spanning 720 scenarios. Because the data is clean ground truth, there is no leakage, no restatement noise, and no vendor point-in-time licensing to negotiate.

  • Mixed 30-year, business-cycle, and event-window horizons
  • Theme-anchored historical and forward branches
  • Reproducible to the seed for audit-ready research
02 · AI agents

Train and evaluate financial agents safely

Give LLM agents a complete, self-consistent world to read — filings that cite events, news that repeats the reported numbers, earnings that tie to guidance. Score reasoning against a known answer key.

  • Filing items and news linked to source events
  • Deterministic answer keys for grading
  • No real-data licensing or redistribution risk
03 · Product & demos

Build and demo without touching licensed feeds

Stand up dashboards, screeners, and analytics on a realistic dataset you can ship in demos, sandboxes, and CI. Swap in the synthetic endpoint anywhere your client expects a financial data feed.

  • Drop-in REST surface with API-key auth
  • Stable fixtures for CI and integration tests
  • Shareable demo environments, zero leakage
Developer experience

Query a complete market world in two calls

API-key auth, JSON responses, and rate-limit headers on every request. Standard REST — if you've used a financial data API before, you already know this one.

# 1. Browse the available market-world scenarios
curl "https://data.davidhf.com/scenarios" \
  -H "X-API-KEY: $DAVID_DATA_KEY"

# 2. Query prices from a scenario
curl "https://data.davidhf.com/prices?scenario_id=$SCN&ticker=AAPL&interval=day" \
  -H "X-API-KEY: $DAVID_DATA_KEY"
Pricing

Start free, scale when you need to

Begin on the free tier with no card required. Every paid plan adds scale, more scenarios, and support — never paywalled features.

Free

Kick the tires and prototype against real data shapes.

$0forever
  • 1 API key · 60 req/min
  • Core endpoint surface
  • 3 sample library scenarios
  • Real ticker universe
  • Community support
Start free

Builder

For solo developers, demos, and side projects.

$99/mo
  • 1 API key · 6,000 req/min
  • Full endpoint surface
  • 50 library scenarios
  • Real ticker universe
  • Email support
Start building
Most popular

Research

For quant teams and AI labs running real workloads.

$499/mo
  • 5 API keys · per-key rate limits
  • Full 720-scenario library
  • Train/validation/test/holdout splits
  • Validation & dataset-quality audits
  • Priority email support
Get access

Enterprise

For institutions needing scale, SLAs, and custom worlds.

Custom
  • Unlimited keys & dedicated capacity
  • Custom themes & calibration
  • Fully materialized institutional panels
  • On-prem / VPC deployment
  • SLA + named support engineer
Talk to sales
FAQ

Questions, answered

No — and that's the point. Every value is synthetically generated to be internally consistent. You get realistic, calibrated dynamics without any licensed market data, so there are no redistribution restrictions or point-in-time licensing to manage. Company reference metadata (names, sectors) is sourced from public yfinance data; all prices, fundamentals, news, filings, and events are synthetic.

Every scenario is generated deterministically, so a given scenario always returns the exact same prices, statements, filings, and events. That makes backtests and AI training runs perfectly reproducible and auditable.

A random walk gives you a price series with nothing behind it. David Data builds a coherent world: accounting identities hold, OHLC invariants are valid, earnings tie to guidance, news repeats the reported numbers, and SEC-style filings carry lineage back to the events that caused them. Validation and dataset-quality audits enforce this consistency.

Yes. It's a standard REST API with API-key auth, JSON responses, and a documented OpenAPI surface — discovery, as-reported and segmented statements, screeners, KPI/non-GAAP views, index funds, interest rates, and market snapshots. Most clients only need to change the base URL and key.

Generation is calibrated against empirical reference panels for volatility, kurtosis, return clustering, drawdowns, and tail behavior. Calibration is a research workbench result, not an institutional sellability claim — for institutional use we offer custom calibration against licensed vendor panels.

Yes. The library ships with 40 market themes including crisis archetypes. On Enterprise, our team designs custom scenarios and themes, runs custom calibration, and delivers fully materialized institutional panels with on-prem or VPC deployment.

Stop fighting your data. Start owning it.

Get an API key and start querying deterministic market worlds in minutes. License-free ground truth for backtests and AI agents.