Perfectly reproducible
Pin a scenario and it returns identical prices, statements, filings, and events every time. Reproduce any backtest or training run to the cent.
David Data generates deterministic, internally consistent public-company worlds — prices, fundamentals, filings, news, and events — served through a REST API. Perfect ground truth for backtesting and training AI agents, with no vendor licensing limits.
$ curl https://data.davidhf.com/prices
?scenario_id=scn_8f3a&ticker=NVDA&interval=day
-H "X-API-KEY: ••••••••"
{
"ticker": "NVDA",
"interval": "day",
"prices": [
{ "date": "2026-01-02",
"open": 142.18, "high": 145.07,
"low": 141.55, "close": 144.92,
"volume": 41203885 },
{ "date": "2026-01-05",
"open": 145.10, "high": 149.33,
"low": 144.81, "close": 148.76,
"volume": 52994117 }
]
}200 OK · X-RateLimit-Remaining: 5,998 · deterministic from seed
Most synthetic data is a random walk with a ticker glued on. David Data builds coherent company worlds where the prices, financials, filings, and headlines all agree with each other.
Pin a scenario and it returns identical prices, statements, filings, and events every time. Reproduce any backtest or training run to the cent.
Accounting identities hold, OHLC invariants are valid, news repeats the structured earnings numbers, and filings carry real lineage to events.
Prices, fundamentals, earnings, guidance, analyst revisions, news, SEC-style filings, ownership, insider trades, and corporate actions — all linked.
16,143 real exchange-style symbols with company reference metadata sourced from yfinance. Operating companies, ETFs, and benchmarks behave correctly.
40 market themes — war energy shocks, bank runs, AI IPO manias, Fed pivots, semiconductor blockades — with event-aware macro tapes around each catalyst.
A fully documented OpenAPI surface with API-key auth and rate limits. Standard REST and JSON — point your existing client at it and go.
A complete endpoint surface — discovery, statements, screeners, KPI views, ownership, and market snapshots — all from one consistent REST API.
Plus validation, dataset-quality, and calibration metadata endpoints — full reference in the interactive docs.
Work across pre-built train/validation/test/holdout splits spanning 720 scenarios. Because the data is clean ground truth, there is no leakage, no restatement noise, and no vendor point-in-time licensing to negotiate.
Give LLM agents a complete, self-consistent world to read — filings that cite events, news that repeats the reported numbers, earnings that tie to guidance. Score reasoning against a known answer key.
Stand up dashboards, screeners, and analytics on a realistic dataset you can ship in demos, sandboxes, and CI. Swap in the synthetic endpoint anywhere your client expects a financial data feed.
API-key auth, JSON responses, and rate-limit headers on every request. Standard REST — if you've used a financial data API before, you already know this one.
# 1. Browse the available market-world scenarios
curl "https://data.davidhf.com/scenarios" \
-H "X-API-KEY: $DAVID_DATA_KEY"
# 2. Query prices from a scenario
curl "https://data.davidhf.com/prices?scenario_id=$SCN&ticker=AAPL&interval=day" \
-H "X-API-KEY: $DAVID_DATA_KEY"Begin on the free tier with no card required. Every paid plan adds scale, more scenarios, and support — never paywalled features.
Kick the tires and prototype against real data shapes.
For solo developers, demos, and side projects.
For quant teams and AI labs running real workloads.
For institutions needing scale, SLAs, and custom worlds.
No — and that's the point. Every value is synthetically generated to be internally consistent. You get realistic, calibrated dynamics without any licensed market data, so there are no redistribution restrictions or point-in-time licensing to manage. Company reference metadata (names, sectors) is sourced from public yfinance data; all prices, fundamentals, news, filings, and events are synthetic.
Every scenario is generated deterministically, so a given scenario always returns the exact same prices, statements, filings, and events. That makes backtests and AI training runs perfectly reproducible and auditable.
A random walk gives you a price series with nothing behind it. David Data builds a coherent world: accounting identities hold, OHLC invariants are valid, earnings tie to guidance, news repeats the reported numbers, and SEC-style filings carry lineage back to the events that caused them. Validation and dataset-quality audits enforce this consistency.
Yes. It's a standard REST API with API-key auth, JSON responses, and a documented OpenAPI surface — discovery, as-reported and segmented statements, screeners, KPI/non-GAAP views, index funds, interest rates, and market snapshots. Most clients only need to change the base URL and key.
Generation is calibrated against empirical reference panels for volatility, kurtosis, return clustering, drawdowns, and tail behavior. Calibration is a research workbench result, not an institutional sellability claim — for institutional use we offer custom calibration against licensed vendor panels.
Yes. The library ships with 40 market themes including crisis archetypes. On Enterprise, our team designs custom scenarios and themes, runs custom calibration, and delivers fully materialized institutional panels with on-prem or VPC deployment.
Get an API key and start querying deterministic market worlds in minutes. License-free ground truth for backtests and AI agents.