Market Data Primer
Start here if you are new to crypto exchange market data
Crypto market data is the public record of what exchange APIs publish while orders are submitted, matched, canceled, or amended. This primer gives you the mental model first: what public feeds contain, how orders meet in a matching engine, how trades and book changes are created, and how those events become data you can replay and analyze.
What exchange APIs publish
For real-time data, centralized crypto exchange APIs publish public market data over WebSocket as JSON messages. A client keeps the connection open, subscribes to a market and data type, and receives messages as the exchange publishes them. Exchange documentation uses names such as channels, streams, topics, or tables for those subscriptions.

Where market data comes from
On a centralized exchange, once an order reaches the matching engine, the engine compares it with open orders in that market's limit order book. This primer starts there because public market data is published from the trades and book changes that matching creates.
Inside the book, the core rule is price-time priority: better prices match first; at the same price, older orders have priority. The matching process creates trades and book changes, which public market data feeds then publish.

At a high level:
A participant submits a limit order, market order, cancel, or amendment.
The matching engine either changes the book or matches the order against open orders.
A match creates a trade.
The exchange publishes order book updates when book liquidity changes.
The exchange publishes trades and related market events when they happen.
Orders and the Level 2 book
An order book is the exchange's list of open buy and sell limit orders for one market.
In an L2 book, each row is a price level. A level has three pieces: side, price, and size. If several orders sit at the same side and price, the exchange aggregates them into one level. L2 shows total size at each price, not every individual order.

ask level
A price and the total size sellers offer at that price
bid level
A price and the total size buyers bid at that price
best ask
The lowest ask level; the lowest price to buy immediately
best bid
The highest bid level; the highest price to sell immediately
spread
The price gap between the best bid and best ask
Book depth describes how much detail a feed exposes:
L1 / top of book
Best bid and best ask, with sizes when the feed provides them
L2 / market by price
Aggregated size at each price level
L3 / market by order
Individual order-level events, when the exchange provides them
For Tardis-specific order book depth, L2/L3 coverage, snapshots, and reconstruction details, see the Order Books FAQ.
Book snapshots and updates
Most exchange APIs send an initial order book snapshot over the WebSocket feed, then publish updates as the book changes. The model is simple: snapshot first, then changes.

A snapshot is the book state at one point in time. An update is a message with the price levels that changed since the previous book state. One update message can contain several changes, including ask and bid changes.
To follow the current book:
Start from a snapshot.
Apply the next update.
Read the new book.
Continue with the next update.
In this L2 model, an update gives the new size for each changed price level. If an update reports the ask at 10100.00 as 2.40 BTC, the current size at that level becomes 2.40 BTC. A size of 0 removes the level.
Tardis CSV datasets include incremental L2 book updates and reconstructed order book snapshots for the top 25 and top 5 levels.
Orders and trades
A limit order sets an amount and a limit price: buy or sell up to this amount, but only at this price or better. Any unfilled amount becomes an open order in the limit order book.
A market order requests an immediate buy or sell against available liquidity. It does not set a limit price; it takes from the book until it is filled or there is not enough liquidity left.
A trade happens when an incoming order matches an order already in the book. In the example below, the book has a limit sell order at 100103.00. A market buy arrives for 0.30 BTC, so the exchange publishes a trade at 100103.00 for 0.30 BTC.

Maker and taker describe the two roles in that match:
The maker order was already in the book and provided liquidity.
The taker order arrived and removed liquidity by matching against it.
In Tardis normalized trade messages, side is the liquidity taker side: buy means the taker bought from asks, and sell means the taker sold into bids.
Raw exchange feeds are venue-specific, so use the exchange's trade-side fields before mapping them.
Timestamps and message order
When working with market data, separate two time references:
Exchange timestamp: a time field supplied by the exchange. Most messages expose one; some expose several, such as event time and publish time. These are both exchange timestamps, but they are not interchangeable: event time describes when the trade or book change happened; publish time describes when the exchange emitted the message.
Arrival timestamp: when the message reached the recorder or data collector. In Tardis data this is
localTimestamp.

Use the exchange timestamp when you care about the time assigned by the exchange. Use the arrival timestamp when you care about collection order or when the message reached the recorder.
The Data FAQ covers timestamp semantics, same-timestamp ordering, and cross-channel synchronization in more detail.
High-frequency and low-frequency data
Frequency describes how much detail a feed preserves, not only how many messages arrive per second.
High-frequency data preserves individual exchange events and book changes. It includes trades, L2 book updates, best bid/ask updates, liquidations, and other event feeds when the exchange publishes them.
Low-frequency data is aggregated or sampled. It includes candles, periodic book snapshots, and other summaries where exact event sequence is not preserved.

Use high-frequency data when the event sequence matters: reconstructing books, replaying market conditions, simulating execution, or debugging feed behavior.
Use low-frequency data when a summary is enough: charting a price series, comparing volume over longer intervals, or inspecting sampled liquidity. A one-minute candle can tell you the open, high, low, close, and volume for that minute. It cannot tell you the order of trades and book changes inside the minute.
Tardis.dev focuses on high-frequency, tick-level market data. Low-frequency views such as candles or sampled book snapshots are derived from tick-level data when needed.
Spot markets
Spot markets trade the asset itself against another asset.

In BTC/USDT spot, BTC is the base asset and USDT is the quote asset. The price is quoted in the quote asset per one unit of the base asset. A price of 100000 means 1 BTC = 100000 USDT.
A trade for 0.10 BTC at that price has 10000 USDT notional. Notional is the quote-asset value of a trade or position.
Derivative markets
Derivatives do not trade the asset itself. They trade contracts whose value references an underlying asset, index, or market.
Symbol formats are exchange-specific. Many derivative symbols encode product details such as the underlying market, expiry, strike, or contract type.

A dated future is a contract with a fixed expiry or settlement date. Traders can open and close positions before expiry; positions open at expiry are settled by the exchange's settlement rules.
A perpetual swap is futures-like, but it has no expiry. Because it does not naturally converge to a settlement date, crypto perpetuals use funding payments between long and short positions to keep the contract price close to a reference market.
An option gives the holder the right, but not the obligation, to buy or sell the underlying at a strike price. It has an expiry and an option type such as call or put.
Derivative-specific market data includes:
funding
Periodic payments between long and short perpetual positions
open interest
Total outstanding derivative exposure; the unit is exchange-specific
liquidation
Forced position close events when an account no longer satisfies margin requirements
mark price
Exchange reference price used for margin, PnL, and liquidation logic
implied volatility and greeks
Option risk measures published by options markets
Derivative contract units
Do not assume every amount, size, qty, or volume field means the same unit. In derivative feeds, size is defined by the contract specification. Depending on the venue, it counts contracts, base asset, quote value, notional value, or another exchange-defined unit.
A contract multiplier defines what one contract represents. The contract model defines how price changes turn into PnL and which currency the PnL settles in.

Common contract models:
Linear: exposure moves directly with the quoted price, and PnL settles in the quote or margin currency. For a BTCUSDT linear perpetual, a
0.10 BTCposition at100000 USDThas10000 USDTnotional.Inverse: the contract is quoted in a fiat or stablecoin price, but PnL settles in the underlying asset. For a BTCUSD inverse contract, size is expressed as USD value while profit and loss are paid in BTC.
Quanto: the underlying market, quote currency, and settlement currency are not the same. The exchange multiplier converts price movement into the settlement currency.
Use instrument metadata and the exchange contract specification before comparing derivative sizes across venues.
Raw or normalized data
Market data services expose exchange data in two broad formats.
Exchange-native raw data keeps the exchange's own message shape, field names, channels, and product-specific details. It is closest to what a live WebSocket client would have received from the exchange API.
Normalized data maps common market data events into a shared schema. It is easier to compare across exchanges, but it necessarily hides or renames some exchange-specific fields.
Compare the same BitMEX trade in exchange-native and normalized form:
Tardis.dev gives customers access to both exchange-native and normalized data. We record exchange-native source feeds and provide them through raw historical APIs, and we also export normalized CSV datasets and client-library objects built from that source.
Where to go next
Use this primer as the mental model, then choose the reference that matches the next job:
Last updated
Was this helpful?