Data
Last updated
Last updated
We provide the most comprehensive and granular market data on the market with complete how the data is being recorded.
Via following normalized tick-level data types are available:
(top 25 and top 5 levels)
(open interest, funding rate, mark price, index price)
that is available for subscriptions provides data in . See to learn about captured for each exchange. Each captured channel can be considered a different exchange specific data type (for example , or ).
We also provide following via our (normalization is done client-side, using as a data source):
trades
order book L2 updates
order book snapshots (tick-by-tick, 10ms, 100ms, 1s, 10s etc)
quotes
derivative tick info (open interest, funding rate, mark price, index price)
liquidations
options summary
OHLCV
volume/tick based trade bars
Raw market data is sourced from exchanges real-time WebSocket APIs. For cases where exchange lacks WebSocket API for particular data type we fallback to pooling REST API periodically, e.g., Binance Futures open interest data.
L2 data (market-by-price) includes bids and asks orders aggregated by price level and can be used to analyze among other things:
order book imbalance
average execution cost
average liquidity away from midpoint
average spread
hidden interest (i.e., iceberg orders)
L3 data (market-by-order) includes every order book order addition, update, cancellation and match and can be used to analyze among other things:
order resting time
order fill probability
order queue dynamics
We always collect full depth order book data as long as exchange's WebSocket API supports it. Table below shows current state of affairs for each supported exchange.
exchange
order book depth
order book updates frequency
full order book depth snapshot and updates
real-time
full order book depth snapshot and updates
real-time
top 1000 levels initial order book snapshot, full depth incremental order book updates
real-time, dynamically adjusted
top 1000 levels initial order book snapshot, full depth incremental order book updates
real-time, dynamically adjusted
top 1000 levels initial order book snapshot, full depth incremental order book updates
100ms
top 100 levels initial order book snapshot and updates
real-time
top 400 levels initial order book snapshot and updates
real-time
top 400 levels initial order book snapshot and updates
real-time
top 400 levels initial order book snapshot and updates
real-time
top 400 levels initial order book snapshot and updates
real-time
top 150 levels initial order book snapshot and updates
30ms
top 150 levels initial order book snapshot and updates
30ms
top 150 levels initial order book snapshot and updates
30ms
top 150 levels initial order book snapshot and updates
100ms
top 100 levels initial order book snapshot and updates
real-time
top 100 levels initial order book snapshot and updates
real-time
full order book depth snapshot and updates
real-time
full order book depth snapshot and updates
real-time
top 1000 levels initial order book snapshot and updates
real-time
full order book depth snapshot and updates
real-time
full order book depth snapshot and updates
real-time
full order book depth snapshot and updates
real-time
top 25 levels initial order book snapshot and updates
real-time
full order book depth snapshot and updates
real-time
top 15 levels snapshots
real-time
top 30 levels initial order book snapshot and updates
20ms
top 100 levels initial order book snapshot and updates
real-time
top 1000 levels initial order book snapshot, full depth incremental order book updates
100ms
top 20 levels order book snapshots
unknown
top 30 levels order book snapshots
unknown
top 400 levels initial order book snapshot and updates
real-time
full order book depth snapshot and updates
real-time
full order book depth snapshot and updates
real-time
top 1000 levels initial order book snapshot, full depth incremental order book updates
100ms
exchange
available since
notes
2019-03-30
2019-03-30
2019-11-17
2020-06-16
2019-08-01
2020-12-18
collected by pooling OKEx REST APIs since liquidations aren't available via WS feeds
2020-12-18
collected by pooling OKEx REST APIs since liquidations aren't available via WS feeds
2020-06-24
2020-06-24
2019-09-14
2019-03-30
2020-12-18
Futures contract is a contract that has expiry date (for example quarter ahead for quarterly futures). Futures contract price converges to spot price as the contract approaches expiration/settlement date. After futures contract expires, exchange settles it and replaces with a new contract for the next period (next quarter for our previous example).
Perpetual swap contract also commonly called "perp", "swap", "perpetual" or "perpetual future" in crypto exchanges nomenclature is very similar to futures contract, but does not have expiry date (hence perpetual). In order to ensure that the perpetual swap contract price stays near the spot price exchanges employ mechanism called funding rate. When the funding rate is positive, Longs pay Shorts. When the funding rate is negative, Shorts pay Longs. This mechanism can be quite nuanced and vary between exchanges, so it's best to study each contract specification to learn all the details (funding periods, mark price mechanisms etc.).
For example BitMEX trade message looks like this:
and this is Deribit trade message:
Sample normalized trade message:
tick-by-tick trades
order book L2 updates
order book snapshots (tick-by-tick, 10ms, 100ms, 1s, 10s etc)
quotes
derivative tick info (open interest, funding rate, mark price, index price)
liquidations
OHLCV
volume/tick based trade bars
channel
field used in the HTTP API and client libs replay
functions?UTC, always.
Although is should never happen in theory, in practice due to various crypto exchanges bugs and peculiarities it can happen (very occasionally), see some posts from users reporting those issues:
Let's take FTX as an example and start with it's snapshot orderbook message (that is frequently called 'partial' in exchanges API docs as well). Remaining bids and asks levels were removed from this sample message for the sake of clarity.
Such snapshot message maps to the following rows in CSV file:
exchange
symbol
timestamp
local_timestamp
is_snapshot
side
price
amount
ftx
ETH/USD
1601510401216632
1601510401316432
true
ask
359.8
8.101
ftx
ETH/USD
1601510401216632
1601510401316432
true
bid
359.72
121.259
... and here's a sample FTX orderbook update message.
Let's see how it maps to CSV format.
exchange
symbol
timestamp
local_timestamp
is_snapshot
side
price
amount
ftx
ETH/USD
1601510427184054
1601510427204046
false
ask
360.24
4.962
ftx
ETH/USD
1601510427184054
1601510427204036
false
ask
361.02
0
For each row in the CSV file (iterate in the same order as provided in file):
only if local timestamp of current row is larger than previous row local timestamp(local_timestamp
column value) it means you can read your local order book state as it's consistent, why? CSV format is flat where each row represents single price level update, but most exchanges real-time feeds publish multiple order book levels updates via single WebSocket message that need to be processed together before reading locally maintained order book state. We use local timestamp value here to detect all price level updates belonging to single 'update' message.
if current row is a part of the snapshot (is_snapshot
column value set to true
) and previous one was not, reset your local order book state object that tracks price levels for each order book side as it means that there was a connection restart and exchange provided full order book snapshot or it was a start of a new day (each incremental_book_L2 file starts with the snapshot)
if current row amount is set to zero (amount
column value set to 0
) remove such price level (row's price
column) from your local order book state as such price level does not exist anymore
if current row amount is not set to zero update your local order book state price level with new value or add new price level if not exist yet in your local order book state - maintain separately bids and asks order book sides (side
column value)
We always collect and provide data with the most granularity that exchange can offer via it's . High frequency can mean different things for different exchanges due to exchanges APIs limitations. For example for it can mean (market-by-order), for all order book real-time updates and for Spot it means order book updates aggregated in 100ms intervals.
See for more details and why .
Recording exchanges real-time WebSocket feeds allows us preserving and providing that exchanges APIs can offer including data that is simply not available via their REST APIs like tick level order book updates. Historical data sourced from WebSocket real-time feeds adheres to what you'll see when trading live and can be used to exactly replicate live conditions even if it means some occasional causing , real-time data publishing delays especially during larger market moves, or in some edge cases. We find that trade-off acceptable and even if data isn't as clean and corrected as sourced from REST APIs, it allows for more insight into market microstructure and various unusual exchanges behaviors that simply can't be captured otherwise. Simple example would be latency spikes for many exchanges during increased volatility periods where exchange publish trade/order book/quote WebSocket messages with larger than usual latency or simply skip some of the the updates and then return those in one batch. Querying the REST API would result in nice, clean trade history, but such data wouldn't fully reflect real actionable market behavior and would result in unrealistic backtesting results, breaking in the real-time scenarios.
See for more details.
We do provide L2 data both in , (top 25 and top 5 levels) as well as in format via client-side.
Historical L3 data is currently available via API for , and - remaining supported exchanges provide only.
data is sourced from exchanges WebSocket APIs when supported with fallback to pooling REST APIs when WebSockets APIs do not support that data type and can be accessed via ) or as .
collected from channel
collected from channel (trades with liquidation
flag)
collected from stream, since 2021-04-27 liquidation orders streams do not push realtime order data anymore, instead, they push snapshot order data at a maximum frequency of 1 order push per second
collected from stream, since 2021-04-27 liquidation orders streams do not push realtime order data anymore, instead, they push snapshot order data at a maximum frequency of 1 order push per second
collected from channel (trades with liquidation
flag)
collected from channel
collected from channel
collected from channel
collected from channel (trades with liquidation
type)
up until 2021-09-20 collected by pooling Bybit REST APIs since liquidations weren't available via WS feeds, starting from 2021-09-20 collected from channel
Yes, we do provide historical options data for and - see CSV data type and and exchange details pages.
We cover all leading derivatives exchanges such as , , , , , , , , , and
See CSV if you'd like to download data for all futures or perpetual swaps as a single file for given exchange instead one by one for each individual instrument.
We are focusing on providing the best possible tick-level historical data for cryptocurrency exchanges and as of now our APIs (both and ) do offer access to tick-level data only and do not offer support for time based aggregated data.
If you're interested in time based aggregated data (OHLC, interval based order book snapshots) see our that provide such capabilities, but with the caveat that data aggregation is performed client-side from tick-level data sourced from the API, meaning it can be relatively slow process in contrast to ready to download aggregated data.
Yes, we're always open to support new promising exchanges. and we'll get back to you to discuss the details.
(unified data format for every exchange) is available via our and . Our provides data only in .
Data we provide has contract amounts exactly as provided by exchanges APIs, meaning in some cases it can be tricky to compare across exchanges due to different contract multipliers (like for example OKEx where each contract has $100 value) or different contract types (linear or inverse). We'll keep it this way, but we also provide that returns contract multipliers, tick sizes and more for each instrument in uniform way, allowing easily normalize the contract amounts client-side without having to go through all kinds of documentation on various exchange to find this information.
.
Cryptocurrency markets are very fragmented and every exchange provides data in it's own bespoke data format which we call exchange-native data format. Our and can provide market data in this format, meaning data you receive is exactly the same as the live data you would have received from exchanges ("as-is").
See in exchange-native format and .
In contrast, normalized data format means the same, unified format across multiple exchanges. We provide normalized data via our (data normalization is performed client-side) as well as via .
In the process of data normalization we map the data we (exchange-native format) to normalized/unified format across exchanges that is easier to deal with (one data format across multiple exchanges). from exchange-native to normalized format to make the whole process as transparent as possible.
We support following normalized data types via our :
and :
(top 25 and top 5 levels)
(open interest, funding rate, mark price, index price)
Exchanges when publishing real-time data messages, always publish those for subscription topics clients have subscribed to. Those subscriptions topics are also very often called "channels" or "streams" in exchanges documentations pages and describe data type given message belongs to - for example publishes it's trades data via and order book L2 updates data via .
Since we collect the data for all the channels described in exchanges' details page () our and offer filtering capability by those channels names, so for example to get historical trades for , channel needs to be provided alongside requested instruments symbols (via HTTP API or client lib replay
function args).
We're doing our best to provide the most complete and reliable historical raw data API on the market. To do so amongst , we utilize on Google Cloud Platform that offer best in the class availability, networking and monitoring. However due to exchanges' APIs downtimes (maintenance, deployments, etc.) we can experience data gaps and cannot guarantee 100% data completeness, but 99.9% (99.99% on most days) which should be more than enough for most of the use cases that tick level data is useful for.
In rare circumstances, when exchange's API changes without any notice or we hit new unexpected rate limits we also may fail to record data during such period, it happens very rarely and is very specific for each exchange. Use API endpoint and check for incidentReports
field in order to get most detailed and up to date information on that subject.
As long as exchange WebSocket API is not 'hidden' behind Cloudflare proxy (causing relatively frequent "CloudFlare WebSocket proxy restarting, Connection reset by peer" errors) connections are stable for majority of supported exchanges and there are almost no connection drops during the day. In cases when there is more volatility in the market some exchanges tend to drop connections more frequently or have larger latency spikes. Overall it's a nuanced matter that changes over time, if you'd have any questions regarding particular exchange, please do not hesitate to .
We do track sequence numbers of WebSocket L2 order book messages when collecting the data and restart connection when sequence gap is detected for exchanges that do provide those numbers. We observe that even in scenario when sequence numbers are in check, bid/ask overlap can occur. When such scenario occurs, exchanges tend to 'forget' to publish delete messages for the opposite side of the book when publishing new level for given side - we validated that hypothesis by comparing reconstructed order book snapshots that had crossed order book (bid/ask overlap) for which we removed order book levels for the opposite side manually (as exchange didn't publish that 'delete'), with quote/ticker feeds if best bid/ask matches (for exchanges that provide those) - see .
That shouldn't happen in theory, but we've detected that for some exchanges when new connection is established sometimes first message for given channel & symbol has newer timestamp than subsequent message, e.g., order book snapshot has newer timestamp than first order book update. This is why we provide data via and for given data ranges based on (timestamp of message arrival) which are always monotonically increasing.
Some exchanges are occasionally publishing duplicated trades (trades with the same ids). Since we collect real-time data we also collect and provide duplicate trades via if those were published by real-time WebSocket feeds of exchanges. Our have functionality that when working with can deduplicate such trades, similarly for we deduplicate data.
Historical market data available via provides order book snapshots at the beginning of each day (00:00 UTC
) - .
We also provide custom order book snapshots with customizable time intervals from tick-by-tick, milliseconds to minutes or hours via in which case custom snapshots are computed client side from raw data provided via HTTP API as well as via - and .
Order books are collected in streaming mode - snapshot at the beginning of each day and then incremental updates. .
We also provide custom order book snapshots with customizable time intervals from tick-by-tick, milliseconds to minutes or hours via in which case custom snapshots are computed client side from raw data provided via HTTP API as well as via - and .
Cryptocurrency exchanges real-time APIs vary a lot, but for they all tend to follow similar flow, first when WS connection is established and subscription is confirmed, exchanges send initial order book snapshot (all existing price levels or top 'x' levels depending on exchange) and then start streaming 'book update' messages (called frequently deltas as well). Those updates when applied to initial snapshot, result in up to data order book state at given time.
We do provide initial L2 snapshots in incremental_book_L2
dataset at the beginning of each day (00:00 UTC, ), but also anytime exchange closes it's real-time WebSocket connection, .
See if you have doubts how to reconstruct order book state based on data provided in incremental_book_L2
dataset.
See also .
In order to reconstruct full order book state correctly from data:
Alternatively we do also provide order book snapshots CSV datasets ready to download.
CSV datasets are available in daily intervals split by exchange, data type and symbol. In addition to standard currency pairs/instrument symbols, each exchange also has special depending if it supports given market type: SPOT, FUTURES, OPTIONS and PERPETUALS. That feature is useful if someone is interested in for examples all Deribit's options instruments' trades or quotes data without a need to request data for each symbol separately one by one.
Each message received via WebSocket connection is timestamped with 100ns precision using at arrival time (before any message processing) and stored in ISO 8601 format.
For it's 15 minutes (T - 15min), for given day are available on the next day around 06:00 UTC.