As exchanges real-time feeds usually publish multiple order book levels updates via single message you can recognize that by grouping rows by local_timestamp field if needed.
If you have any doubts how to correctly reconstruct full order book state from incremental_book_L2 CSV dataset, please see this answer or contact us.
In case you only need order book data for top 25 or top 5 levels, we do provide datasets with already reconstructed snapshots for every update for those. See book_snapshot_25 and book_snapshot_5.
Deribit FUTURES instruments incremental order book L2 updates for 2020-09-01
• book_snapshot_25
Tick-level order book snapshots reconstructed from exchanges' real-time WebSocket order book L2 data feeds. Each row represents top 25 levels from each side of the limit order book book and was recorded every time any of the tracked bids/asks top 25 levels have changed.
Binance USDT Futures BTCUSDT top 25 levels order book snapshots for 2020-09-01
• book_snapshot_5
Tick-level order book snapshots reconstructed from exchanges' real-time WebSocket order book L2 data feeds. Each row represents top 5 levels from each side of the limit order book book and was recorded every time any of the tracked bids/asks top 5 levels have changed.
instrument symbol as provided by exchange (always uppercase)
timestamp
timestamp provided by exchange in microseconds since epoch - if exchange does not provide one local_timestamp value is used as a fallback
local_timestamp
message arrival timestamp in microseconds since epoch
id
trade id as provided by exchange, empty if exchange does not provide one - different exchanges provide id's as numeric values, GUID's or other strings, and some do not provide that information at all
side
liquidity taker side (aggressor), possible values:
buy - liquidity taker was buying
sell - liquidity taker was selling
unknown - exchange did not provide that information
OKEx Futures FUTURES instruments trades for 2020-03-01 dataset sample
• options_chain
Tick-level options summary info (strike prices, expiration dates, open interest, implied volatility, greeks etc.) for all active options instruments collected from exchanges' real-time WebSocket options tickers data feeds. Options chain data is available for Deribit (sourced from ticker channel) and OKEx Options (sourced from option/summary and index/ticker channels).
For options_chain data type only 'OPTIONS' symbol is available (one file per day for all options instruments).
Top of the book (best bid/ask) data reconstructed from exchanges' real-time WebSocket order book L2 data feeds. - best bid/ask recorded every time top of the book has changed.
We on purpose choose this solution over native exchanges real-time quotes feeds as those vary a lot between exchanges, can be throttled, some are absent at all, often are delayed and published in batches in comparison to more granular L2 updates which are the basis for our quotes dataset.
Derivative instrument ticker info (open interest, funding, mark price, index price) collected from exchanges' real-time WebSocket instruments & tickers data feeds.
Anytime any of the tracked values has changed data was added to final dataset.
instrument symbol as provided by exchange (always uppercase)
timestamp
timestamp provided by exchange in microseconds since epoch - if exchange does not provide one local_timestamp value is used as a fallback
local_timestamp
message arrival timestamp in microseconds since epoch
funding_timestamp
timestamp of the next funding event in microseconds since epoch, empty if exchange does not provide one
funding_rate
funding rate that will take effect on the next funding event at funding timestamp, for some exchanges it's fixed, for other it fluctuates, empty if exchange does not provide one
predicted_funding_rate
estimated predicted funding rate for the next after closest funding event, empty if exchange does not provide one
open_interest
current open interest, empty if exchange does not provide one
last_price
last instrument price, empty if exchange does not provide one
index_price
index price of the instrument, empty if exchange does not provide one
mark_price
mark price of the instrument, empty if exchange does not provide one
instrument symbol as provided by exchange (always uppercase)
timestamp
timestamp provided by exchange in microseconds since epoch - if exchange does not provide one local_timestamp value is used as a fallback
local_timestamp
message arrival timestamp in microseconds since epoch
id
liquidation id as provided by exchange, empty if exchange does not provide one - different exchanges provide id's as numeric values, GUID's or other strings, and some do not provide that information at all
In addition to standard currency pairs & instrument symbols that can be requested when via CSV datasets API, each exchange has additional special grouped symbols available depending if it supports given market type: SPOT, FUTURES, OPTIONS and PERPETUALS. When such symbol is requested then downloaded file for it has all the data for all instruments belonging for given market type. This is especially useful for options instruments that as specifying each option symbol one by one can be mundane process, using 'OPTIONS' as a symbol gives data for all options available at given time.
those special symbols are also listed in response to /exchanges/:exchange API call
Datasets API details
all downloadable datasets are gzip compressed
historical market data is available in daily intervals (separate file for each day) based on local timestamp (timestamp of message arrival) split by exchange, data type and symbol
data for a given day is available on the next day around 6h after 00:00 UTC - exact date until when data is available can be requested via /exchanges/:exchange API call (datasets.exportedUntil), e.g., https://api.tardis.dev/v1/exchanges/ftx
datasets are ordered and split into separate daily files by local_timestamp (timestamp of message arrival time)
empty gzip compressed file is being returned in case of no data available for a given day, symbol and data type, e.g., exchange downtime, very low volume currency pairs etc.
iftimestamp equals to local_timestamp it means that exchange didn't provide timestamp for message, e.g., BitMEX order book updates
cell in CSV file is empty if there's no value for it, e.g., no trade id if a given exchange doesn't provide one
Returns gzip compressed CSV dataset for given exchange, data type, date (year, month, day) and symbol.
Path Parameters
Name
Type
Description
exchange
string
one of https://api.tardis.dev/v1/exchanges (field id, only exchanges with "supportsDatasets":true)
dataType
string
one of datasets.symbols[].dataTypes values from https://api.tardis.dev/v1/exchanges/:exchange API response
year
string
year in format YYYY (four-digit year)
month
string
month in format MM (two-digit month of the year)
day
string
day in format DD (two-digit day of the month)
symbol
string
one of datasets.symbols[].id values from https://api.tardis.dev/v1/exchanges/:exchange API response, see details below
Headers
Name
Type
Description
Authorization
string
For authenticated requests provide Authorization header with value: 'Bearer YOUR_API_KEY'.
Without API key historical datasets for the first day of each month are available to download.
symbols param provided to datasets API in comparison to HTTP API needs to be both always uppercase and have '/' and ':' characters replaced with '-' so symbol is url safe.
# pip install tardis-dev
# requires Python >=3.6
from tardis_dev import datasets, get_exchange_details
import logging
# comment out to disable debug logs
logging.basicConfig(level=logging.DEBUG)
# function used by default if not provided via options
def default_file_name(exchange, data_type, date, symbol, format):
return f"{exchange}_{data_type}_{date.strftime('%Y-%m-%d')}_{symbol}.{format}.gz"
# customized get filename function - saves data in nested directory structure
def file_name_nested(exchange, data_type, date, symbol, format):
return f"{exchange}/{data_type}/{date.strftime('%Y-%m-%d')}_{symbol}.{format}.gz"
# returns data available at https://api.tardis.dev/v1/exchanges/deribit
deribit_details = get_exchange_details("deribit")
# print(deribit_details)
datasets.download(
# one of https://api.tardis.dev/v1/exchanges with supportsDatasets:true - use 'id' value
exchange="deribit",
# accepted data types - 'datasets.symbols[].dataTypes' field in https://api.tardis.dev/v1/exchanges/deribit,
# or get those values from 'deribit_details["datasets"]["symbols][]["dataTypes"] dict above
data_types=["incremental_book_L2", "trades", "quotes", "derivative_ticker", "book_snapshot_25", "book_snapshot_5", "liquidations"],
# change date ranges as needed to fetch full month or year for example
from_date="2019-11-01",
# to date is non inclusive
to_date="2019-11-02",
# accepted values: 'datasets.symbols[].id' field in https://api.tardis.dev/v1/exchanges/deribit
symbols=["BTC-PERPETUAL", "ETH-PERPETUAL",],
# (optional) your API key to get access to non sample data as well
api_key="YOUR API KEY",
# (optional) path where data will be downloaded into, default dir is './datasets'
# download_dir="./datasets",
# (optional) - one can customize downloaded file name/path (flat dir strucure, or nested etc) - by default function 'default_file_name' is used
# get_filename=default_file_name,
# (optional) file_name_nested will download data to nested directory structure (split by exchange and data type)
# get_filename=file_name_nested,
)
// npm install [email protected]// requires node version >=12
// remove it to disable debug logs
process.env.DEBUG = 'tardis-dev*'
const { downloadDatasets, getExchangeDetails } = require('tardis-dev')
;(async () => {
// returns data available at https://api.tardis.dev/v1/exchanges/deribit
const deribitDetails = await getExchangeDetails('deribit')
// console.log(deribitDetails.datasets)
await downloadDatasets({
exchange: 'deribit', // one of https://api.tardis.dev/v1/exchanges with supportsDatasets:true - use 'id' value
dataTypes: ['incremental_book_L2', 'trades', 'quotes', 'derivative_ticker', 'book_snapshot_25', 'book_snapshot_5', 'liquidations'], // accepted data types - 'datasets.symbols[].dataTypes' field in https://api.tardis.dev/v1/exchanges/deribit, or get those values from 'deribitDetails.datasets.symbols[].dataTypes' object above
from: '2019-11-01', // change date ranges as needed to fetch full month or year for example
to: '2019-11-02', // to date is non inclusive
symbols: ['BTC-PERPETUAL', 'ETH-PERPETUAL'], // accepted values: 'datasets.symbols[].id' field in https://api.tardis.dev/v1/exchanges/deribit, or `deribitDetails.datasets.symbols[].id` from object above
apiKey: 'YOUR_API_KEY', // (optional) your API key to get access to non sample data as well
// downloadDir:'./datasets', // (optional) path where data will be downloaded into, default dir is './datasets'
// getFilename: getFilenameDefault, // (optional) - one can customize downloaded file name/path (flat dir strucure, or nested etc) - by default function 'getFilenameDefault' is used
// getFilename: getFilenameCustom // (optional) getFilenameCustom will download data to nested directory structure (split by exchange and data type)
})
})().catch((e) => {
console.log('download error', e)
})
// function used by default if not provided via options
function getFilenameDefault({ exchange, dataType, format, date, symbol }) {
return `${exchange}_${dataType}_${date.toISOString().split('T')[0]}_${symbol}.${format}.gz`
}
// customized get filename function - saves data in nested directory structure
function getFilenameCustom({ exchange, dataType, format, date, symbol }) {
return `${exchange}/${dataType}/${date.toISOString().split('T')[0]}_${symbol}.${format}.gz`
}