Locally runnable server with built-in data caching, providing both tick-level historical and consolidated real-time cryptocurrency market data via HTTP and WebSocket APIs
Tardis-machine is a locally runnable server with built-in data caching that uses Tardis.dev HTTP API under the hood. It provides both tick-level historical and consolidated real-time cryptocurrency market data via it's HTTP and WebSocket APIs and is available via npm and Docker.
Features
efficient data replay API endpoints returning historical market data for whole time periods (in contrast to Tardis.dev HTTP API where single call returns data for single minute time period)
WebSocket API providing historical market data replay from any given past point in time with the same data format and 'subscribe' logic as real-time exchanges' APIs - in many cases existing exchanges' WebSocket clients can be used to connect to this endpoint
### running without persitent local cachedockerrun-p8000:8000-p8001:8001-e"TM_API_KEY=YOUR_API_KEY"-dtardisdev/tardis-machine
Tardis-machine server's HTTP endpoints will be available on port 8000 and WebSocket API endpoints on port 8001. Your API key will be passed via ENV variable (TM_API_KEY) — simply replace YOUR_API_KEY with API key you've received via email.
Command above does not use persistent volumes for local caching (each docker restart will result in loosing local data cache). In order to use for example./host-cache-dir as persistent volume (bind mount) cache directory, run:
Since using volumes can cause issues especially on Windows, it's fine to run Docker image without them with the caveat of potentially poor local cache ratio after each container's restart.
Config environment variables
You can set following environment config variables to configure tardis-machine server:
name
default
description
TM_API_KEY
API key for Tardis.dev HTTP API - if not provided only first day of each month of historical data is accessible
TM_PORT
8000
HTTP port on which server will be running, WebSocket port is always this value + 1 (8001 with port set to 8000)
TM_CACHE_DIR
/.cache
path to local dir that will be used as cache location
TM_CLUSTER_MODE
false
will launch cluster of Node.js processes to handle the incoming requests if set to true, by default server runs in single process mode
TM_DEBUG
false
server will print verbose debug logs to stdout if set to true
TM_CLEAR_CACHE
false
server will clear local cache dir on startup if set to true
npm
Requires Node.js v12+ and git installed.
Install and runtardis-machine server via npx command:
npxtardis-machine--api-key=YOUR_API_KEY
or install globally via npm:
npminstall-gtardis-machine
and then run:
tardis-machine--api-key=YOUR_API_KEY
Tardis-machine server's HTTP endpoints will be available on port 8000 and WebSocket API endpoints on port 8001. Your API key will be passed via --api-key config flag — simply replace YOUR_API_KEY with API key you've received via email.
CLI config flags
You can configure tardis-machine server via environment variables as described in Docker section as well.
You can set following CLI config flags when starting tardis-machine server installed via npm:
name
default
description
--api-key
API key for Tardis.dev HTTP API - if not provided only first day of each month of historical data is accessible
--port
8000
HTTP port on which server will be running, WebSocket port is always this value + 1 (8001 with port set to 8000)
--cache-dir
<os.tmpdir>/.tardis-cache
path to local dir that will be used as cache location - if not provided default temp dir for given OS will be used
--cluster-mode
false
will launch cluster of Node.js processes to handle the incoming requests if set to true, by default server runs in single process mode
--debug
false
server will print verbose debug logs to stdout if set to true
--clear-cache
false
server will clear local cache dir on startup is set to true
--help
shows CLI help
--version
shows tardis-machine version number
Exchange-native market data APIs
Exchange-native market data API endpoints provide historical data in exchange-native format. The main difference between HTTP and WebSocket endpoints is the logic of requesting data:
WebSocket API accepts exchanges' specific 'subscribe' messages that define what data will be then "replayed" and send to WebSocket client
• HTTP GET /replay?options={options}
Returns historical market data messages in exchange-native format for given replay options query string param. Single streaming HTTP response returns data for the whole requested time period as NDJSON.
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM, HTTP /replay API endpoint was returning ~700 000 messages/s (already locally cached data).
import asyncioimport aiohttpimport jsonimport urllib.parseasyncdefreplay_via_tardis_machine_machine(replay_options): timeout = aiohttp.ClientTimeout(total=0)asyncwith aiohttp.ClientSession(timeout=timeout)as session:# url encode as json object options encoded_options = urllib.parse.quote_plus(json.dumps(replay_options))# assumes tardis-machine HTTP API running on localhost:8000 url =f"http://localhost:8000/replay?options={encoded_options}"asyncwith session.get(url)as response:# otherwise we may get line to long errors response.content._high_water =100_000_000# returned data is in NDJSON format http://ndjson.org/# each line is separate message JSON encodedasyncfor line in response.content:yield lineasyncdefrun(): lines =replay_via_tardis_machine_machine( {"exchange": "bitmex","from": "2019-10-01","to": "2019-10-02","filters": [ {"channel": "trade", "symbols": ["XBTUSD", "ETHUSD"]}, {"channel": "orderBookL2", "symbols": ["XBTUSD", "ETHUSD"]}, ], } )asyncfor line in lines: message = json.loads(line)# localTimestamp string marks timestamp when message was received# message is a message dict as provided by exchange real-time streamprint(message["localTimestamp"], message["message"])asyncio.run(run())
constfetch=require('node-fetch')constsplit2=require('split2')constserialize= options => {returnencodeURIComponent(JSON.stringify(options))}asyncfunction*replayViaTardisMachine(options) {// assumes tardis-machine HTTP API running on localhost:8000consturl=`http://localhost:8000/replay?options=${serialize(options)}`constresponse=awaitfetch(url)// returned data is in NDJSON format http://ndjson.org/// each line is separate message JSON encoded// split response body stream by new linesconstlines=response.body.pipe(split2())forawait (constlineof lines) {yield line }}asyncfunctionrun() {constoptions= { exchange:'bitmex', from:'2019-10-01', to:'2019-10-02', filters: [ { channel:'trade', symbols: ['XBTUSD','ETHUSD'] }, { channel:'orderBookL2', symbols: ['XBTUSD','ETHUSD'] } ] }constlines=replayViaTardisMachine(options)forawait (constlineof lines) {// localTimestamp string marks timestamp when message was received// message is a message object as provided by exchange real-time streamconst { message,localTimestamp } =JSON.parse(line)console.log(message, localTimestamp) }}run()
We're working on providing more samples and dedicated client libraries in different languages, but in the meanwhile to consume HTTP /replay API responses in your language of choice, you should:
Provide url encoded JSON options object via options query string param when sending HTTP request
Parse HTTP response stream line by line as it's returned - buffering in memory whole response may result in slow performance and memory overflows
replay period start date (UTC) in a ISO 8601 format, e.g., 2019-04-01
to
string
-
replay period end date (UTC) in a ISO 8601 format, e.g., 2019-04-02
withDisconnects
boolean | undefined
undefined
when set to true, response includes empty lines (\n) that mark events when real-time WebSocket connection that was used to collect the historical data got disconnected
Response format
Streamed HTTP response provides data in NDJSON format (new line delimited JSON) - each response line is a JSON with market data message in exchange-native format plus local timestamp:
localTimestamp - date when message has been received in ISO 8601 format
message - JSON with exactly the same format as provided by requested exchange real-time feeds
Exchanges' WebSocket APIs are designed to publish real-time market data feeds, not historical ones. Tardis-machine WebSocket /ws-replay API fills that gap and allows "replaying" historical market data from any given past point in time with the same data format and 'subscribe' logic as real-time exchanges' APIs. In many cases existing exchanges' WebSocket clients can be used to connect to this endpoint just by changing URL, and receive historical market data in exchange-native format for date ranges specified in URL query string params.
After connection is established, client has 2 seconds to send subscriptions payloads and then market data replay starts.
If two clients connect at the same time requesting data for different exchanges and provide the same session key via query string param, then data being send to those clients will be synchronized (by local timestamp).
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM, WebSocket /ws-replayAPI endpoint was sending ~500 000 messages/s (already locally cached data).
You can also use existing WebSocket client, just by changing URL endpoint as shown in the example below that uses ccxws.
constccxws=require('ccxws')constBASE_URL='ws://localhost:8001/ws-replay'constWS_REPLAY_URL=`${BASE_URL}?exchange=bitmex&from=2019-10-01&to=2019-10-02`constbitMEXClient=newccxws.bitmex()// only change required for ccxws client is to point it to /ws-replay URLbitMEXClient._wssPath =WS_REPLAY_URLconstmarket= { id:'XBTUSD', base:'BTC', quote:'USD'}bitMEXClient.on('l2snapshot', snapshot =>console.log('snapshot',snapshot.asks.length,snapshot.bids.length))bitMEXClient.on('l2update', update =>console.log(update))bitMEXClient.on('trade', trade =>console.log(trade))bitMEXClient.subscribeTrades(market)bitMEXClient.subscribeLevel2Updates(market)
As long as you already use existing WebSocket client that connects to and consumes real-time exchange market data feed, in most cases you can use it to connect to /ws-replay API as well just by changing URL endpoint.
Query string params
name
type
default
description
exchange
string
-
requested exchange id - use /exchanges HTTP API to get list of valid exchanges ids
from
string
-
replay period start date (UTC) in a ISO 8601 format, e.g., 2019-04-01
to
string
-
replay period end date (UTC) in a ISO 8601 format, e.g., 2019-04-02
session
string | undefined
undefined
optional replay session key. When specified and multiple clients use it when connecting at the same time then data being send to those clients is synchronized (by local timestamp).
Normalized market data APIs
Normalized market data API endpoints provide data in unified format across all supported exchanges. Both HTTP /replay-normalized and WebSocket /ws-replay-normalized APIs accept the same replay options payload via query string param. It's mostly matter of preference when choosing which protocol to use, but WebSocket /ws-replay-normalized API has also it's real-time counterpart /ws-stream-normalized, which connects directly to exchanges' real-time WebSocket APIs. This opens the possibility of seamless switching between real-time streaming and historical normalized market data replay.
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM,
HTTP /replay-normalized API endpoint was returning ~100 000 messages/s and ~50 000 messages/s when order book snapshots were also requested.
import asyncioimport aiohttpimport jsonimport urllib.parseasyncdefreplay_normalized_via_tardis_machine_machine(replay_options): timeout = aiohttp.ClientTimeout(total=0)asyncwith aiohttp.ClientSession(timeout=timeout)as session:# url encode as json object options encoded_options = urllib.parse.quote_plus(json.dumps(replay_options))# assumes tardis-machine HTTP API running on localhost:8000 url =f"http://localhost:8000/replay-normalized?options={encoded_options}"asyncwith session.get(url)as response:# otherwise we may get line to long errors response.content._high_water =100_000_000# returned data is in NDJSON format http://ndjson.org/ streamed# each line is separate message JSON encodedasyncfor line in response.content:yield lineasyncdefrun(): lines =replay_normalized_via_tardis_machine_machine( {"exchange": "bitmex","from": "2019-10-01","to": "2019-10-02","symbols": ["XBTUSD", "ETHUSD"],"withDisconnectMessages": True,# other available data types examples:# 'book_snapshot_10_100ms', 'derivative_ticker', 'quote',# 'trade_bar_10ms', 'trade_bar_10s'"dataTypes": ["trade", "book_change", "book_snapshot_10_100ms"], } )asyncfor line in lines: normalized_message = json.loads(line)print(normalized_message)asyncio.run(run())
constfetch=require('node-fetch')constsplit2=require('split2')constserialize= options => {returnencodeURIComponent(JSON.stringify(options))}asyncfunction*replayNormalizedViaTardisMachine(options) {// assumes tardis-machine HTTP API running on localhost:8000consturl=`http://localhost:8000/replay-normalized?options=${serialize( options )}`constresponse=awaitfetch(url)// returned data is in NDJSON format http://ndjson.org/// each line is separate message JSON encoded// split response body stream by new linesconstlines=response.body.pipe(split2())forawait (constlineof lines) {yield line }}asyncfunctionrun() {constoptions= { exchange:'bitmex', from:'2019-10-01', to:'2019-10-02', symbols: ['XBTUSD','ETHUSD'], withDisconnectMessages:true,// other available data types examples:// 'book_snapshot_10_100ms', 'derivative_ticker', 'quote',// 'trade_bar_10ms', 'trade_bar_10s' dataTypes: ['trade','book_change','book_snapshot_10_100ms'] }constlines=replayNormalizedViaTardisMachine(options)forawait (constlineof lines) {constnormalizedMessage=JSON.parse(line)console.log(normalizedMessage) }}run()
We're working on providing more samples and dedicated client libraries in different languages, but in the meanwhile to consume HTTP /replay-normalized API responses in your language of choice, you should:
Provide url encoded JSON options via options query string param when sending HTTP request
Parse HTTP response stream line by line as it's returned - buffering in memory whole response may result in slow performance and memory overflows
Options JSON needs to be an object or an array of objects with fields as specified below. If array is provided, then data requested for multiple exchanges is returned synchronized (by local timestamp).
name
type
default
exchange
string
-
requested exchange id - use /exchanges HTTP API to get list of valid exchanges ids
symbols
string[] | undefined
undefined
optional symbols of requested historical data feed - use /exchanges/:exchange HTTP API to get allowed symbols for requested exchange
from
string
-
replay period start date (UTC) in a ISO 8601 format, e.g., 2019-04-01
to
string
-
replay period end date (UTC) in a ISO 8601 format, e.g., 2019-04-02
dataTypes
string[]
-
array of normalized data types for which historical data will be returned
withDisconnectMessages
boolean | undefined
undefined
when set to true, response includes disconnect messages that mark events when real-time WebSocket connection that was used to collect the historical data got disconnected
WebSocket /ws-stream-normalized is the real-time counterpart of this API endpoint, providing real-timemarket data in the same format, but not requiring APIkey as connects directly to exchanges' real-time WebSocket APIs.
import asyncioimport aiohttpimport jsonimport urllib.parseasyncdefrun(): replay_options ={"exchange":"bitmex","from":"2019-10-01","to":"2019-10-02","symbols": ["XBTUSD","ETHUSD"],"withDisconnectMessages":True,# other available data types examples:# 'book_snapshot_10_100ms', 'derivative_ticker', 'quote',# 'trade_bar_10ms', 'trade_bar_10s'"dataTypes": ["trade","book_change","book_snapshot_10_100ms"],} options = urllib.parse.quote_plus(json.dumps(replay_options)) URL =f"ws://localhost:8001/ws-replay-normalized?options={options}"asyncwith aiohttp.ClientSession()as session:asyncwith session.ws_connect(URL)as websocket:asyncfor msg in websocket:print(msg.data)asyncio.run(run())
constWebSocket=require('ws')constserialize= options => {returnencodeURIComponent(JSON.stringify(options))}constreplayOptions= { exchange:'bitmex', from:'2019-10-01', to:'2019-10-02', symbols: ['XBTUSD','ETHUSD'], withDisconnectMessages:true,// other available data types examples:// 'book_snapshot_10_100ms', 'derivative_ticker', 'quote',// 'trade_bar_10ms', 'trade_bar_10s' dataTypes: ['trade','book_change','book_snapshot_10_100ms']}constoptions=serialize(replayOptions)constURL=`ws://localhost:8001/ws-replay-normalized?options=${options}`constws=newWebSocket(URL)ws.onmessage= message => {console.log(message.data)}
We're working on providing more samples and dedicated client libraries in different languages, but in the meanwhile to consume WebSocket /ws-replay-normalized API responses in your language of choice, you should:
Provide url encoded JSON options via options query string param when connecting to
Options JSON needs to be an object or an array of objects with fields as specified below. If array is provided, then data requested for multiple exchanges is being send synchronized (by local timestamp).
name
type
default
exchange
string
-
requested exchange id - use /exchanges HTTP API to get list of valid exchanges ids
symbols
string[] | undefined
undefined
optional symbols of requested historical data feed - use /exchanges/:exchange HTTP API to get allowed symbols for requested exchange
from
string
-
replay period start date (UTC) in a ISO 8601 format, e.g., 2019-04-01
to
string
-
replay period end date (UTC) in a ISO 8601 format, e.g., 2019-04-02
dataTypes
string[]
-
array of normalized data types for which historical data will be provided
withDisconnectMessages
boolean | undefined
undefined
when set to true, sends also disconnect messages that mark events when real-time WebSocket connection that was used to collect the historical data got disconnected
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM,
WebSocket /ws-replay-normalized API endpoint was returning ~70 000 messages/s and ~40 000 messages/s when order book snapshots were also requested.
Doesn't requires API key as connects directly to exchanges real-time WebSocket APIs and transparently restarts closed, broken or stale connections (open connections without data being send for specified amount of time).
Provides consolidated real-time market data streaming functionality with options as an array - provides single consolidated real-time data stream for all exchanges specified in options array.
WebSocket /ws-replay-normalized is the historical counterpart of this API endpoint, providing historicalmarket data in the same format.
import asyncioimport aiohttpimport jsonimport urllib.parseasyncdefrun(): data_types = ["trade","book_change","book_snapshot_10_100ms"] stream_options = [{"exchange":"bitmex","symbols": ["XBTUSD"],"dataTypes": data_types,},{"exchange":"deribit","symbols": ["BTC-PERPETUAL"],"dataTypes": data_types,}, ] options = urllib.parse.quote_plus(json.dumps(stream_options)) URL =f"ws://localhost:8001/ws-stream-normalized?options={options}"# real-time normalized data for two exchanges via single connectionasyncwith aiohttp.ClientSession()as session:asyncwith session.ws_connect(URL)as websocket:asyncfor msg in websocket:print(msg.data)asyncio.run(run())
constWebSocket=require('ws')constserialize= options => {returnencodeURIComponent(JSON.stringify(options))}// other available data types examples:// 'book_snapshot_10_100ms', 'derivative_ticker', 'quote',// 'trade_bar_10ms', 'trade_bar_10s'constdataTypes= ['trade','book_change','book_snapshot_10_100ms']conststreamOptions= [ { exchange:'bitmex', symbols: ['XBTUSD'], dataTypes }, { exchange:'deribit', symbols: ['BTC-PERPETUAL'], dataTypes }]constoptions=serialize(streamOptions)constURL=`ws://localhost:8001/ws-stream-normalized?options=${options}`constws=newWebSocket(URL)// real-time normalized data for two exchanges via single connectionws.onmessage= message => {console.log(message.data)}
We're working on providing more samples and dedicated client libraries in different languages, but in the meanwhile to consume WebSocket /ws-stream-normalized API responses in your language of choice, you should:
Provide url encoded JSON options via options query string param when connecting to
Options JSON needs to be an object or an array of objects with fields as specified below. If array is specified then API provides single consolidated real-time data stream for all exchanges specified (as in examples above).
name
type
default
exchange
string
-
requested exchange id - use /exchanges HTTP API to get list of valid exchanges ids
symbols
string[] | undefined
undefined
optional symbols of requested real-time data feed
dataTypes
string[]
-
array of normalized data types for which real-time data will be provided
withDisconnectMessages
boolean | undefined
undefined
when set to true, sends disconnect messages anytime underlying exchange real-time WebSocket connection(s) gets disconnected
timeoutIntervalMS
number
10000
specifies time in milliseconds after which connection to real-time exchanges' WebSocket API is restarted if no message has been received
{ type:'trade' symbol: string // instrument symbol as provided by exchange exchange: string // exchange id id: string |undefined// trade id if provided by exchange price: number // trade price as provided by exchange amount: number // trade amount as provided by exchange side:'buy'|'sell'|'unknown'// liquidity taker side (aggressor) timestamp: string // trade timestamp provided by exchange (ISO 8601 format) localTimestamp: string // message arrival timestamp (ISO 8601 format)}
• book_change
Initial L2 (market by price) order book snapshot (isSnapshot=true) plus incremental updates for each order book change. Please note that amount is the updated amount at that price level, not a delta. An amount of 0 indicates the price level can be removed.
{ type:'book_change' symbol: string // instrument symbol as provided by exchange exchange: string // exchange id isSnapshot: boolean // if true marks initial order book snapshot bids: { price: number; amount: number }[] // updated bids price-amount levels asks: { price: number; amount: number }[] // updated asks price-amount levels timestamp: string // order book update timestamp if provided by exchange,// otherwise equals to localTimestamp, (ISO 8601 format) localTimestamp: string // message arrival timestamp (ISO 8601 format)}
• derivative_ticker
Derivative instrument ticker info sourced from real-time ticker & instrument channels.
{ type:'derivative_ticker' symbol: string // instrument symbol as provided by exchange exchange: string // exchange id lastPrice: number |undefined// last instrument price if provided by exchange openInterest: number |undefined// last open interest if provided by exchange fundingRate: number |undefined// last funding rate if provided by exchange indexPrice: number |undefined// last index price if provided by exchnage markPrice: number |undefined// last mark price if provided by exchange timestamp: string // message timestamp provided by exchange (ISO 8601 format) localTimestamp: string // message arrival timestamp (ISO 8601 format)}
Order book snapshot for selected number_of_levels (top bids and asks), snapshot_interval and time_unit.
When snapshot_interval is set to 0 , snapshots are taken anytime order book state within specified levels has changed, otherwise snapshots are taken anytime snapshot_interval time has passed and there was an order book state change within specified levels. Order book snapshots are computed from exchanges' real-time order book streaming L2 data (market by price).
Examples:
book_snapshot_10_0ms - provides top 10 levels tick-by-tick order book snapshots
book_snapshot_50_100ms - provides top 50 levels order book snapshots taken at 100 millisecond intervals
book_snapshot_30_10s - provides top 30 levels order book snapshots taken at 10 second intervals
quote is an alias of book_snapshot_1_0ms - provides top of the book (best bid/ask) tick-by-order book snapshots
quote_10s is an alias of book_snapshot_1_10s - provides top of the book (best bid/ask) order book snapshots taken at 10 seconds intervals
{ type:'book_snapshot' symbol: string // instrument symbol as provided by exchange exchange: string // exchange id name: string // name with format book_snapshot_{depth}_{interval}{time_unit} depth: number // requested number of levels (top bids/asks) interval: number // requested snapshot interval in milliseconds bids: { price: number; amount: number }[] // top "depth" bids price-amount levels asks: { price: number; amount: number }[] // top "depth" asks price-amount levels timestamp: string // snapshot timestamp based on last book_change message// processed timestamp adjusted to snapshot interval localTimestamp: string // message arrival timestamp // that triggered snapshot (ISO 8601 format)}
• trade_bar_{aggregation_interval}{suffix}
Trades data in aggregated form, known as OHLC, candlesticks, klines etc. Not only most common time based aggregation is supported, but volume and tick count based as well. Bars are computed from tick-by-tick raw trade data, if in given interval no trades happened, there is no bar produced.
Examples:
trade_bar_10ms - provides time based trade bars with 10 milliseconds intervals
trade_bar_5m - provides time based trade bars with 5 minute intervals
trade_bar_100ticks - provides ticks based trade bars with 100 ticks (individual trades) intervals
trade_bar_100000vol - provides volume based trade bars with 100 000 volume intervals
{ type:'trade_bar' symbol: string // instrument symbol as provided by exchange exchange: string // exchange id name: string // name with format trade_bar_{interval} interval: number // requested trade bar interval kind:'time'|'volume'|'tick'// trade bar kind open: number // open price high: number // high price low: number //low price close: number // close price volume: number // total volume traded in given interval buyVolume: number // buy volume traded in given interval sellVolume: number // sell volume traded in given interval trades: number // trades count in given interval vwap: number // volume weighted average price openTimestamp: string // timestamp of first trade for given bar (ISO 8601 format) closeTimestamp: string // timestamp of last trade for given bar (ISO 8601 format) timestamp: string // end of interval period timestamp (ISO 8601 format) localTimestamp: string // message arrival timestamp // that triggered given bar computation (ISO 8601 format)}
• disconnect
Message that marks events when real-time WebSocket connection that was used to collect the historical data got disconnected.