Tardis Machine Server
Locally runnable server with built-in data caching, providing both tick-level historical and consolidated real-time cryptocurrency market data via HTTP and WebSocket APIs
Tardis-machine is a locally runnable server with built-in data caching that uses Tardis.dev HTTP API under the hood. It provides both tick-level historical and consolidated real-time cryptocurrency market data via it's HTTP and WebSocket APIs and is available via npm and Docker.
.png?alt=media&token=11f81814-6b3e-4254-8047-cb03c433bcde)
- efficient data replay API endpoints returning historical market data for whole time periods (in contrast to Tardis.dev HTTP API where single call returns data for single minute time period)
- WebSocket API providing historical market data replay from any given past point in time with the same data format and 'subscribe' logic as real-time exchanges' APIs - in many cases existing exchanges' WebSocket clients can be used to connect to this endpoint
- consistent format for accessing market data across multiple exchanges
- transparent historical local data caching (cached data is stored on disk in compressed GZIP format and decompressed on demand when reading the data)
- support for top cryptocurrency exchanges: BitMEX, Deribit, Binance, Binance Futures, FTX, OKEx, Huobi Global, Huobi DM, bitFlyer, Bitstamp, Coinbase Pro, Kraken Futures, Gemini, Kraken, Bitfinex, Bybit, OKCoin, CoinFLEX and more
### running without persitent local cache
docker run -p 8000:8000 -p 8001:8001 -e "TM_API_KEY=YOUR_API_KEY" -d tardisdev/tardis-machine
Tardis-machine server's HTTP endpoints will be available on port
8000
and WebSocket API endpoints on port 8001
. Your API key will be passed via ENV variable (TM_API_KEY
) — simply replace YOUR_API_KEY
with API key you've received via email.Command above does not use persistent volumes for local caching (each docker restart will result in loosing local data cache). In order to use for example
./host-cache-dir
as persistent volume (bind mount) cache directory, run: docker run -v ./host-cache-dir:/.cache -p 8000:8000 -p 8001:8001 -e "TM_API_KEY=YOUR_API_KEY" -d tardisdev/tardis-machine
Since using volumes can cause issues especially on Windows, it's fine to run Docker image without them with the caveat of potentially poor local cache ratio after each container's restart.
You can set following environment config variables to configure tardis-machine server:
name | default | description |
TM_API_KEY | | API key for Tardis.dev HTTP API - if not provided only first day of each month of historical data is accessible |
TM_PORT | 8000 | HTTP port on which server will be running, WebSocket port is always this value + 1 ( 8001 with port set to 8000 ) |
TM_CACHE_DIR | /.cache | path to local dir that will be used as cache location |
TM_CLUSTER_MODE | false | will launch cluster of Node.js processes to handle the incoming requests if set to true , by default server runs in single process mode |
TM_DEBUG | false | server will print verbose debug logs to stdout if set to true |
TM_CLEAR_CACHE | false | server will clear local cache dir on startup if set to true |
Requires Node.js v12+ and git installed.
Install and run
tardis-machine
server via npx
command:npx tardis-machine --api-key=YOUR_API_KEY
or install globally via
npm
:npm install -g tardis-machine
and then run:
tardis-machine --api-key=YOUR_API_KEY
Tardis-machine server's HTTP endpoints will be available on port
8000
and WebSocket API endpoints on port 8001
. Your API key will be passed via --api-key
config flag — simply replace YOUR_API_KEY
with API key you've received via email.
You can configure tardis-machine server via environment variables as described in Docker section as well.
You can set following CLI config flags when starting tardis-machine server installed via
npm
:name | default | description |
--api-key | | API key for Tardis.dev HTTP API - if not provided only first day of each month of historical data is accessible |
--port | 8000 | HTTP port on which server will be running, WebSocket port is always this value + 1 ( 8001 with port set to 8000 ) |
--cache-dir | <os.tmpdir>/.tardis-cache | path to local dir that will be used as cache location - if not provided default temp dir for given OS will be used |
--cluster-mode | false | will launch cluster of Node.js processes to handle the incoming requests if set to true , by default server runs in single process mode |
--debug | false | server will print verbose debug logs to stdout if set to true |
--clear-cache | false | server will clear local cache dir on startup is set to true |
--help | | shows CLI help |
--version | | shows tardis-machine version number |
Exchange-native market data API endpoints provide historical data in exchange-native format. The main difference between HTTP and WebSocket endpoints is the logic of requesting data:
- WebSocket API accepts exchanges' specific 'subscribe' messages that define what data will be then "replayed" and send to WebSocket client
Returns historical market data messages in exchange-native format for given replay options query string param. Single streaming HTTP response returns data for the whole requested time period as NDJSON.
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM, HTTP /replay API endpoint was returning ~700 000 messages/s (already locally cached data).
Python
Node.js
cURL
Your preferred language
import asyncio
import aiohttp
import json
import urllib.parse
async def replay_via_tardis_machine_machine(replay_options):
timeout = aiohttp.ClientTimeout(total=0)
async with aiohttp.ClientSession(timeout=timeout) as session:
# url encode as json object options
encoded_options = urllib.parse.quote_plus(json.dumps(replay_options))
# assumes tardis-machine HTTP API running on localhost:8000
url = f"http://localhost:8000/replay?options={encoded_options}"
async with session.get(url) as response:
# otherwise we may get line to long errors
response.content._high_water = 100_000_000
# returned data is in NDJSON format http://ndjson.org/
# each line is separate message JSON encoded
async for line in response.content:
yield line
async def run():
lines = replay_via_tardis_machine_machine(
{
"exchange": "bitmex",
"from": "2019-10-01",
"to": "2019-10-02",
"filters": [
{"channel": "trade", "symbols": ["XBTUSD", "ETHUSD"]},
{"channel": "orderBookL2", "symbols": ["XBTUSD", "ETHUSD"]},
],
}
)
async for line in lines:
message = json.loads(line)
# localTimestamp string marks timestamp when message was received
# message is a message dict as provided by exchange real-time stream
print(message["localTimestamp"], message["message"])
asyncio.run(run())
const fetch = require('node-fetch')
const split2 = require('split2')
const serialize = options => {
return encodeURIComponent(JSON.stringify(options))
}
async function* replayViaTardisMachine(options) {
// assumes tardis-machine HTTP API running on localhost:8000
const url = `http://localhost:8000/replay?options=${serialize(options)}`
const response = await fetch(url)
// returned data is in NDJSON format http://ndjson.org/
// each line is separate message JSON encoded
// split response body stream by new lines
const lines = response.body.pipe(split2())
for await (const line of lines) {
yield line
}
}
async function run() {
const options = {
exchange: 'bitmex',
from: '2019-10-01',
to: '2019-10-02',
filters: [
{
channel: 'trade',
symbols: ['XBTUSD', 'ETHUSD']
},
{
channel: 'orderBookL2',
symbols: ['XBTUSD', 'ETHUSD']
}
]
}
const lines = replayViaTardisMachine(options)
for await (const line of lines) {
// localTimestamp string marks timestamp when message was received
// message is a message object as provided by exchange real-time stream
const { message, localTimestamp } = JSON.parse(line)
console.log(message, localTimestamp)
}
}
run()
curl -g 'localhost:8000/replay?options={"exchange":"bitmex","filters":[{"channel":"orderBookL2","symbols":["XBTUSD","ETHUSD"]}],"from":"2019-07-01","to":"2019-07-02"}'
http://localhost:8000/replay?options={%22exchange%22:%22bitmex%22,%22filters%22:[{%22channel%22:%22orderBookL2%22,%22symbols%22:[%22XBTUSD%22,%22ETHUSD%22]}],%22from%22:%222019-07-01%22,%22to%22:%222019-07-02%22}
localhost
Click to see API response in the browser as long as tardis-machine is running on localhost:8000
We're working on providing more samples and dedicated client libraries in different languages, but in the meanwhile to consume HTTP /replay API responses in your language of choice, you should:
- 1.
- 2.Parse HTTP response stream line by line as it's returned - buffering in memory whole response may result in slow performance and memory overflows
- 3.
name | type | default | |
exchange | string | - | |
filters | {channel:string, symbols?: string[]}[] | [] | optional filters of requested historical data feed - check historical data details for each exchange and /exchanges/:exchange HTTP API to get allowed channels and symbols for requested exchange |
from | string | - | |
to | string | - | |
withDisconnects | boolean | undefined | undefined | when set to true , response includes empty lines (\n ) that mark events when real-time WebSocket connection that was used to collect the historical data got disconnected |
Streamed HTTP response provides data in NDJSON format (new line delimited JSON) - each response line is a JSON with market data message in exchange-native format plus local timestamp:
localTimestamp
- date when message has been received in ISO 8601 formatmessage
- JSON with exactly the same format as provided by requested exchange real-time feeds
Sample response
{"localTimestamp":"2019-05-01T00:09:42.2760012Z","message":{"table":"orderBookL2","action":"update","data":[{"symbol":"XBTUSD","id":8799473750,"side":"Buy","size":2333935}]}}
{"localTimestamp":"2019-05-01T00:09:42.2932826Z","message":{"table":"orderBookL2","action":"update","data":[{"symbol":"XBTUSD","id":8799474250,"side":"Buy","size":227485}]}}
{"localTimestamp":"2019-05-01T00:09:42.4249304Z","message":{"table":"trade","action":"insert","data":[{"timestamp":"2019-05-01T00:09:42.407Z","symbol":"XBTUSD","side":"Buy","size":1500,"price":5263,"tickDirection":"ZeroPlusTick","trdMatchID":"29d7de7f-27b6-9574-48d1-3ee9874831cc","grossValue":28501500,"homeNotional":0.285015,"foreignNotional":1500}]}}
{"localTimestamp":"2019-05-01T00:09:42.4249403Z","message":{"table":"orderBookL2","action":"update","data":[{"symbol":"XBTUSD","id":8799473700,"side":"Sell","size":454261}]}}
{"localTimestamp":"2019-05-01T00:09:42.4583155Z","message":{"table":"orderBookL2","action":"update","data":[{"symbol":"XBTUSD","id":8799473750,"side":"Buy","size":2333838},{"symbol":"XBTUSD","id":8799473800,"side":"Buy","size":547746}]}}
Exchanges' WebSocket APIs are designed to publish real-time market data feeds, not historical ones. Tardis-machine WebSocket /ws-replay API fills that gap and allows "replaying" historical market data from any given past point in time with the same data format and 'subscribe' logic as real-time exchanges' APIs. In many cases existing exchanges' WebSocket clients can be used to connect to this endpoint just by changing URL, and receive historical market data in exchange-native format for date ranges specified in URL query string params.
After connection is established, client has 2 seconds to send subscriptions payloads and then market data replay starts.
If two clients connect at the same time requesting data for different exchanges and provide the same session key via query string param, then data being send to those clients will be synchronized (by local timestamp).
In our preliminary benchmarks on AMD Ryzen 7 3700X, 64GB RAM, WebSocket /ws-replay
API endpoint was sending ~500 000 messages/s (already locally cached data).
Python
Node.js
Your preferred language
import asyncio
import aiohttp
import json
async def run():
WS_REPLAY_URL = "ws://localhost:8001/ws-replay"
URL = f"{WS_REPLAY_URL}?exchange=bitmex&from=2019-10-01&to=2019-10-02"
async with aiohttp.ClientSession() as session:
async with session.ws_connect(URL) as websocket:
await websocket.send_str(
json.dumps(
{
"op": "subscribe",
"args": [
"trade:XBTUSD",
"trade:ETHUSD",
"orderBookL2:XBTUSD",
"orderBookL2:ETHUSD",
],
}
)
)