Crypto data fetcher is a simple wrapper of the Cryptowatch REST API OHLCV endpoint, providing useful features such as:
- Fetch OHLCV data on various exchanges and various pairs
- Recording fetched data in a hdf5 file
- Querying only missing data (according to existing h5 records)
I personally use this to on my Raspi to get minutely data of the Kraken OHLCV on the BTC/EUR pair, as I couldn't find any free dataset available. You can setup a cron job to poll the data. Fetched data is limited by Cryptowatch to an history of 6000 ticks, for any period of time (about 4 days ago for 1m data), so keep that in mind when you're setting up your polling period.
Install required packages:
pip3 install logging time schedule argparse requests numpy pandas datetime h5py --user
Usage:
crypto-data-fetcher.py [flags]
-h, --help show usage and exit
-f, --filepath string file path in which to store the fetched data, or on which to complete existing data (default ./store.h5)
-e, --exchange string exchange symbol on which to fetch the data (exchange listing https://api.cryptowat.ch/exchanges)
-s, --symbol string pair symbol for which to fetch the data (find pair listing for every exchange here https://cryptowat.ch/exchanges). Ex: btceur or etheur or zecbtc etc.
-p, --period string time period of the data, must be one of: "1m", "3m", "5m", "15m", "30m", "1h", "2h", "4h", 6h", "12h", "1d", "3d", "1w"
-l, --logfile string filepath of the logfile in which to write the logs (if none is provided, no log file will be created)
Usage example: fetch 1-minute BTCEUR data from kraken
./crypto-data-fetcher.py -f store.h5 -e kraken -s btceur -p 1m
Fetched data is stored in a h5 file as a chunked table using gzip compression (default compression level for h5py: 4, you can easily change the code and set a higher compression level if you wish). To read and edit data, you can use a GUI tool such as vitables (Version 3.x advised). If you use the data fetcher several time with the same .h5 file as storage, all the data will be stored in the same file. Data is ordered in subgroups: exchange / pair / period
Example: Running
./crypto-data-fetcher.py -f store.h5 -e kraken -s btceur -p 1m
Then
./crypto-data-fetcher.py -f store.h5 -e coinbase -s ethusd -p 3m
Will both use the same data/store.h5 file, but will store the datasets in /kraken/btceur/1m
and /coinbase/ethusd/3m
Columns are always in that order: ['time', 'open', 'high', 'low', close', 'volume_base', 'volume_quote']
(where 'volume_base'
and 'volume_quote'
refer to the volume in the base asset currency, and the volume in the quote asset currency.
One important thing to know is that the Cryptowatch API timestamps the data regarding the end of the period, which can differ from other dataset. For example, a data at minute 00:01 from the Kraken API will actually be at 00:02 for the Cryptowatch API.
All data is recorded with UTC time.
The Cryptowatch API has a rate limitation of 8 seconds of CPU time per hour. This should be more than enough to fetch data even every minute, but keep that in mind if you ever spam their API. You can query https://api.cryptowat.ch to get your remaining CPU time (in nanoseconds).
If you setup several crontab jobs to fetch the store at the same time (say, every hour), you will probably notice that opening the files often fail, because of concurrency writing/reading of the store file. It seems that it is possible to handle that in a smooth way (see http://docs.h5py.org/en/stable/mpi.html) but I haven't looked into that yet, so that it left for further improvements. For now, you have to shift the cron jobs so that they don't run at the same time, or simply use different store files.