Skip to content

Latest commit

 

History

History
56 lines (42 loc) · 3.48 KB

README.md

File metadata and controls

56 lines (42 loc) · 3.48 KB

Quandelabra

A tiny tool for quickly downloading large (and free) datasets from Quandl.

Since Quandl limits the ability to download an entire dataset at once to premium offerings, this tool was made to instead download the entire thing in multiple pieces at the same time.
"But wait!" you say, "Doesn't the Quandl Python API allow you to do that already?".
Well, yes, it does.
However, it is painfully slow while doing it and it is not multithreaded by default. Adding threading or even multiprocessing afterwads is tedious and still suffers from the massive overhead the Quandl Python package has with every request.

Quandelabra solves this problem by not giving a shit about official packages requesting data directly via the REST API, circumventing the CPU-monster that is the Quandl-provided API wrapper. Additionally, it makes use of asyncio and aiohttp to deliver buzzword-worthy, blazing-fast speeds due to being able to issue dozens of requests simultaneously. And this without eating any of your RAM or CPU for breakfast! Everything stays nice, cool and under control, as long as your network can sustain the load, that is.

Disclaimer

As with most everything that I put out, these are personal projects and there's no guarantee that they will either:

  • Always work
  • Be maintained forever
  • Do what you want

If this works for you, great!
If you got improvements, even better, send me a pull request!
If it doesn't work, let me know and I'l try my best to ignore it. Just kidding, I gotta keep my GitHub cred high.

Rate Limits

Please be aware of the rate limits imposed by the API itself.
Free users may only be able to run 3 to 4 requests concurrently, while users with at least one Premium subscription can get away with 50 to 100 requests, depending on their network speed. If the dataset you're downloading is free, the reduced rate limits will still apply.
This tool can also be useful for premium users since the API only allows 10 bulk downloads of a dataset per hour. If you need more, you can use this tool to do so.

Installation

Quandelabra is a simple Python script, so the installation procedure is a very complex and error-prone process which works as follows:

  1. Download git clone git@github.com:timwedde/quandelabra.git
  2. Execute cd quandelabra/ && python3 quandelabra.py
  3. ??? python3 quandelabra.py -h
  4. Profit!

Usage

usage: quandelabra.py [-h] -d quandl_code -a key -o dir [-t N]

A tiny tool for quickly downloading large (and free) datasets from Quandl.

optional arguments:
  -h, --help            show this help message and exit
  -d quandl_code, --dataset quandl_code
                        (required) The Quandl Code for the dataset to download
  -a key, --api_key key
                        (required) Your Quandl API key
  -o dir, --output dir  (required) The directory to output data to
  -t N, --tasks N       The amount of tasks to spawn (default: 75)

Credits

  • Quandl, for providing amazing datasets for free and even more amazing datasets for little money.
  • Candalebra icon made by Freepik and procured from www.flaticon.com, licensed under CC 3.0 BY