Kaggle is a platform for build machine learning notebooks. It can be seen that people has used it to crawl data from websites, and scheduled it to run repeatly (e.g. daily, weekly).
However, the Kaggle API only has the API to download the latest output (not by versions). If the notebooks scheduled to run daily, download these output manually may require a huge cost. Because of that reason, this repository propose a tool to automatically fetch all the version of a Kaggle kernel (notebook).
Firstly, you need to install some libraries:
pip install -r src/requirements.txt
Run the script as following:
python src/kaggle-downloader.py \
-u <username> \
-e <email> \
-p <password> \
-n <notebook>
Furthermore, you can provide the user's information with a file named credential.json
with the following format:
{
"username": "<username>",
"password": "<password>",
"email": "<email>"
}
Then, easily call the source as follow:
python src/kaggle-downloader.py \
-c <credential_path>
-n <notebook>
This repository is owned by phuc16102001.
You are welcome to pull request, but please discuss with me for major changes.