Skip to content

ML repository delivering both scheduling via cron jobs and API endpoints for running ETL, training, prediction, evaluation, etc, of various end-to-end models. Developed with FastAPI, TensorFlow, PyTorch, AWS, MongoDB.

Notifications You must be signed in to change notification settings

tahmid-saj/ml-repository

Repository files navigation

ML Repository

ML job scheduler delivering both scheduling via cron jobs and API endpoints for running ETL, training, prediction, evaluation, etc, of various end-to-end models. Developed with FastAPI, TensorFlow, PyTorch, AWS, MongoDB.



Directory structure

The directory structure is as follows:

ml-job-scheduler/
├── api/
│   ├── controllers/
│   │   ├── btc_forecast_controller.py
│   │   └── __pycache__/
│   │       └── btc_forecast_controller.cpython-312.pyc
│   ├── models/
│   │   ├── btc_forecast/
│   │   │   ├── btc_forecast_models.py
│   │   │   └── __pycache__/
│   │   │       └── btc_forecast_models.cpython-312.pyc
│   │   ├── sp500_forecast/
│   │   ├── text_analyzer/
│   │   └── text_summarizer/
│   └── services/
│       ├── mongodb.py
│       └── __pycache__/
│           └── mongodb.cpython-312.pyc
├── btc_forecast.py
├── conf/
│   ├── aws/
│   ├── cron_jobs/
│   │   └── cron_jobs_conf.yml
│   └── mongodb/
│       ├── mongodb_conf.py
│       └── __pycache__/
│           └── mongodb_conf.cpython-312.pyc
├── data/
│   └── food_classifier/
│       └── models/
│           ├── keras_metadata.pb
│           └── saved_model.pb
├── Dockerfile
├── food_classifier.py
├── index.py
├── monitoring/
├── notebooks/
│   └── btc_forecast.py
├── README.md
├── requirements.txt
├── scripts/
│   ├── data_ops/
│   │   └── text_summarizer_data_ops.py
│   ├── etl/
│   │   ├── btc_forecast/
│   │   │   ├── btc_forecast_multivariate_etl.py
│   │   │   └── btc_forecast_univariate_etl.py
│   │   ├── food_classifier/
│   │   │   └── food_classifier_etl.py
│   │   ├── sp500_forecast/
│   │   │   ├── sp500_forecast_multivariate_etl.py
│   │   │   └── sp500_forecast_univariate_etl.py
│   │   ├── text_analyzer/
│   │   │   └── text_analyzer_etl.py
│   │   └── text_summarizer/
│   │       ├── text_summarizer_etl.py
│   │       ├── text_summarizer_positional_embedding.py
│   │       └── text_summarizer_tribid_embedding.py
│   ├── evaluation/
│   ├── full_pipeline/
│   │   ├── btc_forecast_multivariate_2_weeks.py
│   │   ├── btc_forecast_multivariate_current_day.py
│   │   ├── sp500_forecast_multivariate_2_weeks.py
│   │   └── __pycache__/
│   │       ├── btc_forecast_multivariate_2_weeks.cpython-312.pyc
│   │       └── btc_forecast_multivariate_current_day.cpython-312.pyc
│   ├── postprocessing/
│   ├── prediction/
│   │   └── text_analyzer_ensemble_prediction.py
│   ├── training/
│   │   └── text_analyzer_ensemble_training.py
│   └── training_prediction/
│       ├── btc_forecast_multivariate_training_prediction.py
│       ├── food_classifier_efficientb0_training_prediction.py
│       ├── sp500_forecast_multivariate_training_prediction.py
│       ├── text_summarizer_prediction_evaluation.py
│       └── text_summarizer_test_prediction.py
├── sp500_forecast.py
├── src/
│   └── mls/
│       ├── assets/
│       │   ├── btc_forecast_assets.py
│       │   ├── food_classifier_assets.py
│       │   ├── sp500_forecast_assets.py
│       │   └── __pycache__/
│       │       └── btc_forecast_assets.cpython-312.pyc
│       ├── data_ops/
│       │   ├── btc_forecast_load_prices.py
│       │   ├── sp500_load_prices.py
│       │   └── __pycache__/
│       │       └── btc_forecast_load_prices.cpython-312.pyc
│       ├── etl/
│       │   ├── btc_forecast_etl.py
│       │   ├── sp500_forecast_etl.py
│       │   ├── text_summarizer_etl.py
│       │   └── __pycache__/
│       │       └── btc_forecast_etl.cpython-312.pyc
│       ├── evaluation/
│       │   ├── btc_forecast_evaluation.py
│       │   ├── food_classifier_evaluation.py
│       │   ├── sp500_forecast_evaluation.py
│       │   ├── text_analyzer_evaluation.py
│       │   ├── text_summarizer_evaluation.py
│       │   └── __pycache__/
│       │       └── btc_forecast_evaluation.cpython-312.pyc
│       ├── model/
│       │   ├── btc_forecast_ensemble_model.py
│       │   ├── food_classifier_efficientb0_model.py
│       │   ├── sp500_forecast_ensemble_model.py
│       │   ├── text_analyzer_ensemble_model.py
│       │   ├── text_summarizer_tribid_embedding_model.py
│       │   └── __pycache__/
│       │       └── btc_forecast_ensemble_model.cpython-312.pyc
│       ├── postprocessing/
│       ├── prediction/
│       │   ├── btc_forecast_prediction.py
│       │   ├── sp500_forecast_prediction.py
│       │   └── __pycache__/
│       │       └── btc_forecast_prediction.cpython-312.pyc
│       └── training/
├── tests/
├── text_analyzer.py
├── text_summarizer.py
├── utils/
│   ├── api-requests/
│   ├── constants/
│   │   ├── btc_forecast_constants.py
│   │   └── __pycache__/
│   │       └── btc_forecast_constants.cpython-312.pyc
│   ├── errors/
│   └── helpers/
├── vercel.json
└── vercel_dev.json


Overview

Design

The usage of the service in other applications can be found below. Similar services can be found here and below:

Similar services

image Figure 1: High level view and usage in other applications

image

The ML job scheduler consists of the following models:

  1. BTC forecast: Bitcoin forecasting
  2. S&P 500: S&P 500 forecasting
  3. Food classifier: Food detection / classification
  4. Text analyzer: NLP text analysis
  5. Text summarizer: NLP text summarization

Running the jobs / models:

The jobs / models can be both manually triggered or scheduled via API calls or cron jobs respectively:

  1. BTC forecast: btc_forecast.py
  2. S&P 500: sp500_forecast.py
  3. Food classifier: food_classifier.py
  4. Text analyzer: text_analyzer.py
  5. Text summarizer: btc_summarizer.py

About

ML repository delivering both scheduling via cron jobs and API endpoints for running ETL, training, prediction, evaluation, etc, of various end-to-end models. Developed with FastAPI, TensorFlow, PyTorch, AWS, MongoDB.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published