Skip to content

austinreynolds/analytics_demo

Repository files navigation

Bottom line up front

  • You probably haven't learned everything you need to from your business data.
  • Dashboards and SQL display results; they are not adequate analysis tools.
  • Machine learning isn't just for making predictions; you can use it for investigation.
  • What am I missing? is the most important question in data.
    • Investigators must over-measure then distill with ML in order to miss less.
  • Statistical testing (A/B/n) is important to assess an idea, but we need to generate better ideas.

Intro

This project uses a simple Linux GPU setup to model stock market data. It uses the Anaconda Python distribution and data from the Sharadar Core US Equities Bundle on the Nasdaq data link. Environmental variables NASDAQ_DATA_API_KEY and DATA_HOME are expected.

Project structure

The download.py script fetches the tables, stores them in parquet files, then loads them into a duckdb file, all within $DATA_HOME/analytics_demo. The dbt project, located in dbt_sharadar_demo, must then be run for preprocessing. Lineage graph:

dbt lineage graph

The notebooks:

Setup

  1. Create anaconda environment then activate.
conda env create -f environment.yml --solver=libmamba
conda activate analytics_demo
  1. Set up dbt config, typically found at ~/.dbt/profiles.yml, to include the database filepath.
dbt_sharadar_demo:
  outputs:
    prod:
      type: duckdb
      path: "{{ env_var('DATA_HOME') }}/analytics_demo/sharadar.duckdb"
      threads: 2
  target: prod
  1. Fetch tables.
python3 download.py
  1. Move to the dbt folder to run dbt.
cd dbt_sharadar_demo
dbt run
cd ..
  1. Launch Jupyter to host notebooks if you prefer this over an IDE.
jupyter lab

Tear down

To remove the anaconda environment, simply run:

conda env remove --name analytics_demo

See here if you also wish to remove Anaconda.


Disclaimer: I am not an investment professional. None of my work within or related to this repository should be considered investment advice. It is not.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published