Serverless backend that uses a deployed LightGBM model hosted on AWS to return and update predictions for guests in our database. These predictions are used to help case workers better prioritize the most vulnerable guests, so that case workers can optimize the allocation of their resources and reduce the number of homeless families in Spokane.
Description of current directories:
- remove_predictions_from_exited_guests: single function
- Checks if
predicted_exit_destination
column exists in guests table and creates it if necessary - Updates
predicted_exit_destination
column to account for guests that have recently exited from the shelter
- Checks if
- add_predictions_to_non_exited_guests: multiple functions
- add_predictions_step_1_wrangle_new_data: performs query to retrieve new guest data, wrangles data for modeling, and stores wrangled data in an S3 bucket
- add_predictions_step_2_make_predictions: retrieves wrangled data, produces predictions with data using a pickled model, and stores results in an S3 bucket
- add_predictions_step_3_update_database: retrieves prediction data and then updates the guest table's
predicted_exit_destination
column
Languages: Python, SQL
Dependencies: Pandas, NumPy, psycopg2, pickle, Boto3, LightGBM
Services: Docker, ElephantSQL, PostgreSQL
AWS: API Gateway, Lambda, S3, CloudWatch
Build Amazon Linux image with Python 3.7 and pip
docker build -t example_image_name .
docker run -v $(pwd):/aws -ti example_image_name
pip install bcrypt aws-psycopg2 pandas -t /aws
zip -r example_filename.zip *
At this point you'll want to head over the AWS GUI for function creation using AWS Lambda. Zipped functions must not exceed 50 MB to upload directly. If it exceeds this limit it will need to be saved in an S3 bucket with a 250 MB limit.
RDS_HOST = database host
RDS_USERNAME = database username
RDS_USER_PWD = database password
RDS_HOST = database host
RDS_USERNAME = database username
RDS_USER_PWD = database password
S3_BUCKET = destination S3 bucket name for wrangled data
S3_BUCKET_ORIGIN = S3 bucket name where model and wrangled data are stored
S3_BUCKET_DESTINATION = destination S3 bucket name for prediction data
MODEL_NAME = model file name
WRANGLED_DATA_FILE = wrangled data file name
RDS_HOST = database host
RDS_USERNAME = database username
RDS_USER_PWD = database password
S3_BUCKET_ORIGIN = S3 bucket name where prediction data is stored
PREDICTIONS_FILE = prediction data file name
MIT