Flight Price Prediction Model using Sagemaker Documentation

Usage

To run the project locally:

Clone the repository: git clone https://github.com/iamdebasishdas123/SageMaker_Flight_Prediction
Install dependencies: pip install -r requirements.txt
Run the Streamlit app: streamlit run app.py

Flight Price Prediction Model using Sagemaker Documentation

Introduction

This document provides an overview of the Flight Price Prediction project, including the dataset, project files, model training, and prediction process. The project aims to predict flight prices based on various features such as airline, date of journey, source, destination, departure time, arrival time, duration, total stops, and additional information. The prediction model is built using XGBoost and deployed as a web application using Streamlit.

Project Structure

Flight_Price_Prediction/
├── .gitignore
├── README.md
├── app.py
├── aws-xgboost-model
├── XGB-model
├── preprocessor.joblib
├── requirements.txt
├── data/
│   ├── flight_price.csv
│   ├── test.csv
│   ├── train.csv
│   └── val.csv
|── Preprocess file/
│   ├── test-pre.csv
│   ├── train-pre.csv
│   ├── val-pre.csv
├── notebooks/
│   ├── AWS-Model-training.ipynb
│   ├── Data_Cleaning.ipynb
│   ├── EDA.ipynb
│   ├── Feature_engineering.ipynb
│   ├── local-Model-training.ipynb
│   └── train-pre.csv
├── utils/
│   └── eda_helper_functions.py
└── .git/

Existing Project Files

app.py: The main application file for the Streamlit web app.
preprocessor.joblib: The saved preprocessor object used for transforming the input data.
XGB-model: The trained XGBoost model for price prediction in local computer.
aws-xgboost-model: The XGBoost model trained on AWS Sagemaker.
requirements.txt: The list of dependencies required to run the project.
notebooks/: Contains Jupyter notebooks for data cleaning, exploratory data analysis (EDA), feature engineering, and model training.

Model Training

The model training process involves several steps, as documented in the Jupyter notebooks:

Data Cleaning (Data_Cleaning.ipynb):
- Handling missing values
- Correcting data types
- Removing duplicates
Exploratory Data Analysis (EDA) (EDA.ipynb):
- Visualizing the distribution of features
- Identifying correlations between features and the target variable
- Detecting outliers
Feature Engineering (Feature_engineering.ipynb):
- Creating new features from existing ones (e.g., extracting date and time components)
- Encoding categorical variables
- Scaling numerical features
Model Training:
- Local Model Training (local-Model-training.ipynb): Training the model on local computer.
- AWS Model Training (AWS-Model-training.ipynb): Training the model on AWS Sagemaker for better performance and scalability.

Preprocessing Pipeline

The preprocessing pipeline is defined using scikit-learn and feature-engine transformers. It includes steps for handling categorical and numerical features, as well as feature selection. The pipeline is saved as preprocessor.joblib.

Training the XGBoost Model

The XGBoost model is trained on the preprocessed data. The trained model is saved as XGB-model and aws-xgboost-model for local and AWS training respectively.

Web Application

The web application is built using Streamlit and allows users to input flight details to get a price prediction.

Prediction

When the user inputs the flight details and clicks the "Predict" button, the app:

Loads the saved preprocessor and model.
Transforms the input data using the preprocessor.
Predicts the flight price using the XGBoost model.
Displays the predicted price.

Example

Route: Delhi to Kolkata
Airline: Air India
Actual Price: 5300 INR
Predicted Price: 5920 INR
Flight Details: Non-stop, duration 2h 35min

Conclusion

The Flight Price Prediction model provides an accurate and user-friendly way to predict flight prices based on various features. The web application allows for easy interaction and quick predictions, making it a valuable tool for travelers and analysts alike.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Flight Price Prediction Model using Sagemaker Documentation

Introduction

Project Structure

Existing Project Files

Model Training

Preprocessing Pipeline

Training the XGBoost Model

Web Application

Prediction

Example

Conclusion

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Dataset		Dataset
Notebook		Notebook
Preprocess file		Preprocess file
.gitignore		.gitignore
README.md		README.md
XGB-model		XGB-model
app.py		app.py
aws-xgboost-model		aws-xgboost-model
preprocessor.joblib		preprocessor.joblib
requirements.txt		requirements.txt
train.csv		train.csv

iamdebasishdas123/SageMaker_Flight_Prediction

Folders and files

Latest commit

History

Repository files navigation

Usage

Flight Price Prediction Model using Sagemaker Documentation

Introduction

Project Structure

Existing Project Files

Model Training

Preprocessing Pipeline

Training the XGBoost Model

Web Application

Prediction

Example

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages