This project demonstrates a basic implementation of Simple Linear Regression to predict salaries based on years of experience. It uses the Salary Dataset from Kaggle.
- Data Preprocessing: The dataset is loaded and converted into NumPy arrays.
- Model Implementation: Simple Linear Regression is implemented from scratch using Python.
- Gradient Descent: The model uses gradient descent to optimize weights.
- Visualization: Graphs of actual vs. predicted salaries are included.
- Model Evaluation: Key metrics such as cost function and ( R^2 ) score are calculated.
The project consists of a single Jupyter Notebook file:
main.ipynb
: Contains the entire implementation of the project, including loading data, model training, visualization, and evaluation.
The dataset is sourced from Kaggle: Salary Dataset
It contains two columns:
YearsExperience
: Years of work experience.Salary
: Corresponding salary.
- Python 3.7+
- Jupyter Notebook
- Required Libraries:
pandas
numpy
matplotlib
- Clone this repository:
git clone https://github.com/KartikAg13/simple_linear_regression.git
- Navigate to the project directory:
cd simple_linear_regression
- Open the Jupyter Notebook:
jupyter notebook main.ipynb
- Follow the cells to run the project step by step.
- Optimal Parameters: The model finds the best values for weight (( w )) and bias (( b )) using gradient descent.
- Prediction Visualization: A graph is plotted showing the regression line (predicted) vs. actual data.
- Evaluation: The cost function and ( R^2 ) score are calculated to assess the model's performance.
This is a basic project designed for educational purposes. Feel free to fork and extend it! Contributions are welcome.