Performing Exploratory Data Analysis (EDA) on IPL 2023 data to uncover patterns, trends, and relationships, providing valuable insights for the tournament's analysis.
To run this project, ensure you have the following dependencies installed:
- Python 3.x
- numpy
- pandas
- matplotlib
-
Clone this repository to your local machine or download the project files.
-
Place the IPL 2023 dataset files (
IPL2023_Bowler.csv
,IPL2023_Batsman.csv
,IPL2023_Matches.csv
,IPL2023_Match_Scoreboard.csv
) in the same directory as the code file. -
Open a terminal or command prompt and navigate to the project directory.
-
Execute the following command to install the required dependencies:
-
pip install numpy pandas matplotlib
The code performs the following steps:
- Imports the required libraries: numpy, pandas, and matplotlib.pyplot.
- Reads the IPL 2023 dataset files (
IPL2023_Bowler.csv
,IPL2023_Batsman.csv
,IPL2023_Matches.csv
,IPL2023_Match_Scoreboard.csv
) usingpd.read_csv()
. - Performs EDA on the batsman dataset, including data exploration, descriptive statistics, and data visualization.
- Generates various plots and charts to analyze the batsmen's performance, such as scatter plots, bar charts, and pie charts.
- Presents insights on runs scored, balls faced, boundaries (4's and 6's), and players with the most runs, 4's, 6's, and unique match numbers.
- Provides a detailed analysis of the 'out_by' column, identifying duplicates and generating visualizations.
Feel free to modify the code and explore the dataset further to derive additional insights.