This repository contains materials for the advanced class "Big Data within Science and Industry," which will take place on September 22, 2023, at the University of Milano-Bicocca in Milan, Italy.
Class Information: Big Data within Science and Industry
In this class, you will have access to notebooks and resources designed to guide you through the fascinating field of Gravitational Wave (GW) analysis. Our primary focus will be on two key aspects:
Before diving into the intricate world of GW data analysis, we'll start by understanding the importance of data preprocessing. This fundamental step involves cleaning, formatting, and organizing the raw GW data to make it suitable for further analysis. You will learn how to:
- Clean noisy data.
- Correct for instrumental artifacts.
- Prepare data for feature extraction and model training.
The second part of our tutorial delves into the realm of machine learning applied to GW data. Throughout this class, you'll explore:
- Supervised machine learning algorithms.
- Feature engineering for GW signal detection.
- Implementing machine learning models for classification tasks.
As a hands-on project, we will tackle a real-world challenge in the field of GW analysis. Our goal is to develop a Convolutional Neural Network (CNN) model that can classify transient signals generated by different mechanisms of GW emission during Core Collapse Supernovae (CCSN).
Thanks to Alberto Iess for providing the time series dataset for this project.
Contact: Alberto Iess, alberto.iess@sns.it
Before you start, make sure you have the following prerequisites:
To successfully work on the tasks and projects in this course, you need to have a Python environment with the necessary packages installed. If you don't already have this environment set up on your computer, follow these instructions:
- Connect to the Anaconda Installation Guide.
- Download the Anaconda installer from Anaconda Downloads.
-
Open your terminal.
-
You have the option to install either Anaconda or Miniconda, depending on your preference:
- To install Anaconda, refer to the Anaconda Installation Guide for Linux.
- To install Miniconda, a lightweight alternative, follow the Miniconda Installation Guide for Linux.
-
After successfully installing Anaconda or Miniconda, proceed with creating a Python environment for this project.
In your terminal, run the following commands:
-
Create a new Conda environment (replace
env_name
with your preferred environment name and Python version if necessary):- conda create --name env_name python=3.10 - conda activate env_name
-
pip install numpy pandas matplotlib jupyter tqdm
-
Once installed, you can launch a Jupyter Notebook.
You will need to install specific packages required for the tasks to be solved. You can do this by running the following commands in a Jupyter Notebook cell:
!pip install gwpy
!pip install statsmodels
!pip install tensorflow
!pip install scikit-learn