Skip to content

Classification of stars, galaxies, and quasars using spectral characteristics.

Notifications You must be signed in to change notification settings

IgorKolodziej/stellar_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stellar Classification - SDSS17

Authors

Igor Kołodziej

Kamil Eliaszuk

Introduction

This project focuses on the classification of stars, galaxies, and quasars using spectral characteristics. The dataset used for this project is derived from the Sloan Digital Sky Survey (SDSS) DR17. The goal is to create a robust model that accurately identifies celestial objects to optimize the allocation of resources for further research.

Business Case

Suppose a team of astrophysicists has tasked us with developing a model to classify celestial objects reliably. Given the high cost associated with further research on classified objects, maximizing the precision of the classification model is paramount. The team requires assurance that objects are identified correctly to streamline subsequent investigations.

Folder Structure

  • dataset: Contains the dataset used for training and evaluation.
  • models: Stores trained machine learning models.
  • notebooks: Jupyter notebooks documenting data exploration, model training, and evaluation.
  • scripts: Python scripts for automated data exploration.
  • envs: Anaconda environments used in this project.

Virtual environments

Anaconda environments used in this project are available in the envs directory.

  • Use env_automated_eda to run the scripts
  • Use env for anything else

Best Model

After extensive testing of various architectures, the random forest model achieved the highest performance, with weighted precision of 0.977 using only 4 features of the dataset.

Dataset Overview

The dataset comprises 100,000 observations from the SDSS, each described by 17 feature columns and 1 class column. The features include spectral characteristics such as ultraviolet, green, red, and infrared filters, along with identifiers such as object ID, right ascension angle, declination angle, and more.

  • obj_ID: Object Identifier
  • alpha: Right Ascension angle (at J2000 epoch)
  • delta: Declination angle (at J2000 epoch)
  • u: Ultraviolet filter
  • g: Green filter
  • r: Red filter
  • i: Near Infrared filter
  • z: Infrared filter
  • run_ID: Run Number
  • rereun_ID: Rerun Number
  • cam_col: Camera column
  • field_ID: Field number
  • spec_obj_ID: Unique ID for optical spectroscopic objects
  • class: Object class (galaxy, star, or quasar)
  • redshift: Redshift value
  • plate: Plate ID
  • MJD: Modified Julian Date
  • fiber_ID: Fiber ID

Dataset Citation

  • Author: fedesoriano
  • Date: January 2022
  • Dataset: Stellar Classification Dataset - SDSS17
  • Link: Kaggle

About

Classification of stars, galaxies, and quasars using spectral characteristics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published