Skip to content

Courses-VU/Machine_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine_learning

Machine learning Project - mRNA expression data

Goal

Test different classification methods (discuss) → Select the best method

Dataset

Gene expression (RNA-seq) of five different cancer types.

Methods

Preprocessing (Unsupervised learning):

  • Principal Component Analysis(PCA)

  • Clustering

Use different classification methods (Supervised learning):

  • K-Nearest Neighbors

  • Linear Models

  • Naive Bayes Classifiers

  • Decision Trees

  • Kernelized Support Vector Machines(?)

TODO

Keep track on what we still have to do. Please update this list with new todo's.

  • Update README.
  • Investigate preprocessing that is applied to the data.
  • Write about preprocessing steps in report.
  • Keep track on references in the report.
  • Reorganize repository (give logical filenames, restructure folders, etc.).
  • Rewrite PCA scripts structure.
  • Calculate amount of PC's needed (PCA script).
  • Review PCA script (especially investigate explained variation values).

Workflow

  • PCA: try to apply it within cancer types
  • find important features --> DEG (Differentially expressed genes)
  • KEGG analysis (Pathways)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5