Skip to content

PSTAT197-F23/vignette-anomaly-detection

Repository files navigation

Outlier and Anomaly Detection

Vignette on implementing outlier and anomaly detection using breast cancer detection data; created as a class project for PSTAT197A in Fall 2023.

Contributors: Kyle Wu, Jimmy Dysart, Azfal Peermohammed, Ryan Sevilla, Navneet Rajagopal

Executive Summary

As the names suggest, outlier and anomaly detection are methods meant to identify data points that appear to fall outside the normal range. These anomalous observations are often rare and present patterns not present for standard data points. Much like in regular machine learning models, anomaly detection methods fall into 3 main categories; supervised, unsupervised, and semi-supervised models. The vignette here and supporting documents will demonstrate how to utilize anomaly detection methods.

Repository Content

This repository includes a vignette demonstrating the implementation of a number of anomaly/outlier detection methods. The repository also includes the data used, as well as scripts containing the end-to-end implementation of the models utilized.

Methods of interest

In this vignette, we will demonstrate the efficacy of a number of different models and their consequent ability to identify outliers that may be present in the data. In particular, our models of interest are:

  1. Isolation Forests
  2. Local Outlier Factors
  3. One class SVM

Dataset

Our dataset was downloaded from the UC Irvine Machine Learning Repository.

References:

For more resources on outlier and anomaly detection, utilize some of the following links:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published