Outlier and Anomaly Detection

Vignette on implementing outlier and anomaly detection using breast cancer detection data; created as a class project for PSTAT197A in Fall 2023.

Contributors: Kyle Wu, Jimmy Dysart, Azfal Peermohammed, Ryan Sevilla, Navneet Rajagopal

Executive Summary

As the names suggest, outlier and anomaly detection are methods meant to identify data points that appear to fall outside the normal range. These anomalous observations are often rare and present patterns not present for standard data points. Much like in regular machine learning models, anomaly detection methods fall into 3 main categories; supervised, unsupervised, and semi-supervised models. The vignette here and supporting documents will demonstrate how to utilize anomaly detection methods.

Repository Content

This repository includes a vignette demonstrating the implementation of a number of anomaly/outlier detection methods. The repository also includes the data used, as well as scripts containing the end-to-end implementation of the models utilized.

Methods of interest

In this vignette, we will demonstrate the efficacy of a number of different models and their consequent ability to identify outliers that may be present in the data. In particular, our models of interest are:

Isolation Forests

Local Outlier Factors

One class SVM

Dataset

Our dataset was downloaded from the UC Irvine Machine Learning Repository.

References:

For more resources on outlier and anomaly detection, utilize some of the following links:

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Data		Data
Scripts		Scripts
Vignette		Vignette
anomaly_detection_vignette_cache/html		anomaly_detection_vignette_cache/html
anomaly_detection_vignette_files		anomaly_detection_vignette_files
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
anomaly_detection_vignette.html		anomaly_detection_vignette.html
anomaly_detection_vignette.qmd		anomaly_detection_vignette.qmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Outlier and Anomaly Detection

Executive Summary

Repository Content

Methods of interest

Dataset

References:

About

Releases

Packages

Contributors 5

Languages

PSTAT197-F23/vignette-anomaly-detection

Folders and files

Latest commit

History

Repository files navigation

Outlier and Anomaly Detection

Executive Summary

Repository Content

Methods of interest

Dataset

References:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages