Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anomaly/gmm: Add GMM-based change detection algorithm #94

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
82c2655
anomaly/gmm: Add GMM-based change detection algorithm
jamiesantos May 17, 2023
02f8e9c
caching: add directory for saving files
jamiesantos May 21, 2023
fd225bf
emd: update the PuLP solver to glpk for compatibility with ARM archit…
jamiesantos May 21, 2023
af87439
gmm_change_detection: remove redundant packages
jamiesantos May 21, 2023
98a6b52
caching: streamline file handling
jamiesantos May 22, 2023
162edf0
caching: add predictions for points for complete model
jamiesantos May 22, 2023
7d97666
Remove extraneous code from previous clustering algorithm
jamiesantos May 22, 2023
c363072
gmm: remove extra model fitting
jamiesantos May 22, 2023
2d61373
caching: remove files for GMM cache debugging
jamiesantos May 22, 2023
c35cbde
Merge branch 'develop' of github.com:nasa/isaac into wip/gmm_change_det
marinagmoreira May 26, 2023
f043017
passing python code through linters
marinagmoreira May 26, 2023
5d7940f
Add object disappearance detection
jamiesantos May 28, 2023
1edf24e
wip: preprocess: only save images once point cloud is obtained
jamiesantos May 28, 2023
e4c92d5
Merge branch 'wip/gmm_change_det' of github.com:jamiesantos/isaac int…
jamiesantos May 28, 2023
39ba1a2
artificial_data: randomly generate point clouds with specified changes
jamiesantos May 28, 2023
30bdd53
organize scripts into package structure
marinagmoreira May 30, 2023
74bbaca
adding some headers and converting file from dos to unix
marinagmoreira May 30, 2023
4925183
remote duplicate commented code that is in artificial_data.py
marinagmoreira May 30, 2023
882a125
running on ubuntu 20 the example fake data
marinagmoreira May 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions anomaly/gmm-change-detection/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (c) 2021, United States Government, as represented by the
# Administrator of the National Aeronautics and Space Administration.
#
# All rights reserved.
#
# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
# platform" software is licensed under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with the
# License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(gmm)

## Compile as C++14, supported in ROS Kinetic and newer
add_compile_options(-std=c++14)


## Find catkin macros and libraries
find_package(catkin REQUIRED COMPONENTS)

catkin_package()

15 changes: 15 additions & 0 deletions anomaly/gmm-change-detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
\page gmm GMM Change Detection

# Overview

This implementation of a GMM-based anomaly detection algorithm was created by Jamie Santos, for the purposes of a [Master thesis]().
This algorithm is able to detect changes on environments such as the ISS using 3D point depth cloud data.

# Requirements
[To-Do]

## Usage

rosrun gmm gmm_change_detection.py


21 changes: 21 additions & 0 deletions anomaly/gmm-change-detection/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<?xml version="1.0"?>
<package format="2">
<name>gmm</name>
<version>0.0.0</version>
<description>GMM Change Detection package</description>
<license>
Apache License, Version 2.0
</license>
<author email="astrobee-fsw@nx.arc.nasa.gov">
ISAAC Flight Software
</author>
<maintainer email="astrobee-fsw@nx.arc.nasa.gov">
ISAAC Flight Software
</maintainer>

<buildtool_depend>catkin</buildtool_depend>
<build_depend>roscpp</build_depend>
<build_depend>cv_bridge</build_depend>
<build_export_depend>roscpp</build_export_depend>
<exec_depend>roscpp</exec_depend>
</package>
1 change: 1 addition & 0 deletions anomaly/gmm-change-detection/scripts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gmm_mml import GmmMml
122 changes: 122 additions & 0 deletions anomaly/gmm-change-detection/scripts/artificial_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
#!/usr/bin/env python
# Copyright (c) 2017, United States Government, as represented by the
# Administrator of the National Aeronautics and Space Administration.
#
# All rights reserved.
#
# The Astrobee platform is licensed under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with the
# License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

import copy

import numpy as np


def generate_data(n_start, n_disappearances, n_appearances):
N = 1000 # N points in each cluster
point_set_1 = []
point_set_2 = []

# Define first set of points
means_1 = np.random.uniform(-2, 2, (n_start, 3)).round(2)
covs_1 = np.zeros(shape=(n_start, 3, 3))
for i in range(n_start):
covs_1[i] = np.diag(np.random.uniform(-0.1, 0.1, (1, 3))[0].round(2))

# Remove old clusters in second set of points
means_2 = copy.deepcopy(means_1)
covs_2 = copy.deepcopy(covs_1)

for i in range(n_disappearances):
means_2 = np.delete(means_2, 0, 0)
covs_2 = np.delete(covs_2, 0, 0)

# Add new clusters in second set of points
means_appearances = np.random.uniform(-2, 2, (n_appearances, 3)).round(2)
covs_appearances = np.zeros(shape=(n_appearances, 3, 3))
for i in range(n_appearances):
covs_appearances[i] = np.diag(np.random.uniform(-0.1, 0.1, (1, 3))[0].round(2))
means_2 = np.vstack((means_2, means_appearances))
covs_2 = np.vstack((covs_2, covs_appearances))

# Concatenate clusters into point clouds
for i in range(len(means_1)):
x = np.random.multivariate_normal(means_1[i], covs_1[i], N)
point_set_1.append(x)
points_1 = np.concatenate(point_set_1)

for i in range(len(means_2)):
x = np.random.multivariate_normal(means_2[i], covs_2[i], N)
point_set_2.append(x)
points_2 = np.concatenate(point_set_2)

return points_1, points_2


if __name__ == "__main__":
points_1, points_2 = generate_data(5, 3, 9)
print(points_1.shape, points_2.shape)

# if fake_data:
# # Generate 3D data with 4 clusters
# # set Gaussian centers and covariances in 3D
# means = np.array([[1, 0.0, 0.0],
# [0.0, 0.0, 0.0],
# [-0.5, -0.5, -0.5],
# [-0.8, 0.3, 0.4]])
# covs = np.array([np.diag([0.01, 0.01, 0.03]),
# np.diag([0.08, 0.01, 0.01]),
# np.diag([0.01, 0.05, 0.01]),
# np.diag([0.03, 0.07, 0.01])])
#
# N = 1000 #Number of points to be generated for each cluster.
# points_a = []
# points_b = []
#
# for i in range(len(means)):
# x = np.random.multivariate_normal(means[i], covs[i], N )
# points_a.append(x)
# points_b.append(x)
#
# points1 = np.concatenate(points_a)
#
# if appearance:
# # Add an extra Gaussian
# means2 = np.array([[1.5, 1.5, 1.5],
# [0.2, 0.2, 0.2],
# [0.8, -.03, -0.4]])
# covs2 = np.array([np.diag([0.01, 0.01, 0.01]),
# np.diag([0.02, 0.01, 0.03]),
# np.diag([0.03, 0.02, 0.01])])
#
# for i in range(len(means2)):
# x = np.random.multivariate_normal(means2[i], covs2[i], N )
# points_b.append(x)
#
# points2 = np.concatenate(points_b)
#
# else:
# # Remove an extra Gaussian
# means2 = np.array([[1, 0.0, 0.0],
# [0.0, 0.0, 0.0],
# [-0.5, -0.5, -0.5]])
# covs2 = np.array([np.diag([0.01, 0.01, 0.03]),
# np.diag([0.08, 0.01, 0.01]),
# np.diag([0.01, 0.05, 0.01])])
# points_b = []
#
# for i in range(len(means2)):
# x2 = np.random.multivariate_normal(means2[i], covs2[i], N )
# points_b.append(x2)
#
# points2 = np.concatenate(points_b)
#
84 changes: 84 additions & 0 deletions anomaly/gmm-change-detection/scripts/emd_gmm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# https://towardsdatascience.com/linear-programming-using-python-priyansh-22b5ee888fe0
import numpy as np
from pulp import *


class EMDGMM:
def __init__(self, gmm1_weights, gmm2_weights):
self.warehouse_supply = gmm1_weights # Supply Matrix
self.cust_demands = gmm2_weights # Demand Matrix
self.n_warehouses = gmm1_weights.size
self.n_customers = gmm2_weights.size
self.weight_sum1 = np.sum(self.warehouse_supply)
self.weight_sum2 = np.sum(self.cust_demands)
self.distances = None
self.emd = None

def get_distance(self, means1, means2):
"""Given two GMMs, generate a distance matrix between all cluster
representatives (means) of GMM1 and GMM2. Output: K1 x K2 matrix"""

distances = np.zeros((means1.shape[0], means2.shape[0]))
for i, row1 in enumerate(means1):
for j, row2 in enumerate(means2):
distances[i][j] = np.linalg.norm(row1 - row2)
self.distances = distances

def calculate_emd(self):
"""Optimize the cost-distance (weight-distance) flow between the
two GMMs and use the optimized distance as the EMD distance metric."""

# Cost Matrix
cost_matrix = self.distances

# Initialize Model
model = LpProblem("Supply-Demand-Problem", LpMinimize)

# Define Variable Names
variable_names = [
str(i) + "_" + str(j)
for j in range(1, self.n_customers + 1)
for i in range(1, self.n_warehouses + 1)
]
variable_names.sort()

# Decision Variables
DV_variables = LpVariable.matrix(
"X", variable_names, cat="Continuous", lowBound=0
)
allocation = np.array(DV_variables).reshape(self.n_warehouses, self.n_customers)

# Objective Function
obj_func = lpSum(allocation * cost_matrix)
model += obj_func

# Constraints
for i in range(self.n_warehouses):
# print(lpSum(allocation[i][j] for j in range(self.n_customers)) <= warehouse_supply[i])
model += lpSum(
allocation[i][j] for j in range(self.n_customers)
) <= self.warehouse_supply[i], "Supply Constraints " + str(i)

for j in range(self.n_customers):
# print(lpSum(allocation[i][j] for i in range(self.n_warehouses)) >= cust_demands[j])
model += lpSum(
allocation[i][j] for i in range(self.n_warehouses)
) >= self.cust_demands[j], "Demand Constraints " + str(j)

model.solve(GLPK_CMD(msg=0))
status = LpStatus[model.status]
# print(status)

# print("Total Cost:", model.objective.value())
# for v in model.variables():
# try:
# print(v.name, "=", v.value())
# except:
# print("error couldn't find value")

# for i in range(self.n_warehouses):
# print("Warehouse ", str(i+1))
# print(lpSum(allocation[i][j].value() for j in range(self.n_customers)))

total_flow = min(self.weight_sum1, self.weight_sum2)
self.emd = model.objective.value() / total_flow
46 changes: 46 additions & 0 deletions anomaly/gmm-change-detection/scripts/example_code/emd/emd_gmm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import numpy as np


def earth_movers_distance(m1, m2):
"""Compute the Earth Movers Distance between two Gaussian mixture models.

Parameters
----------
m1 : np.ndarray
Array of means and covariances of the first Gaussian mixture model.
m2 : np.ndarray
Array of means and covariances of the second Gaussian mixture model.

Returns
-------
np.ndarray
Earth Movers Distance between the two Gaussian mixture models.
"""
# Compute the mean and covariance of the combined Gaussian mixture model.
mean_combined = (m1.mean() + m2.mean()) / 2
covariance_combined = (m1.covariance() + m2.covariance()) / 2

# Compute the Earth Movers Distance between the two Gaussian mixture models.
earth_movers_distance = np.linalg.norm(mean_combined - m1.mean()) + np.linalg.norm(
mean_combined - m2.mean()
)

return earth_movers_distance


def main():
m1 = np.random.multivariate_normal(mean=[0, 0, 0], covariance=np.eye(3))
m2 = np.random.multivariate_normal(mean=[1, 2, 3], covariance=np.eye(3))

# Compute the Earth Movers Distance between the two Gaussian mixture models.
earth_movers_distance = earth_movers_distance(m1, m2)

# Print the Earth Movers Distance.
print(
"The Earth Movers Distance between the two Gaussian mixture models is "
+ earth_movers_distance
)


if __name__ == "__main__":
main()
45 changes: 45 additions & 0 deletions anomaly/gmm-change-detection/scripts/example_code/emd/emd_numpy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import numpy as np


def earth_movers_distance(m1, m2):
"""Compute the Earth Movers Distance between two Gaussian mixture models.

Parameters
----------
m1 : np.ndarray
Array of means and covariances of the first Gaussian mixture model.
m2 : np.ndarray
Array of means and covariances of the second Gaussian mixture model.

Returns
-------
np.ndarray
Earth Movers Distance between the two Gaussian mixture models.
"""
# Compute the mean and covariance of the combined Gaussian mixture model.
mean_combined = (m1.mean() + m2.mean()) / 2
covariance_combined = (np.cov(m1) + np.cov(m2)) / 2

# Compute the Earth Movers Distance between the two Gaussian mixture models.
earth_movers_distance = np.linalg.norm(mean_combined - m1.mean()) + np.linalg.norm(
mean_combined - m2.mean()
)

return earth_movers_distance


def main():
m1 = np.random.multivariate_normal((0, 0, 0), np.eye(3))
m2 = np.random.multivariate_normal((1, 2, 3), np.eye(3))
print(m1.shape)
print(m1)

# Compute the Earth Movers Distance between the two Gaussian mixture models.
emd = earth_movers_distance(m1, m2)

# Print the Earth Movers Distance.
print(emd)


if __name__ == "__main__":
main()
Loading