GMM change detection (#100)

* anomaly/gmm: Add GMM-based change detection algorithm Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * caching: add directory for saving files Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * emd: update the PuLP solver to glpk for compatibility with ARM architecture Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * gmm_change_detection: remove redundant packages Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * caching: streamline file handling Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * caching: add predictions for points for complete model With this commit, the caching is now in a functional state. Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * Remove extraneous code from previous clustering algorithm Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * gmm: remove extra model fitting Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * caching: remove files for GMM cache debugging Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * passing python code through linters * Add object disappearance detection Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * wip: preprocess: only save images once point cloud is obtained Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * artificial_data: randomly generate point clouds with specified changes - Can now get random point clouds with specified start, appearances, and disappearances Signed-off-by: Jamie Santos <jamiesanto@gmail.com> * organize scripts into package structure * adding some headers and converting file from dos to unix * remote duplicate commented code that is in artificial_data.py * running on ubuntu 20 the example fake data * first pass at refactoring the code to simplify it * removing test data since it can be generated easily * starting jupyter notebook with basic features and no plot customization * tested all modes * merge ground truth with pre-process data --------- Signed-off-by: Jamie Santos <jamiesanto@gmail.com> Co-authored-by: Jamie Santos <jamiesanto@gmail.com>
nasa · Jul 19, 2023 · 66aa2f0 · 66aa2f0
1 parent 4c9c3ff
commit 66aa2f0
Show file tree

Hide file tree

Showing 15 changed files with 1,971 additions and 0 deletions.
diff --git a/analyst/workspace/gmm-change-detection.ipynb b/analyst/workspace/gmm-change-detection.ipynb
diff --git a/anomaly/gmm-change-detection/CMakeLists.txt b/anomaly/gmm-change-detection/CMakeLists.txt
@@ -0,0 +1,33 @@
+# Copyright (c) 2021, United States Government, as represented by the
+# Administrator of the National Aeronautics and Space Administration.
+#
+# All rights reserved.
+#
+# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
+# platform" software is licensed under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with the
+# License. You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+
+cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
+project(gmm)
+
+## Compile as C++14, supported in ROS Kinetic and newer
+add_compile_options(-std=c++14)
+
+
+## Find catkin macros and libraries
+find_package(catkin REQUIRED COMPONENTS)
+
+# Allow other packages to use python scripts from this package
+catkin_python_setup()
+
+catkin_package()
+
diff --git a/anomaly/gmm-change-detection/README.md b/anomaly/gmm-change-detection/README.md
@@ -0,0 +1,21 @@
+\page gmm GMM Change Detection
+
+# Overview
+
+This implementation of a GMM-based anomaly detection algorithm was created by Jamie Santos, for the purposes of a [Master thesis]().
+This algorithm is able to detect changes on environments such as the ISS using 3D point depth cloud data.
+
+# Requirements
+pip3 install pulp
+pip3 install scikit-learn
+pip3 install pyntcloud
+pip3 install pandas
+pip3 install open3d
+apt-get install glpk-utils
+apt-get install ros-noetic-ros-numpy
+
+## Usage
+
+	rosrun gmm gmm_change_detection.py
+
+
diff --git a/anomaly/gmm-change-detection/package.xml b/anomaly/gmm-change-detection/package.xml
@@ -0,0 +1,21 @@
+<?xml version="1.0"?>
+<package format="2">
+  <name>gmm</name>
+  <version>0.0.0</version>
+  <description>GMM Change Detection package</description>
+  <license>
+    Apache License, Version 2.0
+  </license>
+  <author email="astrobee-fsw@nx.arc.nasa.gov">
+    ISAAC Flight Software
+  </author>
+  <maintainer email="astrobee-fsw@nx.arc.nasa.gov">
+    ISAAC Flight Software
+  </maintainer>
+
+  <buildtool_depend>catkin</buildtool_depend>
+  <build_depend>roscpp</build_depend>
+  <build_depend>cv_bridge</build_depend>
+  <build_export_depend>roscpp</build_export_depend>
+  <exec_depend>roscpp</exec_depend>
+</package>
diff --git a/anomaly/gmm-change-detection/scripts/gmm/__init__.py b/anomaly/gmm-change-detection/scripts/gmm/__init__.py
diff --git a/anomaly/gmm-change-detection/scripts/gmm/artificial_data.py b/anomaly/gmm-change-detection/scripts/gmm/artificial_data.py
@@ -0,0 +1,68 @@
+#!/usr/bin/env python
+# Copyright (c) 2017, United States Government, as represented by the
+# Administrator of the National Aeronautics and Space Administration.
+#
+# All rights reserved.
+#
+# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
+# platform" software is licensed under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with the
+# License. You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+
+import copy
+
+import numpy as np
+
+
+def generate_data(n_start, n_disappearances, n_appearances):
+    N = 1000  # N points in each cluster
+    point_set_1 = []
+    point_set_2 = []
+
+    # Define first set of points
+    means_1 = np.random.uniform(-2, 2, (n_start, 3)).round(2)
+    covs_1 = np.zeros(shape=(n_start, 3, 3))
+    for i in range(n_start):
+        covs_1[i] = np.diag(np.random.uniform(0.0, 0.1, (1, 3))[0].round(2))
+
+    # Remove old clusters in second set of points
+    means_2 = copy.deepcopy(means_1)
+    covs_2 = copy.deepcopy(covs_1)
+
+    for i in range(n_disappearances):
+        means_2 = np.delete(means_2, 0, 0)
+        covs_2 = np.delete(covs_2, 0, 0)
+
+    # Add new clusters in second set of points
+    means_appearances = np.random.uniform(-2, 2, (n_appearances, 3)).round(2)
+    covs_appearances = np.zeros(shape=(n_appearances, 3, 3))
+    for i in range(n_appearances):
+        covs_appearances[i] = np.diag(np.random.uniform(0.0, 0.1, (1, 3))[0].round(2))
+    means_2 = np.vstack((means_2, means_appearances))
+    covs_2 = np.vstack((covs_2, covs_appearances))
+
+    # Concatenate clusters into point clouds
+    for i in range(len(means_1)):
+        x = np.random.multivariate_normal(means_1[i], covs_1[i], N)
+        point_set_1.append(x)
+    points_1 = np.concatenate(point_set_1)
+
+    for i in range(len(means_2)):
+        x = np.random.multivariate_normal(means_2[i], covs_2[i], N)
+        point_set_2.append(x)
+    points_2 = np.concatenate(point_set_2)
+
+    return points_1, points_2
+
+
+if __name__ == "__main__":
+    points_1, points_2 = generate_data(5, 3, 9)
+    print(points_1.shape, points_2.shape)
diff --git a/anomaly/gmm-change-detection/scripts/gmm/emd_gmm.py b/anomaly/gmm-change-detection/scripts/gmm/emd_gmm.py
@@ -0,0 +1,103 @@
+#!/usr/bin/env python
+# Copyright (c) 2017, United States Government, as represented by the
+# Administrator of the National Aeronautics and Space Administration.
+#
+# All rights reserved.
+#
+# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
+# platform" software is licensed under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with the
+# License. You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+
+# https://towardsdatascience.com/linear-programming-using-python-priyansh-22b5ee888fe0
+import numpy as np
+from pulp import *
+
+
+class EMDGMM:
+    def __init__(self, gmm1_weights, gmm2_weights):
+        self.warehouse_supply = gmm1_weights  # Supply Matrix
+        self.cust_demands = gmm2_weights  # Demand Matrix
+        self.n_warehouses = gmm1_weights.size
+        self.n_customers = gmm2_weights.size
+        self.weight_sum1 = np.sum(self.warehouse_supply)
+        self.weight_sum2 = np.sum(self.cust_demands)
+        self.distances = None
+        self.emd = None
+
+    def get_distance(self, means1, means2):
+        """Given two GMMs, generate a distance matrix between all cluster
+        representatives (means) of GMM1 and GMM2. Output: K1 x K2 matrix"""
+
+        distances = np.zeros((means1.shape[0], means2.shape[0]))
+        for i, row1 in enumerate(means1):
+            for j, row2 in enumerate(means2):
+                distances[i][j] = np.linalg.norm(row1 - row2)
+        self.distances = distances
+
+    def calculate_emd(self):
+        """Optimize the cost-distance (weight-distance) flow between the
+        two GMMs and use the optimized distance as the EMD distance metric."""
+
+        # Cost Matrix
+        cost_matrix = self.distances
+
+        # Initialize Model
+        model = LpProblem("Supply-Demand-Problem", LpMinimize)
+
+        # Define Variable Names
+        variable_names = [
+            str(i) + "_" + str(j)
+            for j in range(1, self.n_customers + 1)
+            for i in range(1, self.n_warehouses + 1)
+        ]
+        variable_names.sort()
+
+        # Decision Variables
+        DV_variables = LpVariable.matrix(
+            "X", variable_names, cat="Continuous", lowBound=0
+        )
+        allocation = np.array(DV_variables).reshape(self.n_warehouses, self.n_customers)
+
+        # Objective Function
+        obj_func = lpSum(allocation * cost_matrix)
+        model += obj_func
+
+        # Constraints
+        for i in range(self.n_warehouses):
+            # print(lpSum(allocation[i][j] for j in range(self.n_customers)) <= warehouse_supply[i])
+            model += lpSum(
+                allocation[i][j] for j in range(self.n_customers)
+            ) <= self.warehouse_supply[i], "Supply Constraints " + str(i)
+
+        for j in range(self.n_customers):
+            # print(lpSum(allocation[i][j] for i in range(self.n_warehouses)) >= cust_demands[j])
+            model += lpSum(
+                allocation[i][j] for i in range(self.n_warehouses)
+            ) >= self.cust_demands[j], "Demand Constraints " + str(j)
+
+        model.solve(GLPK_CMD(msg=0))
+        status = LpStatus[model.status]
+        # print(status)
+
+        # print("Total Cost:", model.objective.value())
+        # for v in model.variables():
+        #    try:
+        #        print(v.name, "=", v.value())
+        #    except:
+        #        print("error couldn't find value")
+
+        # for i in range(self.n_warehouses):
+        #    print("Warehouse ", str(i+1))
+        #    print(lpSum(allocation[i][j].value() for j in range(self.n_customers)))
+
+        total_flow = min(self.weight_sum1, self.weight_sum2)
+        self.emd = model.objective.value() / total_flow