Skip to content

wkelongws/trajectory_clustering_with_cuspatial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Trajectory Clustering Example with CUSPATIAL

Note: This work was done with cuspatial v0.10 (coupled with this article) and may not work as is with newer version of cuspatial. We encourage you to check the offical cuspatial repo for updates. This repo can still serve well as a quickstart read for beginners.


CUSPATIAL is a C++ library with python bindings that is based on CUDA and CUDF. CUSPATIAL provides significant GPU acceleration to common spatial and spatio- temporal operations such as point in polygon, distance between trajectories, and trajectory clustering. The speed-up ranges from 10x to 10000x for different operations.

This repo shows an example of accelerating a clustering problem using CUSPATIAL on a real-world vehicle trajectory dataset. On this particular use case, CUSPATIAL brings the end-2-end computation time from around 11 hours down to less than 15 seconds.

Problem Statement - Trajectory Clustering

The trajectory clustering task is grouping a set of trajectories in such a way that trajectories in the same group are more similar to each other than to those in other groups. It is useful for various problems such as motion pattern extraction, behavior analysis and more.

The clustering task consists of two major components in general:

  1. Similarity Metric
  2. Searching Algorithm

Given a specific similarity metric, different clustering algorithm will have different searching mechanism and different complexity. But in some case people may want to precompute all pairwise similarities. One of the many reasons could be people may need to perform multiple iterations of hyperparameter search for the clustering algorithm to get a good result and we don't want to redo the time-consuming similarity computation each time.

In this trajectory clustering example, we work on a particular vehicle trajectory dataset. Below is a quick summary stats of the dataset (More description of this dataset can be found in the following section):

num_trajectories 10070
max_length 961
min_length 30
avg_length 87

We use Hausdorff distance as the similarity metric between trajectories. We compute the pairwise similarity matrix and apply AgglomerativeClustering (AC) and DBSCAN afterwards.

In this example, CUSPATIAL can significantly accelerate the computation of the Hausdorff distance similarity matrix. Comparing to the typical scikit-learn implementation of Hausdorff distance (single CPU thread), CUSPATICAL reduces the computation time from about 5 hours to 13.5 seconds on this particular dataset.

Since we pre-computed the similarity matrix, we can experiment with different clustering algorithm and different hyperparameter sets much easier. In this example we do it in an interactive way using ipython widgets, you may play with the hyperparameter set and see how the clustering result responds. Below shows one example each for AC and DBSCAN:

alt text

alt text

Dataset Used

The real-world trajectory dataset we used here for this example is collected from a traffic intersection located in Dubuque, Iowa. We applied the following steps to extract vehicle trajectories from multiple cameras implemented at this intersection:

  • Detect vehicle locations in camera domain using AI based detectors;
  • Apply tracking algorithm to assign the same id to the same vehicle;
  • Project the vehicle location from camera domain to latitudes and longitudes;
  • Create trajectories based on vehicle id.

A python pickle file (created using python 3.7) of this dataset can be downloaded here.

This dataset is part of a city-wide traffic video database which is developed to support various smart city and smart traffic research tasks (see AI City Challenge for more details).

Install Dependencies

  1. cuSPATIAL: instructions
  2. scikit-image: pip install scikit-image (DO NOT use conda install)
  3. opencv: conda install -c conda-forge opencv
  4. ipywidgets: conda install -c anaconda ipywidgets
  5. scikit-learn: conda install -c anaconda scikit-learn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published