fea: documentation & examples folder (#34)

artefactory · Nov 30, 2023 · 55a78f8 · 55a78f8
2 parents 42829d4 + d0a3324
commit 55a78f8
Show file tree

Hide file tree

Showing 18 changed files with 753 additions and 31 deletions.
diff --git a/.gitignore b/.gitignore
@@ -138,10 +138,12 @@ secrets/*
 .DS_Store
 
 
-# Data ignore everythin data/detections and data/frames
-data/detections/*
-data/frames/*
+# Ignore everything in data/ and large files
+data/
 *.mp4
-*.txt
+*.pt
 # poetry
 poetry.lock
+
+
+*.png
diff --git a/README.md b/README.md
@@ -23,6 +23,16 @@ This Git repository is dedicated to the development of a Python library aimed at
   - [Documentation](#documentation)
   - [Repository Structure](#repository-structure)
 
+## trackreid + bytetrack VS bytetrack
+
+<p align="center">
+  <img src="https://storage.googleapis.com/track-reid/assets/output_with_reid.gif" width="400"/>
+  <img src="https://storage.googleapis.com/track-reid/assets/output_no_reid.gif" width="400"/>
+</p>
+
+
+
+
 ## Installation
 
 First, install poetry:

diff --git a/bin/download_sample_sequences.sh b/bin/download_sample_sequences.sh
diff --git a/docs/index.md b/docs/index.md
@@ -1,3 +1,21 @@
 # Welcome to the documentation
 
-For more information, make sure to check the [Material for MkDocs documentation](https://squidfunk.github.io/mkdocs-material/getting-started/)
+This repository aims to implement a modular library for correcting tracking results. By tracking, we mean:
+
+- On a sequence of images, an initial detection algorithm (e.g., yolo, fast-RCNN) is applied upstream.
+- A tracking algorithm (e.g., Bytetrack, Strongsort) is then applied to the detections with the aim of assigning a unique ID to each different object and tracking these objects, i.e., maintaining the unique ID throughout the image sequence.
+
+Overall, state-of-the-art (SOTA) tracking algorithms perform well in cases of constant speed movements, with detections not evolving (shape of bounding boxes relatively constant), which does not fit many real use cases. In practice, we end up with a lot of ID switches, and far too many unique IDs compared to the number of different objects. Therefore, we propose here a library for re-matching IDs, based on a tracking result, and allowing to reassign object IDs to ensure uniqueness.
+
+Here is an example of the track reid library, used to correct jungling balls tracking results on a short video.
+
+<p align="center">
+  <img src="https://storage.googleapis.com/track-reid/assets/output_no_reid.gif" width="500"/><br>
+  <b>Bytetrack x yolov8l, 42 tracked objects</b>
+</p>
+<p align="center">
+  <img src="https://storage.googleapis.com/track-reid/assets/output_with_reid.gif" width="500"/><br>
+  <b>Bytetrack x yolov8l + track-reid, 4 tracked objects </b>
+</p>
+
+For more insight on how to get started, please refer to [this guide for users](quickstart_user.md), or [this guide for developers](quickstart_dev.md).
diff --git a/lib/.gitkeep → examples/norfair/README.md b/lib/.gitkeep → examples/norfair/README.md
diff --git a/examples/norfair/lib/.gitkeep b/examples/norfair/lib/.gitkeep
diff --git a/lib/bbox/utils.py → examples/norfair/lib/bbox/utils.py b/lib/bbox/utils.py → examples/norfair/lib/bbox/utils.py
diff --git a/lib/norfair_helper/utils.py → examples/norfair/lib/norfair_helper/utils.py b/lib/norfair_helper/utils.py → examples/norfair/lib/norfair_helper/utils.py
@@ -2,9 +2,8 @@
 
 import cv2
 import numpy as np
-from norfair import Detection, get_cutout
-
 from lib.bbox.utils import rescale_bbox, xy_center_to_xyxy
+from norfair import Detection, get_cutout
 
 
 def yolo_to_norfair_detection(

diff --git a/lib/norfair_helper/video.py → examples/norfair/lib/norfair_helper/video.py b/lib/norfair_helper/video.py → examples/norfair/lib/norfair_helper/video.py
@@ -1,9 +1,8 @@
 import cv2
 import numpy as np
-from norfair import Tracker, draw_boxes
-
 from lib.norfair_helper.utils import compute_embeddings, yolo_to_norfair_detection
 from lib.sequence import Sequence
+from norfair import Tracker, draw_boxes
 
 
 def generate_tracking_video(

diff --git a/lib/sequence.py → examples/norfair/lib/sequence.py b/lib/sequence.py → examples/norfair/lib/sequence.py
diff --git a/examples/norfair/norfair_starter_kit.ipynb b/examples/norfair/norfair_starter_kit.ipynb
@@ -0,0 +1,284 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# WIP NOT WORKING"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Value proposition of norfair"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Norfair is a customizable lightweight Python library for real-time multi-object tracking.\n",
+    "Using Norfair, you can add tracking capabilities to any detector with just a few lines of code.\n",
+    "\n",
+    "It means you won't need a SOTA Tracker you can use a basic Tracker with a Kalmann Filter and add the custom logic you want."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Imports and setup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sys; sys.path.append('.')\n",
+    "import os\n",
+    "\n",
+    "import cv2\n",
+    "from norfair import Tracker, OptimizedKalmanFilterFactory\n",
+    "\n",
+    "from lib.sequence import Sequence\n",
+    "from lib.norfair_helper.video import generate_tracking_video\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If you want to test this code on your detection and frames you can use the following code if you structure the data as follows:\n",
+    "\n",
+    "```\n",
+    "data/\n",
+    "   ├── detection/\n",
+    "   │   └── sequence_1/\n",
+    "   │       └── detections_1.txt\n",
+    "   └── frames/\n",
+    "       └── sequence_1/\n",
+    "           └── frame_1.jpg\n",
+    "```\n",
+    "\n",
+    "Where the detections.txt file is in the following format scaled between 0 and 1:\n",
+    "\n",
+    "```\n",
+    "class_id x_center y_center width height confidence\n",
+    "```\n",
+    "\n",
+    "If this is not the case, you'll need to adapt this code to your data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "DATA_PATH = \"../data\"\n",
+    "DETECTION_PATH = f\"{DATA_PATH}/detections\"\n",
+    "FRAME_PATH = f\"{DATA_PATH}/frames\"\n",
+    "VIDEO_OUTPUT_PATH = \"private\"\n",
+    "\n",
+    "SEQUENCES = os.listdir(FRAME_PATH)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def get_sequence_frames(sequence):\n",
+    "    frames = os.listdir(f\"{FRAME_PATH}/{sequence}\")\n",
+    "    frames = [os.path.join(f\"{FRAME_PATH}/{sequence}\", frame) for frame in frames]\n",
+    "    frames.sort()\n",
+    "    return frames\n",
+    "\n",
+    "def get_sequence_detections(sequence):\n",
+    "    detections = os.listdir(f\"{DETECTION_PATH}/{sequence}\")\n",
+    "    detections = [os.path.join(f\"{DETECTION_PATH}/{sequence}\", detection) for detection in detections]\n",
+    "    detections.sort()\n",
+    "    return detections\n",
+    "\n",
+    "frame_path = get_sequence_frames(SEQUENCES[3])\n",
+    "test_sequence = Sequence(frame_path)\n",
+    "test_sequence"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "test_sequence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Basic Usage of Norfair"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Tracker\n",
+    "\n",
+    "Norfair tracker object is the customizable object that will track detections.\n",
+    "Norfair expects a distance function that will serve as a metric to match objects between each detection. You can create your own distance metric or use one of the built-in ones such as euclidian distance, iou or many more."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Initialize a tracker with the distance function\n",
+    "basic_tracker = Tracker(\n",
+    "    distance_function=\"mean_euclidean\",\n",
+    "    distance_threshold=40,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Basic tracking"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "video_path = generate_tracking_video(\n",
+    "    sequence=test_sequence,\n",
+    "    tracker=basic_tracker,\n",
+    "    frame_size=(2560, 1440),\n",
+    "    output_path=os.path.join(VIDEO_OUTPUT_PATH, \"basic_tracking.mp4\"),\n",
+    "    add_embedding=False,\n",
+    ")\n",
+    "video_path"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Advanced tracking"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def always_match(new_object, unmatched_object):\n",
+    "    return 0 # ALWAYS MATCH\n",
+    "\n",
+    "\n",
+    "def embedding_distance(matched_not_init_trackers, unmatched_trackers):\n",
+    "    snd_embedding = unmatched_trackers.last_detection.embedding\n",
+    "\n",
+    "    # Find last non-empty embedding if current is None\n",
+    "    if snd_embedding is None:\n",
+    "        snd_embedding = next((detection.embedding for detection in reversed(unmatched_trackers.past_detections) if detection.embedding is not None), None)\n",
+    "\n",
+    "    if snd_embedding is None:\n",
+    "        return 1 # No match if no embedding is found\n",
+    "\n",
+    "    # Iterate over past detections and calculate distance\n",
+    "    for detection_fst in matched_not_init_trackers.past_detections:\n",
+    "        if detection_fst.embedding is not None:\n",
+    "            distance = 1 - cv2.compareHist(snd_embedding, detection_fst.embedding, cv2.HISTCMP_CORREL)\n",
+    "            # If similar a tiny bit similar, we return the distance to the tracker\n",
+    "            if distance < 0.9:\n",
+    "                return distance\n",
+    "\n",
+    "    return 1 # No match if no matching embedding is found between the 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "advanced_tracker = Tracker(\n",
+    "    distance_function=\"sqeuclidean\",\n",
+    "    filter_factory = OptimizedKalmanFilterFactory(R=5, Q=0.05),\n",
+    "    distance_threshold=350, # Higher value means objects further away will be matched\n",
+    "    initialization_delay=12, # Wait 15 frames before an object is starts to be tracked\n",
+    "    hit_counter_max=15, # Inertia, higher values means an object will take time to enter in reid phase\n",
+    "    reid_distance_function=embedding_distance, # function to decide on which metric to reid\n",
+    "    reid_distance_threshold=0.9, # If the distance is below the object is matched\n",
+    "    reid_hit_counter_max=200, #higher values means an object will stay reid phase longer\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "video_path = generate_tracking_video(\n",
+    "    sequence=test_sequence,\n",
+    "    tracker=advanced_tracker,\n",
+    "    frame_size=(2560, 1440),\n",
+    "    output_path=os.path.join(VIDEO_OUTPUT_PATH, \"advance_tracking.mp4\"),\n",
+    "    add_embedding=True,\n",
+    ")\n",
+    "video_path"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "advanced_tracker.total_object_count"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "track-reid",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/examples/norfair/requirements.txt b/examples/norfair/requirements.txt
@@ -0,0 +1 @@
+norfair
diff --git a/examples/trackreid/data/.gitkeep b/examples/trackreid/data/.gitkeep
diff --git a/examples/trackreid/frames/.gitkeep b/examples/trackreid/frames/.gitkeep
diff --git a/examples/trackreid/requirements.txt b/examples/trackreid/requirements.txt
@@ -0,0 +1,5 @@
+git+https://github.com/artefactory-fr/bytetrack.git@main
+git+https://github.com/artefactory-fr/track-reid.git@main
+opencv-python==4.8.1.78
+ultralytics==8.0.216
+matplotlib==3.8.2