artefactory · nmathieufact · Mar 14, 2024 · Mar 12, 2024 · Mar 14, 2024 · Mar 14, 2024
diff --git a/.gitignore b/.gitignore
@@ -129,4 +129,13 @@ dmypy.json
 .pyre/
 
 # Poetry
-poetry.lock
+poetry.lock
+
+# model
+*.pt
+
+# images
+*.png
+
+# videos
+*.mp4
diff --git a/README.md b/README.md
@@ -1,36 +1,37 @@
-<div align="center">
-<h2>
-  ByteTrack-Pip: Packaged version of the ByteTrack repository
-</h2>
+# Bytetrack starter guide
+
+This repo is a packaged version of the [ByteTrack](https://github.com/ifzhang/ByteTrack) algorithm.
+
+ByteTrack is a multi-object tracking computer vision model. Using ByteTrack, you can allocate IDs for unique objects in a video for use in tracking objects.
+
 <h4>
-    <img width="700" alt="teaser" src="assets/demo.gif">
+    <img width="700" alt="teaser" src="assets/traffic.gif">
 </h4>
-<div>
-    <a href="https://pepy.tech/project/bytetracker"><img src="https://pepy.tech/badge/bytetracker" alt="downloads"></a>
-    <a href="https://badge.fury.io/py/bytetracker"><img src="https://badge.fury.io/py/bytetracker.svg" alt="pypi version"></a>
-</div>
-</div>
 
-## <div align="center">Overview</div>
-
-This repo is a packaged version of the [ByteTrack](https://github.com/ifzhang/ByteTrack) algorithm.
 ### Installation
 ```
-pip install bytetracker
+pip install git+https://github.com/artefactory-fr/bytetrack.git@main
 ```
 
-### Detection Model + ByteTrack
-```python
+### Detection object with Bytetracker and YOLO
+```
 from bytetracker import BYTETracker
-
 tracker = BYTETracker(args)
-for image in images:
-   dets = detector(image)
-   online_targets = tracker.update(dets)
+for frame_id, image_filename in enumerate(frames):
+    img = cv2.imread(image_filename)
+    detections = your_model.predict(img)
+    tracked_objects = tracker.update(detections, frame_id)
 ```
-### Reference:
- - [Yolov5-Pip](https://github.com/fcakyon/yolov5-pip)
- - [ByteTrack](https://github.com/ifzhang/ByteTrack)
+
+
+## Copyright
+
+Copyright (c) 2022 Kadir Nar
+
+## ByteTrack License
+
+ByteTrack is licensed under the MIT License. See the [LICENSE](LICENSE) file and the [ByteTrack repository](https://github.com/bytedance/ByteTrack) for more information.
+
 
 ### Citation
 ```bibtex

diff --git a/assets/traffic.gif b/assets/traffic.gif
diff --git a/examples/requirements.txt b/examples/requirements.txt
@@ -0,0 +1,4 @@
+git+https://github.com/artefactory-fr/bytetrack.git@main
+opencv-python==4.8.1.78
+ultralytics==8.0.216
+matplotlib==3.8.2
diff --git a/examples/test_bytetrack_car.ipynb b/examples/test_bytetrack_car.ipynb
@@ -0,0 +1,294 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Executive Summary: Car Detection with ByteTrack - An Introductory Guide\n",
+    "\n",
+    "This guide is designed to provide a beginner-friendly introduction to the application of ByteTrack for car detection in video footage. ByteTrack is an advanced algorithm that leverages the capabilities of the YOLO (You Only Look Once) model for object detection, specifically focusing on tracking objects across video frames.\n",
+    "\n",
+    "For more information on YOLO and ultralytics, visit [this link](https://github.com/ultralytics/ultralytics).\n",
+    "\n",
+    "For more information on ByteTrack, visit [this link](https://github.com/ifzhang/ByteTrack).\n",
+    "\n",
+    "1. **Frame Extraction**: \n",
+    "   This video is decomposed into frames, transforming continuous video into discrete snapshots for analysis.\n",
+    "\n",
+    "2. **Detection and tracking**: \n",
+    "   We initialize the ByteTracker object and load the pre-trained Yolo model, indicating its parameters. Going through all the frames of the video, the YOLO model enables object detection. Tracking is handled by the ByteTrack algorithm, using the bounding boxes and assigning each of it an ID that enables to track its movement.\n",
+    "\n",
+    "3. **Visualization of Tracking**: \n",
+    "   Recomposing the video from the frames with object detected, writing it in a MP4 format to same folder.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%load_ext autoreload\n",
+    "%autoreload 2\n",
+    "import glob\n",
+    "import matplotlib.pyplot as plt\n",
+    "import cv2\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "# YOLO and video packages \n",
+    "from ultralytics import YOLO\n",
+    "from bytetracker import BYTETracker\n",
+    "from bytetracker.basetrack import BaseTrack\n",
+    "from utils import draw_all_bbox_on_image, yolo_results_to_bytetrack_format, scale_bbox_as_xyxy\n",
+    "from IPython.display import Video"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Download the video\n",
+    "VIDEO_PATH = 'videos/traffic.mp4'\n",
+    "!if [ ! -f $VIDEO_PATH ]; then mkdir -p data && wget https://storage.googleapis.com/bytetrack-data-public/traffic.mp4 -O $VIDEO_PATH; fi"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Reading video"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "Video(VIDEO_PATH, width=800,embed=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 1. Frame Extraction "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# You can run this only once:\n",
+    "# Transform this VIDEO_PATH into a list of frames in this folder under frames/\n",
+    "!mkdir -p frames && ffmpeg -i $VIDEO_PATH -vf fps=12 frames/%d.png -hide_banner -loglevel panic"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# - list and sort PNG frames in the 'frames' directory, ensuring they are ordered numerically for subsequent processing.\n",
+    "# - usinglob to find all PNG files and sorts them based on the numeric part of their filenames, avoiding lexicographic order issues"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "available_frames = glob.glob(\"frames/*.png\")\n",
+    "available_frames = sorted(available_frames, key=lambda x: int(x.split(\"/\")[-1].split(\".\")[0]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "MODEL_WEIGHTS = \"yolov8m.pt\"\n",
+    "\n",
+    "model = YOLO(MODEL_WEIGHTS)\n",
+    "results = model(available_frames[0])[0]\n",
+    "\n",
+    "plt.imshow(cv2.cvtColor(results.plot(), cv2.COLOR_BGR2RGB))\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Classes for prediction, indicating which object to detect\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### We will track only car \n",
+    "CAR_CLASS_ID = 2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "   #### BYTETracker Parameters\n",
+    "   - `track_thresh`: Threshold for considering a detection as a potential object to track.\n",
+    "   - `track_buffer`: Number of frames to keep tracking information for an object before discarding it.\n",
+    "   - `match_thresh`: Threshold for matching detections between consecutive frames.\n",
+    "   - `frame_rate`: Frame rate of the video or sequence being processed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tracker = BYTETracker(track_thresh= 0.15, track_buffer = 3, match_thresh = 0.85, frame_rate= 12)\n",
+    "BaseTrack._count = 0"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = YOLO(MODEL_WEIGHTS, task=\"detect\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 2. Detection and tracking"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "all_tracked_objects  = []\n",
+    "for frame_id, image_filename in enumerate(available_frames):\n",
+    "    img = cv2.imread(image_filename)\n",
+    "    detections = model.predict(img, classes=[CAR_CLASS_ID], conf=0.15, verbose=False)[0]\n",
+    "    detections_bytetrack_format = yolo_results_to_bytetrack_format(detections)\n",
+    "    tracked_objects = tracker.update(detections_bytetrack_format, frame_id)\n",
+    "    if len(tracked_objects) > 0:\n",
+    "        tracked_objects = np.insert(tracked_objects, 0, frame_id, axis=1)\n",
+    "        all_tracked_objects.append(tracked_objects)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Scaling the bounding boxes to match with original image size "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_tracked = pd.DataFrame(np.concatenate(all_tracked_objects), columns=[\"frame_id\", \"x1\", \"y1\", \"x2\", \"y2\", \"track_id\", \"class\", \"confidence\"])\n",
+    "df_tracked[[\"x1\", \"y1\", \"x2\", \"y2\"]] = df_tracked[[\"x1\", \"y1\", \"x2\", \"y2\"]].apply(\n",
+    "    lambda x: scale_bbox_as_xyxy(x[0:4], detections.orig_shape), axis=1, result_type=\"expand\"\n",
+    "    )\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 3. Visualization of Tracking"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fourcc = cv2.VideoWriter_fourcc(*'H264')\n",
+    "OUTPUT_WITH_BBOX = \"videos/traffic_tracked.mp4\"\n",
+    "out = cv2.VideoWriter(OUTPUT_WITH_BBOX, fourcc, 12, (1280, 720))\n",
+    "for frame_id, image_filename in enumerate(available_frames):\n",
+    "    image = cv2.imread(image_filename)\n",
+    "    if frame_id in df_tracked.frame_id.astype('int').values:\n",
+    "        df_current_frame = df_tracked[df_tracked.frame_id == frame_id][[\"x1\", \"y1\", \"x2\", \"y2\", \"track_id\", \"class\", \"confidence\"]].to_numpy()\n",
+    "        image = draw_all_bbox_on_image(image, df_current_frame)\n",
+    "    out.write(image)\n",
+    "out.release()\n",
+    "print(\"Video with bounding box is saved at:\", OUTPUT_WITH_BBOX)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"Number of detected objects: \", len(df_tracked.track_id.unique()))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "video_path = \"videos/traffic_tracked.mp4\"\n",
+    "display(Video(video_path, embed=True, width=800))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "3.8",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}