Multi-Camera Multi Object Tracking System with Live Camera Streams
A real-time multi-camera object detection and tracking system with WebRTC streaming, computer vision integration, and Bird's Eye View (BEV) transformation capabilities. It is designed to be modular and extensible, allowing you to easily add your own trackers and mergers.
- π₯ Multi-Camera Support - Process multiple RTSP streams simultaneously
- π€ Detection and Tracking - Defaults to RFDETR detection and DeepSORT tracking
- π WebRTC Streaming - Low-latency browser-based viewing
- π― Cross-Camera Merging - Track objects across multiple camera views
- ποΈ Extensible - Plugin system for custom trackers and mergers
- π Easy to Use - Simple Python API inspired by Gradio
- Python 3.10+
- Docker (for MediaMTX server)
- Any virtual environment manager (e.g. venv, conda, pipenv, etc., we recommend uv)
- CUDA-capable GPU (recommended)
- FFmpeg with hardware acceleration support
- RTSP streams or cameras
For quick start, install from github:
pip install git+https://github.com/playbox-dev/trackstudio.git
For development / testing, clone this repository and install in development mode (recommended if you want to try out without camera setup):
git clone https://github.com/playbox-dev/trackstudio
cd trackstudio
pip install -e .
# or if using uv (recommended)
uv sync --dev
Thinklet Cube is a small, highly durable, and low-power device that can be used to stream video to the MediaMTX server using RTMP over WiFi or 4G LTE. Visit Fairy Devices for more information. For this setup, we use a Thinklet Cube to stream video to the MediaMTX server using RTMP over Local Network through a wifi connection. You will need two or more Thinklet Cube connected to the same wifi network, adb to connect to the Thinklet Cube, and a PC with a GPU. Detailed instructions on installing adb and connecting to the Thinklet Cube can be found in Thinklet Developer Portal.
# 1. Start MediaMTX server
docker compose up -d mediamtx
# 2. Connect the Thinklet Cube to the PC and run these commands to start streaming video to the MediaMTX server
adb -s <Device ID for camera0 (shown using adb devices)> shell am start \
-n ai.fd.thinklet.app.squid.run/.MainActivity \
-a android.intent.action.MAIN \
-e streamUrl "rtmp://<server IP>:1935" \
-e streamKey "camera0" \
--ei longSide 720 \
--ei shortSide 480 \
--ei videoBitrate 1024 \
--ei audioSampleRate 44100 \
--ei audioBitrate 128 \
--ez preview false
adb -s <Device ID for camera1 (shown using adb devices)> shell am start \
-n ai.fd.thinklet.app.squid.run/.MainActivity \
-a android.intent.action.MAIN \
-e streamUrl "rtmp://<server IP>:1935" \
-e streamKey "camera1" \
--ei longSide 720 \
--ei shortSide 480 \
--ei videoBitrate 1024 \
--ei audioSampleRate 44100 \
--ei audioBitrate 128 \
--ez preview false
# 3. Start streaming: Press the middle button on the Thinklet Cube to start streaming.
# 4. Run TrackStudio with test config
trackstudio run -c test_config.json --vision-fps 10
# 5. Open http://localhost:8000 in your browser
For testing without real cameras, clone this repository and use ffmpeg to publish RTSP test streams:
Assumes 2 videos are inside the tests/videos
directory with names cam1.mp4
and cam2.mp4
(note: the file names start from 1 where as the camera names start from 0 in the config file). You can change the video files in the stream_camera0_ffmpeg.sh
and stream_camera1_ffmpeg.sh
scripts.
We tested with videos from Large Scale Multi-Camera Tracking Dataset [1].
# 1. Start MediaMTX server (in the root directory)
docker compose up -d mediamtx
# 2. Create test streams (this reads the video files from the `tests/videos` directory and publishes them as RTSP streams).
./test_mediamtx.sh
# 3. Run TrackStudio with test config (test_config.json is configured with default settings)
trackstudio run -c test_config.json
# 4. Open http://localhost:8000 in your browser
This creates two test streams at:
rtsp://localhost:8554/camera0
rtsp://localhost:8554/camera1
To stop: docker compose down
import trackstudio as ts
# Launch with default settings
app = ts.launch()
# Custom configuration
app = ts.launch(
rtmp_streams=[
"rtsp://localhost:8554/camera0",
"rtsp://localhost:8554/camera1"
],
camera_names=["Camera 0", "Camera 1"],
tracker="rfdetr",
server_port=8000,
)
# Start server with default config
trackstudio run
# Start with custom streams
trackstudio run --streams rtsp://localhost:8554/camera0 --streams rtsp://localhost:8554/camera1
# Generate config file
trackstudio config --output my_config.json
# List available components
trackstudio list
TrackStudio can be configured via:
- Python API parameters
- Command line arguments
- JSON configuration files
Example configuration file:
{
"cameras": {
"stream_urls": [
"rtsp://localhost:8554/camera0",
"rtsp://localhost:8554/camera1"
]
},
"vision": {
"tracker_type": "rfdetr",
"merger_type": "bev_cluster",
"fps": 10.0
},
"server": {
"host": "0.0.0.0",
"port": 8000
}
}
Create custom trackers with auto-registration - no need to modify any existing files!
from trackstudio.vision_config import register_tracker_config, BaseTrackerConfig, slider_field
from pydantic import Field
@register_tracker_config("mytracker") # Auto-registers with the system!
class MyTrackerConfig(BaseTrackerConfig):
"""Configuration for my custom tracker"""
detection_threshold: float = slider_field(
0.5, 0.1, 1.0, 0.1,
"Detection Threshold",
"How confident detections need to be"
)
max_tracks: int = Field(default=50, title="Max Tracks")
from trackstudio.tracker_factory import register_tracker_class
from trackstudio.trackers.base import VisionTracker
import numpy as np
@register_tracker_class("mytracker") # Auto-registers with the factory!
class MyTracker(VisionTracker):
def __init__(self, config: MyTrackerConfig):
super().__init__(config)
self.config = config
def detect(self, frame: np.ndarray, camera_id: int):
# Your detection logic here
detections = []
return detections
def track(self, detections, camera_id: int, timestamp: float, frame=None):
# Your tracking logic here
tracks = []
return tracks
# That's it! Your tracker is now available system-wide
app = ts.launch(tracker="mytracker")
An example of a custom tracker can be found in custom_tracker_examples/demo.py.
TrackStudio uses a modular architecture:
- Vision Pipeline: Detection β Tracking β BEV Transform β Cross-Camera Merging
- Streaming: RTMP Input β WebRTC Output with hardware acceleration
- Web UI: React-based interface with real-time visualization
- Plugin System: Register custom trackers and mergers
- Tracking trail visualization
- Remove hardcoded image and canvas sizes in BEV canvas as well as input camera streams
- Applications with tracking data
- Detection class handling
- Cloud deployment guides
We welcome contributions!
For developers:
- π Development Guide - Setup, linting, and code quality tools
- π§Ή Code Quality: We use Ruff for linting and formatting
- π§ Quick Setup: Run
make dev-setup
to get started - β Pre-commit Hooks: Automatic code quality checks on commit
[1] Large Scale Multi-Camera Tracking Dataset
[2] RF-DETR
[3] DeepSORT: Simple Online and Realtime Tracking with a Deep Association Metric
TrackStudio is licensed under the Apache License 2.0. See LICENSE for details.
If you use TrackStudio in your research, please cite:
@software{trackstudio,
title = {TrackStudio: Multi-Camera Vision Tracking System},
author = {Playbox},
year = {2025},
url = {https://github.com/playbox-dev/trackstudio}
}
- π§ Email: support@play-box.ai
- π Issues: GitHub Issues
- Documentation # Coming soon
- Thinklet Cube
- PyPI Package
- GitHub Repository
- Example Notebooks # Coming soon