This is a web application using React and Next.js that utilizes computer vision to detect and track people in real-time using the webcam as a video source. The application allows users to set a minimum confidence threshold to filter the detected objects and display them with bounding boxes.
To start the application, first install the NPM packages
npm install
Start the development site using
npm run dev
To build use
npm run build
This model detects person objects defined in the COCO dataset, which is a large-scale object detection, segmentation, and captioning dataset.
The detected objects will appear in bounding boxes. Tune the minimum score to only show person objects meeting a certain confidence value.
A value of 5 means that a bounding box will appear if the model is at least 5% confident the object is a person.
Built with Typescript, NextJS, ReactJS, and TensorflowJS.
Created an abstract Base Detector class that can be detected for using other models. Currently it has the following models:
- Haar Cascade with OpenCV (Specifically Full face & Eyes)
- Coco SSD with TensorFlow
Creating this class makes it easier to extend to other detectors and re-use the components.
This application was initially built using the Haar Cascade Models with OpenCV.js. The application can be found on the other-models branch of this repository.
The application is responsive, however, on mobile, due to the different aspect ratio of the webcam, the application breaks.
Specifically, the canvas element is not laid over correctly over the webcam and hence it does not draw the bounding boxes correctly.
This was my first time developing an OpenCV application, especially one in the browser with ReactJS and NextJS.
I initially implemented full-face and eyes detection with OpenCV.js and the Haar Cascade Models, however it was very slow, laggy and often blocked the UI. In addition, there wasn't an ability for setting the confidence level.
To improve the detection with this model, it would be good to create a web worker on a seperate thread and make the browser more accessible and interactible.
I learned how to implement the tensor flow pre-trained Coco SSD model on the browser with the help from this repository. A future improvement would be to allow users to update other parameters from this model, and also create other Detector
classes based off the provided Tensor Flow JS models.
A future improvement would be to use the YoloV5 model. Take a look at this guide from this repository.
It would be good to provide more information in the bounding boxes, such as the type of object detected (i.e. "person"), and the confidence level.
Update the WebcamDetector
component so it can work on mobile phones. In addition, update the UI to make it more user friendly to use on Mobile.
As mentioned, there was a learning curve and figuring out how to work with OpenCVJs in the browser.
In addition, the Coco-SSD detector only allows updating the minimum confidence levels for returning a bounding box of a detected object. Unfortunately, I did not have time to work with other models and play with other settings to adjust the confidence of the models internally.