** This work is being done as part of the Course Algorithms of digital multimedia processing ** ** author Abuykov Z.M. **
- Description:
- Introduction cv jump
- tracking red object jump
- Image blurring (GaussBlur) jump
- Canny algorithm jump
- Motion detection on video jump
- neural network training jump
- Haar's work (workKhaara) jump
- Text Recognition(opticalRecognationText) jump
- Contours (several different operators) -(it's maybe interesting for HR) jump
- Tracking (mil,csrt,kcf) and histogram-based color tracking -(it's maybe interesting for HR) jump
- Hand detectors (Detectros) -(it's maybe interesting for HR) jump
This project is the first step in machine learning. It will be especially useful for students of machine learning. Only the last 3 jobs will be of interest to HR.
- Python 3.10
- Matlab 2023b with extra packages
- image labeler
- Webcam
- resnet-50
- SSD
- JS (for blur)
- C++ (for canny)
This folder include introduction with openCV. Done work:
- show img and test 3 difeerent flags for image and screen,
- show video(movie,webcam), record movie, display the Red Cross in the center of the screen and inside cross accept blur
- Fill the cross with one of the 3 colors(RGB) - using the following rule: BASED ON RGB FORMAT
determine which central pixel is closer to which color red, green, blue and fill the cross with this color.
Done work:
- Apply filtering to images using the inRange command and leaving only the red part
- morphological transformations (opening and closing) of the filtered image
- Find the moments on the resulting image of 1st order, find the area of the object.
- Based on an analysis of the area of the object, find its center and build a black rectangle around the object.
- Make sure that the resulting black rectangle is displayed on the video, with a new one on the new frame.
Done work: Realisation Gauss's filter and comparison with built-in Gauss's filter
Done work: Canny's algorithm:
- RGB -> GRAY
- Apply Gauss's filter
- calc gradient and gradient's angle
- around gradient's angle
- suppression of non-maximal
- double threshold
- get contours
Done work:
- read frame -> calc absdiff -> find contours
- Walk along the contours of objects for the frame (frame_diff) and find a contour with an area larger than the previously specified parameter
- If such a contour is found, it means there was movement, write the frame to a file
- Training standart NN recognation digit (build multilayer perceptron using the Keras library)
- Training CNN recognation digit
- Show digit and precision
- Prepare dataset - capcha (include in folder) - standart image
- Apply augmentation to dataset
- Apply EasyOCR and Tesseract for recognation text on images
- Record result EasyOcr and Tesseract .txt
Done work: detector count face on the movie
Theme: automobile (logo)
This using Canny's algorithm (description up) and apply different operator and built-in canny
built-in Canny | Sobel | Prewitt | Kirch | Scharr |
---|---|---|---|---|
Also a comparison of the work of different operators.
Comparison of algorithms: on 3 images (small number of logos, large number of logos and cars on the street)
All 3 images were then processed with different borders (10, 100; 100, 200; 150, 230) and kernels (3x3.5x5, 7x7)
In total, 1 algorithm processed 27 images.
Algorithm | Speed work(sec) | Algorithm | MSE (the less, the more differences) |
---|---|---|---|
Canny | 2.3285114765167236 | Canny & Sobel | 42416.498481999944 |
Sobel | 189,11079295476276 | Canny & Prewitt | 3615.576172236691 |
Scharr | 235,590115070343 | Canny & Scharr | 6270.624080584491 |
Prewitt | 237,085066713 | Prewitt & Scharr | 7995.047592230903 |
Kirsch | 1122,984922 |
Used methods: Mil, KCF, CSRT And used HSHsTrack (Hand Simple on base Histogram Tracking)
Algorithm HSHsTrack:
- Converts the current frame to HSV color space
- Calculates the back projection of the histogram onto the current frame
- Applies a binarization threshold to highlight an object
- Applies Gaussian blur to reduce noise
- Performs morphological operations
- Finds contours in a binary mask
- Selects the largest outline
- Returns the coordinates of the bounding box
CSRT | KCF | MIL | HSHsTrack |
---|---|---|---|
~ 13 FPS | ~ 30-32 FPS | ~ 10-12 FPS | ~ 30 FPS |
Result HSHsTack
hand2.online-video-cutter.com.mp4
Theme detect hand on real-time
- Used Haara + trained NN (classification hand, studied on the 11k hand)
- Used Single Shot Detector (trained and used with Matlab2023b with package - resnet50, image Labeler, webcam)
- Used MediaPipe
Table trained model:
Detector | Time train | count possitive | count negative | Total time |
---|---|---|---|---|
Haara (with program for trained -bad ) | - | - | - | ~ 48 hours |
1. max false positives 0.5 & stage 8 | ~ 8 hours | 8860 | 2001 | - |
2. max false positives 0.5 & stage 10 - better | ~ 7hours 41min | 3367 | 800 | - |
3. max false positives 0.4 & stage 16 | ~ 8 hours | 3322 | 1000 | - |
4. max false positives 0.2 & stage 16 | > 2d | 11000 | 3000 | not finish(canel) |
Haara (with Matlab trained - better) all stage 16 | - | - | - | ~ 40 hours |
1. | ~ 6 hours | 182 | 2001 | - |
2. | ~ 5 hours | 1000 | 920 | - |
3. | ~ 6 hours | 262 | 920 | - |
4. | ~ 6 hours | 1790 | 920 | - |
5. - better | ~ 16 hours | 1792 | 3186 | - |
NN classification on the precisiton (helper for Haara) (epochs = 50, activation=relu, base VGG16) | - | - | - | ~ 24 hours |
First model | ~ 8 hours | 1800 | 500 | - |
Continue trained model - better model | ~ 16 hours | 11000 | 1000 | - |
Single Shot Detector | - | - | - | ~ 10 hours |
1. | ~ 2 hours | 252 | - | - |
2. better | ~ 8 hours | 520 + (augmentation = 3560 ) | - | - |
All better model saved and in folder detectors, and ready to apply.
Better result for every detector
- Haara+NN
hand2-3_CtiNPwLL.mp4
- Haara(matlab) + NN
hand2_matlab5-1.online-video-cutter.com.mp4
- SSD
res.online-video-cutter.com.mp4
- MediaPipe
out_hand2.online-video-cutter.com.mp4
Record movie (length 39 sec) every algorithm
Haara + NN | SSD | MediaPipe |
---|---|---|
28 min | ~ 45 sec | ~ 40 sec |