Human action recognition has been a topic of interest across multiple fields ranging from security to entertainment systems. Tracking the motion and identifying the action being performed on a real time basis is necessary for critical security systems. In entertainment, especially gaming, the need for immediate responses for actions and gestures are paramount for the success of that system. We show that Motion History image has been a well established framework to capture the temporal and activity information in multi dimensional detail enabling various usecases including classification. We utilize MHI to produce sample data to train a classifier and demonstrate its effectiveness for action classification across six different activities in a single multi-action video. We analyze the classifier performance and cases where MHI struggles to capture the true activity and discuss mechanisms and future work to overcome those limitations.
If you find this project useful in your research or work, please consider citing it:
@article{gopal2024multiclass,
title={Multi class activity classification in videos using Motion History Image generation},
author={Gopal, Senthilkumar},
journal={arXiv preprint arXiv:2410.09902},
year={2024}
}
mhi.py
- Primary source file- Multi Action Video with Prediction Labels
- Use the conda env setup using
cv_proj.yml
matplotlib=3.0.3
needs to be installed in the environment.
- All the dataset files are already added to the project folder. Reference
- Run the file mhi.py to perform all the steps
LOAD_TRAIN_DATA
- Use this flag to switch on/off loading the dataset from the/datasets
Following are the steps that are executed as part of the file: mhi.py
generate_report_images
- Generates the binary images, MHI, MEI for the report/presentationexecute_classifier_based_recognition('KNN')
- Executes training, validation, testing for KNN classifierexecute_classifier_based_recognition('MLP')
- Executes training, validation, testing for MLP classifierpredict_multi_action
- Uses the MLP classifier to predict the various actions in the sample videoexecute_jogging_only_prediction(mlp_recog)
- Executes the MLP classifier for a sample where the prediction was incorrect for jogging.execute_incorrect_multi_action_prediction
- Executes the MLP classifier for a sample where the prediction was incorrect for jogging.
The expected runtime for mhi.py
- Execution with saved dataset - Approx 4 Minutes
- Execution with Full training - Approx 3 Minutes
/input_videos
- Contains the_d1
type of video files useful for training./my_videos
- Videos that were captured for testing the ML classifier/input_files
- Text files provided/created for storing frame references/datasets
- generated datasets of MHI/MEI and labels for easier loading and training
/output
- Primary output folder containing the confusion matrix, sample video with labels and sample frames/report
- Images generated for the report/report/binary
- Binary images used in the report for different videos/report/mhi
- MHI images used in the report for different videos/report/mei
- MEI images used in the report for different videos
utility.py
- Utility functions for one of testing and generation