Skip to content

Latest commit

 

History

History
74 lines (56 loc) · 3.54 KB

README.md

File metadata and controls

74 lines (56 loc) · 3.54 KB

AdaScan

This repository contains the source code for the paper Adascan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos, Amlan Kar* (IIT Kanpur), Nishant Rai* (IIT Kanpur), Karan Sikka (UCSD and SRI), Gaurav Sharma (IIT Kanpur), with support for multi-GPU training and testing.

Dependencies

Note: skimage and skvideo are required for the preprocessing step

Setup

  • Download UCF-101 dataset from here and UCF-101 flow files from here
  • Download UCF-101 action recognition splits from here (to be passed using -split_dir)
  • Run preprocessing script to create npz files required for training/testing (directory created to be passed using -data_dir)

Training from scratch

  • [RGB training] Download VGG numpy files from here (to be passed using -vgg_npy_path)
  • [Optical Flow training] Download the pre-trained caffe models for flow from here and convert them using this tool to numpy files
  • Edit sample_train.sh and run

Testing pre-trained models or self-trained models

  • Download the pre-trained models from the given links below
  • Download VGG numpy file for RGB and any one of the flow files to pass with -npy_path for testing (This is an extra step and doesn't change anything, we will remove this unneccessary step soon)
  • Edit sample_test.sh and run

Visualizing on custom video (only for RGB)

python demo.py -ckpt_file path/to/ckpt/file -vid_file vis/vid_file

This should save an image in vis/ that looks like:

Sample visualization

Pre-trained models (Coming Soon)

These models have been trained on UCF-101. We will be releasing the updated models soon.

RGB

Flow

Training/Testing

Sample self explanatory train and test scripts have been provided with the code

Updated Results

After fixing a bug post-submission, we have achieved higher results with the same configuration as in the original paper. We request authors to cite these numbers.

Model UCF-101 HMDB-51
AdaScan 91.6 62.4
AdaScan + iDT 93.1 67.6
AdaScan + iDT + C3D 94.0 69.4

Reference

If you use this code as part of any published research, please acknowledge the following paper:

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos
Amlan Kar*, Nishant Rai*, Karan Sikka, Gaurav Sharma (*denotes equal contribution)

@article{kar2016adascan,
title={AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos},
author={Kar, Amlan and Rai, Nishant and Sikka, Karan and Sharma, Gaurav},
booktitle={CVPR}, 
year={2017} 
}