are there any preprocessing to the input video clips? #35

FesianXu · 2020-04-29T03:10:32Z

Hi, great work and gave my lots of help ! However, I still need some help.
I am really not familiar with caffe2 and could not find out whether the caffe2 version IG65M model used any pre-processing to the input video clips or not.
In my experiment，I just simply normalized the pixel to [0,1]. but the performance didn't look very good (about 92% on ucf101, with ig65m pretrained model, I did some finetune on ucf101, or the performance even worse)。 So I wonder if we need to do some specify pre-processing to the video clips like substract the means or somethings else ?
Thanks for your attention and kindly help :)

daniel-j-h · 2020-04-29T08:11:07Z

Yeap, check the extract tool

ig65m-pytorch/ig65m/cli/extract.py

Line 64 in fc749e2

Normalize(mean=[0.43216, 0.394666, 0.37645], std=[0.22803, 0.22145, 0.216989]),

FesianXu · 2020-04-29T08:25:27Z

@daniel-j-h Thanks sooooo much for your rapid reply :) have a nice day

daniel-j-h · 2020-04-29T09:01:42Z

You too! 🤗

FesianXu · 2020-04-30T02:13:45Z

@daniel-j-h Hi daniel, sorry for bothering you again, I tried to normalize the input video clips following the code you shown me yesterday, and I found this didn't work when I evaluated the Kinetics 400 database. I wonder if I was wrong in the channel order, Is the order RGB or BGR ? thanks

FesianXu · 2020-04-30T02:18:30Z

also, I had tried to use torchvision.transforms.Normalize to normalize with the same means and stds you provided frame by frame, instead of using the code you provided (simply appended your code into my projection seems make some problem so I used the library methods instead), Would it be the root of the problem? BTW, I would like to know have you evaluate the result on kinetics-400 and could you reach the accuracy the paper claimed ? thanks

PS： to provide more detail of my pre-processing, the methods looks like:

self.transform = transforms.Compose(
    [
        transforms.ToPILImage(),   # the latter resize need to use PIL image
        transforms.Resize(size=(128,171)),
        transforms.CenterCrop((112,112)),
        transforms.ToTensor()
    ]
)

the model i was using is r2puls1d_34_8_kinetics, I think it should be fine-tuned on kinetics 400. And I just wanted to evaluate the kinetics 400. :)

PSS: I used torchvision.io.read_video() to decode the .mp4 video in database, but I am not sure whether it would be the problem if I didn't use opencv to decode it. (I think both of them use ffmpeg)

FesianXu · 2020-05-02T09:39:53Z

I had solved this problem. Thanks for your attention.

yushuinanrong · 2020-05-12T18:33:35Z

@FesianXu
Could you share your solution to solve the problem? I encountered a similar problem. Also, could you share your validation results on Kinetics400?

FesianXu · 2020-05-13T02:26:59Z

@yushuinanrong check this link for the validation results on kinetics 400 #2 (comment)

I just normalize the RGB clips (in the right order of RGB channel) into pixel value from 0 to 1. And then substract their means and then divide them by the std. the mean and std are

Normalize(mean=[0.43216, 0.394666, 0.37645], std=[0.22803, 0.22145, 0.216989]),

yushuinanrong · 2020-05-13T02:43:00Z

@FesianXu
Thank you!

Yueeeeee-1 · 2020-06-29T07:47:51Z

hi, do you performing the fine-tuned on UCF101 dataset?

daniel-j-h closed this as completed Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

are there any preprocessing to the input video clips? #35

are there any preprocessing to the input video clips? #35

FesianXu commented Apr 29, 2020

daniel-j-h commented Apr 29, 2020

FesianXu commented Apr 29, 2020

daniel-j-h commented Apr 29, 2020

FesianXu commented Apr 30, 2020

FesianXu commented Apr 30, 2020 •

edited

Loading

FesianXu commented May 2, 2020

yushuinanrong commented May 12, 2020

FesianXu commented May 13, 2020

yushuinanrong commented May 13, 2020

Yueeeeee-1 commented Jun 29, 2020

are there any preprocessing to the input video clips? #35

are there any preprocessing to the input video clips? #35

Comments

FesianXu commented Apr 29, 2020

daniel-j-h commented Apr 29, 2020

FesianXu commented Apr 29, 2020

daniel-j-h commented Apr 29, 2020

FesianXu commented Apr 30, 2020

FesianXu commented Apr 30, 2020 • edited Loading

FesianXu commented May 2, 2020

yushuinanrong commented May 12, 2020

FesianXu commented May 13, 2020

yushuinanrong commented May 13, 2020

Yueeeeee-1 commented Jun 29, 2020

FesianXu commented Apr 30, 2020 •

edited

Loading