-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
are there any preprocessing to the input video clips? #35
Comments
Yeap, check the extract tool ig65m-pytorch/ig65m/cli/extract.py Line 64 in fc749e2
|
@daniel-j-h Thanks sooooo much for your rapid reply :) have a nice day |
You too! 🤗 |
@daniel-j-h Hi daniel, sorry for bothering you again, I tried to normalize the input video clips following the code you shown me yesterday, and I found this didn't work when I evaluated the Kinetics 400 database. I wonder if I was wrong in the channel order, Is the order RGB or BGR ? thanks |
also, I had tried to use PS: to provide more detail of my pre-processing, the methods looks like: self.transform = transforms.Compose(
[
transforms.ToPILImage(), # the latter resize need to use PIL image
transforms.Resize(size=(128,171)),
transforms.CenterCrop((112,112)),
transforms.ToTensor()
]
) the model i was using is PSS: I used |
I had solved this problem. Thanks for your attention. |
@FesianXu |
@yushuinanrong check this link for the validation results on kinetics 400 #2 (comment) I just normalize the RGB clips (in the right order of RGB channel) into pixel value from 0 to 1. And then substract their means and then divide them by the std. the mean and std are Normalize(mean=[0.43216, 0.394666, 0.37645], std=[0.22803, 0.22145, 0.216989]), |
@FesianXu |
hi, do you performing the fine-tuned on UCF101 dataset? |
Hi, great work and gave my lots of help ! However, I still need some help.
I am really not familiar with caffe2 and could not find out whether the caffe2 version IG65M model used any pre-processing to the input video clips or not.
In my experiment,I just simply normalized the pixel to [0,1]. but the performance didn't look very good (about 92% on ucf101, with ig65m pretrained model, I did some finetune on ucf101, or the performance even worse)。 So I wonder if we need to do some specify pre-processing to the video clips like substract the means or somethings else ?
Thanks for your attention and kindly help :)
The text was updated successfully, but these errors were encountered: