Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run prepro_vqa.py error when split=2 #22

Open
xhzhao opened this issue Dec 19, 2016 · 16 comments
Open

run prepro_vqa.py error when split=2 #22

xhzhao opened this issue Dec 19, 2016 · 16 comments

Comments

@xhzhao
Copy link

xhzhao commented Dec 19, 2016

I got this error when split=2 while split=1 work very well.
the command is :
python vqa_preprocess.py --download 1 --split 2
python prepro_vqa.py --input_train_json ../data/vqa_raw_train.json --input_test_json ../data/vqa_raw_test.json --num_ans 1000

the error is :
top words and their counts:9.88% done)
(320161, '?')
(225976, 'the')
(200545, 'is')
(118203, 'what')
(76624, 'are')
(64512, 'this')
(49209, 'in')
(45681, 'a')
(41629, 'on')
(40158, 'how')
(38230, 'many')
(37322, 'color')
(37023, 'of')
(29182, 'there')
(18392, 'man')
(14668, 'does')
(13492, 'people')
(12518, 'picture')
(11779, "'s")
(11758, 'to')
total words: 2284620
number of bad words: 0/14770 = 0.00%
number of words in vocab would be 14770
number of UNKs: 0/2284620 = 0.00%
inserting the special UNK token
Traceback (most recent call last):
File "prepro_vqa.py", line 292, in
main(params)
File "prepro_vqa.py", line 217, in main
ans_test = encode_answer(imgs_test, atoi)
File "prepro_vqa.py", line 128, in encode_answer
ans_arrays[i] = atoi.get(img['ans'], -1) # -1 means wrong answer.
KeyError: 'ans'

@idansc
Copy link

idansc commented Dec 31, 2016

there is no 'ans' key on split 2, you should modify lines that ask it.

@xhzhao
Copy link
Author

xhzhao commented Jan 10, 2017

@idansc Thank you, i fixed this code bug, and download the pretrained model from here(https://filebox.ece.vt.edu/~jiasenlu/codeRelease/co_atten/model/vqa_model/model_alternating_train-val_vgg.t7).
I submitted the result with the name: vqa_OpenEnded_mscoco_test-dev2015_HieCoAttenVQA_results.json, and i got the accuracy like this:
{"overall": 43.11, "perAnswerType": {"other": 15.74, "number": 29.76, "yes/no": 78.73}}

But the overall accuracy in the paper is 60.1%, i really don't know where this gap is comes from?

@idansc
Copy link

idansc commented Jan 10, 2017

Did they provide the json, and h5 files as well? the model need to be aligned with the pre-processed files.

@xhzhao
Copy link
Author

xhzhao commented Jan 10, 2017

yeah, the json is provided, and the h5 file is generated by myself:
th prepro_img_vgg.lua -input_json ../data/vqa_data_prepro.json -image_root /home/jiasenlu/data/ -cnn_proto ../image_model/VGG_ILSVRC_19_layers_deploy.prototxt -cnn_model ../image_model/VGG_ILSVRC_19_layers.caffemodel

@idansc
Copy link

idansc commented Jan 10, 2017

that's ok for the images, but what about the h5file containing the preprocessed question dataset?
if it's not provided it will cause problems on training (to be able to fit the right answers to questions)
I believe the gap caused by some sort of misalignment in the pre-processing.

@xhzhao
Copy link
Author

xhzhao commented Jan 12, 2017

@idansc any idea about how to fix this misalignment problem?

@idansc
Copy link

idansc commented Jan 15, 2017

did you try to training the model yourself? by the way are you running on cpu or gpu?

@xhzhao
Copy link
Author

xhzhao commented Jan 16, 2017

I have tried to use GPU(M40) to train this model, but the training process is very slow(12.5 hour / 1 epoch, 250epoch is used in the paper), and i'm trying to find out where is the bottleneck

@idansc
Copy link

idansc commented Jan 16, 2017

epoch should be a few mins with M40, check that your stored cnn features are on SSD, or use DataLoader if you have enough RAM (about 60GB)

@xhzhao
Copy link
Author

xhzhao commented Jan 16, 2017

yes, it should be. I trained another model based on another github and the speed is very fast, while the accuracy is not good enough: here.
I will double check the hardware.

@yauhen-info
Copy link

@xhzhao , @idansc , sorry for offtopic, but really need some help here. I trained the model based on customized VQA, but not sure how to run the evaluation now. I read the readme, but it's not clear from there. I would highly appreciate any help and if you are open for discussion, I guess we can continue here.

@lupantech
Copy link

@xhzhao I met the same problem with you, but I have no idea to deal with it. How did you solve it?

@lupantech
Copy link

OK, I made it :)

@panfengli
Copy link

panfengli commented Apr 13, 2017

@lupantech @yauhen-info @xhzhao How do you solved the problem? I am using the VQA v1.9 dataset, the eval.lua provided by HieCoAttenVQA, and the vqaEvalDemo.py provided by VT-vision-lab/VQA, it will report a error at vqa.py as
'Results do not correspond to current VQA set. Either the results do not have predictions for all question ids in annotation file or there is at least one question id that does not belong to the question ids in the annotation file.'.
I see one suggestion that to use the eval.lua in VT-vision-lab/VQA_LSTM_CNN, but it is not well suitable for the one in HieCoAttenVQA. Thanks!

@panfengli
Copy link

I find the error is due to the new question_id in VQA dataset v1.9 exceeds the volume of FloatTensor, which will cause mismatch when copy from data.ques_id. Simply change one line code will work, as local ques_id = torch.DoubleTensor(total_num)

@ghost
Copy link

ghost commented Oct 6, 2017

@panfengli which file is the line "ques_id = torch.DoubleTensor(total_num)" in ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants