run prepro_vqa.py error when split=2 #22

xhzhao · 2016-12-19T05:47:48Z

I got this error when split=2 while split=1 work very well.
the command is ：
python vqa_preprocess.py --download 1 --split 2
python prepro_vqa.py --input_train_json ../data/vqa_raw_train.json --input_test_json ../data/vqa_raw_test.json --num_ans 1000

the error is ：
top words and their counts:9.88% done)
(320161, '?')
(225976, 'the')
(200545, 'is')
(118203, 'what')
(76624, 'are')
(64512, 'this')
(49209, 'in')
(45681, 'a')
(41629, 'on')
(40158, 'how')
(38230, 'many')
(37322, 'color')
(37023, 'of')
(29182, 'there')
(18392, 'man')
(14668, 'does')
(13492, 'people')
(12518, 'picture')
(11779, "'s")
(11758, 'to')
total words: 2284620
number of bad words: 0/14770 = 0.00%
number of words in vocab would be 14770
number of UNKs: 0/2284620 = 0.00%
inserting the special UNK token
Traceback (most recent call last):
File "prepro_vqa.py", line 292, in
main(params)
File "prepro_vqa.py", line 217, in main
ans_test = encode_answer(imgs_test, atoi)
File "prepro_vqa.py", line 128, in encode_answer
ans_arrays[i] = atoi.get(img['ans'], -1) # -1 means wrong answer.
KeyError: 'ans'

idansc · 2016-12-31T14:40:05Z

there is no 'ans' key on split 2, you should modify lines that ask it.

xhzhao · 2017-01-10T08:28:01Z

@idansc Thank you, i fixed this code bug, and download the pretrained model from here(https://filebox.ece.vt.edu/~jiasenlu/codeRelease/co_atten/model/vqa_model/model_alternating_train-val_vgg.t7).
I submitted the result with the name: vqa_OpenEnded_mscoco_test-dev2015_HieCoAttenVQA_results.json, and i got the accuracy like this:
{"overall": 43.11, "perAnswerType": {"other": 15.74, "number": 29.76, "yes/no": 78.73}}

But the overall accuracy in the paper is 60.1%, i really don't know where this gap is comes from?

idansc · 2017-01-10T08:50:51Z

Did they provide the json, and h5 files as well? the model need to be aligned with the pre-processed files.

xhzhao · 2017-01-10T08:58:58Z

yeah, the json is provided, and the h5 file is generated by myself:
th prepro_img_vgg.lua -input_json ../data/vqa_data_prepro.json -image_root /home/jiasenlu/data/ -cnn_proto ../image_model/VGG_ILSVRC_19_layers_deploy.prototxt -cnn_model ../image_model/VGG_ILSVRC_19_layers.caffemodel

idansc · 2017-01-10T20:10:50Z

that's ok for the images, but what about the h5file containing the preprocessed question dataset?
if it's not provided it will cause problems on training (to be able to fit the right answers to questions)
I believe the gap caused by some sort of misalignment in the pre-processing.

xhzhao · 2017-01-12T07:39:41Z

@idansc any idea about how to fix this misalignment problem?

idansc · 2017-01-15T08:14:09Z

did you try to training the model yourself? by the way are you running on cpu or gpu?

xhzhao · 2017-01-16T00:35:59Z

I have tried to use GPU(M40) to train this model, but the training process is very slow(12.5 hour / 1 epoch, 250epoch is used in the paper), and i'm trying to find out where is the bottleneck

idansc · 2017-01-16T00:40:42Z

epoch should be a few mins with M40, check that your stored cnn features are on SSD, or use DataLoader if you have enough RAM (about 60GB)

xhzhao · 2017-01-16T00:49:58Z

yes, it should be. I trained another model based on another github and the speed is very fast, while the accuracy is not good enough: here.
I will double check the hardware.

yauhen-info · 2017-01-25T22:06:10Z

@xhzhao , @idansc , sorry for offtopic, but really need some help here. I trained the model based on customized VQA, but not sure how to run the evaluation now. I read the readme, but it's not clear from there. I would highly appreciate any help and if you are open for discussion, I guess we can continue here.

lupantech · 2017-03-13T18:20:37Z

@xhzhao I met the same problem with you, but I have no idea to deal with it. How did you solve it?

lupantech · 2017-03-14T07:30:25Z

OK, I made it :)

panfengli · 2017-04-13T04:51:17Z

@lupantech @yauhen-info @xhzhao How do you solved the problem? I am using the VQA v1.9 dataset, the eval.lua provided by HieCoAttenVQA, and the vqaEvalDemo.py provided by VT-vision-lab/VQA, it will report a error at vqa.py as
'Results do not correspond to current VQA set. Either the results do not have predictions for all question ids in annotation file or there is at least one question id that does not belong to the question ids in the annotation file.'.
I see one suggestion that to use the eval.lua in VT-vision-lab/VQA_LSTM_CNN, but it is not well suitable for the one in HieCoAttenVQA. Thanks!

panfengli · 2017-04-13T19:08:15Z

I find the error is due to the new question_id in VQA dataset v1.9 exceeds the volume of FloatTensor, which will cause mismatch when copy from data.ques_id. Simply change one line code will work, as local ques_id = torch.DoubleTensor(total_num)

ghost · 2017-10-06T03:17:35Z

@panfengli which file is the line "ques_id = torch.DoubleTensor(total_num)" in ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run prepro_vqa.py error when split=2 #22

run prepro_vqa.py error when split=2 #22

xhzhao commented Dec 19, 2016

idansc commented Dec 31, 2016

xhzhao commented Jan 10, 2017

idansc commented Jan 10, 2017

xhzhao commented Jan 10, 2017

idansc commented Jan 10, 2017

xhzhao commented Jan 12, 2017

idansc commented Jan 15, 2017

xhzhao commented Jan 16, 2017

idansc commented Jan 16, 2017

xhzhao commented Jan 16, 2017

yauhen-info commented Jan 25, 2017

lupantech commented Mar 13, 2017

lupantech commented Mar 14, 2017

panfengli commented Apr 13, 2017 •

edited

Loading

panfengli commented Apr 13, 2017

ghost commented Oct 6, 2017

run prepro_vqa.py error when split=2 #22

run prepro_vqa.py error when split=2 #22

Comments

xhzhao commented Dec 19, 2016

idansc commented Dec 31, 2016

xhzhao commented Jan 10, 2017

idansc commented Jan 10, 2017

xhzhao commented Jan 10, 2017

idansc commented Jan 10, 2017

xhzhao commented Jan 12, 2017

idansc commented Jan 15, 2017

xhzhao commented Jan 16, 2017

idansc commented Jan 16, 2017

xhzhao commented Jan 16, 2017

yauhen-info commented Jan 25, 2017

lupantech commented Mar 13, 2017

lupantech commented Mar 14, 2017

panfengli commented Apr 13, 2017 • edited Loading

panfengli commented Apr 13, 2017

ghost commented Oct 6, 2017

panfengli commented Apr 13, 2017 •

edited

Loading