Main file: VQA_model.ipynb
Requires annotations file from: https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip
and
Bert.pkl from : https://drive.google.com/file/d/1rStLsUbgAC1uU7Mai5X_-c0qJPxnJk63/view?usp=sharing
both of these files need to be placed in the Files folder.
(https://github.com/GT-Vision-Lab/VQA/blob/master/README.md) It consists of :
-
Real
- 82,783 MS COCO training images, 40,504 MS COCO validation images and 81,434 MS COCO testing images (images are obtained from [MS COCO website] (http://mscoco.org/dataset/#download))
- 443,757 questions for training, 214,354 questions for validation and 447,793 questions for testing
- 4,437,570 answers for training and 2,143,540 answers for validation (10 per question)
-
There is only one type of task - Open-ended task
https://arxiv.org/abs/1704.03162
PDF : https://arxiv.org/pdf/1704.03162.pdf
https://github.com/iamaaditya/VQA_Demo
https://github.com/iamaaditya/VQA_Keras
https://visualqa.org/download.html,
https://arxiv.org/abs/1905.13648,
https://tryolabs.com/blog/2018/03/01/introduction-to-visual-question-answering/
https://iamaaditya.github.io/2016/04/visual_question_answering_demo_notebook
https://paperswithcode.com/task/visual-question-answering
https://keras.io/getting-started/functional-api-guide/
https://github.com/Cyanogenoid/pytorch-vqa
https://github.com/nithinraok/VisualQuestion_VQA
Improvements over this code :
https://github.com/Cyanogenoid/vqa-counting/tree/master/vqa-v2
https://github.com/KaihuaTang/VQA2.0-Recent-Approachs-2018.pytorch