vqav2

Here are 8 public repositories matching this topic...

rentainhe / TRAR-VQA

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

visualization pytorch transformer attention official multi-modal clevr visual-question-answering vision-and-language dynamic-network multi-modality multi-modal-learning multi-scale-features vqav2 iccv2021 local-and-global

Updated Oct 11, 2021
Python

phiyodr / vqaloader

Star

PyTorch DataLoader for many VQA datasets

pytorch vqa dataloader gqa textvqa vqav2

Updated Jan 10, 2023
Python

A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper *Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering*. It's our final project for course DL4NLP at ZJU.

nlp deep-learning pytorch vqa mindspore vqav2

Updated Jul 27, 2021
Jupyter Notebook

williamcfrancis / Visual-Question-Answering-using-Stacked-Attention-Networks

Star

Pytorch implementation of VQA using Stacked Attention Networks: Multimodal architecture for image and question input, using CNN and LSTM, with stacked attention layer for improved accuracy (54.82%). Includes visualization of attention layers. Contributions welcome. Utilizes Visual VQA v2.0 dataset.

natural-language-processing computer-vision deep-learning pytorch vqa visual-question-answering stacked-attention-networks vqav2

Updated Jan 18, 2023
Jupyter Notebook

itsShnik / adaptively-finetuning-transformers

Star

Adaptively fine tuning transformer based models for multiple domains and multiple tasks

transformers pytorch visual-question-answering finetuning vision-and-language vlbert lxmert vqav2 vqacpv2 spottune blockdrop

Updated May 22, 2023
Python

BrightQin / RWSAN

Star

Official implementation of "Deep Residual Weight-Sharing Attention Network with Low-Rank Attention for Visual Question Answering" (RWSAN) published in the IEEE Transactions on Multimedia (TMM), 2022.

pytorch vqav2

Updated May 30, 2022
Python

rentainhe / TRAR-Feature-Extraction

Star

Grid features extraction for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

pytorch vqa extract-features visual-question-answering vqa-dataset vqav2 iccv2021

Updated Oct 10, 2021
Python

shreyas21563 / VQA-using-BLIP

Star

Leveraging the BLIP Model for Visual Question Answering: A Comparative Analysis on VQA and DAQUAR Datasets

machine-learning natural-language-processing computer-vision inference accuracy image-captioning bleu-score blip visual-question-answering wups vqav2 bert-score daquar

Updated Jun 18, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the vqav2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vqav2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vqav2

Here are 8 public repositories matching this topic...

rentainhe / TRAR-VQA

phiyodr / vqaloader

vtu81 / NaiveVQA

williamcfrancis / Visual-Question-Answering-using-Stacked-Attention-Networks

itsShnik / adaptively-finetuning-transformers

BrightQin / RWSAN

rentainhe / TRAR-Feature-Extraction

shreyas21563 / VQA-using-BLIP

Improve this page

Add this topic to your repo