Question about training dataset #7

vishaal27 · 2023-08-15T14:24:29Z

Hey, thanks for the great work!

I had a question about the training dataset for the end-to-end VNLI model. In the paper you mention:

Specifically, we finetune BLIP2 and PaLI-17B using a dataset comprising 110K text-image pairs labeled with alignment annotations. This includes 44K examples from COCO-Con, 3.5K from PickaPic-Con, 20K from COCO t2i and 40K from the training split of the SNLI-VE dataset.

However, I was unable to find the training split on AWS/Huggingface. Are there plans to release it, or if it has already been released, could you please point me to where I can find it?

The text was updated successfully, but these errors were encountered:

yonatanbitton · 2023-10-07T21:12:43Z

Hi, we will publish it soon, working on data release. We will update. Thank you :)

yonatanbitton · 2023-12-11T01:27:15Z

Hi, it's updated now in the project website, sorry for the delay, please ask if anything is unclear
CSV: https://seetrue.s3.amazonaws.com/wysiwyr_train.csv
Images: https://drive.google.com/file/d/1M1CKmYkIdpFYjCOc9JwXHP5Z7E91CJl3/view?usp=drive_link

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about training dataset #7

Question about training dataset #7

vishaal27 commented Aug 15, 2023

yonatanbitton commented Oct 7, 2023

yonatanbitton commented Dec 11, 2023

Question about training dataset #7

Question about training dataset #7

Comments

vishaal27 commented Aug 15, 2023

yonatanbitton commented Oct 7, 2023

yonatanbitton commented Dec 11, 2023