-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
234 lines (209 loc) · 13.2 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
<!DOCTYPE html>
<html lang="en">
<head>
<title>VLQA</title>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="static/styles/index.css"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <link rel="stylesheet" media="screen" href="https://fontlibrary.org/face/hk-grotesk" type="text/css"/>
<link rel="icon" href="static/images/favicon.png">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.1.1/css/all.css" integrity="sha384-O8whS3fhG2OnA5Kas0Y9l3cfpmYjapjI0E4theH4iuMD+pLhbf6JI0jIMfYcK3yZ" crossorigin="anonymous">
<link href="https://afeld.github.io/emoji-css/emoji.css" rel="stylesheet">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap-theme.min.css" integrity="sha384-rHyoN1iRsVXV4nD0JutlnGaslCJuC7uwjduW9SVrLvRYooPp2bWYgmgJQIXwl/Sp" crossorigin="anonymous">
<!-- JS IMPORTS -->
<script src="https://code.jquery.com/jquery-2.2.4.min.js" integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mustache.js/2.3.0/mustache.min.js" integrity="sha256-iaqfO5ue0VbSGcEiQn+OeXxnxAMK2+QgHXIDA5bWtGI=" crossorigin="anonymous"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.17.1/moment.min.js" integrity="sha256-Gn7MUQono8LUxTfRA0WZzJgTua52Udm1Ifrk5421zkA=" crossorigin="anonymous"></script>
</head>
<style>
p.rank{
padding-left:30px;
}
body {
font-family: 'HankenGroteskRegular';
background-color:#E0E0E0;
}
</style>
<body>
<div class="header">
<h1><img src="static/images/qa4.png" width="10%"> VLQA (<u>V</u>isuo-<u>L</u>inguistic <u>Q</u>uestion <u>A</u>nswering)</h1>
<h3> A Dataset for Joint Reasoning over Visuo-Linguistic Context </h3>
</div>
<div class="container">
<div class="row">
<div class="col-md-7 box ">
<center> <h3>What is VLQA? </h3> </center>
<p>VLQA is a dataset for joint reasoning over visuo-linguistic context. It consists of 9K image-passage-question-answers items with detailed annotations, which are meticulously crafted through combined automated and manual efforts. Questions in VLQA are designed to combine both visual and textual information, i.e. ignoring either of them would make the question unanswerable. </p>
<p> Solving this dataset requires an AI model that can (i) understand diverse kinds of images; from simple daily-life scenes and standard charts to complex diagrams (ii) understand complex texts and relate it to given visual information (ii) perform a variety of reasoning tasks and derive inferences. </p>
<hr>
<center>
<h3> VLQA Paper </h3>
<a href="https://arxiv.org/pdf/2005.00330.pdf" target="_blank"> <button class="button"> <i class="fa fa-newspaper-o" aria-hidden="true"></i> PDF (EMNLP'20 Findings) </button></a>
</center>
<br>
<p>For more details about VLQA dataset creation, annotations and dataset analysis please refer to the supplementary material in the above file. </p>
<center>
<hr>
<h3>Browse Examples</h3>
<a href="https://shailaja183.github.io/vlqa/dataset.html"><button class="button"> <i class="fa fa-eye" aria-hidden="true"></i> Explore VLQA Dataset</button></a>
<hr>
<h3>Download Dataset</h3>
<a href="https://drive.google.com/drive/folders/163Tob6UcYoDD601pZbuAfJxgvYc3ASdQ?usp=sharing"><button class="button"><i class="fa fa-download"></i> Train/Val/Test Set </button></a>
<hr>
<h3>Baselines Models</h3>
<a href="https://github.com/shailaja183/vlqa" target="_blank"><button class="button"><i class="fa fa-code" aria-hidden="true"></i> Code for Baseline Models </button></a>
</center>
<br>
<p> <b>Note (As of September 2022): </b> All of our experimentation was done during early days of transformers. Many baselines we implemented are now part of HuggingFace and might be convenient to use. Check out <a href="https://huggingface.co/models" target="_blank">here</a>.</p>
<hr>
<center>
<h3>Leaderboard Submission</h3>
</center>
<p>If you would like your model to be part of our leaderboard, create a prediction.csv file containing two columns- 'qid' and 'pred_answer' for all test set instances. Then send the prediction.csv file to <a href="mailto:ssampa17@asu.edu"> ssampa17@asu.edu</a> with the brief model description.</p>
<hr>
<center><h3>Distribution and Usage</h3></center>
<p>VLQA is curated from multiple online resources (books, encyclopedias, web-crawls, existing datasets, standardized tests etc.). We provide web reference to all such resources used in images, passages and question-answers pairs in our dataset (originally curated content might be altered on case-by-case basis to well fit the purpose of the dataset). </p>
<p>
Creation of VLQA is purely research oriented and so does its distribution and future usage. VLQA is an ongoing effort and we expect the dataset to evolve. If you find our dataset or model helpful, please cite our paper :-)
<br><br>
<h4>Citation:</h4>
<code>
@misc{sampat2020visuo-linguistic,
<br>
title={Visuo-Linguistic Question Answering (VLQA) Challenge},
<br>
author={Shailaja Sampat and Yezhou Yang and Chitta Baral},
<br>
year={2020},
<br>
eprint={2005.00330},
<br>
archivePrefix={arXiv},
<br>
primaryClass={cs.CV}
<br>
}
</code>
</p>
<hr>
</div>
<div class="col-md-5 box">
<div id="container" style="width: 100%"></div>
<script id="template" type="x-tmpl-mustache">
<h3><i class="em em-trophy" aria-role="presentation" aria-label="TROPHY"></i> Top Models on VLQA-Test Set </h3>
<br>
<table class="table table-condensed">
<thead>
<tr>
<th>Rank</th>
<th>Model</th>
<th>Accuracy</th>
</tr>
</thead>
<tbody>
<tr>
<th><span class="date label label-default">OCT 03, 20</span></th>
<td>HUMAN </td>
<td>84.00</td>
</tr>
<tr>
<th><p class="rank">1</p><span class="date label label-default">OCT 03, 20</span></th>
<td>HOLE (Multimodal)
<br> (Sampat et al., 2020)</td>
<td>39.63</td>
</tr>
<tr>
<th><p class="rank">2</p><span class="date label label-default">OCT 03, 20</span></th>
<td>LXMERT (Multimodal)
<br> (Tan and Bansal, 2019) </td>
<td>36.41</td>
</tr>
<tr>
<th><p class="rank">3</p><span class="date label label-default">OCT 03, 20</span></th>
<td>VL-BERT (Multimodal)
<br> (Su et al., 2019)</td>
<td>35.92</td>
</tr>
<tr>
<th><p class="rank">4</p><span class="date label label-default">OCT 03, 20</span></th>
<td>ViLBERT (Multimodal)
<br> (Lu et al., 2019)</td>
<td>34.70</td>
</tr>
<tr>
<th><p class="rank">5</p><span class="date label label-default">OCT 03, 20</span></th>
<td>VisualBERT (Multimodal)
<br> (Li et al., 2019)</td>
<td>33.17</td>
</tr>
<tr>
<th><p class="rank">6</p><span class="date label label-default">OCT 03, 20</span></th>
<td>Random Choice Baseline
<br> -- </td>
<td>31.36</td>
</tr>
<tr>
<th><p class="rank">7</p><span class="date label label-default">OCT 03, 20</span></th>
<td>DQANet (Multimodal)
<br> (Kembhavi et al., 2016) </td>
<td>31.30</td>
</tr>
<tr>
<th><p class="rank">8</p><span class="date label label-default">OCT 03, 20</span></th>
<td>Passage-only (Unimodal)
<br> -- </td>
<td>30.16</td>
</tr>
<tr>
<th><p class="rank">9</p><span class="date label label-default">OCT 03, 20</span></th>
<td>Image-only (Unimodal) <br> -- </td>
<td>29.48</td>
</tr>
<tr>
<th><p class="rank">10</p><span class="date label label-default">OCT 03, 20</span></th>
<td>Question-only (No modality) <br> -- </td>
<td>28.56</td>
</tr>
{{#submissions}}
<th><p class="rank">{{{rank}}}</p><span class="date label label-default">{{created}}</span></th>
<td>{{submission.description}}</td>
<td>{{scores.textual_cloze}}</td>
</tr>
{{/submissions}}
</tbody>
</table>
</script>
</div>
</div>
<div class="col-lg-12 col-sm-12">
<div class="col-lg-3 col-md-3 col-sm-3"><br>
<img src="static/images/asu.jpeg" width=120%>
</div>
<div class="col-lg-8 col-md-8 col-sm-8"><br>
<center><h4> Shailaja Sampat <a href="http://shailaja-sampat.mystrikingly.com/" target="_blank"><i class="em em-female-student" aria-role="presentation" aria-label=""></i></a>, Yezhou Yang <a href="https://yezhouyang.engineering.asu.edu/research-group/" target="_blank"><i class="em em-male-teacher" aria-role="presentation" aria-label=""></i></a> and Chitta Baral <a href="https://cogintlab-asu.github.io" target="_blank"><i class="em em-male-teacher" aria-role="presentation" aria-label=""></i> </a> <br>
School of Computing, Informatics, and Decision Systems Engineering (CIDSE) <br> Arizona State Univeristy </h4></center>
</div>
<div class="col-lg-11 col-md-11 col-sm-11">
<p style="text-align:left"><h4><br>We are thankful to National Science Foundation (NSF) for supporting this research under grant IIS-1816039.</h4></p>
<br>
<h6>Webpage template inspired by <a href="https://rajpurkar.github.io/SQuAD-explorer/" target="_blank"> SQuAD</a> and <a href="https://hucvl.github.io/recipeqa/" target="_blank">RecipeQA</a> leaderboards.</h6>
<h6>Icon template adapted from <a href="https://www.flaticon.com/authors/freepik" >Freepik</a>.</h6>
</div>
</div>
</body>
</html>
<script type="text/javascript">
(function($) {
var LEADERBOARD_JSON = 'https://hucvl.github.io/recipeqa/leaderboard.json';
var template = $('#template').html();
Mustache.parse(template);
var ms_data = { submissions:[], };
$.getJSON(LEADERBOARD_JSON).done(function (data) {
var rendered = Mustache.render(template, ms_data);
$('#container').html(rendered);
}).fail(function () {
$('#container').html('This leaderboard is not ready yet.');
});
})(jQuery);
</script>