Skip to content

Commit

Permalink
Initial import from v0.2
Browse files Browse the repository at this point in the history
  • Loading branch information
Panupong Pasupat committed Feb 15, 2017
0 parents commit 300cd8c
Show file tree
Hide file tree
Showing 10,568 changed files with 2,714,152 additions and 0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
16 changes: 16 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

Session.vim
.netrwhist
*.bak
*~
[._]*.s[a-v][a-z]
[._]*.sw[a-p]
[._]s[a-v][a-z]
[._]sw[a-p]
81 changes: 81 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
WikiTableQuestions Dataset
==========================
Version 0.2

Introduction
------------

The WikiTableQuestions dataset is for the task of question answering on
semi-structured HTML tables as presented in the paper:

> Panupong Pasupat, Percy Liang.
> Compositional Semantic Parsing on Semi-Structured Tables
> Association for Computational Linguistics (ACL), 2015.
Questions and Answers
---------------------

The `data/` directory contains the questions, answers, and the ID of the tables
that each question is asking about.

**Dataset Formats.** Each split of the dataset is stored in 2 formats:

- TSV file. Each row is an example with the following columns:
- Column 1: Example ID
- Column 2: Question
- Column 3: Table ID
- Column 4, 5, ...: Answer
(If the answer has multiple entities, multiple columns are used)

- EXAMPLES file. This LispTree format is used internally in our
[SEMPRE](http://nlp.stanford.edu/software/sempre/) code base.

**Dataset Splits.** We splitted 22033 examples into multiple sets:

- `training`:
Training data (14152 examples)

- `pristine-unseen-tables`:
Test data -- the tables are *not seen* in training data (4344 examples)

- `pristine-seen-tables`:
Additional data where the tables are *seen* in training data. (3537 examples)
(Initially intended to be used as development data, this portion of the dataset
was not actually used in any experiments in the paper.)

- `random-split-*`:
For development, we split training.tsv into 5 random 80-20 splits.
Within each split, tables in the training data (`random-split-seed-*-train`)
and the test data (`random-split-seed-*-test`) are disjoint.

For our ACL 2015 paper:

- In development set experiments, we trained on `random-split-seed-{1,2,3}-train`
and tested on `random-split-seed-{1,2,3}-test`, respectively.

- In test set experiments, we trained on `training` and tested on
`pristine-unseen-tables`.

Tables
------

The `csv/` directory contains the extracted tables, while the `html/` directory
contains the raw HTML data.

**Table Formats.**

- `csv/xxx-csv/yyy.csv`:
Comma-separated table (The first row is treated as the column header)

- `csv/xxx-csv/yyy.tsv`:
Tab-separated table

- `csv/xxx-csv/yyy.table`:
Column-aligned table (More human-readable but harder to parse by machines)

- `html/xxx-html/yyy.html`:
Raw HTML file of the whole web page

- `html/xxx-html/yyy.json`:
Metadata including the URL, the page title, and the index of the chosen table

15 changes: 15 additions & 0 deletions csv/200-csv/0.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
"Year","Title","Chart-Positions","Chart-Positions","Chart-Positions","Comments"
"Year","Title","UK","US","NL","Comments"
"1969","Renaissance","60","–","10",""
"1971","Illusion","–","–","–","1976 (UK)"
"1972","Prologue","–","–","–",""
"1973","Ashes Are Burning","–","171","–",""
"1974","Turn of the Cards","–","94","–","1975 (UK)"
"1975","Scheherazade and Other Stories","–","48","–",""
"1977","Novella","–","46","–","1977 (January in US, August in UK, as the band moved to the Warner Bros Music Group)"
"1978","A Song for All Seasons","35","58","–","UK:Silver"
"1979","Azure d'Or","73","125","–",""
"1981","Camera Camera","–","196","–",""
"1983","Time-Line","–","207","–",""
"2001","Tuscany","–","–","–",""
"2013","Grandine il Vento","–","–","–",""
15 changes: 15 additions & 0 deletions csv/200-csv/0.table
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
| Year | Title | Chart-Positions | Chart-Positions | Chart-Positions | Comments |
| Year | Title | UK | US | NL | Comments |
| 1969 | Renaissance | 60 | – | 10 | |
| 1971 | Illusion | – | – | – | 1976 (UK) |
| 1972 | Prologue | – | – | – | |
| 1973 | Ashes Are Burning | – | 171 | – | |
| 1974 | Turn of the Cards | – | 94 | – | 1975 (UK) |
| 1975 | Scheherazade and Other Stories | – | 48 | – | |
| 1977 | Novella | – | 46 | – | 1977 (January in US, August in UK, as the band moved to the Warner Bros Music Group) |
| 1978 | A Song for All Seasons | 35 | 58 | – | UK:Silver |
| 1979 | Azure d'Or | 73 | 125 | – | |
| 1981 | Camera Camera | – | 196 | – | |
| 1983 | Time-Line | – | 207 | – | |
| 2001 | Tuscany | – | – | – | |
| 2013 | Grandine il Vento | – | – | – | |
15 changes: 15 additions & 0 deletions csv/200-csv/0.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Year Title Chart-Positions Chart-Positions Chart-Positions Comments
Year Title UK US NL Comments
1969 Renaissance 60 10
1971 Illusion 1976 (UK)
1972 Prologue
1973 Ashes Are Burning 171
1974 Turn of the Cards 94 1975 (UK)
1975 Scheherazade and Other Stories 48
1977 Novella 46 1977 (January in US, August in UK, as the band moved to the Warner Bros Music Group)
1978 A Song for All Seasons 35 58 UK:Silver
1979 Azure d'Or 73 125
1981 Camera Camera 196
1983 Time-Line 207
2001 Tuscany
2013 Grandine il Vento
32 changes: 32 additions & 0 deletions csv/200-csv/1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
"Year","Title","Role","Notes"
"1995","Polio Water","Diane","Short film"
"1996","New York Crossing","Drummond","Television film"
"1997","Lawn Dogs","Devon Stockard",""
"1999","Pups","Rocky",""
"1999","Notting Hill","12-Year-Old Actress",""
"1999","The Sixth Sense","Kyra Collins",""
"2000","Paranoid","Theresa",""
"2000","Skipped Parts","Maurey Pierce",""
"2000","Frankie & Hazel","Francesca 'Frankie' Humphries","Television film"
"2001","Lost and Delirious","Mary 'Mouse' Bedford",""
"2001","Julie Johnson","Lisa Johnson",""
"2001","Tart","Grace Bailey",""
"2002","A Ring of Endless Light","Vicky Austin","Television film"
"2003","Octane","Natasha 'Nat' Wilson",""
"2006","The Oh in Ohio","Kristen Taylor",""
"2007","Closing the Ring","Young Ethel Ann",""
"2007","St Trinian's","JJ French",""
"2007","Virgin Territory","Pampinea",""
"2008","Assassination of a High School President","Francesca Fachini",""
"2009","Walled In","Sam Walczak",""
"2009","Homecoming","Shelby Mercer",""
"2010","Don't Fade Away","Kat",""
"2011","You and I","Lana",""
"2012","Into the Dark","Sophia Monet",""
"2012","Ben Banks","Amy",""
"2012","Apartment 1303 3D","Lara Slate",""
"2012","Cyberstalker","Aiden Ashley","Television film"
"2013","Bhopal: A Prayer for Rain","Eva Gascon",""
"2013","A Resurrection","Jessie","Also producer"
"2013","L.A. Slasher","The Actress",""
"2013","Gutsy Frog","Ms. Monica","Television film"
32 changes: 32 additions & 0 deletions csv/200-csv/1.table
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
| Year | Title | Role | Notes |
| 1995 | Polio Water | Diane | Short film |
| 1996 | New York Crossing | Drummond | Television film |
| 1997 | Lawn Dogs | Devon Stockard | |
| 1999 | Pups | Rocky | |
| 1999 | Notting Hill | 12-Year-Old Actress | |
| 1999 | The Sixth Sense | Kyra Collins | |
| 2000 | Paranoid | Theresa | |
| 2000 | Skipped Parts | Maurey Pierce | |
| 2000 | Frankie & Hazel | Francesca 'Frankie' Humphries | Television film |
| 2001 | Lost and Delirious | Mary 'Mouse' Bedford | |
| 2001 | Julie Johnson | Lisa Johnson | |
| 2001 | Tart | Grace Bailey | |
| 2002 | A Ring of Endless Light | Vicky Austin | Television film |
| 2003 | Octane | Natasha 'Nat' Wilson | |
| 2006 | The Oh in Ohio | Kristen Taylor | |
| 2007 | Closing the Ring | Young Ethel Ann | |
| 2007 | St Trinian's | JJ French | |
| 2007 | Virgin Territory | Pampinea | |
| 2008 | Assassination of a High School President | Francesca Fachini | |
| 2009 | Walled In | Sam Walczak | |
| 2009 | Homecoming | Shelby Mercer | |
| 2010 | Don't Fade Away | Kat | |
| 2011 | You and I | Lana | |
| 2012 | Into the Dark | Sophia Monet | |
| 2012 | Ben Banks | Amy | |
| 2012 | Apartment 1303 3D | Lara Slate | |
| 2012 | Cyberstalker | Aiden Ashley | Television film |
| 2013 | Bhopal: A Prayer for Rain | Eva Gascon | |
| 2013 | A Resurrection | Jessie | Also producer |
| 2013 | L.A. Slasher | The Actress | |
| 2013 | Gutsy Frog | Ms. Monica | Television film |
32 changes: 32 additions & 0 deletions csv/200-csv/1.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Year Title Role Notes
1995 Polio Water Diane Short film
1996 New York Crossing Drummond Television film
1997 Lawn Dogs Devon Stockard
1999 Pups Rocky
1999 Notting Hill 12-Year-Old Actress
1999 The Sixth Sense Kyra Collins
2000 Paranoid Theresa
2000 Skipped Parts Maurey Pierce
2000 Frankie & Hazel Francesca 'Frankie' Humphries Television film
2001 Lost and Delirious Mary 'Mouse' Bedford
2001 Julie Johnson Lisa Johnson
2001 Tart Grace Bailey
2002 A Ring of Endless Light Vicky Austin Television film
2003 Octane Natasha 'Nat' Wilson
2006 The Oh in Ohio Kristen Taylor
2007 Closing the Ring Young Ethel Ann
2007 St Trinian's JJ French
2007 Virgin Territory Pampinea
2008 Assassination of a High School President Francesca Fachini
2009 Walled In Sam Walczak
2009 Homecoming Shelby Mercer
2010 Don't Fade Away Kat
2011 You and I Lana
2012 Into the Dark Sophia Monet
2012 Ben Banks Amy
2012 Apartment 1303 3D Lara Slate
2012 Cyberstalker Aiden Ashley Television film
2013 Bhopal: A Prayer for Rain Eva Gascon
2013 A Resurrection Jessie Also producer
2013 L.A. Slasher The Actress
2013 Gutsy Frog Ms. Monica Television film
15 changes: 15 additions & 0 deletions csv/200-csv/10.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
"year","deaths","# of accidents"
"2012","794","700"
"2011","828","117"
"2010","1,115","130"
"2009","1,103","122"
"2008","884","156"
"2007","971","147"
"2006","1,294","166"
"2005","1,459","185"
"2004","771","172"
"2003","1,230","199"
"2002","1,413","185"
"2001","4,140","200"
"2000","1,582","189"
"1999","1,138","211"
15 changes: 15 additions & 0 deletions csv/200-csv/10.table
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
| year | deaths | # of accidents |
| 2012 | 794 | 700 |
| 2011 | 828 | 117 |
| 2010 | 1,115 | 130 |
| 2009 | 1,103 | 122 |
| 2008 | 884 | 156 |
| 2007 | 971 | 147 |
| 2006 | 1,294 | 166 |
| 2005 | 1,459 | 185 |
| 2004 | 771 | 172 |
| 2003 | 1,230 | 199 |
| 2002 | 1,413 | 185 |
| 2001 | 4,140 | 200 |
| 2000 | 1,582 | 189 |
| 1999 | 1,138 | 211 |
15 changes: 15 additions & 0 deletions csv/200-csv/10.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
year deaths # of accidents
2012 794 700
2011 828 117
2010 1,115 130
2009 1,103 122
2008 884 156
2007 971 147
2006 1,294 166
2005 1,459 185
2004 771 172
2003 1,230 199
2002 1,413 185
2001 4,140 200
2000 1,582 189
1999 1,138 211
28 changes: 28 additions & 0 deletions csv/200-csv/11.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"Award","Category","Nominee","Result"
"Academy Awards, 1972","Best Picture","Phillip D'Antoni","Won"
"Academy Awards, 1972","Best Director","William Friedkin","Won"
"Academy Awards, 1972","Best Actor","Gene Hackman","Won"
"Academy Awards, 1972","Best Adapted Screenplay","Ernest Tidyman","Won"
"Academy Awards, 1972","Film Editing","Gerald B. Greenberg","Won"
"Academy Awards, 1972","Best Supporting Actor","Roy Scheider","Nominated"
"Academy Awards, 1972","Best Cinematography","Owen Roizman","Nominated"
"Academy Awards, 1972","Best Sound","Theodore Soderberg\\nChristopher Newman","Nominated"
"American Cinema Editors, 1972","Best Edited Feature Film","Gerald B. Greenberg","Nominated"
"BAFTA, 1972","Best Actor","Gene Hackman","Won"
"BAFTA, 1972","Best Film Editing","Gerald B. Greenberg","Won"
"BAFTA, 1972","Best Direction","William Friedkin","Nominated"
"BAFTA, 1972","Best Film","Philip D'Antoni","Nominated"
"BAFTA, 1972","Best Sound Track","Christopher Newman\\nTheodore Soderberg","Nominated"
"David di Donatello Award, 1972","Best Foreign Film","Philip D'Antoni","Won"
"Directors Guild of America, 1972","Outstanding Directorial Achievement","William Friedkin","Won"
"Edgar Allan Poe Awards, 1972","Best Motion Picture","Ernest Tidyman","Won"
"Golden Globe Awards, 1972","Best Motion Picture","Phillip D'Antoni","Won"
"Golden Globe Awards, 1972","Best Director","William Friedkin","Won"
"Golden Globe Awards, 1972","Best Actor","Gene Hackman","Won"
"Golden Globe Awards, 1972","Best Screenplay","Ernest Tidyman","Nominated"
"Kansas City Film Critics Circle, 1972","Best Actor","Gene Hackman","Won"
"Kansas City Film Critics Circle, 1972","Best Film","Ernest Tidyman","Won"
"National Society of Film Critics, 1972","Best Actor","Gene Hackman","Nominated"
"New York Film Critics Circle, 1971","Best Actor","Gene Hackman","Won"
"New York Film Critics Circle, 1971","Best Film","Ernest Tidyman","Nominated"
"Writers Guild of America, 1972","Best Drama Adaptation","Ernest Tidyman","Nominated"
28 changes: 28 additions & 0 deletions csv/200-csv/11.table
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
| Award | Category | Nominee | Result |
| Academy Awards, 1972 | Best Picture | Phillip D'Antoni | Won |
| Academy Awards, 1972 | Best Director | William Friedkin | Won |
| Academy Awards, 1972 | Best Actor | Gene Hackman | Won |
| Academy Awards, 1972 | Best Adapted Screenplay | Ernest Tidyman | Won |
| Academy Awards, 1972 | Film Editing | Gerald B. Greenberg | Won |
| Academy Awards, 1972 | Best Supporting Actor | Roy Scheider | Nominated |
| Academy Awards, 1972 | Best Cinematography | Owen Roizman | Nominated |
| Academy Awards, 1972 | Best Sound | Theodore Soderberg Christopher Newman | Nominated |
| American Cinema Editors, 1972 | Best Edited Feature Film | Gerald B. Greenberg | Nominated |
| BAFTA, 1972 | Best Actor | Gene Hackman | Won |
| BAFTA, 1972 | Best Film Editing | Gerald B. Greenberg | Won |
| BAFTA, 1972 | Best Direction | William Friedkin | Nominated |
| BAFTA, 1972 | Best Film | Philip D'Antoni | Nominated |
| BAFTA, 1972 | Best Sound Track | Christopher Newman Theodore Soderberg | Nominated |
| David di Donatello Award, 1972 | Best Foreign Film | Philip D'Antoni | Won |
| Directors Guild of America, 1972 | Outstanding Directorial Achievement | William Friedkin | Won |
| Edgar Allan Poe Awards, 1972 | Best Motion Picture | Ernest Tidyman | Won |
| Golden Globe Awards, 1972 | Best Motion Picture | Phillip D'Antoni | Won |
| Golden Globe Awards, 1972 | Best Director | William Friedkin | Won |
| Golden Globe Awards, 1972 | Best Actor | Gene Hackman | Won |
| Golden Globe Awards, 1972 | Best Screenplay | Ernest Tidyman | Nominated |
| Kansas City Film Critics Circle, 1972 | Best Actor | Gene Hackman | Won |
| Kansas City Film Critics Circle, 1972 | Best Film | Ernest Tidyman | Won |
| National Society of Film Critics, 1972 | Best Actor | Gene Hackman | Nominated |
| New York Film Critics Circle, 1971 | Best Actor | Gene Hackman | Won |
| New York Film Critics Circle, 1971 | Best Film | Ernest Tidyman | Nominated |
| Writers Guild of America, 1972 | Best Drama Adaptation | Ernest Tidyman | Nominated |
Loading

0 comments on commit 300cd8c

Please sign in to comment.