Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The number of QA pairs in WebQSP test dataset #86

Closed
novice7 opened this issue Jun 27, 2021 · 20 comments
Closed

The number of QA pairs in WebQSP test dataset #86

novice7 opened this issue Jun 27, 2021 · 20 comments

Comments

@novice7
Copy link

novice7 commented Jun 27, 2021

Hi!I find the number of questions in the https://github.com/malllabiisc/EmbedKGQA/blob/master/KGQA/RoBERTa/webqsp_scores_full_kg.pkl is less than the number in original WebQuestionsSP test dataset!!

I know you just need to select 100 questions in relation matching part and get the result: 66.6.
I tried to use all questions in webqsp_scores_full_kg.pk and the result also is 66.6.
I wonder whether the result is 66.6 if EmbedKGQA tested on original WebQuestionsSP test dataset

@xqx1568
Copy link

xqx1568 commented Jul 8, 2021

Could you provide an answer file for me ? I can not run this code now. I want to get all answer of WebQSP-test. Thank you!
email: xqx1568@163.com

@dungtn
Copy link

dungtn commented Jul 10, 2021

Hi, I also have the same question. I ran the code and get ~66% hits@1, but this is on 1596 test questions instead of 1639 questions in the test set used by other baselines (PullNet, GraftNet, etc.).

I assume the 66% reported in the paper is comparable to baselines. So the 43 questions (=1639-1596) should be counted as incorrect, so the number is actually around ~63%? Then, how do I replicate the 66% reported in the paper? Am I missing something?

@novice7
Copy link
Author

novice7 commented Jul 11, 2021

I'm sorry, maybe my expression is wrong. I mean, with all 1596 questions, we can get 66.6 results.

I am also very confused about the author's results.

However, I noticed that when the author trained the WebQuestioqsp data set with complex embedding, the answers to the ignored 43 questions were not in the entity set, and most of these entities were value rather than entity

@apoorvumang
Copy link
Collaborator

Yes 43 questions are missing from our dataset for some reason. It might have been due to incorrect dataset download on my side. On inspection I found that 42/43 of these questions are the last 42 questions of the original WebQuestionsSP test json file. The train file seems to contain all but 1 question (which shouldn't matter IMO).

I am adding these to the test file and will commit the corrected test file once I re-run the experiment on this. I'll also update the results in case there's a significant change from 66.6

@apoorvumang
Copy link
Collaborator

Here's the corrected test file in case you want to use it on your trained models @novice7 @dungtn
qa_test_webqsp_fixed.txt

@novice7
Copy link
Author

novice7 commented Jul 11, 2021

If you get the result of the rerun, update data / fbwq_ full and embeddings / complex_ fbwq_ full/checkpoint_ best.pt please!! @apoorvumang
because I noticed that some of the entities and answers mentioned in the 43 questions don't seem to be in /data/ fbwq_ full/entities.dict

@dungtn
Copy link

dungtn commented Jul 12, 2021

@apoorvumang Thank you for the prompt reply. Can you also release the code for getting the qa_{train, test}_webqsp.txt? In particular, what entity linker do you use to get the head entity? Upon comparison with the gold entities given by WebQSP, only 696/1639 (42.47%) questions have the correct head entity? But for some of these questions, the model is still able to get the correct answers? I'm quite confused at this point so thank you for being patient with my questions :-)

@apoorvumang
Copy link
Collaborator

@dungtn I used the topic entity in the parse of question as the head entity (as available in the json here https://www.microsoft.com/en-us/download/details.aspx?id=52763)

What method did you use that got you only 696/1639 correct gold entities?

@apoorvumang
Copy link
Collaborator

@dungtn I just verified, all 1639 head entities in the file qa_test_webqsp_[fixed].txt are contained as a 'TopicEntityMid' in at least one parse of the question

@apoorvumang
Copy link
Collaborator

@novice7 Yes some entities are unfortunately not there since these questions probably were not considered during data preprocessing, which involved pruning the FreeBase KG. It seems a hard fix right now to add these entities + their neighbourhood (as done in the data preprocessing) and then retrain ComplEx embeddings on that KG. I might do it some time in the future.

Still, since some of those entities are there in the KG, we should be able to get at least a few correct.

@apoorvumang
Copy link
Collaborator

I have gotten Accuracy is 0.66182 with the extra questions. I trained from scratch so still not sure this is best performance (need to do hyperparameter tuning). Although this is less than reported 66.6 I won't change results yet until I am able to run a hyperparameter search. I'm running a little bit tight on compute resources right now.

@dungtn
Copy link

dungtn commented Jul 12, 2021

@dungtn I used the topic entity in the parse of question as the head entity (as available in the json here https://www.microsoft.com/en-us/download/details.aspx?id=52763)

What method did you use that got you only 696/1639 correct gold entities?

@apoorvumang I'm using the same gold dataset. I found the problem: since there is no question id comes with the qa_test_webqsp_[fixed].txt, I just compare them line by line. Can you give a file with question id as well for ease to verify?

@apoorvumang
Copy link
Collaborator

Unfortunately I do not have a question file, however you can use the 'ProcessedQuestion' as the key @dungtn

@apoorvumang
Copy link
Collaborator

This is the code I used. You can download like

wget http://transfer.sh/1p9wRfh/nb.ipynb

@dungtn
Copy link

dungtn commented Jul 12, 2021

@apoorvumang Thank you! That checks out, the head entities always have a match!

@apoorvumang
Copy link
Collaborator

Issue has been fixed in the latest commit, you can use the new score file provided (or your own trained model's scores ofc).

@albert-jin
Copy link
Contributor

image
I think the question is caused by the disappear relations path ,red circle depicted, so the 43 become 42 may be caused by this issue.

@apoorvumang
Copy link
Collaborator

Nice catch! I'll fix this, thanks @albert-jin

@albert-jin
Copy link
Contributor

I want to make some improvements on your experiments in the several next weeks, and I will publish a paper to WSDM, I hope to get your consent and i will give you the third author, is that ok? @apoorvumang

@apoorvumang
Copy link
Collaborator

apoorvumang commented Jul 17, 2021

of course, you are welcome to make any improvement/changes @albert-jin . You don't need to add me as an author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants