Skip to content
This repository has been archived by the owner on May 8, 2024. It is now read-only.

Go through the list of MPs and check if they are included in wikidata #245

Closed
MansMeg opened this issue Mar 8, 2023 · 14 comments
Closed
Assignees
Labels
manual annotation An issue that needs some manual work to be fixed
Milestone

Comments

@MansMeg
Copy link
Collaborator

MansMeg commented Mar 8, 2023

Here is a list/csv with all MPs in the book "tvåkammar riksdagen":
https://github.com/salgo60/Wikidata_riksdagen-corpus/blob/main/data/WD%20Register%20%C3%B6ver%20riksdagsmannabiografier%20band%201-5.txt

Probably checking if they exist in wikidata (as a suggestion) can also be done programmatically.

Those marked with wikidata_id have already been included. Two could help out by going through that list and adding the minimum necessary information needed for us to include them in our MP database. So the manual work would be to:

  1. Go through the list and for those that are missing wikidata id:
    a. Check if they already exist in wikidata
    If they exist: Add the wikidata_id to the txt-file
    Otherwise: Do a minimal addition of the person to wiki data by including the basic content from the tvåkammarriksdagen book:
    a. The name
    b. Alias (iort/iriksdagen kallad)
    c. Kön, födelsedatum, dödsdatum (if in the book)
    d. described by source where Tvåkammarriksdagsboken is referred to with the correct.
    e. position held. Add one per period in the parliament as an MP (for the correct chamber). See the Tvåkammar book.

Example:
https://www.wikidata.org/wiki/Q1712821

MPs currently missing in wikidata can be found here: #249

@MansMeg MansMeg added the manual annotation An issue that needs some manual work to be fixed label Mar 8, 2023
@fredrik1984
Copy link
Collaborator

fredrik1984 commented Mar 8, 2023

Great! Have I understood it correct that this is the list after the work Sälgö has done with biografibanden for the bicameral riksdag? And it is for us to do the remaining work manually?

Some 1200 persons in the list that do not have a wikidata link

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 8, 2023

Exactly. This would be a first step to check that all persons are in our database. So it should be quite straightforward to do. I tested it and it is just to propose an update to Magnus list if you find a person.

So I guess this is of high priority to do.

@fredrik1984
Copy link
Collaborator

Ok, good! Yes, this should be of high priority. Maybe Lotta and Mattias can help out here.

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 8, 2023

I think anyone with the tvåkammar book can do this.

@ninpnin
Copy link
Collaborator

ninpnin commented Mar 13, 2023

Should we also do a sample manually? Programmatic checking is only so reliable.

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 13, 2023

I think we could do that. But we are now going through the register of the tvåkammarbook so I guess it might not be worthwhile (seen as the cost vs the pot benefit). Simply because I think we might only have a handful missing in wikidata. At least not now. I think running the wikiids from salgos list (after emils checks) and compare with our database would be more worthwhile. Then we would catch wikidata people missing in our database and hence find potential errors directly.

@ninpnin
Copy link
Collaborator

ninpnin commented Mar 13, 2023

Then we would catch wikidata people missing in our database and hence find potential errors directly.

Why would we be missing people that exist in Wikidata? We scrape them directly from there.

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 13, 2023

But Im not sure how you filter them, ie define the set of mps from wikidata? Is it individuals that has a position held in the parliament? Some persons might not have that slot so we miss them when we filter them.

Does it make more sense?

@ninpnin
Copy link
Collaborator

ninpnin commented Mar 13, 2023

Fair enough. But I think missing some people when adding thousands of them is also likely.

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 13, 2023

Yes. But if we look at the differences we are probably going to catch both these types of errors. =)

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 13, 2023

Also, the roll call project will probably also capture some missing mps.

@ninpnin
Copy link
Collaborator

ninpnin commented Mar 13, 2023

I mean the list is not going to include anyone who hasn't been added to Wikidata?

@MansMeg
Copy link
Collaborator Author

MansMeg commented Mar 13, 2023

Yes, it will. The list is the register from the tvåkammar book. Magnus has extracted all names there from the last pages and has started to go through all names to check that they are all in wikidata. Those that are missing are added. Those already are in wikidata just get the wikidata ID in his file.

@MansMeg MansMeg added this to the v0.5.x milestone Mar 17, 2023
@MansMeg MansMeg modified the milestones: v0.5.x, MP database push Apr 2, 2023
@BobBorges
Copy link
Collaborator

close?

@MansMeg MansMeg closed this as completed May 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
manual annotation An issue that needs some manual work to be fixed
Projects
None yet
Development

No branches or pull requests

5 participants