Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update groups, countries, Eurovoc terms #1082

Merged
merged 6 commits into from
Feb 3, 2025
Merged

Update groups, countries, Eurovoc terms #1082

merged 6 commits into from
Feb 3, 2025

Conversation

tillprochaska
Copy link
Collaborator

This fixes the last remaining task in #960.

I’ve also implemented a few small changes to sort records and list values. This should make it much easier to diff the resulting JSON files. (Right now, rerunning some of the htv load-* commands can lead to large diffs, even though there have been only small or no changes.)

Also had to fix how we send SPARQL requests because some SPARQL endpoints now only accept POST requests. (We were using GET requests before.)

@tillprochaska tillprochaska requested a review from linusha January 18, 2025 10:51
@tillprochaska tillprochaska force-pushed the static-data branch 3 times, most recently from 9732af7 to 185220e Compare January 18, 2025 10:56
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue you were seeing should now be fixed. We should probably reaggregate votes after deployment using htv aggregate votes to make sure these changes are also reflected in the search index.

These values are sets, i.e. order isn’t stable which means there will often be large diffs even in case of small or no changes.
Some of the SPARQL endpoints we use now block GET requests that pass the SPARQL query as a URL query parameter. POST requests work fine for all endpoints.
We previously used the DC Terms `identifier` property. However, this property is sometimes ambiguous. For example, the domain concept "AGRICULTURE, FORESTRY AND FISHERIES" has the DC Terms identifier `56`, but the correct ID is `100156`. Instead, we now extract the ID from the resource URI (`http://eurovoc.europa.eu/100156` in this example).
@linusha linusha merged commit dec36b3 into main Feb 3, 2025
2 checks passed
@linusha linusha deleted the static-data branch February 3, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants