Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand and Refactor Latin Verb Query to Focus on Present Tense Forms #495

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

KesharwaniArpita
Copy link
Contributor

Contributor checklist


Description

This PR expands and refactors the existing SPARQL query to focus solely on Present Tense verb forms(till now) for Latin (Q397) verbs. The changes made include:

  • Simplified the query to filter verb forms by Present Tense using a single VALUES block.
  • Removed redundant specifications for tense in each optional block by centralizing the Present Tense filter.
  • Maintained retrieval of all necessary conjugations (Imperative, Subjunctive, Indicative) while limiting results to Present Tense forms.
  • Updated all OPTIONAL blocks to align with this restriction, ensuring they cover Subjunctive, Imperative, and Indicative for different persons (First, Second, Third; Singular and Plural).
  • Improved query readability and maintainability by centralizing tense filtering.

Testing:

  • Ran the query on the Wikidata Query Service to ensure the correct present tense verb forms are returned.
  • Verified that forms such as Subjunctive Active, Imperative Active, and Indicative Passive are returned correctly for all persons (Singular and Plural).

Future Work:

  1. Past Tense: A similar refactor can be done to retrieve past tense verb forms(available on wikidata). These can be added using a VALUES block that filters for relevant grammatical features such as:

  2. Future Tense: The query can be extended to retrieve future tense forms by adding future-specific grammatical features:

Both Past and Future tense forms can follow the same pattern as this Present Tense query, with additional VALUES blocks or separate queries focused on those tenses.

Impact:

The refactor improves the clarity and performance of the current query by focusing on a single tense while maintaining flexibility for future enhancements to cover other tenses.

Related issue

Copy link

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@andrewtavis
Copy link
Member

Can you check out the workflow errors and make the needed fixes, @KesharwaniArpita?

@andrewtavis andrewtavis self-requested a review November 19, 2024 02:19
@andrewtavis
Copy link
Member

Hey @KesharwaniArpita 👋 Checking in here on this :) Do you think you'll have time to address the errors from the query check workflows?

@KesharwaniArpita
Copy link
Contributor Author

Hi @andrewtavis , I'll surely do this. I apologize, I got busy with my end semester lab work and missed the mail. But I'll surely get back to it. Thanks for checking!

@KesharwaniArpita
Copy link
Contributor Author

Hi @andrewtavis, I have resolved the errors. I think we are good to go! Thank you :)

@andrewtavis
Copy link
Member

Ok @KesharwaniArpita :) So there was a minor bug in the forms check workflow that was causing it to not run. There are still many errors in the queries, which you can see here and also run locally with python3 src/scribe_data/check/check_query_forms.py in the project root.

Can you pull down the most recent changes and go through the needed changes as shown in the workflow errors? Big thing here is maybe it makes sense to really restrict this as much as we can such that each individual query just has four forms. Might make sense to do it that way as it really is getting to be tough to navigate the queries.

So steps from here:

  • Bring down the commits I just sent along to your local branch
  • Make all queries return just four forms at max each
  • Make sure that all queries return results for each form and that the results are unique (one row per lexeme)
  • Run python3 src/scribe_data/check/check_query_forms.py locally
  • Fix the errors as dictated by the error messages

Let me know if you have any questions on the above!

@KesharwaniArpita
Copy link
Contributor Author

Sure @andrewtavis . I'll work on it. Thanks for letting me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants