-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor index string fallbacks for Contributor Index #25834
Conversation
admin/app/jobs/RebuildIndexJob.scala
Outdated
* Concatenating these three strings therefore allows us to use firstName and webTitle as fallbacks | ||
* if lastName is None or "". | ||
* */ | ||
val indexString = tag.lastName.getOrElse("").trim + tag.firstName.getOrElse("").trim + tag.webTitle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk of these being duplicated between contributors? If so do we know the effect that this might have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question! my understanding is that these values are just used to bucket contributors by the first letter in their names, if available. (More context on this functionality can be found in a previous PR from @ioannakok: #25176) So there shouldn't be any need for the indexString
value to be unique, if I've understood the functionality correctly.
Perhaps this is a separate issue and I'm misunderstanding something here but even in the current situation I am not sure why contributors with both first name and last name end up in the |
It's a good question! I'm not 100% sure but I believe it's generally going to be because their information has been put in "wrong", so that for instance the |
Ah, I see - thank you! |
admin/app/jobs/RebuildIndexJob.scala
Outdated
* Concatenating these three strings therefore allows us to use firstName and webTitle as fallbacks | ||
* if lastName is None or "". | ||
* */ | ||
val indexString = tag.lastName.getOrElse("").trim + tag.firstName.getOrElse("").trim + tag.webTitle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val indexString = tag.lastName.getOrElse("").trim + tag.firstName.getOrElse("").trim + tag.webTitle | |
val indexString = tag.lastName.getOrElse("").trim.concat(tag.firstName.getOrElse("").trim).concat(tag.webTitle) |
Co-authored-by: Ioanna Kokkini <ioannakok@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good - but just had a thought about making these indexes simpler in general. Could we just use the webTitle
?
Looking at the Tag
model - webTitle is required and almost always seems to be some combination of names.
Looking at where we render the text, it is also the webTitle with some sort of deduping.
I did wonder about this, but thought it was best in this PR to focus on the bug and keep the original intent of the functionality. I think there's definitely a case to be made for using Any thoughts on this @jamesgorrie ? |
Seen on FRONTS-PROD, ADMIN-PROD (merged by @bryophyta 3 hours, 15 minutes and 14 seconds ago)
|
I would love to speak to the person who thought an index sorted on multiple, unsurfaced criteria would be easier to search. Saying that, it's a good point of not knowing the intent. Let's go digging for that. I am going to take a guess it was:
|
What does this change?
Replaces a chain of
orElse
statements with string concatenation. The previous structure didn't test whethertags.lastName
was an empty string. Passing an empty string totagPages.alphaIndexKey
doesn't crash the job, but it was leading to an error being logged.@jamesgorrie's PR #25832 will downgrade the error message to an info message.
This PR should remove a significant proportion of the events which were triggering the logging statement in the first place. Using string concatenation here means that we don't need to explicitly test whether
lastName
is empty; if it is empty then the first character infirstName
will be used for the index (and likewise iffirstName
is also empty). This is a less 'lazy' approach as it will often involve evaluating values that don't get used, but the cost of eagerness here seems low because it just involves unwrapping some Options and concatenating three strings.This PR should also affect the contributor indexes that are built by this job, because all the contributors with no
lastName
should now be indexed by theirfirstName
orwebTitle
instead of the fallback index key of1-9
. It looks like this was the intention of the original code, and in theory should be a more 'useful' index. However, it will still provide the 'wrong' bucketing and alphabetical ordering for those contributors who have nolastName
set through human error (as opposed to the case where their name doesn't fit within the convention of a first and last name; in the latter cases ideally the affordances provided here would be used by editorial to produce the conventionally correct ordering based on the specifics of the case).Does this change need to be reproduced in dotcom-rendering ?
Tests
I deployed this to CODE last night and the errors seem to have dropped off since that deployment:
Also, @ioannakok and I have verified that when the job runs with the code in this PR it does rebuild the indexes in S3 with contributors allocated as we'd expect (i.e. most of them have moved from the
1-9
bucket into an alphabet bucket):