Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task]: Track the length of time an add-on is in the review queue #15174

Closed
1 task
abyrne-moz opened this issue Nov 19, 2024 · 19 comments · Fixed by mozilla/addons-server#22927
Closed
1 task
Assignees
Labels
Milestone

Comments

@abyrne-moz
Copy link

abyrne-moz commented Nov 19, 2024

Description

Measure on a per review basis from when an add-on entered the review queue to a decision being made.

Acceptance Criteria

Milestones/checkpoints

Preview Give feedback

Checks

  • If I have identified that the work is specific to a repository, I have removed "repository:addons-server" or "repository:addons-frontend"

┆Issue is synchronized with this Jira Task

@abyrne-moz abyrne-moz added needs:info repository:addons-server Issue relating to addons-server labels Nov 19, 2024
@diox
Copy link
Member

diox commented Nov 19, 2024

Once #15072 is implemented we should be very close to having that already by looking at when the first NeedsHumanReview was created for a version compared to when it was modified: they should only be modified when is_active is flipped, which is when the review is done.

The only reason this doesn't work right now is that we mass-update multiple NHR at once when a review is done, so auto_now doesn't trigger, so modified isn't updated. But we could trivially add it ourselves.

The query would need to take into account the following things:

  • A given version can have multiple reasons for being in the queue, not necessarily created at the same time
  • Some specific reasons can trigger a version to be re-added to the queue (appeal, developer reply, abuse reports)
  • Versions can enter the queue inheriting the due date of a previously waiting version (since on listed channel there is a maximum of one version awaiting review at all times) - the old version is never reviewed, and the new one should wait for a shorter time

@diox
Copy link
Member

diox commented Nov 19, 2024

Also specifically for DSA-related reasons, those are separate, so an add-on can be reviewed while leaving it in the queue because of those reasons

@diox
Copy link
Member

diox commented Nov 26, 2024

Andreas suggested we could emit an activity log when a version first gets a NHR, then another one when the last NHR is removed (referencing that first one). That would measure how long a version stayed in the queue.

@abyrne-moz
Copy link
Author

This metric should measure the time it takes between an add-on being flagged for human review and it actually having a review decision made.

@diox
Copy link
Member

diox commented Jan 2, 2025

Each time a version gets added to the queue for any reason (gets a due date when it didn't already have one), we'll add one entry to reviewers_reviewqueuehistory table with the created and original_due_date. Each time a version leaves the queue the corresponding entry will get updated with an exit_date. Lastly, if it left because of a reviewer action (and not the developer) the corresponding entry will get a review_decision_log pointing to the activity log we created from the reviewer action.

The difference between created and exit_date for a given row will give us the time spent in the queue. Records with no exit_date should still be in the queue. Records without a review_decision_log haven't been touched by a reviewer, so we'll be able to filter by review_decision_log if we want to only consider versions that left the queue as a result of a reviewer action.

Because add-ons can be in the queue for multiple reasons, and some of these reasons are not cleared for all actions, fields get updated independently. I see 2 edge cases inherent to my approach, all to do with a version leaving the queue because of a developer action but then getting reviewed later:

1) Version getting back in the queue after being disabled by developer

  • Developer submits a version, gets in the queue somehow (either signed and then flagged, or waiting for human review because it should not be auto-approved for some reason) with created date A
  • Developer disables the version at date B, causing the version to leave the queue
  • Developer re-enables the version, it re-enters the queue with created date C
  • The version gets reviewed at date D

In that scenario, we'll have 2 records and both will be considered as being the result of a reviewer action. They will have the correct exit date though:

  • created date A, exit date B (and original due date from first time it entered the queue)
  • created date C, exit date D (and original due date from second time it entered the queue)

2) Version getting reviewed despite no longer being in the queue

  • Developer submits a version, gets in the queue somehow (either signed and then flagged, or waiting for human review because it should not be auto-approved for some reason) with a created date A
  • Developer disables the version at date B, causing the version to leave the queue
  • The version gets reviewed despite not being in the queue at date C

In that scenario, we'll have one record and it will be considered as being the result of a reviewer action, but at the date it first left the queue:

  • created date A, exit date B (and original due date from initial submission as well)

I think both cases are OK considering the exit_date will be correct.

@diox
Copy link
Member

diox commented Jan 2, 2025

To help QA/testing I'm adding a Review Queue Histories admin to visualize this data. In production it would only be available to superadmins, but we'll expose the data in Redash for general consumption.

Testing should involve having an add-on enter the queue (either flagged for review after being signed or waiting for a human approval because it has auto-approved disabled) and then:

  • Keeping it in the queue, not doing any reviewer action, resulting in having an original_due_date matching its due date from the queue, a created date from when it entered the queue, but no exit_date or review_decision_log.
    Or:
  • Making it leave the queue through a reviewer action, resulting in having an exit_date and review_decision_log.

Repeat to test each case, and try to mix and match if you want to find any edge cases I haven't thought about.

@wagnerand
Copy link
Member

For scenario 1, you said

In that scenario, we'll have 2 records and both will be considered as being the result of a reviewer action. They will have the correct exit date though:

Why are both considered the result of a reviewer action? Shouldn't the first one be the result of a developer action (version disabled)?

For scenario 2 and friends (any time an add-on gets reviewed that isn't in the queues), can we avoid attributing it to a former entry in the queue?

I assume an add-on being in the queue due to multiple abuse report categories (or at least one category and e.g. not auto-approved), and a reviewer just either resolving one abuse report category or reviewing the version without resolving the abuse reports will not make the add-on leave the queue and thus will not be recorded this way?

@diox
Copy link
Member

diox commented Jan 2, 2025

Why are both considered the result of a reviewer action? Shouldn't the first one be the result of a developer action (version disabled)?
For scenario 2 and friends (any time an add-on gets reviewed that isn't in the queues), can we avoid attributing it to a former entry in the queue?

Both are side effects of the way I have it implemented right now. It makes it easier to handle a version flagged for different reasons that stays in the queue because only part of those reasons are resolved by the reviewer action (see below).

I assume an add-on being in the queue due to multiple abuse report categories (or at least one category and e.g. not auto-approved), and a reviewer just either resolving one abuse report category or reviewing the version without resolving the abuse reports will not make the add-on leave the queue and thus will not be recorded this way?

The way it works in my patch is that exit_date and review_decision_log are recorded separately, and only set if they were previously empty (and, in the case of exit_date, only set if we're truly exiting the queue - removing the due_date on the Version).

So if reviewing the add-on without resolving the abuse reports, review_decision_log will be set, but not exit_date. If the abuse reports are later resolved, this will set exit_date, but not override the review_decision_log.

Note that all of this is per version (not add-on).

@wagnerand
Copy link
Member

If a developer action makes the add-on/version exit the queue, exit_date should be set, but not review_decision_log.

If an add-on/version is reviewed, but not in the queue, review_decision_log should be set, but not exit_date (since there should be no "open" review).

If the abuse reports are later resolved, this will set exit_date, but not override the review_decision_log.

This is problematic and we probably need to find a way to solve that (cc @abyrne-moz). Resolving an abuse report is a decision as well (and comes with a reviewer action).

@ioanarusiczki ioanarusiczki modified the milestones: 2024.12.12, 2025.01.09 Jan 3, 2025
@diox
Copy link
Member

diox commented Jan 3, 2025

If a developer action makes the add-on/version exit the queue, exit_date should be set, but not review_decision_log.

If an add-on/version is reviewed, but not in the queue, review_decision_log should be set, but not exit_date (since there should be no "open" review).

That's what my patch does. Except when both those scenarios are combined, and something is reviewed, it will set the review_decision_log on the entry that was previously there.

If the abuse reports are later resolved, this will set exit_date, but not override the review_decision_log.

This is problematic and we probably need to find a way to solve that (cc @abyrne-moz).
Resolving an abuse report is a decision as well (and comes with a reviewer action).

If the add-on is only in the queue for an abuse report, and that is resolved, then the review_decision_log will be set correctly. The problem only occurs when something is in the queue for multiple reasons: if we look at it from the perspective of the "queue", there is only one entry. So I had to pick one action to record, even though there were more than one.

@wagnerand
Copy link
Member

I understand, but the numbers need to be accurate, as we might also use them to measure reviewer performance.

@diox
Copy link
Member

diox commented Jan 6, 2025

If you exclude the edge cases highlighted above the numbers should be accurate when looking from the perspective of the queue, which was the goal. Notably, if a version is in the queue for 2 different reasons but only one of those is resolved (as would be the case if unresolved abuse reports are one of the reasons) then the exit_date will be correctly set only when the version is removed from the queue.

@diox
Copy link
Member

diox commented Jan 6, 2025

What I have implemented works for measuring the length of time a version stays in the queue, and we can build aggregates based on that as requested in the issue description. It can be used to measure individual reviewer performance, but that was not in scope, so it's a bonus, and it comes with the edge cases mentioned.

If you're not happy with those edge cases it would have to be a follow-up, as it's going to clash with the fact that a version can be in the queue for multiple reasons that are not resolved at the same time. Addressing that would mean either:

  • a) preventing reviewers from completing an action on the add-on if there are outstanding abuse reports to resolve (including appeals...) - forcing them to resolve all issues at the same time
  • b) having a separate queue for abuse reports and other DSA-related reasons
  • c) measuring time it took for each reviewer action separately (as suggested above comparing action against creation date of the related NHR would be a start - though there are tons of gotchas there as well, as you need to find the right NHR at all times)

@wagnerand
Copy link
Member

wagnerand commented Jan 7, 2025

a) preventing reviewers from completing an action on the add-on if there are outstanding abuse reports to resolve (including appeals...) - forcing them to resolve all issues at the same time

I believe this was what we decided at some point. Did it get intentionally dropped, or just lost along the way?

c) measuring time it took for each reviewer action separately (as suggested above comparing action against creation date of the related NHR would be a start - though there are tons of gotchas there as well, as you need to find the right NHR at all times)

We could also do that and that might be the most elegant solution. However, as you said, it requires to find the right NHR. Do we know which one to pick if there are multiple?

@ioanarusiczki
Copy link

@diox

I tried this mainly on AMO dev and I can see it's working with extensions entering the manual review queue for various reasons and applies for all themes which do not have automatic approval. I had to make some notes from which I'll try to extract what's problematic maybe.

  • a new version is submitted and gets in the queue because auto-approval has been disabled. If the reviewer makes a review decision on the first version (confirms approval) then makes a decision on the new version, exit_date belongs to the confirm approval and the review_decision is correct “API submission on AMO dev Version 2.0 approved.”

entry https://addons-internal.dev.mozaws.net/en-US/admin/models/reviewers/reviewqueuehistory/6/change/
review page https://reviewers.addons-dev.allizom.org/en-US/reviewers/review-listed/635206

Similarly when a block is made on the first version and then the one flagged for HR
Or when a rejection is done on the first version and then one flagged for HR

Entries in reviewqueuehistory would register an exit_date when the first version was blocked/rejected and the review_decision when the other version (which was flagged for HR) is moderated.

  • A version is flagged for HR because of scanner rules but automatically approved. Gets the exit_date from the confirmation of approval which is correct but the review_decision could be wrong (it's approved instead of auto-approval confirmed). That's not a big deal as long as the exit_date is correct.

  • at scenario 1. 2 entries are created, one when version is disabled and another one when it's re-enabled by the developer. If an action is finally taken by the reviewer, both entries would have the same review_decision but different exit_dates. Maybe if the first entry would stay without a review_decision it would indicate that it was not moderated by a reviewer.

  • at scenario 2 When the version is disabled by the developer it gets an exit_date. Then if a reviewer decides to block the version while it's no longer in the queue the review_decision is not registered. Tricky edge case.

  • a version gets in the queue and it's deleted by the developer. Example gets an exit_date but does not have a review_decision. I don't think reviewers would make further actions on deleted ? idk

  • an add-on is made invisible after getting in the queue. Same as for scenario 1 with disabling a version. It would get the exit_date when it was made invisible and won't have a review_decision yet.

  • version with abuse report and following appeal wfm. At appeal it would get a new entry in the reviewqueuehistory. (I'll look around some more when I'll get to do some testing around the dsa feature tomorrow)

  • clear human review and then setting back nhr on the same version wfm. Same version has a new entry.

  • I've one more scenario on -stage to test about temporary delays (I'll update tomorrow)

@diox
Copy link
Member

diox commented Jan 8, 2025

a new version is submitted and gets in the queue because auto-approval has been disabled. If the reviewer makes a review decision on the first version (confirms approval) then makes a decision on the new version, exit_date belongs to the confirm approval and the review_decision is correct “API submission on AMO dev Version 2.0 approved.”

That's an interesting one, bit of a weird edge case. Need to dig a bit more, not sure what we're expecting to happen here since confirming auto-approval clears NHR on all versions...

Similarly when a block is made on the first version and then the one flagged for HR
Or when a rejection is done on the first version and then one flagged for HR
Entries in reviewqueuehistory would register an exit_date when the first version was blocked/rejected and the review_decision when the other version (which was flagged for HR) is moderated.

Same kind of issue: generally most actions in the listed channel (not unlisted) clear NHR for all other versions so it could cause them to be dropped from the queue entirely... That's not new, but maybe there is something to look into.

A version is flagged for HR because of scanner rules but automatically approved. Gets the exit_date from the confirmation of approval which is correct but the review_decision could be wrong (it's approved instead of auto-approval confirmed). That's not a big deal as long as the exit_date is correct.

Ah, that I think is wrong actually. I think we should only record the human actions... unless somehow an automated action takes something out of the queue.

at #15174 (comment). 2 entries are created, one when version is disabled and another one when it's re-enabled by the developer. If an action is finally taken by the reviewer, both entries would have the same review_decision but different exit_dates. Maybe if the first entry would stay without a review_decision it would indicate that it was not moderated by a reviewer.

Known edge case I think we can live with for now as commented above.

at #15174 (comment) the version is disabled by the developer it gets an exit_date. Then if a reviewer decides to block the version while it's no longer in the queue the review_decision is not registered.

If it's not in the queue that's expected.

a version gets in the queue and it's deleted by the developer. Example gets an exit_date but does not have a review_decision. I don't think reviewers would make further actions on deleted ? idk

They could if it had been signed. Here that's not the case, so that's expected.

an add-on is made invisible after getting in the queue. Same as for scenario 1 with disabling a version. It would get the exit_date when it was made invisible and won't have a review_decision yet.

Similar: if the version hasn't been signed yet, then it's no longer in the queue, but wasn't a reviewer action, we're correctly setting an exit date but no review decision (if it gets re-enabled an a review decision is made, we'll land in the same known edge case described above).

@ioanarusiczki
Copy link

ioanarusiczki commented Jan 10, 2025

@diox

Looking a bit more on stage I've noticed today

https://reviewers.addons.allizom.org/en-US/reviewers/review-listed/2244615
https://addons-internal.stage.mozaws.net/en-US/admin/models/reviewers/reviewqueuehistory/47/change/

I could mark this verified since I've noted here what I've noticed while testing and if you need followups let me know.

@diox
Copy link
Member

diox commented Jan 10, 2025

Go ahead and file a follow-up for the recording of the auto-approval decision. We should only record human reviewer decisions.

@diox
Copy link
Member

diox commented Jan 10, 2025

a developer appeal resolved with force enable -> the action is not displayed by review decisions

That's another interesting edge case. Force enable (and only force enable, force disable should work) activity is not tied to one or more versions, it's tied to the add-on itself, despite clearing NHR for all versions as a side-effect. So we don't find which versions to consider updating the history of.

You can also file a follow-up for that, maybe we'll have to add a workaround for that scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants