Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent oversampling of study submissions by any single worker #1080

Merged

Conversation

meta-paul
Copy link
Contributor

@meta-paul meta-paul commented Oct 19, 2023

For some studies (i.e. Tasks spanning multiple TaskRuns) it may be desired to limit number of submissions received from the same worker (for example, to prevent bias in annotation tasks dataset).

The new TaskRunArgs parameter max_submissions_per_worker addresses this need.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 19, 2023
@meta-paul meta-paul force-pushed the prevent-study-oversampling-by-one-worker branch from 3345689 to de9e7ae Compare October 19, 2023 20:55
@meta-paul
Copy link
Contributor Author

meta-paul commented Oct 19, 2023

@JackUrb @xing-liu This is a feature to prevent oversampling. Not part of TaskReviewApp feature, but due to the size of that other PR I've used that branch as the base for now.

@JackUrb
Copy link
Contributor

JackUrb commented Oct 20, 2023

Is this the same as maximum_units_per_worker?

@meta-paul
Copy link
Contributor Author

meta-paul commented Oct 20, 2023

@JackUrb Similar idea, but for a different provider setup. In Prolific we don't know ahead of time who will be working on a unit before releasing it. So sidelining an overcontributing worker should be done differently (via participant group, not at the time of unit assignment because we have no control over that).

@JackUrb
Copy link
Contributor

JackUrb commented Oct 20, 2023

I see - I think in general we should actually try to merge this concept. In #773 I noted that the current way of doing things (based on a count and a "no access" page) wasn't great, and that qualification-based would be preferred. I don't think we should have two separate arguments for this.

@meta-paul
Copy link
Contributor Author

@JackUrb I can attempt at parameter unification there. Should we do this before merging TaskReview app code, or it can be done later?

@JackUrb
Copy link
Contributor

JackUrb commented Oct 23, 2023

I don't think it should be merged as is, and also don't think this feature should block task review.

@meta-paul meta-paul force-pushed the prevent-study-oversampling-by-one-worker branch from ccd522c to 31374ca Compare October 26, 2023 00:45
@meta-paul
Copy link
Contributor Author

meta-paul commented Oct 26, 2023

@JackUrb I've nested my logic under your paramtere name maximum_units_per_worker. That way one parameter will cover both use cases of Mturk and Prolific. Let me know if that makes sense.

(By the way, disregard failing styling check. Not sure why it's failing, as if Github is running on another version of the code - couldn't find failing lines within this branch)

@JackUrb
Copy link
Contributor

JackUrb commented Oct 26, 2023

This consolidation now makes a lot more sense. Just be sure to test it out both on MTurk and prolific.

@meta-paul meta-paul force-pushed the make-task-review-server branch from 8433e98 to 5b07bcc Compare October 26, 2023 22:45
Copy link
Contributor

@mojtaba-komeili mojtaba-komeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick look and it seems reasonable to me. I also see Jack is onboard with the changes. Feel free to merge after adding some tests please.

@meta-paul meta-paul force-pushed the make-task-review-server branch from 42a0931 to f2fe874 Compare November 6, 2023 19:14
@meta-paul meta-paul force-pushed the make-task-review-server branch 3 times, most recently from 6cbce4e to 3494563 Compare November 16, 2023 21:13
@meta-paul meta-paul force-pushed the prevent-study-oversampling-by-one-worker branch from aa0edc3 to 65d3065 Compare November 16, 2023 23:41
@meta-paul meta-paul force-pushed the prevent-study-oversampling-by-one-worker branch 5 times, most recently from 6718c13 to c5c5bc0 Compare November 22, 2023 18:29
@meta-paul meta-paul force-pushed the make-task-review-server branch from 01ba33f to 936effa Compare November 22, 2023 18:32
@meta-paul meta-paul force-pushed the make-task-review-server branch 27 times, most recently from f938fbd to 650326a Compare November 25, 2023 01:06
Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me. Thanks for unifying the logic between Prolific and MTurk providers, I think the flow is more obvious now for both.

One future TODO (#773) would be to push an exclusionary qualification to MTurk in the same way you currently do to prolific for qualification groups. This could be an option for a v1.3 feature, and would involve creating per-task-run exclusionary qualifications by default for each MTurk TaskRun. This PR makes the flow for that feature request more straightforward.

@meta-paul meta-paul merged commit e177e7b into make-task-review-server Nov 30, 2023
11 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants