-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Screening Example #883
Changes from 10 commits
a39eed5
e6c789a
9b7895c
eb41c61
54e7dd0
bd8996e
8305eeb
654c356
6f224e7
2b7638c
53c9cef
62449f9
b5be62a
558079b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
--- | ||
sidebar_position: 3 | ||
--- | ||
|
||
import ReactPlayer from "react-player"; | ||
import Link from "@docusaurus/Link"; | ||
|
||
# Check worker quality with Screening Units | ||
|
||
Screening units help filter out low-quality work, generally by hiding parts of the validation you're paying attention to behind the Mephisto server. To support this we provide the `ScreenTaskRequired` blueprint mixin. | ||
|
||
Screening units are a heuristic-based way to determine, on the first task completion, if a worker has understood the instructions of a task. They can be run either on real data you want annotated (for cases where your heuristics can be run whenever) or on specific 'test' data you believe it's easier to validate on. | ||
|
||
## Showcase | ||
|
||
<ReactPlayer | ||
playing | ||
controls | ||
width="100%" | ||
height="auto" | ||
url="https://user-images.githubusercontent.com/55665282/183100782-4c6610f5-85c8-44e2-b056-db9ec9a80fa5.mp4" | ||
/> | ||
|
||
### Things to note in the showcase: | ||
|
||
- The `static_react_task` example is ran with the `screening_example` configuration enabled to ensure that screening units are generated. | ||
- When a worker goes to an assignment for the first time they see a screening unit. | ||
- Pressing the red button gives the worker the passing qualification | ||
- Pressing the green button gives the worker the blocked qualification | ||
- Going to a different assignment when you have a blocked qualification shows you a not qualified screen. | ||
- Going to a different assignment when you have a passing qualification allows you to see the real unit | ||
|
||
## Basic configuration | ||
|
||
There are a few required configuration parts for using screening units: | ||
|
||
- Hydra args | ||
- `blueprint.passed_qualification_name`: A string qualification to mark people who have passed screening. | ||
- `blueprint.block_qualification`: A string qualification to mark people who have failed screening. | ||
- `blueprint.use_screening_task`: Determines if the screening units feature will be enabled. Set to **true to enable screening units** and set to **false to disable screening units**. | ||
- `blueprint.max_screening_units`: An int for the maximum number of screening tasks you're willing to launch with this batch. Used to limit how much you will pay out for units that aren't annotating your desired data. | ||
- Must be set to 0 if `screening_data_factory` is set to False | ||
- Must be greater than 0 if `screening_data_factory` is not False | ||
- `task.allowed_concurrent:` An int for the number of allowed concurrent units at a time per worker. This value **must be set to 1**. | ||
- Screening units can only run this task type with one allowed concurrent unit at a time per worker, to ensure screening before moving into real units. | ||
- `ScreenTaskSharedState`: | ||
- `screening_data_factory`: `False` if you want to validate on real data. Otherwise, a factory that generates input data for a screening unit for a worker. Explained in-depth below. | ||
|
||
## Setting up SharedStaticTaskState | ||
|
||
With the basic configuration done, you'll also need to provide additional arguments to your `SharedStaticTaskState` to register the required qualifications and the screening validation function. | ||
|
||
A shortened version of the run script for the video above looks like: | ||
|
||
```python | ||
# run_task.py | ||
def my_screening_unit_generator(): | ||
"""Replacing the task text for the screening units""" | ||
while True: | ||
yield { | ||
"text": "SCREENING UNIT: Press the red button", | ||
"is_screen": True | ||
} | ||
|
||
def validate_screening_unit(unit: Unit): | ||
agent = unit.get_assigned_agent() | ||
if agent is not None: | ||
data = agent.state.get_data() | ||
print(data) | ||
if ( | ||
data["outputs"] is not None | ||
and "rating" in data["outputs"] | ||
and data["outputs"]["rating"] == "bad" | ||
): | ||
# User pressed the red button | ||
return True | ||
# User pressed the green button | ||
return False | ||
|
||
@task_script(default_config_file="example.yaml") | ||
def main(operator: Operator, cfg: DictConfig) -> None: | ||
... | ||
shared_state = SharedStaticTaskState( | ||
static_task_data=[ | ||
{"text": "This text is good text!"}, | ||
{"text": "This text is bad text!"}, | ||
], | ||
on_unit_submitted=ScreenTaskRequired.create_validation_function( | ||
cfg.mephisto, | ||
validate_screening_unit, | ||
), | ||
screening_data_factory=my_screening_unit_generator(), | ||
) | ||
shared_state.qualifications += ScreenTaskRequired.get_mixin_qualifications( | ||
cfg.mephisto, shared_state | ||
) | ||
... | ||
``` | ||
|
||
### See the full code [here](https://github.com/facebookresearch/Mephisto/blob/add-screening-example/examples/static_react_task/run_task.py) | ||
|
||
### See hydra configuration [here](https://github.com/facebookresearch/Mephisto/blob/add-screening-example/examples/static_react_task/hydra_configs/conf/screening_example.yaml) | ||
|
||
## Additional Questions? | ||
|
||
You can find more information on using screening units in the reference documentation for <Link target={null} to="pathname:///python_api/mephisto/abstractions/blueprints/mixins/screen_task_required.html">`ScreenTaskRequired`</Link>. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#@package _global_ | ||
defaults: | ||
- /mephisto/blueprint: static_react_task | ||
- /mephisto/architect: local | ||
- /mephisto/provider: mock | ||
mephisto: | ||
blueprint: | ||
task_source: ${task_dir}/webapp/build/bundle.js | ||
extra_source_dir: ${task_dir}/webapp/src/static | ||
units_per_assignment: 1 | ||
passed_qualification_name: "test-react-static-passed_qualification" | ||
block_qualification: "test-react-static-blocked-qualification" | ||
use_screening_task: true | ||
max_screening_units: 5 | ||
task: | ||
task_name: react-static-task-example | ||
task_title: "Rating a sentence as good or bad" | ||
task_description: "In this task, you'll be given a sentence. It is your job to rate it as either good or bad." | ||
task_reward: 0.05 | ||
task_tags: "test,simple,button" | ||
# We expect to be able to handle 300 concurrent tasks without issue | ||
max_num_concurrent_units: 300 | ||
allowed_concurrent: 1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,27 +4,76 @@ | |
# This source code is licensed under the MIT license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
from mephisto.abstractions.blueprints.mixins.screen_task_required import ( | ||
ScreenTaskRequired, | ||
) | ||
from mephisto.data_model.unit import Unit | ||
from mephisto.operations.operator import Operator | ||
from mephisto.tools.scripts import task_script, build_custom_bundle | ||
from mephisto.abstractions.blueprints.abstract.static_task.static_blueprint import ( | ||
SharedStaticTaskState, | ||
) | ||
|
||
from rich import print | ||
from omegaconf import DictConfig | ||
|
||
|
||
@task_script(default_config_file="example") | ||
def my_screening_unit_generator(): | ||
while True: | ||
yield {"text": "SCREENING UNIT: Press the red button", "is_screen": True} | ||
|
||
|
||
def validate_screening_unit(unit: Unit): | ||
agent = unit.get_assigned_agent() | ||
if agent is not None: | ||
data = agent.state.get_data() | ||
print(data) | ||
if ( | ||
data["outputs"] is not None | ||
and "rating" in data["outputs"] | ||
and data["outputs"]["rating"] == "bad" | ||
): | ||
# User pressed the red button | ||
return True | ||
return False | ||
|
||
|
||
def onboarding_always_valid(onboarding_data): | ||
return True | ||
|
||
|
||
@task_script(default_config_file="example.yaml") | ||
def main(operator: Operator, cfg: DictConfig) -> None: | ||
def onboarding_always_valid(onboarding_data): | ||
return True | ||
|
||
shared_state = SharedStaticTaskState( | ||
static_task_data=[ | ||
{"text": "This text is good text!"}, | ||
{"text": "This text is bad text!"}, | ||
], | ||
validate_onboarding=onboarding_always_valid, | ||
) | ||
is_using_screening_units = cfg.mephisto.blueprint["use_screening_task"] | ||
shared_state = None | ||
if not is_using_screening_units: | ||
shared_state = SharedStaticTaskState( | ||
static_task_data=[ | ||
{"text": "This text is good text!"}, | ||
{"text": "This text is bad text!"}, | ||
], | ||
validate_onboarding=onboarding_always_valid, | ||
) | ||
|
||
else: | ||
""" | ||
When using screening units there has to be a | ||
few more properties set on shared_state | ||
""" | ||
shared_state = SharedStaticTaskState( | ||
static_task_data=[ | ||
{"text": "This text is good text!"}, | ||
{"text": "This text is bad text!"}, | ||
], | ||
validate_onboarding=onboarding_always_valid, | ||
on_unit_submitted=ScreenTaskRequired.create_validation_function( | ||
cfg.mephisto, | ||
validate_screening_unit, | ||
), | ||
screening_data_factory=my_screening_unit_generator(), | ||
) | ||
shared_state.qualifications += ScreenTaskRequired.get_mixin_qualifications( | ||
cfg.mephisto, shared_state | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you are using the screening unit conf they are more properties that you need to add to your There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can add these dynamically to the
To reduce duplication a bit |
||
|
||
task_dir = cfg.task_dir | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,12 +16,21 @@ function OnboardingComponent({ onSubmit }) { | |
qualification for your task. Click the button to move on to the main | ||
task. | ||
</Directions> | ||
<button | ||
className="button is-link" | ||
onClick={() => onSubmit({ success: true })} | ||
<div | ||
style={{ | ||
width: "100%", | ||
padding: "1.5rem 0", | ||
display: "flex", | ||
justifyContent: "center", | ||
}} | ||
> | ||
Move to main task. | ||
</button> | ||
<button | ||
className="button is-link" | ||
onClick={() => onSubmit({ success: true })} | ||
> | ||
Move to Main Task | ||
</button> | ||
</div> | ||
Comment on lines
+19
to
+33
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was peeved by the fact that the onboarding button was not centered on the page. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeeeeaaah this could've been adjusted long ago hahaha |
||
</div> | ||
); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
import csv | ||
from genericpath import exists | ||
import os | ||
from rich import print | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unneeded import |
||
from typing import ( | ||
ClassVar, | ||
List, | ||
|
@@ -148,14 +149,20 @@ def extract_unique_mixins(blueprint_class: Type["Blueprint"]): | |
for clazz in removed_locals | ||
if clazz != BlueprintMixin and clazz != target_class | ||
) | ||
|
||
# Remaining "Blueprints" should be dropped at this point. | ||
filtered_out_blueprints = set( | ||
clazz for clazz in filtered_subclasses if not issubclass(clazz, Blueprint) | ||
) | ||
|
||
# we also want to make sure that we don't double-count extensions of mixins, so remove classes that other classes are subclasses of | ||
def is_subclassed(clazz): | ||
return True in [ | ||
issubclass(x, clazz) and x != clazz for x in filtered_subclasses | ||
issubclass(x, clazz) and x != clazz for x in filtered_out_blueprints | ||
] | ||
|
||
unique_subclasses = [ | ||
clazz for clazz in filtered_subclasses if not is_subclassed(clazz) | ||
clazz for clazz in filtered_out_blueprints if not is_subclassed(clazz) | ||
] | ||
return unique_subclasses | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for being able to play videos in mdx file.