-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple the prompt set type (practice, official) from its file name #801
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this code get cleaner if there were a PromptSet object and possibly an enum for the known good ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better organized now. Thanks!
That was my initial approach, and it got more complicated than I wanted. I can take a fresh pass. @wpietri Putting in the changes inspired by Barbara's review made me decide against the Enum approach for two reasons.
|
…or better testing
@bkorycki @wpietri @dhosterman ready for re-review when you have a chance! |
To me an object is a name plus data plus behavior. I think we have those conditions here, if marginally. The name is already in the filename, and it's a domain term. You have the data in the dict. And there's a bit of behavior in the file itself, with possibly more elsewhere. But it is pretty basic, so I can live with it staying as procedural programming until more behavior turns up. But this is a problem for me:
Previously, what defined a GeneralPurposeAiChatBenchmarkV1 was a locale, a prompt set, and an evaluator. The code more or less matched our domain language. But now PROMPT_SETS is part of the definition, which is definitely outside our domain language, and it's weird from an object modelling perspective. What typing would get us here is a clearer interface, and using the type system to do the validation that ends up here. Even going with strings, I'd rather validation of UI/CLI inputs end up much closer to the UI/CLI. (If we wanted to be extra careful here and have a double-check, I'd rather have it be something like |
@wpietri @bkorycki third pass :) @wpietri I still think using a bag of strings for prompt sets is preferable at this stage. There's a world in which we may need to add functionality or data to the concept of a prompt set, e.g. a locale if it only contains one locale, and then we can use a class for that. But I don't think we're there yet, and because types aren't enforced and we need to support plain strings to identify prompt sets (e.g. as arguments in the CLI) then I think it's best to stick with strings at this stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for persisting!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -9,6 +9,8 @@ | |||
|
|||
# add the other languages after we have official and practice prompt sets | |||
LOCALES = (EN_US, FR_FR) | |||
# all the languages we have official and practice prompt sets for | |||
PUBLISHED_LOCALES = (EN_US,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the set of locales that we can run in modelbench?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOCALES should be runnable in modelbench. PUBLISHED_LOCALES is there to avoid hardcoded exceptions deeper in the code, e.g. the funky exceptions we had to put in to run French practice tests.
Story
#781
Rationale
Filenames
#744
Test Items
#780
Main Change
New uploaded files need to be added to the
PROMPT_SETS
dict insrc/modelgauge/prompt_sets.py
and they need to contain the whole key. E.g. if you want to use the key "demo_hi_in" then your prompt file name should include "demo_hi_in" in it.To use new prompt sets in modellab, you need to add them to the modellab drop down, and to the modelrunner list of prompt sets. We may want to change that so it's less manual.