-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Config] Reduce the duplication in run_controller
argument
#3217
Conversation
Nice, removing them is cleaner! This continues what we did by removing such individual attributes from LLM class for example, once we had a usable
Please tell if I don't understand this correctly, AppConfig is already composed of those things, we have moved vars that used to be defined in the app_config into these smaller ones, and continued to split some more in time. Do you mean we should add more "buckets of settings" like them, such as a Just a question on a particular bit of this PR code though. |
Correct me if @neubig think differently: Yes, if it make sense, I think at some point we can make config = AppConfig(
task_config=task_config,
agent_config=agent_config,
runtime_config=agent_config
)
runtime = Runtime.from_config(runtime_config)
agent = Agent.from_config(agent_config)
state = run_controller(
task=instruction,
task_config=task_config,
agent=agent,
runtime=runtime
)
# OR use them altogether
state = run_controller(
task=instruction,
config=app_config
) While For example, people can just create a |
@@ -118,13 +118,12 @@ def process_instance( | |||
instruction += AGENT_CLS_TO_INST_SUFFIX[agent.__class__.__name__] | |||
|
|||
# Here's how you can run the agent (similar to the `main` function) and get the final task state | |||
config.max_iterations = metadata.max_iterations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit strange to me. For most of the time, we really have treated the application configuration as immutable, or almost, even though it isn't. I think working that way had definite benefits, it makes it more understandable, and the user should be able to count on the values they set, once fully loaded and initialized, in the right order of precedence.
If we then change this at runtime, IMO it's harder for the users to understand what value came from where or why.
I wrote some more about this, but unfortunately I... lost the contents of that post. 😅 Anyway I guess that since we no longer let the core parse the command line args for the client scripts, the scripts have to do that. I wonder if we can do it better though, at some point (doesn't have to be right now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think creating an immutable config definitely has its benefits! But I think it could be slightly tricky to do now bc we might need to reload config from env var, config.toml
etc.
I'm working on an evaluation refactor (should be done in the next couple days) for the new runtime, where I'll go through every config
in the evaluation and initialize the whole config all at once (is this more aligned with what you think?):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks very cool, explicit and self-contained! And no other place in SWE-bench sneaking in some other value in it, right? 😅
I think that's exactly on point, a single place per external script to do their config initialization looks just right. ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And no other place in SWE-bench sneaking in some other value in it, right? 😅
Yep! "sneaking in some other value" has caused me a lot of trouble a while back 😿 definitely want to get rid of that behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey arguably, this line is sneaky too. 🤣 We can live with it since we're killing it again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kill it in 2-5 days 🦾
Just a quick note on this, I kinda doubt we will succeed to make AppConfig only from a few sub-configs, it's really big and some don't mesh together with anything really ("disable_color"? 😆), and if we force them into something it might become less understandable. I could be wrong. IMHO it will be fine if we continue as we do, pick and put stuff together when it becomes obvious it needs it and/or makes too much sense. 😆 |
@enyst totally agree! not trying to do that to the fullest extent - but trying to have that in mind as the final goal regardless of whether we can eventually achieve it 😆 |
Just chiming in to agree! |
What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?
A general rule (after discussion with @neubig in #3184) is that we try to minimize the overlap between different input arguments.
For example, in
run_controller
, now we have two things:max_budget_per_task
(higher priority), which overlaps withconfig.max_budget_per_task
max_iterations
(higher priority), which overlaps withconfig.max_iterations
Give a summary of what the PR does, explaining any non-trivial design decisions
This PR tries to get rid of this overlap and remove the two extra input arguments to make
run_controller
a relatively simpler interface.TODO for future PRs (as discussed with @neubig), we should break the gigantic
AppConfig
into several composable things (e.g., agent_config, llm_config, sandbox_config), which is used to separately initialize different components without relying on the sameAppConfig
.Other references