Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split bash commands by the new line character #4462

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

tofarr
Copy link
Collaborator

@tofarr tofarr commented Oct 17, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

When a bash command spans multiple lines the output is really confusing - so we split each command by the newline character for better readability. My initial approach was to split the command into multiple separate commands and add these to the event stream individually, but this did not play well with the LLM - The behavior was non-deterministic, but sometimes it would basically say "Thank you for running the first command - what about the others?" And it would supply the commands that had not yet been run to be run again, resulting in duplicates in the stream!

So my approach was to improve the output from splitting commands in the runtime client, while still treating these as a single RunCmdAction

Example 1:

Print the word "OpenHands" in the bash console in ascii art. Use only the echo command to do this. New lines of text should use the same invocation of the echo command

image

Example 2


* ls -lah
* echo "May the force be with you!"
* echo "May the odds be ever in your favor!"
* echo "Live long and prosper!"

image

Example 3


* ls -lah
* echo "hello world"
* echo a sample kubernetes deployment yaml into deploy.yaml

(This looks a little uglier because the console interprets each tab as 8 spaces.)
image

Example 4 (With confirmation mode!)

Please do the following in a single bash command:

* sleep for 5 seconds
* print "You must Narfle the Garthok!"
* Print the current working directory

image

Example 5 (Running user commands)

image

@tofarr tofarr force-pushed the feat_split_commands_by_new_line branch from e4b2264 to 5b62bff Compare October 17, 2024 20:26
@tofarr tofarr marked this pull request as ready for review October 17, 2024 20:44
Copy link
Contributor

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually something like this on the backend - what do you feel if we move them to the controller here? That is, we break one CmdRunAction to multiple ones and send them over for execution in agent controller?

commands = split_bash_commands(action.command)
all_output = ''
for command in commands:
if command == '':
output, exit_code = self._continue_bash(

new_action.thought = ''
else:
new_action.thought = action.thought
self.event_stream.add_event(new_action, EventSource.AGENT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea, how will it work with longer running commands though? Here _pending_action is set. When the first new action is executed the runtime will put an obs in the stream which will make it unset here... this places them in the stream and continues looping here. Then the step would pass here before all are executed, no? What prevents it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this probably messes with _pending_action quite a bit....

@tofarr you can probably test this by asking the agent something like:

Please run echo hello and ls -lah in the same <execute_bash> block

openhands/controller/agent_controller.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept thinking about this PR since the other day, for some odd reason. I'd appreciate if we take some time to make sure this could work. It seems to me it won't work right or I just don't see why it would.

For added fun, let's take a delegates' case: if the action is split in 5 in a delegate controller, and mini-action 1 of 5 has an error, then... okay, I think something else is bugged currently, but even with a fix, say like this, the delegate ends here. In this PR, is it possible to have obs 2-5 arrive in the stream after the delegate ends? If not, can you please point out what would prevent it?

Because it seems like the child obs could end up in the parent's history, which is a tad messy. 😅

continue
new_action = CmdRunAction(command=cmd)
if i < len(commands) - 1:
new_action.thought = ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from the original prompt:
.. Each instance should have a blank 'thought' attribute, except for the last one which should have the thought from the original action...

The reason for this is that it prevents the frontend UI from printing the thought multiple times. e.g.: (Without this functionality):
image

@enyst
Copy link
Collaborator

enyst commented Oct 18, 2024

It is possible that the action includes several times the same bash command, right? Like echo *, I don't know.

I wonder what would happen here. If they're separate in the stream, like mini-action 1, mini-obs 1, mini-action 2... it seems like they can be detected as repeated actions / obs, and the agent should stop with stuck in a loop error. 🤔 We do this for some cases when the LLM gets trapped and responds with the same thing over and over.

@rbren
Copy link
Collaborator

rbren commented Oct 18, 2024

It is possible that the action includes several times the same bash command, right? Like echo *, I don't know.

Yeah this is possible, but I can't imagine a scenario where the agent would want to do this :)

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

So this seems to work on the backend, but the frontend isn't splitting things up correctly

Screenshot 2024-10-21 at 10 51 09 AM

here's my sample prompt:

please run the following commands all at once:

* ls -lah
* echo "hello world"
* echo a sample kubernetes deployment yaml into deploy.yaml

it seems like we get all the CommandRunActions first, then we get all the Observations. Instead, we should get them collated properly

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

also notice the mangled YAML content on the last command there--not sure what the issue could be...

@tofarr
Copy link
Collaborator Author

tofarr commented Oct 21, 2024

The command posted by @rbren seems to be working now - though the bash console seems to turn tabs into 8 spaces, which makes it hard to read:
image

@tofarr
Copy link
Collaborator Author

tofarr commented Oct 21, 2024

Confirmation mode is now behaving appropriately too:
image

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

@tofarr looks like it's still not working right based on your screenshot. The console in the UI should show

$ ls -lah
./foo
./bar
$ echo "hello world"
hello world

etc. I.e. we should see the output of one command before the start of the next

message.args.is_confirmed !== "rejected"
) {
store.dispatch(appendInput(message.args.command));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commands are only outputted on completion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants