(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

tobitege · 2024-07-19T21:45:34Z

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?

Enhancements to RuntimeClient's method _execute_bash. Comes with new unit test file.

Belongs to #3031

Give a summary of what the PR does, explaining any non-trivial design decisions

Improved parsing of pexpect output/prompt, with first iteration of interactive prompt detection.

There's one test commented out at the end if someone else wants to try to get it working.

xingyaoww · 2024-07-20T03:41:38Z

opendevin/runtime/client/client.py

-
-        # This should NOT match "PS1=\u@\h:\w [PEXPECT]$" when `env` is executed
-        self.__bash_expect_regex = (
-            r'\[PEXPECT_BEGIN\] ([a-z0-9_-]*)@([a-zA-Z0-9.-]*):(.+) \[PEXPECT_END\]'


There was a reason why i use [a-z0-9_-]*)@([a-zA-Z0-9.-]*):(.+) - because other wise, when agent runs env, it will be stopped by the PS1= env var printed by the env command, causing the parsing to be broken.

Yes! I think this got caught in the test cases - i added that one to make sure it is not broken :D

tobitege · 2024-07-20T18:27:18Z

Some notes:

running the original "runtime_build.py" via command line produced an image of 23+ GB(!) in size
version in this PR tries to "fix" that (down to < 4GB), but somethings still not right with the setup, I think
added tweaks so both Ubuntu 22.04 and 24.04 can get generated (2 packages have different setups: libgl1-mesa-glx and libasound2)

xingyaoww · 2024-07-20T18:52:12Z

opendevin/runtime/client/client.py

+                    break
+                elif index == 1:  # Matched an interactive prompt
+                    self.shell.sendline('n')  # Default to 'no' for safety
+                    output += "(Automatically responded 'n' to prompt)\n"


I'm thinking if we can just return an CmdRunOutputObservation directly to the agent with this prefix (with the (y/n) added there of course): something like

CmdRunOutputObservation(content="XXXX (y/n)", exit_code=None) # We can't get exit code here bc the shell is abnormal

And then we can let agent to decide by either running

<execute_bash> y </execute_bash> OR <execute_bash> n </execute_bash>

That'd be awesome, if that would work!

tobitege · 2024-07-20T18:56:56Z

Ugh, I messed up a line in the dockerfile generation, doh!

xingyaoww · 2024-07-21T17:25:55Z

opendevin/runtime/utils/runtime_build.py

+        '    python-docx PyPDF2 python-pptx pylatexenc openai \\\n'
+        '    python-dotenv toml termcolor pydantic python-docx pyyaml \\\n'
+        '    docker pexpect tenacity e2b \\\n'
+        '    browsergym minio playwright \n'


I actually think we don't need to install these manually because we already have these in the pyproject.toml - and it will automatically installed into the right version (poetry.lock), right? I guess this is why i like having poetry inside the runtime 😄

Right, I had copied several over from the plugins code (jupyterlab, browsergym, playwright etc.).
I haven't checked if they're all in the pyproject file, but if so, they'd be "duplicates" without consequence.

Oh, I don't think all of the browser stuff is in the pyproject, on 2nd thought.

Yep, i think this is the only line that needed to be added, which we already did:
https://github.com/OpenDevin/OpenDevin/blob/a61ac5a21425c21c0d76b213f03d18c80fbb5fad/Makefile#L152-L153

tobitege · 2024-07-21T18:50:09Z

opendevin/runtime/client/client.py

@@ -61,55 +62,102 @@ def __init__(self, plugins_to_load: list[Plugin], work_dir: str) -> None:

    def _init_bash_shell(self, work_dir: str) -> None:
        self.shell = pexpect.spawn('/bin/bash', encoding='utf-8', echo=False)
-        self.__bash_PS1 = r'[PEXPECT_BEGIN] \u@\h:\w [PEXPECT_END]'


@xingyaoww why set the prompt up so complex to begin with? 🤔
Doesn't this make detection harder than it should be?

Hmm... just to avoid PS1= ... :D

just to avoid PS1= ... :D
It is just the agent could possibly run env to check the env var, then found itself dead in the water because of the PS1 pattern 😓

why set the prompt up so complex to begin with?

Do you mean the user@host:/workdir thing, or the [PEXPECT_BEGIN]? For the former, i think it will be useful for the agent to know whoami and where am i - so i keep those user/host/workdir info. For [PEXPECT_BEGIN] - i spend sometime and i think that's probably the easiest way to make the \u@\h:\w regex work.

Yes, user@host. But I see the advantage in extra information for the agent there, which seems the obvious way to do now that you mention it! 👍

James4Ever0 · 2024-07-31T09:04:06Z

My thoughts are be like:

Use a TTY parsing library like pyte, render the terminal as we perceive
Let human pause/stop LLM generation at any point, inject instructions and modify terminal
Develop such a sidecar model to learn human interventions, automate the process

test_runtime_client.py to test _execute_bash()

548bcb7

tobitege requested a review from xingyaoww July 19, 2024 21:45

Merge branch 'main' into pexpect

611958e

xingyaoww reviewed Jul 20, 2024

View reviewed changes

tobitege added 3 commits July 20, 2024 07:37

Merge branch 'main' into pexpect

dfa43bb

runtime_build and runtime tweaks

14b808a

Merge branch 'main' into pexpect

573208e

xingyaoww reviewed Jul 20, 2024

View reviewed changes

fix in docker script

b1acc22

xingyaoww mentioned this pull request Jul 20, 2024

[Arch] Shrink runtime image size #3051

Merged

tobitege added 2 commits July 21, 2024 18:48

added _generate_dockerfile_nodejs() to runtime_build (experimental)

8af2e2f

Merge branch 'main' into pexpect

d5ba379

tobitege changed the title ~~(test) test_runtime_client.py to test _execute_bash()~~ (experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] Jul 21, 2024

tobitege marked this pull request as draft July 21, 2024 16:50

xingyaoww reviewed Jul 21, 2024

View reviewed changes

tobitege closed this Jul 21, 2024

tobitege commented Jul 21, 2024

View reviewed changes

James4Ever0 mentioned this pull request Jul 31, 2024

[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

Closed

10 tasks

tobitege deleted the pexpect branch August 29, 2024 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

tobitege commented Jul 19, 2024 •

edited

Loading

xingyaoww Jul 20, 2024

xingyaoww Jul 20, 2024 •

edited

Loading

tobitege commented Jul 20, 2024 •

edited

Loading

xingyaoww Jul 20, 2024

tobitege Jul 20, 2024

tobitege commented Jul 20, 2024

xingyaoww Jul 21, 2024

tobitege Jul 21, 2024

tobitege Jul 21, 2024

xingyaoww Jul 21, 2024 •

edited

Loading

tobitege Jul 21, 2024 •

edited

Loading

xingyaoww Jul 21, 2024

tobitege Jul 21, 2024

James4Ever0 commented Jul 31, 2024 •

edited

Loading

(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

Conversation

tobitege commented Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

xingyaoww Jul 20, 2024 • edited Loading

Choose a reason for hiding this comment

tobitege commented Jul 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tobitege commented Jul 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingyaoww Jul 21, 2024 • edited Loading

Choose a reason for hiding this comment

tobitege Jul 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

James4Ever0 commented Jul 31, 2024 • edited Loading

tobitege commented Jul 19, 2024 •

edited

Loading

xingyaoww Jul 20, 2024 •

edited

Loading

tobitege commented Jul 20, 2024 •

edited

Loading

xingyaoww Jul 21, 2024 •

edited

Loading

tobitege Jul 21, 2024 •

edited

Loading

James4Ever0 commented Jul 31, 2024 •

edited

Loading