[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

xingyaoww · 2024-07-19T15:49:09Z

This is a mega-issue tracker for the Interactive terminal issue peoples run into.

Feel free to expand this list if i missed any relevant issue!

Cause

These are typically caused by the same reason: OpenDevin uses pexcept to interact with Bash shells, however, the current parsing logic only looks for the next PS1 prompt (e.g., something like root@hostname:/folderABC $).

This will keep looking for such a pattern until it timeout, causing the following things to break, as listed in the PR above:

Open a new interactive program (e.g., python3), where the new prompt changes to >>
Open a new text editor (e.g., nano, vim), where the display could be completely broken? (I'm not familiar with the protocol here, though)
Enter a new conda virtual environment: conda will prepend the env name (e.g., (base)) before the PS1 prompt, causing the current pexpect parsing to break
When the agent is asked for password (e.g., with patterns like Password:)
Prompt like (yes/no/[fingerprint]) requesting user confirmation.

Fixes

We plan to resolve them as much as I can once arch refactor #2404 is completed. But these are a non-exhaustive list of patterns we are trying to pexcept and we cannot list everything here:

Try to cover common use cases of these prompts (e.g., [yes/no] pattern, conda environment pattern)
Figure out a more general way (rather than writing rules) for agents to interact with these (e.g., we don't write every rules explicitly, but for example, if we've been waiting for more than 5s and there's no new output from the terminal, it probably means it is waiting for user input and we should and it over to the agent - subsequently, we may need to allow agent to issue special keyboard actions like ctrl+D ctrl+C, etc).
Add something in the prompt that forbids agent goes into interactive programs (e.g., interactive Python, vim, nano, etc)
We need a way to detect if the agent accidentally goes into such an interactive program, and we need a way to force it out (we currently send ctrl+C, which might not work for a large variety of programs like vim).

If you want to help!

Try to take a look at our existing bash parsing logic for the new architecture (under development!):
https://github.com/OpenDevin/OpenDevin/blob/8bfa61f3e4beceb690562b4d105aa01dc50d58d7/opendevin/runtime/client/client.py#L62-L111

You can help to:

Write test cases into https://github.com/OpenDevin/OpenDevin/blob/main/tests/unit/test_runtime.py to expose these interactive bash issues
Try to fix them inside the client/client.py (and/or the ssh_box.py - but we plan to deprecate them soon, so only supporting these on EventStreamRuntime should be sufficient!)

The text was updated successfully, but these errors were encountered:

tobitege · 2024-07-19T16:02:45Z

First, great collection of issues to look into!

Fixes
...if we've been waiting for more than 5s...

As to that 2nd point: if the model runs any installations (like with pip), these 5 seconds could become too short?

Add something in the prompt...

We may have to track down the "usual suspects" IF these have parameters especially to suppress interactive mode:
Several package installers have parameters to suppress interactive mode. Here are the options for some of the most common ones (thx to Sonnet):

pip:
- Use the -q or --quiet flag to suppress output.
- Use the -y or --yes flag to automatically answer yes to prompts (though pip itself doesn't typically prompt for yes/no).
```
pip install -q package_name
```
Poetry:
- Poetry does not have interactive prompts during installation, so no specific flag is needed to suppress them.
npm:
- Use the --yes or -y flag to automatically answer yes to prompts.
```
npm install package_name --yes
```
yarn:
- Yarn does not typically have interactive prompts during installation, so no specific flag is needed to suppress them.
conda:
- Use the -y or --yes flag to automatically answer yes to prompts.
```
conda install package_name -y
```
apt-get (for system packages on Debian-based systems):
- Use the -y or --yes flag to automatically answer yes to prompts.
```
sudo apt-get install package_name -y
```

xingyaoww · 2024-07-19T16:09:34Z

if the model runs any installations (like with pip), these 5 seconds could become too short?

Yeah.. that's a good point and that's primarily why we did implement the timeout but not this.

I guess the T second timeout should be enforced for the stream of output: e.g., if the terminal keeps printing outputs in the last T second duration (e.g., you see the screen rolling when running installation command), then it is probably working fine without interactions. But if it hangs for T second (no output is printing), then it probably needs someone to look to see if we need to enter/do anything.

I mean, we humans do the same thing! If a command hangs unreasonably long, we might just issue ctrl+C to kill it 😅

Maybe we could (1) switch everything to streaming outputs, (2) just wait for T=10 seconds, (3) if no additional outputs are added during that last T seconds, we are probably stuck - we can let the LLM decide whether it wants to (a) interrupt the process, (b) enter something if this is actually a legit interactive prompt, (c) keep waiting. IMO this might be the next level of agents that is execution time aware 😆

tobitege · 2024-07-19T16:46:06Z

The streaming back works in my async PR, but still the prompt regex would then be the issue.

SmartManoj · 2024-07-20T00:23:06Z

@xingyaoww I think this is the ~~right issue~~ to discuss this.

Will we move to Paramiko for windows support? #2059 (comment)

Why are we using pxssh over Paramiko?

Source

Relevant comments:
@iFurySt #1739 (comment)

xingyaoww · 2024-07-20T01:09:26Z

We will no longer use SSH protocol in the new architecture, though - otherwise we need to maintain two separate connections which could be overly complicated. Instead, we will just interact with a local bash shell (for ease of dependency management), so ssh-related libraries might not be relevant anymore.

zenitogr · 2024-07-30T05:17:25Z

#3176 sorry for creating a new issue but I dont see here about long running commands like npm run dev

zenitogr · 2024-07-30T05:21:33Z

for anyone who want to create a nextjs app here is the noninteractive way:
npx create-next-app my-app-name --ts --app --eslint --import-alias "@/*" --use-npm --tailwind --no-src-dir
you basicly pass all the options you want BUT you must pass all of them with --no-flag if you dont want them or --flag if you want them. I would just create-next-app in my wsl if you are using wsl and call it a day

here are all the flags: https://nextjs.org/docs/pages/api-reference/create-next-app

zenitogr · 2024-07-30T15:45:44Z

I think thats the number one priority, cause in react land you have a npm run dev running constantly while you dev the project

also running tasks that need interaction is the defacto even if everybody hates it

its just simpler for packages and core components that are so complicated to ask for user options and prefernces and on the fly choices

Yeah, totally agree! Unfortunately it's not super-trivial to fix (otherwise we would have fixed it already), but I agree that this is high priority

you could use two llms

one llm for coordination in a loop of checking things and the other for coding

gpt-pilot kind of does that with agents, like there are different agents that cooperate in level, there is the top agent that gives commands to lower agents and coordinates what the next step and next agent is gonna be

they use different LLMs for each agent

like gpt-4 for top agent with temp 0.8 and gpt-4o for coding with 0 temp

James4Ever0 · 2024-07-31T09:07:51Z

#3040 (comment)

@tobitege @xingyaoww @zenitogr @SmartManoj

(PS: This is one of the major features planned on Cybergod)

Update: 8/16/24

Now I have made some progress over the terminal agent. Check out the demo video below:

Terminal environment can be captured as image with cursor denoted in red:

fixes All-Hands-AI#3031

James4Ever0 · 2024-08-28T13:09:01Z

Now one can obtain terminal input/output statistics in Cybergod like down below:

With terminal stats, one can build a more efficient event-driven terminal agent, for example, listen for event TerminalIdle just like NetworkIdle in Playwright, and interval-driven terminal agent can be more intelligent by adding statistics to the prompt, and conditional prompts based on different stats.

More can be learned at here and here.

mamoodi · 2024-12-05T16:50:23Z

We've made a lot of good progress on this but I don't think this is 100% resolved yet.

xingyaoww · 2024-12-05T17:31:36Z

most of these should be fixed by #4881 🤔

mamoodi · 2025-01-06T15:56:45Z

@xingyaoww do you think we should close this now that the PR has merged? Or close it when we release?

xingyaoww · 2025-01-06T16:21:19Z

Let's close this now! Closed by #4881

xingyaoww added bug Something isn't working enhancement New feature or request labels Jul 19, 2024

xingyaoww self-assigned this Jul 19, 2024

dosubot bot added the severity:medium Affecting multiple users label Jul 19, 2024

xingyaoww mentioned this issue Jul 19, 2024

interactive terminal? #3008

Closed

tobitege mentioned this issue Jul 19, 2024

(experimental) test_runtime_client.py to test _execute_bash() [DO NOT MERGE!] #3040

Closed

SmartManoj mentioned this issue Jul 20, 2024

Process interactive commands and stream output in logs #3042

Closed

This was referenced Aug 1, 2024

Suggestions: Terminal support and OpenDevin intergration, further training DigiRL-agent/digirl#12

Closed

Add real terminal support, Vim capability DaveOkpare/terminal-agent#1

Open

xingyaoww added this to OpenDevin Priority Roadmap Aug 2, 2024

xingyaoww added this to the 2024-08 milestone Aug 2, 2024

rbren removed this from the 2024-08 milestone Aug 16, 2024

rbren added this to the 2024-09 milestone Aug 16, 2024

SmartManoj added a commit to SmartManoj/Kevin that referenced this issue Aug 17, 2024

process interactive commands

7f76fcd

fixes All-Hands-AI#3031

mamoodi added the tracked Added to internal tracking label Aug 28, 2024

xingyaoww closed this as completed Jan 6, 2025

github-project-automation bot moved this to Done in OpenDevin Priority Roadmap Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

xingyaoww commented Jul 19, 2024 •

edited by enyst

Loading

tobitege commented Jul 19, 2024

xingyaoww commented Jul 19, 2024 •

edited

Loading

tobitege commented Jul 19, 2024

SmartManoj commented Jul 20, 2024 •

edited

Loading

xingyaoww commented Jul 20, 2024

zenitogr commented Jul 30, 2024

zenitogr commented Jul 30, 2024 •

edited

Loading

zenitogr commented Jul 30, 2024 •

edited

Loading

James4Ever0 commented Jul 31, 2024 •

edited

Loading

James4Ever0 commented Aug 28, 2024 •

edited

Loading

mamoodi commented Dec 5, 2024

xingyaoww commented Dec 5, 2024

mamoodi commented Jan 6, 2025

xingyaoww commented Jan 6, 2025

[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

[Runtime] Mega-issue to track all issues related to bash Interactive terminal #3031

Comments

xingyaoww commented Jul 19, 2024 • edited by enyst Loading

Cause

Fixes

If you want to help!

tobitege commented Jul 19, 2024

xingyaoww commented Jul 19, 2024 • edited Loading

tobitege commented Jul 19, 2024

SmartManoj commented Jul 20, 2024 • edited Loading

xingyaoww commented Jul 20, 2024

zenitogr commented Jul 30, 2024

zenitogr commented Jul 30, 2024 • edited Loading

zenitogr commented Jul 30, 2024 • edited Loading

James4Ever0 commented Jul 31, 2024 • edited Loading

James4Ever0 commented Aug 28, 2024 • edited Loading

mamoodi commented Dec 5, 2024

xingyaoww commented Dec 5, 2024

mamoodi commented Jan 6, 2025

xingyaoww commented Jan 6, 2025

xingyaoww commented Jul 19, 2024 •

edited by enyst

Loading

xingyaoww commented Jul 19, 2024 •

edited

Loading

SmartManoj commented Jul 20, 2024 •

edited

Loading

zenitogr commented Jul 30, 2024 •

edited

Loading

zenitogr commented Jul 30, 2024 •

edited

Loading

James4Ever0 commented Jul 31, 2024 •

edited

Loading

James4Ever0 commented Aug 28, 2024 •

edited

Loading