Skip to content

Commit

Permalink
update run_infer and README
Browse files Browse the repository at this point in the history
  • Loading branch information
ryanhoangt committed Sep 8, 2024
1 parent c864237 commit 39d8d0a
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 11 deletions.
16 changes: 8 additions & 8 deletions agenthub/coact_agent/executor/system_prompt.j2
Original file line number Diff line number Diff line change
Expand Up @@ -22,20 +22,20 @@ As a local executor agent, there are some additional actions that you can use to

{% endset %}
{% set BROWSING_PREFIX %}
The assistant can browse the Internet with <execute_browse> and </execute_browse>.
The agent can browse the Internet with <execute_browse> and </execute_browse>.
For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
{% endset %}
{% set PIP_INSTALL_PREFIX %}
The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
The agent can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
{% endset %}
{% set SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX %}
{% set COMMAND_DOCS %}
Apart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:
Apart from the standard Python library, the agent can also use the following functions (already imported) in <execute_ipython> environment:
{{ agent_skills_docs }}
IMPORTANT:
- `open_file` only returns the first 100 lines of the file by default! The assistant MUST use `scroll_down` repeatedly to read the full file BEFORE making edits!
- The assistant shall adhere to THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRING PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write the line out, with all leading spaces before the code!
- `open_file` only returns the first 100 lines of the file by default! The agent MUST use `scroll_down` repeatedly to read the full file BEFORE making edits!
- The agent shall adhere to THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRING PROPER INDENTATION. If the agent would like to add the line ' print(x)', it must fully write the line out, with all leading spaces before the code!
- Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
- Any code issued should be less than 50 lines to avoid context being cut off!
- After EVERY `create_file` the method `append_file` shall be used to write the FIRST content!
Expand All @@ -44,9 +44,9 @@ IMPORTANT:
{% endset %}
{% set SYSTEM_SUFFIX %}
Responses should be concise.
The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
If the assistant is finished with the task you MUST include <finish></finish> in your response.
The agent should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the agent is finished with the task or needs more input or action from the user in order to proceed.
If the agent is finished with the task you MUST include <finish></finish> in your response.
IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
The agent should utilize full file paths and the `pwd` command to prevent path-related errors.
The agent must avoid apologies and thanks in its responses.
Expand Down
2 changes: 1 addition & 1 deletion evaluation/swe_bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ Then, in a separate Python environment with `streamlit` library, you can run the
```bash
# Make sure you are inside the cloned `evaluation` repo
conda activate streamlit # if you follow the optional conda env setup above
streamlit run 0_📊_OpenHands_Benchmark.py --server.port 8501 --server.address 0.0.0.0
streamlit run app.py --server.port 8501 --server.address 0.0.0.0
```
Then you can access the SWE-Bench trajectory visualizer at `localhost:8501`.
Expand Down
7 changes: 5 additions & 2 deletions evaluation/swe_bench/run_infer.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,14 @@
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': codeact_user_response,
'CodeActSWEAgent': codeact_user_response,
'CoActPlannerAgent': codeact_user_response,
}

codeact_inst_suffix = 'When you think you have fixed the issue through code changes, please run the following command: <execute_bash> exit </execute_bash>.\n'
AGENT_CLS_TO_INST_SUFFIX = {
'CodeActAgent': 'When you think you have fixed the issue through code changes, please run the following command: <execute_bash> exit </execute_bash>.\n',
'CodeActSWEAgent': 'When you think you have fixed the issue through code changes, please run the following command: <execute_bash> exit </execute_bash>.\n',
'CodeActAgent': codeact_inst_suffix,
'CodeActSWEAgent': codeact_inst_suffix,
'CoActPlannerAgent': codeact_inst_suffix,
}


Expand Down

0 comments on commit 39d8d0a

Please sign in to comment.