From e925cefeef674c5f5ede19170c3142446274a128 Mon Sep 17 00:00:00 2001 From: Temo Date: Sun, 9 Jun 2024 21:19:05 +0400 Subject: [PATCH] Refactored prompt.py to reduce token usage (#1996) * Refactored prompt.py to reduce token usage * Reverted some destructive changes * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Apply suggestions from code review * Apply suggestions from code review * Update agenthub/codeact_agent/prompt.py * fix integration test * make lint * feat: support ToolQA benchmark (#2263) * Add files via upload * Update README.md * Update run_infer.py * Update utils.py * make lint * Update evaluation/toolqa/run_infer.py --------- Co-authored-by: Engel Nyst Co-authored-by: yufansong Co-authored-by: Boxuan Li * feat: revert hiden special paths change in file action (#2328) * revert change in file action * remove useless code * make lint * Support gpqa benchmark evaluation (#2080) * feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong * fix(frontend): prevent API key from resetting after modal change (#2329) * remove bottom chatbox fade * Modal wider; fix lint error * settings: attempt to not clear api key for same provider * prevent api key from resetting after changing the model * revert other changes and fix post test tear down error --------- Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> * fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034) * fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst * regenerate integration test --------- Co-authored-by: Engel Nyst Co-authored-by: Graham Neubig Co-authored-by: yufansong * Feat: Support Gorilla APIBench (#2081) * removed unused files from gorilla * Update run_infer.py, removed unused imports * Update utils.py * Update ast_eval_hf.py * Update ast_eval_tf.py * Update ast_eval_th.py * Create README.md * Update run_infer.py * make lint * Update run_infer.py * fix lint --------- Co-authored-by: yufansong * remote useless (#2332) * fix integration test * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * fix integration test --------- Co-authored-by: Xingyao Wang Co-authored-by: Frank Xu Co-authored-by: yufansong Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com> Co-authored-by: Engel Nyst Co-authored-by: Boxuan Li Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com> Co-authored-by: tobitege Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> Co-authored-by: Aaron Xia Co-authored-by: Graham Neubig --- agenthub/codeact_agent/prompt.py | 35 ++++++++++--------- .../test_browse_internet/prompt_001.log | 31 ++++++++-------- .../test_browse_internet/prompt_005.log | 31 ++++++++-------- .../CodeActAgent/test_edits/prompt_001.log | 31 ++++++++-------- .../CodeActAgent/test_edits/prompt_002.log | 31 ++++++++-------- .../CodeActAgent/test_edits/prompt_003.log | 31 ++++++++-------- .../CodeActAgent/test_ipython/prompt_001.log | 31 ++++++++-------- .../CodeActAgent/test_ipython/prompt_002.log | 31 ++++++++-------- .../test_ipython_module/prompt_001.log | 31 ++++++++-------- .../test_ipython_module/prompt_002.log | 31 ++++++++-------- .../test_ipython_module/prompt_003.log | 31 ++++++++-------- .../test_write_simple_script/prompt_001.log | 31 ++++++++-------- .../test_write_simple_script/prompt_002.log | 31 ++++++++-------- .../test_write_simple_script/prompt_003.log | 31 ++++++++-------- 14 files changed, 226 insertions(+), 212 deletions(-) diff --git a/agenthub/codeact_agent/prompt.py b/agenthub/codeact_agent/prompt.py index 9a2369295a4a..6edc1f3947ed 100644 --- a/agenthub/codeact_agent/prompt.py +++ b/agenthub/codeact_agent/prompt.py @@ -10,7 +10,7 @@ # ======= SYSTEM MESSAGE ======= MINIMAL_SYSTEM_PREFIX = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -23,22 +23,23 @@ Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. """ -BROWSING_PREFIX = """The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +BROWSING_PREFIX = """The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . """ PIP_INSTALL_PREFIX = """The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.""" SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX -GITHUB_MESSAGE = """To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +GITHUB_MESSAGE = """To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it.""" +If $GITHUB_TOKEN is not set, ask the user to set it.""" -SYSTEM_SUFFIX = """The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +SYSTEM_SUFFIX = """Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. """ @@ -46,10 +47,10 @@ EXAMPLES = """ --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -231,7 +232,7 @@ def index(): [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -241,14 +242,14 @@ def index(): [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- """ INVALID_INPUT_MESSAGE = ( "I don't understand your input. \n" - 'If you want to execute a bash command, please use YOUR_COMMAND_HERE .\n' - 'If you want to execute a block of Python code, please use YOUR_COMMAND_HERE .\n' - 'If you want to browse the Internet, please use YOUR_COMMAND_HERE .\n' + 'For bash commands, use YOUR_COMMAND .\n' + 'For Python code, use YOUR_CODE .\n' + 'For browsing, use YOUR_COMMAND .\n' ) diff --git a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log index b012ddc41fbd..e4a6a7c8607c 100644 --- a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log +++ b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -316,4 +317,4 @@ NOW, LET'S START! Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point. -ENVIRONMENT REMINDER: You have 9 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 9 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log index 6a6a386e731c..2a82aed07d96 100644 --- a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log +++ b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -321,4 +322,4 @@ Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me fo OBSERVATION: {'content': 'The answer to life, the universe, and everything is: OpenDevin is all you need!'} -ENVIRONMENT REMINDER: You have 8 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 8 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_edits/prompt_001.log b/tests/integration/mock/CodeActAgent/test_edits/prompt_001.log index 67a4df89a35a..3aacc4003a0d 100644 --- a/tests/integration/mock/CodeActAgent/test_edits/prompt_001.log +++ b/tests/integration/mock/CodeActAgent/test_edits/prompt_001.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -316,4 +317,4 @@ NOW, LET'S START! Fix typos in bad.txt. Do not ask me for confirmation at any point. -ENVIRONMENT REMINDER: You have 9 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 9 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_edits/prompt_002.log b/tests/integration/mock/CodeActAgent/test_edits/prompt_002.log index 4dbd6b5ff93d..4e4d3f62af70 100644 --- a/tests/integration/mock/CodeActAgent/test_edits/prompt_002.log +++ b/tests/integration/mock/CodeActAgent/test_edits/prompt_002.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -333,4 +334,4 @@ OBSERVATION: 4|Enjoy! -ENVIRONMENT REMINDER: You have 8 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 8 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_edits/prompt_003.log b/tests/integration/mock/CodeActAgent/test_edits/prompt_003.log index ed8fd9e16faa..4eda64d2a646 100644 --- a/tests/integration/mock/CodeActAgent/test_edits/prompt_003.log +++ b/tests/integration/mock/CodeActAgent/test_edits/prompt_003.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -354,4 +355,4 @@ OBSERVATION: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] -ENVIRONMENT REMINDER: You have 7 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 7 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_ipython/prompt_001.log b/tests/integration/mock/CodeActAgent/test_ipython/prompt_001.log index a07b496378c7..25dc81a0aa6e 100644 --- a/tests/integration/mock/CodeActAgent/test_ipython/prompt_001.log +++ b/tests/integration/mock/CodeActAgent/test_ipython/prompt_001.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -316,4 +317,4 @@ NOW, LET'S START! Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point. -ENVIRONMENT REMINDER: You have 9 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 9 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_ipython/prompt_002.log b/tests/integration/mock/CodeActAgent/test_ipython/prompt_002.log index d8a0cc8855d2..84f44f3a48f0 100644 --- a/tests/integration/mock/CodeActAgent/test_ipython/prompt_002.log +++ b/tests/integration/mock/CodeActAgent/test_ipython/prompt_002.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -329,4 +330,4 @@ with open('/workspace/test.txt', 'w') as f: OBSERVATION: [Code executed successfully with no output] -ENVIRONMENT REMINDER: You have 8 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 8 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_001.log b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_001.log index ec8331671dfb..b2e7837674d6 100644 --- a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_001.log +++ b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_001.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -316,4 +317,4 @@ NOW, LET'S START! Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point. -ENVIRONMENT REMINDER: You have 9 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 9 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_002.log b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_002.log index cd67cf0c2ec2..0f9e81b9bc1e 100644 --- a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_002.log +++ b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_002.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -336,4 +337,4 @@ OBSERVATION: [Package installed successfully] [Kernel restarted successfully to load the package] -ENVIRONMENT REMINDER: You have 8 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 8 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_003.log b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_003.log index 9b55db5d3c06..2da99e4f5859 100644 --- a/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_003.log +++ b/tests/integration/mock/CodeActAgent/test_ipython_module/prompt_003.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -361,4 +362,4 @@ with open("/workspace/test.txt", "w") as f: OBSERVATION: [Code executed successfully with no output] -ENVIRONMENT REMINDER: You have 7 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 7 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_001.log b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_001.log index 3983af551192..769500ce24ca 100644 --- a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_001.log +++ b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_001.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -316,4 +317,4 @@ NOW, LET'S START! Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point. -ENVIRONMENT REMINDER: You have 9 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 9 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_002.log b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_002.log index 8af9ef9b7a57..b0d8b23a4f18 100644 --- a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_002.log +++ b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_002.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -330,4 +331,4 @@ OBSERVATION: [Command -1 finished with exit code 0]] -ENVIRONMENT REMINDER: You have 8 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 8 turns left to complete the task. diff --git a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_003.log b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_003.log index b0fe2f8434f3..75b6448ba1de 100644 --- a/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_003.log +++ b/tests/integration/mock/CodeActAgent/test_write_simple_script/prompt_003.log @@ -3,7 +3,7 @@ ---------- A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. -The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "" tag, for example: +The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with . print("Hello World!") @@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution. For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 & Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background. -The assistant can browse the Internet with commands on behalf of the user by wrapping them with and . -For example, you can browse a given URL by Tell me the usa's president using google search . -The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +The assistant can browse the Internet with and . +For example, Tell me the usa's president using google search . +Or Tell me what is in http://example.com . The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them. -To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable. -For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command: +To interact with GitHub, use the $GITHUB_TOKEN environment variable. +For example, to push a branch `my_branch` to the GitHub repo `owner/repo`: git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch -If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it. +If $GITHUB_TOKEN is not set, ask the user to set it. Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment: @@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None: Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. -The assistant's response should be concise. -The assistant should include ONLY ONE or or in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. -IMPORTANT: Whenever possible, execute the code for the user using or or instead of providing it. +Responses should be concise. +The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block. +Include ONLY ONE , , or per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed. +IMPORTANT: Execute code using , , or whenever possible. ---------- @@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving --- START OF EXAMPLE --- -USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000? +USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000. ASSISTANT: -Sure! Let me create a file first: +Sure! Let me create a Python file `app.py`: create_file('app.py') @@ -295,7 +296,7 @@ Observation: [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.] ASSISTANT: -The file has been updated. Let me run the Python file again with the new changes: +Running the updated file: python3 app.py > server.log 2>&1 & @@ -305,7 +306,7 @@ Observation: [1] 126 ASSISTANT: -The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests! +The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests! --- END OF EXAMPLE --- @@ -343,4 +344,4 @@ OBSERVATION: hello [Command -1 finished with exit code 0]] -ENVIRONMENT REMINDER: You have 7 turns left to complete the task. \ No newline at end of file +ENVIRONMENT REMINDER: You have 7 turns left to complete the task.