-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add complete action verification #845
Conversation
<!-- ELLIPSIS_HIDDEN --> | 🚀 | This description was created by [Ellipsis](https://www.ellipsis.dev) for commit fb1090a272988bbbff21cd2d3140ac18f4e6b634 | |--------|--------| feat: add LLM-based action completion verification ### Summary: Adds LLM-based verification for action completion with new templates, logging, and exception handling updates. **Key points**: - Adds LLM-based verification for action completion in `action_complete.py` and `agent.py`. - Uses `verify_complete_action_via_llm_call()` to check if user goal is achieved. - Updates `agent_step()` in `agent.py` to include LLM verification after actions. - Introduces `verify-complete-action.j2` and `check-user-goal.j2` templates for LLM prompts. - Moves `IllegitComplete` exception from `cloud/exceptions.py` to `skyvern/exceptions.py`. - Logs verification responses and errors in `agent.py` and `action_complete.py`. ---- Generated with ❤️ by [ellipsis.dev](https://www.ellipsis.dev) <!-- ELLIPSIS_HIDDEN -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to b7e6c8a in 22 seconds
More details
- Looked at
158
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. skyvern/exceptions.py:507
- Draft comment:
Consider renamingIllegitComplete
to a more descriptive name that clearly indicates the nature of the exception. Additionally, provide a more informative message in the exception to help with debugging. - Reason this comment was not posted:
Confidence changes required:50%
The PR introduces a new exception classIllegitComplete
inskyvern/exceptions.py
. However, the nameIllegitComplete
is not very descriptive and could be improved for better clarity. Additionally, the message in the exception could be more informative.
2. skyvern/forge/agent.py:865
- Draft comment:
Ensure thatverification_response
is a valid JSON object before accessing its keys to prevent potential runtime errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The code already includes a check for the presence of specific keys inverification_response
and logs an error if they are missing. This suggests that the potential issue raised by the comment is already addressed. The comment might be redundant since the code handles the situation where the keys are not present.
I might be overlooking the fact that the comment could be suggesting a more proactive validation step before accessing any keys, rather than just handling missing keys after the fact.
The existing error handling for missing keys inverification_response
seems sufficient to prevent runtime errors, making the comment redundant.
The comment is redundant because the code already handles the case where the expected keys are missing inverification_response
. It should be removed.
3. skyvern/forge/prompts/skyvern/check-user-goal.j2:4
- Draft comment:
Use triple backticks for JSON code block formatting instead of single backticks for proper syntax highlighting. - Reason this comment was not posted:
Confidence changes required:30%
The PR introduces a new templatecheck-user-goal.j2
. However, the template uses backticks for code block formatting, which is incorrect for JSON. It should use triple backticks for proper formatting.
Workflow ID: wflow_MwNQyGKCswdvuHCZ
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on b7e6c8a in 31 seconds
More details
- Looked at
158
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. skyvern/exceptions.py:507
- Draft comment:
TheIllegitComplete
exception is defined here but should be removed since it has been moved to a new location. Ensure it's not duplicated. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is speculative and not based on a definite issue. It also asks for confirmation, which is not allowed. The class was added in this diff, so the comment is not about a change that needs to be made.
I might be missing the context of whether this class was indeed moved from another file, but the diff only shows it being added here.
The diff should be the primary source of truth for this review, and it shows the class being added, not moved.
Remove the comment as it is speculative and not based on a definite issue in the diff.
2. skyvern/forge/agent.py:844
- Draft comment:
Thecheck_user_goal_success
method is defined twice in this file. Ensure it's defined only once to avoid redundancy and potential confusion. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is incorrect because the methodcheck_user_goal_success
is only defined once in the file. The comment does not point to any actual issue in the code, so it should be removed.
I might have missed a subtle duplication or a similar method with a different name that could have been mistaken for a duplicate. It's important to ensure that the method is truly not duplicated in any form.
The file content was thoroughly reviewed, and no duplicate or similar method was found. The comment is indeed incorrect.
The comment is incorrect as the methodcheck_user_goal_success
is only defined once in the file. The comment should be removed.
3. skyvern/forge/agent.py:776
- Draft comment:
Thecheck_user_goal_success
method is defined twice in this file. Ensure it's defined only once to avoid redundancy and potential confusion. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_lpPXNrS2Jqk4z8yA
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
feat: add LLM-based action completion verification
Summary:
Adds LLM-based verification for action completion with new templates, logging, and exception handling updates.
Key points:
action_complete.py
andagent.py
.verify_complete_action_via_llm_call()
to check if user goal is achieved.agent_step()
inagent.py
to include LLM verification after actions.verify-complete-action.j2
andcheck-user-goal.j2
templates for LLM prompts.IllegitComplete
exception fromcloud/exceptions.py
toskyvern/exceptions.py
.agent.py
andaction_complete.py
.Generated with ❤️ by ellipsis.dev