update element hashing logic - subscribe only. otherwise, any element update would break element hashing #1784

wintonzheng · 2025-02-18T08:37:55Z

Important

Update clean_element_before_hashing() in scraper.py to select specific keys for consistent element hashing in subscription scenarios.

Behavior:
- Update clean_element_before_hashing() in scraper.py to select specific keys (frame, tagName, attributes, beforePseudoText, text, afterPseudoText, children) for hashing instead of removing id and rect.
- Ensures consistent element hashing when elements are updated, particularly for subscription scenarios.

^{This description was created by}^{for 42c6afc. It will automatically update as commits are pushed.}

… update would break element hashing

ellipsis-dev

👍 Looks good to me! Reviewed everything up to 42c6afc in 1 minute and 23 seconds

More details

Looked at 20 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. skyvern/webeye/scraper/scraper.py:159

Draft comment:
Consider using a dictionary comprehension to simplify selective deep-copy (e.g., {k: copy.deepcopy(element[k]) for k in keys if k in element}). Also, add a comment explaining why these keys are chosen.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

2. skyvern/webeye/scraper/scraper.py:159

Draft comment:
Consider using a dict comprehension for building element_copy for clarity. For example: element_copy = {k: copy.deepcopy(element[k]) for k in allowed_keys if k in element}.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The suggested dict comprehension would make the code more concise and arguably more "Pythonic". However, the current for loop version is also quite clear and straightforward. The dict comprehension would do the exact same thing in a slightly more compact way. This is a relatively minor stylistic suggestion rather than a functional improvement.
The current code is already quite readable and maintainable. The dict comprehension might actually be slightly harder to read for developers less familiar with Python's more advanced features.
While dict comprehensions are a good Python feature, suggesting their use here is more of a personal style preference rather than a clear improvement.
This comment should be deleted as it suggests a minor stylistic change that doesn't meaningfully improve the code.

3. skyvern/webeye/scraper/scraper.py:159

Draft comment:
Ensure the whitelist of keys for hashing (frame, tagName, attributes, beforePseudoText, text, afterPseudoText, children) is complete and well documented, to avoid unintentional omissions.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50%
This comment is asking the PR author to ensure that the whitelist of keys is complete and well documented. This falls under the category of asking the author to double-check things, which is against the rules. Therefore, this comment should be removed.

Workflow ID: wflow_XTONjHfN0hv7GKiW

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

LawyZheng · 2025-02-18T16:34:22Z

skyvern/webeye/scraper/scraper.py

-    element_copy.pop("rect", None)
+    element_copy = {}
+    # pick up these element keys: frame, tagName, attributes, beforePseudoText, text, afterPseudoText, children
+    for key in ["frame", "tagName", "attributes", "beforePseudoText", "text", "afterPseudoText", "children"]:


why should we keep frame attribute?

i'll remove.

update element hashing logic - subscribe only. otherwise, any element…

42c6afc

… update would break element hashing

ellipsis-dev bot reviewed Feb 18, 2025

View reviewed changes

LawyZheng reviewed Feb 18, 2025

View reviewed changes

Merge branch 'main' into shu/update_hashing_algo

3fbf1f2

suchintan approved these changes Mar 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update element hashing logic - subscribe only. otherwise, any element update would break element hashing #1784

update element hashing logic - subscribe only. otherwise, any element update would break element hashing #1784

wintonzheng commented Feb 18, 2025 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

LawyZheng Feb 18, 2025

wintonzheng Feb 20, 2025

update element hashing logic - subscribe only. otherwise, any element update would break element hashing #1784

Are you sure you want to change the base?

update element hashing logic - subscribe only. otherwise, any element update would break element hashing #1784

Conversation

wintonzheng commented Feb 18, 2025 • edited by ellipsis-dev bot Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

LawyZheng Feb 18, 2025

Choose a reason for hiding this comment

wintonzheng Feb 20, 2025

Choose a reason for hiding this comment

wintonzheng commented Feb 18, 2025 •

edited by ellipsis-dev bot

Loading