Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Add method to report depth of currently-matched element. #6589

Closed
wants to merge 11 commits into from

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented May 21, 2024

Trac ticket: Core-61255.

The HTML Processor maintains a stack of open elements, where every element, every #text node, every HTML comment, and other node is pushed and popped while traversing the document. The "depth" of each of these nodes represents how deep that stack is where the node appears. Unfortunately this information isn't exposed to calling code, which has led different projects to attempt to calculate this value externally. This isn't always trivial, but the HTML Processor could make it so by exposing the internal knowledge in a new method.

In this patch the get_current_depth() method returns just that. Since the processor always exists within a context, the depth includes nesting from the always-present html element and also the body, since currently the HTML Processor only supports parsing in the IN BODY context.

This means that the depth reported for the DIV in <div> is 3, not 1, because its breadcrumbs path is HTML > BODY > DIV.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Copy link
Member

@sirreal sirreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

Comment on lines 1266 to 1271
/**
* Return how deep the currently-matched element is in the HTML document.
*
* @since 6.6.0
*
* @return int The depth of the currently-matched element.
Copy link
Member

@sirreal sirreal May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be talking about "element" here? It doesn't need to be an element we're at… should this talk about "token" or more generally "position of the processor in the document"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say "node". These can be #text nodes here. Is "position" right? I think in places we've used "location in the document" but also in the HTML Processor I feel like we're free to talk more specifically about nodes and elements since that's what it operates on, vs. text tokens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node sounds good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my own confusing is clearer now: I had un-pushed work, including the method itself. I've sent this out in d54d71f

apparently I had already chosen "of the current location" in my work, so I'm leaving it at that unless we prefer "node"

$processor->next_tag( array( 'class_name' => 'target' ) ),
while ( $processor->next_tag() ) {
if ( $processor->has_class( 'target' ) ) {
break;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the other form was broken, causing the test to fail. That's the only reason. I may be mistaken, feel free to revert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this is a known bug I forgot about. we claim to support class_name but don't

@dmsnell dmsnell force-pushed the html-api/add-get-depth branch from d54d71f to 39f9c9d Compare May 23, 2024 21:52
@dmsnell dmsnell marked this pull request as ready for review May 23, 2024 23:11
Copy link

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dmsnell, jonsurrell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@dmsnell dmsnell force-pushed the html-api/add-get-depth branch from d3be8db to 757d133 Compare May 23, 2024 23:11
pento pushed a commit that referenced this pull request May 23, 2024
The HTML Processor maintains a stack of open elements, where every element,
every `#text` node, every HTML comment, and other node is pushed and popped while
traversing the document. The "depth" of each of these nodes represents how deep
that stack is where the node appears. Unfortunately this information isn't
exposed to calling code, which has led different projects to attempt to
calculate this value externally. This isn't always trivial, but the HTML
Processor could make it so by exposing the internal knowledge in a new method.

In this patch the `get_current_depth()` method returns just that. Since the
processor always exists within a context, the depth includes nesting from the
always-present html element and also the body, since currently the HTML
Processor only supports parsing in the IN BODY context.

This means that the depth reported for the `DIV` in `<div>` is 3, not 1, because
its breadcrumbs path is `HTML > BODY > DIV`.

Developed in #6589
Discussed in https://core.trac.wordpress.org/ticket/61255

Fixes #61255.
Props dmsnell, jonsurrell.


git-svn-id: https://develop.svn.wordpress.org/trunk@58191 602fd350-edb4-49c9-b593-d223f7449a82
markjaquith pushed a commit to markjaquith/WordPress that referenced this pull request May 23, 2024
The HTML Processor maintains a stack of open elements, where every element,
every `#text` node, every HTML comment, and other node is pushed and popped while
traversing the document. The "depth" of each of these nodes represents how deep
that stack is where the node appears. Unfortunately this information isn't
exposed to calling code, which has led different projects to attempt to
calculate this value externally. This isn't always trivial, but the HTML
Processor could make it so by exposing the internal knowledge in a new method.

In this patch the `get_current_depth()` method returns just that. Since the
processor always exists within a context, the depth includes nesting from the
always-present html element and also the body, since currently the HTML
Processor only supports parsing in the IN BODY context.

This means that the depth reported for the `DIV` in `<div>` is 3, not 1, because
its breadcrumbs path is `HTML > BODY > DIV`.

Developed in WordPress/wordpress-develop#6589
Discussed in https://core.trac.wordpress.org/ticket/61255

Fixes #61255.
Props dmsnell, jonsurrell.

Built from https://develop.svn.wordpress.org/trunk@58191


git-svn-id: http://core.svn.wordpress.org/trunk@57654 1a063a9b-81f0-0310-95a4-ce76da25c4cd
@dmsnell
Copy link
Member Author

dmsnell commented May 23, 2024

Merged in [58191]
5d52f19

@dmsnell dmsnell closed this May 23, 2024
@dmsnell dmsnell deleted the html-api/add-get-depth branch May 23, 2024 23:41
github-actions bot pushed a commit to gilzow/wordpress-performance that referenced this pull request May 23, 2024
The HTML Processor maintains a stack of open elements, where every element,
every `#text` node, every HTML comment, and other node is pushed and popped while
traversing the document. The "depth" of each of these nodes represents how deep
that stack is where the node appears. Unfortunately this information isn't
exposed to calling code, which has led different projects to attempt to
calculate this value externally. This isn't always trivial, but the HTML
Processor could make it so by exposing the internal knowledge in a new method.

In this patch the `get_current_depth()` method returns just that. Since the
processor always exists within a context, the depth includes nesting from the
always-present html element and also the body, since currently the HTML
Processor only supports parsing in the IN BODY context.

This means that the depth reported for the `DIV` in `<div>` is 3, not 1, because
its breadcrumbs path is `HTML > BODY > DIV`.

Developed in WordPress/wordpress-develop#6589
Discussed in https://core.trac.wordpress.org/ticket/61255

Fixes #61255.
Props dmsnell, jonsurrell.

Built from https://develop.svn.wordpress.org/trunk@58191


git-svn-id: https://core.svn.wordpress.org/trunk@57654 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants