Skip to content

Commit

Permalink
HTML API: Add method to report depth of currently-matched node.
Browse files Browse the repository at this point in the history
The HTML Processor maintains a stack of open elements, where every element,
every `#text` node, every HTML comment, and other node is pushed and popped while
traversing the document. The "depth" of each of these nodes represents how deep
that stack is where the node appears. Unfortunately this information isn't
exposed to calling code, which has led different projects to attempt to
calculate this value externally. This isn't always trivial, but the HTML
Processor could make it so by exposing the internal knowledge in a new method.

In this patch the `get_current_depth()` method returns just that. Since the
processor always exists within a context, the depth includes nesting from the
always-present html element and also the body, since currently the HTML
Processor only supports parsing in the IN BODY context.

This means that the depth reported for the `DIV` in `<div>` is 3, not 1, because
its breadcrumbs path is `HTML > BODY > DIV`.

Developed in #6589
Discussed in https://core.trac.wordpress.org/ticket/61255

Fixes #61255.
Props dmsnell, jonsurrell.


git-svn-id: https://develop.svn.wordpress.org/trunk@58191 602fd350-edb4-49c9-b593-d223f7449a82
  • Loading branch information
dmsnell committed May 23, 2024
1 parent f95abe3 commit 5d52f19
Show file tree
Hide file tree
Showing 2 changed files with 115 additions and 0 deletions.
29 changes: 29 additions & 0 deletions src/wp-includes/html-api/class-wp-html-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,35 @@ public function get_breadcrumbs() {
return $breadcrumbs;
}

/**
* Returns the nesting depth of the current location in the document.
*
* Example:
*
* $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
* // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
* 2 === $processor->get_current_depth();
*
* // Opening the DIV element increases the depth.
* $processor->next_token();
* 3 === $processor->get_current_depth();
*
* // Opening the P element increases the depth.
* $processor->next_token();
* 4 === $processor->get_current_depth();
*
* // The P element is closed during `next_token()` so the depth is decreased to reflect that.
* $processor->next_token();
* 3 === $processor->get_current_depth();
*
* @since 6.6.0
*
* @return int Nesting-depth of current location in the document.
*/
public function get_current_depth() {
return $this->state->stack_of_open_elements->count();
}

/**
* Parses next element in the 'in body' insertion mode.
*
Expand Down
86 changes: 86 additions & 0 deletions tests/phpunit/tests/html-api/wpHtmlProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -334,4 +334,90 @@ public static function data_unsupported_special_in_body_tags() {
'XMP' => array( 'XMP' ),
);
}

/**
* Ensures that the HTML Processor properly reports the depth of a given element.
*
* @ticket 61255
*
* @dataProvider data_html_with_target_element_and_depth_in_body
*
* @param string $html_with_target_element HTML containing element with `target` class.
* @param int $depth_at_element Depth into document at target node.
*/
public function test_reports_proper_element_depth_in_body( $html_with_target_element, $depth_at_element ) {
$processor = WP_HTML_Processor::create_fragment( $html_with_target_element );

$this->assertTrue(
$processor->next_tag( array( 'class_name' => 'target' ) ),
'Failed to find target element: check test data provider.'
);

$this->assertSame(
$depth_at_element,
$processor->get_current_depth(),
'HTML Processor reported the wrong depth at the matched element.'
);
}

/**
* Data provider.
*
* @return array[].
*/
public static function data_html_with_target_element_and_depth_in_body() {
return array(
'Single element' => array( '<div class="target">', 3 ),
'Basic layout and formatting stack' => array( '<div><span><p><b><em class="target">', 7 ),
'Adjacent elements' => array( '<div><span></span><span class="target"></div>', 4 ),
);
}

/**
* Ensures that the HTML Processor properly reports the depth of a given non-element.
*
* @ticket 61255
*
* @dataProvider data_html_with_target_element_and_depth_of_next_node_in_body
*
* @param string $html_with_target_element HTML containing element with `target` class.
* @param int $depth_after_element Depth into document immediately after target node.
*/
public function test_reports_proper_non_element_depth_in_body( $html_with_target_element, $depth_after_element ) {
$processor = WP_HTML_Processor::create_fragment( $html_with_target_element );

$this->assertTrue(
$processor->next_tag( array( 'class_name' => 'target' ) ),
'Failed to find target element: check test data provider.'
);

$this->assertTrue(
$processor->next_token(),
'Failed to find next node after target element: check tests data provider.'
);

$this->assertSame(
$depth_after_element,
$processor->get_current_depth(),
'HTML Processor reported the wrong depth after the matched element.'
);
}

/**
* Data provider.
*
* @return array[].
*/
public static function data_html_with_target_element_and_depth_of_next_node_in_body() {
return array(
'Element then text' => array( '<div class="target">One Deeper', 4 ),
'Basic layout and formatting stack' => array( '<div><span><p><b><em class="target">Formatted', 8 ),
'Basic layout with text' => array( '<div>a<span>b<p>c<b>e<em class="target">e', 8 ),
'Adjacent elements' => array( '<div><span></span><span class="target">Here</div>', 5 ),
'Adjacent text' => array( '<p>Before<img class="target">After</p>', 4 ),
'HTML comment' => array( '<img class="target"><!-- this is inside the BODY -->', 3 ),
'HTML comment in DIV' => array( '<div class="target"><!-- this is inside the BODY -->', 4 ),
'Funky comment' => array( '<div><p>What <br class="target"><//wp:post-author></p></div>', 5 ),
);
}
}

0 comments on commit 5d52f19

Please sign in to comment.