Comments welcome #2140

brucemiller · 2023-07-03T17:04:20Z

This PR deals more consistently with comments, whether Tokens, digested or XML nodes, appearing in places we previously didn't expect them.

The function IsEmpty tests whether a tokens or list is empty, ignoring comments, and is generally preferred over scalar($x->unlist ); element_next($node), element_prev($node) are usually preferred over $node->previousSibling. Also, avoid splitting xml text nodes with comments, pushing them to the front.

Also, remove an obsolete form of DefMathLigature.

…appropriate instead of scalar(x->unlist)

…d splitting xml text nodes with comments; remove obsolete form of DefLigature

dginev · 2023-07-03T20:42:37Z

lib/LaTeXML/Common/XML.pm

+  my $prev;
+  while (($prev = $node->previousSibling) && ($prev->nodeType != XML_ELEMENT_NODE)) {
+    $node = $prev; }
+  return $prev; }


Naming question: Given nextSibling and previousSibling have the naming scheme with the modifier in front and the kind of object second, I'd be tempted to suggest next_element and previous_element (which are in a way abbreviated forms of get_next_element and get_previous_element).

That's roughly how I've internalized the already existing element_nodes which seems to align with childNodes in the libxml wrapper.

I guess at the time I was writing, I was emphasizing "element".

dginev · 2023-07-03T20:50:24Z

lib/LaTeXML/Package.pm

+      return 0 unless IsEmpty($thing->unlist); }
+    elsif ($ref eq 'LaTeXML::Core::Token') {
+      my $cc = $$thing[1];
+      return 0 if ($cc == CC_LETTER) || ($cc == CC_OTHER) || ($cc == CC_ACTIVE) || ($cc == CC_CS); }


Hm, this seems incomplete? A T_BEGIN, T_END, T_MATH, T_ALIGN, T_PARAM aren't empty either. Maybe it's somewhat briefer to invert the condition:

return 0 unless ($cc == CC_COMMENT) || ($cc == CC_MARKER) || ($cc == CC_IGNORE);

though if this is meant to be used for a single Token, I'd also wonder if we shouldn't treat all tokens that have empty text as empty as well.

return 0 unless ($cc == CC_COMMENT) || ($cc == CC_MARKER) || ($cc == CC_IGNORE) || (!length($$thing[0]));

Edit: Or indeed use the regex leveraged a little lower for non-space $$thing[0] !~ /^\s*$/

I think you have the logic flipped; this line should short-circuit immediately returning "false" if we encounter a token that is not empty.
OTOH, your point is good: should we consider T_LETTER("") to be empty?
OTOOH, I expect this issue may have to be revisited to distinguish between truly empty, and just spacey things? Hopefully I can defer that concern?

dginev

Read through the code, the IsEmpty consistency looks nice to have. Just the couple of minor comments.

brucemiller · 2023-07-04T01:00:44Z

You raise good questions, but with the refactors in this PR, it should be easy to find & revisit later if we want to.

brucemiller added 4 commits July 3, 2023 12:55

Generalize and promote IsEmpty to test tokens,lists,xml; Use it when …

fcf0b3d

…appropriate instead of scalar(x->unlist)

Explicitly read pending comments from Gullet during digestion

3bca173

New element_next,element_prev to get only element nodes; use it; avoi…

8ca7c13

…d splitting xml text nodes with comments; remove obsolete form of DefLigature

really remove obsolete form of DefMathLigature

e0b8fed

brucemiller requested a review from dginev July 3, 2023 17:04

dginev reviewed Jul 3, 2023

View reviewed changes

dginev approved these changes Jul 3, 2023

View reviewed changes

brucemiller merged commit cb0194f into master Jul 4, 2023
26 checks passed

brucemiller deleted the comments-welcome branch July 4, 2023 01:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments welcome #2140

Comments welcome #2140

brucemiller commented Jul 3, 2023

dginev Jul 3, 2023

brucemiller Jul 4, 2023

dginev Jul 3, 2023

dginev Jul 3, 2023 •

edited

Loading

brucemiller Jul 4, 2023

dginev left a comment

brucemiller commented Jul 4, 2023

Comments welcome #2140

Comments welcome #2140

Conversation

brucemiller commented Jul 3, 2023

dginev Jul 3, 2023

Choose a reason for hiding this comment

brucemiller Jul 4, 2023

Choose a reason for hiding this comment

dginev Jul 3, 2023

Choose a reason for hiding this comment

dginev Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

brucemiller Jul 4, 2023

Choose a reason for hiding this comment

dginev left a comment

Choose a reason for hiding this comment

brucemiller commented Jul 4, 2023

dginev Jul 3, 2023 •

edited

Loading