enh(cpp) Improve C++ class template headers #2728

klmr · 2020-10-03T17:30:00Z

This fixes #2716 and it fixes #2102¹. In order to parse class and function headers correctly, it is insufficient to just skip greedily (or, worse, non-greedily) over /<.*>/, since C++ template declarations as well as template instantiations can contain nested template parameter lists. As a further complication, they can even contain the operators <, >, << and >> (as part of non-type template parameters or their defaults (in declarations)).

This PR attempts to be a minimal change — a much more comprehensive refactor of the C and C++ language definitions is somewhat overdue (see #2173 and #2146). But this will take time, and in the meantime the improvement from this PR is probably “good enough”.

This PR is a WIP: there are some open issues that I’ve commented inside the changes.

¹ The failing code from the second issue isn’t added as a test case because hljs still fails to highlight it as a function declaration, but that’s unrelated to templates and rather because hljs currently doesn’t recognise operator overload definitions at all.

src/languages/c-like.js

klmr

Request for comment:

src/languages/c-like.js

klmr · 2020-10-03T17:36:41Z

test/markup/cpp/templates.txt

+template <int a, int b, bool c = (a > b), class T> void f1();
+template <int a, int b, bool c = (a >> b), class T> void f2();
+
+template <int a, int b, bool c = (a < b), class T> void g1();


This test is failing, and I don’t know why: I thought my rules would handle the nested parenthesised expression — and this works for the corresponding test case above, i.e. for (a > b).

I'd guess the < in a < b is starting a new TEMPLATE, messing everything up. I'm not sure I see anywhere in the code where you handle this possibility.

Ah, of course! I actually don’t know how to handle this: I thought about adding ) to the end of the TEMPLATE_USE, mode; this works, but then the mode started by ( won’t terminate, even when I also set excludeEnd: true.

Adding the following to the TEMPLATE_USE mode seems to work:

'on:begin': (matchData, response) => { const rest = matchData.input.slice(matchData.index + 1); let parens = 0; let brackets = 1; // start with `<` for (let i = 0; i < rest.length; i++) { const c = rest.charAt(i); switch (c) { case '(': parens++; break; case ')': parens--; break; case '<': brackets++; break; case '>': brackets--; break; } if (brackets == 0) return; if (parens == -1) { response.ignoreMatch(); return; } } },

However, this feels like overkill (potentially also inefficient?); it also only works in cases where the comparison expression is nested inside parentheses, i.e. it works for this test case but not the next one.

If this approach is taken at all, we’d probably want to add some emergency stop characters (;, {) to the switch.

Yeah, this is too much... and definitely has performance implications.

I think the best maybe we can do here is liberal use of endsWithParent to make sure we escape if we hit a {}, line end, etc... and try to return to normal matching...

klmr · 2020-10-03T17:39:11Z

test/markup/cpp/templates.txt

+template <int a, int b, bool c = (a < b), class T> void g1();
+class C;
+
+template <int a, int b, bool c = a < b, class T> void g1();


This test is also failing, and there’s probably no fix, since it’s context sensitive: a<b, class T> could be a valid template parameter, if a were a template template parameter (i.e. if we had template <typename> typename a instead of int a). Of course in that case we’d be missing an additional terminating >, but I can’t see how to make the parser aware of this since it comes (potentially much) later.

You could do someone like this with a callback that looks ahead and does code analysis, but that's definitely crossing over into "parsing" territory I think so I don't think we'd want to do that.

joshgoebel · 2020-10-03T23:27:11Z

src/languages/c-like.js

+    contains: [
+      {
+        begin: /</, end: />/,
+        endsWithParent: true,


Just so you know this endsWithParent does not auto-recurse... it simply means that within the scope of the <...> rule (line 230/231) that it will attempt to match not only the contains rules on line 235, but ALSO the end rule on line 228. But it doesn't go any deeper than that unless you added endsWithParent at every level...

I'm not sure if this has implications for your runaway guard on line 227 or not...

Good question! So, if I understand you correctly, I could equivalently remove endsWithParent, and change the end rule to /[>;{]/ (i.e. the same as the parent end)? In this case I should probably do that instead, and add the same end to the other contained rules.

Well if the desire is that they end with the parent you'd probably want to just use that... that's what it does - pull the parent "end" down into the children... I was just pointing out that it only goes one layer deep... so if it got "stuck" in a child of a child then your "guard" would not save you... so that needs to be considered and might result in adding some more "false positive" edge case tests.

I think that works now. I needed to add a nested {begin: />/} to prevent a > inside a (…) from terminating the template declaration mode prematurely. That said, the behaviour on my unit tests is actually the same as before, I need to think of some example which works now and would previously have failed.

Damn, I think I’ve just kicked the can down the road now. I think this fundamentally doesn’t work because I would need to nest the parent mode inside the mode that handles (…). With my newest change, the the two first lines parse correctly, but the template declaration in the third line ends prematurely, and the “class title” starts at class T rather than at class C:

template <int A = (B<C>), class T> class C; template <int A = B<(c > D<E>)>, class T> class C; template <int A = B<(c > D<(e > f)>)>, class T> class C;

var TEMPLATE_USE = {}; Object.assign(TEMPLATE_USE, { // .. can now use TEMPLATE_USE recursively

All you need is a reference. :-) Though I'm not sure we do this anywhere else and not sure it's a pattern we should start using now though. ...

I'm guessing the nesting here could be unlimited but is there some practical limits for MOST C++ code as far as what you'd see in the real world?

test/markup/cpp/templates.txt

src/languages/c-like.js

joshgoebel · 2020-10-15T13:10:50Z

Closing this in favor of #2752. Thanks for all the hard work on this though - it's been appreciated! Hopefully you've learned a bit about the grammars and parser that will help you if you continue to contribute. :-)

If we find more failure cases we can always re-review, but this was getting way to complicated.

klmr · 2020-10-15T13:33:26Z

Yes, I’m definitely in favour of closing this for now and re-submitting another PR later (in particular, after splitting the definitions for C and C++), I just need to find some time …

klmr added 2 commits October 3, 2020 18:11

Add C++ template expected markup rendering

be3aa54

Add better handling of templates for C++

577e31a

klmr commented Oct 3, 2020

View reviewed changes

src/languages/c-like.js Show resolved Hide resolved

klmr commented Oct 3, 2020

View reviewed changes

joshgoebel reviewed Oct 3, 2020

View reviewed changes

test/markup/cpp/templates.txt Show resolved Hide resolved

joshgoebel reviewed Oct 3, 2020

View reviewed changes

src/languages/c-like.js Show resolved Hide resolved

klmr changed the title ~~enh(cpp) Improve C++ class headers~~ enh(cpp) Improve C++ class template headers Oct 4, 2020

klmr added 3 commits October 4, 2020 17:23

Simplify template declaration parsing mode

1b8c23a

Fix termination of nested parentheses

f7d84e3

Allow limited nesting of template uses

81c80e1

klmr mentioned this pull request Oct 5, 2020

(c) Declaration of struct variable triggers class title mode #2736

Closed

joshgoebel closed this Oct 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enh(cpp) Improve C++ class template headers #2728

enh(cpp) Improve C++ class template headers #2728

klmr commented Oct 3, 2020 •

edited

Loading

klmr left a comment

klmr Oct 3, 2020

joshgoebel Oct 3, 2020 •

edited

Loading

klmr Oct 4, 2020

klmr Oct 4, 2020

joshgoebel Oct 4, 2020

joshgoebel Oct 4, 2020

klmr Oct 3, 2020

joshgoebel Oct 3, 2020

joshgoebel Oct 3, 2020 •

edited

Loading

klmr Oct 4, 2020 •

edited

Loading

joshgoebel Oct 4, 2020

klmr Oct 4, 2020

klmr Oct 4, 2020

joshgoebel Oct 14, 2020

joshgoebel commented Oct 15, 2020

klmr commented Oct 15, 2020

enh(cpp) Improve C++ class template headers #2728

enh(cpp) Improve C++ class template headers #2728

Conversation

klmr commented Oct 3, 2020 • edited Loading

klmr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshgoebel Oct 3, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshgoebel Oct 3, 2020 • edited Loading

Choose a reason for hiding this comment

klmr Oct 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshgoebel commented Oct 15, 2020

klmr commented Oct 15, 2020

klmr commented Oct 3, 2020 •

edited

Loading

joshgoebel Oct 3, 2020 •

edited

Loading

joshgoebel Oct 3, 2020 •

edited

Loading

klmr Oct 4, 2020 •

edited

Loading