-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser: Remove specific support for <!--more--> tag #5061
Conversation
de3b48e
to
68aa510
Compare
@@ -8,5 +8,51 @@ export default function( node ) { | |||
return; | |||
} | |||
|
|||
if ( node.nodeValue.indexOf( 'more' ) === 0 ) { | |||
// Grab any custom text in the comment | |||
const matches = node.nodeValue.match( /more(.*)/ ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this RegExp is pretty much asking for trouble. could we whitelist characters instead of blacklisting them?
<!--moredanghtmlcomments-->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for having a look. I'm not sure what you mean by whitelisting, though. FWIW, this is how core does it (permalink):
preg_match( '/<!--more(.*?)?-->/', $post, $matches )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a valid more tag?
<!--moreYou should continue reading below!-->
The PEG didn't allow this - there had to be whitespace following more
if there was custom text in there. If core allows it then we've been wrong so far. At least core was non-greedy on the match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, /more(.*)/
is redundant. it matches the same things that /more/
matches because both zero of everything and one of everything and many of everything is .*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeValue.slice(4).trim()
Love this! Thank you! |
857e08a
to
a1c273f
Compare
blocks/library/more/index.js
Outdated
from: [ | ||
{ | ||
type: 'raw', | ||
isMatch: ( node ) => node.dataset.block === 'core/more', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the dataset
is used here and it's not a generic API right? it's just a specific attribute created by the special-comment-converter.js
which is also specific to this bloc.
This makes me wonder if this converter's code shouldn't be written somewhere in this transform instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not a generic API right? it's just a specific attribute created by the special-comment-converter.js which is also specific to this bloc.
Correct. It is a bit of a fake separation. However, I named it special-comment-converter
instead of more-converter
because I'm anticipating its use for <!--nextpage-->
later.
This makes me wonder if this converter's code shouldn't be written somewhere in this transform instead?
Hm. Maybe I looked at it the wrong way, but basically dataset + custom tag was what I came up with to preserve the more tag and not let it get destroyed by the rest of rawHandler
. This seems like a better trade-off than e.g. making exceptions in the other transformers to ignore more
. What do you think? Also pinging @iseulde for ideas. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the separation is weird, but can't think of anything better right now... Hm or maybe? The raw handler itself is supposed to return clean HTML which blocks can hook into to transform. We already have shortcodes in this mix too though, but the transform is shortcode
instead of raw
. Maybe we could follow the same pattern and have a comment
transform? If no block transforms the comment, then we should just drop the comment (not pass through raw
transforms). What do you think? This removes the need for the special case comment converter.
* @param {Node} node The node to be processed. | ||
* @return {void} | ||
*/ | ||
export default function( node ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should unit test this or is it already covered elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some tests are in order 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 This surfaces this though #4958
Okay, After looking a bit at the code, The approach in #5061 (comment) might be simpler too? Also no |
This made sense to me initially, but less so after giving more thought. The main complexity is that, unlike with { type: 'comment', /* isMatch, */, transform( commentNode ) { } } isn't enough, since we need to look further than just the single node. Now, the My complementary question is: how many more comments can we expect to have to parse in Gutenberg to justify the introduction of a new public transform type? Right now we can only expect WordPress legacy markings: |
a1c273f
to
750422f
Compare
} | ||
|
||
// Grab any custom text in the comment | ||
const customText = node.nodeValue.slice( 4 ).trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amusing side note: this fixes a bug in the more
handling currently in the parser, which requires a space between "more" and the custom text. Core doesn't require that space. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed the previous discussion wherein @dmsnell learned yet another wonderful weird thing about Core. 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 on this.
// Find the first ancestor to which the More element can be appended; | ||
// appending to the closer P parents fails | ||
let parent = node.parentNode; | ||
while ( parent.nodeName.toLowerCase() === 'p' && parent.parentNode ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that the intention is to ensure that the more element is at the top level (to convert into blocks later), could we loop through parent nodes until the body node is the parent of that node?
while ( parent.nodeName !== 'BODY' ) {
parent = parent.parentNode;
}
Also not completely sure what "appending to the closer P parents fails" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that the intention is to ensure that the more element is at the top level (to convert into blocks later), could we loop through parent nodes until the body node is the parent of that node?
This is more straightforward, I like it.
Also not completely sure what "appending to the closer P parents fails" means.
The <wp-block>
element couldn't be appended to any <p>
; inspecting the parent would show that no children had been added.
} | ||
|
||
function createMore( customText, noTeaser ) { | ||
const node = document.createElement( 'wp-block' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thing :) I wonder if we can also do this for shortcakes... Might enable us to get rid of the "pieces" of HTML and parse it all in one go.
750422f
to
f5db665
Compare
I'll wait until we have #4958 fixed before merging this one. Thanks for the reviews! Worth pointing out that, in an impromptu discussion with @mtias, it was brought up that in the long run the
I'm leaning towards number 1. |
f5db665
to
f30caee
Compare
- No exceptions in serializer either - Use dedicated rawHandling transformer for `more`, `noteaser` - Add tests
5876177
to
de32975
Compare
Description
Fixes #2973. This pull request:
<!--more-->
and<!--teaser-->
), thus keeping the parser "Gutenberg-focus", as @pento put it;more
and optionalnoteaser
tags with regular comment demarcations;<!--nextpage-->
; see Introduce a Next Page Block to allow post pagination #4930) block without introducing further parser exceptions.How Has This Been Tested?
Pasting hasn't yet been tested. For conversion of legacy content, try:
and
and
more
tags.customText
andnoTeaser
). ThecustomText
attribute can be observed by looking at the text on the More block's visual representation; thenoTeaser
attribute can be observed with the inspector controls.Caveat
rawHandler
's transformers will get rid of HTML comments, and will attempt to merge or eliminate many elements (e.g. paragraphs, line breaks). In order to get around this without making special exceptions for certain comments,commentRemover
will detectmore
andnoteaser
comments and replace them with a custom element<wp-block>
. This element has no particular definition, is never rendered, and is solely used in order to pass block information throughrawHandler
. The More block'sraw
transform will then match against its shape to replace it with a proper block.commentRemover
happened to be the place where these operations were initially introduced, but the transformer can be split in two.Screenshots (jpeg or gifs if applicable):
Types of changes
Checklist: