Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserter: use lighter grammar parse to check allowed status #64902

Merged
merged 5 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions packages/block-editor/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
"@wordpress/a11y": "file:../a11y",
"@wordpress/api-fetch": "file:../api-fetch",
"@wordpress/blob": "file:../blob",
"@wordpress/block-serialization-default-parser": "file:../block-serialization-default-parser",
"@wordpress/blocks": "file:../blocks",
"@wordpress/commands": "file:../commands",
"@wordpress/components": "file:../components",
Expand Down
8 changes: 4 additions & 4 deletions packages/block-editor/src/store/private-selectors.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ import {
checkAllowListRecursive,
getAllPatternsDependants,
getInsertBlockTypeDependants,
getParsedPattern,
getGrammar,
} from './utils';
import { INSERTER_PATTERN_TYPES } from '../components/inserter/block-patterns-tab/utils';
import { STORE_NAME } from './constants';
Expand Down Expand Up @@ -300,10 +300,10 @@ export const hasAllowedPatterns = createRegistrySelector( ( select ) =>
if ( ! inserter ) {
return false;
}
const { blocks } = getParsedPattern( pattern );
const grammar = getGrammar( pattern );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar parse is lightening fast, but probably still a good idea to cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth measuring the impact if possible, sometimes caching adds more overhead than the actual operation.

return (
checkAllowListRecursive( blocks, allowedBlockTypes ) &&
blocks.every( ( { name: blockName } ) =>
checkAllowListRecursive( grammar, allowedBlockTypes ) &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny that checkAllowListRecursive is already made to work with the grammar objects... makes me wonder if we did that at some point before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it was used when the method was introduced - #30366. We should probably check why it was removed in case there was an issue with grammar parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that there was a need elsewhere for parsed content, and someone switched it to a full parse. But yes we should try to fish it out of the history.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was removed in #32376.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it was removed to improve correctness of the block names: fallback names, deprecations that migrate to another block type etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the underlying parser produces them in the first place

I'm now looking at the code and it's totally intentional!

https://github.com/WordPress/gutenberg/blob/trunk/packages/block-serialization-default-parser/src/index.js#L273-L290

There is a token of type void-block and it produces block records only on top level. @dmsnell why does the parser do that? It looks like it has something to do with freeform/classic content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes the goal of the parser is to represent the source document, so creating a fallback block for whitespace captures that and lets us write it back.

there was considerable discussion about this a number of years ago. the suggestion in each case is just to ignore the whitespace-only fallback blocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell do you remember why the fallback blocks with whitespace are created only on top level, but not in inner blocks?

If the goal of the parser is to represent the source document well, it should also capture whitespace in a nested markup.

Two nice things about using the lightweight parser for parsing patterns:

  • patterns are more likely to have correct and modern block markup, because the concept of patterns is new. It's mostly the post content where dealing with legacy old things is important.
  • if we ever implement block lazy loading, that means the full two-stage parse must be async (promise-returning) because block definitions are loaded by the parser as it needs them. But the first-state lighter parser can remain completely sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you remember why the fallback blocks with whitespace are created only on top level, but not in inner blocks?

@jsnajdr are you certain what you are seeing is as you describe it? these aren't "empty blocks" so much as they are HTML content outside of a block, or "HTML soup." the whitespace isn't ignored just because it's inside of another block, but when it's HTML content inside of a block it becomes part of that block's inner HTML and inner content. in fact, this is always the case because the whitespace is just innerHTML

You can explore this at my parser explorer - https://dmsnell.github.io/peg-parser-explorer/ - just make sure to click the Fetch button to load the block grammar.

innerContent is an array containing string chunks of HTML content, and null as a placeholder for inner blocks, allowing the reconstruction of multiple non-nested inner blocks within another block. all whitespace should appear both in innerHTML and in innerContent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarification @dmsnell, now I fully understand what's going on. At the top level the whitespace is represented as void blocks, because there is no innerContent field where we could put that information.

For inner blocks, the whitespace is in innerContent, which is an array that looks like ["\n\t", null, "\n\t", null, "\n"]. The whitespace can be fully reconstructed from this. We don't need void blocks here.

There is also innerHTML field which is a string like "\n\t\n\t\n". This field is only useful for "leaf" blocks that have their own markup that's in innerHTML, but for "container" blocks the information is not very useful, as all the whitespace is concatenated together.

grammar.every( ( { name: blockName } ) =>
canInsertBlockType( state, blockName, rootClientId )
)
);
Expand Down
23 changes: 17 additions & 6 deletions packages/block-editor/src/store/selectors.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import {
getAllPatternsDependants,
getInsertBlockTypeDependants,
getParsedPattern,
getGrammar,
} from './utils';
import { orderBy } from '../utils/sorting';
import { STORE_NAME } from './constants';
Expand Down Expand Up @@ -2376,17 +2377,27 @@ export const __experimentalGetAllowedPatterns = createRegistrySelector(
const { getAllPatterns } = unlock( select( STORE_NAME ) );
const patterns = getAllPatterns();
const { allowedBlockTypes } = getSettings( state );

const parsedPatterns = patterns
.filter( ( { inserter = true } ) => !! inserter )
.map( getParsedPattern );
.map( ( pattern ) => {
return {
...pattern,
get blocks() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding this for back compat and also for us. Eventually we should adjust the uses of this selector to not expect blocks and only parse when it's needed.

return getParsedPattern( pattern ).blocks;
},
};
} );

const availableParsedPatterns = parsedPatterns.filter(
( { blocks } ) =>
checkAllowListRecursive( blocks, allowedBlockTypes )
( pattern ) =>
checkAllowListRecursive(
getGrammar( pattern ),
allowedBlockTypes
)
);
const patternsAllowed = availableParsedPatterns.filter(
( { blocks } ) =>
blocks.every( ( { name } ) =>
( pattern ) =>
getGrammar( pattern ).every( ( { blockName: name } ) =>
canInsertBlockType( state, name, rootClientId )
)
);
Expand Down
20 changes: 16 additions & 4 deletions packages/block-editor/src/store/utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
* WordPress dependencies
*/
import { parse } from '@wordpress/blocks';
import { parse as grammarParse } from '@wordpress/block-serialization-default-parser';

/**
* Internal dependencies
Expand All @@ -13,6 +14,7 @@ import { STORE_NAME } from './constants';
export const withRootClientIdOptionKey = Symbol( 'withRootClientId' );

const parsedPatternCache = new WeakMap();
const grammarMapCache = new WeakMap();

function parsePattern( pattern ) {
const blocks = parse( pattern.content, {
Expand All @@ -37,14 +39,24 @@ function parsePattern( pattern ) {

export function getParsedPattern( pattern ) {
let parsedPattern = parsedPatternCache.get( pattern );
if ( parsedPattern ) {
return parsedPattern;
if ( ! parsedPattern ) {
parsedPattern = parsePattern( pattern );
parsedPatternCache.set( pattern, parsedPattern );
}
parsedPattern = parsePattern( pattern );
parsedPatternCache.set( pattern, parsedPattern );
return parsedPattern;
}

export function getGrammar( pattern ) {
let grammarMap = grammarMapCache.get( pattern );
if ( ! grammarMap ) {
grammarMap = grammarParse( pattern.content );
// Block names are null only at the top level for whitespace.
grammarMap = grammarMap.filter( ( block ) => block.blockName !== null );
grammarMapCache.set( pattern, grammarMap );
}
return grammarMap;
}

export const checkAllowList = ( list, item, defaultResult = null ) => {
if ( typeof list === 'boolean' ) {
return list;
Expand Down
1 change: 1 addition & 0 deletions packages/block-editor/tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"references": [
{ "path": "../a11y" },
{ "path": "../api-fetch" },
{ "path": "../block-serialization-default-parser" },
{ "path": "../blob" },
{ "path": "../components" },
{ "path": "../compose" },
Expand Down
57 changes: 57 additions & 0 deletions test/e2e/specs/editor/various/parsing-patterns.spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
/**
* WordPress dependencies
*/
const { test, expect } = require( '@wordpress/e2e-test-utils-playwright' );

test.describe( 'Parsing patterns', () => {
test( 'Considers a pattern with whitespace an allowed pattern', async ( {
admin,
editor,
page,
} ) => {
await admin.createNewPost();
await editor.insertBlock( {
name: 'core/buttons',
innerBlocks: [ { name: 'core/button', attributes: { text: 'a' } } ],
} );
await page.keyboard.press( 'ArrowDown' );
await page.getByLabel( 'Toggle block inserter' ).click();

await page.getByRole( 'tab', { name: 'Patterns' } ).click();
await page.evaluate( () => {
window.wp.data.dispatch( 'core/block-editor' ).updateSettings( {
__experimentalBlockPatterns: [
{
name: 'test/whitespace',
title: 'Pattern with top-level whitespace',
description: '',
content: `<!-- wp:button -->
<div class="wp-block-button"><a class="wp-block-button__link wp-element-button">test</a></div>
<!-- /wp:button -->
<!-- wp:button -->
<div class="wp-block-button"><a class="wp-block-button__link wp-element-button">test</a></div>
<!-- /wp:button -->`,
},
],
} );
} );
await page.fill(
'role=region[name="Block Library"i] >> role=searchbox[name="Search for blocks and patterns"i]',
'whitespace'
);
await page
.locator( 'role=option[name="Pattern with top-level whitespace"i]' )
.click();
expect( await editor.getBlocks() ).toMatchObject( [
{
name: 'core/buttons',
innerBlocks: [
{ name: 'core/button', attributes: { text: 'a' } },
{ name: 'core/button', attributes: { text: 'test' } },
{ name: 'core/button', attributes: { text: 'test' } },
],
},
] );
} );
} );
Loading