Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Blocks] Preserve source markup in invalid blocks #38460

Closed
wants to merge 2 commits into from

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Feb 2, 2022

Status

Currently this is ON HOLD as I prepare a broader change that addresses the shortcomings noted in the discussion, namely the originalContent field and all the little places we need to compare the preserved markup with the re-generated markup, such as in the block comparer during a "Resolve" for an invalid block.

  • Can we avoid updating all the test fixtures? I feel really bad about making an otherwise small PR blow up with changes that are hard to review. I considered modifying our custom snapshot testing runner to exclude the sourceMarkup attribute but wanted to hear thoughts first about it. I wondered if we rely on these fixtures and if they are providing much value to us. If they are, I wonder why we have a custom runner instead of using jest, if that's a carry-over from earlier times.
    • I've updated the test runner to ignore/remove the sourceMarkup property when comparing. On one hand I like this but on the other hand maybe it would be best to leave it in and update the fixtures with the 600+ lines of code that adds to this PR because it would be less surprising to come across. Any thoughts are appreciated.
  • Identify places that are still inconsistent or already consistent. For example, the "Resolve" screen comparing the invalid block against its fix gets things right somehow. Other places might use originalContent and I'd like to audit all those places and even get rid of originalContent if we can.
    • install-button parses originalContent but should we have it parse sourceMarkup instead?
    • block-compare shows a diff view of the innerHTML between a source block and the re-generated markup. showing the difference with sourceMarkup would show the delimiters and also the inner blocks. inner blocks could be noisy, but also essential. showing the delimiters could be confusing.
    • getBlockInnerHTML returns innerHTML if it fails to validate or re-generate block output. can this wipe out inner blocks?
    • ⚠️ validateBlock only looks at innerHTML which means that if we get different innerBlocks we'll accept it as valid.
    • ⚠️ isValidBlockContent overwrites a block's innerBlocks with an empty array for validation, completely ignoring them.
  • Assess more concretely the inner blocks problem whereas we want to be able to edit and modify recognized inner blocks for parent blocks that are invalid. This isn't totally pertinent to this PR so I don't want to hold it up on that account, but I'd like to better understand the implications and future work before merging.
    • for the sake of this PR I don't think we have to consider this problem. we're not making it worse and we're not getting in the way of fixing it. all we'll be doing is making sure that on save an invalid block leaves the original markup.

Description

Previously we have been re-generating a block's save content when
we fail to properly validate that block. While this can work in many
cases it breaks down quickly because of the invalidly-parsed attributes.

Consider a list block whose <ul> tag has been removed. The block will
fail to validate and then also fail to read the content of the list items
because the CSS selector used to source the attribute won't work. The
attribute for the list's content will be empty and getSaveContent will
return an empty list, this despite the fact that the contents were largely
fine in the post HTML when loaded.

In this patch we're threading a new property into the block node called
sourceMarkup whose purpose is to preserve the original HTML in a post
so that we can choose to preserve that on save instead of corrupting or
losing content through the HTML re-generation.

With sourceMarkup we get a less-processed version of the source to
relay into save which should eliminate a common point of frustration:
saving a post erases content.

Testing Instructions

Screenshots

Invalid list before

Screen Shot 2022-02-02 at 11 41 00 AM

Note that the "Resolve" dialog already showed the proper content, but in the editor it looks like an empty list.

Invalid list after

Screen Shot 2022-02-02 at 11 38 57 AM

The editor shows the proper list items and will preserve them on save.

Types of changes

  • Preserving original block attributes, HTML, and inner blocks when saving an invalid block vs. attempting to re-generate that content from the attributes which parsed but failed to validate.
  • Render existing broken content as HTML in editor for invalid blocks instead of showing broken re-generated content.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • I've tested my changes with keyboard and screen readers.
  • My code has proper inline documentation.
  • I've included developer documentation if appropriate.
  • I've updated all React Native files affected by any refactorings/renamings in this PR (please manually search all *.native.js files for terms that need renaming or removal).
  • I've updated related schemas if appropriate.

@github-actions
Copy link

github-actions bot commented Feb 2, 2022

Size Change: -4 B (0%)

Total Size: 1.14 MB

Filename Size Change
build/block-editor/index.min.js 142 kB -16 B (0%)
build/blocks/index.min.js 46.4 kB +12 B (0%)
ℹ️ View Unchanged
Filename Size
build/a11y/index.min.js 960 B
build/admin-manifest/index.min.js 1.1 kB
build/annotations/index.min.js 2.75 kB
build/api-fetch/index.min.js 2.22 kB
build/autop/index.min.js 2.12 kB
build/blob/index.min.js 459 B
build/block-directory/index.min.js 6.28 kB
build/block-directory/style-rtl.css 1.01 kB
build/block-directory/style.css 1.01 kB
build/block-editor/default-editor-styles-rtl.css 378 B
build/block-editor/default-editor-styles.css 378 B
build/block-editor/style-rtl.css 14.8 kB
build/block-editor/style.css 14.8 kB
build/block-library/blocks/archives/editor-rtl.css 61 B
build/block-library/blocks/archives/editor.css 60 B
build/block-library/blocks/archives/style-rtl.css 65 B
build/block-library/blocks/archives/style.css 65 B
build/block-library/blocks/audio/editor-rtl.css 150 B
build/block-library/blocks/audio/editor.css 150 B
build/block-library/blocks/audio/style-rtl.css 111 B
build/block-library/blocks/audio/style.css 111 B
build/block-library/blocks/audio/theme-rtl.css 125 B
build/block-library/blocks/audio/theme.css 125 B
build/block-library/blocks/block/editor-rtl.css 161 B
build/block-library/blocks/block/editor.css 161 B
build/block-library/blocks/button/editor-rtl.css 470 B
build/block-library/blocks/button/editor.css 470 B
build/block-library/blocks/button/style-rtl.css 560 B
build/block-library/blocks/button/style.css 560 B
build/block-library/blocks/buttons/editor-rtl.css 292 B
build/block-library/blocks/buttons/editor.css 292 B
build/block-library/blocks/buttons/style-rtl.css 275 B
build/block-library/blocks/buttons/style.css 275 B
build/block-library/blocks/calendar/style-rtl.css 207 B
build/block-library/blocks/calendar/style.css 207 B
build/block-library/blocks/categories/editor-rtl.css 84 B
build/block-library/blocks/categories/editor.css 83 B
build/block-library/blocks/categories/style-rtl.css 79 B
build/block-library/blocks/categories/style.css 79 B
build/block-library/blocks/code/style-rtl.css 90 B
build/block-library/blocks/code/style.css 90 B
build/block-library/blocks/code/theme-rtl.css 131 B
build/block-library/blocks/code/theme.css 131 B
build/block-library/blocks/columns/editor-rtl.css 108 B
build/block-library/blocks/columns/editor.css 108 B
build/block-library/blocks/columns/style-rtl.css 406 B
build/block-library/blocks/columns/style.css 406 B
build/block-library/blocks/comment-template/style-rtl.css 127 B
build/block-library/blocks/comment-template/style.css 127 B
build/block-library/blocks/comments-pagination-numbers/editor-rtl.css 123 B
build/block-library/blocks/comments-pagination-numbers/editor.css 121 B
build/block-library/blocks/comments-pagination/editor-rtl.css 222 B
build/block-library/blocks/comments-pagination/editor.css 209 B
build/block-library/blocks/comments-pagination/style-rtl.css 235 B
build/block-library/blocks/comments-pagination/style.css 231 B
build/block-library/blocks/comments-query-loop/editor-rtl.css 95 B
build/block-library/blocks/comments-query-loop/editor.css 95 B
build/block-library/blocks/cover/editor-rtl.css 546 B
build/block-library/blocks/cover/editor.css 547 B
build/block-library/blocks/cover/style-rtl.css 1.55 kB
build/block-library/blocks/cover/style.css 1.55 kB
build/block-library/blocks/embed/editor-rtl.css 293 B
build/block-library/blocks/embed/editor.css 293 B
build/block-library/blocks/embed/style-rtl.css 417 B
build/block-library/blocks/embed/style.css 417 B
build/block-library/blocks/embed/theme-rtl.css 124 B
build/block-library/blocks/embed/theme.css 124 B
build/block-library/blocks/file/editor-rtl.css 300 B
build/block-library/blocks/file/editor.css 300 B
build/block-library/blocks/file/style-rtl.css 255 B
build/block-library/blocks/file/style.css 255 B
build/block-library/blocks/file/view.min.js 322 B
build/block-library/blocks/freeform/editor-rtl.css 2.44 kB
build/block-library/blocks/freeform/editor.css 2.44 kB
build/block-library/blocks/gallery/editor-rtl.css 965 B
build/block-library/blocks/gallery/editor.css 967 B
build/block-library/blocks/gallery/style-rtl.css 1.61 kB
build/block-library/blocks/gallery/style.css 1.61 kB
build/block-library/blocks/gallery/theme-rtl.css 122 B
build/block-library/blocks/gallery/theme.css 122 B
build/block-library/blocks/group/editor-rtl.css 159 B
build/block-library/blocks/group/editor.css 159 B
build/block-library/blocks/group/style-rtl.css 57 B
build/block-library/blocks/group/style.css 57 B
build/block-library/blocks/group/theme-rtl.css 78 B
build/block-library/blocks/group/theme.css 78 B
build/block-library/blocks/heading/style-rtl.css 114 B
build/block-library/blocks/heading/style.css 114 B
build/block-library/blocks/html/editor-rtl.css 332 B
build/block-library/blocks/html/editor.css 333 B
build/block-library/blocks/image/editor-rtl.css 810 B
build/block-library/blocks/image/editor.css 809 B
build/block-library/blocks/image/style-rtl.css 500 B
build/block-library/blocks/image/style.css 503 B
build/block-library/blocks/image/theme-rtl.css 124 B
build/block-library/blocks/image/theme.css 124 B
build/block-library/blocks/latest-comments/style-rtl.css 284 B
build/block-library/blocks/latest-comments/style.css 284 B
build/block-library/blocks/latest-posts/editor-rtl.css 199 B
build/block-library/blocks/latest-posts/editor.css 198 B
build/block-library/blocks/latest-posts/style-rtl.css 447 B
build/block-library/blocks/latest-posts/style.css 446 B
build/block-library/blocks/list/style-rtl.css 94 B
build/block-library/blocks/list/style.css 94 B
build/block-library/blocks/media-text/editor-rtl.css 266 B
build/block-library/blocks/media-text/editor.css 263 B
build/block-library/blocks/media-text/style-rtl.css 493 B
build/block-library/blocks/media-text/style.css 490 B
build/block-library/blocks/more/editor-rtl.css 431 B
build/block-library/blocks/more/editor.css 431 B
build/block-library/blocks/navigation-link/editor-rtl.css 649 B
build/block-library/blocks/navigation-link/editor.css 650 B
build/block-library/blocks/navigation-link/style-rtl.css 94 B
build/block-library/blocks/navigation-link/style.css 94 B
build/block-library/blocks/navigation-submenu/editor-rtl.css 299 B
build/block-library/blocks/navigation-submenu/editor.css 299 B
build/block-library/blocks/navigation-submenu/view.min.js 343 B
build/block-library/blocks/navigation/editor-rtl.css 1.98 kB
build/block-library/blocks/navigation/editor.css 1.99 kB
build/block-library/blocks/navigation/style-rtl.css 1.85 kB
build/block-library/blocks/navigation/style.css 1.84 kB
build/block-library/blocks/navigation/view.min.js 2.81 kB
build/block-library/blocks/nextpage/editor-rtl.css 395 B
build/block-library/blocks/nextpage/editor.css 395 B
build/block-library/blocks/page-list/editor-rtl.css 401 B
build/block-library/blocks/page-list/editor.css 402 B
build/block-library/blocks/page-list/style-rtl.css 175 B
build/block-library/blocks/page-list/style.css 175 B
build/block-library/blocks/paragraph/editor-rtl.css 157 B
build/block-library/blocks/paragraph/editor.css 157 B
build/block-library/blocks/paragraph/style-rtl.css 273 B
build/block-library/blocks/paragraph/style.css 273 B
build/block-library/blocks/post-author/style-rtl.css 175 B
build/block-library/blocks/post-author/style.css 176 B
build/block-library/blocks/post-comments-form/style-rtl.css 446 B
build/block-library/blocks/post-comments-form/style.css 446 B
build/block-library/blocks/post-comments/style-rtl.css 521 B
build/block-library/blocks/post-comments/style.css 521 B
build/block-library/blocks/post-excerpt/editor-rtl.css 73 B
build/block-library/blocks/post-excerpt/editor.css 73 B
build/block-library/blocks/post-excerpt/style-rtl.css 69 B
build/block-library/blocks/post-excerpt/style.css 69 B
build/block-library/blocks/post-featured-image/editor-rtl.css 721 B
build/block-library/blocks/post-featured-image/editor.css 721 B
build/block-library/blocks/post-featured-image/style-rtl.css 153 B
build/block-library/blocks/post-featured-image/style.css 153 B
build/block-library/blocks/post-template/editor-rtl.css 99 B
build/block-library/blocks/post-template/editor.css 98 B
build/block-library/blocks/post-template/style-rtl.css 323 B
build/block-library/blocks/post-template/style.css 323 B
build/block-library/blocks/post-terms/style-rtl.css 73 B
build/block-library/blocks/post-terms/style.css 73 B
build/block-library/blocks/post-title/style-rtl.css 80 B
build/block-library/blocks/post-title/style.css 80 B
build/block-library/blocks/preformatted/style-rtl.css 103 B
build/block-library/blocks/preformatted/style.css 103 B
build/block-library/blocks/pullquote/editor-rtl.css 198 B
build/block-library/blocks/pullquote/editor.css 198 B
build/block-library/blocks/pullquote/style-rtl.css 389 B
build/block-library/blocks/pullquote/style.css 388 B
build/block-library/blocks/pullquote/theme-rtl.css 167 B
build/block-library/blocks/pullquote/theme.css 167 B
build/block-library/blocks/query-pagination-numbers/editor-rtl.css 122 B
build/block-library/blocks/query-pagination-numbers/editor.css 121 B
build/block-library/blocks/query-pagination/editor-rtl.css 221 B
build/block-library/blocks/query-pagination/editor.css 211 B
build/block-library/blocks/query-pagination/style-rtl.css 234 B
build/block-library/blocks/query-pagination/style.css 231 B
build/block-library/blocks/query/editor-rtl.css 131 B
build/block-library/blocks/query/editor.css 132 B
build/block-library/blocks/quote/style-rtl.css 201 B
build/block-library/blocks/quote/style.css 201 B
build/block-library/blocks/quote/theme-rtl.css 223 B
build/block-library/blocks/quote/theme.css 226 B
build/block-library/blocks/read-more/style-rtl.css 132 B
build/block-library/blocks/read-more/style.css 132 B
build/block-library/blocks/rss/editor-rtl.css 202 B
build/block-library/blocks/rss/editor.css 204 B
build/block-library/blocks/rss/style-rtl.css 289 B
build/block-library/blocks/rss/style.css 288 B
build/block-library/blocks/search/editor-rtl.css 165 B
build/block-library/blocks/search/editor.css 165 B
build/block-library/blocks/search/style-rtl.css 397 B
build/block-library/blocks/search/style.css 398 B
build/block-library/blocks/search/theme-rtl.css 64 B
build/block-library/blocks/search/theme.css 64 B
build/block-library/blocks/separator/editor-rtl.css 99 B
build/block-library/blocks/separator/editor.css 99 B
build/block-library/blocks/separator/style-rtl.css 233 B
build/block-library/blocks/separator/style.css 233 B
build/block-library/blocks/separator/theme-rtl.css 172 B
build/block-library/blocks/separator/theme.css 172 B
build/block-library/blocks/shortcode/editor-rtl.css 474 B
build/block-library/blocks/shortcode/editor.css 474 B
build/block-library/blocks/site-logo/editor-rtl.css 744 B
build/block-library/blocks/site-logo/editor.css 744 B
build/block-library/blocks/site-logo/style-rtl.css 181 B
build/block-library/blocks/site-logo/style.css 181 B
build/block-library/blocks/site-tagline/editor-rtl.css 86 B
build/block-library/blocks/site-tagline/editor.css 86 B
build/block-library/blocks/site-title/editor-rtl.css 84 B
build/block-library/blocks/site-title/editor.css 84 B
build/block-library/blocks/social-link/editor-rtl.css 177 B
build/block-library/blocks/social-link/editor.css 177 B
build/block-library/blocks/social-links/editor-rtl.css 674 B
build/block-library/blocks/social-links/editor.css 673 B
build/block-library/blocks/social-links/style-rtl.css 1.37 kB
build/block-library/blocks/social-links/style.css 1.36 kB
build/block-library/blocks/spacer/editor-rtl.css 332 B
build/block-library/blocks/spacer/editor.css 332 B
build/block-library/blocks/spacer/style-rtl.css 48 B
build/block-library/blocks/spacer/style.css 48 B
build/block-library/blocks/table/editor-rtl.css 471 B
build/block-library/blocks/table/editor.css 472 B
build/block-library/blocks/table/style-rtl.css 481 B
build/block-library/blocks/table/style.css 481 B
build/block-library/blocks/table/theme-rtl.css 188 B
build/block-library/blocks/table/theme.css 188 B
build/block-library/blocks/tag-cloud/style-rtl.css 214 B
build/block-library/blocks/tag-cloud/style.css 215 B
build/block-library/blocks/template-part/editor-rtl.css 560 B
build/block-library/blocks/template-part/editor.css 559 B
build/block-library/blocks/template-part/theme-rtl.css 101 B
build/block-library/blocks/template-part/theme.css 101 B
build/block-library/blocks/text-columns/editor-rtl.css 95 B
build/block-library/blocks/text-columns/editor.css 95 B
build/block-library/blocks/text-columns/style-rtl.css 166 B
build/block-library/blocks/text-columns/style.css 166 B
build/block-library/blocks/verse/style-rtl.css 87 B
build/block-library/blocks/verse/style.css 87 B
build/block-library/blocks/video/editor-rtl.css 571 B
build/block-library/blocks/video/editor.css 572 B
build/block-library/blocks/video/style-rtl.css 173 B
build/block-library/blocks/video/style.css 173 B
build/block-library/blocks/video/theme-rtl.css 124 B
build/block-library/blocks/video/theme.css 124 B
build/block-library/common-rtl.css 921 B
build/block-library/common.css 919 B
build/block-library/editor-rtl.css 10.1 kB
build/block-library/editor.css 10.1 kB
build/block-library/index.min.js 167 kB
build/block-library/reset-rtl.css 474 B
build/block-library/reset.css 474 B
build/block-library/style-rtl.css 11.3 kB
build/block-library/style.css 11.3 kB
build/block-library/theme-rtl.css 672 B
build/block-library/theme.css 676 B
build/block-serialization-default-parser/index.min.js 1.09 kB
build/block-serialization-spec-parser/index.min.js 2.79 kB
build/components/index.min.js 215 kB
build/components/style-rtl.css 15.5 kB
build/components/style.css 15.5 kB
build/compose/index.min.js 11.2 kB
build/core-data/index.min.js 13.4 kB
build/customize-widgets/index.min.js 11.4 kB
build/customize-widgets/style-rtl.css 1.5 kB
build/customize-widgets/style.css 1.49 kB
build/data-controls/index.min.js 631 B
build/data/index.min.js 7.44 kB
build/date/index.min.js 31.9 kB
build/deprecated/index.min.js 485 B
build/dom-ready/index.min.js 304 B
build/dom/index.min.js 4.5 kB
build/edit-navigation/index.min.js 16.2 kB
build/edit-navigation/style-rtl.css 3.76 kB
build/edit-navigation/style.css 3.76 kB
build/edit-post/classic-rtl.css 546 B
build/edit-post/classic.css 547 B
build/edit-post/index.min.js 29.9 kB
build/edit-post/style-rtl.css 7.17 kB
build/edit-post/style.css 7.16 kB
build/edit-site/index.min.js 41.6 kB
build/edit-site/style-rtl.css 7.22 kB
build/edit-site/style.css 7.21 kB
build/edit-widgets/index.min.js 16.7 kB
build/edit-widgets/style-rtl.css 4.17 kB
build/edit-widgets/style.css 4.17 kB
build/editor/index.min.js 38.4 kB
build/editor/style-rtl.css 3.71 kB
build/editor/style.css 3.71 kB
build/element/index.min.js 3.29 kB
build/escape-html/index.min.js 517 B
build/format-library/index.min.js 6.58 kB
build/format-library/style-rtl.css 571 B
build/format-library/style.css 571 B
build/hooks/index.min.js 1.63 kB
build/html-entities/index.min.js 424 B
build/i18n/index.min.js 3.75 kB
build/is-shallow-equal/index.min.js 501 B
build/keyboard-shortcuts/index.min.js 1.8 kB
build/keycodes/index.min.js 1.39 kB
build/list-reusable-blocks/index.min.js 1.72 kB
build/list-reusable-blocks/style-rtl.css 838 B
build/list-reusable-blocks/style.css 838 B
build/media-utils/index.min.js 2.92 kB
build/notices/index.min.js 925 B
build/nux/index.min.js 2.09 kB
build/nux/style-rtl.css 747 B
build/nux/style.css 743 B
build/plugins/index.min.js 1.95 kB
build/primitives/index.min.js 917 B
build/priority-queue/index.min.js 582 B
build/react-i18n/index.min.js 671 B
build/react-refresh-entry/index.min.js 8.44 kB
build/react-refresh-runtime/index.min.js 7.31 kB
build/redux-routine/index.min.js 2.65 kB
build/reusable-blocks/index.min.js 2.22 kB
build/reusable-blocks/style-rtl.css 256 B
build/reusable-blocks/style.css 256 B
build/rich-text/index.min.js 11 kB
build/server-side-render/index.min.js 1.58 kB
build/shortcode/index.min.js 1.49 kB
build/token-list/index.min.js 639 B
build/url/index.min.js 1.9 kB
build/viewport/index.min.js 1.05 kB
build/warning/index.min.js 248 B
build/widgets/index.min.js 7.15 kB
build/widgets/style-rtl.css 1.16 kB
build/widgets/style.css 1.16 kB
build/wordcount/index.min.js 1.04 kB

compressed-size-action

@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid branch from 259f024 to 4e18a5c Compare February 4, 2022 19:12
@dmsnell dmsnell requested review from youknowriad, gwwar and vcanales and removed request for gwwar February 4, 2022 20:16
@dmsnell dmsnell marked this pull request as ready for review February 4, 2022 21:50
originalUndelimitedContent,
},
innerHTML: rawBlock.blockName ? originalContent : rawBlock.innerHTML,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for the removal of the condition here? Is it something that never happens?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously we have been re-generating block output from the parsed inputs. this is the reason that the editor can silently corrupt or lose content. for HTML soup outside of the blocks though there's no way to re-generate the block since there is no block. because of this we special-case the HTML soup and use innerHTML directly, which ironically is closer to the correct behavior for regular blocks too.

after preserving the source markup every parsed block contains that original content which we can thread through and it doesn't matter if it's a block proper or not, so no more need for the conditional and special-casing.

@@ -228,6 +230,7 @@ export function parseRawBlock( rawBlock ) {
parsedInnerBlocks
);
parsedBlock.originalContent = normalizedBlock.innerHTML;
parsedBlock.sourceMarkup = normalizedBlock.sourceMarkup;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me what's the difference between sourceMarkup and originalContent. As far as I remember, originalContent was introduced for the same reasons.

Copy link
Member Author

@dmsnell dmsnell Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The introduction of originalContent in #14443 was ironically made to support inner blocks even though we've got a few years to reflect on it and confirm it breaks them. 🙃

I find originalContent as a name confusing particularly because it stands in sharp contrast to how we define "content" in innerContent. That is, innerContent is an array that specifically represents the view of the document including the inner blocks while originalContent is a string that specifically represents the view where the inner blocks are missing.

I considered removing originalContent altogether but I'm worried about what might be using it. As noted in the original PR we should be moving towards removing all use of innerHTML everywhere as it's an historical artifact going back to an original design question revolving around how to represent nesting.

sourceMarkup though has a clear referent that's not muddied by the confusing terminology (confusing not only because it means the opposite of what you should infer based on the block API but also because it implies it came from the original document when in fact it was re-generated on the fly from possibly faulty inputs).

With this change in we should be able to start removing references to originalContent and if possible, I'd rather do that in other work since this PR is fixing a longstanding issue and I'm nervous that if we start updating and changing other parts we up the risk of introducing regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so the intent is indeed to change the way we compute originalContent and the name change is there to bring clarity.

I think bringing clarity is a great argument but I wonder if we should just continue using originalContent that way we're not stuck supporting this old property forever and add some useless memory consumption to our stores. I'm hesitant personally here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really want to stick with the renaming, we could potentially make originalContent a getter/setter property in order to add a deprecated call maybe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're uncomfortable keeping both I'd be in favor of removing originalContent entirely even at risk of breaking things. I'll check the usage in the missing block and in other places. innerHTML should go away too but I'm not going to change that here.

@@ -53,6 +53,9 @@ function normalizeParsedBlocks( blocks ) {
// Recurse to normalize inner blocks
block.innerBlocks = normalizeParsedBlocks( block.innerBlocks );

// Ignore since this isn't a part of the parser output
delete block.sourceMarkup;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine since we already have a dedicated snapshot for the source file, maybe we could also consider the same thing for originalContent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sounds great. in fact I think we can go one step further and more positively talk about the attributes we want to check instead of removing the ones we don't. I'll post an update with what I mean.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 3a9ffa6 I've adjusted this function but left originalContent for the same reason I didn't want to introduce the sourceMarkup - it turns this 30-line PR into a 700+ line PR. I wouldn't mind doing a follow-up PR though to update all the fixtures to remove the things we don't want and where that testing-methodology change could be reviewed by itself.

@youknowriad youknowriad requested a review from mcsf February 7, 2022 07:38
@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid branch from a3b1b6c to 3e38d43 Compare February 7, 2022 22:30
Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good overall for me despite my hesitation about the renaming.

Would love some 👀 from @mcsf here.

dmsnell added a commit that referenced this pull request Feb 8, 2022
Previously we've been asserting in our full-content snapshot tests that
the `originalContent` property doesn't change, but it's faulty and going
away (see #38460).

By removing it from the test fixtures we can relax our constraint that
it exists and doesn't change, leaving future work easier to review and focus.

This is a generated property while loading the blocks and use of it leads
to content corruption and loss on certain invalid blocks and blocks with
inner blocks.

After this change we're still veryifying the full load and save flow
for the block implementations.
dmsnell added a commit that referenced this pull request Feb 9, 2022
Previously we've been asserting in our full-content snapshot tests that
the `originalContent` property doesn't change, but it's faulty and going
away (see #38460).

By removing it from the test fixtures we can relax our constraint that
it exists and doesn't change, leaving future work easier to review and focus.

This is a generated property while loading the blocks and use of it leads
to content corruption and loss on certain invalid blocks and blocks with
inner blocks.

After this change we're still veryifying the full load and save flow
for the block implementations.
dmsnell added a commit that referenced this pull request Feb 9, 2022
Previously we've been asserting in our full-content snapshot tests that
the `originalContent` property doesn't change, but it's faulty and going
away (see #38460).

By removing it from the test fixtures we can relax our constraint that
it exists and doesn't change, leaving future work easier to review and focus.

This is a generated property while loading the blocks and use of it leads
to content corruption and loss on certain invalid blocks and blocks with
inner blocks.

After this change we're still veryifying the full load and save flow
for the block implementations.
Previously we have been re-generating a block's save content when
we fail to properly validate that block. While this can work in many
cases it breaks down quickly because of the invalidly-parsed attributes.

Consider a list block whose `<ul>` tag has been removed. The block will
fail to validate and then also fail to read the content of the list items
because the CSS selector used to source the attribute won't work. The
attribute for the list's content will be empty and `getSaveContent` will
return an empty list, this despite the fact that the contents were largely
fine in the post HTML when loaded.

In this patch we're threading a new property into the block node called
`sourceMarkup` whose purpose is to preserve the original HTML in a post
so that we can choose to preserve that on save instead of corrupting or
losing content through the HTML re-generation.

With `sourceMarkup` we get a less-processed version of the source to
relay into save which should eliminate a common point of frustration:
saving a post erases content.
@dmsnell dmsnell force-pushed the blocks/preserve-source-markup-when-invalid branch from 3a9ffa6 to 62e8bdc Compare February 9, 2022 19:42
@dmsnell
Copy link
Member Author

dmsnell commented Feb 9, 2022

Rebuilt this after the merge of #38638, but I'm still trying to totally eliminate originalContent in another branch.

@dmsnell
Copy link
Member Author

dmsnell commented Feb 9, 2022

Unfortunately as I continue to try and eliminate originalContent I keep finding more and more places where the size of the change blows up. I'm really uncomfortable making such a huge change all at once, figuring we'll certainly miss a lot of regressions. I'm not sure we wouldn't still need sourceMarkup in the end anyway since the current loading process makes changes to the original block through things like legacy autop behaviors.

Tossing it out there that it might be practical to fix the existing problem (the editor wipes out or corrupts content from unrelated changes) with the small fix and then start chiseling away at originalContent and innerHTML which take a lot more effort and care. Having both values in place at the same time also means we can piecemeal break apart those originalContent and innerHTML uses instead of doing it all at once.

Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd appreciate a second opinion here.

@dmsnell
Copy link
Member Author

dmsnell commented Feb 10, 2022

I'm going to wait for another opinion as well. Thank you @youknowriad

The more I think about this though the more I think a different interface is going to be better. Possibly that's just two properties vs. one: innerMarkup and outerMarkup. Replacing originalContent is hard and many places (like the block comparison dialog) rely on having the HTML inside the comment delimiters. This is the difference between originalContent and originalUndelimitedContent except that those values break inner blocks.

This being said, I'm still comfortable pushing this out as-is since it's so small and because it does solve one major problem at hand and if we improve it afterwards and can remove originalContent then we can replace sourceMarkup or change it without changing the behaviors this introduces/fixes.

@dmsnell
Copy link
Member Author

dmsnell commented Feb 18, 2022

@youknowriad I'm closing this in favor of #38923 because I believe it provides a better longer-term solution with its source property, containing the original rawBlock, and doesn't require that we assess the same performance questions sourceMarkup does. Also, sourceMarkup just isn't sufficient on its own to handle all the cases we need for general broken-block preservation.

@dmsnell dmsnell closed this Feb 18, 2022
@dmsnell dmsnell deleted the blocks/preserve-source-markup-when-invalid branch February 18, 2022 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants