Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fixtures for MS Word, and properly handle newlines #17459

Closed
wants to merge 1 commit into from

Conversation

tfrommen
Copy link
Member

Description

This is a fix for #13497.

If you copy-paste text from MS Word, this will result in unwanted newlines in the pasted and processed content.

How has this been tested?

See #13497 (comment) and #13497 (comment).

In this PR, I also updated and expanded fixtures.

Types of changes

Update and fix (!) some fixtures, and include a new RAW paste handler.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • My code has proper inline documentation.
  • I've included developer documentation if appropriate.

@tfrommen tfrommen added [Type] Bug An existing feature does not function as intended [Feature] Paste [Package] Blocks /packages/blocks labels Sep 16, 2019
return;
}

node.innerHTML = node.innerHTML.replace( /(?<!<br>)\n/g, ' ' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we trying to strip line breaks (\n)? Why should it not be preceded by a line break element?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes! 🙂 This is exactly what this is about.

Please refer to the initial comment of the related issue.

This behavior, as well as the fix/handling for it, is specific to pasting content from MS Word. As you can see in the mentioned issue, MS Word copies content formatted with a maximum line-length of 80 characters. So there are newlines injected by MS Word itself. Any actual user-defined linebreak is actually a <br>, followed by a newline. And we do not want to replace that.

Does that make sense?

return;
}

if ( ! [ ...node.classList ].find( ( className ) => className.match( /^Mso[A-Z].*/ ) ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is a unique MS Word problem? More specifically a paragraph problem? In my opinion, we should be writing a general line break (\n) collapser, as line breaks (\n) in HTML are not visible in any circumstance. The are only used to format the HTML.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are, for sure, other tools that do weird things with line breaks. However, this PR (and the related issue) is specific to pasting content from MS Word, where one gets additional line breaks that have been injected by MS Word itself.

It might be true that HTML, or rather the browser, doesn't care about newlines. But they will also end up in content.raw in the REST API response. And this is where things can break.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are other sources that may inject line breaks. That's why I'm suggesting to look for a more general fix. See #17470.

@tfrommen
Copy link
Member Author

With #17470 merged, closing now.

@tfrommen tfrommen closed this Nov 25, 2019
@tfrommen tfrommen deleted the stray-newlines-from-ms-word branch November 25, 2019 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Paste [Package] Blocks /packages/blocks [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants