Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operation on text randomly merges words #11588

Closed
Siguardius opened this issue Nov 7, 2018 · 29 comments · Fixed by #12166
Closed

Operation on text randomly merges words #11588

Siguardius opened this issue Nov 7, 2018 · 29 comments · Fixed by #12166
Assignees
Labels
[Feature] Paste [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable [Type] Bug An existing feature does not function as intended

Comments

@Siguardius
Copy link

Describe the bug
When interacting (bolding, inserting links) with text block that has pasted content with prior formatting, block will randomly merge a few words in the text.

To Reproduce
Steps to reproduce the behavior:

  1. Copy & paste text into the text block directly from word editor (LibreOffice in my case)
  2. Bold a word
  3. Enjoy the bug

Expected behavior
A few worlds in the text block will be merged.

Screenshots
BEFORE:
zrzut ekranu 2018-11-07 o 15 16 23
AFTER:
zrzut ekranu 2018-11-07 o 15 16 31

Desktop (please complete the following information):

  • OS: macOS
  • Browser: Chrome
  • Version: 70.0.3538.77

Additional context

  • Present in latest Gutenberg release
  • Avoidable by pasting and matching the style
@desrosj desrosj added the [Type] Bug An existing feature does not function as intended label Nov 7, 2018
@ellatrix
Copy link
Member

ellatrix commented Nov 8, 2018

Hi @Siguardius! I cannot reproduce the problem, we'll need some more information. Could you share what was logged in the browser console when you paste the text from LibreOffice?

@Siguardius
Copy link
Author

@iseulde This is everything that was in the console.
www.pixelophobia.pl-1541686251826.log

@designsimply
Copy link
Member

designsimply commented Nov 8, 2018

Noting an additional report of this from a duplicate at #11626 and their testing steps for reference:

  1. Paste text containing soft line breaks (from Word or from a PDF)
  2. Edit the pasted text
  3. All instances of soft line breaks are now deleted

@designsimply designsimply added the Needs Testing Needs further testing to be confirmed. label Nov 8, 2018
@swissspidy
Copy link
Member

I just noticed the same when copying content which was on separate lines because each line was actually a <div>. So the HTML looks a bit like this:

<div>Some Text <a href="https://example.com"><u>Some link</u></a></div><div>Some Text <a href="https://example.com"><u>Some link</u></a><br /></div><div>Some Text <a href="https://example.com"><u>Some link</u></a><br /></div><div>Some Text <a href="https://example.com"><u>Some link</u></a><br /></div><div>Some Text <a href="https://example.com"><u>Some link</u></a><br /></div>

The "Received plain text:" part in the console properly shows the content on individual lines.

The "Processed HTML piece:" section shows almost the identical HTML, but without the divs. Leading to the pasted content being on a single line.

@Haldaug
Copy link

Haldaug commented Nov 12, 2018

Is it possible that the culprit of this bug could be these changes? #10019

@Haldaug
Copy link

Haldaug commented Nov 12, 2018

This bug can also be reproduced by manually inserting soft line breaks with "shift+enter" when you are editing paragraphs as HTML:

gutenberg-line-break-bug

@Haldaug
Copy link

Haldaug commented Nov 13, 2018

I found a temporary fix for this. I added this code after line 218 in the raw-handling code:

piece = piece.replace( /\n/g, ' ' );

It replaces all new lines with spaces. This is probably not ideal, but at least it lets you paste content from Word without randomly merging words again.

@ellatrix
Copy link
Member

Yes, we'll have to handle this in the raw handling and/or creating rich text values, maybe along with other non meaningful whitespace. I think what should happen is stripping leading and trailing non meaningful whitespace entirely, and collapsing multiple non meaningful white space characters into a single one.

@ellatrix
Copy link
Member

See also https://medium.com/@patrickbrosset/when-does-white-space-matter-in-html-b90e8a7cdd33.

What we have to do in rich text values seems to be: replace [\n\t] with spaces, then remove trailing and leading spaces, and reduce any other recurring spaces to one.

@ellatrix ellatrix self-assigned this Nov 15, 2018
@ellatrix ellatrix added the [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable label Nov 15, 2018
@ellatrix ellatrix added this to the WordPress 5.0 milestone Nov 15, 2018
@designsimply
Copy link
Member

Tested and confirmed using WordPress 4.9.8 and Gutenberg 4.3 by copying content from a word processing app to Gutenberg. I noticed the missing spaces match up with the line breaks in the Received HTML after pasting content:

screen shot 2018-11-14 at 2 32 14 pm

LibreOffice document I copied from:

Tested with LibreOffice 6.0.6.2 on macOS 10.13.6.

WordPress post I pasted into:

Seen at http://alittletestblog.com/wp-admin/post.php?post=15818&action=edit running WordPress 4.9.8 and Gutenberg 4.3 using Firefox 63.0.1 on macOS 10.13.6.

Console output for reference:

Received HTML:

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 6.0.6.2 (MacOSX)"/>
	<style type="text/css">
		@page { margin: 0.79in }
		p { margin-bottom: 0.1in; direction: ltr; font-size: 11pt; line-height: 115%; text-align: left; orphans: 2; widows: 2 }
		a:link { color: #0000ff }
	</style>
</head>
<body lang="en-US" link="#0000ff" dir="ltr">
<p style="margin-bottom: 0in; font-weight: normal; line-height: 100%; page-break-after: avoid">
<font face="Arial, serif"><font size="3" style="font-size: 12pt"><span lang="en-CA">But
we </span></font></font><font face="Arial, serif"><font size="3" style="font-size: 12pt"><span lang="en-CA"><i>should</i></span></font></font><font face="Arial, serif"><font size="3" style="font-size: 12pt"><span lang="en-CA"><span style="font-style: normal">
know that mythology uses contradictory, or at least inconsistent,
versions of the same story, to express alternative perspectives in a
something else blah blah process, rather than “a lie”.</span></span></font></font></p>
<p style="margin-bottom: 0in; font-weight: normal; line-height: 100%">
<br/>

</p>
<p style="margin-bottom: 0in; font-weight: normal; line-height: 100%">
<font face="Arial, serif"><font size="3" style="font-size: 12pt"><span lang="en-CA"><span style="font-style: normal">The
importance of the death of Zeus is that the story emerges exactly from
that point in time and cultural transformation in which Zeus is also
born and at that time it was familiar for a vegetative god,
representing nature blooming in spring and dying in autumn, to die
and be re-born within the immortality of the eternal round of the
year or yearly daemon.</span></span></font></font></p>
</body>
</html>� 

Received plain text:

 But we should know that mythology uses contradictory, or at least inconsistent, versions of the same story, to express alternative perspectives in a something else blah blah process, rather than “a lie”.

The importance of the death of Zeus is that the story emerges exactly from that point in time and cultural transformation in which Zeus is also born and at that time it was familiar for a vegetative god, representing nature blooming in spring and dying in autumn, to die and be re-born within the immortality of the eternal round of the year or yearly daemon. index.js:50:160096

Processed HTML piece:
 <p>
But
we <em>should</em>
know that mythology uses contradictory, or at least inconsistent,
versions of the same story, to express alternative perspectives in a
something else blah blah process, rather than “a lie”.</p><p>
The
importance of the death of Zeus is that the story emerges exactly from
that point in time and cultural transformation in which Zeus is also
born and at that time it was familiar for a vegetative god,
representing nature blooming in spring and dying in autumn, to die
and be re-born within the immortality of the eternal round of the
year or yearly daemon.</p>

@ElYi
Copy link

ElYi commented Nov 19, 2018

EDIT: As mentioned in other threads, this seems to happen in the desktop version of word and not the browser-based version. EDIT EDIT: Going back into the article after publishing, destroys it again with no spaces when doing it this way.
This also happens with Word. Am using the latest version in Office 365. It's infuriating. Happens in the first para when I paste over a whole article, or any block that I subsequently click on. Adding links does it too. Horrible bug. Must affect so many people and there are so many similar reports that keep getting closed before they're fixed.
image
An authoring tool is software that creates an eLearning lesson or course. Once created, the course is usually packaged in a software wrapper before being distributed to clients who then distribute the course via a Learning Management System. They are often very expensive and usually require some training and expertise to use. This, in turn, means they are not particularly flexible or agile and that any courseware created with one is often tricky to update. The nature of the eLearning industry throws other spanners into the works, too: company training courseware is often produced on a one-size-fits-all basis with multiple companies subscribing to it. As such it tends to be rather boring as all modern interactive features must get stripped away to maximise compatibility and market size. What’s typically left is a boring, clunky, unengaging lesson that’s delivered via a bloated SCORM file. It’s not very effective. Fortunately, whether you want an altogether new way of creating courseware or simply a way to enhance what you’ve already got, microlearning can considerably improve matters as the following authoring tool examples demonstrate.

@ellatrix
Copy link
Member

I'll work on a fix asap.

@Haldaug
Copy link

Haldaug commented Nov 19, 2018

FYI, my fix from here still works: #11588 (comment)

@ellatrix
Copy link
Member

Could anyone test #12093?

@sae13
Copy link

sae13 commented Nov 24, 2018

Could anyone test #12093?

how I can test that?

http://j.mp/2SdQ0Tv

@designsimply
Copy link
Member

Closed #12293 as a duplicate.

@designsimply
Copy link
Member

I noticed that #12093 was superseded by #12166. I tested #12166 and left a comment with my findings (#12166 (comment)).

@sae13 the main way to test pull requests would be to install a local development environment as outlined in https://github.com/WordPress/gutenberg/blob/master/CONTRIBUTING.md#getting-started and then use git to checkout the branch for the pull request that needs testing. Setting up a local development environment wouldn't be expected unless you are contributing code or very deep into testing though. 🙂

I also found another way to test but you need to be logged in to GitHub and know the URL (I think). Go to https://github.com/WordPress/gutenberg/tree/try/rich-text-newlines-to-space and click the "Clone or download" button. That will give you a version of the plugin running the try/rich-text-newlines-to-space branch and you can delete Gutenberg and install+activate the downloaded one in order to test.

@gooma2
Copy link

gooma2 commented Dec 7, 2018

Glad this is being worked on as it does this is every sentence pasted in. If you copy/paste to wordpad then it works fine, but having to do 2 steps with what was only 1 step before in Classic Mode seems counterproductive.

@ellatrix
Copy link
Member

ellatrix commented Dec 7, 2018

The fix will be shipped in WordPress 5.0.1.

@sae13
Copy link

sae13 commented Dec 8, 2018

now button colors not working 🤣

@designsimply designsimply removed the [Status] In Progress Tracking issues with work in progress label Dec 12, 2018
@MightyGadget
Copy link

The fix will be shipped in WordPress 5.0.1.

Just to confirm that WordPress 5.0.1 has no fixed this. Copying from Word 2016 still has the problem

@gooma2
Copy link

gooma2 commented Dec 13, 2018 via email

@swissspidy
Copy link
Member

@MightyGadget Since 5.0.1 is a security release only, it will be fixed 5.0.2 now. ETA is next week.

@MatildaMarseillaise
Copy link

I thought it was just me! I have upgraded to 5.0.1 and am finding that copying from Word into Wordpress draft article is also merging words.

Also, if I am scheduling a post rather than posting it immediately it doesn't seem to be doing the automatic spell check (which I would hope would pick up these merged words). Any idea where the spellcheck function is now?

@gooma2
Copy link

gooma2 commented Dec 14, 2018

I had that issue when updating to 5.0 but after 5.0.1 I was able to copy paste from Word into WP. Now it just won't let me edit older articles so back to Classic Editor. It'll get there.

@kylelarkin
Copy link

I can also confirm that this bug still exists in WordPress 5.0.1 with documents pasted from Word 2016. Copying text from a Word File into a Google doc and then into Gutenberg somehow bypasses the issue.

@jersnav
Copy link

jersnav commented Dec 19, 2018

It looks like some of my existing posts created before 5.0 are impacted by the missing space issue when the posts are converted to blocks. So the posts are correct until I go to edit the old posts and select "convert to blocks." At that point it seems like a few spaces in between words per paragraph disappear.

@ChapeauDigital
Copy link

I can also confirm that I get regular complains from my costumers, that pasting from Word is getting cropped of spaces between words.

@designsimply
Copy link
Member

This issue should have been fixed in WordPress version 5.0.2 released on December 19, 2018. If you are still having trouble after updating to WordPress 5.0.2 or later, please leave a new comment there noting your WordPress version and any relevant details you think may be helpful (i.e. you are using a very old version of Word). Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Paste [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable [Type] Bug An existing feature does not function as intended
Projects
None yet