Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for rich text sending #701

Merged
merged 21 commits into from
Oct 21, 2020
Merged

Conversation

KitsuneRal
Copy link
Member

@KitsuneRal KitsuneRal commented Oct 10, 2020

What started as an attempt to improve experience after #580 ended up being an attempt to add full-blown support of rich text in Quaternion - if not in UI then at least in the back-office. The result of this attempt is in this PR:

  • Two new functions in HtmlFilter namespace, qtToMatrix() and matrixToQt() that can translate between Qt and Matrix subsets of HTML, taking into account quirks of each side (read: custom data- attributes in Matrix HTML and strong reliance on CSS in HTML produced by QTextDocument.
  • The message entry box automatically produces actual hyperlinks when tab-completing nicknames.
  • It also accepts arbitrary (in the part supported by Qt) rich text (either by drag-n-drop or by copy-paste), shows it and sends as a message (after converting to proper Matrix HTML as much as possible).
  • /html command is now validating; before sending your HTML it will filter it, sending only things endorsed by the spec and complaining about tags not endorsed or malformed HTML.
  • /md command is available if Quaternion is built with Qt that supports Markdown (i.e., 5.14 or newer).

The main remaining thing is making the filter a bit more tolerant to mistakes and malformed pieces:

  • There's no feedback occuring when HTML filtering cuts significant pieces from /html entry, including the malformed HTML case, in which the filter would just stop at the malformed position and return whatever it managed to produce by far. Quaternion then sends this thing to the wire, without the user ever having a chance to fix the problem (and we still don't have editing in Quaternion, so UX is really unforgiving).
  • The code that actually transforms Matrix HTML (coming from the wire) to Qt HTML in order to display it in the timeline is not commited yet, partially for the same reason as above: as the filter is using Qt Core's XML parser (there's no reasonably complete/extensible HTML parser in Qt, aside from Qt WebEngine, that is), it is very picky about constructs it considers malformed XML. To feed HTML to it the messages get some pre-treatment and it works pretty well; but the filter still cuts messages short in some cases, especially around unescaped > and < characters. Moar testin' needed (which is why this PR is now being submitted and will be merged relatively soon).

There were accessors and Q_PROPERTY but no signal and NOTIFY.
void startNewCompletion()->bool initCompletion()
Cloning for history the document that was just sent retains current
formatting in the entry line. As we're moving to using rich text
by default, this leads to unintended consequences: e.g., the hyperlink
for a mention at the end of the previous message gets "stuck" for
the new message (typing new characters extends this hyperlink). This
commit makes KChatEdit to create a whole new document to flush whatever
state the previous document was sent in.
Click-to-mention was leaving completion in an ongoing state, so Tab
unnaturally rotated completions in the old place even after mentioning.
Now completion is properly cancelled, and the salutations from
completion and from click-to-mention are chained (if detected).
ChatEdit now stores (and accepts) rich text; mentions and completions
are automatically linkified while editing the message. The sending
code will be updated separately; for now it still takes plain text from
ChatEdit, which means that the code to linkify user ids at sending
no more works - and will no more be needed once we make use of
ChatEdit::toHtml().
Although Quaternion itself can only produce hyperlinks as yet, this
commit adds a full-fledged cleanup of the HTML taken from the edit box,
in the form of sanitizeHtml() function that's going to be moved to
libQuotient further down the road. This function is based on two
assumptions:
- the cleanup is specifically made for Matrix: the list of allowed tags
  is taken from https://matrix.org/docs/spec/client_server/latest#m-room-message-msgtypes
- the HTML passed to sanitizeHtml() is _likely_ to be produced by
  the rich text handling engine of Qt, with its idiosyncrasies such as
  full-blown preamble (DTD, default styles etc.), top-level `<p>` tag,
  unnecessary hardcoding of hyperlink style, and so on; sanitizeHtml()
  not only removes tags and attributes that Matrix does not endorse but
  also removes those idiosyncrasies in order to produce minimal
  necessary HTML for a given formatting semantics.

As yet, sanitizeHtml() doesn't know about styles and cannot convert
"font-style", "font-weight" etc. CSS attributes to HTML equivalents
accepted in Matrix. Further on this will have to be covered too,
since Qt 5.14+ uses styles, rather than HTML tags, to translate Markdown
to rich text; and copy-pasted rich text may also have such pieces.
sanitizeHtml() has to learn about styles first.
It doesn't (and shouldn't) depend on anything from the model.
HtmlFilter now converts between Matrix and Qt subsets, recognising
Qt's way of encoding most of Markdown as well. This class will
eventually be moved either to Quotient or, if a demand is there, to
its own micro-library.

This commit introduces one degradation that will be fixed separately:
Quaternion no more linkifies plain-text URLs even in its own messages.
@KitsuneRal KitsuneRal added the enhancement A feature or change request for Quaternion label Oct 10, 2020
* The filtering functions are now in namespace HtmlFilter.
* HF::matrixToQt() now returns error details next to the processed HTML
  in a dedicated structure; ChatRoomWidget now has an example of using
  these details to provide user feedback where relevant.
* HF::matrixToQt() can also work in two parsing modes: Tolerant and
  Validating. Tolerant mode is aimed at payloads coming from the wire,
  skipping unrecognised tags; Validating mode is used to process user
  input and stops filtering at the first tag disallowed in Matrix
  (but still silently strips non-compliant attributes).
* HF::matrixToQt() now also fixes stray left and right angle brackets
  that would otherwise upset the XML reader.
* Streamlined tag filtering code a bit.
@KitsuneRal KitsuneRal force-pushed the kitsune-better-mentions branch from fe61b4a to bd8feb1 Compare October 20, 2020 18:35
Quaternion now officially enforces the HTML subset recommended
by the spec, and falls back to plain text if the incoming message
doesn't follow this subset.
@KitsuneRal KitsuneRal force-pushed the kitsune-better-mentions branch from 6e8c8cf to 68a70df Compare October 21, 2020 08:19
@KitsuneRal KitsuneRal force-pushed the kitsune-better-mentions branch from 68a70df to 20be537 Compare October 21, 2020 08:20
@lgtm-com
Copy link

lgtm-com bot commented Oct 21, 2020

This pull request introduces 2 alerts when merging 20be537 into 62a0733 - view on LGTM.com

new alerts:

  • 2 for FIXME comment

@lgtm-com
Copy link

lgtm-com bot commented Oct 21, 2020

This pull request introduces 2 alerts when merging 5408b74 into 62a0733 - view on LGTM.com

new alerts:

  • 2 for FIXME comment

@KitsuneRal KitsuneRal merged commit bff3875 into master Oct 21, 2020
@KitsuneRal KitsuneRal deleted the kitsune-better-mentions branch October 21, 2020 18:09
@KitsuneRal KitsuneRal mentioned this pull request Mar 12, 2021
@KitsuneRal KitsuneRal linked an issue Mar 12, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A feature or change request for Quaternion
Projects
Status: Version 0.0.95 - Done
Development

Successfully merging this pull request may close these issues.

Parse markup on input
1 participant