Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LaTeX formula render #5073

Closed
wants to merge 9 commits into from

Conversation

shun2wang
Copy link
Contributor

@shun2wang shun2wang commented Apr 25, 2023

Implement LaTeX support for JASP result.

@JorisGoosen
Copy link
Contributor

Export PDF now needs fixes...

Really? Id expected it to be automatic as it should just print whatever it shows

@shun2wang
Copy link
Contributor Author

shun2wang commented Apr 25, 2023

Yeah, now it's exported broken PDF, in fact will export a html but with .pdf suffix....

Can you help? @JorisGoosen An additional question is why we didn't use Qtwebengine's built-in html2pdf function, which can fully export what the browser can render. Are there additional considerations here?

@Kucharssim
Copy link
Member

Kucharssim commented Apr 25, 2023

Additional considerations to keep in mind:

  • Check what we get if we do "Copy to clipboard"
  • Check how is LaTeX playing along with jaspTools: would it break tests? how is it rendered in the R Html Viewer (for the moment I don't think it's critical that we are able to render LaTeX in the output there, but it may be if we decide that showing the results in the R Html Viewer is something we want for the syntax mode)?
  • Accessibility - Would the equations work with screen readers, etc? I think we can get the mathml output from KaTeX, right @shun2wang?

@shun2wang
Copy link
Contributor Author

shun2wang commented Apr 25, 2023

For point 1,3 ,now we will get both MathML and HTML in result page (also to clipboard) which was set from Katex config option Output :htmlAndMathml.

For point 2: katex render html element what we added in jaspHtml which just pass strings so it will not break any tests (This works fine, at least in the use case I tried). Now we cannot render it in Rstudio Viewer because now render by qt webengine,but if we want we can using katex from its R bind package or tinyTex.

@shun2wang shun2wang force-pushed the formulaDev branch 3 times, most recently from d39ff6c to 4e6e842 Compare May 19, 2023 09:34
@shun2wang shun2wang marked this pull request as ready for review May 19, 2023 09:35
@shun2wang
Copy link
Contributor Author

shun2wang commented May 19, 2023

Sorry for this PR has been stuck for a while (I blame it on my addiction to Zelda: Tears of the Kingdom these days😂)

So far I think it's ready to review. The following known issues will be finalized in a later stage after you feedback.

  • Add docs for developer, I guess this should be left to some developers to try out after feedback.
  • PDF export still broken but I cannot fix it... maybe joris can help.
  • Exported math expressions now always show as block ,in fact we need to clean the resulting HTML https://github.com/jasp-stats/INTERNAL-jasp/issues/12 some js code does need to be refactored I'll leave that for now when I have time to do this.

@JorisGoosen
Copy link
Contributor

JorisGoosen commented Jun 1, 2023

Yeah, now it's exported broken PDF, in fact will export a html but with .pdf suffix....

Can you help? @JorisGoosen An additional question is why we didn't use Qtwebengine's built-in html2pdf function, which can fully export what the browser can render. Are there additional considerations here?

We are using the same functions from webengine for this actually.
m_view->printToPdf(m_outputPath); ->

resultsView.printToPdf(pdfPath);

connect(m_view.data(), &[QWebEngineView](https://doc.qt.io/qt-6/qwebengineview.html)::pdfPrintingFinished, this, &Html2PdfConverter::pdfPrintingFinished); ->
QMetaObject::Connection printConnection = QObject::connect(ResultsJsInterface::singleton(), &ResultsJsInterface::pdfPrintingFinished, [&](QString pdfPath)

I cant seem to get latex to render though:
image

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 1, 2023

I cant seem to get latex to render though:

Sorry I change the delimiters $...$ $$...$$ to \\(...\\) and \\[...\\] used to distinguish from operators in R after my discussion with Simon. so you can try change it in R code.( I fogot to update the test modules).
EDIT: we will also have a function in jaspBase by @Kucharssim to do this delimiters after this merged.

@JorisGoosen
Copy link
Contributor

Also pdf seems to work fine on my system?

latexbaby.pdf

@JorisGoosen
Copy link
Contributor

I cant seem to get latex to render though:

Sorry I change the delimiters $...$ $$...$$ to \\(...\\) and \\[...\\] used to distinguish from operators in R after my discussion with Simon. so you can try change it in R code.( I fogot to update the test modules). EDIT: we will also have a function in jaspBase by @Kucharssim to do this delimiters after this merged.

Ok Ill fix the testmodule

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 1, 2023

Also pdf seems to work fine on my system?

latexbaby.pdf

Ah yeah, but why it just be a .PDF file(but in fact a html file with that suffix) in my two Windows PC built... that would be good if it works because I never found the reason.

@JorisGoosen
Copy link
Contributor

Ill try it on windows too then, this was macos

@JorisGoosen
Copy link
Contributor

Well, I dont know what happened on your system im afraid, but this is what I get on windows:
debug.pdf

@JorisGoosen
Copy link
Contributor

@Kucharssim so you think the \\[ \\] \\( \\) delimiters wont collide with anything in R-code right?

If you wanna play some more with the latex before approving this?

@JorisGoosen
Copy link
Contributor

Also, this PR doesnt implement latex for help, we do want that right?

@shun2wang
Copy link
Contributor Author

Also, this PR doesnt implement latex for help, we do want that right?

That had been implemented in my dev branch and I'll launch it after this PR is merged with some minor adjustments:-)

@Kucharssim
Copy link
Member

@Kucharssim so you think the \[ \] \( \) delimiters wont collide with anything in R-code right?

No, but also we'll have: jasp-stats/jaspBase#127 so we would avoid that altogether.

If you wanna play some more with the latex before approving this?

Yes I wanna play before this is merged, unfortunately I am running out of time this week - will try to check it out next week.

Copy link
Member

@Kucharssim Kucharssim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice!

I found some corner cases though, mainly with copy pasting the output. To reproduce, use this code in a test module:

jaspResults[["html"]] <- createJaspHtml(title = gettextf("Formula for %s", jaspBase::mathExpression(r"{\mathcal{M}_{1}}")))
  jaspResults[["html"]]$text <- gettextf(
    "Under %s, we have %s where %s",
    jaspBase::mathExpression(r"{\mathcal{M}_1}"),
    jaspBase::mathExpression(r"{p(\theta \mid y) = \frac{p(\theta) \times p(y \mid \theta)}{p(y)},}", inline = FALSE),
    jaspBase::mathExpression(r"{p(y) = \int p(\theta) \times p(y \mid \theta) d\theta.}", inline = FALSE)
  )
  jaspResults[["table"]] <- createJaspTable(title = gettextf("Summary for %s", jaspBase::mathExpression(r"{\theta}")))
  jaspResults[["table"]]$addColumnInfo(name = "parameter", title = gettext("Parameter"))
  jaspResults[["table"]]$addColumnInfo(name = "estimate", title = jaspBase::mathExpression(r"{\mathop{\mathbb{E}}(\theta \mid y)}"), type = "number")
  jaspResults[["table"]]$addColumnInfo(name = "variance", title = jaspBase::mathExpression(r"{\mathop{\mathbb{VAR}}(\theta \mid y)}"), type = "number")

  jaspResults[["table"]]$setData(
    list(
      parameter = jaspBase::mathExpression(c("\\alpha", "\\beta", "\\gamma", "\\delta")),
      estimate = rnorm(4),
      variance = rexp(4, 1)
    )
  )
  jaspResults[["table"]]$addFootnote(
    message = gettextf(
      "Posterior expectation for parameter %s is computed as %s",
      jaspBase::mathExpression("\\theta"),
      jaspBase::mathExpression(r"{\bar{\theta} = \frac{\sum \theta_i}{n}}")
      ),
    symbol = jaspBase::mathExpression("\\ast"),
    colNames = "estimate"
  )

In JASP, it looks like this (really nice!!!):

image

Issues

  1. Using a subscript in a jaspObject title (e.g., \mathcal{M}_{1}) instead appears as a superscript:
Screenshot 2023-06-05 at 16 49 32
  1. Using math formula in jaspObject title and then exporting the results into a pdf puts each symbol on a new line:
    image

  2. Results -> Copy -> Paste to google docs seems to have an issue with copying the table + the indentation is not correct (display equations are not on a separate line)
    image

  3. Copy jaspHtml object -> paste to a google doc copies the LaTeX code, not the symbols:
    image

  4. On the other hand, copy jaspHtml object -> paste to overleaf copies the correct LaTeX code but wrapper up in html tags so one has to remove that manually to get the correct output:
    image

  5. Copy table -> paste to google docs renders it incorrectly (same as in point 3)

  6. Cope LaTeX of table -> paste to overleaf gets the correct LaTeX code but again wrapped up inside of html code

Regarding 1 and 2, not sure how important this is - but if we can fix it then it would be great.
Regarding 3 and 6 we probably cannot expect that we can copy any symbol from LaTeX into a word processor correctly... But some symbols can be copied from a jaspHtml object but not from a jaspTable, so perhaps there is something wrong with that.
Regarding 4, 5, and 7, that's related to #5081: Merging this PR would solve 7, but to fix both 4 and 5 we would probably have to implement "copy to LaTeX" option for jaspHtml as well.

Except for point 1, these issues only affect exporting results from JASP, it does not affect UX in JASP itself. Considering that that issue is relatively minor, I think we could already merge this and work on the other issues elsewhere. Unless @shun2wang wants to polish it here now. @JorisGoosen any comments?

@Kucharssim
Copy link
Member

Ah, also it would be nice to allow editing a formula that user wrote in a note by double clicking it. Currently you can only delete it and write it again. But again that's just a minor thing.

@JorisGoosen
Copy link
Contributor

I think the google doc thing is because we use a specific katex font to render the formulas right?
So those fonts probably arent available within a doc and thus it fails.

Is there a way to get a render from katex.js? As in, a png or something?

@Kucharssim
Copy link
Member

I think the google doc thing is because we use a specific katex font to render the formulas right?
So those fonts probably arent available within a doc and thus it fails.

That may be for some of the symbols, but some of the same symbols can be copied from a jaspHtml to a google doc but not from a jaspTable to a google doc, so perhaps something else it going on there as well?

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 5, 2023

Hi, @Kucharssim What browser/version are you using to look a exported HTML? I take care of Chromium (version 108 or older) used by JASP Qt webengine to use both Katex's MathML and Html output, which I did not do when exporting...So there may be some Mathml issues here generated for the superscript things...The advantage of using MathML is that you can paste it directly into MS Word and get beautiful formula rendering.

Thanks for pointing these out, Katex does have issues with confusing subscripts and subscripts in some cases, this one I thought I fixed it so I will revisit it again😀

it would be nice to allow editing a formula that user wrote in a note by double clicking it.

i will keep in mind, in fact there may indeed be a good solution here. but not now:-)

I think the google doc thing is because we use a specific katex font to render the formulas right?

Yeah, it may because font things, to be honest I haven't tested Google doc.

Is there a way to get a render from katex.js? As in, a png or something?

Well, I think not, Katex doesnot have that but MathJax will...

@JorisGoosen
Copy link
Contributor

I think the google doc thing is because we use a specific katex font to render the formulas right?
So those fonts probably arent available within a doc and thus it fails.

That may be for some of the symbols, but some of the same symbols can be copied from a jaspHtml to a google doc but not from a jaspTable to a google doc, so perhaps something else it going on there as well?

Good point, ill have a look

@Kucharssim
Copy link
Member

Hi, @Kucharssim What browser/version are you using to look a exported HTML?

Exporting to html looks nice in Chrome, Safari, even in Edge (run from a VM), so there does not seem to be a problem there. The subscripts/superscript confusion only appears in JASP (where we use some chromium based browser I assume) and if I export to a pdf - and only if it's in the title of a jaspObject -- the subscript is correctly rendered elsewhere.

The advantage of using MathML is that you can paste it directly into MS Word and get beautiful formula rendering.

Yeah, I thought that should be the case as well... I did not try MS Word (I don't have it), hence I tried it in Google docs.

@shun2wang
Copy link
Contributor Author

Considering that chrome/chromium's support for MathML is not very complete until version 108, and the latest version of chromium we use in Qt is 108 (actually I guess your build is Qt6.4 (chromium 102 see here) so there may be A bit of a surprise, but it's strange that Katex only promotes MathML as an accessibility boost instead of displaying it when MathML and Html are output at the same time, so I guess there may be some thing else in jaspObject.

I'll revisit it anyway, leave this PR or maybe merged for now, I'll improve it in another PR (to bringing formula rendering to help docs).

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 6, 2023

Well, I fixed formula rendering in toolbar/PDF from css styles, so maybe there was a conflict here.
image

but still can't have a good solution for copy-pasting in Google doc...

The other solution is to output all rendering as MathML, which may solves styling, copy-paste quirks...and allows for consistent rendering while maintaining accessibility:

for katex we set output: "mathml", and then using --webEngineArgs --enable-experimental-web-platform-features (set in env, and dont need do this after Qt webengine update to version 109+) when start JASP.exe.

@JorisGoosen
Copy link
Contributor

image Is what it looks like in libreoffice, so it copies something usable

@EJWagenmakers
Copy link
Contributor

It's very pretty

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 6, 2023

@JorisGoosen In Google doc I can only paste some htmld code, so I doubt it respect the MIME types ("text/html") ?

EDIT: because it depends on browser what you using, in Firefox with google docs have different behavior...

EDIT2: Google Docs needs install a Offline Editing Extension in chrome if you install it the behavior will same in FireFox.

@JorisGoosen
Copy link
Contributor

Yeah there is something off about the copy code, im having a look at it now.

However, I also tried the "experimental-web-features" and it doesnt help the copying.

Also, I dont understand why the fractions dissappear...

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 6, 2023

However, I also tried the "experimental-web-features" and it doesnt help the copying.

This may require special annotations for some of MathML's media types, if you paste this into MS Word you'll get a nice formula.

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mi>x</mi>
  <mo>=</mo>
  <mrow>
    <mfrac>
      <mrow>
        <mo>&#x2212;</mo>
        <mi>b</mi>
        <mo>&#xB1;</mo>
        <msqrt>
          <msup>
            <mi>b</mi>
            <mn>2</mn>
          </msup>
          <mo>&#x2212;</mo>
          <mn>4</mn>
          <mi>a</mi>
          <mi>c</mi>
        </msqrt>
      </mrow>
      <mrow>
        <mn>2</mn>
        <mi>a</mi>
      </mrow>
    </mfrac>
  </mrow>
  <mo>.</mo>
</math>

Also, I dont understand why the fractions dissappear...

perhaps with escape?

@JorisGoosen
Copy link
Contributor

<!DOCTYPE HTML>
<html>
	<head>
		<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
		<title>JASP</title>		<style>			p {margin-top:1em; margin-bottom:1em;}		</style>	</head>
	<body style='display: block; padding: 0px; margin: 0px; '>
<div style="padding: 0px 7.2px; text-align: start; margin-bottom: 7.2px; margin-top: 1.2px; margin-left: 0px; margin-right: 7.2px; display: block; float: none; position: static; ">
<div style="display:inline-block; ">
<div class="jasp-html-primitive jasp-display-primitive">
	<span style="max-width:15cm; display:block;">
		<p></p>

		<p>
		Under 

			<span>
				<span class="katex">
					<span class="katex-mathml">
						<math xmlns="http://www.w3.org/1998/Math/MathML">
							<semantics>
								<mrow>
									<msub>
										<mi mathvariant="script">
										M
										</mi>

										<mn>
										1
										</mn>
									</msub>
								</mrow>

								
							</semantics>
						</math>
					</span>

					
				</span>
			</span>

		, we have 

			<span>
				<span class="katex-display">
					<span class="katex">
						<span class="katex-mathml">
							<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
								<semantics>
									<mrow>
										<mi>
										p
										</mi>

										<mo stretchy="false">
										(
										</mo>

										<mi>
										θ
										</mi>

										<mo>
										∣
										</mo>

										<mi>
										y
										</mi>

										<mo stretchy="false">
										)
										</mo>

										<mo>
										=
										</mo>

										<mfrac>
											<mrow>
												<mi>
												p
												</mi>

												<mo stretchy="false">
												(
												</mo>

												<mi>
												θ
												</mi>

												<mo stretchy="false">
												)
												</mo>

												<mo>
												×
												</mo>

												<mi>
												p
												</mi>

												<mo stretchy="false">
												(
												</mo>

												<mi>
												y
												</mi>

												<mo>
												∣
												</mo>

												<mi>
												θ
												</mi>

												<mo stretchy="false">
												)
												</mo>
											</mrow>

											<mrow>
												<mi>
												p
												</mi>

												<mo stretchy="false">
												(
												</mo>

												<mi>
												y
												</mi>

												<mo stretchy="false">
												)
												</mo>
											</mrow>
										</mfrac>

										<mo separator="true">
										,
										</mo>
									</mrow>

									
								</semantics>
							</math>
						</span>

						
					</span>
				</span>
			</span>

		 where 

			<span>
				<span class="katex-display">
					<span class="katex">
						<span class="katex-mathml">
							<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
								<semantics>
									<mrow>
										<mi>
										p
										</mi>

										<mo stretchy="false">
										(
										</mo>

										<mi>
										y
										</mi>

										<mo stretchy="false">
										)
										</mo>

										<mo>
										=
										</mo>

										<mo>
										∫
										</mo>

										<mi>
										p
										</mi>

										<mo stretchy="false">
										(
										</mo>

										<mi>
										θ
										</mi>

										<mo stretchy="false">
										)
										</mo>

										<mo>
										×
										</mo>

										<mi>
										p
										</mi>

										<mo stretchy="false">
										(
										</mo>

										<mi>
										y
										</mi>

										<mo>
										∣
										</mo>

										<mi>
										θ
										</mi>

										<mo stretchy="false">
										)
										</mo>

										<mi>
										d
										</mi>

										<mi>
										θ
										</mi>

										<mi mathvariant="normal">
										.
										</mi>
									</mrow>

									
								</semantics>
							</math>
						</span>

						
					</span>
				</span>
			</span>

		 
		</p>

		<p></p>
	</span>
</div>
</div></div>	</body>
</html>

The <math> blocks in there look totally reasonable...

@shun2wang
Copy link
Contributor Author

shun2wang commented Jun 6, 2023

@JorisGoosen From a clipboard raw data viwer with browser API I can see it paste a text/plain in chrome when you copy from JASP. The correct one should be text/html right?
image

In FireFox we can get it as text/html
image

@JorisGoosen
Copy link
Contributor

Ah yes, I fixed that already. Ill push it

@shun2wang
Copy link
Contributor Author

Relpaced by #5109

@shun2wang shun2wang closed this Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants