author | date | title |
---|---|---|
Peter Ralph |
26 October 2015 |
R+markdown gotchas |
These things were confusing to me for a while. They might be to others also.
There are a lot of different ways to enter math in LaTeX already,
and there's more when using pandoc.
This is kinda nice for the newcomer to LaTeX, since it's more forgiving,
but if you want to produce both html and pdf output, you have to do it just right.
For html output, it seems safe to just wrap everything in $$s;
but this breaks when passed to pdflatex.
The solution, at least for now, is to use \aligned
instead of \align
:
- If you are producing pdf, pandoc will pass your LaTeX directly to pdflatex; so your code needs to be valid there.
- If you are producing html, pandoc includes raw LaTeX blocks if --mathjax is specified, so you can go ahead and use \align environments and everything.
- Definitions (\newcommand, etc) go at the top of the file, after the YAML header (but see below).
For instance, this file:
\newcommand{\R}{\mathbb{R}}
This is for $\R$eal.
$$\begin{aligned}
e^{i\pi} = -1
\end{aligned}$$
compiles just fine both with
pandoc test.md -o test.pdf
pandoc test.md --standalone --mathjax -o test.html
The macros in the above example get expanded because all the math is in simple math environments.
If you want to use ams environments like \begin{align}
, then there's
no way to include macros directly in the document that works for both html and pdf output.
One nice way around this is to keep your macros in a separate file which is passed to pandoc by a variable,
and insert them in the header using the -H
flag;
you just have to make sure to wrap them in a math environment for mathjax to use them:
# for pdf
pandoc test.md -H macros.tex -o test.pdf
# for html
pandoc test.md -H <(echo '\['; cat macros.tex; echo '\]') --mathjax --standalone -o test.html
It's nice and convenient to turn your .Rmd
files into html using rmarkdown
's function render()
.
And, R+markdown is a great way to produce templated reports:
write one .Rmd file; apply it to many datasets of the same structure.
But, as I discovered the hard way,
if you call, say,
render("template.Rmd",output_file="a.html")
and
render("template.Rmd",output_file="b.html")
at the same time in different R sessions,
with different variables,
you won't get different reports,
you'll get the same one twice, silently.
As far as I can tell, there's no workaround
with rmarkdown;
the way to do it is to call knitr
and pandoc
yourself
(which is what rmarkdown does under the hood anyhow).
knitr
caches figures in a subdirectory, by default called figure/
.
The same goes for the results of any cached chunk.
You can, and should, change this,
by specifying a file-specific prefix for the figures,
for instance as follows:
```{r setup_knitr}
knitr::opts_chunk$set(fig.path=file.path("figure",gsub("\.Rmd","",knitr::current_input()),""),
cache.path=file.path("cache",gsub("\.Rmd","",knitr::current_input()),""))
```
or, defining Makefile rules:
%.md : %.Rmd
Rscript -e 'knitr::opts_chunk$$set(fig.path=file.path("figure","$*",""),cache.path=file.path("cache","$*",""));knitr::knit(basename("$<"),output=basename("$@"))'
%.html : %.md
cd $(dir $<) && pandoc $(notdir $<) $(PANDOC_OPTS) --output $(notdir $@)
For some versions of pandoc,
the second command above wouldn't have worked.
That's because pandoc would have inserted the URL without the https://
in front,
as discussed here.
This works if the file is being served from a webserver,
but not if you're looking at it on your local computer (using the file://
protocol).
The solution is to pass in the mathjax URL explicitly:
pandoc test.md --standalone --mathjax=https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML -o test.html
... or to update your pandoc.