Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Htxelatex for html output #887

Merged
merged 32 commits into from
Jun 27, 2018
Merged

Conversation

PDoakORNL
Copy link
Contributor

Anyway pretty good looking HTML output. We can improve it further generally without having to touch the manual latex. The .mk4 and .cfg files allow a decent amount of control over generated HTML and image conversions.

Only tested with a full texlive2018 install.

Html builds with

./build_html_manual.sh

There are still some rough edges especially as regards to handling all the little files created.

@ghost ghost assigned PDoakORNL Jun 11, 2018
@ghost ghost added the in progress label Jun 11, 2018
@qmc-robot
Copy link

Can one of the maintainers verify this patch?

1 similar comment
@qmc-robot
Copy link

Can one of the maintainers verify this patch?

@markdewing
Copy link
Contributor

Does this supersede #886 ? It appears to contain the same changes.

@PDoakORNL
Copy link
Contributor Author

It's based on that. I would rebase it after 886 was merged.

@PDoakORNL
Copy link
Contributor Author

Probably not visible outside of ORNL network. Firewall exception is pending.
http://128.219.187.136/manual/qmcpack_manual.html

Copy link
Contributor

@prckent prckent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

figures/qmca_opt_energy.png does not exist. qmca_opt_energy.pdf is a real PDF file. Check other updates png/pdf.

@prckent
Copy link
Contributor

prckent commented Jun 11, 2018

Overall the conversion is fairly usable, but some tidying and improvements are needed. The same comment could be made of the source material, so we are moving in the right direction.

@ye-luo
Copy link
Contributor

ye-luo commented Jun 11, 2018

@prckent your links are only accessible in ORNL.

@PDoakORNL
Copy link
Contributor Author

You need the prep_pdf.sh image conversion to work to get all the png's.

Requirements for prep_pdf.sh:

  • imagemagick
  • pdfcrop (provided by texlive-extra-utils or what spack calls a full install)
  • pdf2svg
mac95788:manual epd$ git checkout develop -- figures/qmca_opt_energy.pdf
mac95788:manual epd$ git checkout develop -- figures/qmca_opt_variance.pdf
mac95788:
manual epd$ identify figures/qmca_opt_energy.pdf
figures/qmca_opt_energy.pdf PDF 576x432 576x432+0+0 16-bit sRGB 15614B 0.000u 0:00.000
mac95788:manual epd$ identify figures/qmca_opt_variance.pdf
figures/qmca_opt_variance.pdf PNG 800x600 800x600+0+0 8-bit sRGB 28288B 0.000u 0:00.000

@prckent
Copy link
Contributor

prckent commented Jun 11, 2018

@PDoakORNL This was build_manual.sh. We should use the images directly for the PDF generation and regular TeX processing. I wish to avoid extra processing steps here. For HTML generation these steps are OK and I think unavoidable.

@jtkrogel
Copy link
Contributor

Might not be the right place for this, but what would be needed to get the Nexus manual in the auto-build and online? It would be great for the two manuals to be accessible near each other.

@prckent
Copy link
Contributor

prckent commented Jun 12, 2018

@jtkrogel Adding Nexus PDF and even HTML should be simple. It is just a matter of updating the new qmcpack.org website with the appropriate links and updating the build scripts on the VM that hosts the files. Perhaps create an issue for me or remind me after we have the QMCPACK manual updating and linked correctly.

@jtkrogel
Copy link
Contributor

@prckent thanks. See #889.

@@ -0,0 +1,4 @@
*.html
*.png
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of figures are in png format. This exclusion will make it harder to track them.
Should this be in manual/html/.gitignore instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PDoakORNL Please remove this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes mark

@markdewing
Copy link
Contributor

The latest change does not work with TeX Live 2015. The fontspec package exists and is loaded. It's the IfFontExistsTF command that does not exist in the package.

@prckent
Copy link
Contributor

prckent commented Jun 12, 2018

I'll merge this once I have the conversion running in the VM. I need pdf2svg but have all the other components. Can we close #886 ?

@PDoakORNL PDoakORNL force-pushed the htxelatex_for_html_output branch from 8bf10a9 to 36cf559 Compare June 14, 2018 15:46
@prckent prckent mentioned this pull request Jun 14, 2018
@prckent
Copy link
Contributor

prckent commented Jun 14, 2018

I need more space in the VM for the tooling needed to support this, and therefore before I can merge this PR. May take some time.

@prckent
Copy link
Contributor

prckent commented Jun 15, 2018

Summarizing remaining items for merging this PR

  • Expand space on VM
  • Point spack build_stage at new volume (to not fill /)
  • Obtain working pdf2svg in VM (multiple spack install variants have failed due to bugs or issues in the underlying dependencies)
  • Won't be feasible to get to Tex Live 2018 in VM in short time frame. Update texlive in VM (initial spack install failed. Directly downloaded TexLive binaries require newer glibc than the VM's CentOS6.9 glibc 2.12)
  • Check PR in VM: verify PDF production is still correct
  • Fails Check PR in VM: check HTML production works
  • Merge PR

Then:

  • Delay until workaround in place Update scripts for HTML production
  • Generate on v3.5 release if not automated then Link developer versions HTML and PDF on qmcpack.org website
  • On v3.5 release Generate and link to PDF versions of older releases on qmcpack.org website

\usepackage{fancyhdr}
\usepackage{hyperref} %for urls
%\usepackage{url}
\usepackage{tabularx}
\usepackage{xltabular}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xltabular package and the xltabular.sty file does not seem to be present in the Tex live 2015 distribution (Ubuntu 16.04)

@PDoakORNL
Copy link
Contributor Author

PDoakORNL commented Jun 19, 2018

Recipe for pdf2svg on clean Centos7 VM

  • yum install gcc-c++
  • yum install environment-modules
  • yum install bzip2
  • spack install curl
  • spack load curl
  • sudo yum install libstdc++-static.x86_64
  • spack install gcc@7.3.0 +binutils +piclibs
  • spack load gcc@7
  • sudo yum install cairo-devel.x86_64
  • sudo yum install poppler-glib-devel.x86_64
  • spack install pdf2svg%gcc@7.3.0

note: that if you did not have the whole gcc tool chain install when you first ran spack you will need to add g++ and gfortran to ~/.spack/linux/compilers.yaml

\@ifpackageloaded{fontspec}%
{%
\setmainfont{XCharter Roman}%
\@ifpackagelater{fontspec}{2016/01/30}% Guess at when IfFontExistsTF was added.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date on the fontspec package in Ubuntu 16.04 is 2016/02/01 (v2.5a), and it does not have IfFontExistsTF.
Changing the test date to 2016/02/02 works.

(The commit where IfFontExistsTF was added occurs on Dec 27, 2016)
latex3/fontspec@6969d3a
It appears first in tag 2.5c and the v2.5c release date on github is 01/20/2017

@PDoakORNL PDoakORNL force-pushed the htxelatex_for_html_output branch 2 times, most recently from 862733d to b0a844c Compare June 20, 2018 13:23
@markdewing
Copy link
Contributor

Current status on Ubuntu 16.04

  • build_pdflatex_manual - error about unknown graphics extension: .eps. Otherwise builds okay (the eps issue should eventually be addressed by converting the one eps file to pdf)
  • build_manual - fontspec error about font not found (XCharter Roman). This might be because the latex fonts are not available as system fonts. I haven't tried any of the fixes yet. Otherwise builds okay after commenting out the line that uses this font in qmcpack_manual.sty.
  • build_html_manual

@PDoakORNL PDoakORNL force-pushed the htxelatex_for_html_output branch from a7885d3 to 5df85e7 Compare June 22, 2018 15:07
@markdewing
Copy link
Contributor

For the HTML version of the manual, having tighter requirements is fine, since the primary use is for the project to build and put on the website.

For the PDF version, my concern is adding additional barriers to contributions. Especially for more infrequent contributors. If we require updating a Tex installation to build the manual, it makes it that much harder to contribute to the manual.

I'll submit a PR converting the one EPS figure.

I've been out of the academic world for a while, but it seems like submitting a paper that requires the latest TexLive to a publisher is asking for compatibility problems (but maybe publishers are always up to date?) What's the incentive for most people to be using the cutting edge Tex version?

@PDoakORNL
Copy link
Contributor Author

Bugs are fixed, hard to use incompatible packages are superseded by easier to use more compatible packages. APS RevTex for instance operates well in an up-to-date environment and doesn't support a number of "deprecated" packages including subfigure and subcaption.

But my understanding is they don't actually use your nonlocal latex. They keep the math and in the text formatting and some of the tags in their style itself to automatically populate authors etc. but the styles they give us are for collecting metadata and so you can see more or less how it will look. I think a commercial product or xelatex/luatex is used for the real pub.

The build_pdflatex_manual.sh still builds everything and content can be added with the quality of formatting we had before. It could even still be improved with forward compatible latex. If they stick to basic text, math, tables, pdf figures, and whatever macro I wrap includegraphics in they won't have anything to worry about. If they want to be fancy or have control of html output they need to get up to date.

@ghost ghost assigned prckent Jun 26, 2018
@ye-luo
Copy link
Contributor

ye-luo commented Jun 27, 2018

@PDoakORNL Please update the PR description about the up-to-date change. What scripts are added and when are they used and what are the dependencies.

@prckent
Copy link
Contributor

prckent commented Jun 27, 2018

I am now happy to merge this. We can make PDFs in the VM, as we do currently. I can successfully generate HTML on my Mac and presumably generic UNIX, but not in the VM yet (details below).

After a variety of experimentation and discussion with @PDoakORNL I am proposing that we establish the following rules:

Rule: the manual must build and create a PDF using texlive2017 full install and no other additions (fonts, packages, tools). This is for ease of contributions, even from ourselves. The automated builder VM is running this version of texlive and we will have to live with these constraints.

I know this choice won't keep everyone happy but the barrier for manual contributions must be kept small. It is clear from all the commentary above that there are issues with more aggressive use of newer packages, features, and external tooling.

Background: This issue is that texlive2018 is not available everywhere and not easily installable everywhere. e.g. tl binaries don't work on old systems (glibc versioning), and spack does not actually build the tex binaries, it only installs them. I built tl2018 from source but the barrier is far too high. Clearly, at some point we'll switch to tl2018, but for now it is a little too new and not
accessible enough.

For HTML generation we should minimize tooling to simplify administation, but provided we can build the tools in the "old" VM we use for automation, we can probably support it. The preference is to minimize extras as much as possible since this will make it easier for others to generate the HTML, to improve the HTML, to simplify maintenance, administration costs to rebuild the VM etc.

Currently I can not make HTML in the VM due to some niggling tooling issues. Manually building dvisvgm might be the last needed step. Updating the VM might be another route, but I don't see a reason to delay this PR further.

@prckent
Copy link
Contributor

prckent commented Jun 27, 2018

@PDoakORNL Can you fix the conflicts please?

@ye-luo Happy? Unless there are show-stoppers we can work on improvements in later PRs.

@ye-luo
Copy link
Contributor

ye-luo commented Jun 27, 2018

@prckent I'm happy with the rules you set including no spack.

I tried to build pdf on ubuntu 16.04 (texlive 2015).
I got the error that xelatex doesn't exist. I think we can do a bit better instead of keeping running the script. Just add a check if the xelatex exists, if not print a message asking for installing the package or using the legacy script. After apt-get texlive-xetex, I got everything through.

When I tried the html, I need to install pdf2svg and texht. Then I got latex error at begin document and gave up.

I agree to have this PR merged as my simple improvement request is fulfilled.

However, It is not clear to me why we need to support unicode. The problem is decoupled from html.
I still think unicode brings more burden than benefit and it is better to avoid them as much as possible. For example, dash - has many versions of different length in unicode and it is very hard to tell by eyes.
If someone add a math symbol from unicode instead of using latex command, it may cause portability issue (font). If we remove all the UTF-8, do we lower the quality of our manual?

@prckent
Copy link
Contributor

prckent commented Jun 27, 2018

Good to know that Tex Live 2015 works.

For HTML you will need extra tools and more recent TeX Live. I think of it as the same requirement of up to date tools we have for the application, just as we need people to have C++11/14. Using the packaged (old) versions that come with Linux distros can be a problem, plus they don't come with tlmgr for updates. On my Mac I used mactex/Tex Live 2018, full install, then "tlmgr update --all". Since we will be putting the HTML online, running the generation will only be needed by the robot and by anyone trying to optimize the HTML output in some way.

@PDoakORNL
Copy link
Contributor Author

As far as UTF-8 I'd much rather leave the manual encoded that way. By default people's names won't be mauled in references. Zotero and other reference managers are natively unicode and at least Zotero produces bibtex with multibyte characters. Unicode characters are trivial to pick out by toggling whether multibyte characters are displayed as raw bytes or not in emacs (and surely there is something like this for vim) so invisible unicode characters are not really a problem. Python3 and texlive are both essentially unicode environments now.

Its much more descriptive to write ლʕಠᴥಠʔლ rather than the old ASCII bear.

@PDoakORNL
Copy link
Contributor Author

Found the rebase to be rather messy. So hopefully this merge is ok.

@prckent
Copy link
Contributor

prckent commented Jun 27, 2018

Still working for me (TeX Live 2018, Mac).

@prckent
Copy link
Contributor

prckent commented Jun 27, 2018

OK to test

@ye-luo ye-luo merged commit 8dae2dd into QMCPACK:develop Jun 27, 2018
@ghost ghost removed the in progress label Jun 27, 2018
@PDoakORNL PDoakORNL deleted the htxelatex_for_html_output branch July 8, 2019 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants