Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change 'loose'/'tight' list parsing to be more like CommonMark #5285

Closed
gwern opened this issue Feb 8, 2019 · 2 comments
Closed

Change 'loose'/'tight' list parsing to be more like CommonMark #5285

gwern opened this issue Feb 8, 2019 · 2 comments

Comments

@gwern
Copy link
Contributor

gwern commented Feb 8, 2019

Initial discussion on pandoc-discussion.

Pandoc currently parses ordered/unordered lists with block elements in a surprising & inconsistent way, which differs from CommonMark, by not putting paragraph level items into paragraphs except for the final item (leading to subtle rendering problems in HTML). CommonMark puts all paragraph elements in paragraphs.

Simple example:

$ pandoc --version
pandoc 2.6
Compiled with pandoc-types 1.17.5.4, texmath 0.11.2, skylighting 0.7.5
Default user data directory: /home/gwern/.pandoc
Copyright (C) 2006-2019 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

$ cat foo.page
1. Foo

    bar
2. foo

    bar
3. foo

    Bar
4. Foo

    Bar

$ pandoc foo.page
<ol type="1">
<li><p>Foo</p>
bar</li>
<li><p>foo</p>
bar</li>
<li><p>foo</p>
Bar</li>
<li><p>Foo</p>
<p>Bar</p></li>
</ol>

$ cmark foo.page 
<ol>
<li>
<p>Foo</p>
<p>bar</p>
</li>
<li>
<p>foo</p>
<p>bar</p>
</li>
<li>
<p>foo</p>
<p>Bar</p>
</li>
<li>
<p>Foo</p>
<p>Bar</p>
</li>
</ol>

Possibly related issues:

@jgm jgm closed this as completed in 47537d2 Feb 9, 2019
@gwern
Copy link
Contributor Author

gwern commented Feb 9, 2019

Looks good, thanks!

@gwern
Copy link
Contributor Author

gwern commented Feb 24, 2019

Actually, on further examination, it appears this bug may not be quite fixed, although it's possible this is an entirely separate issue which should be in a different bug report.

It seems that if the final list (unordered or ordered) item contains a blockquote block item (but not merely an indented paragraph/continuation), the entire list gets switched back to the <p>-less version. We noticed that the fix wasn't working in my tea reviews list even though it was working in the test case, and I finally minimized it down to a 2-3 item list example:
Lists with final items using a blockquote:

- foo

    foo
- foo

    > foo


1. Foo

    bar
2. foo

    bar
4. Foo

    > Bar

yields the broken behavior:

<ul>
<li><p>foo</p>
foo</li>
<li><p>foo</p>
<blockquote>
<p>foo</p>
</blockquote></li>
</ul>
<ol type="1">
<li><p>Foo</p>
bar</li>
<li><p>foo</p>
bar</li>
<li><p>Foo</p>
<blockquote>
<p>Bar</p>
</blockquote></li>
</ol>

However, adding an additional paragraph after the blockquote, or adding an additional list item at all to the end, fixes it:

- foo

    foo
- foo

    > foo

    bar

---

- foo

    foo
- foo

    > foo
- bar

yields:

<ul>
<li><p>foo</p>
<p>foo</p></li>
<li><p>foo</p>
<blockquote>
<p>foo</p>
</blockquote>
<p>bar</p></li>
</ul>
<hr />
<ul>
<li><p>foo</p>
<p>foo</p></li>
<li><p>foo</p>
<blockquote>
<p>foo</p>
</blockquote></li>
<li><p>bar</p></li>
</ul>

Pandoc version:

$ ls -al /home/gwern/bin/bin/pandoc
-rwxr-xr-x 1 gwern gwern 102022976 Feb  9 09:27 /home/gwern/bin/bin/pandoc
$ pandoc --version
pandoc 2.6
Compiled with pandoc-types 1.17.5.4, texmath 0.11.2, skylighting 0.7.5

jgm added a commit that referenced this issue Feb 26, 2019
This improves on the original fix to #5285 by preventing
other mixed lists (lists with a mix of Plain and Para
elements) that were allowed given the original fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant