Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathological input: Unclosed inline links (another) #218

Closed
mity opened this issue Jul 12, 2017 · 10 comments · Fixed by github/cmark-gfm#49
Closed

Pathological input: Unclosed inline links (another) #218

mity opened this issue Jul 12, 2017 · 10 comments · Fixed by github/cmark-gfm#49

Comments

@mity
Copy link
Contributor

mity commented Jul 12, 2017

This is similar to #214 but with input generated by this:

$ python -c 'print("[a](<b" * 50000)'

Current master head (88c8ad1):

$ time python -c 'print("[a](<b" * 50000)' | ./cmark > /dev/null

real    0m15.341s
user    0m0.000s
sys     0m0.030s

Again 0.27.1 was not affected:

$ time python -c 'print("[a](<b" * 50000)' | ./cmark > /dev/null

real    0m0.109s
user    0m0.015s
sys     0m0.000s

cc @kivikakk

@mity
Copy link
Contributor Author

mity commented Jul 12, 2017

Given that the kind of link destination enclosed in < and > cannot contain nested <, the parser should stop quite soon, but, I guess, it does not, leading to quadratic complexity.

@mity
Copy link
Contributor Author

mity commented Jul 12, 2017

Or maybe I am wrong and it is just a duplicate of #214. The parser may refuse the link destination enclosed in '<' soon, but then it retries the other link destination type which hits the issue #214.

Sorry I need to leave now so cannot verify. Feel free to close if that's really the case.

@kivikakk
Copy link
Contributor

Looks like it's a valid bug; this is on the latest github/cmark which has the fix for #214 in:

root@6bc5725a6bd5:/src/cmark# git rev-parse HEAD
175542b2f37fb854117580268060d6cd19ef78bf
root@6bc5725a6bd5:/src/cmark# python -c 'print("[a](<b" * 50000)' > path2.md
root@6bc5725a6bd5:/src/cmark# time build/src/cmark-gfm < path2.md > /dev/null

real    0m14.234s
user    0m14.210s
sys     0m0.000s

@kivikakk
Copy link
Contributor

There's no rejection of the link dest in <; it only aborts on EOL, whitespace or >. We should just deny nested < in the <…> autolinks.

@kivikakk
Copy link
Contributor

There's a regression test which expects this to work, actually: https://github.com/jgm/cmark/blob/88c8ad11b687c756146d27d31b5a30f3cf13a75e/test/regression.txt#L77-L85

@jgm, any thoughts on relaxing this?

@jgm
Copy link
Member

jgm commented Jul 13, 2017

The regression test was added in response to #193.

@jgm
Copy link
Member

jgm commented Jul 13, 2017

Looking at this more closely, I think I my fix for the bug in #193 was wrong.
I've reopened that issue; see my comments there.

@jgm
Copy link
Member

jgm commented Jul 13, 2017

Anything that fixes this should add the case to the pathological tests in the test suite.

@mity mity changed the title Yet another pathological input Pathological input: Unclosed inline links (another) Jul 25, 2017
@DemiMarie
Copy link

DemiMarie commented Aug 6, 2017

MD4C is immune to the bug:

$ python -c 'print("[a](<b"*50000)' | time build/md2html/md2html >/dev/null
build/md2html/md2html > /dev/null  0.11s user 0.00s system 91% cpu 0.121 total

But Comrak is vulnerable:

$ python -c 'print("[a](<b"*50000)' | time comrak >/dev/null
comrak > /dev/null  6.78s user 0.01s system 99% cpu 6.810 total

And pulldown-cmark:

$ python -c 'print("[a](<b"*50000)' | time pulldown-cmark >/dev/null
pulldown-cmark > /dev/null  9.80s user 0.00s system 99% cpu 9.831 total

@jgm
Copy link
Member

jgm commented Mar 18, 2019

This seems to be fixed, and there is already a pathological test for this.
Current time for cmark HEAD is 0.12 s for n = 50000.

@jgm jgm closed this as completed Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants