Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header-like structures in content should be escaped #76

Closed
marijnvdwerf opened this issue Nov 5, 2015 · 5 comments
Closed

Header-like structures in content should be escaped #76

marijnvdwerf opened this issue Nov 5, 2015 · 5 comments

Comments

@marijnvdwerf
Copy link

marijnvdwerf commented Nov 5, 2015

Input:

<p>Foo<br>--<br>Bar</p>
<p>Foo<br>Bar<br>--</p>

Actual:

Foo  
--  
Bar

Foo  
Bar  
--

Expected:

Foo  
\--  
Bar

Foo  
Bar  
\--
@andreskrey
Copy link
Contributor

I don't understand why the last -- is not escaped. What's the logic behind this? Only the -- tags should be escaped if they have more text below?

@marijnvdwerf
Copy link
Author

This was changed in version 0.24 of the spec. 'setext' headings couldn't span multiple lines before. Updated the expected output, thanks!

@andreskrey
Copy link
Contributor

I could fix this in a similar way I did for #77, but since there are other escaping issues waiting to be fixed, adding new rules to that same chunk of code will make it quite messy.

I'm thinking of creating a separate function, like escapeSpecialCharacters, that will call other functions (like escapeBlockquotelikeCharacters, escapeHeaderlikeCharacters, etc) but I'm not sure what's the correct approach.

Since I'll be adding new functions to the ParagraphConverter class, should I also create a proper interface for it that would define this escaping functions? Is this necessary? Is there a better approach for this issue? Perhaps instead of escaping this characters at the Converter level, another solution could be to create a function inside the HtmlConverter Class, within the convertChildren function and before the convertToMarkdown call, to catch the string before the convertion and sanitize it there.

Any thoughts @colinodell?

@andreskrey
Copy link
Contributor

I just went ahead and created a PR for this. It's #105

@colinodell
Copy link
Member

Fixed via #105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants