-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change escape function to fix bug with default parse5 parser #85
Conversation
Can you help me clarify something,
If I disable
In this PR, since So, my understanding is that |
Yes, `decodeEntities` is an option for htmlparser2, and only when `xmlMode`
or `_useHtmlParser2` is true will cheerio 1.0 use it.
curbengh <notifications@github.com> 于2019年8月15日周四 下午5:27写道:
… Can you help me clarify something,
in your previous PR #80
<#80>, I need to enable
decodeEntities to have valid render like
<!-- some mandarin comment 取值范围 -->
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:orientation="horizontal"
>
If I disable decodeEntities, it will be rendered to
&amp;amp;lt;!-- some mandarin comment 取值范围 --&amp;amp;gt;
&amp;amp;lt;LinearLayout xmlns:android=&amp;quot;http://schemas.android.com/apk/res/android&amp;quot;
android:layout_width=&amp;quot;match_parent&amp;quot;
android:layout_height=&amp;quot;match_parent&amp;quot;
android:orientation=&amp;quot;horizontal&amp;quot;
&amp;amp;gt;
------------------------------
In this PR, since decodeEntities is not supported in the default parse5,
enable/disable the option has no effect, both did result in valid rendered
file.
So, my understanding is that decodeEntities is only relevant when either
xmlMode or _useHtmlParser2 is true, is that correct?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#85?email_source=notifications&email_token=ACANVR7AK5KSZRZDY47QUJLQEUORNA5CNFSM4IKRGVE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4LKR4A#issuecomment-521578736>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACANVR7ZZYHQD5AFYB6WUTLQEUORNANCNFSM4IKRGVEQ>
.
--
*Alynx Zhou*
*A Coder & Dreamer*
*Intern in SUSE Beijing*
*Student of **Beijing Jiaotong University*
*School of Computer and Information Technology*
|
it is not supported by the default parse5 cheeriojs/dom-serializer#85
To people who need this PR, I find that https://github.com/cheeriojs/cheerio/blob/c635bea08f72f6670977674d40d44a5edd7f4a31/lib/static.js#L106 in cheerio v1.0.0 fixed this problem in a better way by using I will close it. |
Cheerio 1.0.0-RC3 uses parse5 instead of htmlparser2 for HTML parsing, but parse5 decode all entities (include <, >) no matter
decodeEntities
is true or false, so we cannot use this to decide whether a re-encode is needed. Plus users who using non-ASCII chars may not want their chars becomes XML entities.After this PR, if we are using htmlparser2, it will work as old times. And when we are using parse5 (by default), it will only escape HTML chars like
<>&"'
and won't encode user's chars.Also
normalizeWhitespace
is not supported by parse5 either so I removed related test cases.Solves cheeriojs/cheerio#1198. I suggest you to read my comments in this issue to get a clear view of what the problem is and how it happens.