Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] refactor Entitize method, use HtmlEncode, add unit tests #325

Conversation

EvgeniyZ
Copy link
Contributor

@EvgeniyZ EvgeniyZ commented Sep 5, 2019

proposal with fix for #323

@JonathanMagnan hi there, can you please take a look at changes for Entitize method. I'm strongly suggesting use standard System.Net.WebUtility.HtmlEncode method for entitize unicode characters.
If it looks OK, I can improve DeEntitize method, just use System.Net.WebUtility.HtmlDecode method instead of complicated logic. Also I think it's really useful to get rid off _entityName, _entityValue dictionaries.
I'm aware of entitizeQuotAmpAndLtGt parameter and wasn't sure when we should ignore entitize these characters

@EvgeniyZ EvgeniyZ changed the title refactor Entitize method, use HtmlEncode, add unit tests [Draft] refactor Entitize method, use HtmlEncode, add unit tests Sep 5, 2019
@JonathanMagnan
Copy link
Member

Hello @EvgeniyZ ,

We will look at it, thank for the pull ;)

What I'm the most afraid with this change is the backward comaptibility but perharp everything is perfect.

Best Regards,

Jonathan


Performance Libraries
context.BulkInsert(list, options => options.BatchSize = 1000);
Entity Framework ExtensionsEntity Framework ClassicBulk OperationsDapper Plus

Runtime Evaluation
Eval.Execute("x + y", new {x = 1, y = 2}); // return 3
C# Eval FunctionSQL Eval Function

@JonathanMagnan
Copy link
Member

Hello @EvgeniyZ ,

The latest version contains your code.

However, you need to turn the option UseWebUtility on to make it works: https://github.com/zzzprojects/html-agility-pack/blob/master/src/HtmlAgilityPack.Shared/HtmlEntity.cs#L27

The reason is for backward compatibility. In the past, we did some similar change which we have to rollback since people reported us some issues in their code.

Let me know if everything works as expected.

Best Regards,

Jonathan

@escobar5
Copy link

I am using Entitize to convert all special characters in an html to their respective entities, if I set UseWebUtilty = false, Entitize doesn't encode my html tags as I expected, so for example

HtmlAgilityPack.HtmlEntity.Entitize("<span>é</span>")

would return <span>&eacute;</span>

but if i set UseWebUtilty = true, then, the same call returns &lt;span&gt;&#233;&lt;/span&gt;

I need the html intact as per the first example, the problem I have with UseWebUtilty = false is that some emojis are not encoded correctly as stated in #323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants