You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Furthermore, it does not handle other HTML entities well. There are well over 1,000 HTML entities (see list), and their values are simply tossed out with EvEntityRef. I am processing Wikipedia dumps and will encounter a wide range of them.
Why are only 4/5 entities processed, when any entity can occur in a text field? There shouldn't be a security concern, as a motivation for using entities is security.
Also, why are entities treated as an event at all? It'd be nice to have the option to disable this functionality so one could simply get all the text in a EvText() event.
Hopefully, there can be a way to either:
enable EvEntityRef to process all HTML entities or
disable EvEntityRef events from occurring and breaking up EvText() events.
(Created a new issue from a comment #72 (comment) by @Mark-L6n)
Furthermore, it does not handle other HTML entities well. There are well over 1,000 HTML entities (see list), and their values are simply tossed out with
EvEntityRef
. I am processing Wikipedia dumps and will encounter a wide range of them.Why are only 4/5 entities processed, when any entity can occur in a text field? There shouldn't be a security concern, as a motivation for using entities is security.
Also, why are entities treated as an event at all? It'd be nice to have the option to disable this functionality so one could simply get all the text in a
EvText()
event.Hopefully, there can be a way to either:
EvEntityRef
to process all HTML entities orEvEntityRef
events from occurring and breaking upEvText()
events.Example problem:
Output:
The text was updated successfully, but these errors were encountered: