-
Notifications
You must be signed in to change notification settings - Fork 27.4k
ngSanitize cant escape all unicode characters #5088
Comments
I can't think of a reason why we still need to do this. @mhevery any idea? I suggest that you send us a PR and we'll evaluate it there. At first glance it seems that it should be safe to remove, but we need to have a careful look at this before merging any change. In any case, this is a legitimate bug IMO, I'm just not sure of the constraints for the right solution. |
is it possible that you could construct a valid and safe UTF8 string which would be an unsafe string in another encoding? Not sure I would be that eager to remove this. |
Another fix could be to check if |
The problem is JavaScript uses surrogate pairs, see: http://stackoverflow.com/questions/3744721/javascript-strings-outside-of-the-bmp |
Hi, I got the same issue recently. We could handle it before escaping unicode chars as the following diff see also http://mdn.beonex.com/en/Core_JavaScript_1.5_Reference/Global_Objects/String/charCodeAt.html thanks! |
The encodeEndities function encode non-alphanumeric characters to entities with charCodeAt. charCodeAt does not return one value when their unicode codeponts is higher than 65,356. It returns surrogate pair, and this is why the Emoji which has higher codepoints is garbled. We need to handle them properly. Closes #5088 Closes #6911
ngSanitize has a bug escaping unicode chars that arent in the range of charCodeAt, see the fiddle below. Removing this replace function fixes the problem, but i was wondering why this is done in the first place. Unicode chars can be placed inside documents safely, especially since utf-8 became a standard charset these days.
http://jsfiddle.net/jtangelder/SQf7w/
angular.js/src/ngSanitize/sanitize.js
Line 370 in a7e12b7
I can do a PR if it's ok!
The text was updated successfully, but these errors were encountered: