Skip to content
This repository has been archived by the owner on Apr 12, 2024. It is now read-only.

Commit

Permalink
fix(ngSanitize): encode surrogate pair properly
Browse files Browse the repository at this point in the history
The encodeEndities function encode non-alphanumeric characters to entities with charCodeAt.
charCodeAt does not return one value when their unicode codeponts is higher than 65,356.
It returns surrogate pair, and this is why the Emoji which has higher codepoints is garbled.
We need to handle them properly.

Closes #5088
Closes #6911
  • Loading branch information
memolog authored and caitp committed May 2, 2014
1 parent b6aec56 commit 3d0b49c
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 0 deletions.
6 changes: 6 additions & 0 deletions src/ngSanitize/sanitize.js
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ var START_TAG_REGEXP =
COMMENT_REGEXP = /<!--(.*?)-->/g,
DOCTYPE_REGEXP = /<!DOCTYPE([^>]*?)>/i,
CDATA_REGEXP = /<!\[CDATA\[(.*?)]]>/g,
SURROGATE_PAIR_REGEXP = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g,
// Match everything outside of normal chars and " (quote character)
NON_ALPHANUMERIC_REGEXP = /([^\#-~| |!])/g;

Expand Down Expand Up @@ -399,6 +400,11 @@ function decodeEntities(value) {
function encodeEntities(value) {
return value.
replace(/&/g, '&amp;').
replace(SURROGATE_PAIR_REGEXP, function (value) {
var hi = value.charCodeAt(0);
var low = value.charCodeAt(1);
return '&#' + (((hi - 0xD800) * 0x400) + (low - 0xDC00) + 0x10000) + ';';
}).
replace(NON_ALPHANUMERIC_REGEXP, function(value){
return '&#' + value.charCodeAt(0) + ';';
}).
Expand Down
5 changes: 5 additions & 0 deletions test/ngSanitize/sanitizeSpec.js
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,11 @@ describe('HTML', function() {
expect(html).toEqual('<div>');
});

it('should handle surrogate pair', function() {
writer.chars(String.fromCharCode(55357, 56374));
expect(html).toEqual('&#128054;');
});

describe('explicitly disallow', function() {
it('should not allow attributes', function() {
writer.start('div', {id:'a', name:'a', style:'a'});
Expand Down

0 comments on commit 3d0b49c

Please sign in to comment.