Validate characters passed through the Emoji constructor #793
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Abstract
This experiment adds support for Emojis to be validated against Unicode's restrictions, which should prevent misuse of the Emoji class, and prevent accidental 400 errors from Discord when attempting to add a reaction with a bad payload.
Additions
Breaking: Should any string be passed that contains a character that is NOT valid in an Emoji, an exception will be thrown.0
This experiment uses a T4 template to fetch the latest Emoji data from Unicode, and build a constant collection of valid emoji codepoints that will be cross-checked at runtime. This is exposed as an internal collection on
Discord.Emoji
.1Changes
When passing a string into the
Discord.Emoji
constructor on platforms targeting .NET Standard 1.3+ or .NET Framework 4.5, every character in the argument will be converted to a UTF32 codepoint, and verified against the codepoints, as described above.2Tests
This pull request is bundled with tests that should verify most aspects of the new parser work as they should, including single-part and multi-part emojis. Discord-style Emotes are also now tested against.3
This pull request has been tested in production, and is proven to parse both single-part and multi-part Emojis.4
Possible Caveats:
Some Emojis are only valid when used in conjunction with a marker character, e.g. 1️⃣, which is composed of a
DIGIT ONE
and aCOMBINING ENCLOSING KEYCAP
. Under this PR, a singleDIGIT ONE
will be parsed as valid. I'm unsure whether or not Discord will reject a reaction with this Emoji, since Unicode lists the digit character range (0x30..0x39
) as both anEmoji
and anEmoji_Component
.5