Skip to content

Classification of Unicode codepoints by Unicode categories and scripts

Notifications You must be signed in to change notification settings

the-type-founders/unicode-classifier-js

Repository files navigation

Unicode Classifier

This is a TypeScript module that you can use to classify codepoints by their category and script (writing system).

Installation

npm install @thetypefounders/unicode-classifier --save

Usage

import classify from '@thetypefounders/unicode-classifier';

console.log(classify([64, 65, 66, 67]));

// { Nd: { Common: [ 48, 49, 50 ] }, Lu: { Latin: [ 65, 66, 67 ] } }

If the codepoint is not in Unicode, the classifier will sort it under the Unknown category and Unknown script:

import classify from '@thetypefounders/unicode-classifier';

console.log(classify([56845]));

// { Unknown: { Unknown: [ 56845 ] } }

Data

Codepoint categories and scripts data is automatically downloaded from the Unicode website. To update the data, change UNICODE_VERSION in src/update.ts and run npm run update to update the data file.

About

Classification of Unicode codepoints by Unicode categories and scripts

Resources

Stars

Watchers

Forks