Skip to content

An off-the-shelf client-side language identification module for JavaScript.

License

Notifications You must be signed in to change notification settings

saffsd/langid.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction
------------
`langid.js` is a direct port of the language identifier implemented by `langid.py`. 
The theory behind the method is described in two published research papers 
[1,2]. `langid.js` does not implement the training of the model, and instead
provides a tool `ldpy2ldjs.py` to convert models trained with the langid.py training tools.

Demonstration
-------------
Open `demo.html` in a browser. `langid.js` uses TypedArrays so a browser that supports
them is required.

Usage
-----
The models and the actual classifier are distributed as two separate javascript files, and both
must be included in a page for the `langid.js` to work. In this repository, I initially provide
`langid-model-acquis.js`, a toy 4-language model based on only JRC-Acquis data, useful for
testing and development purposes, as well as `langid-model-full.js`, the same model that
is packaged by default with `langid.py`.

References
----------
[1] http://aclweb.org/anthology-new/I/I11/I11-1062.pdf
[2] http://www.aclweb.org/anthology/P/P12/P12-3005.pdf

About

An off-the-shelf client-side language identification module for JavaScript.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published