-
Notifications
You must be signed in to change notification settings - Fork 9
Changelog
Xav edited this page May 16, 2015
·
25 revisions
- Remove annoying
console.log
- Few new Brill rules
- Better looking example page + readme screenshot
- Fix bug that skipped lot of emoticons when building lexicons
- Verbs
- Irregular verbs conjugation + integration in lexicon
- Regular verbs in Lexicon
- Basic tense detection (for simple sentences, based on dependency parsing)
- Numerous new Brill's rules for PoS tagging (92.519% on Penn Treebank)
- Improved dependency parsing
- Trie class interface
- Bit of code documentation
- Sentence detectors are now applied directly in analysis sentence loop (not anymore in a dedicated second loop)
- New attributes for tokens (
is_verb
,infinitive
,is_noun
,plural
,singular
) -
*in
>*ing
inference (if a word ends within
, is not in lexicon, and the same word plusg
exists in lexicon, then infer it asVBG
) - New tests
- Basically working dependency parsing
- Bug fixes/improvements
- Bit of project cleanup
- Move benchmark folder to test/
- Remove find utility (use grep!)
- Improved token PoS tagging (+0.8% on Penn treebank!):
- Order of detectors changed
- Better management of composed words
- First step of scaffolding for dependency parsing feature
- All regular verbs now conjugated (and/or conjugable)
- PoS tagging for verbs greatly improved
- Better packing of verbs and nationalities (-2ko)
- Better filtering of lexicon (-1ko)
- Reorganised a bit the project
- Lexicon data files moved to src/lexicon
- Compendium data files moved to src/dictionaries
- Lot of news tests (isSingular, verbs, lexicon...)
- Refactored detectors API so it's a bid less verbose
- Better sentiment profiling for mixed sentiment, in particular when using multiple adverbs
- Politeness, dirtiness scores
- Synonyms feature for tokens normalization
- Used by PoS tagger in case no other method returned a tag
- Add
interrogative
andexclamatory
sentence types - Fix low confidence for obvious PoS tagging (CD, SYM...)
- [Gulpfile] Add test run on live rebuild
- Statistics skips punctuation tokens
- Improve verb inflector
- Better sentiment profiling
- Better breakpoint detection
Compendium-js, English NLP for Node.js and the browser, MIT Licensed