Release v1.2.0: Alpha tokenizers for Chinese, French, Spanish, Italian and Portuguese · explosion/spaCy

✨ Major features and improvements

NEW: Support Chinese tokenization, via Jieba.
NEW: Alpha support for French, Spanish, Italian and Portuguese tokenization.

Fix issue #376: POS tags for "and/or" are now correct.
Fix issue #578: --force argument on download command now operates correctly.
Fix issue #595: Lemmatization corrected for some base forms.
Fix issue #588: Matcher now rejects empty patterns.
Fix issue #592: Added exception rule for tokenization of "Ph.D."
Fix issue #599: Empty documents now considered tagged and parsed.
Fix issue #600: Add missing token.tag and token.tag_ setters.
Fix issue #596: Added missing unicode import when compiling regexes that led to incorrect tokenization.
Fix issue #587: Resolved bug that caused Matcher to sometimes segfault.
Fix issue #429: Ensure missing entity types are added to the entity recognizer.