Releases · explosion/spaCy

15 Oct 13:04

ines

v2.0.16

48b1bc4

v2.0.16: Fix msgpack-numpy pin

🔴 Bug fixes

Fix msgpack-numpy pin, which could affect serialization on Python 2.7.

Assets 2

15 Oct 12:01

ines

v2.0.15

7bc7fa8

v2.0.15: More wheels and GPU improvements

✨ New features and improvements

Improve version compatibility to support wheels for all spaCy dependencies maintained by us: thinc, cymem, preshed and murmurhash.
Support GPU installation by specifying spacy[cuda], spacy[cuda90], spacy[cuda91], spacy[cuda92] or spacy[cuda10], which will install cupy and thinc_gpu_ops.
Add spacy.prefer_gpu() and spacy.require_gpu() functions.

📖 Documentation and examples

Update GPU installation and usage docs.

Assets 2

13 Oct 21:48

ines

v2.0.13

9cfab59

v2.0.13: Wheels, alpha support for Telugu and Sinhala, rule-based lemmatization for French and Greek, plus various small fixes

✨ New features and improvements

NEW: Pre-built wheels and up to 10 times faster installation! This release starts the journey towards pre-built wheels for all of spaCy's dependencies. Once that's completed, you won't even need a local compiler anymore to install the library. For more details on our wheels process, see explosion/wheelwright.
NEW: Alpha support for Telugu and Sinhala.
NEW: Rule-based lemmatization for Greek and French.
Port over Chinese support (#1210) from v1.x.
Improve language data for Persian, Greek, Swedish, Bengali, Polish, Portuguese, Indonesian, French, German and Russian.
Add Span.ents property for consistency with Doc.ents.
Add --verbose option to spacy train to output more details for debugging.

🔴 Bug fixes

Fix issue #653: Introduce bulk merge function.
Fix issue #1445, #1917, #2209, #2362, #2371, #2383, #2501, #2743, #2758: Fix Keras examples.
Fix issue #2261, #2800: Fix bug that could cause a crash with too many entity types.
Fix issue #2540: Improve French stop words.
Fix issue #2582, #2640, #2645, #2657, #2705, #2784, #2815, #2841, #2845: Fix typos and inconsistencies in documentation.
Fix issue #2593: Prevent numpy warning.
Fix issue #2706: Add missing label FAC to spacy.explain glossary.
Fix issue #2709: Pass default option when calling getoption() in conftest.py.

📖 Documentation and examples

Improve Keras examples.
Update training examples to use minibatching.
Fix various typos and inconsistencies.

👥 Contributors

Thanks to @DimaBryuhanov, @kororo, @AndriyMulyar, @katarkor, @giannisdaras, @bphi, @vikaskyadav, @sammous, @EmilStenstrom, @howl-anderson, @ohenrik, @aashishg, @aryaprabhudesai, @steve-prod, @njsmith, @aniruddha-adhikary, @pzelasko, @mbkupfer, @sainathadapa, @tyburam, @grivaz, @filipecaixeta, @aongko, @free-variation, @mauryaland, @pmj642, @keshan, @darindf, @charlax, @phojnacki, @skrcode, @jacopofar, @Cinnamy and @JKhakpour for the pull requests and contributions!

Assets 2

16 Aug 16:25

ines

v2.1.0a1

c0fa990

v2.1.0a1: New models, joint word segmentation and parsing, better Matcher, bug fixes & more Pre-release

Pre-release

🌙 This is an alpha pre-release of spaCy v2.1.0 and available on pip as spacy-nightly. It's not intended for production use.

pip install -U spacy-nightly

If you want to test the new version, we recommend using a new virtual environment. Also make sure to download the new models – see below for details and benchmarks.

✨ New features and improvements

Tagger, Parser & NER

NEW: Allow parser to do joint word segmentation and parsing. If you pass in data where the tokenizer over-segments, the parser now learns to merge the tokens.
Make parser, tagger and NER faster, through better hyperparameters.
Fix bugs in beam-search training objective.
Remove document length limit during training, by implementing faster Levenshtein alignment.
Use Thinc v6.11, which defaults to single-thread with fast OpenBLAS kernel. Parallelisation should be performed at the task level, e.g. by running more containers.

Models & Language Data

NEW: Small accuracy improvements for parsing, tagging and NER for 6+ languages.
NEW: The English and German models are now available under the MIT license.
NEW: Statistical models for Greek.

CLI

NEW: New ud-train command, to train and evaluate using the CoNLL 2017 shared task data.
Check if model is already installed before downloading it via spacy download.
Pass additional arguments of download command to pip to customise installation.
Improve train command by letting GoldCorpus stream data, instead of loading into memory.
Improve init-model command, including support for lexical attributes and word-vectors, using a variety of formats. This replaces the spacy vocab command, which is now deprecated.
Add support for multi-task objectives to train command.
Add support for data-augmentation to train command.

Other

NEW: Doc.retokenize context manager for merging tokens more efficiently.
NEW: Add support for custom pipeline component factories via entry points (#2348).
NEW: Implement fastText vectors with subword features.
NEW: Built-in rule-based NER component to add entities based on match patterns (see #2513).
Add warnings if .similarity method is called with empty vectors or without word vectors.
Improve rule-based Matcher and add return_matches keyword argument to Matcher.pipe to yield (doc, matches) tuples instead of only Doc objects, and as_tuples to add context to the Doc objects.
Make stop words via Token.is_stop and Lexeme.is_stop case-insensitive.

🚧 Under construction

This section includes new features and improvements that are planned for the stable v2.1.x release, but aren't included in the nightly yet.

Enhanced pattern API for rule-based Matcher (see #1971).

Improve tokenizer performance (see #1642).

Allow retokenizer to update Lexeme attributes on merge (see #2390).

md and lg models and new, pre-trained word vectors for German, French, Spanish, Italian, Portuguese and Dutch.

🔴 Bug fixes

Fix issue #1487: Add Doc.retokenize() context manager.
Fix issue #1574: Make sure stop words are available in medium and large English models.
Fix issue #1665: Correct typos in symbol Animacy_inan and add Animacy_nhum.
Fix issue #1865: Correct licensing of it_core_news_sm model.
Fix issue #1889: Make stop words case-insensitive.
Fix issue #1903: Add relcl dependency label to symbols.
Fix issue #2014: Make Token.pos_ writeable.
Fix issue #2369: Respect pre-defined warning filters.
Fix issue #2671, #2675: Fix incorrect match ID on some patterns.
Fix serialization of custom tokenizer if not all functions are defined.

⚠️ Backwards incompatibilities

This version of spaCy requires downloading new models. You can use the spacy validate command to find out which models need updating, and print update instructions.
If you've been training your own models, you'll need to retrain them with the new version.
While the Matcher API is fully backwards compatible, its algorithm has changed to fix a number of bugs and performance issues. This means that the Matcher in v2.1.x may produce different results compared to the Matcher in v2.0.x.
Also note that some of the model licenses have changed: it_core_news_sm is now correctly licensed under CC BY-NC-SA 3.0, and all English and German models are now published under the MIT license.

📈 Benchmarks

Model	Language	Version	UAS	LAS	POS	NER F	Vec	Size
`en_core_web_sm`	English	2.1.0a0	91.8	90.0	96.8	85.6	𐄂	28 MB
`en_core_web_md`	English	2.1.0a0	92.0	90.2	97.0	86.2	✓	107 MB
`en_core_web_lg`	English	2.1.0a0	92.1	90.3	97.0	86.2	✓	805 MB
`de_core_news_sm`	German	2.1.0a0	92.0	90.1	97.2	83.8	𐄂	26 MB
`de_core_news_md`	German	2.1.0a0	92.4	90.7	97.4	84.2	✓	228 MB
`es_core_news_sm`	Spanish	2.1.0a0	90.1	87.2	96.9	89.4	𐄂	28 MB
`es_core_news_md`	Spanish	2.1.0a0	90.7	88.0	97.2	89.5	✓	88 MB
`pt_core_news_sm`	Portuguese	2.1.0a0	89.4	86.3	80.1	82.7	𐄂	29 MB
`fr_core_news_sm`	French	2.1.0a0	88.8	85.7	94.4	67.3 ¹	𐄂	32 MB
`fr_core_news_md`	French	2.1.0a0	88.7	86.0	95.0	70.4 ¹	✓	100 MB
`it_core_news_sm`	Italian	2.1.0a0	90.7	87.1	96.1	81.3	𐄂	27 MB
`nl_core_news_sm`	Dutch	2.1.0a0	83.5	77.6	91.5	87.3	𐄂	27 MB
`el_core_news_sm`	Greek	2.1.0a0	84.5	81.0	95.0	73.5	𐄂	27 MB
`el_core_news_md`	Greek	2.1.0a0	87.7	84.7	96.3	80.2	✓	143 MB
`xx_ent_wiki_sm`	Multi	2.1.0a0	-	-	-	83.8	𐄂	9 MB

We're currently investigating this, as the results are anomalously low.

💬 UAS: Unlabelled dependencies (parser). LAS: Labelled dependencies (parser). POS: Part-of-speech tags (fine-grained tags, i.e. Token.tag_). NER F: Named entities (F-score). Vec: Model contains word vectors. Size: Model file size (zipped archive).

📖 Documentation and examples

Fix various typos and inconsistencies.

👥 Contributors

Thanks to @DuyguA, @giannisdaras, @mgogoulos and @louridas for the pull requests and contributions.

Assets 2

21 Jul 15:18

ines

v2.0.12

1a16162

v2.0.12: Greek, Arabic, Urdu, Tatar, improved language data, better model downloads & various compatibility and bug fixes

We had to release another update to the v2.0.x branch of spaCy to resolve a dependency issue, so we decided to also include and/or backport a bunch of features and fixes that were originally intended for v2.1.0 (see here for the nightly version).

✨ New features and improvements

NEW: Alpha tokenization and language data for Arabic, Urdu, Tatar and Greek.
NEW: Mecab-based Japanese tokenization and lemmatization.
NEW: Add Norwegian rule-based and lookup lemmatization.
NEW: Add Danish lookup lemmatization based on the Den store danske SprogTeknologiske Ordbase, STO dataset, courtesy of The University of Copenhagen.
NEW: Romanian lookup lemmatization.
Improve language data for Polish, Turkish, French, Romanian, Swedish and Japanese.
Improve case-sensitive lookup lemmatization in German.
Add Token.sent property that returns the sentence Span the token is part of.
Add remove_extension method on Doc, Token and Span.
Add Doc.is_sentenced property that returns True if sentence boundaries have been applied.
Allow ignoring warning by code via the SPACY_WARNING_IGNORE environment variable.
Add --silent option to info command.

🔴 Bug fixes

Fix issue #1456: Pass additional arguments of download command to pip and check if model is already installed before downloading it.
Fix issue #2191: Update README section on tests and dependencies.
Fix issue #2194: Ensure that Doc.noun_chunks_iterator isn't None before calling it.
Fix issue #2196: Return data in cli.info and add silent option.
Fix issue #2200: Correct typo in spacy package command message.
Fix issue #2210: Fix bug in Spanish noun chunks.
Fix issue #2211, #2320: Resolve problem in download command and use requests library again.
Fix issue #2219: Fix token similarity of single-letter tokens.
Fix issue #2222, #2223: Fix typos in documentation and docstrings.
Fix issue #2226: Use correct, non-deprecated merge syntax in merge_ents.
Fix issue #2228: Fix deserialization when using tensor=False or sentiment=False.
Fix issue #2238: Correct Swedish lookup lemmatization.
Fix issue #2242: Add remove_extension method on Doc, Token and Span.
Fix issue #2266: Add collapse_phrases option to displaCy visualizer.
Fix issue #2269: Fix KeyError by renaming SP to _SP.
Fix issue #2304: Don't require attrs argument in Doc.retokenize and allow ints/unicode.
Fix issue #2361: Escape HTML tags in displacy.render.
Fix issue #2376: Improve Matcher examples and add section on using pipeline components.
Fix issue #2385: Handle multi-word entities correctly in IOB to BILUO conversion.
Fix issue #2452: Fix bug that would cause displacy arrows to only point in one direction.
Fix issue #2477: Also allow Span objects in displacy.render.
Fix issue #2490: Update Thinc's dependencies for Python 3.7 compatibility.
Fix issue #2495: Fix loading tokenizer with custom prefix search.
Fix issue #2514: Switch from msgpack-python to msgpack to hopefully prevent conda from downloading a two-year-old spaCy version when installing with latest the Anaconda distribution.
Ensure that Doc.is_tagged is set correctly when using Language.pipe.
Fix bug in merge_noun_chunks factory that would return None if Doc wasn't parsed.
Explicitly require pathlib backport on Python 2 only.

📖 Documentation and examples

NEW: Edit and execute code examples in your browser – all across the documentation!
NEW: The spaCy Universe, a collection of plugins, extensions and other resources for spaCy.
NEW: Experimental rule-based Matcher Explorer demo – create token patterns interactively, test them against your text and copy-paste the Python pattern code.
NEW: Document Cython API.
Fix various typos and inconsistencies.

👥 Contributors

Thanks to @mollerhoj, @howl-anderson, @pktippa, @skrcode, @miroli, @ivyleavedtoadflax, @5hirish, @therealronnie, @alexvy86, @mn3mos, @polm, @knoxdw, @bellabie, @mauryaland, @LRAbbade, @janimo, @vishnumenon, @tzano, @cclauss, @armsp, @aristorinjuang, @BigstickCarpet, @idealley, @ansgar-t, @mpszumowski, @91ns, @msklvsk, @himkt, @DanielRuf, @nathanathan, @GolanLevy, @nipunsadvilkar, @cjhurst, @aliiae, @mirfan899, @ohenrik, @btrungchi, @kleinay, @DuyguA, @stefan-it, @Eleni170, @datascouting, @tjkemp, @x-ji, @giannisdaras, @kororo and @katarkor for the pull requests and contributions.

Assets 2

21 Jul 14:14

ines

v2.1.0a0

50c367e

v2.1.0a0: New models, joint word segmentation and parsing, better Matcher, bug fixes & more Pre-release

Pre-release

🌙 This is an alpha pre-release of spaCy v2.1.0 and available on pip as spacy-nightly. It's not intended for production use.

pip install -U spacy-nightly

If you want to test the new version, we recommend using a new virtual environment. Also make sure to download the new models – see below for details and benchmarks.

✨ New features and improvements

Tagger, Parser & NER

NEW: Allow parser to do joint word segmentation and parsing. If you pass in data where the tokenizer over-segments, the parser now learns to merge the tokens.
Make parser, tagger and NER faster, through better hyperparameters.
Fix bugs in beam-search training objective.
Remove document length limit during training, by implementing faster Levenshtein alignment.
Use Thinc v6.11, which defaults to single-thread with fast OpenBLAS kernel. Parallelisation should be performed at the task level, e.g. by running more containers.

Models & Language Data

NEW: Small accuracy improvements for parsing, tagging and NER for 6+ languages.
NEW: The English and German models are now available under the MIT license.

CLI

NEW: New ud-train command, to train and evaluate using the CoNLL 2017 shared task data.
Check if model is already installed before downloading it via spacy download.
Pass additional arguments of download command to pip to customise installation.
Improve train command by letting GoldCorpus stream data, instead of loading into memory.
Improve init-model command, including support for lexical attributes and word-vectors, using a variety of formats. This replaces the spacy vocab command, which is now deprecated.

Other

NEW: Doc.retokenize context manager for merging tokens more efficiently.
NEW: Add support for custom pipeline component factories via entry points (#2348).
NEW: Implement fastText vectors with subword features.
Add warnings if .similarity method is called with empty vectors or without word vectors.
Improve rule-based Matcher and add return_matches keyword argument to Matcher.pipe to yield (doc, matches) tuples instead of only Doc objects, and as_tuples to add context to the Doc objects.
Make stop words via Token.is_stop and Lexeme.is_stop case-insensitive.

🚧 Under construction

This section includes new features and improvements that are planned for the stable v2.1.x release, but aren't included in the nightly yet.

Enhanced pattern API for rule-based Matcher (see #1971).

Built-in rule-based NER component to add entities based on match patterns (see #2513).

Improve tokenizer performance (see #1642).

Allow retokenizer to update Lexeme attributes on merge (see #2390).

md and lg models and new, pre-trained word vectors for German, French, Spanish, Italian, Portuguese and Dutch.

🔴 Bug fixes

Fix issue #1487: Add Doc.retokenize() context manager.
Fix issue #1574: Make sure stop words are available in medium and large English models.
Fix issue #1665: Correct typos in symbol Animacy_inan and add Animacy_nhum.
Fix issue #1865: Correct licensing of it_core_news_sm model.
Fix issue #1889: Make stop words case-insensitive.
Fix issue #1903: Add relcl dependency label to symbols.
Fix issue #2014: Make Token.pos_ writeable.
Fix issue #2369: Respect pre-defined warning filters.
Fix serialization of custom tokenizer if not all functions are defined.

⚠️ Backwards incompatibilities

This version of spaCy requires downloading new models. You can use the spacy validate command to find out which models need updating, and print update instructions.
If you've been training your own models, you'll need to retrain them with the new version.
While the Matcher API is fully backwards compatible, its algorithm has changed to fix a number of bugs and performance issues. This means that the Matcher in v2.1.x may produce different results compared to the Matcher in v2.0.x.
Also note that some of the model licenses have changed: it_core_news_sm is now correctly licensed under CC BY-NC-SA 3.0, and all English and German models are now published under the MIT license.

📈 Benchmarks

Model	Version	UAS	LAS	POS	NER F	Vec	Size
`en_core_web_sm`	2.1.0a0	91.8	90.0	96.8	85.6	𐄂	28 MB
`en_core_web_md`	2.1.0a0	92.0	90.2	97.0	86.2	✓	107 MB
`en_core_web_lg`	2.1.0a0	92.1	90.3	97.0	86.2	✓	805 MB
`de_core_news_sm`	2.1.0a0	92.0	90.1	97.2	83.8	𐄂	26 MB
`de_core_news_md`	2.1.0a0	92.4	90.7	97.4	84.2	✓	228 MB
`es_core_news_sm`	2.1.0a0	90.1	87.2	96.9	89.4	𐄂	28 MB
`es_core_news_md`	2.1.0a0	90.7	88.0	97.2	89.5	✓	88 MB
`pt_core_news_sm`	2.1.0a0	89.4	86.3	80.1	82.7	𐄂	29 MB
`fr_core_news_sm`	2.1.0a0	88.8	85.7	94.4	67.3 ¹	𐄂	32 MB
`fr_core_news_md`	2.1.0a0	88.7	86.0	95.0	70.4 ¹	✓	100 MB
`it_core_news_sm`	2.1.0a0	90.7	87.1	96.1	81.3	𐄂	27 MB
`nl_core_news_sm`	2.1.0a0	83.5	77.6	91.5	87.3	𐄂	27 MB
`xx_ent_wiki_sm`	2.1.0a0	-	-	-	83.8	𐄂	9 MB

We're currently investigating this, as the results are anomalously low.

💬 UAS: Unlabelled dependencies (parser). LAS: Labelled dependencies (parser). POS: Part-of-speech tags (fine-grained tags, i.e. Token.tag_). NER F: Named entities (F-score). Vec: Model contains word vectors. Size: Model file size (zipped archive).

📖 Documentation and examples

Fix various typos and inconsistencies.

👥 Contributors

Thanks to @DuyguA for the pull requests and contributions.

Assets 2

04 Apr 09:52

ines

v2.0.11

0c7fab4

v2.0.11: Alpha Vietnamese support, fixes to vectors, improved errors and more

📊 Help us improve spaCy and take the User Survey 2018!

✨ New features and improvements

NEW: Alpha Vietnamese support with tokenization via Pyvi.
NEW: Improved system for error messages and warnings. Errors now have unique error codes and are referenced in one place, and all unspecified asserts have been replaced with descriptive errors. See #2163 for implementation details, and let us know if you have any suggestions for errors and warnings in #2164!
Improve language data for Polish.
Tidy up dependencies and drop six, html5lib, ftfy and requests.
Improve efficiency (and potentially accuracy) of beam-search training, by randomly using greedy updates for some sentences. This can be controlled by changing the beam_update_prob entry in nlp.parser.cfg. The default value is 0.5, so 50% of beam updates will be done as greedy updates.

🔴 Bug fixes

Fix issue #1554, #1752, #2159: Fix Token.ent_iob after Doc.merge(), and ensure consistency in Doc.ents.
Fix issue #1660: Fix loading of multiple vector models.
Fix issue #1967: Allow entity types with dashes.
Fix issue #2032: Fix accidentally quadratic runtime in Vocab.set_vector.
Fix issue #2050: Correct mistakes in Italian lemmatizer data.
Fix issue #2073: Make Token.set_extension work as expected.
Fix issue #2100, #2151, #2181: Drop six and html5lib and prevent dependency conflict with TensorFlow / Keras.
Fix issue #2101: Improve error message if token text is empty string.
Fix issue #2121: Fix Language.to_bytes and pickling in Thinc.
Fix issue #2156: Fix hashtag example in Matcher docs.
Fix issue #2177: Don't raise error in set_extension if getter and setter are specified or if default=None, and add error if setter is specified with no getter.

📖 Documentation and examples

Add example for TensorBoard's standalone embedding projector.
Improve example for training a new entity type.
Add formal CITATION for assigning a DOI via Zenodo.

👥 Contributors

Thanks to @jimregan, @justindujardin, @trungtv, @katrinleinweber and @skrcode for the pull requests and contributions.

Assets 2

24 Mar 18:03

ines

v2.0.10

d566e67

v2.0.10: Built-in factories to merge spans, small improvements and bug fixes

📊 Help us improve spaCy and take the User Survey 2018!

✨ New features and improvements

Improve language data for Turkish and Croatian.
Add built-in factories for merge_entities and merge_noun_chunks to allow models to specify those components as part of their pipeline.

merge_entities = nlp.create_pipe('merge_entities')
nlp.add_pipe(merge_entities, after='ner')

🔴 Bug fixes

Fix issue #2012: Fix Spanish noun_chunks failure caused by typo.
Fix issue #2040: Make sure Token.lemma always returns a hash value.
Fix issue #2063: Correct typo in English lookup lemmatization table.
Fix issue #2103: Correct typo in documentation.
Fix pickling of Vectors class.

📖 Documentation and examples

Add example for visualizing spaCy vectors with the TensorBoard Embedding Projector.
Fix various typos and inconsistencies.

👥 Contributors

Thanks to @thomasopsomer, @alldefector, @DuyguA, @dejanmarich, @justindujardin, @calumcalder, @SebastinSanty, @iann0036, @doug-descombaz and @willismonroe for the pull requests and contributions.

Assets 2

23 Mar 22:48

ines

v1.10.1

abc9398

v1.10.1: Fix compatibility with pip

🔴 Bug fixes

Fix issue #2112: Avoid import pip to ensure compatibility with pip v9.0.2 which deprecated this usage. See pypa/pip#5081 for more details.

👥 Contributors

Thanks to @mdcclv for the pull request!

Assets 2

22 Feb 17:32

ines

v2.0.9

307aefe

v2.0.9: Fix issue with msgpack dependency

📊 Help us improve spaCy and take the User Survey 2018!

🔴 Bug fixes

Fix issue #2015: Pin msgpack-python to 0.5.4 to avoid conflict with new msgpack release.

Assets 2

Releases: explosion/spaCy

v2.0.16: Fix msgpack-numpy pin

🔴 Bug fixes

v2.0.15: More wheels and GPU improvements

✨ New features and improvements

📖 Documentation and examples

v2.0.13: Wheels, alpha support for Telugu and Sinhala, rule-based lemmatization for French and Greek, plus various small fixes

✨ New features and improvements

🔴 Bug fixes

📖 Documentation and examples

👥 Contributors

v2.1.0a1: New models, joint word segmentation and parsing, better Matcher, bug fixes & more

✨ New features and improvements

Tagger, Parser & NER

Models & Language Data

CLI

Other

🚧 Under construction

🔴 Bug fixes

⚠️ Backwards incompatibilities

📈 Benchmarks

📖 Documentation and examples

👥 Contributors

v2.0.12: Greek, Arabic, Urdu, Tatar, improved language data, better model downloads & various compatibility and bug fixes

✨ New features and improvements

🔴 Bug fixes

📖 Documentation and examples

👥 Contributors

v2.1.0a0: New models, joint word segmentation and parsing, better Matcher, bug fixes & more

✨ New features and improvements

Tagger, Parser & NER

Models & Language Data

CLI

Other

🚧 Under construction

🔴 Bug fixes

⚠️ Backwards incompatibilities

📈 Benchmarks

📖 Documentation and examples

👥 Contributors

v2.0.11: Alpha Vietnamese support, fixes to vectors, improved errors and more

📊 Help us improve spaCy and take the User Survey 2018!

✨ New features and improvements

🔴 Bug fixes

📖 Documentation and examples

👥 Contributors

v2.0.10: Built-in factories to merge spans, small improvements and bug fixes

📊 Help us improve spaCy and take the User Survey 2018!

✨ New features and improvements

🔴 Bug fixes

📖 Documentation and examples

👥 Contributors

v1.10.1: Fix compatibility with pip

🔴 Bug fixes

👥 Contributors

v2.0.9: Fix issue with msgpack dependency

📊 Help us improve spaCy and take the User Survey 2018!

🔴 Bug fixes