NLP
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the u…
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning
A comprehensive reference for all topics related to Natural Language Processing
Augmenty is an augmentation library based on spaCy for augmenting texts.
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
SpikeX - SpaCy Pipes for Knowledge Extraction
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
State-of-the-Art Text Embeddings
Top2Vec learns jointly embedded topic, document and word vectors.
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
An open-source NLP research library, built on PyTorch.
Natural Language Processing Best Practices & Examples
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
A Visual Analysis Tool to Explore Learned Representations in Transformers Models