Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sense2vec support too and integrate with POS-config #26

Closed
davidberenstein1957 opened this issue Dec 4, 2022 · 2 comments
Closed
Labels
enhancement New feature or request

Comments

@davidberenstein1957
Copy link
Owner

No description provided.

@prakhar251998
Copy link

Hi @davidberenstein1957, saw the capabailities of sense2vec library and how it will work much better than some of the pretrained glove word2vec models.
My question was is there are a way we can add support for more state of the art word vector embeddings like sentence transformers,BERT etc.?

@davidberenstein1957
Copy link
Owner Author

@prakhar251998 thanks for the suggestion, but sadly this wouldn´t be possible. The concise-concepts library works based on a find_most_similar search within pre-defined embeddings based on tokens present in the embedding model. For word2vec-like models, these tokens are pre-defined/indexed and have a stand-alone semantical meaning like apple being used in a similar context as pear. For transformer-based models, the index is mostly limited to a sub-word/character level and therefore doesn´t allow for a find_most_similar operation.

I you would like to use these kinds of embeddings, you could potentially create a semantic-search knowledge base with KNN/ANN and embeddings based on the descriptions of the potential entities, but maybe this costs too much effort.

@davidberenstein1957 davidberenstein1957 added the enhancement New feature or request label Dec 6, 2022
davidberenstein1957 added a commit that referenced this issue Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants