vitaLITy: creating GloVe and Specter document embeddings

Requirements

Python 3.9 - Link - tested on Python 3.9 on MacOSX Sonoma
pip - Link - package installer for Python
venv - Link - Serves files in virtual environment

Setup

Create and activate a Python virtual environment. We have tested using Python3.9.
brew install gcc
export CC=/opt/homebrew/Cellar/gcc/14.1.0_2/bin/g++-14 This will be different for different users/systems.
export CFLAGS="-Wa,-q"
pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m nltk.downloader popular
Set OPEN_AI_KEY in config.py.
export TOKENIZERS_PARALLELISM=false

Note: pip cache purge might be needed sometimes to start fresh installation.

Run

Configure file paths in config.py. Sample data files are provided:
- data/sample-dataset-sans-embeddings.tsv - the output file from the scraper module as the input file to compute embeddings.
- data/sample-dataset-with-embeddings.tsv - the output file with computed embeddings.
Run python embed.py

Credits

vitaLITy was created by Arpit Narechania, Alireza Karduni, Ryan Wesslen, and Emily Wall.

Citation

@article{narechania2021vitality,
  title={vitaLITy: Promoting Serendipitous Discovery of Academic Literature with Transformers \& Visual Analytics},
  author={Narechania, Arpit and Karduni, Alireza and Wesslen, Ryan and Wall, Emily},
  journal={IEEE Transactions on Visualization and Computer Graphics},
  year={2022},
  doi={10.1109/TVCG.2021.3114820},
  publisher={IEEE}
}

License

The software is available under the MIT License.

Contact

If you have any questions, feel free to open an issue or contact Arpit Narechania.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

vitaLITy: creating GloVe and Specter document embeddings

Requirements

Setup

Run

Credits

Citation

License

Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

vitaLITy: creating GloVe and Specter document embeddings

Requirements

Setup

Run

Credits

Citation

License

Contact