Abstract2Title is a Seq2Seq Model finetuned from T5-base that generates titles for your machine learning papers using the abstract. The model is trained using a subset of the arXiv dataset categorized in the cs.AI
category that has 40k articles. (The full arXiv dataset contains 1.7 million scientific articles)
This model is hosted as an API on CellStrat Hub and the Interactive Web Demo can be accessed at https://abstract2title.netlify.app
Checkout the Weights and Biases Experiment for complete training and evaluation results. TLDR; The final evaluation results are:
{'eval_loss': 1.6707532405853271,
'eval_rouge1': 47.1746,
'eval_rouge2': 26.8231,
'eval_rougeL': 41.7727,
'eval_rougeLsum': 41.8263,
'eval_runtime': 220.717,
'eval_samples_per_second': 9.084,
'eval_steps_per_second': 1.137}
Before proceeding, make sure to install the required libraries with,
pip install -r requirements.txt
This repository contains the processed dataset already in arxiv_AI_dataset/. If you want to train on new categories, you can follow the steps below to obtain the original and full dataset.
- Download the arXiv dataset from kaggle
- Extract the downloaded zip file to get a json file and put it in a folder called data. The path should look something like this
data/arxiv-metadata-oai-snapshot.json
- Run the PrepareData.ipynb notebook to filter any specific categories you might want to train on.
To train the model, run the Train.ipynb notebook. For logging in weights and biases, you would want to change the username in the wandb.init()
cell in the beginning.
- Download and extract the model weights from here
- Run the Predict.ipynb notebook to perform inference using the inference.py module.
- The model is deployed as an API using CellStrat Hub. You can learn more about deployment here
- The Next.js Frontend App's source code can be found at a2t-app/