This playground is a pytorch implementation of a learning framework for implementing different models for the neural abstractive text summarization and beyond. It is an extension of NATS toolkit, which is a toolkit for Neural Abstractive Text Summarization. The goal of this framework is to make it convinient to try out new ideas in abstractive text summarization and other language generation tasks.
Live System Demo http://dmkdt3.cs.vt.edu/leafNATS/
- glob
- argparse
- shutil
- spacy
- pytorch 1.0
We tested different models in LeafNATS on the following datasets. Here, we provide the link to CNN/Daily Mail dataset and data processing codes for Newsroom and Bytecup2018 datasets. The preprocessed data will be available upon request.
In the dataset, <s> and </s> is used to separate sentences. <sec> is used to separate summaries and articles. We did not use the json format because it takes more space and be difficult to transfer between servers.
LeafNATS is current under development. A simple way to run models that have already implemented is
-
Check:
Check models we have implemented in this directory. -
Import:
In run.py, import the example you want to try. For examplefrom nats.pointer_generator_network.main import *
-
Training:
python run.py -
Validate:
python run.py --task validate -
Test:
python run.py --task beam -
Rouge:
python run.py --task rouge
Engine
Training frameworksPlayground
Models, pipelines, loss functions, and data redirectionModules
Building blocks, beam search, word-copy for decodingData
Data pre-process and batcher.
Here is the pretrained model for our live system https://drive.google.com/open?id=1A7ODPpermwIHeRrnqvalT5zpr4BCTBi9
Experimental Results can be found in paper
- Neural Abstractive Text Summarization with Sequence-to-Sequence Models
- LeafNATS: An Open-Source Toolkit and Live Demo System for Neural Abstractive Text Summarization
@article{shi2018neural,
title={Neural Abstractive Text Summarization with Sequence-to-Sequence Models},
author={Shi, Tian and Keneshloo, Yaser and Ramakrishnan, Naren and Reddy, Chandan K},
journal={arXiv preprint arXiv:1812.02303},
year={2018}
}
@inproceedings{shi2019leafnats,
title={LeafNATS: An Open-Source Toolkit and Live Demo System for Neural Abstractive Text Summarization},
author={Shi, Tian and Wang, Ping and Reddy, Chandan K},
booktitle={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)},
pages={66--71},
year={2019}
}