Skip to content

ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

License

Notifications You must be signed in to change notification settings

alirezasalemi7/ARMAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

Our paper has been accepted as a long paper in the EMNLP-2021 main conference and you can find the preprint version here.

Abstractive text summarization is one of the areas influenced by the emergence of pre-trained language models. Current pre-training works in abstractive summarization give more points to the summary with more words in common with the main text and pay less attention to the semantic similarity between generated sentences and the original document. We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue. In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary. To summarize more accurately and similar to human writing patterns, we applied modified sentence reordering in the best setting. We evaluated our proposed models on six downstream Persian summarization tasks. Experimental results show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore. Our models also outperform prior works in Textual Entailment, Question Paraphrasing, and Multiple Choice Question Answering. Finally, we established a human evaluation and show that using the semantic score significantly improves summarization results.

Results

Our model ARMAN(MSR) got state-of-the-art results in 5 out of 6 Persian abstractive summarization datasets using ROUGE metric, and 6 out of 6 using BERTScore.

In the following table, the results are reported using ROUGE-1/ROUGE-2/ROUGE-L metrics.

Dataset ARMAN(MSR) ARMAN(SS-100) ARMAN(SH) ARMAN(SS-80) PEGASUS
PN-Summary 46.19/28.41/40.27 46.33/28.57/40.38 45.89/28.03/39.89 45.98/28.2/40.09 45.67/27.81/39.71
Wiki-Summary 32.48/11.86/24.08 32.36/11.78/24.1 32.04/11.78/23.83 32.27/11.72/23.91 31.98/11.63/23.79
VOA 48.23/29.52/44.27 47.73/28.95/43.89 46.96/27.88/42.93 47.91/28.9/43.75 47.55/28.68/43.57
Perkey(summary) 63.59/52.87/60.3 62.83/51.92/59.53 63.47/52.71/60.16 62.97/52.11/59.64 62.82/51.96/59.48
Perkey(title) 54.81/40.17/52.51 54.25/39.51/51.92 54.5/39.9/52.19 54.18/39.39/51.84 53.99/39.3/51.72
Tebyan 37.79/21.85/31.98 37.64/21.78/31.94 37.6/21.77/31.82 37.53/21.73/31.77 37.2/21.23/31.47

In the following table, the results are reported using Precision-BERTScore/Recall-BERTScore/F1-BERTScore metrics.

Dataset ARMAN(MSR) ARMAN(SH) ARMAN(SS-80) PEGASUS
PN-Summary 80.14/79.84/79.93 79.95/79.69/79.76 80.08/79.74/79.85 79.86/79.67/79.7
Wiki-Summary 74.67/71.55/72.95 74.25/71.43/72.68 74.24/71.48/72.71 74.29/71.31/72.64
VOA 81.1/81.35/81.16 80.64/80.91/80.71 81.02/81.13/81 80.84/81.13/80.92
Perkey(summary) 86.54/86.24/86.33 86.46/86.22/86.29 86.27/86.01/86.09 86.13/86.01/86.01
Perkey(title) 83.93/83.59/83.71 83.85/83.49/83.62 83.65/83.36/83.46 83.68/83.31/83.45
Tebyan 75.49/75.46/75.4 75.48/75.28/75.29 75.48/75.32/75.32 75.26/75.17/75.14

Furthermore, we fine-tuned our models on the ParsiNLU dataset, and the results showed that ARMAN models could be used as Language model too! Our models get state-of-the-art results in 3 out of 4 tasks (virtually on natural part of the dataset). The results are reported in the following table. The results for other models are available in ParsiNLU paper (bold results are the results that were better than other reported models with at most 400M parameters, our models have around 220M).

Task Textual Entailment Question Paraphrasing Sentiment Multiple-Choice Question Answering
Model natural - translated natural - translated food - movie literature - common knowledge - math & logic
ARMAN(SS-80) 54.5 - 50.6 82.5 - 74.8 51.4 - 47 37.7 - 25.7 - 47.7
ARMAN(SS-100) 54.2 - 53 79.9 - 72.8 50 - 52.9 41.4 - 27.4 - 43.1
ARMAN(SH) 55.5 - 52.9 82.6 - 75.1 56.7 - 42 34.6 - 28.6 - 45.4
ARMAN(MSR) 54.8 - 51.8 79.9 - 75.9 52 - 46 36.57 - 21.7 - 49.14
PEGASUS 54.5 - 52.6 80 - 76.1 51.9 - 56 40 - 27.7 - 45.1

Other important results about the ability of models for performing summarization in low resource scenarios are reported in our paper. Briefly, our model needs around 1K data points and 2K training steps to perform well on most summarization tasks.

Link to models

This table contains pre-trained models that we trained.

model pre-trained vocab
ARMAN(SS-80) download download
ARMAN(SS-100) download download
ARMAN(SH) download download
ARMAN(MSR) download download
PEGASUS download download

This table contains fine-tuned models that we fine-tuned on summarization tasks.

model Perkey(summary) Perkey(title) Tebyan Wiki Summary VOA headlines PN Summary Vocab
ARMAN(SS-80) download download download download download download download
ARMAN(SS-100) download download download download download download download
ARMAN(SH) download download download download download download download
ARMAN(MSR) download download download download download download download
PEGASUS download download download download download download download
TRANSFORMER download download download download download download download
mT5 download download download download download download download

This table contains fine-tuned models that we trained on NLU tasks.

model Entailment Question Paraphrasing Multiple Choice Sentiment (Food) Sentiment (Movie) vocab
ARMAN(SS-80) download download download download download download
ARMAN(SS-100) download download download download download download
ARMAN(SH) download download download download download download
ARMAN(MSR) download download download download download download
PEGASUS download download download download download download

Link to Tebyan Dataset

The Tebyan cultural institute, which is affiliated to the organization "Sazman-e Tablighat-e Eslami", is one of the biggest and best known cultural institutes in Iran, and has cooperated with other cultural institutes in different fields for supporting cultural festivals and broadcasting their activities in the media. The activities of the institute's take place not only in Tehran, but also in the provincial centers, and 1,600,000 users visit its website each day. The Iranian deputy minister supported its activities for sport and youth on the website tebyan.net in Tehran.

We created the dataset by crawling the Tebyan website pages. Then we split it into train/test/validation sets. The dataset is publicly available for research purposes.

train validation test
78445 6922 6922
download download download

Pre-training or fine-tuning ARMAN?

The codes and guidelines on how to pre-train or fine-tune the model are available in the pretraining and models folder.

huggingface models

We have converted our TF1 models into PyTorch models using the Huggingface library. You can find them here. It should be noted that the reported results in our paper were produced using TF1 models, so we can not guarantee that you will get the same results using converted models.

Citation

If you use this code, please consider citing our paper:

@misc{salemi2021arman,
      title={ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization}, 
      author={Alireza Salemi and Emad Kebriaei and Ghazal Neisi Minaei and Azadeh Shakery},
      year={2021},
      eprint={2109.04098},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published