This is a PyTorch implementation of the Committed paper for ACL 2021.
In order to facilitate project management, the data sets used in this project are linked as Download with password: ev80.
After the dataset is downloaded, it needs to be decompressed into this project. The extracted file is named dataset, which contains two Chinese datasets mentioned in this paper: Double and Weibo. The link of English datasets is as follows: Cornell Movie-Dialogu Corpus
The proposed model is consist of a Retrieval Model and a Generation model. In order to align utterance and retrieval pairs at word-vector level, a novel Selective-Attention Guided Alignment module is proposed.
This project is supported by Pytorch and other standard libraries. If you want to train the whole model, train.ipynb is a jupyter file, which can directly run and view the intermediate results in the training process.
We evaluted the proposed model on BLEU, Rouge, Relevance and Diversity, which the former metrics are word-overlap metrics and the latter two are embedding based metrics. The result are listed as follow:
Cornell | Types | Models | BLEU | Rouge | Relevance | Diversity | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Rtrv | Gene | R-1 | R-2 | R-L | Average | Extrema | Greedy | Dist-1 | Dist-2 | |||
√ | S2S+Attn | 38.79 | 37.82 | 17.87 | 33.73 | 0.361 | 0.201 | 0.346 | 0.049 | 0.088 | ||
√ | CVAE | 45.23 | 41.89 | 20.86 | 39.49 | 0.381 | 0.256 | 0.374 | 0.076 | 0.145 | ||
√ | UniLM | 50.65 | 44.24 | 23.07 | 40.27 | 0.401 | 0.294 | 0.387 | 0.121 | 0.189 | ||
√ | Retrieval | 36.91 | 30.81 | 13.87 | 27.33 | 0.296 | 0.152 | 0.311 | 0.103 | 0.249 | ||
√ | Retrieval+Rerank | 38.67 | 34.57 | 18.23 | 32.44 | 0.338 | 0.193 | 0.321 | 0.129 | 0.212 | ||
√ | √ | Eidt | 49.11 | 45.81 | 21.99 | 43.01 | 0.393 | 0.307 | 0.391 | 0.112 | 0.207 | |
√ | √ | Ours | 52.67 | 48.72 | 24.45 | 43.28 | 0.417 | 0.331 | 0.407 | 0.121 | 0.231 | |
Douban | Types | Models | BLEU | Rouge | Relevance | Diversity | ||||||
Rtrv | Gene | R-1 | R-2 | R-L | Average | Extrema | Greedy | Dist-1 | Dist-2 | |||
√ | S2S+Attn | 35.36 | 33.74 | 18.16 | 30.62 | 0.341 | 0.182 | 0.368 | 0.061 | 0.081 | ||
√ | CVAE | 43.65 | 42.32 | 21.83 | 38.78 | 0.358 | 0.189 | 0.373 | 0.076 | 0.201 | ||
√ | UniLM | 49.31 | 48.76 | 31.89 | 47.09 | 0.383 | 0.274 | 0.389 | 0.202 | 0.364 | ||
√ | Retrieval | 36.28 | 30.19 | 14.88 | 28.36 | 0.298 | 0.164 | 0.327 | 0.131 | 0.466 | ||
√ | Retrieval+Rerank | 40.17 | 36.49 | 17.67 | 35.44 | 0.362 | 0.211 | 0.378 | 0.137 | 0.431 | ||
√ | √ | Eidt | 47.65 | 48.27 | 29.81 | 46.53 | 0.378 | 0.243 | 0.366 | 0.134 | 0.189 | |
√ | √ | Ours | 55.1 | 51.79 | 32.07 | 51.35 | 0.394 | 0.364 | 0.391 | 0.188 | 0.297 | |
Types | Models | BLEU | Rouge | Relevance | Diversity | |||||||
Rtrv | Gene | R-1 | R-2 | R-L | Average | Extrema | Greedy | Dist-1 | Dist-2 | |||
√ | S2S+Attn | 37.21 | 36.77 | 20.14 | 35.07 | 0.341 | 0.194 | 0.366 | 0.026 | 0.084 | ||
√ | CVAE | 44.74 | 44.15 | 23.12 | 41.39 | 0.358 | 0.195 | 0.378 | 0.086 | 0.142 | ||
√ | UniLM | 50.06 | 50.33 | 32.19 | 49.81 | 0.388 | 0.237 | 0.387 | 0.142 | 0.342 | ||
√ | Retrieval | 35.79 | 32.41 | 15.21 | 28.03 | 0.304 | 0.162 | 0.315 | 0.111 | 0.472 | ||
√ | Retrieval+Rerank | 38.92 | 38.29 | 18.17 | 35.14 | 0.354 | 0.201 | 0.378 | 0.167 | 0.494 | ||
√ | √ | Eidt | 50.64 | 50.82 | 26.71 | 48.38 | 0.396 | 0.234 | 0.387 | 0.152 | 0.158 | |
√ | √ | Ours | 56.3 | 52.43 | 29.07 | 50.41 | 0.412 | 0.367 | 0.411 | 0.203 | 0.314 |