NLP-transformer re-implement codes (http://nlp.seas.harvard.edu/2018/04/03/attention.html) with mxnet