Sequential Attention mechanisms

Repo of sequential attention mechanisms from various papers, including their implementations and studies of how they behave within various model architectures.

Using and viewing

The implementations themselves are primarily done in PyTorch (v0.3 for now). They're built in a way that lets them be easily imported and used into any model following the documented input/output shapes.

The tests and studies of the implementations are done in Jupyter notebooks and can be viewed without having PyTorch installed (but not runnable without it).

References

Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau, Cho, Bengio. 2015): recurrent variable-length sequential attention mechanism
Attention-Based Models for Speech Recognition (Chorowski, et al. 2015): provides improvements to the recurrent variable-length sequential attention mechanism
Attention is All You Need (Vaswani, et al. 2017): non-recurrent transformer attention mechanism
Self-Attention with Relative Position Representations (Shaw, Uszkoreit, Vaswani. 2018): provides improvements to the transformer attention mechanism via positional vectors that represent pairwise relationships between input elements

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
recurrent_attention		recurrent_attention
transformer_attention		transformer_attention
.gitattributes		.gitattributes
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sequential Attention mechanisms

Using and viewing

References

About

Releases

Packages

Languages

velocirabbit/Attention

Folders and files

Latest commit

History

Repository files navigation

Sequential Attention mechanisms

Using and viewing

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages