Embedding scaling is tied to positional encoding for Transformer models

When positional encoding is disabled, the embedding scaling is also disabled even though the operations are independent:

https://github.com/OpenNMT/OpenNMT-py/blob/1.0.0/onmt/modules/embeddings.py#L48

In consequence, Transformer models with relative position representations do not follow the reference implementation which scales the embedding [by default](https://github.com/tensorflow/tensor2tensor/blob/e1f0e3a746bb322f4bf3975fad2c8105b3a43a49/tensor2tensor/layers/common_hparams.py#L111).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embedding scaling is tied to positional encoding for Transformer models #1722

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Embedding scaling is tied to positional encoding for Transformer models #1722

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions