Skip to content

Embedding scaling is tied to positional encoding for Transformer models #1722

@guillaumekln

Description

@guillaumekln

When positional encoding is disabled, the embedding scaling is also disabled even though the operations are independent:

https://github.com/OpenNMT/OpenNMT-py/blob/1.0.0/onmt/modules/embeddings.py#L48

In consequence, Transformer models with relative position representations do not follow the reference implementation which scales the embedding by default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions