The goal of this project was to create multi-modal implementation of Transformer architecture in Swift for Tensorflow.
Also it was an attempt to answer the question if Swift for Tensorflow is ready for non-trivial work.
The use-case is based on a paper "Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks" by Matthias Plappert. He created a nice dataset of few thousand motions "The KIT Motion-Language Dataset (paper)", website.
The Motion2Language and Lang2motion Transformer-based models were implemented. Also some more sophisticated motion generation strategies were tried.
Modified Swift Transformer implementation by Andre Carrera was used.
- motion 2 language
- Transformer from motion to annotation
- language 2 motion
- Transformer from annotation to motion
-
original: 2017-06-22.zip
-
processed:
-
annotations and labels:
-
vocabulary