Skip to content

We'll build a transformer from scratch, layer-by-layer. We'll start with the **Multi-Head Self-Attention** layer since that's the most involved bit. Once we have that working, the rest of the model will look familiar.

Notifications You must be signed in to change notification settings

SaYanZz0/Transformers-Attention-is-all-you-need

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Transformer Model for Machine Translation

This repository contains a Transformer model implementation from scratch for building a machine translation system. The implementation follows the architecture described in the paper "Attention Is All You Need" by Vaswani et al.

Table of Contents

Introduction

This project implements a Transformer model for machine translation tasks. Transformers have proven to be highly effective for various natural language processing tasks due to their ability to handle long-range dependencies and parallelize training.

Architecture

Multi-Head Self-Attention

The Transformer model relies heavily on the self-attention mechanism, which allows the model to weigh the importance of different words in a sequence relative to each other. The multi-head self-attention mechanism helps capture various aspects of the relationships between words.

Multi-Head Self-Attention multi_head_self_attention

Encoder Block

The encoder block in the Transformer model consists of a multi-head self-attention mechanism followed by a feedforward neural network. Both layers have skip connections and layer normalization.

encoder_block

Decoder Block

The decoder block is similar to the encoder block but includes an additional multi-head cross-attention mechanism that attends to the encoder's output.

decoder_block

Usage

To use this model for machine translation, follow the instructions below.

Requirements

  • Python 3.x
  • TensorFlow or PyTorch
  • NumPy
  • Matplotlib

Installation

Clone the repository and install the required dependencies:

git clone https://github.com/SaYanZz0/Transformers-Attention-is-all-you-need.git
cd Transformers-Attention-is-all-you-need
pip install -r requirements.txt

About

We'll build a transformer from scratch, layer-by-layer. We'll start with the **Multi-Head Self-Attention** layer since that's the most involved bit. Once we have that working, the rest of the model will look familiar.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published