Transformer-Encoder-with-Char

Transformer Encoder with Char information for text classification
This code was created by referring to the code in carpedm20 and DongjunLee

1. Model structure

Input words are represented with Char-CNN, Word2vec concatenated together(64 dimensions each)
Normal Transformer Encoder from (Attention is all you need) is used
Model is composed of 7 Transformer Encoder layers with 4 attention heads
Global Average Pooling layer with softmax is used at the end, for predicting class

2. Char CNN

Char CNN implemented by Yoon Kim

3. Prerequisite

Tensorflow 1.8.0
Python 3.6

4. Training

Clone git

$ git clone https://github.com/MSWon/Transformer-Encoder-with-Char.git

Unzip data.zip and embedding.zip

$ unzip data.zip
$ unzip embedding.zip

Training with user settings (char_mode : (char_cnn, char_lstm, no_char))

$ python train.py --batch_size 128 --training_epochs 12 --char_mode char_cnn

5. Experiments

5-1. Datasets

The AG’s news topic classification dataset is constructed by choosing 4 largest classes from the original news corpus
4 classes are ‘world’, ‘sports’, ‘business’ and ‘science/technology’
Each class contains 30,000 training samples and 1,900 testing samples
The total number of training samples is 120,000 and 7,600 for test

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
images		images
README.md		README.md
attention.py		attention.py
data.zip		data.zip
embedding.zip		embedding.zip
layer.py		layer.py
model.py		model.py
preprocess.py		preprocess.py
train.py		train.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer-Encoder-with-Char

1. Model structure

2. Char CNN

3. Prerequisite

4. Training

5. Experiments

5-1. Datasets

5-2. Test loss graph

5-3. Performance table

About

Releases

Packages

Languages

MSWon/Transformer-Encoder-with-Char

Folders and files

Latest commit

History

Repository files navigation

Transformer-Encoder-with-Char

1. Model structure

2. Char CNN

3. Prerequisite

4. Training

5. Experiments

5-1. Datasets

5-2. Test loss graph

5-3. Performance table

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages