Skip to content

Latest commit

 

History

History
94 lines (54 loc) · 3.04 KB

File metadata and controls

94 lines (54 loc) · 3.04 KB

This repository contains code for text classification using attention mechanism in Tensorflow with tensorboard visualization.


Requirements

  • Python 3.6
  • Tensorflow 1.2.1
  • Numpy

Project Module

  • utility_dir: storage module for data, vocab files, saved models, tensorboard logs, outputs.

  • pre_processing_module: code for pre-processing text file which includes sampling infrequent words, creation of training vocab and classes in form of pickle dictionary.

  • implementation_module: code for model architecture, data reader, training pipeline and test pipeline.

  • settings_module: code to set directory paths (data path, vocab path, model path etc.), set model parameters (hidden dim, attention dim, regularization, dropout etc.), set vocab dictionary.

  • run_module: wrapper code to execute end-to-end train and test pipeline.

  • viz_module: code to generate embedding visualization via tensorboard.

  • utility_code: other utility codes


How to run

  • train: python -m global_module.run_module.run_train

  • test: python -m global_module.run_module.run_test

  • visualize tensorboard: tensorboard --logdir=PATH-TO-LOG-DIR


Data Sample

  • Utterance file

    • it is hard to resist
    • But something seems to be missing .
    • A movie of technical skill and rare depth of intellect and feeling .
    • Brosnan is more feral in this film than I 've seen him before
    • . . .
  • Utterance label

    • neg
    • neg
    • pos
    • neg
    • . . .

How to change model parameters

Go to set_params.py here.


Model Graph

alt text

Loss and Accuracy Plots

alt text

alt text

alt text

alt text

#Histogram

alt text

Embedding Visualization

alt text

alt text

alt text

alt text

alt text