Skip to content

maximxlss/text_normalization

Repository files navigation

text_normalization

Training replication procedure

  1. Clone this repo: git clone https://github.com/maximxlss/text_normalization
  2. cd text_normalization
  3. Install requirements: pip install -r requirements.txt
  4. Install PyTorch
  5. Download ru_train.csv from this Kaggle challenge
  6. Run python preprocess.py (takes time)
  7. Run python train_tokenizer.py (also takes time)
  8. Tweak settings in train.py
  9. Run python train.py
  10. I have reset the scheduler (see train.py) manually when training so keep that in mind. You can see the details of the training process in the metrics

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages