Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 269 Bytes

readme.md

File metadata and controls

6 lines (4 loc) · 269 Bytes

Google ngram defucker

This parses and compresses the gdataset from google ngrams into a nice rust binary parsable file for further processing later, but with more reasonable file sizes.

Limitations

My computer dies on large files, should probably mmap the file.