-
This datset consists of 22,624 texts labled for two tasks:
- Language identification: this task is used to identify the lanaguage a give text written in. - Topic classification: this task is also useful to classify the topics of a given text according to its content.
# Load and Pre-process data
python preprocess.py
# Train
python train.py
# Test and results
python test.py
- The test environment is
- Python 3.5.2
- Keras 2.3.1
- tensorflow 2.1.0
=======