It's a supoorting repository for this Kaggle notebook which is aimed for analyzing Hindi Bible text using NLP.
Hindi NLP-Data preprocessing.ipynb
contains Data Cleaning of the JSON file (dataset).
Files created through the Data preprocessing are available in Results folder.
Hindi NLP resources like Indicnlp library and Hindi SentiWordNet are required to run the Hindi NLP.ipnyb file
.
indic_nlp_resources
can be downloded from here
indic_nlp_library
can be downloded from here
Alternative: tokenizing with NLTK will also do the work
Hindi SentiWordNet can be downloaded from here
Hindi Bible dataset in JSON format is taken from here