Dataset(Zabra): data (ISNA news) includes 7 topics and each topic with 10 documents and at the first part of the code, we make a list for each topic and enter the docs in related list
This code includes 6 phases:
###a
By considering equal probabilty for each farsi letters, Entropy for each char is calculated
###b
Calculating entropy for each farsi letters and also puntuatuion marks (علائم سجاوندی) by using the probabilty extracted from dataset
###g
Like part b but just for farsi letters (without considerng punctuation marks)
###d
Calculation expectation length of Persian words form dataset
###h
Normal histogram in Persian letters (sorted)
###v
Histogram for length of words