farsi-characters-distribution-and-Entropy

Dataset(Zabra): data (ISNA news) includes 7 topics and each topic with 10 documents and at the first part of the code, we make a list for each topic and enter the docs in related list

This code includes 6 phases:

###a

By considering equal probabilty for each farsi letters, Entropy for each char is calculated

###b

Calculating entropy for each farsi letters and also puntuatuion marks (علائم سجاوندی) by using the probabilty extracted from dataset

###g

Like part b but just for farsi letters (without considerng punctuation marks)

###d

Calculation expectation length of Persian words form dataset

###h

Normal histogram in Persian letters (sorted)

###v

Histogram for length of words

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ZebRa		ZebRa
README.md		README.md
a.py		a.py
figure_1.png		figure_1.png
figure_2.png		figure_2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

farsi-characters-distribution-and-Entropy

About

Releases

Packages

Languages

farnazgh/Farsi-letters-distribution-and-Entropy

Folders and files

Latest commit

History

Repository files navigation

farsi-characters-distribution-and-Entropy

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages