Skip to content

A simple but efficient method to approximately calculate the users' vocabulary level

Notifications You must be signed in to change notification settings

hossein-amirkhani/VocabLevel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VocabLevel

This is a simple but efficient method to approximately calculate the users' vocabulary level.

How it works?

It is based on the subtitles of the Friends series, although it can be easily extended to other corpuses. After some simple pre-processing steps, the unique words are sorted according to their frequencies. Then, a binary search approach is adopted to find the vocabulary level of the users. This approach is based on the following simple rules:

  • If the user knows the meaning of a word, it is likely that she/he also knows the meaning of more frequent words.
  • If the user does not know the meaning of a word, it is likely that she/he also does not know the meaning of less frequent words.

Although these rules are not completely true, they are approximately acceptable. Anyway, these simple rules help us to obtain a very efficient method which only needs log2(n), so for example with a list of 10000 words you would need to answer about 13 questions.

About

A simple but efficient method to approximately calculate the users' vocabulary level

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages