Memorize

This is a repository containing code and data for the paper:

B. Tabibian, U. Upadhyay, A. De, A. Zarezade, Bernhard Schölkopf, and M. Gomez-Rodriguez. Enhancing Human Learning via Spaced Repetition Optimization. Proceedings of the National Academy of Sciences (PNAS), March, 2019.

The paper is available from PNAS website and the supporting website also gives a description of our algorithm in a nutshell.

As a follow-up of this work, we tested a variant of the algorithm presented here (named Select) in the wild by means of a Randomized Trial and found that it performed significantly better than competitive baselines. We present those findings in the following paper:

U. Upadhyay, G. Lancashire, C. Moser and M. Gomez-Rodriguez. Large-scale randomized experiment reveals machine learning helps people learn and remember more effectively., npj Science of Learning, 6, Article number: 26 (2021).

Pre-requisites

This code depends on the following packages:

numpy
pandas
matplotlib
seaborn
scipy
dill
click

Apart from this, the instructions assume that the Duolingo dataset has been downloaded, extracted, and saved at ./data/raw/duolingo.csv.

Code structure

memorize.py contains the memorize algorithm.
preprocesed_weights.csv contains estimated model parameters for the HLR model, as described in section 8 of supplementary materials.
observations_1k.csv contains a set of 1K user-item pairs and associated number of total/correct attempts by every user for given items. This dataset has been curated from a larger dataset released by Duolingo, available here.

Execution

The code can by executed as follows:

python memorize.py

The code will use default parameter value (q) used in the code.

Experiments with Duolingo data

Pre-processing

Convert to Python dict by user_id, lexeme_id and pruning it for reading it:

python dataset2dict.py ./data/raw/duolingo.csv ./data/duo_dict.dill --success_prob 0.99 --max_days 30 
python process_raw_data.py ./data/raw/duolingo.csv ./data/duolingo_reduced.csv

Plots

See the notebook plots.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset2dict.py		dataset2dict.py
hlr.duolingo.weights		hlr.duolingo.weights
memorize.py		memorize.py
observation_1k.csv		observation_1k.csv
plot_utils.py		plot_utils.py
plots.ipynb		plots.ipynb
power.duolingo.weights		power.duolingo.weights
process_raw_data.py		process_raw_data.py
spaced_rep_code.py		spaced_rep_code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Memorize

Pre-requisites

Code structure

Execution

Experiments with Duolingo data

Pre-processing

Plots

About

Releases

Packages

Contributors 3

Languages

License

Networks-Learning/memorize

Folders and files

Latest commit

History

Repository files navigation

Memorize

Pre-requisites

Code structure

Execution

Experiments with Duolingo data

Pre-processing

Plots

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages