Here are implemented several versions of the so-called rational model of categorization. Executing any of the python files here will launch a test/demo of one of these classic models.
All these models are essentially forms of the Dirichlet Process Mixture model with different posterior approximation methods. They assume that stimuli are generated by a mixture of underlying distributions of unknown size. These distributions are Gaussian on continuous stimulus dimensions, and multinomial on discrete stimulus dimensions. The prior over the number of mixture components is given via a Dirichlet distribution.
These should all run in Python >= 2.5, with numpy and scipy. The plotting in some of the demos will require matplotlib.
Anderson's original model is available here. In Figure 1 of Anderson (1991) [Anderson] walks through the model's inner workings on each time step as it learns the classic Medin & Schaffer (1978) [MedSchaff] task. Running the file in python launches a demo which performs the same task, finding the same answers at each step.
Becuase this model was developed before advanced techniques for approximating intractable Bayesian posteriors were in wide use, the model views stimuli sequentially and assigns them deterministically to the cluster that was most likely to have generated them.
[Anderson] | Anderon, J. R. (1991). "The adaptive nature of human categorization." Psychological Review, 98:409-429. |
[MedSchaff] | Medin, D. L. and Schaffer, M. M. (1978). "Context Theory of Classification Learning." Psychological Review, 85:207-238. |
Recently, Sanborn, Griffiths and Navarro (2006) [sgn] brought the model up to
date with two methods for approximating the full posterior over possible
partitions of the stimuli. I have implemented the basic model in the document
particle.py
, and implemented each of these approximation methods as
extensions. Since the Anderson model is a special use of the "more rational"
model, it can also be run here. Psychologists are particularly interested in
the particle filter because it operates on-line, which is generally taken as a
necessity for a Psychologically plausible algorithm.
- Gibbs Sampling
- All items are assigned arbitrarily. Sampling proceeds by removing each item
one by one and relabeling it probabilistically. In the limit, the
likelihood of a given partition of the stimuli is given by the number of
times it is visited. I have implemented this in
gibbs.py
. Running the script launches a demo, which runs the sampler for a few hundred iterations on the Zeithamova and Maddox (2009) [zm] dataset, printing out its partition at each stage. - Particle filtering.
- Items are viewed sequentially, as in the Anderson (1991) model, but the
model tracks many hypotheses about the correct partition, and at each stage
resamples from its own existing samples. I have implemented this in
filter.py
. Running this script launches a demo which runs the [zm] task with 6 particles, plotting each particle's partition at the end.
[sgn] | Sanborn, A. N., Griffiths T. L., and Navarro, D. J. (2006). "A More Rational Model of Categorization." Proceedings of the 28th Annual Conference of the Cognitive Science Society. |
[zm] | (1, 2) Zeithamova, D. and Maddox, W. T. (2009). "Learning mode and exemplar sequencing in unsupervised category learning." Journal of Experimental Psychology: Learning, Memory, and Cognition, 35:731-757. |
Although this is the codebase maintained by me (John McDonnell), it is highly intertwined with Doug Markant's implementation and can be considered to have been jointly authored, although as the maintainer, I take blame upon myself for any errors that have entered in.
I should also probably give the usual boilerplate about how this code is made available without any warranty of suitability for any particular purpose, authors are not responsible for disgraced careers, financial meltdowns, godzilla attacks, etc., that may arise from using it.