Skip to content

Eigen Memory Trees (EMT)

Mark Rucker edited this page Jul 18, 2023 · 10 revisions

An Eigen Memory Tree (EMT) is a memory based learning reduction that works on examples with simple label types. EMTs will remember previous training examples and use this memory to assign labels to future requested predictions. In this way EMT is similar in kind to other memory based learners such as nearest neighbor, however, unlike nearest neighbors models EMT is able to learn and grow incrementally while preserving O(log n) time complexity. The original EMT paper describing the reduction in detail can be found here.

Experiments

Python based contextual bandit experiments using EMT can be found at: https://github.com/mrucker/emt_experiments. These experiments create contextual bandit learners from an EMT by remembering the rewards received when taking an action in a given context. At prediction time the EMT is queried for every available context action pair. The context action pair that EMT predicts will receive the highest reward becomes the greedy action.

Usage

In order to use the EMT reduction in VW all one needs to do is pass --emt when initializing VW. In addition to this there a few other optional flags.

vw --emt --emt_tree <tree mems> --emt_leaf <leaf mems> --emt_scorer <scorer type> --emt_router <router type> --emt_initial <initial metric>

The first flag --emt is a required indicator that simply tells VW to create an EMT reduction. The first argument, <tree mems>, indicates the number of memories the tree is allowed to have before memories begin to be pruned. This is an optional argument whose default value, 0, means that the an EMT tree will never forget memories. The second argument, <leaf mems>, indicates how many memories a leaf in the tree can have before a leaf will be split into two smaller leaves. Leaf size is a trade-off parameter. In general larger leaf sizes lead to more accurate predictions at the cost of more compute time. The next two arguments <scorer type> and <router type> control how the tree is searched. By default these are set to use self-consistent rank scorer and eigen routing. The final argument, <initial metric> determines the metric used to initialize the scoring function. By default this uses cosine similarity primarily for its insensitivity to feature scaling. In general the default values should work well for most problems.

Label Type

EMT works with simple label types:

[Label] [Importance] [Base]

  • [Label] should be a scalar value.
  • [Importance] determines how large of an update step to take.
  • [Base] is unused by EMT. EMT calculates its own base internally.

Small Performance Suggestions

EMT also works with several existing VW flags due to it reducing to a regression learner. We have found the following flags can improve performance.

--min_prediction 0 --max_prediction 3 --interactions <polymomial namespace interactions>

Due to the mechanism used by EMT to compare memories bounding the min and max prediction used by the reduction learner can greatly improve performance. Additionally, adding a few simple non-linear namespace interactions can also improve the performance of the scorer reduction learner. None of these flags are necessary, but if you are not getting desired results from EMT they are easy to test.

Clone this wiki locally