Performing `xcs.predict(X)` alters the model when it should not (supervised learning) #58

dpaetzel · 2023-04-15T14:34:16Z

Hi! 🙂

When doing supervised learning, I'd expect model.predict(X) not to change model. However, xcs.predict(X) does sometimes change the model. This is especially problematic for large, high dimensional data sets.

Why is the model even changed by xcs.predict(X)? Due to how XCSF performs covering: The classifiers created by covering are simply added to the population (and correspondingly other classifers are deleted).

In my opinion, this is generally undesirable when doing supervised learning where users expect model.predict(X) not to alter the model. Also, in most cases, supervised learning fits the model once and from then on only performs predictions which means that there will never be a fitness signal for these newly created classifiers.

This is especially problematic for high dimensional data where new data is with only a low probability matched by existing classifiers. In my case, I had a large test data set (50000 test data points, 20 dimensions) and upon doing xcs.predict(X) the existing, fitted, population essentially got erased (all classifiers were replaced by random new classifiers with experience 0 and a correspondingly bad fit). While I'd expect the predictions to be bad in this case, I would definitely not expect the model's state to be erased.

How could this be solved? I guess one way to approach this while keeping the overall XCSF character would be to, upon xcs.predict(X), generate covering classifiers as necessary to perform the prediction but to not put them into the population. Another way would be to add a default rule which matches everything and predicts the data mean or something like that.

What do you think?

The text was updated successfully, but these errors were encountered:

dpaetzel · 2023-04-15T14:38:14Z

Note that the first option for solving this may be as simple as changing this line to

if (xcsf->explore) {
    clset_add(&xcsf->pset, new);
}

as well as freeing up that memory at the end of clset_cover.

dpaetzel · 2023-04-15T16:28:50Z

My workaround for now looks like this (I do all this in Python):

Decrease the maximum population size by one, before calling xcs.fit(X, y).
Create a default rule (which predicts the training data mean) with a very small fitness (so predictions where other rules match are not biased towards the default rule too much).
Increase the maximum population size by one again.
Add that default rule to the population via the JSON import.

The increasing of the population size is necessary since otherwise the maximum population size is exceeded and a rule is deleted by roulette wheel deletion—and if we use a small fitness (which we should do due to the fitness also being used as a mixing weight), the probability of the default rule being deleted is rather high.

Note that I don't think that this is a very good workaround. 😉

rpreen · 2023-04-15T23:00:57Z

You're right that generally we wouldn't expect predict() or score() to result in any modifications to the model (EA and updates are disabled for those functions for that reason), but, as you point out, covering may be invoked.

I think a simple temporary solution from Python would be to checkpoint the population set by calling store() right before calling predict() and then call retrieve() to put things back. store() will create a complete copy of the current population set in RAM. The retreive() will have minor overhead because it just deletes the current population set and then points to the stored set; but if you do this in a loop and predict one sample at a time then there might be some noticeable overhead in calling store() each time to copy the whole population set for 50000 iterations...

A proper solution for this will require some thought - it would be reasonably simple to make it return user-defined values instead of covering when calling predict() or score() which could be set as a parameter. Does that sound like the best way? I'm thinking of something like (for y_dim=3) pred = xcs.predict(X, cover=[0,0,0]) where cover would be optional and define what to return for a sample if the match set is empty instead of invoking covering.

I think in practice if there is no matching rule then you would want to impute the value? Something like using the most common class or picking the nearest training sample etc. etc., which I guess could be done manually in Python afterwards if those samples could be identified.

dpaetzel · 2023-04-19T16:09:45Z

checkpoint the population set

Ah, nice, I was not aware of that workaround. Thank you! I'll have to benchmark how much of a performance problem that is when compared to adding a default rule, though.

make it return user-defined values instead of covering when calling predict() or score() which could be set as a parameter. Does that sound like the best way? I'm thinking of something like (for y_dim=3) pred = xcs.predict(X, cover=[0,0,0]) where cover would be optional and define what to return for a sample if the match set is empty instead of invoking covering.

That would definitely be an option.

I think in practice if there is no matching rule then you would want to impute the value?

I'm wrapping XCS into an (almost) scikit-learn-compatible Python object anyway and within that object I'm already computing a default prediction (right now I go with the data mean) which is used to create the default rule. Being able to provide the default prediction directly to xcs.predict would definitely be an improvement.

Work around xcsf-dev/xcsf#58 .

rpreen linked a pull request Apr 22, 2023 that will close this issue

add optional argument to predict/score functions to avoid covering #59

Merged

rpreen closed this as completed in #59 Apr 23, 2023

dpaetzel added a commit to dpaetzel/run-rsl-bench that referenced this issue May 9, 2023

🐛 Add default rule to XCSF

70806cf

Work around xcsf-dev/xcsf#58 .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performing `xcs.predict(X)` alters the model when it should not (supervised learning) #58

Performing `xcs.predict(X)` alters the model when it should not (supervised learning) #58

dpaetzel commented Apr 15, 2023 •

edited

Loading

dpaetzel commented Apr 15, 2023 •

edited

Loading

dpaetzel commented Apr 15, 2023 •

edited

Loading

rpreen commented Apr 15, 2023 •

edited

Loading

dpaetzel commented Apr 19, 2023

Performing xcs.predict(X) alters the model when it should not (supervised learning) #58

Performing xcs.predict(X) alters the model when it should not (supervised learning) #58

Comments

dpaetzel commented Apr 15, 2023 • edited Loading

dpaetzel commented Apr 15, 2023 • edited Loading

dpaetzel commented Apr 15, 2023 • edited Loading

rpreen commented Apr 15, 2023 • edited Loading

dpaetzel commented Apr 19, 2023

Performing `xcs.predict(X)` alters the model when it should not (supervised learning) #58

Performing `xcs.predict(X)` alters the model when it should not (supervised learning) #58

dpaetzel commented Apr 15, 2023 •

edited

Loading

dpaetzel commented Apr 15, 2023 •

edited

Loading

dpaetzel commented Apr 15, 2023 •

edited

Loading

rpreen commented Apr 15, 2023 •

edited

Loading