Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performing xcs.predict(X) alters the model when it should not (supervised learning) #58

Closed
dpaetzel opened this issue Apr 15, 2023 · 4 comments · Fixed by #59
Closed

Comments

@dpaetzel
Copy link
Contributor

dpaetzel commented Apr 15, 2023

Hi! 🙂

When doing supervised learning, I'd expect model.predict(X) not to change model. However, xcs.predict(X) does sometimes change the model. This is especially problematic for large, high dimensional data sets.

Why is the model even changed by xcs.predict(X)? Due to how XCSF performs covering: The classifiers created by covering are simply added to the population (and correspondingly other classifers are deleted).

In my opinion, this is generally undesirable when doing supervised learning where users expect model.predict(X) not to alter the model. Also, in most cases, supervised learning fits the model once and from then on only performs predictions which means that there will never be a fitness signal for these newly created classifiers.

This is especially problematic for high dimensional data where new data is with only a low probability matched by existing classifiers. In my case, I had a large test data set (50000 test data points, 20 dimensions) and upon doing xcs.predict(X) the existing, fitted, population essentially got erased (all classifiers were replaced by random new classifiers with experience 0 and a correspondingly bad fit). While I'd expect the predictions to be bad in this case, I would definitely not expect the model's state to be erased.

How could this be solved? I guess one way to approach this while keeping the overall XCSF character would be to, upon xcs.predict(X), generate covering classifiers as necessary to perform the prediction but to not put them into the population. Another way would be to add a default rule which matches everything and predicts the data mean or something like that.

What do you think?

@dpaetzel
Copy link
Contributor Author

dpaetzel commented Apr 15, 2023

Note that the first option for solving this may be as simple as changing this line to

if (xcsf->explore) {
    clset_add(&xcsf->pset, new);
}

as well as freeing up that memory at the end of clset_cover.

@dpaetzel
Copy link
Contributor Author

dpaetzel commented Apr 15, 2023

My workaround for now looks like this (I do all this in Python):

  1. Decrease the maximum population size by one, before calling xcs.fit(X, y).
  2. Create a default rule (which predicts the training data mean) with a very small fitness (so predictions where other rules match are not biased towards the default rule too much).
  3. Increase the maximum population size by one again.
  4. Add that default rule to the population via the JSON import.

The increasing of the population size is necessary since otherwise the maximum population size is exceeded and a rule is deleted by roulette wheel deletion—and if we use a small fitness (which we should do due to the fitness also being used as a mixing weight), the probability of the default rule being deleted is rather high.

Note that I don't think that this is a very good workaround. 😉

@rpreen
Copy link
Member

rpreen commented Apr 15, 2023

You're right that generally we wouldn't expect predict() or score() to result in any modifications to the model (EA and updates are disabled for those functions for that reason), but, as you point out, covering may be invoked.

I think a simple temporary solution from Python would be to checkpoint the population set by calling store() right before calling predict() and then call retrieve() to put things back. store() will create a complete copy of the current population set in RAM. The retreive() will have minor overhead because it just deletes the current population set and then points to the stored set; but if you do this in a loop and predict one sample at a time then there might be some noticeable overhead in calling store() each time to copy the whole population set for 50000 iterations...

A proper solution for this will require some thought - it would be reasonably simple to make it return user-defined values instead of covering when calling predict() or score() which could be set as a parameter. Does that sound like the best way? I'm thinking of something like (for y_dim=3) pred = xcs.predict(X, cover=[0,0,0]) where cover would be optional and define what to return for a sample if the match set is empty instead of invoking covering.

I think in practice if there is no matching rule then you would want to impute the value? Something like using the most common class or picking the nearest training sample etc. etc., which I guess could be done manually in Python afterwards if those samples could be identified.

@dpaetzel
Copy link
Contributor Author

checkpoint the population set

Ah, nice, I was not aware of that workaround. Thank you! I'll have to benchmark how much of a performance problem that is when compared to adding a default rule, though.

make it return user-defined values instead of covering when calling predict() or score() which could be set as a parameter. Does that sound like the best way? I'm thinking of something like (for y_dim=3) pred = xcs.predict(X, cover=[0,0,0]) where cover would be optional and define what to return for a sample if the match set is empty instead of invoking covering.

That would definitely be an option.

I think in practice if there is no matching rule then you would want to impute the value?

I'm wrapping XCS into an (almost) scikit-learn-compatible Python object anyway and within that object I'm already computing a default prediction (right now I go with the data mean) which is used to create the default rule. Being able to provide the default prediction directly to xcs.predict would definitely be an improvement.

dpaetzel added a commit to dpaetzel/run-rsl-bench that referenced this issue May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants