-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limited Results, Low Coverage #30
Comments
Hey - good to hear, that you are working with our implementation! It has not seen a lot of maintenance lately and we are discussing how to proceed from here. |
You've hit a rule that has a very high precision with low coverage. This is both one of Anchors' features and shortcomings and provides valuable insights to you.
This is usually the case, when the explained instances
I'd guess this is the case for you. Your model focuses on exactly one feature manifestation of your instance and everytime it gets changed in its perturbation space, the model takes another feature into account for decision. Your model is both very sure and can't generalize this prediction (possibly interpretable as being overfitted). Hope that helps. Also, very excited you're using this project and interested in what for :) |
@TobiasGoerke Thanks this is all helpful information. @fkoehne It's a bussiness implementation so I can't share exact details. The idea is to have a batch process that rebuilds the model using h2o and their mojo/pojo conversion. I load up the model.zip file in a springboot app in java and pull in live data from a streaming process. As a result, my training data and data to predict are seperate. In the titantic example, the explained instance is coming from the training data that was put into an AnchorTabular object. As a result I have to build a TabularInstance object to calculate a explainable outcome. It would be nice if the Adapters extension has a build() method that would construct a TabularInstance and create the discresionezed versions of my data so I don't have to do that manually. I ultimatly want to use the explinations in a UI. For live predictions, I want to construct a natural language explination of the outvome the model suggests by using a combination of Shapley values and Anchors. For example, "The model suggests approving this request because, [DATA] is greater than 5". It doesn't have to look like that but the idea is that their is an easy way for a user to interpret a decision the model made. Where I can see future work here is more integration with h2o and easier data conversion. Feel free to ask more questions. I am really excited the work you are both doing! |
Not sure if I understand your request correctly. There's a default builder already that you can use. You'd just need to create a
That sounds great! Combinding various mechanisms and visualizations is always a good idea. By the way: I've written a chapter about Anchors in Interpretable Machine Learning. Here, you'll find a visualization technique for Anchors that we've written. It's currently only available for R but it may inspire you on how to preprocess the results for your users: We're working on similar things, focusing on MLOps and machine learning lifecycles, improving their maturity (XAI being an important component here) for production use-cases in general, using the cloud and tools like Kubeflow.
While we're aware some users are actively using this project, we haven't yet received all too much valuable feedback, so thank you, yours is very much appreciated. In case you decide to move forward with this Anchors implementation and bring it to production, we'd be very interested to hear what your journey was like and how the tool helps you in real-life situations. Also, being a consultancy, we'd be happy to help you besides contributing to open-source. Feel free to send me a message in case you're interested and would like to talk about possible options. |
I am using this with Anchors Adapter and my implementation is similar to the titanic dataset located in https://github.com/viadee/xai_examples/tree/master for the titanic dataset. My results are coming in with very low coverage. That is the first problem. What does this suggest about my data? Or is there a problem with how I encoded it?
A few details here, this is running in a streaming application. Data comes in by the line and I run data preprocessing on it. In order to get and explanation, I am forced to convert it to a TabularInstance along with discresionized version of the data. Here is a method I created.
First off it would be great that Adapters supported an easy way to convert one line of data using .build. Second,
tabular.getVisualizer().visualizeResult(anchor));
is giving me an issue because IntegerColumn does set the discretionizer. I wish I could give an easy method for replicating this issue.The text was updated successfully, but these errors were encountered: