Item visibility (private/public) #990
-
Two recent events made me think about the status of item, whether they should be private or public. The first one is reading the documentation (for instance https://skore.probabl.ai/0.5/generated/skore.item.cross_validation_item.CrossValidationItem.html). The second one is a comment stating that items should be public (#966 (comment)). In my opinion, items should be private, for the following reasons:
Making items private wouldn't change anything in the user workflow, except that we would need to change |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 7 replies
-
Furthermore, |
Beta Was this translation helpful? Give feedback.
-
Let's list the requirements.
For inspiration, we can look at W&B's public API for images. https://docs.wandb.ai/guides/track/log/media/ They have an Note that an Item does not "represents a transformation" (ping @thomass-dev). It represents an artifact of your ML development process. We designed @MarieS-WiMLDS I believe that you are exploring opportunities to make the user experience less confusing. The source of confusion seems to be the potential asymmetry between Now that we have a clearer idea of what we would like to store, we could try to see what kind of data needs to be wrapped:
All in all, I am in favor of this effort, but I would like to see examples of code. |
Beta Was this translation helpful? Give feedback.
-
I'll take mine on this. 🙊 As a user perspective I don't care about how item persists, I don't care about about the underlying classes. I would like user to write code like: from sklearn import datasets
from sklearn.linear_model import Lasso
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
import skore
p = skore.open() # Each call to open could create a session that will help user track there progress.
# data
diabetes = datasets.load_diabetes()
X = diabetes.data[:150]
y = diabetes.target[:150]
p.track(X, name="X") # tracks a dataframe numpy/polars/...
p.track(y) # name is optional, it may fallback to the given variable name using frame inspection ?
# sk learn models
pipeline = make_pipeline(StandardScaler(), Lasso())
p.cross_validate(pipeline, X, y)
p.track(pipeline)
# plots and media
pillow_img = ...
p.track(pillow_img) # auto media type to png thx to pillow
plot_object = ...
p.track(plot_object) # as vector if possible, as vega spec if possible fallback to png bytes
# primitive type
markdown = ...
p.track(markdown, name="summary", media_type="text/markdown")
# later
my_old_pilpeline = p.find_model(name="pipeline", session="azertyui") # session id is visible in the UI hence one can get back it's fitted model Imo getting stuff back from skore is not relevant. If you get stuff back from skore, you get only the extra stuff skore gave you (cross validations plots, table insights, ...) which are already visible in the UI. Fitted models are probably a special case as it sounds like a good idea to store them securely using skops. |
Beta Was this translation helpful? Give feedback.
-
I'm even convinced since the start that "how to display an item" should not be programmatically defined, but in the UI by the user. For objects whose the type can't characterize their nature, such |
Beta Was this translation helpful? Give feedback.
-
It was decided to make items private in order to have an API easier to understand and to use. Related issue is #1045. |
Beta Was this translation helpful? Give feedback.
It was decided to make items private in order to have an API easier to understand and to use. Related issue is #1045.