Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a bin shorthand #1567

Closed
JarnoRFB opened this issue Jun 18, 2019 · 16 comments
Closed

Adding a bin shorthand #1567

JarnoRFB opened this issue Jun 18, 2019 · 16 comments

Comments

@JarnoRFB
Copy link
Contributor

Would it be possible to add a shorthand syntax for binning? The exact proposal would be to to allow to replace

alt.Chart(cars).mark_bar().encode(
    x=alt.X("Horsepower", bin=True), 
    y="count()"
)

by

alt.Chart(cars).mark_bar().encode(
    x="bin(Horsepower)", 
    y="count()"
)

After reading through #763 I understand that changes in this direction are not trivial, but I would suppose that just allowing this simple substitution would not break the conceptual model, as one would still have to use alt.X to customize the binning. However, for the default case this substitution should correspond to what is already being done for aggregations.

@jakevdp
Copy link
Collaborator

jakevdp commented Jun 18, 2019

It's an interesting idea, and I've thought about adding something like this in the past...

One thing I'm not excited about with this approach is that it makes the transition from default binning to configured binning a rather jarring one; you first do

x = 'bin(column)'

and then when you want to control the binnings, it has to be changed to

x=alt.X('column', bin=alt.Bin(maxbins=10))

Not a particularly seamless transition.

One alternative I've thought about is to use an approach similar to the vega-lite Javascript API and allows things like

alt.Chart().encode(
   alt.X('column').bin()
)

which could then be customized using

alt.X('column').bin(maxbins=10)

which would be much more natural. It would also remove the bin=alt.Bin boilerplate. We could do similar things with scale, axis, etc.

The problem with that idea is that currently alt.X.bin is a class attribute that is either Undefined, True, or an alt.Bin() object. Turning it into a callable method could break existing code. But it may be worth the breaking change to make encoding attributes more expressive.

@JarnoRFB
Copy link
Contributor Author

I think the JS API looks very fluent. It is also nice that it takes the builder pattern that is already present at the Chart level, to the more lower levels of the API. I believe adding this kind of API to altair would greatly enhance the user experience.

The problem with that idea is that currently alt.X.bin is a class attribute that is either Undefined, True, or an alt.Bin() object. Turning it into a callable method could break existing code.

What about sticking even closer to the JS API and offering lower-case alternatives for the channel builders? So one could write

alt.Chart().encode(
   alt.x('column').bin(...).scale(...)
)

The, let's call them channel functions, and their methods would then always return a channel builder object.
Then just before the chart is rendered one would have to traverse the chart object and call build() on every builder, so the user does not really have to care about it.
This way one could put the new fluent channel API next to the existing one, without having to worry to much about breaking existing code.

@jakevdp
Copy link
Collaborator

jakevdp commented Jun 28, 2019

I've been playing with this a bit... we might be able to fit this kind of new approach on top of the old one. Something like this (we'd need to adjust method signatures & docstrings, but that won't be too hard):

def propertysetter(name, cls):
  def setter(self, *args, **kwargs):
    if len(args) == 1 and len(kwargs) == 0:
      setattr(self, name, args[0])
    else:
      setattr(self, name, cls(*args, **kwargs))
    return self
  return setter

class X(alt.X):
  bin = propertysetter('bin', alt.Bin)
  axis = propertysetter('axis', alt.Axis)
  scale = propertysetter('scale', alt.Scale)

Then you could define a chart like this:

chart = alt.Chart('data.csv').mark_point().encode(
  x = X('name:Q').bin(maxbins=100).axis(title=None).scale(None)
)

chart.to_dict()
# {'$schema': 'https://vega.github.io/schema/vega-lite/v3.3.0.json',
#  'config': {'mark': {'tooltip': None}, 'view': {'height': 300, 'width': 400}},
#  'data': {'url': 'data.csv'},
#  'encoding': {'x': {'axis': {'title': None},
#    'bin': {'maxbins': 100},
#    'field': 'name',
#    'scale': None,
#    'type': 'quantitative'}},
#  'mark': 'point'}

The main backward incompatibility here is that you could no longer access attribues by their name; i.e. chart.encoding.x.bin would reference the setter function, not the bin value.

@domoritz
Copy link
Member

The main backward incompatibility here is that you could no longer access attribues by their name; i.e. chart.encoding.x.bin would reference the setter function, not the bin value.

How often do people use this? In JavaScript, we often solve this problem by using a function that is an accessor when you don't pass any arguments and a setter when you pass arguments. I don't know how pythonic this would be, though, and it also doesn't solve the backwards compatibility issue.

@jakevdp
Copy link
Collaborator

jakevdp commented Jun 28, 2019

How often do people use this?

Not much, if at all. We don't do any direct property access in Altair's docs or examples, so I think it's a pretty minor incompatibility all told. And my take is it's well worth the benefit of being able to do this:

chart.encode(
  x = alt.X('name:Q').bin(maxbins=100).axis(title=None).scale(None)
)

instead of this:

chart.encode(
  x = alt.X('name:Q',
    bin=alt.Bin(maxbins=100),
    axis=alt.Axis(title=None),
    scale=None
  )
)

@domoritz
Copy link
Member

Just to make sure, autocomplete (in e.g. VSCode or JupyterLab) would work as expected here?

@jakevdp
Copy link
Collaborator

jakevdp commented Jun 28, 2019

Autocomplete would work for systems based on Jedi or similar (i.e. ones that can infer the return type of alt.X() in order to populate autocomplete for its methods.

@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

The main backward incompatibility here is that you could no longer access attribues by their name; i.e. chart.encoding.x.bin would reference the setter function, not the bin value.

We can at least make chart.encoding.x.bin() return current bin value when there is no parameter specified? (This is what VL-API currently does.)

@jakevdp
Copy link
Collaborator

jakevdp commented Jun 28, 2019

We can at least make chart.encoding.x.bin() return current bin value when there is no parameter specified?

Yes, that should work. Although that might mess with the type inference engines behind autocomplete in the case of chained statements, because it means there's more logic required to infer the correct return type.

@kanitw
Copy link
Member

kanitw commented Jun 30, 2019

Although that might mess with the type inference engines behind autocomplete in the case of chained statements, because it means there's more logic required to infer the correct return type.

I see. I think good autocompletion is probably more important. I guess people can always do .to_dict()["bin"] to get the values or we can introduce .get_bin() or something.

@kanitw
Copy link
Member

kanitw commented Jul 1, 2019

@jakevdp

Would this mean that sort, which has two class types (SortByEncoding and EncodingSortField), would support either .sort(encoding="x", order="ascending") or .sort(op=..., field=..., order=...) ?

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 1, 2019

Would this mean that sort, which has two class types

Hmm... that's an interesting question. From the user's perspective, I think it would make the most sense to have a single method, with the return type chosen based on the content of the arguments.

Do you know offhand how the JS API handles this?

@kanitw
Copy link
Member

kanitw commented Jul 1, 2019

Since JS doesn't support keyword arguments and but support dict args, the JS API simply takes either .sort({encoding: 'x', order: 'ascending'}) or .sort({op: ..., field: ..., order: ...}).

Note that we're considering if sort should support channel name (vega/vega-lite#5134), which is a bit relevant to this question about sort.

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 1, 2019

Ah, I see.

Huge +1 on supporting channel name as well; we can make it all work from the Altair side.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 23, 2019

Prototype & more discussion in #1629

@binste
Copy link
Contributor

binste commented May 26, 2023

The method-based syntax was added in Altair 5 🥳 See https://altair-viz.github.io/user_guide/encodings/index.html#method-based-syntax. Closing this issue

@binste binste closed this as completed May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants