Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: supporting channel name string for sorting by another encoding (e.g., sort: "x", sort: "-x") #5134

Merged
merged 4 commits into from
Jul 6, 2019

Conversation

kanitw
Copy link
Member

@kanitw kanitw commented Jun 29, 2019

Fix #4933

@kanitw kanitw requested a review from domoritz June 29, 2019 19:28
@kanitw kanitw changed the base branch from master to 4.0 June 29, 2019 19:28
@domoritz
Copy link
Member

domoritz commented Jul 1, 2019

While I can see that this is nice, couldn't there be some confusion about whether x is an encoding or a field? I'd like to get some thoughts on this from @arvind and @jheer.

@kanitw
Copy link
Member Author

kanitw commented Jul 1, 2019

Interesting point.

It's also worth noting that:

  1. Given that we already use sort string for ascending/descending, we can't use string to denote a field name as there can be a name collision. Channel names (with optional minuses) can't lead to collision though. Thus, if we want to extend sort string, it only makes sense that we use string for channel, which is the more common use case anyway. The question is whether we want to introduce this at all.

  2. As I consider the design of Voyager and our examples gallery, I realize that sort by encoding is by far the most common use case for sort.

  3. From the experience running Altair workshops, it's just painful that such a common action would require an awkward sort=alt.SortByEncoding(encoding="x") Given it's common use, we better make it really easy to use.

That said, as I just saw yesterday that Altair may introduce a chained syntax like .sort(encoding="x"), which makes this shorter syntax less important. (Let's see how the discussion go in Altair.)

  1. I don't think this is too confusing though.

Basically, users will only see examples that show sort strings as channels. They can easily guess that x/y are channels as we never use x/y as the field names in the examples. The document will also states that sort strings are either ascending/descending or a channel names with an optional minus prefix.

In a rare case that the user's plot has a field named "y" mapped to x-encoding or vice versa, it will be confusing anyway no matter we introduce this syntax or not.

@arvind
Copy link
Member

arvind commented Jul 1, 2019

I don't have a strong opinion on this. Besides the point @domoritz made of ambiguity between field/encoding, which does seem problematic, I'm also curious whether this introduces multiple ways of specifying sorting or are we getting rid of the longform? If multiple, then is this sufficiently important to warrant the added complexity (code paths/documentation/learnability/discoverability/etc.)?

Another way of looking at it is perhaps the new Vega-Lite JS API frees us from optimizing the concision of the JSON for hand-writing use cases (and, instead, the JSON can focus on concision with respect to the grammar and issues of abstraction). The fact that Altair is considering adding a shorthand makes me believe that there is less of a need for the JSON to handle it directly.

@jakevdp
Copy link
Contributor

jakevdp commented Jul 1, 2019

Big +1 on this; it would address Altair feature requests as well: vega/altair#884

@kanitw
Copy link
Member Author

kanitw commented Jul 1, 2019

I'm also curious whether this introduces multiple ways of specifying sorting or are we getting rid of the long form?

This is definitely another way of doing things and is the biggest con of this PR IMHO (moreso than the potentil confusion, which I think is not so strong).

If we agree that the shorter sort-by-encoding syntax is more desireable, I think we should deprecate the old SortByEncoding ({encoding: ..., order: ...}) to avoid redundancy.

I haven't deprecated the old syntax in this PR yet, simply for backward compatibility. But I agree we could remove it from docs / schema but maintain the functionality as a macro.

Another way of looking at it is perhaps the new Vega-Lite JS API frees us from optimizing the concision of the JSON for hand-writing use cases (and, instead, the JSON can focus on concision with respect to the grammar and issues of abstraction). The fact that Altair is considering adding a shorthand makes me believe that there is less of a need for the JSON to handle it directly.

I thought about this too. Having the JS APIs provide concise can be useful in some cases, but doesn't come with a cost.

For things that we want the languages to be consistently concise across different wrappers (JS, Altair, and others), it is better to provide some of the concision mechanism in the core JSON syntax, so the syntax will be consistent across wrappers.

@domoritz
Copy link
Member

domoritz commented Jul 1, 2019

So if we deprecate the object for sorting by channel, we will have three ways to sort:

For natural order:

sort: 'descending'

For channels:

sort: '-x'

For fields:

sort: {field: 'foo', order: 'descending', op: 'mean'}

I think we already introduced a shorthand with descending since we could have also done

For natural order:

sort: {order: 'descending'}

For channels:

sort: {encoding: 'x', order: 'descending'}

For fields:

sort: {field: 'foo', order: 'descending', op: 'mean'}

I'm a bit conflicted because the latter is definitely easier to reason about for a machine but the former is easier for people. Right now we are in between two options. I would actually be okay with supporting both and normalize the former to the latter. I am also okay with deprecating the latter. I am also okay with only writing about the former in the docs if we want to keep the latter.

@jakevdp
Copy link
Contributor

jakevdp commented Jul 1, 2019

Regarding the shorthands: what happens if you have a field named "-x" or named "descending"?

@kanitw
Copy link
Member Author

kanitw commented Jul 1, 2019

Regarding the shorthands: what happens if you have a field named "-x" or named "descending"?

As mentioned above, we will use strings only for either (1) ascending/descending or (2) encoding channels (with an optional minus prefix) -- but not field names for this exact reason.

So if we deprecate the object for sorting by channel, we will have three ways to sort:

I think having 3 ways to sort (for 3 distinct use cases) it not a problem.

It's only a problem when we provide two ways for one of the use case.

If we deprecate sort: {encoding: 'x', order: 'descending'}, we still then have only one way for each main use case:

  • sort by the encoded field's values: sort: 'ascending' | 'descending'

  • sort by a field in another channel: sort: 'x' | '-x' | ...

  • sort by a custom field sort: {field: ..., order: ..., op: ...}

So the only change would be that {encoding: ..., order: ...} becomes "x" | "-x" | ....

Note: FWIW, we currently don't support sort: {order: 'descending'} since "sort" and "order" are redundant in this context.

@kanitw kanitw changed the title feat: add a short-form for sort by encoding (e.g., sort: "x", sort: "-x") feat!: use string instead of object for sort by encoding (e.g., sort: "x", sort: "-x") Jul 1, 2019
@kanitw kanitw added the RFC / Discussion 💬 For discussing proposed changes label Jul 5, 2019
@domoritz
Copy link
Member

domoritz commented Jul 5, 2019

Overall, I see that this change is useful for users but sacrifices some easy programmatic handling. I would prefer to see this built into Altair and the VL JS API instead. However, adding it in Vega-Lite directly will increase adoption of the idea.

@kanitw kanitw force-pushed the kw/sort-short branch 3 times, most recently from 3e51fc6 to 49f8d83 Compare July 6, 2019 04:20
@kanitw kanitw removed the RFC / Discussion 💬 For discussing proposed changes label Jul 6, 2019
@kanitw kanitw changed the title feat!: use string instead of object for sort by encoding (e.g., sort: "x", sort: "-x") feat!: supporting channel name string for sorting by another encoding (e.g., sort: "x", sort: "-x") Jul 6, 2019
@domoritz
Copy link
Member

domoritz commented Jul 6, 2019

So @kanitw changed to support both the shorthand and the expanded version. Internally, we expand in the normalizer. I generally don't like having two ways to do the same thing but in this case I think an exception is warranted as it makes things much easier for users and ensures consistency among programming APIs like Altair, Elm Vega, and the VL API.

@domoritz domoritz merged commit 1bd7037 into 4.0 Jul 6, 2019
@github-actions github-actions bot deleted the kw/sort-short branch July 6, 2019 10:49
kanitw added a commit that referenced this pull request Jul 20, 2019
domoritz pushed a commit that referenced this pull request Jul 25, 2019
domoritz pushed a commit that referenced this pull request Jul 25, 2019
willium pushed a commit that referenced this pull request Jul 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants