-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A transform to consolidate ordinal values outside the top n into an “other” category, perhaps in conjunction with the group transform. #144
Comments
I wonder if this should be something you specify as a scale transform rather than a mark transform? Seems handy… |
(I don't have access to https://observablehq.com/d/0e0c0dcb66d6714e) Doing it on the scale would just be like specifying .unknown("Others")? Could be interesting for individual marks (like dot), but we need it as a data transform, I think, for aggregate operations (bars). |
Sorry, that’s an internal dashboard. But I can try to make another example for you that uses the “other” transform. |
My own use case for nominal "Others" is detailed in this “modalities” notebook. |
I've made some progress on this idea; seems to work with facets https://observablehq.com/d/0bca2cad63c75fe1 |
I've tried a few things to achieve this, by passing the domain to the scale transform in plot.js#38, but my conclusion is it's a dead end. The scale transform is invoked too late, after the grouping, when the aggregation (count) is already done; so, even if we map all the individual groups to the same place on the screen, they will not be aggregated. For counts, we could maybe recount (sum the sums in the aggregated channel, but which one is it?), and this would not work for other types of aggregation. |
This solution works on X, where others+k are an option of the group reducer. --- a/src/transforms/group.js
+++ b/src/transforms/group.js
@@ -67,7 +67,7 @@ function groupn(
// The z, fill, and stroke channels (if channels and not constants) are
// greedily materialized by the transform so that we can reference them for
// subdividing groups without having to compute them more than once.
- const {z, fill, stroke, ...options} = inputs;
+ const {z, fill, stroke, others, k = 10, ...options} = inputs;
const [BZ, setBZ] = maybeLazyChannel(z);
const [vfill] = maybeColor(fill);
const [vstroke] = maybeColor(stroke);
@@ -84,6 +84,15 @@ function groupn(
...Object.fromEntries(outputs.map(({name, output}) => [name, output])),
transform: maybeTransform(options, (data, facets) => {
const X = valueof(data, x);
+ if (others && X) {
+ const domain0 = sort(grouper(X, d => d), ([,{length}]) => -length);
+ if (domain0.length > k + 1) {
+ const domain = new Set(domain0.slice(0, k).map(d => d[0]));
+ for (let i = 0; i < X.length; i++) {
+ if (!domain.has(X[i])) X[i] = others;
+ }
+ }
+ }
const Y = valueof(data, y); EDIT I don't think we should pursue in this direction, since the modalities function defined in this notebook returns both the channel and a domain that we can use in the scale definition. This is enough for the purpose and in line with #271 (comment) . |
We now have sort:{ fx: { value: …, limit } } in #442 ; the only thing missing is "others". |
Some more pairing on this, led by Fil: https://observablehq.com/d/f3aac7d647ef1c9e |
A more advanced experiment here https://observablehq.com/@observablehq/plot-stacking-others-144 |
e.g., https://next.observablehq.com/d/0e0c0dcb66d6714e
The text was updated successfully, but these errors were encountered: