Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operation for group-wise normalisation #6124

Merged
merged 12 commits into from
Apr 23, 2024
Merged

Add operation for group-wise normalisation #6124

merged 12 commits into from
Apr 23, 2024

Conversation

maximlt
Copy link
Member

@maximlt maximlt commented Feb 20, 2024

resolves #6123

Very much WIP and I've never written any operation! But it works:

import numpy as np
import panel as pn
import holoviews as hv

from holoviews.operation.normalization import normalize_group

hv.extension('bokeh')

x = np.linspace(0, 10*np.pi)
curves = []
j = 0.5
for group in ['A', 'B', 'C']:
    for i in range(2):
        yvals = j * np.sin(x)
        curve = hv.Curve((x + i*np.pi/2, yvals), label=f'{group}{i}', group=group).opts(
            subcoordinate_y=True,
        )
        curves.append(curve)
        j += 1

os = hv.Overlay(curves).opts(show_legend=False, width=400, height=400)
normalize_group(os)

image

@droumis let's chat more about the CZI use case we're trying to solve, with a much much larger dataset that comes with more constraints.

@codecov-commenter
Copy link

codecov-commenter commented Feb 20, 2024

Codecov Report

Attention: Patch coverage is 89.28571% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 88.12%. Comparing base (c3c211e) to head (3e769fa).
Report is 34 commits behind head on main.

Files Patch % Lines
holoviews/operation/normalization.py 80.64% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6124      +/-   ##
==========================================
- Coverage   88.70%   88.12%   -0.58%     
==========================================
  Files         314      319       +5     
  Lines       65993    67212    +1219     
==========================================
+ Hits        58537    59232     +695     
- Misses       7456     7980     +524     
Flag Coverage Δ
ui-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@droumis
Copy link
Member

droumis commented Feb 20, 2024

We should probably adjust the range limit without changing the data itself. Unless I'm not understanding something about the subcoordinate_y mechanism, it involves setting ylim on each subcoordinate axis group to fit the largest of timeseries in that group.

@maximlt
Copy link
Member Author

maximlt commented Mar 12, 2024

Take 2 on this. This time the operation doesn't update the data but just records the min-max per group in the range tuple parameter of the y dimension. The aim is to rely on a concept HoloViews already has that defining this range is respected when an element is plotted.

image

Setting the range tuple on the elements part of a subcoordinate_y overlay didn't work out of the box. I made a simple (hacky for the moment) change to element.py so that in this context the range value takes precedence over the other approaches.

The code is the same as in the original post, with the same output, except this time the data is unchanged.

image

dims = [dim]
v0, v1 = dim.range
axis_label = str(dim)
specs = ((dim.name, dim.label, dim.unit),)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically ylim should be supported as well...I'll open an issue about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, have you opened an issue about it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An issue about this bit of code once this PR is merged ;-p

@jlstevens
Copy link
Contributor

Looks good!

Made a few comments and this approach is definitely much more sensible than rescaling the data itself. Once these comments are addressed and a couple of unit tests are added, I would be happy to see this merged.

@droumis droumis requested a review from philippjfr April 5, 2024 16:17
@philippjfr philippjfr self-assigned this Apr 9, 2024
@droumis droumis assigned maximlt and unassigned philippjfr Apr 9, 2024
@maximlt maximlt added the type: enhancement Minor feature or improvement to an existing feature label Apr 15, 2024
@philippjfr
Copy link
Member

Just to clarify, this operation is meant to be invoked explicitly by a user or could we use compositors to automatically apply this if subcoordinate_y is set?

@maximlt
Copy link
Member Author

maximlt commented Apr 17, 2024

Just to clarify, this operation is meant to be invoked explicitly by a user or could we use compositors to automatically apply this if subcoordinate_y is set?

I didn't think about compositors so yes it is meant to be invoked explicitly. If it was applied through a compositor, would it be possible for users to disable it if they don't want this behavior?

(I'll fix the conflicts)

@philippjfr
Copy link
Member

If it was applied through a compositor, would it be possible for users to disable it if they don't want this behavior?

Sure, if we exposed a subcoordinate_group_normalize option (or ideally something nicer) it could be made contingent on that being enabled.

@maximlt
Copy link
Member Author

maximlt commented Apr 17, 2024

If it was applied through a compositor, would it be possible for users to disable it if they don't want this behavior?

Sure, if we exposed a subcoordinate_group_normalize option (or ideally something nicer) it could be made contingent on that being enabled.

So you would prefer this was implemented and applied by default via a registered compositor? If so, I may need some pointers as I just tried that but didn't manage to get the compositor operation registered on an "Overlay" to run.

@philippjfr
Copy link
Member

I guess I'd like to hear what others think. Should this behavior be automatic or toggleable with a plot option?

@jlstevens
Copy link
Contributor

jlstevens commented Apr 17, 2024

My own feeling is that this operation addresses a very niche use-case. Having the operation be explicit helps make it a bit clearer how things work, especially as it is an unusual way to use the group/label system (which we have generally been moving away from over the years).

My own feeling is that we should document this operation and simply tell users how to register a suitable compositor if they really want it (i.e. if they will be using this feature all the time) and not make it a default compositor that is enabled by default.

So in short, my own feelings are:

  • Should this operation be usable in a compositor? Yes
  • Should we have that compositor enabled by default? No (we should just document it for people who need it)

As for a plot option, if we want to add one then definitely it should be disabled by default (imho).

@maximlt
Copy link
Member Author

maximlt commented Apr 18, 2024

tell users how to register a suitable compositor if they really want it

As far as I can see compositors aren't documented outside of the reference API. I'm not really planning to document compositors part of this PR 🙃

Should this operation be usable in a compositor? Yes

I'm not sure how to convert the operation to be usable in a compositor. As far as I can see, compositors apply to an operation to each element of an Overlay. In this case, we need to compute the min-max over multiple elements (those with the same group), I don't know if this is compatible with the compositor approach.

@philippjfr
Copy link
Member

I'm not sure how to convert the operation to be usable in a compositor.

It already is, my request to make it a no-op was the main thing that's needed.

As far as I can see, compositors apply to an operation to each element of an Overlay.

Compositors operate at the level of an Overlay and can combine one or more layers.

@philippjfr
Copy link
Member

I'm fine with merging this without the compositor for now.

@philippjfr
Copy link
Member

I am a little unclear about one thing though, @jlstevens makes it sound this operation is extremely niche but I thought it is the desired behavior for grouped layers when subcoordinate_y is enabled. Is that not the case? Certainly the issue @droumis created explicitly says:

If groups are specified, normalize the y-range of curve-like elements per group, by default.

@maximlt
Copy link
Member Author

maximlt commented Apr 18, 2024

Compositors operate at the level of an Overlay and can combine one or more layers.

Ok I must have messed up then when trying to register is with a compositor since _process called with the elements contained within the overlay.

I'm fine with merging this without the compositor for now.

Nice! Are you done reviewing it?

@droumis I would like to update the subcoordinate-y docs when we're done with the series of changes we're making this month.

@philippjfr
Copy link
Member

I think for the compositor to work we'd have to add support for patterns that match on any number of elements, currently you can only do something like this:

Compositor.register(Compositor("Curve * Curve", subcoordinate_group_ranges, None,
                               'data', transfer_options=True,
                               transfer_parameters=True,
                               output_type=hv.Overlay,
                               backends=['bokeh', 'matplotlib', 'plotly']))

Additionally we'd have to wrap the operation so it checks the layers have subcoordinate_y set.

@philippjfr
Copy link
Member

Nice! Are you done reviewing it?

I'd like an answer to my question above first, as it currently stands this does not address the ask outlined in the issue.

@jlstevens
Copy link
Contributor

jlstevens commented Apr 18, 2024

I suppose I consider all use of subcoordinate_y fairly niche.

Essentially, I am always wary of introducing additional complexity and I do think that this normalization behavior based on group isn't entirely intuitive (good docs will help!). If we do as Philipp suggests and only do anything when subcoordinate_y is enabled, I am happy to see this compositor on by default (if we think this is the right default behavior for subcoordinate_y).

@philippjfr
Copy link
Member

Got you, yes the proposal was never to enable this for all overlays, only for subcoordinate plots. I had vaguely recalled you could make a compositor dependent on a plot option already but I guess that was me misremembering as I can't find any mention of that now.

@maximlt
Copy link
Member Author

maximlt commented Apr 18, 2024

I'd like an answer to my question above first, as it currently stands this does not address the ask outlined in the issue.

Good catch, I'll see with Demetris.

Meanwhile:

we'd have to add support for patterns that match on any number of elements

Yes true! I've also tried "Chart * Chart" but that doesn't work.


I'll make the operation a no-op is the overlay is not a subcoord-y overlay and will chat with Demetris.

@maximlt
Copy link
Member Author

maximlt commented Apr 22, 2024

I'll make the operation a no-op is the overlay is not a subcoord-y overlay

I'm actually not sure how to find whether an overlay is a subcoord-y overlay in the operation. subcoordinate_y is a Parameter of (Bokeh) ElementPlot and propagated to OverlayPlot, except that it seems to propagation happens after the operation runs, overlay.opts.get('plot').kwargs.get('subcoordinate_y', False) is always False as far as I can see.

I've added some warnings inspired by the checks added in #6122.

@droumis can you remind me of the outcome of our discussion on Thursday about applying the normalization per group by default or not? I think we said we're fine with this not being the default behavior but I'd like you to confirm this (my brain 🧀 ). If not, Compositor will need to be extended to:

  • Add support for patterns that match any number of elements. Currently, you can only specify e.g. Curve * Curve * Curve which means it'll apply to the first 3 Curves of an overlay.
  • The operation is meant to work for all Chart element types but as far as I can see patterns are only applied to specific types, i.e. Chart * Chart * Chart doesn't work.

@droumis
Copy link
Member

droumis commented Apr 22, 2024

Yea, we agreed that because of the complexity involved, it would be sufficient (for now) to require group-wise normalization to be explicitly enabled, and so not applied by default.

@maximlt maximlt requested a review from philippjfr April 22, 2024 17:46
@maximlt
Copy link
Member Author

maximlt commented Apr 22, 2024

Yea, we agreed that because of the complexity involved, it would be sufficient (for now) to require group-wise normalization to be explicitly enabled, and so not applied by default.

Thanks! Then I'll just ask @philippjfr to have a look again at this PR 🙏

Copy link
Contributor

@jlstevens jlstevens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Operations typically don't use the plot options but here they are simply checked to generate a warning. That seems entirely reasonable to me.

@maximlt maximlt merged commit eb7ae9d into main Apr 23, 2024
12 of 13 checks passed
@maximlt maximlt deleted the normalize_group branch April 23, 2024 12:05
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: enhancement Minor feature or improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Subcoordinate-y normalization per group
5 participants