Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Area/Bar stacking implementation #5235

Open
stas-sl opened this issue Mar 15, 2022 · 4 comments
Open

Area/Bar stacking implementation #5235

stas-sl opened this issue Mar 15, 2022 · 4 comments

Comments

@stas-sl
Copy link
Contributor

stas-sl commented Mar 15, 2022

This is related to holoviz/hvplot#551, but I think it is a more broad issue. For bar charts specifying columns order works as expected (except reversed legend), but for area chart this order is not respected.

df = pd.DataFrame({'a': [2, 3, 1], 'b': [1, 2, 3], 'c': [3, 2, 1]})
df.hvplot.bar(y=['b', 'c', 'a'], stacked=True)

image

df = pd.DataFrame({'a': [2, 3, 1], 'b': [1, 2, 3], 'c': [3, 2, 1]})
df.hvplot.area(y=['b', 'c', 'a'], stacked=True)

image

I believe this is due rather different implementations. For area it goes inside Area.stack and has df.groupby(level=levels) which causes sorting implicitly:

@classmethod
def stack(cls, areas, baseline_name='Baseline'):
"""
Stacks an (Nd)Overlay of Area or Curve Elements by offsetting
their baselines. To stack a HoloMap or DynamicMap use the map
method.
"""
if not len(areas):
return areas
is_overlay = isinstance(areas, Overlay)
if is_overlay:
areas = NdOverlay({i: el for i, el in enumerate(areas)})
df = areas.dframe(multi_index=True)
levels = list(range(areas.ndims))
vdim = areas.last.vdims[0]
vdims = [vdim, baseline_name]
baseline = None
stacked = areas.clone(shared_data=False)
for key, sdf in df.groupby(level=levels):
sdf = sdf.droplevel(levels).reindex(index=df.index.unique(-1), fill_value=0)
if baseline is None:
sdf[baseline_name] = 0
else:
sdf[vdim.name] = sdf[vdim.name] + baseline
sdf[baseline_name] = baseline
baseline = sdf[vdim.name]
stacked[key] = areas[key].clone(sdf, vdims=vdims)
return Overlay(stacked.values()) if is_overlay else stacked

Whereas for bar charts it is handled on plotting level in BarPlot.get_stack:

def get_stack(self, xvals, yvals, baselines, sign='positive'):
"""
Iterates over a x- and y-values in a stack layer
and appropriately offsets the layer on top of the
previous layer.
"""
bottoms, tops = [], []
for x, y in zip(xvals, yvals):
baseline = baselines[x][sign]
if sign == 'positive':
bottom = baseline
top = bottom+y
baseline = top
else:
top = baseline
bottom = top+y
baseline = bottom
baselines[x][sign] = baseline
bottoms.append(bottom)
tops.append(top)
return bottoms, tops

Also I'm not sure why not to just use ad hoc bokeh methods vbar_stack and varea_stack for this.

@jbednar
Copy link
Member

jbednar commented Mar 15, 2022

Would adding sort=False to the groupby call be enough to fix it?

@stas-sl
Copy link
Contributor Author

stas-sl commented Mar 15, 2022

Yes, it indeed solves the issue! I didn't know there is such option.

df = pd.DataFrame({'a': [2, 3, 1], 'b': [1, 2, 3], 'c': [3, 2, 1]})
df.hvplot.area(y=['b', 'c', 'a'], stacked=True)

image

Though on a higher level maybe it worth considering unifying stacking for both these charts? Ideally it seems reasonable to extract this functionality in a common helper method and make use of bokeh methods, or at least to have it on the same level, whereas currently bar chart is stacked on plotting/bokeh level, while area chart is stacked on a more abstract level before specific backend, which might cause inconsistencies when using other backends.

@jbednar
Copy link
Member

jbednar commented Mar 15, 2022

Happy to consider a PR either just making this tiny sort=False fix or making things more consistent, which seems like more work.

stas-sl added a commit to stas-sl/holoviews that referenced this issue Mar 15, 2022
This is a quick fix of holoviz#5235, until a more thorough refactoring (if ever) as discussed there.
@stas-sl
Copy link
Contributor Author

stas-sl commented Mar 15, 2022

I'll proceed with the quick fix option for now, as indeed, refactoring requires much deeper understanding of the library.

philippjfr pushed a commit that referenced this issue Mar 17, 2022
This is a quick fix of #5235, until a more thorough refactoring (if ever) as discussed there.
jlstevens pushed a commit that referenced this issue Apr 1, 2022
This is a quick fix of #5235, until a more thorough refactoring (if ever) as discussed there.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants