Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

histogram op: raise error on non numerical data #5481

Closed
wants to merge 2 commits into from

Conversation

maximlt
Copy link
Member

@maximlt maximlt commented Oct 14, 2022

Fixes holoviz/hvplot#906 in hvPlot

The main goal of this PR was to validate the data to be processed by the histogram operation, that data must be numerical/datetime-like to make sense. Guarding is tricky in HoloViews with all the data types supported, I hope this is alright (the tests passed locally). While I was at it I've also done the same with for the bins parameter.

@codecov-commenter
Copy link

codecov-commenter commented Oct 14, 2022

Codecov Report

Merging #5481 (b7793ad) into main (fba7697) will decrease coverage by 0.00%.
The diff coverage is 55.55%.

@@            Coverage Diff             @@
##             main    #5481      +/-   ##
==========================================
- Coverage   88.22%   88.21%   -0.01%     
==========================================
  Files         302      302              
  Lines       62179    62188       +9     
==========================================
+ Hits        54857    54862       +5     
- Misses       7322     7326       +4     
Impacted Files Coverage Δ
holoviews/operation/element.py 71.42% <55.55%> (-0.27%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -700,6 +700,14 @@ class histogram(Operation):
Used for setting a common style for histograms in a HoloMap or AdjointLayout.""")

def _process(self, element, key=None):

bad_bins_msg = 'bins only accept numerical data.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message could be improved a bit. How about:

'Histogram binning can only be performed on numerical data'

(self.p.bins.dtype.kind in 'SU' or self.bins.dtype.kind == 'O' and not isdatetime(self.bins))):
raise TypeError(bad_bins_msg)
elif isinstance(self.p.bins, (list, tuple)) and any(not is_number(bin) for bin in self.p.bins):
raise TypeError(bad_bins_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given all these boolean expressions, it would be good to pull them out onto separate lines for clarity.

@@ -716,8 +724,16 @@ def _process(self, element, key=None):

if hasattr(element, 'interface'):
data = element.interface.values(element, selected_dim, compute=False)
dtype = element.interface.dtype(element, selected_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dtype not easily available with compute=False?

@jlstevens
Copy link
Contributor

Made a few comments and overall looks good. Could you add a test to see if the type error is raised when expected?

@hoxbro
Copy link
Member

hoxbro commented Jan 12, 2023

I can see that the improved error message requested for categorial data in holoviz/hvplot#906 have been solved in #5540. E.g, this code now gives a more descriptive error message:

import plotly.express as px
import hvplot.pandas
df = px.data.tips()
df.hvplot.hist("day", height=400)

I can't at a quick glance see if this PR also fixes other issues with categorical data in histograms.

@hoxbro
Copy link
Member

hoxbro commented May 17, 2024

No answer in a year, so I will close this PR.

@hoxbro hoxbro closed this May 17, 2024
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it very clear that hist only support numerical or datetime fields
4 participants