-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
categorical plots - unused categories mess up element spacing and width #3736
Comments
I think you want to set |
Right - that indeed fixes it, thanks! - though perhaps the default |
Yeah — determining whether dodge is needed is a surprisingly hard problem. Here's the code that's currently doing it; not sure why it isn't working with your example. |
The reason that Using following modified dataframe for testing: import seaborn as sns
penguins = sns.load_dataset('penguins')
penguins['island'] = penguins['island'].astype('category')
penguins['island'] = penguins['island'].cat.add_categories(['Uninhabited Island'])
penguins['hue_col'] = penguins['island'] Then island
Biscoe 168
Dream 124
Torgersen 52
Uninhabited Island 0
Name: count, dtype: int64 And island hue_col
Biscoe Biscoe 168
Dream Dream 124
Torgersen Torgersen 52
Name: count, dtype: int64 Changing the test in |
Several of seaborn's functions for plotting categorical data don't cope well when the categories list includes unused categories.
I've noticed two main issues:
It doesn't make a difference if you use vertical or horizontal orientation.
The issue only occurs when the same feature is used for the categorical x/y variable and for the hue. If no hue is provided, or if the hue uses a different feature, there is no issue.
The issues occur for
sns.barplot
,sns.boxplot
,sns.boxenplot
,sns.violinplot
. Whereassns.pointplot
,sns.stripplot
,sns.swarmplot
are fine.I've reproduced the issue with the penguins dataset we all know and love from the seaborn docs. In the following MRE, the first col is the raw penguins data. The second col is after converting it to categorical (also works fine). The final col is after adding an unused category to the data, which causes the above two issues:
It looks as though it is failing to recognise that the
hue
andy
are the same, so it makes space on the plot within eachy
for all thehue
s. This is what makes each element a) get squeezed, and b) not align nicely with they
ticks. Presumably the unused category is somehow the cause of the confusion.Code to generate the above plot:
Many thanks as always for the superb library!
The text was updated successfully, but these errors were encountered: