-
-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imbalance dataset detection enhancement #276
Comments
@alirezazolanvari @sadrasabouri |
@ssheikholeslami |
How about defining a variable for each class that present it's weight in main data set and calculate the variance for it to determine if the data set is imbalance or not. |
I think only the population of the most and the least populated classes play role in imbalance detection. |
I think the method that you mentioned here is not accurate enough at all cases.
|
Weight is a good idea! |
We can calculate E, weighted and then divide E by sum of the wights then E should be normaled respected to weights. and about @alirezazolanvari 's idea, i couldn't find any exception (at least at first view) could you please write down an example that doesn't seems Imbalance but has large difference between the min and max and vice versa. |
There is no standard method for imbalance dataset detection and @alirezazolanvari idea is not wrong , but take a look at this example : Class 1 : 3900 The current method recognizes this distribution as balanced !! (even with ratio of 1.5) |
We can also define an |
If it's OK I can work on this issue for version 3.3. |
🥇 |
Description
PyCM imbalance detection is weak, example :
The text was updated successfully, but these errors were encountered: