Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support binary case #200

Closed
wants to merge 1 commit into from
Closed

Support binary case #200

wants to merge 1 commit into from

Conversation

ejm714
Copy link
Collaborator

@ejm714 ejm714 commented Jul 20, 2022

If a user provides only two labels, we assume these are mutually exclusive and train the binary model. We log which column we're keeping.

Outstanding:

  • tests
  • documentation

Note: I gave this a quick try on a dataset of 100 videos balanced evenly between blank and non blank and we indeed see some learning.

      Validate metric             DataLoader 0
─────────────────────────────────────────────────────────
species/val_accuracy/blank            0.75
   species/val_f1/blank        0.7826086956521738
species/val_precision/blank    0.6923076923076923
 species/val_recall/blank              0.9
       val_accuracy                   0.75
         val_loss              0.6593988537788391
       val_macro_f1            0.7442455242966751

Some downsides to the binary case is that these metrics can look misleading when the class are imbalanced. Not sure the best way to warn users about that. If users provide highly imbalanced data, models may learn problematically to only predict the default class, but I suppose that is the same in the multilabel case.

@netlify
Copy link

netlify bot commented Jul 20, 2022

Deploy Preview for silly-keller-664934 ready!

Name Link
🔨 Latest commit c8c5055
🔍 Latest deploy log https://app.netlify.com/sites/silly-keller-664934/deploys/62d74d49a7f34b000941e9a5
😎 Deploy Preview https://deploy-preview-200--silly-keller-664934.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@ejm714 ejm714 marked this pull request as draft July 20, 2022 00:33
@github-actions
Copy link
Contributor

@@ -491,6 +491,12 @@ def preprocess_labels(cls, values):
.max()
)

species_cols = labels.filter(regex="species_").columns
# binary case
if len(species_cols) == 2:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add check to confirm that row-wise sum is always 1 in addition to two columns

@AllenDowney
Copy link
Contributor

This branch is currently failing several tests (not just code quality), even after pulling updates from master. Rather than debug them, I'm inclined to start with a new branch and create a new PR. Sound ok, @ejm714 ?

@AllenDowney AllenDowney mentioned this pull request Aug 22, 2022
2 tasks
@ejm714
Copy link
Collaborator Author

ejm714 commented Aug 22, 2022

Closing as this is superseded by #215

@ejm714 ejm714 closed this Aug 22, 2022
@ejm714 ejm714 deleted the binary-model branch September 12, 2022 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants