Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch_NCF fix, normalize running loss #537

Merged
merged 4 commits into from
Oct 24, 2023
Merged

PyTorch_NCF fix, normalize running loss #537

merged 4 commits into from
Oct 24, 2023

Conversation

hieuddo
Copy link
Member

@hieuddo hieuddo commented Oct 23, 2023

Description

Minor fix for PyTorch_NCF, nn.BCELoss(reduction="sum"), default is reduction="mean"

@hieuddo hieuddo requested a review from tqtg October 23, 2023 13:57
@tqtg
Copy link
Member

tqtg commented Oct 23, 2023

@hieuddo any reason why we want to change to sum?

@hieuddo
Copy link
Member Author

hieuddo commented Oct 23, 2023

The current loss being printed is loss = running_loss / num_users. So the reduction of loss should be sum instead of mean.

count += len(batch_users)

@tqtg
Copy link
Member

tqtg commented Oct 23, 2023

I see your point. Two perspectives:

  1. We want to show what's the current loss value on the training dataset which is what the formula says if we're about to do batch learning, although it's not quite what the model is optimizing for when doing mini-batch.
  2. Normalizing the loss by the size of mini-batch will help learning be more stable, less sensitive to the size of mini-batch. Think of them as doing SGD using the centroid of mini-batch.

I would suggest to keep the reduction="mean". If you're not comfortable with the current way of showing the loss value, we can change count to counting the number of mini-batches, and sum_loss to summing all the normalized mini-batch losses.

@hieuddo
Copy link
Member Author

hieuddo commented Oct 23, 2023

Understood. Just wanted to print the correct average loss over users.

One minor is current printed loss is relatively small (i.e., e-[3/4/5]), which could be hard to follow for hyperparams searching. This could be fixed by changing count += 1 I guess.

@hieuddo hieuddo changed the title PyTorch_NCF fix, BCELoss(reduction='sum') PyTorch_NCF fix, normalize running loss Oct 23, 2023
@tqtg
Copy link
Member

tqtg commented Oct 23, 2023

Let's also synchronize this change to the tensorflow backend?

@hieuddo hieuddo merged commit 9b06a51 into PreferredAI:master Oct 24, 2023
12 checks passed
@hieuddo hieuddo deleted the fix_loss_ncfpytorch branch October 24, 2023 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants