-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After using Focal Loss, the network does not converge #811
Comments
@pprp there is about zero obj loss in your second example, so obviously the network will never learn obj this way. |
@pprp also, if focal loss produces worse results, then clearly don't use it. |
What should I do if i want use focal loss? |
@pprp try different settings. |
Thank you very much. I will try to fix this problem.. |
@pprp by the way, I was looking at the focal loss function. I think the reduction setting may need an update now that the loss reduction functions are set to |
Thanks for your reply, I will retrain tomorrow and inform you of the final result. |
@glenn-jocher I try the fixed version but get the same problem.
if I use Fdefault, the network will get non-finite loss error. if I use uFBCE, the network does not converge.
|
@pprp ah ok. Well, it seems focal loss is not the best choice for your problem. I recommend you stick to the repo defaults (i.e. |
From experience @pprp Focal Loss is usually not the best way to go. I don't know what you are training on. But I would recommend either increasing the img-size, lowering the initial learning rate by a magnitude of 10, or lowering the training IoU. |
In my problem, I want to use focal loss to balance the positive samples and negative samples. I have a question about lobj. In compute_loss function:
Can you tell me why to calculate the loss between the output and the giou? Does this have an effect on the focal loss? |
@pprp this is experimental. I think we will revert back to the original formulation below, we are currently testing the effect of the change. Focal loss is independent of this though. tobj[b, a, gj, gi] = 1.0 |
This issue is stale because it has been open 30 days with no activity. Remove Stale label or comment or this will be closed in 5 days. |
we get the error below:
After I set the parameters as:
The network works but fails to converge.
what's more, I only have one class and I use the command below:
The text was updated successfully, but these errors were encountered: