Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cherry-pick] Fix the CE's bug when axis is specified and weight is provided #36647

Conversation

HydrogenSulfate
Copy link
Contributor

@HydrogenSulfate HydrogenSulfate commented Oct 22, 2021

PR types

Bug fixes

PR changes

APIs

Describe

cherry pick from #36344

Background:
CrossEntropy loss function uses hard label, and when the weight is received at the same time, then if specify the axis other than -1, an error will be reported in the calculation process

Problem location:
gather_nd will be used when calculating the intermediate variable weight_gather, but when the coordinate value is not in the last dimension, gather_nd will make an error. When the input shape is as described in the background, it will cause the above problem

Solution:
When the axis is specified as a dimension other than -1, manually construct a correct permutation and pass it to the subsequent gather_nd function to use, so as to get the weight_gather with the correct shape

Other modifications:
Fixed some error judgment conditions related to axis, and add an test case in unittest

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@chajchaj chajchaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XiaoguangHu01 XiaoguangHu01 merged commit 32fe5a4 into PaddlePaddle:release/2.2 Oct 26, 2021
@HydrogenSulfate HydrogenSulfate deleted the cherry_pick_fix_CE_axis_bug branch October 26, 2021 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants