Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

Merged
merged 2 commits into from
Dec 16, 2017

Conversation

tensor-tang
Copy link
Contributor

This is related with #6512

@tensor-tang tensor-tang requested a review from luotao1 December 14, 2017 09:32
(j + blockSize > seqWidth) ? (seqWidth - j) * sizeof(real)
: blockMemSize);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里能保留原来代码在么?
openblas的情况下,是不是原来的memcpy更快速。现在多了一层循环了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

哦好的,那我保留原来的区分下好。thx

@tensor-tang
Copy link
Contributor Author

Done

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit 5ac72d9 into PaddlePaddle:develop Dec 16, 2017
@tensor-tang tensor-tang deleted the omp branch December 17, 2017 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants