use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

tensor-tang · 2017-12-14T09:32:14Z

This is related with #6512

luotao1 · 2017-12-15T07:44:51Z

paddle/gserver/layers/SequenceToBatch.cpp

+                 (j + blockSize > seqWidth) ? (seqWidth - j) * sizeof(real)
+                                            : blockMemSize);
+        }
+      }


这里能保留原来代码在么？
openblas的情况下，是不是原来的memcpy更快速。现在多了一层循环了。

哦好的，那我保留原来的区分下好。thx

tensor-tang · 2017-12-15T13:30:17Z

Done

luotao1

LGTM

tensor-tang requested a review from luotao1 December 14, 2017 09:32

luotao1 reviewed Dec 15, 2017

View reviewed changes

tensor-tang force-pushed the omp branch from 00f41fe to 9c27c13 Compare December 15, 2017 13:26

luotao1 approved these changes Dec 16, 2017

View reviewed changes

luotao1 merged commit 5ac72d9 into PaddlePaddle:develop Dec 16, 2017

tensor-tang added 2 commits December 16, 2017 20:55

use intel openmp to speedup seq2batch when WITH_MKL

84cb542

follow comments using macro to separate the original implements

9c27c13

tensor-tang deleted the omp branch December 17, 2017 10:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

tensor-tang commented Dec 14, 2017

luotao1 Dec 15, 2017

tensor-tang Dec 15, 2017

tensor-tang commented Dec 15, 2017

luotao1 left a comment

use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

use Intel OpenMP to speedup seq2batch when WITH_MKL #6622

Conversation

tensor-tang commented Dec 14, 2017

luotao1 Dec 15, 2017

Choose a reason for hiding this comment

tensor-tang Dec 15, 2017

Choose a reason for hiding this comment

tensor-tang commented Dec 15, 2017

luotao1 left a comment

Choose a reason for hiding this comment