Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Additional fix for vector access. #17230

Merged
merged 6 commits into from
Feb 16, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions 3rdparty/mshadow/mshadow/dot_engine-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -421,12 +421,9 @@ struct BLASEngine<cpu, double> {
CBLAS_TRANSPOSE p_transa[GROUP_SIZE] = {cblas_a_trans};
CBLAS_TRANSPOSE p_transb[GROUP_SIZE] = {cblas_b_trans};

std::vector<const double*> pp_A;
std::vector<const double*> pp_B;
std::vector<double*> pp_C;
pp_A.reserve(batch_count);
pp_B.reserve(batch_count);
pp_C.reserve(batch_count);
std::vector<const double*> pp_A(batch_count, nullptr);
Copy link
Member

@wkcn wkcn Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix!
It will be faster to use std::vector<const double*> pp_A(batch_count); : )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because it's precisely the same logic used by the earlier commit.

That being said, I believe the STL guarantees that non-class types will be zero initialized when using the constructor form you propose, so I would expect the performance of the two forms to be identical.

std::vector<const double*> pp_B(batch_count, nullptr);
std::vector<double*> pp_C(batch_count, nullptr);

auto m_k = m * k;
auto k_n = k * n;
Expand Down