-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prefix sort #7230
add prefix sort #7230
Conversation
skadilover
commented
Oct 25, 2023
•
edited
Loading
edited
- Introducing PrefixSort to improve the performance of memory sort, see Optimize sort #6766
- Supports BIGINT,performance is improved in both single-column and multi-column scenarios,see PrefixSortBenchmark
✅ Deploy Preview for meta-velox canceled.
|
Hi @skadilover! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
5e06d5b
to
8de29fd
Compare
ce1ffb8
to
fca0ef0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skadilover Thank you for working on this. This PR iw 2K+ lines long. Is there a way to break it up into smaller pieces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments.
@@ -336,6 +336,11 @@ class QueryConfig { | |||
static constexpr const char* kEnableExpressionEvaluationCache = | |||
"enable_expression_evaluation_cache"; | |||
|
|||
static constexpr const char* kEnablePrefixSort = "enable_prefix_sort"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you add documentation for these properties to config.rst?
velox/exec/PrefixSort.h
Outdated
*/ | ||
#pragma once | ||
|
||
#include "PrefixSortAlgorithm.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use full paths: "velox/exec/PrefixSortAlgorithm.h".
Ditto other places.
velox/exec/PrefixSortEncode.h
Outdated
* limitations under the License. | ||
*/ | ||
#pragma once | ||
#include "string.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line after #pragma once. Ditto other places.
velox/exec/PrefixSortEncode.h
Outdated
#include "velox/common/base/Exceptions.h" | ||
|
||
template <size_t SIZE> | ||
static inline void MemcpyFixed(void* dest, const void* src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, review coding guidelines and update to comply. Function and method names should start lowercase.
velox/exec/PrefixSortEncode.h
Outdated
case 246: | ||
return MemsetFixed<246>(ptr, value); | ||
case 247: | ||
return MemsetFixed<247>(ptr, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a macro to shorten this code.
velox/exec/PrefixSortAlgorithm.h
Outdated
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, this code can be introduced in a separate PR with unit tests.
bool needSortData = false; | ||
}; | ||
|
||
class PrefixSort { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, this can be another PR with tests.
velox/exec/RowContainer.h
Outdated
@@ -161,6 +161,7 @@ class RowContainer { | |||
public: | |||
static constexpr uint64_t kUnlimited = std::numeric_limits<uint64_t>::max(); | |||
using Eraser = std::function<void(folly::Range<char**> rows)>; | |||
friend class PrefixSort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid using 'friend'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I have been struggling with for a long time. The reason for using friend is: PrefixSort needs to get data from RowContainer, but I don't want RowContainer to be dependent on the code of PrefixSort. I understand that the current solution that can take into account both performance and code dependencies can only use friend. Do you have any better suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PrefixSort needs to get data from RowContainer, but I don't want RowContainer to be dependent on the code of PrefixSort
@skadilover I haven't read the code yet, but maybe you can explain a bit what data you need and why can't we have some sort of extractXxx API in RowContainer for the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbasmanova The key point is that: when extracting data, we need to rely on PrefixSort logic (normalize/variable length process ..) , this part of logic will be a little complex. If the interface is provided by RowContainer, it will depend on the logic of PrefixSort.
@mbasmanova |
@skadilover It is indeed helpful to have the full picture, hence, let's keep this PR for reference and create small PRs for review and eventual merge. |
class PrefixSort { | ||
public: | ||
PrefixSort( | ||
RowContainer* FOLLY_NONNULL rowContainer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use FOLLY_NONNULL
assert(!data.empty()); | ||
exec::test::PlanBuilder builder; | ||
|
||
auto& type = data[0]->type()->as<TypeKind::ROW>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is not used.
RowTypePtr type, | ||
int64_t numVectors, | ||
int32_t numPerVector, | ||
int32_t stringCardinality = 1000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter is not used.
sortedRows_.begin(), | ||
sortedRows_.end(), | ||
[this](const char* leftRow, const char* rightRow) { | ||
for (vector_size_t index = 0; index < sortCompareFlags_.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a simple case sorting by a few integers, we can optimize this comparison to use memcmp, no? I wonder how would this optimization compare with using PrefixSort. Specifically, I'm curious whether it will give us the most benefits with PrefixSort adding just a little bit of perf boost on top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is the code velox use now see the if body thats PrefixSort impl
if (prefixSortConfig_.has_value())
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right and I assume this is very inefficient. I wonder what happens if we just optimize this comparator. I expect it will be speed things up quite a bit.
761660b
to
a9f14d9
Compare
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |