Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prefix sort #7230

Closed

Conversation

skadilover
Copy link
Contributor

@skadilover skadilover commented Oct 25, 2023

  1. Introducing PrefixSort to improve the performance of memory sort, see Optimize sort #6766
  2. Supports BIGINT,performance is improved in both single-column and multi-column scenarios,see PrefixSortBenchmark

@netlify
Copy link

netlify bot commented Oct 25, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d039ab8
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6581657cc1eeb000081c158a

@facebook-github-bot
Copy link
Contributor

Hi @skadilover!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@skadilover skadilover mentioned this pull request Oct 25, 2023
@skadilover skadilover force-pushed the WIP_add_prefix_sort branch 2 times, most recently from 5e06d5b to 8de29fd Compare November 8, 2023 10:45
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 8, 2023
@skadilover skadilover force-pushed the WIP_add_prefix_sort branch 21 times, most recently from ce1ffb8 to fca0ef0 Compare November 10, 2023 08:23
@skadilover skadilover changed the title [WIP] add prefix sort add prefix sort Nov 14, 2023
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skadilover Thank you for working on this. This PR iw 2K+ lines long. Is there a way to break it up into smaller pieces?

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

@@ -336,6 +336,11 @@ class QueryConfig {
static constexpr const char* kEnableExpressionEvaluationCache =
"enable_expression_evaluation_cache";

static constexpr const char* kEnablePrefixSort = "enable_prefix_sort";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add documentation for these properties to config.rst?

*/
#pragma once

#include "PrefixSortAlgorithm.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use full paths: "velox/exec/PrefixSortAlgorithm.h".

Ditto other places.

* limitations under the License.
*/
#pragma once
#include "string.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add empty line after #pragma once. Ditto other places.

#include "velox/common/base/Exceptions.h"

template <size_t SIZE>
static inline void MemcpyFixed(void* dest, const void* src) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, review coding guidelines and update to comply. Function and method names should start lowercase.

case 246:
return MemsetFixed<246>(ptr, value);
case 247:
return MemsetFixed<247>(ptr, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a macro to shorten this code.

* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, this code can be introduced in a separate PR with unit tests.

bool needSortData = false;
};

class PrefixSort {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, this can be another PR with tests.

@@ -161,6 +161,7 @@ class RowContainer {
public:
static constexpr uint64_t kUnlimited = std::numeric_limits<uint64_t>::max();
using Eraser = std::function<void(folly::Range<char**> rows)>;
friend class PrefixSort;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using 'friend'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I have been struggling with for a long time. The reason for using friend is: PrefixSort needs to get data from RowContainer, but I don't want RowContainer to be dependent on the code of PrefixSort. I understand that the current solution that can take into account both performance and code dependencies can only use friend. Do you have any better suggestions?

@mbasmanova

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrefixSort needs to get data from RowContainer, but I don't want RowContainer to be dependent on the code of PrefixSort

@skadilover I haven't read the code yet, but maybe you can explain a bit what data you need and why can't we have some sort of extractXxx API in RowContainer for the purpose.

Copy link
Contributor Author

@skadilover skadilover Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova The key point is that: when extracting data, we need to rely on PrefixSort logic (normalize/variable length process ..) , this part of logic will be a little complex. If the interface is provided by RowContainer, it will depend on the logic of PrefixSort.

@skadilover
Copy link
Contributor Author

@skadilover Thank you for working on this. This PR iw 2K+ lines long. Is there a way to break it up into smaller pieces?

@mbasmanova
Because this is the first PR, I hope it can help you have a general understanding of the entire plan. Is it reasonable to split the code in this way and meet the requirements of the community:
pr#1 Algrthm
pr#2 Normalize Encode
pr#3 PrefixSort

@mbasmanova
Copy link
Contributor

@skadilover It is indeed helpful to have the full picture, hence, let's keep this PR for reference and create small PRs for review and eventual merge.

class PrefixSort {
public:
PrefixSort(
RowContainer* FOLLY_NONNULL rowContainer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use FOLLY_NONNULL

assert(!data.empty());
exec::test::PlanBuilder builder;

auto& type = data[0]->type()->as<TypeKind::ROW>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is not used.

RowTypePtr type,
int64_t numVectors,
int32_t numPerVector,
int32_t stringCardinality = 1000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is not used.

sortedRows_.begin(),
sortedRows_.end(),
[this](const char* leftRow, const char* rightRow) {
for (vector_size_t index = 0; index < sortCompareFlags_.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a simple case sorting by a few integers, we can optimize this comparison to use memcmp, no? I wonder how would this optimization compare with using PrefixSort. Specifically, I'm curious whether it will give us the most benefits with PrefixSort adding just a little bit of perf boost on top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is the code velox use now see the if body thats PrefixSort impl
if (prefixSortConfig_.has_value())
`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right and I assume this is very inefficient. I wonder what happens if we just optimize this comparator. I expect it will be speed things up quite a bit.

Copy link

stale bot commented Mar 18, 2024

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions!

@stale stale bot added the stale label Mar 18, 2024
@stale stale bot closed this Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants