-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27675][SQL] do not use MutableColumnarRow in ColumnarBatch #24581
Conversation
this.columns = columns; | ||
this.writableColumns = null; | ||
} | ||
private final WritableColumnVector[] columns; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this extend ColumnarBatchRow
to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it too. MutableColumnarRow
is used in performance critical path (hash aggregate), and I'm a little hesitant to add class Hierarchy here, which may hurt performance. cc @kiszk
Overall, looks good to me (+1). One minor point about avoiding code duplication by extending the read-only row to add the write methods, but that's not a blocker for me. |
/** | ||
* An internal class, which wraps an array of {@link ColumnVector} and provides a row view. | ||
*/ | ||
class ColumnarBatchRow extends InternalRow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it final class
like ColumnarRow
and MutableColumnarRow
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can add a final. I'll do it when I touch code here next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks clearer. Looks good.
Test build #105316 has finished for PR 24581 at commit
|
Merged to master. |
## What changes were proposed in this pull request? To move DS v2 API to the catalyst module, we can't refer to an internal class (`MutableColumnarRow`) in `ColumnarBatch`. This PR creates a read-only version of `MutableColumnarRow`, and use it in `ColumnarBatch`. close apache#24546 ## How was this patch tested? existing tests Closes apache#24581 from cloud-fan/mutable-row. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
What changes were proposed in this pull request?
To move DS v2 API to the catalyst module, we can't refer to an internal class (
MutableColumnarRow
) inColumnarBatch
.This PR creates a read-only version of
MutableColumnarRow
, and use it inColumnarBatch
.close #24546
How was this patch tested?
existing tests