Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6326][SQL] Improve castStruct to be faster #5017

Closed
wants to merge 2 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Mar 13, 2015

Current castStruct should be very slow. This pr slightly improves it.

@SparkQA
Copy link

SparkQA commented Mar 13, 2015

Test build #28575 timed out for PR 5017 at commit 746fcfb after a configured wait of 120m.

@marmbrus
Copy link
Contributor

Please post benchmark results for PRs with performance improvements.

@viirya
Copy link
Member Author

viirya commented Mar 15, 2015

Simple benchmark conducting 1000000 the struct casting in ExpressionEvaluationSuite:

before pr: 59.149s
after pr: 53.869s

Conducting 1000000 the complex casting in ExpressionEvaluationSuite:

before pr: 33.975s
after pr: 31.879s

It is slightly improved.

// TODO: Could be faster?
buildCast[Row](_, row => {
var i = 0
val fields = row.toSeq.map {(v) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be faster if we use while loop instead of the .map? At least we can save the intermediate object creation(fields).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved with new commit. It is faster.

@viirya
Copy link
Member Author

viirya commented Mar 15, 2015

With new commit.

Conducting 1000000 the struct casting in ExpressionEvaluationSuite:

before pr: 59.149s
after pr: 47.243s

Conducting 1000000 the complex casting in ExpressionEvaluationSuite:

before pr: 33.975s
after pr: 27.51s

var i = 0
while (i < row.length) {
val v = row(i)
newRow.update(i, if (v == null) null else casts(i)(v))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newRow(i) = if (v == null) null else casts(i)(v)

@viirya
Copy link
Member Author

viirya commented Mar 15, 2015

Unrelated failure. retest it please.

@marmbrus
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Mar 18, 2015

Test build #28763 has finished for PR 5017 at commit 385d5b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Mar 20, 2015

/cc @marmbrus Any other concerns?

@marmbrus
Copy link
Contributor

Merged to master. Thanks!

@asfgit asfgit closed this in 73d5775 Mar 26, 2015
@viirya viirya deleted the faster_caststruct branch December 27, 2023 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants