Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3620][VL] Support Range operator for Velox Backend #8161

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ArnavBalyan
Copy link
Contributor

What changes were proposed in this pull request?

  • Currently the range operator fallsback leads to R2C overhead for downstream operators in velox.
  • Added support for RangeExec which produces range in columnar batches.
  • In the future the range generation can itself be offloaded to velox backend.

How was this patch tested?

image
  • Plan reflects the new operator
  • Added unit tests

@github-actions github-actions bot added CORE works for Gluten Core BUILD VELOX labels Dec 5, 2024
Copy link

github-actions bot commented Dec 5, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented Dec 5, 2024

Run Gluten Clickhouse CI on x86

2 similar comments
Copy link

github-actions bot commented Dec 5, 2024

Run Gluten Clickhouse CI on x86

Copy link

github-actions bot commented Dec 5, 2024

Run Gluten Clickhouse CI on x86

Copy link
Contributor

@CodenameGHOST007 CodenameGHOST007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does velox currently have any similar operator that we can leverage or a new implementation will be needed in case we want to offload it.

import org.apache.spark.sql.GlutenSQLTestsTrait
import org.apache.spark.sql.Row

class GlutenSQLRangeExecSuite extends GlutenSQLTestsTrait {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suite would need to be enabled in VeloxTestSettings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled

@ArnavBalyan
Copy link
Contributor Author

Does velox currently have any similar operator that we can leverage or a new implementation will be needed in case we want to offload it.

Yes, I'm planning to work on the offload implementation. Currently it seems unlikely we will have to introduce code in velox. Will raise PR once possible thanks.
Some minor changes are needed for the tests, will enable them thanks for review

* @param child
* Child SparkPlan nodes for this operator, if any.
*/
case class RangeExecTransformer(
Copy link
Member

@zhztheplayer zhztheplayer Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name it "ColumnarRangeExec" (or VeloxColumnarRangeExec or something) as it doesn't extend the TransformSupport trait? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this thanks, I am planning to add native support soon, will name it accordingly once it's done

Comment on lines +45 to +54
override protected def doValidateInternal(): ValidationResult = {
val isSupported = BackendsApiManager.getSettings.supportRangeExec()

if (!isSupported) {
return ValidationResult.failed(
s"RangeExec is not supported by the current backend."
)
}
ValidationResult.succeeded
}
Copy link
Member

@zhztheplayer zhztheplayer Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think usually we have to add some code to this validator to make doValidate be used. Can you help check this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am opening #8177 that may simplify the usage of validators. Perhaps could do a rebase once that PR merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, rebased since #8177 is merged, I believe we won't need to add to this validator since currently the code does not interact with native or rely on native validation. It may be needed once offloaded to velox

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArnavBalyan The validator's name is a little bit misleading, though I think it's the one to call doValidate, then doValidateInternal to make sure BackendsApiManager.getSettings.supportRangeExec() is checked. Would you like to double check? Or is there another code path calling RangeExecTransformer.doValidateInternal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense, let me double check and get back

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checked, the this is defined in offloadothers, which is wired upto validatorBuilder that calls doValidate and then doValidateInternal. This will trigger the check, UTs also confirm it, I see it's currently similarly implemented for other columnar java operators as well thanks!

current += numRows * step

val batch = new ColumnarBatch(vectors.asInstanceOf[Array[ColumnVector]], numRows)
val offloadedBatch = ColumnarBatches.offload(allocator, batch)
Copy link
Member

@zhztheplayer zhztheplayer Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the code, the operator's behaviour acutally align more precisely with

  override def batchType(): Convention.BatchType = {
    ArrowNativeBatch
  }

(as ColumnarBatches.offload results in native Arrow batch)

Would you see if we can add the code to this class? If yes we can also remove RangeExecBaseTransformer's default implementation.

Doing this will make query planner to insert an explicit ArrowToVelox c2c transition into query plan so we can easily collect the transition's metrics.

Futhermore, I will suggest remove line 117 and line 118 then return the batch directly, then have the batch type changed as

  override def batchType(): Convention.BatchType = {
    ArrowJavaBatch
  }

So two explicit c2cs (ArrowJava-to-ArrowNative, ArrowNative-to-Velox) can be inserted to query plan.

More details please refer to test code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback this is much more elegant, updated it!
image

import org.apache.spark.sql.GlutenSQLTestsTrait
import org.apache.spark.sql.Row

class GlutenSQLRangeExecSuite extends GlutenSQLTestsTrait {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need test suites for all Spark versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to keep it only under 33

@zhouyuan zhouyuan changed the title [VL] Support Range operator for Velox Backend [GLUTEN-3620][VL] Support Range operator for Velox Backend Dec 11, 2024
Copy link

#3620

@zhouyuan
Copy link
Contributor

Does velox currently have any similar operator that we can leverage or a new implementation will be needed in case we want to offload it.

Yes, I'm planning to work on the offload implementation. Currently it seems unlikely we will have to introduce code in velox. Will raise PR once possible thanks. Some minor changes are needed for the tests, will enable them thanks for review

We have some early discussion on this to map it to unnest + sequence - but didn't try the real impl at that time
#3620

@zhouyuan
Copy link
Contributor

@ArnavBalyan would you please check the failed unit tests?

thanks, -yuan

@ArnavBalyan
Copy link
Contributor Author

Hi @zhouyuan thanks for the review, interesting thread, I will go through! I'll fix the tests today/tomorrow thanks!

Copy link

github-actions bot commented Jan 1, 2025

Run Gluten Clickhouse CI on x86

2 similar comments
Copy link

github-actions bot commented Jan 1, 2025

Run Gluten Clickhouse CI on x86

Copy link

github-actions bot commented Jan 2, 2025

Run Gluten Clickhouse CI on x86

Copy link

github-actions bot commented Jan 2, 2025

Run Gluten Clickhouse CI on x86

Copy link

github-actions bot commented Jan 2, 2025

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot removed the BUILD label Jan 2, 2025
Copy link

github-actions bot commented Jan 2, 2025

Run Gluten Clickhouse CI on x86

Copy link

github-actions bot commented Jan 2, 2025

Run Gluten Clickhouse CI on x86

@ArnavBalyan
Copy link
Contributor Author

The unit tests are failing due to another bug, this PR will be rebased once the dependent PR is merged. Which will also fix the UTs.
#8453

Copy link

github-actions bot commented Jan 8, 2025

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core VELOX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants