-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49569][CONNECT][SQL] Add shims to support SparkContext and RDD #48065
Conversation
@@ -52,4 +52,7 @@ package object sql { | |||
f(builder) | |||
column(builder.build()) | |||
} | |||
|
|||
private[sql] def throwRddNotSupportedException(): Nothing = | |||
throw new UnsupportedOperationException("RDDs are not supported in Spark Connect.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this use the error framework.
client.hijackServerSideSessionIdForTesting(suffix) | ||
} | ||
|
||
/** @inheritdoc */ | ||
override def sparkContext: SparkContext = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this use the error framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make CIs happy, @hvanhovell ?
[error] /home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala:83:18: value aggregate is not a member of org.apache.spark.rdd.RDD[Array[String]]
[error] tokenRDD.aggregate(startType)(inferRowType, mergeRowTypes)
@dongjoon-hyun yeah working on it. SBT does not seem to respect maven exclusions... |
Could you rebase once more, @hvanhovell ? |
@dongjoon-hyun it is a bit more complicated. While SBT compile seems to respect Maven exclusions, test:compile and package do not. Investigating. I'd personally not hold off preview2 because of this. |
Ack! Thank you for sharing the status, @hvanhovell . |
Could you re-trigger the failed CIs?
|
Merging to master. |
This PR has caused the Maven daily test build to fail: xxx.
I am trying to fix it at: #48399 |
### What changes were proposed in this pull request? This PR does two things: - It adds shims for SparkContext and RDD. These are in a separate module. This module is a compile time dependency for sql/api, and a regular dependency for connector/connect/client/jvm. We remove this dependency in catalyst and connect-server because those should use the actual implementation. - It adds RDD (and the one SparkContext) based method to the shared Scala API. For connect these methods throw an unsupported operation exception. ### Why are the changes needed? We are creating a shared Scala interface for Classic and Connect. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. I will add a couple on the connect side. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48065 from hvanhovell/SPARK-49569. Authored-by: Herman van Hovell <herman@databricks.com> Signed-off-by: Herman van Hovell <herman@databricks.com>
What changes were proposed in this pull request?
This PR does two things:
Why are the changes needed?
We are creating a shared Scala interface for Classic and Connect.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing tests. I will add a couple on the connect side.
Was this patch authored or co-authored using generative AI tooling?
No.