-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49152][SQL][FOLLOWUP] DelegatingCatalogExtension should also use V1 commands #47995
Conversation
@@ -284,7 +284,8 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager) | |||
case AnalyzeColumn(ResolvedV1TableOrViewIdentifier(ident), columnNames, allColumns) => | |||
AnalyzeColumnCommand(ident, columnNames, allColumns) | |||
|
|||
case RepairTable(ResolvedV1TableIdentifier(ident), addPartitions, dropPartitions) => | |||
case RepairTable( | |||
ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => | |
ResolvedTableIdentifierInSessionCatalog(ident), addPartitions, dropPartitions) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: fix indentation
@@ -684,7 +693,8 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager) | |||
} | |||
|
|||
private def supportsV1Command(catalog: CatalogPlugin): Boolean = { | |||
isSessionCatalog(catalog) && | |||
SQLConf.get.getConf(SQLConf.V2_SESSION_CATALOG_IMPLEMENTATION).isEmpty | |||
(isSessionCatalog(catalog) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think isSessionCatalog(catalog)
should always be checked.
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Outdated
Show resolved
Hide resolved
val v2Catalog = catalog("spark_catalog").asTableCatalog | ||
val table = v2Catalog.loadTable(Identifier.of(Array("default"), "tbl")) | ||
assert(table.properties().get(TableCatalog.PROP_PROVIDER) == classOf[SimpleScanSource].getName) | ||
val e = intercept[AnalysisException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverts the behavior change in https://github.com/apache/spark/pull/47660/files#diff-eeb429a8e3eb55228451c8dbc2fccca044836be608d62e9166561b005030c940L2107
thanks, merging to master/3.5! |
…se V1 commands ### What changes were proposed in this pull request? This is a followup of #47660 . If users override `spark_catalog` with `DelegatingCatalogExtension`, we should still use v1 commands as `DelegatingCatalogExtension` forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS. This PR also forces to use v1 commands for certain commands that do not have a v2 version. ### Why are the changes needed? Avoid introducing behavior changes to Spark plugins that implements `DelegatingCatalogExtension` to override `spark_catalog`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? new test case ### Was this patch authored or co-authored using generative AI tooling? No Closes #47995 from amaliujia/fix_catalog_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit f7cfeb5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@@ -71,4 +71,12 @@ class DataSourceV2SQLSessionCatalogSuite | |||
sql(s"CREATE EXTERNAL TABLE t (i INT) USING $v2Format TBLPROPERTIES($prop)") | |||
} | |||
} | |||
|
|||
test("SPARK-49152: partition columns should be put at the end") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case name sounds a little mismatched. Is this a correct reproducer for DelegatingCatalogExtension
issue, @amaliujia ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the Spark behavior with the built-in catalog. A DelegatingCatalogExtension
shouldn't change it.
Strictly speaking, this shouldn't be the only issue, as there might be other subtle differences between v1 and v2 CREATE TABLE command, or other commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah as @cloud-fan have mentioned, this verifies a specific case so we make sure the behavior is not changed. And there might be a list to verify though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. It makes sense. Thank you for the explanation.
…be changed by falling back to v1 command ### What changes were proposed in this pull request? This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. ### Why are the changes needed? Behavior regression. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…be changed by falling back to v1 command This is a followup of #47772 . The behavior of SaveAsTable should not be changed by switching v1 to v2 command. This is similar to #47995. For the case of `DelegatingCatalogExtension` we need it goes to V1 commands to be consistent with previous behavior. Behavior regression. No UT No Closes #48019 from amaliujia/regress_v2. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37b39b4) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This is a followup of #47660 . If users override
spark_catalog
withDelegatingCatalogExtension
, we should still use v1 commands asDelegatingCatalogExtension
forwards requests to HMS and there are still behavior differences between v1 and v2 commands targeting HMS.This PR also forces to use v1 commands for certain commands that do not have a v2 version.
Why are the changes needed?
Avoid introducing behavior changes to Spark plugins that implements
DelegatingCatalogExtension
to overridespark_catalog
.Does this PR introduce any user-facing change?
No
How was this patch tested?
new test case
Was this patch authored or co-authored using generative AI tooling?
No