Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39800][SQL][WIP] DataSourceV2: View Support #44197

Closed
wants to merge 6 commits into from

Conversation

jzhuge
Copy link
Member

@jzhuge jzhuge commented Dec 6, 2023

What changes were proposed in this pull request?

This PR adds support to load, create, alter, and drop views in DataSource V2 catalogs.

  • View substitution rule
  • Create view DDL
  • View SQL DDLs

Why are the changes needed?

Support views stored in DataSourceV2 catalogs. Details in SPIP.

Does this PR introduce any user-facing change?

  • Support views from DataSource V2 catalogs in SQL
  • Support views from DataSource V2 catalogs in DataFrame API

How was this patch tested?

  • New unit tests
  • DDLParserSuite
  • PlanResolutionSuite
  • DataSourceV2SQLSuite

@github-actions github-actions bot added the SQL label Dec 6, 2023
@jzhuge
Copy link
Member Author

jzhuge commented Dec 6, 2023

Had to create a new pull request since #39796 was closed

@jzhuge
Copy link
Member Author

jzhuge commented Dec 6, 2023

Current status:

  • Fix failed unit tests
  • Incorporate all @MaxGekk comments (took care of some)
  • Add unit tests

Follow up in future pull requests:

  • Support user specified column names
  • Support viewSQLConfigs
  • Conform to “v2 command framework” (SPARK-36586). - ResolveCatalogs seem to have too much extra view code, but that cleanup might require migrating more SQL commands to “v2 command framework”.

@@ -0,0 +1,104 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the file is named VIewSubstitution. Should be ViewSubstitution

if (catalog.viewExists(ident)) {
catalog.dropView(ident)
}
catalog.createView(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be using the new createOrReplace() API that was introduced by #43677:

catalog.createOrReplaceView(
        ident,
        sql,
        currentCatalog,
        currentNamespace,
        viewSchema,
        queryColumnNames,
        columnAliases,
        columnComments,
        newProperties.asJava)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually

catalog.replaceView(viewInfo, true)

ident: Identifier,
ifExists: Boolean) extends LeafV2CommandExec {

override protected def run(): Seq[InternalRow] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified to

override def run(): Seq[InternalRow] = {
    if (catalog.viewExists(ident)) {
      catalog.dropView(ident)
    } else if (!ifExists) {
      throw QueryCompilationErrors.noSuchViewError(ident)
    }

    Seq.empty
  }

catalogManager.currentCatalog.name,
catalogManager.currentNamespace,
c.comment,
c.schema,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.schema,
c.viewSchema,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise this will be passing the wrong schema here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this!

@jzhuge
Copy link
Member Author

jzhuge commented Dec 22, 2023

Looking to leverage the newly added ViewInfo more

@jzhuge
Copy link
Member Author

jzhuge commented Dec 29, 2023

Almost done converting to v2 command framework. Seeing great reduction in changes.

@jzhuge
Copy link
Member Author

jzhuge commented Jan 1, 2024

Failed unit tests

  • org.apache.spark.sql.hive.execution.HiveSQLViewSuite
  • org.apache.spark.sql.SQLQueryTestSuite
  • org.apache.spark.sql.execution.PersistedViewTestSuite
  • org.apache.spark.sql.execution.command.PlanResolutionSuite
  • org.apache.spark.sql.execution.GlobalTempViewTestSuite
  • org.apache.spark.sql.connector.DataSourceV2SQLSuiteV1Filter
  • org.apache.spark.sql.execution.LocalTempViewTestSuite
  • org.apache.spark.sql.DataFrameWriterV2Suite
  • org.apache.spark.sql.execution.SimpleSQLViewSuite

@jzhuge
Copy link
Member Author

jzhuge commented Jan 2, 2024

Only 2 unit tests failed but they succeeded on my laptop

  • org.apache.spark.deploy.yarn.YarnClusterSuite
  • org.apache.spark.sql.streaming.RocksDBStateStoreStreamingAggregationSuite

Maybe they are flaky?

@jzhuge
Copy link
Member Author

jzhuge commented Jan 2, 2024

Done:

  • Converted to v2 command framework
  • Incorporated review comments
  • Passed existing unit tests

TODO:

  • ShowViewPropertiesExec: Redact properties and hide reserved properties
  • DescribeViewExec/ShowCreateViewExec: Support user specified columns
  • Add unit tests
  • Investigate ReplaceCharWithVarchar

Future pull requests:

  • Support stored view schema
  • Support viewSQLConfigs
  • (Optional?) Support view-only catalog

Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Apr 12, 2024
@github-actions github-actions bot closed this Apr 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants