Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmos spark3 DataSourceV2 catalog api implementation #18011

Conversation

moderakh
Copy link
Contributor

@moderakh moderakh commented Dec 8, 2020

This PR adds support for spark3 DataSourceV2 Catalog API:

NOTE: this PR is the same as this PR (moderakh#15) targeting Azure repo.
The original PR is already reviewed and signed off by reviewers.

spark.conf.set(s"spark.sql.catalog.cosmoscatalog", "com.azure.cosmos.spark.CosmosCatalog")
spark.conf.set(s"spark.sql.catalog.cosmoscatalog.spark.cosmos.accountEndpoint", cosmosEndpoint)
spark.conf.set(s"spark.sql.catalog.cosmoscatalog.spark.cosmos.accountKey", cosmosMasterKey)

spark.sql(s"CREATE DATABASE cosmoscatalog.mydb;")
spark.sql(s"CREATE TABLE cosmoscatalog.mydb.myContainer (word STRING, number INT) using cosmos.items 
 TBLPROPERTIES(partitionKeyPath = '/mypk', manualThroughput = '1100')")

Please see CosmosCatalogSpec for end to end integration tests.
The integration testings will work once this earlier PR merges:
#17952 getting merged.

TODO:

  • There are some TODO in the code, (e.g., add support for table alter)
  • the integration tests resource management needs to be figured out.
  • This PR adds support for catalog metadata operation, we should also validate data operation through catalog api.

@moderakh
Copy link
Contributor Author

moderakh commented Dec 8, 2020

NOTE: this PR is the same as (moderakh#15) retargeting Azure repo.
The original PR is already reviewed and signed off by reviewers.

I had to raise that earlier PR because its dependency PR wasn't earlier merged.

@moderakh moderakh merged commit 1d93da5 into Azure:feature/cosmos/spark30 Dec 8, 2020
@moderakh moderakh deleted the users/moderakh/cosmos/catalog-api branch December 8, 2020 07:39
@moderakh moderakh linked an issue Dec 18, 2020 that may be closed by this pull request
openapi-sdkautomation bot pushed a commit to AzureSDKAutomation/azure-sdk-for-java that referenced this pull request Mar 2, 2022
add xms-ids for RecoveryServicesSiteRecovery (Azure#18011)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cosmos:spark3 Cosmos DB Spark3 OLTP Connector Cosmos
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support DataSourceV2 catalog api
1 participant