Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to create / repair Hive tables via a Hive JDBC connection #164

Closed
yruslan opened this issue Mar 8, 2023 · 0 comments
Closed
Assignees
Labels
DE enhancement New feature or request Pramen-Scala

Comments

@yruslan
Copy link
Collaborator

yruslan commented Mar 8, 2023

Background

This is relevant for the Standardization sink first of all, but eventually might be also relevant to EnceladusSink, especially the table creation part.

Sometimes the sink outputs data to an external S3 bucket and Hive table creation or repair is needed, but the current Spark context is attached to a different Hive metastore.

Feature

Add the ability to create / repair Hive tables via a Hive JDBC connection.

Proposed Solution

  1. Use JDBC configuration used by Pramen JDBC connectors to connect to a Hive cluster.
  2. Use JDBC Native tools from Pramen core to run Hive creation and repair queries.
  3. The Hive connection parameters should be specified for the particular sink.
  4. Hive JDBC drive JAR is expected to be available on the environment's classpath.
@yruslan yruslan added enhancement New feature or request Pramen-Scala DE labels Mar 8, 2023
@yruslan yruslan self-assigned this Mar 8, 2023
yruslan added a commit that referenced this issue Mar 14, 2023
yruslan added a commit that referenced this issue Mar 17, 2023
yruslan added a commit that referenced this issue Mar 22, 2023
@yruslan yruslan closed this as completed Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DE enhancement New feature or request Pramen-Scala
Projects
None yet
Development

No branches or pull requests

1 participant