-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run Python transformations on Databricks #204
Run Python transformations on Databricks #204
Conversation
pramen/core/src/main/scala/za/co/absa/pramen/core/databricks/DatabricksClient.scala
Outdated
Show resolved
Hide resolved
Unit Test Coverage
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of comments. Since it is a draft, I wasn't too nitpicky.
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/ConfigUtils.scala
Outdated
Show resolved
Hide resolved
pramen/core/src/main/scala/za/co/absa/pramen/core/databricks/Schema.scala
Outdated
Show resolved
Hide resolved
2a4cc7c
to
c1cd748
Compare
…Pramen-Py databricks client
4196794
to
9bb65d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! Great job!
} | ||
|
||
private[databricks] def replaceVariablesInMap(map: Map[String, Any]): Map[String, Any] = { | ||
// in typesafe Config, keys can be set to null (this function will be maily called on Maps created from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮
pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/OperationSplitter.scala
Outdated
Show resolved
Hide resolved
Co-authored-by: Ruslan Yushchenko <yruslan@gmail.com>
Thanks a lot! I was a long journey 😄 |
A very bare-bones implementation of running Pramen-Py transformations on Databricks. We submit a one-time transient job using a REST API.
Not sure about:
Responses
object which contains case classes used for the REST API response deserialization. The attributes are snake-case for simplified deserialization. But we can also expand it into a POJO with Json annotations if required.More possible features that could be implemented in another PR (this PR was getting really large):
pramen.py.cmd
). Databricks-specific configuration is already under (pramen.py.databricks
)