-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve StandardizationSink #210
Conversation
yruslan
commented
Jun 14, 2023
- Add support for Delta as the format of the raw layer.
- Add support for Delta as the format of the publish layer.
- Add support for customizing partition columns of the publish layer.
- Allow column transformations to use information date in expressions.
@@ -245,7 +247,13 @@ abstract class TaskRunnerBase(conf: Config, | |||
case None => runResult.data | |||
} | |||
|
|||
val postProcessed = task.job.postProcessing(dfWithTimestamp, task.infoDate, conf) | |||
val dfWithInfoDate = if (dfWithTimestamp.schema.exists(f => f.name.equals(task.job.outputTable.infoDateColumn))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only change in the framework itself. The rest is Enceladus-specific.
@@ -336,10 +336,12 @@ class TaskRunnerBaseSuite extends AnyWordSpec with SparkTestBase with TextCompar | |||
"""[ { | |||
| "a" : "B", | |||
| "b" : 2, | |||
| "INFO_DATE" : "2022-02-18", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this is the test for the new logic - the information date is now available after a jobs has run, even before it is saved.
Unit Test Coverage
|