Versatile Data Kit 0.2
Summary
Major features include:
- Improvements in control-service security to be compliant with best Kubernetes practices (run jobs as unprivileged/non-root);
- Plugins can now hook into the ingestion process before, during or after the ingestion execution; these hooks can also be chained;
- Added support for Kimball templates (SCD1, SCD2, Snapshot Accumulating Fact Table) for the vdk-impala plugin.
Package versions
See installation instructions here.
The versions of VDK components released under VDK 0.2 are:
Main components
vdk-heartbeat==0.5.476585195
vdk-core==0.1.476585195
vdk-control-cli==1.2.476585195
pipelines-control-service==1.4.476585195
Plugins
vdk-trino==0.2.476585195
vdk-test-utils==0.2.476585195
vdk-kerberos-auth==0.2.476585195
vdk-ingest-http==0.2.476585195
vdk-impala==0.2.476585195
quickstart-vdk==0.2.476585195
What's Changed
- vdk-wiki: Created Life Expectancy Scenario by @alod83 in #616
- vdk-wiki: life-expectancy minor fixes to work end-to-end by @tozka in #700
- vdk-kerberos-auth: Fix keytab file in job directory by @doks5 in #721
- vdk-plugins: Update ingestion interfaces used in plugins by @doks5 in #689
- vdk-test-utils: Add test pre-ingest, ingest, post-ingest plugins by @doks5 in #679
- control-service: Allow job builder run as non-root by @doks5 in #625
- control-service: add counter to track data job watching task executions by @tpalashki in #692
- control-service: add job-base-image folder and CI step/job by @tozka in #711
- control-service: add security context to Data Job template by @mivanov1988 in #713
- control-service: add timeouts to shedlock's database operations by @tpalashki in #693
- control-service: enable logging on update cron job failure by @mivanov1988 in #674
- control-service: fix role permissions for pod/logs by @tozka in #699
- control-service: graphQL job executions filter by teamName by @mrMoZ1 in #702
- control-service: remove unnecessary directory of legacy builder by @tozka in #712
- control-service: remove unused code by @tozka in #705
- control-service: revert swagger path changes by @ivakoleva in #704
- control-service: run data job as non-root user by @tozka in #710
- control-service: vdk sdk docker repository secret by @mivanov1988 in #694
- control-service: add amazon-ecr-credential-helper to the job builder by @tpalashki in #723
- control-service: Swagger UI path changes docs update by @ivakoleva in #715
- control-service: publish Swagger UI to /data-jobs path by @ivakoleva in #677
- control-service: publish Swagger UI to /data-jobs path by @ivakoleva in #714
- control-service: redirect Swagger webjars resources by @ivakoleva in #697
- control-service: release a new version by @ivakoleva in #681
- quickstart-vdk: Run vdk-heartbeat before release by @YanaZhivkova in #665
- vdk-control-cli: Expand missing resource error message by @gageorgiev in #706
- vdk-control-cli: Make list command print all jobs on empty team param by @gageorgiev in #709
- vdk-control-cli: Parse contacts with both comma "," as delimiter as well by @tozka in #719
- vdk-control-cli: clarify api token documentation by @tozka in #722
- vdk-core: add flag to enable synchronous/blockng ingestion by @tozka in #698
- vdk-core: adjust defined type for configuration values by @tozka in #684
- vdk-core: fix error message by @tozka in #686
- vdk-core: Add ingestion functional tests by @doks5 in #691
- vdk-core: Implementation of new ingestion interfaces by @doks5 in #690
- vdk-core: Introduce post-ingest-sequence env var by @doks5 in #682
- vdk-heartbeat: Successful data job run test mode by @ivakoleva in #718
- vdk-heartbeat: Fix successful run status check by @YanaZhivkova in #724
- vdk-impala: Impala docker image upgrade by @ivakoleva in #675
- vdk-impala: impala templates by @mrMoZ1 in #671
- vdk-impala: template .sql missing from python distro by @ivakoleva in #695
- vdk-impala: Improve error handling to handle view errors by @doks5 in #717
- vdk-ingest-http: additional request parameters support by @ivakoleva in #701
- vdk-ingest-http: configurable allow JSON float NaN capability by @ivakoleva in #725
- vdk-trino: trino ingest to handle type casting and missing values by @tozka in #685
- versatile-data-kit: Establish release process by @gageorgiev in #673
New Contributors
Full Changelog: 0.1...0.2