Add missing delta-storage
dependency and class loader workaround to Delta table ingestion
#16648
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With the upgrade to Kernel 3.2.0, the Druid Delta connector extension isn't able to read and ingest Delta tables successfully.
The Delta Kernel now requires the
delta-storage
dependency, so add this in our connector. However, even after adding the dependency, reading from a Delta table fails with the following error:Please see the upstream issue delta-io/delta#3299 and fix delta-io/delta#3304. I verified the upstream fix by deploying a custom Kernel jar and reverting this Class loader workaround, so we should be able to remove it once we upgrade the Kernel to a later version.
But as the comments in this patch notes, this is only a temporary workaround until we update to the next upstream Kernel library version. The alternative to this patch would be to revert to 3.1.0 where this issue didn't exist. But we lose some new features and performance improvements made in 3.2.0.
Note that this issue wasn't caught in the unit tests present in our Druid extension because the test setup doesn't fully model an actual setup. For example, the
DeltaLakeDruidModule
configuration is skipped.I will follow up with a sanity IT to catch such issues in the future.
This PR has: