[EventHub] Fix keyerror issue in BlobCheckpointStore #15752
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addressing issue: #13060
stack overflow issue: https://stackoverflow.com/questions/63354884/azure-eventhubs-python-checkpointing-with-blob-storage-keyerror-issue-in-ev
also added a test resource storage blob v2 with data lake enabled to verify the fix
--- echoing the context here ---
The root cause of KeyError is that the
list_blobs
functionality when called on a v2 storage blob with data lake enabled (hierarchical namespace) will not only get the per-partition checkpoint/ownership but also get the parent blob node which contains no metadata.To illustrate this better, let's say we have the following blob structures:
in v2 storage with data lake enabled (hierarchical namespace), when the code was using prefix
{<fully_qualified_namespace>/<eventhub_name>/<consumer_group>/ownership
to search for blobs, the{<fully_qualified_namespace>/<eventhub_name>/<consumer_group>/ownership
directory itself would also be returned containing no metadata leading to theKeyError
when we're trying to extract information.What we want is the per-partition blob, the the fix is easy: we add a
/
at the end of the prefix search string such thatlist_blobs
won't return the parent node.{<fully_qualified_namespace>/<eventhub_name>/<consumer_group>/ownership/
(Checkpoint would encounter the same problem)