-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subpartitioning Python Cosmos DB SDK #31121
Conversation
adding subpartitioning
Fixes some edge cases. This also adds tests for subpartitioning CRUD operations that match Java SDK as well as some python specific edge cases. This also adds samples for subpartitioning in python.
/azp run python - cosmos - tests |
Pull request contains merge conflicts. |
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
API change check APIView has identified API level changes in this PR and created following API reviews. |
remove line of code that was used for testing
…k-for-python into subpartitioning
update changelog to include new feature
/azp run python - cosmos - tests |
No commit pushedDate could be found for PR 31121 in repo Azure/azure-sdk-for-python |
fixes for pylint issues
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
removes left over debugging code on subpartition test
/azp run python - cosmos - tests |
No commit pushedDate could be found for PR 31121 in repo Azure/azure-sdk-for-python |
/azp run python - cosmos - tests |
No commit pushedDate could be found for PR 31121 in repo Azure/azure-sdk-for-python |
/azp run python - cosmos - tests |
No commit pushedDate could be found for PR 31121 in repo Azure/azure-sdk-for-python |
changing get epk range for prefix partition key to be private
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_cosmos_client_connection_async.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_cosmos_client_connection_async.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding off since we found gaps on the supported scenarios, mainly partial PK spanning multiple partitions
In the case of large databases, a prefix query involving a container with subpartitioning may involve multiple physical partitions. This allows for a prefix query to properly query items from all the partitions that contain the prefix partition keys.
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_cosmos_client_connection_async.py
Outdated
Show resolved
Hide resolved
This commit adds better support for the case of a prefix query needing to query multiple physical partitions. It will query each partition with the needed partition key range for each physical partition. New tests were also added to test this functionality.
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_cosmos_client_connection_async.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few nits and questions. Please make sure to validate this pointing to the account we have for subpartitioning testing
Added a comment explaining the fourth case of what EPK sub range could equal. In that case the epk sub range equals the feed range EPK as it is within the range of a physical partition without spanning the entire physical partition.
/azp run python - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot Bryan!
Any idea when this is going into a non-prerelease? Trying to set up pipelines for the changeover. |
Description
This PR adds subpartioning to the python sdk (also reffered to as hiearchial partitioning or multihash partitioning). This PR also includes tests and samples.
In order to activate subpartitioning you have to define the partition key as MultiHash and pass a list to the paths.
ex:
container operations must pass in partition keys that match the list you used to define the partition key path.
ex:
or
Additionally this also adds support for prefix partition queries. This allows one to use an incomplete partition key to perform a query.
example:
The above example will only return a single item despite using 'SELECT * from c' as the query. As before with all subpartitioning operations you have to pass in a list for the partition key value.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines