Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Board Review: Storage STG73 features #1347

Closed
kasobol-msft opened this issue May 19, 2020 · 13 comments
Closed

Board Review: Storage STG73 features #1347

kasobol-msft opened this issue May 19, 2020 · 13 comments
Assignees
Labels
architecture board-review Request for an Architectural Board Review

Comments

@kasobol-msft
Copy link

kasobol-msft commented May 19, 2020

Thank you for starting the process for approval of the client library for your Azure service. Thorough review of your client library ensures that your APIs are consistent with the guidelines and the consumers of your client library have a consistently good experience when using Azure.

** Before submitting, ensure you adjust the title of the issue appropriately **

To ensure consistency, all Tier-1 languages (C#, TypeScript, Java, Python) will generally be reviewed together. In expansive libraries, we will pair dynamic languages (Python, TypeScript) together, and strongly typed languages (C#, Java) together in separate meetings.

The Basics

About this client library

Artifacts required (per language)

We use an API review tool (apiview) to support .NET and Java API reviews. For Python and TypeScript, use the API extractor tool, then submit the output as a Draft PR to the relevant repository (azure-sdk-for-python or azure-sdk-for-js).

.NET

Additionally PR for new package Azure.Storage.Blobs.Changefeed: Azure/azure-sdk-for-net#11692
APIView of changefeed: https://apiview.dev/Assemblies/Review/85046278a8764a02a11d65ab219b91ae

Java

azure-storage-blob long running, based on latest GA
azure-storage-blob-cryptography long running, based on latest GA
azure-storage-blob-changefeed
new package
azure-storage-common long running, based on latest GA
azure-storage-file-datalake long running, based on latest GA
azure-storage-file-share long running, based on latest GA
azure-storage-blob-batch long running, based on latest GA

Python

Stubgen for all features:
https://github.com/xiafu-msft/azure-sdk-for-python/pull/3/files

APIView per package
azure-storage-file-share:
https://apiview.dev/Assemblies/Review/27d251c31d924249a26cb294cc62eb8c?diffRevisionId=48a51f74be7d447f836b42e2f6067e4e#azure.storage.fileshare.ShareClient

azure-storage-blob:
https://apiview.dev/Assemblies/Review/5f53b3f2e2e8453eb4cd81cb48d16620?diffRevisionId=7eeef5934e5e4d2682bf5e180bfc7061

azure-storage-file-datalake:
https://apiview.dev/Assemblies/Review/b09bb14da49e4f9c82787cd62f90fada?diffRevisionId=05426f402f484eb3ac5c0ef7e328e5ff

azure-storage-blob-changefeed:
https://apiview.dev/Assemblies/Review/14368856308144c0a07f79da963aabc7#azure.storage.blob.changefeed.ChangeFeedClient

JavaScript

per feature:
Blob versioning: Azure/azure-sdk-for-js#7886
Quick Query: Azure/azure-sdk-for-js#9112
Change Feed: https://github.com/Azure/azure-sdk-for-js/pull/9532/files#diff-dd305e6973a778934af8460c60bc7a3a

per pacakage
storage-blob:
https://github.com/Azure/azure-sdk-for-js/pull/9532/files#diff-676a73311cb02df3d5c777d7e678ddfc

storage-file-datalake:
https://github.com/Azure/azure-sdk-for-js/pull/9532/files#diff-d11c95aee3e218ade7c296fbc2cbeced

storage-file-share:
https://github.com/Azure/azure-sdk-for-js/pull/9532/files#diff-b72df24b3c4664e3bd58c593aa3cf7a7

@kasobol-msft kasobol-msft added architecture board-review Request for an Architectural Board Review labels May 19, 2020
@kasobol-msft kasobol-msft changed the title Board Review: Storage STG73 features .NET Board Review: Storage STG73 features May 27, 2020
@seanmcc-msft seanmcc-msft pinned this issue Jun 1, 2020
@amishra-dev
Copy link

@KrzysztofCwalina / @JonathanGiles / @johanste / @tg-msft / @annatisch Gentle reminder.

@amishra-dev
Copy link

/cc @kurtzeborn

@annatisch
Copy link
Member

Thanks @amishra-dev - I've left some questions/comments, however it seems that the generated API views for Python need updating - as they don't reflect some of the changes we made last week. :)

@KrzysztofCwalina
Copy link
Member

I reviewed the APIs and added comments.

@bterlson
Copy link
Member

I have accepted/reviewed everything, though there are some open questions/concerns that need to be worked through as yet.

@tg-msft
Copy link
Member

tg-msft commented Jun 12, 2020

I've added comments and they've all either been addressed or are being addressed. I'm good to ship a Preview for .NET. Thanks @seanmcc-msft and @amnguye!

@annatisch
Copy link
Member

Almost all the comments for Python have been/are being addressed. So I'm confident we can get the preview out next week.
One of the areas that we focused on is quick query, and I have a PR open with our proposal that I'm working through reviewing with @xiafu-msft
Azure/azure-sdk-for-python#11991

In summary we wish to make the query experience align with common patterns used by existing Python CSV tools, to maximize compatibility. The key changes are:

  • Input/output serialization.
    I have renamed these parameters to blob_format and output_format as I felt the concept of "input" was confusing. The types accepted here have been remodeled into two options: CSVDialect (that uses the API of the Python csv.Dialect class) and DelimitedJSON.
  • I have moved the headers_present field into a standalone parameter, as it turns out the service only uses it for input formatting anyway. I have renamed to "has_header" to be consistent with Python csv lib.
  • I am using the error handling currently used for Python encoding, which means you can specify three options:
    • 'strict' (the default), which raises on any error
    • 'ignore', which ignores non-fatal errors, fatal errors are still raised
    • callable(error) which allows for custom handling, with raising or supressing of any error
  • I have added an iterable response function so users can iterate over the records in the streamed response. This behaviour is more consistent with how records are consumed in other Python csv readers.

@scbedd scbedd unpinned this issue Jun 18, 2020
@scbedd scbedd pinned this issue Jun 18, 2020
@JonathanGiles
Copy link
Member

I completed another pass through the Java API changes and nothing jumps out at me, so for Java I'm ok with this preview going out.

@annatisch
Copy link
Member

I believe all the comments for Python have now been addressed, so we should be good to go to preview.

@kasobol-msft
Copy link
Author

The API Views has been updated to reflect latest master (merge of stg73 and few additional features that were contributed after).

@kyle-patterson We'd like to request final review before GA next week. Could you please schedule that?

FYI @kurtzeborn @amishra-dev

@lilyjma
Copy link
Contributor

lilyjma commented Jul 21, 2020

scheduled for 7/27

@JonathanGiles
Copy link
Member

I've left some minor Java feedback that should be reviewed preferably in advance of this arch board. Thanks!

@lilyjma
Copy link
Contributor

lilyjma commented Jul 27, 2020

[MS Internal Only] video

Action Items: @kasobol-msft

Java

BlobChangefeedClientBuilder

  • reference BlobBatchBuilder; okay with current approach if same as BlobBatchBuilder

BlobBeginCopyOptions

  • Java: setSealDestination, isSealDestination
  • C# shouldSealDestination (Brian likes this for JS too)

JS

BlobDownloadResponseParsed, objectReplicationSourceProperties type

  • okay to leave as is if Storage team thinks it's good

C#

namespace Azure.Storage.Blobs.ChangeFeed.Models{}

  • put models in same namespace instead of creating new namespace

DataLakeQueryJsonTextConfiguration: DataLakeQueryTextConfiguration{}

  • adding factory methods and mentioning derived types in doc comments

BlobQueryTextConfiguration{}

  • change to Options

BlobTagItem{}

  • make it a struct and name it BlobId (work on naming further)

BlobContainerStates{}

  • keep for now since will be adding more later

public virtual Response<IDictonary<string,string>> GetTags()

  • sync offline with SDK team

Python

In start_copy_from_url()

  • use seal_destination_blob

upload_blob()

  • watch for email on if_tags naming

datalake: query_expression in query_file()

  • start email thread on this; useful to see it in sample

DataLikeFileQueryError

  • follow up with Johan: slightly odd to get exception passed in through callback

APPROVED for release!

@JoshLove-msft JoshLove-msft unpinned this issue Aug 1, 2020
@seanmcc-msft seanmcc-msft pinned this issue Aug 13, 2020
@seanmcc-msft seanmcc-msft unpinned this issue Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture board-review Request for an Architectural Board Review
Projects
None yet
Development

No branches or pull requests

10 participants