Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-14036] Read Configuration for Pub/Sub SchemaTransform #17730
[BEAM-14036] Read Configuration for Pub/Sub SchemaTransform #17730
Changes from all commits
0c0ab4b
6c2798d
3099d60
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls change to getDeadLetterTopic to be consistent with the Pub/Sub read transform.
beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
Line 865 in d9436c4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chamikaramj (cc: @angoenka ) Thank you again for reviewing. May we consider leaving the name
getDeadLetterQueue
?In a previous use of AutoValueSchema with AutoValue in a different project, I observed it needed the getters to be named as get in order for the serialization to work. For example, I had a property called fooName. When I named the getter fooName(), the return value of fooName() was null when invoked in the context of a DoFn. However, when I changed the getter to getFooName() the return value was what I expected. I am not sure if my observation is still valid.
Adding to supporting get instead of with, the design goals of the configuration class are to hold data needed by its corresponding SchemaProvider. The method is not doing any action implied by the with preposition. The getter is simply getting data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I think the confusion here is that this property does not correspond to that PubSubIO.getDeadLetterTopic I referenced by maps to the dlq property below.
beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubMessageToRow.java
Line 81 in 408664b
Is that correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see format, protoClass, thriftClass attributes in the original Read config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chamikaramj (cc: @angoenka )
I saw https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L128. Additionally, the original Provider's https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L209 calls the config::serializer method. In said serializer method, the format, protoClass, and thriftClass attributes are referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These again map to PubSubSchemaIO not the original PubSubIO. Is that plan here to make Pub/Sub SchemaTransform just consistent with existing Pub/Sub SchemaIO ? I'm not sure.
@TheNeuralBit @dpcollins-google WDYT ? Should we just be consistent with Pub/Sub SchemaIO implementation here or should SchemaTransform be a different design ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan and replicate like-for-like SchemaIO questions are critical and blocking design decisions that relates to this thread as well. I will hold off on any changes to this PR until we get the feedback needed. Thank you again.