Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BigQueryIO] Use final destination table schema and metadata when creating temp tables #24471

Merged
merged 3 commits into from
Dec 2, 2022

Conversation

ahmedabu98
Copy link
Contributor

@ahmedabu98 ahmedabu98 commented Dec 1, 2022

Fixes #22372

When we create temp tables, the jobs we send will naturally have CREATE_IF_NEEDED disposition. Users are facing a problem when writing with CREATE_NEVER, which makes providing a schema optional. When no schema is provided, the job that creates temp tables is sent with a null schema and an error is thrown.

The changes in this PR uses a DynamicDestinations object that returns the schema of the table when the user-provided schema is null.

Note: this code was removed in #17365 because SchemaUpdateOptions were introduced for temp tables. This PR matches the destination schema only when no SchemaUpdateOptions are provided

@ahmedabu98
Copy link
Contributor Author

R: @MarcoRob
R: @Abacn

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2022

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@ahmedabu98
Copy link
Contributor Author

Run Java PreCommit

1 similar comment
@ahmedabu98
Copy link
Contributor Author

Run Java PreCommit

@Abacn
Copy link
Contributor

Abacn commented Dec 2, 2022

Thanks for the quick fix. Just wondering why we do not need the condition of if (createDisposition.equals(CreateDisposition.CREATE_IF_NEEDED) || createDisposition.equals(CreateDisposition.CREATE_NEVER)) that is removed in https://github.com/apache/beam/pull/17365/files#diff-3edb3abd8909e7075e679e8b93420d4119858d6a9418b6ec10a204a592f062abL682-L692 ?

@ahmedabu98
Copy link
Contributor Author

I believe it's an unnecessary check, since those are the only two existing CreateDispositions:

that condition will always be true

Copy link
Contributor

@Abacn Abacn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This LGTM

@Abacn
Copy link
Contributor

Abacn commented Dec 2, 2022

java precommit test failure irrelevant flake: org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest.testLatencyAttributionToQueuedState[0: [streamingEngine=false]]

@Abacn Abacn merged commit 464749c into apache:master Dec 2, 2022
prodriguezdefino pushed a commit to prodriguezdefino/beam-pabs that referenced this pull request Dec 6, 2022
…ating temp tables (apache#24471)

* always attempt to match tables with final destination

* do not match when schema update options exist

* spotless
lostluck pushed a commit to lostluck/beam that referenced this pull request Dec 22, 2022
…ating temp tables (apache#24471)

* always attempt to match tables with final destination

* do not match when schema update options exist

* spotless
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: PR 17365 breaks BQ loads in some cases
2 participants