Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigquery: add copy functionality #127

Merged
merged 3 commits into from
Dec 20, 2019
Merged

bigquery: add copy functionality #127

merged 3 commits into from
Dec 20, 2019

Conversation

eliotst
Copy link
Contributor

@eliotst eliotst commented Dec 18, 2019

This commit adds a copy method to the
parsons.google.google_bigquery.GoogleBigQuery connector. The
copy method can be used to load a Parsons table into a BigQuery
table by uploading the file to Google Cloud Storage.

This commit adds a `copy` method to the
`parsons.google.google_bigquery.GoogleBigQuery` connector. The
`copy` method can be used to load a Parsons table into a BigQuery
table by uploading the file to Google Cloud Storage.
Copy link
Collaborator

@jburchard jburchard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome.

gcs_client: object
The GoogleCloudStorage Connector to use for loading data into Google Cloud Storage.
"""
tmp_gcs_bucket = tmp_gcs_bucket or os.environ.get('GCS_TEMP_BUCKET')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is there a reason why you didn't use the check_env util?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just forgot about it. I'll swap that out for this.

@jburchard
Copy link
Collaborator

One question, actually, are there any advanced configurations for copying that the Google Client can take that we could pass as **kwargs. Similar to all of the arguments with Redshift, but a little bit simpler to read?

@eliotst
Copy link
Contributor Author

eliotst commented Dec 18, 2019

One question, actually, are there any advanced configurations for copying that the Google Client can take that we could pass as **kwargs. Similar to all of the arguments with Redshift, but a little bit simpler to read?

Yeah. There is a location and project parameter, which can be used to override the defaults on the client. There's also a retry parameter which defines how to retry if there's a failure in the call. I can pass all of that through.

The other thing I will do is add a job_config parameter that will allow passing in custom LoadJobConfig options.

@eliotst eliotst merged commit 3e65128 into master Dec 20, 2019
@eliotst eliotst deleted the eliots-bigquery_copy branch December 20, 2019 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants