Skip to content

v3.0.0

Compare
Choose a tag to compare
@shaunagm shaunagm released this 08 Dec 21:06
· 127 commits to main since this release
626edc7

What's Changed

Breaking Changes

3.0.0 is a major release, which means there are some breaking changes. You can see a full list of all PRs included in this release which have implications for breaking changes here.

  1. We now parse boolean types by default instead of coercing them into strings by default. We detect boolean column types when copying a parsons Table to a database and create a boolean column in the database. If you want to maintain the old behavior, to convert the boolean columns in the table to strings before uploading it to the database, like this: table = table.convert(['bool', 'columns', 'here', ...], str) More: #943

  2. We've made some major updates to the BigQuery and GoogleCloudStorage connecters:

GoogleBigQuery

The GoogleBigQuery connector was written with compatibility in mind, and utilizes many of the same functions as the Amazon Redshift connector in order to minimize the user experience between the two cloud service providers. GoogleBigQuery is authenticated with a service account JSON file, which can be generated in the GCP user interface and stored locally.

There are several subtle differences between GoogleBigQuery and Redshift, most notably in the .query() function, which runs asynchronously in Google. We recommend using the .delete_table() function rather than sending a DELETE TABLE SQL query through the .query() function for this reason, as the connector will raise an exception when the asynchronous task completes and the table no longer exists; alternatively, the user can pass in .query(sql=sql, return_values=False) to prevent this exception from raising.

In addition to the familiar .copy() function, the GoogleBigQuery connector includes a .copy_large_compressed_file_from_gcs() function to handle large files in cloud storage, such as the voter file. BigQuery streams large uncompressed files in batches, but cannot do so when a file in compressed. This function decompresses the file in question using the correct compression type parameter (gzip is default but zip is also accepted), copies the file to BigQuery, then deletes the decompressed file from cloud storage.

GoogleCloudStorage

Similarly, the GoogleCloudStorage connector provides an API to view and manipulate blobs in cloud storage with compatibility to Amazon's S3 connector. Users can create new storage buckets, load blobs into buckets, list their contents, acquire blob metadata, and download blobs from cloud storage to their local environments. This connector handles the decompression steps outlined above in the . copy_large_compressed_file_from_gcs() function outlined above, and also includes helpful utilities to aid in moving data to and from Google Cloud Storage.

New Connectors

Big thanks to @cmdelrio for adding the new MobileCommons connector (#896) and to @austinweisgrau for the new Catalist Match API connector (#912)!

Other Changes

New Contributors

Cheers to our newest contributors! 🎉 Thanks so much for your help.

Full Changelog: v2.1.0...v3.0.0