Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix/timestamp-without-time-zone #5

Merged
merged 9 commits into from
Mar 26, 2024

Conversation

fivetran-joemarkiewicz
Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Mar 22, 2024

PR Overview

This PR will address the following Issue/Feature: Internally raised Issue

This PR will result in the following new package version: v0.2.0

This likely won't be breaking for the majority of users. However, this will be changing the datatypes of a number of fields to timestamp without time zone; whereas, before they were timestamp with time zone. Therefore, we will mark this as breaking.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

🚨 Breaking Changes: Bug Fixes 🚨

  • Casted the following timestamp fields in the below models using the dbt.type_timestamp() macro. This is necessary to ensure all timestamps are consistently casted and do not experience datatype mismatches in downstream transformations.
    • stg_qualtrics__contact_mailing_list_membership
      • unsubscribed_at
    • stg_qualtrics__directory_contact
      • created_at
      • unsubscribed_from_directory_at
      • last_modified_at
    • stg_qualtrics__directory_mailing_list
      • created_at
      • last_modified_at
    • stg_qualtrics__distribution_contact
      • opened_at
      • response_completed_at
      • response_started_at
      • sent_at
    • stg_qualtrics__distribution
      • created_at
      • last_modified_at
      • send_at
      • survey_link_expires_at
    • stg_qualtrics__survey_response
      • finished_at
      • is_finished
      • last_modified_at
      • recorded_date
      • started_at
    • stg_qualtrics__survey_version
      • created_at
    • stg_qualtrics__survey
      • last_accessed_at
      • last_activated_at
      • last_modified_at
    • stg_qualtrics__user
      • account_created_at
      • account_expires_at
      • last_login_at
      • password_expires_at
      • password_last_changed_at

Please note: this update will likely only impact Redshift destinations as it was found the connector synced these fields as timestamp with time zone when in fact they were without. Most users will not see any changes following this release. But we marked this as breaking to ensure no possible datatype conflicts downstream.

Under the Hood

  • Updated the maintainer PR template to resemble the most up to date format.
  • Added the auto release GitHub Action for easier deployment.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • [n/a] dbt run (if incremental models are present)

Before marking this PR as "ready for review" the following have been applied:

  • [n/a] The appropriate issue has been linked, tagged, and properly assigned.
  • All necessary documentation and version upgrades have been applied.
  • docs were regenerated (unless this PR does not include any code or yml updates).
  • BuildKite integration tests are passing.
  • Detailed validation steps have been provided below.

Detailed Validation

Please share any and all of your validation steps:

None of these changes really need any validating in the source package, so the core of the validation is done in combination with the downstream dbt_qualtrics package.

First I recreated the issue by setting up a new Qualtrics connector (for Redshift) and attempted to use the production version of the package. Alas, I saw the same error as documented:
image

I then installed the local version of the qualtrics package with these changes included and saw the errors resolved!
image

If you had to summarize this PR in an emoji, which would it be?

⏲️

@fivetran-joemarkiewicz fivetran-joemarkiewicz marked this pull request as ready for review March 25, 2024 18:30
Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some recommendations to the CHANGELOG and one clarifying question, but this should be good to go!

CHANGELOG.md Outdated

## 🚨 Breaking Changes: Bug Fixes 🚨
- Casted the following timestamp fields in the below models using the `dbt.type_timestamp()` macro. This is necessary to ensure all timestamps are consistently casted and do not experience datatype mismatches in downstream transformations.
- stg_qualtrics__contact_mailing_list_membership
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add `` to each of these models for better readability. (i.e. stg_qualtrics__contact_mailing_list_membership). (Probably the fields too but I'll leave that up to you!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I actually did have `` added to each table and field and it turned into quite the eye soar lol. I think adding the backticks to the tables only should be a good middle ground.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -32,7 +32,7 @@ final as (
mailing_list_id,
name,
owner_id as owner_user_id,
unsubscribe_date as unsubscribed_at,
cast(unsubscribe_date as {{ dbt.type_timestamp() }}) as unsubscribed_at,
unsubscribed as is_unsubscribed,
_fivetran_synced,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cast _fivetran_synced as timestamp as well, or are we 100% confident this field will always be not null?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't use _fivetran_synced anywhere else I feel confident that we shouldn't have to cast this field. Additionally since this is a Fivetran generated field I feel comfortable that this should be the datatype we assume.

CHANGELOG.md Outdated
- survey_link_expires_at
- stg_qualtrics__survey_response
- finished_at
- is_finished
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove is_finished, it was not updated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why that snuck in there. Probably an artifact of copy/paste. Thanks for catching!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

integration_tests/dbt_project.yml Show resolved Hide resolved
Copy link

@fivetran-reneeli fivetran-reneeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit 7420a38 into main Mar 26, 2024
7 checks passed
@fivetran-joemarkiewicz fivetran-joemarkiewicz deleted the bugfix/timestamp-without-time-zone branch March 26, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants