Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data duplication issue #5

Merged
merged 2 commits into from
Feb 24, 2021

Conversation

0xpetersatoshi
Copy link
Contributor

Description of change

  • Fixing issue that was causing duplication of records to be emitted and updated stream with correct replication key and created at timestamp. Before writing records, the tap was comparing the current bookmark datetime to the datetime from the state file but it was doing a ">=" comparison. This was causing records from the previous run to be emitted again. Check was corrected to only emit records with a timestamp greater than the previous run. Additionally, one of the streams was using the incorrect fields for both the replication key and the created at timestamp field.

Manual QA steps

  • Run the tap once and produce a state file. Run again with state file and ensure that there are 0 records emitted for streams using "incremental" method.

Risks

  • N/A

Rollback steps

  • revert this branch

@cosimon cosimon merged commit 1f5d10e into singer-io:master Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants