Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't allow writing duplicates in bulk writer and corresponding tests #515

Merged
merged 10 commits into from
Jun 21, 2023

Conversation

tvaron3
Copy link
Member

@tvaron3 tvaron3 commented Jun 9, 2023

Type of PR

  • Documentation changes
  • Code changes
  • Test changes
  • CI-CD changes
  • GitHub Template changes

Purpose of PR

The bulk upsert operation will sometimes write data with duplicate id and partition key incorrectly. This change prevents duplicate items to be sent to the bulk upsert operation. It should only send the latest item. The feature is hidden behind a new config that is set to true by default. The flag added is called "connect.cosmos.sink.bulk.compression.enabled".

Observability + Testing

  • What changes or considerations did you make in relation to observability?

  • Did you add testing to account for any new or changed work?
    I added a unit test and a integration test for the changes I made in the bulkwriter.

Review notes

Issues Closed or Referenced

  • Closes # (this will automatically close the issue when the PR closes)
  • References # (this references the issue but does not close with PR)

@tvaron3 tvaron3 added the feature-request New feature or request label Jun 9, 2023
@tvaron3 tvaron3 requested a review from a team as a code owner June 9, 2023 23:04
@tvaron3 tvaron3 self-assigned this Jun 9, 2023
Copy link
Collaborator

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great, just some minor comments.
Great work @tvaron3 , thanks!!

@TheovanKraay
Copy link

LGTM - nit: can we please add the actual flag in the PR description?

@TheovanKraay
Copy link

LGTM - nit: can we please add the actual flag in the PR description?

I saw you updated, but I meant show the value that customer will need to update in their config file for the connector - I think its connect.cosmos.sink.bulk.duplicates.enabled?

@tvaron3 tvaron3 requested a review from kushagraThapar June 16, 2023 18:59
@tvaron3
Copy link
Member Author

tvaron3 commented Jun 16, 2023

Theo I changed it again and the value they would need to change is "connect.cosmos.sink.bulk.compression.enabled".

kushagraThapar
kushagraThapar previously approved these changes Jun 20, 2023
Copy link
Collaborator

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tvaron3 , great work :)

CHANGELOG.md Outdated Show resolved Hide resolved
Fixed wording of flag in documentation and changelog

Co-authored-by: Kushagra Thapar <kuthapar@microsoft.com>
@tvaron3 tvaron3 merged commit dff9960 into dev Jun 21, 2023
@tvaron3 tvaron3 deleted the users/tvaron3/kafka-connector branch June 21, 2023 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants