Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for mysql upsert via AR import #18

Merged
merged 9 commits into from
Aug 17, 2021

Conversation

mitch-lindsay
Copy link
Member

@mitch-lindsay mitch-lindsay commented Aug 5, 2021

The existing scrubbing process relies on singular update statements to apply the changes to the models. This is effective but very time consuming if you are attempting to scrub a large amount of data.
MySQL, PostgreSQL and SQLite all support some variation of bulk updating via an upsert operation. Thankfully activerecord-import has incorporated support for this making it easy to implement. https://github.com/zdennis/activerecord-import#duplicate-key-update
So at the core of this change, we are adding support to apply scrubbed changes via a bulk upsert versus update commands.
The performance implications of this change are significant. In a test against a table with 700,000 rows, the upsert approach was approximately 5x faster (5m13.113s vs 27m49.008s).
Additional changes in this PR are around reorganizing the code to create one distinct implementation path between scrub and scrub:model and decomposing some of the layers for better separation of concerns.
Note: Because the activerecord-import implementations for MySQL and PostgreSQL are slightly different, I have only added support for MySQL at this time.

lib/acts_as_scrubbable.rb Show resolved Hide resolved
lib/acts_as_scrubbable/task_runner.rb Outdated Show resolved Hide resolved
lib/acts_as_scrubbable/tasks.rb Outdated Show resolved Hide resolved
lib/acts_as_scrubbable/ar_class_processor.rb Outdated Show resolved Hide resolved
lib/acts_as_scrubbable/ar_class_processor.rb Outdated Show resolved Hide resolved
lib/acts_as_scrubbable/ar_class_processor.rb Outdated Show resolved Hide resolved
lib/acts_as_scrubbable/update_processor.rb Outdated Show resolved Hide resolved
@mitch-lindsay
Copy link
Member Author

I tweaked the way we determine which update method we use. Instead of being aware of the application database and gem configuration, it is now explicitly configured through the initializer or through an environment variable.

@mitch-lindsay mitch-lindsay merged commit b827696 into master Aug 17, 2021
@alexadia alexadia deleted the update_to_use_import branch January 9, 2023 18:42
adamstegman added a commit that referenced this pull request Jul 8, 2024
Callbacks should be around the update, not just retrieving scrubbed values.
I think this should have been moved in #18, but wasn't caught.
adamstegman added a commit that referenced this pull request Jul 9, 2024
Callbacks should be around the update, not just retrieving scrubbed values.
I think this should have been moved in #18, but wasn't caught.
adamstegman added a commit that referenced this pull request Jul 10, 2024
Callbacks should be around the update, not just retrieving scrubbed values.
I think this should have been moved in #18, but wasn't caught.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants