Skip to content

Commit

Permalink
Support sharding for background migrations
Browse files Browse the repository at this point in the history
  • Loading branch information
fatkodima committed Dec 13, 2023
1 parent 94cf5c9 commit 395eee6
Show file tree
Hide file tree
Showing 27 changed files with 653 additions and 119 deletions.
8 changes: 8 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,5 +53,13 @@ jobs:
with:
ruby-version: ${{ matrix.ruby-version }}
bundler-cache: true
- name: Prepare PostgreSQL shards
run: |
createdb online_migrations_shard_one
createdb online_migrations_shard_two
env:
PGHOST: localhost
PGUSER: postgres
PGPASSWORD: postgres
- name: Run the test suite
run: bundle exec rake test
1 change: 1 addition & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ Naming/FileName:
Exclude:
- lib/online_migrations/version.rb
- test/support/schema.rb
- test/support/db/**
- test/test_helper.rb
- gemfiles/**.gemfile

Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
## master (unreleased)

- Support sharding for background migrations

Now, if a `relation` inside background migration definition is defined on a sharded model,
then that background migration would automatically run on all the shards.

To get all the new sharding related schema changes, you need to run:

```sh
$ bin/rails generate online_migrations:upgrade
$ bin/rails db:migrate
```

- Drop support for Ruby < 2.7 and Rails < 6.1

## 0.10.0 (2023-12-12)
Expand Down
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,20 @@ And then run:
```sh
$ bundle install
$ bin/rails generate online_migrations:install
$ bin/rails db:migrate
```

**Note**: If you do not have plans on using [background migrations](docs/background_migrations.md) feature, then you can delete the generated migration and regenerate it later, if needed.

### Upgrading

If you're already using [background migrations](docs/background_migrations.md), your background migrations tables may require additional columns. After every upgrade, please run:

```sh
$ bin/rails generate online_migrations:upgrade
$ bin/rails db:migrate
```

## Motivation

Writing a safe migration can be daunting. Numerous articles have been written on the topic and a few gems are trying to address the problem. Even for someone who has a pretty good command of PostgreSQL, remembering all the subtleties of explicit locking can be problematic.
Expand Down
28 changes: 27 additions & 1 deletion docs/background_migrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,13 +116,15 @@ enqueue_background_migration("MyMigrationWithArgs", arg1, arg2, ...)

## Predefined background migrations

* `BackfillColumn` - backfills column(s) with scalar values (enqueue using `backfill_column_in_background`)
* `BackfillColumn` - backfills column(s) with scalar values (enqueue using `backfill_column_in_background`; or `backfill_column_for_type_change_in_background` if backfilling column for which type change is in progress)
* `CopyColumn` - copies data from one column(s) to other(s) (enqueue using `copy_column_in_background`)
* `DeleteAssociatedRecords` - deletes records associated with a parent object (enqueue using `delete_associated_records_in_background`)
* `DeleteOrphanedRecords` - deletes records with one or more missing relations (enqueue using `delete_orphaned_records_in_background`)
* `PerformActionOnRelation` - performs specific action on a relation or individual records (enqueue using `perform_action_on_relation_in_background`)
* `ResetCounters` - resets one or more counter caches to their correct value (enqueue using `reset_counters_in_background`)

**Note**: These migration helpers should be run inside the migration against the database where background migrations tables are defined.

## Testing

At a minimum, it's recommended that the `#process_batch` method in your background migration is tested. You may also want to test the `#relation` and `#count` methods if they are sufficiently complex.
Expand Down Expand Up @@ -294,3 +296,27 @@ OnlineMigrations.config.background_migrations.backtrace_cleaner = cleaner
```

If none is specified, the default `Rails.backtrace_cleaner` will be used to clean backtraces.

### Customizing the database where background migrations tables live

If you have multiple databases, you can configure where background migrations related tables live
by configuring the parent model:

```ruby
# config/initializers/online_migrations.rb

OnlineMigrations::BackgroundMigrations::ApplicationRecord.connects_to database: { writing: :animals, reading: :animals_replica }
```

If everything is sharded and there is no some "common" database, you should run background migrations against one of the chosen shards:

```ruby
# schedule.rb
every 1.minute do
runner <<~RUBY
OnlineMigrations::BackgroundMigrations::ApplicationRecord.connected_to(shard: :shard_one, role: :writing) do
OnlineMigrations::BackgroundMigrations::Scheduler.run
end
RUBY
end
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
class AddShardingToOnlineMigrations < <%= migration_parent %>
def change
safety_assured do
remove_index :background_migrations, [:migration_name, :arguments], unique: true

change_table :background_migrations do |t|
t.bigint :parent_id
t.string :shard
t.boolean :runnable, default: true, null: false

t.foreign_key :background_migrations, column: :parent_id, on_delete: :cascade

t.index [:migration_name, :arguments, :shard],
unique: true, name: :index_background_migrations_on_unique_configuration
end
end
end
end
7 changes: 6 additions & 1 deletion lib/generators/online_migrations/templates/migration.rb.tt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
class InstallOnlineMigrations < <%= migration_parent %>
def change
create_table :background_migrations do |t|
t.bigint :parent_id
t.string :migration_name, null: false
t.jsonb :arguments, default: [], null: false
t.string :batch_column_name, null: false
Expand All @@ -13,9 +14,13 @@ class InstallOnlineMigrations < <%= migration_parent %>
t.integer :sub_batch_pause_ms, null: false
t.integer :batch_max_attempts, null: false
t.string :status, default: "enqueued", null: false
t.string :shard
t.boolean :runnable, default: true, null: false
t.timestamps null: false

t.index [:migration_name, :arguments],
t.foreign_key :background_migrations, column: :parent_id, on_delete: :cascade

t.index [:migration_name, :arguments, :shard],
unique: true, name: :index_background_migrations_on_unique_configuration
end

Expand Down
33 changes: 33 additions & 0 deletions lib/generators/online_migrations/upgrade_generator.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# frozen_string_literal: true

require "rails/generators"
require "rails/generators/active_record/migration"

module OnlineMigrations
# @private
class UpgradeGenerator < Rails::Generators::Base
include ActiveRecord::Generators::Migration

source_root File.expand_path("templates", __dir__)

def copy_templates
migrations_to_be_applied.each do |migration|
migration_template("#{migration}.rb", File.join(db_migrate_path, "#{migration}.rb"))
end
end

private
def migrations_to_be_applied
connection = BackgroundMigrations::Migration.connection
columns = connection.columns(BackgroundMigrations::Migration.table_name).map(&:name)

migrations = []
migrations << "add_sharding_to_online_migrations" if !columns.include?("shard")
migrations
end

def migration_parent
"ActiveRecord::Migration[#{Utils.ar_version}]"
end
end
end
1 change: 1 addition & 0 deletions lib/online_migrations.rb
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ module BackgroundMigrations
autoload :DeleteOrphanedRecords
autoload :PerformActionOnRelation
autoload :ResetCounters
autoload :ApplicationRecord
autoload :MigrationJob
autoload :Migration
autoload :MigrationJobRunner
Expand Down
13 changes: 13 additions & 0 deletions lib/online_migrations/background_migrations/application_record.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# frozen_string_literal: true

module OnlineMigrations
module BackgroundMigrations
# Base class for all records used by this gem.
#
# Can be extended to setup different database where all tables related to
# online_migrations will live.
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
end
end
68 changes: 63 additions & 5 deletions lib/online_migrations/background_migrations/migration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

module OnlineMigrations
module BackgroundMigrations
class Migration < ActiveRecord::Base
class Migration < ApplicationRecord
STATUSES = [
:enqueued, # The migration has been enqueued by the user.
:running, # The migration is being performed by a migration executor.
Expand All @@ -15,14 +15,18 @@ class Migration < ActiveRecord::Base
self.table_name = :background_migrations

scope :queue_order, -> { order(created_at: :asc) }
scope :runnable, -> { where(runnable: true) }
scope :active, -> { where(status: [statuses[:enqueued], statuses[:running]]) }
scope :except_succeeded, -> { where.not(status: :succeeded) }
scope :for_migration_name, ->(migration_name) { where(migration_name: normalize_migration_name(migration_name)) }
scope :for_configuration, ->(migration_name, arguments) do
for_migration_name(migration_name).where("arguments = ?", arguments.to_json)
end

enum status: STATUSES.index_with(&:to_s)

belongs_to :parent, class_name: name, optional: true
has_many :children, class_name: name, foreign_key: :parent_id
has_many :migration_jobs

validates :migration_name, :batch_column_name, presence: true
Expand All @@ -33,7 +37,7 @@ class Migration < ActiveRecord::Base
validates :batch_pause, :sub_batch_pause_ms, presence: true,
numericality: { greater_than_or_equal_to: 0 }
validates :rows_count, numericality: { greater_than_or_equal_to: 0 }, allow_nil: true
validates :arguments, uniqueness: { scope: :migration_name }
validates :arguments, uniqueness: { scope: [:migration_name, :shard] }

validate :validate_batch_column_values
validate :validate_batch_sizes
Expand All @@ -43,6 +47,7 @@ class Migration < ActiveRecord::Base
validates_with MigrationStatusValidator, on: :update

before_validation :set_defaults
before_create :create_child_migrations, unless: :parent_id?

# @private
def self.normalize_migration_name(migration_name)
Expand All @@ -58,12 +63,36 @@ def completed?
succeeded? || failed?
end

def composite?
parent_id.nil? && !runnable?
end

# Overwrite enum's generated method to correctly work for composite migrations.
def paused!
return super if !composite?

transaction do
super
children.each { |child| child.paused! if child.enqueued? || child.running? }
end
end

# Overwrite enum's generated method to correctly work for composite migrations.
def running!
return super if !composite?

transaction do
super
children.each { |child| child.running! if child.paused? }
end
end

def last_job
migration_jobs.order(max_value: :desc).first
migration_jobs.order(:max_value).last
end

def last_completed_job
migration_jobs.completed.order(finished_at: :desc).first
migration_jobs.completed.order(:finished_at).last
end

# Returns the progress of the background migration.
Expand All @@ -75,6 +104,11 @@ def last_completed_job
def progress
if succeeded?
1.0
elsif composite?
progresses = children.map(&:progress).compact
if progresses.any?
progresses.sum / progresses.size
end
elsif rows_count
jobs_rows_count = migration_jobs.succeeded.sum(:batch_size)
# The last migration job may need to process the amount of rows
Expand All @@ -95,6 +129,10 @@ def migration_relation
migration_object.relation
end

def migration_model
migration_relation.model
end

# Returns whether the interval between previous step run has passed.
# @return [Boolean]
#
Expand Down Expand Up @@ -170,7 +208,13 @@ def validate_batch_sizes
end

def validate_jobs_status
if succeeded? && migration_jobs.except_succeeded.exists?
if composite?
if succeeded? && children.except_succeeded.exists?
errors.add(:base, "all child migrations must be succeeded")
elsif failed? && !children.failed.exists?
errors.add(:base, "at least one child migration must be failed")
end
elsif succeeded? && migration_jobs.except_succeeded.exists?
errors.add(:base, "all migration jobs must be succeeded")
elsif failed? && !migration_jobs.failed.exists?
errors.add(:base, "at least one migration job must be failed")
Expand Down Expand Up @@ -201,6 +245,20 @@ def set_defaults
end
end

def create_child_migrations
shards = Utils.shard_names(migration_model)

if shards.size > 1
children = shards.map do |shard|
child = dup
child.shard = shard
child
end
self.runnable = false
self.children = children
end
end

def next_min_value
if last_job
last_job.max_value.next
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

module OnlineMigrations
module BackgroundMigrations
class MigrationJob < ActiveRecord::Base
class MigrationJob < ApplicationRecord
STATUSES = [
:enqueued,
:running,
Expand Down
Loading

0 comments on commit 395eee6

Please sign in to comment.