Skip to content

Practices Scenarios

Vinllen Chen edited this page Apr 3, 2020 · 1 revision

Starting from version 2.2, MongoShake can support replica set-> replica set, replica set-> cluster, cluster-> replica set, cluster-> cluster, and the source and destination instances can be the same instance

1. Feature list

The following functions can be used in combination according to user needs

1.1 The source is a replica set instance

  • Configure the source replica set, parameter mongo_urls
  • Configure the destination replica set (can also be tcp / rpc / kafka / file channel), parameter tunnel.address

Common Functions

  • Support full + incremental migration, full migration includes index migration, parameter sync_mode = all
  • Support incremental ddl migration, parameter replayer.dml_only = false
  • Support white list and black list, parameters filter.namespace.white and filter.namespace.black
  • Support automatic deletion of the target library table with the same name, parameter replayer.collection_drop = true
  • Support the target table index is different from the source table, parameter replayer.collection_drop = false, the new index of the target table is created and then synchronized in the target database (if the target table exists before the migration, the index of the table is no longer synchronized)
  • Supports full migration flow control, parameters replayer.collection_parallel, replayer.document_parallel and replayer.document_batch_size
  • Support write address of custom checkpoint, parameters context.storage.url, context.storage.db and context.storage.collection

Special function

  • Support migration or synchronization of Alibaba Cloud replica set instances
  • Support destination table renaming, parameter transform.namespace
  • Supports dbref table renaming, which has a large performance impact. The parameter dbref = true
  • Support admin library synchronization, default synchronization, parameter filter.pass.special.db = admin
  • Improve the fault tolerance of incremental synchronization. The parameters replayer.executor.upsert = true and replayer.executor.insert_on_dup_update = true
  • Support the destination as tcp / rpc / kafka / file channel, parameter tunnel is set to the corresponding mode, and tunnel.address is set to the corresponding address of the corresponding mode. topic @ brokers1, brokers2

1.2 The source is a cluster instance

  • Configure the source shard node list, parameter mongo_urls
  • Configure the source config node, parameter mongo_cs_url
  • Configure the destination mongos node (can also be tcp / rpc / kafka / file channel), parameter tunnel.address

Common Functions

  • Common functions supporting all replica set instances
  • Support filtering source orphan documents (legacy documents caused by failure of move chunk), parameter filter.orphan_document = true
  • Support source balancer to start balancer during incremental synchronization, parameter movechunk.enable = true (but source balancer still needs to be closed during full synchronization)

Special function

  • Special features that support all replica set instances
  • Supports migration or synchronization of Alibaba Cloud cluster instances and the need to filter the orphan documents on the source side (no filtering is required for orphan and no additional configuration is required). In the parameter mongo_urls, you need to add the name of the replica set of each node, and then add? ReplicaSet = xxx (Because the read-only account on Alibaba Cloud does not currently have permission to obtain replica set information)

2. Large data migration scenarios

  • Deploy mongoshake in the same machine room or the same area as the source, especially for scenarios where the source and destination are far away, otherwise the migration speed will be much slower
  • Select the appropriate concurrency parameters to achieve full-load migration, and appropriately enlarge the parameters replayer.collection_parallel and replayer.document_batch_size. The optimal situation is to overfill the destination iops.

3. More complex user scenarios

3.1 The source is a cluster instance

  1. To migrate or synchronize the Alibaba Cloud cluster instance and need to filter the source orphan documents, the operation steps are as follows: 1) Apply for a connection address and read-only account for each shard and cs node on the Alibaba Cloud console 2) Fill in on mongo_urls Shard's connection address and replica set name. The shard replica set name can be obtained by checking the config.shards table. 3) Configure filter.orphan_document = true and enable synchronization.

  2. If you want to change the shard key of a table in the cluster instance, you can accept the table name (it cannot be migrated to a new instance). The steps are as follows: 1) Create a new table and create the required shard key and index. 2) Pay attention to the configuration parameters filter.namespace.white and transform.namespace, and enable full + incremental synchronization. 3) Stop the service during the low-peak period of the business, start the service after changing the table name accessed by the application, and observe that no new data is written to the destination database Stop mongoshake

Clone this wiki locally