Skip to content

Usage with MongoDB

Luke Lovett edited this page Feb 6, 2015 · 8 revisions

The Basics

Mongo Connector can replicate from one MongoDB replica set or sharded cluster to another using the Mongo DocManager. The most basic usage is like the following:

mongo-connector -m localhost:27017 -t localhost:37017 -d mongo_doc_manager

old usage (before 2.0 release):

mongo-connector -m localhost:27017 -t localhost:37017 -d <your-doc-manager-folder>/mongo_doc_manager.py

This assumes you are running a replica set or sharded cluster on ports 27017 and 37017 of the local machine.

Important Notes

  • Even though replication is mongo-to-mongo, Mongo Connector still needs to insert the _ts and ns fields in order to handle rollbacks and provide renaming features. Note: in version 1.3, the _ts and ns will not appear in replicated documents and is instead stored in the __mongo_connector database. This is true as of commit b10b94f3ec3d1bc104d807ac7b8e61aabaa120d8.
  • Mongo Connector is "upsert only." This means that when a document is updated, the original document is overwritten with the latest version of that document on the source cluster. This is not the normal behavior of MongoDB replication, and it can result in short-lived discrepancies between the source and target MongoDB clusters.

Comparison to Other Tools

MongoDB comes with several other tools that can be helpful in certain situations where Mongo Connector may also apply. These tools include:

For backup purposes, these tools work fine and are probably a lot faster than Mongo Connector. Furthermore, MongoDB Inc. officially supports their use (mongo-connector is not "officially supported"), and they may have fewer bugs. It's even possible to backup or move data from one MongoDB cluster to another without downtime using filesystem snapshots and mongooplog. However, there are certain situations where Mongo Connector really excels. Some of these are:

  • Needing to replicate to a system other than MongoDB
  • Needing to backup or move data from a MongoDB cluster without downtime, and filesystem snapshots aren't an option
  • Targeting specific namespaces for live replication
  • Replicating to multiple targets with one tool
  • Migrating databases or collections to have different names without downtime

The take-away: Consider your options first before committing to a solution for just moving data around.

Clone this wiki locally