Skip to content
Chris T edited this page Dec 16, 2013 · 9 revisions

Riak MDC Replication FAQ

How to estimate the memory requirements for keylist fullsync

A system needs to be able to hold a bloom filter in memory for at least 1 partition's worth of keys, for the entire duration of the fullsync. The size of the bloom is approximately 10 MB. Note that the AAE fullsync strategy has the same requirement. Although the two strategies use different mechanisms to determine which keys are different between the two clusters, they both build a bloom filter and then folder over the vnode's partition while checking for a match of the key in the bloom filter. The keylist strategy builds sorted files and then does a merged sort of the files, so the amount of required memory is only the binary chunks of files during the merge (see http://www.erlang.org/doc/man/file_sorter.html).

Can fullsync replication run between clusters having rings of different sizes?

  • No

Can realtime replication run between clusters having different n values?

  • Yes, but very inefficiently

How can it be determined if backpressure is being applied during replication?

  • look at netstat -a output and see if Recv-Q and Send-Q and growing

How can network performance between source + sink be measured

  • iperf

Do replicated puts show up in riak-admin status stats?

  • yes, here are the rt puts for a single object on a sink with an n_val=3:
vnode_puts : 3
vnode_puts_total : 3

Do bucket hooks on the sink fire?

What happens when the same keys are written into both source and sink?

  • be extra darn careful about resolving siblings on either side!
  • kittens die

How does repl work on EC2?