-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using AOF (the collection of commands) instead of RDB for full synchronization #59
Comments
@soloestoy I've not looked much into AOF, so my question might be very naive. Wouldn't a user need to pay certain amount of performance penalty (would vary with |
@hpatro your question is very good. Perhaps I didn't express it clearly at the beginning, but I can explain it by answering your question. The AOF file used during full synchronization does not rely on the |
@soloestoy, I think it is a great idea to use a no-preamble AOF and I will definitely consider using it for the atomic slot migration work (#23).
Curious. Do we use the RESP protocol on full sync today? Going through https://github.com/valkey-io/valkey/blob/unstable/src/rdb.c#L3031, my impression is that we don't. If so, are we still bound by |
Hi @PingXie , I mean when the migration tools parsing RDB, they have two choices:
|
Got it. This is not about the full sync but the migration tool. That makes sense. I think another benefit of using no-preamble AOF in either the full sync or slot migration is that we can easily achieve the non-blocking behavior. |
In some system, they are stuck on Redis 6.2 because they need to support Rolling Downgrade. Would this feature allow a replica running Redis 6.2 to replicate from Valkey 8.x? Why do people need rolling downgrade? After a rolling upgrade, if anything is not perfect, such as CPU or memory usage or some bug, the customer wants to downgrade again and take some time to investigate the problem. This includes systems where Valkey cluster nodes are just some part of a bigger system where everything is upgraded together in the same rolling upgrade. |
Whenever we introduce new data structures or new encodings, we need to adjust the RDB's version and encoding format. Internally in the server, this is not an issue. However, changes to RDB create a substantial adaption workload for external tools.
For example, some data migration tools like redis-shake, which needs to parse the RDB file during full synchronization, has to put effort into adapting to new versions whenever there are changes in the RDB (RDB version as well as many changes in storage structures). The tool needs to parse RDB to extract key-value pairs and transform them into restore commands before sending them to the target instance.
Also, there are many detailed issues to be addressed, such as the restore command's parameters not exceeding 500MB (due to
proto-max-bulk-len
limitation), meaning that when dealing with larger payloads, it's necessary to analyze the specific storage format for splitting, and then reverse-engineer it back into commands for replay (for example, a large hash would be converted into multiple HSET commands).Furthermore, instances on old versions cannot parse RDB from new versions and cannot use the restore command for data migration (some special scenarios may require rolling back to a previous version by using data migration tools).
To solve the problems and to ensure that migration tools are not affected by changes in RDB format, we can add a new method for full synchronization: using the AOF file (where AOF specifically refers to a collection of commands, not using an RDB preamble). While full synchronization between master and replica still uses the RDB file, data migration tools could declare the file format they wish to use during full synchronization via the REPLCONF command. By doing so, data migration tools can simply forward commands without needing to parse RDB, allowing the full synchronization data to be directly passed through to the target instance, thus simplifying the adaptation work.
The text was updated successfully, but these errors were encountered: