|
| 1 | +--- |
| 2 | +start_date: 2022-12-12 |
| 3 | +rfc_pr: https://github.com/kadalu/rfcs/pull/26 |
| 4 | +status: SUBMITTED |
| 5 | +available_since: (leave this empty) |
| 6 | +--- |
| 7 | + |
| 8 | += Kadalu Storage Volume Rebalance |
| 9 | + |
| 10 | +When a Kadalu Storage Volume expands or shrinks, each Storage unit gets the new hash range. Fix the layout of the new Storage units to allow hashing of the new files to them. |
| 11 | + |
| 12 | +After adding new distribute groups, the existing files distribution are not uniform. A few of the files present in the existing bricks may belongs to the freshly added Storage units. Recalculate the hash and distribute the files to the respective Storage units as required. |
| 13 | + |
| 14 | +== Fix layout Rebalance (Moana based) |
| 15 | + |
| 16 | +Trigger the fix layout rebalance automatically after expanding the Kadalu Storage Volume(Or while shrinking). Check the status of the fix layout by running the "volume expand status" command. |
| 17 | + |
| 18 | +---- |
| 19 | +kadalu volume create DEV/myvol \ |
| 20 | + replica node1.example.com:/exports/myvol/s1 \ |
| 21 | + node2.example.com:/exports/myvol/s2 \ |
| 22 | + node3.example.com:/exports/myvol/s3 |
| 23 | +
|
| 24 | +kadalu volume expand DEV/myvol \ |
| 25 | + replica node4.example.com:/exports/myvol/s4 \ |
| 26 | + node5.example.com:/exports/myvol/s5 \ |
| 27 | + node6.example.com:/exports/myvol/s6 |
| 28 | +
|
| 29 | +---- |
| 30 | + |
| 31 | +---- |
| 32 | +kadalu volume expand DEV/myvol status |
| 33 | +---- |
| 34 | + |
| 35 | +Volume expand API will start a service in the first node (May change this later based on the node online status) from the participating nodes list. Example of the new service: |
| 36 | + |
| 37 | +[source,crystal] |
| 38 | +---- |
| 39 | +enum RebalanceCmd |
| 40 | + None |
| 41 | + Start |
| 42 | + Stop |
| 43 | + Status |
| 44 | + FixLayoutStart |
| 45 | + StartForce |
| 46 | +end |
| 47 | +---- |
| 48 | + |
| 49 | +---- |
| 50 | +glusterfs -s <volfile-server> --volfile-id <volname> --process-name rebalance \ |
| 51 | + --xlator-option "*distribute.use-readdirp=yes" \ |
| 52 | + --xlator-option "*distribute.lookup-unhashed=yes" \ |
| 53 | + --xlator-option "*distribute.assert-no-child-down=yes" \ |
| 54 | + --xlator-option "replicate.data-self-heal=off" \ |
| 55 | + --xlator-option "replicate.metadata-self-heal=off" \ |
| 56 | + --xlator-option "replicate.entry-self-heal=off" \ |
| 57 | + --xlator-option "*distribute.readdir-optimize=on" \ |
| 58 | + --xlator-option "*distribute.rebalance-cmd=<RebalanceCmd.FixLayoutStart> \ |
| 59 | + --xlator-option "*distribute.node-uuid=<node-id>" \ |
| 60 | + --xlator-option "*distribute.commit-hash=<unique-id>" \ |
| 61 | + -p <pid-file> \ |
| 62 | + --socket-file <socket-file> \ |
| 63 | + -l <logfile> |
| 64 | +---- |
| 65 | + |
| 66 | +== Fix layout Rebalance (Operator based) |
| 67 | + |
| 68 | +Operator parses the CRD modification and understands that Storage Pool expansion is triggered. After starting all the new Server pods. Start a Job that triggers the Fix layout rebalance. kubectl-kadalu can get the status of rebalance by checking the status of the Job. |
| 69 | + |
| 70 | +Refer the above example command to be run as job. |
| 71 | + |
| 72 | +== Files rebalance (Moana based) |
| 73 | + |
| 74 | +This is the manual process can be triggered by admins based on the needs (When no space available in the existing Storage units, or when the Cluster usage is minimal). |
| 75 | + |
| 76 | +"Volume expand" command also prints the note with the steps to run the Rebalance command. Rebalance helps to distribute the existing files uniformly among the Storage units. |
| 77 | + |
| 78 | +---- |
| 79 | +kadalu volume rebalance-start <POOL>/<VOLUME> |
| 80 | +---- |
| 81 | + |
| 82 | +Stop and Start command will start the rebalance process from the beginning, it doesn't remember the previous progress. |
| 83 | + |
| 84 | +---- |
| 85 | +kadalu volume rebalance-stop <POOL>/<VOLUME> |
| 86 | +---- |
| 87 | + |
| 88 | +Above command calls the respective API that internally starts a service in respective Storage unit nodes. **Note**: These services will halt once the Rebalance process completes its job. Do not start the service in every nodes of the Volume, start only in the node of the first Storage unit from each distribute groups. |
| 89 | + |
| 90 | +Service in each node will mount the Volume locally and runs the below command (More info related to this sub-command is covered in the next section) |
| 91 | + |
| 92 | +---- |
| 93 | +kadalu rebalance-process <storage-unit-path> <mount-path> |
| 94 | +---- |
| 95 | + |
| 96 | + |
| 97 | +== Files Rebalance (Operator based) |
| 98 | + |
| 99 | +Introduce a new CRD to start the rebalance process. |
| 100 | + |
| 101 | +[source,yaml] |
| 102 | +---- |
| 103 | +# File: rebalance-storage-pool-1.yaml |
| 104 | +--- |
| 105 | +apiVersion: kadalu-operator.storage/v1alpha1 |
| 106 | +kind: KadaluStorageRebalance |
| 107 | +metadata: |
| 108 | + # This will be used as Job name |
| 109 | + name: rebal-sp1-dec-2022 |
| 110 | +spec: |
| 111 | + pool: storage-pool1 |
| 112 | +---- |
| 113 | + |
| 114 | +Operator gets the Storage pool info from configmap and then identifies the first Storage unit(Brick) from each distribute groups. Based on the node affinity of each Storage unit, Operator starts the Rebalance job (Refer next section). |
| 115 | + |
| 116 | +Delete this CRD to stop the Rebalance job. |
| 117 | + |
| 118 | +`kubectl-kadalu` will collect the status from each Jobs and shows the aggregated status. |
| 119 | + |
| 120 | +---- |
| 121 | +kubectl kadalu rebalance-status <Pool-name> |
| 122 | +---- |
| 123 | + |
| 124 | +== Tool to handle the rebalance |
| 125 | + |
| 126 | +This tool will crawl the Volume from the backend (Storage unit or Brick path), for each file it calls the virtual setxattr from the mount path. The virtual xattr is implemented by GlusterFS that handles the rebalance of the file. The tool initially captures the backend usage details by running the `du` command and estimates the progress after each file rebalance. |
| 127 | + |
| 128 | +---- |
| 129 | +kadalu rebalance-process <storage-unit-path> <mount-path> |
| 130 | +---- |
| 131 | + |
| 132 | +**Note**: This tool is work in progress. Will update this doc once available. |
0 commit comments