DeComFL is a library designed for training/fine-tuning deep learning models in the federated learning scenario. Its unique feature is the utilization of zeroth-order optimization, enabling communication between clients to be limited to just a few scalars, irrespective of the original model's size. This dimension-free communication is the inspiration behind the library's name.
From Table 1 and 2, we observe the DeComFL's effectiveness in communication cost reduction. We evaluate its performance with five and ten perturbations. Its performance matches or even outperforms MeZO and FedZO in all datasets. Surprisingly, DeComFL can just require about 1MB communication cost to converge, which is a significant saving compared with other algorithms.
Model | Dataset / Task | MeZO | FedZO with P = 5 | DeComFL with P = 5 | DeComFL with P = 10 |
---|---|---|---|---|---|
OPT-125M | SST-2 | 83.99% | 84.11% (0.27 TB) | 84.02% (0.18 MB) | 85.08% (0.36 MB) |
CB | 72.49% | 73.97% (0.09 TB) | 74.28% (0.06 MB) | 75.00% (0.12 MB) | |
WSC | 55.18% | 59.43% (0.27 TB) | 59.13% (0.18 MB) | 59.59% (0.36 MB) | |
WIC | 53.25% | 53.31% (0.27 TB) | 53.28% (0.18 MB) | 53.38% (0.36 MB) | |
RTE | 52.91% | 53.42% (0.18 TB) | 54.33% (0.12 MB) | 57.05% (0.24 MB) | |
BoolQ | 61.46% | 61.20% (0.18 TB) | 61.36% (0.12 MB) | 61.60% (0.24 MB) | |
OPT-1.3B | SST-2 | 90.23% | 90.17% (1937.15 TB) | 90.02% (0.12 MB) | 90.78% (0.24 MB) |
CB | 74.01% | 74.41% (2905.73 TB) | 74.40% (0.18 MB) | 75.71% (0.36 MB) | |
WSC | 58.21% | 59.95% (2905.73 TB) | 60.41% (0.18 MB) | 64.16% (0.36 MB) | |
WIC | 55.95% | 56.06% (1937.15 TB) | 55.97% (0.12 MB) | 56.14% (0.24 MB) | |
RTE | 57.57% | 58.88% (1452.86 TB) | 59.42% (0.90 MB) | 60.89% (1.80 MB) | |
BoolQ | 61.98% | 62.01% (1452.86 TB) | 62.17% (0.90 MB) | 62.50% (1.80 MB) |
Model | Dataset / Task | MeZO | FedZO with P = 5 | DeComFL with P = 5 | DeComFL with P = 10 |
---|---|---|---|---|---|
OPT-125M | SST-2 | 85.07% | 85.34% (279.40 TB) | 85.42% (0.18 MB) | 85.44% (0.36 MB) |
CB | 69.64% | 70.55% (93.13 TB) | 71.07% (0.06 MB) | 71.43% (0.12 MB) | |
WSC | 52.66% | 54.61% (93.13 TB) | 54.53% (0.06 MB) | 57.03% (0.12 MB) | |
WIC | 53.49% | 53.12% (186.26 TB) | 53.08% (0.12 MB) | 53.71% (0.24 MB) | |
RTE | 50.15% | 50.92% (46.57 TB) | 51.40% (0.03 MB) | 51.40% (0.06 MB) | |
BoolQ | 60.68% | 60.53% (46.57 TB) | 60.12% (0.03 MB) | 60.78% (0.06 MB) |
We use conda as our cross platform environment management tool. However due to macOS' lacking support for cuda, we have to make 2 different environment set up files:
- Use
environment.yml
on macOS or if you do not have cuda at hand. - Use
environment_cuda.yml
otherwise.
For READMD.md, we will use environment.yml
whenever a environment file is needed.
- Make sure
conda
is available, see https://conda.io/projects/conda/en/latest/user-guide/install/index.html for more detail. - At the root of this repo, run
conda env create -f environment.yml -y
. - Once installation is finished, run
conda activate decomfl
to use the created virtual env. - (Optional) If you see something like
conda init before activate
. Runconda init
, then restart your terminal/powershell. Then repeat step 3. - Run any command provided in Run Experiments section. If code works, then congratulations, you have successfully set up the environment for this repo!
-
Run zeroth-order random gradient estimate + SGD training. Train model using ZOO RGE. Usage example:
python zo_rge_main.py --dataset=cifar10 --num-pert=10 --lr=1e-6 --mu=1e-3
-
Run DeComFL: Follow FL routine, split data into chunks and train on different clients. Usage example:
python decomfl_main.py --large-model=opt-125m --dataset=sst2 --iterations=1000 --train-batch-size=32 --test-batch-size=200 --eval-iterations=25 --num-clients=3 --num-sample-clients=2 --local-update-steps=1 --num-pert=5 --lr=1e-5 --mu=1e-3 --grad-estimate-method=rge-forward
@article{li2024achieving,
title={Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization},
author={Li, Zhe and Ying, Bicheng and Liu, Zidong and Dong, Chaosheng and Yang, Haibo},
journal={arXiv preprint arXiv:2405.15861},
year={2024}
}
DeComFL is currently contributed and maintained by Zidong Liu (ComboCurve), Bicheng Ying (Google) and Zhe Li (RIT), and advised by Prof. Haibo Yang (RIT).