Skip to content

uw-mad-dash/GradCompressionUtility

Repository files navigation

Setup

  • We have provided a public AMI- ami-0eea3ad7fabaa0125 with all the dependencies installed.
  • If the user wants to write out results automatically to s3 bucket, they will need to create an IAM role allowing EC2 to write code to S3. Details for it can be found here. If the role is not provide the code will just write to local disk.
  • The code can be launched with some minor configuration changes in launch_ec2_run_commands.py
  • If using the provided AMI no other changes are are needed.

Running the code

Running the code with the AMI: The code borrows a lot of structure from Thijis Vogels's code for PowerSGD.

  • To launch EC2 instances automatically look at launch_ec2_run_commands.py
  • Ideally the user will only need to provide their ssh key and the ssh key_path.
  • You also need to provide the bash file which will eventually launch the code.
  • We have already provided the bash file and added in the run commands on line 127 in launch_ec2_run_commands.py
  • Look at run_ddp.sh and sh for model which use imagenet dataset.

Running code without the AMI:

  • If the user wants to launch the code without using the AMI, they will need to install PyTorch 1.8.1+cu111 and bit2byte extension.
  • Then the user can launch run_ddp.sh manually on each node with appropriate parameters

Sogou Dataset available at: here.

Models available at:

BERT-Base, Uncased

BERT-Base, Chinese

Acknowledgement

The code borrows a lot of structure from code for PowerSGD. We will like to thank the authors of PowerSGD for providing the code.

Cite

@article{agarwal2021utility,
  title={On the utility of gradient compression in distributed training systems},
  author={Agarwal, Saurabh and Wang, Hongyi and Venkataraman, Shivaram and Papailiopoulos, Dimitris},
  journal={arXiv preprint arXiv:2103.00543},
  year={2021}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published