Skip to content

nicolaswilde/thread-block-scheduling

Repository files navigation

Reproducibility Submission of pap223 & apdx190

Experiment Workflow

1 Prepare Hardware & Software Environments

You need computers or servers with Ubuntu 20.04, CUDA 11.3, Nsight System, and V100/RTX2080Ti/A100/RTX3090 GPUs.

If local GPUs are not available, you can rent GPU servers used in this work:

  • login to https://en.gpushare.com/, navigate to console -> Instance and Data -> My instance -> Create instance
  • select the GPU you want
  • select Instance image -> Official image -> PyTorch 1.12.0 -> Python 3.8 -> Cuda 11.3
  • then you can access the GPU server through ssh.

CUDA 11.3 is already installed on the GPU servers, but you still need to install nsight system:

2 Execute the code

3 Check the Experiment Results

After execution, some information is printed on the terminal. You will find the profiling files in the directory result. You can download the profiling files to your local computer (if you are using GPU servers) and open the profiling files with a host version (with GUI) of nsight system.

Following the workflow above, the authors provide the experiment results on V100, RTX2080Ti, A100, and RTX3090 at https://github.com/nicolaswilde/thread-block-scheduling/tree/main/my-result. You can check the log files and the nsys-rep files.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published