Reproducibility Submission of pap223 & apdx190

Experiment Workflow

1 Prepare Hardware & Software Environments

You need computers or servers with Ubuntu 20.04, CUDA 11.3, Nsight System, and V100/RTX2080Ti/A100/RTX3090 GPUs.

If local GPUs are not available, you can rent GPU servers used in this work:

login to https://en.gpushare.com/, navigate to console -> Instance and Data -> My instance -> Create instance
select the GPU you want
select Instance image -> Official image -> PyTorch 1.12.0 -> Python 3.8 -> Cuda 11.3
then you can access the GPU server through ssh.

CUDA 11.3 is already installed on the GPU servers, but you still need to install nsight system:

download the nsight system installation package from https://developer.nvidia.com/downloads/assets/tools/secure/nsight-systems/2023_2/nsightsystems-linux-cli-public-2023.2.1.122-3259852.deb/ to your local computer
if the above link is not valid (version updated), you can click download on https://developer.nvidia.com/nsight-systems/get-started to download the installation package (Linux CLI only .deb Installer) to your local computer
upload the nsight system installation package through scp to the GPU servers
install the dependent library: sudo apt-get install libglib2.0-0:amd64
install nsight system: sudo dpkg -i NsightSystems-linux-cli-public-{installation-package-version-number}.deb

2 Execute the code

get the code from https://github.com/nicolaswilde/thread-block-scheduling
upload the code to the GPU servers
execute:
- make GPGPU=V100 on V100
- make GPGPU=RTX2080TI on RTX2080Ti
- make GPGPU=A100 on A100
- make GPGPU=RTX3090 on RTX3090

3 Check the Experiment Results

After execution, some information is printed on the terminal. You will find the profiling files in the directory result. You can download the profiling files to your local computer (if you are using GPU servers) and open the profiling files with a host version (with GUI) of nsight system.

Following the workflow above, the authors provide the experiment results on V100, RTX2080Ti, A100, and RTX3090 at https://github.com/nicolaswilde/thread-block-scheduling/tree/main/my-result. You can check the log files and the nsys-rep files.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
include		include
kernels		kernels
my-result		my-result
test1		test1
test10		test10
test11		test11
test12		test12
test13		test13
test14		test14
test2		test2
test3		test3
test4		test4
test5		test5
test6		test6
test7		test7
test8		test8
test9		test9
LICENSE		LICENSE
README.md		README.md
makefile		makefile
makefile_test		makefile_test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproducibility Submission of pap223 & apdx190

Experiment Workflow

1 Prepare Hardware & Software Environments

2 Execute the code

3 Check the Experiment Results

About

Releases

Packages

Languages

License

nicolaswilde/thread-block-scheduling

Folders and files

Latest commit

History

Repository files navigation

Reproducibility Submission of pap223 & apdx190

Experiment Workflow

1 Prepare Hardware & Software Environments

2 Execute the code

3 Check the Experiment Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages