-
Notifications
You must be signed in to change notification settings - Fork 37
Manual
This guide contains a detailed description of the procedure for launching the main scenario of the DLI benchmarking system on remote computing nodes in Docker containers. The example uses Intel Distribution of OpenVINO toolkit 2022.1 to infer deep models. Note that these instructions are also valid for other supported frameworks.
This section contains a step-by-step description of the procedure for deploying a DLI system on remote computing nodes using the deployment module included in the system. At the moment, the DLI system supports two modes for launching experiments: experiments will be launched directly in the current environment or in the corresponding Docker container. If the system has already been deployed, you can skip this step.
Manually deploy the environment on computing nodes. You should execute commands from the corresponding Docker files, which are stored in the docker directory, on each computing node. After that, the DLI system will be deployed on each machine in the current environment.
-
Prepare a Docker image of the system, which will be remotely deployed on computing nodes. You need to go to the directory with the Docker files and build the required image, using the instructions that are located in the corresponding docker directory and specifying all the necessary
ARG
variables from the corresponding Docker file. Save the image to an archive, which will be copied to the required remote nodes.docker build -t openvino:2022.1 --build-arg DATASET_DOWNLOAD_LINK=<path> docker save openvino:2022.1 > openvino_2022.1.tar
-
Prepare computing nodes for executing experiments. For convenience, we assume that on each machine with the user
itmm
a directory/home/itmm/validation
is created, which contains all the necessary files: the system repository DLI and the repository OpenVINO™ Toolkit - Open Model Zoo. In addition to cloning repositories in the directory, you should create a separateresults
directory, in which files with the results of experiments will be stored in the future. In addition, it is required to mount shared directories with models and datasets on remote machines in advance. Please, use the following commands:cd ~ mkdir validation && cd validation git clone https://github.com/itlab-vision/dl-benchmark.git --depth 1 git clone https://github.com/openvinotoolkit/open_model_zoo.git --recursive --branch 2022.1.0 --single-branch --depth 1 mkdir results sudo mount -t cifs -o username=itmm,password=itmm //10.0.32.14/linuxshare /mnt
Futher, we assume that the address
/mnt
contains new directories:/mnt/models
is a directory with models, and/mnt/datasets
is a directory with datasets for executing a module for assessing the accuracy of models. -
Prepare a configuration file for the remote system deployment module. The current experiment assumes three computing nodes. For each node there is an IP address (tag
IP
), login and password to access the node (tagsLogin
andPassword
), OS (tagOS
), the name of the working directory with the sources of the DLI benchmarking system (tagDownloadFolder
) and the path to the shared directories with test datasets and models (tagsDatasetFolder
andModelFolder
). An example of a completed configuration file is presented below.<?xml version="1.0" encoding="utf-8" ?> <Computers> <Computer> <IP>10.0.32.12</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <DownloadFolder>/home/itmm/validation</DownloadFolder> <DatasetFolder>/mnt/datasets</DatasetFolder> <ModelFolder>/mnt/models</ModelFolder> </Computer> <Computer> <IP>10.0.32.13</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <DownloadFolder>/home/itmm/validation</DownloadFolder> <DatasetFolder>/mnt/datasets</DatasetFolder> <ModelFolder>/mnt/models</ModelFolder> </Computer> <Computer> <IP>10.0.32.16</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <DownloadFolder>/home/itmm/validation</DownloadFolder> <DatasetFolder>/mnt/datasets</DatasetFolder> <ModelFolder>/mnt/models</ModelFolder> </Computer> </Computers>
-
Execute the deployment module of the DLI benchmarking system using manual.
python3 deploy.py -s 10.0.32.15 -l itmm -p itmm \ -i ~/dl-benchmark/docker/OpenVINO_DLDT/openvino_22.1.tar \ -d /home/itmm/ftp \ -n OpenVINO_DLDT \ --machine_list deploy_config.xml \ --project_folder /home/itmm/validation/dl-benchmark
The script copies the archive with the Docker image to the FTP server (in this example, the IP address of the FTP server is 10.0.32.15), futher the script copies it from the FTP server to all remote nodes described in the corresponding configuration file, and then deploys their.
For each node, you can create your own configuration file to run performance experiments.
Let's create configuration files using the guide. As an example, let’s run an experiment
for the classical AlexNet model in the latency mode on all three computing nodes. Note that
to validate the performance of models, a separate private data repository
itlab-vision-dl-benchmark-data
is used, which is cloned during the system deployment into
the /tmp
directory.
<?xml version="1.0" encoding="utf-8"?>
<Tests>
<Test>
<Model>
<Task>classification</Task>
<Name>alexnet</Name>
<Precision>FP32</Precision>
<SourceFramework>Caffe</SourceFramework>
<ModelPath>/media/models/public/alexnet/FP32/alexnet.xml</ModelPath>
<WeightsPath>/media/models/public/alexnet/FP32/alexnet.bin</WeightsPath>
</Model>
<Dataset>
<Name>ImageNET</Name>
<Path>/tmp/itlab-vision-dl-benchmark-data/Datasets/ImageNET/</Path>
</Dataset>
<FrameworkIndependent>
<InferenceFramework>OpenVINO DLDT</InferenceFramework>
<BatchSize>1</BatchSize>
<Device>CPU</Device>
<IterationCount>1000</IterationCount>
<TestTimeLimit>180</TestTimeLimit>
</FrameworkIndependent>
<FrameworkDependent>
<Mode>sync</Mode>
<Extension></Extension>
<AsyncRequestCount></AsyncRequestCount>
<ThreadCount></ThreadCount>
<StreamCount></StreamCount>
<Frontend></Frontend>
<InputShape></InputShape>
<Layout></Layout>
<Mean></Mean>
<InputScale></InputScale>
</FrameworkDependent>
</Test>
</Tests>
Note that paths in the configuration files are relative to the Docker container, if the system
is deployed in a Docker environment, and relative to the current environment, if deployed
directly on the current machine. Configuration files are first copied to an FTP server;
during system deployment, the script copies test configurations to the corresponding remote
machines. In this example configuration files on the FTP server are stored at /home/itmm/ftp/remote
.
scp benchmark_config_i3.xml itmm@10.0.32.15:/home/itmm/ftp/remote
scp benchmark_config_i7.xml itmm@10.0.32.15:/home/itmm/ftp/remote
scp benchmark_config_tower.xml itmm@10.0.32.15:/home/itmm/ftp/remote
Similarly, we create configuration files for the module of checking the accuracy of deep models. As an example, let's take the same AlexNet model as for the benchmarking subsystem. Note that the path to the model is specified relative to the environment of the Docker container in the case of deploying the system in a Docker environment and relative to the current environment in the case of deployment directly on the current computing node. The path to the configuration file for the accuracy checker is always described relative to the computin node.
<?xml version="1.0" encoding="utf-8"?>
<Tests>
<Test>
<Model>
<Task>classification</Task>
<Name>alexnet</Name>
<Precision>FP32</Precision>
<SourceFramework>Caffe</SourceFramework>
<Directory>/media/models/public/alexnet/FP32</Directory>
</Model>
<Parameters>
<InferenceFramework>OpenVINO DLDT</InferenceFramework>
<Device>CPU</Device>
<Config>/home/itmm/validation/open_model_zoo/tools/accuracy_checker/configs/alexnet.yml</Config>
</Parameters>
</Test>
</Tests>
The configuration file is copied to the FTP server, and the deployment script
will subsequently transfer the test configurations to the corresponding remote
computing nodes. Let's say that in the current example, we stote the configuration files
on the FTP server at the /home/itmm/ftp/remote
directory.
scp accuracy_checker_config_i3.xml itmm@10.0.32.15:/home/itmm/ftp/remote
scp accuracy_checker_config_i7.xml itmm@10.0.32.15:/home/itmm/ftp/remote
scp accuracy_checker_config_tower.xml itmm@10.0.32.15:/home/itmm/ftp/remote
To execute experiments remotely, please, follow, the described instructions.
-
Prepare the configuration file
config.xml
for the remote experiment start using the manual, and save it on the FTP server at/home/itmm/ftp/remote/
. The example below is presented for the case when the system is deployed in Docker containers. If you deploy the DLI system directly on computational nodes, you should replace the value of the<Executor>docker_container</Executor>
tag with<Executor>host_machine</Executor>
everywhere.<?xml version="1.0" encoding="utf-8" ?> <Computers> <Computer> <IP>10.0.32.16</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <FTPClientPath>/home/itmm/validation/dl-benchmark/src/remote_control/ftp_client.py</FTPClientPath> <Benchmark> <Config>/home/itmm/ftp/remote/benchmark_config_i3.xml</Config> <Executor>docker_container</Executor> <LogFile>/home/itmm/validation/log_bench.txt</LogFile> <ResultFile>/home/itmm/validation/result_bench_table.csv</ResultFile> </Benchmark> <AccuracyChecker> <Config>/home/itmm/ftp/remote/accuracy_checker_config_i3.xml</Config> <Executor>docker_container</Executor> <DatasetPath>/media/datasets/</DatasetPath> <DefinitionPath>/home/itmm/validation/open_model_zoo/tools/accuracy_checker/dataset_definitions.yml</DefinitionPath> <LogFile>/home/itmm/validation/log_ac.txt</LogFile> <ResultFile>/home/itmm/validation/result_ac_table.csv</ResultFile> </AccuracyChecker> </Computer> <Computer> <IP>10.0.32.12</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <FTPClientPath>/home/itmm/validation/dl-benchmark/src/remote_control/ftp_client.py</FTPClientPath> <Benchmark> <Config>/home/itmm/ftp/remote/benchmark_config_i7.xml</Config> <Executor>docker_container</Executor> <LogFile>/home/itmm/validation/log_bench.txt</LogFile> <ResultFile>/home/itmm/validation/result_bench_table.csv</ResultFile> </Benchmark> <AccuracyChecker> <Config>/home/itmm/ftp/remote/accuracy_checker_config_i7.xml</Config> <Executor>docker_container</Executor> <DatasetPath>/media/datasets/</DatasetPath> <DefinitionPath>/home/itmm/validation/open_model_zoo/tools/accuracy_checker/dataset_definitions.yml</DefinitionPath> <LogFile>/home/itmm/validation/log_ac.txt</LogFile> <ResultFile>/home/itmm/validation/result_ac_table.csv</ResultFile> </AccuracyChecker> </Computer> <Computer> <IP>10.0.32.13</IP> <Login>itmm</Login> <Password>itmm</Password> <OS>Linux</OS> <FTPClientPath>/home/itmm/validation/dl-benchmark/src/remote_control/ftp_client.py</FTPClientPath> <Benchmark> <Config>/home/itmm/ftp/remote/benchmark_config_tower.xml</Config> <Executor>docker_container</Executor> <LogFile>/home/itmm/validation/log_bench.txt</LogFile> <ResultFile>/home/itmm/validation/result_bench_table.csv</ResultFile> </Benchmark> <AccuracyChecker> <Config>/home/itmm/ftp/remote/accuracy_checker_config_tower.xml</Config> <Executor>docker_container</Executor> <DatasetPath>/media/datasets/</DatasetPath> <DefinitionPath>/home/itmm/validation/open_model_zoo/tools/accuracy_checker/dataset_definitions.yml</DefinitionPath> <LogFile>/home/itmm/validation/log_ac.txt</LogFile> <ResultFile>/home/itmm/validation/result_ac_table.csv</ResultFile> </AccuracyChecker> </Computer> </Computers>
-
Execute experiments remotely. Please, use the
screen
utility for Linux, which allows you to create background processes. As a result, you can run a script with experiments, and then turn off the terminal. Executing the command specified below, a newvalidation
window is created, in which you can run experiments remotely.screen -S validation
To start benchmark you need to enter the following command on the FTP server:
python3 remote_start.py -c /home/itmm/ftp/remote/config.xml \ -s 10.0.32.15 -l itmm -p itmm \ -br benchmark_results.csv \ -acr accuracy_checker_results.csv \ --ftp_dir /home/itmm/ftp/results
Thus, the script will go through all the described computing nodes from the configuration file
/home/itmm/ftp/remote/config.xml
, copy the corresponding configuration files with descriptions of the experiments and run them. To exit this session, you must press the following keys:CTRL + A + D
. To return to our session with the benchmark and find out the status, you need to enter the following command:screen -R validation
At the end of experiments, the validation
screen session should display
the following lines:
[ INFO ] Ended process on Linux with id 0
[ INFO ] Ended process on Linux with id 0
[ INFO ] Ended process on Linux with id 0
Upon completion of the experiments, csv tables with the results of experiments
assessing the performance of deep models and checking their accuracy from each remote node,
as well as generalized tables benchmark_results.csv
and accuracy_checker_results.csv
will be stored on the FTP server in the directory /home/itmm/ftp/results
.
These files can be downloaded to your local machine and converted to HTML and/or XSLX formats
using the following commands.
scp itmm@10.0.16.125:/home/itmm/ftp/results/benchmark_results.csv /tmp
scp itmm@10.0.16.125:/home/itmm/ftp/results/accuracy_checker_results.csv /tmp
cd /tmp/dl-benchmark/src/csv2html
python3 converter.py -t /tmp/benchmark_results.csv -r /tmp/benchmark_results.html -k benchmark
python3 converter.py -t /tmp/accuracy_checker_results.csv -r /tmp/accuracy_checker_results.html -k accuracy_checker
cd /tmp/dl-benchmark/src/csv2xlsx
python3 converter.py -t /tmp/benchmark_results.csv -r /tmp/benchmark_results.xlsx -k benchmark
python3 converter.py -t /tmp/accuracy_checker_results.csv -r /tmp/accuracy_checker_results.xlsx -k accuracy_checker