Bachelor's degree in Data Science and Engineering. Final Thesis
Defended on July 1, 2021
Miquel Escobar Castells, @miquelescobar, miquel.escobar@estudiantat.upc.edu
- Abstract
- Repository structure
2.1. CTE-POWER
2.1. AWS SageMaker
2.1. Figures - Contact
Combining Reinforcement Learning and Deep Learning is the most challenging Artificial Intelligence research and development area at present. Scaling these types of applications in an HPC infrastructure available in the cloud will be crucial for advancing the massive use of these technologies. The purpose of this project is to develop and test various implementations of Deep RL algorithms given a selected case study, using Barcelona Supercomputing Center’s (BSC) CTE-POWER cluster, and make a deployment to the cloud of the training pipeline to analyze its costs and viability in a production environment.
One can find the complete report of the project in the report.pdf
file in the root directory, as well as the thesis defense slides in the slides.pdf
file.
The directory structure of this reposirory is shown below. In each of the subsections, more detailed documentation about its internal structure is provided, as well as instructions on dependencies installation and reproducibility.
├───cte-power
│ ├───jobs
│ │ ├───distributed
│ │ ├───err
│ │ ├───gym
│ │ ├───out
│ │ ├───pybullet-envs
│ │ ├───pybulletgym
│ │ └───unity
│ ├───scripts
│ │ ├───plot
│ │ ├───render
│ │ └───train
│ ├───trainings
│ │ ├───gym
│ │ ├───pybullet-envs
│ │ ├───pybulletgym
│ │ └───unity
│ └───results
│ └───humanoid
|
├───aws-sagemaker
│ └───rl-ray-pybullet
│ ├───common
│ │ ├───sagemaker_rl
│ └───src
|
└───figures
├───plots
└───videos
In the ./cte-power/
directory one can find all files and scripts required to execute the trainings and metrics evaluations in the https://www.bsc.es/user-support/power.php cluster. The scripts can be executed as-is, assuming that the compatibilities and the required dependencies are maintaned by the system operators at CTE, since the loading of environment are defined in the jobs section scripts.
It is worth noting that some files might not be included in this repository due to their size.
In order to understand the implementation, below there is a more detailed description of each one of the modules.
In the ./cte-power/scripts/
directory reside the necessary scripts for executing the training, renderization and evaluation plots of RLlib algorithms on different kinds of environments. Depending on the RL toolkit, different dependencies are required.
There is an adapted script for the training of each of the following toolkits:
- OpenAI Gym: see
./cte-power/scripts/train/train-gym.py
script. - PyBullet: see
./cte-power/scripts/train/train-bullet-envs.py
script. - PyBullet Gymperium: see
./cte-power/scripts/train/train-pybulletgym.py
script. - Unity: see
./cte-power/scripts/train/train-unity.py
script.
The ./cte-power/scripts/render/
directory also provides scripts for the renderization of the OpenAI Gym, PyBullet and PyBullet Gymperium toolkits.
Finally, the ./cte-power/scripts/plot/
directory containes the scripts used to analyze, slighlty transform and plot the data of the obtained training results.
In this directory one can find the bash scripts used for the execuition of SLURM trainingjobs in the CTE-POWER cluster. There is the output and error directories, as well as a directory for each of the toolkits and the distributed directory that contains the distributed jobs definitions and launching script.
The trainings directory contains the configurations used as input for the training jobs (hyperparameters, resource allocation, stopping conditions, etc.). They are organized by RL toolkit.
In this directory the training results and metrics are stored. In the GitHub repository, only the summarized data extracted from the files with the raw training metrics at an iteration granularity level can be found, given that the latter occupy several MBs of space per file.
AWS SageMaker is the selected tool in this project for implementing the RL training pipelines in the Cloud. It is an AWS service that provides higher-level modules for machine learning functionalities in the AWS environment.
In this repository one can find an example for executing a complete training pipeline of PyBullet environments.
In this directory, the figures corresponding to the generated plots from the training results, the drawn architectures and miscellaneous images as well as some videos of renderizations of interactions between trained agents and the environment can be found.
For any inquiries please contact the author of this project at miquel.escobar@estudiantat.upc.edu or miquel.escobar@bsc.es.