Run parameter sweeps easily, in parallel, with JSON parameters, logs, diverse language support, and parameter data frames.
Python 3.7 or above is required for running sweeps. Versions 3.6 and below will encounter error.
To install: Download sweeps package, navigate to its directory (cd sweeps
) and execute the following:
sudo python setup.py install
After installation, sweeps
may now be invoked from the command line anywhere on your system.
This guide assumes you are working in the top-level of your parameter sweeps directory. For an example, see the directory tree below.
- Initialize this directory by creating
bin
andrfs
directories. - Add a JSON file to the top-level containing parameter sweep information.
⋅⋅* An example parameter sweep file, such as
sweep_config.json
, may be seen in the test folder. - Add a script file to the
bin
folder. - Create run folders (rfs) using
sweeps create
(below) - Run the script using
sweeps run
(below)
Run folders (rfs) represent individual runs of a script file for a particular parameter. The folder name is a hash depending on the parameter value and script file.
sweeps . create sweep_config.json
Requirement: A script file, such as script.py
, must be located inside a bin
folder on your top-level directory. (See example directory tree below:)
sweeps . run python script_file.py
Querying shows the status of your run, including the number of rfs completed, queued, running, and failed.
Requirement: A script file, such as script.py
, must be located inside a bin
folder on your top-level directory. (This is already satisfied if sweeps run
was used.)
sweeps . query script_file.py
.
├── bin
│ └── script_file.jl
├── history
│ ├── 2019-12-10_16-34-13.create.json
│ ├── 2019-12-10_16-34-40.run
│ └── 2019-12-10_16-34-40.script
├── rfs
│ ├── 0e37e95b8301883e
│ │ ├── log.txt
│ │ ├── params.json
│ │ └── status.txt
│ ├── 6e733249c3ae5dd1
│ │ ├── log.txt
│ │ ├── params.json
│ │ └── status.txt
│ ├── 7bfacd4db6a44d40
│ │ ├── log.txt
│ │ ├── params.json
│ │ └── status.txt
│ ├── 9ac81a2c5029aa08
│ │ ├── log.txt
│ │ ├── params.json
│ │ └── status.txt
│ └── d73ece6dc1a2f5e8
│ ├── log.txt
│ ├── params.json
│ └── status.txt
└── sweep_config.json
A couple of built-in Python tools exist to extract completed run data in an rfs
folder and create a Pandas DataFrame.
In Python, at the top-level directory such as the one in the example structure tree,
>>> import sweeps
>>> import os
>>> cwd = os.getcwd()
>>> run_DataFrame = sweeps.get_DataFrame(cwd)
# Result: run_DataFrame =
value
9ac81a2c5029aa08 2.0
7bfacd4db6a44d40 3.0
d73ece6dc1a2f5e8 0.0
6e733249c3ae5dd1 1.0
0e37e95b8301883e 4.0
In Python at the top-level directory, to read in a data file saved by your script for a particular run,
>>> result = sweeps.get_data('9ac81a2c5029aa08', cwd)
The following data formats have support for sweeps.get_data()
:
- HDF5 (.hdf5)
- Matlab (.mat)
- JSON (.json) Note: Ensure that saved data file is not named params.json
- Binary JSON (.bson)
- Numpy array (.npz)
- Python Pickele file (.pklz)
- Julia, using HDF5 encoding (.jld or .jld2) (returns Numpy array if it is the only object stored in file, otherwise returns HDF5 keys)
If no recognized file type is found, any existing data file is returned without any attempt at processing it.