Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.1.10 Release #432

Merged
merged 184 commits into from
Dec 12, 2023
Merged

1.1.10 Release #432

merged 184 commits into from
Dec 12, 2023

Conversation

jwhite242
Copy link
Collaborator

This PR is the merge of release 1.1.10 into the mainline branch.

FrankD412 and others added 30 commits July 14, 2019 12:59
* Addition of hashing to Study parameterization.

* Addition of the hashws option to argparse.

* Addition of a warning note for users who use labels in steps.
* Addition of a more general flux ScriptAdapter.

* Addition of some casting from int to str

* Corrected "gpus" to "ngpus"

* Rework jobspec construction to make a valid jobspec.

* Check for empty value for cores per task.
…ements. (#157)

* Made pickle and log path string safe for pathing.

* Tweaks to make_safe_path to include a base path.

* Updates to make_safe_path usage

* Correction to not modify the iterator copy.
…hat breaks printing. (#160)

* Addition of a utility function for formatting times to H:M:S

* _StepRecord time methods now call the new utility function.

* Tweaks to add days to the format to avoid 3 digit hours.

* Tweak to formatting.

* Made the day format more parsable.
* Removal of _stage_linear since it is now not needed.

* Addition of linear LULESH samples.

* Update the dev to 1.1.
…on (#152)

* Addition of a utility method to create a dictionary from a list of key-value pairs.

* Addition of the pargs interface for passing parameters to custom parameter generation.

* Addition of a Monte Carlo example that accepts pargs.

* Addition of pargs check for dependency on pgen.

* Addition of clearer error message for malformed parameters.

* Update setup.py
Signed-off-by: Peter Robinson <robinson96@llnl.gov>
* Changes to make workspaces reflect relative pathing based on step names.

* Addition of an alternative output format based on step combinations.
Fixes #167 

* added pytest to requirements

added Pipfile and pipenv settings

* Added property key to Abstract.ScriptAdapter (#167)
Also added impementation and tests to verify that existing functionality isn't changed

* updated factory to use key when registering adapters(#167)

* cleanedup linelength

* cleaned up imports to be specific to module (#167)

* added tests to verify exception for unknown adapter

* moved adapters tests to individual files

* added test to verify scriptadapter functionality (#167)

updated gitignore to have testing and pycharm ignores

testing existing adapters in factory (#167)

added test to verify factories.keys matches get_valid_adapters (#167)

added copyright to file

* updated __init__ modules to do dynamic includes

* removed unneeeded imports

* updated dependency versions

* fixed all flake8 errors

* updated to run flake8 and pytest when run locally

* updated tests to have documentation about purpose and function as requested in #170

* fixed line length

* Removal of nose from requirements.

* updated to remove nose from the requirements
* Locking the version of PyYAML to be above 2.1 because of an arbitrary code execution vulnerability.

* Addition of a version condition to pyyaml to patch a vulnerability.

* Update of Pipfile.lock to match Pipefile.
Fixes #173 

* Addition of a loader to the yaml load call.

* Addition of a catch if the loader attribute is missing.
* Moved enum34 to condition dependent on Python<3.4.

* Addition of conditional enum34 install for requirements.txt.

* Correction of requirements.txt syntax for python version.
* Addition of a Dockerfile for quick tutorials.

* Tweaks for Docker and addition of git.

* Tweak to Docker file for caching.

* Addition of Docker documentation.

* Tweaks to Docker documentation.

* Removal of markdown ##
…ten. (#181)

* Take out shebang from shell definiton and at it when script is written.

* Include shebang in cmd and fix format of string written to file.
* Extension of shebang feature to allow users to specify shells.

* Addition of debug message to print kwargs.

* Addition of kwargs.

* Addition of basic batch settings to LULESH sample.

* Addition of kwargs to Flux adapters.

* Docstring tweaks.

* Docstring update.
* Docstring correction for LocalAdapter.

* Correction to addition of exec line at top of scripts.
* Addition of a Record class for storing general data.

* Addition of SubmissionRecord type.

* Update to the order of for record parameters.

* Changes to StepRecord to expect SubmissionRecord returns.

* Updates to SLURM and local adapters to use SubmissionRecords.

* Slight tweak to LocalAdapter docstring.

* Tweak to have SubmissionRecord initialize its base.

* Addition of CancellationRecord class.

* Changes to CancellationRecord to map based on status.

* Additional interface additions and tweaks.

* Changes to have cancel use CancellationRecords.

* Update to ExecutionGraph to use records.

* Updates to SLURM and local adapters to use SubmissionRecords.

* Slight tweak to LocalAdapter docstring.

* Addition of CancellationRecord class.

* Additional interface additions and tweaks.

* Changes to have cancel use CancellationRecords.

* Cherry pick of execution commit.

* Removal of redundant "get" definiton.
FrankD412 and others added 28 commits April 3, 2022 17:14
* initial patch for more complete configuration of jsrun launcher

* First pass at documenting usage of the lsf/jsrun launcher

* Add corresponding maestro steps for each jsrun variant

* Add one example of a memory hungry application

* Build table mapping jsrun to maestro step keys

* Add binding controls, rename keys from snake case, document defaults

* Fix up straggling snake case keys

* Improve debugging info in schema error messages

* Fix rs_per_node, make gpu binding optional since it's new in lsf 10.1

* Update binding flag in examples, add note about gpu binding availability

* Add initial lsfscriptadapter tests

* Initial pass at general batch block documentation

* Remove old commentary

* Remove unneeded nodes/procs math in jsrun launcher substitution

* Remove unneeded loggin output

* Remove more debugging log outputs

* Update lsf examples to match json schema for resource specification keys

* Cleanup the cpus per rs machinery, schema

* Add openmp and mpi lulesh study to exercise lsf resource specification keys

* Document the sample lsf lulesh specification
* Removal of setup.py for poetry editable.

* Addition of install testing.

* Make python versions strings to avoid 3.1 Python version.

* Version tick for dev2.
* First pass at re-enabling tests in ci

* Update to newer poetry gh action, tweak cache setting

* Fix missing }

* Fix incorrect gh action repository name

* Disable venv cache

* Test out doing flake8 linting with poetry

* Revert flake8 linting to separate run for nicer reporting

* Test reusable python matrix

* Revert reuse test, sync up steps between pip/poetry

* Remove missing dependency

* force linting pass before running expensive install/pytest steps
Bumps [certifi](https://github.com/certifi/python-certifi) from 2021.10.8 to 2022.12.7.
- [Release notes](https://github.com/certifi/python-certifi/releases)
- [Commits](certifi/python-certifi@2021.10.08...2022.12.07)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [cryptography](https://github.com/pyca/cryptography) from 37.0.1 to 38.0.3.
- [Release notes](https://github.com/pyca/cryptography/releases)
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@37.0.1...38.0.3)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
no scheduled scripts missig #! on lsf
* Updating versions in lock file.

* Increase minimum version of python to 3.8

* Remove missed 3.7 version from matrix.

* Update to readd 3.7 and remove fabric.

* Remove version limit on pytest.

* Updates to versions in lock.

* Add 3.11 to the test matrix

* Bump Sphinx version.
Signed-off-by: grosa1 <g.rosa1@studenti.unimol.it>
* Adds discussion on the philosophy and motivations of Maestros design: reproducible computational science

* Adds complete documentation of the current study specification

* Improved discussion of dependency types available in env block

* Documents Maestros minimal workflow language (tokens, step dependencies, ...)

* Expands documentation of how to schedule studies using various HPC schedulers

* Documentation of command line interface

* Comprehensive, progressive tutorial building Maestro workflows from scratch and running them

* Exposes multiple variants of the lulesh specs in the sample directory directly in the documentation

* Adds mini tutorial for porting existing HPC batch workflows and extending them with Maestro features

* Adds how-to guides with more advanced recipes/solutions

* Add mkdocstrings to autogenerate api docs from source as part of docs build

* Adds workflow topology diagrams using mermaid to better highlight runtime expansion of specs with parameters

* New documentation organization/layout inspired by divio documentation system

* Removes sphinx dependency

---------

Co-authored-by: Jeremy White <white242@llnl.gov>
Switch to importlib to get version for maestro's cli from pyproject.toml and remove old out of sync version source
* Pin mermaid to 9.4 due to api breaking change in 10.x

* Pin readthedocs poetry install to < 1.5 to avoid python2supports import failure in virtualenv pkg
Fixes dependabot flagged security issues:

* Updates lock file to update to certifi version that removes e-Tugra root certificate.

* Updates lock file to update requests dependency to patch unintended leak of proxy-authorization header
…it's resource/uri handling (#415)

Casts flux's JobID objects into string form to enable the dag to be pickled when conductor saves state. JobID objects are tied to flux's use of cffi and are not pickle-able themselves.

Base58 string form is used instead of the integer it's derived from so that logging is consistent with the view users have on the command line with flux reporting job id's in that format. This requires a separate re-instantiation of JobID objects upon waking to check the job status'.

Additional fixes:
* Improve the flexibility of resource specification in the flux adapter to align with the other adapters (i.e. not requiring nodes be specified).
* Change uri handling so env vars are not the only way to schedule jobs
* Updates to get working on flux only machines that require no bootstrapping or uri
  * Changes job submit call to be non-waitable by default on adapter 0.49 and newer as only owner of instance can do so (which is rarely the case on flux managed machines)
* Adds documentation and recipes for how to use Maestro with flux machines and flux instances running on non-flux scheduled machines
* Pins poetry version to 1.4.2 in CI to maintain support for python 3.7
* Fixes parsing of flux broker version to properly handle non-release versions of flux and logs that version into the flux batch script header comments and the maestro logs.
Adds pager functionality to rich rendered status layouts (not legacy layout) with ability to turn pager and theme on/off to remain compatible with systems/setups that don't have full support.
…hecking (#427)

* Adds packaging to toml file as it is now a required non-dev dependency.  Version is constrained to >=22.0 to reflect removal of LegacyVersion which can be returned from parse_version in older packaging versions.
* Improve robustness of flux adapter version error handling by testing against base_version to allow pre-release builds to succeed.
* Add version comparison unit tests for both full and base_version variants
Update v0.49 and v0.26 flux script adapters to adapt cancellation machinery to work with recent update to submit machinery that converts from flux JobID to native types.

Update general cancellation behaviors:
* Check for in progress steps before declaring cancelled successfully to delay until actual final states can be serialized
* Update cancel logic to mirror failure logic: mark all steps downstream of a cancelled step to also be cancelled
Slurm adapter bugfixes
* Adds prescribed column lists to both squeue and sacct to avoid parsing issues due to optional user environment variables 'SQUEUE_FORMAT' and 'SACCT_FORMAT'.
* Reduces some unneeded columns in sacct to make it more robust (not all machines have accounts, which broke the parsing)

Test additions
* Enabled detection of slurm scheduler for live test subset (sched_slurm marker) with auto disablement of marked tests to remain quiet in github ci
* Unit tests added for slurm adapter
  * check use of squeue and sacct job status checking options in presence of custom squeue/sacct format env vars
  * check functionality of job cancellation
* Slurm 'hello world' integration test
@jwhite242 jwhite242 merged commit 141eaaa into master Dec 12, 2023
23 checks passed
jwhite242 added a commit that referenced this pull request Feb 6, 2024
1.1.10 Release (#432)

* Sync up read the docs config with dev environments using poetry (#399)
* Print usage on command line when no args are provided (#404)
* Add sacct fallback to slurm adapter to improve robustness of job tracking (#405)
* Update Flurm Job State mappings for flux versions >= 0.26 (#407)
* Bump certifi from 2021.10.8 to 2022.12.7 to address security issue (#409)
* Bump cryptography from 37.0.1 to 38.0.3 to address security issue (#410)
* Add missing shbang in unscheduled scripts from lsf adapter (#411)
* Update poetry lockfile to address dependabot flagged security issues (#412)
* Fix for Dockerfile smell DL3006 (#418)
* Port Maestro documentation to mkdocs and expand coverage of features and tutorials (#403)
* Update version info to be driven from pyproject.toml exclusively, and hook up to command line (#419)
* Pin mermaid to < 10.x due to api change (#422)
* Bump lock file certifi from 2022.12.7 to 2023.7.22 to address security issue (#426)
* Refactor flux adapter to avoid using pickle to talk to flux brokers installed in external environments (#415)
   Also adds flux integration tests to exercise against real flux brokers
* Add pager functionality to status command (#420)
* Patch broken flux job cancellation (#428)
* Insulate slurm adapters from user customization of squeue and sacct output formats (#431)
   Also adds live unit and integration tests for slurm adapter

---------

Co-authored-by: Francesco Di Natale <frank.dinatale1988@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Bruno P. Kinoshita <kinow@users.noreply.github.com>
Co-authored-by: Charles Doutriaux <doutriaux1@llnl.gov>
Co-authored-by: Giovanni Rosa <grosa23@yahoo.com>
Co-authored-by: Brian Gunnarson <49216024+bgunnar5@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.