Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status tweaks #358

Merged
merged 45 commits into from
Jun 5, 2021
Merged
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f8c6cbd
Add bfs/dfs ordered status output
jwhite242 May 7, 2021
c45753c
Prototype rich formatted status tables with cli layout switch
jwhite242 May 7, 2021
c6ab494
Fix bfs ordering option, make it default
jwhite242 May 7, 2021
a3edcc5
Remove unhelpful log output..
jwhite242 May 7, 2021
40b170d
Add rich dependency
jwhite242 May 7, 2021
033ff01
Add rich dependency to setup.py
jwhite242 May 7, 2021
f314cce
Quit whining flake8...
jwhite242 May 7, 2021
d62cea7
More flake8 drama..
jwhite242 May 7, 2021
728267d
Cache status ordering to improve scalability
jwhite242 May 7, 2021
b1c2543
Add bfs/dfs ordered status output
jwhite242 May 7, 2021
a6e096f
Prototype rich formatted status tables with cli layout switch
jwhite242 May 7, 2021
d923c8b
Fix bfs ordering option, make it default
jwhite242 May 7, 2021
dd1ede2
Remove unhelpful log output..
jwhite242 May 7, 2021
013dc49
Quit whining flake8...
jwhite242 May 7, 2021
2993c3a
More flake8 drama..
jwhite242 May 7, 2021
51ee47c
Cache status ordering to improve scalability
jwhite242 May 7, 2021
301cfae
Tweak to python version for rich.
FrankD412 May 7, 2021
b181278
Attach params to step records to enable use in status output
jwhite242 May 8, 2021
1ec26c9
Add param name:value table in narrow status layout
jwhite242 May 8, 2021
8b84ff2
Check for new status column to enable backwards compatibility
jwhite242 May 8, 2021
960e30d
Checkpoint on renderer factory implementation
jwhite242 May 10, 2021
af088db
Cleanup after successful test
jwhite242 May 10, 2021
371704f
Remove extraneous debug logging
jwhite242 May 10, 2021
32cfe2b
Fix bad indent, tweak themes to play nice on different terminal themes
jwhite242 May 13, 2021
212b017
Unit test for flat status layout
jwhite242 May 14, 2021
c7c9887
Fix erroneous whitespace
jwhite242 May 14, 2021
1c04eec
Add step root to workspace in status of parameterized steps
jwhite242 May 18, 2021
199411b
Rework status tests, add narrow layout test, rebaseline
jwhite242 May 18, 2021
b9a8f69
Remove stray debug printing
jwhite242 May 18, 2021
4e0fdff
Add help target and some documentation to the make file
jwhite242 May 18, 2021
162a431
Document layouts, add some test comments
jwhite242 May 18, 2021
29b528d
Rename status renderer test file so pytest can find it automatically
jwhite242 May 19, 2021
b4c122a
Fix bug causing duplicate entries in narrow layout
jwhite242 May 19, 2021
fabdeb1
Update narrow layout test baseline
jwhite242 May 19, 2021
1a3ed51
Add layout screenshots to docs
jwhite242 May 19, 2021
3e2cecb
Add legacy table format back to the status layout options
jwhite242 May 19, 2021
6ee2d19
Convert base renderer to abstract, implement auto registration and
jwhite242 May 21, 2021
9c97ce5
Add some proper google style doc strings to render factory
jwhite242 May 24, 2021
841d1ff
Docstring, arg defaults change for narrow renderer
jwhite242 May 24, 2021
f873ebd
Sync up layout methods and doc strings of status renderers
jwhite242 May 24, 2021
9c76ffc
Fix up docstrings on params for StepRecords
jwhite242 Jun 2, 2021
a683540
Fix incorrect return type documentation
jwhite242 Jun 2, 2021
1cbc9a1
Bug fixes/style tweaks on narrow status layout
jwhite242 Jun 5, 2021
6075067
Remove debug output
jwhite242 Jun 5, 2021
8720d3d
Add documentation for legacy status layout
jwhite242 Jun 5, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,21 +1,31 @@
DOCS = docs
EGG = $(wildcard *.egg-info)

.PHONY: cleanall clean cleandocs docs
.PHONY: cleanall clean cleandocs docs help release

all: cleanall release docs
# Use '#' comments to auto document each target in the help message
help: # Show this help message
@echo 'usage: make [target] ...'
@echo
@echo 'targets:'
@egrep '^(.+)\:\ #\ (.+)' ${MAKEFILE_LIST} | column -t -c 2 -s ':#'

release:
all: # Clean and then build everything
cleanall release docs

release: # Build wheel
python setup.py sdist bdist_wheel

docs:
docs: # Build documentation
$(MAKE) -C $(DOCS) html

clean:
clean: # Clean up release (wheel) build areas
rm -rf dist
rm -rf build

cleandocs:
cleandocs: # Clean up docs build areas
rm -rf $(DOCS)/build

cleanall: clean cleandocs
cleanall: # Clean every thing
clean
cleandocs
84 changes: 29 additions & 55 deletions docs/source/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,39 +121,17 @@ Maestro will launch a conductor in the background using ``nohup`` in order to mo
Monitoring a Running Study
***************************

Once the conductor is spun up, you will be returned to the command line prompt. There should now be a ``.tests/lulesh`` directory within the root of the repository. This directory represents the executing study's workspace, or where Maestro will place this study's data, logs, and state. For a more in-depth description of the contents of a workspace see the documentation about :doc:`Study Workspaces <./maestro_core>`.
Once the conductor is spun up, you will be returned to the command line prompt. There should now be a ``./tests/lulesh`` directory within the root of the repository. This directory represents the executing study's workspace, or where Maestro will place this study's data, logs, and state. For a more in-depth description of the contents of a workspace see the documentation about :doc:`Study Workspaces <./maestro_core>`.

In order to check the status of a running study, use the ``maestro status`` subcommand. The only required parameter to the status command is the path to the running study's workspace. In this case, to find the status of the running study (from the root of the repository) is::

$ maestro status ./tests/lulesh

The resulting output will look something like below::

Step Name Workspace State Run Time Elapsed Time Start Time Submit Time End Time Number Restarts
---------------------------------- ------------------- ----------- -------------- -------------- -------------------------- -------------------------- -------------------------- -----------------
run-lulesh_ITER.20.SIZE.20 ITER.20.SIZE.20 FINISHED 0:00:00.226297 0:00:00.226320 2018-08-07 12:54:23.233567 2018-08-07 12:54:23.233544 2018-08-07 12:54:23.459864 0
post-process-lulesh post-process-lulesh INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.9 TRIAL.9 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.8 TRIAL.8 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-size_SIZE.10 SIZE.10 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.1 TRIAL.1 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.3 TRIAL.3 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.2 TRIAL.2 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.5 TRIAL.5 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.4 TRIAL.4 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.7 TRIAL.7 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.6 TRIAL.6 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.30.SIZE.20 ITER.30.SIZE.20 FINISHED 0:00:00.543726 0:00:00.543743 2018-08-07 12:54:23.469009 2018-08-07 12:54:23.468992 2018-08-07 12:54:24.012735 0
run-lulesh_ITER.10.SIZE.20 ITER.10.SIZE.20 FINISHED 0:00:00.148773 0:00:00.148794 2018-08-07 12:54:23.068119 2018-08-07 12:54:23.068098 2018-08-07 12:54:23.216892 0
post-process-lulesh-size_SIZE.30 SIZE.30 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.20.SIZE.30 ITER.20.SIZE.30 FINISHED 0:00:01.066736 0:00:01.066757 2018-08-07 12:54:24.892856 2018-08-07 12:54:24.892835 2018-08-07 12:54:25.959592 0
run-lulesh_ITER.30.SIZE.10 ITER.30.SIZE.10 FINISHED 0:00:00.054475 0:00:00.054488 2018-08-07 12:54:23.005877 2018-08-07 12:54:23.005864 2018-08-07 12:54:23.060352 0
make-lulesh make-lulesh FINISHED 0:00:05.416096 0:00:05.416109 2018-08-07 12:53:17.395362 2018-08-07 12:53:17.395349 2018-08-07 12:53:22.811458 0
run-lulesh_ITER.10.SIZE.10 ITER.10.SIZE.10 FINISHED 0:00:00.043584 0:00:00.043610 2018-08-07 12:54:22.905328 2018-08-07 12:54:22.905302 2018-08-07 12:54:22.948912 0
run-lulesh_ITER.20.SIZE.10 ITER.20.SIZE.10 FINISHED 0:00:00.035449 0:00:00.035463 2018-08-07 12:54:22.958755 2018-08-07 12:54:22.958741 2018-08-07 12:54:22.994204 0
run-lulesh_ITER.10.SIZE.30 ITER.10.SIZE.30 FINISHED 0:00:00.812721 0:00:00.812764 2018-08-07 12:54:24.069466 2018-08-07 12:54:24.069423 2018-08-07 12:54:24.882187 0
post-process-lulesh-size_SIZE.20 SIZE.20 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.30.SIZE.30 ITER.30.SIZE.30 FINISHED 0:00:01.376227 0:00:01.376240 2018-08-07 12:54:25.968730 2018-08-07 12:54:25.968717 2018-08-07 12:54:27.344957 0
The resulting output will look something like below:

.. image:: ./status_layouts/flat_layout_lulesh_in_progress.png
:width: 1735
:alt: Flat layout view of status of an in progress study


The general statuses that are usually encountered are:
Expand All @@ -163,6 +141,7 @@ The general statuses that are usually encountered are:
- ``FINISHED``: A step that has completed successfully.
- ``FAILED``: A step that during execution encountered a non-zero error code.


Cancelling a Running Study
***************************

Expand All @@ -172,30 +151,25 @@ Similar to checking the status of a running study, cancelling a study uses the `

.. note:: Cancelling a study is not instantaneous. The background conductor is a daemon which spins up periodically, so cancellation occurs the next time the conductor returns from sleeping and sees that a cancel has been triggered.

When a study is cancelled, the cancellation is reflected in the status when calling the ``maestro status`` command::

Step Name Workspace State Run Time Elapsed Time Start Time Submit Time End Time Number Restarts
---------------------------------- ------------------- --------- -------------- -------------- -------------------------- -------------------------- -------------------------- -----------------
run-lulesh_ITER.20.SIZE.20 ITER.20.SIZE.20 FINISHED 0:00:00.238367 0:00:00.238549 2018-08-07 17:24:04.178433 2018-08-07 17:24:04.178251 2018-08-07 17:24:04.416800 0
post-process-lulesh post-process-lulesh CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.813454 0
post-process-lulesh-trials_TRIAL.9 TRIAL.9 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.813207 0
post-process-lulesh-trials_TRIAL.8 TRIAL.8 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.812957 0
post-process-lulesh-size_SIZE.10 SIZE.10 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.809833 0
post-process-lulesh-trials_TRIAL.1 TRIAL.1 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.810962 0
post-process-lulesh-trials_TRIAL.3 TRIAL.3 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.811659 0
post-process-lulesh-trials_TRIAL.2 TRIAL.2 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.811368 0
post-process-lulesh-trials_TRIAL.5 TRIAL.5 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.812205 0
post-process-lulesh-trials_TRIAL.4 TRIAL.4 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.811927 0
post-process-lulesh-trials_TRIAL.7 TRIAL.7 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.812708 0
post-process-lulesh-trials_TRIAL.6 TRIAL.6 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.812458 0
run-lulesh_ITER.30.SIZE.20 ITER.30.SIZE.20 FINISHED 0:00:00.324670 0:00:00.324849 2018-08-07 17:24:04.425894 2018-08-07 17:24:04.425715 2018-08-07 17:24:04.750564 0
run-lulesh_ITER.10.SIZE.20 ITER.10.SIZE.20 FINISHED 0:00:00.134795 0:00:00.135016 2018-08-07 17:24:04.032750 2018-08-07 17:24:04.032529 2018-08-07 17:24:04.167545 0
post-process-lulesh-size_SIZE.30 SIZE.30 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.810583 0
run-lulesh_ITER.20.SIZE.30 ITER.20.SIZE.30 FINISHED 0:00:00.678922 0:00:00.679114 2018-08-07 17:24:05.129377 2018-08-07 17:24:05.129185 2018-08-07 17:24:05.808299 0
run-lulesh_ITER.30.SIZE.10 ITER.30.SIZE.10 FINISHED 0:00:00.048609 0:00:00.048803 2018-08-07 17:24:03.974073 2018-08-07 17:24:03.973879 2018-08-07 17:24:04.022682 0
make-lulesh make-lulesh FINISHED 0:00:04.979883 0:00:04.980055 2018-08-07 17:22:58.735953 2018-08-07 17:22:58.735781 2018-08-07 17:23:03.715836 0
run-lulesh_ITER.10.SIZE.10 ITER.10.SIZE.10 FINISHED 0:00:00.045598 0:00:00.045783 2018-08-07 17:24:03.853461 2018-08-07 17:24:03.853276 2018-08-07 17:24:03.899059 0
run-lulesh_ITER.20.SIZE.10 ITER.20.SIZE.10 FINISHED 0:00:00.044422 0:00:00.044655 2018-08-07 17:24:03.912904 2018-08-07 17:24:03.912671 2018-08-07 17:24:03.957326 0
run-lulesh_ITER.10.SIZE.30 ITER.10.SIZE.30 FINISHED 0:00:00.359750 0:00:00.359921 2018-08-07 17:24:04.760954 2018-08-07 17:24:04.760783 2018-08-07 17:24:05.120704 0
post-process-lulesh-size_SIZE.20 SIZE.20 CANCELLED --:--:-- --:--:-- -- -- 2018-08-07 17:25:06.810216 0
run-lulesh_ITER.30.SIZE.30 ITER.30.SIZE.30 FINISHED 0:00:00.915474 0:00:00.915682 2018-08-07 17:24:05.818191 2018-08-07 17:24:05.817983 2018-08-07 17:24:06.733665 0
When a study is cancelled, the cancellation is reflected in the status when calling the ``maestro status`` command

.. image:: ./status_layouts/flat_layout_lulesh_cancelled.png
:width: 1735
:alt: Flat layout view of status of a cancelled study


Status Layouts
**************

There are currently two layouts for viewing the status. The default `flat` layout shown above can be a little hard to read in narrow terminals or in studies with long step/parameter combo names. For this purpose a `narrow` layout has also been implemented,
compressing the width and making the status table taller. Additionally, the extra room allows the addition of per step tables of parameter names and values. To switch between these two simply use the `--layout` option with either `flat` or `narrow` as shown below::

$ maestro status ./tests/lulesh --layout narrow


A snippet of the narrow layout for the above study is shown below. These layouts are computed by status command, so can alternate between them in the same study without issue:

.. image:: ./status_layouts/narrow_layout_lulesh_cancelled.png
:alt: Narrow layout view of status of an in progress study

Note these layouts read some colors from your terminal theme, including the background color. The red for the state, yellow for jobid and the blue color on alternating columns are currently fixed. Snapshots reproduced using the encom theme for iTerm2.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading