Skip to content

Commit

Permalink
Status tweaks (#358)
Browse files Browse the repository at this point in the history
* Add bfs/dfs ordered status output

* Prototype rich formatted status tables with cli layout switch

* Fix bfs ordering option, make it default

* Remove unhelpful log output..

* Add rich dependency

* Add rich dependency to setup.py

* Quit whining flake8...

* More flake8 drama..

* Cache status ordering to improve scalability

* Add bfs/dfs ordered status output

* Prototype rich formatted status tables with cli layout switch

* Fix bfs ordering option, make it default

* Remove unhelpful log output..

* Quit whining flake8...

* More flake8 drama..

* Cache status ordering to improve scalability

* Tweak to python version for rich.

* Attach params to step records to enable use in status output

* Add param name:value table in narrow status layout

* Check for new status column to enable backwards compatibility

* Checkpoint on renderer factory implementation

* Cleanup after successful test

* Remove extraneous debug logging

* Fix bad indent, tweak themes to play nice on different terminal themes

* Unit test for flat status layout

* Fix erroneous whitespace

* Add step root to workspace in status of parameterized steps

* Rework status tests, add narrow layout test, rebaseline

* Remove stray debug printing

* Add help target and some documentation to the make file

* Document layouts, add some test comments

* Rename status renderer test file so pytest can find it automatically

* Fix bug causing duplicate entries in narrow layout

* Update narrow layout test baseline

* Add layout screenshots to docs

* Add legacy table format back to the status layout options

* Convert base renderer to abstract, implement auto registration and
auto layout cli choice list

* Add some proper google style doc strings to render factory

* Docstring, arg defaults change for narrow renderer

* Sync up layout methods and doc strings of status renderers

* Fix up docstrings on params for StepRecords

* Fix incorrect return type documentation

* Bug fixes/style tweaks on narrow status layout

* Remove debug output

* Add documentation for legacy status layout

Co-authored-by: Francesco Di Natale <frank.dinatale1988@gmail.com>
  • Loading branch information
jwhite242 and FrankD412 authored Jun 5, 2021
1 parent 6716ec7 commit 89eb9cb
Show file tree
Hide file tree
Showing 18 changed files with 1,055 additions and 154 deletions.
24 changes: 17 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,21 +1,31 @@
DOCS = docs
EGG = $(wildcard *.egg-info)

.PHONY: cleanall clean cleandocs docs
.PHONY: cleanall clean cleandocs docs help release

all: cleanall release docs
# Use '#' comments to auto document each target in the help message
help: # Show this help message
@echo 'usage: make [target] ...'
@echo
@echo 'targets:'
@egrep '^(.+)\:\ #\ (.+)' ${MAKEFILE_LIST} | column -t -c 2 -s ':#'

release:
all: # Clean and then build everything
cleanall release docs

release: # Build wheel
python setup.py sdist bdist_wheel

docs:
docs: # Build documentation
$(MAKE) -C $(DOCS) html

clean:
clean: # Clean up release (wheel) build areas
rm -rf dist
rm -rf build

cleandocs:
cleandocs: # Clean up docs build areas
rm -rf $(DOCS)/build

cleanall: clean cleandocs
cleanall: # Clean every thing
clean
cleandocs
60 changes: 31 additions & 29 deletions docs/source/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,39 +121,17 @@ Maestro will launch a conductor in the background using ``nohup`` in order to mo
Monitoring a Running Study
***************************

Once the conductor is spun up, you will be returned to the command line prompt. There should now be a ``.tests/lulesh`` directory within the root of the repository. This directory represents the executing study's workspace, or where Maestro will place this study's data, logs, and state. For a more in-depth description of the contents of a workspace see the documentation about :doc:`Study Workspaces <./maestro_core>`.
Once the conductor is spun up, you will be returned to the command line prompt. There should now be a ``./tests/lulesh`` directory within the root of the repository. This directory represents the executing study's workspace, or where Maestro will place this study's data, logs, and state. For a more in-depth description of the contents of a workspace see the documentation about :doc:`Study Workspaces <./maestro_core>`.

In order to check the status of a running study, use the ``maestro status`` subcommand. The only required parameter to the status command is the path to the running study's workspace. In this case, to find the status of the running study (from the root of the repository) is::

$ maestro status ./tests/lulesh

The resulting output will look something like below::

Step Name Workspace State Run Time Elapsed Time Start Time Submit Time End Time Number Restarts
---------------------------------- ------------------- ----------- -------------- -------------- -------------------------- -------------------------- -------------------------- -----------------
run-lulesh_ITER.20.SIZE.20 ITER.20.SIZE.20 FINISHED 0:00:00.226297 0:00:00.226320 2018-08-07 12:54:23.233567 2018-08-07 12:54:23.233544 2018-08-07 12:54:23.459864 0
post-process-lulesh post-process-lulesh INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.9 TRIAL.9 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.8 TRIAL.8 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-size_SIZE.10 SIZE.10 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.1 TRIAL.1 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.3 TRIAL.3 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.2 TRIAL.2 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.5 TRIAL.5 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.4 TRIAL.4 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.7 TRIAL.7 INITIALIZED --:--:-- --:--:-- -- -- -- 0
post-process-lulesh-trials_TRIAL.6 TRIAL.6 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.30.SIZE.20 ITER.30.SIZE.20 FINISHED 0:00:00.543726 0:00:00.543743 2018-08-07 12:54:23.469009 2018-08-07 12:54:23.468992 2018-08-07 12:54:24.012735 0
run-lulesh_ITER.10.SIZE.20 ITER.10.SIZE.20 FINISHED 0:00:00.148773 0:00:00.148794 2018-08-07 12:54:23.068119 2018-08-07 12:54:23.068098 2018-08-07 12:54:23.216892 0
post-process-lulesh-size_SIZE.30 SIZE.30 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.20.SIZE.30 ITER.20.SIZE.30 FINISHED 0:00:01.066736 0:00:01.066757 2018-08-07 12:54:24.892856 2018-08-07 12:54:24.892835 2018-08-07 12:54:25.959592 0
run-lulesh_ITER.30.SIZE.10 ITER.30.SIZE.10 FINISHED 0:00:00.054475 0:00:00.054488 2018-08-07 12:54:23.005877 2018-08-07 12:54:23.005864 2018-08-07 12:54:23.060352 0
make-lulesh make-lulesh FINISHED 0:00:05.416096 0:00:05.416109 2018-08-07 12:53:17.395362 2018-08-07 12:53:17.395349 2018-08-07 12:53:22.811458 0
run-lulesh_ITER.10.SIZE.10 ITER.10.SIZE.10 FINISHED 0:00:00.043584 0:00:00.043610 2018-08-07 12:54:22.905328 2018-08-07 12:54:22.905302 2018-08-07 12:54:22.948912 0
run-lulesh_ITER.20.SIZE.10 ITER.20.SIZE.10 FINISHED 0:00:00.035449 0:00:00.035463 2018-08-07 12:54:22.958755 2018-08-07 12:54:22.958741 2018-08-07 12:54:22.994204 0
run-lulesh_ITER.10.SIZE.30 ITER.10.SIZE.30 FINISHED 0:00:00.812721 0:00:00.812764 2018-08-07 12:54:24.069466 2018-08-07 12:54:24.069423 2018-08-07 12:54:24.882187 0
post-process-lulesh-size_SIZE.20 SIZE.20 INITIALIZED --:--:-- --:--:-- -- -- -- 0
run-lulesh_ITER.30.SIZE.30 ITER.30.SIZE.30 FINISHED 0:00:01.376227 0:00:01.376240 2018-08-07 12:54:25.968730 2018-08-07 12:54:25.968717 2018-08-07 12:54:27.344957 0
The resulting output will look something like below:

.. image:: ./status_layouts/flat_layout_lulesh_in_progress.png
:width: 1735
:alt: Flat layout view of status of an in progress study


The general statuses that are usually encountered are:
Expand All @@ -163,6 +141,7 @@ The general statuses that are usually encountered are:
- ``FINISHED``: A step that has completed successfully.
- ``FAILED``: A step that during execution encountered a non-zero error code.


Cancelling a Running Study
***************************

Expand All @@ -172,7 +151,30 @@ Similar to checking the status of a running study, cancelling a study uses the `

.. note:: Cancelling a study is not instantaneous. The background conductor is a daemon which spins up periodically, so cancellation occurs the next time the conductor returns from sleeping and sees that a cancel has been triggered.

When a study is cancelled, the cancellation is reflected in the status when calling the ``maestro status`` command::
When a study is cancelled, the cancellation is reflected in the status when calling the ``maestro status`` command

.. image:: ./status_layouts/flat_layout_lulesh_cancelled.png
:width: 1735
:alt: Flat layout view of status of a cancelled study


Status Layouts
**************

There are currently three layouts for viewing the status. The default `flat` layout shown above can be a little hard to read in narrow terminals or in studies with long step/parameter combo names. For this purpose a `narrow` layout has also been implemented,
compressing the width and making the status table taller. Additionally, the extra room allows the addition of per step tables of parameter names and values. To switch between these two simply use the `--layout` option with either `flat` or `narrow` as shown below::

$ maestro status ./tests/lulesh --layout narrow


A snippet of the narrow layout for the above study is shown below. These layouts are computed by status command, so can alternate between them in the same study without issue:

.. image:: ./status_layouts/narrow_layout_lulesh_cancelled.png
:alt: Narrow layout view of status of an in progress study

Note these layouts read some colors from your terminal theme, including the background color. The red for the state, yellow for jobid and the blue color on alternating columns are currently fixed. Snapshots reproduced using the encom theme for iTerm2.

Finally, the original status layout is still available via the `legacy` option. Note that this renderer can be difficult to read in narrow terminals and with studies having many parameters due to the lack of wrapping in columns. An example from lulesh is shown below::

Step Name Workspace State Run Time Elapsed Time Start Time Submit Time End Time Number Restarts
---------------------------------- ------------------- --------- -------------- -------------- -------------------------- -------------------------- -------------------------- -----------------
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 89eb9cb

Please sign in to comment.