Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add a pause with approval to continue between playbooks in workflow job #1206

Closed
lsantiag86 opened this issue Feb 12, 2018 · 49 comments

Comments

@lsantiag86
Copy link

ISSUE TYPE
  • Feature Idea
COMPONENT NAME
  • UI
SUMMARY

I am requesting a pause feature be included as one of the steps in a job workflow in-between playbooks so that a user can give the +1 to continue on to the next playbook in the job workflow.

This will give the user the unlimited amount of time they need to do their manual steps, but also allow them to continue as fast as they are ready with out having to wait on some other trigger.

@wenottingham
Copy link
Contributor

This is not something we would likely get to in the near future, but it has been requested a few times, including:

  • setting a specific approver != the job submitter
  • setting multiple approvers

@wenottingham
Copy link
Contributor

Note that the best workaround for this case is using wait_for or get_url tasks in Ansible waiting for a flag/port to be available that is set by your approval process.

@ghost
Copy link

ghost commented Sep 1, 2018

This is much needed feature.

The workaround which you are suggesting wait_for, cannot be used in our case. Operations users in our company only have UI access and they are not allowed to login into CLI and set some flag in any way.

@wenottingham
Copy link
Contributor

wait_for and get_url don't require local CLI accesss, they can check ports on remote machines (wait_for) or pretty much any URL state such as a ticket state (get_url).

@piroux
Copy link
Contributor

piroux commented Mar 19, 2019

I second @ghost, this feature would be really useful.
I do believe that wait_for would be imperfect as it will use AWX resources (for a running job template) for polling so that it would not scale well if used for all workflows to run at the same time.

I was thinking that maybe, rather than having this "Approval feature" as a Workflow feature only, having it placed at the Job Template level would be a better idea because it could be used for each node of a Workflow and for single Job Template too.
Until one of the approvers could actually approve the run of the job template, it would then stay in a "Pending Approval" state.

Added to the already available job scheduling features, it would really increase the usefulness of the project, according to me.

@matburt
Copy link
Member

matburt commented Apr 25, 2019

This is most likely going to involve:

  • Adding a job state to indicate to the task manager to not process the job
  • Adding a job state to indicate approved/rejected
  • Updating the task manager to handle not processing the job until it's been approved
  • RBAC for delegating approval
  • Notification support for alerting approvers
  • UI/UX work for approval/rejecting
    • Workflow node to hook approval/rejecting

----- Future ----- ?

  • External approval/rejecting
    • workflow role execution that passes/fails
    • http poll?

@wenottingham
Copy link
Contributor

wenottingham commented Apr 25, 2019

@matburt unless the approval step is an actual task with certain parameters

@mabashian
Copy link
Member

Thoughts on UI impact:

Workflow Editor
We'll need some way of defining or denoting a wait node. I think I would prefer this be it's own node as opposed to being a flag attached to an existing node (with a JT, etc, etc) but am open to discussion there. I think it would make the graph easier to read/understand at a glance. If we do go down the path of making wait it's own node then we'll need to update the add/edit node form to give the user the ability to select a template/project/inv or define the node as wait. Selecting wait may expose some more fields in that form:

  1. Who the approver(s) is/are. Should this be optional? No approvers means anyone with execute access on the wfjt can resume execution?
  2. Notifications - This is a bit unclear to me. Do we want to give users the ability to select a notification to send inside the workflow editor or are we envisioning them interacting with the Notifications tab at the wfjt level to do this?

Question: Can you have more than one wait node in a workflow? If not then we'll need to handle that restriction in the UI.

Jobs List

  1. Show new paused state on workflow jobs where necessary
  2. Add play/resume button to paused workflow jobs when user has the ability to continue the job run.

Question: Should the act of resuming/approving a job take the user to the workflow results page or leave them on the current page?

My View Jobs List

  1. Show new paused state on workflow jobs where necessary
  2. Add play/resume button to paused workflow jobs when user has the ability to continue the job run.

Workflow Results

  1. Show paused state
  2. We probably want some visual indication on the graph itself that the workflow is paused. Maybe we draw attention to the wait node somehow?
  3. If he user viewing the results page can resume/continue the run then we should probably expose a button to do that (maybe up by where the relaunch button is shown). Same button that we would use in the list view.

Question: Do we ever see a situation/need arising whereby a user could establish a wait node but rather than requiring approval it requires additional user input (like prompt on launch)? If so, I'm not sure that it would impact our architecture decisions but it may be something to think about.

I could see an argument being made for providing our users with a more streamlined view of all jobs awaiting your approval rather than making them go digging through the jobs list to find the job. Maybe we could show a toast or banner across the top of the dashboard letting them know that a job is awaiting their approval(?)

@matburt
Copy link
Member

matburt commented May 1, 2019

This is tricky, because in order for this to work on a standard job template the approval/disapproval would have to be associated with the job itself. If we carry that forward into Workflows it would cause the user to have to create a stub job to act as the approval step, but for notifications it would give us something to hang a notification onto.

We could go a different way with this entirely, which I think is what the original proposal is asking for anyway, and require that approvals use a workflow even if they are just looking to run a single job. This is what I'm leaning towards now. The approval node then becomes another node type of WorkflowJobTemplateNode that links to something like a WorkflowLogicNode. This node, at least at first, could encapsulate just this Approval feature (also a Pause-for-time feature?) but in the future could be used to introduce more complex logic to a workflow that doesn't strictly adhere to a traiditional UnifiedJobTemplate.

Given this, you could have as many of these as you want, anywhere in your Workflow. A Workflow overall could also include a timeout value where workflows don't hangout forever... that they eventually expire if no one comes along to approve it.

Given that we'd do this entirely from workflows, it could appear as something distinctive within the Workflow detail view indicating which node needs approval. My intuition is that this will also need some sort of indicator on the Job list view (where we indicate which Workflows are waiting on approval). Probably also from the Dashboard?

Question: Do we ever see a situation/need arising whereby a user could establish a wait node but rather than requiring approval it requires additional user input (like prompt on launch)? If so, I'm not sure that it would impact our architecture decisions but it may be something to think about.

I alluded to this up above, but I could see doing something like this. I think my proposed architecture here would support this in the future without us having to implement anything specific right now.

RBAC
For RBAC we could add another Organization Role for Org-level approvals but also add a role to Workflows to delegate approvals on individual workflows themselves.

@wenottingham
Copy link
Contributor

Yeah, I'm fine with this being just for a workflow node, and if you want it for a single JT, create a workflow with it.

For RBAC, etc. a MVP would be two choices - '"approve by anyone who can execute", and "approve by org admin or above". We could consider other choices, but I don't think we need them initially.

@matburt
Copy link
Member

matburt commented May 1, 2019

Yep. I think Org Admin or higher would work here.

@matburt
Copy link
Member

matburt commented May 17, 2019

NotificationTemplates can/should be bound on the WorkflowJobTemplate abstraction.

@matburt
Copy link
Member

matburt commented May 17, 2019

  • Time outs on these elements as failures

@kdelee
Copy link
Member

kdelee commented May 17, 2019

From an API perspective there will be some way to get a list of nodes that are awaiting approval
The approval will be sent to the node in question
The approval needs to create an activity stream entry including the user who approved + node
The workflow is in “running” even if all it has is a node awaiting approval

@mabashian
Copy link
Member

Proposed UI changes (this likely isn't the final iteration):

Creating a pause node in a workflow job template

  • When creating a new node in a workflow a new option Pause will be added. We currently expose 3 different lists in the node form and we use tabs to all the user to toggle between them. We'll be changing those tabs to a dropdown (default is to show the Templates list) to allow for more options.
  • Selecting the Pause node type will present the user with a few additional options:
  1. A timeout input which will let the user override the default timeout for the pause. This is the amount of time that an approver will have to respond before the workflow continues as if "deny" was selected by the approver.
  2. A series of rbac related options will be presented to the user. I'm envisioning this being a series of checkboxes or radio buttons that let the creator of the workflow define who can approve/pause the continuation of the workflow. The exact roles are still tbd at this point but some ideas might be: org admin or above or anyone who can run - something like that
  3. (Potential) We may expose a multi-select lookup field for specifying notification templates that should be executed on the pause node event (think getting a Slack notification when a workflow has been launched and has hit the point where you need to approve). The UX here would mirror something like the Instance Groups lookup that is already exposed in many forms.
  • There's no prompting associated with this node type
  • You can have more than one pause node in a WFJT. They can even be back to back - no restrictions expected there.
  • The display of the pause node in the workflow should be self-obvious. We'll display some text like: Pause - Await Approval or similar. This applies to both the workflow editor and the results.

Approving/denying continued execution

  • A new status/icon will be added to the application header which will indicate to the user that they have pending jobs awaiting approval. We'll need to fetch this data from the api on certain actions. For sure we'll fetch it after login and when the user clicks on icon itself. We probably want to fetch it on any manual refresh of the page as well. We do the same thing with the config endpoint. There are no plans for websocket support here so there's no expectation that this status/icon will update in real time.
  • When the user clicks on this icon, a drawer will slide out from the right-hand side of the page exposing the list of jobs pending approval. In the list, the workflow job template name and job id will be displayed (same as jobs list) as well as a timestamp for when the job was paused and a timestamp for when the pause expires. Finally, two options will be presented to the user (Approve/Deny). Clicking one of these options will make the corresponding request to the api. After an option is chosen, the row will temporarily remain in the list (displaying which option was chosen). The row will no longer appear in the list if the page is refreshed or if the list of jobs awaiting approval is hidden. Clicking on the job name in the drawer should navigate the user to the job in question.

Jobs List

  • If the API creates a new "state" for a WFJT (paused) then the UI will need to expose that anywhere that we display jobs
  • It's possible we may want to expose approve/deny at the job list level. This is TBD.

Workflow Results

  • If provided by the API, we may want to update the workflow results to show if a pause node was approved or denied and who did the approving/denying(?). This is TBD.
  • It's possible we may want to expose approve/deny at the workflow results level. This is TBD.

Activity Stream

  • (Probably) Events related to approving/denying further execution of a workflow should be logged in the activity stream. If this is the case, we should expose those events in the UI.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 1, 2019

Discussed some API implementation with @chrismeyersfsu and @beeankha.

Here are the high level notes, @mabashian @kdelee this may interest you.

  1. Add two new models, a WorkflowApprovalTemplate and a WorkflowApproval. These are two new UnifiedJobTemplate and UnifiedJob base classes, respectively.

  2. A WorkflowJobTemplate today has a link to a UJT. Under this new feature, these can now be WorkflowApprovalTemplate objects (instead of just a JobTemplate, ProjectUpdate, etc...).

  3. This new model shouldn't require any actual changes to the task manager. Instead, when the Workflow DAG code encounters a node that represents a WorkflowApproval, that job will just hang out in "pending" forever (or at least, until somebody approves or rejects it). In the eye of the task manager, this is no different than any other job (it's just that under the hood, we're not actually running any Python code that will transition into a successful or failure status).

This means we'll be adding some new APIs. Specifically:

/api/v2/workflow_approval_templates/
/api/v2/workflow_approval_templates/N

These endpoints will allow you to create and edit new approval templates. As we progress in feature development, we'll hang specific features/attributes off of these endpoints, such as "the notification templates for this approval template" or "who can approve".

/api/v2/workflow_approvals/
/api/v2/workflow_approvals/N/
POST /api/v2/workflow_approvals/N/approve/
POST /api/v2/workflow_approvals/N/reject/

@mabashian I expect these endpoints will be useful for "fetching a list of pending approvals" i.e.,

/api/v2/workflow_approvals/?status=pending

Additionally, the UI can use special approve and reject endpoints to actually transition these jobs into the proper state so that workflow execution can continue.

We'll need to work out and answer some questions related to RBAC. Specifically:

  1. Who can create these new approval nodes? Do they have some sort of organizational membership, or are they just sort of created ad-hoc?
  2. When you delete/remove one of these nodes from a workflow, should we go ahead and clean up the underlying WorkflowApprovalTemplate object? It probably doesn't really make sense for them to exist outside of the context of their relationship to an actual node in a workflow. @bianca @chrismeyersfsu how do you feel about making it so that you must specify the node they're attached to at creation time?
  3. How do we make sure we restrict the endpoint functionality? For example, even though this is a common pattern for other UJTs, something like this doesn't really makes sense:

/api/v2/workflow_approval_templates/N/launch/

@beeankha
Copy link
Contributor

Hi @spsingh1982 , we don’t have an exact ETA for when this will be released as part of AWX, but it is something we're currently working on. You can take a look at the work-in-progress PR here: #4264

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 16, 2019

@beeankha @mabashian @chrismeyersfsu @AlanCoding,

After much deliberation, I think this is what we're currently settled on (if I can summarize).

There will not be a global /api/v2/workflow_approval_templates/ endpoint.

To create a new pause node, you do:

POST /api/v2/workflow_job_template_nodes/
{"workflow_job_template": 42, "unified_job_template": null}
POST /api/v2/workflow_job_templates_nodes/X/approval_job_template/
{"timeout": 3600, "name": "Get Manager Approval"}

This means that the only way you can create an approval template is via this endpoint, where it's assigned to a node.

We will continue to have /api/v2/workflow_approvals/ and the approve and deny endpoints we discussed (which @beeankha, you currently have in your PR).

@kdelee kdelee self-assigned this Jul 18, 2019
@dsesami dsesami self-assigned this Jul 18, 2019
@kdelee
Copy link
Member

kdelee commented Jul 18, 2019

@dsesami and I brought up that it looks like we need a explanation field that can be POSTed with the approval/denial.

Additionally this would be used when a approval node fails due to time out as the text field where tower puts in "Failed due to timeout after {num} seconds"

@dsesami
Copy link
Contributor

dsesami commented Jul 18, 2019

I also had the thought that there should be another way to know that a workflow is in a pending state; in the running jobs list, it would be nice to highlight the pending workflow in some manner to signal that action needs to be taken. cc @mabashian

@dsesami
Copy link
Contributor

dsesami commented Jul 18, 2019

@kdelee and I are also wondering how users will review previously approved/denied jobs from the past + view the explanation for why approved/denied.

Seems like clicking on the approval node from the workflow should just take us to a job detail view, just no stdout. Basically it'd be this page, but no stdout block (if hiding the element is difficult for some reason, then just leave it blank maybe?) and then just dropping all the extra vars and just having a few key ones here.

The fields might be something like

Status: Approved
Approver/Denier/Whatever noun: Mike Abashian
Explanation: lorem ipsum...
Decision made at: 2:00:39
Workflow Job: [link]

Screenshot from 2019-07-18 10-38-48

@kdelee
Copy link
Member

kdelee commented Jul 18, 2019

A pile of questions:

  • What would this look like for those who don’t have the power to approve?
  • Is this role assigned per-workflow or is there also an overall approver for all workflows? I think it's a bit of both?
  • What kind of privileges does a workflow admin have to approve/deny?
  • Who can grant approval permissions?
  • What happens with a nested workflow? Where the approval node is one or more levels down?
  • What happens to the parent workflow, will information bubble up that the nested workflow is pending approval? What if nested workflow fails because of timeout or denial?

@ryanpetrello
Copy link
Contributor

@kdelee and I are also wondering how users will review previously approved/denied jobs from the past + view the explanation for why approved/denied.

I think our intention is to use the activity stream to record approve/deny, since it already has a lot of the attributes we care about.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 18, 2019

I also had the thought that there should be another way to know that a workflow is in a pending state; in the running jobs list, it would be nice to highlight the pending workflow in some manner to signal that action needs to be taken. cc @mabashian

In a recent chat about this, I think we decided to not show approval jobs in the global job list, because the implications of calculating their RBAC (can this be approved by the current user) in the context of the global unified job list worried us from a query performance perspective cc @AlanCoding @matburt.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 18, 2019

@beeankha or @AlanCoding,

#1206 (comment)

Could one of you summarize some answers here for @kdelee re: the new approval_role implicit fields we're currently working on?

@AlanCoding
Copy link
Member

Is this role assigned per-workflow or is there also an overall approver for all workflows? I think it's a bit of both?

I think we had a very clear answer to this, and the implementation has already begun.

Organizations get a new approval_role role, and Workflow JTs get a new approval_role. Those with either of these roles can approve any pending approval nodes in the workflow.

(this assumes the existing parentage structure is obvious to you, workflows live inside an organization, and the nodes live inside a workflow)

@dsesami
Copy link
Contributor

dsesami commented Jul 18, 2019

I think we decided to not show approval jobs in the global job list, because the implications of calculating their RBAC (can this be approved by the current user) in the context of the global unified job list worried us from a query performance perspective

@ryanpetrello I would argue that this scenario is the exact thing that it's needed for. It should say something like (assuming they can see the WF in the first place) "This workflow is in a pending state until someone with permissions approves it". Basically, IT or whomever can then ask the approver to push the job forward. At least, that's what I'm imagining, feel free to correct me if I'm missing something though

@AlanCoding
Copy link
Member

There are two proposals I worry are getting confused here:

  • Showing "Approval Jobs" in the UI JOBS list
  • Showing a badge / counter / infographic in the workflow job's entry about pending approvals

We decided against showing approval jobs in the list. They have a confusing role (duplicated with the workflow job), and have their own separate place in the UI.

This workflow is in a pending state until someone with permissions approves it

It's more nuanced than that. The workflow job is in the "running" state if an approval job inside of it is in the "pending" state. Also, the blocking by an approval job is specific to 1 branch, so other branches may have running jobs simultaneously.

It is valuable information to the user to see that a workflow job is waiting on 1 or more approvals. I just don't know if showing this in the UI JOBS list is in scope for the initial version of the feature.

@dsesami
Copy link
Contributor

dsesami commented Jul 18, 2019

my main concern is that a user might just see the "running" state on the workflow itself, not see a notification that an approval is pending, and just leave it without knowing something is needed. as long as the pending approval list is marked clearly enough, I'm ok with that.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 18, 2019

@dsesami one of the things we're adding with this feature is a new item in the top nav that displays pending approvals:

image
image

In the first pass of this feature, we're not planning on showing these approvals as distinct jobs in the global jobs list.

@dsesami
Copy link
Contributor

dsesami commented Jul 18, 2019

That works for me.

@sumkincpp
Copy link
Contributor

Sorry, I've briefly checked current development progress.

Will there be a descriptive message/or jinja template based message that will show current state of workflow, so that reviewer will approve specific actions that will be performed? I.e. configuration diff, or dynamic values calculated to that stage.

@kdelee
Copy link
Member

kdelee commented Jul 18, 2019

@kdelee and I are also wondering how users will review previously approved/denied jobs from the past + view the explanation for why approved/denied.

Seems like clicking on the approval node from the workflow should just take us to a job detail view, just no stdout. Basically it'd be this page, but no stdout block (if hiding the element is difficult for some reason, then just leave it blank maybe?) and then just dropping all the extra vars and just having a few key ones here.

The fields might be something like

Status: Approved
Approver/Denier/Whatever noun: Mike Abashian
Explanation: lorem ipsum...
Decision made at: 2:00:39
Workflow Job: [link]

@ryanpetrello so answer to "Will the approval node in the workflow job view link to anything with a details link like the other nodes is "NO" ?

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 18, 2019

@kdelee we haven't planned any UX interaction where you go view an approval job in the UI on a separate page the way you do job results. Our plan is to have a global notification list where you can view the list of pending approvals and (if you have permissions) approve or deny them.

We'll likely make it so that when you're viewing a workflow that has a pause node, you can also approve/deny it there (if you have permission):

image

@AlanCoding
Copy link
Member

Also, the name field already comes from user entry at the time of creating the node / approval JT. I don't see any need for an explanation field beyond what has already been entered into that field. Related job names are already shown in the UI on all nodes in a workflow job.

If the node times out, job_explanation field is planned to be populated with a message that it times out. I don't know how this particular field will be exposed in the UI as of yet. That's a pertinent question.

@wenottingham
Copy link
Contributor

Will there be a descriptive message/or jinja template based message that will show current state of workflow, so that reviewer will approve specific actions that will be performed? I.e. configuration diff, or dynamic values calculated to that stage.

Not in this version. There's no real mechanism to pass arbitrary data in this way to show the user other than maybe showing set_stats content, and I could see that running afoul of other uses of set_stats.

@dsesami
Copy link
Contributor

dsesami commented Aug 29, 2019

This feature has been merged. PR: #4264

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests