Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send also workflows as part of unified jobs and send all changes to jobs #6176

Conversation

Ladas
Copy link
Contributor

@Ladas Ladas commented Mar 5, 2020

SUMMARY

Workflows do not have a record in main_job, therefore the JOIN
was ignoring those. We need to do LEFT JOIN to include also
workflows.

It can take several hours for a job to go from pending to successful/failed state and we need to also send the job with a changed state, otherwise the analytics will be incorrect.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME
  • Analytics
FIXES

#6200

Partially fixing: #6201 (sending workflows as part of unified jobs)

Works with current analytics processor?

Currently the processor will break, since it requires Org name and ID as NOT NULL. I will limit this constraints for only template jobs. (will insert PR here)

TODO

@Ladas
Copy link
Contributor Author

Ladas commented Mar 5, 2020

@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

@awxbot awxbot added the type:bug label Mar 5, 2020
@Ladas Ladas closed this Mar 5, 2020
@Ladas Ladas force-pushed the send_also_workflows_as_part_of_unified_jobs branch from 2346c1f to 4fcd2c5 Compare March 5, 2020 13:48
@Ladas Ladas reopened this Mar 5, 2020
@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

@Ladas Ladas changed the title Send also workflows as part of unified jobs [WIP] Send also workflows as part of unified jobs Mar 6, 2020
@Ladas Ladas force-pushed the send_also_workflows_as_part_of_unified_jobs branch from 748f6cd to 4a5e596 Compare March 6, 2020 10:44
@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

@ryanpetrello
Copy link
Contributor

We may want to hold off on this until @AlanCoding's change here lands: #5726

JOIN main_organization ON main_organization.id = main_project.organization_id
WHERE main_unifiedjob.created > {}
LEFT JOIN main_project ON main_project.unifiedjobtemplate_ptr_id = main_job.project_id
LEFT JOIN main_organization ON main_organization.id = main_project.organization_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This join is going to change after @AlanCoding's PR merges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, I am fine fixing this another way. Just keep in mind this was basically filtering out everything except jobs. But we want other stuff like workflowapproval, worflowjob, projectupdate etc. , so we'll need the left join, since not everything has org associated

Also @AlanCoding is your PR going to be backported? We need this backported to all versions sending data to analytics, so we get the workflow and other data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we do not plan on backporting. that would be very complicated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, ok, so can we merge this first? Or it doesn't matter for easier backport?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait for @AlanCoding's organization schema to land in devel before changing this in devel.

If you want to open a PR like this in a backport branch, we'll have to carry a different implementation for that database schema anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I'll backport and wait for @AlanCoding here

@AlanCoding
Copy link
Member

It also seems like we are not able to get a link to organizations
from workflows? When looking at:
<tower_url>#/organizations?organization_search=page_size:20;order_by:name

I don't follow that URL. If you were searching from /api/v2/unified_jobs/ or /api/v2/workflow_jobs/, or the corresponding template endpoints, and using organization, then yes, I see why that wouldn't work, and that will change with the other PR I have open because it moves the field from the workflow tables to the unified tables

@Ladas
Copy link
Contributor Author

Ladas commented Mar 6, 2020

@AlanCoding the tower url was just supposed to show that on the organization page, we don't show any links to workflows. And as I was scanning the db, we only store organization relation in workflow template, not workflow_job.

@ryanpetrello
Copy link
Contributor

@Ladas we've merged @AlanCoding's work - this is probably ready for a rebase and another look now.

@AlanCoding
Copy link
Member

why was the created filter changed to modified? I think those were both indexed the same, but this is confusing. What do you think that modified represents? And how does that differ from finished? I'm sorry, I'm still groking what data it is that you are trying to obtain.

@Ladas
Copy link
Contributor Author

Ladas commented Mar 17, 2020

@AlanCoding the use of modified is described in the issue (I suppose finished can be used also). If we use created, then finished/modified can be hours later and if we'd already sent the payload, we wouldn't send the job again. We need to send the job again if it's finished (or change occurred) so the analytics contains the terminal state.

@AlanCoding
Copy link
Member

I may not have access to the issue. But finished is indexed

finished = models.DateTimeField(
null=True,
default=None,
editable=False,
help_text=_("The date and time the job finished execution."),
db_index=True,
)

can't say the same for modified. So it seems like that'll work as a drop-in replacement here.

And the organization field should be on the unified jobs table now, but I think there are more problems you're trying to solve here which I'm not fully aware of.

@Ladas
Copy link
Contributor Author

Ladas commented Mar 18, 2020

@AlanCoding I see. Although finished doesn't have NOT NULL constraint. And from what I see, it is being set only when really finished. So it will not catch jobs stuck in pending etc. (so things we are trying to detect and alert upon)

So it looks like we have 2 paths:

  1. using modified

GOOD: we'll catch all changes
BAD: we'll always have to do a sequential scan of unified jobs

  1. using both created and finished

GOOD: we'll use indexes
BAD: while we should catch all jobs, the state it is stuck on doesn't have to be correct


Looks like 2) is good enough, we just need take in account that when finished is NULL, we might not have the right status. (and e.g. change the alert for too many pending jobs to too many unfinished jobs)


For the orgs, in analytics I expect them to be there for job type. So it's ok that the rest of unified jobs don't have it. But we should send all unified jobs for analytics except what is explicitly filtered out in where condition.

@Ladas Ladas changed the title [WIP] Send also workflows as part of unified jobs Send also workflows as part of unified jobs and send all changes to jobs Mar 18, 2020
@Ladas Ladas force-pushed the send_also_workflows_as_part_of_unified_jobs branch 2 times, most recently from 7223908 to 274fead Compare March 18, 2020 09:23
@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

@Ladas
Copy link
Contributor Author

Ladas commented Mar 18, 2020

@AlanCoding ok I changed the condition to (main_unifiedjob.created > {0} OR main_unifiedjob.finished > {0}) and I've added the left join, so the unified jobs without org are not filtered out.

Can you review also the backports please? https://github.com/ansible/tower/pull/4180 and https://github.com/ansible/tower/pull/4179

Btw. we should probably add index to modified column, so we can get the 100% correct data to analytics. (even if that can't be backported)

@wenottingham
Copy link
Contributor

What still needs done here? Just a review?

cc @AlanCoding @rooftopcellist

@ryanpetrello
Copy link
Contributor

@wenottingham likely that and verification from somebody.

@chrismeyersfsu
Copy link
Member

chrismeyersfsu commented Apr 8, 2020

@Ladas I submitted a PR against your PR. https://github.com/Ladas/awx/pull/1/files incorporate that test and I'm good with merging this PR 👍

@Ladas
Copy link
Contributor Author

Ladas commented Apr 9, 2020

@chrismeyersfsu thank you

btw. this PR's condition works for backports nicely. But as extension for the next major AWX release, the best would be if we could add index on modified column and use that instead of created+finished. (then we could track exact state of a job, in a case it gets stuck in a non terminal state)

@softwarefactory-project-zuul
Copy link
Contributor

Build failed.

@ryanpetrello
Copy link
Contributor

@Ladas the CI failures here suggest to me that this branch needs to be rebased against devel.

Ladas and others added 4 commits April 9, 2020 15:20
It can take several hours for a job to go from pending to
successful/failed state and we need to also send the job with
a changed state, otherwise the analytics will be incorrect.
Workflows do not have a record in main_job, therefore the JOIN
was ignoring those. We need to do LEFT JOIN to include also
workflows.

It also seems like we are not able to get a link to organizations
from workflows? When looking at:
<tower_url>#/organizations?organization_search=page_size:20;order_by:name

We don't seem to list a relation to workflows. Is it possible to get it from
somewhere?
Use created and finished, which are indexed, to try to fetch all
states of jobs. If job is not finished, we might not get the
right terminal status, but that should be ok for now.
* unified_jobs output should include derived jobs i.e. project update,
inventory update, job
* This PR adds a test to ensure that.
@Ladas Ladas force-pushed the send_also_workflows_as_part_of_unified_jobs branch from 405ecd3 to 1f9f869 Compare April 9, 2020 13:21
@Ladas
Copy link
Contributor Author

Ladas commented Apr 9, 2020

@ryanpetrello thanks for the hint, rebased

@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

@wenottingham
Copy link
Contributor

@ryanpetrello ok for mergeit with @chrismeyersfsu's approval?

@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded (gate pipeline).

@softwarefactory-project-zuul softwarefactory-project-zuul bot merged commit c414fd6 into ansible:devel Apr 13, 2020
AlanCoding pushed a commit to AlanCoding/awx that referenced this pull request Jan 4, 2023
[backport 4.2] removed hostname check when editing hostname on existing host (ansible#13057)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants