Send k8s pod events in TaskExecutionEvent updates #3825
Labels
enhancement
New feature or request
untriaged
This issues has not yet been looked at by the Maintainers
Motivation: Why do you think this is important?
It can be difficult to understand delays in task start up. We recently added runtime metrics to the timeline view to better surface where time is spent, and PodCondition reasons are included in the task state tooltip to explain state transitions.
The full text of this tooltip is of the form
However this doesn't indicate ongoing node allocation or image pull, two of the most common delays in "happy path" task start up. By comparison
kubectl get events
has much richer information.Goal: What should the final outcome look like, ideally?
The execution closer should include task-specific event details, including scheduling attempts, node allocations, and image pulls.
Describe alternatives you've considered
A more complete solution may overhaul event information in the execution closure so that reasons are not coupled to Flyte state transitions and could instead surface a sink of structured or unstructured event information. This is beyond the scope of this particular issue, but the proposal below does not preclude such an investment in the future.
Propose: Link/Inline OR Additional context
As a potential solution, update DemystifyPending to interleave k8s pod events alongside existing PodCondition reasons.
Note the reporting interface assumes a single-event per state; however, a recent change made it possible to report multiple events using a phase version.
A relatively naive solutions proposed by @hamersaw might be:
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: