Proposal: Revamp task executor interface #63

irvinlim · 2022-04-17T21:09:41Z

Motivation

The task executor interface has so far been implemented in a generic manner, allowing other task executor implementations to be created. However, not much work has been put into designing and deciding what should constitute a Task from Furiko's point of view.

To focus more on the scheduling aspect, we can and should delegate task execution to other downstream executors when possible. At the same time, not all kinds of "tasks" are suitable to be written using the PodTemplateSpec API.

Proposal

In view of the above, I propose that we should expand the definition of a "task" to potentially include, but not limited to, the following examples:

Container workloads, grouped together in a Pod.
Complex container-based workflows, supporting DAG/dependency-based ordering of container executions.
- This can be implemented via Argo Workflows.
Serverless function invocations, which supports auto-scaling based on the number of running tasks at once.
- This can be implemented via Knative.
HTTP requests, defined by some cluster-routable URL, request body and headers.
Kubernetes API resource, which can cover any other custom types of tasks that need to be supported.
- Users can write their own operators to reconcile a stub Task object, for example.

A task should minimally support the following semantics:

Can be created and its status reconciled based on the underlying task's state.
Supports at least the following states: Staging, Running, Success, Failed, Killing, Killed (naming can be revised)
Can be killed if it is currently in-progress. This includes killing via external interference, as well as automatic killing (e.g. timeouts).
Has a result, which could be structured or plain text. The result should be parseable, and we should be able to tell if the result is a successful one or not.
Supports substitution of fields based on context variables, the exact behavior may differ between executors.

API Design

Example of JobConfig API to be updated:

apiVersion: execution.furiko.io/v1alpha1
kind: JobConfig
metadata:
  name: jobconfig-sample
  namespace: default
spec:
  concurrency:
    policy: Forbid
  template:
    spec:
      task:
        # Defines the task template. At most one field can be specified.
        template:
          pod: ...
          workflow: ...
          customResource: ...

`pod`

Supported by the Pod task executor.

pod:
  metadata:
    annotations:
      custom-annotation: value
  spec:
    containers:
      - args:
          - echo
          - Hello world, ${option.user-name}!
        image: alpine:${option.image-tag}
        name: job-container
    restartPolicy: Never

`workflow`

Supported by Argo Workflows' Workflow API.

The following example is taken from https://github.com/argoproj/argo-workflows/blob/0b2f2d1f23bead3bc0ab34049192d06594b575ba/examples/template-defaults.yaml.

workflow:
  metadata:
    annotations:
      custom-annotation: value
  spec:
    entrypoint: main
    templateDefaults:
      timeout: 30s   # timeout value will be applied to all templates
      retryStrategy: # retryStrategy value will be applied to all templates
        limit: "2"
    templates:
      - name: main
        steps:
          - - name: retry-backoff
              template: retry-backoff
          - - name: whalesay
              template: whalesay
      - name: whalesay
        container:
          image: argoproj/argosay:v2
          command: [cowsay]
          args: ["hello world"]
      - name: retry-backoff
        container:
          image: python:alpine3.6
          command: ["python", -c]
          args: ["import random; import sys; exit_code = random.choice([0, 1, 1]); sys.exit(exit_code)"]

`http`

Should be implemented by a custom HTTP executor extension service. These executors will pick up HTTPRequest tasks and reconcile them, followed by triggering the HTTP request within the cluster.

The API schema could follow HAR (HTTP Archive Request) format: https://en.wikipedia.org/wiki/HAR_(file_format)

http:
  # The request field follows HAR spec: http://www.softwareishard.com/blog/har-12-spec/#request
  request:
    method: GET
    url: "http://my-service.default.svc:8080/my-path"
    httpVersion: HTTP/1.1
    headers:
      - name: Content-Type
        value: application/json
    queryString:
      - name: param
        value: value1
        comment: Custom comment
    postData:
      mimeType: application/json
      text: |
        { "mydata": { "field": 1 } }

  # Additional fields so that we can achieve task semantics.
  response:
    # Either json or yaml
    format: json
    # Defines a JSONPath to evaluate if the result is successful
    resultJsonPath: ".result.jobs"

  # Timeouts, not sure if needed
  timeout:
    socketTimeout: 10
    readTimeout: 30

Additional notes:

Requires the implementation of "executor" controllers. These controllers should preferably be located outside of the core execution-controller, so that it can be horizontally scaled independently of the main controller manager (which has leader election).
We can avoid a central scheduler since most requests should be able to be fulfilled by any available executor, some ideas could be to follow Sparrow scheduling in Spark: pdf, video
HTTP tasks could support executor affinity, for example to overcome network restrictions.
Evaluating HTTP response as results may be tricky, since the specification supports lookups but not evaluation to a single boolean, one way is to require all JSONPath expressions to evaluate to a true, false, or undefined/null. Another way is to use truthy/falsy values, this same problem may be solved in text/template as well.
Killing HTTP tasks will probably require synchronization to ensure that the TCP connection is already severed.
If the executor terminates or loses its state prematurely, then the task is considered gone.
We probably want to store the HTTP response somewhere, which is also not defined by the task spec currently.
Some more considerations: HTTP/2, streaming responses (HTTP keep-alive), TLS/certificates

This is probably a very hard problem to solve.. so we may choose not to implement it at all. Or another way is to simply specify a very strict requirement for what endpoints are supported, and fail fast and reliably if they do not conform to the "correct" kind of endpoint.

`customResource`

Supports creating custom resources, similar to Argo's ResourceTemplate.

customResource:
  manifest: |
    apiVersion: my.custom.domain/v1alpha1
    kind: Book
    metadata:
      name: google-sre-book
  result:
    successCondition: 
      jsonPath: ".status.phase == \"Succeeded\""

Additional notes:

Killing of the task should be equivalent to deleting the resource. Graceful termination can be implemented by using finalizers on the custom resource.
Similar to http above, we need a way to determine the task's current state. This could be specified in the customResource specification itself but it may be quite clunky.

Design Considerations

Embedding Types

There are some issues with embedding third-party types, even including Kubernetes' core types.

Depending on external API types will undoubtedly cause headaches for versioning of Furiko's own API types in the long run. For example, Argo's API types are still in v1alpha1, which basically means they are free to break backwards compatibility guarantees at any time.
At the same time, any updates to downstream API types will also require us to update and explicitly version our APIs in line with other third-party APIs.

One solution is to require users to define these third-party API types as YAML string literals, but validate the payload at runtime. If we support substitution of values (which can only happen at execution time), then there may not be a way to validate the JobConfig at save-time.

Example of workflow following the customResource executor:

template:
  workflow:
    manifest: |
      apiVersion: argoproj.io/v1alpha1
      kind: Workflow
      spec: ...

Since we know beforehand how to interpret a Workflow's status, the user only really has to specify the manifest.

Another solution without needing to use a string is to use type: object in the CRD definition without properties, which prevents any schema validation. We could consider doing this for the PodTemplateSpec field currently as well.

Curse of Generality

In some cases, the above proposed design may be too flexible and could open the floor to a whole slew of additional requirements and problems in the future.

The alternative approach is to provide a small surface of well-defined and well-supported executors, which could also be interfaces themselves. One example is to repurpose a Pod into a generic interface for another underlying workload using Virtual Kubelet.

The text was updated successfully, but these errors were encountered:

irvinlim · 2022-06-05T15:22:08Z

We will consider this closed, as we have made sufficient headway in streamlining the Task interface that would be generic enough to support future task executors. Most of the work required was to make TaskTemplate support generic fields, make PodTemplate schemaless, and to remove the active deadline-related logic from the Task interface.

Any planned task executors will be created in their own dedicated issue.

irvinlim added help wanted Extra attention is needed kind/proposal Proposal for new ideas or features component/execution Issues or PRs related exclusively to the Execution component (Job, JobConfig) area/workloads Related to workload execution (e.g. jobs, tasks) labels Apr 17, 2022

irvinlim added this to the v0.1.0 milestone Apr 18, 2022

This was referenced May 2, 2022

Split PodTemplateSpec out from JobTaskSpec #72

Closed

feat(api): Make PodTemplateSpec schemaless #77

Merged

irvinlim closed this as completed Jun 5, 2022

irvinlim added area/api Related to public APIs, including CRD design, configuration, etc and removed help wanted Extra attention is needed area/workloads Related to workload execution (e.g. jobs, tasks) labels Jun 5, 2022

irvinlim linked a pull request Jun 5, 2022 that will close this issue

feat(execution): Avoid using activeDeadlineSeconds to kill tasks #85

Merged

irvinlim removed a link to a pull request Jun 5, 2022

feat(execution): Avoid using activeDeadlineSeconds to kill tasks #85

Merged

irvinlim mentioned this issue Jun 5, 2022

feat(execution): Avoid using activeDeadlineSeconds to kill tasks #85

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Revamp task executor interface #63

Proposal: Revamp task executor interface #63

irvinlim commented Apr 17, 2022 •

edited

Loading

irvinlim commented Jun 5, 2022

Proposal: Revamp task executor interface #63

Proposal: Revamp task executor interface #63

Comments

irvinlim commented Apr 17, 2022 • edited Loading

Motivation

Proposal

API Design

pod

workflow

http

customResource

Design Considerations

Embedding Types

Curse of Generality

irvinlim commented Jun 5, 2022

irvinlim commented Apr 17, 2022 •

edited

Loading

`pod`

`workflow`

`http`

`customResource`