Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion Controller #53

Merged
merged 15 commits into from
Feb 15, 2024
Merged

Ingestion Controller #53

merged 15 commits into from
Feb 15, 2024

Conversation

AdheipSingh
Copy link
Contributor

@AdheipSingh AdheipSingh commented May 4, 2023

Fixes #3.

Current State of PR

  • Supports NativeBatchIndexParallel
  • Supports Basic-Auth

CRD Definition

group: druid.apache.org
version: v1alpha1
kind: DruidIngestion

Scope

  • CRD is scoped to namespace for k8s API
  • DruidIngestion CR is scoped to the same namespace as Druid CR.
  • Relation b/w DruidIngestion CR and Druid CR. is one to one.

Installation

  • Helm
  • Kubectl

Ingestion Method Support

  • Kafka
  • Kinesis
  • NativeBatchIndexParallel
  • QueryControllerSQL
  • HadoopIndexHadoop

Authentication/TLS With Druid

  • Basic Auth
  • TLS

Ingestion Controller

Intro

  • Ingestion Controller is based on event driven + polling which shall reconciles k8s events with druidingestion custom resource.
  • Ingestion controller is hooked in the manager ( main.go ) and runs its own reconcile loop.
  • Introducing ingestion controller is fully backward compatible to the druid CR.

DruidIngestion Custom Resource

suspend: false
druidCluster: tiny-cluster
auth:
    type: basic-auth
    secretRef:
        name: ingestion-secret
        namespace: druid
ingestion:
    type: native-batch
    spec: |-
    ...
    ...
 status:
   currentIngestionSpec.json: |-
   ...
   ...
   lastUpdateTime: "2023-05-06T21:34:29Z"
   message: DruidIngestionControllerCreateSuccess
   reason: '{"task":"index_parallel_wikipedia-5_ogmefkld_2023-05-06T21:34:29.497Z"}'
   status: "True"
   taskId: index_parallel_wikipedia-5_ogmefkld_2023-05-06T21:34:29.497Z
   type: DruidIngestionControllerCreateSuccess

State Handling

  • Controllers are stateless, and custom resource status is used to store and reflect state changes. For each reconcile loop, the state is constructed based on observation.

  • Flow for creation and updation of druidIngestion CR

    1. The controller checks the status of the CR to see if a taskId is associated with the incoming event.
    2. If there is no taskId associated, the controller creates a task by calling the underlying druid API for the specific ingestion method type defined.
    3. Once the request is submitted to the druid API, the response is patched on the druid ingestion CR. If the response code is 200, it is considered a success, and the taskId is populated. This way, on the next subsequent reconcile, the controller is aware of the existence of the task. If the response code is not 200, the status is patched with "failed" and the appropriate response from the druid API.
    4. If a taskId is already associated with the incoming event, the controller checks if anything has changed in the ingestion spec defined using reflect.DeepEqual(). If the current state has changed, the spec is updated by calling the druid API, and the responses are patched to the status.
  • Flow for deletion of druidIngestion CR

    1. The DruidIngestion controller uses finalizers for executing logic before the deletion of the CR.
    2. The controller adds a finalizer in the druid ingestion CR if the deletion timestamp is not set.
    3. On deletion of the druid ingestion CR, the deletion timestamp is set, and the controller calls the druid API to shut down the task.
    4. If the response is successful, the finalizer is removed, and the k8s API removes the underlying CR.
  • The Druid service is constructed by getting the router Kubernetes service, and the controller expects the druidCluster name to be present on the CR.

  • Basic Auth is supported, and users can create a secret and refer to it in the druidIngestion CR. Controllers look up OperatorUserName and OperatorPassword.

@AdheipSingh AdheipSingh marked this pull request as draft May 4, 2023 21:33
@AdheipSingh AdheipSingh marked this pull request as ready for review May 6, 2023 23:22
@AdheipSingh
Copy link
Contributor Author

AdheipSingh commented May 6, 2023

@itamar-marom @styk-tv @cyril-corbon @cintoSunny @harinirajendran

All your initial reviews are welcome !

@AdheipSingh AdheipSingh mentioned this pull request May 11, 2023
5 tasks
Ingestion Controller Updates
@AdheipSingh AdheipSingh changed the title WIP Ingestion Controller Ingestion Controller Dec 28, 2023
@AdheipSingh AdheipSingh merged commit bb30bd9 into datainfrahq:master Feb 15, 2024
1 check passed
@AdheipSingh AdheipSingh deleted the ingestion branch February 15, 2024 17:05
TessaIO pushed a commit to TessaIO/druid-operator that referenced this pull request Jul 24, 2024
* ingestion spec acc to v3

* task creation

* supprt native batch

* fix router url

* revert license change

* revert go mod change

* fix main

* fix: made some changes as per review comments

* fix: removed unused package from ingestion reconciler

* rebase 1

* add example

* add review

* update dockerfile

---------

Co-authored-by: avtarOPS <avtarsingh12015@gmail.com>
AdheipSingh added a commit that referenced this pull request Jul 28, 2024
* Ingestion Controller (#53)

* ingestion spec acc to v3

* task creation

* supprt native batch

* fix router url

* revert license change

* revert go mod change

* fix main

* fix: made some changes as per review comments

* fix: removed unused package from ingestion reconciler

* rebase 1

* add example

* add review

* update dockerfile

---------

Co-authored-by: avtarOPS <avtarsingh12015@gmail.com>

* Update Docs and Tutorials (#138)

* docs and tutorials

* Refactor/ordering (#123)

* (ordering): refactor code

* (ordering): refactor code

* (ordering): testing

* chore(branch): rebase branch with master

* fix(tests): validate nodes order by regex

* Bump controller-tools version (#140)

* Utilize the DruidIngestion controller in e2e tests (#146)

* adds needed volumes to eks deployment spec and improves getting started documentation by noting minio dependency (#149)

* Add support for annotations on Deployment/StatefulSet resources in DruidNodeSpec (#145)

* Add support for annotations on Deployment/StatefulSet resources

* Support setting ReplicationControllerAnnotations at the cluster-level

* rename replicationControllerAnnotations to workloadAnnotations

* suggestions from code review

* Add support for multi tier nodes with different PVC sizes (#106) (#152)

Co-authored-by: Farhad Farahi <farhad@adjoe.io>

* fix: put Druid crds in the appropriate folder specified by Helm (#162)

Signed-off-by: ahmed.g <ahmed.g@adjoe.io>
Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>

* Adds service account name to each druid node optionally (#164)

* Adds service account to each druid node optionally

* Use controller-gen v0.11.2

---------

Signed-off-by: ahmed.g <ahmed.g@adjoe.io>
Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>
Co-authored-by: AdheipSingh <34169002+AdheipSingh@users.noreply.github.com>
Co-authored-by: avtarOPS <avtarsingh12015@gmail.com>
Co-authored-by: Itamar Marom <46691031+itamar-marom@users.noreply.github.com>
Co-authored-by: Jesper Larsson <4522613+MrLarssonJr@users.noreply.github.com>
Co-authored-by: Sam Wheating <samwheating@gmail.com>
Co-authored-by: Evan Jones <evan.a.jones3@gmail.com>
Co-authored-by: Farhad Farahi <farhad.farahi@gmail.com>
Co-authored-by: Farhad Farahi <farhad@adjoe.io>
Co-authored-by: Sadananda Aithal <111732128+saithal-confluent@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Ingestion Spec controller
3 participants