Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support for task orchestration #747

Closed
KoenR3 opened this issue Jul 27, 2021 · 11 comments · Fixed by #853
Closed

[FEATURE] Support for task orchestration #747

KoenR3 opened this issue Jul 27, 2021 · 11 comments · Fixed by #853
Assignees
Milestone

Comments

@KoenR3
Copy link
Contributor

KoenR3 commented Jul 27, 2021

By activating job orchestration, the new job API needs to be used.

https://docs.databricks.com/data-engineering/jobs/jobs-api-updates.html

Adding this as a feature request for some next release

@nfx
Copy link
Contributor

nfx commented Jul 27, 2021

Hi @KoenR3 , do you actively use current databricks_job resource? How big are the workspaces, where this resource is intended to be rolled out?

Let's wait a bit until more votes are added for this issue.

@KoenR3
Copy link
Contributor Author

KoenR3 commented Jul 27, 2021

Hey @nfx , yes we actively use it but in our production environment we have not yet switched to task orchestration so that we can keep using Terraform (we are currently using an external airflow scheduler for the complex scheduling).

We have approximately 15 users on the workspace, with about 30 jobs provisioned through Terraform and we expect to onboard more in the coming year.

It is not an urgent feature, as I said. Just to put it on the roadmap because when the features becomes GA, the switch between APIs might break the current implementation

@Nestor10
Copy link

Similar use case as above. Too bad there isn't reverse compatibility with the API. But no one likes throwing vX in api paths.

@Nestor10
Copy link

Ok, turns out that i was wrong about the API. I've activated pipelines in our dev env and terraform still allows deploying jobs.

@nfx nfx added this to the v0.4.0 milestone Aug 6, 2021
@janaekj
Copy link

janaekj commented Aug 18, 2021

Our Organization would also like to see support for newly released MULTI-TASK Jobs. now in public preview since end of july. https://docs.databricks.com/data-engineering/jobs/index.html

@vadivelselvaraj
Copy link

We at Rivian would also love to see the Tasks orchestration feature supported via Terraform. This would help us to define dependency DAGs for our jobs and handle retries, etc.

nfx added a commit that referenced this issue Oct 8, 2021
* provider has to be initialized with `use_multitask_jobs = true`
* `task` block of `databricks_job` is currently slice, so adding and removing different tasks might cause confusing, but still correct diffs
* we may explore `tf:slice_set` mechanics for `task` blocks, though initial testing turned out to be harder to test
* `always_running` parameter still has to be tested for API 2.1 compatibility

This implements feature #747
@nfx nfx self-assigned this Oct 8, 2021
@nfx nfx modified the milestones: v0.4.0, v0.3.9 Oct 8, 2021
@nfx
Copy link
Contributor

nfx commented Oct 9, 2021

Support would be added in v0.3.9

nfx added a commit that referenced this issue Oct 12, 2021
* provider has to be initialized with `use_multitask_jobs = true`
* `task` block of `databricks_job` is currently slice, so adding and removing different tasks might cause confusing, but still correct diffs
* we may explore `tf:slice_set` mechanics for `task` blocks, though initial testing turned out to be harder to test
* `always_running` parameter still has to be tested for API 2.1 compatibility

This implements feature #747
@nfx nfx closed this as completed in #853 Oct 13, 2021
nfx added a commit that referenced this issue Oct 13, 2021
* provider has to be initialized with `use_multitask_jobs = true`
* `task` block of `databricks_job` is currently slice, so adding and removing different tasks might cause confusing, but still correct diffs
* we may explore `tf:slice_set` mechanics for `task` blocks, though initial testing turned out to be harder to test
* `always_running` parameter still has to be tested for API 2.1 compatibility

This implements feature #747
@dugernierg
Copy link

Thanks for the release, it's working like a charm!

For future readers of the repo facing the same requirements than me:

While it's not specified in the current doc, you can create as many dependencies as you need by adding more depends_on{} blocks. You can declare only one dependency per block however, trying to declare multiple task_key, or to pass an array of strings instead of a string won't work.

@ravulachetan
Copy link

Hey @dugernierg
Whats the terraform syntax for enabling "Task orchestration in Jobs" on Admin Console of DataBricks. I have tried "use_multitask_jobs = true" as part of the provider but its not allowed as part of provider.

@nfx
Copy link
Contributor

nfx commented Jan 5, 2022

@ravulachetan it should be enabled on all new workspaces. otherwise you can enable it through UI in workspace settings. otherwise please check with your databricks representative on the undocumented property for databricks_workspace_conf

@ravulachetan
Copy link

ravulachetan commented Jan 6, 2022

@dugernierg; thanks for quick response.
I am currently creating the workspace using Terraform and manually enabling the 'Task orchestration in Jobs' via UI in workspace settings. my understanding based on above comments is Terraform started to support enabling orchestration from provider version 3.9. I am looking for the Terraform syntax to enable it.
Let me know if my understanding is wrong and Terraform does not support enabling orchestration yet.

michael-berk pushed a commit to michael-berk/terraform-provider-databricks that referenced this issue Feb 15, 2023
* provider has to be initialized with `use_multitask_jobs = true`
* `task` block of `databricks_job` is currently slice, so adding and removing different tasks might cause confusing, but still correct diffs
* we may explore `tf:slice_set` mechanics for `task` blocks, though initial testing turned out to be harder to test
* `always_running` parameter still has to be tested for API 2.1 compatibility

This implements feature databricks#747
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants