Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control-service: utilize new deployment tables #2714

Merged
merged 8 commits into from
Oct 2, 2023

Conversation

mivanov1988
Copy link
Collaborator

@mivanov1988 mivanov1988 commented Sep 28, 2023

Why

As part of the VEP-2272, we have to introduce a process to synchronize the data jobs from the database to the Kubernetes.

The initial implementation depends on the data_job_deployment table for both reading and writing operations. As the deployment process operates asynchronously with unpredictable durations, users may believe that their deployment has been completed while it may not have.
To address this issue, we'll be introducing a new table called "desired_data_job_deployment" and renaming the current one to "actual_data_job_deployment". The read operations will read from "actual_data_job_deployment" while the write operations will write to "desired_data_job_deployment".

What

We've modified the deployment logic to operate in conjunction with these new tables.

Just so you know, this is just the second phase of implementation. The following enhancements are planned for future PRs:

  • We will annotate the method DataJobsSynchronizer.synchronizeDataJobs() with @scheduled.
  • Improved exception handling will be integrated.
  • More tests will be included in subsequent updates.
  • ThreadPool configuration will be tuned and exposed through the application.properties.

Testing Done

Integration tests

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com

@mivanov1988 mivanov1988 changed the title control-service: additional deployment table [DRAFT] control-service: additional deployment table Sep 28, 2023
@mivanov1988 mivanov1988 changed the title [DRAFT] control-service: additional deployment table control-service: additional deployment table Sep 28, 2023
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch 3 times, most recently from 59dd97b to ba58cb2 Compare September 29, 2023 07:06
@mivanov1988 mivanov1988 changed the title control-service: additional deployment table control-service: utilize new deployment tables Sep 29, 2023
mivanov1988 and others added 2 commits September 29, 2023 14:56
Why
As part of the VEP-2272, we need to introduce a process which will syncronize the data jobs from the database to the Kubernetes.

What
TODO

Testing Done
TODO

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch 2 times, most recently from 2acfb21 to f618fdf Compare September 29, 2023 12:20
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch from d52c674 to cfc02d3 Compare September 29, 2023 14:59
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch from 970141e to 860d117 Compare September 29, 2023 15:45
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch from cc58cdf to ea0fc74 Compare September 30, 2023 12:11
Why
As part of the VEP-2272, we have to introduce a process to synchronize the data jobs from the database to the Kubernetes.

The initial implementation depends on the data_job_deployment table for both reading and writing operations. As the deployment process operates asynchronously with unpredictable durations, users may believe that their deployment has been completed while it may not have.
To address this issue, we'll be introducing a new table called "desired_data_job_deployment" and renaming the current one to "actual_data_job_deployment". The read operations will read from "actual_data_job_deployment" while the write operations will write to "desired_data_job_deployment".

What
We've implemented new deployment tables and modified the deployment logic to operate in conjunction with these new tables.

Just so you know, this is just the second phase of implementation. The following enhancements are planned for future PRs:

We will annotate the method DataJobsSynchronizer.synchronizeDataJobs() with @scheduled.
Improved exception handling will be integrated.
More tests will be included in subsequent updates.
ThreadPool configuration will be tuned and exposed through the application.properties.

Testing Done
Integration tests

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
@mivanov1988 mivanov1988 force-pushed the person/miroslavi/additional-deployment-table branch from 53abcbc to 4383dde Compare October 2, 2023 07:21
@mivanov1988 mivanov1988 enabled auto-merge (squash) October 2, 2023 12:58
Copy link
Collaborator

@antoniivanov antoniivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Left some comments. But they can be addressed in this or subsequent PRs as you see fit with the other TODOs.

If you disagree or don't understand a comment, do respond .

@mivanov1988
Copy link
Collaborator Author

Looks great. Left some comments. But they can be addressed in this or subsequent PRs as you see fit with the other TODOs.

If you disagree or don't understand a comment, do respond .

I totally agree with all the comments and will address them as part of #2742, thanks!

@mivanov1988 mivanov1988 merged commit 105ec6d into main Oct 2, 2023
@mivanov1988 mivanov1988 deleted the person/miroslavi/additional-deployment-table branch October 2, 2023 18:50
mivanov1988 added a commit that referenced this pull request Oct 11, 2023
# Why
We observed the following post-deployment test failure:
https://gitlab.com/vmware-analytics/versatile-data-kit/-/jobs/5192553226.

This change is the culprit
#2714
It’s the first pipeline that started having this issue
https://gitlab.com/vmware-analytics/versatile-data-kit/-/jobs/5210024385
The previous change for the control service that triggered the same
tests passed fine. There weren’t any changes to heartbeat or control-cli
in between, as far as I can tell. You can see from the logs that the
schedule is written to the data job table for the previous control
service change that triggered the same tests.


https://gitlab.com/vmware-analytics/versatile-data-kit/-/pipelines/1020826286
https://gitlab.com/vmware-analytics/versatile-data-kit/-/jobs/5192553226

The control-cli does a PUT request after creating the data job. This
updates the config, which includes the schedule. For some reason, the
config doesn’t get updated. The change touches some of the update code,
namely, it passes the new DesiredJobDeployment entity into the
JobImageBuilder, which is used in the JobService update method, which
gets called when we do a PUT request to update the job.

# What
Removed the unnecessary @transactional annotation from the
DesiredJobDeploymentRepository methods which prevent from committing the
transaction started in JobsService.updateJob().

# Testing done
Manually created data job through the DataJob API and then updated it
through the DataJob API.

create: localhost:8092/data-jobs/for-team/taurus/jobs
```json
{
	"job_name": "miroslavi-test6",
	"team": "taurus",
        "config": {
          "schedule": {
	          "schedule_cron": "* * * * *"
          }
	}
}
```

update: localhost:8092/data-jobs/for-team/taurus/jobs/miroslavi-test6
```json
{
	"team": "taurus",
	"config": {
          "schedule": {
	          "schedule_cron": "* * * * 1"
          }
	}
}
```

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com

---------

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
Co-authored-by: github-actions <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants