Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capability to support loading terminated jobs after leader election #595

Merged
merged 17 commits into from
Dec 8, 2023

Conversation

sundargates
Copy link
Contributor

Context

^^^

Checklist

  • ./gradlew build compiles code correctly
  • Added new tests where applicable
  • ./gradlew test passes all tests
  • Extended README or added javadocs where applicable

Copy link

github-actions bot commented Nov 29, 2023

Test Results

487 tests   - 62   481 ✔️  - 60   7m 15s ⏱️ -10s
131 suites ±  0       6 💤  -   2 
131 files   ±  0       0 ±  0 

Results for commit f381695. ± Comparison against base commit 1dfc742.

This pull request removes 68 and adds 6 tests. Note that renamed tests count towards both.
io.mantisrx.master.jobcluster.JobClusterTest ‑ testCronTriggersSLAToKillOld
io.mantisrx.master.jobcluster.JobClusterTest ‑ testExpireOldJobs
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetJobDetailsForArchivedJob
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetLastSubmittedJob
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetLastSubmittedJobSubject
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetLastSubmittedJobSubjectWithWrongClusterNameFails
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetLastSubmittedJobWithCompletedOnly
io.mantisrx.master.jobcluster.JobClusterTest ‑ testGetLastSubmittedJobWithNoJobs
io.mantisrx.master.jobcluster.JobClusterTest ‑ testJobClusterArtifactUpdate
io.mantisrx.master.jobcluster.JobClusterTest ‑ testJobClusterArtifactUpdateMultipleTimes
…
io.mantisrx.master.jobcluster.CompletedJobStoreTest ‑ testInitializationOfCompletedJobStore
io.mantisrx.master.jobcluster.CompletedJobStoreTest ‑ testJobClusterDeletion
io.mantisrx.master.jobcluster.CompletedJobStoreTest ‑ testLazyLoadingOfNewPages
io.mantisrx.master.jobcluster.CompletedJobStoreTest ‑ testWhenJobGetsCompleted
io.mantisrx.master.jobcluster.CompletedJobStoreTest ‑ testWhenJobIsNotThere
io.mantisrx.server.master.store.KeyValueStoreTest ‑ testUpsertOrdered

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Nov 29, 2023

Uploaded Artifacts

To use these artifacts in your Gradle project, paste the following lines in your build.gradle.

resolutionStrategy {
    force "io.mantisrx:mantis-client:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-common-serde:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-common:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-discovery-proto:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-runtime:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-remote-observable:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-runtime-loader:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-testcontainers:0.1.0-20231207.221317-124"
    force "io.mantisrx:mantis-connector-iceberg:0.1.0-20231207.221317-453"
    force "io.mantisrx:mantis-connector-job:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-connector-kafka:0.1.0-20231207.221317-455"
    force "io.mantisrx:mantis-network:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-control-plane-client:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-control-plane-core:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-connector-publish:0.1.0-20231207.221317-454"
    force "io.mantisrx:mantis-control-plane-server:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-groupby-sample:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-jobconnector-sample:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-mantis-publish-sample:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-core:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-shaded:0.1.0-20231207.221317-453"
    force "io.mantisrx:mantis-examples-sine-function:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-synthetic-sourcejob:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-publish-core:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-wordcount:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-examples-twitter-sample:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-server-agent:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-server-worker:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-server-worker-client:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-source-job-kafka:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-source-job-publish:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-publish-netty-guice:0.1.0-20231207.221317-448"
    force "io.mantisrx:mantis-publish-netty:0.1.0-20231207.221317-447"
}

@sundargates sundargates had a problem deploying to Integrate Pull Request November 30, 2023 00:01 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request November 30, 2023 17:10 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request November 30, 2023 17:15 — with GitHub Actions Failure
// let's load 1 page of completed jobs from DB and then let the rest be loaded lazily
// todo(sundaram): Use a clock here
Instant end = Instant.now();
List<CompletedJob> completedJobs = jobStore.loadCompletedJobsForCluster(name, end.minus(Duration.ofDays(7)), end);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a blocking IO call? can it be converted to a future?

@sundargates sundargates had a problem deploying to Integrate Pull Request November 30, 2023 18:11 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request November 30, 2023 22:18 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request December 1, 2023 04:36 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request December 2, 2023 07:57 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request December 2, 2023 20:57 — with GitHub Actions Failure
@sundargates sundargates temporarily deployed to Integrate Pull Request December 5, 2023 08:57 — with GitHub Actions Inactive
@sundargates sundargates temporarily deployed to Integrate Pull Request December 6, 2023 05:44 — with GitHub Actions Inactive
@sundargates sundargates force-pushed the sundaram/jobs_restart branch from 805b66a to 962a49d Compare December 7, 2023 07:57
@sundargates sundargates had a problem deploying to Integrate Pull Request December 7, 2023 07:57 — with GitHub Actions Failure
@sundargates sundargates had a problem deploying to Integrate Pull Request December 7, 2023 21:48 — with GitHub Actions Failure
@sundargates sundargates temporarily deployed to Integrate Pull Request December 7, 2023 22:03 — with GitHub Actions Inactive

interface ICompletedJobsStore {

void initialize() throws IOException;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments


void initialize() {
try {
logger.info("Loading completed jobs for cluster {}", name);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to debug instead.

@sundargates sundargates had a problem deploying to Integrate Pull Request December 8, 2023 06:59 — with GitHub Actions Failure
@sundargates sundargates merged commit c5c2a9d into master Dec 8, 2023
@sundargates sundargates deleted the sundaram/jobs_restart branch December 8, 2023 07:07
Comment on lines +78 to +87
List<CompletedJob> getCompletedJobs(int limit) throws IOException;

/**
* Gets a list of completed jobs
* @param limit number of jobs to return
* @param endExclusive end job id
* @return list of completed jobs
* @throws IOException if there is an error
*/
List<CompletedJob> getCompletedJobs(int limit, JobId endExclusive) throws IOException;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this only supports reading first x jobs? Is there a way to read from the list paginated?

while (!jobs.isEmpty()) {
for (CompletedJob completedJob : jobs) {
try {
// todo(sundaram): Clean this up. This is a hack to get around the fact that the job store
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incomplete comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants