Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Fixes #2509 by listening for connection drops and abdicating directly #2712

Merged
merged 1 commit into from
Dec 14, 2015

Conversation

aquamatthias
Copy link
Contributor

No description provided.

@aquamatthias aquamatthias self-assigned this Nov 24, 2015
@aquamatthias aquamatthias force-pushed the mv/fix_2509 branch 2 times, most recently from 1af1487 to a688f8a Compare November 25, 2015 09:17
@aquamatthias aquamatthias assigned kolloch and unassigned aquamatthias Nov 25, 2015

// The current candidate implementation does not handle ZK connection loss
// We register a listener on the candidate zk connection and handle disconnetions explicitly
zk.get().register(new Watcher {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to move this out of the Guice logic.

It also looks strange to me that you couple this with creating the MarathonSchedulerService.

Couldn't that simply be inside a WhenLeaderActor that has the ZooKeeperClient and the MarathonSchedulerService as dependencies?

@kolloch
Copy link
Contributor

kolloch commented Nov 26, 2015

@aquamatthias Thanks for taking care of this! I don't like placing the logic in the guice module, though...

@aquamatthias
Copy link
Contributor Author

@kolloch I agree. I have created a separate actor that handles this.
Side Note: The biggest problem here was wiring everything. We need to improve this.

@aquamatthias aquamatthias assigned kolloch and unassigned aquamatthias Dec 2, 2015
processes.get(name).map { process =>
Try(process.destroy())
processes -= name
process.exitValue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this affected by the race conditions mentioned in L188?

It looks odd that stopAllProcesses actually doesn't call this method for each process.

@gkleiman
Copy link
Contributor

gkleiman commented Dec 7, 2015

LGTM, just a minor comment regarding one of the new tests.

@aquamatthias
Copy link
Contributor Author

@gkleiman Addressed your comment.

@aquamatthias
Copy link
Contributor Author

@kolloch Since Gaston is on vacation - can you review?

@aquamatthias aquamatthias assigned kolloch and unassigned gkleiman Dec 14, 2015
}
}

class AbdicateOnConnectionLossActor(zk: ZooKeeperClient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: could be private[impl]

}

When("Zookeeper starts again")
ProcessKeeper.startZooKeeper(config.zkPort, "/tmp/foo/single", wipeWorkDir = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using a different working directory here? Please clarify in a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same path as in SingleMarathonIntegrationTest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move it to a constant in SingleMarathonIntegrationTest to make that clear?

@kolloch
Copy link
Contributor

kolloch commented Dec 14, 2015

Hey @aquamatthias, looks good. Mostly minor nitpicks and one race-condition. The only major thing is that I am not sure if the error detection code is exhaustive.

@kolloch kolloch assigned aquamatthias and unassigned kolloch Dec 14, 2015
@aquamatthias aquamatthias assigned kolloch and unassigned aquamatthias Dec 14, 2015
@kolloch
Copy link
Contributor

kolloch commented Dec 14, 2015

Thanks for the changes. I would suggest moving that Zookeeper path to a constant and referencing it in the test. Otherwise I would keep being confused about the meaning of that path in the test. Otherwise looks good!

@aquamatthias
Copy link
Contributor Author

Extracted a def with parameter default values.
Rebased and Squashed.

@kolloch
Copy link
Contributor

kolloch commented Dec 14, 2015

:shipit: 🚀

kolloch pushed a commit that referenced this pull request Dec 14, 2015
Fixes #2509 by listening for connection drops and abdicating directly
@kolloch kolloch merged commit 48b1e4b into master Dec 14, 2015
@kolloch kolloch deleted the mv/fix_2509 branch December 14, 2015 19:55
@marcomonaco marcomonaco added the pr label Mar 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants