Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local datastore to persist states of LocalScope network #466

Merged
merged 3 commits into from
Sep 22, 2015
Merged

Add local datastore to persist states of LocalScope network #466

merged 3 commits into from
Sep 22, 2015

Conversation

chenchun
Copy link
Contributor

Signed-off-by: Chun Chen ramichen@tencent.com

Fixes #461

@chenchun
Copy link
Contributor Author

Depending on godep updates of libkv.

@mavenugo
Copy link
Contributor

@chenchun thanks for your fast response. The CI failure is expected since the corresponding boltdb changes have not made into libkv project.

FYI, this needs a Godep update. which we can update once docker/libkv#46 is merged

@@ -68,6 +68,7 @@ install-deps:
go get github.com/tools/godep
go get github.com/golang/lint/golint
go get github.com/mattn/goveralls
go get github.com/boltdb/bolt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally go into godep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, once bolt gets merged into libkv project, I'll update.

@abronan
Copy link

abronan commented Aug 21, 2015

@chenchun You added all the etcd log files, make sure to exclude the default.etcd folder from your commits.

@abronan
Copy link

abronan commented Aug 21, 2015

Needs a vendoring of boltdb library too with Godep.

@chenchun
Copy link
Contributor Author

@abronan Thanks, updated and removed them.

@abronan
Copy link

abronan commented Aug 21, 2015

This will replace #439

@mavenugo mavenugo mentioned this pull request Aug 21, 2015
Datastore DatastoreCfg
Daemon DaemonCfg
Cluster ClusterCfg
Datastore, LocalDataStore DatastoreCfg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we call these GlobalStore and LocalStore ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this

@chenchun
Copy link
Contributor Author

Also contains the following change:
Removed useless function getNetworkFromStore and getEndpointFromStore from store.go.
Added tests to test default config for local store.
Added KVProviderConfigto netlabel/labels.go and passed LocalScope configuration to opt in controller.RegisterDriver.
Removed network.isGlobalScoped since it's useless now. watchNetworks and watchEndpoints will be invoked only by global store now.

if err := c.initLocalStore(); err != nil {
// Failing to initalize datastore is a bad situation to be in.
// But it cannot fail creating the Controller
log.Debugf("Failed to Initialize LocalDatastore due to %v.", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we added default configurations for local store, shall we return err here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this.

@mavenugo
Copy link
Contributor

@chenchun thanks for addressing all the comments. its shaping up nicely. I just gave a few more comments. Could you please address them ?
If you have any specific design question, we can chat in #docker-network IRC channel.

@chenchun
Copy link
Contributor Author

@mavenugo, I'll update the PR soon.

@chenchun
Copy link
Contributor Author

@mavenugo, just updated the PR. It's quite later today, if you have more comments, I'll take a look 8 hours later.

if err != nil || !global {
func (c *controller) newNetworkFromStore(n *network) error {
// Check if a network already exists with the specified network name
c.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the c.Lock() and Unlock() in this method. It causes deadlock :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. And tested against docker, seems ok now.

@chenchun
Copy link
Contributor Author

I agree it tries to do too many things. I think we can split the last commit and focus on introducing localstore. But these should be a way to handle https://github.com/docker/libnetwork/blob/master/drivers/bridge/bridge.go#L887

@mavenugo
Copy link
Contributor

thanks @chenchun. I noticed that you have reverted the bridge driver changes that I reported. Is that expected ? Without the changes to the bridge driver, wouldnt restore fail as I indicated earlier ?
I will not be able to give this patch a try tonight. But it will be good if you can give me some description of the latest changes that you have done to enable with the code-review.

@mavenugo
Copy link
Contributor

@chenchun based on our discussion, i agree that the bridge driver changes aren't required & also the daemon restart is going to cleanup the endpoint anyways.
I discussed this issue with the other maintainers and we feel that it is infact better to not even save the endpoint for local scope (we still need it in the globalstore).
Can you please make that change ? And I think that will solve all the issues and makes it is ready for merge.

@chenchun
Copy link
Contributor Author

also the daemon restart is going to cleanup the endpoint anyways

Correct me if I'm wrong. I think daemon restarts won't delete endpoints currently because daemon only keeps SandboxID/EndpointID now and when daemon is killed manually, daemon won't be able to delete veth devices. So I think it's better to keep endpoints and deletes them upon restart.

@mavenugo
Copy link
Contributor

@chenchun you are right. But, we have other ideas to clean up the stale endpoints. Another reason to keep the endpoints deletion away from this PR is to reduce the complexity and get this in ASAP.
So, it would be good if we get this PR in with network save/restore right away and work on the endpoint cleanup on bootup in docker. Also, we have multiple PRs pending on the libkv upgrade. Shall we get this PR to get Network save and restore completed followed by another for proper endpoint handling ?
am in IRC and we can work together on this.

case "local":
c.Scope = driverapi.LocalScope
c.DataScope = datastore.LocalScope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a very quick rebase :-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately forget to refactor tests during rebase.

@mavenugo
Copy link
Contributor

Thanks @chenchun for the contribution and addressing all the comments patiently.
I tested e2e with both local and global store.

LGTM.

ping @mrjana @aboch

@aboch
Copy link
Contributor

aboch commented Sep 15, 2015

LGTM

@mavenugo
Copy link
Contributor

@chenchun as we discussed, we were trying to simplify this logic by removing my commit from this PR. With the introduction of #547, we can actually do that ... 🎆

Shall I request you to submit the PR again (with a rebase) and remove my commit (https://github.com/chenchun/libnetwork/commit/652d319c33bef98e58f1223b6f47d25a609f1291) from this PR ? BTW I attempted this change in : https://github.com/mavenugo/libnetwork/commits/defnet_save. Please use it if you find it useful. (You can ignore my last commit that introduces persist option),

This ofcourse needs changes in the docker daemon to call controller.New with driver init params. Which we will take care during vendoring in.

@chenchun
Copy link
Contributor Author

OK, will update the PR shortly.

Signed-off-by: Chun Chen <ramichen@tencent.com>
Signed-off-by: Chun Chen <ramichen@tencent.com>
@mavenugo
Copy link
Contributor

Thanks @chenchun pls let me know if you need any help.

@chenchun
Copy link
Contributor Author

@mavenugo, some problems with unit tests now. Since we changed many unit tests to pass configs to controller.New function, some tests will create localstore now. And cause they are run in a single process, they will hang on creating the same boltdb file lock.

@mavenugo
Copy link
Contributor

@chenchun. Can't we use config Params to point to different Boltdb file for each of the different controllers to make use of ?

@chenchun
Copy link
Contributor Author

Yes, doing it right now.

1. Don't save localscope endpoints to localstore for now.
2. Add common function updateToStore/deleteFromStore to store KVObjects.
3. Merge `getNetworksFromGlobalStore` and `getNetworksFromLocalStore`
4. Add `n.isGlobalScoped` before `n.watchEndpoints` in `addNetwork`
5. Fix integration-tests
6. Fix test failure in drivers/remote/driver_test.go
7. Restore network to store if deleteNework failed
@chenchun
Copy link
Contributor Author

@mavenugo updated now. Will test it against docker tomorrow.

if err == datastore.ErrKeyModified {
return types.InternalErrorf("operation in progress. delete failed for network %s. Please try again.")
}
return err
}

defer func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a defer here to restore network.

@mavenugo
Copy link
Contributor

@chenchun thanks for your patience and handling all the comments.
We still need a couple of fixes to make it work properly e2e with Docker Daemon.
(such as mavenugo@76c9c13). But we can handle it via subsequent PRs.

Will merge this PR.

mavenugo added a commit that referenced this pull request Sep 22, 2015
Add local datastore to persist states of LocalScope network
@mavenugo mavenugo merged commit 5df9f84 into moby:master Sep 22, 2015
@chenchun
Copy link
Contributor Author

👍 , Happy to see it finally gets merged. I apologize if I slow down your work during the process of this PR. Thanks for your patient review and help too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants