Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Sync for skaffold dev #1039

Merged
merged 24 commits into from
Oct 5, 2018
Merged

Conversation

r2d4
Copy link
Contributor

@r2d4 r2d4 commented Sep 26, 2018

This PR introduces file sync for skaffold dev. Each artifact config now has an optional sync map. If the changes for the dev cycle are contained entirely in the sync map, then a rebuild and redeploy will be skipped, and those files will be synced to the running container. If only some of the changes are contained in the sync map, a full rebuild and redeploy will happen as normal.

That can optionally take glob patterns in the "key" part of the map (if the "key" is a pattern, the "value" destination must be a directory).

This PR changes the watcher slightly to compute WatchEvents, which are files that are added, modified, deleted, rather than an aggregate yes/no to whether things have changed. We were already computing this information and throwing it away, so it shouldn't be any slower.

It also adds a "webserver" example

Caveats: Files are copied with kubectl cp and kubectl exec rm. kubectl cp requires the tar and cp binaries to be present in the container. kubectl exec rm will require rm to be in the container. This feature will not work for scratch containers.

Future work for another PR will directly infer the "destination" from the Dockerfile if it not present.

@codecov-io
Copy link

codecov-io commented Sep 26, 2018

Codecov Report

Merging #1039 into master will increase coverage by 0.83%.
The diff coverage is 70.9%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1039      +/-   ##
==========================================
+ Coverage   42.32%   43.16%   +0.83%     
==========================================
  Files          72       74       +2     
  Lines        3303     3408     +105     
==========================================
+ Hits         1398     1471      +73     
- Misses       1771     1799      +28     
- Partials      134      138       +4
Impacted Files Coverage Δ
cmd/skaffold/app/cmd/fix.go 0% <ø> (ø) ⬆️
pkg/skaffold/util/util.go 45.09% <0%> (-2.33%) ⬇️
pkg/skaffold/runner/runner.go 51.68% <42.1%> (-1.48%) ⬇️
pkg/skaffold/kubernetes/sync.go 68.42% <68.42%> (ø)
pkg/skaffold/runner/changes.go 83.33% <75%> (-16.67%) ⬇️
pkg/skaffold/sync/sync.go 95% <95%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 616f9f7...9b11bc4. Read the comment docs.

return perform(image, syncMap, deleteFileFn)
}

// TODO(r2d4): kubectl exec doesn't seem to take a namespace flag?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not documented, but you can pass --namespace to kubectl exec

➜  ~ kubectl exec getting-started --namespace=default -- "ls"     
app
bin
...
➜  ~ kubectl exec getting-started --namespace=asdf -- "ls"   
Error from server (NotFound): pods "getting-started" not found

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind opening up an issue on kubectl to document that? That might be a good contribution to upstream kubectl.

if err != nil {
return errors.Wrap(err, "getting k8s client")
}
pods, err := client.CoreV1().Pods("").List(meta_v1.ListOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to plumb through the namespace to these functions and pass it in here. As per my comment above this should work with kubectl exec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we actually know what namespace we're looking for? We don't actually parse the namespace anywhere, and it can be different for kubectl and helm.

@@ -106,6 +106,13 @@ func ExpandPathsGlob(workingDir string, paths []string) ([]string, error) {
return ret, nil
}

// HasMeta reports whether path contains any of the magic characters
// recognized by filepath.Match.
// This is a copy of filepath/match.go's hasMeta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original implementation also includes the filepath separator. why not just use that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -108,8 +108,9 @@ func newCallback() *callback {
}
}

func (c *callback) call() {
func (c *callback) call(e Events) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this change for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callback.call() has to implement ChangeFn which is func(Events) error

for _, c := range p.Spec.Containers {
if strings.HasPrefix(c.Image, image) {
for src, dst := range files {
cmd := cmdFn(p, c, src, dst)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we call these in parallel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll make them run in parallel

if !sync {
changed.Add(artifact)
} else {
color.Default.Fprintln(out, "Synced:", "copied", append(e.Added, e.Modified...), "deleted", e.Deleted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the case, but just to be sure: if we get here, is it always true that we've performed a sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, shouldSync should probably be renamed, since it actually syncs and returns true or doesn't sync and returns false

@dgageot
Copy link
Contributor

dgageot commented Sep 27, 2018

@r2d4 when I tried, it would sync files when I don't have a sync on the artifacts and it would stop syncing as soon as I had one...

@dgageot
Copy link
Contributor

dgageot commented Sep 27, 2018

@r2d4 to make the review easier, could you create a PR that introduces watch events?

return true, nil
}

func intersect(syncMap map[string]string, files []string) (map[string]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work if the artifact is not in . but in a sub-folder. sync destinations are relative to the workspace but file paths are relative to the root of the project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if I want to copy all static/*.html files to /html, I end up copying them to /html/static/. It should not work that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intersect function would benefit from a good test coverage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote it to fix those issues and added another suite of tests to cover it. I still might rewrite both these functions after I split out watch, they're doing a little too much.

@r2d4
Copy link
Contributor Author

r2d4 commented Sep 27, 2018

I’ll split out the watch changes into a separate PR

@ajbouh
Copy link
Contributor

ajbouh commented Sep 28, 2018

@r2d4 it's possible that you could use similar logic to keep the build context up to date to make container builds on knative really fast

@r2d4
Copy link
Contributor Author

r2d4 commented Sep 28, 2018

Yes @ajbouh we can use the same information from the watcher to only update the file deltas on remote contexts 👍

@r2d4 r2d4 mentioned this pull request Sep 28, 2018
@r2d4 r2d4 force-pushed the file-sync branch 5 times, most recently from 90d2f35 to 0a206c2 Compare October 1, 2018 16:32
@dgageot dgageot added the wip label Oct 2, 2018
@dgageot
Copy link
Contributor

dgageot commented Oct 2, 2018

I added a wip label to make sure we merge #1064 first and rebase this one on top of it

@dgageot
Copy link
Contributor

dgageot commented Oct 2, 2018

ping @r2d4, #1064 is merged.

Copy link
Contributor

@dgageot dgageot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test it tomorrow. Just one question

if err != nil {
return errors.Wrap(err, "getting k8s client")
}
pods, err := client.CoreV1().Pods("").List(meta_v1.ListOptions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you provide the ns here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have the namespace. In port-forward and log, we have to watch all pods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"" is all namespaces

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have a namespace sometimes but it's ok, your proposal of filtering on label is equally good.

@dgageot
Copy link
Contributor

dgageot commented Oct 2, 2018

Sorry, @r2d4 you need to rebase once again...

})
}
if err := e.Wait(); err != nil {
return errors.Wrap(err, "syncing files:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing :

case len(changed.needsResync) > 0:
for _, s := range changed.needsResync {
if err := r.Syncer.Sync(s); err != nil {
logrus.Warnln("Skipping build and deploy due to sync error:", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the sync fails I don't see the warning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, sometimes I see it...

}

func deleteFileFn(pod v1.Pod, container v1.Container, src, dst string) *exec.Cmd {
return exec.Command("kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "--", "rm", "-rf", dst)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A debug log would be useful here.


func copyFileFn(pod v1.Pod, container v1.Container, src, dst string) *exec.Cmd {
return exec.Command("kubectl", "cp", src, fmt.Sprintf("%s/%s:%s", pod.Namespace, pod.Name, dst), "-c", container.Name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A debug log would be useful here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get a debug log with util.RunCmd. Does that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes should be good

logrus.Warnln("Skipping build and deploy due to sync error:", err)
return nil
}
color.Default.Fprintf(out, "Synced files for %s...\nCopied: %s\nDeleted: %s\n", s.Image, s.Copy, s.Delete)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe print the number of files in info and the names in debug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be any message in stdout?

stdout: ?
info: # of files synced
debug: filenames

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like info: # of files synced
debug: filenames

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@r2d4 r2d4 force-pushed the file-sync branch 2 times, most recently from d344b05 to 5da3ccd Compare October 3, 2018 14:33
@bhack bhack mentioned this pull request Oct 3, 2018
@r2d4
Copy link
Contributor Author

r2d4 commented Oct 5, 2018

final rebase :)

@dgageot dgageot merged commit d65ab29 into GoogleContainerTools:master Oct 5, 2018
@r2d4 r2d4 deleted the file-sync branch October 5, 2018 14:16
- image: gcr.io/k8s-skaffold/node-example
context: node
sync:
'*.js': .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we avoid user data as keys ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have justified my comment indeed.
Coming from k8s it is unusual to have user data as keys; it surprised me.

Digging into k8s design doc, we have:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#lists-of-named-subobjects-preferred-over-maps
There are whole discussions linked from there, notably about

  • readability (mostly in case of subobjects though)
  • merging definitions, which may or may not be of concert for skaffold

It lists exceptions to this though: "pure maps [...] as opposed to sets of subobjects".

I suppose if we later want to enrich the format with subobjects it would make things harder with the current format.

I am not (yet) familiar with all the skaffold format, I don't know if there are other cases like that, I couln't find any.

So this is just to open the discussion. Should I move that into an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link. That actually makes a lot of sense. Please open an issue! I think it’s a good suggestion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we go: #1180

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants