Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local paths treated like git repos? #952

Closed
camertron opened this issue Jul 10, 2019 · 14 comments
Closed

Local paths treated like git repos? #952

camertron opened this issue Jul 10, 2019 · 14 comments

Comments

@camertron
Copy link

Bug

Current Behavior

I'm trying to point garden at a local path via the repositoryUrl option. Not only does garden expect the path to end with a git ref (i.e. #master) which doesn't make sense to me, it also doesn't pick up local changes I make in that directory. Any changes I make to the Dockerfile, for example, aren't picked up when garden builds the image. I guess garden is checking the repo out at master, discarding my local changes in the process?

It's absolutely possible I'm misunderstanding how garden is supposed to work, but I thought the whole point was to enable local development via k8s. How does that work if it treats all repositoryUrls as remotes?

Expected behavior

I expected garden to be able to "see" changes inside a local directory.

Reproducible example

Point your garden.yml at any local path, make a local change and run garden build. The local change won't make it into the image.

Your environment

Mac OS 10.13.6
kubernetes 1.10.11

garden version = 0.10.0

Not attaching result of get debug-info because it contains sensitive info.

@eysi09
Copy link
Collaborator

eysi09 commented Jul 11, 2019

Hi @camertron, thanks for the question!

Is the path you're pointing to another git repo that you want to import into your project? If that is the case, you will need to link the repository via the garden link command. So if you're importing a another module, you would run something like:

# In main project dir
garden link module my-module /path/to/my-module

You can read more about working with remote sources here.

However, if this is just another service in a monorepo style project you don't use the repositoryUrl directive. Instead you co-locate the garden.yml file with your modules, one per module. You can optionally use the include directive to explicitly tell Garden what directories within the module it should include.

The motivation behind the repositoryUrl directive is to allow users to pull in code from other (usually) remote repos and deploy with them the rest of the stack. If the user then needs to make changes to those remote services, they can link those to a local version they have on their machine. The reason for the extra link step is that in some cases users don't have/need the source code the for the remote on their machine. E.g. you might want to pull in a service that another team is working to run your whole stack without necessarily wanting to clone that repo.

That being said, the "remote sources" workflow could definitely be improved. I'd love to hear what your use case is and if you have any suggestions. Also, your comment suggests that we need to document this better.

@camertron
Copy link
Author

Hey @eysi09 thanks for your quick response :)

The path I'm pointing to is another git repo, but that's sort of just a coincidence. I tried garden link though and that seems to be exactly what I was looking for 🎉

I think potentially just adding a mention of garden link in the documentation about remote sources (maybe it's already there and I missed it?) would be enough.

Thanks for the great tool!

@10ko
Copy link
Member

10ko commented Jul 12, 2019

Hi @camertron,
I am happy you figured this out with @eysi09 and you like Garden.

I have one question for you thou: you mentioned you can't attach the output from garden get debug-info because it contains sensitive information, what kind of info does the output contain?
I was wondering if you could elaborate on that, since we tried to scramble as much as possible and if the output still contains sensitive data and it's not safe to use, we have to fix it.

It would be nice to know more and also if you have a specific threat model we didn't think of.
Feel free to briefly comment here or open an issue if you feel like :).

Thanks.
Have a nice one.
Emanuele

@camertron camertron reopened this Jul 12, 2019
@camertron
Copy link
Author

@eysi09 I reopened this issue because it turns out garden link isn't what I was looking for. It appears garden will rsync everything from the linked directory into its .build directory. Complete with assets, the repo I'm working with weighs in at over 7GB and is comprised of hundreds of thousands of files. I literally want garden to point at this directory. It shouldn't need to clone or copy anything. It would also be really cool if I could mount an NFS volume inside kubernetes with garden so my container wouldn't have to be rebuilt or rsynced every time I make a local change. Just a thought.

@10ko The output .zip file contained copies of my garden.yml files, which contained names of repos, git URLs, etc. I'm currently just playing around with garden to see if it works for our use case, i.e. this is essentially throwaway stuff.

@eysi09
Copy link
Collaborator

eysi09 commented Jul 15, 2019

You should be able to choose what directories you sync with the include directive or alternatively with a .gardenignore file. However, this seems to have broken, see #968. We'll look into it ASAP.

As for the repositoryUrl field, it is generally meant to be used for remote repositories. This is because we assume users commit their garden.yml files and therefore they shouldn't contain values particular to a user's setup, such as a local path. This is why you need that extra link step.

That being said, we should probably detect if the respositoryUrl is a local path and in that case skip cloning the repo and the link step (Garden will still sync the build).

We've also thought about including something like a user profile where users can set local paths to external sources. Garden could then read those and again, skip cloning and linking if applicable. Would that make sense for your use case?

@camertron
Copy link
Author

camertron commented Jul 16, 2019

@eysi09 ah ok, I wondered about that. I tried using the include directive and .gardenignore but they didn't seem to have any effect. I thought I was using them wrong hehe.

I understand (and agree with) the rationale behind the link command, i.e. keeping user-specific config out of source control is a good thing. For that reason I'm not sure it makes sense to implicitly or automatically perform a link step on the user's behalf. Perhaps garden could print an informative error message when it discovers a local path? Something like "Looks like you're pointing to a local directory. Consider garden link instead?"

A user profile would also be great, maybe a JSON file at ~/.garden?

An update: I've been able to work around the rsync issues by writing a module of type: kubernetes. I have a type: container that builds a very minimal docker image locally, then the type: kubernetes module to actually deploy it. I have an NFS volume mounted inside the container that points to my local copy of the repo and two more regular (i.e. non-NFS) persistent volumes for yarn and bundler dependencies respectively. I'm effectively avoiding all the rsync issues and I'm able to use the many many image and flash assets in my local repo without too much slowdown. The biggest issues with rsync are 1) slowness, and 2) it's not bi-directional. If I make a change to the repo files (i.e. upload an avatar), that file is written inside the container but doesn't get synced into my local repo, so the next time I run the container the avatar image will be gone. The local path issues could really be helped by a user profile with variables that could be interpolated into the module definition at runtime.

Let me know what you think :)

@edvald
Copy link
Collaborator

edvald commented Jul 16, 2019

That's interesting, and good feedback! We've been pondering how to make the external sources more easy to use, and separately pondering how we might use data volumes (e.g. for test data) more effectively. It's great to get some data points like this to help guide the design.

Just on the rsync slowness issue, we've been looking into using mutagen ourselves instead of rsync. It's performant and does bi-directional syncing. Could be just what you need for your use-case?

@camertron
Copy link
Author

@edvald happy to help, and thanks again for garden!

Mutagen looks very cool, but maybe overkill for our use-case? We literally just need docker containers to have access to the host filesystem. I understand NFS isn't the perfect solution (for one thing, it has to be managed separately from garden) but it seems to work well enough that I can just set it up and not think about it again. From garden's perspective, I can see how it would be desirable to continue to keep a local copy of a codebase and its remote repo separate, and I like the concept of being able to "link" a repo. Perhaps the thing to do is include an NFS (or similar) server into garden. A quick google search reveals sdc-nfs, but there may be other options out there too.

@edvald
Copy link
Collaborator

edvald commented Jul 19, 2019

As it happens, we actually already use NFS for our in-cluster building features, but only within the cluster to sync and then expose the build context to the Docker daemon or Kaniko. Embedding an NFS server on the client-side is something we hadn't considered, but is definitely interesting!

I think the mechanics of syncing a data volume could be solved in a very similar way as to how we sync code, just need to think more about how we handle the relationship between a data "service" and other services in terms of abstractions and config. But this has been coming up more and more, for example in data science related scenarios, so I'm keen to work something out.

Oh and our pleasure :) I'm just happy that people are finding it helpful, it's super motivating to get the feedback.

@iphpdonthitme
Copy link

At the risk of not giving totally specific feedback on what's broken, it seems a little odd to me that it appears difficult to set up existing local git repos with files modified (i.e. not a specific committed git ref) outside of a garden project directory. I would guess that this would be a fairly common use case. Perhaps an example project that does this would be useful? Not clear that to me that I'm doing things right or wrong with modules and sources and repositoryUrls. And the sources and repositoryUrls even when set up with "file:///" seem to want to rsync a specific git ref.

@eysi09
Copy link
Collaborator

eysi09 commented Sep 16, 2019

Closing this issue as the behaviour is documented here: https://docs.garden.io/using-garden/using-remote-sources#local-sources-modules

@eysi09 eysi09 closed this as completed Sep 16, 2019
@camertron
Copy link
Author

@eysi09 are local sources still rsynced into the .build directory? If so then the concerns I've raised in this issue are still relevant and I think it should be re-opened.

@eysi09
Copy link
Collaborator

eysi09 commented Sep 19, 2019

All your source code is rsynced into the .build directory (except files and dir that are ignored via .gitignore, .gardenignore and the exclude directive). So modules that are a part of the main project get rsynced, as well as anything included via the repositoryUrl directive. It's from there that Garden "operates". So for e.g. when Garden builds the Docker image for a module, it does that in the .build directory.

If we wouldn't rsync external sources, Garden wouldn't be aware of them.

@camertron
Copy link
Author

@eysi09 yes, I understand. As I've detailed in previous comments, rsyncing is slow, space-consuming, and not bi-directional. I had hoped for a way to point garden at a local directory with no rsyncing. I had also suggested adding NFS support, which would address all my concerns. If local directories without rsync aren't on the roadmap and that's why this issue was closed, then that's fine. If not, then I think my concerns are still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants