-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Individual lockfile per workspace #1223
Comments
It's best to summarize it - there are a lot of discussions there 🙂 I've seen your comment about cache layers, but I wonder if what you're looking for isn't just a way to compute a "cache key" for a given workspace? |
I'll gladly do that 😄 Current state of affairsYarn workspaces has a lot of benefits for monorepos, one of them being the ability to hoist third-party dependencies to reduce installation times and disk space consumed. This works by picking a dependency version that fits the most dependency requirements as specified by package manifest files. If a single dependency can't be found that matches all requirements, that's OK, the dependency is kept in a package's node_modules folder instead of the top level and the Node.js resolution algorithm takes care of the rest. With PnP, I'm assuming the Node.js resolution algorithm is patched in a similar way to make it work with multiple versions of dependencies. I'm assuming that because all of these dependencies are managed by a single What's the problem?In various monorepos, it is desirable to treat a workspace as an independent deployable entity. Most deployment solutions out there will look for manifest and lock files to set up required dependencies. In addition to this some tools, like Docker, can leverage the fact that versions are immutable to implement caching and reduce build and deployment times. Here's the problem: because there is a single lock file at the top-level, one can't just take a package (i.e., workspace) and deploy it as one would when not using Yarn workspaces. If there was a lock file at the package level then this would not be an issue.
It's not just computing a "cache key" for caching, but also having a lock file to pin versions. For example, if you're deploying a workspace as a Google Cloud Function you would want the lock file to be there so that installation of dependencies was pinned as the lock file specifies. One could copy the entire lock file to pin versions but then the caching mechanism breaks. So the underlying thing we're working with here is that deployment platforms use lock files as a cache key for the third-party dependencies. |
Let's see if I understand this properly (consider that I don't have a lot of experience with Docker - I've played with docker-compose before, but there are many subtleties I'm still missing):
Did I understand correctly? If so, a few questions:
|
I think so, but let me add a bit more context to how the layer caching mechanism works in Docker. When building a docker image (i.e., In the Dockerfile, the specification to build the Docker image, there are various commands available. The most common one is # layer 1
FROM node
# layer 2
COPY package.json yarn.lock .
# layer 3
RUN yarn install
# layer 4
COPY . . Again, this might be an oversimplification but it gets the point across. Because we first copy the package.json and the lockfile and then run
This is a good question. After much thought, our solution is going to be a private NPM registry. This will not only work for building Docker images but also for using tools like GCP Cloud Functions or AWS Lambda. If Docker was the only tool we were using we could use the entire monorepo as the build context but still just
It's a best practice to do it within the image. This guarantees native dependencies are built in the appropriate OS and has the benefit of caching to reduce build times in CI. In our current workaround we actually have to build everything outside, move it into Docker, and run
This might be a good workaround for now, perhaps in a postinstall script. Would this keep the hoisting benefits of Yarn workspaces? |
I tried this out with and unfortunately it's not as straightforward since running |
Linking a comment to a related issue here: yarnpkg/yarn#4521 (comment) |
If having an independent deployable entity is the main reason for this. I currently have a plugin that is able to do this. I need to work with my employer to get it released however. |
@Larry1123 I think your plugin could be very useful to quite a few folks, will your employer allow you to share it? |
I got the ok be able to release it, will have to do it when I have the time. |
@Larry1123 Wondering how are you handling yarn workspace. Does your plugin creates yarn.lock for each package in the workspace? |
In a way yes, it takes the project's lock and reruns the install of the workspace as if it was the only one in the project in a new folder after also removing devDependencies. That way the resulting lock matches the project but for only what is needed for that workspace. It also currently hardlinks the cache, and copies what it can keep from the project's .yarn files. |
The
We're building two Docker images from it:
This can be done, and is nicely described in yarnpkg/yarn#5428 (comment) (we furthermore utilize tarball context as a performance optimization), but the issue with a single lockfile stays: a change in (We also have other tooling that is affected by this, for example, we compute the versions of frontend-app and backend-app from Git revisions of the relevant paths, and a change to I don't know what the best solution would be, but one idea I had was that workspaces should actually be a two-dimensional construct in {
"workspaces": {
"frontend-app": ["frontend", "common"],
"backend-app": ["backend", "common"]
}
} For the purposes of module resolution and installation, Yarn would still see this as three "flat" workspaces, When we'd be building a Docker image for frontend-app (or calculating a version number), we'd involve these files:
It would be awesome if this could work but I'm not sure if it's feasible... As a side note, I previously thought that I wanted to have
In our case, |
I've changed where I stand on this issue and shared my thoughts here: yarnpkg/yarn#5428 (comment). |
@arcanis I'm reading your Yarn 2.1 blog post and there's a section on Focused Workspaces there. I don't have experience with this from either 2.x or 1.x Yarn but is it possibly solving the Like, could I create a build context that contains the main Or is it still not enough and something like named sets of workspaces would be necessary? |
I think it would, yes. The idea would be to run I encourage you to try it out and see whether there are blockers we can solve by improving this workflow. I'm not sold about this named workspace set idea, because I would prefer Yarn to deduce which workspaces are needed based on the main ones you want. It's too easy to make a mistake otherwise. |
Agree; if the focus mode works, then it's probably better. Do you have a suggestion on how to construct the common/frontned/backend dependencies to make it the most tricky for Yarn? Like, request |
I don't know if this makes a difference in your reasoning @arcanis but I thought it would be worth mentioning in case there's something about Yarn's design that would lend itself for this... this issue could also be solved by having a lockfile per worktree instead of per workspace. For example, each deployable workspace can itself be a worktree and specify which workspaces from the project it depends on. Here's an example repo: https://github.com/migueloller/yarn-workspaces It would be fine to have a lockfile for That being said, based on what I had commented before (#1223 (comment)), one could just have multiple yarn projects in the same repo and have them all share the same Yarn cache. While it wouldn't be as nice as running I'm taking the definition of project > worktree > workspace from here. |
Another thought is that I also wanted to add another use case in addition to Docker images. If one has a large monorepo where CI jobs are started depending on whether a certain "package" changed, having a shared lockfile makes that a bit hard for the same reasons it's hard on Docker's cache. If we want to check if a workspace changed, including its dependencies, we would also want to check the lockfile. For example, some security update could've been added that changed the patch version being used but not the version range in |
That is a good point, and we have similar use case. Not just for CI but we also e.g. calculate the app versions ("apps" are e.g. |
I'm currently using it within our Dockerfile - one question about determinism (which may expose my misunderstanding of Is it possible to run focus such that it should fail if the |
No (because the lockfile would effectively be pruned from extraneous entries, should it be persisted, so it wouldn't pass the immutable check) - I'd recommend to run the full |
@tabroughton @samarpanda I have gotten the plugin I was working on public https://gitlab.com/Larry1123/yarn-contrib/-/tree/master/packages/plugin-production-install. |
I have a slightly different use case in mind for this feature. Originally wrote it on Discord, but copying it here for posterity: One of the downsides of monorepos seems to be that once you add new developers, you have to give them access to the whole code base, while with single repos you could partition things in a way so they have access to smaller bits and pieces. Now, this could probably be solved with git submodules, putting each workspace in its own git repo. Only certain trusted/senior devs could then have access to the root monorepo, and work with it as one. The only problem holding this back seems to be the lack of a dedicated With a
Seems like there would also be a need to isolate workspace dependencies to separate I'm not concerned about pushing, more concerned about pulling. I don't want any junior dev to simply pull all the company intellectual property as one simple command. How do you guys partition projects with newer, junior (not yet established/trusted) devs, now that everyone works from home? |
This is something that has been a pain point for my work also. I have been wanting to work out a solution to this just have not had the time to truly work it out. Once I understand yarn better I had intended to try to work out a plan of action. A holistic approach I feel would have various integrations into things like identity providers, git, git(hub|lab)/bitbucket, yarn, and tooling or zero-trust coronation of internal dependencies and resolutions throughout the super repo. The integration into the git host would be to be able to handle cross project things but not sure what level it would need. |
I have created a pretty simple Yarn 2 plugin that will create a separate https://github.com/andreialecu/yarn-plugin-workspace-lockfile I haven't yet fully tested it, but it seems to create working lockfiles. I would still recommend @Larry1123's plugin above for production deployment scenarios: #1223 (comment), but perhaps someone will find this useful as well. |
I'll mirror my comment from yarnpkg/yarn#5428 (comment) here: My need for this behavior (versioning per workspace, but still have lockfiles in each package) is that I have a nested monorepo, where a subtree is exported to another repo entirely, so must remain independent. Right now I'm stuck with lerna/npm and some custom logic to attempt to even out versions. It would be nice if yarn could manage all of them at once, but leave the correct subset of the "entire workspace pinning" in each. (Though, I'm really not sure how this nested workspace will play out if I were to switch to @andreialecu That plugin looks interesting; it's almost what I'm looking for, though appears to be directed towards deployment (and not just general development). But it does give me hope that what I'm looking for might be prototype-able in a plugin. |
@jakebailey do note that there are two plugins: for deployment: https://gitlab.com/Larry1123/yarn-contrib/-/tree/master/packages/plugin-production-install Feel free to take either of them and fork them. If you end up testing mine and improving it, feel free to contribute changes back as well. |
Hi folks, I gave a shot at generate-lockfile but it couldn't read the root Will try yarn-plugin-entrypoints-lockfiles. Edit: this seems to work much better, see my example monorepo: https://github.com/VulcanJS/vulcan-npm/pull/132/files. Last issue: I hit diff yarn.lock yarn.vulcan-remix.lock
1663,1665c1663,1665
< "@types/react-dom@npm:<18.0.0, @types/react-dom@npm:^17.0.14":
< version: 17.0.17
< resolution: "@types/react-dom@npm:17.0.17"
---
> "@types/react-dom@npm:^17.0.16":
> version: 17.0.16
> resolution: "@types/react-dom@npm:17.0.16"
1668c1668
< checksum: 23caf98aa03e968811560f92a2c8f451694253ebe16b670929b24eaf0e7fa62ba549abe9db0ac028a9d8a9086acd6ab9c6c773f163fa21224845edbc00ba6232
---
> checksum: 2f41a45ef955c8f68a7bcd22343715f15e1560a5e5ba941568b3c970d9151f78fe0975ecf4df7f691339af546555e0f23fa423a0a5bcd7ea4dd4f9c245509936
1672,1674c1672,1674
< "@types/react@npm:^17, @types/react@npm:^17.0.43":
< version: 17.0.47
< resolution: "@types/react@npm:17.0.47"
---
> "@types/react@npm:^17.0.16":
> version: 17.0.44
> resolution: "@types/react@npm:17.0.44"
1679c1679
< checksum: 2e7fe0eb630cb77da03b6da308c58728c01b38e878118e9ff5cd8045181c8d4f32dc936e328f46a62cadb56e1fe4c5a911b5113584f93a99e1f35df7f059246b
---
> checksum: ebee02778ca08f954c316dc907802264e0121c87b8fa2e7e0156ab0ef2a1b0a09d968c016a3600ec4c9a17dc09b4274f292d9b15a1a5369bb7e4072def82808f
5949,5952c5949,5952
< "graphql@npm:^16.3.0, graphql@npm:^16.4.0":
< version: 16.5.0
< resolution: "graphql@npm:16.5.0"
< checksum: a82a926d085818934d04fdf303a269af170e79de943678bd2726370a96194f9454ade9d6d76c2de69afbd7b9f0b4f8061619baecbbddbe82125860e675ac219e
---
> "graphql@npm:^15.6.2":
> version: 15.8.0
> resolution: "graphql@npm:15.8.0"
> checksum: 423325271db8858428641b9aca01699283d1fe5b40ef6d4ac622569ecca927019fce8196208b91dd1d8eb8114f00263fe661d241d0eb40c10e5bfd650f86ec5e
11725c11725
< "vulcan-remix@workspace:.":
---
> "vulcan-remix@workspace:starters/remix":
11727c11727
< resolution: "vulcan-remix@workspace:."
---
> resolution: "vulcan-remix@workspace:starters/remix"
To fix this I have to drop Also @borekb : |
Hi, just to rephrase what I think is needed now to close this issue:
The idea is that you could "trick" yarn into generating a lockfile, but without actually installing packages. Since this lockfile is NOT named The process could be as follow:
It could even be simplified like this:
Maybe those options kinda exist today? But I couldn't find anything like that in the docs. |
Why not use a generic name for the lock files in individual workspaces, the same way
Executing yarn install from within |
Here is my approach to this. Since we don't exactly have monorepo, but rather a repo with other repos as submodules this makes for quite problematic deployment strategy w/o individual lockfiles. So my solution is simply a fork of yarn-plugin-entrypoint-lockfiles but the lockfiles are generated next to the Link to the plugin https://github.com/zaro/yarn-plugin-deploy-lockfiles . |
This is the workaround we use for a metarepository: Meta root:
Individual projects:
Contents of root
Running The individual projects are separate Git repositories and are built separately from each other using the local There is one caveat - updating the per-project |
@arcanis Is there any plan to implement this in the future? The inability to have a lockfile (and individual .yarn folder) per workspace is such a huge downside it overshadows any upside using a workspace can provide us |
I just started using Yarn Workspaces for some of my projects and have used nohoist/nmHoistLimits: workspaces, Does Yarn enforce the lock file to be generated only in the root repository even for noHoist? This is the behavior I see in my project. Can someone please confirm that the lock file will be generated only in the root project, regardless of the hoist configuration? |
Yes that is correct. Hoisting only effects the physical files in the
dependency project. Personally I use husky pre commit to copy yarn.lock
from the workspace directory to the sub project folder and name it
yarn.workspace.lock for the deployment scripts
…On Thu, May 4, 2023, 18:47 Manoj ***@***.***> wrote:
I just started using Yarn Workspaces for some of my projects and have used
nohoist/*nmHoistLimits: workspaces*, Does Yarn enforce the lock file to
be generated only in the root repository even for noHoist? This is the
behavior I see in my project. Can someone please confirm that the lock file
will be generated only in the root project, regardless of the hoist
configuration?
—
Reply to this email directly, view it on GitHub
<#1223 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOIFDOLWREDQCIBTG2M3STXEPTR5ANCNFSM4MMVPXLA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
|
True what @bertho-zero said above ☝️, unfortunately we can't upgrade to Yarn 4 because of the removal of I understand why the team wanted to remove it but there's no other solution to the use cases described in this issue that I would be aware of. |
I'm interested to try to figure out a proper integrated solution for the 4.1. Would it solve your use cases if |
I'm not quite sure. Separate lockfiles, as implemented by https://github.com/JanVoracek/yarn-plugin-entrypoint-lockfiles, are admittedly a bit wasteful (the contents of lockfiles is partly duplicated and commands like It's hard for me to imagine how to support all of this (and those are real use cases BTW) with a single shared lockfile. I have to admit that the idea of separate lockfiles was quite controversial in our team initially and it's still relatively weird to see several |
BTW it's great that you're thinking how to implement this for 4.1! |
We are also having an issue with this and it prevents us from stepping up to Yarn 4 as we have a workspace solution that without The use case is that we are having a yarn workspace containing a number of npm package repos. W If we commit this lockfile when running yarn install inside the workspace it will be a problem when running Being able to rename the lockfile in the workspace directory solved this so that the Example
Doesn't work (yarn lockfile generated in subrepo differs when running yarn install inside/outside the workspace)
|
My workspaces are also git submodules in my super repo (for example). Any of the workspaces can be cloned separately on their own. I'd like those workspaces to have their own lock files for when they are cloned separately. The workspace lockfiles would be ignored when installing using the top-level workspace, of course (the top level would ensure they are in sync, and could even throw an error if they are not in sync, and maybe even provide an option to force-overwrite the workspace lock file to fix any issue) |
@trusktr FWIW what you're describing is what I implemented in https://github.com/jakebailey/yarn-plugin-workspace-lockfile via a plugin (modified from someone else's attempt); I ended up not needing it personally (the team I was on didn't end up adopting yarn), but it did seem to work well a the time those many years ago. It's possible it still works, or could with modification for Yarn 4. |
Would a |
Has anyone found a solution for this? |
@KevinEdry, yes and no. Your issue could be also solved if Yarn introduced the The suggestion (and I believe the original philosophy behind these features) from arcanis (here) is to run |
@akwodkiewicz Thanks for the quick reply!
Is there a way to minimize this install process so the |
AFAIK no, there is no way to strip this process. I totally understand the issue (I encountered the similar scenario as you) and yeah, until this individual lockfile option is implemented by someone, then we're stuck with:
The only thing I can suggest is fine-tuning Dockerfiles with the multi-stage builds and relying on registry caching to avoid as much installation as possible. |
Describe the user story
I believe the best user story are the comments in this issue: yarnpkg/yarn#5428
Describe the solution you'd like
Add support for generating a lockfile for each workspace. It could be enabled via configuration option like
lockfilePerWorkspace
?Describe the drawbacks of your solution
As this would be opt-in and follow well known semantics, I'm not sure there are any drawbacks to implementing the feature. One could argue if it should be in the core or implemented as a plugin. I personally don't have a preference of one over the other.
Describe alternatives you've considered
This could be implemented in the existing
workspace-tools
plugin. I'm unsure if the hooks provided by Yarn would allow for this, though.Additional context
yarnpkg/yarn#5428
The text was updated successfully, but these errors were encountered: