Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented support for Lerna monorepos #142

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

maybephilipp
Copy link

@maybephilipp maybephilipp commented Aug 21, 2020

Hello everyone, it's feature that are needed for our project.

For example, we have the following directory structure:

monorepo
- packages
--- foo
----- node_modules
------- bar
--- bar
--- zoo
----- node_modules
------- bar
------- lar
--- lar
----- node_modules
------- bar
--- kar
----- node_modules
------- bar
------- zoo
--- zar
----- node_modules
------- kar
- package.json
- lerna.json

zar is entry point package that requires all other dependencies.

In my real project I have many more dependencies. And several entry points. Unlike in the example, my entry points don't require all other packages, but many of them.

In the current version, when I'm packaging my entry point, all my symlinks are followed, so the dependencies are duplicated many times. I don't want to release npm packages every time I want to test my release version. So I just want to clone my monorepo and deploy entry point with "local" symlinks.

Now the code generates the following zip archive structure:

zip
- packages
--- foo
----- node_modules
------- bar -> ../../bar
--- bar
--- zoo
----- node_modules
------- bar -> ../../bar
------- lar -> ../../lar
--- lar
----- node_modules
------- bar -> ../../bar
--- kar
----- node_modules
------- bar -> ../../bar
------- zoo -> ../../zoo
- node_modules
--- kar -> ../packages/kar
- package.json

All external dependencies are resolved in the same way as in the current version.

I suggest to hoist dependencies in the monorepo, because it saves from many duplicated deps and decreases zip size.

For my project, this solution is 2x slower than baseline sls package without symlinks, but I think it's because not all my dependencies are hoisted. For example, I have 3x lodash dependencies and every lodash file is iterated 3x times (too much files in each iteration). I should fix versions in my packages and all will be good, I think.

What do you think about this solution?

@maybephilipp
Copy link
Author

Sorry, I didn't run tests. I'll be back soon.

Exactly reverts globby version

This reverts commit 24743c3.
implemented createPackageMap function;
implemented getting package names from package.json
@maybephilipp
Copy link
Author

All tests passed, reopening.

@maybephilipp maybephilipp reopened this Aug 22, 2020
@ryan-roemer
Copy link
Member

ryan-roemer commented Aug 24, 2020

@philippmalkov -- Thanks for the interest and the PR! The change set here is a bit complex and is going to take me a little while to digest, so it's probably going to be at least next week until I can really dig into this.

What would be appreciated and useful to me in the meantime is a more in-depth explanation of the use case and specifically the packaging scenario...

Most of our monorepo projects end up packaging individually: true for each function because doing a service-level package for a monorepo would be super huge and defeat some of the purpose of the monorepo (allowing individual functions to deploy independently). Is there a case that this symlink behavior change would be useful if all functions in a monorepo are packaged and deployed individually?

If you are intending to package with individually: false (or unset) with a service-level package can you explain the goals and ideal workings of that for packaging and deployment?

Thanks!

@ryan-roemer
Copy link
Member

Oh, and one other question -- have you tried tracing mode instead of legacy / drop-in-replacement packaging mode?

Tracing should be faster and produce way smaller ultimate bundles (at the cost of usually some extra configuration). I'm not totally sure how it handles symlinks, but we'd be open to changing the options passed to the resolve() library used in trace-deps (our tracing engine) here: https://github.com/FormidableLabs/trace-deps/blob/master/lib/trace.js#L230-L237

@maybephilipp
Copy link
Author

@philippmalkov -- Thanks for the interest and the PR! The change set here is a bit complex and is going to take me a little while to digest, so it's probably going to be at least next week until I can really dig into this.

What would be appreciated and useful to me in the meantime is a more in-depth explanation of the use case and specifically the packaging scenario...

Most of our monorepo projects end up packaging individually: true for each function because doing a service-level package for a monorepo would be super huge and defeat some of the purpose of the monorepo (allowing individual functions to deploy independently). Is there a case that this symlink behavior change would be useful if all functions in a monorepo are packaged and deployed individually?

If you are intending to package with individually: false (or unset) with a service-level package can you explain the goals and ideal workings of that for packaging and deployment?

Thanks!

Sorry for delay:)

Hm, maybe I have wrong architecture... In my monorepo I have two different types of packages: service and tool. I package and deploy only service packages which depend on tool-packages. It is relation one-to-many.

Each service has many relations and each tool is used many times by multiple services.
When I implement a new fix or release version, I deploy my whole app which includes services directly independent between each other. But they depend on common tools.

When I'm releasing my tools via npm packages or github repos, I have difficulties with every-time-updates of my packages. And downloading a package from git is more slower than from NPM. I thought of a packaging script that will be able to create serverless packages without downloading my tools from git for each service from scratch. And I found your serverless plugin which implements the most part of what I want. Thank you for your work!

@maybephilipp
Copy link
Author

maybephilipp commented Aug 29, 2020

Oh, and one other question -- have you tried tracing mode instead of legacy / drop-in-replacement packaging mode?

Tracing should be faster and produce way smaller ultimate bundles (at the cost of usually some extra configuration). I'm not totally sure how it handles symlinks, but we'd be open to changing the options passed to the resolve() library used in trace-deps (our tracing engine) here: https://github.com/FormidableLabs/trace-deps/blob/master/lib/trace.js#L230-L237

I saw the tracing mode, but I didn't try it because in my project I have many legacy dynamic code injections from directories which are like a "library" for concrete type of classes. I will refactor it later, but currently I can't package with the tracing mode. If I will add all my dynamic dependencies, then serverless configuration will be like a hardcoded scheme of my directories. I don't want it.

@maybephilipp
Copy link
Author

The change set here is a bit complex and is going to take me a little while to digest

I agree with you. I wrote something that is not easy to read. I thought about some refactoring, but I didn't have much time. If the code is too bad, I will refactor it.

@leemhenson
Copy link

leemhenson commented Oct 29, 2020

I've just bumped into this issue while using npm 7.0.3 workspaces and typescript 4.0.3 project references and a project structure such as:

/handlerA
/packages
  /packageB
  /packageC

where handlerA depends on packageB and packageC, while packageB depends on packageC too.

As is, the zip gets produced with the correct structure:

/dist
/node_modules
  /packageB
    /dist
    /node_modules
      /packageC
        /dist
        /node_modules
  /packageC
    /dist
    /node_modules

but packageC is duplicated, which wouldn't necessarily be a problem except my packages use unique symbols which mean that a symbol created by requiring a file in one packageC is not the same as requiring that same file in the other packageC. This leads to lovely bugs.

If I hack trace-deps and set preserveSymlinks: false I can get:

/dist
/node_modules
/packages
  /packageB
    /dist
    /node_modules
  /packageC
    /dist
    /node_modules

and there's no way for handlerA to resolve packageB or packageC at runtime as they don't exist under its node_modules, and there's no way to get to packageC from packageB

What I want to happen is that my symlinked packages get bundled under the handler node_modules, while package-package dependencies are maintained as symlinks in their own node_modules. This means everything maintains the expected field hierarchy for node module resolution without duplicating code. Does that make sense? What's the best way to go about achieving that?

@ryan-roemer
Copy link
Member

@leemhenson -- I think I understand your goal. Would you happen to be able to create a minimal repository that has the organization, packages with jetpack as you'd like to be able to with install and build steps and then a list of zipinfo outputs that (1) are produced in the repository and (2) what you would like to be produced with changes to the library?

I do know this may be a bit of work, but having a working project to dig into and see zipinfo with current project vs. a proposed change will make things so much faster on our end for implementing changes. Thanks!

@maybephilipp
Copy link
Author

maybephilipp commented Oct 29, 2020

What I want to happen is that my symlinked packages get bundled under the handler node_modules, while package-package dependencies are maintained as symlinks in their own node_modules. This means everything maintains the expected field hierarchy for node module resolution without duplicating code. Does that make sense? What's the best way to go about achieving that?

@leemhenson I had the same problem and this branch solves it. If I understood correctly, you will get the following dir structure:

zip (handlerA)
- dist
- packages
--- packageB
----- dist
----- node_modules
------- packageC -> ../../packageC
--- packageC
----- dist
----- node_modules
- node_modules
--- packageB -> ../packages/packageB
--- packageC -> ../packages/packageC

As you can see, all packages can import each other with default node resolution algorithm via symlinks. And code isn't duplicated at all. But this case will not work for you if you need tracing mode...

@maybephilipp
Copy link
Author

maybephilipp commented Oct 29, 2020

UPD: cherry-picked some fixes after testing on my real project. This packaging method works on my project like a charm.

Here is our configuration for serverless-jetpack in <project_root>/packages/bar/serverless.yml (one of our Lambda services):

custom:
  jetpack:
    base: "../.."
    packages:
      - "../pkg1"
      - "../pkg2"
      - "../pkg3"
      - "../pkg4"
      - "../pkg5"

plugins:
  - "@philippmalkov/serverless-jetpack"

@leemhenson
Copy link

@philippmalkov Yeah that looks like exactly the output I'm hoping to achieve, but ideally I'd get it with tracing too.

@ryan-roemer Here's a repro: https://github.com/leemhenson/jetpack-monorepo-repro

  1. npm install in handlerA, packageB and packageC
  2. in handlerA, run tsc --build --verbose --force to build the project
  3. sls package and then check the zip

@ryan-roemer
Copy link
Member

Thanks very much to both of you! I’m fully occupied right now with a project and this issue / PR review is going to take a bit of time and focus for me to dig into, so I appreciate your patience.

I’m guessing the earliest I will start diving into the code and sample repos is next week (I got waylaid from my initial review of the PR by having to address an issue in the serverless core itself and although I have another fix I want to get in serverless core I’m going prioritize this set of issues at the top of my list for when I have available jetpack/serverless time). But having a working PR and sample repo already in place should hopefully make everything go much faster!

@leemhenson
Copy link

@ryan-roemer I've been having a hack around with this in my spare time:

master...smolltd:symlinks

It's a long way from PR-able, but it does produce a zip file containing the structure I want. However, there's an issue in the symlink generation inside the zipfile. I think it must be a bug with archiver or one of its dependencies. If I switch the output type to tar it produces working symlinks from nested node_modules, but with zip the symlinks are invalid. I don't have much experience with archiver, I don't know if this is something you might know more about?

@ryan-roemer
Copy link
Member

@leemhenson i used archiver for parity with built in serverless core packaging but I’m not attached to it if another library produces better zips that otherwise pass the parity test suite jetpack has.

I’m still without time to really dig into this issue yet as my priority work for my team is this https://github.com/FormidableLabs/trace-pkg which is basically jetpack trace mode, just without the serverless framework. I mention this as (1) I’m still delayed - sorry! And (2) I might use that package as a packaging engine for jetpack trace mode instead of straight trace-deps (the underlying tracing lib both jetpack and trace-pkg use) and i do have a symlonk investigation ticket which might be easy to create a failing regression test with using the mock-fs library I use in all the tests. Trace-pkg still has a bit to go but I’m pretty sure I’ll have it out next week and then turn back to this issue in jetpack for y’all

@leemhenson
Copy link

leemhenson commented Nov 9, 2020

Just FYI I've fixed the symlink issue in node-archiver:

archiverjs/node-archiver#469

That's been merged but not released yet. You'll need these fixes in trace-pkg too. The important thing to do is pass the right mode when creating the symlink in the zip:

zip.symlink(link, source, 493);

I've also added a comprehensive test case for my symlinks scenario in the serverless-jetpack tests in my fork, so if you want to develop your own implementation in trace-pkg then you could take that as a data point to guide you:

https://github.com/smolltd/serverless-jetpack/blob/symlinks/test/spec/index.spec.js#L847-L1029

@ryan-roemer
Copy link
Member

Thanks for the note! I haven't forgotten about this, but am swamped.

I've been thinking I may implement support first in https://github.com/FormidableLabs/trace-pkg as that may become the underlying engine for serverless-jetpack since there's so much overlap now.

@leemhenson
Copy link

As per my comment on #187, I don't need this now. Hoisting everything to the root node_modules means no need to traverse inner node_modules + symlinks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants