-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yarn Plug 'N Play should generate a static manifest file, not .pnp.js
#6388
Comments
Hey! Thank you for raising this issue! It's an important point, and since it's goes a bit against the current state of things (even if not entirely - cf
This seems dubious to me. If you can run Yarn, you can run Node. Even in your scenario where you have two images, you still have one image with Node that you use to run Yarn. I don't see what prevents you from doing this conversion as a secondary step. It doesn't have to be a post install script: you could just run it yourself.
Not so much that it's very complex - we just don't expect to have the same idea as to what the implementation would be as other package managers. NPM has an history of doing things in their own ways, and I don't want to end up in a scenario where projects have to live with X different formats (assuming that PnP gets traction). Standardizing it as a JS API is a good way to circumvent such issues.
I disagree on the "easy to analyze" part. In order to analyze them, you have to understand them. Meaning that you have to implement the exact same logic than any other loaders, and any shortcut might give one of your users wrong results. An API is much easier to use and less error-prone.
I don't know. What I do know is that until 24h ago, the prospect of removing
I think Node is the most portable thing we can get for Node applications 😉 |
This is where we disagree. My point is that "analyze a Node project in an environment without Node.JS" is not an uncommon edge case. This is my day job, and I've seen the pattern of "the organisation-wide CI does not have the runtime" over and over again across countless organisations, including top technology companies, Fortune 100s, etc. It might help if I give you more context. Here is how I believe you imagine the integration process looks (this is how I thought of it at first too):
Here is how this process actually looks across many companies, 90% of the time (this is the rule, not the exception):
At this point, we have two options for integrating a tool across an organisation:
I cannot overemphasise how common this organisation structure and use case is and how painful it is to integrate on a project-by-project basis. One significant advantage of using a file format is that we can do analysis at the organisation level (and in general, we can do analysis in a portable way, regardless of what environment the tool is running in). On one hand, I can see the argument that this is a people/enterprise problem and not a tool problem. On the other hand, I believe that tools should be built around real problems felt by real users, and not supporting this use case will cause a lot of pain for a lot of users down the line.
There are a couple subtle effects in practice that I think make analysis a lot less hard than you believe. I'm going to break this down into a couple of points:
I don't think the file format would be significantly harder to understand than the API. Obviously, you know better than me here; but I imagine that you can represent most of the needed state by mapping package names + revisions to file paths.
One thing that we've learned in practice (that was very surprising to me) is that users are generally okay with approximately correct answers. As an example, the configuration spec for Bower is absolutely crazy (you can override the At the very least, users in practice have strongly preferred "approximately correct" to "no answer available" (or even "answer available, but you have to put in a lot more effort"). It sounds to me like the main problem right now is that I haven't convinced you that my use case is common and real. What other context can I share that would help you make that decision? |
If you can bundle a tool, why can't you also bundle a runtime for the tool, at the org-wide CI step? |
At least for now, Plug'n'Play will require a small integration, in that those teams will have to enable it manually (by adding the
I think it's more that I'm not convinced that it's something that should be fixed at the package manager level. I'm trying to understand why there is no other layer where this can/should happen. |
@edmorley: bundling a runtime is possible, but I would rather avoid this step. It's a significant complexity cost that every tool trying to analyse Yarn PNP projects would need to pay.
This sounds like a good intermediate step. The pain here is that building those static resolution tables will still require a project-by-project integration.
My intuition says that the package manager is the right place to fix this, but I'm still trying to gather those thoughts into words. It feels to me like the package manager is the most seamless layer to implement this functionality. As another data point, here's another comment with similar concerns (excerpt below): #2250 (comment)
|
Hey folks, I put together the following project: https://github.com/arcanis/build-pnm It consumes the PnP API in order to generate a static package-name-maps file. As you can see by checking the source it's not a lot of code, and you could easily adapt it to output any language you want! In fact, it could likely be generalized to output data based on a template 🙂
Could be alleviated if we had a plugin system. I have something in mind regarding this. I haven't progressed enough to show anything yet but just know that it's something I'd be super interested to pursue.
It's a somewhat different concern in that the problems behind the lockfile format being close-to-yaml-but-not-yaml are clear, and the solution is well understood (we should have added a few characters that would have made the file a subset of yaml). The topic here is much more uncharted territories. |
The plugin system sounds interesting -- our Maven integration uses this route as well.
Yes, that's correct. My point was that there exist other people out there who want to understand Yarn projects without using Node. |
Closing, Yarn v2 has an option ( Note that the manifest should be treated as a safe opaque file, and any assumption regarding the data it contains might lead to breakages (ideally only between major releases, but possibly minors as well). |
This is a response to yarnpkg/rfcs#101 (comment). For context, my previous response to the PnP proposal is here.
I won't reproduce in detail the arguments against
.pnp.js
I've previously presented (they can be read in the context comment). To summarise:.pnp.js
makes dependency analysis very difficult in environments without Node.JS. I believe this covers a non-trivial number of use cases..pnp.js
presents security concerns that are not outweighed by its advantages in implementation flexibility.To directly address @arcanis's concerns:
Correct me if I'm wrong, but I interpreted the rationale to be "the implementation is very complex and might change, so it's easier to provide an executable API".
Helping developers and enterprises use dependency analysis tools is my day job, and I see this use case again and again and again. One of the methods we use for Node.JS dependency analysis is basically "run
npm ls --json
and parse the output" and we've repeatedly encountered environments where Node.JS isn't available in a CI environment even for projects written in Node.JS. CI environments at large enterprises are often complex, multi-stage, and multi-image; integrating polyglot dependency analysis tools at the project-level CI stage where Node.JS is available is often infeasible.This isn't an off-hand edge case. We get bug reports caused by missing Node runtimes during dependency analysis on a regular (~weekly) basis.
This makes a lot of sense to me, and I sympathise with this concern. I think the critical question here is: how often do you think the data layout will change?
Static manifest files have a lot of advantages. They're easy to parse, easy to analyse, portable (requiring Node means the current API is not really portable), and pose significantly fewer security risks. It seems premature to give up on all of these advantages to gain flexibility that you might not end up really needing.
In what scenarios do you see the data layout changing? How often do you think this will occur?
I think the performance improvement from avoiding a copy is so great that I would be surprised if you needed constant data layout changes to improve performance. In the case where changes are not often needed, it's feasible to add a
version
field to the data structure and provide a resolver library that switches on theversion
.I would be much less opposed to an executable API if it were portable (e.g. in Bash).
I can't consume the API if I can't run Node. For reasons outside of my control, it's infeasible for me to get all integrating Node projects to install a post-install hook.
The text was updated successfully, but these errors were encountered: