Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up back end to recollect information from pnpm #2797

Merged
merged 1 commit into from
Feb 4, 2022

Conversation

JerryHue
Copy link
Contributor

@JerryHue JerryHue commented Feb 2, 2022

Issue This PR Addresses

Fixes #2738

Type of Change

  • Bugfix: Change which fixes an issue
  • New Feature: Change which adds functionality
  • Documentation Update: Change which improves documentation
  • UI: Change which improves UI

Description

The service offers a single route: /projects, where it returns a list of all project names that Telescope depends on. The service uses a text file to load this information. The file is named deps.txt. The one I have copied there is for testing purposes, so that you don't have to run the tool if you can't.

The tool that generates this file is a one-line bash script, which calls pnpm list and does some stream processing to reduce the information needed. The tool can be found in the tools folder. Since this is a bash script, Windows users cannot test it. A PowerShell script may be necessary.

Steps to test the PR

As of right now, this service is not integrated completely with Docker. To start it up, you have to do the following:

  1. pnpm install to get all dependencies
  2. Navigate to the src/api/dependency-graph folder
  3. pnpm start to start up the application

Checklist

  • Quality: This PR builds and passes our npm test and works locally
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not (if applicable)
  • Documentation: This PR includes updated/added documentation to user exposed functionality or configuration variables are added/changed or an explanation of why it does not(if applicable)

@gitpod-io
Copy link

gitpod-io bot commented Feb 2, 2022

@TueeNguyen
Copy link
Contributor

I notice a few Eslint issues, fix them first

Copy link
Contributor

@DukeManh DukeManh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a great way to do it. The dependencies list and their information can be extracted using pnpm list and pnpm view cli, and we can use child_process.exec or other libraries to call the cli.
For example

const { exec } = require('child_process');

exec("pnpm list --dep  2 --json", (error, stdout) => {
 stdout.forEach((dep) => {
     exec(`pnpm view ${dep} repository.url`")
})
})

That's untested code but should give you some idea, hope that helps.
See
https://docs.npmjs.com/cli/v7/commands/npm-view
https://pnpm.io/cli/list

src/api/dependency-graph/package.json Outdated Show resolved Hide resolved
src/api/dependency-graph/package.json Outdated Show resolved Hide resolved
src/api/dependency-graph/package.json Outdated Show resolved Hide resolved
src/api/dependency-graph/package.json Outdated Show resolved Hide resolved
src/api/dependency-graph/src/graph-collector/graph.js Outdated Show resolved Hide resolved
src/api/dependency-graph/src/graph-collector/index.js Outdated Show resolved Hide resolved
src/api/dependency-graph/src/router.js Outdated Show resolved Hide resolved
src/api/dependency-graph/src/router.js Outdated Show resolved Hide resolved
src/api/dependency-graph/src/router.js Outdated Show resolved Hide resolved
src/api/dependency-graph/src/graph-collector/index.js Outdated Show resolved Hide resolved
@JerryHue
Copy link
Contributor Author

JerryHue commented Feb 2, 2022

So, the only problem with pnpm list is that it requires the whole node_modules folder installed, so you would need to include the whole repo in the Docker image, run an installation, and then run the pnpm list to get the whole list. That is why I went with the approach of reading the pnpm-lock.yml.

So, if that won't make the image or the container super big, then I can switch it to that approach. Could someone that knows more about Docker let me know if this is true?

cc @humphd, @DukeManh

@humphd
Copy link
Contributor

humphd commented Feb 2, 2022

Yes, this will complicate things.

I think we're going to have to add a script in tools/ that runs pnpm list the way you want in the root (when the node_modules/ is already there) and copies the result over to your microservice directory so that it can be included in the build context we copy in.

So maybe focus on writing a node.js script in tools/ to generate your JSON formatted dependency file from pnpm list, and we'll figure out how to get that into the service once it exists, maybe as part of https://github.com/Seneca-CDOT/telescope/blob/master/tools/autodeployment/deploy.sh.

@manekenpix might have a better idea.

@JerryHue
Copy link
Contributor Author

JerryHue commented Feb 2, 2022

I think we're going to have to add a script in tools/ that runs pnpm list the way you want in the root (when the node_modules/ is already there) and copies the result over to your microservice directory so that it can be included in the build context we copy in.

So maybe focus on writing a node.js script in tools/ to generate your JSON formatted dependency file from pnpm list, and we'll figure out how to get that into the service once it exists, maybe as part of https://github.com/Seneca-CDOT/telescope/blob/master/tools/autodeployment/deploy.sh.

@manekenpix might have a better idea.

So, this was similar to the first approach I wanted to go with. Let me pitch it first.

The first approach was creating a tool that produces a file similar to pnpm list, but decorated with more information, less "tree-like", and more as a flat list. Then this file would be given to the service, which will read it and then serve the information.

The only problem I would see with the aforementioned approach is that there is too much separation between two components that I feel should be together 😞

The current code is like a weird amalgamation between the approach I mentioned and the current one I had.

@JerryHue
Copy link
Contributor Author

JerryHue commented Feb 2, 2022

Is it possible to run pnpm list to provide all the transitive dependencies of Telescope? That includes not only the direct ones mentioned in package.json files, but also the dependencies of those dependencies, and so on.

I would love it to also display a flat list just like how pnpm list --parseable does, so that the json output doesn't repeat so much information.

@humphd
Copy link
Contributor

humphd commented Feb 2, 2022

Is it possible to run pnpm list to provide all the transitive dependencies of Telescope? That includes not only the direct ones mentioned in package.json files, but also the dependencies of those dependencies, and so on.

I would love it to also display a flat list just like how pnpm list --parseable does, so that the json output doesn't repeat so much information.

You just need to specify a depth, and it will do deps of deps.

@JerryHue
Copy link
Contributor Author

JerryHue commented Feb 2, 2022

Is it possible to run pnpm list to provide all the transitive dependencies of Telescope? That includes not only the direct ones mentioned in package.json files, but also the dependencies of those dependencies, and so on.
I would love it to also display a flat list just like how pnpm list --parseable does, so that the json output doesn't repeat so much information.

You just need to specify a depth, and it will do deps of deps.

From the discussion in the pnpm repo, it seems I can specify Infinity to get all dependencies.

One small clarification: do we want to lump all dependencies and say they are part of Telescope, instead of dividing them into the microservices projects? If we do it with the json/parseable way, we will have to group all dependencies this way

@humphd
Copy link
Contributor

humphd commented Feb 2, 2022

From the discussion in the pnpm repo, it seems I can specify Infinity to get all dependencies.

When I tried running it with any depth over 5, it returned one billion responses and took forever. I'm not sure that we have to go that deep?

I don't think we care which part of Telescope uses them, or whether they are a direct or indirect dep. We just want to figure out where to look for things to work on.

@JerryHue
Copy link
Contributor Author

JerryHue commented Feb 2, 2022

From the discussion in the pnpm repo, it seems I can specify Infinity to get all dependencies.

When I tried running it with any depth over 5, it returned one billion responses and took forever. I'm not sure that we have to go that deep?

I think if you ran it with --parseable, some lines were repeated. The same happens with --json.

I had the intention to collect all dependencies of Telescope, and the easy way was by going through the pnpm-lock.yaml. That is why I would like to go as deep as possible into the dependency tree.

@humphd
Copy link
Contributor

humphd commented Feb 2, 2022

Pipe the output through uniq and you can remove the duplicates:

❯ pnpm list --parseable --depth 5 | wc -l
   97762
❯ pnpm list --parseable --depth 5 | uniq | wc -l
    2466
~/repos/telescope sso-dockerfile-fix

@JerryHue JerryHue requested a review from DukeManh February 3, 2022 09:05
src/api/dependency-discovery/deps.txt Outdated Show resolved Hide resolved
tools/pnpm-deps-gen.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@humphd humphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to need to Dockerize this as well, and figure out the logic for calling your script at build-time when we deploy in a follow-up.

But I think this gives us a beginning worth having.

src/api/dependency-discovery/src/server.js Outdated Show resolved Hide resolved
src/api/dependency-discovery/src/dependency-list.js Outdated Show resolved Hide resolved
@@ -0,0 +1,24 @@
#!/bin/bash

# First regex can be read as the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯🤯🤯🤯🤯

tools/pnpm-deps-gen.sh Outdated Show resolved Hide resolved
humphd
humphd previously approved these changes Feb 4, 2022
Copy link
Contributor

@humphd humphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits.

// The order of the alternatives is important!
// The regex engine will favor the first pattern on an alternation
// even if the other alternatives are subpatterns
return dependencies.split(/\r\n|\n|\r/).filter((line) => line !== '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty typical to do /\r?\n/ to deal with Unix vs. Windows, but this is also fine.

tools/pnpm-deps-gen.sh Show resolved Hide resolved
DukeManh
DukeManh previously approved these changes Feb 4, 2022
Copy link
Contributor

@DukeManh DukeManh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 1 nit: you can chain sed expressions

tools/pnpm-deps-gen.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@DukeManh DukeManh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work. 👍

Copy link
Contributor

@humphd humphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'll file follow-up issues on things I notice that still need to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Back-end code to gather dependencies from pnpm data
5 participants