-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CJS build, add package testing CJS exports #626
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looked through the PR and tried to provide some hopefully useful feedback.
I don't have time to dive as deep as I'd like unfortunately.
a few more high-level pieces of feedback:
1. Do you really want to / need to support CommonJS?
The community is pretty split on this; it's like the old days of the python2 => python3 painful upgrade.
A lot of maintainers like Sindre Sorhus are choosing to explicitly only support ESM. It will make your life so so sooooo much easier. That's what I decided to do for the chatgpt package btw.
If you support both, it puts more work on you guys, and you'll get a lot of random, esoteric build chain issues like "when I import into this webpack app, this fails". This is not the type of stuff you really want to waste your team's dev resources on.
On the flip side, if you only support ESM, you'll get a lot of issues with less experienced devs who don't understand how to use the package. Sindre has a guide he uses for all of his packages that I like. Just make sure to mention it prominently in your docs like I do here:
async function example() {
// To use ESM in CommonJS, you can use a dynamic import like this:
const { ChatGPTAPI } = await import(https://github.com/transitive-bullshit/chatgpt-api)
// You can also try dynamic importing like this:
// const importDynamic = new Function('modulePath', 'return import(modulePath)')
// const { ChatGPTAPI } = await importDynamic('chatgpt')
const api = new ChatGPTAPI({ apiKey: process.env.OPENAI_API_KEY })
const res = await api.sendMessage('Hello World!')
console.log(res.text)
}
It's up to you whether you want to support both; just don't be surprised if the CJS side of things ends up being painful to maintain w/ random build chain bugs down the road.
Optional / peer dependencies
I couldn't really tell, but just wanted to make sure you're aware of NPM's support for optionalDependencies and peerDependenciesMeta, which lets you specify optional peer dependencies.
Misc
Also, two useful toys you want to add to your codebase would be using esbuild
to replace tsc
(still using tsc
to output types), maybe a wrapper around esbuild
like tsup
to help with transpilation & bundling dependencies. Generally relying on tsc
(which I think is what this is currently doing) isn't always enough to really get you what you want, but I know you're doing some custom build script stuff here, so maybe it's nice to have full control. Also check out tsx
which is my go-to for running TS-based scripts with esbuild
without needing a build step. Super fast for local testing && running scripts.
Hope that's useful feedback. Sorry it's not more detailed. Really really love what you guys are building w/ LangChain && excited to see the JS/TS side of things mature 🔥 💕
@@ -56,12 +76,14 @@ | |||
"url": "git@github.com:hwchase17/langchainjs.git" | |||
}, | |||
"scripts": { | |||
"build": "yarn clean && tsc --declaration --outDir dist/ && node create-entrypoints.js && node check-tree-shaking.js", | |||
"build:watch": "node create-entrypoints.js && tsc --declaration --outDir dist/ --watch", | |||
"build": "yarn clean && yarn build:esm && yarn build:cjs && node scripts/create-entrypoints.js && node scripts/check-tree-shaking.js", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(small nit / advice) instead of doing the &&
chaining, I generally use npm-run-all
and build:*
commands which is a lil cleaner
@@ -241,83 +263,104 @@ | |||
"exports": { | |||
".": { | |||
"types": "./index.d.ts", | |||
"import": "./index.js" | |||
"import": "./index.js", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having all these distinct exports is nice but will prolly be a pain to maintain over time
they're super nice for tree-shaking, though.
I think including a default
as the final option is a best practice (order matters for these btw)
here's an example from one of my packages: https://github.com/transitive-bullshit/chatgpt-api/blob/600b35eaec985bbbfcb6c77776dc30d4614bd085/package.json#L13-L15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep adding the default in the next PR
@@ -8,6 +8,9 @@ export function listEntrypoints() { | |||
const entrypoints = []; | |||
|
|||
for (const [key, value] of Object.entries(exports)) { | |||
if (key === "./package.json") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a test file?
if it's part of the build, that would seem like a red flag to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a test file, but it's run in the build command because it requires the build output
@@ -0,0 +1,38 @@ | |||
import { resolve, dirname, parse, format } from "node:path"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally fine to roll your own build system for this type of stuff; whatever works for you :)
a few other options to consider that are a lil more common in the TS world:
for context, I wrote a bit about the lay of the land of these diff tools not too long ago; most of the advice is still relevant: https://transitivebullsh.it/javascript-dev-tools-in-2022
def not concrete advice btw; using your own scripts is fine and whatever gets the job done. just sharing some hopefully useful context since there are so many options in this space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I want to avoid bundling dependencies (for now at least) as that's a bit too opinionated for me
import url from "node:url"; | ||
import path from "node:path"; | ||
import fs from "node:fs/promises"; | ||
import * as url from "node:url"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great
prettier is super powerful w/ imports btw; I recommend using one of the prettier import sorting plugins ala https://github.com/transitive-bullshit/chatgpt-api/blob/main/.prettierrc.cjs
just completely removes having to think about imports and ordering best practices and is a huge time saver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh need to set that up
@@ -133,6 +131,8 @@ export class HNSWLib extends SaveableVectorStore { | |||
} | |||
|
|||
async save(directory: string) { | |||
const fs = await import("node:fs/promises"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it clear which methods and classes require node imports?
what happens if I try to call save
in browser/edge/deno/etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In deno it would work. In other envs it wouldn't. One thing I haven't done yet is add some indication in the docs of which methods are node-only, that's planned
// Test CSVLoader | ||
const loader = new CSVLoader(new Blob(["a,b,c\n1,2,3\n4,5,6"])); | ||
|
||
const docs = await loader.load(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loader.load
feels a lil off; I understand why it's separate from the ctor cause it's async. maybe loader.init
?
not sure..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yea I think we went with this because it's the name used in python
assert(typeof HNSWLib === "function"); | ||
|
||
// Test dynamic imports of peer dependencies | ||
const { HierarchicalNSW } = await HNSWLib.imports(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line surprised me; I'm not used to seeing the type of thing happening at runtime. I know you're just testing here, but the fact that imports()
is part of the public api made me glance twice.
dealing w/ deps / peer deps / optional peer deps is really tricky as I'm sure you're well aware, especially across diff environments.
for now, if this abstraction works for you, I'd say run with it. just wanted to point out that it was surprising
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yea not without its issues, but it has mostly worked for us. Surprisingly hard to do optional deps in JS nowadays
index: new HierarchicalNSW("ip", 3), | ||
}); | ||
|
||
await vs.addVectors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of the actual usage of these classes is so 🔥 btw
"test:cjs:import": "node ./import.js", | ||
"test:ts": "tsc && node dist/index.js", | ||
"format": "prettier --write \"**/*.ts\"", | ||
"format:check": "prettier --list-different \"**/*.ts\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding test:format
that runs prettier --check
like this
also same feedback w/ using npm-run-all
(or a similar alternative to the &&
approach)
FYI I'm aware of the "dual package hazard", our library is entirely free of global state (and we make very little use of
instanceof
checks), we have tests in place that check this on CI. This is following this approach recommended by Node https://nodejs.org/api/packages.html#approach-2-isolate-state