feat(checkpoint-validation): add custom checkpointer validation tool #600

benjamincburns · 2024-10-15T01:11:47Z

What's in the box?

This change adds the package @langgraph/checkpoint-validation-tool, which provides a portable black-box acceptance test suite for validating both internal and external checkpoint saver implementations.

This change fixes #541.

How can I use it?

It can be ran as a CLI tool, or by embedding it into an existing Jest test project. Interfacing between the test suite and custom checkpoint saver code is handled by a user-provided object that contains a set of simple functions that are used for infrastructure and component setup/teardown. Full usage details can be found in the README.md introduced in this PR.

Why is the build failing?

This package is also configured to exercise the existing checkpoint saver implementations during normal test runs (via yarn test). I use TestContainers where needed (MongoDB and PostgreSQL savers) for infrastructure provisioning.

A number of problems were discovered in the existing LangGraph checkpoint saver implementations while working on this PR. I've submitted additional PRs to correct some of the problems that I discovered (#578, #580, #582, #583, #584), and raised GH issues for others (#581, #589, #590, #591, #592, #593, #594, #595).

In the case of the problems that are described in GH issues, I've included implementation-specific carve-outs in the test logic to avoid test failures. These carve-outs are marked with TODO comments in the code.

In the case of the problems for which I've submitted PRs, I have not included any carve-outs in this test suite for those. As a result, the tests introduced by this PR will fail until the above mentioned PRs are accepted.

What's the best way to evaluate this whole mess?

To solve the chicken-and-egg problem that's caused by the dependence of this PR on so many others, I've created a merge branch that contains all of those fixes, along with the changes in this PR. I will endeavor to keep that branch in sync as things change.

benjamincburns · 2024-10-15T01:15:12Z

I should also note that I haven't added any documentation for this beyond what's in the README.md. I'd be happy to add docs, but I wasn't sure where they'd fit best. Guidance would be appreciated.

jacoblee93 · 2024-10-17T16:22:45Z

Thanks so much for this! Will dive in today.

jacoblee93

Some nits but overall I think this looks excellent and well thought out. Will give it a spin a bit later and approve if I'm able to run everything!

Other general nit is that we've generally used snake case for filenames stylistically - would prefer to keep it that way. I can make that change later too though.

jacoblee93 · 2024-10-17T16:37:33Z

libs/checkpoint-validation/src/runner.ts

+
+// passing via global is ugly, but there's no good alternative for handling the dynamic import here
+const initializer = (globalThis as GlobalThis)
+  .__langgraph_checkpoint_validation_initializer;


tiny nit: a symbol might be more idiomatic

Updated, although it's not much better. Jest gives each test file a fresh module cache, so there's no way to insert something into globalThis under a globally unique symbol key. I had to fall back to using Symbol.for, which isn't much different from what I had here, apart from being less ugly to read.

jacoblee93 · 2024-10-17T16:56:23Z

libs/checkpoint-validation/src/spec/data.ts

+        parents: {
+          // I think this is meant to be opaque to checkpoint savers, so I'm just stuffing the data in here to make sure it's stored and retrieved
+          // I think this is roughly what it'd look like if it were generated by the pregel loop, though
+          checkpoint_ns: parentCheckpointId,


nit: checkpoint_ns is a subgraph name pulled from the name of the node in the parent graph rather than a checkpoint id

This was actually a syntax goof on my part. I meant to write [checkpoint_ns]: parentCheckpointId here, as I was basing this off of the TSDoc comment on CheckpointMetadata.parents which reads:

export interface CheckpointMetadata { ... /** * The IDs of the parent checkpoints. * Mapping from checkpoint namespace to checkpoint ID. */ parents: Record<string, string>; }

updated to fix my initial syntax goof, but please lmk if I got the semantics wrong.

jacoblee93 · 2024-10-17T16:58:29Z

libs/checkpoint-validation/src/spec/getTuple.ts

+      const baseConfig = {
+        configurable: {},
+      };
+      initializerConfig = mergeConfigs(


nit: you can use ensureConfig for this

I think actually I need to just remove this (along with the configure method on CheckpointSaverTestInitializer).

My initial thinking was that it'd be handy to pass custom config into the checkpointer, e.g. like auth details that can be used for multi-tenant isolation. Unfortunately it doesn't seem that the core langgraph code merges the RunnableConfig passed on invoke when calling the checkpointer, so this logic doesn't hold any value in the tests.

Updated to remove this logic altogether.

libs/checkpoint-validation/src/tests/postgresInitializer.ts

jacoblee93 · 2024-10-17T17:09:55Z

libs/checkpoint-validation/README.md

@@ -0,0 +1,103 @@
+# @langchain/langgraph-checkpoint-validation


We should add a section to the LangGraph docs around contributing! I can do that in a separate PR after we merge this.

jacoblee93 · 2024-10-17T17:15:00Z

libs/checkpoint-validation/src/cli.ts

+          await runCLI(
+            {
+              _: [],
+              $0: "",


Could we avoid the need for the global by passing the initializer filepath here as an extra option, then do the dynamic import within the test suites? Not sure what the API looks like

I couldn't find any nice way to do this. If Jest had proper support for ESMs (and therefore allowed top-level await in test files), I could've made it much cleaner.

FWIW, vitest does support this, and migration from Jest isn't usually much of a drama. Definitely not advocating for that as a result of this PR, but it might be worth keeping it in mind if you hit more of these sort of issues in the future.

jacoblee93 · 2024-10-17T17:26:00Z

libs/checkpoint-validation/src/tests/mongoInitializer.ts

nit: rename to mongodb_initializer.ts for style

done, and applied this naming convention for all other files.

jacoblee93 · 2024-10-17T17:26:29Z

libs/checkpoint-validation/src/spec/data.ts

nit: move into testUtils.ts or make clearer that this doesn't contain any test files

jacoblee93 · 2024-10-17T17:29:14Z

libs/checkpoint-validation/README.md

+
+**IMPORTANT**: If you kill the test runner early this function may not be called. To avoid manual clean-up, give preference to test infrastructure management tools like [TestContainers](https://testcontainers.com/guides/getting-started-with-testcontainers-for-nodejs/), as these tools are designed to detect when this happens and clean up after themselves once the controlling process dies.
+
+### (Required) `createSaver`: Construct your checkpoint saver


uber nit: have been generally calling these checkpointer rather than saver

Yeah, I wasn't sure what to go with there, so I defaulted to the term used in the type definitions. Easy to switch it out, though.

FWIW, I think it'd make the code a bit more searchable/readable if you were to rename the XyzSaver classes (including BaseCheckpointSaver) to XyzCheckpointer next time you had to make a breaking change in the core checkpoint library.

updated all names/references/readme text/etc to reflect this naming convention.

jacoblee93 · 2024-10-17T18:22:43Z

Just ran locally with MemorySaver and it seems great :)

If you don't have time for the style things I'd feel pretty comfortable merging as is/picking them up myself.

benjamincburns · 2024-10-18T00:43:39Z

Just ran locally with MemorySaver and it seems great :)

Glad to hear it! 😅

If you don't have time for the style things I'd feel pretty comfortable merging as is/picking them up myself.

I'll look over the feedback a bit more closely tonight or tomorrow and let you know if I'll need to put any of it off. From a quick glance through I think the only thing that might take a bit of time is the bit about avoiding passing via globals in the dynamic test gen stuff (runner.ts).

Apart from the feedback that you've given, I also have three lingering "nice to haves" that I didn't really have time to address:

it'd be nicer if library users didn't need to be using Jest to integrate it directly with their existing test suite
it'd also be nicer if the CLI exposed more of the Jest functionality to the user, for configuring things like code coverage, various reporters, etc.
It's probably overkill, but it'd give me slightly more confidence if it also ran some e2e tests that exercised the savers at a higher level, perhaps using some code from the examples, etc.

If you agree that these would be useful improvements I can raise some issues for them so they aren't forgotten.

benjamincburns · 2024-10-18T02:32:54Z

Pushed a couple quick fixes for test logic errors that I found while using this to validate my project's custom DynamoDB checkpointer today

dropped a put test that was validating an invalid assumption that I made early on 😊
- it caused my custom checkpointer to throw while building the pending_sends array (parent not found)
- may be worth adding this back in as a referential integrity failure case, as it seems a bit off that the existing checkpointers allowed this to pass silently?
updated a getTuple failure case test to pass an invalid thread_id
- without this it was essentially duplicating the test just below it

jacoblee93 · 2024-10-20T02:26:56Z

it'd be nicer if library users didn't need to be using Jest to integrate it directly with their existing test suite

I think it's fine for now, it's up to you but would say wait for a few people to ask

it'd also be nicer if the CLI exposed more of the Jest functionality to the user, for configuring things like code coverage, various reporters, etc.

Same as above

It's probably overkill, but it'd give me slightly more confidence if it also ran some e2e tests that exercised the savers at a higher level, perhaps using some code from the examples, etc.

Yeah this would be nice, perhaps something in line with:

https://github.com/langchain-ai/langgraphjs/blob/main/libs/langgraph/src/tests/pregel.postgres_saver.int.test.ts

jacoblee93 · 2024-10-20T02:34:29Z

Seems like some flakiness with testcontainers? Not super familiar with it but am seeing this in CI:

  ● @langchain/langgraph-checkpoint-mongodb › @langchain/langgraph-checkpoint-mongodb#list › {
  description: "should return no tuples when thread_id does not match pushed checkpoint(s), checkpoint_ns matches 'child', limit is 2, before child checkpoint, and filter is undefined",
  thread_id: '1ef8e8ba-72cb-6ca0-fffd-bb7bbb3b207d',
  checkpoint_ns: 'child',
  limit: 2,
  before: [Object],
  filter: undefined,
  expectedCheckpointIds: []
}

    Could not find a working container runtime strategy

      22 |
      23 |   async beforeAll() {
    > 24 |     startedContainer = await container.start();
         |                        ^
      25 |     const connectionString = `mongodb://127.0.0.1:${startedContainer.getMappedPort(
      26 |       27017
      27 |     )}/${dbName}?directConnection=true`;

      at getContainerRuntimeClient (node_modules/testcontainers/src/container-runtime/clients/client.ts:64:9)
      at MongoDBContainer.start (node_modules/testcontainers/src/generic-container/generic-container.ts:81:20)
      at MongoDBContainer.start (node_modules/@testcontainers/mongodb/src/mongodb-container.ts:15:40)
      at Object.beforeAll (libs/checkpoint-validation/src/tests/mongoInitializer.ts:24:24)
      at Object._initializer_beforeAllTimeout (libs/checkpoint-validation/src/spec/index.ts:27:5)

Seems there might be something wrong with the package? I can repro locally, not sure if it's working for you or if it needs additional setup?

benjamincburns · 2024-10-20T20:27:36Z

I can repro locally, not sure if it's working for you or if it needs additional setup?

It does work for me. Do you have docker installed and running when testing locally? If so, you may be encountering a timeout when pulling the image due to the local yarn test applying a 30s timeout via the CLI.

In the GHA log it's failing on the mac runner with the message Could not find a working container runtime strategy. It looks like this is because Docker doesn't ship on the macOS images by default (per actions/runner-images#2150 (comment)). You'll notice that it did pass on the Ubuntu image, however.

In the thread linked above, they did pre-install colima, so if you don't mind the mac build taking an extra minute or two to run, the solution may just be to add brew install docker && colima start to the GHA test run config. That'll install the docker CLI for use with the colima runtime. Otherwise we can ignore these tests in CI when running on mac, mark them as integration tests, or some combination of these options.

benjamincburns · 2024-10-20T23:15:33Z

Colima isn't supported on M-series macs in GHA due to a lack of support for nested virtualization (actions/runner-images#9460 (comment)), so I ignored those tests in CI when running on M-series macs.

Unfortunately it seems that they also fail on Windows, likely due to docker desktop not being installed for licensing reasons (edit: yep - windows containers work fine, linux containers don't - actions/runner#904, and TestContainers doesn't support windows containers). I'll see if there's an alternative container runtime available there. If not, I'll also ignore these in CI on Windows.

benjamincburns · 2024-10-21T05:09:03Z

Current HEAD (0ea8824) passes CI. AFAICT this should be ready for re-review.

it'd be nicer if library users didn't need to be using Jest to integrate it directly with their existing test suite

I think it's fine for now, it's up to you but would say wait for a few people to ask

I did wind up taking a small/simple step in this direction by dropping imports from @jest/globals and using the types TS compiler option. That should make it work with any test lib that injects compatible globals.

nfcampos

This looks good, great work!

nfcampos · 2024-10-21T15:25:41Z

libs/checkpoint-validation/src/spec/get_tuple.ts

+
+        const writesToParent = [
+          {
+            taskId: "pending_sends_task",


I think the library always uses uuids for taskId, are we using something else here on purpose to test it supports arbitrary strings?

benjamincburns force-pushed the checkpoint-acceptance-tests branch 5 times, most recently from fcf1947 to 78309c4 Compare October 15, 2024 01:55

benjamincburns mentioned this pull request Oct 16, 2024

Feature request: Create an acceptance test suite for checkpointers #541

Closed

benjamincburns force-pushed the checkpoint-acceptance-tests branch 4 times, most recently from 957b968 to 34c6b65 Compare October 17, 2024 04:29

jacoblee93 reviewed Oct 17, 2024

View reviewed changes

benjamincburns force-pushed the checkpoint-acceptance-tests branch from 9b541b5 to 53211d7 Compare October 20, 2024 21:58

benjamincburns added 3 commits October 21, 2024 12:13

feat(checkpoint-validation): add custom checkpointer validation tool

624bd62

drop invalid put test

be4ab79

fix invalid getTuple test (supply thread_id)

dd64de1

benjamincburns force-pushed the checkpoint-acceptance-tests branch from 839b908 to aaffbad Compare October 20, 2024 23:13

benjamincburns force-pushed the checkpoint-acceptance-tests branch 2 times, most recently from b3a4833 to 7611cba Compare October 20, 2024 23:42

skip dockerized tests in CI on Windows and M-series mac

ef088b9

benjamincburns force-pushed the checkpoint-acceptance-tests branch from 7611cba to ef088b9 Compare October 21, 2024 00:42

benjamincburns added 3 commits October 21, 2024 15:29

style fixes

2c40e3f

Clean up CLI arg parsing & passing

81f13e0

bump yarn.lock

0ea8824

nfcampos approved these changes Oct 21, 2024

View reviewed changes

nfcampos merged commit d453cf6 into langchain-ai:main Oct 21, 2024
19 checks passed

benjamincburns mentioned this pull request Nov 5, 2024

Firebase Checkpoint Saver #657

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(checkpoint-validation): add custom checkpointer validation tool #600

feat(checkpoint-validation): add custom checkpointer validation tool #600

benjamincburns commented Oct 15, 2024 •

edited

Loading

benjamincburns commented Oct 15, 2024

jacoblee93 commented Oct 17, 2024

jacoblee93 left a comment

jacoblee93 Oct 17, 2024

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024

benjamincburns Oct 20, 2024

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024

benjamincburns Oct 20, 2024

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024

jacoblee93 Oct 17, 2024 •

edited

Loading

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024 •

edited

Loading

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024

benjamincburns Oct 21, 2024

jacoblee93 Oct 17, 2024 •

edited

Loading

benjamincburns Oct 18, 2024

benjamincburns Oct 21, 2024

jacoblee93 commented Oct 17, 2024

benjamincburns commented Oct 18, 2024

benjamincburns commented Oct 18, 2024 •

edited

Loading

jacoblee93 commented Oct 20, 2024

jacoblee93 commented Oct 20, 2024 •

edited

Loading

benjamincburns commented Oct 20, 2024 •

edited

Loading

benjamincburns commented Oct 20, 2024 •

edited

Loading

benjamincburns commented Oct 21, 2024

nfcampos left a comment

nfcampos Oct 21, 2024

		@@ -0,0 +1,103 @@
		# @langchain/langgraph-checkpoint-validation


		IMPORTANT: If you kill the test runner early this function may not be called. To avoid manual clean-up, give preference to test infrastructure management tools like [TestContainers](https://testcontainers.com/guides/getting-started-with-testcontainers-for-nodejs/), as these tools are designed to detect when this happens and clean up after themselves once the controlling process dies.

		### (Required) `createSaver`: Construct your checkpoint saver

feat(checkpoint-validation): add custom checkpointer validation tool #600

feat(checkpoint-validation): add custom checkpointer validation tool #600

Conversation

benjamincburns commented Oct 15, 2024 • edited Loading

What's in the box?

How can I use it?

Why is the build failing?

What's the best way to evaluate this whole mess?

benjamincburns commented Oct 15, 2024

jacoblee93 commented Oct 17, 2024

jacoblee93 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacoblee93 Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacoblee93 Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacoblee93 Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacoblee93 commented Oct 17, 2024

benjamincburns commented Oct 18, 2024

benjamincburns commented Oct 18, 2024 • edited Loading

jacoblee93 commented Oct 20, 2024

jacoblee93 commented Oct 20, 2024 • edited Loading

benjamincburns commented Oct 20, 2024 • edited Loading

benjamincburns commented Oct 20, 2024 • edited Loading

benjamincburns commented Oct 21, 2024

nfcampos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjamincburns commented Oct 15, 2024 •

edited

Loading

jacoblee93 Oct 17, 2024 •

edited

Loading

jacoblee93 Oct 17, 2024 •

edited

Loading

jacoblee93 Oct 17, 2024 •

edited

Loading

benjamincburns commented Oct 18, 2024 •

edited

Loading

jacoblee93 commented Oct 20, 2024 •

edited

Loading

benjamincburns commented Oct 20, 2024 •

edited

Loading

benjamincburns commented Oct 20, 2024 •

edited

Loading