Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(remix-node): use atomic writes for file storage #4604

Closed
wants to merge 4 commits into from

Conversation

midgleyc
Copy link

All loaders run in parallel. If multiple loaders that run access the session, there is a potential race condition without atomic file writes:

  • Loader 1 begins overwriting the file, but does not finish (session.set)
  • Loader 2 reads the whole file into a variable and attempts to parse the variable as JSON (session.get), but fails because the file was only partially written

As the test needs to reproduce a race condition involving multiple files, it is difficult to write. I think it should be possible with an integration test, but I will need to spend some more time to understand how it should work.

Closes: #4353

  • Docs
  • Tests

Testing Strategy:

We have the problem described in the attached ticket, and could reliably reproduce the original issue by rapidly switching between pages until it crashed. With the changed session storage, we have not seen the error and the original method to reproduce the issue does not crash.

@changeset-bot
Copy link

changeset-bot bot commented Nov 15, 2022

🦋 Changeset detected

Latest commit: 4d53d2e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages
Name Type
@remix-run/node Patch
@remix-run/architect Patch
@remix-run/express Patch
@remix-run/netlify Patch
@remix-run/serve Patch
@remix-run/testing Patch
@remix-run/vercel Patch
@remix-run/dev Patch
create-remix Patch
remix Patch
@remix-run/cloudflare Patch
@remix-run/cloudflare-pages Patch
@remix-run/cloudflare-workers Patch
@remix-run/css-bundle Patch
@remix-run/deno Patch
@remix-run/eslint-config Patch
@remix-run/react Patch
@remix-run/server-runtime Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@remix-cla-bot
Copy link
Contributor

remix-cla-bot bot commented Nov 15, 2022

Hi @midgleyc,

Welcome, and thank you for contributing to Remix!

Before we consider your pull request, we ask that you sign our Contributor License Agreement (CLA). We require this only once.

You may review the CLA and sign it by adding your name to contributors.yml.

Once the CLA is signed, the CLA Signed label will be added to the pull request.

If you have already signed the CLA and received this response in error, or if you have any questions, please contact us at hello@remix.run.

Thanks!

- The Remix team

@remix-cla-bot
Copy link
Contributor

remix-cla-bot bot commented Nov 15, 2022

Thank you for signing the Contributor License Agreement. Let's get this merged! 🥳

@machour
Copy link
Collaborator

machour commented Nov 15, 2022

Hi @midgleyc and thank you so much for tackling this.
Please rebase this PR on the dev branch

@midgleyc
Copy link
Author

I've made a change to revert the initial creation back to using fs to simplify the change.

As the wx flag is passed, the file is only written to if it did not exist. In practice, this makes it atomic already: if the file does not exist, it is created; if it does exist nothing happens.

@machour machour added the feat:sessions Sessions related issues label Jan 22, 2023
@MichaelDeBoey MichaelDeBoey changed the title fix(sessions): use atomic writes for file storage fix(remix-node): use atomic writes for file storage Mar 9, 2023
@MichaelDeBoey
Copy link
Member

Hi @midgleyc!

Are you still interested in getting this one merged?

If so, please rebase onto latest dev & resolve conflicts

@MichaelDeBoey MichaelDeBoey added the needs-response We need a response from the original author about this issue/PR label May 1, 2023
midgleyc and others added 4 commits May 2, 2023 11:29
All loaders run in parallel. If multiple loaders that run access the
session, there is a potential race condition without atomic file writes:

* Loader 1 begins overwriting the file, but does not finish
* Loader 2 reads the whole file into a variable
* Loader 2 attempts to parse the variable as JSON, but fails
  because the file was only partially written
It is only written if the file does not exist due to the 'wx' flag.
@midgleyc
Copy link
Author

midgleyc commented May 2, 2023

Sure, I've rebased.

@musou1500
Copy link

musou1500 commented May 2, 2023

hello, I am facing same problem.

This PR uses write-file-atomic package to implement atomic write.
write-file-atomic implements atomic write by create and write data to temporary file then rename.
I looked around write-file-atomic and have following concerns

  • this library implements atomic write but does not implements lock(may be not problem)
  • write-file-atomic may overwrites existing temporary file.

more specifically about second concern, write-file-atomic decide temporary filename as follows.

The file is initially named filename + "." + murmurhex(__filename, process.pid, ++invocations)

arguments for hash is filename, pid and invocations but in HA configuration, thay can be the same over instances.
if filename collide, write-file-atomic will overwrites existing file.
A rarer case, but hash collision also be considered.

@github-actions github-actions bot removed the needs-response We need a response from the original author about this issue/PR label May 2, 2023
@midgleyc
Copy link
Author

midgleyc commented May 2, 2023

The error from the integration tests is:

✘ [ERROR] This require call is not allowed because the imported file "../../../node_modules/@jspm/core/nodelibs/browser/worker_threads.js" contains a top-level await

    ../../../node_modules/write-file-atomic/lib/index.js:18:34:
      18 │     const workerThreads = require('worker_threads')
         ╵                                   ~~~~~~~~~~~~~~~~

  The top-level await in "../../../node_modules/@jspm/core/nodelibs/browser/worker_threads.js" is here:

    ../../../node_modules/@jspm/core/nodelibs/browser/worker_threads.js:97:49:
      97 │ ...rkerData, environmentData }] = await once(parentPort, 'message'));

This comes from esbuild, and relates to trying to bundle a (series of) packages with top level await somewhere: evanw/esbuild#253 (comment)

write-file-atomic uses require('worker_threads') inside an IIFE to distinguish between multiple threads running on the same pid.

The integration tests have this import the jspm-core version of worker_threads, which contains a top-level await, which breaks the bundling.

I don't know what JSPM is used for, but the readme mentions "implementing the optimized browser versions of the Node.js builtins", and I think the actual node version does not contain the top level await issue.

@brophdawg11 brophdawg11 removed the request for review from mcansh August 2, 2023 15:59
@brophdawg11
Copy link
Contributor

The current remix v1 design of parallel loader execution means that accessing the same sessions from different loaders is always non-deterministic if one or any of them are writing to the session. The approach now would be to use different sessions to read/write different values in loaders, or only read values from a shared session across loaders (no setting in loaders).

@brophdawg11 brophdawg11 closed this Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Occasional error when reading session data from file system
5 participants