Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firestore: Invalid result returned from getDoc after writing to cache #6739

Closed
KoryNunn opened this issue Oct 31, 2022 · 10 comments
Closed

Firestore: Invalid result returned from getDoc after writing to cache #6739

KoryNunn opened this issue Oct 31, 2022 · 10 comments

Comments

@KoryNunn
Copy link
Contributor

KoryNunn commented Oct 31, 2022

[REQUIRED] Describe your environment

  • Operating System version: Windows WSL2 (but all)
  • Browser version: Any (also node)
  • Firebase SDK version: Reproduced in 9.13.0, 9.10.x
  • Firebase Product: firestore

[REQUIRED] Describe the problem

After writing a document via setDoc(ref, data, { merge: true }), if a write is pending, getDoc will return only the updated fields from the merge, not including any existing data for that document.

Steps to reproduce:

  1. Write an update to a document with setDoc and { merge: true }, don't await the write.
  2. Read the same document immediately.
  3. Only the updated fields will be returned.

Relevant Code:

const { disableNetwork, getFirestore, getDoc, setDoc, doc } = require('firebase/firestore');

const config = {
    "firestoreEmulatorPort": 9098,
    "firebase": {
      "projectId": "<<insert>>",
      "apiKey": "<<insert>>",
      "appId": "<<insert>>",
    }
}

const app = initializeApp(config.firebase);
const db = getFirestore(app);

async function run () {

  // Consistent write to server
  await setDoc(doc(db, "test/foo"), {
      first: "Ada",
      last: "Lovelace",
      born: 1815
  });

  // Get from server
  const snapshot1 = await getDoc(doc(db, "test/foo"));
  console.log(snapshot1.metadata); // SnapshotMetadata { hasPendingWrites: false, fromCache: false }
  console.log(snapshot1.data()); // { last: 'Lovelace', born: 1815, first: 'Ada' }

  // Force opperations against the local cache.
  await disableNetwork(db);

  // Write to cache
  const write = setDoc(doc(db, "test/foo"), {
      foo: "bar"
  }, { merge: true });

  // Get from cache while the write is pending
  const snapshot2 = await getDoc(doc(db, "test/foo"));
  console.log(snapshot2.metadata); // SnapshotMetadata { hasPendingWrites: true, fromCache: true }
  console.log(snapshot2.data()); // returns { foo: 'bar' } -- Expected to return { last: 'Lovelace', born: 1815, first: 'Ada', foo: 'bar' }
}

run();
@KoryNunn
Copy link
Contributor Author

The impact of this bug is that we cannot use the recommended approach of inconsistent writes in our application.

@cherylEnkidu cherylEnkidu self-assigned this Oct 31, 2022
@cherylEnkidu
Copy link
Contributor

Hi @KoryNunn

Thank you so much for filling the support ticket and provide detailed reproduce steps. We have successfully reproduced it and fix is on the way.

@cherylEnkidu
Copy link
Contributor

Hi @KoryNunn

After discussion, our team believes this is the expected behaviour.

The reason for that is with memory persistence, document inside cache gets eagerly garbage collected. So the cache is empty before you call setDoc in the second time.

There are two possible fixes for this:

  1. use indexeddb persistence.
  2. instead of using getDoc in the first time, using onSnapshot. So the query will stay active and the cache won't be cleaned up.

I will close this ticket for now, but please feel free to reopen it if you have any questions.

@KoryNunn
Copy link
Contributor Author

KoryNunn commented Nov 1, 2022

Hi @cherylEnkidu

This would suggest that in node/web without indexDB either:

  1. All writes should be awaited so that getDoc/getDocs work.
    Or
  2. Writes should be assumed eventual, and getDoc/getDocs should not be used.

Neither of these are obvious or clearly documented, and both are very compromised.

Setting cacheSizeBytes also does not mitigate this without indexDB persistance.

Previously, it was Firebase's recommendation that writes shouldnt be awaited if they didn't need to be consistently propagated to the server, but this issue means that there are only some scenarios where that will work, and the way in which applications should be implemented with firestore drastically changes based on this consideration.

The advertising blurb on the hompage:

Offline support | Cloud Firestore caches data that your app is actively using, so the app can write, read, listen to, and query data even if the device is offline. When the device comes back online, Cloud Firestore synchronizes any local changes back to Cloud Firestore.

Is not true unless the developer treads very carefully and ahs configured and used firebase in a very specific way.

To my mind, firestore should not so eagerly garbage collect documents, even in memory mode, when there is plenty of free space in the cache.

Additionally, if you remove the await disableNetwork(db);, the above code will return the whole document, or only the modified document, non-deterministically, making this issue very hard to track down (ask me how I know...)

@cherylEnkidu
Copy link
Contributor

cherylEnkidu commented Nov 1, 2022

Hi @KoryNunn ,

Thank you for the time spend and detailed suggestions. Our team agrees we can optimized the way how memory persistence works. Right now we are trying to prioritize this feature request (b/256897558) among other features. Will keep you posted if we have any news.

@cherylEnkidu cherylEnkidu reopened this Nov 1, 2022
@KoryNunn
Copy link
Contributor Author

FYI we've hacked around this by doing as you suggested, adding a snapshotListener for every single unique read query, then using getDocFromCache after the snapshot resolves it's first value.

This increased our performance by a factor of between 50x and 200x, including writes, which we no longer need to await.

I'd recommend looking into implementing a more official way of achieving this as this now lives up to what is described in the advertising blurb.

@wu-hui
Copy link
Contributor

wu-hui commented Nov 29, 2022

Thanks. We will take this into consideration and hopefully we can have better support for this soon.

@haris-aqeel
Copy link

FYI we've hacked around this by doing as you suggested, adding a snapshotListener for every single unique read query, then using getDocFromCache after the snapshot resolves it's first value.

This increased our performance by a factor of between 50x and 200x, including writes, which we no longer need to await.

I'd recommend looking into implementing a more official way of achieving this as this now lives up to what is described in the advertising blurb.

@KoryNunn Can you assist us in doing the same?

We are using the getDocFromCache, but we are puzzled how to use snapshot listeners with our redux state management.

@KoryNunn
Copy link
Contributor Author

KoryNunn commented Sep 20, 2023

FYI we've hacked around this by doing as you suggested, adding a snapshotListener for every single unique read query, then using getDocFromCache after the snapshot resolves it's first value.
This increased our performance by a factor of between 50x and 200x, including writes, which we no longer need to await.
I'd recommend looking into implementing a more official way of achieving this as this now lives up to what is described in the advertising blurb.

@KoryNunn Can you assist us in doing the same?

We are using the getDocFromCache, but we are puzzled how to use snapshot listeners with our redux state management.

Wrap gets and sets in your own functions and within them:

First: Create a cacheStatus with a Map or similar that maps queries to their cached status.

For gets:

  • If the query is cached, do a getFromCache(), otherwise...
  • Do an onSnapshot for your query, make sure you set includeMetadataChanges: true in the options
  • In the snapshot handler,
    • if snapshot.metadata.fromCache, do nothing
    • Otherwise:
      • unsubscribe the snapshot listener (or queue it for unsubscription)
      • mark this query as cached in your cacheStatus
      • do a getFromCache() and return the value

For sets:

  • Do a setDoc and do not await it.
  • mark this query as NOT cached in your cacheStatus so that subsequent get's know to get it from the server.

Use these get and set functions in redux however you want.

I will note that this is a bad solution and that firestore should just implement cached gets and sets correctly, which would in turn also mean you don't have to wait for a full round trip for updates.

@dconeybe
Copy link
Contributor

As of version 9.21.0 (released April 27, 2023 (release notes)) you can now enable LRU garbage collection when using memory persistence.

With LRU garbage collection, the surprising behavior of "eager" garbage collection (mentioned in earlier comment #6739 (comment)) no longer applies. That is, you should see the behavior you expect without having to resort to unnecessarily using a snapshot listener or unnecessarily enabling IndexedDb persistence.

Here is example code showing how to create a Firestore instance that uses the new LRU garbage collector for memory persistence:

import { FirebaseApp } from 'firebase/app';
import {
  Firestore,
  initializeFirestore,
  memoryLocalCache,
  memoryLruGarbageCollector
} from 'firebase/firestore';

function createFirestoreWithMemoryLruGc(app: FirebaseApp): Firestore {
  const garbageCollector = memoryLruGarbageCollector();
  const localCache = memoryLocalCache({ garbageCollector });
  return initializeFirestore(app, { localCache });
}

I'm going to close this issue now that there is an official solution; however, feel free to leave new comments or re-open if you feel that the issue has not been adequately resolved. If you find bugs with the new LRU garbage collection please open a new issue.

@firebase firebase locked and limited conversation to collaborators Oct 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants