-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MongoServerSelectionError exception happen periodically when change stream code is added #11275
Comments
Seems to be related https://jira.mongodb.org/projects/NODE/issues/NODE-3894 |
I'm not too sure if they are related, because the above mentioned issue seems to happened when there is a change of MongoDB primary node. But that is not happening in my case. It happened even when there is no change of primary on MongoDB cluster. |
are you guys able to repro it using my sample? |
I never used lambda functions so I can't test that mvce. I assume that this is a mongodb native driver issue. Mongoose is just adding syntactical sugar over mongodb native driver. Probably the Bugtracker of mongdb is better place for your issue. |
perhaps someone on your team can help to repro this issue using my sample. I also think that the issue may be related to driver. But I'm not sure I could just blame the driver for that. Because the driver guy may come back to me and say that I didn't use the driver correctly and demand an explanation. This is the part that I need your expertise as a Mongoose developer to come in and advice. And it may also turn out that I might not be using Mongoose correctly and hence causing the issue. So, I sincerely hope that the Mongoose team can help to look into this and help with facilitating a fix (in driver/Mongoose) or perhaps advice me on any possible workaround solution. Thanks a lot. |
What do I type to get the repo going? |
repro steps: you need to install serverless framework to deploy to AWS: after you get the code from repo, goto project folder and run "npm install" to install all npm packages. |
Need an AWS account to repro @vkarpov15 |
Seems related to an issue I also have : #11228 |
@newbaluku I've got your sample running on my AWS account and successfully repro-ed your error:
I'm adding some debugging to track this down, will post here when I find out more. Progress will likely be a little slow because, as written, this bug only seems to happen once every 35 minutes, so the feedback loop is pretty slow. In the meantime, try reducing |
Thanks @vkarpov15 for reproducing the bug. I have just tried setting the based on my prelim observations, the frequency of it happening has increased after setting the option. |
Some additional info that may help in your debugging. the bug doesn't seems to be reproducible when I deployed the same original code to run on Alibaba cloud Function compute (Lambda equivalent) that connects to ApsaraDB for MongoDB (MongoDB Atlas equivalent) that uses MongoDB version 4.2 |
We have a similar problem after upgrading Mongoose from 5.11.13 to 6.2.1. When we use Do you think it can be related? |
Some more info on my side: we are running mongoose inside an AWS Lambda and MongoDb Atlas cluster. Also when using |
@davideicardi if you're not using change streams, your issue is unrelated. It is hard to tell without looking at some code samples, but given that you're getting a ServerSelectionError if ServerSelectionTimeoutMS is 5000 but not if you leave it at 30000, I suspect you should be able to fix the "need to wait several seconds to connect" by reducing @newbaluku what I'm seeing so far is that the change stream is getting confused by the warmer. It looks like the change stream starts a I suspect the issue comes down to race conditions with using a change stream as a stream in a Lambda. I don't have an exact answer about the particulars why yet, but change streams don't handle being interrupted mid |
@vkarpov15 Thanks for the update and your efforts in debugging the issue. |
@newbaluku I think I found the right way to do this. I've been discussing with the MongoDB Node driver team and we both agree that running a change stream as a stream in the background of a Lambda function is bad practice. Too many potential places for Lambda to shut down the container in the middle of change stream logic and leave the change stream logic in a broken state. You need a way to process a change stream explicitly in your handler, picking up where the previous change stream left off, and explicitly close the change stream when your handler is done. I've been running the below handler for a few hours and as far as I can tell it picks up changes correctly and without any unpredictable errors. Just need to fix #11527 to make this handler work without having to bypass Mongoose's change stream wrapper. const warmer = require('lambda-warmer');
const Responses = require('./helpers/responses'),
{ connectToDatabase, getChangeStream } = require('./helpers/db'),
{ Country } = require('./model/country'),
{ logd, loge, logi } = require('./helpers/debuglog');
let resumeAfter = undefined;
module.exports.queryCountries = async (event, context) => {
// add this so that we can re-use any static/global variables between function calls if Lambda
// happens to re-use existing containers for the invocation.
context.callbackWaitsForEmptyEventLoop = false;
try {
if (await warmer(event)) {
logd('queryCountries> do nothing just warm up...');
return;
}
console.time('connectToDatabase');
await connectToDatabase();
console.timeEnd('connectToDatabase');
console.time('Country.findById');
const found = await Country.findById('AU');
console.timeEnd('Country.findById');
// Use MongoDB Node driver's `watch()` function, because Mongoose change streams
// don't support `next()` yet. See https://github.com/Automattic/mongoose/issues/11527
const changeStream = await Country.collection.watch([], { resumeAfter });
// Change stream `next()` will wait forever if there are no changes. So make sure to
// stop listening to the change stream after a fixed period of time.
let timeoutPromise = new Promise(resolve => setTimeout(() => resolve(false), 1000));
let doc = null;
while (doc = await Promise.race([changeStream.next(), timeoutPromise])) {
logd('Got:' + JSON.stringify(doc, null, ' '));
}
// `resumeToken` tells you where the change stream left off, so next function instance
// can pick up any changes that happened in the meantime.
resumeAfter = changeStream.resumeToken;
await changeStream.close();
// Test code to make sure this function picks up previous function call's changes
// logd('Run updateMany?');
// const updateRes = await Country.updateMany({}, { $inc: { sentSms: 1 } });
// logd('Updated', updateRes);
logd('queryCountries> return...', resumeAfter);
return { country: found };
}
catch (e) {
loge('queryCountries> Catch', e);
return Responses.internalServerError(e);
}
}; |
I meant to make 9eaa0a9 reference this issue. I added some notes in our change streams docs about this approach. |
@vkarpov15 Thanks for your effort and updates. Will see how to incorporate this into my code. |
@vkarpov15 just to make sure that I understand this correctly. It is not recommended to use ChangeStream with event emitter in Lambda functions. I should iterate the changeStream myself using the above code instead, right? |
@vkarpov15 any comments? |
@newbaluku sorry for missing this. This comment is mostly correct. If you use a change stream in a Lambda function, we recommend closing the change stream at the end of your function invocation, and creating a new change stream instance at the next invocation using refresh tokens to pick up where the last change stream left off. It is easier to do this by iterating the change stream imperatively using |
Do you want to request a feature or report a bug?
bug
What is the current behavior?
when change stream code is included during connection to mongoDB, the AWS Lambda container would crashed periodically with Uncaught exception:
OR (I believe older version of Mongoose reported this)
without the change stream code, no exception will happen.
If the current behavior is a bug, please provide the steps to reproduce.
the above can be repro using sample Lambda project @ https://github.com/newbaluku/repro-mongodb-exception
This project has a 4 mins warmer running to keep the container warm and also a 35 mins timer to trigger DB query to simulate endpoint invocation. On cold boot, it will make DB connection as well as open a change stream for a collection.
What is the expected behavior?
exception should not happen with/without the change stream code.
What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.
Node.js v12.22.7
Mongoose v6.1.8 (also repro in v6.1.7 and v6.0.12)
MongoDB v4.4.11
The text was updated successfully, but these errors were encountered: