-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test killed after 10min on travis with docker mongo #120
Comments
Could it be a problem with the |
I got a similar (but not identical) deadlock & backtrace when running
and
As near as I can tell....
phew. That was fun. I'm pretty sure the bug is that Thoughts? |
Hi @dvic and @KJTsanaktsidis First off - @dvic thanks for the solid report, and @KJTsanaktsidis thanks for diving deeper into mgo than is good for your sanity! We'll take a look at this - we've never seen any deadlocks ourselves but the possibility is definitely there - there's an amazing amount of interplay with the locks (as @KJTsanaktsidis can clearly attest!) Do either of you have any reproducing code we can look at? Dom |
I’ll have a look and see if I can find a solid reproduction next week - maybe a “mongo” server that accepts then closes all connections might trigger this code path? |
@domodwyer I think I've managed to provide a repro in #121 - the test in the first commit fails about 20% of the time when i run it with |
Hi @dvic We're going to merge #121 into development ASAP (thanks to @KJTsanaktsidis !) and cut a hotfix to master once it's tested. In the meantime would you be able to run your tests using the development mgo branch to check if it resolves this issue? Dom |
Hi @domodwyer, sure no problem. Thanks! Will try it now and get back to you. |
Hey @dvic It's not merged just yet - I'll post here when it's done 👍 Dom |
No problem, for now I just used https://github.com/zendesk/mgo/tree/fix_dial_deadlock directly, TravisCI is running.. 🤞 |
Good news: I ran the test suite three times now, each passed without problems 👍 I'll keep them running just to be sure and I can also run it a few times on the dev branch once you're ready. |
@domodwyer Tests keep passing, #121 definitely seems to solve the problem (for me at least). Let me know if you want me to perform additional test runs on the dev branch. |
This is great news - thanks @dvic for reporting and @KJTsanaktsidis for such a comprehensive analysis and fix! Open source communities are alive and well! 👍 I will close this after the hotfix - thanks a lot! Dom |
Really happy to help - having this library be actively maintained helps everyone! |
Hi @dvic, @KJTsanaktsidis Sorry for disappearing, I was out the country! It looks like this has been fixed (thanks!) but with a direct push to development so this didn't close (I'll also find out how that happened - it should be PR only) so closing now. I will cut a hotfix release after a test run - thanks again! Dom |
For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix.
* cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis
* socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver
* socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver * added support for json-compatible support for slices and maps Marshal() func: nil slice or map converts to nil, not empty (initialized with len=0) * fix IsNil on slices and maps format * added godoc * fix sasl empty payload * fix scram-sha-1 auth * revert fix sasl empty payload
* socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * Allow passing slice pointer as an interface pointer to Iter.All * Reverted to original error message, added test case for interface{} ptr
* socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * findAndModify support writeConcern * fix
* allow ptr in inline structs * inline pointer_to_struce mode: update comments. return error on pointer not to struct * fix(dbtest): Use os.Kill on windows instead of Interrupt 🐛 I've added a use for os.Kill, instead of os.Interrupt signal, when using Windows. I'm current developing my project on Windows, and using DBServer.Stop() was resulting in: "timeout waiting for mongod process to die". After investigating, I've discovered that os.Interrupt isn't implemented on Windows, and it seems golang has Frozen this issue due to age (2013). They instruct to use os.Kill instead. Using this, the DBServer on my project works with no problem. * Respect nil slices, maps in bson encoder (#147) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver * added support for json-compatible support for slices and maps Marshal() func: nil slice or map converts to nil, not empty (initialized with len=0) * fix IsNil on slices and maps format * added godoc * fix sasl empty payload * fix scram-sha-1 auth * revert fix sasl empty payload * Separate read/write network timeouts (#161) * socket: separate read/write network timeouts Splits DialInfo.Timeout (defaults to 60s when using mgo.Dial()) into ReadTimeout and WriteTimeout to address #160. Read/write timeout defaults to DialInfo.Timeout to preserve existing behaviour. * cluster: remove AcquireSocket Only used by tests, replaced by the pool-aware acquire socket functions: * AcquireSocketWithPoolTimeout * AcquireSocketWithBlocking * cluster: use configured timeouts for cluster operations * `mongoCluster.syncServer()` no longer uses hard-coded 5 seconds * `mongoCluster.isMaster()` no longer uses hard-coded 10 seconds * tests: use DialInfo for internal timeouts * server: fix fantastic serverTags nil slice bug When unmarshalling serverTags, it is now an empty slice, instead of a nil slice. `len(thing) == 0` works all the time, regardless. * cluster: remove unused duplicate pool config * session: avoid calculating default values in hot path Changes `DialWithInfo` to handle setting default values by setting the relevant `DialInfo` field, rather than calling the respective methods in the hot path for: * `PoolLimit` * `ReadTimeout` * `WriteTimeout` * session: remove unused consts * session: update docs * add URI options: "w", "j", "wtimeoutMS" (#162) * add URI options: "w", "j", "wtimeoutMS" * change "w" to "j" * Add Collation support for calling Count() on a Query (#166) * Expand documentation for *Iter.Next (#163) The documentation now explains the difference between calling Err and Close after Next returns false. The example code has been expanded to include checking for timeout. * add NewMongoTimestamp() and MongoTimestamp.Time(),Counter() (#171) code is inspired by go-mgo#202 * MGO-156 Avoid iter.Next deadlock on dead sockets (#182) * Allow passing slice pointer as an interface pointer to Iter.All (#181) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * Allow passing slice pointer as an interface pointer to Iter.All * Reverted to original error message, added test case for interface{} ptr * Contributing:findAndModify support writeConcern (#185) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * findAndModify support writeConcern * fix * readme: credit everyone (#187) * @cedric-cordenier * @DaytonG * @ddspog * @gedge * @jefferickson * @larrycinnabar * @Mei-Zhao * @roobre * revert: MGO-156 Avoid iter.Next deadlock on dead sockets (#182) (#188) This reverts commit 7253b2b. * Add support for ssl dial string (#184) * Add support for ssl dial string * Ensure we dont override user settings * update examples * update ssl value parsing * PingSsl test * skip test requiring system certificates * readme: credit @tbruyelle (#190)
* socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * Release/r2018.06.15 (#191) * allow ptr in inline structs * inline pointer_to_struce mode: update comments. return error on pointer not to struct * fix(dbtest): Use os.Kill on windows instead of Interrupt 🐛 I've added a use for os.Kill, instead of os.Interrupt signal, when using Windows. I'm current developing my project on Windows, and using DBServer.Stop() was resulting in: "timeout waiting for mongod process to die". After investigating, I've discovered that os.Interrupt isn't implemented on Windows, and it seems golang has Frozen this issue due to age (2013). They instruct to use os.Kill instead. Using this, the DBServer on my project works with no problem. * Respect nil slices, maps in bson encoder (#147) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver * added support for json-compatible support for slices and maps Marshal() func: nil slice or map converts to nil, not empty (initialized with len=0) * fix IsNil on slices and maps format * added godoc * fix sasl empty payload * fix scram-sha-1 auth * revert fix sasl empty payload * Separate read/write network timeouts (#161) * socket: separate read/write network timeouts Splits DialInfo.Timeout (defaults to 60s when using mgo.Dial()) into ReadTimeout and WriteTimeout to address #160. Read/write timeout defaults to DialInfo.Timeout to preserve existing behaviour. * cluster: remove AcquireSocket Only used by tests, replaced by the pool-aware acquire socket functions: * AcquireSocketWithPoolTimeout * AcquireSocketWithBlocking * cluster: use configured timeouts for cluster operations * `mongoCluster.syncServer()` no longer uses hard-coded 5 seconds * `mongoCluster.isMaster()` no longer uses hard-coded 10 seconds * tests: use DialInfo for internal timeouts * server: fix fantastic serverTags nil slice bug When unmarshalling serverTags, it is now an empty slice, instead of a nil slice. `len(thing) == 0` works all the time, regardless. * cluster: remove unused duplicate pool config * session: avoid calculating default values in hot path Changes `DialWithInfo` to handle setting default values by setting the relevant `DialInfo` field, rather than calling the respective methods in the hot path for: * `PoolLimit` * `ReadTimeout` * `WriteTimeout` * session: remove unused consts * session: update docs * add URI options: "w", "j", "wtimeoutMS" (#162) * add URI options: "w", "j", "wtimeoutMS" * change "w" to "j" * Add Collation support for calling Count() on a Query (#166) * Expand documentation for *Iter.Next (#163) The documentation now explains the difference between calling Err and Close after Next returns false. The example code has been expanded to include checking for timeout. * add NewMongoTimestamp() and MongoTimestamp.Time(),Counter() (#171) code is inspired by go-mgo#202 * MGO-156 Avoid iter.Next deadlock on dead sockets (#182) * Allow passing slice pointer as an interface pointer to Iter.All (#181) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * Allow passing slice pointer as an interface pointer to Iter.All * Reverted to original error message, added test case for interface{} ptr * Contributing:findAndModify support writeConcern (#185) * socket: only send client metadata once per socket (#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (#111) * Brings in a patch on having flusher not suppress errors. (#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes #101 and fixes #103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (#110) * Hotfix #120 (#136) * cluster: fix deadlock in cluster synchronisation (#120) For a impressively thorough breakdown of the problem, see: #120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * findAndModify support writeConcern * fix * readme: credit everyone (#187) * @cedric-cordenier * @DaytonG * @ddspog * @gedge * @jefferickson * @larrycinnabar * @Mei-Zhao * @roobre * revert: MGO-156 Avoid iter.Next deadlock on dead sockets (#182) (#188) This reverts commit 7253b2b. * Add support for ssl dial string (#184) * Add support for ssl dial string * Ensure we dont override user settings * update examples * update ssl value parsing * PingSsl test * skip test requiring system certificates * readme: credit @tbruyelle (#190) * strip space of flag
We've seen a deadlock happen occasionally where syncServers needs to acquire a socket to call isMaster, but the socket acquisition needs to know the server topology which isn't known yet. See globalsign#120 issue for a detailed breakdown. This replicates the issue by setting up a mongo "server" which closes sockets as soon as they're opened; about 20% of the time, this will trigger the deadlock because the acquired socket for ismaster() dies and needs to be reacquired.
As discussed in the issue globalsign#120, isMaster() can cause a deadlock with the topology scanner if the connection it makes dies before running the command; mgo automagically attempts to make another socket in acquireSocket, but this can't work without topology. This commit forces isMaster() to actually run on the intended socket.
Proposed fix for deadlock in globalsign#120
* cluster: fix deadlock in cluster synchronisation (globalsign#120) For a impressively thorough breakdown of the problem, see: globalsign#120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis
* socket: only send client metadata once per socket (globalsign#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (globalsign#111) * Brings in a patch on having flusher not suppress errors. (globalsign#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (globalsign#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (globalsign#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (globalsign#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (globalsign#110) * Hotfix globalsign#120 (globalsign#136) * cluster: fix deadlock in cluster synchronisation (globalsign#120) For a impressively thorough breakdown of the problem, see: globalsign#120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver
* allow ptr in inline structs * inline pointer_to_struce mode: update comments. return error on pointer not to struct * fix(dbtest): Use os.Kill on windows instead of Interrupt 🐛 I've added a use for os.Kill, instead of os.Interrupt signal, when using Windows. I'm current developing my project on Windows, and using DBServer.Stop() was resulting in: "timeout waiting for mongod process to die". After investigating, I've discovered that os.Interrupt isn't implemented on Windows, and it seems golang has Frozen this issue due to age (2013). They instruct to use os.Kill instead. Using this, the DBServer on my project works with no problem. * Respect nil slices, maps in bson encoder (globalsign#147) * socket: only send client metadata once per socket (globalsign#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (globalsign#111) * Brings in a patch on having flusher not suppress errors. (globalsign#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (globalsign#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (globalsign#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (globalsign#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (globalsign#110) * Hotfix globalsign#120 (globalsign#136) * cluster: fix deadlock in cluster synchronisation (globalsign#120) For a impressively thorough breakdown of the problem, see: globalsign#120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * added support for marshalling/unmarshalling maps with non-string keys * refactor method receiver * added support for json-compatible support for slices and maps Marshal() func: nil slice or map converts to nil, not empty (initialized with len=0) * fix IsNil on slices and maps format * added godoc * fix sasl empty payload * fix scram-sha-1 auth * revert fix sasl empty payload * Separate read/write network timeouts (globalsign#161) * socket: separate read/write network timeouts Splits DialInfo.Timeout (defaults to 60s when using mgo.Dial()) into ReadTimeout and WriteTimeout to address globalsign#160. Read/write timeout defaults to DialInfo.Timeout to preserve existing behaviour. * cluster: remove AcquireSocket Only used by tests, replaced by the pool-aware acquire socket functions: * AcquireSocketWithPoolTimeout * AcquireSocketWithBlocking * cluster: use configured timeouts for cluster operations * `mongoCluster.syncServer()` no longer uses hard-coded 5 seconds * `mongoCluster.isMaster()` no longer uses hard-coded 10 seconds * tests: use DialInfo for internal timeouts * server: fix fantastic serverTags nil slice bug When unmarshalling serverTags, it is now an empty slice, instead of a nil slice. `len(thing) == 0` works all the time, regardless. * cluster: remove unused duplicate pool config * session: avoid calculating default values in hot path Changes `DialWithInfo` to handle setting default values by setting the relevant `DialInfo` field, rather than calling the respective methods in the hot path for: * `PoolLimit` * `ReadTimeout` * `WriteTimeout` * session: remove unused consts * session: update docs * add URI options: "w", "j", "wtimeoutMS" (globalsign#162) * add URI options: "w", "j", "wtimeoutMS" * change "w" to "j" * Add Collation support for calling Count() on a Query (globalsign#166) * Expand documentation for *Iter.Next (globalsign#163) The documentation now explains the difference between calling Err and Close after Next returns false. The example code has been expanded to include checking for timeout. * add NewMongoTimestamp() and MongoTimestamp.Time(),Counter() (globalsign#171) code is inspired by go-mgo#202 * MGO-156 Avoid iter.Next deadlock on dead sockets (globalsign#182) * Allow passing slice pointer as an interface pointer to Iter.All (globalsign#181) * socket: only send client metadata once per socket (globalsign#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (globalsign#111) * Brings in a patch on having flusher not suppress errors. (globalsign#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (globalsign#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (globalsign#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (globalsign#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (globalsign#110) * Hotfix globalsign#120 (globalsign#136) * cluster: fix deadlock in cluster synchronisation (globalsign#120) For a impressively thorough breakdown of the problem, see: globalsign#120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * Allow passing slice pointer as an interface pointer to Iter.All * Reverted to original error message, added test case for interface{} ptr * Contributing:findAndModify support writeConcern (globalsign#185) * socket: only send client metadata once per socket (globalsign#105) Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Merge Development (globalsign#111) * Brings in a patch on having flusher not suppress errors. (globalsign#81) go-mgo#360 * Fallback to JSON tags when BSON tag isn't present (globalsign#91) * Fallback to JSON tags when BSON tag isn't present Cleanup. * Add test to demonstrate tagging fallback. - Test coverage for tagging test. * socket: only send client metadata once per socket Periodic cluster synchronisation calls isMaster() which currently resends the "client" metadata every call - the spec specifies: isMaster commands issued after the initial connection handshake MUST NOT contain handshake arguments https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst#connection-handshake This hotfix prevents subsequent isMaster calls from sending the client metadata again - fixes globalsign#101 and fixes globalsign#103. Thanks to @changwoo-nam @qhenkart @canthefason @jyoon17 for spotting the initial issue, opening tickets, and having the problem debugged with a PoC fix before I even woke up. * Cluster abended test 254 (globalsign#100) * Add a test that mongo Server gets their abended reset as necessary. See https://github.com/go-mgo/mgo/issues/254 and https://github.com/go-mgo/mgo/pull/255/files * Include the patch from Issue 255. This brings in a test which fails without the patch, and passes with the patch. Still to be tested, manual tcpkill of a socket. * changeStream support (globalsign#97) Add $changeStream support * readme: credit @peterdeka and @steve-gray (globalsign#110) * Hotfix globalsign#120 (globalsign#136) * cluster: fix deadlock in cluster synchronisation (globalsign#120) For a impressively thorough breakdown of the problem, see: globalsign#120 (comment) Huge thanks to @dvic and @KJTsanaktsidis for the report and fix. * readme: credit @dvic and @KJTsanaktsidis * findAndModify support writeConcern * fix * readme: credit everyone (globalsign#187) * @cedric-cordenier * @DaytonG * @ddspog * @gedge * @jefferickson * @larrycinnabar * @Mei-Zhao * @roobre * revert: MGO-156 Avoid iter.Next deadlock on dead sockets (globalsign#182) (globalsign#188) This reverts commit 7253b2b. * Add support for ssl dial string (globalsign#184) * Add support for ssl dial string * Ensure we dont override user settings * update examples * update ssl value parsing * PingSsl test * skip test requiring system certificates * readme: credit @tbruyelle (globalsign#190)
Hi,
Every once in a while our mongo suite gets killed on TravicCI. We run go 1.10 and use docker for our test suites. Our Postgres and Neo4j test suites run just fine with this setup but with mgo and Mongo we're having these issues.
Stacktrace information can be found below. Any idea why this is happening?
The text was updated successfully, but these errors were encountered: