-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBL 1.1 Mac: Stopping continuous replication consistently triggers timeout warning #758
Comments
The replication should not start again after you call stop, so that sounds like a related part of the bug. FYI, if you want the replication to stop when it goes idle, it seems like you'd be better off just using a non-continuous replication. Then you won't have to manually stop it. |
Oh, that's just how to reproduce. I was using it over longer periods (running while editing documents backed by cblmodel). I've since switched to one-shot after saving each model. However, it didn't matter how long the continuous replication was running, the warning would show regardless. |
In general, the stop behavior is similar to v1.0.4 except that the stop method is now waiting until the background stop notification is received. The waiting timeout is 2 seconds which should be more than enough. There are 2 unit tests (test12_StopIdlePush and test13_StopIdlePullReplication) that test stop the replication right after it's in the idle state as you described above. I just ran the test manually, and I didn't see any issues. Would it be possible to enable 'Sync' and 'SyncVerbose' log and share the log file? It might help us spotting where the issue is. |
Hi Pasin. I have done that. I don't see anything interesting†, but I'm not sure what I'm looking for. I don't want to paste it here, can I mail it in? †Actually, there's a new error about _bulk_docs having issues with a _design doc (the push user isn't an admin) but I wasn't seeing that earlier. |
@mdfw, my email address is pasin at couchbase.com. I'm not sure if the error will cause the delay in receiving the bg stop notification, but timestamp of the events in the log file after the stop method was called might tell something. |
Sent. |
@mdfw Thanks for the log file. Here are some notes:
|
@pasin As far as I can tell, the replicator restarted itself. I walked my code and added some of my own logging and I don't see a point where I call start again. However, I was going to try to do a basic shell of an app just to check to make sure it wasn't my code. I didn't get to that tonight but I'll see what I can do this weekend. |
@pasin I sent you an updated log from an test app I put together. While the warning seems to have disappeared (maybe that's somewhere in my code), the double stopping behavior seems to be there so maybe something is still going on. Here's a few lines (I sent Pasin the full log);
|
@mdfw Thanks for more info. Did the restart issue go away as well? The first STOPPING... is from CBLReplication -stop method. The second are from the internal CBL_Replication when it's released or dealloc, which apparently doesn't do anything mostly as the CBL_Replication has already been stopped. Please let me know if the issue happens again. If that happens again, we may need to add some diagnostic log messages and I may need your help for that. Now let keep our eyes on that. |
I'm closing the issue for now. Feel free to add more comments or reopen the issue. |
@mdfw I have replied your email. I have thought about a couple possibilities that can trigger the warning message.
We should improve or prevent these two behaviors. |
Reopen the issue. |
Ok, I think I figured it out. This is what's called an edge case. There was a one line difference between my app and my test app and it makes all the difference (of course). When I'm shutting down a continuous replication, I do the following:
If I eliminate the switch to continuous, the errors disappear. I put a check for .running and it is still running after calling Why was I doing that? I trap when CBLModel saves and if there's not a running replication, it triggers a one-shot on the same replicant (that stays around). I'm doing things differently now so this bug doesn't affect me, but I did want to report the findings if you want to fix it. |
@mdfw Thanks a lot for finding the root cause. Setting the continuous property will trigger the replicator to restart. That's why the -stop method was called twice (resulted to the warning message) and the replicator restarted itself. https://github.com/couchbase/couchbase-lite-ios/blob/master/Source/API/CBLReplication.m#L92-L97 |
- Calling the -stop method twice or more will cause the 'Timeout waiting for background stop notification' to show up as the _bg_replicator has already been stop so there will not be any more notification sent. To fix the issue, ignore the -stop message if the _bg_replicator is not running. - Fix comments in test16_Restart and rename test12_StopIdlePush to test12_StopIdlePushReplication for naming consistency. #758
I fixed timeout warning issue when calling the -stop method multiple times. I'm closing this issue now. |
I'm having the same problem This is my stop replication method but stills not work |
@vfernandezg, are you using 1.1 or master build binary? The warning message should be fixed on the master branch with 12ffbba commit. The reason that your replication didn't get stopped (actually it got restarted) is that you tried to reset the continuous property after calling stop (See https://github.com/couchbase/couchbase-lite-ios/blob/master/Source/API/CBLReplication.m#L95-L100). Do you have a reason to reset the continuous property after stopping the replication? |
- Calling the -stop method twice or more will cause the 'Timeout waiting for background stop notification' to show up as the _bg_replicator has already been stop so there will not be any more notification sent. To fix the issue, ignore the -stop message if the _bg_replicator is not running. - Fix comments in test16_Restart and rename test12_StopIdlePush to test12_StopIdlePushReplication for naming consistency. #758
@pasin No I have not. I already change my method to only stop but in the 1.10 version it always put a break exception on the CBLReplication stop method |
@vfernandezg Would it be possible to get the full crash log (on debugger, try Also would it be possible to try with the binary built from the current master branch? There was a fix around stop warning message which was not included in v1.1.0. Here is an instruction to build a binary from the master branch: https://github.com/couchbase/couchbase-lite-ios/wiki/Building-Couchbase-Lite Please let me know. |
It would be helpful also if you could paste your current completed code when stopping the replicator to see if I can use to reproduce the issue. |
@pasin Here the complete log https://gist.github.com/vfernandezg/d38143eaf891ddf5fdfc |
@pasin Sorry, this is the code of stop replication method -(void) stopP2P { |
@vfernandezg Thanks for more information, and it's good to know that it's a P2P case. I will need to test this. Anyway, I have quickly looked at the backtrace, but I couldn't find where the crashed thread is. Did it actually crash the application or just wait until timing out? It's weird that you had symbol conflicts as the 'Listener iOS Library' target build settings does use the CouchbaseLitePrefix prefix header, which already renames GCDAsyncSocket to CBL_GCDAsyncSocket. Anyway, I uploaded the latest build binary (from the master) to dropbox here: https://www.dropbox.com/s/gic2pv24i40wu9f/couchbase-lite-ios-community_0.0.0-473.zip?dl=0. You may try this. |
@pasin Yeah I think that it's an exception, look for it at thread 3 |
Yes, you are right. Thread#3 has an exception. Thanks for pointing that out. A couple of things here:
|
@vfernandezg: You're calling the same Couchbase Lite objects from multiple threads, which isn't allowed: the library is not thread-safe that way. This causes "undefined behavior", up to and including crashes. Please look at the concurrency documentation. |
@pasin You were right, that was the problem. Sure, I will add to the prefix and also here is the list of the duplicated symbols: @snej Jens, by my business logic I need to stop a p2p replication and initiate a gateway replication but this interrupt my GUI by a moment. In the previous version of CouchbaseLite I was able to put this methods in async blocks |
As reported in #758 (comment), renamed more GCDAsyncSocket symbols to CouchbaseLitePrefix.h.
As reported in #758 (comment), renamed more GCDAsyncSocket symbols to CouchbaseLitePrefix.h.
Can you show me the source code of the async block? |
Hi @snej Any update about the situation? |
Here are from my opinions:
|
I'm running replications from/to CBL Mac to CouchDB (Cloudant hosted). While testing this bug, the push and pull replications are not running concurrently, but at separate times.
After updating to CBL 1.1 (built it myself from commit: 7949e8d on Yosemite, Xcode 6.3.2) . I'm consistently (every time) getting this warning when I stop a continuous replication:
I don't remember ever seeing this warning on 1.0.x but to be honest, I've just started working with replications so it's possible I just didn't run into it. I may roll back and see if it's there.
Steps to reproduce:
[db createPushReplication:url]
)[self.pushreplication start]
[self.pushreplication stop]
Expected behavior: no warnings.
Note: The replication does appear to actually shut down and stop before or after the warning - it's hard to tell.
Note2: I don't see this on one-shot replications.
Related: It seems after I call stop, a replication runs again? To debug I attached a filter that just logs when the replication runs and returns
YES
. It seems that after the call to[replication stop]
the replication runs again. Is this correct behavior?The text was updated successfully, but these errors were encountered: