Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency count logic in RedisRegistrationStore.getAllRegistrations() and RedisRegistrationStore.Cleaner. #401

Closed
chuchiliusandc opened this issue Oct 6, 2017 · 12 comments
Labels
question Any question about leshan

Comments

@chuchiliusandc
Copy link

Hi sbernard31,

Remembered your fix for the following.
#393

I took a look at the logic for cleaner.
They look different in its implementation.
Could you take a look at the count logic in Cleaner to see it is correct ?

The reason why I asked is because I keep seeing the following in my log.
Somehow, the device seems to be unregistered by Cleaner, but the device was still alive ... but the server seems not be able to listen to it ...

8115978 [CoapServer#4] INFO com.sandc.enterprise.ts2c.leshan.dm.listener.RegistrationLogger - Device, TCMR-1000001, just updated TCMR-1000001's registration ...
8133873 [CoapServer#1] INFO com.sandc.enterprise.ts2c.leshan.dm.listener.RegistrationLogger - Device, TCMR-09999999, just updated TCMR-09999999's registration ...
8162233 [RedisRegistrationStore Cleaner (60s)] INFO com.sandc.enterprise.ts2c.leshan.dm.listener.RegistrationLogger - Device, TCMR-1000001, just unregistered ...

Thanks :)

@chuchiliusandc
Copy link
Author

btw, i also see this. somehow in that second, the device is unregistered and registered ...
The RedisRegistrationStore class is under cluster package.
Is this class stable ? sometimes, I even feel its cleaner has some issue (not wait long enough) ....

2017-10-06 18:58:05 INFO RegistrationLogger - Device, TCMR-10000002, just unregistered ...
2017-10-06 18:58:12 ERROR KafkaProducerCallback - Error while sending message to topic :null
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 20000 ms.
Oct 06, 2017 6:58:12 PM org.eclipse.californium.core.network.CoapEndpoint$5 run
SEVERE: Exception in protocol stage thread: no registration for this Id
java.lang.IllegalStateException: no registration for this Id
at org.eclipse.leshan.server.cluster.RedisRegistrationStore.add(RedisRegistrationStore.java:489)
at org.eclipse.californium.core.network.BaseMatcher.registerObserve(BaseMatcher.java:130)
at org.eclipse.californium.core.network.UdpMatcher.sendRequest(UdpMatcher.java:102)
at org.eclipse.californium.core.network.CoapEndpoint$OutboxImpl.sendRequest(CoapEndpoint.java:616)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackBottomAdapter.sendRequest(BaseCoapStack.java:190)
at org.eclipse.californium.core.network.stack.ReliabilityLayer.sendRequest(ReliabilityLayer.java:98)
at org.eclipse.californium.core.network.stack.BlockwiseLayer.sendRequest(BlockwiseLayer.java:196)
at org.eclipse.californium.core.network.stack.AbstractLayer.sendRequest(AbstractLayer.java:65)
at org.eclipse.californium.core.network.stack.ExchangeCleanupLayer.sendRequest(ExchangeCleanupLayer.java:28)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.sendRequest(BaseCoapStack.java:146)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.sendRequest(BaseCoapStack.java:140)
at org.eclipse.californium.core.network.stack.BaseCoapStack.sendRequest(BaseCoapStack.java:79)
at org.eclipse.californium.core.network.CoapEndpoint$3.run(CoapEndpoint.java:535)
at org.eclipse.californium.core.network.CoapEndpoint$5.run(CoapEndpoint.java:977)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

2017-10-06 18:58:12 INFO RegistrationLogger - Device, TCMR-1234567, just updated TCMR-1234567's registration ...
2017-10-06 18:58:12 INFO RegistrationLogger - Device, TCMR-10000002, just registered ...

@chuchiliusandc
Copy link
Author

2017-10-06 19:15:19 INFO RegistrationLogger - Device, TCMR-10000099, just updated TCMR-10000099's registration ...
2017-10-06 19:15:49 INFO RegistrationLogger - Device, TCMR-10000099, just unregistered ...
Oct 06, 2017 7:15:59 PM org.eclipse.californium.core.network.CoapEndpoint$5 run
SEVERE: Exception in protocol stage thread: no registration for this Id
java.lang.IllegalStateException: no registration for this Id
at org.eclipse.leshan.server.cluster.RedisRegistrationStore.add(RedisRegistrationStore.java:489)
at org.eclipse.californium.core.network.BaseMatcher.registerObserve(BaseMatcher.java:130)
at org.eclipse.californium.core.network.UdpMatcher.sendRequest(UdpMatcher.java:102)
at org.eclipse.californium.core.network.CoapEndpoint$OutboxImpl.sendRequest(CoapEndpoint.java:616)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackBottomAdapter.sendRequest(BaseCoapStack.java:190)
at org.eclipse.californium.core.network.stack.ReliabilityLayer.sendRequest(ReliabilityLayer.java:98)
at org.eclipse.californium.core.network.stack.BlockwiseLayer.sendRequest(BlockwiseLayer.java:196)
at org.eclipse.californium.core.network.stack.AbstractLayer.sendRequest(AbstractLayer.java:65)
at org.eclipse.californium.core.network.stack.ExchangeCleanupLayer.sendRequest(ExchangeCleanupLayer.java:28)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.sendRequest(BaseCoapStack.java:146)
at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.sendRequest(BaseCoapStack.java:140)
at org.eclipse.californium.core.network.stack.BaseCoapStack.sendRequest(BaseCoapStack.java:79)
at org.eclipse.californium.core.network.CoapEndpoint$3.run(CoapEndpoint.java:535)
at org.eclipse.californium.core.network.CoapEndpoint$5.run(CoapEndpoint.java:977)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

2017-10-06 19:15:59 INFO RegistrationLogger - Device, TCMR-10000099, just registered ...

@sbernard31
Copy link
Contributor

I took a look at the logic for cleaner.
They look different in its implementation.

I re-look at this and it seems ok for me.

The RedisRegistrationStore class is under cluster package.
Is this class stable ?

This is the more stable class of this experimental package ;)

sometimes, I even feel its cleaner has some issue (not wait long enough) ....

I made some minor modification in #402, you should try if this behave better.
Personally, I still think you face some clients which update their registration too late... If after trying this minor fix, you still face this issue.
I also add a new "clemency" feature, then you can wait more than "lifetime" before to expire registration with something like that :

 LeshanServerBuilder builder = new LeshanServerBuilder();
 builder.setRegistrationStore(new RedisRegistrationStore(jedis, RedisRegistrationStore.DEFAULT_CLEAN_PERIOD,
                    93/* seconds => MAX_TRANSMITWAIT */));
 }
 // Create and start LWM2M server
 LeshanServer lwServer = builder.build();

Please try without this clemency feature before.

@chuchiliusandc
Copy link
Author

Thanks :)

@sbernard31
Copy link
Contributor

Any news ?

@chuchiliusandc
Copy link
Author

Some of my co-worker thinks maybe due to the thread pool size from leshan.
They are creating its own thread pool and create asychronous call to do the request sending.
However, I found the following link. Is this still an issue ?
https://dev.eclipse.org/mhonarc/lists/leshan-dev/msg00321.html

Thx :)

@sbernard31
Copy link
Contributor

In Leshan we try to enhance that and do our best to send the request after sending the response (see #249)

Anyway, A client should be able to handle this kind of bad ordering because of UDP (as I explained here : #388 (comment) )

@chuchiliusandc
Copy link
Author

chuchiliusandc commented Oct 18, 2017

My solution was to do it through LESHAN_REQ channel, not through listeners. I feel the code is cleaner.

For #249 you mentioned above,
Could you give me the context what request is that about ?
Sending request refers to sending observer/write/read request to the device ?

Same for the response ? the response refers to the response from registration or registration update ?

By the way, I am a little confused about what you said there. Here, "It" means device and what "LWM2M response" is sent by device here ? Why "It" goes after "LWM2M response" ? Isn't the registration event always going first ?

"It aims to send registration event (register, update, deregister) after we effectively send the LWM2M response"

One more question, what is the future of Q mode for leshan ?
Currently, I am doing it on the top of leshan server.

@sbernard31
Copy link
Contributor

My solution was to do it through LESHAN_REQ channel, not through listeners. I feel the code is cleaner.

You feel wrong and you will face some issue later like missing features because as I said this is not our priority.

"It aims to send registration event (register, update, deregister) after we effectively send the LWM2M response"

Before #249, event (RegistrationListener callback) was execute before the response of register/update/deregister request was sent.

Generally user want to send downlink request (write/read/ ...) to client on this event. If you do that synchronously there was lot of change that your request was sent before the register/update/deregister response. #249 limit this issue.

But again this is just an optimization because of udp reordering. (see eclipse-wakaama/wakaama#24 (comment))

If you device does not handle that correctly, just add a delay before to send your downlink request.

One more question, what is the future of Q mode for leshan ?

For now, unless somebody work again on #265, Q mode will probably not be implemented in the v1.0.0.
For server side, it's not a big deal as you can easily do that on your own. Just with register and notification event.

@sbernard31 sbernard31 added the question Any question about leshan label Oct 19, 2017
@chuchiliusandc
Copy link
Author

chuchiliusandc commented Oct 23, 2017

Hi Sbernard31,

If Q mode is done by myself (on top of leshan server),
the requests in the Queue are forwarded to leshan server through the redis channel, LESHAN_REQ.
Is this correct ?

However, you seemed to say LESHAN_REQ is in experimental stage.
If that is the case, what is the best way for our Q mode implementation to provide the request to leshan server?

Thanks :)

@sbernard31
Copy link
Contributor

You should use :

LeshanServer.send(Registration destination, DownlinkRequest<T> request,
            ResponseCallback<T> responseCallback, ErrorCallback errorCallback)

@chuchiliusandc
Copy link
Author

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Any question about leshan
Projects
None yet
Development

No branches or pull requests

2 participants