-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws_iot_shadow_yield() does account for next pingtime when passing timer to iot_tls_read #38
Comments
Hi Tim, We were seeing unexpected behavior when we tried to set this in every call of the loop in the tls_read function. That is already something we expect to fix in a future release. You can enable the commented line here to get the behavior you are looking for. But its not well tested at the moment on our end. All of that being said, I am not entirely sure I understood your post correctly. You were trying to test the blocking behavior for the tls_read call. Is that correct? Was the above information useful? Rahul |
Rahul - No, in the port to the embedded platform I'm working on (STM34F401 Nucleo) I had the implemented the iot_tls_read(...) call to either:
I noticed when running the shadow example on the port if I bumped the yield time to a large number, such as 60 seconds, that the library ends up losing the connection to the primary server. This is due to the library not sending a ping command since the iot_tls_read(...) call is waiting for 1 of the three conditions above for the 60 seconds to elapse even though the remote server expected a ping within a shorter interval from then. (It should have set the elapsed time to iot_tls_read(...) to the time to the next scheduled PINGREQ command.) If the iot_tls_read(...) function is not supposed to use the timer at all... then the documentation should be updated. Tim |
(The diff above made the sample TLS implemention respect the timer so that the bug can be exposed in the main codebase since the one bug is hiding the other.) |
Hi Tim, In your particular case, yes, you are correct that the ping timer value is the one that should be used as timeout. Let me run this by the rest of the team and see what the best way to handle this case should be. At the very least, a documentation update pointing this out is required. Thank you for pointing out this issue. Please let us know if you have any further suggestions. Rahul |
This is by design. See the comment below extracted from the code - https://github.com/aws/aws-iot-device-sdk-embedded-C/blob/master/src/aws_iot_mqtt_client_yield.c#L165-L168
Documentation should be updated to call this out clearly so that yield() is called frequent enough and the wait time passed to aws_iot_shadow_yield() is shorter than the keep-alive timeout. |
In this case, the function has control for the entire keep alive interval with additional time beyond this so it's not exactly a rate thing. One would expect that the function would do the right thing in the case that it holds onto control of the thread for longer than the keep alive interval. (If this should never keep control beyond the keep alive interval by design, the function could truncate the amount of time that it holds control to this amount of time. Another way of looking at this function is one is saying "do what you need to do for X seconds" - it should properly manage the internal state to do this, including returning early if really needs another external invocation, but ideally it'd just keep doing its thing internally until the X seconds has elapsed.) |
The keep-alive interval is controlled through pClient->pingTimer, which is an independent timer from the yield timeout. As long as the yield wait time is small enough (compared to keepAliveInterval), keep alive message rate will be somewhat constant. |
Regardless, a yield() style function should be able to main the internal state when it's given ample timeout periods such as minutes at a time. (The "rate" messages in the documentation would normally make one conclude that it has to be run within that amount of time so that messages get through, not to make it functional. E.g. no gaps in control to the yield function that is greater than the keep alive interval, or the incoming message rate. In this case, the function has control of the thread within that "rate" so it should just work.) |
I agree that the API documentation should be improved. I think the right strategy for calling (Note, of course 0 or extremely small yield timeout could lead to the issue you reported in #35, to which I believe I already have a solution. The idea basically is to use a separate timer - packet receiving timer - to time the package reads from the 2nd read here and onwards. The packet receiving timer is the maximum time we will until a full MQTT packet is received, it's a constant timer so it won't be affected by whatever the remaining time the yield cycle has. ) |
The only point I disagree with is what the permitted range of the parameter timeout_ms when invoking the ..._yield() call should be by design. For it to be friendly to folks using the library (and not needing to realize specific quirks on the way it operates), it should accept an arbitrary timeout and not cause the internal state of the library to be out-of-wack due to the parameter specified there. E.g. while(1)
{
aws_iot_mqtt_yield(..., 100);
} should be just as valid as: while(1)
{
aws_iot_mqtt_yield(..., 60000);
} but currently they are not equivalent despite the second case having control of the CPU at the time the next keep alive should be sent. (Or in replacing the above with I digress. |
That part I have already agreed with you :) . We could either improve the documentation or throw an error if the given timeout_ms is greater than the keep alive interval. In reality, keep alive interval is generally way greater than timeout_ms for the cost and efficiency reasons. |
Implement http deinit function
I'm assuming the iot_tls_read(...) call should block until it gets any amount of data or until the timer passed expires. If this assumption is correct, then the _aws_iot_mqtt_internal_yield(...) function needs to be updated as suggested below. (The current sample mbedtls implementation does not have this behavior and I'm guessing this is a bug - the fix for this is below in order to expose this other bug.)
The _aws_iot_mqtt_internal_yield(...) function invokes the iot_tls_read(...) call (via aws_iot_mqtt_internal_cycle_read(...)) with the full timeout value to the TLS function - it should give the lesser of the time the next ping needs to be sent or the overall timeout. The example case is to fix the mbedtls implementation to block for the full time, and to set the overall elapsed time for the yield function to a value over the time the unit needs to send another ping to the remote. Here's a patch that exposes the issue:
The resulting behavior is that the client keeps disconnecting due to not sending a ping, and it needs to reconnect at the elapse of the 60 second timer.
The text was updated successfully, but these errors were encountered: