Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IEEE802.15.4: HW Auto ACK considered harmful #12910

Closed
jia200x opened this issue Dec 9, 2019 · 7 comments
Closed

IEEE802.15.4: HW Auto ACK considered harmful #12910

jia200x opened this issue Dec 9, 2019 · 7 comments
Assignees
Labels
Area: network Area: Networking Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR State: stale State: The issue / PR has no activity for >185 days

Comments

@jia200x
Copy link
Member

jia200x commented Dec 9, 2019

Description

We have most of the radios configured with the Auto ACK feature. This means, the radio generates an ACK packet when it receives a valid IEEE802.15.4 packet (even before the packet is fetch from the radio). In most cases, this practice can be harmful.

Sometimes radio receive a packet but it get lost before being processed by the MAC layer. E.g

In both cases the radio sends an ACK packet to the sender (a.k.a "all good, my MAC layer received the packet") but the packet was not received by the receiver.
This can produce weird behaviors, since the sender's MAC layer believes the packet was received and processed.

Note that hardware frame retransmissions are OK and can be used without any issues.

Thus, I propose to leave Auto ACK as optional and implement ACK response in software. Besides having a more reliable L2, we would automatically add ACK features to radios that don't provide Auto ACK caps.

@bergzand bergzand self-assigned this Dec 9, 2019
@bergzand
Copy link
Member

bergzand commented Dec 9, 2019

In most cases, this practice can be harmful.

Realistically speaking, how often does this cause actual harm? What percentage of messages get ack'ed while actually discarded by the MCU?

The radio switches to TX mode when receiving (#11256)

This case is unique to radios with a shared receive/transmit buffer right? So only affects the at86rf2xx class of devices

This can produce weird behaviors, …

Do you have some examples where this causes issues?

Thus, I propose to leave Auto ACK as optional and implement ACK response in software

Do you have some numbers on the impact on this? Flash size/RAM usage, but also power usage as the MCU has to wake up longer. Also how much time is there between frame reception and ack timeout and is it realistic to handle ACK's in software if the radio is connected over SPI?

Besides having a more reliable L2

I'm a bit skeptical about this claim (if that wasn't clear yet from my comment), handling L2 acks is a soft real-time case (missing deadlines will degrade service) and handling them in software puts a hard requirement on the real-time behaviour of RIOT.

I don't mind software L2 acks if it is a hard requirement for some specific MAC layers, but that's not mentioned from the description here.

@jia200x
Copy link
Member Author

jia200x commented Dec 9, 2019

Hi @bergzand

Realistically speaking, how often does this cause actual harm? What percentage of messages get ack'ed while actually discarded by the MCU?

For the upper layers it's not thaaaaat a big of a deal (considering they are usually best efforts).
However, some MAC and sub-MAC layers (OpenThread, TSCH) use the ACK for updating neighbor information.

This case is unique to radios with a shared receive/transmit buffer right? So only affects the at86rf2xx class of devices

True. I'm just pointing out scenarios where this might happen.

Do you have some examples where this causes issues?

Giving wrong information to users of the MAC layer (e.g Mesh Link Establishment) would probably affect the link quality. I'm aware though we don't have such a thing in RIOT yet (and stacks that use it already implement the feature).

Do you have some numbers on the impact on this? Flash size/RAM usage, but also power usage as the MCU has to wake up longer. Also how much time is there between frame reception and ack timeout and is it realistic to handle ACK's in software if the radio is connected over SPI?

Unfortunately not. Since it's not implemented, I don't have numbers to compare. I only have some test branches with auto ACK, but I didn't test deeper.

I'm a bit skeptical about this claim (if that wasn't clear yet from my comment), handling L2 acks is a soft real-time case (missing deadlines will degrade service) and handling them in software puts a hard requirement on the real-time behavior of RIOT.

There's some information from Tiny OS that software ACK tend to have higher drops (https://vs-git.informatik.uni-kl.de/engel/tinyos/blob/020c6a6d8cc542bf58ca6afb8b1bf24efbe381de/doc/txt/tep126.txt) but at least there aren't false positives.
AFAIK the IEEE802.15.4 doesn't talk about hard constrains (only about timeout values). If an ACK packet is not delivered in time it's assumed as lost. But as said before,I have no information about how well the OS performs with software ACKs. Would be interesting to have some benchmarks

I don't mind software L2 acks if it is a hard requirement for some specific MAC layers, but that's not mentioned from the description here.

We need to implement software ACK anyway for radios that don't support Auto ACK and features that are not available in hardware accelerators (I'm not aware of radios that support Enhanced ACKs to be honest).
The question is, do we want this to be optional or mandatory for our default configurations? As described before, the idea is not to remove Auto ACK support but to have better support of L2. We can still enable Auto ACK if we want (that's why I proposed to add radio caps in #11473)

@PeterKietzmann
Copy link
Member

PeterKietzmann commented Dec 10, 2019

Maybe I am mistaken but the solution seems simple:

  • We implement a (hardware independent) ARQ -> we want that regardless of this discussion.
  • We compare performances (and interoperability) of hardware vs software implementations.
  • We deside for a platform specific default setting, which does not prevent from compiling the other.

As a side node: The fact that we never faced the problem highlighted by @jia200x can be related to missing MAC-layers on top a radio in RIOT. Furthermore, with low-power lossy radios we tolerate certain loss anyway.

@kb2ma
Copy link
Member

kb2ma commented Dec 10, 2019

How important are software ACKs for 6TiSCH? IIRC, OpenWSN uses software ACKs partly to track link quality among neighbors.

@jia200x
Copy link
Member Author

jia200x commented Dec 10, 2019

How important are software ACKs for 6TiSCH? IIRC, OpenWSN uses software ACKs partly to track link quality among neighbors.

6TiSCH doesn't speak against hardware ACKs, but requires Enhanced ACKs to pass timing information. In OpenWSN, this is handled by the pkg itself.

@PeterKietzmann
Copy link
Member

*and Enhanced ACKs are not supported by the hardware that I know of.

Furthermore, the problem with hardware acknowledgment capabilities in some cases is that it takes full responsibility of the mechanism without necessarily exposing relevant information, such as received ACK or number of retries. With 6TiSCH you might re-transmit a frame in an other cell, which requires this kind of information. Thus, OpenWSN bases on a limited radio device feature set and implements all other MAC components in software.

@aabadie aabadie added Area: network Area: Networking Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR labels Jan 3, 2020
@stale
Copy link

stale bot commented Jul 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Jul 6, 2020
@stale stale bot closed this as completed Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR State: stale State: The issue / PR has no activity for >185 days
Projects
None yet
Development

No branches or pull requests

6 participants