-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: add tests for netdev flooding race-condition #11256
base: master
Are you sure you want to change the base?
Conversation
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions. |
note that all radios (and probably network devices) are affected by this issue, because the driver ISR and the send messages are handled by the same thread. EDIT: All radios that share the framebuffer for sending/receiving |
From how I understand the issue here (and please correct me if I'm wrong) this is caused by shared frame buffer of the at86rf2xx radio. It has a single 128B buffer used for both the TX PDU and the RX PDU The mrf24j40, cc2420 and the nrf52840 radios all have separate transmit and receive buffers (I didn't check or know about the remaining few), so for those radios it is not possible to overwrite the receive buffer with the PDU from the |
If the radio has a different TX and RX framebuffer, then yes... this problem doesn't happen. |
Side-note: while selective fragment recovery (#12303) works very well for mitigating this bug (still waiting for the high-load results, but when taking it slow I get 100% success-rate even when forwarding) it causes the forwarder to create VRB entries to itself: When a fragment is forwarded, it can happen that the forwarder due to this bug receives it itself, determining that it is not the destination of this fragment, and creating a VRB for it with its own address as the source address. The only bad thing is that when ever there is an ACK to be forwarded to the same tag that erroneous read packet is send to, it is forwarded on the medium to the forwarder itself (so causing energy problems), but other that again: some of the problems of this bug are mitigated. For my experiments I will just merge #11264 into the branch my experiments will be based on and use |
@fjmolinas sure! |
So when I flash those on two
and on the other
So I guess it's working? Might also mean the boards are stuck 😉 edit: Now I got
But I guess that's just another node sending periodic announcements. |
#if defined(MODULE_AT86RF2XX) | ||
#define NETDEV_ADDR_LEN (2U) | ||
#define NETDEV_FLOOD_HDR_SEQ_OFFSET (2U) | ||
/* IEEE 802.15.4 header */ | ||
#define NETDEV_FLOOD_HDR { 0x71, 0x98, /* FCF */ \ | ||
0xa1, /* Sequence number 161 */ \ | ||
0x23, 0x00, /* PAN ID 0x23 */ \ | ||
0x0e, 0x50, /* 0x500e (start of NETDEV_FLOOD_TARGET) */ \ | ||
0x0e, 0x12, /* 0x120e (start of NETDEV_FLOOD_SOURCE) */ } | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this at86rf2xx specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this test is trying to point out a problem in the at86rf2xx
implementation. For other drivers other data might be needed. See also https://github.com/RIOT-OS/RIOT/pull/11256/files#diff-f29476e871b447f52f062cd754786b75R53-R54
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the 802.15.4 header should be the same for all 802.15.4 devices - what configuration would that be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 802.15.4 header, yes. I don't remember if this data was specific to the error case though.
Is this still relevant? |
I guess, if all 802.15.4 devices are ported to the new |
Contribution description
While working on #11068, I noticed a race condition within the state machine (see p. 51 in the datasheet) of the
at86rf2xx
device driver:.
This PR introduces two accompanying applications that reproduce this race condition.
netdev_flood_flooder
that sends IEEE 802.15.4 frames periodically every 5msnetdev_flood_replier
that receives those frames and tries to reply to them with different content after a 2ms delayIf they would succeed the
netdev_flood_replier
application would just receive the frames sent bynetdev_flood_flooder
, however due to the discovered race condition it may happen that it reads the data it just sent.Testing procedure
In general: check the READMEs, they should describe it very well. But here is the rundown.
Compile and flash
tests/netdev_flood_flooder
firstCheck the output with
Then compile and flash
tests/netdev_flood_replier
tooUse the
make test
target to check the output. If the following message is not shown it is successful (which it shouldn't be in the current master).Issues/PRs references
Issue made clear with these tests was found and is making problems for #11068.