Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/kw2xrf: Improved event callback policy #12083

Closed
wants to merge 3 commits into from
Closed

drivers/kw2xrf: Improved event callback policy #12083

wants to merge 3 commits into from

Conversation

Ilmu011
Copy link

@Ilmu011 Ilmu011 commented Aug 26, 2019

Contribution description

kw2xrf driver; kw2xrf_netdev.c:

The event_callback policy was kind of messy; after a TX operation, depending on internal flags, 0, 1 or 2 event callbacks were made, with different parameters.

This change makes sure that a TX operation (T or TR) always results in exactly one event callback.

Testing procedure

The callback functions currently only increment netstat counters. Making an event callback with the parameter NETDEV_EVENT_TX_MEDIUM_BUSY will increment the fail counter. Making an event callback with the parameter NETDEV_EVENT_TX_COMPLETE will increment the success counter.
To observe faulty behaviour follow these instructions:

  • Compile and flash gnrc_networking example on kw2xrf board
  • Type ifconfig
  • Under "statistics for layer 2" look at the line TX succeeded <x> errors <y>

Scenario 1: 0 callbacks made

  • Disable the NETOPTs AUTOACK, ACK_REQ and AUTOCCA. To do this type these lines into the console:
    ifconfig <interface_id> -autoack
    ifconfig <interface_id> -ack_req
    ifconfig <interface_id> -autocca

  • Send a UDP packet to any address
    udp send <address> <port> <data>

  • Type ifconfig

  • Under "statistics for layer 2" look at the line TX succeeded <x> errors <y>; neither x or y will have incremented but we would have expected x to increment by one.

Scenario 2: 2 callbacks made

  • This scenario happens, when the driver is told by hardware, that a CCA has identified the channel as busy. Place a jammer next to the board. Or, if you don't want to do that, go to drivers/kw2xrf/kw2xrf_netdev.c and in line 694 change

if (dregs[MKW2XDM_IRQSTS2] & MKW2XDM_IRQSTS2_CCA) {

to

if ( !( dregs[MKW2XDM_IRQSTS2] & MKW2XDM_IRQSTS2_CCA ) ) {

and recompile and flash. This will make the driver interpret a clear channel as a busy channel instead.
Edit: This doesn't quite work, because this only influences the ISR assessments. A CCAIRQ will also terminate the sequence, which does not happen, when using this workaround. Jamming the channel seems to be the only safe way to test this.

  • Send a UDP packet to any address
    udp send <address> <port> <data>
  • Type ifconfig
  • Under "statistics for layer 2" look at the line TX succeeded <x> errors <y>; both x and y will have incremented but we only would have expected y to increment by one.

Keep in mind, the board does some stuff in the background too, including receiving and sending data over the radio. This will increment the netstat counters too. To avoid misinterpretations, you may want to enable debug prints.

@Ilmu011
Copy link
Author

Ilmu011 commented Sep 2, 2019

@PeterKietzmann @gebart
Did this slip through?

@PeterKietzmann
Copy link
Member

@Ilmu011 yes it did, I'm sorry. Next time, just tell me personally about your PRs :-). Right now I don't find the time for a review but I'll come back to it very soon.

@PeterKietzmann PeterKietzmann added Area: drivers Area: Device drivers Area: network Area: Networking Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Sep 2, 2019
@benemorius
Copy link
Member

Have you checked whether the NOACK event is being called as expected? I don't have the hardware to test with but my understanding is that SEQIRQ will be asserted at the end of the sequence no matter whether it ended with a received ACK or a timeout. That would mean that the check for TMR4IRQ is never reached.

@Ilmu011
Copy link
Author

Ilmu011 commented Sep 16, 2019

Have you checked whether the NOACK event is being called as expected? I don't have the hardware to test with but my understanding is that SEQIRQ will be asserted at the end of the sequence no matter whether it ended with a received ACK or a timeout. That would mean that the check for TMR4IRQ is never reached.

I have checked, whether the NOACK event is being called as expected. It is. Which makes sense, since TMR4IRQ is always checked when SEQIRQ is NOT asserted. Did you miss the "else" at line 705 in kw2xrf.netdev.c or am I missing something?

NOACK is called correctly... but right now it doesn't do anything. From my understanding, the callback function is defined at line 1408 in sys/net/gnrc/netif/gnrc_netif.c.

The only handled events are NETDEV_EVENT_TX_MEDIUM_BUSY and NETDEV_EVENT_TX_COMPLETE.

Everything else will be ignored and treated as an "unhandled event".

@Ilmu011
Copy link
Author

Ilmu011 commented Sep 16, 2019

And another thing; I didn't even realize it, until you pointed it out, but why is TMR4 used to detect an ACK timeout? There is a dedicated timer for detecting timeouts in RX-operations and that is TMR3. I am working on #10364 right now and the kw41z driver uses TMR3 for this purpose, which makes much more sense to me. This is also smoother since a TMR3 timeout also means that the sequence is aborted, asserting the SEQIRQ flag.

This is described on page 130 of the reference manual.

When Sequence R is initiated, the sequence manager warms up the receiver (analog and
digital elements). This process takes 144 us. From this point forward, timer TMR3 can be
enabled as a “bracketing” timer (software option). If a TMR3 timer match occurs, and
software has asserted the TC3TMOUT bit, the sequence manager will warm down the
receiver and return to SEQ_IDLE state.

@PeterKietzmann
Copy link
Member

@benemorius are you okay with @Ilmu011's explanation?

@Ilmu011 for the TMR3 vs TMR4 discussion, please keep that one separate in an own issue or even better in a fixing PR :-)

@benemorius
Copy link
Member

Ok I didn't catch the different usages between TMR3 and TMR4 either but I think you've discovered already that this was the source of my wrong assumption about the check not being reached. If a TMR4 expiration doesn't assert SEQIRQ then indeed my comment is invalid.

@Ilmu011
Copy link
Author

Ilmu011 commented Sep 23, 2019

@PeterKietzmann I created #12287 regarding this topic

@PeterKietzmann PeterKietzmann added this to the Release 2019.10 milestone Sep 24, 2019
@PeterKietzmann
Copy link
Member

As discussed offline, something behaves unexpected following the proposed CCA test procedure. Please double-check and report back.

@kb2ma kb2ma modified the milestones: Release 2019.10, Release 2020.01 Oct 9, 2019
Copy link
Member

@PeterKietzmann PeterKietzmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The purpose and thus, the use of NETDEV_EVENT_TX_COMPLETE which increments tx_success generally seems a bit imprecise, but this is not a problem of this PR. What is success, a driver that returns? Successful channel access? Received ACK? In any way, this contribution aligns with other drivers.

  2. I've tested the two described scenarios that worked as described. Most importantly, I've placed a jammer next to the DUT and only the error counter incremented.

  3. The logic in _isr_event_seq_tr and _isr_event_seq_t is implemented differently. Why / for what?

if (dregs[MKW2XDM_IRQSTS3] & MKW2XDM_IRQSTS3_TMR3IRQ) {
/* if the sequence was aborted by timer 3, ACK timed out */
DEBUG("[kw2xrf] TC3TMOUT, SEQIRQ, TX failed\n");
netdev->event_callback(netdev, NETDEV_EVENT_TX_NOACK);
} else {
DEBUG("[kw2xrf] SEQIRQ\n");
} else if(!channel_was_busy) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check here if the channel was busy before? Isn't this the case where an ACK was received?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding there are two scenarios, where a TMR3IRQ does NOT occur:
Either the ACK came in time (success) OR the initial CCA failed and aborted the sequence (failure).

@Ilmu011
Copy link
Author

Ilmu011 commented Oct 29, 2019

@PeterKietzmann
From my understanding, "success" means, that all measures we have taken to ensure the message reaches its destination have all returned positive results. In most cases, there is no way for us to tell whether a message we delivered actually made its way to its destination. For example when we have a sequence T without acknowledgement and without CCA, we just pass the message to the radio and make a NETDEV_EVENT_TX_COMPLETE callback, despite the fact, that the channel could have been busy or the address we sent to does not exist. If we make additional checks though, for example a CCA and it fails, we make a NETDEV_EVENT_TX_MEDIUM_BUSY callback, because we know for sure a transmission would fail and otherwise we "assume" success.

@@ -625,6 +625,9 @@ static void _isr_event_seq_t(netdev_t *netdev, uint8_t *dregs)
netdev->event_callback(netdev, NETDEV_EVENT_TX_COMPLETE);
}
}
else {
netdev->event_callback(netdev, NETDEV_EVENT_TX_COMPLETE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a comment here too, to document the occurrence. If I see it correctly, this is the case where SEQIRQ fired because

This interrupt will assert whenever the Sequence Manager transitions from non-idle to idle state, for any reason

and CCAIRQ did not fire because it is disabled by configuration

CCA Interrupt. A '1' indicates the completion of a CCA operation.

@fjmolinas
Copy link
Contributor

@Ilmu011 approaching release status ping, there are still some unaddressed comments here but maybe you can find to address before hard feature freeze?

@Ilmu011 Ilmu011 closed this Jan 14, 2020
@jia200x
Copy link
Member

jia200x commented Jan 14, 2020

@Ilmu011 why did you close this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: drivers Area: Device drivers Area: network Area: Networking Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants