Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnrc_netif: use irq_handler instead of IPC #12462

Closed
wants to merge 1 commit into from

Conversation

jia200x
Copy link
Member

@jia200x jia200x commented Oct 15, 2019

Contribution description

This PR uses the irq_handler module to process network device events (NETDEV_EVENT_TYPE_ISR), instead of waiting for NETDEV_MSG_TYPE_EVENT IPC messages in the gnrc_netif thread.

The implications of this are:

  • The receive procedure is preemptive, which is not possible with the current approach in master: if a packet arrives when the gnrc_netif thread is already processing a GNRC_NETAPI_MSG_TYPE_SND, a NETDEV_MSG_TYPE_SND event will be queued to be processed right after the send procedure finishes. If the network device shares the same framebuffer for RX and TX, then the flow goes like
dev->driver->isr(dev) => dev->event_callback(NETDEV_EVENT_RX_COMPLETE) => dev->driver-read(dev, ...) /* FRAME BUFFER WAS OVERRIDEN BY THE SEND PROCEDURE!! */

This is the root cause of #11256. Using the irq_handler module, the packet gets processed first.

  • Same as gnrc_netif: Add support for internal event loop #9326, the request to handle the ISR will never get lost.
  • The drivers can directly offload the ISR, so it's possible to avoid the usage of the isr field of the netdev_driver_t structure (that's the goal, but I'm trying to go there in small steps)
  • If this happens, the OS process the driver events. Meaning that different network stacks (OpenThread, lwip, etc) don't need to handle network device events. Thus, one step closer of having stack independent network interfaces.

I'm aware this increase the ram consumption because of the extra irq_handler thread, but the effect is mitigated if other modules (drivers, etc) also use the irq_handler mechanism. Also if the gnrc_netif events are processed from irq_handlers (or directly in the thread who calls gnrc's send, get and set stuff), then it's possible to remove the gnrc_netif thread.

Testing procedure

I would recommend to run some of the Release Specs tests for this one (e.g ICMP stress test)

Issues/PRs references

#11483

@jia200x jia200x added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: network Area: Networking labels Oct 15, 2019
@jia200x
Copy link
Member Author

jia200x commented Oct 15, 2019

#11256 seems to pass with this mechanism (add an irq_handler for processing the radio ISR)

@jia200x
Copy link
Member Author

jia200x commented Oct 15, 2019

if #12459 gets merged first, I will naturally rebase.

@jia200x jia200x requested a review from benpicco October 15, 2019 14:37
@bergzand
Copy link
Member

* The receive procedure is preemptive, which is not possible with the current approach in master: if a packet arrives when the gnrc_netif thread is already processing a GNRC_NETAPI_MSG_TYPE_SND, a NETDEV_MSG_TYPE_SND event will be queued to be processed right after the send procedure finishes.  Then the flow goes like
dev->driver->isr(dev) => dev->event_callback(NETDEV_EVENT_RX_COMPLETE) => dev->driver-read(dev, ...) /* FRAME BUFFER WAS OVERRIDEN BY THE SEND PROCEDURE!! */

This is the root cause of #11256. Using the irq_handler module, the packet gets processed first.

This feels to me like the current network stack is changed to solve an issue within single radio. Why not use the irq_handler at at86rf2xx radio level then instead of modifying the behaviour for all radios?

@jia200x
Copy link
Member Author

jia200x commented Oct 15, 2019

This feels to me like the current network stack is changed to solve an issue within single radio. Why not use the irq_handler at at86rf2xx radio level then instead of modifying the behaviour for all radios?

The fact is, this issue is present in all network devices. E.g for MRF24J40, if an RX IRQ is triggered while the CPU is processing this line, a NETDEV_MSG_TYPE_EVENT will be enqueued. That means, right after sending this lines will be processed. At that point, the FB was already overwritten by the TX procedure.

Radio interrups should be processed asap, which is what the irq_handler mechanism is aimed for.

@miri64
Copy link
Member

miri64 commented Oct 16, 2019

At that point, the FB was already overwritten by the TX procedure.

Mhhh but according to @bergzand's claim here the RX and TX buffer are distinct in this device.

@jia200x
Copy link
Member Author

jia200x commented Oct 16, 2019

Mhhh but according to @bergzand's claim here the RX and TX buffer are distinct in this device.

Yes, he is right, some devices have different TX and RX buffers. So, those radios wouldn't be affected by this problem. I know this problem is present at least in the at86rf2xx and the LoRa radios.

This feels to me like the current network stack is changed to solve an issue within single radio. Why not use the irq_handler at at86rf2xx radio level then instead of modifying the behaviour for all radios?

To continue with the discussion before

This feels to me like the current network stack is changed to solve an issue within single radio

Note this is not a change in the network stack but in the way how we process the radio events. Processing the radio IRQ shouldn't be a task of the network stack, otherwise we end up with a lot of duplication and stacks that only support a bunch of radios (Openthread only supports at86rf2xx, I think only a couple work for lwip, etc).

Why not use the irq_handler at at86rf2xx radio level then instead of modifying the behaviour for all radios?

Adding this to the `at86rf2xx solves the issue.
However, as said, the main intention of this change is to process network device ISRs from the OS instead of the network stack. That means:

  • The PHY layer can be provided by the OS: network stacks just receive data instead of dealing with radio specific events
  • For network device users, the application doesn't need to care about processing radio events (so we get rid of the _recv thread here)

@jia200x
Copy link
Member Author

jia200x commented Oct 18, 2019

This becomes stalled due to the discussion of #12459. However, the intention is still the same. Process the IRQ event from network devices from OS mechanisms instead of from the network stack. That way network events processing are independent of the chosen network stack.

@jia200x
Copy link
Member Author

jia200x commented Apr 15, 2020

Since #13669 is imminent, I propose to close this one,

@jia200x jia200x closed this Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants