Replies: 4 comments 4 replies
-
I was a bit concerned about the interrupt handling in this as well (I simply ported this from a C++ module with very few structural changes). Since LoRa is not a reliable transport, all apps using it should already be implemented to understand that packets might be lost. Given that, unless this causes some other interesting fault (exception? crashed radio? problems waking from sleep?), or it happens frequently, I'm not sure I'd spend much time worrying about it. If it is a frequent problem, a better answer might be to detect that the RX is pending and queue the TX radio transition to occur upon the RX completion. This sounds like work, especially considering the amount of testing that would be needed to verify its reliability. If not a common problem, yet the failure mode causes behavioral problems on following RX/TX packets (meaning, not just the current lost RX and/or TX packet), then perhaps the fix can be simplified to just focus on solving that behavior and not trying to recover the packet. In general, I'd not be in favor of throwing exceptions given that packets are commonly lost anyway and therefore any LoRa app requires handling of that condition already. |
Beta Was this translation helpful? Give feedback.
-
Not that I've ever seen for any of the issues at hand. They provide app notes around physical board layout, antenna matching, and lots of stuff around the LoRa modulation choices. Something that occurred to me is that in the original C++ driver the RXDone interrupt handler most likely was an actual interrupt handler that took care of the received packet in some form and thus the worst of the race conditions (packet received and then clobbered by TX) probably didn't occur in the C++ version. |
Beta Was this translation helpful? Give feedback.
-
In my use case, it will almost never happen. Given the legal requirements (FCC, etc) of the LoRa frequencies, LoRa devices generally use very little time transmitting. I see how in a highly congested area this could be a meaningful problem, but for me, it's a time/value equation - given my current understanding of the problem and my use case, I'm not seeing enough value to put in the time to try to fix it. However ... I totally agree that it would be best to fix this race condition. Might you consider taking on the project? Perhaps start by creating a test that can (somewhat) consistently create the race condition, so the fix can be verified that it doesn't introduce any new failure modes (as that's what worries me most - having special code that attaches to a race condition thereby increasing the risk for edge-case stability issues). |
Beta Was this translation helpful? Give feedback.
-
I believe that the sx127x LoRa radio driver has some race conditions WRT RX/TX switching. I haven't used the driver but have written an FSK driver for that chip, which reminded me of the issue and I looked at the LoRa driver source code to see how it handled things. So perhaps I'm interpreting the code incorrectly and all this is a non-issue :^)
Suppose the app performs the following on a periodic timer:
Assume
enableReceiver = true
in the radio config, for example, the app expects to receive commands via packets.Now suppose a packet is received by the radio during
read_sensors()
orformat_packet()
. The RX interrupt will fire and the interrupt pin's onReadable callback will be queued. The packet is sitting in the FIFO. But the app continues and callsradio.write()
which switches the radio to TX mode, overwrites the FIFO, and transmits.As soon as the app returns from the periodic timer the pin onReadable callback is called. The radio driver thinks TX is done since the radio is in TX mode and ends the TX and switches to RX mode. The result is that the received packet is lost and the transmitted packet is likely aborted before it is fully transmitted (exact outcome depends on timing).
The above situation is not that likely to occur because most uses of the chip are sensors that TX only and that RX very rarely, if ever, and have specific time windows for that where they can suppress TX.
Fixes to consider:
writeIf
method that performs above checks and has a return value to indicate whether TX happened or not, as opposed to throwing an error inwrite
writeIf
returns that TX is not possible (orwrite
throws), guarantee that a freshonWritable
callback is made when TX is possible againBeta Was this translation helpful? Give feedback.
All reactions