-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIL testing issues on different targets #1524
Comments
There are FIFO access issues on the S2 with uart, maybe there is something similar here? |
They could definitely be related, is there a tracking issue for that? |
There have been a string of failures for the ESP32-S3, so to keep things moving I have removed the requirement for this check from the merge queue configuration. Once the issue has been resolved we will have to re-add this requirement. |
Did some testing locally and, looks like ❯ git diff
error: cannot run less: No such file or directory
diff --git a/esp-hal/src/spi/master.rs b/esp-hal/src/spi/master.rs
index e6f407a6..a006fc63 100644
--- a/esp-hal/src/spi/master.rs
+++ b/esp-hal/src/spi/master.rs
@@ -2036,7 +2036,6 @@ where
self.update();
reset_dma_before_load_dma_dscr(reg_block);
- self.clear_dma_interrupts();
tx_chain.fill_for_tx(false, write_buffer_ptr, write_buffer_len)?;
tx.prepare_transfer_without_start(self.dma_peripheral(), tx_chain)
.and_then(|_| tx.start_transfer())?;
@@ -2044,6 +2043,7 @@ where
rx.prepare_transfer_without_start(self.dma_peripheral(), rx_chain)
.and_then(|_| rx.start_transfer())?;
+ self.clear_dma_interrupts();
reset_dma_before_usr_cmd(reg_block);
reg_block.cmd().modify(|_, w| w.usr().set_bit()); |
That's odd, when I run it on my S3 it works just fine. What the output when you run |
Here is the full log: Full log
The error occurs both when running the whole test suite and only the |
Ah the stack doesn't show where the panic happened. I may have some time this evening to look further. It's rather odd that this only fails on the S3 but only sometimes, and undoing it breaks async spi on the Esp32 |
I was able to reproduce the failing HIL test and have done a bit of digging. (sorry for the wall of text, I wasn't planning on writing this much 😬 ) The quick fix for the HIL tests is to do This was hinting that there was some kind of lingering broken state in the SPI/DMA driver. I would've made a PR for the quick fix but it's not quite appropriate. Even if we make the HIL tests pass by resetting the GDMA peripheral, the reality is users don't have this luxury. Once they enter the broken state the spi driver stops working as expected. I've narrowed down the broken state to be coming from My current thinking for a proper solution is to remove support for asymmetric transfers in the base driver. The hardware doesn't support it directly. An asymmetric transfer should be implemented as two transfers, one full duplex symmetric and one half duplex. i.e. Transfer + Write or Transfer + Read. (This also makes error handling a bit better since asymmetric transfers trigger the error interrupts) Thoughts? |
One thing I discovered while looking into another (unrelated) future HIL test PR: apparently JTAG reset (which is done by the HIL tests) doesn't reset peripherals on all chips (it does on e.g. ESP32-C6/ESP32-C3 but doesn't on ESP32-S3 - probably it's different for Xtensa and RISC-V based chips).
I added the reset also for I2C which tends to get into weird states on errors. I agree we probably should do it for all drivers - I'll open an issue for that |
probe-rs doesn't yet do a proper system reset for Xtensa targets, that can explain why peripherals aren't reset. It's coming, but I need someone to merge changes over there to progress :) |
Just updated the initial list, only two issues left! |
Don't worry @SergioGasquez we'll have more once ESP32 is on line 🙃 |
The issue is, RcFast is not running. Fixed in #2170 |
ESP32-S2: #2098 The main issue is PDMA writing more data than expected. If the buffer's size is not 1 mod 4, PDMA will transfer a whole word, regardless of how many bytes we asked for. "Fixed" by #2179, by ensuring that DMA buffers are always a) aligned and b) a multiple of 4 bytes, regardless of what the user specified. |
I'm not sure what's wrong here, the tests are enabled and passing on ESP32 |
They were fix with #2131, I will tick them. |
I2S on ESP32-S2 just needed enabling, done in #2181. |
I2S on ESP32 generates two wrong clock pulses. This made the receiver reject the first sample pair, causing the test to fail. Issue worked around in #2194 by assigning the first halfword to the right channel and by outputting the right channel first. Receivers may still get freaked out by the 2 extra clock pulses they encounter before the first sample, but I'm not sure what to do about that. |
If a HIL test fails, would be sensible to automatically rerun it with |
Yeah that's something we can try. But if a test is spotty, the second run might just succeed. Same if the failure relies on some timing circumstances. In case of I2S I'm suspecting dodgy wiring or some default pull resistor getting in the way. |
Ah fair point. The 2nd run's output may actually be misleading too.
I was going to suggest reducing the I2S frequency in case it's that but 16kHz is already quite low. |
Getting useful backtraces would be good |
I would like to solve this in probe-rs if possible, instead of hacking around with reruns and relying on defmt that introduces timing jitter. Even if that timing jitter is actually a good way to inject errors and we might want to be resilient there :D |
Stack traces will be fixed with probe-rs/probe-rs#2831 and probe-rs/probe-rs#2829 so I'll try an update once they are in. |
Failing/Disabled Tests
aes_dma
support #2312dma_mem2mem
support #2313embassy_interrupt_spi_dma
: C2, C3, C6, H2 - need to use a different DMA-enabled peripheral #2289interrupt
HIL tests #2314Solved issues
gpio::test_gpio_interrupt
(GPIO testtest_gpio_interrupt
is failing for Xtensa targets #1413)get_time
(ESP32-H2:Delay::delay_millis
delays too short #1509)i2s
(ESP32-H2: I2S Clock is way too slow #1637): Enablei2s
HIL test foresp32h2
#1755delay
,embassy_timers_executors
andget_time
tests are failing.esp32h2
. #1535spi_full_duplex_dma
test_symmetric_dma_transfer_huge_buffer
If we add the following code:
test_symmetric_dma_transfer
fails if previous test failedIm not sure what triggers this, but I think that when some test failed, the init state is bad and the first test of
spi_full_duplex_dma
fails.spi_half_duplex_write
spi_full_duplex
fails (after the second run iteration). Might be related to thespi_full_duplex_dma
on S2 issue listed above.delay
Intermittently fails the
delay_700millis
test. See https://github.com/esp-rs/esp-hal/actions/runs/8880612772/job/24381385302uart::test_send_receive_different_baud_rates_and_clock_sources
When using the
RcFast
clock source, the code just hangs in https://github.com/esp-rs/esp-hal/blob/main/esp-hal/src/uart.rs#L1108 and then, the test times out. Thereg_update
should:Software write 1 would synchronize registers into UART Core clock domain and would be cleared by hardware after synchronization is done.
but looks like its never cleared because it's not synchronized.spi_full_duplex_dma_async
andspi_full_duplex_dma_pcnt
: fixed with Fix SPI DMA write/read for ESP32, ESP32-S2 #2131i2s
issuei2s
tests fail. Workaround: Enable I2S tests on ESP32 and work around first sample issue #2194delay
: H2: H2: enable TWAI, enable delay test #2199dma_macros
: 32, S2 - Slight general cleanup, enable dma-macros test, allow using virtual mem2mem channel on c2 #2200qspi_read
reads only zeros. Test issue, fixed in Enable QSPI tests on ESP32 and clean up #2198qspi_write
test doesn't make sense on ESP32, panics here because is usesCommand8
(qspi_read_write
combination last two). Test updated in Enable QSPI tests on ESP32 and clean up #2198twai
The text was updated successfully, but these errors were encountered: