-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Zbus Benchmark Sample is smashing the stack and is too fast #53563
Conversation
@rodrigopex for reviews? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only small changes are necessary.
@alpsayin maybe we could leave the duration zero occurs. The only thing we need to do is to avoid the zero division and force |
I found a somewhat reliable way to measure time for Eyeballed means/deviations are 30+/-10us for Let's see what CI/CD will end up showing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid abbreviations for one-time used variables and declare variables inside ifdefs.
Stack sizes need to be dependent on something arch specific such as idle stack size because they're smashing their stacks. Signed-off-by: Alp Sayin <alp.sayin@amd.com>
…time Needed because benchmark finishes faster than a tick, especially on emulated platforms where tick rate is 100Hz. Output units are also updated to print numbers with less digits. Signed-off-by: Alp Sayin <alp.sayin@amd.com>
@alpsayin can you please fix the failing checks? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just waiting for @cfriedt to comment on the native_posix get-time approach.
|
||
#if defined(CONFIG_ARCH_POSIX) | ||
#include "native_rtc.h" | ||
#define GET_ARCH_TIME_NS() (native_rtc_gettime_us(RTC_CLOCK_PSEUDOHOSTREALTIME) * NSEC_PER_USEC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfriedt Could you please give us your opinion about line 17? Is this a good practice for native_posix
?
Hi @carlescufi thanks for showing interest.
2nd failure I see here is just me being somehow very unlucky:
|
Let's try it again in some minutes. |
I think we can let this one go, but double-checking with @stephanosio |
#include "native_rtc.h" | ||
#define GET_ARCH_TIME_NS() (native_rtc_gettime_us(RTC_CLOCK_PSEUDOHOSTREALTIME) * NSEC_PER_USEC) | ||
#else | ||
#define GET_ARCH_TIME_NS() (k_cyc_to_ns_near32(sys_clock_cycle_get_32())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was seeing issues on the ARC QEMU platform where the tick count runs at 100Hz; does this API provide a higher resolution value for that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems so
-- west build: running target run_qemu
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: arcem
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 2.11MB/s
I: Duration: 118061.400us
@118061400
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: archs
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 2.12MB/s
I: Duration: 117709.400us
@117709400
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: hs6x
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 3.12MB/s
I: Duration: 80011.600us
@80011600
The package download failure has been fixed and can be ignored. The |
…POSIX Because posix emulated execution is significantly faster than posix emulated timer and benchmark finishes before even the first tick occurs. Signed-off-by: Alp Sayin <alp.sayin@amd.com>
The sample is now able to run on all previously excluded platforms. Signed-off-by: Alp Sayin <alp.sayin@amd.com>
Of course 🤦🏼 Dont need |
@rodrigopex I have a feeling I should also compare ticks for good measure. |
Merging this now, additional improvements can be done in a subsequent PR. |
@alpsayin I need to think more about it. We can discuss it in the following days. Ok? |
This doesn't fix the problem on qemu_nios2 where the cycle counter resolution is the the same as the system clock (100Hz by default). Cranking up the tick rate ( Output with minimal libc (with 1kHz tick):
Output with picolibc (with 1kHz tick):
As you can see, when using picolibc, the test completes in 4ms, well under the 10ms system tick. You should be able to verify this using this command:
I'll submit another PR that just increases the TICK value for these tests. |
That would be great. |
I checked, and we get away with (mostly by chance). In the slowest platforms, tests still finish before cycles can roll over. |
ZBus subscriber threads were smashing their threads. Upped their stack with something safe by utilising
IDLE_STACK_SIZE
for subscribers andMAIN_STACK_SIZE
for producer.Benchmark also finishes way too fast for most if not all targets. Used cycles instead of uptime as originally suggested by @rodrigopex and used nanosecond conversions to get around the
duration==0
check.Native posix still returns
start
andend
times as0
though, so with further commits utilisednative_rtc
withRTC_CLOCK_PSEUDOHOSTREALTIME
to get as much resolution as possible.