Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wifi issues with default compiler optimization flags #288

Open
mamama1 opened this issue Dec 15, 2022 · 3 comments
Open

wifi issues with default compiler optimization flags #288

mamama1 opened this issue Dec 15, 2022 · 3 comments

Comments

@mamama1
Copy link

mamama1 commented Dec 15, 2022

Hi

we have stumbled across wifi issues which we were able to mitigate by adding a delay(1) (ie effectively allowing background tasks like wifi handling to run) directly in the main loop or when adding serial output messages at random places.

we found that one simple for..next loop (iteration from 0 to 10) seems to add to the issue, however 0 to 10 shouldn't really block the ESP8266 very long, especially since we're only checking a few very simple evaluations there (mostly comparing bool variables). Adding a serial output in a place which isn't even executed 99,9% of the time, also mitigates the wifi issue.

So we came to the conclusion that some compiler optimizations must be messing with us and so we tried to use -O2 instead of the default -Os compiler flags and with that, wifi works flawlessy, without adding any delays or serial outputs in our code.

Since our code is huge and complex, I am not able to post a minimal sketch to reproduce. Any small change to the code can completely change the behaviour. But as an example, I can demonstrate which completely unlogical changes make wifi processing work again:

	for (uint8_t n = 0; n < RFM69_TX_QUEUE_LENGTH; n++)
	{
		// delay(1);
		// Only work on packets where NewPacket = true and if they have retries left.
		// Do not work on Packets where ACKReceived = true, since those have already 
		// been sent AND ACKed by the peer. They are only waiting to be cleared by user code.
		if (this->TXQueue[n].NewPacket == false || this->TXQueue[n].TXRetries == 0 || this->TXQueue[n].ACKReceived == true)
			continue;
		
		// LOG("%u", n);
		ADDITIONAL STUFF HAPPENS HERE....
	}

So this for..next loop processes waiting packets in a TX queue. Most of the time there are no packets in that queue, so the loop continues right after the first if statement. Wifi is not working (not connecting) with the default compiler flags -Os with the code above.
If I uncomment the LOG output AFTER the if statement (which just continues most of the time thus the LOG doesn't even get executed anyway), wifi suddenly connects and works again. If I leave the code as it is and use -O2 instead of -Os, wifi starts working immediately as well.

LOG is a macro which calls Serial.printf(), nothing special. Changing code in a completely unrelated place led us to the conclusion that compiler optimizations must be messing with us and to me it looks like this is indeed the case.

Without deeper analysis of our code - can it be generally said that -Os can be problematic? Is our code probably the issue? What should we be looking for? Is there a way to find out what exactly is blocking wifi from connecting correctly? Should we just use -O2 and be happy?

Can please someone advise whether this is probably an issue with the compiler optimizations (and not our fault) or whether we should dig deeper into our code and give us some directions.

Thanks!
PS: made a github issue instead of a forum post bc. maybe this is really related to the chosen compiler optimizations flags.

@TD-er
Copy link

TD-er commented Jul 18, 2023

Hmm too bad this post didn't get any reply, as I'm really curious whether others may find similar behavior.

I can at least confirm that the WiFi stability on ESP8266 may appear to be completely unpredictable from build to build and even a small completely unrelated change somewhere in the code seems to 'fix' or 'break' WiFi code as you also described.

I always thought there might be some bug related to (string) buffer size being one byte too small or not being 0-terminated somewhere which depends on the order of linking object files. But changing optimization flags may also be a good explanation.

@valeros
Copy link
Member

valeros commented Jul 19, 2023

Not sure how to help here as we use the same optimization flags as the Arduino core for ESP8266. Without a minimal project that works with the Arduino IDE and doesn't with PlatformIO, we cannot affirm that this behavior has something to do with PlatformIO's build process.

@valeros
Copy link
Member

valeros commented Jul 19, 2023

@mcspr correctly suggested that an outdated local toolchain package might be the culprit. The new platform v4.2.1 contains a stricter minimal toolchain version requirement so that PlatformIO is forced to use the latest packages with bugfixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants