-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wifi occasionally won't connect to dual-band router with same SSID (IDFGH-4064) #5935
Comments
Can confirm this hapenning aswell. Unfortunately I didn't save any logs from when this was hapenning, though turning all possible debug on verbose, I found that it failed with a "auth timeout" and proceeded to blacklist the wifi - which is really bad, because I believe you have to restart or reinitialize the whole wifi stack to remove your AP from that blacklist. |
Yeah this is stunningly bad / problematic. Thanks for confirming I'm not crazy. I am concerned it might be a corner case based on distance to station and traffic on 2.4ghz. Anyone else confirm this? Anyone supporting esp-idf care to comment? |
I've brought a device home for the weekend to test, here is a really verbose output - it failed to connect multiple times, but I've got my code set to retry 4 times, so it eventually connected. It practically never connects on the first try, and it did fail completely a few times, but I did not have verbose output turned on yet at that point. Note: 'wifi_manager' is my module, so take those outputs with a grain of salt, or ignore them https://gist.github.com/istokm/bbce565b2492918a159abccf201d03d8 My wifi setup is as follows: AP2 (TP-Link EAP225) So essentially, both APs broadcast a dual-band signal with the same SSID After the ESP finally connected, the RSSI reported by the AP stabilized at around -59 I've got my scan method set to fast, if set to all channels, the output changes to this: (unfortunately CMD trimmed the initialization, but I think most of the useful data is there) |
Looks like in all instances in your gists the AP at 2.4 was found (mac f6 or 26). How are you retrying? My loop:
If I fail to connect, I just retry this code segment. Are you suggesting that if you let it retry long enough, it will connect? Because mine never connects until I temporarily turn off dual-AP mode on my Synology. As soon as I do, it instantly connects. Thanks for the work on this Marek (hope that is correct) ;) |
I'm not using Arduino, but straight ESP-IDF, so my code is a bit different, but essentially the same thing:
The timeout you see in those functions is my internal timer that stops the wifi connection automatically after 5 seconds if it didn't connect yet - sometimes it just hangs in a "connected" state, but I never receive a IP from the DHCP - which is when my actual code is meant to start, so that never happens. Well, here comes the weird part, I've been working on other stuff today, so I turned off verbose output and it refuses to connect a lot - without a restart, but the moment I turn on verbose log output it works "fine" - like my previous logs, connecting after a few attempts. I'm starting to think there are some timing issues in the wifi code, otherwise that wouldn't make sense (log output to UART is painfully slow) Here's the normal failures I tend to see: https://gist.github.com/istokm/c5269a1cf7874d899ca9087d48d2ca19 Edit: |
Can you post the 2.4G and 5G configuration of the router here? The recommended Macbook's built-in packet capture tool does not require a packet capture card. It can be used to capture 802.11a/b/g/n/ac packets: |
Hi @HarveyRong-Esp, Tried to recreate the problem and, if course, after many attempts with 2 different boards, I have been unsuccessful. But this is REAL. Here are my router settings for my Synology AC-2600: Perhaps @istokm is able to do what you're asking. Apologies. |
Have you tried to capture unencrypted wireless packets? |
Hey @HarveyRong-Esp, Find the attached pcaps for both 5GHz and 2.4GHz. I reset the esp32 during these captures and saw it re-attach successfully. Hope this helps. MacBook Air_ch3_2020-10-22_08.23.14.209.pcap.gz |
@scubachristopher, @istokm,
This seems to be the connection failure caused by the ESP device not receiving the auth within the specified time. There are two possibilities for not receiving auth: It can be analyzed by capturing packets. @scubachristopher, Sorry, because the captured packets you provided are encrypted, I cannot view the relevant information. Can you provide unencrypted wireless data packets? |
I apologize, but I currently have no way of capturing packets... I'll post a capture as soon as I can. |
Hmm -- I am able to open these in Wireshark. Not experienced with packet capture tools, but I've installed Wireshark and did a few more captures. I am able to open and view them, as well as the old ones. Can you try viewing them in Wireshark? |
Hi, @scubachristopher, can you provide some information?
|
Hi @HarveyRong-Esp, Tried for hours to repeat the scenario and I cannot, unfortunately. 1). sdkconfig.gz 2).
3). ESP_successful_dual.pcapng.gz I tried toggling 5GHz on the router on and off, rebooted as well. Now, of course, I cannot repeat the issue. :/ |
If someone has written a convenient ESP32 code for capturing packets I could try and get the logs and captures that you need - I just don't have the time to write a packet capture tool (that doesn't use a SD card, as I don't have any SD equipped dev boards around), and I also don't have any other packet capture capable devices. |
@scubachristopher, There is no problem with sdkconfig. It is the default configuration. Can both Log and data packets be connected normally? We have purchased Synology AC2600 on the Amazon platform, but due to customs and other issues, we need to deliver it at the end of the month. Once delivered, we will try to reproduce the problem locally. |
Yeah all connects normally now. Unclear what was going on, apologies for not getting a pcap when it was happening. I was thinking about what @istokm said above:
I did add protection for my EEPROM-stored credentials in my project, as I do switch between different boards. Not sure if that's relevant, as I don't know what dependencies Wifi libraries have on stored data. But that might be a clue. Since I've implemented validation checks (magic string + checksum), I haven't seen it... |
Hi @scubachristopher , Our Synology AC2600 has arrived, unfortunately I cannot reproduce it locally. |
@HarveyRong-Esp, yes, my router is configured with dual-band enabled, a single SSID. I am not familiar with the ESP32's Wifi hardware, but I wonder: if the device is only capable of 2.4GHz, how could a 5GHz station create a problem? Disappointing that we both cannot reproduce it. But it is real. Deeply appreciate your effort to solve this -- thank you for putting the effort into this. |
@scubachristopher, I suspect that the 2.4G/5G option is automatically selected for routing and not 2.4G and 5G exist at the same time, switch to a certain frequency band according to the scene at the same time。When the 5G frequency band is selected for routing, 2.4G does not actually exist. Since the ESP32 device does not support 5G, the device connection may fail. |
It should be very easy to verify it. |
Guys, what I meant was "since the ESP32 doesn't support 5GHz, how is this happening?" |
Yes, esp32 doesn't currently support 5G frequency band. |
Hi @scubachristopher , |
Our customers with dual-band router with same SSID all have this issue, this ticket should still be open. We use SDK v4.3.2 |
Yes. The issue is still there. |
We have the same problem but with several 2.4GHz routers with the same SSID name. do you also have several routers (mesh) with the same name or only one with 2.4+5ghz? |
Yes, confirming that is still present. I was unable to connect until I disabled the 5ghz on my router. More information that I found during my tests in order to successfully connect to my router:
I hope this will help on find a working solution. |
Hi - adding to this thread here with an additional use case: trying to get ESPresence working with a Google WiFi network (no control over 2.4GHz / 5GHz bands, multiple 2.4GHz BSSIDs). In this environment, if I successfully set up a device in my office and then try to move it to another room, there's a very small chance it will wake up and associate properly. If I instead go to a physical location where I plan to deploy an ESP32 sensor and then reset the wireless from there, I can improve those odds, but it would be nice to be able to just give an SSID and have the ESP32 manage across all the different BSSIDs associated with that SSID. My ESP: AITRIP 5PCS ESP-WROOM-32 ESP32 ESP-32S Type-C USB Here's a little more info on my network and my setup: scubachristopher - am I missing something here? I looked for another ticket, but couldn't find one - if this is appropriate, should we reopen or start a new issue? |
@HarveyRong-Esp @Alvin1Zhang |
Yeah, I'm also experiencing the same. It's very frustrating and causes big headaches because now it's standard. |
So I might be experiencing these at few sites. But I haven't look into it deeply yet. I do know a site that uses Ubiquiti routers seems to cause this a lot often than other routers so far. |
We are experiencing a similar issue with the Google Nest Wifi and unfortunately, we can't get enough debug data as it's in the field and we can't get any logs. The behaviour that we are experiencing is, every time the device reboots, it connects successfully to the given credentials and it drops the connection after 10-15 minutes of the reboot and it fails to connect back again. |
We also have hundreds of issues where customers are having this issue on select equipment in dual band and have not been able to track it down to anything specific. Only occurs with our esp32-wroom product. Other 2.4ghz only devices have no issues. |
hi @bad-jesus @abhishekbn @Irfan93 could you provide some details, I wonder if there is just one dual-band router or there are some routers |
@Xiehanxin: Are you joking? Is there not enough data above on this issue that has hung out there for 3 YEARS? |
Going to have to agree here. We have issues with a lot of ISP specific routers here in Canada and our US customers also experience issues with various products including Cisco/Meraki and Ubiquiti. We find that these brands are MORE stable but always rock solid when bands are split into different SSID's. A lot of cheaper products by Linksys also have this issue. TP link seems to be rock solid across the product types we have tried. Orbi also has this issue. You should not need any of us to provide more data at this point. It's been years, it's THOUSANDS of conflicts. I am not sure what more you need. The Espressif C5 can not come out fast enough. It keeps being pushed and it's ridiculous. We should have this in testing NOW. |
It is not router specific but VERY device specific. I have four identical ESP32-S3FN8 sitting next to each other and three will perform an initial connect every time to my dual band rounter. One will not. Multiple retries after a delay does get it to connect, because it seems to work only after it has first failed to work. Wifi.begin() with no SSID, PASSWORD after a successful login is much more reliable. I have not worked with the ESP32 since the first version, but remember having this same issue years ago. Figured it would have been fixed by now. |
Hi, Anyone find the Solution? |
@tibbis I am getting the issue like you - when its a mesh setup it is really unstable. Turn off the mesh (i.e. only one wifi access point) and it works fine. Not sure how wifi drivers are supposed to handle mesh networks, but seems ESP32 S3 has an issue with it. |
Environment
Development Kit: ESP32-Wrover-Kit (Hazzah Feather32)
Module or chip used: ESP32-WROOM-32
IDF version 2.0 (Platformio)
Build System:
PlatformIO Core 5.0.1
Python 3.8.4-final.0
System Type darwin_x86_64
Platform macOS-10.15.7
platformio.ini
[env:esp32dev]
platform = espressif32
board = featheresp32
framework = arduino
Compiler version: xtensa-esp32-elf-gcc (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 5.2.0
Operating System: macOS
IDE: MS Code / Platformio 5.0.1
Power Supply: USB
Problem Description
I have a Synology AC2600 dual-band router advertising a single SSID on 2.4 and 5ghz. OCCASIONALLY (no firmware change), it will simply not connect to Wifi. MOST of the time, it does. When it fails, it is consistent and has the same behavior if I re-flash or even clear the flash. I have a web server running on the softAP that allows me to change the credentials.
I have a Pi that is in AP mode (2.4ghz) and it connects immediately to that. Switch back the creds to the dual-band and it won't connect.
Flashed the same firmware to my Olimex ESP32-EVB and it connected fine. I believe, eventually, it would exhibit the same behavior. I reflashed the same firmware to the Feather32 and the behavior was the same -- won't connect to dual-band. Perhaps this is because the MAC's are different.
So I decided to test this by watching the serial output and disabling my 5ghz advertisement and BAM, it immediately connects.
This is a SUBSTANTIAL issue because these instruments will be in the field connecting to a local AP that may be dual-band as well.
Expected Behavior
It should consistently connect to a dual-band wifi AP with a 2.4ghz station.
Actual Behavior
It OCCASIONALLY, repeatedly, won't connect.
Steps to reproduce
I've seen posts about people solving this by turning off their 5ghz radio on their router as a solution, which is not a solution. Since it's occasional, seems it might be triggered by distance from AP and how crowded the 2.4ghz channel is with other traffic -- dunno.
Code to reproduce this issue
Debug Logs
The text was updated successfully, but these errors were encountered: