Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LITESUN_LA_WF3/BNC-50 device won't boot with firmware newer than Jan 2021. #2436

Closed
davebuk opened this issue Apr 12, 2021 · 13 comments
Closed

Comments

@davebuk
Copy link
Contributor

davebuk commented Apr 12, 2021

As the device was not booting correctly after trying the WiFi reconnect code additions, like you said, maybe something got corrupted when I did the factory reset. Anyway, thanks for the help. I'll get back to testing the latest code over the next few days.

Originally posted by @davebuk in #2433 (comment)

Following on from trying to test the WiFi re-connect changes, I have been trying to get the latest dev firmware on this device. Its an HBN BNC 50 (#2022 (comment)) and I've up until now been using the following to build and the latest platform

#elif defined(DB_HBN_SKT)    // LITESUN_LA_WF3

    // Info
    #define MANUFACTURER        "HBN"
    #define DEVICE              "BNC50"
	
    //My config
    #define ALEXA_SUPPORT           0
    #define DOMOTICZ_SUPPORT        0
    #define HOMEASSISTANT_SUPPORT   0
    #define THINGSPEAK_SUPPORT      0

    // Buttons
    #define BUTTON1_PIN         13
    #define BUTTON1_CONFIG      BUTTON_PUSHBUTTON | BUTTON_DEFAULT_HIGH
    #define BUTTON1_RELAY       1
    #define BUTTON1_PRESS       BUTTON_ACTION_NONE
    #define BUTTON1_CLICK       BUTTON_ACTION_TOGGLE
    #define BUTTON1_DBLCLICK    BUTTON_ACTION_NONE
    #define BUTTON1_LNGCLICK    BUTTON_ACTION_NONE
    #define BUTTON1_LNGLNGCLICK BUTTON_ACTION_NONE

    // Relays
    #define RELAY1_PIN          12
    #define RELAY1_TYPE         RELAY_TYPE_NORMAL

    // LEDs
    #define LED1_PIN            4  // 4 blue led
    #define LED1_MODE           LED_MODE_WIFI
    #define LED1_PIN_INVERSE    1
    #define LED2_PIN            5  // 5 red led
    #define LED2_MODE           LED_MODE_RELAY
    #define LED2_PIN_INVERSE    1

Even the with the latest dev commit (dfba0de) loaded via serial, the device doesn't boot and there is nothing shown via putty/serial connection even @ 74880. Power cycling the device force closes the serial connection so can't see any boot data.

I have a .bin file running 1.15.0-dev8e659c94 that I can flash via serial or web OTA and it works fine, but the current dev just doesn't seem to work. The button doesn't do anything, LEDs aren't coming on and there is no signs of any WiFi, espurna AP or connection to my network.

Should I look to try and build various firmwares between this working build and the current dev and try to work out which code is making the device fail?

@mcspr
Copy link
Collaborator

mcspr commented Apr 12, 2021

Should I look to try and build various firmwares between this working build and the current dev and try to work out which code is making the device fail?

2 points of interest:

  • does it work after erasing the last 16KiB of flash? need to backup it first though. for the 1MB chip, where 0xfc000 is the flash addr and 0x4000 is the size in hex:
> esptool.exe read_flash 0xFC000 0x4000 sdk.bin
> esptool.exe erase_flash 0xFC000 0x4000
  • can you share the build dir firmware .bin, .elf and .map files? if this is something with the firmware build itself, that may help as well (note of the built-in settings though)

@davebuk
Copy link
Contributor Author

davebuk commented Apr 13, 2021

db-hbn-skt.zip
Here are the build files.

I'll look at the other test later.

A build using current dev on a 4MB board works fine. I haven't tried any other 1MB boards. I do have a spare Sonoff basic I can try as well.

@mcspr
Copy link
Collaborator

mcspr commented Apr 13, 2021

Testing .zip binary and a custom build, it looks like it actually boots ok but stops somewhere around here and triggers hardware watchdog
(based on the logs, at least)

Hardware WDT Stack Dump - enabled

...
[001422] [WIFI] Initial
[001434] [MAIN] Uptime: 00y 00d 00h 00m 01s

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 3460, room 16
tail 4
chksum 0xcc
load 0x3fff20b8, len 40, room 4
tail 4
chksum 0xc9
csum 0xc9
v00089890
~ld


Hardware WDT reset


>>>stack>>>

ctx: sys
sp: 3fffeb30 end: 3ffffd30 offset: 0000
...etc...

if (mask & heartbeat::Report::Freeheap) {
auto stats = systemHeapStats();
DEBUG_MSG_P(PSTR("[MAIN] %5u / %5u bytes available (%5u contiguous)\n"),
stats.available, systemInitialFreeHeap(), stats.usable);
}

Removing the block seems to workaround the issue. However, these still trigger the wdt since they use the same API:

  • calling terminal command heap
  • connecting to the mqtt and sending freeheap data
  • publishing influxdb data

It does not seem to be able to calculate the heap stats

@mcspr
Copy link
Collaborator

mcspr commented Apr 13, 2021

There seems to be some kind of memory corruption going on. Adding Core build flags -DDEBUG_ESP_PORT=Serial -DDEBUG_ESP_OOM enables a more strict environment and it ends up crashing much earlier, but now it is showing something related to the UDPContext.h. This code is only used from three places by default - ArduinoOTA, MDNS and Alexa. Config above already disables Alexa, disabling the rest:

#define OTA_ARDUINOOTA_SUPPORT 0
#define MDNS_SERVER_SUPPORT 0

stops the crashing.

I'll try to figure out what is going on though, since both are kind of nice to have enabled by default.

@davebuk
Copy link
Contributor Author

davebuk commented Apr 13, 2021

A quick test defining below still doesn't allow the device to boot.

#define OTA_ARDUINOOTA_SUPPORT 0
#define MDNS_SERVER_SUPPORT 0

@mcspr
Copy link
Collaborator

mcspr commented Apr 13, 2021

diff --git a/code/espurna/led.cpp b/code/espurna/led.cpp
index 4eadf41b..6269f2f7 100644
--- a/code/espurna/led.cpp
+++ b/code/espurna/led.cpp
@@ -339,7 +339,7 @@ std::vector<size_t> _led_relays;

 void _ledConfigure() {
 #if RELAY_SUPPORT
-    _led_relays.resize(relayCount(), RelaysMax);
+    _led_relays.resize(_leds.size(), RelaysMax);
 #endif

     for (size_t id = 0; id < _leds.size(); ++id) {

?

Hope that's all that needs to happen

@davebuk
Copy link
Contributor Author

davebuk commented Apr 13, 2021

#define OTA_ARDUINOOTA_SUPPORT 0
#define MDNS_SERVER_SUPPORT 0

removed from the build and the above change made. Device running 1.15.0-dev.gitdfba0de4 now boots and runs.

The LEDs were set to #0 GPIO 4, OFF/Relay 0 and #1 GPIO 5, Switch status/Relay 1 . I changed #0 to Switch Status/Relay 0 and #1 to Switch Status/Relay 0 and tried the toggle. The relay works but the LEDs don't. I did have one reboot where info gave:

ESPURNA 1.15.0-dev.gitdfba0de4 built 2021-04-13 21:56:42
mcu: esp8266 chipid: DC4F22D33EF9
sdk: 2.2.2-dev(38a443e) core: 2.7.4
md5: d7ed11804e1dbab7ea409d02df3c71b0
support: API BUTTON DEBUG_SERIAL DEBUG_TELNET DEBUG_WEB LED MDNS MQTT NTP ARDUINO_OTA OTA_CLIENT RELAY SCHEDULER TELNET TERMINAL WEB 
last reset reason: Exception
extra info: Fatal exception:3 flag:2 (Exception) epc1:0x40100c48 epc2:0x00000000 epc3:0x00000000 excvaddr:0x4004f361 depc:0x00000000

latest crash was at 75580 ms after boot
Reason of restart: 2

Exception (3):
epc1=0x40100c48 epc2=0x00000000 epc3=0x00000000 excvaddr=0x4004f361 depc=0x00000000

>>>stack>>>

ctx: todo
sp: 3fffee60 end: 3fffffb0 offset: 0000
3fffee60:  40000f68 00000030 0000000b ffffffff 
3fffee64:  40000f58 00000000 00000020 00000000 
3fffee68:  3ffee7e8 7fffffff 00000000 00000001 
3fffee6c:  0000017f 00000000 00000020 40100e30 
3fffee70:  00000000 3fffdcc0 3ffea838 00000030 
3fffee74:  00000000 00000000 00000000 4010073c 
3fffee78:  4024dd75 3ffeb04f 00000000 40265e38 
3fffee7c:  00006208 7f5e0001 be10faff 3ffef318 
3fffee80:  00000008 3fff189c 3ffee158 4024dc8c 
3fffee84:  3ffee158 402523f0 3ffee158 3ffebc54 
3fffee88:  3ffebc54 000001a5 00000000 00000022 
3fffee8c:  00000002 00000018 4025bddb 3ffee158 
<<<

Changing each LED in turn to Always ON and save, the RED or BLUE LEDs would light, so they are being contolled on their GPIOs, but they don'y follow the RELAY.

@davebuk
Copy link
Contributor Author

davebuk commented Apr 13, 2021

And keys give:

> ledMode0 => "8"
> ledMode1 => "8"
> ledRelay0 => "0"
> ledRelay1 => "0"

@mcspr
Copy link
Collaborator

mcspr commented Apr 14, 2021

@@ -512,8 +545,6 @@ void ledSetup() {

     DEBUG_MSG_P(PSTR("[LED] Number of leds: %u\n"), leds);
     if (leds) {
-        _ledConfigure();
-
 #if MQTT_SUPPORT
         mqttRegister(_ledMQTTCallback);
 #endif
@@ -526,13 +557,15 @@ void ledSetup() {
 #endif

 #if RELAY_SUPPORT
-        relaySetStatusNotify([](size_t, bool) {
-            ledUpdate(true);
+        relaySetStatusChange([](size_t, bool) {
+            _led_update = true;
         });
 #endif

         espurnaRegisterLoop(ledLoop);
+
         espurnaRegisterReload(_ledConfigure);
+        _ledConfigure();
     }
 }

for relays? does not look like something crash-able though.

raw stack needs a decoder to actually make sense. there's something malloc-like around the exception itself, but idk what is the context
https://github.com/xoseperez/espurna/blob/dev/.github/ISSUE_TEMPLATE/bug_report.md
https://github.com/mcspr/EspArduinoExceptionDecoder

@davebuk
Copy link
Contributor Author

davebuk commented Apr 14, 2021

Fixed! We'll done again!

I must set some time aside to look at the decoder software.

A quick test and all appears to be working without any crashes. I'll keep an eye on it. Was it the multiple LEDs that were the issue?

@mcspr
Copy link
Collaborator

mcspr commented Apr 14, 2021

Yes. _led_relays[1] was being accessed, which overwrote neighboring ram address used by something else (which happened to be udpcontext).

@davebuk
Copy link
Contributor Author

davebuk commented Apr 15, 2021

All seems to be working fine without any crashes.

@mcspr
Copy link
Collaborator

mcspr commented Apr 15, 2021

Thanks!
Merged 0d11932, plus some other minor fixes related to relay sync and led config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants