-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to detect ENC28J60 module "hang" state #167
Comments
Could you write a sample code to detect this situation? |
It's seems as dirty hack at this time. I changed utility\Enc28J60Network.h file - move readReg() sub from private to public members area. Then i use this pieces of the code:
|
Thanks |
@zbx-sadman did you flood the data to or from the ENC28J60? I can send a lot of data to the module, but when the module sends just a few bytes it hangs. I also tested the init code above, but it had no improvement. |
Are you still using your module as described above or have you perhaps found a stabler library for the ENC? |
Yes, i've flood module with TCP and UDP traffic by I've found additional "signs" on hanged ENC module: ESTAT.BUFFER and ENC. EIR.RXERIF will be raised. It seems that sometime strange interference just bump the network chip. Maybe it's all the same electrostatic interference. I don't know. IP and MAC shows on debug output as it should be, but ENC module's registry say to me that error occured ... My ENC hanged on ~second week after restart and therefore code debugging is pain. I still use Arduino_UIP because its have the same commands that stock Ethernet library and allow to use both Ethernet shields with the same code. |
Here https://github.com/TMRh20/arduino_uip/tree/fix_errata12 is fork of this library with some changes |
@prezeskk are you running the fix_etrata12 fork with no lockups? I'm busy stress testing the fork with zbx-sadman's |
@gwest7, may be yours sketch have a memory leak? In my experiments i found out that low memory will stops network functions working. If 250 bytes remain free after compiling (by IDE report) - device stops to reply on ping. May be compiler does not calculate correctly memory allocating and some classes from included libs eat the memory on init... But for my sketch it is a fact - no ping reply if <250b memory is still free. I think that this situation require to deeper source code analyze, but i not great programmer and haven't any debug tools :( |
Ok, I'll check my sketch again because it did freeze up again during the night. The |
Mine is connected to an Atmega328 at 5v soldered to a project board. Will make these changes too. Thanks for the effort! |
Maybe I'm suggesting here things you've already tried, but my lockups went away when I didn't use the 3.3v power supply of the Arduino board anymore, and used a 5v->3.3V step down converter to power the board. |
@GunterO, yes using 3.3v from Arduino board is a bad idea ) I agree - if network module and MCU board is powered from the same supply, voltage dropping on both regulators must be equal or close and ENC's TTL level must be enough for MCU, but who knows how good voltage regulator from Aliexpress ;) ENC can work on 3.1V (from datasheet), but it not enough voltage for 5V MCU input. So, i think it's worth a try level converter. May be it did not help, but all will be wired as vendor recommend, and we can exclude this possible hardware error. |
I still fight with my ENC. Today i wrote multithread app on perl to make more test of my device. This script just fork process that opened connection to device, send request (~15 bytes) and recieve answer. And no more. Device just wait ~1sec before send answer and close connection. I supervised the chip's registers EPKTCNT, EIR, and some others. I've found that EIR.RXERIF bit will be raised on two session, but my device still active and answers the request. EPKTCNT registry shows to me ~22-23pkts. Microchip's datasheet says to me followng: When a packet is being received and the receive buffer runs completely out of space, or EPKTCNT is 255 and cannot be incremented, the packet being received will be aborted (permanently lost) and the EIR.RXERIF bit will be set to '1'. Once set, RXERIF can only be cleared by the host controller or by a Reset condition. And its true - RXERIF do not clear when i stop perl script. And device still working.... i did increase UIP_CONF_BUFFER_SIZE to 200 - no any luck, all the same. It turns out that this small traffic (15 bytes request every seconds in two thread) are cause the ENC buffer overflow? But device still answer and have no dropped requests. It make me crazy. Now i want to test new strategy: I will monitor the time elapsed since the last successful incoming connection, and if detect no activity on one minute or so - check ECON1.RXEN && EIR.RXERIF bits. If first is dropped and second is set - just make Enc28J60.init(...). It allow to reinit chip not so often to make ping of device without lost packets I know - it looks like mad action, but i see no other ways to avoid hard hangs at this time. |
Good luck. Looking forward to see your results. |
@gwest7 Can you email me or give to me your address. I want check some guess, and your "fast hangs" device wlll helpful. P.S. my email on the profile page Tanx |
@zbx-sadman Thanks a lot for your work on this. Very interesting! |
@GunterO, i have many exceed code in my project, but i think that this piece of code shows how to detect "hang" state and reinit ENC module. Also i trying to find more simple solution to clear RX-error state, not so radical like init()...
|
Great, thanks! |
I forget to add Ethernet.maintain() call to my example code. It fixed now. One more interesting fact: including Ethernet.maintain() to the network loop increase a little (-40 bytes for my test code) free program storage space if UIPEthernet used, but eat MCU flash (~3,3kB) & RAM (~40 bytes) when i just change network driver to Wiznet's Ethernet.h. Wiznet's drivers just includes DHCP functionality to firmware, even you won't to use it. So, you must detect this situation like that:
|
I want to inform you about my experiment results - ENC28J60 based device twice roll over max millis() period without hanging up. But i need to update firmware and i have to stops this process. ...and i really do not understand why the number of re-inits ~24.03 has increased dramatically. But device still works, serve requests without rest, and lost no sensors data at everytime. |
@zbx-sadman Thanks a lot for your work. Very interesting! I'm using ENC28J60 module with an arduino mega, but I have the same "hang" problem. Please could you give me a recap about your solution? This procedure is enough? |
@enry86cami , i almost returned to Wiznet modules. W5100 Mini Red ($3.78) and ENC28J60 module($2.57) haven't dramatically price diffrence. But now ENC driver does not impact to ATMega RAM, bigger network buffer and TCP/IP available inside the Wiznet chip , and no any power supply / logic level / etc. problem. But my last "ENC28J60 control method" contain registers, and network configuration testing function
|
Thanks for yout reply.
Do you think that the W5100 Mini Red is more realiable? If yes, I'll trash the ENC and use the Wiznet modules Thanks |
Unfortunately, i don't know nothing about Blynk, but i know something about ENC: if you have long loop() w/o calling driver's tick() procedure - you have problem. Yes, i guess that better use with shield/module on Wiznet chip W5100/W5500 in this case. Use ENC if you can control loop()'s runtime duration. |
@zbx-sadman I use ENC28J60 shield with UIPETHERNET lib on Arduino Nano for cyclic UDP packets send. Bascially I collect data from few One Wire devices and send one packet every 2 seconds to UDP server at my local network. To make it easier I'm not waiting for any response from the server, only sending data. And second question: My ENC is on shield so I don't havy direct access to its reset pin so it would be easier to reset Arduino with watchdog to basically reset also the ENC. Am I correct? For UDP packet sending I use udp.beginpacket() -> udp.print() -> udp.endpacket() methods. Beginpacket() methods returns true when resolved correctly hostname and port. Is it a good way to check whether the ENC is already freezed or not? I am of course aware that in a case of network problems it will give me same information but it's used only on local network so I don't think it will be an issue. |
As long as I remember - ENC catch all packets on the interface and place to hardware buffer, because hardware filter does not activated in the Norbert's driver. The driver must have time to read the buffer (via any procedure that call tick()) and make a decision about packet's destiny. If hardware buffer does not read out, he will overflow, error occurs and ESTAT.BUFFER is rised. And then all packets will be dropped while this situation does not resolved and some ENC registers not changed from outside (by microcontroller). So, if u have network packet storm or something like this and does not read out hardware buffer fastly - ENC can hang and need to be reset (this is the easiest way ;). I use softreset method for ENC and see no any problem w/o access to hardware reset pin. Unfortunately, i can't say anything about Beginpacket(), because does not using UDP with this driver |
Thanks for quick answer. Ok, so I understand that it is necessary to use your tweak in any case, mine also. It doesn't whether we're sending/receiving packets. How do you perform that softreset on ENC? By using SPI.transfer(255) command or you mean just init() method? |
Yes, i just use Enc28J60.init(...) |
According what you've told about tick command... |
Another question: Is it necessary to perform Ethernet.begin(mac, ip).... etc after Enc28J60.init(...)? |
Ough, i take look on my old and dusty sources. I've really used Ethernet.begin() instead init() , because init() called from the inside begin() procedure (sorry for the confusion). About tick() calling strategy... If you search on the sources by 'tick()' word, you can see many places from where its called. Just call these routines as often as you can and check STATE.BUFFER. I use firmware with blocking code for Dallas sensors on my project, which calls Ethernet.mantain() on every loop, watching for STATE.BUFFER and all seems OK. Sometime i have up to 10 ENC resets for one hour, sometime its equal zero. But whole system is answered on incoming connects for months w/o Arduino board resets. |
Thanks for your help. At this moment my board is working for 3 days without a problem, but I haven't implemented yet your checking modification. I'm gonna wait for 2-3 more days to see how it works for now and then add your piece of code with saving to EEPROM counter with total value of ethernet restarts. Hopefully it will work like in your case :-) |
@zbx-sadman Could you provide some latest sources that worked for your case, please? Because now I'm pretty confused which example should I use. |
@kaczy1217 , i can point to my code, but its not simple example: Zabbuino |
I use ATmega328-based Arduino & ENC28J60 (revision 6) & Arduino_UIP (fix_errata12) in my hobby project for a long time. My device(s) handles requests from the monitoring service (Zabbix). But i can't achieve stable operation of the device. It can work two weeks on the production (noisy) network and hang then, or work only 2...3 hours and hang again. Also it can work for a month on the home network. I'll try to change power supply, set additional capacitors, give more RAM (~250 bytes free in the work state) and so... I have no luck in all cases - activity led is blink, but no any data in the ethClient.read()
Nevertheless, any attempts to hang module by hand was failed too ;) I have used flood-attacks without success - the module stops to answer for a few time and works again when attack is off.
After that i have add ENC28J60 registers state output in my sketch, connect device to noisy network and began to wait. When module is freeze, i have connected to the UART and saw that ECON1=00000000b.
ECON1 contain the RXEN bit : Receive Enable bit. If its 1 - packets which pass the current filter configuration will be written into the receive buffer. 0 - All packets received will be ignored.
Seems that cleared RXEN bit is the reason for such behavior - recieving no data with activity led blinking.
In the Arduino_UIP code, i did not find the RXEN bit cleaning operations. Now i'll try to call Enc28J60.init() when 0 == ECON1.RXEN.
I hope that the information was helpful And may be someone will found workaround to get more stability with ENC28J60.
The text was updated successfully, but these errors were encountered: