Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save on IOControl page has stopped working on Chrome? #175

Closed
nielsonm236 opened this issue May 23, 2023 · 9 comments
Closed

Save on IOControl page has stopped working on Chrome? #175

nielsonm236 opened this issue May 23, 2023 · 9 comments

Comments

@nielsonm236
Copy link
Owner

nielsonm236 commented May 23, 2023

This morning I noticed that Chrome freezes if I click on Save on the IOControl page. I see that Chrome 113.0.5672.127 released on May 16 2023, and my system automatically updates Chrome.

The same Network Modules continue to respond just fine with Edge and Firefox, even while the Chrome window is still frozen. When I run Developer Tools with Chrome it looks like the POST is sent from Chrome, but when I check the IOControl page with Firefox I don't see that the POST was executed - like it was not received at the Network Module from Chrome.

Is anyone else seeing this issue?

@nielsonm236 nielsonm236 changed the title Save and Refresh have stopped working on Chrome? Save on IOControl page has stopped working on Chrome? May 23, 2023
@jmcvieira1
Copy link

Just made a quick test with Chrome 113.0.5672.127 on the Code Revision 20230416 1116 Browser UPG with the PCF8574 attached and Chrome freezes if I click on Save only in the IOControl page of the PCF8574.

On the Code Revision 20230416 1116 MQTT UPG BME the Chrome 113.0.5672.127 worked with no issues,

@nielsonm236
Copy link
Owner Author

Thanks. I think I know where to start looking.
Mike

@kiralikbeyin
Copy link

While working with the ESP8266 chip, I encountered a situation where the request did not end and there was a timeout at the end because I did not close the connection after the client request in the web browser. I think there is an issue with that. Pay attention to the content length. Make sure the connection is completely closed.

If necessary, instead of satisfying the request inside a loop; A solution might be to run the client in a task and then delete the task completely at the end. (Maximum 5 clients could connect in the Esp8266 library.)

I haven't looked at your Github codes, just a few ideas..

@nielsonm236
Copy link
Owner Author

Thank you. I appreciate the pointers from your experience.
Long explanation (just thinking out loud): The situation is similar here, but I'm fairly sure it is a bug where I'm not handling continuation of a TCP datagram properly when it spans multiple packets. Multiple packets are required due to the very limited RAM space available to receive and interpret TCP datagrams. The memory limitation is not a problem from the logical standpoint, just a matter of properly tracking the incoming datagram and reconstructing the POST. If it happened "rarely" I could attribute it to link errors or some such thing and that would point to the need for a timeout and "close on error", but when it happens "every time" I know it is a logical error in the datagram reconstruction. Fixing this kind of problem is very simple in concept, but for some reason my fix from many months ago isn't working so I know I'm not getting to the root of the issue.
You might ask "If it was working why would it stop working?" If in fact the issue is losing track of datagram reconstruction across multiple packets, and "something happens" to cause this to become a solid error (not intermittent), then perhaps a byte of data is being dropped at the point where a packet break occurs. If the dropped byte is something that would be ignored anyway no error occurs. But if the dropped byte is something necessary to the POST, then interpretation of the POST cannot complete. So, working code will stop working if the datagram in the series of packets shifts by a few bytes, or perhaps even one byte, causing the critical data to now span a packet break point. This shift can occur because of changes in the Browser pre-amble, or changes in the size of the maximum packet length, or some other change in the length of "unimportant data" in the overall TCP transaction. Why does it stop working in one browser (say Chrome) but not in another browser (say Firefox)? Because they each have their own pre-amble content of unique length preceding the POST data, and when a browser update occurs that pre-amble length can change. In fact, even if the time/date stamp in the pre-amble changes then length can change. But the point is that a byte just can't be dropped or not counted. That's a bug.
So, the POST comes in, I lose track of where I am in the datagram (didn't count something properly and dropped a byte), and I don't respond with an error because I'm lost. I do close the connection under normal circumstances. If I built in a timeout it would stop the browser freeze, but I would still have an error on every IOControl Save. Obviously not acceptable :-)
Admittedly I'm supposing this is the issue because it is presenting itself in a way I've seen before. Fortunately most of my debug is still present and commented out from the last time this happened a couple years ago, so once I can get a few uninterrupted hours I should be able to track this down. All just part of making it work reliably in a tiny tiny space. It would be great to be doing this in the "unlimited" space of a regular computer, but that wouldn't make it as much fun, eh?

@kiralikbeyin
Copy link

Sometimes it's easier to rewrite the code than to find the bug and fix it. I think you can solve it quickly with GET method instead of POST method.

@nielsonm236
Copy link
Owner Author

Yeah ... that's a lot of re-writing.
In general I agree with you but have not undertaken the task due to the time it would take and the concern that it might end up taking more code space than the current POST processing. However: a) The POST processing code is quite large, so perhaps GET would be smaller, b) It would likely be less error prone, c) It requires re-write of both the C-code and the HTML/Javascript. No rocket science, just work.
As FYI I would still need POST code for the upgradeable versions to transfer new executable files to the server. But that is mostly a separate set of routines so the separation of data-POST and file-POST would not be too difficult.
I will put some more thought into this. I can already sense some users thinking "Ugh ... he's going to tear up the code again".

@nielsonm236
Copy link
Owner Author

@kiralikbeyin Could you private message me at nielsonm.projects@gmail.com? I'd like to ask a couple of questions about using GET in this case. Perhaps I can draw on your experience to build some code for an experiment.

@nielsonm236
Copy link
Owner Author

nielsonm236 commented May 28, 2023

I haven't heard back from @kiralikbeyin, but I think I've got this resolved. I found a logic issue in the parsing process. Testing this is very difficult because the failures are always corner cases that alias with packet boundaries (specifically the location of the &'s in a POST relative to the packet boundaries). So, I'm going to do more testing before I release the fix.

@jmcvieira1 I wasn't able to reproduce your report, but I suspect it was the same bug. My inability to reproduce can easily be some change in packet boundaries. HOWEVER - while trying to reproduce I had the PCF8574 appear, then disappear, and I could only get it to reappear with a reset (I was using the reset button on the module to do this). I suspect it may have just been poor wiring in my attachment of the PCF8574, but I will make several more attempts to figure out what happened.

I am intrigued by the suggestion from @kiralikbeyin . I think Jevgeni suggested it once before. But, I am not sure how to implement the Javascript to perform multiple GET requests instead of one POST. One of the POSTs returns over 600 characters, and GET requests are limited to 255 characters, so it might take three or more GET requests to return the same data contained in a POST. Additionally we could end up with some new aliasing problems as follows; In a POST I'm able to wait until the POST completes before processing the configuration changes, which enables me to check for conflicts in the configuration. In the GET method I might have to come up with a new method of making sure all changes in a given session are reported before processing the changes. But my first impression is that the GET method might free up some Flash space.

@nielsonm236
Copy link
Owner Author

Fixed in release 20230603 2058. There was one logic problem in POST parsing resulting from a change in parsing methods long ago that left a gap in detecting aliasing of the '&' delimiters when they occurred at packet boundaries. In addition there was an error in the POST size count for Configuration pages.

nielsonm236 added a commit that referenced this issue Jun 4, 2023
Issue #175 "Save on IOControl page has stopped working on Chrome?"
Issue #176 "Saving PCF8574 Configuration can cause the PCF8574 to disappear"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants