-
Notifications
You must be signed in to change notification settings - Fork 14
web2py + IE8 + Downloading Files > 64kb = Corrupt Downloads #1
Comments
I have seen this issue happening before, but wasn't able to predictably reproduce it, unfortunately. Prompted by this request, after reading the code for a couple of hours, I'm almost sure that the problem is some interaction between Python's socket.py wrapper for write() and/or sendall(), which interacts with Rocket's exception handling, that would (eventually) close() a timed out socket, flushing the rest of the failed sendall() or something like that. I haven't been able to work out the exact scenario, but reading through the code, it seems impossible to recover from Python's sendall() timeout, which Rocket tries to do (by virtue of not immediately terminating the connection). Python's sendall() is not atomic with respect to timeouts; any part of the data might already have been sent when the timeout exception is raised. This is further complicated by the socket.py wrapper, which adds buffering to write()/writeline(), and adds a flush() at close() -- so that if there was a sequence of writes() followed by a sendall() that timed out, attempting to close() would flush() the remains of the writes(), and the whole output is unpredictable. (If I'm reading the source correctly, it would double some bytes, rather than skip some bytes, but perhaps I'm not reading it correctly). In case that is not the problem, the following observations might still help, however:
While trying to figure this out, I noticed two small problems in rocket.py, neither of which (as far as I can tell) is the cause, but which you might want to fix:
While I haven't been able to find the reason, my suspects are:
|
Ok, the culprit is definitely ignoring exceptions raised in sendall. How to reproduce: you have to have a wsgi worker, that produces output in parts (that is, returns a list or yields part as a generator). e.g: use web2py's "static" file server (which uses wsgi and does not use the FileSystermWorker).
A better idea where the problem is can be seen from the following ugly patch (applied against web2py's "one file" rocket.py)
Running the same experiment with the patched rocket.py will show that files get corrupted if 'exception lost' is printed to the web2py's terminal. Discussion: The only way to use sendall() reliably is to immediately terminate the connection upon any error (including timeout), as there is no way to know how many bytes were sent. (That there is no way to know how many bytes were sent is clearly stated in the documentation; the implication that it is impossible to reliably recover from this is not). However, there are sendall() calls all over rocket.py, and some will result in additional sendalls() following a failed sendall(). The worst offender seems to be WSGIWorker.write(), but I'm not sure the other sendalls are safe either. Temporary workarounds: increase SOCKET_TIMEOUT significantly (default is 1 second; bump to e.g. 10), and not swallow socket.timeout in WSGIWorker.write(). Increasing the chunk size is NOT a helpful, because it only changes the number of bytes before the first loss (at a given bandwidth), but from that point, the problem is the same. |
Many thanks. I'll dig into this soon. |
Any updates / insights? Thanks in advance |
web2py users have reported corrupt downloads using Rocket. It seems that only IE8 (and lower versions) are affected. I can reproduce this with web2py but I cannot reproduce it with Rocket alone.
web2py with any other webserver does not exhibit this issue. There is some interaction between Rocket and web2py that causes this.
This also only happens when downloading files. Uploaded files seem to be unaffected.
In this scenario, files are sent to the browser in 64k blocks. In what seems to be random circumstance, 4kb may be missing from the beginning of a block. I've never seen this happen with the first block. It is typically first shows on the 4th or 5th block.
The steps to reproduce this in web2py are detailed here: http://groups.google.com/group/web2py/browse_thread/thread/d7f6faddb841790b/d67ed796649fc3f1?pli=1
Any help with this issue would be much appreciated.
The text was updated successfully, but these errors were encountered: