Filesize detection doesn't work for compressed files #50

tstenner · 2019-10-18T08:29:22Z

https://github.com/xdf-modules/xdf-python/blob/a6c5f84860c41de9d63fc3f66ac7cac0f7779bc7/pyxdf/pyxdf.py#L235-L245

For compressed files, reading the file is aborted as soon as an invalid chunk size is read and the position in the uncompressed stream is after the compressed size. So, the new code checks if more than 1024 bytes are actually readable.

cbrnr

LGTM, but what's the special meaning of 1024 bytes here? Where does this number come from? A comment in the code would be helpful.

tstenner · 2019-10-21T10:28:31Z

I have no idea where the 1024 bytes came from. I suspect that it wasn't considered too useful to find a boundary chunk if the file is too close to the end, but how the 1024 came from that I don't know.

cbrnr · 2019-10-21T10:34:47Z

Can you add this explanation as a comment (it makes sense to not look for boundary chunks very close to the end of a file)?

tstenner · 2019-10-21T13:36:56Z

It doesn't matter how much data is left, so I changed the check to read only one byte. If there's not enough left for a boundary chunk, _scan_forward will read until the end of the file and then quit. It's a tiny bit slower, but we've gotten rid of one magic number.

cbrnr · 2019-10-21T13:38:14Z

And this happens only for corrupt files anyway, right?

tstenner · 2019-10-21T13:43:14Z

Yes with a tiny bit of fineprint. For regular files it shouldn't happen.

cbrnr · 2019-10-21T13:50:05Z

pyxdf/pyxdf.py

+                    f.seek(-1, 1)
+                    if _scan_forward(f):
+                        continue
+                    else:


I guess you could get rid of this else: break block because you have the same thing right in the next else block. I.e. I think you can remove the first else block completely (the one belonging to the if _scan_forward(f)), and also remove the second else and dedent the two subsequent lines.

Right, I've changed it.

cbrnr · 2019-10-21T14:05:27Z

Thanks @tstenner!

tstenner requested review from cboulay and cbrnr October 18, 2019 08:29

tstenner added 2 commits October 18, 2019 10:33

BF: more robust seeking in compressed corrupted files

668fe57

Add changelog entry

5d1af22

tstenner force-pushed the robustness branch from 32ffb8b to 5d1af22 Compare October 18, 2019 08:34

tstenner mentioned this pull request Oct 18, 2019

Allow loading from already opened file objects #51

Merged

cbrnr reviewed Oct 21, 2019

View reviewed changes

Resume from boundary chunks in any case except for end of file

ab3ba94

cbrnr reviewed Oct 21, 2019

View reviewed changes

Minor simplification (thanks to @cbrnr)

497c038

cbrnr merged commit ad98e89 into master Oct 21, 2019

cbrnr deleted the robustness branch October 21, 2019 14:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filesize detection doesn't work for compressed files #50

Filesize detection doesn't work for compressed files #50

tstenner commented Oct 18, 2019

cbrnr left a comment

tstenner commented Oct 21, 2019

cbrnr commented Oct 21, 2019

tstenner commented Oct 21, 2019

cbrnr commented Oct 21, 2019

tstenner commented Oct 21, 2019

cbrnr Oct 21, 2019

tstenner Oct 21, 2019

cbrnr commented Oct 21, 2019

Filesize detection doesn't work for compressed files #50

Filesize detection doesn't work for compressed files #50

Conversation

tstenner commented Oct 18, 2019

cbrnr left a comment

Choose a reason for hiding this comment

tstenner commented Oct 21, 2019

cbrnr commented Oct 21, 2019

tstenner commented Oct 21, 2019

cbrnr commented Oct 21, 2019

tstenner commented Oct 21, 2019

cbrnr Oct 21, 2019

Choose a reason for hiding this comment

tstenner Oct 21, 2019

Choose a reason for hiding this comment

cbrnr commented Oct 21, 2019