Scan forward seeks a byte before #34

dojeda · 2019-02-26T09:33:10Z

Hi, I have been reading the load_xdf function with great interest. I am looking forward to propose a couple of modifications to improve it, but I have found a weird inconsistency, or misunderstanding. Perhaps someone could shed a light on this?

I am interested in reading a file without decoding its contents. More precisely, I want to read the info headers for each stream but not their contents. I was digging into the code and found that when there is some corrupted data, there is a _scan_forward function that reads until a boundary chunk is found https://github.com/sccn/xdf/wiki/Specifications#boundary-chunk. This functionality is great for robustness, but when I read the code:

def _scan_forward(f):
    """Scan forward through the given file object until after the next
    boundary chunk."""
    blocklen = 2**20
    signature = bytes([0x43, 0xA5, 0x46, 0xDC, 0xCB, 0xF5, 0x41, 0x0F,
                       0xB3, 0x0E, 0xD5, 0x46, 0x73, 0x83, 0xCB, 0xE4])
    while True:
        curpos = f.tell()
        block = f.read(blocklen)
        matchpos = block.find(signature)
        if matchpos != -1:
            f.seek(curpos + matchpos + 15)
            logger.debug('  scan forward found a boundary chunk.')
            break
        if len(block) < blocklen:
            logger.debug('  scan forward reached end of file with no match.')
            break

... I am confused to see that when pattern is found, the file pointer is moved to the matching position + 15... The pattern is 16 bytes long, so using 15 would make the file reader re-read the 0xE4 byte. In both cases where this function is called, the next instruction is either a continue or the end of the loop, so the next operation is to determine a variable int length with _read_varlen_int, which will fail because the next byte, being 0xE4, is not 0, 1, 4 or 8. This, in turn, will warn with "got zero-length chunk, scanning forward to next boundary chunk", which is actually not an accurate message, and then continue scanning for the next boundary chunk.

This pattern would continue on this error/find boundary/fail loop until the file is consumed.

The text was updated successfully, but these errors were encountered:

Fixes sccn#34

cbrnr · 2019-04-16T10:29:18Z

Fixed in xdf-modules/pyxdf#6.

cbrnr · 2019-04-18T06:53:50Z

@cboulay this can be closed.

dojeda mentioned this issue Feb 26, 2019

Feature request: read headers only #37

Open

dojeda added a commit to dojeda/xdf that referenced this issue Feb 26, 2019

Fix incorrect seek after finding block boundary

52596b3

Fixes sccn#34

cbrnr mentioned this issue Apr 16, 2019

Fix _scan_forward (incorrect seek) xdf-modules/pyxdf#6

Merged

cboulay closed this as completed Apr 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scan forward seeks a byte before #34

Scan forward seeks a byte before #34

dojeda commented Feb 26, 2019

cbrnr commented Apr 16, 2019

cbrnr commented Apr 18, 2019

Scan forward seeks a byte before #34

Scan forward seeks a byte before #34

Comments

dojeda commented Feb 26, 2019

cbrnr commented Apr 16, 2019

cbrnr commented Apr 18, 2019