Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan forward seeks a byte before #34

Closed
dojeda opened this issue Feb 26, 2019 · 2 comments
Closed

Scan forward seeks a byte before #34

dojeda opened this issue Feb 26, 2019 · 2 comments

Comments

@dojeda
Copy link

dojeda commented Feb 26, 2019

Hi, I have been reading the load_xdf function with great interest. I am looking forward to propose a couple of modifications to improve it, but I have found a weird inconsistency, or misunderstanding. Perhaps someone could shed a light on this?

I am interested in reading a file without decoding its contents. More precisely, I want to read the info headers for each stream but not their contents. I was digging into the code and found that when there is some corrupted data, there is a _scan_forward function that reads until a boundary chunk is found https://github.com/sccn/xdf/wiki/Specifications#boundary-chunk. This functionality is great for robustness, but when I read the code:

def _scan_forward(f):
    """Scan forward through the given file object until after the next
    boundary chunk."""
    blocklen = 2**20
    signature = bytes([0x43, 0xA5, 0x46, 0xDC, 0xCB, 0xF5, 0x41, 0x0F,
                       0xB3, 0x0E, 0xD5, 0x46, 0x73, 0x83, 0xCB, 0xE4])
    while True:
        curpos = f.tell()
        block = f.read(blocklen)
        matchpos = block.find(signature)
        if matchpos != -1:
            f.seek(curpos + matchpos + 15)
            logger.debug('  scan forward found a boundary chunk.')
            break
        if len(block) < blocklen:
            logger.debug('  scan forward reached end of file with no match.')
            break

... I am confused to see that when pattern is found, the file pointer is moved to the matching position + 15... The pattern is 16 bytes long, so using 15 would make the file reader re-read the 0xE4 byte. In both cases where this function is called, the next instruction is either a continue or the end of the loop, so the next operation is to determine a variable int length with _read_varlen_int, which will fail because the next byte, being 0xE4, is not 0, 1, 4 or 8. This, in turn, will warn with "got zero-length chunk, scanning forward to next boundary chunk", which is actually not an accurate message, and then continue scanning for the next boundary chunk.

This pattern would continue on this error/find boundary/fail loop until the file is consumed.

@cbrnr
Copy link
Contributor

cbrnr commented Apr 16, 2019

Fixed in xdf-modules/pyxdf#6.

@cbrnr
Copy link
Contributor

cbrnr commented Apr 18, 2019

@cboulay this can be closed.

@cboulay cboulay closed this as completed Apr 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants