You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am using your tool on a regular basis and stumbled over an error that occurs sometimes when I am processing genomic data files with a certain tool and redirect the uncompressed data stream into a compressor (gzip, pigz, bgzip) while creating gztool index with line numbers asynchronous in parallel. I tested gztool versions 1.4.3 and 1.6.1.
Decompressing [file].gz does not return any error and also other post-processing tools never complained about the input. The index gztool -l [file].gz looks fine to me too, but certain line offsets used to retrieve data can lead to ERROR: Compressed data error in '[file].gz' when using the following command.
gztool -L [offset] -R 1 [file].gz
Unfortunately I am unable to reproduce this issue on toy examples, but I always end up in the same issue when re-running the command above using the same input.
It would be great if you could have a look into it. Please find attached one of my seemingly erroneous files. Offset issues start to happen at line 38234850 (gztool -L 38234850 -R 1 erroneous_data.gz). erroneous_data.gz
Thanks you so much!
The text was updated successfully, but these errors were encountered:
Hi @koriege
Thank you very much for your detailed review and the data file, I have to carefully study this error.
Meanwhile you can add -p to gztool command line with -L, and it will patch the supposed "error" itself 👍
Maybe also -v0 so it doesn't bother you with details about the patching.
From your example: gztool -v0 -p -L [offset] -R 1 [file].gz
It is very rewarding to know about these uses of gztool 😊
Oh, I overlooked the -p parameter. This solves it for now. You may close this issue, which indeed seems to be related to the gzip blocks introduced by bgzip (I am using v1.16 installed via conda).
Hi,
I am using your tool on a regular basis and stumbled over an error that occurs sometimes when I am processing genomic data files with a certain tool and redirect the uncompressed data stream into a compressor (gzip, pigz, bgzip) while creating gztool index with line numbers asynchronous in parallel. I tested gztool versions 1.4.3 and 1.6.1.
Decompressing
[file].gz
does not return any error and also other post-processing tools never complained about the input. The indexgztool -l [file].gz
looks fine to me too, but certain line offsets used to retrieve data can lead toERROR: Compressed data error in '[file].gz'
when using the following command.Unfortunately I am unable to reproduce this issue on toy examples, but I always end up in the same issue when re-running the command above using the same input.
It would be great if you could have a look into it. Please find attached one of my seemingly erroneous files. Offset issues start to happen at line 38234850 (
gztool -L 38234850 -R 1 erroneous_data.gz
).erroneous_data.gz
Thanks you so much!
The text was updated successfully, but these errors were encountered: