Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: fread should raise an error instead of a warning when reading a gzipped file that does not fit in temporary storage #5415

Open
remomomo opened this issue Jul 6, 2022 · 0 comments

Comments

@remomomo
Copy link

remomomo commented Jul 6, 2022

Hi,

Thanks for developing this awesome package!

fread has supported gzipped files for a while now and it usually works great
#717

Recently, I have been working with containers more often. These often have limited temporary storage. When reading a gzipped file with fread, I believe it first unzips it to temporary storage, and then reads the file from there into memory (I am aware of the tmpdir argument that controls this). If the file is very large, the temporary storage can fill up. fread, instead of raising an error, reads the truncated file and prints a cryptic warning (telling you that the file was truncated). This warning is easily overlooked when working non-interactively.

I think it would be better if instead of reading a truncated file, fread would actually raise an error in these cases. The error could then be caught early on. Only reading part of the data often leads to downstream errors, which then harder to de-bug. I don't think many users want to read just the first x lines that happen to fit into their temporary storage at random.

A flag could still force reading a truncated gzipped file, if necessary.

thank you for your time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants