Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control-Z as EOF #1612

Closed
daroczig opened this issue Mar 25, 2016 · 7 comments
Closed

Control-Z as EOF #1612

daroczig opened this issue Mar 25, 2016 · 7 comments
Milestone

Comments

@daroczig
Copy link

Some CSV files generated on MS DOS/Windows, can have ^Z as the end-of-file character as eg at https://www.treasury.gov/ofac/downloads/sdn.csv which results in an error when calling fread:

Expected sep (',') but new line, EOF (or other non printing character) ends field 1 on line 6 when detecting types: ^Z

Removing that character from the end of the file resolves the problem.

Session info:

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 15.10

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.4

loaded via a namespace (and not attached):
[1] magrittr_1.5   plyr_1.8.3     tools_3.2.2    reshape2_1.4.1 Rcpp_0.12.3   
[6] stringi_1.0-1  stringr_1.0.0  chron_2.3-47  

But I can reproduce this problem with the most recent dev version of data.table as well at 6f58f5c.

@jangorecki
Copy link
Member

some awk or sed should be good workaround for now

@daroczig
Copy link
Author

Yeah, as said, "removing that character from the end of the file resolves the problem" :) But I though it's worth reporting as others might have the very same issue. Not high-priority for sure.

@skanskan
Copy link

skanskan commented Jul 7, 2017

It would be great if fread could remove it automatically.

@mattdowle mattdowle added this to the v1.10.6 milestone Jul 7, 2017
@skanskan
Copy link

skanskan commented Jul 8, 2017

How can we install version 1.10.6?
I think
install.packages("data.table", type = "source", repos = "http://Rdatatable.github.io/data.table")
would install version 1.10.5

@st-pasha
Copy link
Contributor

st-pasha commented Jul 9, 2017

Version 1.10.6 hasn't been released yet. There is only 1.10.4 on CRAN, and the "dev" version (1.10.5) -- which is based on the master branch in this repo.

@skanskan
Copy link

skanskan commented Jul 9, 2017

I said it because I read "mattdowle added this to the v1.10.6 milestone". I thought we could try it in the dev version in some way.

@mattdowle
Copy link
Member

mattdowle commented Jul 13, 2017

@skanskan When last number is odd, that's the dev release. v1.10.5 will be renamed v1.10.6 when it is released to CRAN. Otherwise we all get confused when we grab the dev at different times. It is only possible to obtain an even numbered version number from CRAN and is a guaranteed checkpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants