Skip to content

Commit

Permalink
Closes #773. fread doesn't warn about last 5 lines when nrow arg is p…
Browse files Browse the repository at this point in the history
…rovided.
  • Loading branch information
arunsrinivasan committed Sep 17, 2015
1 parent 13ff7ce commit 11637f2
Show file tree
Hide file tree
Showing 4 changed files with 38 additions and 1 deletion.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@
* deals with quotes more robustly. When reading quoted fields fail, it re-attemps to read the field as if it wasn't quoted. This helps read in those fields that might have unbalanced quotes without erroring immediately, thereby closing issues [#568](https://github.com/Rdatatable/data.table/issues/568), [#1256](https://github.com/Rdatatable/data.table/issues/1256), [#1077](https://github.com/Rdatatable/data.table/issues/1077), [#1079](https://github.com/Rdatatable/data.table/issues/1079) and [#1095](https://github.com/Rdatatable/data.table/issues/1095). Thanks to @Synergist, @daroczig, @geotheory and @rsaporta for the reports.
* gains argument `strip.white` which is `TRUE` by default (unlike `base::read.table`). All unquoted columns' leading and trailing white spaces are automatically removed. If \code{FALSE}, only trailing spaces of header is removed. Closes [#1113](https://github.com/Rdatatable/data.table/issues/1113), [#1035](https://github.com/Rdatatable/data.table/issues/1035), [#1000](https://github.com/Rdatatable/data.table/issues/1000), [#785](https://github.com/Rdatatable/data.table/issues/785), [#529](https://github.com/Rdatatable/data.table/issues/529) and [#956](https://github.com/Rdatatable/data.table/issues/956). Thanks to @dmenne, @dpastoor, @GHarmata, @gkalnytskyi, @renqian, @MatthewForrest, @fxi and @heraldb.
* doesn't warn about empty lines when 'nrow' argument is specified and that many rows are read properly. Thanks to @richierocks for the report. Closes [#1330](https://github.com/Rdatatable/data.table/issues/1330).
* doesn't error/warn about not being able to read last 5 lines when 'nrow' argument is specified. Thanks to @robbig2871. Closes [#773](https://github.com/Rdatatable/data.table/issues/773).

6. Auto indexing:
* `DT[colA == max(colA)]` now works again without needing `options(datatable.auto.index=FALSE)`. Thanks to Jan Gorecki and kaybenleroll, [#858](https://github.com/Rdatatable/data.table/issues/858). Test added.
Expand Down
28 changes: 28 additions & 0 deletions inst/tests/issue_773_fread.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
AAA|BBB|CCC
4|5|6
7|8|9
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
1|2|3
31|32|33
21|22|23
ZZZ|YYY
10|11
1|2
1|2
1|2
1|2
7 changes: 7 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -6948,6 +6948,13 @@ test(1557.3, names(fread(str, col.names=letters[1])), error="Can't assign 1 name
test(1557.4, names(fread(str, col.names=letters[1:3])), error="Can't assign 3 names to")
test(1557.5, names(fread(str, col.names=1:2)), error="Passed a vector of type")

# Fix for #773
f = "issue_773_fread.txt"
ans = data.table(AAA=as.integer(c(4,7,rep(1,17),31,21)),
BBB=as.integer(c(5,8,rep(2,17),32,22)),
CCC=as.integer(c(6,9,rep(3,17),33,23)))
test(1558, fread(f, nrow=21L), ans) # no warning

##########################


Expand Down
3 changes: 2 additions & 1 deletion src/fread.c
Original file line number Diff line number Diff line change
Expand Up @@ -995,7 +995,8 @@ SEXP readfile(SEXP input, SEXP separg, SEXP nrowsarg, SEXP headerarg, SEXP nastr
}
}
if (i<5) {
warning("Unable to find 5 lines with expected number of columns (%s)\n", str);
if (INTEGER(nrowsarg)[0]==-1)
warning("Unable to find 5 lines with expected number of columns (%s)\n", str);
continue;
}
}
Expand Down

0 comments on commit 11637f2

Please sign in to comment.