Skip to content

Commit

Permalink
GroupedDataFrame performs some checks on the grouped_df objects, …
Browse files Browse the repository at this point in the history
…which

  prevent some issues related to corrupt `grouped_df` objects as the one
  made by rbind (#606).
  • Loading branch information
romainfrancois committed Sep 22, 2014
1 parent 3d4b64d commit f716b02
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 1 deletion.
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# dplyr 0.2.0.9000

* hashing a numeric column and an integer column was wrong (#450)
* `GroupedDataFrame` performs some checks on the `grouped_df` objects, which
prevent some issues related to corrupt `grouped_df` objects as the one
made by rbind (#606).

* hashing a numeric column and an integer column was wrong (#450).

* `nth` now correctly promotes the result when using dates, times and factors (#509).

Expand Down
14 changes: 14 additions & 0 deletions inst/include/dplyr/GroupedDataFrame.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,20 @@ namespace Rcpp {
group_sizes = data_.attr( "group_sizes" );
biggest_group_size = data_.attr( "biggest_group_size" ) ;
labels = data_.attr( "labels" );

if( !is_lazy ){
// check consistency of the groups
int rows_in_groups = sum(group_sizes) ;
if( data_.nrows() != rows_in_groups ){
std::stringstream s ;
s << "corrupt 'grouped_df', contains "
<< data_.nrows()
<< " rows, and "
<< rows_in_groups
<< " rows in groups" ;
stop(s.str()) ;
}
}
}

group_iterator group_begin() const {
Expand Down
13 changes: 13 additions & 0 deletions tests/testthat/test-filter.r
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,16 @@ test_that( "$ does not end call traversing. #502", {

})

test_that( "GroupedDataFrame checks consistency of data (#606)", {
df1 <- data.frame(
group = factor(rep(c("C", "G"), 5)),
value = 1:10)
df1 <- df1 %>% group_by(group) #df1 is now tbl
df2 <- data.frame(
group = factor(rep("G", 10)),
value = 11:20)
df3 <- rbind(df1, df2) #df2 is data.frame

expect_error( df3 %>% filter(group == "C"), "corrupt 'grouped_df', contains" )

})

0 comments on commit f716b02

Please sign in to comment.