Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GForce optimization for head/tail with arbitrary n #5060

Closed
myoung3 opened this issue Jul 1, 2021 · 0 comments · Fixed by #5089
Closed

GForce optimization for head/tail with arbitrary n #5060

myoung3 opened this issue Jul 1, 2021 · 0 comments · Fixed by #5089
Milestone

Comments

@myoung3
Copy link
Contributor

myoung3 commented Jul 1, 2021

Mentioned here: #523 (comment)

library(data.table)
#> Warning: package 'data.table' was built under R version 4.0.5
options(datatable.verbose=TRUE)
d = as.data.table(iris)


d[, head(.SD,1L),by="Species"]
#> Finding groups using forderv ... forder.c received 150 rows and 1 columns
#> 0.000s elapsed (0.000s cpu) 
#> Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
#> lapply optimization changed j from 'head(.SD, 1L)' to 'list(head(Sepal.Length, 1L), head(Sepal.Width, 1L), head(Petal.Length, 1L), head(Petal.Width, 1L))'
#> GForce optimized j to 'list(ghead(Sepal.Length, 1L), ghead(Sepal.Width, 1L), ghead(Petal.Length, 1L), ghead(Petal.Width, 1L))'
#> Making each group and running j (GForce TRUE) ... gforce initial population of grp took 0.001
#> gforce assign high and low took 0.000
#> gforce eval took 0.000
#> 0.000s elapsed (0.000s cpu)
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa          5.1         3.5          1.4         0.2
#> 2: versicolor          7.0         3.2          4.7         1.4
#> 3:  virginica          6.3         3.3          6.0         2.5
d[, head(.SD,2L),by="Species"]
#> Finding groups using forderv ... forder.c received 150 rows and 1 columns
#> 0.000s elapsed (0.000s cpu) 
#> Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
#> lapply optimization changed j from 'head(.SD, 2L)' to 'list(head(Sepal.Length, 2L), head(Sepal.Width, 2L), head(Petal.Length, 2L), head(Petal.Width, 2L))'
#> GForce is on, left j unchanged
#> Old mean optimization is on, left j unchanged.
#> Making each group and running j (GForce FALSE) ... 
#>   memcpy contiguous groups took 0.000s for 3 groups
#>   eval(j) took 0.000s for 3 calls
#> 0.000s elapsed (0.000s cpu)
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa          5.1         3.5          1.4         0.2
#> 2:     setosa          4.9         3.0          1.4         0.2
#> 3: versicolor          7.0         3.2          4.7         1.4
#> 4: versicolor          6.4         3.2          4.5         1.5
#> 5:  virginica          6.3         3.3          6.0         2.5
#> 6:  virginica          5.8         2.7          5.1         1.9


d[, tail(.SD,1L),by="Species"] 
#> Finding groups using forderv ... forder.c received 150 rows and 1 columns
#> 0.000s elapsed (0.000s cpu) 
#> Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
#> lapply optimization changed j from 'tail(.SD, 1L)' to 'list(tail(Sepal.Length, 1L), tail(Sepal.Width, 1L), tail(Petal.Length, 1L), tail(Petal.Width, 1L))'
#> GForce optimized j to 'list(gtail(Sepal.Length, 1L), gtail(Sepal.Width, 1L), gtail(Petal.Length, 1L), gtail(Petal.Width, 1L))'
#> Making each group and running j (GForce TRUE) ... gforce initial population of grp took 0.000
#> gforce assign high and low took 0.000
#> gforce eval took 0.000
#> 0.000s elapsed (0.000s cpu)
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa          5.0         3.3          1.4         0.2
#> 2: versicolor          5.7         2.8          4.1         1.3
#> 3:  virginica          5.9         3.0          5.1         1.8
d[, tail(.SD,2L),by="Species"]
#> Finding groups using forderv ... forder.c received 150 rows and 1 columns
#> 0.000s elapsed (0.000s cpu) 
#> Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
#> lapply optimization changed j from 'tail(.SD, 2L)' to 'list(tail(Sepal.Length, 2L), tail(Sepal.Width, 2L), tail(Petal.Length, 2L), tail(Petal.Width, 2L))'
#> GForce is on, left j unchanged
#> Old mean optimization is on, left j unchanged.
#> Making each group and running j (GForce FALSE) ... 
#>   memcpy contiguous groups took 0.000s for 3 groups
#>   eval(j) took 0.000s for 3 calls
#> 0.000s elapsed (0.000s cpu)
#>       Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1:     setosa          5.3         3.7          1.5         0.2
#> 2:     setosa          5.0         3.3          1.4         0.2
#> 3: versicolor          5.1         2.5          3.0         1.1
#> 4: versicolor          5.7         2.8          4.1         1.3
#> 5:  virginica          6.2         3.4          5.4         2.3
#> 6:  virginica          5.9         3.0          5.1         1.8

Created on 2021-07-01 by the reprex package (v1.0.0)

@mattdowle mattdowle added this to the 1.14.1 milestone Aug 25, 2021
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants