Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I use the “.” operator in Tidier.jl? #148

Closed
gioneves opened this issue Sep 9, 2024 · 1 comment
Closed

How do I use the “.” operator in Tidier.jl? #148

gioneves opened this issue Sep 9, 2024 · 1 comment

Comments

@gioneves
Copy link

gioneves commented Sep 9, 2024

We can do this in R:

library(dplyr)

df <- data.frame(
a = c(5, 2, 6),
b = c(2, 4, 6),
c = c(10, 2, 1)
)

df %>%
mutate(across((a:c), ~ if_else(. < 5, "OK", "NOK"), .names = "new{col}"))

In Julia, I tried that:

using Tidier

df = DataFrame(a = [5, 2, 6], b = [2, 4, 6], c = [10, 2, 1])

@chain df begin
@Mutate(across((a, b), if_else(.<5, "OK", "NOK") ))
end

But:

ERROR: ParseError:

Error @ REPL[102]:2:31

@chain df begin
@Mutate(across((a:c), if_else(.<5, "OK", "NOK") ))

└┘ ── not a unary operator

Stacktrace:
[1] top-level scope
@ none:1

You can use the “.” operator in Tidier and apply .names (as done in R) to avoid writing them out one by one:

@chain df begin
@Mutate(
newa = ifelse.(a .< 5, "OK", "NOK"),
newb = ifelse.(b .< 5, "OK", "NOK"),
newc = ifelse.(c .< 5, "OK", "NOK")
)
end

@kdpsingh
Copy link
Member

kdpsingh commented Sep 9, 2024

Here's what you are looking for:

julia> @chain df begin
         @mutate(across((a, b), x -> if_else.(x .< 5, "OK", "NOK")))
       end
3×5 DataFrame
 Row │ a      b      c      a_function  b_function 
     │ Int64  Int64  Int64  String      String     
─────┼─────────────────────────────────────────────
   1 │     5      2     10  NOK         OK
   2 │     2      4      2  OK          OK
   3 │     6      6      1  NOK         NOK

across() is able to accept anonymous functions like the one shown here. The one caveat with across() is that it operates on entire columns and doesn't auto-vectorize the contents of functions. So when you define the anonymous function, you have to vectorize if_else() as if_else.() and x < 5 as x .< 5.

You can also use a named function, like this:

julia> @chain df begin
         @mutate(across((a, b), function okay(x) if_else.(x .< 5, "OK", "NOK") end))
       end
3×5 DataFrame
 Row │ a      b      c      a_okay  b_okay 
     │ Int64  Int64  Int64  String  String 
─────┼─────────────────────────────────────
   1 │     5      2     10  NOK     OK
   2 │     2      4      2  OK      OK
   3 │     6      6      1  NOK     NOK

Hope that helps, and happy to clarify further.

@kdpsingh kdpsingh closed this as completed Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants