Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Coerce #37

Merged
merged 43 commits into from
Apr 13, 2022
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
5eb55b1
Added Coerce transform
Apr 5, 2022
7d4fc92
Fixed the tests for Coerce transform
Apr 5, 2022
e3be036
Fixed a style issue
Apr 5, 2022
a75ac7b
Updated Coerce transform
Apr 5, 2022
106c92e
Added a missing dependency for tests
Apr 5, 2022
93fa98e
Revised Coerce transform
Apr 6, 2022
81d6c4a
Added a new constructor for Coerce struct
Apr 6, 2022
654cc04
Added missing parameters to Coerce transform & Tidied up Coerce trans…
Apr 6, 2022
bd6ee77
Revised Coerce transform
Apr 7, 2022
5d82ca6
Update src/transforms/coerce.jl
ceferisbarov Apr 8, 2022
4d4b732
Update src/transforms/coerce.jl
ceferisbarov Apr 8, 2022
083bfb9
Update src/transforms/coerce.jl
ceferisbarov Apr 8, 2022
567bf3a
Updated docstring od Coerce transform
Apr 8, 2022
0355178
Update src/transforms/coerce.jl
ceferisbarov Apr 8, 2022
6b6e430
Update test/runtests.jl
ceferisbarov Apr 8, 2022
dc0a308
Updated tests for Coerce transform
Apr 8, 2022
362798d
Merge branch 'coerce' of https://github.com/ceferisbarov/TableTransfo…
Apr 8, 2022
967eda0
Removed an unnecessary dependency from test/runtests.jl
Apr 8, 2022
a194745
Updated the reverse function for Coerce transform
Apr 8, 2022
8a7f82d
Update test/transforms.jl
ceferisbarov Apr 9, 2022
1f949c9
Update src/transforms/coerce.jl
ceferisbarov Apr 9, 2022
530828f
Refactored the revert function of Coerce transform
Apr 9, 2022
16afabf
Update src/transforms/coerce.jl
ceferisbarov Apr 9, 2022
a572062
Update src/transforms/coerce.jl
ceferisbarov Apr 9, 2022
92b8f21
Update src/transforms/coerce.jl
ceferisbarov Apr 9, 2022
bf913d7
Update src/transforms/coerce.jl
ceferisbarov Apr 9, 2022
be90b87
Fixed a typo
Apr 9, 2022
b005beb
Fixed a typo
Apr 9, 2022
eae03d4
Added categorical tests for Coerce transform
Apr 9, 2022
147bb14
Added a missing dependency
Apr 9, 2022
7a9eeb5
Update test/runtests.jl
juliohm Apr 10, 2022
6c9457f
Refactored the reverse function of Coerce transform
Apr 10, 2022
b3d4387
Merge branch 'coerce' of https://github.com/ceferisbarov/TableTransfo…
Apr 10, 2022
580dd2e
Update test/transforms.jl
juliohm Apr 11, 2022
41fec8b
Update test/transforms.jl
juliohm Apr 11, 2022
b98ff15
Update test/transforms.jl
juliohm Apr 11, 2022
1fd2523
Update test/transforms.jl
juliohm Apr 11, 2022
8252563
Update src/transforms/coerce.jl
juliohm Apr 11, 2022
537c335
Added Coerce to README
Apr 12, 2022
7a2d1ab
Update README.md
juliohm Apr 12, 2022
c735b9b
Merge branch 'master' into coerce
ceferisbarov Apr 12, 2022
d99e7e8
Removed a space
Apr 12, 2022
f196d3a
Update README.md
juliohm Apr 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ Please check the docstrings for additional information.
| `DropMissing` | Drop missings |
| `Rename` | Column renaming |
| `Coalesce` | Replace missings |
| `Coerce` | Coerce scientific types |
juliohm marked this conversation as resolved.
Show resolved Hide resolved
| `Identity` | Identity transform |
| `Center` | Mean removal |
| `Scale` | Interval scaling |
Expand Down
1 change: 1 addition & 0 deletions src/TableTransforms.jl
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ export
DropMissing,
Rename,
Coalesce,
Coerce,
ceferisbarov marked this conversation as resolved.
Show resolved Hide resolved
Identity,
Center,
Scale,
Expand Down
1 change: 1 addition & 0 deletions src/transforms.jl
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ include("transforms/select.jl")
include("transforms/filter.jl")
include("transforms/rename.jl")
include("transforms/coalesce.jl")
include("transforms/coerce.jl")
ceferisbarov marked this conversation as resolved.
Show resolved Hide resolved
include("transforms/identity.jl")
include("transforms/center.jl")
include("transforms/scale.jl")
Expand Down
48 changes: 48 additions & 0 deletions src/transforms/coerce.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# ------------------------------------------------------------------
# Licensed under the MIT License. See LICENSE in the project root.
# ------------------------------------------------------------------

"""
Coerce(pairs, tight=false, verbosity=1)

Return a copy of the table, ensuring that the scientific types of the columns match the new specification.

This transform wraps the ScientificTypes.coerce function. Please see their docstring for more details.

```julia
Coerce(:col1 => Continuous, :col2 => Count)
```
"""
struct Coerce{P} <: Transform
pairs::P
tight::Bool
verbosity::Int
end

Coerce(pair::Pair{Symbol,<:Type}...; tight=false, verbosity=1) =
Coerce(pair, tight, verbosity)

isrevertible(::Type{<:Coerce}) = true

function apply(transform::Coerce, table)
newtable = coerce(table, transform.pairs...;
tight=transform.tight,
verbosity=transform.verbosity)

types = Tables.schema(table).types

newtable, types
end

function revert(transform::Coerce, newtable, cache)
names = Tables.columnnames(newtable)
cols = Tables.columns(newtable)
oldcols = map(zip(cache, names)) do (T, n)
x = Tables.getcolumn(cols, n)
collect(T, x)
end

𝒯 = (; zip(names, oldcols)...)
𝒯 |> Tables.materializer(newtable)
end

2 changes: 2 additions & 0 deletions test/Project.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
GR = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71"
ImageIO = "82e4d734-157c-48bb-816b-45c225c6df19"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
ReferenceTests = "324d217c-45ce-50fc-942e-d289b448e8cf"
ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Expand Down
2 changes: 2 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ using TableTransforms
using Distributions
using Tables
using TypedTables
using CategoricalArrays
using ScientificTypes: Count, Multiclass
using LinearAlgebra
using Statistics
using Test, Random, Plots
Expand Down
25 changes: 25 additions & 0 deletions test/transforms.jl
Original file line number Diff line number Diff line change
Expand Up @@ -673,6 +673,31 @@
@test ttypes == Tables.schema(tₒ).types
end

@testset "Coerce" begin
x1 = [1.0, 2.0, 3.0, 4.0, 5.0]
x2 = [1.0, 2.0, 3.0, 4.0, 5.0]
x3 = [5.0, 5.0, 5.0, 5.0, 5.0]
t = Table(;x1, x2, x3)

T = Coerce(:x1=>Count, :x2=>Count)
n, c = apply(T, t)
@test eltype(n.x1) == Int
@test eltype(n.x2) == Int
n, c = apply(T, t)
tₒ = revert(T, n, c)
@test eltype(tₒ.x1) == eltype(t.x1)
@test eltype(tₒ.x2) == eltype(t.x2)

T = Coerce(:x1=>Multiclass, :x2=>Multiclass)
n, c = apply(T, t)
@test eltype(n.x1) <: CategoricalValue
@test eltype(n.x2) <: CategoricalValue
n, c = apply(T, t)
tₒ = revert(T, n, c)
@test eltype(tₒ.x1) == eltype(t.x1)
@test eltype(tₒ.x2) == eltype(t.x2)
end

@testset "Identity" begin
x = rand(4000)
y = rand(4000)
Expand Down