Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing transform field and creating TransformedKernel (and ScaledKernel) #32

Merged
merged 24 commits into from
Feb 28, 2020

Conversation

theogf
Copy link
Member

@theogf theogf commented Jan 28, 2020

Following the discussion in #17 I removed the transform field (and the parametrization on it). Now all base kernels default on IdentityTransform.
I created on top TransformedKernel which contains a transform and a kernel and ScaledKernel which contains a premultiplying scalar.
Left to do is

  • Decide on nomenclature for constructors (for now I implemented a toy example with the same names in lowercases
  • Create appropriate tests
  • Improve the printing of the kernels ( I have some ideas on this )
  • Adapt documentation and examples

@theogf theogf requested a review from willtebbutt January 28, 2020 09:55
@theogf theogf self-assigned this Jan 28, 2020

ConstantKernel(t::Tr,c::Tc=1.0) where {Tr<:Transform,Tc<:Real} = ConstantKernel{Tr,Tc}(t,c)
params(k::ConstantKernel) = (k.c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is params (and opt_params) supposed to return a tuple? This will just return a scalar value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just bad leftovers I did for something else. This is related to #31 . I will make a more thorough PR at some point.

@theogf theogf added the WIP label Jan 28, 2020
@@ -0,0 +1,20 @@
struct ScaledKernel{Tk<:Kernel,Tσ<:Real} <: Kernel
kernel::Tk
σ::Vector{Tσ}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, why is the scaling parameter a vector? Couldn't you just use

Suggested change
σ::Vector{Tσ}
σ::Tσ

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that it makes the types much more Zygote compatible. One can then optimize the parameters in place, by passing the reference of the parameter. A scalar would not allow this. One other option would be to use a Ref{Ts} but I find this quite ugly in general!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm pretty sure that not using a vector should be completely fine with Zygote - at least I would be surprised if it's not: https://fluxml.ai/Zygote.jl/dev/#Structs-and-Types-1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I was not clear. It's actually to be more compatible with Flux.
The whole implicit gradient approach allows you to save the reference to parameters (via an IdDict and apply the gradients directly in place without having to build new objects. But this is only possible on mutable object so for a scalar it would not work. Or maybe I got everything wrong about Flux 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue with using a Ref rather than a Vector here? I would have thought that a reference to a Real would be more readable than a Vector{<:Real} that you have to know is always of length 1?

@devmotion
Copy link
Member

Do we actually need an IdentityTransform? I thought one possibility would be to just implement kernels without considering transforms at all and not having to define any transform function. One could just define general transformations such as TransformedKernel or ScaledKernel that implement all required methods (such as kappa etc.) by applying the transformations and otherwise forwarding to the corresponding function of the underlying kernel, such as e.g.

kappa::TransformedKernel, d) = kappa.kernel, d)
kappa::TransformedKernel, x, y) = kappa.kernel, κ.transform(x), κ.transform(y))

Maybe one could define some more convenient methods such as

transform(f, k::Kernel) = TransformedKernel(k, f)
Base.:*(s::Real, k::Kernel) = ScaledKernel(k, s)
Base.:*(k::Kernel, s::Real) = s * k

In general, I think it might be a bit confusing that TransformedKernel transforms the inputs whereas ScaledKernel transforms the evaluated kernel function. I'm not sure if this could be avoided by different or more explicit names.

Copy link
Member Author

@theogf theogf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need an IdentityTransform?

Well yeah we could, the advantage so far is that it unifies everything (base kernels and transformed kernels)

Maybe one could define some more convenient methods

Why not for the first one although I start to feel we are overloading transform with a lot of unrelated stuff.
The second one is already there
I don't think the third one is a good idea,

In general, I think it might be a bit confusing that TransformedKernel transforms the inputs whereas ScaledKernel transforms the evaluated kernel function. I'm not sure if this could be avoided by different or more explicit names.

Changing ScaledKernel sounds like a good idea, any suggestions?

@devmotion
Copy link
Member

Why not for the first one although I start to feel we are overloading transform with a lot of unrelated stuff.

Oh I thought that could be the only implementation of transform. I assumed that the implementation of TransformedKernel could be such that one never has to worry or deal with transformations - everything regarding transformations is handled by TransformedKernel, one would always just evaluate kappa, metric, kernelmatrix etc. for a given kernel.

@theogf
Copy link
Member Author

theogf commented Jan 28, 2020

Oh that's a perfectly fair point then!

@@ -0,0 +1,13 @@
struct TransformedKernel{Tk<:Kernel,Tr<:Transform} <: Kernel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one could remove the Transform type and allow arbitrary types and functions as transformations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to have wrapper around all types.

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Just a few things to sort out.

transform::Tr
end

@inline transform(k::TransformedKernel) = k.transform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a transform function anymore?

src/generic.jl Outdated
@@ -1,5 +1,4 @@
@inline metric(κ::Kernel) = κ.metric

@inline transform(::Kernel) = IdentityTransform()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a transform function anymore?

kernels::Vector{Kernel}
weights::Vector{Real}
function KernelSum{Tr}(kernels::AbstractVector{<:Kernel},weights::AbstractVector{<:Real}) where {Tr}
new{Tr}(kernels,weights)
function KernelSum(kernels::AbstractVector{<:Kernel},weights::AbstractVector{<:Real})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably get rid of this constructor entirely

@@ -0,0 +1,20 @@
struct ScaledKernel{Tk<:Kernel,Tσ<:Real} <: Kernel
kernel::Tk
σ::Vector{Tσ}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue with using a Ref rather than a Vector here? I would have thought that a reference to a Real would be more readable than a Vector{<:Real} that you have to know is always of length 1?

@theogf
Copy link
Member Author

theogf commented Jan 29, 2020

Do we actually need an IdentityTransform? I thought one possibility would be to just implement kernels without considering transforms at all and not having to define any transform function.

I went to make the changes to drop the need for IdentityTransform and faced an issue :
we still want to compute the pairwise matrix using Distances.jl (more efficient this way) and for this we need to preapply a transform.
A solution would be that transform(::BaseKernel,X) simply returns X

@theogf
Copy link
Member Author

theogf commented Jan 29, 2020

About the Ref vs Vector problem:
They have the same size :

sizeof(Ref(3.0)) ## 8
sizeof([3.0]) ## 8

But the functions used to set values to a Ref are different then from a Vector.
The advantage would be that when one collects all the parameters of a complex kernel, the results is a large collection of arrays and not a mix of ref and vectors which have to be treated differently

@devmotion
Copy link
Member

I went to make the changes to drop the need for IdentityTransform and faced an issue :
we still want to compute the pairwise matrix using Distances.jl (more efficient this way) and for this we need to preapply a transform.
A solution would be that transform(::BaseKernel,X) simply returns X

My suggestion would be to define kernelmatrix! etc. for TransformedKernel, I think that should solve the issue. E.g.,

# default implementation without considering transforms at all
function kernelmatrix!(
        K::Matrix,
        κ::Kernel,
        X::AbstractMatrix;
        obsdim::Int = defaultobs
        )
        @assert obsdim  [1,2] "obsdim should be 1 or 2 (see docs of kernelmatrix))"
        if !check_dims(K,X,X,feature_dim(obsdim),obsdim)
            throw(DimensionMismatch("Dimensions of the target array K $(size(K)) are not consistent with X $(size(X))"))
        end
        map!(x->kappa(κ,x),K,pairwise(metric(κ),X,dims=obsdim))
end

# special implementation for TransformedKernel
function kernelmatrix!(
        K::Matrix,
        κ::TransformedKernel,
        X::AbstractMatrix;
        obsdim::Int = defaultobs
        )
       kernelmatrix!(K, κ.kernel, κ.transform(X, obsdim); obsdim = obsdim)
end

@theogf
Copy link
Member Author

theogf commented Jan 29, 2020

My suggestion would be to define kernelmatrix! etc. for TransformedKernel, I think that should solve the issue. E.g.,

The problem with this is that we end up having the exact problem we were trying to avoid! 😆
Dispatching kernelmatrix functions is really painful because you need to treat all cases (X,XY,diagX).
Doing :

transform(κ::Kernel,x;obsdim=defaultobs) = x
transform(κ::TransformedKernel,x;obsdim=defaultobs) = transform(κ.transform,x,obsdim=obsdim)

Would produce the exact same result.
If it is only about the function name transform we can change it to something else

@devmotion
Copy link
Member

devmotion commented Jan 29, 2020

By the way, one should probably better use RefValue instead of Ref: https://discourse.julialang.org/t/ref-vs-zero-dimensional-arrays/24434/11

julia> struct S
           s::Ref{Int}
       end

julia> struct T
           s::Base.RefValue{Int}
       end

julia> T(x::Int) = T(Ref(x))
T

julia> f(x) = x.s[]
f (generic function with 1 method)

julia> a = S(1)
S(Base.RefValue{Int64}(1))

julia> b = T(1)
T(Base.RefValue{Int64}(1))

julia> @code_warntype f(a)
Variables
  #self#::Core.Compiler.Const(f, false)
  x::S

Body::Any
1%1 = Base.getproperty(x, :s)::Ref{Int64}%2 = Base.getindex(%1)::Any
└──      return %2

julia> @code_warntype f(b)
Variables
  #self#::Core.Compiler.Const(f, false)
  x::T

Body::Int64
1%1 = Base.getproperty(x, :s)::Base.RefValue{Int64}%2 = Base.getindex(%1)::Int64
└──      return %2

@willtebbutt
Copy link
Member

Having now thought about this a bit more, I'm not sure that there's any way around what @devmotion proposes (which is also what I had imagined happening). The reason being that, for example, KernelSum doesn't define kappa or metric because it doesn't make sense for it, and I would very much like to be able to transform the inputs to a KernelSum.

AFAICT, the kappa + metric abstraction works really well for base kernels (I'm going to implement it in Stheno shortly so that I'm closer to being ready for KernelFunctions integration -- very excited about that), but doesn't work for composite kernels such as KernelSum, KernelProduct and TransformedKernel.

@devmotion
Copy link
Member

I thought about this a bit more as well, and not having a transform trait still feels like the more natural thing to do. I think @willtebbutt pointed out the strongest argument in favour of dropping transform and just implementing kernelmatrix and kernelmatrix! for TransformedKernel.

The problem with this is that we end up having the exact problem we were trying to avoid! 😆

I thought this PR tries to solve the problem that we have to carry around Transform type parameters for all kernels since some kernels do not allow for a natural/reasonable definition of Transforms, so it seems the same reason also motivates dropping the transform trait?

Dispatching kernelmatrix functions is really painful because you need to treat all cases (X,XY,diagX).

IMO the amount of functions that have to be added is not unreasonably large - and at the same time it allows us to remove all uses of transform 🙂

@theogf
Copy link
Member Author

theogf commented Feb 5, 2020

Ok, I will implement these changes then.
I just need to put this on hold, as I have a deadline end of the week :)

@willtebbutt
Copy link
Member

No worries. No prizes for guessing what the deadline is though haha

@devmotion
Copy link
Member

κ.transform(X,obsdim) does not seem natural to me, looks a lot like OOP which we don't necessarily want.
Here is my general take given all the inputs :
I replace transform by apply (not exported), and overload kernelmatrix like :

IMO it doesn't matter. It seems there's only an advantage in using apply, if the method is not only defined for ::TransformedKernel (i.e., not specifically kernelmatrix(k::TransformedKernel, ...)) but some more general type of kernels which might require transforms to be applied in a different way (and in that case probably one might want to use kernel(k) instead of k.kernel for the same reasons).

@devmotion
Copy link
Member

I was playing around with this PR a bit since I would like to move (almost) all kernel-related functionality of my package to KernelFunctions. However, I noticed that the current implementation leads to a severe performance regression when using ScaleTransform. That is both due to the fact that its parameter s is not a scalar and (more importantly) due to the allocation of a large number of vectors when evaluating kappa(kernel, x, y). A simple example that I benchmarked shows:

# using `exponentialkernel` (i.e. no `ScaleTransform`)
1.137 ms (0 allocations: 0 bytes)

# using `exponentialkernel(0.1)`
2.432 ms (39805 allocations: 6.07 MiB)

Maybe one could add more efficient implementations such as

function kappa::TransformedKernel{<:ExponentialKernel,<:ScaleTransform}, x, y)
    k = κ.kernel
    s = κ.transform.s[]
    kappa(k, s * evaluate(metric(k), x, y))
end

(probably in a more modular/general way) for the default kernels? With this implementation and
using RefValue I get

1.009 ms (5 allocations: 80 bytes)

By switching to a scalar field s in ScaleTransform I can reduce the allocations to

1.002 ms (0 allocations: 0 bytes)

@theogf
Copy link
Member Author

theogf commented Feb 11, 2020

When you say you use kappa(k,x,y) you mean you loop over all vectors x and y?
Could you share the benchmarking code you used?

@devmotion
Copy link
Member

It's different estimators and I don't need all pairwise distances for all of them, so I'm evaluating just the combinations I need, e.g., in https://github.com/devmotion/CalibrationErrors.jl/blob/1cad3a183ea4c0f34758dd938dfd3b8cec393409/src/skce/generic.jl#L224-L228. The code is a bit out-dated, I haven't pushed my latest changes.

@theogf
Copy link
Member Author

theogf commented Feb 11, 2020

Ah I understand now what you propose, you want to have specialized cases for some specific kappa cases. It would only work for ScaleTransform actually
Maybe we can add this in a different PR.

@devmotion
Copy link
Member

Ah I understand now what you propose, you want to have specialized cases for some specific kappa cases. It would only work for ScaleTransform actually
Maybe we can add this in a different PR.

Yes, that was my idea. IMO the ScaleTransform is so simple that it should not lead to such a severe performance regression and hence could be improved (at least for the default kernels) by some special implementations. Just thought I bring it up right away, since I was a bit surprised about the regressions.

@theogf
Copy link
Member Author

theogf commented Feb 11, 2020

Ok I got a two liner :

kappa(κ::TransformedKernel{<:Kernel,<:ScaleTransform}, x, y) = kappa(kernel(κ), _scale(κ)*evaluate(metric(κ),x, y))
_scale(κ::TransformedKernel) = first(κ.transform.s)^(metric(κ) isa Euclidean ? 1 : 2)

@devmotion
Copy link
Member

Ok I got a two liner :

I'm a bit worried about kernels with different metrics. Maybe we could dispatch on metric(κ.kernel) and add an implementation for ::SqEuclidean and ::Euclidean (these are the only metrics used in KernelFunctions so far?) and use a fallback in all other cases. Something like

kappa::TransformedKernel{<:Kernel,<:ScaleTransform}, x, y) = kappa.kernel, _scale.transform, metric.kernel), x, y))
_scale(t::ScaleTransform, metric::Euclidean, x, y) =  first(t.s) * evaluate(metric, x, y)
_scale(t::ScaleTransform, metric::Euclidean, x, y) =  first(t.s)^2 * evaluate(metric, x, y)
_scale(t::ScaleTransform, metric, x, y) = evaluate(metric, apply(t, x), apply(t, y))


metric(::ConstantKernel) = Delta()
@inline metric(::ConstantKernel) = Delta()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side remark, IMO usually it's not required to add @inline, in particular not for small functions such as metric. In these cases, I think usually one can trust the compiler, as stated in the docstring:

help?> @inline
  @inline

  Give a hint to the compiler that this function is worth inlining.

  Small functions typically do not need the @inline annotation, as the compiler does it automatically. By using @inline on bigger
  functions, an extra nudge can be given to the compiler to inline it. 

@theogf
Copy link
Member Author

theogf commented Feb 14, 2020

To add up on the massive changes of this PR :

  • Creation of the abstract type BaseKernel, which makes it more practical to create subsequent generic functions
  • BaseKernel with parameters, for example ConstantKernel, have now a constructor based on keywords, to avoid confusion when constantkernel(3.0) is called for example. To get the old behaviour one needs to call constantkernel(3.0,c=1.0). Related to this, I kept the same naming for the variables, but this means some keywords are unicode characters (alpha,gamma etc), I will change it to verbose names.
  • I went with the approach of naming the TransformedKernel wrappers by changing the name to lowercase, but I think we can maybe remove the kernel at the end.
  • I put some deprecations for calling the previous constructors with a Transform


### Syntactic sugar for creating matrices and using kernel functions
for k in [:ExponentialKernel,:SqExponentialKernel,:GammaExponentialKernel,:MaternKernel,:Matern32Kernel,:Matern52Kernel,:LinearKernel,:PolynomialKernel,:ExponentiatedKernel,:ZeroKernel,:WhiteKernel,:ConstantKernel,:RationalQuadraticKernel,:GammaRationalQuadraticKernel]
for k in subtypes(BaseKernel)
Copy link
Member

@devmotion devmotion Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that does not add the syntactic sugar for user-defined BaseKernels - in the future (when support for Julia < 1.3 is dropped) this can be done in a nice way due to JuliaLang/julia#31916: one can just define

::BaseKernel)(d::Real) = kappa(κ, d)

and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I did not know there was this new feature in 1.3.
I was thinking for now and as an alternative to create a macro that would do the same on a user-defined kernel. Including creating the TransformedKernel wrapper

src/generic.jl Outdated
## Constructors for kernels without parameters
for kernel in [:ExponentialKernel,:SqExponentialKernel,:Matern32Kernel,:Matern52Kernel,:ExponentiatedKernel]
for k in Symbol.(subtypes(BaseKernel))
k = Symbol(string(k)[17:end])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess slightly less confusing would be

for k in nameof.(subtypes(BaseKernel))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn I spent hours trying to look for a solution on this 😅

src/generic.jl Outdated
for kernel in [:ExponentialKernel,:SqExponentialKernel,:Matern32Kernel,:Matern52Kernel,:ExponentiatedKernel]
for k in Symbol.(subtypes(BaseKernel))
k = Symbol(string(k)[17:end])
new_k = Symbol(lowercase(string(k)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely sold on the idea of using lowercase functions here but I'm also not fully convinced by using the uppercase k either. IMO, the lowercase definitions add quite a bit of inconsistency, since now some kernels can be constructed with lowercase function whereas others don't, everyone who implements custom BaseKernels has to copy these definitions (and in contrast to the syntactic sugar above I don't see how this problem could ever be solved by KernelFunctions, even on Julia >= 1.3), and one can construct kernels now in two different ways (e.g., sqexponentialkernel() and SqExponentialKernel()) which feels weird. That's why I'm inclined to say, maybe one should rather implement the uppercase version, such as $k(ρ::Real; args...) = TransformedKernel($k(;args...),ScaleTransform(ρ)) - even though in this case it might come as a surprise that the return value is not of type $k. On the other hand, that should be fine I guess?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the real problem, this would create a very problematic type inconsistency...
For the user defined case, again I was thinking on making a macro. That would automatically create these functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function wrap_kernel(k) #::KernelFunctions.BaseKernel
    new_k = Symbol(lowercase(string(k)))
    return @eval begin::$k)(d::Real) = kappa(κ,d) #TODO Add test::$k)(x::AbstractVector{<:Real}, y::AbstractVector{<:Real}) = kappa(κ, x, y)
        (κ::$k)(X::AbstractMatrix{T}, Y::AbstractMatrix{T}; obsdim::Integer=defaultobs) where {T} = kernelmatrix(κ, X, Y, obsdim=obsdim)
        (κ::$k)(X::AbstractMatrix{T}; obsdim::Integer=defaultobs) where {T} = kernelmatrix(κ, X, obsdim=obsdim)
        $new_k(;args...) = $k(;args...)
        $new_k::Real;args...) = TransformedKernel($k(;args...),ScaleTransform(ρ))
        $new_k::AbstractVector{<:Real};args...) = TransformedKernel($k(;args...),ARDTransform(ρ))
        $new_k(t::Transform;args...) = TransformedKernel($k(;args...),t)
        export $new_k
    end
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That (@eval inside a function) does not feel right, IMO this is the perfect use case for a proper macro, i.e., macro wrap_kernel(T) ... end where T is the type of the custom user-defined BaseKernel. That feels a lot cleaner to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on this on Slack right now . I am very bad at understanding macros

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should

@kernel s * Kernel(transform, kernel_params...)

be rewritten to? To

s * TransformedKernel(Kernel(kernel_params...), transform)

? I'm still wondering if this does not lead to problems in cases such as

@kernel s * MaternKernel(42)

It seems a bit unintuitive to assume that a user would actually want to use a length scale of 42 in this case instead of the default constructor of a MaternKernel with parameter 42. IMO in that example an explicit keyword argument such as lengthscale would be less confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should
@kernel s * Kernel(transform, kernel_params...) be rewritten to? To s * TransformedKernel(Kernel(kernel_params...), transform) ?

Yes exactly

It seems a bit unintuitive to assume that a user would actually want to use a length scale of 42 in this case instead of the default constructor of a MaternKernel with parameter 42

Hmm I dont understand your point, it seems as unintuitive to me a user would use a nu = 42 for MaternKernel. That's why I think using keywords for kernel parameters is necessary. It is even more evident for kernel having multiple parameters like PolynomialKernel. What should be PolynomialKernel(2.0,3.0) should be ? Whereas PolynomialKernel(c=2.0,d=3.0) does not leave any doubt.
Having lengthscale as keyword could be an additional way to do it. But I would be quite against having kernel parameters as positional parameters, (as it is now the case)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I dont understand your point, it seems as unintuitive to me a user would use a nu = 42 for MaternKernel. That's why I think using keywords for kernel parameters is necessary.

Sure, using keyword arguments for kernel parameters might be more explicit and more intuitive, in particular in the case of kernels with multiple parameters. However, IMO it feels unintuitive and confusing if adding @kernel changes completely what the 42 in MaternKernel(42) means (even if one wants to encourage the use of keyword arguments, usually the default constructors still exist, so MaternKernel(42) might be a completely valid call of the default outer constructor).

But I would be quite against having kernel parameters as positional parameters, (as it is now the case)

The problem I see is that you might be able to enforce that for the kernels defined in this package (by implementing an inner constructor that invalidates the default outer constructor and only adding an outer constructor with keyword arguments) but it is impossible to ensure that for user-defined kernels. Actually, I'm not even sure if it is a good idea to try to enforce that in the case of kernels with a single clearly defined parameter (such as maybe even MaternKernel).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually the default constructors still exist, so MaternKernel(42) might be a completely valid call of the default outer constructor

My thought was to change the default constructor as well.

it is impossible to ensure that for user-defined kernels.

We can be explicit about that in the docs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will finish the PR without the @kernel wrapper and remove all other wrappers (the lowercase thing). We can add @kernel in a different PR. I will just correct the docs and add tests and merge the PR.

@theogf theogf merged commit a5bcc63 into master Feb 28, 2020
@theogf theogf mentioned this pull request Mar 1, 2020
@theogf theogf deleted the remove-transform branch March 9, 2020 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants