-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: A traits-based user-extensible @inbounds #11867
Conversation
This removes `unsafe_getindex` and `unsafe_setindex!` in favor of `getindex(::BoundsCheckOff, ...)` and `setindex!(::BoundsCheckOff, ...). `getindex(...)` falls back to `getindex(BoundsCheckOn(), ...)`, which will error if neither was defined. `getindex(BoundsCheckOff(), ...)` will fall back to `getindex(...)` if not defined.
Great direction! 👍 |
Very interested to see this. A serious look might be a few days, unfortunately. |
Nice! |
The case that I keep worrying about is something like: type MyType <: AbstractVector{Int} end
getindex(::MyType, ind) = 1
getindex(::MyType, ind1, ind2) = 2 With this PR, I get the output: julia> getindex(MyType(), 1)
1
julia> getindex(MyType(), 1, 1)
2
julia> getindex(BoundsCheckOff(), MyType(), 1)
1
julia> getindex(BoundsCheckOff(), MyType(), 1, 1)
1 This is pathological since it's non-sensical for |
This seems like the crux of the problem (insofar as I understand it): how do you make sure you skip doing run-time dispatch without using different methods for checked indexing and unchecked indexing? |
I should have trusted the @mbauman from three months ago. Dangit. |
Actually, that's an issue on master right now, too: julia> type MyType <: AbstractVector{Int} end
julia> getindex(::MyType, ind) = 1
getindex (generic function with 121 methods)
julia> getindex(::MyType, ind1, ind2) = 2
getindex (generic function with 122 methods)
julia> getindex(MyType(), 1)
1
julia> getindex(MyType(), 1, 1)
2
julia> Base.unsafe_getindex(MyType(), 1)
1
julia> Base.unsafe_getindex(MyType(), 1, 1)
1 :-\ |
I don't see how things could work any other way when you're exploiting multiple dispatch to change behaviors. It's always possible to disassociate those four methods since the only thing holding them together is the no-punning honor code in the community. |
My proposal from the other day for solving that part of the problem was: @generated function good_unsafe_getindex(args...)
safesig = methods(getindex, args)[1].sig
unsafesig = methods(Base.unsafe_getindex, args)[1].sig
:($(unsafesig <: safesig ? (:(Base.unsafe_getindex)) : :getindex)(args...))
end For: type MyType <: AbstractVector{Int} end
getindex(::MyType, ind) = 1
getindex(::MyType, ind1, ind2) = 2 this gives: julia> good_unsafe_getindex(MyType(), 1)
1
julia> good_unsafe_getindex(MyType(), 1, 2)
2 and for: type MyUnsafeType <: AbstractVector{Int} end
getindex(::MyUnsafeType, ind) = 1
getindex(::MyUnsafeType, ind1, ind2) = 2
Base.unsafe_getindex(::MyUnsafeType, ind) = 3
Base.unsafe_getindex(::MyUnsafeType, ind1, ind2) = 4 this gives: julia> good_unsafe_getindex(MyUnsafeType(), 1)
3
julia> good_unsafe_getindex(MyUnsafeType(), 1, 1)
4 |
I love this phrase. |
Umm, I'm the newbie in the crowd, can you please explain the "no-punning honor code"? (Hopefully, before I break the code and everybody yells at me!) (also, I see type punning all over the C code that's been driving me crazy!) Thanks! |
What about factoring function getindex(a, i)
check_indices(a, i)
unsafe_getindex(a, i)
end Then you could override just |
The no-punning honor code is basically don't do anything like this: http://stackoverflow.com/questions/4854248/why-are-bitwise-shifts-and-used-for-cout-and-cin |
The "no punning" means that generic functions, operators included, should have a single, canonical "meaning", and different implementations of the function (methods) shouldn't try to bend that meaning in weird or "pun"ning ways. Generic functions should be easy to reason about in terms of what they do without being "fancy" or funny. |
Hmmm. |
@StefanKarpinski I think this is basically @JeffBezanson's proposal from #7799 (comment). If we can get it to perform well, and if we don't mind the awkwardness of making everyone extend |
I prefer it to be explicit whether or not my method "responds" to |
Note that you can still make indexing with and without bounds checking disagree by overloading both in ways that don't match. Of course, you wouldn't usually extend |
What surprised me, was that something I wrote in method A (adding |
Yes, this feature requires a limited form of dynamic scoping. |
But that wouldn't be true anymore, right, with @JeffBezanson #7799? Gerry Sussman taught me to hate dynamic scoping 😀 |
None of the proposals we're currently discussing will auto-propagate |
Yeah! 👍 💯 🎉 👏 |
In response to #11867 (comment): The basic trouble stems from wanting to call either If we have automatic bounds checking in base, then you really shouldn't define |
@mbauman Good points. Now I don't know what I want! (except that I don't want |
Perhaps having these functions fall back on each other by default wouldn't be all that bad. Sure, there's a stack overflow if your type is broken and doesn't define any indexing methods. But we could detect this and add a message to the REPL error display about why there was a stack overflow. Just so long as it happens instantly (and doesn't hang in thousands of calls |
I'm still not convinced that having |
|
As an example of the complexities, I offer the case of SubArrays, where there are bounds-checks both on the indexes and the parent array. That is to say, for a SubArray This is a frustratingly difficult problem to reason about. |
Closing this in favor of the proposal in #7799 (comment). I think this is trying to hard to deal with too many use-cases, which ends up making this complicated and confusing. Let's just do it the obvious way: check bounds in the base, and recommend that users define the unsafe method. |
@simonster and @johnmyleswhite and I were talking about #7799 more yesterday. After the discussion, I went back to my proposal in #8227 (comment) for a traits based mechanism. The crux of the issue is that we want the user to be able to define either a simple method
f(A)
that is assumed to be safe or a more complicated methodf(::BoundsCheck, A)
that allows them to use dispatch to selectively change behavior. The problem is that we need to implement fallbacks that try eitherf(A)
orf(BoundsCheckOn(), A)
without incurring a stack overflow. I had been thinking that this lived in the same design space as a linear indexing trait dispatch, but I now think that it is a separate problem that, while tricky to compose with other traits, can be solved independently. It just requires a little bit of tip-toe-ing around:getindex(...)
falls back togetindex(BoundsCheckOn(), ...)
, which will error if neither was defined.getindex(BoundsCheckOff(), ...)
will fall back togetindex(...)
if not defined.This works okay, as long as you never explicitly call
f(BoundsCheckOn(), A)
, since that can't fall-back tof(A)
without a stack overflow. The trouble is that it's very tempting to write a method off
that "propagates" its bound-checking state:f(b::BoundsCheck, A) = f(b, A.data)
. This won't work unless we remove the error above inf(::BoundsCheckOn, A)
. And if we do that I don't think there's any straightforward way to avoid getting a non-obvious stack overflow in a place where we really need a sensible "you forgot to define this method" error.Potential issues:
AbstractArray
, but it may be possible to add fallbacks for all typesf(BoundsCheckOn(), A)
, but defining functions in the formf(::BoundsCheckOn, ::Any)
is totally supported and just fine. That's a little wonky.MAX_TUPLETYPE_LEN
) two indices sooner.I think that this may be a viable path towards a user-extensible
@inbounds
. To move forward, @simonster: I need your help in updating your@inbounds
macro andexpand-ref
changes from #8227 to simply insert aBoundsCheckOff()
within all@inbounds
'edgetindex
andsetindex!
calls.