Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TF] Added support for advanced indexing and slicing #23684

Merged
merged 31 commits into from
Apr 17, 2019
Merged

[TF] Added support for advanced indexing and slicing #23684

merged 31 commits into from
Apr 17, 2019

Conversation

eaplatanios
Copy link

@eaplatanios eaplatanios commented Mar 30, 2019

This adds support for advanced indexing and slicing to tensors. I plan to add support for setters and gradients soon, but first I wanted to see if this initial design looks good to the Swift for TF team.

I also added a new test and made sure all previous tests using tensor subscripts still pass, using the new subscript implementation.

Some comments/questions:

  • When using a subscript setter in code we get a crash, as in TF-178. I tried looking into this a bit and it seems to me that the type checker is crashing when trying to diagnose a subscript error in this case. This is probably because, in the failure example, we're trying to use the subscript setter, but no such setter exists, and there is bug in how this error is diagnosed. There does not seem to be an issue when using the getter. I'm not sure how to check if this has been fixed upstream as the merging upstream changes may be nontrivial (although I haven't tried doing so yet).
  • The overload subscript functions were required due not having support for variadic generics. Do you know of a better way to do this? Also, what is the convention for line breaks in the generic types list?
  • Regarding gradients, is it possible to use @differentiating to directly provided the gradient of Raw.stridedSlice? Otherwise, I can provide a gradient directly for the subscript function.
  • Swift ranges do not support negative indices and so something like 0 ..< -1 cannot be used, even if only as syntactic sugar. Do we need that precondition for Swift ranges? In principle, the ranges could be allowed to be empty, from a Swift point of view, and have this be used simply as syntactic sugar. This can also be avoided if we define our own ..<, ..., etc. operators that support negative indices and are only used as syntactic sugar for advanced indexing and slicing.
  • Strides are not currently supported, but I can post a question on Swift Evolution about this tomorrow. One very direct but S4TF-specific solution would be to provide our own ..<, ..., etc. operators as discussed above, that also allow for a .. operator for strided ranges.
  • Is there a way to also support ranges over Int32, Tensor<Int32>, and other types? I get an error whenever I try to do that saying I can only define one extension for each protocol conformance, irrespective of the conditional constraint on the type.
  • We should probably start organizing the ops in multiple files instead of putting everything in Ops.swift. How about making an Ops directory in TensorFlow and have separate files like Manipulation.swift, Math.swift, etc.

@eaplatanios eaplatanios changed the title [TF] [WIP] Added support for advanced indexing and slicing [TF] WIP: Added support for advanced indexing and slicing Mar 30, 2019
stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
@rxwei
Copy link
Contributor

rxwei commented Mar 30, 2019

  • When using a subscript setter in code we get a crash, as in TF-178. I tried looking into this a bit and it seems to me that the type checker is crashing when trying to diagnose a subscript error in this case. This is probably because, in the failure example, we're trying to use the subscript setter, but no such setter exists, and there is bug in how this error is diagnosed. There does not seem to be an issue when using the getter. I'm not sure how to check if this has been fixed upstream as the merging upstream changes may be nontrivial (although I haven't tried doing so yet).

Folks are still working on merging things from master. I'll take a little longer.

Have you tried adding a setter?

  • The overload subscript functions were required due not having support for variadic generics. Do you know of a better way to do this? Also, what is the convention for line breaks in the generic types list?

Before we have variadic generics, we can use existentials.

subscript(_ indices: TensorSliceIndexProtocol...)

The Google Swift Style Guide is slightly inconsistent with out local convention: https://google.github.io/swift/#line-wrapping. But you can follow its line wrapping guide.

  • Regarding gradients, is it possible to use @differentiating to directly provided the gradient of Raw.stridedSlice? Otherwise, I can provide a gradient directly for the subscript function.

Not yet. The feature is called "retroactive derivative registration" and will take some time to engineer.

  • Swift ranges do not support negative indices and so something like 0 ..< -1 cannot be used, even if only as syntactic sugar. Do we need that precondition for Swift ranges? In principle, the ranges could be allowed to be empty, from a Swift point of view, and have this be used simply as syntactic sugar. This can also be avoided if we define our own ..<, ..., etc. operators that support negative indices and are only used as syntactic sugar for advanced indexing and slicing.

We should stick with Swift's range formation operators and semantics for now. The precondition that upperBound cannot be -1 (or any number smaller than lowerBound) is in Swift programmers' mental model IMO. Let's delay the discussion about negative indices after this PR.

  • Strides are not currently supported, but I can post a question on Swift Evolution about this tomorrow. One very direct but S4TF-specific solution would be to provide our own ..<, ..., etc. operators as discussed above, that also allow for a .. operator for strided ranges.

I don't think providing our own ..< and ... solves the problem here. I believe we need an operator (of lower precedence than RangeFormationPrecedence) that takes a Range/ClosedRange on the left hand side and takes an integer on the right hand side, forming a StridedRange. The only new things necessary are StridedRange and that operator.

  • Is there a way to also support ranges over Int32, Tensor<Int32>, and other types? I get an error whenever I try to do that saying I can only define one extension for each protocol conformance, irrespective of the conditional constraint on the type.

For now, I'd recommend against supporting Tensor<Int32> indices since it can be more error-prone. Int32 is the index type for Tensor APIs, and there is no precedent of supporting multiple index types in the Swift standard library. Let's stick with Int32 for now.

  • We should probably start organizing the ops in multiple files instead of putting everything in Ops.swift. How about making an Ops directory in TensorFlow and have separate files like Manipulation.swift, Math.swift, etc.

Actually, our plan is to gradually move everything to tensorflow/swift-apis.

@eaplatanios
Copy link
Author

Have you tried adding a setter?

I'll add this today. I can use TensorScatterUpdate using #tfop, but I was wondering why it is not included in RawOpsGenerated.swift. Is it skipped because something is not supported yet?

Not yet. The feature is called "retroactive derivative registration" and will take some time to engineer.

Ok, I can use @differentiable in that case for now. One question related to that is how to I specify VJP separately for a subscript getter and setter using @differentiable?

We should stick with Swift's range formation operators and semantics for now. The precondition that upperBound cannot be -1 (or any number smaller than lowerBound) is in Swift programmers' mental model IMO. Let's delay the discussion about negative indices after this PR.

Sounds good.

I don't think providing our own ..< and ... solves the problem here. I believe we need an operator (of lower precedence than RangeFormationPrecedence) that takes a Range/ClosedRange on the left hand side and takes an integer on the right hand side, forming a StridedRange. The only new things necessary are StridedRange and that operator.

Sorry, that's what I meant with the .. operator for strided ranges. I just mixed the two comments in one making it confusing. Also, we don't necessarily need StridedRange given that Swift already has StrideTo<T> and StrideThrough<T> types. We would just need to expose their lower/upper bounds and strides as public properties. I'll include that in the Swift Evolution post.

For now, I'd recommend against supporting Tensor<Int32> indices since it can be more error-prone. Int32 is the index type for Tensor APIs, and there is no precedent of supporting multiple index types in the Swift standard library. Let's stick with Int32 for now.

Sounds good, although, if we want to avoid using gather (I noticed it's more expensive due to memory allocations based on the existing comments), we should probably add support for that soon as it's useful for embedding table lookups.

Also, the strided slice gradient is a dense tensor, so we can support it fine for now, but an advantage of gather is that, once we support sparse tensor gradients, we can compute its gradient as something equivalent to TensorIndexedSlices which would be more efficient in some cases.

Actually, our plan is to gradually move everything to tensorflow/swift-apis.

That sounds great actually! I wonder if we should be making these new additions for ops directly in that repo in that case. Let's keep this PR here for now, but maybe next time I work on adding support for new ops, I may do it there directly.

@eaplatanios
Copy link
Author

Also, I just realized that the ScatterUpdate op in the raw ops module is generated wrongly here. This should instead be the signature of TensorScatterUpdate. For ScatterUpdate the ref tensor type should not be T, but rather the reference type for T, which is not supported in S4TF. This is because this op is meant to do updates on reference variables.

@rxwei
Copy link
Contributor

rxwei commented Mar 30, 2019

Sorry, that's what I meant with the .. operator for strided ranges. I just mixed the two comments in one making it confusing. Also, we don't necessarily need StridedRange given that Swift already has StrideTo<T> and StrideThrough<T> types. We would just need to expose their lower/upper bounds and strides as public properties. I'll include that in the Swift Evolution post.

Great, we are on the same page. Before StrideTo and StrideThrough go through Swift Evolution, adding a new type is fine.

That sounds great actually! I wonder if we should be making these new additions for ops directly in that repo in that case. Let's keep this PR here for now, but maybe next time I work on adding support for new ops, I may do it there directly.

Feel free to move APIs there in future PRs.

Also, I just realized that the ScatterUpdate op in the raw ops module is generated wrongly here. This should instead be the signature of TensorScatterUpdate. For ScatterUpdate the ref tensor type should not be T, but rather the reference type for T, which is not supported in S4TF. This is because this op is meant to do updates on reference variables.

The raw op generation script should be changed to ignore ops that take ref arguments. As to TensorScatterUpdate, I'm not sure whether Raw.tensorScatterUpdate is available in the raw module yet. Maybe we need to regenerate bindings.

@eaplatanios
Copy link
Author

I looked into the setter a bit and I realized that TensorScatterUpdate will not be efficient for this operation. We should directly add support for StridedSliceAssign, but for tensors (e.g., we can call it TensorStridedSliceUpdate). This should be easily done by taking the implementation of the existing StridedSliceAssign kernels and adding support for tensor arguments by following a similar approach to the TensorScatterUpdate kernel implementation. Does that sound reasonable?

@eaplatanios
Copy link
Author

The raw op generation script should be changed to ignore ops that take ref arguments. As to TensorScatterUpdate, I'm not sure whether Raw.tensorScatterUpdate is available in the raw module yet. Maybe we need to regenerate bindings.

I agree. Also, Raw.tensorScatterUpdate is not available in the raw module yet.

@eaplatanios
Copy link
Author

I added support for a new TensorStridedSliceUpdate op here. Haven't tested this yet, but if it works, would be able to merge this into the main TensorFlow repo and use it for the setter? Otherwise, is there currently a way to add support for new ops and kernel in S4TF?

@rxwei
Copy link
Contributor

rxwei commented Mar 30, 2019

Haven't tested this yet, but if it works, would be able to merge this into the main TensorFlow repo and use it for the setter?

You can start a PR and go through reviews in core TF with folks like @alextp. This looks right, but I don't own the kernel code in core TensorFlow to say this is the right direction or not.

Otherwise, is there currently a way to add support for new ops and kernel in S4TF?

There's no way yet. Adding the kernel in core TF will be the first step.

@eaplatanios
Copy link
Author

I'm following through with Alex on the core TF PR. In the meantime, how can I add separate VJP registrations for the getter and the setter of a subscript?

@rxwei
Copy link
Contributor

rxwei commented Mar 31, 2019

Setters cannot have a VJP yet.

@eaplatanios
Copy link
Author

Ok, so that means I can just use @differentiating on top of subscript itself, right?

@rxwei
Copy link
Contributor

rxwei commented Mar 31, 2019

Use @differentiable(vjp: ... where ...) on top of the subscript.

@eaplatanios
Copy link
Author

@rxwei I updated the TensorFlow checkout commit to include the strided slice assign op that was merged, but when I try to compile, I get a linking error saying that dyld: Library not loaded: @rpath/libtensorflow.so.1. Not sure if the dynamic libraries were versioned before, but if they were not, could this be due to something missing from the build script in S4TF?

@rxwei
Copy link
Contributor

rxwei commented Apr 9, 2019

Could you try a bazel build in ../tensorflow?

@eaplatanios
Copy link
Author

eaplatanios commented Apr 9, 2019 via email

@rxwei
Copy link
Contributor

rxwei commented Apr 9, 2019

You can make a renamed copy of the lib and try that.

@eaplatanios
Copy link
Author

But where should I place it? ../tensorflow/bazel-bin/tensorflow already contains libtensorflow.so, libtensorflow.so.1 and libtensorflow.so.13.1, where the first two are symbolic links pointing to the last one.

@eaplatanios
Copy link
Author

@rxwei I think this may be related to this PR and not the swift build configuration.

@eaplatanios
Copy link
Author

Yes I agree. By the way, what's the difference between a nullary operator and a global variable? I tried to start an evolution topic that included strides, but I guess it may have been too generic and not focused on a single subject.

By the way, I just pushed the requested changes. If these look ok, I can go ahead and merge upstream changes including the Int32 -> Int change.

@rxwei
Copy link
Contributor

rxwei commented Apr 16, 2019

Operators and identifiers are separate sets of characters. That's the main difference. Operators are functions, so I guess a nullary operator would have to represent a nullary function. Even if we had nullary operators, it still won't solve our problem. We need structural types (including function types) to be able to conform to protocols.

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I just pushed the requested changes. If these look ok, I can go ahead and merge upstream changes including the Int32 -> Int change.

LGTM! One suggestion is adding tests for the newly added .. operators.

stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
stdlib/public/TensorFlow/Ops.swift Outdated Show resolved Hide resolved
@eaplatanios
Copy link
Author

@rxwei This should be ready now. I have also added a couple of tests for the new stride operator.

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

1 similar comment
@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

6 similar comments
@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@eaplatanios
Copy link
Author

I can't tell if the tests are running on the CI, but I just ran them locally and they all pass (in case this helps).

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

Now it's triggered!

@rxwei rxwei changed the title [TF] WIP: Added support for advanced indexing and slicing [TF] Added support for advanced indexing and slicing Apr 17, 2019
@rxwei rxwei merged commit 89a9033 into swiftlang:tensorflow Apr 17, 2019
@rxwei
Copy link
Contributor

rxwei commented Apr 17, 2019

Thanks Anthony for driving this!

@eaplatanios
Copy link
Author

Thanks for all the help and useful suggestions Richard!

@rxwei
Copy link
Contributor

rxwei commented Apr 18, 2019

This actually failed GPU tests: https://ci-external.swift.org/job/swift-PR-TensorFlow-Linux-gpu/1272/console. Could you look into this?

@rxwei
Copy link
Contributor

rxwei commented Apr 18, 2019

Ok, I think this is because GPU only supports Int32 masks. We need to switch them from Int64 to Int32.

@eaplatanios
Copy link
Author

I'm not sure if that's the reason because int64 is also supported for these masks with GPUs. The error messages look like something else may be wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants