-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement reference semantics for all Tensors #160
Comments
Isn't the biggest disadvantage that it is no longer clear whether a function call |
All function calls that modifies the underlying data (or the metadata) in Arraymancer require a The only gotcha left is: import arraymancer
let a = [1, 2, 3, 4].toTensor
var b = a
b[2] = 10
echo a # Tensor : [1, 2, 10, 4] If |
Hm, isn't that gotcha what ruins all guarantees? You can also argue that in the Python world you can simply grep for expressions that modify The biggest need for guaranteed immutability (and the unnecessary copy workaround) comes from use cases like this: Consider you don't know anything about |
I'm in your camp ;), my first non-scripting language was Haskell and I like
my referential transparency and deep immutability guarantee.
I think I've tried everything for weeks:
- Copy-on-write would introduce more troubles than it's worth
#157 (comment)
- Default seq copy-on-assignment requires to use `unsafe` everywhere
especially for slicing (though I can use Nim move {call} optimization so
only a single unsafe is needed) to avoid copy. Furthermore it makes the
library bigger (to learn and maintain) due to the safe + unsafe version and
CPU and Cuda tensors had different semantics.
- Shallow on `let`, Deep copy on `var` doesn't work (yet ? See below).
I've raised related feature requests:
- nim-lang/Nim#6348
- nim-lang/Nim#6793
Adding your own use cases to them would be valuable.
Another thing that is promising but that I didn't try is write-tracking:
https://nim-lang.org/araq/writetracking.html.
This is a recurrent issue that pops up regularly on the forum (and I bugged
Araq a lot about it on IRC/Gitter):
- 2015: https://forum.nim-lang.org/t/1685/1
- 2017: https://forum.nim-lang.org/t/3374
I'm very well aware of this unfortunate trade-off which for me can be
resumed as:
Value semantics
- Safety at the cost of slowness (copy/heap allocation) and memory
consumption OR unwieldly `unsafe` syntax.
Vs
- Speed at the cost of gotchas (this might have a huge marketing impact if
others benchmark typical Arraymancer code against competing frameworks)
- Familiar paradigm to Python/Julia users (but not to R/Matlab)
- Codebase, API, maintenance simplicity
In the future I hope we can use write tracking to prevent this gotcha.
|
Current value semantic / copy-on-assignment is not good enough performance-wise. It requires
unsafe
all over the place which is not ergonomic at all.I tried copy-on-write as well, in-depth
monologuediscussion in this issue #157. You can implement COW with atomic reference counting or a shared/non-shared boolean but it has a few problems detailed here:=
is overloaded, it is non-trivial to avoid COW,let a = b.unsafeView
won't workSo Arraymancer will move to reference semantics (share Tensor data by default, copy must be made explicit).
Benefits:
unsafeSlice
all over the place for performanceunsafe
proc can be removedasContiguous
andreshape
grep clone *.nim
Disadvantages
clone
and share data by mistake.grep unsafe *.nim
In the wild
Numpy and Julia have reference semantics, Matlab and R have copy-on-write.
The text was updated successfully, but these errors were encountered: