Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change curry to use functools.partial #150

Closed
wants to merge 6 commits into from

Conversation

eriknw
Copy link
Member

@eriknw eriknw commented Mar 20, 2014

curry instances always return other curry instance if the arguments are not fully applied; previously, they could sometimes return functools.partial instances.

The performance of calling curry instances is better than before, but still not as good as calling partial. To achieve the performance of partial, use curried_func.call.

This PR addresses the idempotency issue from #147. It does not, however, split curry into class Curry/def curry as discussed in #147 (or as indicated by the name of this branch). I think curry performs just fine as a single class.

eriknw added 5 commits March 19, 2014 18:48
…Not finished.

This includes a significant change to curried objects: they never return
`functools.partial` objects.  Instead, `Curry` *uses* `partial`.  This
is because `partial` behaves differently.  Let me illustrate:

`curry(lambda x, y=0: x+y)(y=2)` is valid.
`partial(lambda x, y=0: x+y)(y=2)` returns a TypeError.

More documentation and tests are needed.  PRs (and comments) are welcome.
Calling `call` on a curried object is an efficient way to evaluate the function
These get the attributes from the `functools.partial` instance, and
like `partial` instances, they are now read-only.
@eriknw
Copy link
Member Author

eriknw commented Mar 20, 2014

I performed the following benchmarks for old and new curry, which shows the new curry is significantly improved, but still significantly slower (x2-x3) than partial:

from toolz import curry
from functools import partial

def f(*args, **kwargs):
    pass

c1 = curry(f)
p1 = partial(f)

%timeit p1()
%timeit c1()
if hasattr(c1, 'call'):
    %timeit c1.call()

%timeit p1(1)
%timeit c1(1)
%timeit p1(1, 2)
%timeit c1(1, 2)
%timeit p1(x=1)
%timeit c1(x=1)
%timeit p1(x=1, y=2)
%timeit c1(x=1, y=2)

c2 = curry(f, 1, 2)
p2 = partial(f, 1, 2)

%timeit p2()
%timeit c2()
%timeit p2(3)
%timeit c2(3)
%timeit p2(3, 4)
%timeit c2(3, 4)
%timeit p2(x=3)
%timeit c2(x=3)
%timeit p2(x=3, y=4)
%timeit c2(x=3, y=4)

c3 = curry(f, a=1, b=2)
p3 = partial(f, a=1, b=2)

%timeit p3()
%timeit c3()
%timeit p3(3)
%timeit c3(3)
%timeit p3(3, 4)
%timeit c3(3, 4)
%timeit p3(x=3)
%timeit c3(x=3)
%timeit p3(x=3, y=4)
%timeit c3(x=3, y=4)

@mrocklin
Copy link
Member

While I like the idea of precomputing a partial and keeping it around in call I think I'm mostly against this change.

To me a 2-3x slowdown on curried function calling is a big deal. I actually use map(get(...)) quite a bit and if all I'm doing is basic data selection then it does occasionally become a bottleneck. This slowdown would make me switch down to using list comprehensions.

At the same time I only rare notice that I've been given a partial rather than a curry object, and I don't think I've ever actually cared. Maybe I need more motivation here to understand the value. I guess this would be an issue due to docstrings, function names, and pickling. The inconsistent output interface doesn't actually bother me that much (I'm not sure why not though, I think that it should).

Anyway, I'm -0.5 on this as is. Thoughts?

1. Once again, `Curry` is a class and `curry` is a function.
2. `Curry`and `curry` may return a partial instance if there is a single
   argument remaining and no keywords.
3. `curried_object.call` was changed to `curried_object._call`.  As per
   usual convention, `_call` is an implementation detail that may occasionally
   be useful (as it is explicit and faster), but there are no guarantees
   that it will remain in future releases of `toolz`.
@eriknw
Copy link
Member Author

eriknw commented Mar 20, 2014

Take two. See latest push. I'll discuss more later.

@eriknw
Copy link
Member Author

eriknw commented Mar 20, 2014

I expected this PR would need vetting, and I'm sure it will continue to develop.

I agree that a factor of 2x is important. One approach to achieve this is to sometimes return a partial object. The approach in 9fc06dd is to never return a partial, but instead adds a call attribute, which can be used to quickly call any curried function. The code is simple, the behavior is consistent, and the efficient calling of a curried function is explicit.

Let me illustrate what prompted some of the changes in this PR:

curried_get = curry(get)
fast_cheetah = curried_get('cheetah')
fast_cheetah(zoo)  # this is a fast partial

slow_cheetah = curry(get, 'cheetah')
slow_cheetah(zoo)  # this is a slow curry with the current implementation

While developing the class Curry/def curry split, I thought, "Aha, I can make slow_cheetah a fast partial just like fast_cheetah, and it's also intuitive that curry(get)('cheetah') is equivalent to curry(get, 'cheetah')," so I made it so. With curry as a function, it is able to return a Curry or partial instance. However, this caused some tests to fail.

@curry
def memoize(f, cache=None, key=None):
    ...

add_cache = {}
@memoize(cache=add_cache)
def add(x, y):
    return x + y

@memoize(cache=add_cache) didn't work, because memoize is a partial instance that doesn't incrementally add arguments like curry. This sucks! The approach in 2ca1c7c is similar to the current behavior, except it only returns a partial if the function doesn't have keywords. curry(f, x) and curry(f)(x) are equivalent, @memoize(cache=cache) works, and curry(binop)(1) and curry(binop, 1) are both fast partials. However, your example map(get(...), ...) is a slow Curry, because get has keyword arguments. Hence, this solution may not be ideal either.

I'll continue this discussion later. Allowing curry to sometimes return a partial introduces many questions (more than I initially realized), and there are a large permutation of possible behaviors. Hopefully this post helped illustrate some of the issues and different behaviors that can arise. When faced with so many questions and possibilities, I instinctively turned to a simple, consistent, and explicit implementation-- 9fc06dd --that introduces a way to always call the curried function efficiently.

@eriknw eriknw mentioned this pull request Apr 25, 2014
eriknw added a commit to eriknw/toolz that referenced this pull request May 6, 2014
…#155 pytoolz#159

1. Idempotent such that `curry(curry(f))` is equivalent to `curry(f)`; see pytoolz#147
2. Comparable via equality and hashable based on `func`, `args`, and `keywords`; see pytoolz#159
3. `func`, `args`, and `keywords` attributes are readonly
4. `__name__` and `__doc__` attributes are writable
5. More efficient implementation using `partial`; see pytoolz#150
6. Curried objects never transmogrigy into `partial` objects; see pytoolz#155
mrocklin added a commit that referenced this pull request May 6, 2014
Many improvements to `curry` based on #147 #150 #155 #159
@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Closing. Superseded by #170.

@eriknw eriknw closed this May 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants