Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many improvements to curry based on #147 #150 #155 #159 #170

Merged
merged 6 commits into from
May 6, 2014

Conversation

eriknw
Copy link
Member

@eriknw eriknw commented May 6, 2014

  1. Idempotent such that curry(curry(f)) is equivalent to curry(f); see Make curry idempotent such that curry(curry(f)) -> curry(f) #147
  2. Comparable via equality and hashable based on func, args, and keywords;
  3. func, args, and keywords attributes are readonly
    • Important because these are used to calculate the hash
  4. __name__ and __doc__ attributes are writable
    • This allows users to customize curried objects to reflect their intended usage
  5. More efficient implementation using partial; see Change curry to use functools.partial #150
  6. Curried objects never transmogrify into partial objects; see Cython implementation of toolz #155

eriknw added 3 commits May 6, 2014 10:53
…#155 pytoolz#159

1. Idempotent such that `curry(curry(f))` is equivalent to `curry(f)`; see pytoolz#147
2. Comparable via equality and hashable based on `func`, `args`, and `keywords`; see pytoolz#159
3. `func`, `args`, and `keywords` attributes are readonly
4. `__name__` and `__doc__` attributes are writable
5. More efficient implementation using `partial`; see pytoolz#150
6. Curried objects never transmogrigy into `partial` objects; see pytoolz#155
assert not isinstance(g.func, curry)
assert not hasattr(g.func, 'func')
# curry makes a new curry object, so everything is distinct but equal
assert f is not g
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to assert this? Aren't there valid solutions that allow this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll remove the "distinct" checks.

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

In general I like these changes.

The one thing I'm not super excited about is including partial in curry. What was the original reasoning here? Was it so that we can call f._partial(...) and get faster performance? This is wacky enough so that I'd prefer not to include it if possible. In general I prefer transmorgification but, given the existence of cytoolz, I'm also just willing to settle on a slow toolz.get.

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

The one thing I'm not super excited about is including partial in curry. What was the original reasoning here?

Simple: it is a faster way to implement curry. We use partial to efficiently combine args and merge kwargs.

Was it so that we can call f._partial(...) and get faster performance?

I had proposed f.call at one point, but I think it's funky too. As the naming implies, f._partial is an implementation detail; cytoolz doesn't need it, so cytoolz doesn't have it.

In general I prefer transmorgification

Even now that curry has even more behaviors that partial doesn't have, like being comparable and hashable in a sane way?

but, given the existence of cytoolz, I'm also just willing to settle on a slow toolz.get.

Fair enough. You've said this enough times now to convince me too :)

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Simple: it is a faster way to implement curry. We use partial to efficiently combine args and merge kwargs.

So it's not a performance issue, just more convenient for implementation?

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

So it's not a performance issue, just more convenient for implementation?

It is a performance issue. Using partial is faster.

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

But we still use standard calling syntax. Can you back this up with a tiny benchmark?

- return self.func(*args, **kwargs)
+ return self._partial(*args, **kwargs)

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Can you back this up with a tiny benchmark?

Certainly. I'll be a few minutes though.

But we still use standard calling syntax.

Yup. That's not where most of the speed improvement comes from. By using partial, all of the below gets skipped prior to the function call:

        args = self.args + args
        if _kwargs:
            kwargs = {}
            if self.keywords:
                kwargs.update(self.keywords)
            kwargs.update(_kwargs)
        elif self.keywords:
            kwargs = self.keywords
        else:
            kwargs = {}

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Ah, now I understand.

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Benchmarks:

In [2]: def f(*args, **kwargs):
   ...:         pass
   ...: 

In [3]: new_f1 = new_curry(f)

In [4]: old_f1 = old_curry(f)

In [5]: 

In [5]: %timeit new_f1()
1000000 loops, best of 3: 1 µs per loop

In [6]: %timeit old_f1()
1000000 loops, best of 3: 1.39 µs per loop

In [7]: %timeit new_f1(1)
1000000 loops, best of 3: 1.19 µs per loop

In [8]: %timeit old_f1(1)
1000000 loops, best of 3: 1.56 µs per loop

In [9]: %timeit new_f1(1, 2)
1000000 loops, best of 3: 1.15 µs per loop

In [10]: %timeit old_f1(1, 2)
100000 loops, best of 3: 1.62 µs per loop

In [11]: %timeit new_f1(x=1)
1000000 loops, best of 3: 1.54 µs per loop

In [12]: %timeit old_f1(x=1)
100000 loops, best of 3: 2.63 µs per loop

In [13]: %timeit new_f1(x=1, y=2)
100000 loops, best of 3: 1.83 µs per loop

In [14]: %timeit old_f1(x=1, y=2)
100000 loops, best of 3: 2.91 µs per loop

In [15]: 

In [15]: new_f2 = new_curry(f, 1, 2)

In [16]: old_f2 = old_curry(f, 1, 2)

In [17]: 

In [17]: %timeit new_f2()
1000000 loops, best of 3: 1.03 µs per loop

In [18]: %timeit old_f2()
1000000 loops, best of 3: 1.44 µs per loop

In [19]: %timeit new_f2(3)
1000000 loops, best of 3: 1.22 µs per loop

In [20]: %timeit old_f2(3)
1000000 loops, best of 3: 1.59 µs per loop

In [21]: %timeit new_f2(3, 4)
1000000 loops, best of 3: 1.29 µs per loop

In [22]: %timeit old_f2(3, 4)
1000000 loops, best of 3: 1.64 µs per loop

In [23]: %timeit new_f2(x=3)
100000 loops, best of 3: 1.8 µs per loop

In [24]: %timeit old_f2(x=3)
100000 loops, best of 3: 2.69 µs per loop

In [25]: %timeit new_f2(x=3, y=4)
100000 loops, best of 3: 1.87 µs per loop

In [26]: %timeit old_f2(x=3, y=4)
100000 loops, best of 3: 2.97 µs per loop

In [27]: 

In [27]: new_f3 = new_curry(f, a=1, b=2)

In [28]: old_f3 = old_curry(f, a=1, b=2)

In [29]: 

In [29]: %timeit new_f3()
1000000 loops, best of 3: 1.38 µs per loop

In [30]: %timeit old_f3()
1000000 loops, best of 3: 1.63 µs per loop

In [31]: %timeit new_f3(3)
1000000 loops, best of 3: 1.55 µs per loop

In [32]: %timeit old_f3(3)
100000 loops, best of 3: 1.85 µs per loop

In [33]: %timeit new_f3(3, 4)
100000 loops, best of 3: 1.74 µs per loop

In [34]: %timeit old_f3(3, 4)
100000 loops, best of 3: 1.92 µs per loop

In [35]: %timeit new_f3(x=3)
100000 loops, best of 3: 2.16 µs per loop

In [36]: %timeit old_f3(x=3)
100000 loops, best of 3: 3.45 µs per loop

In [37]: %timeit new_f3(x=3, y=4)
100000 loops, best of 3: 2.34 µs per loop

In [38]: %timeit old_f3(x=3, y=4)
100000 loops, best of 3: 3.73 µs per loop

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Seems convincing to me!

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Seems convincing to me!

Great!

Was it so that we can call f._partial(...) and get faster performance? This is wacky enough so that I'd prefer not to include it if possible.

Do you want to rename it to f.__partial so users aren't tempted to use implementation details? This generally isn't preferred, because Python users are consenting adults. In this particular case, though, we know cytoolz.curry doesn't have _partial.

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Do you want to rename it to f.__partial so users aren't tempted to use implementation details? This generally isn't preferred, because Python users are consenting adults. In this particular case, though, we know cytoolz.curry doesn't have _partial.

I don't have an opinion on this either way

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

I'd stick with _partial.

I consider this PR ready to merge, but I'll let you do the honors when you're ready.

args = func.args + args
func = func.func

if kwargs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check? Why not just always throw in the kwargs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a performance tweak. partial is faster when keywords is None instead of {}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that I don't care much about performance tweaks during function creation as I assume that these will happen only a few times. In these cases I think that code cleanliness is more important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but this performance tweak isn't for initialization, it is for usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. This is only going to happen whenever I add new args onto the curry, right? I guess that in my use of curried functions this happens only a few times. Can you supply a motivating example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance increase is when the curried function gets called, not upon creating the curried object. Maybe this will clarify:

In [1]: from functools import partial

In [2]: def f():
   ...:     pass
   ...: 

In [3]: kwargs = {}

In [4]: f1 = partial(f, **kwargs)

In [5]: f2 = partial(f)

In [6]: f1.keywords
Out[6]: {}

In [7]: f2.keywords

In [8]: f2.keywords is None
Out[8]: True

In [9]: %timeit f1()
1000000 loops, best of 3: 379 ns per loop

In [10]: %timeit f2()
1000000 loops, best of 3: 291 ns per loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I understand. Alright, I guess that this makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's subtle. One would expect partial to behave the same for f1 and f2 from above, but it doesn't.

eriknw added 2 commits May 6, 2014 15:11
Also, don't return `self` from `__call__`; return a new `curry` object instead.
@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Alright, looks good to me. +1. Merging in a bit if no comments.

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Ah, the docstring wasn't updated to show that curried objects are comparable and hashable. @mrocklin, can I leave this for you to do? Otherwise, +1 from me too.

@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

I'm not particularly concerned about telling users that curries are comparable and hashable. I think that common case users won't care. This is something to think about adding to the curry or serialization docs though. Merging.

mrocklin added a commit that referenced this pull request May 6, 2014
Many improvements to `curry` based on #147 #150 #155 #159
@mrocklin mrocklin merged commit 037ca71 into pytoolz:master May 6, 2014
@mrocklin
Copy link
Member

mrocklin commented May 6, 2014

Thanks for following through on all of these issues @eriknw . This stuff takes times and care.

@eriknw
Copy link
Member Author

eriknw commented May 6, 2014

Thanks for following through on all of these issues @eriknw . This stuff takes times and care.

Absolutely. I know how much you care about curry! I appreciate your thorough engagement too.

eriknw added a commit to eriknw/toolz that referenced this pull request Jun 23, 2014
Also, add a test that ensures `curry` doesn't return a `partial`
object if a single argument remains and keyword arguments are present.
This behavior was added in pytoolz#170.
eriknw added a commit to eriknw/cytoolz that referenced this pull request Jun 24, 2014
Matches curry changes from pytoolz/toolz#170
Matches reduceby changes from pytoolz/toolz#184
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants