Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLN] More cython cleanups, with bonus type annotations #22283

Merged
merged 14 commits into from
Sep 8, 2018

Conversation

jbrockmendel
Copy link
Member

A few places where string formatting is modernized

Removed remaining # cython: profile=False comments

The main (and possible controversial) thing done here is changing some functions to use type-hint syntax. The affected functions are all currently cpdef, but are never used in cython, so should only be def. But cython's syntax does not allow for specifying a return type in def functions, so this is the cleanest way to remove unnecessary cpdef without losing typing. (some discussion of this occurred in #22180)

@WillAyd
Copy link
Member

WillAyd commented Aug 11, 2018

Out of curiosity how are you verifying that the cpdef functions are only being called from Python? Manual exercise or is there a tool to help identify?

@jbrockmendel
Copy link
Member Author

Out of curiosity how are you verifying that the cpdef functions are only being called from Python? Manual exercise or is there a tool to help identify?

Just text search within the file. Since the relevant functions aren't exposed in a .pxd file, there is no risk of them being called from a different cython file.

@codecov
Copy link

codecov bot commented Aug 11, 2018

Codecov Report

Merging #22283 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22283   +/-   ##
=======================================
  Coverage   92.04%   92.04%           
=======================================
  Files         169      169           
  Lines       50782    50782           
=======================================
  Hits        46742    46742           
  Misses       4040     4040
Flag Coverage Δ
#multiple 90.45% <ø> (ø) ⬆️
#single 42.3% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2e1a10...3300de1. Read the comment docs.

@gfyoung gfyoung added Internals Related to non-user accessible pandas implementation Clean labels Aug 13, 2018
@gfyoung gfyoung requested a review from jreback August 13, 2018 07:00
@jbrockmendel
Copy link
Member Author

@jreback would this be easier if I separated the no-profile comments into an isolated PR?

@jreback
Copy link
Contributor

jreback commented Aug 14, 2018

no just haven’t had a chance to look yet

@jbrockmendel jbrockmendel changed the title More cython cleanups, with bonus type annotations [CLN] More cython cleanups, with bonus type annotations Aug 16, 2018
""" return the maximum size of elements in a 1-dim string array """
cdef:
Py_ssize_t i, m = 0, l = 0, length = arr.shape[0]
pandas_string v

for i in range(length):
v = arr[i]
if PyString_Check(v):
if isinstance(v, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these make any perf diffs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cython makes this substitution on its own

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok, can you rebase

@jbrockmendel
Copy link
Member Author

@jreback gentle ping; got your OK last week

@jreback
Copy link
Contributor

jreback commented Sep 4, 2018

think this was ok, can you rebase

@jbrockmendel
Copy link
Member Author

Rebased; circleCI fail is hypothesis timeout

@jbrockmendel
Copy link
Member Author

Rebased+green

@@ -587,7 +587,7 @@ def get_dispatch(dtypes):

{{for name, c_type, dtype in get_dispatch(dtypes)}}

cpdef ensure_{{name}}(object arr, copy=True):
def ensure_{{name}}(object arr, copy=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these for sure not called in cython code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-cimported only

@@ -132,6 +132,7 @@ cdef inline void _sipround(uint64_t* v0, uint64_t* v1,
v2[0] = _rotl(v2[0], 32)


# TODO: This appears unused; remove?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be unused - it was a part of the hashing at one point iirc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will remove in next pass.

@@ -107,7 +107,7 @@ def memory_usage_of_objects(object[:] arr):
# ----------------------------------------------------------------------


cpdef bint is_scalar(object val):
def is_scalar(val: object) -> bint:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this not ever called in cython?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, easy to confirm via grep

@jreback jreback added this to the 0.24.0 milestone Sep 8, 2018
@jreback jreback merged commit ed65364 into pandas-dev:master Sep 8, 2018
@jreback
Copy link
Contributor

jreback commented Sep 8, 2018

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants