Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone #51247

Merged
merged 22 commits into from
Mar 15, 2023

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Feb 8, 2023

trying to use https://peps.python.org/pep-0508/#environment-markers, let's see if this works

@MarcoGorelli MarcoGorelli marked this pull request as ready for review February 9, 2023 11:49
pyproject.toml Outdated
@@ -27,7 +27,8 @@ dependencies = [
"numpy>=1.21.0; python_version>='3.10'",
"numpy>=1.23.2; python_version>='3.11'",
"python-dateutil>=2.8.2",
"pytz>=2020.1"
"pytz>=2020.1",
"tzdata>=2022.1; platform_system=='Windows'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be checked in setup.py? or even just import it in timezones.pyx?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't know, I couldn't reproduce this when building from source

Tempted to just ship it, and then check with the nightlies?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lithomas1 thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll tag this as build to run the wheel builders, which should sniff this stuff out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lithomas1 did you try this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the wheel builder jobs ran on this PR. Looks like some failures. I don't have Windows access so I can't help more sorry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wheel builder jobs ran on this PR

Ah, I see now sorry - thanks! Taking a look

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you might need a check for this in setup.py?

If you uninstall tzdata and try to run python setup.py develop, does this error correctly?

I don't know if setuptools reads this section of pyproject.toml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

giving this a go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this worked

In a new virtual environment on Windows, I:

  • pip installed numpy
  • pip installed cython
  • pip install versioneer[toml]
  • ran python setup.py develop

and tzdata got installed, along with python-dateutil. Here's (part of) my output:

copying build\lib.win-amd64-cpython-38\pandas\_libs\window\aggregations.cp38-win_amd64.pyd -> pandas\_libs\window
copying build\lib.win-amd64-cpython-38\pandas\_libs\window\indexers.cp38-win_amd64.pyd -> pandas\_libs\window
copying build\lib.win-amd64-cpython-38\pandas\_libs\writers.cp38-win_amd64.pyd -> pandas\_libs
copying build\lib.win-amd64-cpython-38\pandas\io\sas\_sas.cp38-win_amd64.pyd -> pandas\io\sas
copying build\lib.win-amd64-cpython-38\pandas\io\sas\_byteswap.cp38-win_amd64.pyd -> pandas\io\sas
copying build\lib.win-amd64-cpython-38\pandas\_libs\json.cp38-win_amd64.pyd -> pandas\_libs
Creating c:\users\user\pandas-dev\.venv\lib\site-packages\pandas.egg-link (link to .)
Adding pandas 2.1.0.dev0+186.g4b054da685 to easy-install.pth file

Installed c:\users\user\pandas-dev
Processing dependencies for pandas==2.1.0.dev0+186.g4b054da685
Searching for tzdata>=2022.1
Reading https://pypi.org/simple/tzdata/
C:\Users\User\pandas-dev\.venv\lib\site-packages\pkg_resources\__init__.py:123: PkgResourcesDeprecationWarning:  is an invalid version and will not be supported in a future release
  warnings.warn(
Downloading https://files.pythonhosted.org/packages/fa/5e/f99a7df3ae2079211d31ec23b1d34380c7870c26e99159f6e422dcbab538/tzdata-2022.7-py2.py3-none-any.whl#sha256=2b88858b0e3120792a3c0635c23daf36a7d7eeeca657c323da299d2094402a0d
Best match: tzdata 2022.7
Processing tzdata-2022.7-py2.py3-none-any.whl
Installing tzdata-2022.7-py2.py3-none-any.whl to c:\users\user\pandas-dev\.venv\lib\site-packages
Adding tzdata 2022.7 to easy-install.pth file

Installed c:\users\user\pandas-dev\.venv\lib\site-packages\tzdata-2022.7-py3.8.egg

@jbrockmendel
Copy link
Member

I'm on board in principle, dont know enough about pep508 to offer an informed opinion

@lithomas1 lithomas1 added the Build Library building on various platforms label Feb 10, 2023
@jbrockmendel
Copy link
Member

@MarcoGorelli IIUC you added a warning in timezones.pyx to address the associated issue(s). do you want to do this in addition to that?

@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Mar 2, 2023
@MarcoGorelli
Copy link
Member Author

@MarcoGorelli IIUC you added a warning in timezones.pyx to address the associated issue(s). do you want to do this in addition to that?

I don't think I did, no

@MarcoGorelli MarcoGorelli marked this pull request as draft March 3, 2023 16:14
@MarcoGorelli MarcoGorelli marked this pull request as ready for review March 6, 2023 07:39
@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Mar 6, 2023

tzdata is versioned as 2022.1, 2022.2, ... on PyPI, but 2022a, 2022b on conda-forge 🤦

I can't see how to specify a minimum for both, but putting 2022.1 and then installing that dependency from PyPI seems to work

But I'm worried that if someone has already installed tzdata2022a from conda and then also installs tzdata2022.1 from PyPI then they'll get issues from the conflict. Or is this a non-issue?

I'm tempted to close, contact the tzdata conda-forge people and ask them to follow the same versioning as the PyPI package, and then specify a minimum which works for both

Removing from the 2.0.0 milestone anyway, this probably isn't something worth backporting (esp. if we're not sure about it)

@MarcoGorelli MarcoGorelli removed this from the 2.0 milestone Mar 6, 2023
@MarcoGorelli MarcoGorelli mentioned this pull request Mar 9, 2023
1 task
- pip:
- "cython"
- "--extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple"
- "--pre"
- "numpy"
- "scipy"
- "tzdata>=2022.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these somehow be marked as windows-specific?

@@ -657,7 +657,7 @@ Optional libraries below the lowest tested version may still work, but are not c
+-----------------+-----------------+---------+
| tabulate |0.8.9 | X |
+-----------------+-----------------+---------+
| tzdata |2022a | |
| tzdata |2022.1 | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still want to edit the 1.5 file?

@MarcoGorelli
Copy link
Member Author

cool, I've gone with 2022.1 then, the conda recipe yaml file can specify 2022a

CI seems to be working fine now

@MarcoGorelli
Copy link
Member Author

reckon we can get this in for the next release candidate?

@lithomas1 lithomas1 added this to the 2.0 milestone Mar 13, 2023
@MarcoGorelli
Copy link
Member Author

I tried this out in a fresh virtual environment on Windows, and tzdata got installed fine: here's my complete unedited traceback https://pastebin.com/JpuCz8PY

- xarray>=0.21.0
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
- zstandard>=0.15.2

- pip:
- tzdata>=2022.1; platform_system=="Windows"
Copy link
Member

@lithomas1 lithomas1 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to be consistent and go with the pip installed tzdata everywhere, so we don't get surprises if/when Github updates its runner images.

I guess there's a risk of things breaking for people without the tzdata package installed, but that's more of a zoneinfo/Python stdlib problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, but isn't that what I've done (i.e. pip-installed tzdata)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant on all platforms not just Windows, sorry if I was unclear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sure, I've removed the windows-specific condition, and have installed/required it everywhere

Copy link
Member

@lithomas1 lithomas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's give this a shot!

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's one more timezone extra reference in Installing optional dependencies with pip extras in whatsnew/v2.0.0.rst. Otherwise LGTM

@mroeschke mroeschke merged commit 3ea1780 into pandas-dev:main Mar 15, 2023
@mroeschke
Copy link
Member

Thanks @MarcoGorelli

@lumberbot-app
Copy link

lumberbot-app bot commented Mar 15, 2023

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.0.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 3ea17805b072d575d1f55f50fe0027d0fdb9b985
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #51247: PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone'
  1. Push to a named branch:
git push YOURFORK 2.0.x:auto-backport-of-pr-51247-on-2.0.x
  1. Create a PR against branch 2.0.x, I would have named this PR:

"Backport PR #51247 on branch 2.0.x (PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@phofl
Copy link
Member

phofl commented Mar 17, 2023

@MarcoGorelli could you do the backport?

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Mar 17, 2023

yup, thanks! guess this didn't make the rc then 😄 oh well

MarcoGorelli added a commit to MarcoGorelli/pandas that referenced this pull request Mar 17, 2023
…rom a list of Timestamp with timezone

---------

Co-authored-by: MarcoGorelli <>
(cherry picked from commit 3ea1780)
MarcoGorelli added a commit to MarcoGorelli/pandas that referenced this pull request Mar 17, 2023
…rom a list of Timestamp with timezone

---------

Co-authored-by: MarcoGorelli <>
(cherry picked from commit 3ea1780)
phofl pushed a commit that referenced this pull request Mar 17, 2023
…ndex from a list of Timestamp with timezone) (#52045)

Backport PR #51247: PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone

---------

Co-authored-by: MarcoGorelli <>
(cherry picked from commit 3ea1780)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone
5 participants