Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sun_rise_set_transit_spa behavior changed in v0.11.1 #2238

Open
kandersolar opened this issue Oct 4, 2024 · 4 comments
Open

sun_rise_set_transit_spa behavior changed in v0.11.1 #2238

kandersolar opened this issue Oct 4, 2024 · 4 comments
Labels
Milestone

Comments

@kandersolar
Copy link
Member

Describe the bug
With #2055, pvlib.solarposition.sun_rise_set_transit_spa sunrise etc times are in the next/previous day, depending on the time zone of the input timestamps and how close to midnight the timestamps are.

To Reproduce

import pandas as pd
import pvlib

times = pd.to_datetime(["2000-01-01 18:00", "2000-01-01 19:00"]).tz_localize("Etc/GMT+5")
pvlib.solarposition.sun_rise_set_transit_spa(times, 40, -80)['sunrise']

Output:

v0.11.0:

2000-01-01 18:00:00-05:00   2000-01-01 07:41:50.956689024-05:00
2000-01-01 19:00:00-05:00   2000-01-01 07:41:50.956689024-05:00

v0.11.1:

2000-01-01 18:00:00-05:00   2000-01-01 07:41:50.956689024-05:00
2000-01-01 19:00:00-05:00   2000-01-02 07:41:58.457240320-05:00

In v0.11.1, the dates are based on the UTC equivalent of the input timestamps. Previously, the dates were taken directly from the input timestamps.

Expected behavior
The previous behavior is more intuitive to me. If I am calculating sunrise/sunset, I am likely working in local time, and wanting to calculate sunrise/sunset accordingly.

Unlike dayofyear-based calculations for Earth's orbital position (the focus of #2055), I think sunrise/sunset calculations should respect the timezone of the input timestamps. I think we should revert to the previous behavior.

Additional context
#1631 is somewhat similar, although for a different underlying cause and calculation method.

@cwhanse
Copy link
Member

cwhanse commented Oct 4, 2024

Huh. I agree that v0.11.1 isn't the desired behavior, nor was a change intended.

Now I'm curious about two things:

  • how did tests not detect this change (I haven't looked yet)
  • how did this behavior change due to a seemingly innocent change in these two lines:

Before

    utcday = pd.DatetimeIndex(times.date).tz_localize('UTC')
    unixtime = _datetime_to_unixtime(utcday)

After

    times_utc = times.tz_convert('UTC')
    unixtime = _datetime_to_unixtime(times_utc.normalize())

For the second question, it is because times.date ignores the timezone and simply returns the date part of the datetime string. That is the issue that provoked #2055, when interpolating the monthly turbidity data.

@kandersolar
Copy link
Member Author

In other words: those two snippets both "truncate" the datetimes to midnight (via .date or .normalize()), and convert to UTC. The difference is the order in which those operations are performed.

@cwhanse
Copy link
Member

cwhanse commented Oct 4, 2024

However we fix it, let's add some inline comments.

@yhkee0404
Copy link
Contributor

yhkee0404 commented Oct 7, 2024

Sorry for the unexpected change. In fact, #2055 managed to fix an ambiguous issue of sun_rise_set_transit_spa that had existed before v0.11.1. However, I felt the relevant issues were too big to discuss at once and missed a chance to talk.

Thank you for raising this issue. Please let me clarify the issues one by one. I'll begin by taking the original output of spa_c.spa_calc, which can be interpreted in two ways: like in v0.11.1 affected by #2055 or similarly to sun_rise_set_transit_geometric with an additional ambiguity. Then I will demonstrate that sun_rise_set_transit_spa in v0.11.0 had an even more significant problem that some of the output values did not match those produced by spa_c.spa_calc. Finally, I am about to remind the difference in the output of sun_rise_set_transit_ephem.

import datetime
import pandas as pd
import numpy as np
import pvlib

latitude = 40
longitude = -80
tz = "Etc/GMT+5"

times = pd.date_range("1999-12-31 18:00", "2000-01-01 19:00", freq = 'H', tz=tz)[[0,1,5,6,13,14,-2,-1]]
times_utc = times.tz_convert('UTC')
utcoffset_hours = times[0].utcoffset().total_seconds() / 3600
sunrise_spa_c_local = pvlib.solarposition.spa_c(times, latitude, longitude, raw_spa_output=True)[['sunrise']]
sunrise_spa_c_utc = pvlib.solarposition.spa_c(times_utc, latitude, longitude, raw_spa_output=True)[['sunrise']]
sunrise_spa_c = sunrise_spa_c_local.join(
    sunrise_spa_c_utc.tz_convert(tz), lsuffix='_spa_c_local', rsuffix='_spa_c_utc', how='outer')

assert np.array_equal(sunrise_spa_c_local, sunrise_spa_c_utc)

sunrise_spa_c.head(2)

This shows the sunrise hour returned from spa_c.spa_calc:

                           sunrise_spa_c_local  sunrise_spa_c_utc
1999-12-31 18:00:00-05:00            12.694794          12.694794
1999-12-31 19:00:00-05:00            12.697481          12.697481

The output of spa_c.spa_calc remains consistent across time zones. It seems that the times are being calculated from UTC midnight, with the sun rising around 7 a.m. After applying the UTC offset of -5 hours to the local time:

image

sunrise_spa_c_local_fixed = sunrise_spa_c_local + utcoffset_hours
sunrise_spa_c_fixed = sunrise_spa_c.copy()
sunrise_spa_c_fixed['sunrise_spa_c_local'] = sunrise_spa_c_local_fixed
sunrise_spa_c_fixed.rename(columns={'sunrise_spa_c_local':'sunrise_spa_c_fixed_local'}, inplace=True)

sunrise_spa_c_local_datetime = sunrise_spa_c_local_fixed.apply(
    lambda row: row.apply(lambda x: row.name.normalize() + datetime.timedelta(hours=x)), axis=1)
sunrise_spa_c_utc_datetime = sunrise_spa_c_utc.apply(
    lambda row: row.apply(lambda x: row.name.normalize() + datetime.timedelta(hours=x)), axis=1)
sunrise_spa_c_datetime = sunrise_spa_c_local_datetime.join(
    sunrise_spa_c_utc_datetime.tz_convert(tz), lsuffix='_spa_c_fixed_local', rsuffix='_spa_c_utc', how='outer')

assert np.array_equal(sunrise_spa_c_local_fixed - utcoffset_hours, sunrise_spa_c_utc)
assert not np.array_equal(
    sunrise_spa_c_local_datetime.applymap(lambda x: x.date()),
    sunrise_spa_c_utc_datetime.applymap(lambda x: x.date()))

sunrise_spa_c_fixed.join(
    sunrise_spa_c_datetime, lsuffix='_original_hour', rsuffix='_derived_datetime', how='outer')

There are two possible approaches to adjust sunrise_spa_c_fixed_local_derived_datetime. One method is to simply convert sunrise_spa_c_utc_derived_datetime to the local timezone. The other method involves overriding the UTC output's date with that of the local input. The latter approach, which I used here, presents an issue: we cannot determine the correct local sunrise by dates. For example, there is an ambiguity between 1999-12-31 07:41:41.256873-05:00 and 1999-12-31 07:41:50.931387-05:00, and between 2000-01-01 07:41:50.931387-05:00 and 2000-01-01 07:41:58.431969-05:00. This problem does not arise in UTC.

#2055 in v0.11.1 adopted the former approach for sun_rise_set_transit_spa:

image

sunrise_spa_py_local = pvlib.solarposition.sun_rise_set_transit_spa(times, latitude, longitude)[['sunrise']]
sunrise_spa_py_utc = pvlib.solarposition.sun_rise_set_transit_spa(times_utc, latitude, longitude)[['sunrise']]
sunrise_spa_py = sunrise_spa_py_local.join(
    sunrise_spa_py_utc.tz_convert(tz), lsuffix='_spa_py_local', rsuffix='_spa_py_utc')

sunrise_spa_py_hour = sunrise_spa_py.applymap(lambda x: x.hour + x.minute / 60 + x.second / 3600)

assert np.array_equal(sunrise_spa_py_local, sunrise_spa_py_utc)

sunrise_spa_py_hour.join(sunrise_spa_py, lsuffix='_derived_hour_v0.11.1', rsuffix='_original_datetime_v0.11.1')

While the primary concern with this issue is that the change was unexpected, it successfully resolved the ambiguity of having two sunrises on the same local date.

On the other hand, sun_rise_set_transit_geometric seems to use the latter approach, where the time part of the output changes at UTC midnight but the date part corresponds to the input, differing at either UTC or local midnight:

image

eot = pvlib.solarposition.equation_of_time_spencer71(
    times_utc.dayofyear)  # minutes
decl = pvlib.solarposition.declination_spencer71(times_utc.dayofyear)  # radians

sunrise_geometric_local = pvlib.solarposition.sun_rise_set_transit_geometric(
    times, latitude, longitude, decl, eot)[0].to_series(index=times, name='sunrise').to_frame()
sunrise_geometric_utc = pvlib.solarposition.sun_rise_set_transit_geometric(
    times_utc, latitude, longitude, decl, eot)[0].to_series(index=times_utc, name='sunrise').to_frame()
sunrise_geometric = sunrise_geometric_local.join(
    sunrise_geometric_utc.tz_convert(tz), lsuffix='_geometric_local', rsuffix='_geometric_utc')

sunrise_geometric_hour = sunrise_geometric.applymap(lambda x: x.hour + x.minute / 60 + x.second / 3600)

assert np.isclose(
    sunrise_geometric_hour['sunrise_geometric_local'] - utcoffset_hours,
    sunrise_geometric_hour['sunrise_geometric_utc']).all()
assert not np.array_equal(
    sunrise_geometric_local.applymap(lambda x: x.date()), sunrise_geometric_utc.applymap(lambda x: x.date()))

sunrise_geometric_hour.join(sunrise_geometric, lsuffix='_derived_hour', rsuffix='_original_datetime')

Although neither of these two approaches alters the original pattern of sunrise hours from spa_c.calc, which always change at UTC midnight regardless of the timezone, the prior implementation of sun_rise_set_transit_spa in v0.11.0 tried to differentiate timezones; local hours change at local midnights unlike UTC hours at UTC midnights, which can be seen as a bug since it contradicts the original behavior of spa_c.calc:

image

sunrise_spa_py_local = pvlib.solarposition.sun_rise_set_transit_spa(times, latitude, longitude)[['sunrise']]
sunrise_spa_py_utc = pvlib.solarposition.sun_rise_set_transit_spa(times_utc, latitude, longitude)[['sunrise']]
sunrise_spa_py = sunrise_spa_py_local.join(
    sunrise_spa_py_utc.tz_convert(tz), lsuffix='_spa_py_local', rsuffix='_spa_py_utc', how='outer')

sunrise_spa_py_hour = sunrise_spa_py.applymap(lambda x: x.hour + x.minute / 60 + x.second / 3600)

assert not np.array_equal(
    sunrise_spa_py_hour['sunrise_spa_py_local'] - utcoffset_hours, sunrise_spa_py_hour['sunrise_spa_py_utc'])
assert not np.array_equal(
    sunrise_spa_py_local.applymap(lambda x: x.date()), sunrise_spa_py_utc.applymap(lambda x: x.date()))

sunrise_spa_py_hour.join(
    sunrise_spa_py, lsuffix='_derived_hour_v0.11.0', rsuffix='_original_datetime_v0.11.0', how='outer')

Unlike the other two approaches, which have consistent times and only change at UTC midnights even for local hours, the behavior of sun_rise_set_transit_spa before v0.11.1, i.e. v0.11.0, produced different results when converting the same datetime input across time zones. I wonder if the behavior was truly intentional or intuitive. Please note that sunrise_spa_py_local_derived_hour_v0.11.0 seems to show incorrect values from 19:00 to 23:00:

assert abs(7.694722 + 5 - 12.697222) > 1e-3
assert abs(7.697222 + 5 - 12.699444) > 1e-3

On the contrary, sunrise_spa_py_utc_derived_hour_v0.11.0 looks correct and nearer to spa_c.spa_calc:

assert abs(12.694794 - 12.694722) <= 1e-3
assert abs(12.697481 - 12.697222) <= 1e-3
assert abs(12.699564 - 12.699444) <= 1e-3

PyEphem, on the other hand, takes a different approach. It seems closest to the current behavior of sun_rise_set_transit_spa in v0.11.1, where both the date and time parts of the input change at the same time; but this time immediately after the sunrise, for example. around 7 a.m.:

image

sunrise_ephem_local = pvlib.solarposition.sun_rise_set_transit_ephem(times, latitude, longitude)[['sunrise']]
sunrise_ephem_utc = pvlib.solarposition.sun_rise_set_transit_ephem(times_utc, latitude, longitude)[['sunrise']]
sunrise_ephem = sunrise_ephem_local.join(
    sunrise_ephem_utc.tz_convert(tz), lsuffix='_ephem_local', rsuffix='_ephem_utc')

sunrise_ephem_hour = sunrise_ephem.applymap(lambda x: x.hour + x.minute / 60 + x.second / 3600)

assert np.array_equal(sunrise_ephem_local, sunrise_ephem_utc)

sunrise_ephem_hour.join(sunrise_ephem, lsuffix='_derived_hour', rsuffix='_original_datetime')

To sum up, I've identified four possible approaches to resolve this issue. One option is to modify sun_rise_set_transit_spa and sun_rise_set_transit_geometric to align with PyEphem's methodology as shown in the last screenshot, although spa_c.calc has no choice but to remain an exception. Another approach is to retain the behavior introduced by #2055 in v0.11.1 as shown in the second screenshot, where all values differ at UTC midnights, and to update sun_rise_set_transit_geometric accordingly. A third option would be to preserve the existing sun_rise_set_transit_geometric and modify sun_rise_set_transit_spa to align with it so that only the date part of the output corresponds to the input datetime, similar to the first and third screenshots. Finally, we could revert to the bahavior of v0.11.0, where the output changes based on time zone conversions of the same datetime input, differentiating it from both spa_c.spa_calc and sun_rise_set_transit_geometric.

Thank you for reading! Which option do you think is best?

@cwhanse cwhanse modified the milestones: v0.11.2, v0.11.3 Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants