Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update MIDC variable mapping #2006

Merged
merged 11 commits into from
Jun 7, 2024
Merged

Conversation

AdamRJensen
Copy link
Member

@AdamRJensen AdamRJensen commented Apr 12, 2024

  • [ ] Closes #xxxx
  • I am familiar with the contributing guidelines
  • [ ] Tests added
  • [ ] Updates entries in docs/sphinx/source/reference for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels (including remote-data) and Milestone are assigned to the Pull Request and linked Issue.

I got the following comments from the personnel responsible for operating MIDC. This PR implements the requested changes, which I think makes good sense.

Note, the change in how instruments are mapped is in a way a breaking change,

Just a heads up on the following variable mapping:

  • NWTC does not have PSP anymore (and they also have Direct and Diffuse now).
  • BMS still has the NIP, but the CHP1 is better (we had the CHP1 for a very long time).
  • UOSMRL does not have Diffuse Schenk anymore (they have a Diffuse CMP22). Also they now have an instrument better than NIP that can be used instead (CHP1)

I’m not sure if this variable mapping is used for just current data or historical data, so maybe that is why the NIP was chosen so that it works over a larger time period, however for NWTC and UOSMRL you won’t have the PSP and Schenk anymore.

Also I highly recommend the VTIF be removed since its shutdown (had leveling problems) and since BMS is so close to it there is no reason to use it.

@AdamRJensen AdamRJensen added io remote-data triggers --remote-data pytests labels Apr 12, 2024
@AdamRJensen AdamRJensen requested a review from kandersolar April 12, 2024 09:26
@AdamRJensen
Copy link
Member Author

Test failures are related to upload to Codecov

@cwhanse
Copy link
Member

cwhanse commented Apr 12, 2024

To be clear, MIDC no longer offers data from those old instruments through their API? Or, are the old data still available somehow, and there are also data from new instruments?

@AdamRJensen
Copy link
Member Author

AdamRJensen commented Apr 12, 2024

To be clear, MIDC no longer offers data from those old instruments through their API? Or, are the old data still available somehow, and there are also data from new instruments?

The latter. At some periods that may be two of the same type of measurement, e.g. DNI, so we need to decide which gets mapped to "dni".

@cwhanse
Copy link
Member

cwhanse commented Apr 12, 2024

If someone requests old data (prior to these MIDC changes), they will get old data. If they also ask for variable mapping, pvlib will rename any of the current fields, and leave the old ones alone. Is that right?

@AdamRJensen
Copy link
Member Author

If someone requests old data (prior to these MIDC changes), they will get old data. If they also ask for variable mapping, pvlib will rename any of the current fields, and leave the old ones alone. Is that right?

Correct. No data is changed.

But if you upgrade your pvlib version, the dni data may seem to have changed because the data is renamed from a different column in some instances.

@kandersolar
Copy link
Member

It's not obvious to me that pvlib.iotools should be in the business of choosing which of several DNI options is the "blessed" channel that gets mapped to be mapped to the special name dni. What do we do with other datasets that provide multiple measurements of the same quantity? With get_srml, don't we return names like dni_1, dni_2?

@AdamRJensen
Copy link
Member Author

It's not obvious to me that pvlib.iotools should be in the business of choosing which of several DNI options is the "blessed" channel that gets mapped to be mapped to the special name dni. What do we do with other datasets that provide multiple measurements of the same quantity? With get_srml, don't we return names like dni_1, dni_2?

I believe so. In this case we could maybe do dni_chp1 and the likes?

@kandersolar
Copy link
Member

In this case we could maybe do dni_chp1 and the likes?

I could support this. The alternative I see is to not map at all in cases like this where there are several options.

@AdamRJensen
Copy link
Member Author

AdamRJensen commented Apr 19, 2024

It's not obvious to me that pvlib.iotools should be in the business of choosing which of several DNI options is the "blessed" channel that gets mapped to be mapped to the special name dni.

I want to emphasize that it is the data provider doing it in this case. So I don't see a problem with it. Also in this case there's a significant improvements in accuracy (thus a clear choice) and one instrument was decommissioned

@AdamRJensen
Copy link
Member Author

@kandersolar I've implemented a bit of a compromise. For the stations that have multiple pyrheliometers, I have implemented subscripts for the instrument type, e.g., dni_chp1 and dni_nip.

For the case when the new column does not denote the instrument type, e.g., 'Diffuse [W/m^2], I have chosen to just map this to the corresponding variable name (for the example it would be dhi).

@AdamRJensen AdamRJensen added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Apr 22, 2024
@AdamRJensen AdamRJensen added this to the v0.10.5 milestone Apr 22, 2024
@kandersolar kandersolar modified the milestones: v0.10.5, 0.11.0 May 3, 2024
@AdamRJensen AdamRJensen marked this pull request as ready for review May 24, 2024 08:54
Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for cases like the MIDC where there are potentially multiple instruments, the rule should be to always append the instrument name when it is known, even if there is only one measurement of that type? So for example, BMS (which has only one GHI measurement listed) would have 'Global CMP22 (vent/cor) [W/m^2]': 'ghi_cmp22'.

This seems somewhat more consistent to me, and has the added benefit of being future-proof if/when the MIDC adds more instruments again in the future.

Side note: doesn't the BMS have dozens of instruments? Why do we only map 1-2 of each type?

docs/sphinx/source/whatsnew/v0.11.0.rst Outdated Show resolved Hide resolved
pvlib/iotools/midc.py Outdated Show resolved Hide resolved
@AdamRJensen
Copy link
Member Author

Maybe for cases like the MIDC where there are potentially multiple instruments, the rule should be to always append the instrument name when it is known, even if there is only one measurement of that type? So for example, BMS (which has only one GHI measurement listed) would have 'Global CMP22 (vent/cor) [W/m^2]': 'ghi_cmp22'.

This seems somewhat more consistent to me, and has the added benefit of being future-proof if/when the MIDC adds more instruments again in the future.

I don't think it is future proof given that SRRL for example has multiple of the same type, so it would have to be ghi_cmp22_1.

Side note: doesn't the BMS have dozens of instruments? Why do we only map 1-2 of each type?

Yes, I expect they have 20+ GHI sensors, so I think that's not useful to map. However, they specify one instrument as the main instrument, which is what we map in this case. I acknowledge that it is not ideal, but I still think that merging this PR is the current best option

@cwhanse
Copy link
Member

cwhanse commented Jun 6, 2024

merging this PR is the current best option

I agree

@AdamRJensen AdamRJensen merged commit 17aed3a into pvlib:main Jun 7, 2024
24 of 25 checks passed
@AdamRJensen AdamRJensen deleted the update_midc_dict branch June 7, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io remote-data triggers --remote-data pytests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants