Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading hourly model data in aerocom format and colocating based on that crashes #1332

Open
jgriesfeller opened this issue Sep 3, 2024 · 2 comments

Comments

@jgriesfeller
Copy link
Member

Describe the bug
Please provide a clear and concise description of what the bug is.

  • Pyaerocom version: 0.22.dev0; branch 1330-add-more-aeroval-base-configurations
  • Computing platform:
  • Configuration file:
from pyaerocom.aeroval import EvalSetup, ExperimentProcessor
from pyaerocom.aeroval.config.cameo.base_config import get_CFG
CFG = get_CFG(anayear=2019,
             )
CFG["raise_exceptions"] = False
CFG["add_model_maps"] = False
stp = EvalSetup(**CFG)
ana = ExperimentProcessor(stp)
ana.update_interface()

res = ana.run()
  • Error message
/lustre/storeB/project/fou/kl/CAMS2_40/task4041/EMEP.cameo/renamed/aerocom3_EMEP.cameo_concnh4_Surface_2019_hourly.nc.
Error: repr(Last timestamp of data 2019-12-31T00:00:00.000000 does not lie in end period: 2019-12-31 23:00)

Invalid var_name time for coord None in cube. Overwriting with time
Invalid long_name None for coord time in cube. Overwriting with Time
Invalid long_name latitude for coord lat in cube. Overwriting with Center coordinates for latitudes
Invalid long_name longitude for coord lon in cube. Overwriting with Center coordinates for longitudes
Failed to perform analysis: Traceback (most recent call last):
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocator.py", line 390, in run
   coldata = self._run_helper(
             ^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocator.py", line 1068, in _run_helper
   coldata = self._colocation_func(**args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocation_utils.py", line 799, in colocate_gridded_ungridded
   all_stats = data_ref.to_station_data_all(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/ungriddeddata.py", line 1257, in to_station_data_all
   data = self.to_station_data(
          ^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/ungriddeddata.py", line 955, in to_station_data
   merged = merge_station_data(
            ^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/helpers.py", line 931, in merge_station_data
   merged = _merge_stats_2d(
            ^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/helpers.py", line 803, in _merge_stats_2d
   merged.merge_other(
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 868, in merge_other
   self.merge_vardata(other, var_name, **kwargs)
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 838, in merge_vardata
   return self._merge_vardata_2d(other, var_name, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 756, in _merge_vardata_2d
   s0 = pd.concat([s0, s1], verify_integrity=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 395, in concat
   return op.get_result()
          ^^^^^^^^^^^^^^^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 644, in get_result
   new_index = self.new_axes[0]
               ^^^^^^^^^^^^^
 File "properties.pyx", line 36, in pandas._libs.properties.CachedProperty.__get__
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 702, in new_axes
   return [
          ^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 703, in <listcomp>
   self._get_concat_axis if i == self.bm_axis else self._get_comb_axis(i)
   ^^^^^^^^^^^^^^^^^^^^^
 File "properties.pyx", line 36, in pandas._libs.properties.CachedProperty.__get__
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 766, in _get_concat_axis
   self._maybe_check_integrity(concat_axis)
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 774, in _maybe_check_integrity
   raise ValueError(f"Indexes have overlapping values: {overlap}")
ValueError: Indexes have overlapping values: DatetimeIndex(['2019-12-26'], dtype='datetime64[s]', freq=None)


Failed to perform analysis: Traceback (most recent call last):
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocator.py", line 390, in run
   coldata = self._run_helper(
             ^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocator.py", line 1068, in _run_helper
   coldata = self._colocation_func(**args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/colocation/colocation_utils.py", line 799, in colocate_gridded_ungridded
   all_stats = data_ref.to_station_data_all(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/ungriddeddata.py", line 1257, in to_station_data_all
   data = self.to_station_data(
          ^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/ungriddeddata.py", line 955, in to_station_data
   merged = merge_station_data(
            ^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/helpers.py", line 931, in merge_station_data
   merged = _merge_stats_2d(
            ^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/helpers.py", line 803, in _merge_stats_2d
   merged.merge_other(
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 868, in merge_other
   self.merge_vardata(other, var_name, **kwargs)
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 838, in merge_vardata
   return self._merge_vardata_2d(other, var_name, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/jang/data/Python3/pyaerocom/pyaerocom/stationdata.py", line 756, in _merge_vardata_2d
   s0 = pd.concat([s0, s1], verify_integrity=True)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 395, in concat
   return op.get_result()
          ^^^^^^^^^^^^^^^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 644, in get_result
   new_index = self.new_axes[0]
               ^^^^^^^^^^^^^
 File "properties.pyx", line 36, in pandas._libs.properties.CachedProperty.__get__
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 702, in new_axes
   return [
          ^
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 703, in <listcomp>
   self._get_concat_axis if i == self.bm_axis else self._get_comb_axis(i)
   ^^^^^^^^^^^^^^^^^^^^^
 File "properties.pyx", line 36, in pandas._libs.properties.CachedProperty.__get__
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 766, in _get_concat_axis
   self._maybe_check_integrity(concat_axis)
 File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya-edit/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 774, in _maybe_check_integrity
   raise ValueError(f"Indexes have overlapping values: {overlap}")
ValueError: Indexes have overlapping values: DatetimeIndex(['2019-12-26'], dtype='datetime64[s]', freq=None)

To Reproduce
Steps to reproduce the behavior:
1.git switch 1330-add-more-aeroval-base-configurations
2.put config file above into a python file
3.python

Expected behavior
don't crash

Screenshots
None

Additional context
anaysis of hourly data for the CAMEO project

@jgriesfeller jgriesfeller changed the title reading hourly model data in aerocom format and colocating based on that looks crashes reading hourly model data in aerocom format and colocating based on that crashes Sep 3, 2024
@lewisblake
Copy link
Member

lewisblake commented Sep 16, 2024

Check the times again for precision. More generally speaking, we should probably reconsider how we collocated with hourly data, whether we want to actually use timestamps from the model, how, we deal with what the timestamps represent (beginning, middle, end), etc.

@jgriesfeller
Copy link
Member Author

I had a look at the times once again.
time variable of old task4041 file:

double time(time) ;
                time:standard_name = "time" ;
                time:long_name = "time at middle of period" ;
                time:units = "days since 1900-01-01" ;
                time:calendar = "standard" ;
                time:axis = "T" ;
data:

 time = "2018-01-01 00:30", "2018-01-01 01:30", "2018-01-01 02:30", 

New file:

double time(time) ;
                time:standard_name = "time" ;
                time:long_name = "time at end of period" ;
                time:units = "days since 1900-01-01" ;
                time:calendar = "standard" ;
                time:axis = "T" ;

data:

 time = "2019-01-01", "2019-01-01 01", "2019-01-01 02", "2019-01-01 03", 

So the difference is that the new data uses the end point (time at end of period) while the old data uses the middle point (time at middle of period)

I'm not aware that I ever saw the end point explicitly mentioned in the time's long name. Testing if that might be the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants