-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change volume stats to handle and output masked array result #618
Conversation
Thanks @valeriupredoi . I'm still getting NaNs though. This is what the last elements of
I think it's because of the masked |
@sloosvel indeed, In [10]: np.ma.average(np.ma.array([1, 2, 3, 5]), weights=np.ma.array([2,2,2,2], mask=[True, True, True, False]))
Out[10]: 5.0
In [11]: np.ma.average(np.ma.array([1, 2, 3, 5]), weights=np.ma.array([2,2,2,2], mask=[True, True, True, True]))
Out[11]: nan and it looks like the actual logic is correct from the documentation since the statistic is computed using sum(weights) as denominator (obv, weighted mean heh) so in case of a fully masked weights array we are a bit stuck - should probably perform a simple mean in that case since weights are completely irrelevant but that could assign a bit more weight to that particular column- I am not sure how to approach from the statistical point of vew - @ledm and advice? |
or we can just completely mask that column? |
OK I masked it completely to get rid of |
I think the routine is computing the 2D average first and then the average over all levels. So the output of averaging Anyway, I think it's a bit strange to have a completely masked level in volcello. |
so that's the average volume per time point with computed average over a set of columns at different depths, weighted by each column's volume; the code changes mask those values that have weights masked equivalent to saying if the column volume is not viable as a weight there (coz it's masked) it means I may not have enough information about that particular column, and extrapolating this - if all my columns have masked values I don't have enough information to compute an average volume at that time point so I am masking it - I think it's fine |
I don't know that's seems overly complicated to me. I took a closer look and it turns out that depth_volume is a list. I changed its type to np.ma.array and I don't get nans anymore. |
@sloosvel this is the generalized approach, note that even if |
any news on this @sloosvel 🍺 |
I am not sure if a case in which all There are several models that have the last layer of the ocean data fully masked. This routine is not handling well this case because the |
cool! then apply whatever changes are needed here, job done by the sounds of it 🍺 |
…to change_volume_stats_to_masked_result
@ledm @bouweandela Can you please review this? It is blocking us |
…to change_volume_stats_to_masked_result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok by me, shame I can't approve my own PR that is in fact not my PR 😁 @sloosvel does your implementation sort out all them nan
'd?
I added a test for nans and looks like it's passing |
Indeed it do 😁 What's with all the deleted blank lines? New Flake8 conventions for docstrings? - there has to be a blank line between docstring body and eg |
I copy-pasted code and probably messed the format while doing so, will correct now. |
@jvegasbsc Please feel free to help out with reviewing, it is impossible for me to review every single pull request in a timely manner. |
@bouweandela Well if you want to consider that case I think the easiest is to save the result in line 275 in
@valeriupredoi Your implementation was not working for the datasets that I was testing. Because the average was still being computed with lists instead of masked arrays, so the result was a NaN that was getting masked at the very end. I cannot have |
Yes, that would be a good solution. |
@mattiarighi Could you please test and merge when you're happy with the result? |
@sloosvel can you provide a test case (specific model and variable(s) to test)? |
@mattiarighi Sorry for the delay, our data it's on its way to be published but it's not available yet. The HadGEM3-GC31-LL historical data could be another example of a volcello with a fully masked level, but I'm having concatenation errors with this dataset. |
The changes in code look fine to me. |
Before you start, please read CONTRIBUTING.md.
Tasks
no, it's just being petty, it can bugger off
yamllint
to check that your YAML files do not contain mistakesIf you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.
Closes #611