-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended consistency checking #268
Extended consistency checking #268
Conversation
1349e1d
to
9573d3f
Compare
thanks @znicholls, but I'm not sure that this change makes sense - do we really want to "jump" over levels in the variable tree? Wouldn't it make more sense to require that the variable |
I think it would but I also think that ship has sailed. For example, aviation and shipping are often reported at the World level only and you need to take account of that when checking regional sums. Pyam can already do that, this PR just makes it so it can take account variables which are deeper in the hierarchy. That is required in some cases (the current one for me being sorting variables into MAGICC's fossil and afolu boxes: I need to put aviation below fossil but make sure aviation is counted when making aggregates/checking consistency). |
Not sure why windows tests are failing, will try rebasing... |
8dabbc9
to
813fb95
Compare
@znicholls, still not quite convinced...
That, I agree with - but if you have a structure like
|
Yes, but if you have a structure like
then the current setup will fail when checking the aggregation of |
I'm +1 with @znicholls on this one. Bunkers are a consistent pain the in side for all emissions-related calculations, and I think first class support for them in That being said, would it make sense to add a few more tests here @znicholls? At the moment, I see that you added a shipping sector in the test data, but I don't think any tests are being applied explicitly for this feature (correct me if I'm wrong).. Also, would it be useful to add this new capability to the existing aggregation tutorial? I guarantee that folks who don't know it exists a prior will never find it buried either in the code or this PR. |
And ho-boy, it looks like windows tests are failing in our tutorial tests with |
@znicholls can you try adding |
Adding the shipping sector made the test suite fail (before the change) so I think that's testing for this feature in some way. Having said that, happy to add whatever tests you think are sensible (I just couldn't think of other obvious ones beyond what we already have).
Will put it on the todos. |
Ok, music to my ears, but also not obvious! I guess I was thinking of just including a consistency check of the sector total, but maybe not needed? I leave the decision in your expert hands. |
Added a couple more which hopefully make things more obvious. Over to you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve the changes, but I'm still not convinced that this is a good path forward for the check_consistency()
function - it only works on an IamDataFrame
that has
- a completely hierarchical variable tree
(it will fail as soon as you haveFinal Energy|<sectors>
andFinal Energy|<fuels>
) - assumes summation over all variables
- has exactly one "global" region and n subregions (but no sub-sub-regions)
@gidden, over to you for a final review and merge |
I agree that this only works for a specific use case. However that's the use case I currently have (pulling out the CMIP6 data, doing some operations on it, making sure I didn't cock anything up along the way). I think the limitations you point out are great things to go for next, I just have no idea how to implement them. Maybe we make a new issue for this? |
Agreed that there are limitations here, but I see us building up a clean interface. When more features are needed, we can dig into the details, but the community writ large also needs to decide on those details (how do we define sub-sub regions, e.g.). @danielhuppmann you know that a pet peeve of mine is the hierarchical tree anyway, so something that forces us more in that direction is not a negative at least to me =) In any case, regarding that, I do think we can come up with some intelligent "cleaning" operations that basically takes a non-strict hierarchy, makes it strict, does an op, and then deals with additional data after. But I think that is a fight for another time.. |
thanks all for the robust discussion! |
Please confirm that this PR has done the following:
Adding to RELEASE_NOTES.md (remove section after adding to RELEASE_NOTES.md)
Please add a single line in the release notes similar to the following:
Description of PR
Extends the checking of internal consistency of data. The use case is this: a variable is reported at the World level only, but is more than one level deeper in the hierarchy than the variable being considered. Previously, our check_aggregate functions would raise an error, now they find this variable and correctly realise that the data is internally consistent.