extend IamDataFrame to use extra data columns and sub-annual time #132

danielhuppmann · 2018-11-02T08:54:20Z

Please confirm that this PR has done the following:

Tests Added
Documentation Added
Description in RELEASE_NOTES.md Added

Description of PR

This PR extends an IamDataFrame to have extra (custom) columns in the data data frame, and to distinguish between

years (as integers, possibly with a column subannual (as string)
continuous time using pd.to_datetime()
The "type" of time representation is stored in self.time_col, the additional custom columns in self.data in self.extra_cols.

One quickfix required to get stuff working: read_files() is downgraded to take only one file, not a list of files. Because this would require checking that all imported data has the same time format and same extra columns (or add them on the fly).

ToDo:

change "year" to self.time_col for basic functionality (e.g., in timeseries())
change *_IDX usage in the package to replace "year" by self.time_col and use self.extra_cols
figure out how to refactor append() (see issue above with read_files())
reimplement multiple files in read_files()

Related issue and discussion

Closes #123

danielhuppmann · 2018-11-02T09:18:46Z

Help requested

@gidden and @znicholls, what would be an elegant way for running each test three times, once with the existing examples, once with a "subannual" version ("winter day" style), and once for a continuous-time example?

danielhuppmann · 2018-11-03T14:23:14Z

revised as suggested by @znicholls to cover initialising an IamDataFrame from wide format with columns as datetime, see danielhuppmann#8

danielhuppmann · 2018-11-03T14:50:41Z

Help requested

@znicholls, can you implement an appropriate equivalent of the pyam.utils.years_match() function for the new time column formatted as datetime? I guess that this will require some from-to filtering option (e.g., for years, I would use df.filter(year=range(2030, 2051))) to get a downselection of the time period.

danielhuppmann · 2018-11-04T09:42:36Z

rebased on https://github.com/IAMconsortium/pyam/releases/tag/v0.1.1

znicholls · 2018-11-04T10:30:59Z

@danielhuppmann I've got a bit confused here. Do you still need my help with the filter function? I'm guessing we need more tests first as they're currently all passing?

pyam/iiasa.py

danielhuppmann · 2018-11-04T11:18:18Z

@znicholls, yes, I still need your help with a filter function for the time column. Thanks!

Re the tests, only a few tests currently check for multiple time formats (all those using test_df). So the implementation works insofar as the IamDataFrame now takes a year format (as int) or time (as datetime-castable), with some sanity checks that it can only be one of those. Basic functions like regions(), filter() and append() work (except no filtering for time).

Any other column (string that can't be cast to int) in the input data is now kept in data, with a list of these columns stored as df.extra_cols. These can also be used as args in filter() and can be added with timeseries(iamc_index=False).

The connection to the IIASA-db's are refactored so that they work with the new implementation, including the tutorial.

@znicholls and @gidden, I suggest that you review and merge unless you see major problems. The existing use cases work as expected for the year format (ie, CMIP6 and IPCC SR15), but ensuring that all extra features like check_aggregate() and the plotting library work with both formats will be too much for one PR (and too much for me). So I'd keep the time format as "beta" feature and refactor functions as needed going forward.

pyam/core.py

tests/test_core.py

danielhuppmann · 2018-11-29T13:13:33Z

@znicholls, I removed the test for extra columns in meta, because this feature is not yet supported and we need to discuss this a bit further...

znicholls · 2018-11-29T13:20:25Z

Yep i get that. But if we mark it as xfail, the testsuite will still pass and we have an easy reminder of what we need to think about. If we just remove it, finding it again is trickier. The difference is marginal so no big deal either way.

…

On Thu, 29 Nov 2018 at 9:13 pm, Daniel Huppmann ***@***.***> wrote: @znicholls <https://github.com/znicholls>, I removed the test for extra columns in meta, because this feature is not yet supported and we need to discuss this a bit further... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#132 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWh-m0zvbH_fWvi-uJzGOFJ8WrmKB2g5ks5uz919gaJpZM4YLQmp> .

danielhuppmann · 2018-11-29T14:42:10Z

@znicholls , the datetime tests seem to have problems with python 2, can you take a look?

danielhuppmann · 2018-11-29T14:45:51Z

@znicholls, re having a failing test for meta and extra columns

Yep i get that. But if we mark it as xfail, the testsuite will still pass and we have an easy reminder of what we need to think about.

I'd be ok to leave it in if I was convinced that this is the ideal implementation for the long term. But I'd rather have some more discussions on that first...

znicholls · 2018-11-29T23:23:57Z

I'd be ok to leave it in if I was convinced that this is the ideal implementation for the long term. But I'd rather have some more discussions on that first...

cool, will take a look at Python2 stuff now

pyam/core.py

tests/test_core.py

danielhuppmann · 2018-12-03T11:04:51Z

@gidden and @znicholls, this should be good to go now...

Follow-up issues that we should discuss after this PR is merged:

how to treat meta by extra columns, requires further discussion
refactor the tests, split out into multiple test files? test_core.py is pretty long now, some tests are executed for multiple year/time formats, others are only executed on the year format
write a tutorial for the datetime feature, include a description in the docs

znicholls · 2018-12-03T18:45:00Z

Looks good to me.

Follow-up issues

Can we split these out into issues? You can assign me to the datetime feature tutorial.

danielhuppmann · 2018-12-03T19:13:44Z

@znicholls, yes, I meant that we would add these issues after the merge - just thought that it would be clearer what still needs to be done for the larger picture when I spell it out here. Thanks for volunteering!

znicholls · 2018-12-03T19:20:31Z

just thought that it would be clearer what still needs to be done for the larger picture when I spell it out here

aha very good, one step ahead of me!

* Add test of extra col init behaviour * Add failing tests of time filtering * Setup time filtering tests * Pass test filter year * Redo tests of time filtering and include super messy first steps towards implementation * Fill out tests and reset core * Finish implementation of time filtering, cleaning up needed * Refactor core so apply filters can use self.time_col

@znicholls

as suggested by @znicholls

danielhuppmann · 2018-12-19T22:34:28Z

another rebase after merge of #162 to try to appease stickler (which should have been appeased already)

danielhuppmann · 2018-12-20T02:36:35Z

closing in favour of #167

danielhuppmann requested review from gidden and znicholls November 2, 2018 08:54

znicholls mentioned this pull request Nov 2, 2018

Add multiple time axes to test_df fixture danielhuppmann/pyam#8

Closed

danielhuppmann force-pushed the time branch 6 times, most recently from 1878256 to e340236 Compare November 4, 2018 09:42

stickler-ci reviewed Nov 4, 2018

View reviewed changes

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

stickler-ci reviewed Nov 4, 2018

View reviewed changes

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

pyam/iiasa.py Outdated Show resolved Hide resolved

danielhuppmann changed the title ~~WIP - extend IamDataFrame to use extra data columns and sub-annual time~~ extend IamDataFrame to use extra data columns and sub-annual time Nov 8, 2018

gidden force-pushed the master branch from 742ae68 to 7f8ca6e Compare November 20, 2018 13:26

stickler-ci reviewed Nov 29, 2018

View reviewed changes

znicholls approved these changes Nov 29, 2018

View reviewed changes

pyam/core.py Outdated Show resolved Hide resolved

tests/test_core.py Outdated Show resolved Hide resolved

danielhuppmann and others added 24 commits December 19, 2018 23:29

enable filtering by extra columns in data

672ddce

add check that no column conflicts exist between meta and data

eaff4cb

raise error when using append() with incompatible time formats

54d4817

docstring clean-up

eb20dc8

pep8

539fb59

test additional time formats

d863b97

fix rebase error

8affc17

add to release notes

3f171f9

clean up returned json object returned from IIASA db

5613f7b

add kwarg iamc_index to timeseries() for clean or full index

7adccf9

refactor from static to self._LONG_IDX

0211359

fix bug in setting extra_cols from long format

8d74bea

when retrieving data from iiasadb, check that versions are unique

a3d200e

appeasing stickler

590e364

appeasing stickler more

8105395

change default behaviour of timeseries() to include all extra cols

28c910f

add iamc_index as kwarg to to_csv() and to_excel()

4b78be9

remove test for meta with extra columns (behaviour not supported)

e602b60

appeasing stickler

a2917cf

fix bug in error message match for Python 2

2e2fe45

as suggested by @znicholls

update docstring for filter() and fix warning message formatting

30de073

try again to fix bug in error message mattch for Python 2

1facfb5

refactor _df to _data in __init__()

072d975

danielhuppmann force-pushed the time branch from 65d3619 to 072d975 Compare December 19, 2018 22:30

danielhuppmann mentioned this pull request Dec 19, 2018

extend IamDataFrame to use extra data columns and sub-annual time #167

Merged

5 tasks

danielhuppmann closed this Dec 20, 2018

danielhuppmann deleted the time branch December 20, 2018 11:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extend IamDataFrame to use extra data columns and sub-annual time #132

extend IamDataFrame to use extra data columns and sub-annual time #132

danielhuppmann commented Nov 2, 2018 •

edited

Loading

danielhuppmann commented Nov 2, 2018

danielhuppmann commented Nov 3, 2018

danielhuppmann commented Nov 3, 2018

danielhuppmann commented Nov 4, 2018

znicholls commented Nov 4, 2018

danielhuppmann commented Nov 4, 2018

danielhuppmann commented Nov 29, 2018

znicholls commented Nov 29, 2018 via email

danielhuppmann commented Nov 29, 2018

danielhuppmann commented Nov 29, 2018

znicholls commented Nov 29, 2018

danielhuppmann commented Dec 3, 2018

znicholls commented Dec 3, 2018

danielhuppmann commented Dec 3, 2018

znicholls commented Dec 3, 2018

danielhuppmann commented Dec 19, 2018

danielhuppmann commented Dec 20, 2018

extend IamDataFrame to use extra data columns and sub-annual time #132

extend IamDataFrame to use extra data columns and sub-annual time #132

Conversation

danielhuppmann commented Nov 2, 2018 • edited Loading

Please confirm that this PR has done the following:

Description of PR

ToDo:

Related issue and discussion

danielhuppmann commented Nov 2, 2018

Help requested

danielhuppmann commented Nov 3, 2018

danielhuppmann commented Nov 3, 2018

Help requested

danielhuppmann commented Nov 4, 2018

znicholls commented Nov 4, 2018

danielhuppmann commented Nov 4, 2018

danielhuppmann commented Nov 29, 2018

znicholls commented Nov 29, 2018 via email

danielhuppmann commented Nov 29, 2018

danielhuppmann commented Nov 29, 2018

znicholls commented Nov 29, 2018

danielhuppmann commented Dec 3, 2018

znicholls commented Dec 3, 2018

danielhuppmann commented Dec 3, 2018

znicholls commented Dec 3, 2018

danielhuppmann commented Dec 19, 2018

danielhuppmann commented Dec 20, 2018

danielhuppmann commented Nov 2, 2018 •

edited

Loading