rewrite the tutorials #302

danielhuppmann · 2019-12-10T20:00:13Z

Please confirm that this PR has done the following:

Tests Added
Documentation Added
Description in RELEASE_NOTES.md Added

Adding to RELEASE_NOTES.md (remove section after adding to RELEASE_NOTES.md)

Please add a single line in the release notes similar to the following:

- (#XX)[http://link-to-pr.com] Added feature which does something

Description of PR

This PR reworks most of the tutorials, in particular the "first-steps" tutorial. It updates the tutorial data source to the IAMC 1.5°C data for consistency with the IIASA-db tutorial, and improves consistency of the formatting and notation.

closes #298
closes #290

use actual Primary-Energy-values from MESSAGE-CD-LINKS scenarios

coveralls · 2019-12-10T20:17:30Z

Coverage decreased (-0.4%) to 84.721% when pulling 84011be on danielhuppmann:tutorials/first_step_update into 9931f38 on IAMconsortium:master.

danielhuppmann · 2019-12-11T07:47:12Z

@znicholls, can you take please a look at the updated "checking_consistency" notebook?

I created a smaller, more specific example to better highlight what is being checked and why each validation fails. I also replaced the term "database" with "scenario ensemble" to describe the content of an IamDataFrame rather than where it is stored, and made some other edits, hoping make it easier for new users...

danielhuppmann · 2019-12-11T07:50:06Z

@byersiiasa, can you take a look whether the updated "first-steps" notebook satisfies the issues #290 and #298? (if you have a minute)

jkikstra · 2019-12-11T11:01:39Z

Hi @danielhuppmann. Thanks for reworking these tutorials! I think you have done a great job of making this first steps tutorial accessible.

Some quite minor points:

[filtering]: 'The feature for filtering by model, scenario or region are implemented using exact string matching, where ...' should probably be changed to model, scenario, region, variable (and level), or year to be more complete and not confuse new users.
[filtering]: you could add a tiny example under the 'filtering by year' description, e.g (df_selectedyears = df.filter(range(2010, 2051))). If you want, you can make it a box and then also use it to show it as a shorter timeseries next to the full display_df
[numpy]: you could provide a hyperlink to the numpy package when you mention it when you're dealing with set_meta_from_data()

In general, when using pyam in practice for processing purposes, there can be a lot of converting between different forms. It would be good to make it as clear as possible when one has a wide or long format, and when it is a pyam.IamDataFrame and when it is a pandas.DataFrame. A new user would probably assume that everything that goes in as a pyam.IamDataFrame also returns one. Thus, every time that this not the case, it should be very clear (e.g. under 'displaying timeseries data' it is a bit too easy to miss right now). (this also holds for when a pandas.Series [e.g. overshoot] or even a multidimensional dataframe [e.g. df.meta] are returned, as this can cause quite some confusion because it is not always explicit enough)
In the same way, I think it is important to also emphasize what actions return an altered copy and what actions change the df itself.
Perhaps an overview table of some central functions with these two binary options (pyam/pandas, and copy/direct alteration) could be very useful for new users, and is more friendly then going into the API or testing it out.

znicholls

lgtm, small suggestions:

'sum of wind and coal in that region' --> 'sum of wind and coal in region_a and region_b
' is Primary Energy should be equal to Primary Energy|Coal plus Primary Energy|Coal' --> 'is Primary Energy equal to Primary Energy|Coal plus Primary Energy|Coal'
'Note that the detailed output (i.e., where the aggregation validation fails) is not shown in a notebook when calling the function within a loop.', maybe add a pointer to a resource which explains how notebooks work for those who are confused by this

The one other thing you might want to add is an explainer of the handling of Bunkers. It's super confusing to everyone and having that in there might make it easier for people. Alternately it could be a separate tutorial just for 'advanced users' who have to deal with this stuff.

francescolovat · 2019-12-11T15:21:03Z

Hi @danielhuppmann ,

Great tutorial notebook, very illustrative and clear, even for not-very-experienced readers as myself.

Minor comments:

In exclude_on_fail section, the last sentence of this paragraph is not very clear:

Any scenario (by a particular model) failing the validation criteria is then marked as exclude=True. This "exclusion flag" is implemented in the meta table of the IamDataFrame, which can be used categorization and quantitative indicators (more below).

I think a preposition is missing before categorization. Moreover, I think a clear reference to the meta table section should be included instead of the parenthesis, stating sth like: "the second latest cell in XX shows the meta table" (I found it confusing talking about a meta table whose format it was not clarified until the command df.meta.head() at the end of the notebook).

In the Categorization assignment markdown cell I'd list some of the other possible arguments to be used by the plotting library. So the users could experiment with them (I'd have liked to do that).
I'd suggest to swap the order of the cells [14] and [15] (and their associated markdowns) with commands df.filter(level=1).variables() and df.filter(variable='Primary Energy*', level='1-').variables(), respectively. This is to follow the logic of the Filtering by variables and levels markdown cell, which is explaning the difference between filtering by both variable and level or with just level.

danielhuppmann · 2019-12-12T09:28:13Z

response to @znicholls:

'Note that the detailed output (i.e., where the aggregation validation fails) is not shown in a notebook when calling the function within a loop.', maybe add a pointer to a resource which explains how notebooks work for those who are confused by this

Any good reference for this?

Re "handling of bunkers" - this wasn't included in the tutorial before. I'll see if I have time to extend the tutorial when I tackle #299.

znicholls · 2019-12-12T09:53:03Z

Any good reference for this?

2. Pretty Display of Variables of https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/ ?

Re "handling of bunkers" - this wasn't included in the tutorial before. I'll see if I have time to extend the tutorial when I tackle #299.

Ah ok cool makes sense.

danielhuppmann · 2019-12-12T17:40:41Z

Thank you @jkikstra and @francescolovat for these great suggestions!

Highlights of my changes:

added a blue-highlighted cell about the expected return type of timeseries(), and some more of these cells throughout the notebook
moved the "filter-by-year" section, add an example
introduce the meta table earlier, so that it can be referenced in the validation and indicators section
add a second plot with the kwarg color='scenario' in the plotting example

About the suggestion by @jkikstra to create an overview table of functions and their return types - I understand that this is confusing for new users, but I'm worried about adding new parts to the documentation, because this will be difficult to maintain and keep up to date...

byersiiasa · 2019-12-17T11:32:11Z

@byersiiasa, can you take a look whether the updated "first-steps" notebook satisfies the issues #290 and #298? (if you have a minute)

thanks - yes, nicely implemented!

gidden · 2019-12-22T18:20:10Z

lgtm @danielhuppmann ! I tried restarting the mac CI that failed, not sure why it is failing or if this is a known issue.

danielhuppmann · 2019-12-23T02:04:14Z

thanks @gidden!

this seems to be related to #281 - building the docs on travis installs kealib, which also installs matplotlib-base-2.2.4 on linux but matplotlib-base-3.1.2 on mac. no idea why this isn't a problem in other PRs

danielhuppmann · 2019-12-23T07:21:04Z

no idea why Appveyor is now failing with ever more confusing error messages - maybe it's because now running on gidden:master...? it passed on "my" instance a few commits ago

merging to see if this resolves the issue

danielhuppmann added 24 commits December 2, 2019 12:43

bump copyright year in README

2b03d8d

harmonize README with the docs

40f45f4

add own (short) section on data model, cross-reference to the docs page

fa873fe

minor docs edits

399a928

flip logger message when detecting a notebook

031cadb

rewrite header of first-steps tutorial for consistency

992ce79

use actual Primary-Energy-values from MESSAGE-CD-LINKS scenarios

add subsection with reference to the docs

d7d62d7

use tutorial data snapshot from IAMC 1.5°C scemario ensemble

724a028

add GENESYS-MOD data to tutorial snapshot

19bac08

minor edits of first section

4619651

rewrite filtering and plotting-gallery sections

4be42c7

rewrite validation section

4bec414

rewrite categorization section

e8103a1

rewrite section on quantitative indicators, one more round of clean-uos

2411c7a

update the legends tutorial

ae369d0

update the ipcc-colors tutorial

37b9c6c

remove output, harmonize formatting in "build the logo" tutorial

4e25b4c

remove output, harmonize formatting in "aggregating & plotting" tutorial

82259af

remove output of "checking consistency" tutorial

c879cf8

rename the "checking consistency" tutorial

6944e08

rewrite "checking consistency" tutorial, remove deprecated data file

d916b12

remove output of "iiasa db" tutorial

e0fa744

extend the readme

dbc4a19

rename consistency-tutorial in the test

0142624

danielhuppmann requested a review from znicholls December 11, 2019 07:44

danielhuppmann added the tutorial label Dec 11, 2019

danielhuppmann requested a review from byersiiasa December 11, 2019 07:48

znicholls reviewed Dec 11, 2019

View reviewed changes

danielhuppmann added 2 commits December 12, 2019 08:38

implement review comments by @znicholls

66f770c

add to release notes

dbb9134

danielhuppmann added 4 commits December 12, 2019 11:01

implement review comments by @jkikstra

0cc109e

add link to "tips & tricks" when working with notebooks (per @znicholls)

3a7c58e

implement review comments by @francescolovat

2a3e56b

Merge branch 'master' into tutorials/first_step_update

84011be

byersiiasa mentioned this pull request Dec 17, 2019

Add slightly more explicit example of df.filter(exclude=False) in tutorials #298

Closed

danielhuppmann added 5 commits December 23, 2019 03:05

try downgrading matplotlib-base for building the docs on travis

9d56090

try removing kealib for building docs on travis

24e786c

try remove outdated region-plotting dependencies from appveyor tests

889e556

try re-inserting the building docs on travis

11b2078

remove unused imports from plotting

408029f

danielhuppmann merged commit e84eeb4 into IAMconsortium:master Dec 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rewrite the tutorials #302

rewrite the tutorials #302

danielhuppmann commented Dec 10, 2019 •

edited

Loading

coveralls commented Dec 10, 2019 •

edited

Loading

danielhuppmann commented Dec 11, 2019

danielhuppmann commented Dec 11, 2019

jkikstra commented Dec 11, 2019

znicholls left a comment

francescolovat commented Dec 11, 2019

danielhuppmann commented Dec 12, 2019

znicholls commented Dec 12, 2019

danielhuppmann commented Dec 12, 2019

byersiiasa commented Dec 17, 2019

gidden commented Dec 22, 2019

danielhuppmann commented Dec 23, 2019

danielhuppmann commented Dec 23, 2019

rewrite the tutorials #302

rewrite the tutorials #302

Conversation

danielhuppmann commented Dec 10, 2019 • edited Loading

Please confirm that this PR has done the following:

Adding to RELEASE_NOTES.md (remove section after adding to RELEASE_NOTES.md)

Description of PR

coveralls commented Dec 10, 2019 • edited Loading

danielhuppmann commented Dec 11, 2019

danielhuppmann commented Dec 11, 2019

jkikstra commented Dec 11, 2019

znicholls left a comment

Choose a reason for hiding this comment

francescolovat commented Dec 11, 2019

danielhuppmann commented Dec 12, 2019

znicholls commented Dec 12, 2019

danielhuppmann commented Dec 12, 2019

byersiiasa commented Dec 17, 2019

gidden commented Dec 22, 2019

danielhuppmann commented Dec 23, 2019

danielhuppmann commented Dec 23, 2019

danielhuppmann commented Dec 10, 2019 •

edited

Loading

coveralls commented Dec 10, 2019 •

edited

Loading