Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite the tutorials #302

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
2b03d8d
bump copyright year in README
danielhuppmann Nov 19, 2019
40f45f4
harmonize README with the docs
danielhuppmann Nov 19, 2019
fa873fe
add own (short) section on data model, cross-reference to the docs page
danielhuppmann Nov 19, 2019
399a928
minor docs edits
danielhuppmann Dec 2, 2019
031cadb
flip logger message when detecting a notebook
danielhuppmann Dec 2, 2019
992ce79
rewrite header of first-steps tutorial for consistency
danielhuppmann Dec 3, 2019
d7d62d7
add subsection with reference to the docs
danielhuppmann Dec 3, 2019
724a028
use tutorial data snapshot from IAMC 1.5°C scemario ensemble
danielhuppmann Dec 3, 2019
19bac08
add GENESYS-MOD data to tutorial snapshot
danielhuppmann Dec 3, 2019
4619651
minor edits of first section
danielhuppmann Dec 4, 2019
4be42c7
rewrite filtering and plotting-gallery sections
danielhuppmann Dec 4, 2019
4bec414
rewrite validation section
danielhuppmann Dec 4, 2019
e8103a1
rewrite categorization section
danielhuppmann Dec 4, 2019
2411c7a
rewrite section on quantitative indicators, one more round of clean-uos
danielhuppmann Dec 4, 2019
ae369d0
update the legends tutorial
danielhuppmann Dec 5, 2019
37b9c6c
update the ipcc-colors tutorial
danielhuppmann Dec 6, 2019
4e25b4c
remove output, harmonize formatting in "build the logo" tutorial
danielhuppmann Dec 10, 2019
82259af
remove output, harmonize formatting in "aggregating & plotting" tutorial
danielhuppmann Dec 10, 2019
c879cf8
remove output of "checking consistency" tutorial
danielhuppmann Dec 10, 2019
6944e08
rename the "checking consistency" tutorial
danielhuppmann Dec 10, 2019
d916b12
rewrite "checking consistency" tutorial, remove deprecated data file
danielhuppmann Dec 10, 2019
e0fa744
remove output of "iiasa db" tutorial
danielhuppmann Dec 10, 2019
dbc4a19
extend the readme
danielhuppmann Dec 10, 2019
0142624
rename consistency-tutorial in the test
danielhuppmann Dec 10, 2019
66f770c
implement review comments by @znicholls
danielhuppmann Dec 12, 2019
dbb9134
add to release notes
danielhuppmann Dec 12, 2019
0cc109e
implement review comments by @jkikstra
danielhuppmann Dec 12, 2019
3a7c58e
add link to "tips & tricks" when working with notebooks (per @znicholls)
danielhuppmann Dec 12, 2019
2a3e56b
implement review comments by @francescolovat
danielhuppmann Dec 12, 2019
84011be
Merge branch 'master' into tutorials/first_step_update
danielhuppmann Dec 12, 2019
9d56090
try downgrading `matplotlib-base` for building the docs on travis
danielhuppmann Dec 23, 2019
24e786c
try removing `kealib` for building docs on travis
danielhuppmann Dec 23, 2019
889e556
try remove outdated region-plotting dependencies from appveyor tests
danielhuppmann Dec 23, 2019
11b2078
try re-inserting the building docs on travis
danielhuppmann Dec 23, 2019
408029f
remove unused imports from `plotting`
danielhuppmann Dec 23, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ script:
- make test
# only test docs once to make sure everything works on most recent python
- cd doc
- if [[ "${PYENV}" == "py37" && "${TRAVIS_OS_NAME}" != 'windows' ]]; then conda install --yes kealib==1.4.7; make html; fi
- if [[ "${PYENV}" == "py37" && "${TRAVIS_OS_NAME}" != 'windows' ]]; then make html; fi
- cd ..

after_success:
Expand Down
74 changes: 43 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
pyam: a Python toolkit for Integrated Assessment Modeling
=========================================================
pyam: analysis and visualization of integrated-assessment scenarios
===================================================================

**Documentation on [Read the Docs](https://pyam-iamc.readthedocs.io)**

Expand All @@ -8,43 +8,55 @@ pyam: a Python toolkit for Integrated Assessment Modeling
Overview and scope
------------------

The ``pyam`` package provides a range of diagnostic tools and functions
for analyzing and working with IAMC-format timeseries data.
The open-source Python package ``pyam`` provides a suite of tools and functions
for analyzing and visualizing input data (i.e., assumptions/parametrization)
and results (model output) of integrated-assessment scenarios.

Features:
- Summary of models, scenarios, variables, and regions included in a snapshot.
- Display of timeseries data as pandas.DataFrame with IAMC-specific filtering
options.
- Simple visualization and plotting functions.
- Diagnostic checks for non-reported variables or timeseries data to identify
outliers and potential reporting issues.
- Categorization of scenarios according to timeseries data or meta-identifiers
for further analysis.
Key features:

The package can be used with timeseries data that follows the data template
convention of the [Integrated Assessment Modeling Consortium](http://www.globalchange.umd.edu/iamc/) (IAMC).
An illustrative example is shown below;
see [data.ene.iiasa.ac.at/database](http://data.ene.iiasa.ac.at/database/)
for more information.
- Simple analysis of timeseries data in the IAMC format
(more about it [here](https://pyam-iamc.readthedocs.io/en/stable/data.html))
with an interface similar in feel and style to the widely
used [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)
- Advanced visualization and plotting functions
(see the [gallery](https://pyam-iamc.readthedocs.io/en/stable/examples/index.html))
- Diagnostic checks for scripted validation of scenario data and results

| **model** | **scenario** | **region** | **variable** | **unit** | **2005** | **2010** | **2015** |
|--------------|--------------|------------|----------------|----------|----------|----------|----------|
| MESSAGE V.4 | AMPERE3-Base | World | Primary Energy | EJ/y | 454.5 | 479.6 | ... |
| ... | ... | ... | ... | ... | ... | ... | ... |
Data model
----------

An illustrative example of the timeseries format developed by the
[Integrated Assessment Modeling Consortium](http://www.globalchange.umd.edu/iamc/) (IAMC)
is shown below.
The row is taken from the [IAMC 1.5°C scenario explorer](https://data.ene.iiasa.ac.at/iamc-1.5c-explorer),
showing a scenario from the [CD-LINKS](https://www.cd-links.org) project.
[Read the docs](https://pyam-iamc.readthedocs.io/en/stable/data.html)
for more information on the IAMC format and the ``pyam`` data model.

Tutorial
--------
| **model** | **scenario** | **region** | **variable** | **unit** | **2005** | **2010** | **2015** |
|-----------|--------------|------------|----------------|----------|----------|----------|----------|
| MESSAGE | CD-LINKS 400 | World | Primary Energy | EJ/y | 462.5 | 500.7 | ... |
| ... | ... | ... | ... | ... | ... | ... | ... |

A comprehensive tutorial for the basic functions is included
in [the first tutorial](doc/source/tutorials/pyam_first_steps.ipynb)
using a partial snapshot of the IPCC AR5 scenario database.

Tutorials
---------

An introduction to the basic functions is shown
in [the "first-steps" notebook](doc/source/tutorials/pyam_first_steps.ipynb).

All tutorials are available in rendered format (i.e., with output) as part of
the [online documentation](https://pyam-iamc.readthedocs.io/en/stable/tutorials.html).
The source code of the tutorials notebooks is available
in the folder [doc/source/tutorials](doc/source/tutorials) of this repository.

Documentation
-------------

The documentation pages can be built locally.
See the instruction in [doc/README](doc/README.md).
The complete documentation is hosted on [Read the Docs](https://pyam-iamc.readthedocs.io).

The documentation pages can be built locally,
refer to the instruction in [doc/README](doc/README.md).

Authors
-------
Expand All @@ -56,7 +68,7 @@ and Daniel Huppmann ([@danielhuppmann](https://github.com/danielhuppmann/)).
License
-------

Copyright 2017-2018 IIASA Energy Program
Copyright 2017-2019 IIASA Energy Program

The ``pyam`` package is licensed
under the Apache License, Version 2.0 (the "License");
Expand Down Expand Up @@ -91,7 +103,7 @@ conda activate pyam # may be simply `source activate pyam` or just `activate p
make -B virtual-environment
```

To check everything has installed correctly,
To check everything has installed correctly, run

```
pytest tests
Expand Down
1 change: 1 addition & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@

# Next Release

- [#302](https://github.com/IAMconsortium/pyam/pull/302) Rework the tutorials
- [#301](https://github.com/IAMconsortium/pyam/pull/301) Bugfix when using `to_excel()` with a `pd.ExcelWriter`
- [#297](https://github.com/IAMconsortium/pyam/pull/297) Add `empty` attribute, better error for `timeseries()` on empty dataframe
- [#295](https://github.com/IAMconsortium/pyam/pull/295) Include `meta` table when writing to or reading from `xlsx` files
Expand Down
2 changes: 1 addition & 1 deletion appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ install:
- conda --version
- activate testing
- conda install -y numpy pandas pyyaml xlrd xlsxwriter seaborn==0.9.0 six requests jupyter nbconvert proj4==5.2.0 pywin32
- conda install -y -c conda-forge matplotlib==3.0.3 libiconv gdal fiona "geopandas<0.5.0" cartopy cython pyproj==1.9.6
- conda install -y -c conda-forge matplotlib==3.0.3 pyproj==1.9.6

build: false

Expand Down
Binary file modified doc/source/_static/iamc_template.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Key features:
- Simple analysis of timeseries data in the IAMC format (more about it `here`_)
with an interface similar in feel and style to the widely
used `pandas.DataFrame`_
- Advanced visualization and plotting function (see the `gallery`_)
- Advanced visualization and plotting functions (see the `gallery`_)
- Diagnostic checks for scripted validation of scenario data and results

The source code for |pyam| is available on `Github`_.
Expand Down
Binary file removed doc/source/tutorials/_static/AMPERE-Logo.png
Binary file not shown.
Binary file removed doc/source/tutorials/_static/EMF-Logo_v2.1.png
Binary file not shown.
Binary file modified doc/source/tutorials/_static/IAMC_logo.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed doc/source/tutorials/_static/IIASA_logo.png
Binary file not shown.
Binary file added doc/source/tutorials/_static/cdlinks_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Large diffs are not rendered by default.

236 changes: 236 additions & 0 deletions doc/source/tutorials/checking_consistency.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,236 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Checking consistency of a scenario ensemble\n",
"\n",
"It has happened in previous model comparison exercises that the reported data was not internally consistent. This can be due to incomplete variable hierarchies, reporting templates incompatible with model specifications, or user error.\n",
"\n",
"In this tutorial, we show how to make the most of **pyam** to check that a scenario ensemble (or just a single scenario) is complete and that timeseries data \"add up\" across regions and along the variable tree (i.e., that the sum of values of the subcategories such as `Primary Energy|*` are identical to the values of the category `Primary Energy`).\n",
"\n",
"<div class=\"alert alert-block alert-warning\">\n",
" This feature of the <b>pyam</b> package currently only supports \"consistency\"\n",
" in the sense of a strictly hierarchical variable tree\n",
" (with subcategories summing up to the category value)\n",
" and subregions of depth 1 adding up the \"World\" region.\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import pyam"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start with a hypothetical tutorial data set, which is constructed to highlight the individual validation features below.\n",
"\n",
"The scenario below has two inconsistencies:\n",
"\n",
"1. In year `2010` and regions `region_b` & `World`, the values of coal and wind do not add up to the total `Primary Energy` value\n",
"2. In year `2020` in the `World` region, the value of `Primary Energy` and `Primary Energy|Coal` is not the sum of `region_a` and `region_b` <br />\n",
" (but the sum of wind and coal to `Primary Energy` in each sub-region is correct)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tutorial_df = pd.DataFrame([\n",
" ['World', 'Primary Energy', 'EJ/y', 7, 15],\n",
" ['World', 'Primary Energy|Coal', 'EJ/y', 4, 11],\n",
" ['World', 'Primary Energy|Wind', 'EJ/y', 2, 4],\n",
" ['region_a', 'Primary Energy', 'EJ/y', 4, 8],\n",
" ['region_a', 'Primary Energy|Coal', 'EJ/y', 2, 6],\n",
" ['region_a', 'Primary Energy|Wind', 'EJ/y', 2, 2],\n",
" ['region_b', 'Primary Energy', 'EJ/y', 3, 6],\n",
" ['region_b', 'Primary Energy|Coal', 'EJ/y', 2, 4],\n",
" ['region_b', 'Primary Energy|Wind', 'EJ/y', 0, 2],\n",
"],\n",
" columns=['region', 'variable', 'unit', 2010, 2020]\n",
")\n",
"\n",
"df = pyam.IamDataFrame(data=tutorial_df, model='model_a', scenario='scen_a')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"With the [check_internal_consistency()](https://pyam-iamc.readthedocs.io/en/stable/api.html#pyam.IamDataFrame.check_internal_consistency) feature, we can check the internal consistency of a scenario ensemble (i.e., an `IamDataFrame` instance).\n",
"If this method returns `None`, the database is internally consistent (i.e. the total variables are the sum of the sectoral breakdowns and the regional breakdown).\n",
"\n",
"In the rest of this tutorial, we give you a chance to better understand this method. We go through what it is actually doing and show you the kind of output you can expect."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Checking that variables are the sum of their components\n",
"\n",
"We are going to use the [check_aggregate()](https://pyam-iamc.readthedocs.io/en/stable/api.html#pyam.IamDataFrame.check_aggregate) method of the `IamDataFrame`\n",
"to check that the components of a variable add up to its total.\n",
"This method takes [np.is_close()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.isclose.html) arguments as keyword arguments. We show our recommended settings here."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np_isclose_args = {\n",
" 'equal_nan': True,\n",
" 'rtol': 1e-03,\n",
" 'atol': 1e-05,\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [check_aggregate()](https://pyam-iamc.readthedocs.io/en/stable/api.html#pyam.IamDataFrame.check_aggregate) function allows us to quickly verify whether a given variable is the sum of its sectoral components (e.g. `Primary Energy` should be equal to `Primary Energy|Coal` plus `Primary Energy|Wind`). The validation is performed separately for each region.\n",
"\n",
"This section illustrates the first constructed inconsistency in this scenario. The returned [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) indicates where the aggregate is not equal to the sum of components."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.check_aggregate('Primary Energy', **np_isclose_args)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In practice, it would now be up to the user to determine the cause of the inconsistency (or confirm that this is expected for some reason)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Checking multiple variables\n",
"\n",
"We can now construct a loop over all variables in this `IamDataFrame`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for variable in df.variables():\n",
" df.check_aggregate(variable, **np_isclose_args)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The log tells us the same message as in the previous example, and it shows that the other two variables (coal and wind) cannot be assessed because they have no subcategories.\n",
"\n",
"<div class=\"alert alert-block alert-info\">\n",
"Note that the detailed output (i.e., where the aggregation validation fails) is not shown in a notebook when calling the function within a loop.<br />\n",
" Read <a href=\"https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/\">this page</a> for helpful tips and tricks when working with Jupyter notebooks.\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Checking that timeseries subregions sum to aggregate regions\n",
"\n",
"Similarly to checking that the sum of a variable's components give the declared total shown above, we can check that summing over subregions returns the value of a region.\n",
"\n",
"To do this, we use the [check_aggregate_region](https://pyam-iamc.readthedocs.io/en/stable/api.html#pyam.IamDataFrame.check_aggregate_region) function. By default, this method checks that all the regions in the dataframe sum to `World`. \n",
"\n",
"Using this function allows us to quickly check if a regional total for a single variable is equal to the sum of its regional values.\n",
"This section illustrates the second constructed inconsistency in this scenario. \n",
"The returned [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) indicates where the timeseries at the `region='World'` level is not equal to the sum of regional components."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.check_aggregate_region('Primary Energy', **np_isclose_args)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Checking complete internal consistency of a scenario (ensemble)\n",
"\n",
"The previous sections illustrated two functions to validate specific variables across their subcategories or regional breakdown. These two functions are combined in the [check_internal_consistency()](https://pyam-iamc.readthedocs.io/en/stable/api.html#pyam.IamDataFrame.check_internal_consistency) feature.\n",
"\n",
"If we have an internally consistent scenario ensemble (or single scenario), the function will return `None`; otherwise, it will return a [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) indicating all detected inconsistencies.\n",
"\n",
"<div class=\"alert alert-block alert-warning\">\n",
" Note that at the moment, this method assumes that all the regions sum to the <b>World</b> region. See <a href=\"https://github.com/IAMconsortium/pyam/issues/106\">this issue</a> for more information.\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.check_internal_consistency()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The output of this function reports both types of illustrative inconsistencies in the scenario constructed for this tutorial."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading