Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R style columns #212

Merged
merged 7 commits into from
Apr 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@

# Next Release

- [#212](https://github.com/IAMconsortium/pyam/pull/212) Now natively support reading R-style data frames with year columns like "X2015"
- [#202](https://github.com/IAMconsortium/pyam/pull/202) Extend the `df.rename()` function with a `check_duplicates (default True)` validation option
- [#201](https://github.com/IAMconsortium/pyam/pull/201) Added native support for legends outside of plots with `pyam.plotting.OUTSIDE_LEGEND` with a tutorial
- [#199](https://github.com/IAMconsortium/pyam/pull/199) Initializing an `IamDataFrame` accepts kwargs to fill or create from the data any missing required columns
Expand Down
3 changes: 3 additions & 0 deletions pyam/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,16 @@ class IamDataFrame(object):
an instance of an TimeSeries or Scenario (requires `ixmp`),
or pd.DataFrame or data file with IAMC-format data columns.
A pd.DataFrame can have the required data as columns or index.
Support is provided additionally for R-style data columns for years,
like "X2015", etc.
kwargs:
if `value=col`, melt `col` to `value` and use `col` name as `variable`;
else, mapping of columns required for an `IamDataFrame` to:
- one column in `df`
- multiple columns, which will be concatenated by pipe
- a string to be used as value for this column
"""

def __init__(self, data, **kwargs):
"""Initialize an instance of an IamDataFrame"""
# import data from pd.DataFrame or read from source
Expand Down
18 changes: 18 additions & 0 deletions pyam/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,24 @@ def format_data(df, **kwargs):
if isinstance(df, pd.Series):
df = df.to_frame()

# Check for R-style year columns, converting where necessary
danielhuppmann marked this conversation as resolved.
Show resolved Hide resolved
def convert_r_columns(c):
try:
first = c[0]
second = c[1:]
if first == 'X':
try:
# bingo! was X2015 R-style, return the integer
return int(second)
except:
# nope, not an int, fall down to final return statement
pass
except:
# not a string/iterable/etc, fall down to final return statement
pass
return c
df.columns = df.columns.map(convert_r_columns)

# if `value` is given but not `variable`,
# melt value columns and use column name as `variable`
if 'value' in kwargs and 'variable' not in kwargs:
Expand Down
18 changes: 18 additions & 0 deletions tests/test_cast_to_iamc.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,3 +79,21 @@ def test_cast_with_variable_and_value(meta_df):

assert compare(pe_df, df).empty
pd.testing.assert_frame_equal(df.data, pe_df.data.reset_index(drop=True))


def test_cast_from_r_df(test_pd_df):
df = test_pd_df.copy()
# last two columns are years
df.columns = list(df.columns[:-2]) + ['X{}'.format(c)
for c in df.columns[-2:]]
obs = IamDataFrame(df)
exp = IamDataFrame(test_pd_df)
assert compare(obs, exp).empty
pd.testing.assert_frame_equal(obs.data, exp.data)


def test_cast_from_r_df_err(test_pd_df):
df = test_pd_df.copy()
# last two columns are years
df.columns = list(df.columns[:-2]) + ['Xfoo', 'Xbar']
pytest.raises(ValueError, IamDataFrame, df)