pd.Series interpolate with method='time' returns inconsistent results for first or last NaN #15356

bertrandhaut · 2017-02-09T09:40:04Z

Calling the pd.Series interpolate with method='time' returns inconsistent results when the first or the last value is NaN.

When the first value is NaN, interpolation is not performed on the first value. This is, for me, the expected behaviour since interpolation is not possible.

pd.Series(index=[datetime(2017,1,1), datetime(2017,1,2), datetime(2017,1,7)], data=[float('nan'), float('nan'), 3]).interpolate(method='time')

2017-01-01 NaN
2017-01-02 NaN
2017-01-07 3.0
dtype: float64

When the last value is a NaN, 'interpolation' is performed (like a forward-fill). I was expecting to keep the NaN values

pd.Series(index=[datetime(2017,1,1), datetime(2017,1,2), datetime(2017,1,7)], data=[1, float('nan'), float('nan')]).interpolate(method='time')

2017-01-01 1.0
2017-01-02 1.0
2017-01-07 1.0
dtype: float64

pd.show_versions()
INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-53-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.1
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

jorisvandenbossche · 2017-02-09T12:55:11Z

This does not seem to be specific to 'time' interpolation, as for more simple cases you see the same behaviour:

In [58]: pd.Series([1, 2, np.nan], index=[1,2,4]).interpolate()
Out[58]: 
1    1.0
2    2.0
4    2.0
dtype: float64

In [59]: pd.Series([1, 2, np.nan], index=[1,2,4]).interpolate(method='index')
Out[59]: 
1    1.0
2    2.0
4    2.0
dtype: float64

To me that feels like a bug (or at least not a behaviour that is mentioned in the docs), but not too familiar with the interpolate code.

jorisvandenbossche · 2017-02-09T12:57:15Z

OK, this is a duplicate of #8000. Always welcome to look into this and submit a PR!

jorisvandenbossche closed this as completed Feb 9, 2017

jorisvandenbossche added Duplicate Report Duplicate issue or pull request Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Feb 9, 2017

jorisvandenbossche added this to the No action milestone Feb 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.Series interpolate with method='time' returns inconsistent results for first or last NaN #15356

pd.Series interpolate with method='time' returns inconsistent results for first or last NaN #15356

bertrandhaut commented Feb 9, 2017

pd.show_versions()
INSTALLED VERSIONS

jorisvandenbossche commented Feb 9, 2017

jorisvandenbossche commented Feb 9, 2017

pd.Series interpolate with method='time' returns inconsistent results for first or last NaN #15356

pd.Series interpolate with method='time' returns inconsistent results for first or last NaN #15356

Comments

bertrandhaut commented Feb 9, 2017

pd.show_versions() INSTALLED VERSIONS

jorisvandenbossche commented Feb 9, 2017

jorisvandenbossche commented Feb 9, 2017

pd.show_versions()
INSTALLED VERSIONS