-
-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeseries operations #1172
Merged
Merged
Timeseries operations #1172
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
3369010
Fixed decimate operation for (Nd)Overlays
philippjfr 59ba4d7
Added rolling operations
philippjfr de6b4ef
Add streams parameter to ElementOperation
philippjfr 65ec54f
Allow interpolate_curve operation to work on (Nd)Overlays
philippjfr 9bf2911
Added resampling timeseries operation
philippjfr 37a3758
Made timeseries operation efficient for Pandas and DaskInterface
philippjfr 2105ea4
Expanded functionality and added tests for timeseries ops
philippjfr 1c64c4d
Cleaned up operation.timeseries
philippjfr d5faecb
Renamed operation _apply methods to _process_layer
philippjfr 78258ad
Fixed small bugs introduced in datashader
philippjfr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
import param | ||
import numpy as np | ||
import pandas as pd | ||
|
||
from ..core import ElementOperation, Element | ||
from ..core.data import PandasInterface | ||
from ..element import Scatter | ||
|
||
|
||
class rolling(ElementOperation): | ||
""" | ||
Applies a function over a rolling window. | ||
""" | ||
|
||
center = param.Boolean(default=True, doc=""" | ||
Whether to set the x-coordinate at the center or right edge | ||
of the window.""") | ||
|
||
function = param.Callable(default=np.mean, doc=""" | ||
The function to apply over the rolling window.""") | ||
|
||
min_periods = param.Integer(default=None, doc=""" | ||
Minimum number of observations in window required to have a | ||
value (otherwise result is NA).""") | ||
|
||
rolling_window = param.Integer(default=10, doc=""" | ||
The window size over which to apply the function.""") | ||
|
||
window_type = param.ObjectSelector(default=None, | ||
objects=['boxcar', 'triang', 'blackman', 'hamming', 'bartlett', | ||
'parzen', 'bohman', 'blackmanharris', 'nuttall', | ||
'barthann', 'kaiser', 'gaussian', 'general_gaussian', | ||
'slepian'], doc="The type of the window to apply") | ||
|
||
def _process_layer(self, element, key=None): | ||
xdim = element.kdims[0].name | ||
df = PandasInterface.as_dframe(element) | ||
roll_kwargs = {'window': self.p.rolling_window, | ||
'center': self.p.center, | ||
'win_type': self.p.window_type, | ||
'min_periods': self.p.min_periods} | ||
df = df.set_index(xdim).rolling(**roll_kwargs) | ||
if roll_kwargs['window'] is None: | ||
rolled = df.apply(self.p.function) | ||
else: | ||
if self.p.function is np.mean: | ||
rolled = df.mean() | ||
elif self.p.function is np.sum: | ||
rolled = df.sum() | ||
else: | ||
raise ValueError("Rolling window function only supports " | ||
"mean and sum when custom window_type is supplied") | ||
return element.clone(rolled.reset_index()) | ||
|
||
def _process(self, element, key=None): | ||
return element.map(self._process_layer, Element) | ||
|
||
|
||
class resample(ElementOperation): | ||
""" | ||
Resamples a timeseries of dates with a frequency and function | ||
""" | ||
|
||
closed = param.ObjectSelector(default=None, objects=['left', 'right'], | ||
doc="Which side of bin interval is closed") | ||
|
||
function = param.Callable(default=np.mean, doc=""" | ||
The function to apply over the rolling window.""") | ||
|
||
label = param.ObjectSelector(default='right', doc=""" | ||
The bin edge to label the bin with.""") | ||
|
||
rule = param.String(default='D', doc=""" | ||
A string representing the time interval over which to apply the resampling""") | ||
|
||
def _process_layer(self, element, key=None): | ||
df = PandasInterface.as_dframe(element) | ||
xdim = element.kdims[0].name | ||
resample_kwargs = {'rule': self.p.rule, 'label': self.p.label, | ||
'closed': self.p.closed} | ||
df = df.set_index(xdim).resample(**resample_kwargs) | ||
return element.clone(df.apply(self.p.function).reset_index()) | ||
|
||
def _process(self, element, key=None): | ||
return element.map(self._process_layer, Element) | ||
|
||
|
||
class rolling_outlier_std(ElementOperation): | ||
""" | ||
Detect outliers using the standard deviation within a rolling window. | ||
|
||
Outliers are the array elements outside `sigma` standard deviations from | ||
the smoothed trend line, as calculated from the trend line residuals. | ||
""" | ||
|
||
rolling_window = param.Integer(default=10, doc=""" | ||
The window size of which within the rolling std is computed.""") | ||
|
||
sigma = param.Number(default=2.0, doc=""" | ||
Minimum sigma before a value is considered an outlier.""") | ||
|
||
def _process_layer(self, element, key=None): | ||
sigma, window = self.p.sigma, self.p.rolling_window | ||
ys = element.dimension_values(1) | ||
|
||
# Calculate the variation in the distribution of the residual | ||
avg = pd.Series(ys).rolling(window, center=True).mean() | ||
residual = ys - avg | ||
std = pd.Series(residual).rolling(window, center=True).std() | ||
|
||
# Get indices of outliers | ||
outliers = (np.abs(residual) > std * sigma).values | ||
return element[outliers].clone(new_type=Scatter) | ||
|
||
def _process(self, element, key=None): | ||
return element.map(self._process_layer, Element) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
import pandas as pd | ||
import numpy as np | ||
|
||
from holoviews import Curve, Scatter | ||
from holoviews.element.comparison import ComparisonTestCase | ||
from holoviews.operation.timeseries import (rolling, resample, rolling_outlier_std) | ||
|
||
|
||
class TimeseriesOperationTests(ComparisonTestCase): | ||
""" | ||
Tests for the various timeseries operations including rolling, | ||
resample and rolling_outliers_std. | ||
""" | ||
|
||
def setUp(self): | ||
self.dates = pd.date_range("2016-01-01", "2016-01-07", freq='D') | ||
self.values = [1, 2, 3, 4, 5, 6, 7] | ||
self.outliers = [1, 2, 1, 2, 10., 2, 1] | ||
self.date_curve = Curve((self.dates, self.values)) | ||
self.int_curve = Curve(self.values) | ||
self.date_outliers = Curve((self.dates, self.outliers)) | ||
self.int_outliers = Curve(self.outliers) | ||
|
||
def test_roll_dates(self): | ||
rolled = rolling(self.date_curve, rolling_window=2) | ||
rolled_vals = [np.NaN, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5] | ||
self.assertEqual(rolled, Curve((self.dates, rolled_vals))) | ||
|
||
def test_roll_ints(self): | ||
rolled = rolling(self.int_curve, rolling_window=2) | ||
rolled_vals = [np.NaN, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5] | ||
self.assertEqual(rolled, Curve(rolled_vals)) | ||
|
||
def test_roll_date_with_window_type(self): | ||
rolled = rolling(self.date_curve, rolling_window=3, window_type='triang') | ||
rolled_vals = [np.NaN, 2, 3, 4, 5, 6, np.NaN] | ||
self.assertEqual(rolled, Curve((self.dates, rolled_vals))) | ||
|
||
def test_roll_ints_with_window_type(self): | ||
rolled = rolling(self.int_curve, rolling_window=3, window_type='triang') | ||
rolled_vals = [np.NaN, 2, 3, 4, 5, 6, np.NaN] | ||
self.assertEqual(rolled, Curve(rolled_vals)) | ||
|
||
def test_resample_weekly(self): | ||
resampled = resample(self.date_curve, rule='W') | ||
dates = list(map(pd.Timestamp, ["2016-01-03", "2016-01-10"])) | ||
vals = [2, 5.5] | ||
self.assertEqual(resampled, Curve((dates, vals))) | ||
|
||
def test_resample_weekly_closed_left(self): | ||
resampled = resample(self.date_curve, rule='W', closed='left') | ||
dates = list(map(pd.Timestamp, ["2016-01-03", "2016-01-10"])) | ||
vals = [1.5, 5] | ||
self.assertEqual(resampled, Curve((dates, vals))) | ||
|
||
def test_resample_weekly_label_left(self): | ||
resampled = resample(self.date_curve, rule='W', label='left') | ||
dates = list(map(pd.Timestamp, ["2015-12-27", "2016-01-03"])) | ||
vals = [2, 5.5] | ||
self.assertEqual(resampled, Curve((dates, vals))) | ||
|
||
def test_rolling_outliers_std_ints(self): | ||
outliers = rolling_outlier_std(self.int_outliers, rolling_window=2, sigma=1) | ||
self.assertEqual(outliers, Scatter([(4, 10)])) | ||
|
||
def test_rolling_outliers_std_dates(self): | ||
outliers = rolling_outlier_std(self.date_outliers, rolling_window=2, sigma=1) | ||
self.assertEqual(outliers, Scatter([(pd.Timestamp("2016-01-05"), 10)])) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine but we do need to document this before before 1.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we'll have to add documentation about dynamic operations to our DynamicMap and streams sections.