LOCI fails on MacOS with Python 2.7 (caused by np.count_nonzero) #36

yzhao062 · 2018-12-04T04:17:45Z

It is noted running LOCI model on MacOS with Python 2.7 may fail. One potential cause is the following code, as np.count_nonzero returns int instead of array.
I am currently investigating how to fix it. Please stay tuned.

 def _get_alpha_n(self, dist_matrix, indices, r):
        """Computes the alpha neighbourhood points.
        
        Parameters
        ----------
        dist_matrix : array-like, shape (n_samples, n_features)
            The distance matrix w.r.t. to the training samples.
        
        indices : int
            Subsetting index
        
        r : int
            Neighbourhood radius
            
        Returns
        -------
        alpha_n : array, shape (n_alpha, )
            Returns the alpha neighbourhood points.       
        """

        if type(indices) is int:
            alpha_n = np.count_nonzero(
                dist_matrix[indices, :] < (r * self._alpha))
            return alpha_n
        else:
            alpha_n = np.count_nonzero(
                dist_matrix[indices, :] < (r * self._alpha), axis=1)
            return alpha_n

The error message looks like below:

(test27) bash-3.2$ python loci_example.py
/anaconda2/envs/test27/lib/python2.7/site-packages/pyod/models/loci.py:199: RuntimeWarning: divide by zero encountered in double_scalars
outlier_scores[p_ix] = mdef/sigma_mdef
/Users/zhaoy9/.local/lib/python2.7/site-packages/numpy/core/_methods.py:101: RuntimeWarning: invalid value encountered in subtract
x = asanyarray(arr - arrmean)
On Training Data:
Traceback (most recent call last):
File "loci_example.py", line 133, in
evaluate_print(clf_name, y_train, y_train_scores)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/pyod/utils/data.py", line 159, in evaluate_print
roc=np.round(roc_auc_score(y, y_pred), decimals=4),
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 356, in roc_auc_score
sample_weight=sample_weight)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/metrics/base.py", line 77, in _average_binary_score
return binary_metric(y_true, y_score, sample_weight=sample_weight)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 328, in _binary_roc_auc_score
sample_weight=sample_weight)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 618, in roc_curve
y_true, y_score, pos_label=pos_label, sample_weight=sample_weight)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 403, in _binary_clf_curve
assert_all_finite(y_score)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/utils/validation.py", line 68, in assert_all_finite
_assert_all_finite(X.data if sp.issparse(X) else X, allow_nan)
File "/anaconda2/envs/test27/lib/python2.7/site-packages/sklearn/utils/validation.py", line 56, in _assert_all_finite
raise ValueError(msg_err.format(type_err, X.dtype))
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

The text was updated successfully, but these errors were encountered:

yzhao062 · 2018-12-13T01:37:00Z

The problem gets fixed by updating:

Python == 2.7.15
numpy == 1.15.1
sklearn == 0.19.2
scipy == 1.1.0

If you are running issues on Python 2 with Mac, try to update Python and dependent libraries.

yzhao062 added the bug label Dec 4, 2018

yzhao062 self-assigned this Dec 4, 2018

yzhao062 closed this as completed Dec 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LOCI fails on MacOS with Python 2.7 (caused by np.count_nonzero) #36

LOCI fails on MacOS with Python 2.7 (caused by np.count_nonzero) #36

yzhao062 commented Dec 4, 2018

yzhao062 commented Dec 13, 2018

LOCI fails on MacOS with Python 2.7 (caused by np.count_nonzero) #36

LOCI fails on MacOS with Python 2.7 (caused by np.count_nonzero) #36

Comments

yzhao062 commented Dec 4, 2018

yzhao062 commented Dec 13, 2018