Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IsolationTree 的入参数据series #33

Closed
1097872822 opened this issue Apr 26, 2023 · 6 comments
Closed

IsolationTree 的入参数据series #33

1097872822 opened this issue Apr 26, 2023 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@1097872822
Copy link

作者你好,我想请问这个series[],我在使用不同的数据源的时候 会在pathLengthM方法中报空指针异常,这种情况是有些个人数据源没出现,比如我的速度数据源没报错,但温度数据源【带正负值】的会报错,但从数的构建和查询来说应该不是数据源导致的吧。 孤立森林的series[]数据源应该不会有什么限制吧?
最后感谢作者的贡献以及参考,若能耐心回复,将万分感激~[抱拳]

@MezereonXP
Copy link
Owner

MezereonXP commented Apr 26, 2023 via email

@1097872822
Copy link
Author

你好,感谢收到你的回复并且修改了有可能的bug代码。

而在我个人对数据源进行分析过程发现:数据源的随机性有可能导致pathLengthM方法内出现空指针报错,如上面我所说的,温度数据它可能一一直保持在一个比较恒定的区间,即极少存在异常值。而反观我的速度数据,大部分都是0值,只有个别或极少部分出现了跳跃速度的情况,然而我的速度数据源能被孤立森林检测出这些异常值,所以我个人怀疑是数据的随机性或者是否均匀分布的数据源,导致算法没有能够找出异常值。其次我这么猜想是因为我的温度数据没有报错,但会一直执行,个人调试过代码应该是在一个do while条件内一直执行的原因。随后我就造出了一个类似于我的速度数据源的集合,并且提出上述的一个猜想,到此再次感谢你对我的陈述。

@MezereonXP
Copy link
Owner

十分感谢你的分析,请问在更新最新的版本之后还会出现问题吗?
如果还有问题的话,可以贴一下测试的样例给我,我会尝试修复一下。

@MezereonXP MezereonXP added the bug Something isn't working label Apr 27, 2023
@MezereonXP MezereonXP self-assigned this Apr 27, 2023
@1097872822
Copy link
Author

非常感谢你的回复,由于个人也同时再做其它算法方面的研究测试,所以我抽空会重新拉取新版本代码再测试,然后再给你反馈。
再次感谢~并提前祝劳动节快乐。

@jiewang-2023
Copy link

jiewang-2023 commented May 12, 2023

作者你好
使用这组数据源还是会随机报空指针错误
[8.75, 8.87, 8.89, 8.87, 8.97, 8.97, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 8.87, 10.05, 10.09, 10.05, 10.05, 10.03, 10.03, 10.05, 10.09, 10.09, 10.09, 10.13, 10.03, 10.03, 9.99, 10.05, 10.03, 10.03, 10.05, 9.99, 9.99, 9.99, 10.03, 10.03, 9.95, 10.37, 10.37, 10.37, 10.41, 10.41, 10.41, 10.37, 10.37, 10.33, 10.37, 10.37, 10.37, 10.37, 10.43, 10.43, 10.43, 10.43, 10.43, 10.41, 10.33, 10.37, 10.37, 10.33, 10.37, 10.37, 10.41, 10.47, 10.47, 10.41, 10.33, 10.41, 10.33, 10.37, 10.43, 10.37, 10.41, 10.33, 10.33]

代码结构
@test
public void timeSeriesAnalyse() {
System.out.println("Isolation Forest Test is Started ...");
IsolationTreeTool isolationTreeTool = new IsolationTreeTool();
String str = FileUtil.readString(new File("D://tmp//testdata.json"), StandardCharsets.UTF_8);
JSONArray arr = JSONUtil.parseArray(str);
double[] array = arr.stream().filter(Objects::nonNull).mapToDouble(e -> new BigDecimal(e.toString()).setScale(2, RoundingMode.HALF_UP).doubleValue()).limit(360).toArray();
System.out.println("array.length = " + array.length);
System.out.println("array = " + Arrays.toString(array) );
isolationTreeTool.timeSeriesAnalyse(array);
DisplayTool.showResult(isolationTreeTool);
System.out.println("Isolation Forest Test is Finished");
}
异常信息
java.lang.NullPointerException: Cannot invoke "com.anomalydetect.IsolationTree.IsolationTreeNode.isExtenal()" because "node" is null

at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:74)
at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:80)
at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:80)
at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:80)
at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:80)
at com.anomalydetect.IsolationTree.IsolationTree.pathLengthM(IsolationTree.java:80)
at com.anomalydetect.IsolationTree.IsolationTree.pathLength(IsolationTree.java:62)
at com.anomalydetect.IsolationTree.IsolationForest.searchForest(IsolationForest.java:52)
at com.anomalydetect.IsolationTree.IsolationTreeTool.cutAnomaly(IsolationTreeTool.java:43)
at com.anomalydetect.IsolationTree.IsolationTreeTool.timeSeriesAnalyse(IsolationTreeTool.java:32)
at com.anomalydetect.IsolationTree.IsolationTreeToolTest.timeSeriesAnalyse(IsolationTreeToolTest.java:53)

@MezereonXP
Copy link
Owner

你好,感谢你提供的测试数据。

空指针的原因是随机森林在构建多颗树的时候,使用的采样的技术,默认的采样值设定为256,所以在处理较长的数组时一旦采样的值不够大就会出现在树中查找值会查找失败。

我已经修复了该问题,在采样值小于数据点数量的时候,默认将其设置成数据点的数量。

修复后的代码已经发布,估计要等2-3个小时生效。请将版本设置成1.1.1使用,谢谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants