-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] [Datasophon-service] When the alarm is restored in AlertActor, the state modification logic is abnormal #402
Comments
I'm not sure what issue you're trying to clarify. Can you elaborate on it |
dev 分支,当服务宕机告警恢复时,修改状态的逻辑应该有问题,如下代码这里应该是当前是非running状态才去修改状态为running |
这个会导致服务发送 resovled 告警时 ,无法将异常状态恢复到正常状态,我同步对比了之前版本的代码 ,这个地方应该是在改造的时候写错了吧 |
We tested that it is possible to recover from an abnormal state to a normal state. How did the situation you mentioned occur |
如果从页面直接启停应该复现不了这个问题,服务停掉后 ,后台启动应该能复现问题(情况应该是机器负载高导致prometheus采集的时候异常后续正常的时候回送告警解除信息无法将状态重置为正常状态) |
按照你所描述的,我们复现了这个问题,你能帮我们解决它吗? |
I will submit a PR later |
add pr link: #404 |
Search before asking
What happened
Should the turntable here be updated when it is not running? If (roleInstance. getServiceRoleState()!= ServiceRoleState. RUNNING)
ClusterServiceRoleInstanceEntity roleInstance = roleInstanceService.getOneServiceRole(labels.getServiceRoleName(), hostname, clusterId);
if (roleInstance.getServiceRoleState() == ServiceRoleState.RUNNING) {
roleInstance.setServiceRoleState(ServiceRoleState.RUNNING);
if (nodeHasWarnAlertList) {
roleInstance.setServiceRoleState(ServiceRoleState.EXISTS_ALARM);
}
oleInstanceService.updateById(roleInstance);
}
What you expected to happen
Should the turntable here be updated when it is not running? If (roleInstance. getServiceRoleState()!= ServiceRoleState. RUNNING)
ClusterServiceRoleInstanceEntity roleInstance = roleInstanceService.getOneServiceRole(labels.getServiceRoleName(), hostname, clusterId);
if (roleInstance.getServiceRoleState() == ServiceRoleState.RUNNING) {
roleInstance.setServiceRoleState(ServiceRoleState.RUNNING);
if (nodeHasWarnAlertList) {
roleInstance.setServiceRoleState(ServiceRoleState.EXISTS_ALARM);
}
oleInstanceService.updateById(roleInstance);
}
How to reproduce
Should the turntable here be updated when it is not running? If (roleInstance. getServiceRoleState()!= ServiceRoleState. RUNNING)
ClusterServiceRoleInstanceEntity roleInstance = roleInstanceService.getOneServiceRole(labels.getServiceRoleName(), hostname, clusterId);
if (roleInstance.getServiceRoleState() == ServiceRoleState.RUNNING) {
roleInstance.setServiceRoleState(ServiceRoleState.RUNNING);
if (nodeHasWarnAlertList) {
roleInstance.setServiceRoleState(ServiceRoleState.EXISTS_ALARM);
}
oleInstanceService.updateById(roleInstance);
}
Anything else
Should the turntable here be updated when it is not running? If (roleInstance. getServiceRoleState()!= ServiceRoleState. RUNNING)
ClusterServiceRoleInstanceEntity roleInstance = roleInstanceService.getOneServiceRole(labels.getServiceRoleName(), hostname, clusterId);
if (roleInstance.getServiceRoleState() == ServiceRoleState.RUNNING) {
roleInstance.setServiceRoleState(ServiceRoleState.RUNNING);
if (nodeHasWarnAlertList) {
roleInstance.setServiceRoleState(ServiceRoleState.EXISTS_ALARM);
}
oleInstanceService.updateById(roleInstance);
}
Version
dev
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: