Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GET warning events for alert have some delay #168

Closed
kingsqv opened this issue Apr 17, 2020 · 2 comments
Closed

GET warning events for alert have some delay #168

kingsqv opened this issue Apr 17, 2020 · 2 comments
Labels
question Further information is requested

Comments

@kingsqv
Copy link

kingsqv commented Apr 17, 2020

hi i am useing shell-operator for my k8s cluster ,It's a cool tool.
I use this config to get k8s events
{ "onKubernetesEvent":[
{"kind":"Event",
"event":["add"]
}
]
}
LEVEL=kubectl -n $ns get event/$eventName -o json | jq .type
when $LEVEL is warning ,will send alerts。but i found the alerts always have some delay.
What's the reason for this. Is there a problem with the my configuration.
Thanks

@diafour
Copy link
Contributor

diafour commented Apr 17, 2020

Hello! Nice to hear you like it!

I don't know the exact reason for your case, but I can guess what causes the delay:

  1. Legacy v0 configuration is used. New configuration has these advantages:
  • Existed Event objects are not ignored, a hook receives them on startup with Synchronization binding context.
  • Support for parallel queues.
  • Binding context contains JSON for Event object, no need to execute kubectl:
    LEVEL=$(jq .[0].object.type $BINDING_CONTEXT_PATH)
    
  • Binding context can combining events (Combine binding contexts to speedup hook execution on high loads #140) to execute a hook once for several events. This clears the queue much faster.
cat $BINDING_CONTEXT_PATH
[
  {"name":"kubernetes", type:"Event", "watchEvent":"Added", "object":{
     "metadata": {"name": "coredns-..." , ... },
     "type": "Normal",
     ...
  }},
  {"name":"kubernetes", type:"Event", "watchEvent":"Added", "object":{
     "metadata": {"name": "kube-proxy-..." , ... },
     "type": "Normal",
     ...
  }},
  { ... }, ...
]
  1. Slow jq execution in alpine builds. alpine has jq-1.6 which has bad performance JQ1.6 is much slower than JQ1.5 jqlang/jq#2069

So to speed up things, first do these:

  • Migrate to configVersion: v1 configuration.
  • Expect multiple items in BINDING_CONTEXT_PATH
    • Rewrite your jq expressions or use /framework/shell/hook.sh (no examples yet, sorry)
  • Do not call kubectl, a binding context for v1 already has full JSON for Event object.
  • Use latest ubuntu version (flant/shell-operator:v1.0.0-beta.9), it has jq-1.5 without performance bug.

Feel free to show your hook, I can help you rewrite it.

One more problem: Shell-operator saves objects in memory. In the case of a huge amount of added Event, it can "eat" memory. We'll fix this in the future, but right now memory limit for shell-operator's Pod is a must-have.

@diafour diafour added the question Further information is requested label Apr 17, 2020
@kingsqv
Copy link
Author

kingsqv commented Apr 19, 2020

Thanks for your advice.

This is my config now ,it works well with no delay, but i found the same message will send twice after 1 hours. aways interval 1 hours. and I check operator pod has no restart.
By the way the logs segmentation by line in new version((flant/shell-operator:v1.0.0-beta.9)

like this log: (same message send twice)
{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":""Warning"","output":"stdout","queue":"main","task":"HookRun","time":"2020-04-19T12:30:29+08:00"}
{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":"{"errcode":0,"errmsg":"ok"}","output":"stdout","queue":"main","task":"HookRun","time":"2020-04-19T12:30:29+08:00"}

{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":"Hook executed successfully","operator.component":"taskRunner","queue":"main","task":"HookRun","time":"2020-04-19T12:30:29+08:00"}
{"binding":"schedule","event.id":"2cbc3e8a-3d1c-4fc3-a0e6-f2b1231fd406","level":"info","msg":"queue task HookRun:main:kubernetes:shell-hook-events.sh:kubernetes","operator.component":"handleEvents","queue":"main","task.id":"d5172979-4655-4230-89f3-b9ab2ba20333","time":"2020-04-19T13:31:28+08:00"}
{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":"Execute hook","operator.component":"taskRunner","queue":"main","task":"HookRun","time":"2020-04-19T13:31:30+08:00"}

{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":""Warning"","output":"stdout","queue":"main","task":"HookRun","time":"2020-04-19T13:31:30+08:00"}
{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":"{"errcode":0,"errmsg":"ok"}","output":"stdout","queue":"main","task":"HookRun","time":"2020-04-19T13:31:30+08:00"}

{"binding":"kubernetes","hook":"shell-hook-events.sh","level":"info","msg":"Hook executed successfully","operator.component":"taskRunner","queue":"main","task":"HookRun","time":"2020-04-19T13:31:30+08:00"}

#!/usr/bin/env bash

source /cus/config/config
source /hooks/common.sh
if [[ $1 == "--config" ]] ; then
cat <<EOF
{
"configVersion":"v1",
"kubernetes":[
{
"apiVersion": "events.k8s.io/v1beta1",
"kind": "Event"
}
]
}

EOF
else
binding=$(cat $BINDING_CONTEXT_PATH)
type=$(jq -r '.[0].type' ${BINDING_CONTEXT_PATH})
if [[ $type == "Event" ]] ; then
if [[ $EVENTS == 1 ]];then
LEVEL=$(jq .[0].object.type $BINDING_CONTEXT_PATH)
if [ $LEVEL != '"Warning"' ];then
echo "$LEVEL"
exit 0
fi
APP=Events
NS=$(jq .[0].object.metadata.namespace $BINDING_CONTEXT_PATH)
ECOMPONENT=$(jq .[0].object.deprecatedSource.component $BINDING_CONTEXT_PATH)
EHOST=$(jq .[0].object.deprecatedSource.host $BINDING_CONTEXT_PATH)
KIND=$(jq .[0].object.regarding.kind $BINDING_CONTEXT_PATH)
NAME=$(jq .[0].object.regarding.name $BINDING_CONTEXT_PATH)
REASON=$(jq .[0].object.reason $BINDING_CONTEXT_PATH)
MESSAGE=$(jq .[0].object.note $BINDING_CONTEXT_PATH)
FIRSTTIMESTAMP=$(jq .[0].object.deprecatedFirstTimestamp $BINDING_CONTEXT_PATH)
LASTTIMESTAMP=$(jq .[0].object.deprecatedLastTimestamp $BINDING_CONTEXT_PATH)
echo "$LEVEL"
mes="
Project: ${CLUSTER_ROLE}
Level: $LEVEL
Host: $EHOST
Component: $ECOMPONENT
Kind: $KIND
Namespace: $NS
Name: $NAME
Reason: $REASON
FirstTimestamp: $FIRSTTIMESTAMP
LastTimestamp: $LASTTIMESTAMP
Message: $MESSAGE
"
whichbot
/cus/bin/ding.py "$mes" "$DING_BOT" || echo 'send dingtalk failed'
else
echo "EVENTS: $EVENTS"
fi
fi
fi

@shurup shurup closed this as completed Jan 15, 2021
@flant flant locked and limited conversation to collaborators Jan 15, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants