You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
Tell us about your request
What do you want us to build?
Since #625 has been shipped, we are starting to scheduling workloads on EKS Fargate. Every time a new Fargate node is created, the node reports the following warning:
invalid capacity 0 on image filesystem.
kubectl describe node fargate-ip-172-17-8-64.ap-northeast-1.compute.internal
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 9m25s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Starting kubelet.
Warning InvalidDiskCapacity 9m25s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal invalid capacity 0 on image filesystem
Normal NodeHasSufficientMemory 9m25s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Node fargate-ip-172-17-8-64.ap-northeast-1.compute.internal status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 9m25s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Node fargate-ip-172-17-8-64.ap-northeast-1.compute.internal status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 9m25s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Node fargate-ip-172-17-8-64.ap-northeast-1.compute.internal status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 9m24s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Updated Node Allocatable limit across pods
Normal NodeReady 9m15s kubelet, fargate-ip-172-17-8-64.ap-northeast-1.compute.internal Node fargate-ip-172-17-8-64.ap-northeast-1.compute.internal status is now: NodeReady
We use BotKube to monitor our EKS clusters. Warnings and errors are sent to our Slack channels. The above InvalidDiskCapacity is now "spamming" us for each scheduled pod on EKS Fargate.
I'm wondering if we are the only one affected by this issue or if this is a temporary issue on EKS Fargate scheduler and whether or not AWS is going to handle this warning in the near future?
Which service(s) is this request for?
EKS Fargate
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
I'm trying to avoid false positive alarms on EKS cluster with workload scheduled on Fargate.
Are you currently working around this issue?
How are you currently solving this problem?
I have implemented a custom BotKube filter to ignore invalid capacity 0 on image filesystem Node event on Fargate.
Here is the custom filter for those who want to have a look: botkube/pkg/filterengine/filters/custom_node_event_checker.go
// CustomNodeEventsChecker filter to send notifications on critical node events
package filters
import (
"github.com/infracloudio/botkube/pkg/events"
"github.com/infracloudio/botkube/pkg/filterengine"
"github.com/infracloudio/botkube/pkg/log"
"strings"
)
const (
// InvalidDiskCapacity EventReason when Node has InvalidDiskCapacity
InvalidDiskCapacity string = "InvalidDiskCapacity"
)
// CustomNodeEventsChecker checks job status and adds message in the events structure
type CustomNodeEventsChecker struct {
Description string
}
// Register filter
func init() {
filterengine.DefaultFilterEngine.Register(CustomNodeEventsChecker{
Description: "Sends notifications on node level critical events.",
})
}
// Run filers and modifies event struct
func (f CustomNodeEventsChecker) Run(object interface{}, event *events.Event) {
// Run filter only on Node events
if event.Kind != "Node" {
return
}
log.Debugf("CustomNodeEventsChecker, object: %+v\n------------", object)
log.Debugf("CustomNodeEventsChecker, event: %+v\n------------", event)
// Update event details
// Promote InfoEvent with critical reason as significant ErrorEvent
switch event.Reason {
case InvalidDiskCapacity:
log.Debug("Node has InvalidDiskCapacity, ignoring it")
if strings.Contains(event.Name, "fargate-ip-") {
for _, m := range event.Messages {
// As of 2021/06/17 skip warning events due to invalid capacity 0 on image filesystem during Fargate node creation
// See https://github.com/aws/containers-roadmap/issues/1403
if strings.Contains(m, "invalid capacity 0 on image filesystem") {
log.Debug("Skipping Node event with InvalidDiskCapacity for EKS Fargate")
event.Skip = true
}
}
}
default:
}
log.Debug("Node Critical Event filter successful!")
}
// Describe filter
func (f CustomNodeEventsChecker) Describe() string {
return f.Description
}
Additional context
Anything else we should know?
We don't have this issue with Self-Managed EKS Nodes nor AWS Managed EKS nodes.
Thanks in advance for your time.
The text was updated successfully, but these errors were encountered:
I see the same Event Type warning when I run "kubectl describe node instance_name" where instance_name is the dns name of a EKS Local Clusters control-plane,master node.
Community Note
Tell us about your request
What do you want us to build?
Since #625 has been shipped, we are starting to scheduling workloads on EKS Fargate. Every time a new Fargate node is created, the node reports the following warning:
invalid capacity 0 on image filesystem
.We use BotKube to monitor our EKS clusters. Warnings and errors are sent to our Slack channels. The above
InvalidDiskCapacity
is now "spamming" us for each scheduled pod on EKS Fargate.I'm wondering if we are the only one affected by this issue or if this is a temporary issue on EKS Fargate scheduler and whether or not AWS is going to handle this warning in the near future?
Which service(s) is this request for?
EKS Fargate
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
I'm trying to avoid false positive alarms on EKS cluster with workload scheduled on Fargate.
Are you currently working around this issue?
How are you currently solving this problem?
I have implemented a custom BotKube filter to ignore
invalid capacity 0 on image filesystem
Node event on Fargate.Here is the custom filter for those who want to have a look:
botkube/pkg/filterengine/filters/custom_node_event_checker.go
Additional context
Anything else we should know?
We don't have this issue with Self-Managed EKS Nodes nor AWS Managed EKS nodes.
Thanks in advance for your time.
The text was updated successfully, but these errors were encountered: