Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liveness probe with subservices #481

Merged
merged 4 commits into from
Nov 22, 2023
Merged

Liveness probe with subservices #481

merged 4 commits into from
Nov 22, 2023

Conversation

zolug
Copy link
Collaborator

@zolug zolug commented Nov 17, 2023

Description

  • NSP, IPAM, Proxy and LB register Liveness probe service at custom gRPC health server.
  • FE no longer checks Target Registry Client as part of Readiness.
  • IPAM, NSP Readiness no longer includes their server status.
  • Liveness probes in IPAM, NSP, Proxy and LB containers rely on service name "Liveness" to check status.
  • Health server in TAPA hosts health services Liveness, Startup and Readiness out of the box. Ambassador server availability is returned as part of Liveness. (Example-target has been updated accordingly.) Note, it's not mandatory to use these health services in the kubernetes probe configuration, but using them could avoid the need of revisiting the probe configuration in the TAPA container if the recommended probe setup changes (the implementation would change in the background).

Note: There's a DR that does not recommend putting evaluation criteria inside startup probe which are related to service operation readiness which later can fail readiness. Thus startup probes do not check for server availability. (Although in case of IPAM and NSP this limitation would no longer stand.)

Issue link

#480

Checklist

  • Purpose
    • Bug fix
    • New functionality
    • Documentation
    • Refactoring
    • CI
  • Test
    • Unit test
    • E2E Test
    • Tested manually
  • Introduce a breaking change
    • Yes (description required)
    • No

-NSP, IPAM, Proxy and LB register Liveness probe service
at custom health server.
-FE no longer checks Target Registry Client as part of Readiness.
-IPAM, NSP Readiness no longer includes their server status.
Liveness probes in IPAM, NSP, Proxy and LB containers rely on
service name "Liveness" to check status.
@@ -56,6 +56,9 @@ MERIDIO_TIMEOUT | time.Duration | timeout of NSM request/close, NSP register/unr
MERIDIO_DIAL_TIMEOUT | time.Duration | timeout to dial NSMgr | 5s
MERIDIO_MAX_TOKEN_LIFETIME | time.Duration | maximum lifetime of tokens | 24h
MERIDIO_LOG_LEVEL | string | Log level | DEBUG
MERIDIO_NSP_ETRY_TIMEOUT | time.Duration | Timeout of the entries registered in NSP | 30senvconfig:"nsp_entry_timeout"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MERIDIO_NSP_ETRY_TIMEOUT | time.Duration | Timeout of the entries registered in NSP | 30senvconfig:"nsp_entry_timeout"`
MERIDIO_NSP_ETRY_TIMEOUT | time.Duration | Timeout of the entries registered in NSP | 30s"`

Comment on lines 87 to 92
Service | Description
--- | ---
Liveness | A unique service to be used by liveness probe to return status, can aggregate other lesser services
Startup | A unique service to be used by startup probe to return status, can aggregate other lesser services
Readiness | A unique service to be used by readiness probe to return status, can aggregate other lesser services
AmbassadorSvc | Monitor status of the Ambassador server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Service | Description
--- | ---
Liveness | A unique service to be used by liveness probe to return status, can aggregate other lesser services
Startup | A unique service to be used by startup probe to return status, can aggregate other lesser services
Readiness | A unique service to be used by readiness probe to return status, can aggregate other lesser services
AmbassadorSvc | Monitor status of the Ambassador server
Service | Probe | Description
--- | --- | ---
AmbassadorSvc | Liveness | Monitor status of the Ambassador server

var ProxyLivenessServices []string = []string{NSMEndpointSvc}
var IPAMReadinessServices []string = []string{NSPCliSvc}
var IPAMLivenessServices []string = []string{IPAMSvc}
var NSPReadinessServices []string = []string{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why keeping the empty sub-services? (NSPReadinessServices, TAPAStartupServices and TAPAReadinessServices)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could be that these are just confusing, so I will remove them.

Copy link
Member

@LionelJouin LionelJouin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add the type of probes for each sub-probe/sub-service in the documentation

TAPA registers health services Liveness, Startup and Readiness,
which can be used by kubernetes probes to check the respective
status.

The evaluation criteria for these services can be changed behind
the secenes, without the need of updating the kubernetes probe
configuration of the TAPA container.

In accordance with the recent changes the example-target checks
Ambassador server availability as part of its liveness status.
Comment on lines +87 to +91
Service | Description
--- | ---
Liveness | A unique service to be used by liveness probe to return status, can aggregate other lesser services
Startup | A unique service to be used by startup probe to return status, can aggregate other lesser services
Readiness | A unique service to be used by readiness probe to return status, can aggregate other lesser services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this table is needed

@@ -115,6 +117,16 @@ func main() {

// create and start health server
sigCtx = health.CreateChecker(sigCtx)
// add health server services Startup, Readiness and Liveness representing probes
if err := health.RegisterStartupSubservices(sigCtx); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why keeping empty probe services?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their purpose is to hide implementation, thus not requiring kubernetes probe config changes in the TAPA container (avoiding/limiting user impact).

@zolug zolug merged commit 3f6ce72 into master Nov 22, 2023
@zolug zolug deleted the probe-conv branch March 4, 2024 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants