Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cluster manager init sequence. #33221

Merged
merged 7 commits into from
Apr 11, 2024

Conversation

yanjunxiang-google
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google commented Mar 29, 2024

This is to address the issue: #33218

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google yanjunxiang-google marked this pull request as ready for review April 1, 2024 21:23
@yanjunxiang-google
Copy link
Contributor Author

/assign @wbpcode @htuch

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set of dependencies leading to the need to break the reference circle might benefit from some rethinking longer term, but this seems OK to me. @wbpcode you were interested in reviewing this as well.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. This seems good to me. (Although I cannot ensure it completely resolve our problem, the status is very complex when doing the initialization. But now it at least better then original condition)

envoy/upstream/cluster_manager.h Outdated Show resolved Hide resolved
Comment on lines 131 to 134

// clusterManagerFromProto() and init() have to be called consecutively.
cluster_manager_ = cluster_manager_factory.clusterManagerFromProto(bootstrap);
RETURN_IF_NOT_OK(cluster_manager_->init(bootstrap));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this completely resolve our problem. This ensure the server context could always return a valid reference to the cluster manager. But what will happen when the upstream network filter or upstream http filter try to access the actual content of the cluster manager when doing the initialization?

Copy link
Contributor Author

@yanjunxiang-google yanjunxiang-google Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments!

  1. When router filter is trying to call processFilters() for upstream filters:

    context.serverFactoryContext().clusterManager(), *upstream_ctx_, prefix);

    it does access context.serverFactoryContext().clusterManager(). IIUC, this is in the much later stage, so the cluster manager instance is already initialized.

  2. In the ClusterInfoImpl c'tor, which is called during cluster_manager_->init()

    factory_context.clusterManager(), upstream_context_, prefix);

factory_.clusterFromProto(cluster, *this, outlier_event_logger_, added_via_api);

It's accessing the cluster_manager_, i.e, *this, before it is initialized. This means ClusterInfoImpl c'tor only needs an instance of cluster manager, and does not require it is initialized.

Unfortunately, for composite filter in upstream chain case, it needs to access the cluster_manager_ from from outside here:

factory_context, server_factory_context.clusterManager(), false, "http", nullptr);

That's the reason we need to make sure the server_factory_context.clusterManager() is setup before it is initialized.

BTW, for composite filter in upstream case, it also only needs an instance of cluster manager, and does not require it is initialized.

@wbpcode
Copy link
Member

wbpcode commented Apr 3, 2024

Please check the CI. Then we can try to merge it and see the effect :)

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@wbpcode
Copy link
Member

wbpcode commented Apr 8, 2024

LGTM, Thanks for your update. Could you check the CI then we can merge this PR.

@yanjunxiang-google
Copy link
Contributor Author

/retest

1 similar comment
@yanjunxiang-google
Copy link
Contributor Author

/retest

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

/retest

@yanjunxiang-google
Copy link
Contributor Author

CI is complaining below. Should be non-related to this change.

Code coverage for source/common/stats is lower than limit of 96.6 (96.5)

…tor_cluster_manager

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for this contribution. cc @htuch for final pass or merge.

@wbpcode wbpcode assigned htuch and unassigned htuch and wbpcode Apr 10, 2024
@yanjunxiang-google
Copy link
Contributor Author

Kind ping!

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@htuch htuch merged commit 42d56bc into envoyproxy:main Apr 11, 2024
53 checks passed
mum4k pushed a commit to envoyproxy/nighthawk that referenced this pull request Apr 17, 2024
- sync files
- add `# unique` in .bazelrc to avoid CI failure https://dev.azure.com/cncf/4684fb3d-0389-4e0b-8251-221942316e06/_apis/build/builds/168119/logs/57
- add `cluster_manager_->initialize(bootstrap_);` to fix a CI crash which is introduced by envoyproxy/envoy#33221 

Signed-off-by: Qin Qin <qqin@google.com>
alyssawilk pushed a commit to alyssawilk/envoy that referenced this pull request Apr 29, 2024
This is to address the issue: envoyproxy#33218

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants