CLOUDP-342858: Running search test under OM 8.2 #467

lsierant · 2025-09-18T21:17:16Z

Summary

This PR adds testing for running search with Ops Manager 8.0.14 and with mongod 8.2.

Changes to search/search_enterprise_tls.py:

Added ability to run the test with either OM or cloud qa.
Added mongod migration steps from 8.0.14 to 8.2.0.
- This adds automated verification that we can have search running and upgrade from mdb 8.0 (having polyfilled searchCoordinator role) to mdb 8.2 (with the role being a built-in one).

Proof of Work

EVG passing for search_enterprise_tls.py in both cloud-qa and om80 variants: evg link

Also the test is only run in om80 variant and not for earlier OM.

Checklist

Have you linked a jira ticket and/or is the ticket in the title?
Have you checked whether your jira ticket required DOCSP changes?
Have you added changelog file?
- use skip-changelog label if not needed
- refer to Changelog files and Release Notes section in CONTRIBUTING.md for more details

fealebenpae · 2025-09-23T10:44:37Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+    if ops_manager is not None:
+        ops_manager.update()
+    ops_manager.om_status().assert_reaches_phase(Phase.Running, timeout=1200)
+    ops_manager.appdb_status().assert_reaches_phase(Phase.Running, timeout=600)


Would get_ops_manager() ever return None provided this step is decorated with skip_if_cloud_manager? Or maybe I'm not understanding why we're guarding against ops_manager being None before calling .update() but not for the following method calls?

you're right, removed the if. The annotation is sufficient.

fealebenpae · 2025-09-23T10:46:32Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+
+
+@mark.e2e_search_enterprise_tls
+def test_wait_for_database_resource_ready2(mdb: MongoDB):


Why do we wait again here?

Yeah, it looks it's redundant. Removing it.

fealebenpae · 2025-09-23T10:59:17Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

 from kubetester.mongodb_user import MongoDBUser
+from kubetester.omtester import skip_if_cloud_manager
 from kubetester.phase import Phase
+from mypyc.irbuild.function import check_native_override


This doesn't seem used.

fealebenpae · 2025-09-23T10:59:38Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

 from tests.common.search.search_tester import SearchTester
-from tests.conftest import get_default_operator
+from tests.conftest import get_default_operator, get_issuer_ca_filepath
+from tests.opsmanager.conftest import custom_om_prev_version


This doesn't seem used.

fealebenpae · 2025-09-23T11:01:57Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+@mark.e2e_search_enterprise_tlssh
+def test_search_assert_search_query_2(mdb: MongoDB):
+    get_user_sample_movies_helper(mdb).assert_search_query(retry_timeout=60)
+


I think we should move this as the last step in TestUpgradeMongod to make sure it doesn't get moved around.

mircea-cosbuc · 2025-09-25T14:59:48Z

docker/mongodb-kubernetes-tests/tests/search/om_deployment.py

+
+    resource.set_version(get_custom_om_version())
+    resource.set_appdb_version(get_custom_appdb_version())
+    resource.allow_mdb_rc_versions()


Is 8.2 an RC version? Is it not available without enabling this?

Oh this is not necessary anymore. Thanks for flagging.

github-actions · 2025-09-29T08:54:54Z

⚠️ (this preview might not be accurate if the PR is not rebased on current master branch)

MCK 1.5.0 Release Notes

New Features

Improve automation agent certificate rotation: the agent now restarts automatically when its certificate is renewed, ensuring smooth operation without manual intervention and allowing seamless certificate updates without requiring manual Pod restarts.

Bug Fixes

MongoDBMultiCluster: fix resource stuck in Pending state if any clusterSpecList item has 0 members. After the fix, a value of 0 members is handled correctly, similarly to how it's done in the MongoDB resource.

m1kola

Looks good overall, but the tests are not easy to follow. Left some suggestions.

docker/mongodb-kubernetes-tests/kubetester/mongodb.py

docker/mongodb-kubernetes-tests/tests/search/fixtures/enterprise-replicaset-sample-mflix.yaml

m1kola · 2025-10-01T09:37:40Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

 @mark.e2e_search_enterprise_tls
 def test_wait_for_database_resource_ready(mdb: MongoDB):
-    mdb.assert_abandons_phase(Phase.Running, timeout=300)
    mdb.assert_reaches_phase(Phase.Running, timeout=300)


Nit: doesn't look like a very useful test. We do nothing here apart from the assertion. In my mind the test should perform some change and then assert some condition.

it is useful, it's checking the state of the world. Not every test must change the world AND observe it as part of the same test case. Especially when you deal with multiple components deployed asynchronously. In this case we create users, search and mongodb which are reconciled by different controllers

Having smaller test cases makes it easier to develop, as you can retry smaller individual functions if needed. Also we should consider all the functions top to bottom as part of one testing steps and story.

In case of search, after it's deployed, the MongoDB reconciler will pick it up and perform changes to mongod - agents will perform rolling restart of mongods, even without changing to MongoDB CR at all.

it is useful, it's checking the state of the world. Not every test must change the world AND observe it as part of the same test case. Especially when you deal with multiple components deployed asynchronously. In this case we create users, search and mongodb which are reconciled by different controllers

Having smaller test cases makes it easier to develop, as you can retry smaller individual functions if needed. Also we should consider all the functions top to bottom as part of one testing steps and story.

This is an end to end test and we shouldn't be concerned about specific controllers or other implementation details. We should focus on how the system behaves as a whole instead.

Perhaps it is easier to develop/debug the tests like this, but tests structured this way leave the reader guessing. Even if you consider all the functions top to bottom it raises a question - why we do this check here? We already check ready state in test_create_database_resource - why check it again? You explained this later, but imagine not having this context.

In case of search, after it's deployed, the MongoDB reconciler will pick it up and perform changes to mongod - agents will perform rolling restart of mongods, even without changing to MongoDB CR at all.

Given that explanation I would argue that in the end to end test we should be:

Deploying everything (the DB resource, user and the search)

Do the ready state waits and other checks after the waiting.

This is probably how the users will be using it (especially with GitOps). And it eliminates the questions about strange checks.

Your comments made me rethink this and I changed that function to test_wait_for_agents_ready. After search is deployed, mongod parameters are added to all processes and the agents are doing their work for at least few minutes.
Due to how we report (or rather not doing properly) changes to Pending when reconciling, we cannot see the agents are doing anything by checking the spec.status alone.

Therefore I've added in the test mdb.get_om_tester().wait_agents_ready() that checks if all the agents reached the goal state (it's exactly the same logic as the operator is waiting on in the reconcile loop).

I hope now it's more clear what's happening.

m1kola · 2025-10-01T09:43:30Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+@mark.e2e_search_enterprise_tls
+def test_validate_tls_connections(mdb: MongoDB, mdbs: MongoDBSearch, namespace: str):
+    validate_tls_connections(mdb, mdbs, namespace)
+
+
+@mark.e2e_search_enterprise_tls
+def test_search_restore_sample_database(mdb: MongoDB):
+    get_admin_sample_movies_helper(mdb).restore_sample_database()
+
+
+@mark.e2e_search_enterprise_tls
+def test_search_create_search_index(mdb: MongoDB):
+    get_user_sample_movies_helper(mdb).create_search_index()


I don't understand the correlation between test actions and the assertions here and in some other tests. Perhaps it is too granular split into separate tests?

I think it will be easier to follow if there is a test which performs an action and checks the outcome of the action.

This is difficult to fix tbh and it would require a more broader alignment how we should write the test...
I don't know how to fix this exact case, but it's just part of the testing story: in order to query the search we need the index in the first place.

I'm not going to insist on changing this. But IMO something like this makes more sense:

def test_can_search(mdb: MongoDB): get_admin_sample_movies_helper(mdb).restore_sample_database() get_user_sample_movies_helper(mdb).create_search_index() get_user_sample_movies_helper(mdb).assert_search_query(retry_timeout=60)

You can immediately see that the user should be able to search after creating a search index.

m1kola · 2025-10-01T10:12:19Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+        custom_roles = mdb.get_automation_config_tester().automation_config.get("roles", [])
+        assert len(custom_roles) > 0
+        assert "searchCoordinator" in [role["role"] for role in custom_roles]


Not clear to me - why we assert it here?

I had to look at the controller code to understand it better. I see that for DB versions prior to 8.2 we create a polyfill role. Would it then make more sense to assert this after creating the resource with MDB_VERSION_WITHOUT_BUILT_IN_ROLE? E.g. somewhere in test_create_database_resource?

And then in the upgrade test we can test that there is no polyfill role after the upgrade.

This way test actions are bundled together with a relevant assertions which makes it easier to understand. Also I think a comment with an explanation about the polyfill role wouldn't hurt somewhere around role assertions.

I've added some more comments around and reordered the steps a bit - PTAL.

After looking at this again I now feel like ideally we should not be checking for implementation details in an E2E test: we should not be concerned about polyfill role. It might be an useful unit/integration test, but on the E2E level we shouldn't wory about it. We care about an answer to the question - "does search work after upgrade?".

Not requesting you to change this (but I'll welcome it) as there is no harm in the additional checks apart from them adding a bit of extra complexity to the test.

m1kola · 2025-10-01T10:13:31Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+    def test_mongod_version(self, mdb: MongoDB):
+        mdb.tester(ca_path=get_issuer_ca_filepath(), use_ssl=True).assert_version(MDB_VERSION_WITHOUT_BUILT_IN_ROLE)


Not sure what is the value of this test as part of the upgrade test.

Perhaps it is also should be an assertion after initial creation in test_create_database_resource?

And then in the upgrade test we will be:

Taking action: upgrading to a new version

Making assertions: checking running state, custom roles, version, etc

I agree that this assertion alone is of little value as it stands as we deploy this version at the beginning and the assertion could be there.

OTOH, I've prepared this TestUpgradeMongod to be a potentially reusable bit of upgrade test/steps. If you look at this assertion as part of the steps grouped under a single TestClass, then it's OK to test whether we run the test under specific circumstances.

Let me think about the overall structure, perhaps some comments would be needed here.

docker/mongodb-kubernetes-tests/kubetester/mongodb.py

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

anandsyncs · 2025-10-01T13:03:45Z

docker/mongodb-kubernetes-tests/kubetester/mongodb.py

+        except client.ApiException as e:
+            if e.status == 404:
+                logger.debug("project config map is not specified, trying my-project as the source")
+                src_cm = read_configmap(self.namespace, "my-project", api_client=api_client)


The whole except section is redundant due to default value assigned to src_project_config_map_name

I refactored this part a bit.

m1kola

Left a few comments. I'll be happy if you address them, but I'll be ok with the PR merged as is because I do not consider these comments critical.

m1kola · 2025-10-08T09:07:30Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+def test_wait_for_mongod_parameters(mdb: MongoDB):
+    # After search CR is deployed, MongoDB controller will pick it up
+    # and start adding searchCoordinator role and search-related
+    # parameters to the automation config.
+    def check_mongod_parameters():
+        parameters_are_set = True
+        pod_parameters = []
+        for idx in range(mdb.get_members()):
+            mongod_config = yaml.safe_load(
+                KubernetesTester.run_command_in_pod_container(
+                    f"{mdb.name}-{idx}", mdb.namespace, ["cat", "/data/automation-mongod.conf"]
+                )
+            )
+            set_parameter = mongod_config.get("setParameter", {})
+            parameters_are_set = parameters_are_set and (
+                "mongotHost" in set_parameter and "searchIndexManagementHostAndPort" in set_parameter
+            )
+            pod_parameters.append(f"pod {idx} setParameter: {set_parameter}")
+
+        return parameters_are_set, f'Not all pods have mongot parameters set:\n{"\n".join(pod_parameters)}'
+
+    run_periodically(check_mongod_parameters, timeout=200)


I think both this and test_wait_for_agents_ready are testing implementation details. I understand that test_wait_for_agents_ready is necessary (at least for now, while we are not reporting Pending). But do we need to do test_wait_for_mongod_parameters?

I'm trying to look at the flow from the user perspective. For me as a user it is important that I can:

Deploy MongoDB

Deploy MongoDBSearch

Get some indication that everything is ready (we don't have that at the moment unfortunately as you explained in the comment for test_wait_for_agents_ready).

Create search index

And start searching.

Can we just drop test_wait_for_mongod_parameters and verify "I can search" functionality (you already do that later in the test)?

m1kola · 2025-10-08T09:24:30Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+@mark.e2e_search_enterprise_tls
+def test_validate_tls_connections(mdb: MongoDB, mdbs: MongoDBSearch, namespace: str):
+    validate_tls_connections(mdb, mdbs, namespace)
+
+
+@mark.e2e_search_enterprise_tls
+def test_search_restore_sample_database(mdb: MongoDB):
+    get_admin_sample_movies_helper(mdb).restore_sample_database()
+
+
+@mark.e2e_search_enterprise_tls
+def test_search_create_search_index(mdb: MongoDB):
+    get_user_sample_movies_helper(mdb).create_search_index()


I'm not going to insist on changing this. But IMO something like this makes more sense:

def test_can_search(mdb: MongoDB): get_admin_sample_movies_helper(mdb).restore_sample_database() get_user_sample_movies_helper(mdb).create_search_index() get_user_sample_movies_helper(mdb).assert_search_query(retry_timeout=60)

You can immediately see that the user should be able to search after creating a search index.

m1kola · 2025-10-08T09:34:58Z

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py

+        custom_roles = mdb.get_automation_config_tester().automation_config.get("roles", [])
+        assert len(custom_roles) > 0
+        assert "searchCoordinator" in [role["role"] for role in custom_roles]


After looking at this again I now feel like ideally we should not be checking for implementation details in an E2E test: we should not be concerned about polyfill role. It might be an useful unit/integration test, but on the E2E level we shouldn't wory about it. We care about an answer to the question - "does search work after upgrade?".

Not requesting you to change this (but I'll welcome it) as there is no harm in the additional checks apart from them adding a bit of extra complexity to the test.

lsierant self-assigned this Sep 18, 2025

lsierant force-pushed the lsierant/multi-cloud-qa branch from 94a69a9 to b053f96 Compare September 18, 2025 21:18

lsierant force-pushed the lsierant/search-8.2 branch from 0e3d326 to db23821 Compare September 19, 2025 20:29

lsierant force-pushed the lsierant/multi-cloud-qa branch from b053f96 to 8689e2c Compare September 22, 2025 10:34

lsierant force-pushed the lsierant/search-8.2 branch from c882bd5 to 13e1cdb Compare September 22, 2025 18:10

lsierant force-pushed the lsierant/multi-cloud-qa branch 2 times, most recently from 6efa62e to 9d65420 Compare September 22, 2025 20:02

lsierant force-pushed the lsierant/search-8.2 branch 2 times, most recently from d70d9e6 to 4363fa2 Compare September 22, 2025 20:19

lsierant force-pushed the lsierant/multi-cloud-qa branch from 4964f58 to aaa6b41 Compare September 22, 2025 20:27

lsierant force-pushed the lsierant/search-8.2 branch 2 times, most recently from 6d98958 to 23dc861 Compare September 22, 2025 20:28

lsierant force-pushed the lsierant/multi-cloud-qa branch from aaa6b41 to 3aa8a72 Compare September 22, 2025 20:33

lsierant force-pushed the lsierant/search-8.2 branch from 23dc861 to d9f6731 Compare September 22, 2025 20:34

lsierant force-pushed the lsierant/multi-cloud-qa branch from 3aa8a72 to 8f187b9 Compare September 22, 2025 21:44

lsierant force-pushed the lsierant/search-8.2 branch from d9f6731 to a415c55 Compare September 22, 2025 21:44

lsierant marked this pull request as ready for review September 23, 2025 06:19

lsierant requested a review from a team as a code owner September 23, 2025 06:19

lsierant requested review from m1kola, mircea-cosbuc and fealebenpae and removed request for a team September 23, 2025 06:19

lsierant force-pushed the lsierant/search-8.2 branch from 34a3b86 to bdabc37 Compare September 23, 2025 07:20

fealebenpae approved these changes Sep 23, 2025

View reviewed changes

lsierant force-pushed the lsierant/multi-cloud-qa branch from fa19d3d to de8fd69 Compare September 23, 2025 20:30

lsierant force-pushed the lsierant/search-8.2 branch from bdabc37 to c771f53 Compare September 23, 2025 20:30

lsierant force-pushed the lsierant/multi-cloud-qa branch from de8fd69 to 8398705 Compare September 25, 2025 07:39

lsierant force-pushed the lsierant/search-8.2 branch 2 times, most recently from a7d5aeb to 37d09a7 Compare September 25, 2025 09:55

mircea-cosbuc reviewed Sep 25, 2025

View reviewed changes

lsierant changed the base branch from lsierant/multi-cloud-qa to master September 29, 2025 08:48

lsierant force-pushed the lsierant/search-8.2 branch from 37d09a7 to 422790a Compare September 29, 2025 08:54

lsierant force-pushed the lsierant/search-8.2 branch from 2a7af5e to 0f77d1f Compare September 29, 2025 11:21

lsierant added the skip-changelog Use this label in Pull Request to not require new changelog entry file label Sep 29, 2025

lsierant changed the title ~~Running search test under OM 8.2~~ CLOUDP-342858: Running search test under OM 8.2 Sep 29, 2025

lsierant requested a review from mircea-cosbuc September 30, 2025 12:28

m1kola reviewed Oct 1, 2025

View reviewed changes

lucian-tosa approved these changes Oct 1, 2025

View reviewed changes

docker/mongodb-kubernetes-tests/kubetester/mongodb.py Show resolved Hide resolved

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py Outdated Show resolved Hide resolved

docker/mongodb-kubernetes-tests/tests/search/search_enterprise_tls.py Outdated Show resolved Hide resolved

anandsyncs reviewed Oct 1, 2025

View reviewed changes

lsierant added 6 commits October 3, 2025 08:47

Allow for more than one deployment per test under cloud-qa

8c95711

Removed any changes to the existing tests

84ff3d6

Running search test under OM 8.2

ac898f4

Review fixes

d6d7c70

Incorporated cloud-qa changes

2046f89

Test fixes

48c5aaa

lsierant force-pushed the lsierant/search-8.2 branch from 2d1d13b to d8329a6 Compare October 3, 2025 06:47

Review fixes

e3fa3e7

lsierant force-pushed the lsierant/search-8.2 branch from d8329a6 to e3fa3e7 Compare October 3, 2025 07:16

lsierant added 4 commits October 3, 2025 10:55

Review fixes

853a47f

Lint

8d3f0b3

Review fixes

f763e34

Waiting for the agents to reach goal state

16af9a2

lsierant requested review from m1kola and anandsyncs October 6, 2025 09:32

m1kola approved these changes Oct 8, 2025

View reviewed changes



		@mark.e2e_search_enterprise_tls
		def test_wait_for_database_resource_ready2(mdb: MongoDB):

		def test_mongod_version(self, mdb: MongoDB):
		mdb.tester(ca_path=get_issuer_ca_filepath(), use_ssl=True).assert_version(MDB_VERSION_WITHOUT_BUILT_IN_ROLE)

CLOUDP-342858: Running search test under OM 8.2 #467

Are you sure you want to change the base?

CLOUDP-342858: Running search test under OM 8.2 #467

Uh oh!

Conversation

lsierant commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Proof of Work

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MCK 1.5.0 Release Notes

New Features

Bug Fixes

Uh oh!

m1kola left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m1kola Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m1kola left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

lsierant commented Sep 18, 2025 •

edited

Loading

github-actions bot commented Sep 29, 2025 •

edited

Loading

m1kola Oct 2, 2025 •

edited

Loading