Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception #3823

zelinh · 2023-07-27T22:34:30Z

Description

Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception.

I noticed our previous fix on the integ tests would be logging multiple empty errors just like this.

2023-07-26 22:49:42 ERROR    ()
2023-07-26 22:49:42 ERROR    ()
2023-07-26 22:49:42 INFO     Sending SIGKILL to PID 6410
2023-07-26 22:49:42 INFO     Process killed with exit code None
2023-07-26 22:49:42 ERROR    ()
2023-07-26 22:49:42 ERROR    ()
2023-07-26 22:49:42 ERROR    ()
2023-07-26 22:49:42 ERROR    ()

This is because when we terminate the process. We are iterating all the running process and see if the file is successfully closed before we remove it.
However, we are catching multiple irrelevant exception and print out them with no more context. Some of this exception come from the race condition(e.g. the process ends before we go through its opened files) or some process we have no access on. We don't need to worry about these. Therefore according to my research, it's no harm to skip them regarding of our testing, and we would only need to log what we need under if statement self.stderr.name == item.path.

one of a use case on using Process.open_files
https://stackoverflow.com/questions/64599502/how-can-i-check-if-a-file-is-actively-open-in-another-process-outside-of-my-pyth

Issues Resolved

closes #3524

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

codecov · 2023-07-27T22:39:48Z

Codecov Report

Merging #3823 (4fd5967) into main (e5e214d) will increase coverage by 0.48%.
Report is 128 commits behind head on main.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #3823      +/-   ##
==========================================
+ Coverage   91.61%   92.10%   +0.48%     
==========================================
  Files         182      187       +5     
  Lines        5438     5674     +236     
==========================================
+ Hits         4982     5226     +244     
+ Misses        456      448       -8

Files Changed	Coverage Δ
src/system/process.py	`95.29% <100.00%> (+2.35%)`	⬆️
src/test_workflow/test_cluster.py	`85.71% <100.00%> (+0.18%)`	⬆️

... and 14 files with indirect coverage changes

peterzhuamazon · 2023-07-27T23:10:05Z

src/test_workflow/test_cluster.py

@@ -87,6 +87,7 @@ def terminate(self) -> None:
        for service in self.dependencies:
            termination_result = service.terminate()
            self.__save_test_result_data(termination_result)
+            service.uninstall()


Could you make sure the uninstallation of dependencies wont affect __save_test_result_data?

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

gaiksaya · 2023-07-28T18:15:34Z

src/system/process.py

@@ -66,9 +66,9 @@ def terminate(self) -> int:
                try:
                    for item in proc.open_files():


I believe this happened because we are checking all the process that is running on the host. The host has a number of processes running that may or may not be related to testing. Can you narrow down what process and open files we are checking?

I think that's the properties of psutil to iterate all the running process though some of them are not related to our testing at all. I don't think there is a way to narrow down to specific process across platforms. Also some of the exception may be caused by race condition and we think we could pass on them since they are already outdated.

You can get the process id from line 48. See if the process is subprocess of it? Rather than iterating over all the existing processes?

@gaiksaya Basically in windows, according to my research, the python.exe was the process that was holding the file from removing which is not the OpenSearch/Dashboards process so that process id is not the one we want to check.

The pid thing needs another research and bug fix as it was initially only designed for tar, not any other dist that can be handled by external means.

As of now the pid is not accurate unless is tar.

Thanks.

gaiksaya · 2023-07-28T18:16:10Z

src/system/process.py

-                    logging.error(f"{err.args}")
+                            logging.error(f"stdout {item} is being used by process {proc}")
+                except Exception:
+                    pass


Catching and exception and not doing anything is not making sense. Can you elaborate what you are trying to do here?

These exceptions caught are mainly some process that were not related to our testing or they were already closed during the iteration so we may not have permission to go through it. I have seen quite few examples of using this regarding of whether certain file is closed/not used by any process. e.g. https://stackoverflow.com/questions/64599502/how-can-i-check-if-a-file-is-actively-open-in-another-process-outside-of-my-pyth. I think it's safe to do this.

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

src/test_workflow/test_cluster.py

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

Narrow the exception handle cases to avoid unwanted print

c5c3fb1

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

zelinh requested review from dblock, peterzhuamazon, bbarani, gaiksaya, rishabh6788, jordarlu, prudhvigodithi, Divyaasm and tianleh as code owners July 27, 2023 22:34

github-actions bot added the distinguished-contributor label Jul 27, 2023

peterzhuamazon reviewed Jul 27, 2023

View reviewed changes

Update fix for the test_cluster to save all local cluster logs

a885946

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

zelinh force-pushed the win-test-more-fix branch from 707fd13 to a885946 Compare July 27, 2023 23:47

zelinh marked this pull request as draft July 27, 2023 23:54

gaiksaya reviewed Jul 28, 2023

View reviewed changes

Update tests

85d5664

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

zelinh force-pushed the win-test-more-fix branch from bc1475b to 85d5664 Compare July 28, 2023 21:39

zelinh marked this pull request as ready for review July 28, 2023 21:49

gaiksaya approved these changes Sep 13, 2023

View reviewed changes

peterzhuamazon reviewed Sep 13, 2023

View reviewed changes

src/test_workflow/test_cluster.py Outdated Show resolved Hide resolved

Move uninstall

4fd5967

Signed-off-by: Zelin Hao <zelinhao@amazon.com>

zelinh force-pushed the win-test-more-fix branch from 6927a10 to 4fd5967 Compare September 13, 2023 23:52

peterzhuamazon approved these changes Sep 14, 2023

View reviewed changes

peterzhuamazon merged commit 43caf3d into opensearch-project:main Sep 14, 2023

peterzhuamazon deleted the win-test-more-fix branch September 14, 2023 16:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception #3823

Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception #3823

zelinh commented Jul 27, 2023 •

edited by gaiksaya

Loading

codecov bot commented Jul 27, 2023 •

edited

Loading

peterzhuamazon Jul 27, 2023

gaiksaya Jul 28, 2023

zelinh Jul 28, 2023

gaiksaya Aug 22, 2023

zelinh Aug 22, 2023

peterzhuamazon Sep 13, 2023 •

edited

Loading

gaiksaya Jul 28, 2023

zelinh Jul 28, 2023

		@@ -66,9 +66,9 @@ def terminate(self) -> int:
		try:
		for item in proc.open_files():

Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception #3823

Uninstall OpenSearch after testing on dashboards and modify the exception handle to skip irrelevant exception #3823

Conversation

zelinh commented Jul 27, 2023 • edited by gaiksaya Loading

Description

Issues Resolved

codecov bot commented Jul 27, 2023 • edited Loading

Codecov Report

peterzhuamazon Jul 27, 2023

Choose a reason for hiding this comment

gaiksaya Jul 28, 2023

Choose a reason for hiding this comment

zelinh Jul 28, 2023

Choose a reason for hiding this comment

gaiksaya Aug 22, 2023

Choose a reason for hiding this comment

zelinh Aug 22, 2023

Choose a reason for hiding this comment

peterzhuamazon Sep 13, 2023 • edited Loading

Choose a reason for hiding this comment

gaiksaya Jul 28, 2023

Choose a reason for hiding this comment

zelinh Jul 28, 2023

Choose a reason for hiding this comment

zelinh commented Jul 27, 2023 •

edited by gaiksaya

Loading

codecov bot commented Jul 27, 2023 •

edited

Loading

peterzhuamazon Sep 13, 2023 •

edited

Loading