-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16265 test: Fix erasurecode/rebuild_fio.py out of space #15020
Conversation
The erasurecode/rebuild_fio.py test runs out of space in self.test_dir due to the same path being used for the control metadata path in MD on SSD mode. The test log file is also quite large with 24 test variants. Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild Skip-func-hw-test-large-md-on-ssd: false Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
Ticket title is '[12-24]-./erasurecode/rebuild_fio.py:EcodFioRebuild.test_ec_online_rebuild_fio tests fail due to daos_server startup problem.' |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15020/1/execution/node/962/log |
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15020/2/execution/node/946/log |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15020/2/execution/node/962/log |
Required-githooks: true
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
Required-githooks: true
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
Verified the change by reducing the threshold to 3% in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15020/7/artifact/Functional%20Hardware%20Large/erasurecode/rebuild_fio.py/job.log:
|
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild Skip-func-hw-test-large-md-on-ssd: false Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just looks like debugging. How is erasurecode/rebuild_fio.py
being fixed?
This is optimizing code we've already implemented to debug an erasurecode/rebuild_fio.py issue where we would run out of space in the testing directory. The optimization is to only provide detail about what files are consuming space on nodes that exceed the threshold instead of all the hosts. The original problem is no longer being seen. In fact, even in the most recent https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15020/8/artifact/Functional%20Hardware%20Large%20MD%20on%20SSD/erasurecode/rebuild_fio.py/job.log run the highest use percentage is 7% by the 24th test variant. It also adds an option to adjust the threshold via the test yaml (or extra test yaml). One additional option we could enable is to completely bypass the check if the |
Prevent accumulating large server log files caused by temporarily enbaling the DEBUG log mask while creating or destroying pools. Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild EcodOnlineMultFail Skip-func-hw-test-large-md-on-ssd: false Required-githooks: true Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
All tests passed in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15020/9/testReport/ |
Now a fix is included to disable enabling the DEBUG log mask when crearting/destroying pools. |
Prevent accumulating large server log files caused by temporarily enabling the DEBUG log mask while creating or destroying pools. Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: EcodFioRebuild EcodOnlineMultFail Skip-func-hw-test-large-md-on-ssd: false Signed-off-by: Phil Henderson <phillip.henderson@intel.com>
Prevent accumulating large server log files caused by temporarily
enabling the DEBUG log mask while creating or destroying pools.
Skip-unit-tests: true
Skip-fault-injection-test: true
Test-tag: EcodFioRebuild
Skip-func-hw-test-large-md-on-ssd: false
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: