Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-15930 test: extend_simple add FI required macro call #14667

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

kccain
Copy link
Contributor

@kccain kccain commented Jun 28, 2024

In release 2.6 RC1 testing certain daos_test -B (extend_simple)
cases failed (EXTEND6, EXTEND8, EXTEND14). It was found that, although
some extend tests use fault injection via dfs_extend_internal(), they
did not invoke FAULT_INJECTION_REQUIRED(). This can result in
unpredictable execution (e.g., hangs) when running with a daos build
whose BUILD_TYPE=release.

This change adds the FAULT_INJECTION_REQUIRED() macro call,
and makes some minor changes to the test code to make the output
reflect what steps the test executes - to facilitate debugging.

Skip-unit-tests: true
Skip-fault-injection-test: true
Test-tag: test_daos_extend_simple
faults-enabled: false

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Copy link

Ticket title is 'daos_test/suite.py:DaosCoreTest.test_daos_extend_simple - timeout waiting for rebuild'
Status is 'Reopened'
Labels: '2.6.0rc1,ci_impact,intermittent_test_failure,pr_test,scrubbed_2.8'
https://daosio.atlassian.net/browse/DAOS-15930

In release 2.6 RC1 testing certain daos_test -B (extend_simple)
cases failed (EXTEND6, EXTEND8, EXTEND14). It was found that, although
some extend tests use fault injection via dfs_extend_internal(), they
did not invoke FAULT_INJECTION_REQUIRED(). Tthis can result in
unpredictable execution (e.g., hangs) when running with a daos build
whose BUILD_TYPE=release.

This change adds the FAULT_INJECTION_REQUIRED() macro call,
and makes some minor changes to the test code to make the output
reflect what steps the test executes - to facilitate debugging.

Skip-unit-tests: true
Skip-fault-injection-test: true
Test-tag: test_daos_extend_simple
faults-enabled: false

Signed-off-by: Kenneth Cain <kenneth.c.cain@intel.com>
@kccain kccain force-pushed the kccain/daos_15930_master branch from 80c788b to c6e4f9c Compare June 28, 2024 18:06
@kccain
Copy link
Contributor Author

kccain commented Jun 28, 2024

Jenkins build https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14667/3/ in progress at priority 2 as discussed with @sbpeirce . I'll proactively push a release/2.6 branch cherry-pick PR.

@kccain kccain added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Jun 28, 2024
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium UCX Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14667/3/execution/node/1036/log

@kccain
Copy link
Contributor Author

kccain commented Jun 29, 2024

Test stage Functional Hardware Medium UCX Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14667/3/execution/node/1036/log

This seems to be an instance of cart_ctl command failure in UCX environments noted in this ticket https://daosio.atlassian.net/browse/DAOS-16008

The other provider tests (verbs, and verbs MD on SSD) passed, see https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14667/3/testReport/FTEST_daos_test/DaosCoreTest/

And the same code in the other release/2.6 patch passed the test in all 3 environments (ucx, verbs, verbs MD on SSD)

@kccain kccain marked this pull request as ready for review June 29, 2024 01:16
@kccain kccain requested review from liuxuezhao and liw June 29, 2024 01:16
@kccain kccain requested a review from a team July 2, 2024 17:45
@daltonbohning daltonbohning merged commit f3f959a into master Jul 3, 2024
13 of 15 checks passed
@daltonbohning daltonbohning deleted the kccain/daos_15930_master branch July 3, 2024 16:04
grom72 pushed a commit to grom72/daos that referenced this pull request Jul 25, 2024
…#14667)

In release 2.6 RC1 testing certain daos_test -B (extend_simple)
cases failed (EXTEND6, EXTEND8, EXTEND14). It was found that, although
some extend tests use fault injection via dfs_extend_internal(), they
did not invoke FAULT_INJECTION_REQUIRED(). This can result in
unpredictable execution (e.g., hangs) when running with a daos build
whose BUILD_TYPE=release.

This change adds the FAULT_INJECTION_REQUIRED() macro call,
and makes some minor changes to the test code to make the output
reflect what steps the test executes - to facilitate debugging.

Signed-off-by: Kenneth Cain <kenneth.c.cain@intel.com>
daltonbohning pushed a commit that referenced this pull request Aug 7, 2024
…14668)

In release 2.6 RC1 testing certain daos_test -B (extend_simple)
cases failed (EXTEND6, EXTEND8, EXTEND14). It was found that, although
some extend tests use fault injection via dfs_extend_internal(), they
did not invoke FAULT_INJECTION_REQUIRED(). Tthis can result in
unpredictable execution (e.g., hangs) when running with a daos build
whose BUILD_TYPE=release.

This change adds the FAULT_INJECTION_REQUIRED() macro call,
and makes some minor changes to the test code to make the output
reflect what steps the test executes - to facilitate debugging.

Signed-off-by: Kenneth Cain <kenneth.c.cain@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed.
Development

Successfully merging this pull request may close these issues.

5 participants