-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16097 vos: assign persistent DTX entry in vos_dtx_prepared #14708
Conversation
Ticket title is 'erasurecode/aggregation.py:EcodAggregationOff.test_ec_aggregation_time - failed to destroy container TestContainer_12: DER_IO(-2001)' |
d56a93d
to
d6b9f5b
Compare
Test stage Fault injection testing on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/2/execution/node/1116/log |
d6b9f5b
to
20cfaf6
Compare
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/3/execution/node/1322/log |
b876ef3
to
3b8e48f
Compare
c899d12
to
8c4db79
Compare
3b8e48f
to
9d7596c
Compare
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/7/execution/node/1322/log |
9d7596c
to
7c1f413
Compare
8c4db79
to
a5959eb
Compare
7c1f413
to
94a74b0
Compare
a5959eb
to
74de330
Compare
94a74b0
to
2723b2b
Compare
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/10/execution/node/1368/log |
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/10/execution/node/1347/log |
2723b2b
to
11fae97
Compare
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14708/11/testReport/ |
Test stage NLT on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/11/execution/node/816/log |
Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14708/11/testReport/ |
11fae97
to
95949f1
Compare
95949f1
to
f8713a5
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/14/execution/node/1447/log |
f8713a5
to
e0a88ff
Compare
Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/382/log |
Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/388/log |
Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/374/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/375/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/376/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/15/execution/node/367/log |
e0a88ff
to
d02c854
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/16/execution/node/1446/log |
d02c854
to
37a2256
Compare
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/17/execution/node/1184/log |
Assign persistent DTX entry only via vos_dtx_prepared() that will initialize such DTX entry immediately to avoid any potential race between persistently allocating DTX entry and initializing it. Add some check (for DTX flag) after DTX locally prepared. Do not allow current transaction to deregister the record that is referenced by another prepared (but non-committed) DTX. Allow-unstable-test: true Signed-off-by: Fan Yong <fan.yong@intel.com>
37a2256
to
b3aab99
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14708/18/execution/node/1416/log |
osa_offline_reintegration failed for DAOS-16007 |
Ping reviewers, thanks! |
@jolivier23 @NiuYawei @janekmi , would you please to help review the patch? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this change set looks correct. I was focusing on the persistency correctness though.
Sorry for joining late to the party. I would still appreciate responses to my questions. I am very eager to use every opportunity to learn a bit more about DTX. Thanks!
dck.oid = dsp->dsp_oid; | ||
dck.dkey_hash = dsp->dsp_dkey_hash; | ||
rc = dtx_commit(cont, &pdte, &dck, 1); | ||
if (rc < 0 && rc != -DER_NONEXIST && for_io) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: && for_io
seems redundant considering the whole block is under this condition.
if (for_io) { | ||
rc = vos_dtx_check(cont->sc_hdl, &dsp->dsp_xid, NULL, NULL, NULL, false); | ||
switch(rc) { | ||
case DTX_ST_COMMITTABLE: | ||
dck.oid = dsp->dsp_oid; | ||
dck.dkey_hash = dsp->dsp_dkey_hash; | ||
rc = dtx_commit(cont, &pdte, &dck, 1); | ||
if (rc < 0 && rc != -DER_NONEXIST && for_io) | ||
d_list_add_tail(&dsp->dsp_link, cmt_list); | ||
else | ||
dtx_dsp_free(dsp); | ||
continue; | ||
case DTX_ST_COMMITTED: | ||
case -DER_NONEXIST: /* Aborted */ | ||
dtx_dsp_free(dsp); | ||
continue; | ||
default: | ||
break; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not too familiar with this part of DAOS, but it seems to me this change is not explained in the commit message. Or is it?
rc = dtx_leader_begin(ioc.ioc_vos_coh, &odm->odm_xid, &epoch, | ||
dcts[0].dct_shards[dmi->dmi_tgt_id].dcs_nr, version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this bit not referred to in the commit message?
Assign persistent DTX entry only via vos_dtx_prepared() that will initialize such DTX entry immediately to avoid any potential race between persistently allocating DTX entry and initializing it.
Add some check (for DTX flag) after DTX locally prepared.
Do not allow current transaction to deregister the record that is referenced by another prepared (but non-committed) DTX.
Allow-unstable-test: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: