DAOS-18950 object: migrate_cont_iter_cb() error, drop ds_pool#18224
Draft
kccain wants to merge 1 commit into
Draft
DAOS-18950 object: migrate_cont_iter_cb() error, drop ds_pool#18224kccain wants to merge 1 commit into
kccain wants to merge 1 commit into
Conversation
Before this change, if migrate_cont_iter_cb() gets an error (e.g., -DER_CONT_NONEXIST) while launching cont_fetch_start_ult(), a reference taken on struct ds_pool (in fetch_arg->pool) still exists, and is not dropped in the error handling path. Instead, migrate_cont_iter_cb() returns the error directly (and the cleanup in cont_fetch_end_ult is not run). Later, a pool destroy was observed to hang in ds_pool_stop() because the loop waiting for all references to be dropped did not finish. With this change, the error handling will redirect to the free: label where cont_fetch_end_ult is invoked, that drops the reference to the ds_pool. This is expected to avoid a lingering reference on the ds_pool and a resulting pool destroy hang. Features: rebuild ec_online_rebuild ec_offline_rebuild Signed-off-by: Kenneth Cain <kenneth.cain@hpe.com>
|
Ticket title is 'erasurecode/online_rebuild_mdtest.py:EcodOnlineRebuildMdtest.test_ec_online_rebuild_mdtest - tearDown pool destroy time out' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Before this change, if migrate_cont_iter_cb() gets an error (e.g., -DER_CONT_NONEXIST) while launching cont_fetch_start_ult(), a reference taken on struct ds_pool (in fetch_arg->pool) still exists, and is not dropped in the error handling path. Instead, migrate_cont_iter_cb() returns the error directly (and the cleanup in cont_fetch_end_ult is not run). Later, a pool destroy was observed to hang in ds_pool_stop() because the loop waiting for all references to be dropped did not finish.
With this change, the error handling will redirect to the free: label where cont_fetch_end_ult is invoked, that drops the reference to the ds_pool. This is expected to avoid a lingering reference on the ds_pool and a resulting pool destroy hang.
Features: rebuild ec_online_rebuild ec_offline_rebuild
Steps for the author:
After all prior steps are complete: