Skip to content

Commit 77dbbcd

Browse files
djwonggregkh
authored andcommitted
xfs: log new intent items created as part of finishing recovered intent items
[ Upstream commit 93293bc ] During a code inspection, I found a serious bug in the log intent item recovery code when an intent item cannot complete all the work and decides to requeue itself to get that done. When this happens, the item recovery creates a new incore deferred op representing the remaining work and attaches it to the transaction that it allocated. At the end of _item_recover, it moves the entire chain of deferred ops to the dummy parent_tp that xlog_recover_process_intents passed to it, but fail to log a new intent item for the remaining work before committing the transaction for the single unit of work. xlog_finish_defer_ops logs those new intent items once recovery has finished dealing with the intent items that it recovered, but this isn't sufficient. If the log is forced to disk after a recovered log item decides to requeue itself and the system goes down before we call xlog_finish_defer_ops, the second log recovery will never see the new intent item and therefore has no idea that there was more work to do. It will finish recovery leaving the filesystem in a corrupted state. The same logic applies to /any/ deferred ops added during intent item recovery, not just the one handling the remaining work. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent bbf30f4 commit 77dbbcd

4 files changed

Lines changed: 32 additions & 4 deletions

File tree

fs/xfs/libxfs/xfs_defer.c

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,8 +186,9 @@ xfs_defer_create_intent(
186186
{
187187
const struct xfs_defer_op_type *ops = defer_op_types[dfp->dfp_type];
188188

189-
dfp->dfp_intent = ops->create_intent(tp, &dfp->dfp_work,
190-
dfp->dfp_count, sort);
189+
if (!dfp->dfp_intent)
190+
dfp->dfp_intent = ops->create_intent(tp, &dfp->dfp_work,
191+
dfp->dfp_count, sort);
191192
}
192193

193194
/*
@@ -390,6 +391,7 @@ xfs_defer_finish_one(
390391
list_add(li, &dfp->dfp_work);
391392
dfp->dfp_count++;
392393
dfp->dfp_done = NULL;
394+
dfp->dfp_intent = NULL;
393395
xfs_defer_create_intent(tp, dfp, false);
394396
}
395397

@@ -552,3 +554,23 @@ xfs_defer_move(
552554

553555
xfs_defer_reset(stp);
554556
}
557+
558+
/*
559+
* Prepare a chain of fresh deferred ops work items to be completed later. Log
560+
* recovery requires the ability to put off until later the actual finishing
561+
* work so that it can process unfinished items recovered from the log in
562+
* correct order.
563+
*
564+
* Create and log intent items for all the work that we're capturing so that we
565+
* can be assured that the items will get replayed if the system goes down
566+
* before log recovery gets a chance to finish the work it put off. Then we
567+
* move the chain from stp to dtp.
568+
*/
569+
void
570+
xfs_defer_capture(
571+
struct xfs_trans *dtp,
572+
struct xfs_trans *stp)
573+
{
574+
xfs_defer_create_intents(stp);
575+
xfs_defer_move(dtp, stp);
576+
}

fs/xfs/libxfs/xfs_defer.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,4 +63,10 @@ extern const struct xfs_defer_op_type xfs_rmap_update_defer_type;
6363
extern const struct xfs_defer_op_type xfs_extent_free_defer_type;
6464
extern const struct xfs_defer_op_type xfs_agfl_free_defer_type;
6565

66+
/*
67+
* Functions to capture a chain of deferred operations and continue them later.
68+
* This doesn't normally happen except log recovery.
69+
*/
70+
void xfs_defer_capture(struct xfs_trans *dtp, struct xfs_trans *stp);
71+
6672
#endif /* __XFS_DEFER_H__ */

fs/xfs/xfs_bmap_item.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -534,7 +534,7 @@ xfs_bui_item_recover(
534534
xfs_bmap_unmap_extent(tp, ip, &irec);
535535
}
536536

537-
xfs_defer_move(parent_tp, tp);
537+
xfs_defer_capture(parent_tp, tp);
538538
error = xfs_trans_commit(tp);
539539
xfs_iunlock(ip, XFS_ILOCK_EXCL);
540540
xfs_irele(ip);

fs/xfs/xfs_refcount_item.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,7 @@ xfs_cui_item_recover(
555555
}
556556

557557
xfs_refcount_finish_one_cleanup(tp, rcur, error);
558-
xfs_defer_move(parent_tp, tp);
558+
xfs_defer_capture(parent_tp, tp);
559559
error = xfs_trans_commit(tp);
560560
return error;
561561

0 commit comments

Comments
 (0)