Skip to content

Commit 5a605fd

Browse files
Darrick J. Wongdchinner
authored andcommitted
xfs: recalculate free rt extents after log recovery
I've been observing periodic corruption reports from xfs_scrub involving the free rt extent counter (frextents) while running xfs/141. That test uses an error injection knob to induce a torn write to the log, and an arbitrary number of recovery mounts, frextents will count fewer free rt extents than can be found the rtbitmap. The root cause of the problem is a combination of the misuse of sb_frextents in the incore mount to reflect both incore reservations made by running transactions as well as the actual count of free rt extents on disk. The following sequence can reproduce the undercount: Thread 1 Thread 2 xfs_trans_alloc(rtextents=3) xfs_mod_frextents(-3) <blocks> xfs_attr_set() xfs_bmap_attr_addfork() xfs_add_attr2() xfs_log_sb() xfs_sb_to_disk() xfs_trans_commit() <log flushed to disk> <log goes down> Note that thread 1 subtracts 3 from sb_frextents even though it never commits to using that space. Thread 2 writes the undercounted value to the ondisk superblock and logs it to the xattr transaction, which is then flushed to disk. At next mount, log recovery will find the logged superblock and write that back into the filesystem. At the end of log recovery, we reread the superblock and install the recovered undercounted frextents value into the incore superblock. From that point on, we've effectively leaked thread 1's transaction reservation. The correct fix for this is to separate the incore reservation from the ondisk usage, but that's a matter for the next patch. Because the kernel has been logging superblocks with undercounted frextents for a very long time and we don't demand that sysadmins run xfs_repair after a crash, fix the undercount by recomputing frextents after log recovery. Gating this on log recovery is a reasonable balance (I think) between correcting the problem and slowing down every mount attempt. Note that xfs_repair will fix undercounted frextents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
1 parent f34061f commit 5a605fd

3 files changed

Lines changed: 71 additions & 9 deletions

File tree

fs/xfs/xfs_mount.c

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -468,6 +468,8 @@ STATIC int
468468
xfs_check_summary_counts(
469469
struct xfs_mount *mp)
470470
{
471+
int error = 0;
472+
471473
/*
472474
* The AG0 superblock verifier rejects in-progress filesystems,
473475
* so we should never see the flag set this far into mounting.
@@ -506,11 +508,32 @@ xfs_check_summary_counts(
506508
* superblock to be correct and we don't need to do anything here.
507509
* Otherwise, recalculate the summary counters.
508510
*/
509-
if ((!xfs_has_lazysbcount(mp) || xfs_is_clean(mp)) &&
510-
!xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS))
511-
return 0;
511+
if ((xfs_has_lazysbcount(mp) && !xfs_is_clean(mp)) ||
512+
xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) {
513+
error = xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount);
514+
if (error)
515+
return error;
516+
}
517+
518+
/*
519+
* Older kernels misused sb_frextents to reflect both incore
520+
* reservations made by running transactions and the actual count of
521+
* free rt extents in the ondisk metadata. Transactions committed
522+
* during runtime can therefore contain a superblock update that
523+
* undercounts the number of free rt extents tracked in the rt bitmap.
524+
* A clean unmount record will have the correct frextents value since
525+
* there can be no other transactions running at that point.
526+
*
527+
* If we're mounting the rt volume after recovering the log, recompute
528+
* frextents from the rtbitmap file to fix the inconsistency.
529+
*/
530+
if (xfs_has_realtime(mp) && !xfs_is_clean(mp)) {
531+
error = xfs_rtalloc_reinit_frextents(mp);
532+
if (error)
533+
return error;
534+
}
512535

513-
return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount);
536+
return 0;
514537
}
515538

516539
/*
@@ -784,11 +807,6 @@ xfs_mountfs(
784807
goto out_inodegc_shrinker;
785808
}
786809

787-
/* Make sure the summary counts are ok. */
788-
error = xfs_check_summary_counts(mp);
789-
if (error)
790-
goto out_log_dealloc;
791-
792810
/* Enable background inode inactivation workers. */
793811
xfs_inodegc_start(mp);
794812
xfs_blockgc_start(mp);
@@ -844,6 +862,11 @@ xfs_mountfs(
844862
goto out_rele_rip;
845863
}
846864

865+
/* Make sure the summary counts are ok. */
866+
error = xfs_check_summary_counts(mp);
867+
if (error)
868+
goto out_rtunmount;
869+
847870
/*
848871
* If this is a read-only mount defer the superblock updates until
849872
* the next remount into writeable mode. Otherwise we would never

fs/xfs/xfs_rtalloc.c

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1284,6 +1284,43 @@ xfs_rtmount_init(
12841284
return 0;
12851285
}
12861286

1287+
static int
1288+
xfs_rtalloc_count_frextent(
1289+
struct xfs_mount *mp,
1290+
struct xfs_trans *tp,
1291+
const struct xfs_rtalloc_rec *rec,
1292+
void *priv)
1293+
{
1294+
uint64_t *valp = priv;
1295+
1296+
*valp += rec->ar_extcount;
1297+
return 0;
1298+
}
1299+
1300+
/*
1301+
* Reinitialize the number of free realtime extents from the realtime bitmap.
1302+
* Callers must ensure that there is no other activity in the filesystem.
1303+
*/
1304+
int
1305+
xfs_rtalloc_reinit_frextents(
1306+
struct xfs_mount *mp)
1307+
{
1308+
uint64_t val = 0;
1309+
int error;
1310+
1311+
xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL);
1312+
error = xfs_rtalloc_query_all(mp, NULL, xfs_rtalloc_count_frextent,
1313+
&val);
1314+
xfs_iunlock(mp->m_rbmip, XFS_ILOCK_EXCL);
1315+
if (error)
1316+
return error;
1317+
1318+
spin_lock(&mp->m_sb_lock);
1319+
mp->m_sb.sb_frextents = val;
1320+
spin_unlock(&mp->m_sb_lock);
1321+
return 0;
1322+
}
1323+
12871324
/*
12881325
* Get the bitmap and summary inodes and the summary cache into the mount
12891326
* structure at mount time.

fs/xfs/xfs_rtalloc.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,7 @@ bool xfs_verify_rtbno(struct xfs_mount *mp, xfs_rtblock_t rtbno);
135135
int xfs_rtalloc_extent_is_free(struct xfs_mount *mp, struct xfs_trans *tp,
136136
xfs_rtblock_t start, xfs_extlen_t len,
137137
bool *is_free);
138+
int xfs_rtalloc_reinit_frextents(struct xfs_mount *mp);
138139
#else
139140
# define xfs_rtallocate_extent(t,b,min,max,l,f,p,rb) (ENOSYS)
140141
# define xfs_rtfree_extent(t,b,l) (ENOSYS)
@@ -145,6 +146,7 @@ int xfs_rtalloc_extent_is_free(struct xfs_mount *mp, struct xfs_trans *tp,
145146
# define xfs_rtbuf_get(m,t,b,i,p) (ENOSYS)
146147
# define xfs_verify_rtbno(m, r) (false)
147148
# define xfs_rtalloc_extent_is_free(m,t,s,l,i) (ENOSYS)
149+
# define xfs_rtalloc_reinit_frextents(m) (0)
148150
static inline int /* error */
149151
xfs_rtmount_init(
150152
xfs_mount_t *mp) /* file system mount structure */

0 commit comments

Comments
 (0)