Skip to content

Commit 8394a97

Browse files
author
Chandan Babu R
committed
Merge tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
xfs: support in-memory btrees Online repair of the reverse-mapping btrees presens some unique challenges. To construct a new reverse mapping btree, we must scan the entire filesystem, but we cannot afford to quiesce the entire filesystem for the potentially lengthy scan. For rmap btrees, therefore, we relax our requirements of totally atomic repairs. Instead, repairs will scan all inodes, construct a new reverse mapping dataset, format a new btree, and commit it before anyone trips over the corruption. This is exactly the same strategy as was used in the quotacheck and nlink scanners. Unfortunately, the xfarray cannot perform key-based lookups and is therefore unsuitable for supporting live updates. Luckily, we already a data structure that maintains an indexed rmap recordset -- the existing rmap btree code! Hence we port the existing btree and buffer target code to be able to create a btree using the xfile we developed earlier. Live hooks keep the in-memory btree up to date for any resources that have already been scanned. This approach is not maximally memory efficient, but we can use the same rmap code that we do everywhere else, which provides improved stability without growing the code base even more. Note that in-memory btree blocks are always page sized. This patchset modifies the kernel xfs buffer cache to be capable of using a xfile (aka a shmem file) as a backing device. It then augments the btree code to support creating btree cursors with buffers that come from a buftarg other than the data device (namely an xfile-backed buftarg). For the userspace xfs buffer cache, we instead use a memfd or an O_TMPFILE file as a backing device. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: launder in-memory btree buffers before transaction commit xfs: support in-memory btrees xfs: add a xfs_btree_ptrs_equal helper xfs: support in-memory buffer cache targets xfs: teach buftargs to maintain their own buffer hashtable
2 parents aa8fb4b + 0dc63c8 commit 8394a97

21 files changed

Lines changed: 1363 additions & 138 deletions

Documentation/filesystems/xfs/xfs-online-fsck-design.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2270,13 +2270,12 @@ follows:
22702270
pointing to the xfile.
22712271

22722272
3. Pass the buffer cache target, buffer ops, and other information to
2273-
``xfbtree_create`` to write an initial tree header and root block to the
2274-
xfile.
2273+
``xfbtree_init`` to initialize the passed in ``struct xfbtree`` and write an
2274+
initial root block to the xfile.
22752275
Each btree type should define a wrapper that passes necessary arguments to
22762276
the creation function.
22772277
For example, rmap btrees define ``xfs_rmapbt_mem_create`` to take care of
22782278
all the necessary details for callers.
2279-
A ``struct xfbtree`` object will be returned.
22802279

22812280
4. Pass the xfbtree object to the btree cursor creation function for the
22822281
btree type.

fs/xfs/Kconfig

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,13 +128,20 @@ config XFS_LIVE_HOOKS
128128
bool
129129
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
130130

131+
config XFS_MEMORY_BUFS
132+
bool
133+
134+
config XFS_BTREE_IN_MEM
135+
bool
136+
131137
config XFS_ONLINE_SCRUB
132138
bool "XFS online metadata check support"
133139
default n
134140
depends on XFS_FS
135141
depends on TMPFS && SHMEM
136142
select XFS_LIVE_HOOKS
137143
select XFS_DRAIN_INTENTS
144+
select XFS_MEMORY_BUFS
138145
help
139146
If you say Y here you will be able to check metadata on a
140147
mounted XFS filesystem. This feature is intended to reduce
@@ -169,6 +176,7 @@ config XFS_ONLINE_REPAIR
169176
bool "XFS online metadata repair support"
170177
default n
171178
depends on XFS_FS && XFS_ONLINE_SCRUB
179+
select XFS_BTREE_IN_MEM
172180
help
173181
If you say Y here you will be able to repair metadata on a
174182
mounted XFS filesystem. This feature is intended to reduce

fs/xfs/Makefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,8 @@ endif
137137

138138
xfs-$(CONFIG_XFS_DRAIN_INTENTS) += xfs_drain.o
139139
xfs-$(CONFIG_XFS_LIVE_HOOKS) += xfs_hooks.o
140+
xfs-$(CONFIG_XFS_MEMORY_BUFS) += xfs_buf_mem.o
141+
xfs-$(CONFIG_XFS_BTREE_IN_MEM) += libxfs/xfs_btree_mem.o
140142

141143
# online scrub/repair
142144
ifeq ($(CONFIG_XFS_ONLINE_SCRUB),y)

fs/xfs/libxfs/xfs_ag.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,7 @@ xfs_free_perag(
264264
xfs_defer_drain_free(&pag->pag_intents_drain);
265265

266266
cancel_delayed_work_sync(&pag->pag_blockgc_work);
267-
xfs_buf_hash_destroy(pag);
267+
xfs_buf_cache_destroy(&pag->pag_bcache);
268268

269269
/* drop the mount's active reference */
270270
xfs_perag_rele(pag);
@@ -352,7 +352,7 @@ xfs_free_unused_perag_range(
352352
spin_unlock(&mp->m_perag_lock);
353353
if (!pag)
354354
break;
355-
xfs_buf_hash_destroy(pag);
355+
xfs_buf_cache_destroy(&pag->pag_bcache);
356356
xfs_defer_drain_free(&pag->pag_intents_drain);
357357
kfree(pag);
358358
}
@@ -419,7 +419,7 @@ xfs_initialize_perag(
419419
pag->pagb_tree = RB_ROOT;
420420
#endif /* __KERNEL__ */
421421

422-
error = xfs_buf_hash_init(pag);
422+
error = xfs_buf_cache_init(&pag->pag_bcache);
423423
if (error)
424424
goto out_remove_pag;
425425

fs/xfs/libxfs/xfs_ag.h

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,7 @@ struct xfs_perag {
106106
int pag_ici_reclaimable; /* reclaimable inodes */
107107
unsigned long pag_ici_reclaim_cursor; /* reclaim restart point */
108108

109-
/* buffer cache index */
110-
spinlock_t pag_buf_lock; /* lock for pag_buf_hash */
111-
struct rhashtable pag_buf_hash;
109+
struct xfs_buf_cache pag_bcache;
112110

113111
/* background prealloc block trimming */
114112
struct delayed_work pag_blockgc_work;

0 commit comments

Comments
 (0)