Skip to content

Commit 4966b46

Browse files
committed
Merge patch series "fuse: use iomap for buffered reads + readahead"
Joanne Koong <joannelkoong@gmail.com> says: This series adds fuse iomap support for buffered reads and readahead. This is needed so that granular uptodate tracking can be used in fuse when large folios are enabled so that only the non-uptodate portions of the folio need to be read in instead of having to read in the entire folio. It also is needed in order to turn on large folios for servers that use the writeback cache since otherwise there is a race condition that may lead to data corruption if there is a partial write, then a read and the read happens before the write has undergone writeback, since otherwise the folio will not be marked uptodate from the partial write so the read will read in the entire folio from disk, which will overwrite the partial write. This is on top of two locally-patched iomap patches [1] [2] patched on top of commit f1c864b ("Merge branch 'vfs-6.18.async' into vfs.all") in Christian's vfs.all tree. This series was run through fstests on fuse passthrough_hp with an out-of kernel patch enabling fuse large folios. This patchset does not enable large folios on fuse yet. That will be part of a different patchset. * patches from https://lore.kernel.org/20250926002609.1302233-1-joannelkoong@gmail.com: fuse: remove fc->blkbits workaround for partial writes fuse: use iomap for readahead fuse: use iomap for read_folio iomap: make iomap_read_folio() a void return iomap: move buffered io bio logic into new file iomap: add caller-provided callbacks for read and readahead iomap: set accurate iter->pos when reading folio ranges iomap: track pending read bytes more optimally iomap: rename iomap_readpage_ctx struct to iomap_read_folio_ctx iomap: rename iomap_readpage_iter() to iomap_read_folio_iter() iomap: iterate over folio mapping in iomap_readpage_iter() iomap: store read/readahead bio generically iomap: move read/readahead bio submission logic into helper function iomap: move bio read logic into helper function Signed-off-by: Christian Brauner <brauner@kernel.org>
2 parents 7aa6bc3 + 93570c6 commit 4966b46

15 files changed

Lines changed: 541 additions & 288 deletions

File tree

Documentation/filesystems/iomap/operations.rst

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,28 @@ These ``struct kiocb`` flags are significant for buffered I/O with iomap:
135135

136136
* ``IOCB_DONTCACHE``: Turns on ``IOMAP_DONTCACHE``.
137137

138+
``struct iomap_read_ops``
139+
--------------------------
140+
141+
.. code-block:: c
142+
143+
struct iomap_read_ops {
144+
int (*read_folio_range)(const struct iomap_iter *iter,
145+
struct iomap_read_folio_ctx *ctx, size_t len);
146+
void (*submit_read)(struct iomap_read_folio_ctx *ctx);
147+
};
148+
149+
iomap calls these functions:
150+
151+
- ``read_folio_range``: Called to read in the range. This must be provided
152+
by the caller. The caller is responsible for calling
153+
iomap_finish_folio_read() after reading in the folio range. This should be
154+
done even if an error is encountered during the read. This returns 0 on
155+
success or a negative error on failure.
156+
157+
- ``submit_read``: Submit any pending read requests. This function is
158+
optional.
159+
138160
Internal per-Folio State
139161
------------------------
140162

@@ -182,6 +204,28 @@ The ``flags`` argument to ``->iomap_begin`` will be set to zero.
182204
The pagecache takes whatever locks it needs before calling the
183205
filesystem.
184206

207+
Both ``iomap_readahead`` and ``iomap_read_folio`` pass in a ``struct
208+
iomap_read_folio_ctx``:
209+
210+
.. code-block:: c
211+
212+
struct iomap_read_folio_ctx {
213+
const struct iomap_read_ops *ops;
214+
struct folio *cur_folio;
215+
struct readahead_control *rac;
216+
void *read_ctx;
217+
};
218+
219+
``iomap_readahead`` must set:
220+
* ``ops->read_folio_range()`` and ``rac``
221+
222+
``iomap_read_folio`` must set:
223+
* ``ops->read_folio_range()`` and ``cur_folio``
224+
225+
``ops->submit_read()`` and ``read_ctx`` are optional. ``read_ctx`` is used to
226+
pass in any custom data the caller needs accessible in the ops callbacks for
227+
fulfilling reads.
228+
185229
Buffered Writes
186230
---------------
187231

block/fops.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -540,12 +540,13 @@ const struct address_space_operations def_blk_aops = {
540540
#else /* CONFIG_BUFFER_HEAD */
541541
static int blkdev_read_folio(struct file *file, struct folio *folio)
542542
{
543-
return iomap_read_folio(folio, &blkdev_iomap_ops);
543+
iomap_bio_read_folio(folio, &blkdev_iomap_ops);
544+
return 0;
544545
}
545546

546547
static void blkdev_readahead(struct readahead_control *rac)
547548
{
548-
iomap_readahead(rac, &blkdev_iomap_ops);
549+
iomap_bio_readahead(rac, &blkdev_iomap_ops);
549550
}
550551

551552
static ssize_t blkdev_writeback_range(struct iomap_writepage_ctx *wpc,

fs/erofs/data.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -371,15 +371,16 @@ static int erofs_read_folio(struct file *file, struct folio *folio)
371371
{
372372
trace_erofs_read_folio(folio, true);
373373

374-
return iomap_read_folio(folio, &erofs_iomap_ops);
374+
iomap_bio_read_folio(folio, &erofs_iomap_ops);
375+
return 0;
375376
}
376377

377378
static void erofs_readahead(struct readahead_control *rac)
378379
{
379380
trace_erofs_readahead(rac->mapping->host, readahead_index(rac),
380381
readahead_count(rac), true);
381382

382-
return iomap_readahead(rac, &erofs_iomap_ops);
383+
iomap_bio_readahead(rac, &erofs_iomap_ops);
383384
}
384385

385386
static sector_t erofs_bmap(struct address_space *mapping, sector_t block)

fs/fuse/dir.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1192,7 +1192,7 @@ static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode,
11921192
if (attr->blksize != 0)
11931193
blkbits = ilog2(attr->blksize);
11941194
else
1195-
blkbits = fc->blkbits;
1195+
blkbits = inode->i_sb->s_blocksize_bits;
11961196

11971197
stat->blksize = 1 << blkbits;
11981198
}

0 commit comments

Comments
 (0)