Skip to content

Commit b9ff43d

Browse files
author
Matthew Wilcox (Oracle)
committed
mm/readahead: Fix readahead with large folios
Reading 100KB chunks from a big file (eg dd bs=100K) leads to poor readahead behaviour. Studying the traces in detail, I noticed two problems. The first is that we were setting the readahead flag on the folio which contains the last byte read from the block. This is wrong because we will trigger readahead at the end of the read without waiting to see if a subsequent read is going to use the pages we just read. Instead, we need to set the readahead flag on the first folio _after_ the one which contains the last byte that we're reading. The second is that we were looking for the index of the folio with the readahead flag set to exactly match the start + size - async_size. If we've rounded this, either down (as previously) or up (as now), we'll think we hit a folio marked as readahead by a different read, and try to read the wrong pages. So round the expected index to the order of the folio we hit. Reported-by: Guo Xuenan <guoxuenan@huawei.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
1 parent 170f37d commit b9ff43d

1 file changed

Lines changed: 9 additions & 6 deletions

File tree

mm/readahead.c

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -474,7 +474,8 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,
474474

475475
if (!folio)
476476
return -ENOMEM;
477-
if (mark - index < (1UL << order))
477+
mark = round_up(mark, 1UL << order);
478+
if (index == mark)
478479
folio_set_readahead(folio);
479480
err = filemap_add_folio(ractl->mapping, folio, index, gfp);
480481
if (err)
@@ -555,8 +556,9 @@ static void ondemand_readahead(struct readahead_control *ractl,
555556
struct file_ra_state *ra = ractl->ra;
556557
unsigned long max_pages = ra->ra_pages;
557558
unsigned long add_pages;
558-
unsigned long index = readahead_index(ractl);
559-
pgoff_t prev_index;
559+
pgoff_t index = readahead_index(ractl);
560+
pgoff_t expected, prev_index;
561+
unsigned int order = folio ? folio_order(folio) : 0;
560562

561563
/*
562564
* If the request exceeds the readahead window, allow the read to
@@ -575,8 +577,9 @@ static void ondemand_readahead(struct readahead_control *ractl,
575577
* It's the expected callback index, assume sequential access.
576578
* Ramp up sizes, and push forward the readahead window.
577579
*/
578-
if ((index == (ra->start + ra->size - ra->async_size) ||
579-
index == (ra->start + ra->size))) {
580+
expected = round_up(ra->start + ra->size - ra->async_size,
581+
1UL << order);
582+
if (index == expected || index == (ra->start + ra->size)) {
580583
ra->start += ra->size;
581584
ra->size = get_next_ra_size(ra, max_pages);
582585
ra->async_size = ra->size;
@@ -662,7 +665,7 @@ static void ondemand_readahead(struct readahead_control *ractl,
662665
}
663666

664667
ractl->_index = ra->start;
665-
page_cache_ra_order(ractl, ra, folio ? folio_order(folio) : 0);
668+
page_cache_ra_order(ractl, ra, order);
666669
}
667670

668671
void page_cache_sync_ra(struct readahead_control *ractl,

0 commit comments

Comments
 (0)