Skip to content

Commit 69c34f0

Browse files
committed
Merge branch 'for-6.11/block-limits' into for-6.11/block
Merge in last round of queue limits changes from Christoph. * for-6.11/block-limits: (26 commits) block: move the bounce flag into the features field block: move the skip_tagset_quiesce flag to queue_limits block: move the pci_p2pdma flag to queue_limits block: move the zone_resetall flag to queue_limits block: move the zoned flag into the features field block: move the poll flag to queue_limits block: move the dax flag to queue_limits block: move the nowait flag to queue_limits block: move the synchronous flag to queue_limits block: move the stable_writes flag to queue_limits block: move the io_stat flag setting to queue_limits block: move the add_random flag to queue_limits block: move the nonrot flag to queue_limits block: move cache control settings out of queue->flags block: remove blk_flush_policy block: freeze the queue in queue_attr_store nbd: move setting the cache control flags to __nbd_set_size virtio_blk: remove virtblk_update_cache_mode loop: fold loop_update_rotational into loop_reconfigure_limits loop: also use the default block size from an underlying block device ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 parents 465478b + 339d394 commit 69c34f0

61 files changed

Lines changed: 576 additions & 730 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/block/writeback_cache_control.rst

Lines changed: 38 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -46,41 +46,50 @@ worry if the underlying devices need any explicit cache flushing and how
4646
the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
4747
may both be set on a single bio.
4848

49+
Feature settings for block drivers
50+
----------------------------------
4951

50-
Implementation details for bio based block drivers
51-
--------------------------------------------------------------
52+
For devices that do not support volatile write caches there is no driver
53+
support required, the block layer completes empty REQ_PREFLUSH requests before
54+
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
55+
requests that have a payload.
5256

53-
These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
54-
directly below the submit_bio interface. For remapping drivers the REQ_FUA
55-
bits need to be propagated to underlying devices, and a global flush needs
56-
to be implemented for bios with the REQ_PREFLUSH bit set. For real device
57-
drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
58-
on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
59-
data can be completed successfully without doing any work. Drivers for
60-
devices with volatile caches need to implement the support for these
61-
flags themselves without any help from the block layer.
57+
For devices with volatile write caches the driver needs to tell the block layer
58+
that it supports flushing caches by setting the
6259

60+
BLK_FEAT_WRITE_CACHE
6361

64-
Implementation details for request_fn based block drivers
65-
---------------------------------------------------------
62+
flag in the queue_limits feature field. For devices that also support the FUA
63+
bit the block layer needs to be told to pass on the REQ_FUA bit by also setting
64+
the
6665

67-
For devices that do not support volatile write caches there is no driver
68-
support required, the block layer completes empty REQ_PREFLUSH requests before
69-
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
70-
requests that have a payload. For devices with volatile write caches the
71-
driver needs to tell the block layer that it supports flushing caches by
72-
doing::
66+
BLK_FEAT_FUA
67+
68+
flag in the features field of the queue_limits structure.
69+
70+
Implementation details for bio based block drivers
71+
--------------------------------------------------
72+
73+
For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simplify passed on
74+
to the driver if the drivers sets the BLK_FEAT_WRITE_CACHE flag and the drivers
75+
needs to handle them.
76+
77+
*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is
78+
_not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to
79+
handle REQ_FUA.
7380

74-
blk_queue_write_cache(sdkp->disk->queue, true, false);
81+
For remapping drivers the REQ_FUA bits need to be propagated to underlying
82+
devices, and a global flush needs to be implemented for bios with the
83+
REQ_PREFLUSH bit set.
7584

76-
and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
77-
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
78-
of an empty REQ_OP_FLUSH request followed by the actual write by the block
79-
layer. For devices that also support the FUA bit the block layer needs
80-
to be told to pass through the REQ_FUA bit using::
85+
Implementation details for blk-mq drivers
86+
-----------------------------------------
8187

82-
blk_queue_write_cache(sdkp->disk->queue, true, true);
88+
When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests
89+
with a payload are automatically turned into a sequence of a REQ_OP_FLUSH
90+
request followed by the actual write by the block layer.
8391

84-
and the driver must handle write requests that have the REQ_FUA bit set
85-
in prep_fn/request_fn. If the FUA bit is not natively supported the block
86-
layer turns it into an empty REQ_OP_FLUSH request after the actual write.
92+
When the BLK_FEAT_FUA flags is set, the REQ_FUA bit simplify passed on for the
93+
REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer
94+
after the completion of the write request for bio submissions with the REQ_FUA
95+
bit set.

arch/m68k/emu/nfblock.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 bsize)
9898
{
9999
struct queue_limits lim = {
100100
.logical_block_size = bsize,
101+
.features = BLK_FEAT_ROTATIONAL,
101102
};
102103
struct nfhd_device *dev;
103104
int dev_id = id - NFHD_DEV_OFFSET;

arch/um/drivers/ubd_kern.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -835,6 +835,7 @@ static int ubd_add(int n, char **error_out)
835835
struct queue_limits lim = {
836836
.max_segments = MAX_SG,
837837
.seg_boundary_mask = PAGE_SIZE - 1,
838+
.features = BLK_FEAT_WRITE_CACHE,
838839
};
839840
struct gendisk *disk;
840841
int err = 0;
@@ -881,8 +882,6 @@ static int ubd_add(int n, char **error_out)
881882
goto out_cleanup_tags;
882883
}
883884

884-
blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
885-
blk_queue_write_cache(disk->queue, true, false);
886885
disk->major = UBD_MAJOR;
887886
disk->first_minor = n << UBD_SHIFT;
888887
disk->minors = 1 << UBD_SHIFT;

arch/xtensa/platforms/iss/simdisk.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,9 @@ static const struct proc_ops simdisk_proc_ops = {
263263
static int __init simdisk_setup(struct simdisk *dev, int which,
264264
struct proc_dir_entry *procdir)
265265
{
266+
struct queue_limits lim = {
267+
.features = BLK_FEAT_ROTATIONAL,
268+
};
266269
char tmp[2] = { '0' + which, 0 };
267270
int err;
268271

@@ -271,7 +274,7 @@ static int __init simdisk_setup(struct simdisk *dev, int which,
271274
spin_lock_init(&dev->lock);
272275
dev->users = 0;
273276

274-
dev->gd = blk_alloc_disk(NULL, NUMA_NO_NODE);
277+
dev->gd = blk_alloc_disk(&lim, NUMA_NO_NODE);
275278
if (IS_ERR(dev->gd)) {
276279
err = PTR_ERR(dev->gd);
277280
goto out;

block/blk-core.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -782,7 +782,7 @@ void submit_bio_noacct(struct bio *bio)
782782
if (WARN_ON_ONCE(bio_op(bio) != REQ_OP_WRITE &&
783783
bio_op(bio) != REQ_OP_ZONE_APPEND))
784784
goto end_io;
785-
if (!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
785+
if (!bdev_write_cache(bdev)) {
786786
bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
787787
if (!bio_sectors(bio)) {
788788
status = BLK_STS_OK;
@@ -791,7 +791,7 @@ void submit_bio_noacct(struct bio *bio)
791791
}
792792
}
793793

794-
if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
794+
if (!(q->limits.features & BLK_FEAT_POLL))
795795
bio_clear_polled(bio);
796796

797797
switch (bio_op(bio)) {
@@ -915,8 +915,7 @@ int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags)
915915
return 0;
916916

917917
q = bdev_get_queue(bdev);
918-
if (cookie == BLK_QC_T_NONE ||
919-
!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
918+
if (cookie == BLK_QC_T_NONE || !(q->limits.features & BLK_FEAT_POLL))
920919
return 0;
921920

922921
blk_flush_plug(current->plug, false);

block/blk-flush.c

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -100,23 +100,6 @@ blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
100100
return blk_mq_map_queue(q, REQ_OP_FLUSH, ctx)->fq;
101101
}
102102

103-
static unsigned int blk_flush_policy(unsigned long fflags, struct request *rq)
104-
{
105-
unsigned int policy = 0;
106-
107-
if (blk_rq_sectors(rq))
108-
policy |= REQ_FSEQ_DATA;
109-
110-
if (fflags & (1UL << QUEUE_FLAG_WC)) {
111-
if (rq->cmd_flags & REQ_PREFLUSH)
112-
policy |= REQ_FSEQ_PREFLUSH;
113-
if (!(fflags & (1UL << QUEUE_FLAG_FUA)) &&
114-
(rq->cmd_flags & REQ_FUA))
115-
policy |= REQ_FSEQ_POSTFLUSH;
116-
}
117-
return policy;
118-
}
119-
120103
static unsigned int blk_flush_cur_seq(struct request *rq)
121104
{
122105
return 1 << ffz(rq->flush.seq);
@@ -398,19 +381,32 @@ static void blk_rq_init_flush(struct request *rq)
398381
bool blk_insert_flush(struct request *rq)
399382
{
400383
struct request_queue *q = rq->q;
401-
unsigned long fflags = q->queue_flags; /* may change, cache */
402-
unsigned int policy = blk_flush_policy(fflags, rq);
403384
struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx);
385+
bool supports_fua = q->limits.features & BLK_FEAT_FUA;
386+
unsigned int policy = 0;
404387

405388
/* FLUSH/FUA request must never be merged */
406389
WARN_ON_ONCE(rq->bio != rq->biotail);
407390

391+
if (blk_rq_sectors(rq))
392+
policy |= REQ_FSEQ_DATA;
393+
394+
/*
395+
* Check which flushes we need to sequence for this operation.
396+
*/
397+
if (blk_queue_write_cache(q)) {
398+
if (rq->cmd_flags & REQ_PREFLUSH)
399+
policy |= REQ_FSEQ_PREFLUSH;
400+
if ((rq->cmd_flags & REQ_FUA) && !supports_fua)
401+
policy |= REQ_FSEQ_POSTFLUSH;
402+
}
403+
408404
/*
409405
* @policy now records what operations need to be done. Adjust
410406
* REQ_PREFLUSH and FUA for the driver.
411407
*/
412408
rq->cmd_flags &= ~REQ_PREFLUSH;
413-
if (!(fflags & (1UL << QUEUE_FLAG_FUA)))
409+
if (!supports_fua)
414410
rq->cmd_flags &= ~REQ_FUA;
415411

416412
/*

block/blk-mq-debugfs.c

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -84,28 +84,15 @@ static const char *const blk_queue_flag_name[] = {
8484
QUEUE_FLAG_NAME(NOMERGES),
8585
QUEUE_FLAG_NAME(SAME_COMP),
8686
QUEUE_FLAG_NAME(FAIL_IO),
87-
QUEUE_FLAG_NAME(NONROT),
88-
QUEUE_FLAG_NAME(IO_STAT),
8987
QUEUE_FLAG_NAME(NOXMERGES),
90-
QUEUE_FLAG_NAME(ADD_RANDOM),
91-
QUEUE_FLAG_NAME(SYNCHRONOUS),
9288
QUEUE_FLAG_NAME(SAME_FORCE),
9389
QUEUE_FLAG_NAME(INIT_DONE),
94-
QUEUE_FLAG_NAME(STABLE_WRITES),
95-
QUEUE_FLAG_NAME(POLL),
96-
QUEUE_FLAG_NAME(WC),
97-
QUEUE_FLAG_NAME(FUA),
98-
QUEUE_FLAG_NAME(DAX),
9990
QUEUE_FLAG_NAME(STATS),
10091
QUEUE_FLAG_NAME(REGISTERED),
10192
QUEUE_FLAG_NAME(QUIESCED),
102-
QUEUE_FLAG_NAME(PCI_P2PDMA),
103-
QUEUE_FLAG_NAME(ZONE_RESETALL),
10493
QUEUE_FLAG_NAME(RQ_ALLOC_TIME),
10594
QUEUE_FLAG_NAME(HCTX_ACTIVE),
106-
QUEUE_FLAG_NAME(NOWAIT),
10795
QUEUE_FLAG_NAME(SQ_SCHED),
108-
QUEUE_FLAG_NAME(SKIP_TAGSET_QUIESCE),
10996
};
11097
#undef QUEUE_FLAG_NAME
11198

block/blk-mq.c

Lines changed: 26 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4109,14 +4109,26 @@ void blk_mq_release(struct request_queue *q)
41094109
blk_mq_sysfs_deinit(q);
41104110
}
41114111

4112+
static bool blk_mq_can_poll(struct blk_mq_tag_set *set)
4113+
{
4114+
return set->nr_maps > HCTX_TYPE_POLL &&
4115+
set->map[HCTX_TYPE_POLL].nr_queues;
4116+
}
4117+
41124118
struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
41134119
struct queue_limits *lim, void *queuedata)
41144120
{
41154121
struct queue_limits default_lim = { };
41164122
struct request_queue *q;
41174123
int ret;
41184124

4119-
q = blk_alloc_queue(lim ? lim : &default_lim, set->numa_node);
4125+
if (!lim)
4126+
lim = &default_lim;
4127+
lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
4128+
if (blk_mq_can_poll(set))
4129+
lim->features |= BLK_FEAT_POLL;
4130+
4131+
q = blk_alloc_queue(lim, set->numa_node);
41204132
if (IS_ERR(q))
41214133
return q;
41224134
q->queuedata = queuedata;
@@ -4269,17 +4281,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
42694281
mutex_unlock(&q->sysfs_lock);
42704282
}
42714283

4272-
static void blk_mq_update_poll_flag(struct request_queue *q)
4273-
{
4274-
struct blk_mq_tag_set *set = q->tag_set;
4275-
4276-
if (set->nr_maps > HCTX_TYPE_POLL &&
4277-
set->map[HCTX_TYPE_POLL].nr_queues)
4278-
blk_queue_flag_set(QUEUE_FLAG_POLL, q);
4279-
else
4280-
blk_queue_flag_clear(QUEUE_FLAG_POLL, q);
4281-
}
4282-
42834284
int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
42844285
struct request_queue *q)
42854286
{
@@ -4307,7 +4308,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
43074308
q->tag_set = set;
43084309

43094310
q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT;
4310-
blk_mq_update_poll_flag(q);
43114311

43124312
INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work);
43134313
INIT_LIST_HEAD(&q->flush_list);
@@ -4631,13 +4631,15 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
46314631
int ret;
46324632
unsigned long i;
46334633

4634+
if (WARN_ON_ONCE(!q->mq_freeze_depth))
4635+
return -EINVAL;
4636+
46344637
if (!set)
46354638
return -EINVAL;
46364639

46374640
if (q->nr_requests == nr)
46384641
return 0;
46394642

4640-
blk_mq_freeze_queue(q);
46414643
blk_mq_quiesce_queue(q);
46424644

46434645
ret = 0;
@@ -4671,7 +4673,6 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
46714673
}
46724674

46734675
blk_mq_unquiesce_queue(q);
4674-
blk_mq_unfreeze_queue(q);
46754676

46764677
return ret;
46774678
}
@@ -4793,8 +4794,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
47934794
fallback:
47944795
blk_mq_update_queue_map(set);
47954796
list_for_each_entry(q, &set->tag_list, tag_set_list) {
4797+
struct queue_limits lim;
4798+
47964799
blk_mq_realloc_hw_ctxs(set, q);
4797-
blk_mq_update_poll_flag(q);
4800+
47984801
if (q->nr_hw_queues != set->nr_hw_queues) {
47994802
int i = prev_nr_hw_queues;
48004803

@@ -4806,6 +4809,13 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
48064809
set->nr_hw_queues = prev_nr_hw_queues;
48074810
goto fallback;
48084811
}
4812+
lim = queue_limits_start_update(q);
4813+
if (blk_mq_can_poll(set))
4814+
lim.features |= BLK_FEAT_POLL;
4815+
else
4816+
lim.features &= ~BLK_FEAT_POLL;
4817+
if (queue_limits_commit_update(q, &lim) < 0)
4818+
pr_warn("updating the poll flag failed\n");
48094819
blk_mq_map_swqueue(q);
48104820
}
48114821

0 commit comments

Comments
 (0)