Skip to content

Commit bff8460

Browse files
bcodding-rhgregkh
authored andcommitted
nfs/blocklayout: Limit repeat device registration on failure
[ Upstream commit 614733f ] Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the device which can drive GETDEVINFO, then finally may need to prep the device with a reservation. This slow work makes a mess of IO latencies if one of the later steps is going to fail for awhile. If we're unable to register a SCSI device, ensure we mark the device as unavailable so that it will timeout and be re-added via GETDEVINFO. This avoids repeated doomed attempts to register a device in the IO path. Add some clarifying comments as well. Fixes: d869da9 ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent faa4bac commit bff8460

1 file changed

Lines changed: 14 additions & 1 deletion

File tree

fs/nfs/blocklayout/blocklayout.c

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -571,19 +571,32 @@ bl_find_get_deviceid(struct nfs_server *server,
571571
if (!node)
572572
return ERR_PTR(-ENODEV);
573573

574+
/*
575+
* Devices that are marked unavailable are left in the cache with a
576+
* timeout to avoid sending GETDEVINFO after every LAYOUTGET, or
577+
* constantly attempting to register the device. Once marked as
578+
* unavailable they must be deleted and never reused.
579+
*/
574580
if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
575581
unsigned long end = jiffies;
576582
unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT;
577583

578584
if (!time_in_range(node->timestamp_unavailable, start, end)) {
585+
/* Uncork subsequent GETDEVINFO operations for this device */
579586
nfs4_delete_deviceid(node->ld, node->nfs_client, id);
580587
goto retry;
581588
}
582589
goto out_put;
583590
}
584591

585-
if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node)))
592+
if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) {
593+
/*
594+
* If we cannot register, treat this device as transient:
595+
* Make a negative cache entry for the device
596+
*/
597+
nfs4_mark_deviceid_unavailable(node);
586598
goto out_put;
599+
}
587600

588601
return node;
589602

0 commit comments

Comments
 (0)