Skip to content

Commit b005d61

Browse files
committed
spi: cadence-quadspi: Prevent indirect read
Merge series from Mateusz Litwin <mateusz.litwin@nokia.com>: On the Stratix10 platform, indirect reads can become very slow due to lost interrupts and/or missed `complete()` calls, causing `wait_for_completion_timeout()` to expire. Three issues were identified: 1) A race condition exists between the read loop and IRQ `complete()` call: An IRQ can call `complete()` after the inner loop ends, but before `reinit_completion()`, losing the completion event and leading to `wait_for_completion_timeout()` expire. This function will not return an error because `bytes_to_read` > 0 (indicating data is already in the FIFO) and the final `ret` value is overwritten by `cqspi_wait_for_bit()` return value (indicating request completion), masking the timeout. For test purpose, logging was added to print the count of timeouts and the outer loop count. $ dd if=/dev/mtd0 of=/dev/null bs=64M count=1 [ 2232.925219] cadence-qspi ff8d2000.spi: Indirect read error timeout (1) loop (12472) [ 2236.200391] cadence-qspi ff8d2000.spi: Indirect read error timeout (1) loop (12460) [ 2239.482836] cadence-qspi ff8d2000.spi: Indirect read error timeout (5) loop (12450) This indicates that such an event is rare, but possible. Tested on the Stratix10 platform. 2) The quirk assumes the indirect read path never leaves the inner loop on SoCFPGA. This assumption is incorrect when using slow flash. Disabling IRQs in the inner loop can cause lost interrupts. 3) The `CQSPI_SLOW_SRAM` quirk disables `CQSPI_REG_IRQ_IND_COMP` (indirect completion) interrupt, relying solely on the `CQSPI_REG_IRQ_WATERMARK` (FIFO watermark) interrupt. For small transfers sizes, the final data read might not fill the FIFO sufficiently to trigger the watermark, preventing completion and leading to wait_for_completion_timeout() expiration. Two patches have been prepared to resolve these issues. - [1/2] spi: cadence-quadspi: Prevent lost complete() call during indirect read Moving `reinit_completion()` before the inner loop prevents a race condition. This might cause a premature IRQ complete() call to occur; however, in the worst case, this will result in a spurious wakeup and another wait cycle, which is preferable to waiting for a timeout. - [2/2] spi: cadence-quadspi: Improve CQSPI_SLOW_SRAM quirk if flash is slow Re-enabling `CQSPI_REG_IRQ_IND_COMP` interrupt resolves the problem for small reads and removes the disabling of interrupts, addressing the issue with lost interrupts. This marginally increases the IRQ count. Test: $ dd if=/dev/mtd0 of=/dev/null bs=1M count=64 Results from the Stratix10 platform with mt25qu02g flash. FIFO size in all tests: 128 Serviced interrupt call counts: Without `CQSPI_SLOW_SRAM` quirk: 16 668 850 With `CQSPI_SLOW_SRAM` quirk: 204 176 With `CQSPI_SLOW_SRAM` and this patch: 224 528 Patch 2/2: Delivers a substantial read‑performance improvement for the Cadence QSPI controller on the Stratix10 platform. Patch 1/2: Applies to all platforms and should yield a modest performance gain, most noticeable with large `CQSPI_READ_TIMEOUT_MS` values and workloads dominated by many small reads.
2 parents c81f30b + 5bfbbf0 commit b005d61

1 file changed

Lines changed: 11 additions & 12 deletions

File tree

drivers/spi/spi-cadence-quadspi.c

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,9 @@ struct cqspi_driver_platdata {
300300
CQSPI_REG_IRQ_IND_SRAM_FULL | \
301301
CQSPI_REG_IRQ_IND_COMP)
302302

303+
#define CQSPI_IRQ_MASK_RD_SLOW_SRAM (CQSPI_REG_IRQ_WATERMARK | \
304+
CQSPI_REG_IRQ_IND_COMP)
305+
303306
#define CQSPI_IRQ_MASK_WR (CQSPI_REG_IRQ_IND_COMP | \
304307
CQSPI_REG_IRQ_WATERMARK | \
305308
CQSPI_REG_IRQ_UNDERFLOW)
@@ -381,7 +384,7 @@ static irqreturn_t cqspi_irq_handler(int this_irq, void *dev)
381384
else if (!cqspi->slow_sram)
382385
irq_status &= CQSPI_IRQ_MASK_RD | CQSPI_IRQ_MASK_WR;
383386
else
384-
irq_status &= CQSPI_REG_IRQ_WATERMARK | CQSPI_IRQ_MASK_WR;
387+
irq_status &= CQSPI_IRQ_MASK_RD_SLOW_SRAM | CQSPI_IRQ_MASK_WR;
385388

386389
if (irq_status)
387390
complete(&cqspi->transfer_complete);
@@ -757,7 +760,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
757760
*/
758761

759762
if (use_irq && cqspi->slow_sram)
760-
writel(CQSPI_REG_IRQ_WATERMARK, reg_base + CQSPI_REG_IRQMASK);
763+
writel(CQSPI_IRQ_MASK_RD_SLOW_SRAM, reg_base + CQSPI_REG_IRQMASK);
761764
else if (use_irq)
762765
writel(CQSPI_IRQ_MASK_RD, reg_base + CQSPI_REG_IRQMASK);
763766
else
@@ -769,17 +772,19 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
769772
readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */
770773

771774
while (remaining > 0) {
775+
ret = 0;
772776
if (use_irq &&
773777
!wait_for_completion_timeout(&cqspi->transfer_complete,
774778
msecs_to_jiffies(CQSPI_READ_TIMEOUT_MS)))
775779
ret = -ETIMEDOUT;
776780

777781
/*
778-
* Disable all read interrupts until
779-
* we are out of "bytes to read"
782+
* Prevent lost interrupt and race condition by reinitializing early.
783+
* A spurious wakeup and another wait cycle can occur here,
784+
* which is preferable to waiting until timeout if interrupt is lost.
780785
*/
781-
if (cqspi->slow_sram)
782-
writel(0x0, reg_base + CQSPI_REG_IRQMASK);
786+
if (use_irq)
787+
reinit_completion(&cqspi->transfer_complete);
783788

784789
bytes_to_read = cqspi_get_rd_sram_level(cqspi);
785790

@@ -811,12 +816,6 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
811816
remaining -= bytes_to_read;
812817
bytes_to_read = cqspi_get_rd_sram_level(cqspi);
813818
}
814-
815-
if (use_irq && remaining > 0) {
816-
reinit_completion(&cqspi->transfer_complete);
817-
if (cqspi->slow_sram)
818-
writel(CQSPI_REG_IRQ_WATERMARK, reg_base + CQSPI_REG_IRQMASK);
819-
}
820819
}
821820

822821
/* Check indirect done status */

0 commit comments

Comments
 (0)