Skip to content

Commit e2768b7

Browse files
ryanhrobMarc Zyngier
authored andcommitted
arm64/mm: Modify range-based tlbi to decrement scale
In preparation for adding support for LPA2 to the tlb invalidation routines, modify the algorithm used by range-based tlbi to start at the highest 'scale' and decrement instead of starting at the lowest 'scale' and incrementing. This new approach makes it possible to maintain 64K alignment as we work through the range, until the last op (at scale=0). This is required when LPA2 is enabled. (This part will be added in a subsequent commit). This change is separated into its own patch because it will also impact non-LPA2 systems, and I want to make it easy to bisect in case it leads to performance regression (see below for benchmarks that suggest this should not be a problem). The original commit (d1d3aa9 "arm64: tlb: Use the TLBI RANGE feature in arm64") stated this as the reason for _incrementing_ scale: However, in most scenarios, the pages = 1 when flush_tlb_range() is called. Start from scale = 3 or other proper value (such as scale =ilog2(pages)), will incur extra overhead. So increase 'scale' from 0 to maximum. But pages=1 is already special cased by the non-range invalidation path, which will take care of it the first time through the loop (both in the original commit and in my change), so I don't think switching to decrement scale should have any extra performance impact after all. Indeed benchmarking kernel compilation, a TLBI-heavy workload, suggests that this new approach actually _improves_ performance slightly (using a virtual machine on Apple M2): Table shows time to execute kernel compilation workload with 8 jobs, relative to baseline without this patch (more negative number is bigger speedup). Repeated 9 times across 3 system reboots: | counter | mean | stdev | |:----------|-----------:|----------:| | real-time | -0.6% | 0.0% | | kern-time | -1.6% | 0.5% | | user-time | -0.4% | 0.1% | Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231127111737.1897081-2-ryan.roberts@arm.com
1 parent 2cc14f5 commit e2768b7

1 file changed

Lines changed: 10 additions & 10 deletions

File tree

arch/arm64/include/asm/tlbflush.h

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -350,14 +350,14 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
350350
* entries one by one at the granularity of 'stride'. If the TLB
351351
* range ops are supported, then:
352352
*
353-
* 1. If 'pages' is odd, flush the first page through non-range
354-
* operations;
353+
* 1. The minimum range granularity is decided by 'scale', so multiple range
354+
* TLBI operations may be required. Start from scale = 3, flush the largest
355+
* possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
356+
* requested range, then decrement scale and continue until one or zero pages
357+
* are left.
355358
*
356-
* 2. For remaining pages: the minimum range granularity is decided
357-
* by 'scale', so multiple range TLBI operations may be required.
358-
* Start from scale = 0, flush the corresponding number of pages
359-
* ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it
360-
* until no pages left.
359+
* 2. If there is 1 page remaining, flush it through non-range operations. Range
360+
* operations can only span an even number of pages.
361361
*
362362
* Note that certain ranges can be represented by either num = 31 and
363363
* scale or num = 0 and scale + 1. The loop below favours the latter
@@ -367,12 +367,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
367367
asid, tlb_level, tlbi_user) \
368368
do { \
369369
int num = 0; \
370-
int scale = 0; \
370+
int scale = 3; \
371371
unsigned long addr; \
372372
\
373373
while (pages > 0) { \
374374
if (!system_supports_tlb_range() || \
375-
pages % 2 == 1) { \
375+
pages == 1) { \
376376
addr = __TLBI_VADDR(start, asid); \
377377
__tlbi_level(op, addr, tlb_level); \
378378
if (tlbi_user) \
@@ -392,7 +392,7 @@ do { \
392392
start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
393393
pages -= __TLBI_RANGE_PAGES(num, scale); \
394394
} \
395-
scale++; \
395+
scale--; \
396396
} \
397397
} while (0)
398398

0 commit comments

Comments
 (0)