Skip to content

Commit 5815d93

Browse files
jgunthorpejoergroedel
authored andcommitted
iommupt: Only cache flush memory changed by unmap
The cache flush was happening on every level across the whole range of iteration, even if no leafs or tables were cleared. Instead flush only the sub range that was actually written. Overflushing isn't a correctness problem but it does impact the performance of unmap. After this series the performance compared to the original VT-d implementation with cache flushing turned on is: map_pages pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 253,266 , 213,227 , 6.06 2^21, 246,244 , 221,219 , 0.00 2^30, 231,240 , 209,217 , 3.03 256*2^12, 2604,2668 , 2415,2540 , 4.04 256*2^21, 2495,2824 , 2390,2734 , 12.12 256*2^30, 2542,2845 , 2380,2718 , 12.12 unmap_pages pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 259,292 , 222,251 , 11.11 2^21, 255,259 , 227,236 , 3.03 2^30, 238,254 , 217,230 , 5.05 256*2^12, 2751,2620 , 2417,2437 , 0.00 256*2^21, 2461,2526 , 2377,2423 , 1.01 256*2^30, 2498,2543 , 2370,2404 , 1.01 Fixes: efa03da ("iommupt: Flush the CPU cache after any writes to the page table") Reported-by: Francois Dugast <francois.dugast@intel.com> Closes: https://lore.kernel.org/all/20260121130233.257428-1-francois.dugast@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
1 parent 63804fe commit 5815d93

1 file changed

Lines changed: 10 additions & 1 deletion

File tree

drivers/iommu/generic_pt/iommu_pt.h

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -931,6 +931,8 @@ static __maybe_unused int __unmap_range(struct pt_range *range, void *arg,
931931
struct pt_table_p *table)
932932
{
933933
struct pt_state pts = pt_init(range, level, table);
934+
unsigned int flush_start_index = UINT_MAX;
935+
unsigned int flush_end_index = UINT_MAX;
934936
struct pt_unmap_args *unmap = arg;
935937
unsigned int num_oas = 0;
936938
unsigned int start_index;
@@ -986,6 +988,9 @@ static __maybe_unused int __unmap_range(struct pt_range *range, void *arg,
986988
iommu_pages_list_add(&unmap->free_list,
987989
pts.table_lower);
988990
pt_clear_entries(&pts, ilog2(1));
991+
if (pts.index < flush_start_index)
992+
flush_start_index = pts.index;
993+
flush_end_index = pts.index + 1;
989994
}
990995
pts.index++;
991996
} else {
@@ -999,15 +1004,19 @@ static __maybe_unused int __unmap_range(struct pt_range *range, void *arg,
9991004
num_contig_lg2 = pt_entry_num_contig_lg2(&pts);
10001005
pt_clear_entries(&pts, num_contig_lg2);
10011006
num_oas += log2_to_int(num_contig_lg2);
1007+
if (pts.index < flush_start_index)
1008+
flush_start_index = pts.index;
10021009
pts.index += log2_to_int(num_contig_lg2);
1010+
flush_end_index = pts.index;
10031011
}
10041012
if (pts.index >= pts.end_index)
10051013
break;
10061014
pts.type = pt_load_entry_raw(&pts);
10071015
} while (true);
10081016

10091017
unmap->unmapped += log2_mul(num_oas, pt_table_item_lg2sz(&pts));
1010-
flush_writes_range(&pts, start_index, pts.index);
1018+
if (flush_start_index != flush_end_index)
1019+
flush_writes_range(&pts, flush_start_index, flush_end_index);
10111020

10121021
return ret;
10131022
}

0 commit comments

Comments
 (0)