Skip to content

Commit 97c5550

Browse files
pjaroszynskiwilldeacon
authored andcommitted
arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
contpte_ptep_set_access_flags() compared the gathered ptep_get() value against the requested entry to detect no-ops. ptep_get() ORs AF/dirty from all sub-PTEs in the CONT block, so a dirty sibling can make the target appear already-dirty. When the gathered value matches entry, the function returns 0 even though the target sub-PTE still has PTE_RDONLY set in hardware. For a CPU with FEAT_HAFDBS this gathered view is fine, since hardware may set AF/dirty on any sub-PTE and CPU TLB behavior is effectively gathered across the CONT range. But page-table walkers that evaluate each descriptor individually (e.g. a CPU without DBM support, or an SMMU without HTTU, or with HA/HD disabled in CD.TCR) can keep faulting on the unchanged target sub-PTE, causing an infinite fault loop. Gathering can therefore cause false no-ops when only a sibling has been updated: - write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared) - read faults: target still lacks PTE_AF Fix by checking each sub-PTE against the requested AF/dirty/write state (the same bits consumed by __ptep_set_access_flags()), using raw per-PTE values rather than the gathered ptep_get() view, before returning no-op. Keep using the raw target PTE for the write-bit unfold decision. Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT range may become the effective cached translation and software must maintain consistent attributes across the range. Fixes: 4602e57 ("arm64/mm: wire up PTE_CONT for user mappings") Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Breno Leitao <leitao@debian.org> Cc: stable@vger.kernel.org Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: James Houghton <jthoughton@google.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Breno Leitao <leitao@debian.org> Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com> Acked-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
1 parent 0100e49 commit 97c5550

1 file changed

Lines changed: 49 additions & 4 deletions

File tree

arch/arm64/mm/contpte.c

Lines changed: 49 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -599,6 +599,27 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
599599
}
600600
EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
601601

602+
static bool contpte_all_subptes_match_access_flags(pte_t *ptep, pte_t entry)
603+
{
604+
pte_t *cont_ptep = contpte_align_down(ptep);
605+
/*
606+
* PFNs differ per sub-PTE. Match only bits consumed by
607+
* __ptep_set_access_flags(): AF, DIRTY and write permission.
608+
*/
609+
const pteval_t cmp_mask = PTE_RDONLY | PTE_AF | PTE_WRITE | PTE_DIRTY;
610+
pteval_t entry_cmp = pte_val(entry) & cmp_mask;
611+
int i;
612+
613+
for (i = 0; i < CONT_PTES; i++) {
614+
pteval_t pte_cmp = pte_val(__ptep_get(cont_ptep + i)) & cmp_mask;
615+
616+
if (pte_cmp != entry_cmp)
617+
return false;
618+
}
619+
620+
return true;
621+
}
622+
602623
int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
603624
unsigned long addr, pte_t *ptep,
604625
pte_t entry, int dirty)
@@ -608,13 +629,37 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
608629
int i;
609630

610631
/*
611-
* Gather the access/dirty bits for the contiguous range. If nothing has
612-
* changed, its a noop.
632+
* Check whether all sub-PTEs in the CONT block already match the
633+
* requested access flags/write permission, using raw per-PTE values
634+
* rather than the gathered ptep_get() view.
635+
*
636+
* __ptep_set_access_flags() can update AF, dirty and write
637+
* permission, but only to make the mapping more permissive.
638+
*
639+
* ptep_get() gathers AF/dirty state across the whole CONT block,
640+
* which is correct for a CPU with FEAT_HAFDBS. But page-table
641+
* walkers that evaluate each descriptor individually (e.g. a CPU
642+
* without DBM support, or an SMMU without HTTU, or with HA/HD
643+
* disabled in CD.TCR) can keep faulting on the target sub-PTE if
644+
* only a sibling has been updated. Gathering can therefore cause
645+
* false no-ops when only a sibling has been updated:
646+
* - write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared)
647+
* - read faults: target still lacks PTE_AF
648+
*
649+
* Per Arm ARM (DDI 0487) D8.7.1, any sub-PTE in a CONT range may
650+
* become the effective cached translation, so all entries must have
651+
* consistent attributes. Check the full CONT block before returning
652+
* no-op, and when any sub-PTE mismatches, proceed to update the whole
653+
* range.
613654
*/
614-
orig_pte = pte_mknoncont(ptep_get(ptep));
615-
if (pte_val(orig_pte) == pte_val(entry))
655+
if (contpte_all_subptes_match_access_flags(ptep, entry))
616656
return 0;
617657

658+
/*
659+
* Use raw target pte (not gathered) for write-bit unfold decision.
660+
*/
661+
orig_pte = pte_mknoncont(__ptep_get(ptep));
662+
618663
/*
619664
* We can fix up access/dirty bits without having to unfold the contig
620665
* range. But if the write bit is changing, we must unfold.

0 commit comments

Comments
 (0)