Skip to content

Commit 1ca0f9e

Browse files
hansendcgregkh
authored andcommitted
mm: add a ptdesc flag to mark kernel page tables
commit 27bfafa upstream. The page tables used to map the kernel and userspace often have very different handling rules. There are frequently *_kernel() variants of functions just for kernel page tables. That's not great and has lead to code duplication. Instead of having completely separate call paths, allow a 'ptdesc' to be marked as being for kernel mappings. Introduce helpers to set and clear this status. Note: this uses the PG_referenced bit. Page flags are a great fit for this since it is truly a single bit of information. Use PG_referenced itself because it's a fairly benign flag (as opposed to things like PG_lock). It's also (according to Willy) unlikely to go away any time soon. PG_referenced is not in PAGE_FLAGS_CHECK_AT_FREE. It does not need to be cleared before freeing the page, and pages coming out of the allocator should have it cleared. Regardless, introduce an API to clear it anyway. Having symmetry in the API makes it easier to change the underlying implementation later, like if there was a need to move to a PAGE_FLAGS_CHECK_AT_FREE bit. Link: https://lkml.kernel.org/r/20251022082635.2462433-3-baolu.lu@linux.intel.com Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robin Murohy <robin.murphy@arm.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Cc: Vasant Hegde <vasant.hegde@amd.com> Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Yi Lai <yi1.lai@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent b3bbbf9 commit 1ca0f9e

1 file changed

Lines changed: 41 additions & 0 deletions

File tree

include/linux/mm.h

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2947,6 +2947,7 @@ static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long a
29472947
#endif /* CONFIG_MMU */
29482948

29492949
enum pt_flags {
2950+
PT_kernel = PG_referenced,
29502951
PT_reserved = PG_reserved,
29512952
/* High bits are used for zone/node/section */
29522953
};
@@ -2972,6 +2973,46 @@ static inline bool pagetable_is_reserved(struct ptdesc *pt)
29722973
return test_bit(PT_reserved, &pt->pt_flags.f);
29732974
}
29742975

2976+
/**
2977+
* ptdesc_set_kernel - Mark a ptdesc used to map the kernel
2978+
* @ptdesc: The ptdesc to be marked
2979+
*
2980+
* Kernel page tables often need special handling. Set a flag so that
2981+
* the handling code knows this ptdesc will not be used for userspace.
2982+
*/
2983+
static inline void ptdesc_set_kernel(struct ptdesc *ptdesc)
2984+
{
2985+
set_bit(PT_kernel, &ptdesc->pt_flags.f);
2986+
}
2987+
2988+
/**
2989+
* ptdesc_clear_kernel - Mark a ptdesc as no longer used to map the kernel
2990+
* @ptdesc: The ptdesc to be unmarked
2991+
*
2992+
* Use when the ptdesc is no longer used to map the kernel and no longer
2993+
* needs special handling.
2994+
*/
2995+
static inline void ptdesc_clear_kernel(struct ptdesc *ptdesc)
2996+
{
2997+
/*
2998+
* Note: the 'PG_referenced' bit does not strictly need to be
2999+
* cleared before freeing the page. But this is nice for
3000+
* symmetry.
3001+
*/
3002+
clear_bit(PT_kernel, &ptdesc->pt_flags.f);
3003+
}
3004+
3005+
/**
3006+
* ptdesc_test_kernel - Check if a ptdesc is used to map the kernel
3007+
* @ptdesc: The ptdesc being tested
3008+
*
3009+
* Call to tell if the ptdesc used to map the kernel.
3010+
*/
3011+
static inline bool ptdesc_test_kernel(const struct ptdesc *ptdesc)
3012+
{
3013+
return test_bit(PT_kernel, &ptdesc->pt_flags.f);
3014+
}
3015+
29753016
/**
29763017
* pagetable_alloc - Allocate pagetables
29773018
* @gfp: GFP flags

0 commit comments

Comments
 (0)