Skip to content

Commit 5315644

Browse files
surenbaghdasaryanakpm00
authored andcommitted
mm: do not increment pgfault stats when page fault handler retries
If the page fault handler requests a retry, we will count the fault multiple times. This is a relatively harmless problem as the retry paths are not often requested, and the only user-visible problem is that the fault counter will be slightly higher than it should be. Nevertheless, userspace only took one fault, and should not see the fact that the kernel had to retry the fault multiple times. Move page fault accounting into mm_account_fault() and skip incomplete faults which will be accounted upon completion. Link: https://lkml.kernel.org/r/20230419175836.3857458-1-surenb@google.com Fixes: d065bd8 ("mm: retry page fault when blocking on disk transfer") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Peter Xu <peterx@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Minchan Kim <minchan@google.com> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent d2658f2 commit 5315644

1 file changed

Lines changed: 27 additions & 19 deletions

File tree

mm/memory.c

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5104,24 +5104,31 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
51045104
* updates. However, note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
51055105
* still be in per-arch page fault handlers at the entry of page fault.
51065106
*/
5107-
static inline void mm_account_fault(struct pt_regs *regs,
5107+
static inline void mm_account_fault(struct mm_struct *mm, struct pt_regs *regs,
51085108
unsigned long address, unsigned int flags,
51095109
vm_fault_t ret)
51105110
{
51115111
bool major;
51125112

5113+
/* Incomplete faults will be accounted upon completion. */
5114+
if (ret & VM_FAULT_RETRY)
5115+
return;
5116+
51135117
/*
5114-
* We don't do accounting for some specific faults:
5115-
*
5116-
* - Unsuccessful faults (e.g. when the address wasn't valid). That
5117-
* includes arch_vma_access_permitted() failing before reaching here.
5118-
* So this is not a "this many hardware page faults" counter. We
5119-
* should use the hw profiling for that.
5120-
*
5121-
* - Incomplete faults (VM_FAULT_RETRY). They will only be counted
5122-
* once they're completed.
5118+
* To preserve the behavior of older kernels, PGFAULT counters record
5119+
* both successful and failed faults, as opposed to perf counters,
5120+
* which ignore failed cases.
51235121
*/
5124-
if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
5122+
count_vm_event(PGFAULT);
5123+
count_memcg_event_mm(mm, PGFAULT);
5124+
5125+
/*
5126+
* Do not account for unsuccessful faults (e.g. when the address wasn't
5127+
* valid). That includes arch_vma_access_permitted() failing before
5128+
* reaching here. So this is not a "this many hardware page faults"
5129+
* counter. We should use the hw profiling for that.
5130+
*/
5131+
if (ret & VM_FAULT_ERROR)
51255132
return;
51265133

51275134
/*
@@ -5204,21 +5211,22 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma,
52045211
vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
52055212
unsigned int flags, struct pt_regs *regs)
52065213
{
5214+
/* If the fault handler drops the mmap_lock, vma may be freed */
5215+
struct mm_struct *mm = vma->vm_mm;
52075216
vm_fault_t ret;
52085217

52095218
__set_current_state(TASK_RUNNING);
52105219

5211-
count_vm_event(PGFAULT);
5212-
count_memcg_event_mm(vma->vm_mm, PGFAULT);
5213-
52145220
ret = sanitize_fault_flags(vma, &flags);
52155221
if (ret)
5216-
return ret;
5222+
goto out;
52175223

52185224
if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
52195225
flags & FAULT_FLAG_INSTRUCTION,
5220-
flags & FAULT_FLAG_REMOTE))
5221-
return VM_FAULT_SIGSEGV;
5226+
flags & FAULT_FLAG_REMOTE)) {
5227+
ret = VM_FAULT_SIGSEGV;
5228+
goto out;
5229+
}
52225230

52235231
/*
52245232
* Enable the memcg OOM handling for faults triggered in user
@@ -5247,8 +5255,8 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
52475255
if (task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))
52485256
mem_cgroup_oom_synchronize(false);
52495257
}
5250-
5251-
mm_account_fault(regs, address, flags, ret);
5258+
out:
5259+
mm_account_fault(mm, regs, address, flags, ret);
52525260

52535261
return ret;
52545262
}

0 commit comments

Comments
 (0)