Skip to content

Commit c672a4b

Browse files
sjp38gregkh
authored andcommitted
mm/damon/core: handle zero {aggregation,ops_update} intervals
commit 3488af0 upstream. Patch series "mm/damon/core: fix handling of zero non-sampling intervals". DAMON's internal intervals accounting logic is not correctly handling non-sampling intervals of zero values for a wrong assumption. This could cause unexpected monitoring behavior, and even result in infinite hang of DAMON sysfs interface user threads in case of zero aggregation interval. Fix those by updating the intervals accounting logic. For details of the root case and solutions, please refer to commit messages of fixes. This patch (of 2): DAMON's logics to determine if this is the time to do aggregation and ops update assumes next_{aggregation,ops_update}_sis are always set larger than current passed_sample_intervals. And therefore it further assumes continuously incrementing passed_sample_intervals every sampling interval will make it reaches to the next_{aggregation,ops_update}_sis in future. The logic therefore make the action and update next_{aggregation,ops_updaste}_sis only if passed_sample_intervals is same to the counts, respectively. If Aggregation interval or Ops update interval are zero, however, next_aggregation_sis or next_ops_update_sis are set same to current passed_sample_intervals, respectively. And passed_sample_intervals is incremented before doing the next_{aggregation,ops_update}_sis check. Hence, passed_sample_intervals becomes larger than next_{aggregation,ops_update}_sis, and the logic says it is not the time to do the action and update next_{aggregation,ops_update}_sis forever, until an overflow happens. In other words, DAMON stops doing aggregations or ops updates effectively forever, and users cannot get monitoring results. Based on the documents and the common sense, a reasonable behavior for such inputs is doing an aggregation and an ops update for every sampling interval. Handle the case by removing the assumption. Note that this could incur particular real issue for DAMON sysfs interface users, in case of zero Aggregation interval. When user starts DAMON with zero Aggregation interval and asks online DAMON parameter tuning via DAMON sysfs interface, the request is handled by the aggregation callback. Until the callback finishes the work, the user who requested the online tuning just waits. Hence, the user will be stuck until the passed_sample_intervals overflows. Link: https://lkml.kernel.org/r/20241031183757.49610-1-sj@kernel.org Link: https://lkml.kernel.org/r/20241031183757.49610-2-sj@kernel.org Fixes: 4472edf ("mm/damon/core: use number of passed access sampling as a timer") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [6.7.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent 2d339a1 commit c672a4b

1 file changed

Lines changed: 3 additions & 3 deletions

File tree

mm/damon/core.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2001,7 +2001,7 @@ static int kdamond_fn(void *data)
20012001
if (ctx->ops.check_accesses)
20022002
max_nr_accesses = ctx->ops.check_accesses(ctx);
20032003

2004-
if (ctx->passed_sample_intervals == next_aggregation_sis) {
2004+
if (ctx->passed_sample_intervals >= next_aggregation_sis) {
20052005
kdamond_merge_regions(ctx,
20062006
max_nr_accesses / 10,
20072007
sz_limit);
@@ -2019,7 +2019,7 @@ static int kdamond_fn(void *data)
20192019

20202020
sample_interval = ctx->attrs.sample_interval ?
20212021
ctx->attrs.sample_interval : 1;
2022-
if (ctx->passed_sample_intervals == next_aggregation_sis) {
2022+
if (ctx->passed_sample_intervals >= next_aggregation_sis) {
20232023
ctx->next_aggregation_sis = next_aggregation_sis +
20242024
ctx->attrs.aggr_interval / sample_interval;
20252025

@@ -2029,7 +2029,7 @@ static int kdamond_fn(void *data)
20292029
ctx->ops.reset_aggregated(ctx);
20302030
}
20312031

2032-
if (ctx->passed_sample_intervals == next_ops_update_sis) {
2032+
if (ctx->passed_sample_intervals >= next_ops_update_sis) {
20332033
ctx->next_ops_update_sis = next_ops_update_sis +
20342034
ctx->attrs.ops_update_interval /
20352035
sample_interval;

0 commit comments

Comments
 (0)