Skip to content

Commit 273cc94

Browse files
committed
sched_ext: Call ops.update_idle() after updating builtin idle bits
BPF schedulers that use both builtin CPU idle mechanism and ops.update_idle() may want to use the latter to create interlocking between ops.enqueue() and CPU idle transitions so that either ops.enqueue() sees the idle bit or ops.update_idle() sees the task queued somewhere. This can prevent race conditions where CPUs go idle while tasks are waiting in DSQs. For such interlocking to work, ops.update_idle() must be called after builtin CPU masks are updated. Relocate the invocation. Currently, there are no ordering requirements on transitions from idle and this relocation isn't expected to make meaningful differences in that direction. This also makes the ops.update_idle() behavior semantically consistent: any action performed in this callback should be able to override the builtin idle state, not the other way around. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-and-tested-by: Andrea Righi <arighi@nvidia.com> Acked-by: Changwoo Min <changwoo@igalia.com>
1 parent aa3a7b6 commit 273cc94

1 file changed

Lines changed: 15 additions & 10 deletions

File tree

kernel/sched/ext_idle.c

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -738,16 +738,6 @@ void __scx_update_idle(struct rq *rq, bool idle, bool do_notify)
738738

739739
lockdep_assert_rq_held(rq);
740740

741-
/*
742-
* Trigger ops.update_idle() only when transitioning from a task to
743-
* the idle thread and vice versa.
744-
*
745-
* Idle transitions are indicated by do_notify being set to true,
746-
* managed by put_prev_task_idle()/set_next_task_idle().
747-
*/
748-
if (SCX_HAS_OP(sch, update_idle) && do_notify && !scx_rq_bypassing(rq))
749-
SCX_CALL_OP(sch, SCX_KF_REST, update_idle, rq, cpu_of(rq), idle);
750-
751741
/*
752742
* Update the idle masks:
753743
* - for real idle transitions (do_notify == true)
@@ -765,6 +755,21 @@ void __scx_update_idle(struct rq *rq, bool idle, bool do_notify)
765755
if (static_branch_likely(&scx_builtin_idle_enabled))
766756
if (do_notify || is_idle_task(rq->curr))
767757
update_builtin_idle(cpu, idle);
758+
759+
/*
760+
* Trigger ops.update_idle() only when transitioning from a task to
761+
* the idle thread and vice versa.
762+
*
763+
* Idle transitions are indicated by do_notify being set to true,
764+
* managed by put_prev_task_idle()/set_next_task_idle().
765+
*
766+
* This must come after builtin idle update so that BPF schedulers can
767+
* create interlocking between ops.update_idle() and ops.enqueue() -
768+
* either enqueue() sees the idle bit or update_idle() sees the task
769+
* that enqueue() queued.
770+
*/
771+
if (SCX_HAS_OP(sch, update_idle) && do_notify && !scx_rq_bypassing(rq))
772+
SCX_CALL_OP(sch, SCX_KF_REST, update_idle, rq, cpu_of(rq), idle);
768773
}
769774

770775
static void reset_idle_masks(struct sched_ext_ops *ops)

0 commit comments

Comments
 (0)