@@ -81,6 +81,7 @@ Documentation written by Tom Zanussi
8181 .usecs display a common_timestamp in microseconds
8282 .percent display a number of percentage value
8383 .graph display a bar-graph of a value
84+ .stacktrace display as a stacktrace (must by a long[] type)
8485 ============= =================================================
8586
8687 Note that in general the semantics of a given field aren't
@@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression::
17861787 # echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger
17871788 # echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger
17881789
1790+ Variables can even hold stacktraces, which are useful with synthetic events.
1791+
178917922.2.2 Synthetic Events
17901793----------------------
17911794
@@ -1861,7 +1864,7 @@ A histogram can now be defined for the new synthetic event::
18611864The above shows the latency "lat" in a power of 2 grouping.
18621865
18631866Like any other event, once a histogram is enabled for the event, the
1864- output can be displayed by reading the event's 'hist' file.
1867+ output can be displayed by reading the event's 'hist' file::
18651868
18661869 # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist
18671870
@@ -1908,7 +1911,7 @@ output can be displayed by reading the event's 'hist' file.
19081911
19091912
19101913The latency values can also be grouped linearly by a given size with
1911- the ".buckets" modifier and specify a size (in this case groups of 10).
1914+ the ".buckets" modifier and specify a size (in this case groups of 10)::
19121915
19131916 # echo 'hist:keys=pid,prio,lat.buckets=10:sort=lat' >> \
19141917 /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger
@@ -1940,6 +1943,157 @@ the ".buckets" modifier and specify a size (in this case groups of 10).
19401943 Entries: 16
19411944 Dropped: 0
19421945
1946+ To save stacktraces, create a synthetic event with a field of type "unsigned long[]"
1947+ or even just "long[]". For example, to see how long a task is blocked in an
1948+ uninterruptible state::
1949+
1950+ # cd /sys/kernel/tracing
1951+ # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
1952+ # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger
1953+ # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
1954+ # echo 1 > events/synthetic/block_lat/enable
1955+ # cat trace
1956+
1957+ # tracer: nop
1958+ #
1959+ # entries-in-buffer/entries-written: 2/2 #P:8
1960+ #
1961+ # _-----=> irqs-off/BH-disabled
1962+ # / _----=> need-resched
1963+ # | / _---=> hardirq/softirq
1964+ # || / _--=> preempt-depth
1965+ # ||| / _-=> migrate-disable
1966+ # |||| / delay
1967+ # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
1968+ # | | | ||||| | |
1969+ <idle>-0 [005] d..4. 521.164922: block_lat: pid=0 delta=8322 stack=STACK:
1970+ => __schedule+0x448/0x7b0
1971+ => schedule+0x5a/0xb0
1972+ => io_schedule+0x42/0x70
1973+ => bit_wait_io+0xd/0x60
1974+ => __wait_on_bit+0x4b/0x140
1975+ => out_of_line_wait_on_bit+0x91/0xb0
1976+ => jbd2_journal_commit_transaction+0x1679/0x1a70
1977+ => kjournald2+0xa9/0x280
1978+ => kthread+0xe9/0x110
1979+ => ret_from_fork+0x2c/0x50
1980+
1981+ <...>-2 [004] d..4. 525.184257: block_lat: pid=2 delta=76 stack=STACK:
1982+ => __schedule+0x448/0x7b0
1983+ => schedule+0x5a/0xb0
1984+ => schedule_timeout+0x11a/0x150
1985+ => wait_for_completion_killable+0x144/0x1f0
1986+ => __kthread_create_on_node+0xe7/0x1e0
1987+ => kthread_create_on_node+0x51/0x70
1988+ => create_worker+0xcc/0x1a0
1989+ => worker_thread+0x2ad/0x380
1990+ => kthread+0xe9/0x110
1991+ => ret_from_fork+0x2c/0x50
1992+
1993+ A synthetic event that has a stacktrace field may use it as a key in
1994+ histogram::
1995+
1996+ # echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
1997+ # cat events/synthetic/block_lat/hist
1998+
1999+ # event histogram
2000+ #
2001+ # trigger info: hist:keys=delta.buckets=100,stack.stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
2002+ #
2003+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2004+ schedule+0x6b/0x110
2005+ io_schedule+0x46/0x80
2006+ bit_wait_io+0x11/0x80
2007+ __wait_on_bit+0x4e/0x120
2008+ out_of_line_wait_on_bit+0x8d/0xb0
2009+ __wait_on_buffer+0x33/0x40
2010+ jbd2_journal_commit_transaction+0x155a/0x19b0
2011+ kjournald2+0xab/0x270
2012+ kthread+0xfa/0x130
2013+ ret_from_fork+0x29/0x50
2014+ } hitcount: 1
2015+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2016+ schedule+0x6b/0x110
2017+ io_schedule+0x46/0x80
2018+ rq_qos_wait+0xd0/0x170
2019+ wbt_wait+0x9e/0xf0
2020+ __rq_qos_throttle+0x25/0x40
2021+ blk_mq_submit_bio+0x2c3/0x5b0
2022+ __submit_bio+0xff/0x190
2023+ submit_bio_noacct_nocheck+0x25b/0x2b0
2024+ submit_bio_noacct+0x20b/0x600
2025+ submit_bio+0x28/0x90
2026+ ext4_bio_write_page+0x1e0/0x8c0
2027+ mpage_submit_page+0x60/0x80
2028+ mpage_process_page_bufs+0x16c/0x180
2029+ mpage_prepare_extent_to_map+0x23f/0x530
2030+ } hitcount: 1
2031+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2032+ schedule+0x6b/0x110
2033+ schedule_hrtimeout_range_clock+0x97/0x110
2034+ schedule_hrtimeout_range+0x13/0x20
2035+ usleep_range_state+0x65/0x90
2036+ __intel_wait_for_register+0x1c1/0x230 [i915]
2037+ intel_psr_wait_for_idle_locked+0x171/0x2a0 [i915]
2038+ intel_pipe_update_start+0x169/0x360 [i915]
2039+ intel_update_crtc+0x112/0x490 [i915]
2040+ skl_commit_modeset_enables+0x199/0x600 [i915]
2041+ intel_atomic_commit_tail+0x7c4/0x1080 [i915]
2042+ intel_atomic_commit_work+0x12/0x20 [i915]
2043+ process_one_work+0x21c/0x3f0
2044+ worker_thread+0x50/0x3e0
2045+ kthread+0xfa/0x130
2046+ } hitcount: 3
2047+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2048+ schedule+0x6b/0x110
2049+ schedule_timeout+0x11e/0x160
2050+ __wait_for_common+0x8f/0x190
2051+ wait_for_completion+0x24/0x30
2052+ __flush_work.isra.0+0x1cc/0x360
2053+ flush_work+0xe/0x20
2054+ drm_mode_rmfb+0x18b/0x1d0 [drm]
2055+ drm_mode_rmfb_ioctl+0x10/0x20 [drm]
2056+ drm_ioctl_kernel+0xb8/0x150 [drm]
2057+ drm_ioctl+0x243/0x560 [drm]
2058+ __x64_sys_ioctl+0x92/0xd0
2059+ do_syscall_64+0x59/0x90
2060+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
2061+ } hitcount: 1
2062+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2063+ schedule+0x6b/0x110
2064+ schedule_timeout+0x87/0x160
2065+ __wait_for_common+0x8f/0x190
2066+ wait_for_completion_timeout+0x1d/0x30
2067+ drm_atomic_helper_wait_for_flip_done+0x57/0x90 [drm_kms_helper]
2068+ intel_atomic_commit_tail+0x8ce/0x1080 [i915]
2069+ intel_atomic_commit_work+0x12/0x20 [i915]
2070+ process_one_work+0x21c/0x3f0
2071+ worker_thread+0x50/0x3e0
2072+ kthread+0xfa/0x130
2073+ ret_from_fork+0x29/0x50
2074+ } hitcount: 1
2075+ { delta: ~ 100-199, stack.stacktrace __schedule+0xa19/0x1520
2076+ schedule+0x6b/0x110
2077+ schedule_hrtimeout_range_clock+0x97/0x110
2078+ schedule_hrtimeout_range+0x13/0x20
2079+ usleep_range_state+0x65/0x90
2080+ pci_set_low_power_state+0x17f/0x1f0
2081+ pci_set_power_state+0x49/0x250
2082+ pci_finish_runtime_suspend+0x4a/0x90
2083+ pci_pm_runtime_suspend+0xcb/0x1b0
2084+ __rpm_callback+0x48/0x120
2085+ rpm_callback+0x67/0x70
2086+ rpm_suspend+0x167/0x780
2087+ rpm_idle+0x25a/0x380
2088+ pm_runtime_work+0x93/0xc0
2089+ process_one_work+0x21c/0x3f0
2090+ } hitcount: 1
2091+
2092+ Totals:
2093+ Hits: 10
2094+ Entries: 7
2095+ Dropped: 0
2096+
194320972.2.3 Hist trigger 'handlers' and 'actions'
19442098-------------------------------------------
19452099
@@ -2054,11 +2208,11 @@ The following commonly-used handler.action pairs are available:
20542208 wakeup_new_test($testpid) if comm=="cyclictest"' >> \
20552209 /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
20562210
2057- Or, equivalently, using the 'trace' keyword syntax:
2211+ Or, equivalently, using the 'trace' keyword syntax::
20582212
2059- # echo 'hist:keys=$testpid:testpid=pid: onmatch(sched.sched_wakeup_new).\
2060- trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
2061- /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
2213+ # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
2214+ trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
2215+ /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
20622216
20632217 Creating and displaying a histogram based on those events is now
20642218 just a matter of using the fields and new synthetic event in the
@@ -2191,48 +2345,48 @@ The following commonly-used handler.action pairs are available:
21912345 resulting latency, stored in wakeup_lat, exceeds the current
21922346 maximum latency, a snapshot is taken. As part of the setup, all
21932347 the scheduler events are also enabled, which are the events that
2194- will show up in the snapshot when it is taken at some point:
2348+ will show up in the snapshot when it is taken at some point::
21952349
2196- # echo 1 > /sys/kernel/tracing/events/sched/enable
2350+ # echo 1 > /sys/kernel/tracing/events/sched/enable
21972351
2198- # echo 'hist:keys=pid: ts0=common_timestamp.usecs \
2199- if comm=="cyclictest"' >> \
2200- /sys/kernel/tracing/events/sched/sched_waking/trigger
2352+ # echo 'hist:keys=pid:ts0=common_timestamp.usecs \
2353+ if comm=="cyclictest"' >> \
2354+ /sys/kernel/tracing/events/sched/sched_waking/trigger
22012355
2202- # echo 'hist:keys=next_pid: wakeup_lat=common_timestamp.usecs-$ts0: \
2203- onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
2204- prev_comm):onmax($wakeup_lat).snapshot() \
2205- if next_comm=="cyclictest"' >> \
2206- /sys/kernel/tracing/events/sched/sched_switch/trigger
2356+ # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
2357+ onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
2358+ prev_comm):onmax($wakeup_lat).snapshot() \
2359+ if next_comm=="cyclictest"' >> \
2360+ /sys/kernel/tracing/events/sched/sched_switch/trigger
22072361
22082362 When the histogram is displayed, for each bucket the max value
22092363 and the saved values corresponding to the max are displayed
22102364 following the rest of the fields.
22112365
22122366 If a snapshot was taken, there is also a message indicating that,
2213- along with the value and event that triggered the global maximum:
2367+ along with the value and event that triggered the global maximum::
22142368
2215- # cat /sys/kernel/tracing/events/sched/sched_switch/hist
2216- { next_pid: 2101 } hitcount: 200
2217- max: 52 next_prio: 120 next_comm: cyclictest \
2218- prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
2369+ # cat /sys/kernel/tracing/events/sched/sched_switch/hist
2370+ { next_pid: 2101 } hitcount: 200
2371+ max: 52 next_prio: 120 next_comm: cyclictest \
2372+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
22192373
2220- { next_pid: 2103 } hitcount: 1326
2221- max: 572 next_prio: 19 next_comm: cyclictest \
2222- prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
2374+ { next_pid: 2103 } hitcount: 1326
2375+ max: 572 next_prio: 19 next_comm: cyclictest \
2376+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
22232377
2224- { next_pid: 2102 } hitcount: 1982 \
2225- max: 74 next_prio: 19 next_comm: cyclictest \
2226- prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
2378+ { next_pid: 2102 } hitcount: 1982 \
2379+ max: 74 next_prio: 19 next_comm: cyclictest \
2380+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
22272381
2228- Snapshot taken (see tracing/snapshot). Details:
2229- triggering value { onmax($wakeup_lat) }: 572 \
2230- triggered by event with key: { next_pid: 2103 }
2382+ Snapshot taken (see tracing/snapshot). Details:
2383+ triggering value { onmax($wakeup_lat) }: 572 \
2384+ triggered by event with key: { next_pid: 2103 }
22312385
2232- Totals:
2233- Hits: 3508
2234- Entries: 3
2235- Dropped: 0
2386+ Totals:
2387+ Hits: 3508
2388+ Entries: 3
2389+ Dropped: 0
22362390
22372391 In the above case, the event that triggered the global maximum has
22382392 the key with next_pid == 2103. If you look at the bucket that has
@@ -2310,15 +2464,15 @@ The following commonly-used handler.action pairs are available:
23102464 $cwnd variable. If the value has changed, a snapshot is taken.
23112465 As part of the setup, all the scheduler and tcp events are also
23122466 enabled, which are the events that will show up in the snapshot
2313- when it is taken at some point:
2467+ when it is taken at some point::
23142468
2315- # echo 1 > /sys/kernel/tracing/events/sched/enable
2316- # echo 1 > /sys/kernel/tracing/events/tcp/enable
2469+ # echo 1 > /sys/kernel/tracing/events/sched/enable
2470+ # echo 1 > /sys/kernel/tracing/events/tcp/enable
23172471
2318- # echo 'hist:keys=dport: cwnd=snd_cwnd: \
2319- onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
2320- onchange($cwnd).snapshot()' >> \
2321- /sys/kernel/tracing/events/tcp/tcp_probe/trigger
2472+ # echo 'hist:keys=dport:cwnd=snd_cwnd: \
2473+ onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
2474+ onchange($cwnd).snapshot()' >> \
2475+ /sys/kernel/tracing/events/tcp/tcp_probe/trigger
23222476
23232477 When the histogram is displayed, for each bucket the tracked value
23242478 and the saved values corresponding to that value are displayed
@@ -2341,10 +2495,10 @@ The following commonly-used handler.action pairs are available:
23412495 { dport: 443 } hitcount: 211
23422496 changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800
23432497
2344- Snapshot taken (see tracing/snapshot). Details: :
2498+ Snapshot taken (see tracing/snapshot). Details:
23452499
2346- triggering value { onchange($cwnd) }: 10
2347- triggered by event with key: { dport: 80 }
2500+ triggering value { onchange($cwnd) }: 10
2501+ triggered by event with key: { dport: 80 }
23482502
23492503 Totals:
23502504 Hits: 414
0 commit comments