@@ -17,14 +17,21 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
1717This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
1818flag bits:
1919
20- ============================================= ================================
20+ =============================================== ================================
2121RDT (Resource Director Technology) Allocation "rdt_a"
2222CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
2323CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
2424CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
2525MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
2626MBA (Memory Bandwidth Allocation) "mba"
27- ============================================= ================================
27+ SMBA (Slow Memory Bandwidth Allocation) ""
28+ BMEC (Bandwidth Monitoring Event Configuration) ""
29+ =============================================== ================================
30+
31+ Historically, new features were made visible by default in /proc/cpuinfo. This
32+ resulted in the feature flags becoming hard to parse by humans. Adding a new
33+ flag to /proc/cpuinfo should be avoided if user space can obtain information
34+ about the feature from resctrl's info directory.
2835
2936To use the feature mount the file system::
3037
@@ -161,6 +168,83 @@ with the following files:
161168"mon_features":
162169 Lists the monitoring events if
163170 monitoring is enabled for the resource.
171+ Example::
172+
173+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
174+ llc_occupancy
175+ mbm_total_bytes
176+ mbm_local_bytes
177+
178+ If the system supports Bandwidth Monitoring Event
179+ Configuration (BMEC), then the bandwidth events will
180+ be configurable. The output will be::
181+
182+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
183+ llc_occupancy
184+ mbm_total_bytes
185+ mbm_total_bytes_config
186+ mbm_local_bytes
187+ mbm_local_bytes_config
188+
189+ "mbm_total_bytes_config", "mbm_local_bytes_config":
190+ Read/write files containing the configuration for the mbm_total_bytes
191+ and mbm_local_bytes events, respectively, when the Bandwidth
192+ Monitoring Event Configuration (BMEC) feature is supported.
193+ The event configuration settings are domain specific and affect
194+ all the CPUs in the domain. When either event configuration is
195+ changed, the bandwidth counters for all RMIDs of both events
196+ (mbm_total_bytes as well as mbm_local_bytes) are cleared for that
197+ domain. The next read for every RMID will report "Unavailable"
198+ and subsequent reads will report the valid value.
199+
200+ Following are the types of events supported:
201+
202+ ==== ========================================================
203+ Bits Description
204+ ==== ========================================================
205+ 6 Dirty Victims from the QOS domain to all types of memory
206+ 5 Reads to slow memory in the non-local NUMA domain
207+ 4 Reads to slow memory in the local NUMA domain
208+ 3 Non-temporal writes to non-local NUMA domain
209+ 2 Non-temporal writes to local NUMA domain
210+ 1 Reads to memory in the non-local NUMA domain
211+ 0 Reads to memory in the local NUMA domain
212+ ==== ========================================================
213+
214+ By default, the mbm_total_bytes configuration is set to 0x7f to count
215+ all the event types and the mbm_local_bytes configuration is set to
216+ 0x15 to count all the local memory events.
217+
218+ Examples:
219+
220+ * To view the current configuration::
221+ ::
222+
223+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
224+ 0=0x7f;1=0x7f;2=0x7f;3=0x7f
225+
226+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
227+ 0=0x15;1=0x15;3=0x15;4=0x15
228+
229+ * To change the mbm_total_bytes to count only reads on domain 0,
230+ the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
231+ (in hexadecimal 0x33):
232+ ::
233+
234+ # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
235+
236+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
237+ 0=0x33;1=0x7f;2=0x7f;3=0x7f
238+
239+ * To change the mbm_local_bytes to count all the slow memory reads on
240+ domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
241+ in binary (in hexadecimal 0x30):
242+ ::
243+
244+ # echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
245+
246+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
247+ 0=0x30;1=0x30;3=0x15;4=0x15
164248
165249"max_threshold_occupancy":
166250 Read/write file provides the largest value (in
@@ -464,6 +548,25 @@ Memory bandwidth domain is L3 cache.
464548
465549 MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
466550
551+ Slow Memory Bandwidth Allocation (SMBA)
552+ ---------------------------------------
553+ AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
554+ CXL.memory is the only supported "slow" memory device. With the
555+ support of SMBA, the hardware enables bandwidth allocation on
556+ the slow memory devices. If there are multiple such devices in
557+ the system, the throttling logic groups all the slow sources
558+ together and applies the limit on them as a whole.
559+
560+ The presence of SMBA (with CXL.memory) is independent of slow memory
561+ devices presence. If there are no such devices on the system, then
562+ configuring SMBA will have no impact on the performance of the system.
563+
564+ The bandwidth domain for slow memory is L3 cache. Its schemata file
565+ is formatted as:
566+ ::
567+
568+ SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
569+
467570Reading/writing the schemata file
468571---------------------------------
469572Reading the schemata file will show the state of all resources
@@ -479,6 +582,46 @@ which you wish to change. E.g.
479582 L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
480583 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
481584
585+ Reading/writing the schemata file (on AMD systems)
586+ --------------------------------------------------
587+ Reading the schemata file will show the current bandwidth limit on all
588+ domains. The allocated resources are in multiples of one eighth GB/s.
589+ When writing to the file, you need to specify what cache id you wish to
590+ configure the bandwidth limit.
591+
592+ For example, to allocate 2GB/s limit on the first cache id:
593+
594+ ::
595+
596+ # cat schemata
597+ MB:0=2048;1=2048;2=2048;3=2048
598+ L3:0=ffff;1=ffff;2=ffff;3=ffff
599+
600+ # echo "MB:1=16" > schemata
601+ # cat schemata
602+ MB:0=2048;1= 16;2=2048;3=2048
603+ L3:0=ffff;1=ffff;2=ffff;3=ffff
604+
605+ Reading/writing the schemata file (on AMD systems) with SMBA feature
606+ --------------------------------------------------------------------
607+ Reading and writing the schemata file is the same as without SMBA in
608+ above section.
609+
610+ For example, to allocate 8GB/s limit on the first cache id:
611+
612+ ::
613+
614+ # cat schemata
615+ SMBA:0=2048;1=2048;2=2048;3=2048
616+ MB:0=2048;1=2048;2=2048;3=2048
617+ L3:0=ffff;1=ffff;2=ffff;3=ffff
618+
619+ # echo "SMBA:1=64" > schemata
620+ # cat schemata
621+ SMBA:0=2048;1= 64;2=2048;3=2048
622+ MB:0=2048;1=2048;2=2048;3=2048
623+ L3:0=ffff;1=ffff;2=ffff;3=ffff
624+
482625Cache Pseudo-Locking
483626====================
484627CAT enables a user to specify the amount of cache space that an
0 commit comments