@@ -56,14 +56,14 @@ Example usage of perf::
5656For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
5757as PMU v1, but some new functions are added to the hardware.
5858
59- (a) L3C PMU supports filtering by core/thread within the cluster which can be
59+ 1. L3C PMU supports filtering by core/thread within the cluster which can be
6060specified as a bitmap::
6161
6262 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
6363
6464This will only count the operations from core/thread 0 and 1 in this cluster.
6565
66- (b) Tracetag allow the user to chose to count only read, write or atomic
66+ 2. Tracetag allow the user to chose to count only read, write or atomic
6767operations via the tt_req parameeter in perf. The default value counts all
6868operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
6969represents write operations, 3'b110 represents atomic store operations and
@@ -73,38 +73,42 @@ represents write operations, 3'b110 represents atomic store operations and
7373
7474This will only count the read operations in this cluster.
7575
76- (c) Datasrc allows the user to check where the data comes from. It is 5 bits.
76+ 3. Datasrc allows the user to check where the data comes from. It is 5 bits.
7777Some important codes are as follows:
78- 5'b00001: comes from L3C in this die;
79- 5'b01000: comes from L3C in the cross-die;
80- 5'b01001: comes from L3C which is in another socket;
81- 5'b01110: comes from the local DDR;
82- 5'b01111: comes from the cross-die DDR;
83- 5'b10000: comes from cross-socket DDR;
78+
79+ - 5'b00001: comes from L3C in this die;
80+ - 5'b01000: comes from L3C in the cross-die;
81+ - 5'b01001: comes from L3C which is in another socket;
82+ - 5'b01110: comes from the local DDR;
83+ - 5'b01111: comes from the cross-die DDR;
84+ - 5'b10000: comes from cross-socket DDR;
85+
8486etc, it is mainly helpful to find that the data source is nearest from the CPU
8587cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
8688configured in perf command::
8789
8890 $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
8991 hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
9092
91- (d) Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
93+ 4. Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
9294contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
9395clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
9496SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
9597CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
96- 5'b00000: I/O_MGMT_ICL;
97- 5'b00001: Network_ICL;
98- 5'b00011: HAC_ICL;
99- 5'b10000: PCIe_ICL;
10098
101- (e) uring_channel: UC PMU events 0x47~0x59 supports filtering by tx request
99+ - 5'b00000: I/O_MGMT_ICL;
100+ - 5'b00001: Network_ICL;
101+ - 5'b00011: HAC_ICL;
102+ - 5'b10000: PCIe_ICL;
103+
104+ 5. uring_channel: UC PMU events 0x47~0x59 supports filtering by tx request
102105uring channel. It is 2 bits. Some important codes are as follows:
103- 2'b11: count the events which sent to the uring_ext (MATA) channel;
104- 2'b01: is the same as 2'b11;
105- 2'b10: count the events which sent to the uring (non-MATA) channel;
106- 2'b00: default value, count the events which sent to the both uring and
107- uring_ext channel;
106+
107+ - 2'b11: count the events which sent to the uring_ext (MATA) channel;
108+ - 2'b01: is the same as 2'b11;
109+ - 2'b10: count the events which sent to the uring (non-MATA) channel;
110+ - 2'b00: default value, count the events which sent to the both uring and
111+ uring_ext channel;
108112
109113Users could configure IDs to count data come from specific CCL/ICL, by setting
110114srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
0 commit comments