Skip to content

Commit feafee2

Browse files
committed
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon: "There's good stuff across the board, including some nice mm improvements for CPUs with the 'noabort' BBML2 feature and a clever patch to allow ptdump to play nicely with block mappings in the vmalloc area. Confidential computing: - Add support for accepting secrets from firmware (e.g. ACPI CCEL) and mapping them with appropriate attributes. CPU features: - Advertise atomic floating-point instructions to userspace - Extend Spectre workarounds to cover additional Arm CPU variants - Extend list of CPUs that support break-before-make level 2 and guarantee not to generate TLB conflict aborts for changes of mapping granularity (BBML2_NOABORT) - Add GCS support to our uprobes implementation. Documentation: - Remove bogus SME documentation concerning register state when entering/exiting streaming mode. Entry code: - Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY) - Micro-optimise syscall entry path with a compiler branch hint. Memory management: - Enable huge mappings in vmalloc space even when kernel page-table dumping is enabled - Tidy up the types used in our early MMU setup code - Rework rodata= for closer parity with the behaviour on x86 - For CPUs implementing BBML2_NOABORT, utilise block mappings in the linear map even when rodata= applies to virtual aliases - Don't re-allocate the virtual region between '_text' and '_stext', as doing so confused tools parsing /proc/vmcore. Miscellaneous: - Clean-up Kconfig menuconfig text for architecture features - Avoid redundant bitmap_empty() during determination of supported SME vector lengths - Re-enable warnings when building the 32-bit vDSO object - Avoid breaking our eggs at the wrong end. Perf and PMUs: - Support for v3 of the Hisilicon L3C PMU - Support for Hisilicon's MN and NoC PMUs - Support for Fujitsu's Uncore PMU - Support for SPE's extended event filtering feature - Preparatory work to enable data source filtering in SPE - Support for multiple lanes in the DWC PCIe PMU - Support for i.MX94 in the IMX DDR PMU driver - MAINTAINERS update (Thank you, Yicong) - Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets). Selftests: - Add basic LSFE check to the existing hwcaps test - Support nolibc in GCS tests - Extend SVE ptrace test to pass unsupported regsets and invalid vector lengths - Minor cleanups (typos, cosmetic changes). System registers: - Fix ID_PFR1_EL1 definition - Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1 - Sync TCR_EL1 definition with the latest Arm ARM (L.b) - Be stricter about the input fed into our AWK sysreg generator script - Typo fixes and removal of redundant definitions. ACPI, EFI and PSCI: - Decouple Arm's "Software Delegated Exception Interface" (SDEI) support from the ACPI GHES code so that it can be used by platforms booted with device-tree - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI runtime calls - Fix a node refcount imbalance in the PSCI device-tree code. CPU Features: - Ensure register sanitisation is applied to fields in ID_AA64MMFR4 - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests can reliably query the underlying CPU types from the VMM - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes to our context-switching, signal handling and ptrace code" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: cpufeature: Remove duplicate asm/mmu.h header arm64: Kconfig: Make CPU_BIG_ENDIAN depend on BROKEN perf/dwc_pcie: Fix use of uninitialized variable arm/syscalls: mark syscall invocation as likely in invoke_syscall Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU Documentation: hisi-pmu: Fix of minor format error drivers/perf: hisi: Add support for L3C PMU v3 drivers/perf: hisi: Refactor the event configuration of L3C PMU drivers/perf: hisi: Extend the field of tt_core drivers/perf: hisi: Extract the event filter check of L3C PMU drivers/perf: hisi: Simplify the probe process of each L3C PMU version drivers/perf: hisi: Export hisi_uncore_pmu_isr() drivers/perf: hisi: Relax the event ID check in the framework perf: Fujitsu: Add the Uncore PMU driver arm64: map [_text, _stext) virtual address range non-executable+read-only arm64/sysreg: Update TCR_EL1 register arm64: Enable vmalloc-huge with ptdump arm64: cpufeature: add Neoverse-V3AE to BBML2 allow list arm64: errata: Apply workarounds for Neoverse-V3AE arm64: cputype: Add Neoverse-V3AE definitions ...
2 parents fe68bb2 + ea0b391 commit feafee2

96 files changed

Lines changed: 3779 additions & 765 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6406,8 +6406,9 @@
64066406
rodata= [KNL,EARLY]
64076407
on Mark read-only kernel memory as read-only (default).
64086408
off Leave read-only kernel memory writable for debugging.
6409-
full Mark read-only kernel memory and aliases as read-only
6410-
[arm64]
6409+
noalias Mark read-only kernel memory as read-only but retain
6410+
writable aliases in the direct map for regions outside
6411+
of the kernel image. [arm64]
64116412

64126413
rockchip.usb_uart
64136414
[EARLY]

Documentation/admin-guide/perf/dwc_pcie_pmu.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ provides the following two features:
1616

1717
- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
1818
time spent in each low-power LTSSM state) and
19-
- one 32-bit counter for Event Counting (error and non-error events for
20-
a specified lane)
19+
- one 32-bit counter per event for Event Counting (error and non-error
20+
events for a specified lane)
2121

2222
Note: There is no interrupt for counter overflow.
2323

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
.. SPDX-License-Identifier: GPL-2.0-only
2+
3+
================================================
4+
Fujitsu Uncore Performance Monitoring Unit (PMU)
5+
================================================
6+
7+
This driver supports the Uncore MAC PMUs and the Uncore PCI PMUs found
8+
in Fujitsu chips.
9+
Each MAC PMU on these chips is exposed as a uncore perf PMU with device name
10+
mac_iod<iod>_mac<mac>_ch<ch>.
11+
And each PCI PMU on these chips is exposed as a uncore perf PMU with device name
12+
pci_iod<iod>_pci<pci>.
13+
14+
The driver provides a description of its available events and configuration
15+
options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/
16+
and /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/.
17+
This driver exports:
18+
- formats, used by perf user space and other tools to configure events
19+
- events, used by perf user space and other tools to create events
20+
symbolically, e.g.:
21+
perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls
22+
perf stat -a -e pci_iod0_pci0/event=0x24/ ls
23+
- cpumask, used by perf user space and other tools to know on which CPUs
24+
to open the events
25+
26+
This driver supports the following events for MAC:
27+
- cycles
28+
This event counts MAC cycles at MAC frequency.
29+
- read-count
30+
This event counts the number of read requests to MAC.
31+
- read-count-request
32+
This event counts the number of read requests including retry to MAC.
33+
- read-count-return
34+
This event counts the number of responses to read requests to MAC.
35+
- read-count-request-pftgt
36+
This event counts the number of read requests including retry with PFTGT
37+
flag.
38+
- read-count-request-normal
39+
This event counts the number of read requests including retry without PFTGT
40+
flag.
41+
- read-count-return-pftgt-hit
42+
This event counts the number of responses to read requests which hit the
43+
PFTGT buffer.
44+
- read-count-return-pftgt-miss
45+
This event counts the number of responses to read requests which miss the
46+
PFTGT buffer.
47+
- read-wait
48+
This event counts outstanding read requests issued by DDR memory controller
49+
per cycle.
50+
- write-count
51+
This event counts the number of write requests to MAC (including zero write,
52+
full write, partial write, write cancel).
53+
- write-count-write
54+
This event counts the number of full write requests to MAC (not including
55+
zero write).
56+
- write-count-pwrite
57+
This event counts the number of partial write requests to MAC.
58+
- memory-read-count
59+
This event counts the number of read requests from MAC to memory.
60+
- memory-write-count
61+
This event counts the number of full write requests from MAC to memory.
62+
- memory-pwrite-count
63+
This event counts the number of partial write requests from MAC to memory.
64+
- ea-mac
65+
This event counts energy consumption of MAC.
66+
- ea-memory
67+
This event counts energy consumption of memory.
68+
- ea-memory-mac-write
69+
This event counts the number of write requests from MAC to memory.
70+
- ea-ha
71+
This event counts energy consumption of HA.
72+
73+
'ea' is the abbreviation for 'Energy Analyzer'.
74+
75+
Examples for use with perf::
76+
77+
perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
78+
79+
And, this driver supports the following events for PCI:
80+
- pci-port0-cycles
81+
This event counts PCI cycles at PCI frequency in port0.
82+
- pci-port0-read-count
83+
This event counts read transactions for data transfer in port0.
84+
- pci-port0-read-count-bus
85+
This event counts read transactions for bus usage in port0.
86+
- pci-port0-write-count
87+
This event counts write transactions for data transfer in port0.
88+
- pci-port0-write-count-bus
89+
This event counts write transactions for bus usage in port0.
90+
- pci-port1-cycles
91+
This event counts PCI cycles at PCI frequency in port1.
92+
- pci-port1-read-count
93+
This event counts read transactions for data transfer in port1.
94+
- pci-port1-read-count-bus
95+
This event counts read transactions for bus usage in port1.
96+
- pci-port1-write-count
97+
This event counts write transactions for data transfer in port1.
98+
- pci-port1-write-count-bus
99+
This event counts write transactions for bus usage in port1.
100+
- ea-pci
101+
This event counts energy consumption of PCI.
102+
103+
'ea' is the abbreviation for 'Energy Analyzer'.
104+
105+
Examples for use with perf::
106+
107+
perf stat -e pci_iod0_pci0/ea-pci/ ls
108+
109+
Given that these are uncore PMUs the driver does not support sampling, therefore
110+
"perf record" will not work. Per-task perf sessions are not supported.

Documentation/admin-guide/perf/hisi-pmu.rst

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,10 @@ HiSilicon SoC uncore PMU driver
1818
Each device PMU has separate registers for event counting, control and
1919
interrupt, and the PMU driver shall register perf PMU drivers like L3C,
2020
HHA and DDRC etc. The available events and configuration options shall
21-
be described in the sysfs, see:
21+
be described in the sysfs, see::
22+
23+
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>
2224

23-
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
2425
The "perf list" command shall list the available events from sysfs.
2526

2627
Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
@@ -112,6 +113,50 @@ uring channel. It is 2 bits. Some important codes are as follows:
112113
- 2'b00: default value, count the events which sent to the both uring and
113114
uring_ext channel;
114115

116+
6. ch: NoC PMU supports filtering the event counts of certain transaction
117+
channel with this option. The current supported channels are as follows:
118+
119+
- 3'b010: Request channel
120+
- 3'b100: Snoop channel
121+
- 3'b110: Response channel
122+
- 3'b111: Data channel
123+
124+
7. tt_en: NoC PMU supports counting only transactions that have tracetag set
125+
if this option is set. See the 2nd list for more information about tracetag.
126+
127+
For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are
128+
further divided into parts for finer granularity of tracing, each part has its
129+
own dedicated PMU, and all such PMUs together cover the monitoring job of events
130+
on particular uncore device. Such PMUs are described in sysfs with name format
131+
slightly changed::
132+
133+
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}>
134+
135+
Z is the sub-id, indicating different PMUs for part of hardware device.
136+
137+
Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU
138+
provides ``ext`` option to allow exploration of even finer granual statistics
139+
of L3C PMU. L3C PMU driver uses that as hint of termination when delivering
140+
perf command to hardware:
141+
142+
- ext=0: Default, could be used with event names.
143+
- ext=1 and ext=2: Must be used with event codes, event names are not supported.
144+
145+
An example of perf command could be::
146+
147+
$# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5
148+
149+
or::
150+
151+
$# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5
152+
153+
As above, ``hisi_sccl0_l3c1_0`` locates PMU of Super CPU CLuster 0, L3 cache 1
154+
pipe0.
155+
156+
First command locates the first part of L3C since ``ext=0`` is implied by
157+
default. Second command issues the counting on another part of L3C with the
158+
event ``0x1``.
159+
115160
Users could configure IDs to count data come from specific CCL/ICL, by setting
116161
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
117162
tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not

Documentation/admin-guide/perf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,4 @@ Performance monitor support
2929
cxl
3030
ampere_cspmu
3131
mrvl-pem-pmu
32+
fujitsu_uncore_pmu

Documentation/arch/arm64/booting.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -466,6 +466,17 @@ Before jumping into the kernel, the following conditions must be met:
466466
- HDFGWTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1.
467467
- HDFGWTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1.
468468

469+
For CPUs with SPE data source filtering (FEAT_SPE_FDS):
470+
471+
- If EL3 is present:
472+
473+
- MDCR_EL3.EnPMS3 (bit 42) must be initialised to 0b1.
474+
475+
- If the kernel is entered at EL1 and EL2 is present:
476+
477+
- HDFGRTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1.
478+
- HDFGWTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1.
479+
469480
For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS):
470481

471482
- If the kernel is entered at EL1 and EL2 is present:

Documentation/arch/arm64/elf_hwcaps.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -441,6 +441,10 @@ HWCAP3_MTE_FAR
441441
HWCAP3_MTE_STORE_ONLY
442442
Functionality implied by ID_AA64PFR2_EL1.MTESTOREONLY == 0b0001.
443443

444+
HWCAP3_LSFE
445+
Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001
446+
447+
444448
4. Unused AT_HWCAP bits
445449
-----------------------
446450

Documentation/arch/arm64/silicon-errata.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,8 @@ stable kernels.
200200
+----------------+-----------------+-----------------+-----------------------------+
201201
| ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 |
202202
+----------------+-----------------+-----------------+-----------------------------+
203+
| ARM | Neoverse-V3AE | #3312417 | ARM64_ERRATUM_3194386 |
204+
+----------------+-----------------+-----------------+-----------------------------+
203205
| ARM | MMU-500 | #841119,826419 | ARM_SMMU_MMU_500_CPRE_ERRATA|
204206
| | | #562869,1047329 | |
205207
+----------------+-----------------+-----------------+-----------------------------+

Documentation/arch/arm64/sme.rst

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -81,17 +81,7 @@ The ZA matrix is square with each side having as many bytes as a streaming
8181
mode SVE vector.
8282

8383

84-
3. Sharing of streaming and non-streaming mode SVE state
85-
---------------------------------------------------------
86-
87-
It is implementation defined which if any parts of the SVE state are shared
88-
between streaming and non-streaming modes. When switching between modes
89-
via software interfaces such as ptrace if no register content is provided as
90-
part of switching no state will be assumed to be shared and everything will
91-
be zeroed.
92-
93-
94-
4. System call behaviour
84+
3. System call behaviour
9585
-------------------------
9686

9787
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
@@ -112,7 +102,7 @@ be zeroed.
112102
exceptions for execve() described in section 6.
113103

114104

115-
5. Signal handling
105+
4. Signal handling
116106
-------------------
117107

118108
* Signal handlers are invoked with PSTATE.SM=0, PSTATE.ZA=0, and TPIDR2_EL0=0.

Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ properties:
3333
- items:
3434
- enum:
3535
- fsl,imx91-ddr-pmu
36+
- fsl,imx94-ddr-pmu
3637
- fsl,imx95-ddr-pmu
3738
- const: fsl,imx93-ddr-pmu
3839

0 commit comments

Comments
 (0)