@@ -18,14 +18,19 @@ model features for SME is included in Appendix A.
18181. General
1919-----------
2020
21- * PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
22- register state and TPIDR2_EL0 are tracked per thread.
21+ * PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA and (when
22+ present) ZTn register state and TPIDR2_EL0 are tracked per thread.
2323
2424* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
2525 AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
2626 instructions and registers, and the Linux-specific system interfaces
2727 described in this document. SME is reported in /proc/cpuinfo as "sme".
2828
29+ * The presence of SME2 is reported to userspace via HWCAP2_SME2 in the
30+ aux vector AT_HWCAP2 entry. Presence of this flag implies the presence of
31+ the SME2 instructions and ZT0, and the Linux-specific system interfaces
32+ described in this document. SME2 is reported in /proc/cpuinfo as "sme2".
33+
2934* Support for the execution of SME instructions in userspace can also be
3035 detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
3136 instruction, and checking that the value of the SME field is nonzero. [3]
@@ -44,6 +49,7 @@ model features for SME is included in Appendix A.
4449 HWCAP2_SME_B16F32
4550 HWCAP2_SME_F32F32
4651 HWCAP2_SME_FA64
52+ HWCAP2_SME2
4753
4854 This list may be extended over time as the SME architecture evolves.
4955
@@ -52,8 +58,8 @@ model features for SME is included in Appendix A.
5258 cpu-feature-registers.txt for details.
5359
5460* Debuggers should restrict themselves to interacting with the target via the
55- NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
56- of detecting support for these regsets is to connect to a target process
61+ NT_ARM_SVE, NT_ARM_SSVE, NT_ARM_ZA and NT_ARM_ZT regsets. The recommended
62+ way of detecting support for these regsets is to connect to a target process
5763 first and then attempt a
5864
5965 ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
@@ -89,13 +95,13 @@ be zeroed.
8995-------------------------
9096
9197* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
92- ZA matrix are preserved.
98+ ZA matrix and ZTn (if present) are preserved.
9399
94100* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
95101 as per the standard SVE ABI.
96102
97- * Neither the SVE registers nor ZA are used to pass arguments to or receive
98- results from any syscall.
103+ * None of the SVE registers, ZA or ZTn are used to pass arguments to
104+ or receive results from any syscall.
99105
100106* On process creation (eg, clone()) the newly created process will have
101107 PSTATE.SM cleared.
@@ -134,6 +140,14 @@ be zeroed.
134140 __reserved[] referencing this space. za_context is then written in the
135141 extra space. Refer to [1] for further details about this mechanism.
136142
143+ * If ZTn is supported and PSTATE.ZA==1 then a signal frame record for ZTn will
144+ be generated.
145+
146+ * The signal record for ZTn has magic ZT_MAGIC (0x5a544e01) and consists of a
147+ standard signal frame header followed by a struct zt_context specifying
148+ the number of ZTn registers supported by the system, then zt_context.nregs
149+ blocks of 64 bytes of data per register.
150+
137151
1381525. Signal return
139153-----------------
@@ -151,6 +165,9 @@ When returning from a signal handler:
151165 the signal frame does not match the current vector length, the signal return
152166 attempt is treated as illegal, resulting in a forced SIGSEGV.
153167
168+ * If ZTn is not supported or PSTATE.ZA==0 then it is illegal to have a
169+ signal frame record for ZTn, resulting in a forced SIGSEGV.
170+
154171
1551726. prctl extensions
156173--------------------
@@ -214,8 +231,8 @@ prctl(PR_SME_SET_VL, unsigned long arg)
214231 vector length that will be applied at the next execve() by the calling
215232 thread.
216233
217- * Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
218- Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
234+ * Changing the vector length causes all of ZA, ZTn, P0..P15, FFR and all
235+ bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
219236 unspecified, including both streaming and non-streaming SVE state.
220237 Calling PR_SME_SET_VL with vl equal to the thread's current vector
221238 length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
@@ -317,6 +334,15 @@ The regset data starts with struct user_za_header, containing:
317334
318335* The effect of writing a partial, incomplete payload is unspecified.
319336
337+ * A new regset NT_ARM_ZT is defined for access to ZTn state via
338+ PTRACE_GETREGSET and PTRACE_SETREGSET.
339+
340+ * The NT_ARM_ZT regset consists of a single 512 bit register.
341+
342+ * When PSTATE.ZA==0 reads of NT_ARM_ZT will report all bits of ZTn as 0.
343+
344+ * Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
345+
320346
3213478. ELF coredump extensions
322348---------------------------
@@ -331,6 +357,11 @@ The regset data starts with struct user_za_header, containing:
331357 been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
332358 when the coredump was generated.
333359
360+ * A NT_ARM_ZT note will be added to each coredump for each thread of the
361+ dumped process. The contents will be equivalent to the data that would have
362+ been read if a PTRACE_GETREGSET of NT_ARM_ZT were executed for each thread
363+ when the coredump was generated.
364+
334365* The NT_ARM_TLS note will be extended to two registers, the second register
335366 will contain TPIDR2_EL0 on systems that support SME and will be read as
336367 zero with writes ignored otherwise.
@@ -406,6 +437,9 @@ In A64 state, SME adds the following:
406437 For best system performance it is strongly encouraged for software to enable
407438 ZA only when it is actively being used.
408439
440+ * A new ZT0 register is introduced when SME2 is present. This is a 512 bit
441+ register which is accessible when PSTATE.ZA is set, as ZA itself is.
442+
409443* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
410444 SMSTOP instructions or by access to the SVCR system register:
411445
0 commit comments