Skip to content

Commit e54debe

Browse files
committed
Merge tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Dave Hansen: "There's no _actual_ kernel functionality here. This expands the documentation around AMX support including some code examples. The example code also exposed the fact that hardware architecture constants as part of the ABI, but there's no easy place that they get defined for apps. Adding them to a uabi header will eventually make life easier for consumers of the ABI. Summary: - Improve AMX documentation along with example code - Explicitly make some hardware constants part of the uabi" * tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Explain the state component permission for guests Documentation/x86: Add the AMX enabling example x86/arch_prctl: Add AMX feature numbers as ABI constants Documentation/x86: Explain the purpose for dynamic features
2 parents 4980c17 + 5fbff26 commit e54debe

2 files changed

Lines changed: 103 additions & 0 deletions

File tree

Documentation/arch/x86/xstate.rst

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,22 @@ are enabled by XCR0 as well, but the first use of related instruction is
1111
trapped by the kernel because by default the required large XSTATE buffers
1212
are not allocated automatically.
1313

14+
The purpose for dynamic features
15+
--------------------------------
16+
17+
Legacy userspace libraries often have hard-coded, static sizes for
18+
alternate signal stacks, often using MINSIGSTKSZ which is typically 2KB.
19+
That stack must be able to store at *least* the signal frame that the
20+
kernel sets up before jumping into the signal handler. That signal frame
21+
must include an XSAVE buffer defined by the CPU.
22+
23+
However, that means that the size of signal stacks is dynamic, not static,
24+
because different CPUs have differently-sized XSAVE buffers. A compiled-in
25+
size of 2KB with existing applications is too small for new CPU features
26+
like AMX. Instead of universally requiring larger stack, with the dynamic
27+
enabling, the kernel can enforce userspace applications to have
28+
properly-sized altstacks.
29+
1430
Using dynamically enabled XSTATE features in user space applications
1531
--------------------------------------------------------------------
1632

@@ -64,6 +80,61 @@ the handler allocates a larger xstate buffer for the task so the large
6480
state can be context switched. In the unlikely cases that the allocation
6581
fails, the kernel sends SIGSEGV.
6682

83+
AMX TILE_DATA enabling example
84+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
85+
86+
Below is the example of how userspace applications enable
87+
TILE_DATA dynamically:
88+
89+
1. The application first needs to query the kernel for AMX
90+
support::
91+
92+
#include <asm/prctl.h>
93+
#include <sys/syscall.h>
94+
#include <stdio.h>
95+
#include <unistd.h>
96+
97+
#ifndef ARCH_GET_XCOMP_SUPP
98+
#define ARCH_GET_XCOMP_SUPP 0x1021
99+
#endif
100+
101+
#ifndef ARCH_XCOMP_TILECFG
102+
#define ARCH_XCOMP_TILECFG 17
103+
#endif
104+
105+
#ifndef ARCH_XCOMP_TILEDATA
106+
#define ARCH_XCOMP_TILEDATA 18
107+
#endif
108+
109+
#define MASK_XCOMP_TILE ((1 << ARCH_XCOMP_TILECFG) | \
110+
(1 << ARCH_XCOMP_TILEDATA))
111+
112+
unsigned long features;
113+
long rc;
114+
115+
...
116+
117+
rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_SUPP, &features);
118+
119+
if (!rc && (features & MASK_XCOMP_TILE) == MASK_XCOMP_TILE)
120+
printf("AMX is available.\n");
121+
122+
2. After that, determining support for AMX, an application must
123+
explicitly ask permission to use it::
124+
125+
#ifndef ARCH_REQ_XCOMP_PERM
126+
#define ARCH_REQ_XCOMP_PERM 0x1023
127+
#endif
128+
129+
...
130+
131+
rc = syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, ARCH_XCOMP_TILEDATA);
132+
133+
if (!rc)
134+
printf("AMX is ready for use.\n");
135+
136+
Note this example does not include the sigaltstack preparation.
137+
67138
Dynamic features in signal frames
68139
---------------------------------
69140

@@ -72,3 +143,32 @@ entry if the feature is in its initial configuration. This differs from
72143
non-dynamic features which are always written regardless of their
73144
configuration. Signal handlers can examine the XSAVE buffer's XSTATE_BV
74145
field to determine if a features was written.
146+
147+
Dynamic features for virtual machines
148+
-------------------------------------
149+
150+
The permission for the guest state component needs to be managed separately
151+
from the host, as they are exclusive to each other. A coupled of options
152+
are extended to control the guest permission:
153+
154+
-ARCH_GET_XCOMP_GUEST_PERM
155+
156+
arch_prctl(ARCH_GET_XCOMP_GUEST_PERM, &features);
157+
158+
ARCH_GET_XCOMP_GUEST_PERM is a variant of ARCH_GET_XCOMP_PERM. So it
159+
provides the same semantics and functionality but for the guest
160+
components.
161+
162+
-ARCH_REQ_XCOMP_GUEST_PERM
163+
164+
arch_prctl(ARCH_REQ_XCOMP_GUEST_PERM, feature_nr);
165+
166+
ARCH_REQ_XCOMP_GUEST_PERM is a variant of ARCH_REQ_XCOMP_PERM. It has the
167+
same semantics for the guest permission. While providing a similar
168+
functionality, this comes with a constraint. Permission is frozen when the
169+
first VCPU is created. Any attempt to change permission after that point
170+
is going to be rejected. So, the permission has to be requested before the
171+
first VCPU creation.
172+
173+
Note that some VMMs may have already established a set of supported state
174+
components. These options are not presumed to support any particular VMM.

arch/x86/include/uapi/asm/prctl.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@
1616
#define ARCH_GET_XCOMP_GUEST_PERM 0x1024
1717
#define ARCH_REQ_XCOMP_GUEST_PERM 0x1025
1818

19+
#define ARCH_XCOMP_TILECFG 17
20+
#define ARCH_XCOMP_TILEDATA 18
21+
1922
#define ARCH_MAP_VDSO_X32 0x2001
2023
#define ARCH_MAP_VDSO_32 0x2002
2124
#define ARCH_MAP_VDSO_64 0x2003

0 commit comments

Comments
 (0)