Skip to content

Commit 1e536e1

Browse files
kaihuanghansendc
authored andcommitted
x86/cpu: Detect TDX partial write machine check erratum
TDX memory has integrity and confidentiality protections. Violations of this integrity protection are supposed to only affect TDX operations and are never supposed to affect the host kernel itself. In other words, the host kernel should never, itself, see machine checks induced by the TDX integrity hardware. Alas, the first few generations of TDX hardware have an erratum. A partial write to a TDX private memory cacheline will silently "poison" the line. Subsequent reads will consume the poison and generate a machine check. According to the TDX hardware spec, neither of these things should have happened. Virtually all kernel memory accesses operations happen in full cachelines. In practice, writing a "byte" of memory usually reads a 64 byte cacheline of memory, modifies it, then writes the whole line back. Those operations do not trigger this problem. This problem is triggered by "partial" writes where a write transaction of less than cacheline lands at the memory controller. The CPU does these via non-temporal write instructions (like MOVNTI), or through UC/WC memory mappings. The issue can also be triggered away from the CPU by devices doing partial writes via DMA. With this erratum, there are additional things need to be done. To prepare for those changes, add a CPU bug bit to indicate this erratum. Note this bug reflects the hardware thus it is detected regardless of whether the kernel is built with TDX support or not. Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20231208170740.53979-17-dave.hansen%40intel.com
1 parent f3f6aa6 commit 1e536e1

2 files changed

Lines changed: 20 additions & 0 deletions

File tree

arch/x86/include/asm/cpufeatures.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,7 @@
496496
#define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
497497
#define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
498498
#define X86_BUG_GDS X86_BUG(30) /* CPU is affected by Gather Data Sampling */
499+
#define X86_BUG_TDX_PW_MCE X86_BUG(31) /* CPU may incur #MC if non-TD software does partial write to TDX private memory */
499500

500501
/* BUG word 2 */
501502
#define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */

arch/x86/virt/vmx/tdx/tdx.c

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@
3333
#include <asm/msr.h>
3434
#include <asm/cpufeature.h>
3535
#include <asm/tdx.h>
36+
#include <asm/intel-family.h>
37+
#include <asm/processor.h>
3638
#include "tdx.h"
3739

3840
static u32 tdx_global_keyid __ro_after_init;
@@ -1308,6 +1310,21 @@ static struct notifier_block tdx_memory_nb = {
13081310
.notifier_call = tdx_memory_notifier,
13091311
};
13101312

1313+
static void __init check_tdx_erratum(void)
1314+
{
1315+
/*
1316+
* These CPUs have an erratum. A partial write from non-TD
1317+
* software (e.g. via MOVNTI variants or UC/WC mapping) to TDX
1318+
* private memory poisons that memory, and a subsequent read of
1319+
* that memory triggers #MC.
1320+
*/
1321+
switch (boot_cpu_data.x86_model) {
1322+
case INTEL_FAM6_SAPPHIRERAPIDS_X:
1323+
case INTEL_FAM6_EMERALDRAPIDS_X:
1324+
setup_force_cpu_bug(X86_BUG_TDX_PW_MCE);
1325+
}
1326+
}
1327+
13111328
void __init tdx_init(void)
13121329
{
13131330
u32 tdx_keyid_start, nr_tdx_keyids;
@@ -1361,4 +1378,6 @@ void __init tdx_init(void)
13611378
tdx_nr_guest_keyids = nr_tdx_keyids - 1;
13621379

13631380
setup_force_cpu_cap(X86_FEATURE_TDX_HOST_PLATFORM);
1381+
1382+
check_tdx_erratum();
13641383
}

0 commit comments

Comments
 (0)