Skip to content

Commit c35977b

Browse files
yghannambp3tk0v
authored andcommitted
x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
The MI200 (Aldebaran) series of devices introduced a new SMCA bank type for Unified Memory Controllers. The MCE subsystem already has support for this new type. The MCE decoder module will decode the common MCA error information for the new bank type, but it will not pass the information to the AMD64 EDAC module for detailed memory error decoding. Have the MCE decoder module recognize the new bank type as an SMCA UMC memory error and pass the MCA information to AMD64 EDAC. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230515113537.1052146-3-muralimk@amd.com
1 parent f5e87cd commit c35977b

2 files changed

Lines changed: 6 additions & 3 deletions

File tree

arch/x86/kernel/cpu/mce/amd.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -715,11 +715,13 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
715715

716716
bool amd_mce_is_memory_error(struct mce *m)
717717
{
718+
enum smca_bank_types bank_type;
718719
/* ErrCodeExt[20:16] */
719720
u8 xec = (m->status >> 16) & 0x1f;
720721

722+
bank_type = smca_get_bank_type(m->extcpu, m->bank);
721723
if (mce_flags.smca)
722-
return smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC && xec == 0x0;
724+
return (bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2) && xec == 0x0;
723725

724726
return m->bank == 4 && xec == 0x8;
725727
}
@@ -1050,7 +1052,7 @@ static const char *get_name(unsigned int cpu, unsigned int bank, struct threshol
10501052
if (bank_type >= N_SMCA_BANK_TYPES)
10511053
return NULL;
10521054

1053-
if (b && bank_type == SMCA_UMC) {
1055+
if (b && (bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2)) {
10541056
if (b->block < ARRAY_SIZE(smca_umc_block_names))
10551057
return smca_umc_block_names[b->block];
10561058
return NULL;

drivers/edac/mce_amd.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1186,7 +1186,8 @@ static void decode_smca_error(struct mce *m)
11861186
if (xec < smca_mce_descs[bank_type].num_descs)
11871187
pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
11881188

1189-
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
1189+
if ((bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2) &&
1190+
xec == 0 && decode_dram_ecc)
11901191
decode_dram_ecc(topology_die_id(m->extcpu), m);
11911192
}
11921193

0 commit comments

Comments
 (0)