Skip to content

Commit 9c91e6a

Browse files
committed
Merge tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov: "A lot of changes this time around, details below. From the next cycle onwards, we'll switch the EDAC tree to topic branches (instead of a single edac-for-next branch) which should make the changes handling more flexible, hopefully. We'll see. Summary: - Rework error logging functions to accept a count of errors parameter (Hanna Hawa) - Part one of substantial EDAC core + ghes_edac driver cleanup (Robert Richter) - Print additional useful logging information in skx_* (Tony Luck) - Improve amd64_edac hw detection + cleanups (Yazen Ghannam) - Misc cleanups, fixes and code improvements" * tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (35 commits) EDAC/altera: Use the Altera System Manager driver EDAC/altera: Cleanup the ECC Manager EDAC/altera: Use fast register IO for S10 IRQs EDAC/ghes: Do not warn when incrementing refcount on 0 EDAC/Documentation: Describe CPER module definition and DIMM ranks EDAC: Unify the mc_event tracepoint call EDAC/ghes: Remove intermediate buffer pvt->detail_location EDAC/ghes: Fix grain calculation EDAC/ghes: Use standard kernel macros for page calculations EDAC: Remove misleading comment in struct edac_raw_error_desc EDAC/mc: Reduce indentation level in edac_mc_handle_error() EDAC/mc: Remove needless zero string termination EDAC/mc: Do not BUG_ON() in edac_mc_alloc() EDAC: Introduce an mci_for_each_dimm() iterator EDAC: Remove EDAC_DIMM_OFF() macro EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function EDAC/amd64: Get rid of the ECC disabled long message EDAC/ghes: Fix locking and memory barrier issues EDAC/amd64: Check for memory before fully initializing an instance EDAC/amd64: Use cached data when checking for ECC ...
2 parents 752272f + 5781823 commit 9c91e6a

25 files changed

Lines changed: 555 additions & 633 deletions

Documentation/admin-guide/ras.rst

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -330,9 +330,12 @@ There can be multiple csrows and multiple channels.
330330

331331
.. [#f4] Nowadays, the term DIMM (Dual In-line Memory Module) is widely
332332
used to refer to a memory module, although there are other memory
333-
packaging alternatives, like SO-DIMM, SIMM, etc. Along this document,
334-
and inside the EDAC system, the term "dimm" is used for all memory
335-
modules, even when they use a different kind of packaging.
333+
packaging alternatives, like SO-DIMM, SIMM, etc. The UEFI
334+
specification (Version 2.7) defines a memory module in the Common
335+
Platform Error Record (CPER) section to be an SMBIOS Memory Device
336+
(Type 17). Along this document, and inside the EDAC subsystem, the term
337+
"dimm" is used for all memory modules, even when they use a
338+
different kind of packaging.
336339
337340
Memory controllers allow for several csrows, with 8 csrows being a
338341
typical value. Yet, the actual number of csrows depends on the layout of
@@ -349,12 +352,14 @@ controllers. The following example will assume 2 channels:
349352
| | ``ch0`` | ``ch1`` |
350353
+============+===========+===========+
351354
| ``csrow0`` | DIMM_A0 | DIMM_B0 |
352-
+------------+ | |
353-
| ``csrow1`` | | |
355+
| | rank0 | rank0 |
356+
+------------+ - | - |
357+
| ``csrow1`` | rank1 | rank1 |
354358
+------------+-----------+-----------+
355359
| ``csrow2`` | DIMM_A1 | DIMM_B1 |
356-
+------------+ | |
357-
| ``csrow3`` | | |
360+
| | rank0 | rank0 |
361+
+------------+ - | - |
362+
| ``csrow3`` | rank1 | rank1 |
358363
+------------+-----------+-----------+
359364

360365
In the above example, there are 4 physical slots on the motherboard
@@ -374,11 +379,13 @@ which the memory DIMM is placed. Thus, when 1 DIMM is placed in each
374379
Channel, the csrows cross both DIMMs.
375380

376381
Memory DIMMs come single or dual "ranked". A rank is a populated csrow.
377-
Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above
378-
will have just one csrow (csrow0). csrow1 will be empty. On the other
379-
hand, when 2 dual ranked DIMMs are similarly placed, then both csrow0
380-
and csrow1 will be populated. The pattern repeats itself for csrow2 and
381-
csrow3.
382+
In the example above 2 dual ranked DIMMs are similarly placed. Thus,
383+
both csrow0 and csrow1 are populated. On the other hand, when 2 single
384+
ranked DIMMs are placed in slots DIMM_A0 and DIMM_B0, then they will
385+
have just one csrow (csrow0) and csrow1 will be empty. The pattern
386+
repeats itself for csrow2 and csrow3. Also note that some memory
387+
controllers don't have any logic to identify the memory module, see
388+
``rankX`` directories below.
382389

383390
The representation of the above is reflected in the directory
384391
tree in EDAC's sysfs interface. Starting in directory

drivers/edac/altera_edac.c

Lines changed: 9 additions & 143 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#include <linux/interrupt.h>
1515
#include <linux/irqchip/chained_irq.h>
1616
#include <linux/kernel.h>
17+
#include <linux/mfd/altera-sysmgr.h>
1718
#include <linux/mfd/syscon.h>
1819
#include <linux/notifier.h>
1920
#include <linux/of_address.h>
@@ -275,7 +276,6 @@ static int a10_unmask_irq(struct platform_device *pdev, u32 mask)
275276
return ret;
276277
}
277278

278-
static int socfpga_is_a10(void);
279279
static int altr_sdram_probe(struct platform_device *pdev)
280280
{
281281
const struct of_device_id *id;
@@ -399,7 +399,7 @@ static int altr_sdram_probe(struct platform_device *pdev)
399399
goto err;
400400

401401
/* Only the Arria10 has separate IRQs */
402-
if (socfpga_is_a10()) {
402+
if (of_machine_is_compatible("altr,socfpga-arria10")) {
403403
/* Arria10 specific initialization */
404404
res = a10_init(mc_vbase);
405405
if (res < 0)
@@ -502,68 +502,6 @@ module_platform_driver(altr_sdram_edac_driver);
502502

503503
#endif /* CONFIG_EDAC_ALTERA_SDRAM */
504504

505-
/**************** Stratix 10 EDAC Memory Controller Functions ************/
506-
507-
/**
508-
* s10_protected_reg_write
509-
* Write to a protected SMC register.
510-
* @context: Not used.
511-
* @reg: Address of register
512-
* @value: Value to write
513-
* Return: INTEL_SIP_SMC_STATUS_OK (0) on success
514-
* INTEL_SIP_SMC_REG_ERROR on error
515-
* INTEL_SIP_SMC_RETURN_UNKNOWN_FUNCTION if not supported
516-
*/
517-
static int s10_protected_reg_write(void *context, unsigned int reg,
518-
unsigned int val)
519-
{
520-
struct arm_smccc_res result;
521-
unsigned long offset = (unsigned long)context;
522-
523-
arm_smccc_smc(INTEL_SIP_SMC_REG_WRITE, offset + reg, val, 0, 0,
524-
0, 0, 0, &result);
525-
526-
return (int)result.a0;
527-
}
528-
529-
/**
530-
* s10_protected_reg_read
531-
* Read the status of a protected SMC register
532-
* @context: Not used.
533-
* @reg: Address of register
534-
* @value: Value read.
535-
* Return: INTEL_SIP_SMC_STATUS_OK (0) on success
536-
* INTEL_SIP_SMC_REG_ERROR on error
537-
* INTEL_SIP_SMC_RETURN_UNKNOWN_FUNCTION if not supported
538-
*/
539-
static int s10_protected_reg_read(void *context, unsigned int reg,
540-
unsigned int *val)
541-
{
542-
struct arm_smccc_res result;
543-
unsigned long offset = (unsigned long)context;
544-
545-
arm_smccc_smc(INTEL_SIP_SMC_REG_READ, offset + reg, 0, 0, 0,
546-
0, 0, 0, &result);
547-
548-
*val = (unsigned int)result.a1;
549-
550-
return (int)result.a0;
551-
}
552-
553-
static const struct regmap_config s10_sdram_regmap_cfg = {
554-
.name = "s10_ddr",
555-
.reg_bits = 32,
556-
.reg_stride = 4,
557-
.val_bits = 32,
558-
.max_register = 0xffd12228,
559-
.reg_read = s10_protected_reg_read,
560-
.reg_write = s10_protected_reg_write,
561-
.use_single_read = true,
562-
.use_single_write = true,
563-
};
564-
565-
/************** </Stratix10 EDAC Memory Controller Functions> ***********/
566-
567505
/************************* EDAC Parent Probe *************************/
568506

569507
static const struct of_device_id altr_edac_device_of_match[];
@@ -1008,16 +946,6 @@ static int __maybe_unused altr_init_memory_port(void __iomem *ioaddr, int port)
1008946
return ret;
1009947
}
1010948

1011-
static int socfpga_is_a10(void)
1012-
{
1013-
return of_machine_is_compatible("altr,socfpga-arria10");
1014-
}
1015-
1016-
static int socfpga_is_s10(void)
1017-
{
1018-
return of_machine_is_compatible("altr,socfpga-stratix10");
1019-
}
1020-
1021949
static __init int __maybe_unused
1022950
altr_init_a10_ecc_block(struct device_node *np, u32 irq_mask,
1023951
u32 ecc_ctrl_en_mask, bool dual_port)
@@ -1033,34 +961,10 @@ altr_init_a10_ecc_block(struct device_node *np, u32 irq_mask,
1033961
/* Get the ECC Manager - parent of the device EDACs */
1034962
np_eccmgr = of_get_parent(np);
1035963

1036-
if (socfpga_is_a10()) {
1037-
ecc_mgr_map = syscon_regmap_lookup_by_phandle(np_eccmgr,
1038-
"altr,sysmgr-syscon");
1039-
} else {
1040-
struct device_node *sysmgr_np;
1041-
struct resource res;
1042-
uintptr_t base;
1043-
1044-
sysmgr_np = of_parse_phandle(np_eccmgr,
1045-
"altr,sysmgr-syscon", 0);
1046-
if (!sysmgr_np) {
1047-
edac_printk(KERN_ERR, EDAC_DEVICE,
1048-
"Unable to find altr,sysmgr-syscon\n");
1049-
return -ENODEV;
1050-
}
1051-
1052-
if (of_address_to_resource(sysmgr_np, 0, &res)) {
1053-
of_node_put(sysmgr_np);
1054-
return -ENOMEM;
1055-
}
964+
ecc_mgr_map =
965+
altr_sysmgr_regmap_lookup_by_phandle(np_eccmgr,
966+
"altr,sysmgr-syscon");
1056967

1057-
/* Need physical address for SMCC call */
1058-
base = res.start;
1059-
1060-
ecc_mgr_map = regmap_init(NULL, NULL, (void *)base,
1061-
&s10_sdram_regmap_cfg);
1062-
of_node_put(sysmgr_np);
1063-
}
1064968
of_node_put(np_eccmgr);
1065969
if (IS_ERR(ecc_mgr_map)) {
1066970
edac_printk(KERN_ERR, EDAC_DEVICE,
@@ -1125,9 +1029,6 @@ static int __init __maybe_unused altr_init_a10_ecc_device_type(char *compat)
11251029
int irq;
11261030
struct device_node *child, *np;
11271031

1128-
if (!socfpga_is_a10() && !socfpga_is_s10())
1129-
return -ENODEV;
1130-
11311032
np = of_find_compatible_node(NULL, NULL,
11321033
"altr,socfpga-a10-ecc-manager");
11331034
if (!np) {
@@ -2178,33 +2079,9 @@ static int altr_edac_a10_probe(struct platform_device *pdev)
21782079
platform_set_drvdata(pdev, edac);
21792080
INIT_LIST_HEAD(&edac->a10_ecc_devices);
21802081

2181-
if (socfpga_is_a10()) {
2182-
edac->ecc_mgr_map =
2183-
syscon_regmap_lookup_by_phandle(pdev->dev.of_node,
2184-
"altr,sysmgr-syscon");
2185-
} else {
2186-
struct device_node *sysmgr_np;
2187-
struct resource res;
2188-
uintptr_t base;
2189-
2190-
sysmgr_np = of_parse_phandle(pdev->dev.of_node,
2191-
"altr,sysmgr-syscon", 0);
2192-
if (!sysmgr_np) {
2193-
edac_printk(KERN_ERR, EDAC_DEVICE,
2194-
"Unable to find altr,sysmgr-syscon\n");
2195-
return -ENODEV;
2196-
}
2197-
2198-
if (of_address_to_resource(sysmgr_np, 0, &res))
2199-
return -ENOMEM;
2200-
2201-
/* Need physical address for SMCC call */
2202-
base = res.start;
2203-
2204-
edac->ecc_mgr_map = devm_regmap_init(&pdev->dev, NULL,
2205-
(void *)base,
2206-
&s10_sdram_regmap_cfg);
2207-
}
2082+
edac->ecc_mgr_map =
2083+
altr_sysmgr_regmap_lookup_by_phandle(pdev->dev.of_node,
2084+
"altr,sysmgr-syscon");
22082085

22092086
if (IS_ERR(edac->ecc_mgr_map)) {
22102087
edac_printk(KERN_ERR, EDAC_DEVICE,
@@ -2270,18 +2147,7 @@ static int altr_edac_a10_probe(struct platform_device *pdev)
22702147
if (!of_device_is_available(child))
22712148
continue;
22722149

2273-
if (of_device_is_compatible(child, "altr,socfpga-a10-l2-ecc") ||
2274-
of_device_is_compatible(child, "altr,socfpga-a10-ocram-ecc") ||
2275-
of_device_is_compatible(child, "altr,socfpga-eth-mac-ecc") ||
2276-
of_device_is_compatible(child, "altr,socfpga-nand-ecc") ||
2277-
of_device_is_compatible(child, "altr,socfpga-dma-ecc") ||
2278-
of_device_is_compatible(child, "altr,socfpga-usb-ecc") ||
2279-
of_device_is_compatible(child, "altr,socfpga-qspi-ecc") ||
2280-
#ifdef CONFIG_EDAC_ALTERA_SDRAM
2281-
of_device_is_compatible(child, "altr,sdram-edac-s10") ||
2282-
#endif
2283-
of_device_is_compatible(child, "altr,socfpga-sdmmc-ecc"))
2284-
2150+
if (of_match_node(altr_edac_a10_device_of_match, child))
22852151
altr_edac_a10_device_add(edac, child);
22862152

22872153
#ifdef CONFIG_EDAC_ALTERA_SDRAM

0 commit comments

Comments
 (0)