Skip to content

Commit a7bfae2

Browse files
rmurphy-armwilldeacon
authored andcommitted
perf/arm-cmn: Reduce stack usage during discovery
Arnd reports that Clang's aggressive inlining of arm_cmn_discover() can lead to stack frame size warnings, and while we could simply prevent such inlining to hide the issue, it seems more productive to actually heed the warning and do something about the overall stack footprint. The xp_region array is already rather large, and CMN_MAX_XPS might only grow larger in future, however it only serves as a convenience to save repeating the first level's worth of register reads in the second pass of discovery. There's no performance concern here, and it only takes a small tweak to the flow to re-extract the offsets instead of stashing them, so let's just do that and save several hundred bytes of stack. Reported-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/e7dd41bf0f1b098e2e4b01ef91318a4b272abff8.1751046159.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
1 parent b6e37b2 commit a7bfae2

1 file changed

Lines changed: 8 additions & 7 deletions

File tree

drivers/perf/arm-cmn.c

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2245,12 +2245,11 @@ static enum cmn_node_type arm_cmn_subtype(enum cmn_node_type type)
22452245

22462246
static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
22472247
{
2248-
void __iomem *cfg_region;
2248+
void __iomem *cfg_region, __iomem *xp_region;
22492249
struct arm_cmn_node cfg, *dn;
22502250
struct arm_cmn_dtm *dtm;
22512251
enum cmn_part part;
22522252
u16 child_count, child_poff;
2253-
u32 xp_offset[CMN_MAX_XPS];
22542253
u64 reg;
22552254
int i, j;
22562255
size_t sz;
@@ -2302,11 +2301,12 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
23022301
cmn->num_dns = cmn->num_xps;
23032302

23042303
/* Pass 1: visit the XPs, enumerate their children */
2304+
cfg_region += child_poff;
23052305
for (i = 0; i < cmn->num_xps; i++) {
2306-
reg = readq_relaxed(cfg_region + child_poff + i * 8);
2307-
xp_offset[i] = reg & CMN_CHILD_NODE_ADDR;
2306+
reg = readq_relaxed(cfg_region + i * 8);
2307+
xp_region = cmn->base + (reg & CMN_CHILD_NODE_ADDR);
23082308

2309-
reg = readq_relaxed(cmn->base + xp_offset[i] + CMN_CHILD_INFO);
2309+
reg = readq_relaxed(xp_region + CMN_CHILD_INFO);
23102310
cmn->num_dns += FIELD_GET(CMN_CI_CHILD_COUNT, reg);
23112311
}
23122312

@@ -2332,11 +2332,12 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
23322332
cmn->dns = dn;
23332333
cmn->dtms = dtm;
23342334
for (i = 0; i < cmn->num_xps; i++) {
2335-
void __iomem *xp_region = cmn->base + xp_offset[i];
23362335
struct arm_cmn_node *xp = dn++;
23372336
unsigned int xp_ports = 0;
23382337

2339-
arm_cmn_init_node_info(cmn, xp_offset[i], xp);
2338+
reg = readq_relaxed(cfg_region + i * 8);
2339+
xp_region = cmn->base + (reg & CMN_CHILD_NODE_ADDR);
2340+
arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, xp);
23402341
/*
23412342
* Thanks to the order in which XP logical IDs seem to be
23422343
* assigned, we can handily infer the mesh X dimension by

0 commit comments

Comments
 (0)