evo2 SAE steering analysis: dose-response / selectivity harness + tested metrics by polinabinder1 · Pull Request #1635 · NVIDIA-BioNeMo/bionemo-framework

polinabinder1 · 2026-06-12T03:36:39Z

Summary

Steering analysis — tested effect-metrics + a harness that persists results. Stacked on the engine (#1622).

scripts/steer_analysis.py — pure, CPU-tested metrics: divergence, dose_response, selectivity
scripts/steer.py — harness: reuses the production Evo2SAE.generate decode-only sae.steering clamp, scores with the above, writes steering_results.json
tests/test_steer_analysis.py (CPU)

How to run

pytest recipes/evo2/tests/test_steer_analysis.py                 # CPU metrics (no model)
python steer.py --evo2-ckpt-dir $CKPT --sae-checkpoint $SAE --layer 26 \
    --sequence ATGGCC... --feature 29244 --controls 33918,12000,5000 \
    --strengths 0,50,100,200,400 --out steering_results.json     # GPU: dose-response + selectivity

Writes steering_results.json (target, dose-response curve, target-vs-control selectivity) — the REQ-6 steering evidence once run on 7B/L26.

Summary by CodeRabbit

Release Notes

New Features
- Added feature steering harness for Evo2 models enabling controlled sequence generation with adjustable feature clamping and dose-response analysis
- Introduced metrics for measuring feature steering effects, including divergence analysis and selectivity measurement against control features
Tests
- Added comprehensive test coverage for feature steering metrics

Expected output

steer.py … --out steering_results.json prints: the sequence's top active features, then a dose-response table (strength → diverges@N, %changed) where %changed rises with strength for a real steering feature, then selectivity (target %changed ≫ mean control; ratio > 1). Writes the same to JSON.
pytest tests/test_steer_analysis.py → 3 passed.

The dose-response / selectivity numbers are pending the 7B/L26 GPU run (REQ-6 evidence). Expected shape: monotone dose-response + target ratio clearly above controls.

The analysis layer steering lacked (steer.py used to only print). Adds steer_analysis.py — pure, CPU-tested metrics: divergence (continuation departs from baseline), dose_response (effect vs clamp strength), selectivity (target vs control features). steer.py reuses the production Evo2SAE.generate clamp, scores with these, and writes steering_results.json so the evidence is persisted/reproducible — the steering analog of probe.py annotate. Stacked on #1622. Signed-off-by: Polina Binder <pbinder@nvidia.com>

copy-pr-bot · 2026-06-12T03:36:43Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-12T03:36:46Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5992e03f-89e7-4750-9edb-4cd2f4be07a9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces a complete SAE steering harness for Evo2: a pure metrics module for quantifying feature-driven text changes, a CLI script that orchestrates encoding, generation sweeps, and causal analysis, and corresponding unit tests. The workflow measures dose-response and selectivity of target feature clamping against baseline and control features.

Changes

SAE Steering Harness Implementation

Layer / File(s)	Summary
Steering metrics module `bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py`	Pure functions `divergence`, `dose_response`, and `selectivity` compute character-level divergence between text strings, quantify per-strength dose-response curves, and measure selectivity of target features against control features at matching clamp strengths.
Steering harness script and orchestration `bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py`	CLI-driven harness that loads Evo2SAE, encodes DNA sequences, auto-selects or accepts target features, generates baseline and clamped continuations, invokes metrics to measure dose-response and selectivity, and optionally writes results to JSON.
Metrics test coverage `bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py`	Three unit tests verify divergence correctness, dose-response sorting and monotonicity, and selectivity ratio behavior under target vs control activation patterns.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A steering harness takes the stage,
With clamps and metrics on each page,
We measure what each feature sways,
In DNA through many ways. 🧬✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: adding tested steering analysis metrics (dose-response/selectivity) and a harness for Evo2 SAE steering.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The pull request description provides clear context on the changes, includes usage examples, and documents expected outputs and objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pbinder/evo2-steering-analysis

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…nder/evo2-steering-analysis Signed-off-by: Polina Binder <pbinder@nvidia.com>

polinabinder1 · 2026-06-12T04:30:04Z

@coderabbitai review

coderabbitai · 2026-06-12T04:30:12Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (2)

bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py (1)

27-50: ⚡ Quick win

Consider adding edge case tests.

The current tests cover happy paths well. Consider adding tests for edge cases such as:

Empty strings in divergence (e.g., divergence("", "") or divergence("A", ""))
Empty control_steered dict in selectivity (verifies the edge case flagged in steer_analysis.py)
dose_response with empty steered_by_strength dict

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py`
around lines 27 - 50, Add unit tests covering edge cases: call divergence with
empty strings and with one empty vs non-empty (e.g., divergence("", "") and
divergence("A", "")) and assert expected behavior; add a test for selectivity
that passes an empty control_steered dict to ensure the function handles/raises
as intended; and add a test for dose_response with an empty steered_by_strength
dict to verify it returns an empty/appropriate response. Place these new tests
in the existing test_steer_analysis.py alongside the other tests and reference
the same functions divergence, selectivity, and dose_response so edge behaviors
are validated.

bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py (1)

33-68: ⚡ Quick win

Expand docstrings to Google-style format.

The coding guidelines require Google-style docstrings with Args, Returns, and (if applicable) Raises sections. The current brief docstrings are functional but could be more explicit about parameter types, return structure, and edge cases.

📚 Example expansion for `divergence`

def divergence(a: str, b: str) -> tuple[int, float]:
    """Return divergence metrics between two strings over their shared prefix.

    Args:
        a: First string to compare.
        b: Second string to compare.

    Returns:
        A tuple of (first_differing_index, fraction_differing) where:
            - first_differing_index is the index of the first character that differs,
              or the shared prefix length if strings are identical up to that point.
            - fraction_differing is the fraction of characters that differ within
              the shared prefix length (min of the two string lengths).
    """

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py`
around lines 33 - 68, Update the short docstrings for divergence, dose_response,
and selectivity to full Google-style docstrings: for divergence, add an Args
section documenting a: str and b: str, a Returns section describing the tuple
(first_differing_index: int, fraction_differing: float) and explain edge cases
(identical prefix => index == shared prefix length, fraction computed over
min(len(a), len(b)) and uses max(1, n) to avoid division by zero); for
dose_response, document Args (baseline: str, steered_by_strength: dict[float,
str]) and Returns (list[dict] with keys "strength": float, "first_divergence":
int, "frac_changed": float) and note rows are sorted by ascending strength; for
selectivity, document Args (baseline: str, target_steered: str, control_steered:
dict[int, str]) and Returns (dict with keys "target_frac_changed",
"control_frac_changed", "mean_control_frac_changed", "selectivity_ratio") and
mention handling of empty controls (mean_control_frac_changed==0 and
selectivity_ratio uses denominator max(mean_c, 1e-9)). Ensure each docstring
uses Google-style sections (Args, Returns, and mention Raises if applicable) and
references the existing function names divergence, dose_response, and
selectivity so reviewers can locate the changes.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py`:
- Around line 54-68: The selectivity() function currently divides by a tiny
epsilon when control_steered is empty, producing a misleading huge
selectivity_ratio; modify selectivity() to detect when controls is empty (e.g.,
if not control_steered or if len(controls)==0) and set "selectivity_ratio" to
None (or another sentinel) instead of computing target / max(mean_c, 1e-9); keep
the other fields the same (you may also set "mean_control_frac_changed" to None
if you prefer), and update any type hints or callers of selectivity to accept a
nullable selectivity_ratio.
- Line 32: The top-level imports in steer_analysis.py currently have only one
blank line following them; update the module by inserting a second blank line
after the import block so there are exactly two blank lines before the first
subsequent statement (e.g., before any module-level constants, function or class
definitions) to satisfy the isort configuration.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py`:
- Line 38: The import block in steer.py currently has only one blank line after
the imports which violates the isort rule requiring two blank lines; update the
top-of-file imports section by adding one additional newline so there are two
blank lines separating the imports from the rest of the code (i.e., ensure the
imports block in steer.py is followed by two blank lines).
- Around line 77-78: The parsing of comma-separated inputs for controls and
strengths (variables controls and strengths derived from a.controls and
a.strengths) lacks error handling and will raise ValueError on malformed tokens;
wrap the list comprehensions in a try/except that catches ValueError, validate
each token with .strip() before converting (skip empty tokens), convert controls
with int() and strengths with float() inside the try block, verify the two lists
have the expected matching lengths and raise or exit with a clear error message
including the offending token and the original input (use the argparse object a
for context) so the script fails gracefully instead of crashing.
- Around line 71-76: The code can leave target as None when no labeled features
are in the top activations; after the loop that inspects vals, ids and sets
target from a.feature or eng.labels, add a validation: if target is still None,
either set a sensible fallback (e.g., target = int(ids[0]) to pick the
highest-activation feature) or raise a clear error asking the user to provide
--feature; ensure the check references target, a.feature, vals, ids and
eng.labels so downstream code that uses target as a dict key/for formatting
won't get a TypeError.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py`:
- Line 21: The file test_steer_analysis.py has only one blank line after the
import block which violates the isort config requiring two blank lines; open
test_steer_analysis.py, locate the import section at the top of the module (the
import statements in the test_steer_analysis module), and add one additional
blank line so there are two consecutive blank lines immediately following the
imports before the first test or code block.

---

Nitpick comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py`:
- Around line 33-68: Update the short docstrings for divergence, dose_response,
and selectivity to full Google-style docstrings: for divergence, add an Args
section documenting a: str and b: str, a Returns section describing the tuple
(first_differing_index: int, fraction_differing: float) and explain edge cases
(identical prefix => index == shared prefix length, fraction computed over
min(len(a), len(b)) and uses max(1, n) to avoid division by zero); for
dose_response, document Args (baseline: str, steered_by_strength: dict[float,
str]) and Returns (list[dict] with keys "strength": float, "first_divergence":
int, "frac_changed": float) and note rows are sorted by ascending strength; for
selectivity, document Args (baseline: str, target_steered: str, control_steered:
dict[int, str]) and Returns (dict with keys "target_frac_changed",
"control_frac_changed", "mean_control_frac_changed", "selectivity_ratio") and
mention handling of empty controls (mean_control_frac_changed==0 and
selectivity_ratio uses denominator max(mean_c, 1e-9)). Ensure each docstring
uses Google-style sections (Args, Returns, and mention Raises if applicable) and
references the existing function names divergence, dose_response, and
selectivity so reviewers can locate the changes.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py`:
- Around line 27-50: Add unit tests covering edge cases: call divergence with
empty strings and with one empty vs non-empty (e.g., divergence("", "") and
divergence("A", "")) and assert expected behavior; add a test for selectivity
that passes an empty control_steered dict to ensure the function handles/raises
as intended; and add a test for dose_response with an empty steered_by_strength
dict to verify it returns an empty/appropriate response. Place these new tests
in the existing test_steer_analysis.py alongside the other tests and reference
the same functions divergence, selectivity, and dose_response so edge behaviors
are validated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ea342482-73ae-48e5-b854-6dc5e460dff3

📥 Commits

Reviewing files that changed from the base of the PR and between f310289 and 46f5426.

📒 Files selected for processing (3)

bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py

coderabbitai · 2026-06-12T04:34:37Z

+
+from __future__ import annotations
+
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a second blank line after imports.

The isort configuration requires 2 blank lines after imports. Currently only 1 blank line is present.

📝 Proposed fix

from __future__ import annotations + def divergence(a: str, b: str) -> tuple[int, float]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from __future__ import annotations

def divergence(a: str, b: str) -> tuple[int, float]:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py` at line 32, The top-level imports in steer_analysis.py currently have only one blank line following them; update the module by inserting a second blank line after the import block so there are exactly two blank lines before the first subsequent statement (e.g., before any module-level constants, function or class definitions) to satisfy the isort configuration.

Source: Coding guidelines

coderabbitai · 2026-06-12T04:34:37Z

+def selectivity(baseline: str, target_steered: str, control_steered: dict[int, str]) -> dict:
+    """Target effect vs control features clamped to the same strength.
+
+    ``selectivity_ratio`` > 1 means the target feature moves generation more than the average
+    control — evidence the steering is feature-specific, not a generic "any clamp perturbs output".
+    """
+    target = divergence(baseline, target_steered)[1]
+    controls = {int(c): round(divergence(baseline, seq)[1], 4) for c, seq in control_steered.items()}
+    mean_c = sum(controls.values()) / len(controls) if controls else 0.0
+    return {
+        "target_frac_changed": round(target, 4),
+        "control_frac_changed": controls,
+        "mean_control_frac_changed": round(mean_c, 4),
+        "selectivity_ratio": round(target / max(mean_c, 1e-9), 2),
+    }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against misleading selectivity_ratio when controls are empty.

When control_steered is empty, Line 62 sets mean_c = 0.0, and Line 67 computes target / max(0.0, 1e-9) = target / 1e-9, which can produce an artificially large ratio (e.g., if target is 0.5, ratio becomes 50 million). This may mislead users into thinking the target is highly selective when in fact there were no controls to compare against.

🛡️ Proposed fix

def selectivity(baseline: str, target_steered: str, control_steered: dict[int, str]) -> dict: """Target effect vs control features clamped to the same strength. ``selectivity_ratio`` > 1 means the target feature moves generation more than the average control — evidence the steering is feature-specific, not a generic "any clamp perturbs output". """ + if not control_steered: + raise ValueError("control_steered must contain at least one control feature") target = divergence(baseline, target_steered)[1] controls = {int(c): round(divergence(baseline, seq)[1], 4) for c, seq in control_steered.items()} mean_c = sum(controls.values()) / len(controls) if controls else 0.0 return { "target_frac_changed": round(target, 4), "control_frac_changed": controls, "mean_control_frac_changed": round(mean_c, 4), "selectivity_ratio": round(target / max(mean_c, 1e-9), 2), }

Alternatively, return None or a sentinel value for selectivity_ratio when there are no controls.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer_analysis.py` around lines 54 - 68, The selectivity() function currently divides by a tiny epsilon when control_steered is empty, producing a misleading huge selectivity_ratio; modify selectivity() to detect when controls is empty (e.g., if not control_steered or if len(controls)==0) and set "selectivity_ratio" to None (or another sentinel) instead of computing target / max(mean_c, 1e-9); keep the other fields the same (you may also set "mean_control_frac_changed" to None if you prefer), and update any type hints or callers of selectivity to accept a nullable selectivity_ratio.

coderabbitai · 2026-06-12T04:34:37Z

+import sys
+from pathlib import Path
+
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a second blank line after imports.

The isort configuration requires 2 blank lines after imports. Currently only 1 blank line is present.

📝 Proposed fix

_HERE = Path(__file__).resolve().parent sys.path.insert(0, str(_HERE)) sys.path.insert(0, str(_HERE.parent / "src")) # recipes/evo2/src -> evo2_sae package sys.path.insert(0, str(_HERE.parents[2] / "sae" / "src")) + from steer_analysis import dose_response, selectivity # noqa: E402

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_HERE = Path(__file__).resolve().parent

sys.path.insert(0, str(_HERE))

sys.path.insert(0, str(_HERE.parent / "src")) # recipes/evo2/src -> evo2_sae package

sys.path.insert(0, str(_HERE.parents[2] / "sae" / "src"))

from steer_analysis import dose_response, selectivity # noqa: E402

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py` at line 38, The import block in steer.py currently has only one blank line after the imports which violates the isort rule requiring two blank lines; update the top-of-file imports section by adding one additional newline so there are two blank lines separating the imports from the rest of the code (i.e., ensure the imports block in steer.py is followed by two blank lines).

Source: Coding guidelines

coderabbitai · 2026-06-12T04:34:37Z

+    target = a.feature
+    for v, i in zip(vals.tolist(), ids.tolist()):
+        lab = eng.labels.get(int(i))
+        print(f"  feat {int(i):6d}  {str(lab):18s}  max_act {v:7.2f}")
+        if target is None and lab:
+            target = int(i)


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate that a target feature is selected.

If the sequence has no labeled features in its top 10 activations and the user did not specify --feature, Line 76 leaves target = None. This will cause a TypeError later when target is used as a dictionary key (Line 93) or in string formatting (Line 95).

🛡️ Proposed fix

for v, i in zip(vals.tolist(), ids.tolist()): lab = eng.labels.get(int(i)) print(f" feat {int(i):6d} {str(lab):18s} max_act {v:7.2f}") if target is None and lab: target = int(i) + if target is None: + print("Error: No labeled feature found in top activations. Specify --feature explicitly.", file=sys.stderr) + sys.exit(1) controls = [int(c) for c in a.controls.split(",") if c.strip()]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py` around lines 71 - 76, The code can leave target as None when no labeled features are in the top activations; after the loop that inspects vals, ids and sets target from a.feature or eng.labels, add a validation: if target is still None, either set a sensible fallback (e.g., target = int(ids[0]) to pick the highest-activation feature) or raise a clear error asking the user to provide --feature; ensure the check references target, a.feature, vals, ids and eng.labels so downstream code that uses target as a dict key/for formatting won't get a TypeError.

coderabbitai · 2026-06-12T04:34:37Z

+    controls = [int(c) for c in a.controls.split(",") if c.strip()]
+    strengths = [float(s) for s in a.strengths.split(",")]


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add error handling for parsing controls and strengths.

Lines 77-78 parse comma-separated strings into integers and floats without error handling. If the user provides malformed input (e.g., --controls "12345,abc"), the script will crash with a ValueError.

🛡️ Proposed fix

- controls = [int(c) for c in a.controls.split(",") if c.strip()] - strengths = [float(s) for s in a.strengths.split(",")] + try: + controls = [int(c) for c in a.controls.split(",") if c.strip()] + except ValueError as e: + print(f"Error parsing --controls: {e}", file=sys.stderr) + sys.exit(1) + try: + strengths = [float(s) for s in a.strengths.split(",")] + except ValueError as e: + print(f"Error parsing --strengths: {e}", file=sys.stderr) + sys.exit(1)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/steer.py` around lines 77 - 78, The parsing of comma-separated inputs for controls and strengths (variables controls and strengths derived from a.controls and a.strengths) lacks error handling and will raise ValueError on malformed tokens; wrap the list comprehensions in a try/except that catches ValueError, validate each token with .strip() before converting (skip empty tokens), convert controls with int() and strengths with float() inside the try block, verify the two lists have the expected matching lengths and raise or exit with a clear error message including the offending token and the original input (use the argparse object a for context) so the script fails gracefully instead of crashing.

coderabbitai · 2026-06-12T04:34:37Z

+import sys
+from pathlib import Path
+
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a second blank line after imports.

The isort configuration requires 2 blank lines after imports. Currently only 1 blank line is present.

📝 Proposed fix

from pathlib import Path + sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "scripts"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from pathlib import Path

sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "scripts"))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_steer_analysis.py` at line 21, The file test_steer_analysis.py has only one blank line after the import block which violates the isort config requiring two blank lines; open test_steer_analysis.py, locate the import section at the top of the module (the import statements in the test_steer_analysis module), and add one additional blank line so there are two consecutive blank lines immediately following the imports before the first test or code block.

Source: Coding guidelines

…-commit CI) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>

Merge remote-tracking branch 'origin/pbinder/evo2-sae-serve' into pbi…

1dbd741

…nder/evo2-steering-analysis Signed-off-by: Polina Binder <pbinder@nvidia.com>

This was referenced Jun 12, 2026

evo2 SAE serve: FastAPI server + CLI (on the engine) #1637

Open

evo2 SAE recipe: live inference engine + steering server + CLI #1622

Open

Merge remote-tracking branch 'origin/pbinder/evo2-sae-serve' into pbi…

46f5426

…nder/evo2-steering-analysis Signed-off-by: Polina Binder <pbinder@nvidia.com>

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

polinabinder1 marked this pull request as ready for review June 12, 2026 05:32

polinabinder1 requested review from jstjohn, jwilber, pstjohn, savitha-eng and trvachov as code owners June 12, 2026 05:32

polinabinder1 removed request for jstjohn, pstjohn and trvachov June 12, 2026 05:32

style(core): ruff-format blank line after in-function import (fix pre…

14a430a

…-commit CI) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>

polinabinder1 requested review from jstjohn and pstjohn June 12, 2026 05:56


	from __future__ import annotations


	def divergence(a: str, b: str) -> tuple[int, float]:

+_HERE = Path(__file__).resolve().parent
+sys.path.insert(0, str(_HERE))
+sys.path.insert(0, str(_HERE.parent / "src"))  # recipes/evo2/src -> evo2_sae package
+sys.path.insert(0, str(_HERE.parents[2] / "sae" / "src"))
+from steer_analysis import dose_response, selectivity  # noqa: E402

		controls = [int(c) for c in a.controls.split(",") if c.strip()]
		strengths = [float(s) for s in a.strengths.split(",")]


	from pathlib import Path


	sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "scripts"))

Conversation

polinabinder1 commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Contents

How to run

Summary by CodeRabbit

Release Notes

Expected output

Uh oh!

copy-pr-bot Bot commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

polinabinder1 commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

polinabinder1 commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading