Create adding_new_model_tutorial.md#1784
Conversation
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…re always defined Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…l and mip commands Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…ection in mip_progress.py Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…t masked by grep exit code Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
📝 WalkthroughWalkthroughAdds Qwen3.5 (0.8B and 2B) as supported Puzzletron models via a new converter, model descriptor, layer descriptor dataclasses, and complete Hydra config trees. Simultaneously introduces an experimental ChangesQwen3.5 model support
Puzzletron agent skill
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 2
🧹 Nitpick comments (1)
modelopt/torch/puzzletron/anymodel/models/qwen3_5/__init__.py (1)
16-17: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAdd an explicit
__all__in this package__init__.py.This new
__init__.pyre-exports symbols but does not define its public surface explicitly.Suggested patch
+from .qwen3_5_converter import __all__ as _qwen3_5_converter_all from .qwen3_5_converter import * +from .qwen3_5_model_descriptor import __all__ as _qwen3_5_model_descriptor_all from .qwen3_5_model_descriptor import * + +__all__ = [*_qwen3_5_converter_all, *_qwen3_5_model_descriptor_all]As per coding guidelines: “Define the public API with
__all__at the top of each module and re-export viafrom .module import *in package__init__.pyfiles.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/torch/puzzletron/anymodel/models/qwen3_5/__init__.py` around lines 16 - 17, The package __init__.py file uses wildcard imports from qwen3_5_converter and qwen3_5_model_descriptor modules but does not explicitly define the public API surface. Add an explicit __all__ list to this __init__.py file after the import statements that specifies which symbols from those modules should be publicly exported by the package. This __all__ list should contain the names of the symbols that are intended to be part of the public API for users importing from this package.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.agents/skills/puzzletron/SKILL.md:
- Line 130: The AutoConfig.from_pretrained calls have hardcoded
trust_remote_code=True which is a security vulnerability when handling untrusted
model sources. Change both occurrences of AutoConfig.from_pretrained (at the
initial config loading and the retry logic) to default to
trust_remote_code=False. Wrap the AutoConfig.from_pretrained calls in try-except
blocks to catch custom modeling code errors, and when such an error occurs,
prompt the user to confirm they trust the model source before retrying with
trust_remote_code=True if they explicitly approve.
In
`@examples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/ffn_pruning.yaml`:
- Around line 18-20: The intermediate_size_list in the ffn_pruning.yaml config
contains values (5120) that exceed the 0.8B teacher model's intermediate size of
3584, which causes the pruning strategy to allow expansion instead of
compression. Update the intermediate_size_list to use only values that are
smaller than the teacher size of 3584 to ensure proper FFN pruning behavior for
the 0.8B model, and remove or replace any values that are equal to or larger
than the teacher intermediate size.
---
Nitpick comments:
In `@modelopt/torch/puzzletron/anymodel/models/qwen3_5/__init__.py`:
- Around line 16-17: The package __init__.py file uses wildcard imports from
qwen3_5_converter and qwen3_5_model_descriptor modules but does not explicitly
define the public API surface. Add an explicit __all__ list to this __init__.py
file after the import statements that specifies which symbols from those modules
should be publicly exported by the package. This __all__ list should contain the
names of the symbols that are intended to be part of the public API for users
importing from this package.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b1c0e3bb-f37c-4a69-b9a2-40102d8d5134
📒 Files selected for processing (32)
.agents/skills/puzzletron/README.md.agents/skills/puzzletron/SKILL.md.agents/skills/puzzletron/adding_new_model_tutorial.md.agents/skills/puzzletron/all_progress.py.agents/skills/puzzletron/mip_losses.py.agents/skills/puzzletron/mip_progress.py.agents/skills/puzzletron/mip_sweep.py.claude/skills/puzzletronCHANGELOG.rstexamples/puzzletron/README.mdexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_model_defaults.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/attn_pruning.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/ffn_pruning.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/hidden_dim_pruning.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/pruning_defaults.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/qwen3_5-0.8b_pruneffn_memory.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/qwen3_5.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/validate_model_defaults.yamlexamples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/validate_solutions_defaults.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/pruning/attn_pruning.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/pruning/ffn_pruning.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/pruning/hidden_dim_pruning.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/pruning/pruning_defaults.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/qwen3_5-2B_pruneffn_memory.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/qwen3_5.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/validate_model_defaults.yamlexamples/puzzletron/configs/qwen3_5-2B_pruneffn_memory/validate_solutions_defaults.yamlmodelopt/torch/puzzletron/anymodel/models/__init__.pymodelopt/torch/puzzletron/anymodel/models/qwen3_5/__init__.pymodelopt/torch/puzzletron/anymodel/models/qwen3_5/qwen3_5_converter.pymodelopt/torch/puzzletron/anymodel/models/qwen3_5/qwen3_5_model_descriptor.py
| import sys; sys.path.insert(0, '.') | ||
| from modelopt.torch.puzzletron.anymodel.model_descriptor import ModelDescriptorFactory | ||
| from transformers import AutoConfig | ||
| cfg = AutoConfig.from_pretrained('<hf_model_path>', trust_remote_code=True) |
There was a problem hiding this comment.
CRITICAL: Remove hardcoded trust_remote_code=True from AutoConfig loading.
Lines 130 and 148 hardcode trust_remote_code=True when loading the model config. Per SECURITY.md, this is an RCE vector if the model source is untrusted or user-supplied. The flag should default to False and let the user explicitly opt in if needed.
Recommended fix: Default to trust_remote_code=False. If the user encounters a "custom modeling code" error, explain the situation and ask if they trust the model source before retrying with trust_remote_code=True.
# Step 1 (line 126-134): First attempt with default safe setting
cfg = AutoConfig.from_pretrained('<hf_model_path>', trust_remote_code=False)
# If that fails with a custom modeling error, ask the user to confirm trust
# Step 2 (line 146-154): Same approachAlso applies to: 148-148
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.agents/skills/puzzletron/SKILL.md at line 130, The
AutoConfig.from_pretrained calls have hardcoded trust_remote_code=True which is
a security vulnerability when handling untrusted model sources. Change both
occurrences of AutoConfig.from_pretrained (at the initial config loading and the
retry logic) to default to trust_remote_code=False. Wrap the
AutoConfig.from_pretrained calls in try-except blocks to catch custom modeling
code errors, and when such an error occurs, prompt the user to confirm they
trust the model source before retrying with trust_remote_code=True if they
explicitly approve.
Source: Coding guidelines
| # teacher_intermediate_size is 6144 | ||
| intermediate_size_list: [1280, 2560, 3584, 5120] | ||
| mlp_init_mode: "PruneByActivationsLog" |
There was a problem hiding this comment.
Align 0.8B FFN candidates with the same tree’s teacher size.
Line 18/Line 19 use 2B-scale defaults (6144, 5120) that conflict with the 0.8B teacher size documented in qwen3_5-0.8b_pruneffn_memory.yaml Line 25 (3584). This preset becomes unsafe when used directly because it allows expansion instead of pruning.
Suggested fix
-# teacher_intermediate_size is 6144
-intermediate_size_list: [1280, 2560, 3584, 5120]
+# teacher_intermediate_size is 3584
+intermediate_size_list: [768, 1536, 2048, 3072]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # teacher_intermediate_size is 6144 | |
| intermediate_size_list: [1280, 2560, 3584, 5120] | |
| mlp_init_mode: "PruneByActivationsLog" | |
| # teacher_intermediate_size is 3584 | |
| intermediate_size_list: [768, 1536, 2048, 3072] | |
| mlp_init_mode: "PruneByActivationsLog" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@examples/puzzletron/configs/qwen3_5-0.8b_pruneffn_memory/pruning/ffn_pruning.yaml`
around lines 18 - 20, The intermediate_size_list in the ffn_pruning.yaml config
contains values (5120) that exceed the 0.8B teacher model's intermediate size of
3584, which causes the pruning strategy to allow expansion instead of
compression. Update the intermediate_size_list to use only values that are
smaller than the teacher size of 3584 to ensure proper FFN pruning behavior for
the 0.8B model, and remove or replace any values that are equal to or larger
than the teacher intermediate size.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1784 +/- ##
===========================================
+ Coverage 58.45% 76.08% +17.62%
===========================================
Files 510 514 +4
Lines 56271 56911 +640
===========================================
+ Hits 32896 43300 +10404
+ Misses 23375 13611 -9764
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
what's different between 0.8B and 2B yamls? Mostly looks same. Can we have just one and mention a readme in this directory that mentions what to change if users want to try pruning a larger qwen3.5 variant? We dont want to maintain pruning yamls for all qwen3.5 variants if they only different in size and everything else is same
What does this PR do?
BEFORE REVIEWING IT:
Type of change: new feature.
#1780
Usage
see: .agents/skills/puzzletron/adding_new_model_tutorial.md
Testing
Tested manually. Go through this tutorial before accepting this MR.
Before your PR is "Ready for review"
Summary by CodeRabbit
Release Notes
New Features
/puzzletroncommand for AI code agents enabling model pruning and compression pipelines with support for Qwen3.5 (0.8B, 2B) and Llama models.Documentation