Add: suppot trt-rtx-abi ep#1783
Conversation
Signed-off-by: haoxiz <haoxiz@nvidia.com>
📝 WalkthroughWalkthroughThe PR extends ONNX PTQ calibration to support a new ChangesNvTensorRtRtx-abi Execution Provider Support
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@modelopt/onnx/quantization/__main__.py`:
- Around line 119-120: The help text string for the calibration_eps parameter
contains a malformed token where `dml:x` is missing quotes while all other
device options like `'trt'`, `'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and
`'NvTensorRtRtx-abi'` are quoted. Add quotes around `dml:x` to make it `'dml:x'`
to maintain consistent formatting in the user-facing help message.
In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 355-357: Replace the substring matching condition in the elif
statement that checks for NvTensorRtRtx-abi with an exact equality comparison.
Change the condition from using the 'in' operator to using the equality operator
(==) to ensure that only the exact provider string "NvTensorRtRtx-abi" matches,
not variations or malformed values like "NvTensorRtRtx-abi:0". This ensures
consistency with how the ABI registration is triggered in the quantize()
function through exact membership checking.
- Around line 317-325: The _check_for_nv_tensorrt_rtx_abi_libs function accepts
a user-supplied ep_path parameter and loads it as a shared library via
ort.register_execution_provider_library without establishing a trust boundary.
Add either explicit trust validation (such as signature verification) or an
inline comment documenting why the library path is safe (for example, confirming
it is internally-generated and not directly user-supplied), then request
security team review as indicated in SECURITY.md for dynamic component loading
from user-provided artifact paths.
In `@modelopt/onnx/quantization/quantize.py`:
- Around line 495-497: The abi_ep_path parameter documentation references
NvTensorRtRtx-abi as a valid value in calibration_eps, but the calibration_eps
parameter documentation in the same docstring does not list NvTensorRtRtx-abi in
its allowed values. Update the calibration_eps documentation to include
NvTensorRtRtx-abi in the list of supported execution provider values to maintain
consistency with the abi_ep_path documentation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ebba326b-603d-4522-9b99-7b7ce1107279
📒 Files selected for processing (4)
examples/windows/onnx_ptq/genai_llm/quantize.pymodelopt/onnx/quantization/__main__.pymodelopt/onnx/quantization/ort_utils.pymodelopt/onnx/quantization/quantize.py
| "Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', " | ||
| "'NvTensorRtRtx-abi'], where 'x' is the device id." |
There was a problem hiding this comment.
Fix malformed --calibration_eps help text token.
Line 119 shows dml:x without quotes while all other choices are quoted. This looks like a typo in the user-facing guidance.
Suggested fix
- "Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', "
+ "Any subset of ['trt', 'cuda:x', 'dml:x', 'cpu', 'NvTensorRtRtx', "🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modelopt/onnx/quantization/__main__.py` around lines 119 - 120, The help text
string for the calibration_eps parameter contains a malformed token where
`dml:x` is missing quotes while all other device options like `'trt'`,
`'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and `'NvTensorRtRtx-abi'` are quoted.
Add quotes around `dml:x` to make it `'dml:x'` to maintain consistent formatting
in the user-facing help message.
| def _check_for_nv_tensorrt_rtx_abi_libs(ep_path: str): | ||
| logger.info("Checking for NvTensorRtRtx ABI EP library") | ||
| if not ep_path: | ||
| raise FileNotFoundError("Need to provide abi_ep_path to use NvTensorRtRtx-abi") | ||
| if not os.path.isfile(ep_path): | ||
| raise FileNotFoundError(f"NvTensorRtRtx ABI EP library not found: {ep_path}") | ||
|
|
||
| ort.register_execution_provider_library("NvTensorRTRTXExecutionProvider", ep_path) | ||
| return True |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file and the exact code at the specified lines
cd modelopt && cat -n onnx/quantization/ort_utils.py | sed -n '310,330p'Repository: NVIDIA/Model-Optimizer
Length of output: 1205
🏁 Script executed:
# Check how ep_path is passed and where _check_for_nv_tensorrt_rtx_abi_libs is called
rg "_check_for_nv_tensorrt_rtx_abi_libs" -B 3 -A 3 modelopt/Repository: NVIDIA/Model-Optimizer
Length of output: 1109
🏁 Script executed:
# Check the function signatures that call this function to understand the trust boundary
rg "abi_ep_path" -B 5 -A 5 modelopt/ | head -100Repository: NVIDIA/Model-Optimizer
Length of output: 6303
🏁 Script executed:
# Check if there's any documentation or config validation for abi_ep_path
rg "register_execution_provider_library" -B 5 -A 5 modelopt/Repository: NVIDIA/Model-Optimizer
Length of output: 965
Add explicit trust boundary for ABI EP library loading.
Line 324 loads a user-supplied shared library path into the process via register_execution_provider_library. As documented in SECURITY.md, user-provided artifact paths (including the external abi_ep_path DLL) must be treated as untrusted. Security-sensitive dynamic component loading requires an explicit trust contract—either a trusted-source gate with signature verification or an inline comment documenting why the path is safe (e.g., confirming it is internally-generated and not user-supplied)—plus security codeowners review.
Currently the code only validates file existence, with no trust boundary or safety justification. This creates an RCE vector if a malicious DLL is placed at the user-supplied path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modelopt/onnx/quantization/ort_utils.py` around lines 317 - 325, The
_check_for_nv_tensorrt_rtx_abi_libs function accepts a user-supplied ep_path
parameter and loads it as a shared library via
ort.register_execution_provider_library without establishing a trust boundary.
Add either explicit trust validation (such as signature verification) or an
inline comment documenting why the library path is safe (for example, confirming
it is internally-generated and not directly user-supplied), then request
security team review as indicated in SECURITY.md for dynamic component loading
from user-provided artifact paths.
Source: Coding guidelines
| elif "NvTensorRtRtx-abi" in ep: | ||
| providers.append("NvTensorRTRTXExecutionProvider") | ||
| logger.debug("Added NvTensorRtRtx ABI EP") |
There was a problem hiding this comment.
Use exact matching for NvTensorRtRtx-abi EP selection.
Using substring matching here accepts malformed values (for example, NvTensorRtRtx-abi:0), but ABI registration is triggered via exact membership in quantize(). That creates a mismatch where provider selection can proceed without registration and fail later at runtime.
Suggested fix
- elif "NvTensorRtRtx-abi" in ep:
+ elif ep == "NvTensorRtRtx-abi":
providers.append("NvTensorRTRTXExecutionProvider")
logger.debug("Added NvTensorRtRtx ABI EP")
- elif "NvTensorRtRtx" in ep:
+ elif ep == "NvTensorRtRtx":📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif "NvTensorRtRtx-abi" in ep: | |
| providers.append("NvTensorRTRTXExecutionProvider") | |
| logger.debug("Added NvTensorRtRtx ABI EP") | |
| elif ep == "NvTensorRtRtx-abi": | |
| providers.append("NvTensorRTRTXExecutionProvider") | |
| logger.debug("Added NvTensorRtRtx ABI EP") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modelopt/onnx/quantization/ort_utils.py` around lines 355 - 357, Replace the
substring matching condition in the elif statement that checks for
NvTensorRtRtx-abi with an exact equality comparison. Change the condition from
using the 'in' operator to using the equality operator (==) to ensure that only
the exact provider string "NvTensorRtRtx-abi" matches, not variations or
malformed values like "NvTensorRtRtx-abi:0". This ensures consistency with how
the ABI registration is triggered in the quantize() function through exact
membership checking.
| abi_ep_path: | ||
| Path to an external NvTensorRtRtx ABI execution-provider library. Required when | ||
| ``NvTensorRtRtx-abi`` is present in ``calibration_eps``. |
There was a problem hiding this comment.
Keep calibration_eps documentation aligned with the new ABI EP option.
The new abi_ep_path docs are added, but the allowed calibration_eps list in this same docstring still omits NvTensorRtRtx-abi. That makes the public API docs internally inconsistent.
As per coding guidelines, public APIs should be clearly documented and kept accurate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modelopt/onnx/quantization/quantize.py` around lines 495 - 497, The
abi_ep_path parameter documentation references NvTensorRtRtx-abi as a valid
value in calibration_eps, but the calibration_eps parameter documentation in the
same docstring does not list NvTensorRtRtx-abi in its allowed values. Update the
calibration_eps documentation to include NvTensorRtRtx-abi in the list of
supported execution provider values to maintain consistency with the abi_ep_path
documentation.
Source: Coding guidelines
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1783 +/- ##
==========================================
- Coverage 77.09% 75.71% -1.39%
==========================================
Files 511 511
Lines 56168 58257 +2089
==========================================
+ Hits 43302 44108 +806
- Misses 12866 14149 +1283
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
What does this PR do?
Add support for onnx quantization to support trt-rtx-abi ep
Usage
Testing
Tested on 4 popular llm models on all popular quantization method(int4, fp8, int8)
Before your PR is "Ready for review"
Summary by CodeRabbit
NvTensorRtRtx-abiexecution provider in ONNX quantization workflows--abi_ep_pathcommand-line argument to specify the ABI execution provider library pathNvTensorRtRtxandNvTensorRtRtx-abiexecution provider variants