Skip to content

Add: suppot trt-rtx-abi ep#1783

Open
haoxiz-nvidia wants to merge 1 commit into
mainfrom
haoxiz/onnx-ptq-trt-abi
Open

Add: suppot trt-rtx-abi ep#1783
haoxiz-nvidia wants to merge 1 commit into
mainfrom
haoxiz/onnx-ptq-trt-abi

Conversation

@haoxiz-nvidia

@haoxiz-nvidia haoxiz-nvidia commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Add support for onnx quantization to support trt-rtx-abi ep

Usage

python -m modelopt.onnx.quantization --onnx_path="path\to\model.onnx" --quantize_mode=int8 --output_path="path\to\output\model.onnx" --calibration_eps=NvTensorRtRtx-abi --use_external_data_format --high_precision_dtype=fp32 --abi_ep_path="path\to\trt-rtx-abi-dll"

Testing

Tested on 4 popular llm models on all popular quantization method(int4, fp8, int8)

Before your PR is "Ready for review"

  • Is this change backward compatible?: ✅
  • Did you write any new necessary tests?: ❌
  • Did you update Changelog?: ❌
  • Did you get Claude approval on this PR?: N/A

Summary by CodeRabbit

  • New Features
    • Added support for NvTensorRtRtx-abi execution provider in ONNX quantization workflows
    • Introduced --abi_ep_path command-line argument to specify the ABI execution provider library path
    • Extended input shape profile generation to support both NvTensorRtRtx and NvTensorRtRtx-abi execution provider variants

Signed-off-by: haoxiz <haoxiz@nvidia.com>
@haoxiz-nvidia haoxiz-nvidia self-assigned this Jun 22, 2026
@haoxiz-nvidia haoxiz-nvidia requested review from a team as code owners June 22, 2026 04:48
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The PR extends ONNX PTQ calibration to support a new NvTensorRtRtx-abi execution provider. A new register_abi_ep function validates and registers an external shared library via ONNX Runtime's register_execution_provider_library. The quantize() API and both CLI entrypoints (__main__.py and the Windows example script) are updated to accept and thread through an --abi_ep_path argument.

Changes

NvTensorRtRtx-abi Execution Provider Support

Layer / File(s) Summary
ABI EP registration and provider list selection
modelopt/onnx/quantization/ort_utils.py, modelopt/onnx/quantization/quantize.py
Adds _check_for_nv_tensorrt_rtx_abi_libs and register_abi_ep to validate and register the ABI EP shared library. Extends _prepare_ep_list with an elif branch for NvTensorRtRtx-abi. Updates quantize() imports, signature (`abi_ep_path: str
CLI and example script wiring
modelopt/onnx/quantization/__main__.py, examples/windows/onnx_ptq/genai_llm/quantize.py
Adds --abi_ep_path argument and updates --calibration_eps help text in __main__.py, forwarding the value into quantize(). The Windows example script receives the same treatment: imports register_abi_ep, broadens parse_calibration_eps and input-shape-profile conditions to include NvTensorRtRtx-abi, adds ABI EP registration in main(), and adds the --abi_ep_path CLI argument.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Security Anti-Patterns ❌ Error The code loads user-supplied abi_ep_path via ort.register_execution_provider_library() at line 324 of ort_utils.py without an inline comment explaining why the file is safe. Per SECURITY.md, se... Add inline comment to line 324 explaining why it's safe to load the user-supplied DLL path, or request @NVIDIA/modelopt-setup-codeowners review with explicit justification in the PR description.
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title contains a typo ('suppot' instead of 'support') and is extremely vague; it doesn't clearly convey that this PR adds TensorRT RTX ABI execution provider support for ONNX quantization. Correct the typo and provide a more specific title like 'Add TensorRT RTX ABI execution provider support for ONNX quantization' to clearly communicate the main change.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch haoxiz/onnx-ptq-trt-abi

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1783/

Built to branch gh-pages at 2026-06-22 04:51 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modelopt/onnx/quantization/__main__.py`:
- Around line 119-120: The help text string for the calibration_eps parameter
contains a malformed token where `dml:x` is missing quotes while all other
device options like `'trt'`, `'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and
`'NvTensorRtRtx-abi'` are quoted. Add quotes around `dml:x` to make it `'dml:x'`
to maintain consistent formatting in the user-facing help message.

In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 355-357: Replace the substring matching condition in the elif
statement that checks for NvTensorRtRtx-abi with an exact equality comparison.
Change the condition from using the 'in' operator to using the equality operator
(==) to ensure that only the exact provider string "NvTensorRtRtx-abi" matches,
not variations or malformed values like "NvTensorRtRtx-abi:0". This ensures
consistency with how the ABI registration is triggered in the quantize()
function through exact membership checking.
- Around line 317-325: The _check_for_nv_tensorrt_rtx_abi_libs function accepts
a user-supplied ep_path parameter and loads it as a shared library via
ort.register_execution_provider_library without establishing a trust boundary.
Add either explicit trust validation (such as signature verification) or an
inline comment documenting why the library path is safe (for example, confirming
it is internally-generated and not directly user-supplied), then request
security team review as indicated in SECURITY.md for dynamic component loading
from user-provided artifact paths.

In `@modelopt/onnx/quantization/quantize.py`:
- Around line 495-497: The abi_ep_path parameter documentation references
NvTensorRtRtx-abi as a valid value in calibration_eps, but the calibration_eps
parameter documentation in the same docstring does not list NvTensorRtRtx-abi in
its allowed values. Update the calibration_eps documentation to include
NvTensorRtRtx-abi in the list of supported execution provider values to maintain
consistency with the abi_ep_path documentation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ebba326b-603d-4522-9b99-7b7ce1107279

📥 Commits

Reviewing files that changed from the base of the PR and between 9048d13 and c5208da.

📒 Files selected for processing (4)
  • examples/windows/onnx_ptq/genai_llm/quantize.py
  • modelopt/onnx/quantization/__main__.py
  • modelopt/onnx/quantization/ort_utils.py
  • modelopt/onnx/quantization/quantize.py

Comment on lines +119 to +120
"Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', "
"'NvTensorRtRtx-abi'], where 'x' is the device id."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix malformed --calibration_eps help text token.

Line 119 shows dml:x without quotes while all other choices are quoted. This looks like a typo in the user-facing guidance.

Suggested fix
-            "Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', "
+            "Any subset of ['trt', 'cuda:x', 'dml:x', 'cpu', 'NvTensorRtRtx', "
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modelopt/onnx/quantization/__main__.py` around lines 119 - 120, The help text
string for the calibration_eps parameter contains a malformed token where
`dml:x` is missing quotes while all other device options like `'trt'`,
`'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and `'NvTensorRtRtx-abi'` are quoted.
Add quotes around `dml:x` to make it `'dml:x'` to maintain consistent formatting
in the user-facing help message.

Comment on lines +317 to +325
def _check_for_nv_tensorrt_rtx_abi_libs(ep_path: str):
logger.info("Checking for NvTensorRtRtx ABI EP library")
if not ep_path:
raise FileNotFoundError("Need to provide abi_ep_path to use NvTensorRtRtx-abi")
if not os.path.isfile(ep_path):
raise FileNotFoundError(f"NvTensorRtRtx ABI EP library not found: {ep_path}")

ort.register_execution_provider_library("NvTensorRTRTXExecutionProvider", ep_path)
return True

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and the exact code at the specified lines
cd modelopt && cat -n onnx/quantization/ort_utils.py | sed -n '310,330p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1205


🏁 Script executed:

# Check how ep_path is passed and where _check_for_nv_tensorrt_rtx_abi_libs is called
rg "_check_for_nv_tensorrt_rtx_abi_libs" -B 3 -A 3 modelopt/

Repository: NVIDIA/Model-Optimizer

Length of output: 1109


🏁 Script executed:

# Check the function signatures that call this function to understand the trust boundary
rg "abi_ep_path" -B 5 -A 5 modelopt/ | head -100

Repository: NVIDIA/Model-Optimizer

Length of output: 6303


🏁 Script executed:

# Check if there's any documentation or config validation for abi_ep_path
rg "register_execution_provider_library" -B 5 -A 5 modelopt/

Repository: NVIDIA/Model-Optimizer

Length of output: 965


Add explicit trust boundary for ABI EP library loading.

Line 324 loads a user-supplied shared library path into the process via register_execution_provider_library. As documented in SECURITY.md, user-provided artifact paths (including the external abi_ep_path DLL) must be treated as untrusted. Security-sensitive dynamic component loading requires an explicit trust contract—either a trusted-source gate with signature verification or an inline comment documenting why the path is safe (e.g., confirming it is internally-generated and not user-supplied)—plus security codeowners review.

Currently the code only validates file existence, with no trust boundary or safety justification. This creates an RCE vector if a malicious DLL is placed at the user-supplied path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modelopt/onnx/quantization/ort_utils.py` around lines 317 - 325, The
_check_for_nv_tensorrt_rtx_abi_libs function accepts a user-supplied ep_path
parameter and loads it as a shared library via
ort.register_execution_provider_library without establishing a trust boundary.
Add either explicit trust validation (such as signature verification) or an
inline comment documenting why the library path is safe (for example, confirming
it is internally-generated and not directly user-supplied), then request
security team review as indicated in SECURITY.md for dynamic component loading
from user-provided artifact paths.

Source: Coding guidelines

Comment on lines +355 to +357
elif "NvTensorRtRtx-abi" in ep:
providers.append("NvTensorRTRTXExecutionProvider")
logger.debug("Added NvTensorRtRtx ABI EP")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use exact matching for NvTensorRtRtx-abi EP selection.

Using substring matching here accepts malformed values (for example, NvTensorRtRtx-abi:0), but ABI registration is triggered via exact membership in quantize(). That creates a mismatch where provider selection can proceed without registration and fail later at runtime.

Suggested fix
-        elif "NvTensorRtRtx-abi" in ep:
+        elif ep == "NvTensorRtRtx-abi":
             providers.append("NvTensorRTRTXExecutionProvider")
             logger.debug("Added NvTensorRtRtx ABI EP")
-        elif "NvTensorRtRtx" in ep:
+        elif ep == "NvTensorRtRtx":
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif "NvTensorRtRtx-abi" in ep:
providers.append("NvTensorRTRTXExecutionProvider")
logger.debug("Added NvTensorRtRtx ABI EP")
elif ep == "NvTensorRtRtx-abi":
providers.append("NvTensorRTRTXExecutionProvider")
logger.debug("Added NvTensorRtRtx ABI EP")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modelopt/onnx/quantization/ort_utils.py` around lines 355 - 357, Replace the
substring matching condition in the elif statement that checks for
NvTensorRtRtx-abi with an exact equality comparison. Change the condition from
using the 'in' operator to using the equality operator (==) to ensure that only
the exact provider string "NvTensorRtRtx-abi" matches, not variations or
malformed values like "NvTensorRtRtx-abi:0". This ensures consistency with how
the ABI registration is triggered in the quantize() function through exact
membership checking.

Comment on lines +495 to +497
abi_ep_path:
Path to an external NvTensorRtRtx ABI execution-provider library. Required when
``NvTensorRtRtx-abi`` is present in ``calibration_eps``.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep calibration_eps documentation aligned with the new ABI EP option.

The new abi_ep_path docs are added, but the allowed calibration_eps list in this same docstring still omits NvTensorRtRtx-abi. That makes the public API docs internally inconsistent.

As per coding guidelines, public APIs should be clearly documented and kept accurate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modelopt/onnx/quantization/quantize.py` around lines 495 - 497, The
abi_ep_path parameter documentation references NvTensorRtRtx-abi as a valid
value in calibration_eps, but the calibration_eps parameter documentation in the
same docstring does not list NvTensorRtRtx-abi in its allowed values. Update the
calibration_eps documentation to include NvTensorRtRtx-abi in the list of
supported execution provider values to maintain consistency with the abi_ep_path
documentation.

Source: Coding guidelines

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 33.33333% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.71%. Comparing base (cfc823d) to head (c5208da).
⚠️ Report is 45 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/onnx/quantization/ort_utils.py 21.42% 11 Missing ⚠️
modelopt/onnx/quantization/quantize.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1783      +/-   ##
==========================================
- Coverage   77.09%   75.71%   -1.39%     
==========================================
  Files         511      511              
  Lines       56168    58257    +2089     
==========================================
+ Hits        43302    44108     +806     
- Misses      12866    14149    +1283     
Flag Coverage Δ
examples 41.82% <16.66%> (-0.14%) ⬇️
gpu 57.68% <27.77%> (-0.63%) ⬇️
unit 54.42% <33.33%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant