Skip to content

fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider#2155

Open
lyzustc wants to merge 1 commit into
THUDM:mainfrom
lyzustc:main
Open

fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider#2155
lyzustc wants to merge 1 commit into
THUDM:mainfrom
lyzustc:main

Conversation

@lyzustc

@lyzustc lyzustc commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Forward attention_backend and attention_softmax_in_fp32 to the bridge model provider

When --megatron-to-hf-mode bridge is used, the model provider is built by AutoBridge and only a subset of args are manually copied onto it. This meant attention_backend and attention_softmax_in_fp32 were silently dropped, so bridge mode ignored the user's choices and fell back to the bridge defaults — diverging from the default (non-bridge) path, where these are forwarded automatically via core_transformer_config_from_args. This change copies both values from args onto the provider alongside the existing parallelism overrides, and does so before provider.finalize() so finalize-time invariants (e.g. apply_query_key_layer_scaling forcing attention_softmax_in_fp32=True) still hold. The fix is scoped entirely to the bridge branch and does not affect any other code path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant