fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider by lyzustc · Pull Request #2155 · THUDM/slime

lyzustc · 2026-06-30T17:22:39Z

Forward attention_backend and attention_softmax_in_fp32 to the bridge model provider

When --megatron-to-hf-mode bridge is used, the model provider is built by AutoBridge and only a subset of args are manually copied onto it. This meant attention_backend and attention_softmax_in_fp32 were silently dropped, so bridge mode ignored the user's choices and fell back to the bridge defaults — diverging from the default (non-bridge) path, where these are forwarded automatically via core_transformer_config_from_args. This change copies both values from args onto the provider alongside the existing parallelism overrides, and does so before provider.finalize() so finalize-time invariants (e.g. apply_query_key_layer_scaling forcing attention_softmax_in_fp32=True) still hold. The fix is scoped entirely to the bridge branch and does not affect any other code path.

…p32 to provider

fix(model_provider): add attention_backend and attention_softmax_in_f…

c54d421

…p32 to provider

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider#2155

fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider#2155
lyzustc wants to merge 1 commit into
THUDM:mainfrom
lyzustc:main

lyzustc commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lyzustc commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant