-
Notifications
You must be signed in to change notification settings - Fork 1k
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(update-weight): add mooncake transfer engine transport
#2159
opened Jul 1, 2026 by
Vincent-Bo-ali
Loading…
(fix): handle none/dict rewards in _compute_zero_std_metrics
#2157
opened Jul 1, 2026 by
atoniolo76
Loading…
fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider
#2155
opened Jun 30, 2026 by
lyzustc
Contributor
Loading…
fix: support multi-head MTP weight mapping in MimoBridge (closes #2131)
#2154
opened Jun 30, 2026 by
botbikamordehai2-sketch
Loading…
[docker] Upgrade to sglang v0.5.14
run-ci-image
#2149
opened Jun 29, 2026 by
zhuzilin
Contributor
Loading…
docs: fix dead examples/README link to low_precision
#2142
opened Jun 29, 2026 by
aoshen02
Contributor
Loading…
docs(examples): fix broken markdown links in rollout_buffer and examples
#2137
opened Jun 27, 2026 by
CalvinXKY
Contributor
Loading…
docs(examples): list coding_agent_rl in examples/README
#2133
opened Jun 26, 2026 by
aoshen02
Contributor
Loading…
fix(mtp): support multi-head MTP loss logging (mtp-num-layers > 1)
#2132
opened Jun 26, 2026 by
ZiyiTsang
Loading…
Skip entropy gradient computation when entropy_coef == 0
#2130
opened Jun 25, 2026 by
CSUN1997
Loading…
Support partial rollout resume in Search-R1 example
#2128
opened Jun 23, 2026 by
OLIVER-XYP
Loading…
Reduce entropy logging memory when entropy coef is zero
#2127
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template
#2126
opened Jun 23, 2026 by
Meihan-chen
Loading…
fix(partial-rollout): cap max_new_tokens by prior response length
#2122
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(retool): coerce list prompt to str in reward_func
#2120
opened Jun 23, 2026 by
mvanhorn
Loading…
fix(delta-sync): surface failed engine apply results instead of silently discarding them
#2119
opened Jun 22, 2026 by
tanishkasinghhh
Loading…
fix(rm_hub): grade the final ###Response segment in deepscaler reward
#2116
opened Jun 22, 2026 by
SuperMarioYL
Loading…
fix(rm_hub): guard deepscaler reward against a missing response
#2115
opened Jun 21, 2026 by
vjsai
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.