Skip to content

Pull requests: InternLM/lmdeploy

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: configure cudagraph capture batch sizes
#4573 opened May 8, 2026 by CUHKSZzxy Collaborator Draft
[WIP]: Support mtp fp8
#4572 opened May 8, 2026 by RunningLeon Collaborator Loading…
Fix health latency under concurrent VL request preparation Bug:P0
#4570 opened May 7, 2026 by CUHKSZzxy Collaborator Loading…
LLM evaluation skill on text datasets
#4566 opened Apr 30, 2026 by lvhan028 Collaborator Loading…
Fix the reprefill of evicted seqs with invalid draft tokens
#4564 opened Apr 29, 2026 by RunningLeon Collaborator Loading…
FP8 kv cache quantization
#4563 opened Apr 29, 2026 by CUHKSZzxy Collaborator Draft
Add Qwen3.5 Moe lite awq
#4561 opened Apr 28, 2026 by 43758726 Collaborator Loading…
[Feature] Add guided decoding support for speculative decoding enhancement New feature or request
#4559 opened Apr 28, 2026 by windreamer Collaborator Loading…
4 tasks done
Update turbomind modeling infrastructure improvement
#4557 opened Apr 27, 2026 by lzhangzz Collaborator Loading…
[WIP]DeepSeek V4 support
#4554 opened Apr 24, 2026 by grimoire Collaborator Draft
[Feature] Implement /v1/embeddings endpoint for OpenAI-compatible API enhancement New feature or request
#4550 opened Apr 23, 2026 by ZhijunLStudio Contributor Loading…
2 of 4 tasks
bump version to v0.13.0
#4549 opened Apr 23, 2026 by lvhan028 Collaborator Loading…
Test: update video sleep/wakeup and abort scenarios
#4528 opened Apr 15, 2026 by littlegy Contributor Loading…
style: add autopep8 pre-commit hook and apply PEP 8 formatting fixes
#4524 opened Apr 14, 2026 by windreamer Collaborator Loading…
make fp8 model quantized by llm-compressor can be inferenced in turbomind enhancement New feature or request
#4509 opened Apr 8, 2026 by 43758726 Collaborator Loading…
Integrate deep-ep nccl backend enhancement New feature or request
#4477 opened Mar 27, 2026 by irexyc Collaborator Loading…
feat: Turbomind linear gdn prefix caching enhancement New feature or request
#4465 opened Mar 25, 2026 by lapy Contributor Loading…
refactor get_ppl improvement
#4461 opened Mar 25, 2026 by lvhan028 Collaborator Loading…
Support multi stop words improvement
#4454 opened Mar 24, 2026 by lvhan028 Collaborator Loading…
[WIP] Support qwen3-omni
#4411 opened Mar 13, 2026 by CUHKSZzxy Collaborator Draft
Add model deployment best practice section in user guide documentation Improvements or additions to documentation
#4399 opened Mar 9, 2026 by lvhan028 Collaborator Draft
Fix Structured Output for GPT-OSS Models
#4386 opened Mar 2, 2026 by windreamer Collaborator Loading…
ProTip! Add no:assignee to see everything that’s not assigned.