InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 653
Star 7.6k

Code
Issues 511
Pull requests 51
Discussions
Actions
Projects
Security 1
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: InternLM/lmdeploy

Labels 34 Milestones 0

New pull request New

51 Open 2,010 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[WIP]: support glm4.7 with mtp WIP

#4346 opened Feb 10, 2026 by RunningLeon • Draft

add custom noaux kernel

#4345 opened Feb 10, 2026 by grimoire

Loading…

Support MiniMax-M2 in TurboMind engine

#4343 opened Feb 10, 2026 by zh-nj

Loading…

fix qwen3-vl-moe long context Bug:P1

#4342 opened Feb 9, 2026 by grimoire

Loading…

Fix authorization

#4338 opened Feb 9, 2026 by lvhan028

Loading…

[WIP]Support torch compile

#4336 opened Feb 8, 2026 by grimoire • Draft

add preliminary support for EP(single-node) of turbomind backend

#4332 opened Feb 6, 2026 by irexyc

Loading…

Qwen Dense/Moe model fp8 quant online

#4324 opened Feb 5, 2026 by 43758726

Loading…

Negative KV sequence length error in Attention op

#4316 opened Feb 2, 2026 by jinminxi104

Loading…

Compatible with transformers 5.0 at TurboMind side improvement

#4304 opened Jan 28, 2026 by lvhan028

Loading…

change ascend paged attention from BSH format to TND format for better performace

#4295 opened Jan 27, 2026 by jinminxi104 • Draft

Support ignore layers in quant config for qwen3 models improvement

#4293 opened Jan 26, 2026 by RunningLeon

Loading…

return BadRequest for all invlid inputs Bug:P2

#4291 opened Jan 26, 2026 by lvhan028

Loading…

support repetition ngram logits processor

#4288 opened Jan 23, 2026 by grimoire

Loading…

fix dllm mask on set_step

#4278 opened Jan 18, 2026 by grimoire

Loading…

[ascend] fix awq and smoothq

#4277 opened Jan 16, 2026 by wanfengcxz • Draft

test: add mixing guided and non-guided tests

#4267 opened Jan 12, 2026 by windreamer

Loading…

Update benchmark serving script for proxy_server

#4173 opened Dec 1, 2025 by lvhan028

Loading…

[WIP]: Support prefix caching with routed experts

#4171 opened Nov 28, 2025 by RunningLeon • Draft

Support fp32 head for qwen and internlm models improvement

#4160 opened Nov 27, 2025 by RunningLeon

Loading…

fix: fix lora weight loading for internvl

#4106 opened Nov 6, 2025 by windreamer • Draft

Update installation.md

#4095 opened Nov 3, 2025 by krescent

Loading…

Add step_map to track token decoding order in DLLM

#4057 opened Oct 21, 2025 by Auraithm

Loading…

4 tasks done

[POC] Encoder Disaggregation

#4047 opened Oct 17, 2025 by CUHKSZzxy • Draft

2 of 7 tasks

quant blocked fp8 enhancement

New feature or request

#4018 opened Sep 29, 2025 by CUHKSZzxy

Loading…

4 of 5 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!