Commit Graph

3 Commits

Author SHA1 Message Date
sunxin c29e86fc9d [Feature] Support mtp overlap schedule (#7001) 2026-04-01 14:24:26 +08:00
sunxin 0dc7034ce0 [Model Runner] Deprecate not_need_stop (#6356)
* Deprecate not_need_stop
2026-03-05 10:55:42 +08:00
huicongyao 0f718baaf2 [Speculative Decoding]Reformat input preprocess for spec decode (#6501)
* add speculate_pre_process kernel

* reduce one slice

* make d2h async && fix mtp bug for new pre_process

* fix

* add unitest

* fix: code stype formatting

* fix

* fix: thread race in speculate_preprocess && rename d2h event
2026-03-03 10:22:07 +08:00