Files
FastDeploy/custom_ops/gpu_ops/speculate_decoding
huicongyao 0f718baaf2 [Speculative Decoding]Reformat input preprocess for spec decode (#6501)
* add speculate_pre_process kernel

* reduce one slice

* make d2h async && fix mtp bug for new pre_process

* fix

* add unitest

* fix: code stype formatting

* fix

* fix: thread race in speculate_preprocess && rename d2h event
2026-03-03 10:22:07 +08:00
..
2025-09-01 17:50:17 +08:00
2026-02-27 19:07:35 +08:00