Files
FastDeploy/custom_ops/gpu_ops/speculate_decoding/draft_model
freeliuzc e87ce4b8cd [Speculative Decoding] refactor MTP and optimize spec-decoding postprocess (#6973)
* support new mtp

* refactor(speculate_decoding and mtp): optimize mtp sturcture logic. Update spec-branch status-process

* fix cuda-graph for spec-decoding

* fix xpu mtp and fix some note

* fix unittest and optmize note

* fix model status update in eos-branch
2026-03-24 10:19:01 +08:00
..
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00