[Speculative Decoding] Support MTP for GLM-4.5-Air (#6047)

* glm mtp
* add spec neox partial rope
This commit is contained in:
GoldPancake
2026-01-16 14:35:24 +08:00
committed by GitHub
parent b2a2e11551
commit bda38aa519
9 changed files with 627 additions and 31 deletions
@@ -465,6 +465,8 @@ def post_process_specualate(
step_idx=share_inputs["step_idx"],
limit_think_status=share_inputs["limit_think_status"],
accept_num=share_inputs["accept_num"],
stop_flags=share_inputs["stop_flags"],
eos_token_ids=share_inputs["eos_token_id"],
think_end_id=think_end_id,
line_break_id=line_break_id,
)