freeliuzc
|
cf7934a4b2
|
[Speculative Decoding] Unify Spec and non-spec branch (#6685)
* optimize spec-inference architecture
* delete debug log
* optimize spec_method usage && fix unit_test
* add claude unit-test skill
* fix some ugly bug
* enhance robustness and bounds check
* unify method & spec_method to method to avoid bug
* activate CI
* fix unit test
* Unify logprobs computation for naive and speculative decoding, fix CUDA kernel
* fix logprob bug && optimize verify kernel
* fix exist_decode() judge
|
2026-03-10 23:58:44 -07:00 |
|