freeliuzc
|
cf7934a4b2
|
[Speculative Decoding] Unify Spec and non-spec branch (#6685)
* optimize spec-inference architecture
* delete debug log
* optimize spec_method usage && fix unit_test
* add claude unit-test skill
* fix some ugly bug
* enhance robustness and bounds check
* unify method & spec_method to method to avoid bug
* activate CI
* fix unit test
* Unify logprobs computation for naive and speculative decoding, fix CUDA kernel
* fix logprob bug && optimize verify kernel
* fix exist_decode() judge
|
2026-03-10 23:58:44 -07:00 |
|
GoldPancake
|
2178f2829b
|
[Speculative Decoding] Support suffix decoding (#6403)
* support suffix decoding
|
2026-02-26 11:42:05 +08:00 |
|
yangjianfengo1
|
ba5c2b7e37
|
[Docx] add language (en/cn) switch links (#4470)
* add install docs
* 修改文档
* 修改文档
|
2025-10-17 15:47:41 +08:00 |
|
freeliuzc
|
46911f903d
|
[MTP]update hybrid-mtp-with-ngram (#4047)
|
2025-09-15 17:13:31 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|