2 Commits

Author SHA1 Message Date
AIbin 3c54a41131 [Docs][Feature]add fastdeploy-llm-integration skill & research-report skill (#7287)
* add fastdeploy-llm-integration skill &  research-report skill
2026-04-10 11:24:23 +08:00
freeliuzc cf7934a4b2 [Speculative Decoding] Unify Spec and non-spec branch (#6685)
* optimize spec-inference architecture

* delete debug log

* optimize spec_method usage  && fix unit_test

* add claude unit-test skill

* fix some ugly bug

* enhance robustness and bounds check

* unify method & spec_method to method to avoid bug

* activate CI

* fix unit test

* Unify logprobs computation for naive and speculative decoding, fix CUDA kernel

* fix logprob bug && optimize verify kernel

* fix exist_decode() judge
2026-03-10 23:58:44 -07:00