chenjian
|
74d0f1c01f
|
[Optim] Robust sync status when preempted happens (#5796)
* [Bug fix] Sync status for caching output cache
* fix
* fix
* fix bug
* fix
* fix
* support xpu
* fix
* fix
* fix
* fix
* fix
* fix ci
* fix ci
* fix xpu
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
|
2026-01-14 12:07:33 +08:00 |
|
freeliuzc
|
582aebd48b
|
[MTP]support mtp chunk_prefill_v1 (#4366)
* support mtp chunk_prefill_v1
* fix mtp chunkprefill output, fix unit test
* fix unit test
* fix save_output
|
2025-10-15 13:21:32 +08:00 |
|
freeliuzc
|
52eda7fdb3
|
[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610)
|
2025-08-26 14:29:22 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|