Yonghua Li
7c8c0a3c02
[BugFix] replace ftok with custom_ftok in get_output/save_output ops ( #6822 )
...
* [BugFix] replace ftok with custom_ftok in get_output/save_output ops
* [Test] add unit test for custom_ftok
* [Chore] create custom_ftok.h
* [Chore] reorganize header file
* [Fix] fix cache messager msg_queue_id+rank_id conflict
2026-03-16 14:22:18 +08:00
Juncai
d67388a479
[PD Disaggregation] Distinguish the pipelines for sending kv signal in different prefill ( #5514 )
...
* Distinguish the pipelines for sending kv signal in different prefill
* up
2025-12-12 14:05:36 +08:00
Yonghua Li
43097a512a
[BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol ( #5132 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix v1 scheduler profile run for append attention in prefill node
* [fix] skip send_signal if kv signal not inited for gpu and xpu
* [fix] extend fix to flash_attn & mla_attn
* [fix] fix v1 pd run in ipc transfer protocol
* [ci] add test for v1 pd profile run using ipc transfer protocol
* [style] fix code style check
* [style] fix code style again
* [fix] fix profile run
* [update] remove --num-gpu-blocks-override in example script
* [chore] rename forward_meta is_profiling to is_dummy_or_profile_run
2025-11-20 21:39:22 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
freeliuzc
d49f8fb30a
[Feature][MTP] Support cacheKV transfer in per_chunk mode ( #2890 )
...
* support chunk_prefill both normal and speculative_decoding(mtp)
* optimize pd-disaggregation config
* fix bug
2025-07-17 17:58:08 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00