[BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled

* [fix] fix v1 scheduler profile run for append attention in prefill node

* [fix] skip send_signal if kv signal not inited for gpu and xpu

* [fix] extend fix to flash_attn & mla_attn

* [fix] fix v1 pd run in ipc transfer protocol

* [ci] add test for v1 pd profile run using ipc transfer protocol

* [style] fix code style check

* [style] fix code style again

* [fix] fix profile run

* [update] remove --num-gpu-blocks-override in example script

* [chore] rename forward_meta is_profiling to is_dummy_or_profile_run
This commit is contained in:
Yonghua Li
2025-11-20 21:39:22 +08:00
committed by GitHub
parent 01c30f6b87
commit 43097a512a
12 changed files with 512 additions and 94 deletions
-1
View File
@@ -68,7 +68,6 @@ nohup python -m fastdeploy.entrypoints.openai.api_server \
--cache-transfer-protocol "rdma" \
--rdma-comm-ports "$((P_PORT + 4))" \
--pd-comm-port "$((P_PORT + 5))" \
--num-gpu-blocks-override 2000 \
--router "0.0.0.0:${ROUTER_PORT}" \
2>&1 >${FD_LOG_DIR}/nohup &