[BugFix] fix cache transfer manager updating/clearing (#5930)

* [fix] fix cache transfer manager updating/clearing

* [fix] fix code style

* [fix] fix config

* [fix] fix engine client

* [fix] let worker update kv cache status signal

* [fix] update worker process

* [fix] fix clear/update for case if comm group is shutdown

* [fix] update dynamic weight manager

* [fix] fix port

* [fix] add num_cpu_blocks arg for async_llm, and remove unnecessary waiting
This commit is contained in:
Yonghua Li
2026-01-13 21:09:29 +08:00
committed by GitHub
parent 6da06abc17
commit 456637002d
8 changed files with 165 additions and 74 deletions
+1
View File
@@ -38,6 +38,7 @@ python -m fastdeploy.entrypoints.openai.api_server \
--gpu-memory-utilization 0.9 \
--model "$MODEL_PATH" \
--no-shutdown-comm-group-if-worker-idle \
--swap-space 10 \
--load-strategy ipc_snapshot \
--dynamic-load-weight &