[BugFix] fix cache transfer manager updating/clearing (#5930)

* [fix] fix cache transfer manager updating/clearing * [fix] fix code style * [fix] fix config * [fix] fix engine client * [fix] let worker update kv cache status signal * [fix] update worker process * [fix] fix clear/update for case if comm group is shutdown * [fix] update dynamic weight manager * [fix] fix port * [fix] add num_cpu_blocks arg for async_llm, and remove unnecessary waiting
2026-04-23 00:17:25 +08:00 · 2026-01-13 21:09:29 +08:00
parent 6da06abc17
commit 456637002d
8 changed files with 165 additions and 74 deletions
@@ -38,6 +38,7 @@ python -m fastdeploy.entrypoints.openai.api_server \
       --gpu-memory-utilization 0.9 \
       --model "$MODEL_PATH" \
       --no-shutdown-comm-group-if-worker-idle \
+       --swap-space 10 \
       --load-strategy ipc_snapshot \
       --dynamic-load-weight &