mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
[RL][Cherry-Pick] Support Fully Async and PrefixCache (#6599)
* cherry-pick Support Fully Async and PrefixCache step 1 * copy routing_indices_cache.py from 2.4 * cherry-pick [RL] R3 Fix the bug for determining the end of a request (#6388) * cherry-pick [RL] Clear Requests status of R3 (#6569) * delete code * fix rename bug * fix status shape bug * fix ci
This commit is contained in:
@@ -658,7 +658,7 @@ class PaddleDisWorkerProc:
|
||||
|
||||
if num_blocks_local <= 0:
|
||||
raise ValueError(
|
||||
"The total number of blocks cannot be less than zero. "
|
||||
f"The total number of blocks cannot be less than zero bug got {num_blocks_local}. "
|
||||
"Please increase gpu_memory_utilization "
|
||||
"Or decrease max_num_batched_tokens(max model length)."
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user