[RL][Cherry-Pick] Support Fully Async and PrefixCache (#6599)

* cherry-pick  Support Fully Async and PrefixCache step 1

* copy routing_indices_cache.py from 2.4

* cherry-pick [RL] R3 Fix the bug for determining the end of a request (#6388)

* cherry-pick [RL] Clear Requests status of R3 (#6569)

* delete code

* fix rename bug

* fix status shape bug

* fix ci
This commit is contained in:
RAM
2026-03-12 16:13:30 +08:00
committed by GitHub
parent 1ed6073d94
commit cdaf6dd400
7 changed files with 641 additions and 237 deletions
+1 -1
View File
@@ -658,7 +658,7 @@ class PaddleDisWorkerProc:
if num_blocks_local <= 0:
raise ValueError(
"The total number of blocks cannot be less than zero. "
f"The total number of blocks cannot be less than zero bug got {num_blocks_local}. "
"Please increase gpu_memory_utilization "
"Or decrease max_num_batched_tokens(max model length)."
)