[RL][Cherry-Pick] Support Fully Async and PrefixCache (#6599)

* cherry-pick Support Fully Async and PrefixCache step 1 * copy routing_indices_cache.py from 2.4 * cherry-pick [RL] R3 Fix the bug for determining the end of a request (#6388) * cherry-pick [RL] Clear Requests status of R3 (#6569) * delete code * fix rename bug * fix status shape bug * fix ci
2026-04-23 00:17:25 +08:00 · 2026-03-12 16:13:30 +08:00
parent 1ed6073d94
commit cdaf6dd400
7 changed files with 641 additions and 237 deletions
@@ -658,7 +658,7 @@ class PaddleDisWorkerProc:

            if num_blocks_local <= 0:
                raise ValueError(
-                    "The total number of blocks cannot be less than zero. "
+                    f"The total number of blocks cannot be less than zero bug got {num_blocks_local}. "
                    "Please increase gpu_memory_utilization "
                    "Or decrease max_num_batched_tokens(max model length)."
                )