[RL] Support Rollout Routing Replay (#5321)

* [RL] Support Rollout Routing Replay

* add routing indices cache

* fix config bug and moe forward bug

* R3 Support GLM

* support eb4.5

* fix merge bug

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* add routing replay ci

* support glm topk

* support orther top_k

* fix ci bug

* pre-commit

* only support chatcmpl

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
This commit is contained in:
RAM
2025-12-05 20:01:33 +08:00
committed by GitHub
parent 8545b705ed
commit 96d2d4877b
24 changed files with 592 additions and 24 deletions
+2
View File
@@ -65,6 +65,7 @@ class RolloutModelConfig:
data_parallel_size: int = 1,
num_nextn_predict_layers: int = 0,
eplb_config: str = {},
routing_replay_config: str = None,
):
# Required parameters
self.model = model_name_or_path
@@ -113,6 +114,7 @@ class RolloutModelConfig:
self.plas_attention_config = plas_attention_config
self.num_nextn_predict_layers = num_nextn_predict_layers
self.eplb_config = eplb_config
self.routing_replay_config = routing_replay_config
def __str__(self):
return "\n".join(f"{k}: {v}" for k, v in self.__dict__.items())