mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-22 16:07:51 +08:00
[BugFix] Fix clear_parameters hang issue in MTP during weight cleanup in RL (#7522)
CE Compile Job / ce_job_pre_check (push) Waiting to run
CE Compile Job / print_ce_job_pre_check_outputs (push) Blocked by required conditions
CE Compile Job / FD-Clone-Linux (push) Blocked by required conditions
CE Compile Job / Show Code Archive Output (push) Blocked by required conditions
CE Compile Job / BUILD_SM8090 (push) Blocked by required conditions
CE Compile Job / BUILD_SM8090_RL (push) Blocked by required conditions
CE Compile Job / BUILD_SM8689 (push) Blocked by required conditions
CE Compile Job / CE_UPLOAD (push) Blocked by required conditions
CE Compile Job / CE_UPLOAD_RL (push) Blocked by required conditions
Deploy GitHub Pages / deploy (push) Waiting to run
CE Compile Job / ce_job_pre_check (push) Waiting to run
CE Compile Job / print_ce_job_pre_check_outputs (push) Blocked by required conditions
CE Compile Job / FD-Clone-Linux (push) Blocked by required conditions
CE Compile Job / Show Code Archive Output (push) Blocked by required conditions
CE Compile Job / BUILD_SM8090 (push) Blocked by required conditions
CE Compile Job / BUILD_SM8090_RL (push) Blocked by required conditions
CE Compile Job / BUILD_SM8689 (push) Blocked by required conditions
CE Compile Job / CE_UPLOAD (push) Blocked by required conditions
CE Compile Job / CE_UPLOAD_RL (push) Blocked by required conditions
Deploy GitHub Pages / deploy (push) Waiting to run
* fix mtp clear graph bugs in rl
This commit is contained in:
@@ -2910,12 +2910,19 @@ class GPUModelRunner(ModelRunnerBase):
|
||||
# Clear CUDAGraph
|
||||
if self.use_cudagraph:
|
||||
self.model.clear_graph_opt_backend()
|
||||
if (
|
||||
self.speculative_decoding
|
||||
and self.spec_method == SpecMethod.MTP
|
||||
and self.graph_opt_config.draft_model_use_cudagraph
|
||||
):
|
||||
self.proposer.model.clear_graph_opt_backend()
|
||||
# Clear parameters and Send single
|
||||
self.dynamic_weight_manager.clear_parameters(
|
||||
pid, self.fd_config.parallel_config.shutdown_comm_group_if_worker_idle
|
||||
)
|
||||
if self.spec_method == SpecMethod.MTP:
|
||||
self.proposer.model.clear_graph_opt_backend()
|
||||
|
||||
# NOTE(wangyanpeng): MTP cache must be cleared before clearing the main KV cache
|
||||
if self.speculative_decoding and self.spec_method == SpecMethod.MTP:
|
||||
self.proposer.clear_mtp_cache()
|
||||
self.clear_cache()
|
||||
paddle.device.cuda.empty_cache()
|
||||
|
||||
Reference in New Issue
Block a user