mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
[RL] add pause, update_weights, resume interface for async RL (#6052)
* support dynamic run_control_request through zmq from apiserver to common_engine * support pause/resume/is_paused/update_weights in apiserver->common_engine by common run_control_method * change /is_puased from HTTP POST method to GET method * add pause、resume、is_paused implementation * support engine <==> worker communication(request&response) * support sync weights through RDMA from checkpoint_transfer * support specified version, rsync_config in update_weights rpc call * add pause, update_weights, resume interface for async RL * bug fix: update_weights support using default arguments * fix typo * typo fix * typo fix * typo fix * add unitest for control request/response, localscheduler.get_inflight_requests, resource_manager_v1.preempted_all * add "rsync" to LoadConfig.load_strategy Literal type hints Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * typo fix * typo fix * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * check version/rsync params * add error log when version.txt not exists Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * raise specified ValueError when paramters check failed Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * tp barrier after run_control_method * encode 'engine_worker_queue_port' to unique name of worker2engine fmq queue * typo fix * typo fix --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -195,6 +195,10 @@ class EngineArgs:
|
||||
"""
|
||||
dynamic load weight strategy
|
||||
"""
|
||||
rsync_config: Optional[Dict[str, Any]] = None
|
||||
"""
|
||||
rsync weights config info
|
||||
"""
|
||||
quantization: Optional[Dict[str, Any]] = None
|
||||
guided_decoding_backend: str = "off"
|
||||
"""
|
||||
@@ -812,6 +816,12 @@ class EngineArgs:
|
||||
default=EngineArgs.load_strategy,
|
||||
help="Flag to dynamic load strategy.",
|
||||
)
|
||||
model_group.add_argument(
|
||||
"--rsync-config",
|
||||
type=json.loads,
|
||||
default=EngineArgs.rsync_config,
|
||||
help="Rsync weights config",
|
||||
)
|
||||
model_group.add_argument(
|
||||
"--engine-worker-queue-port",
|
||||
type=lambda s: s.split(",") if s else None,
|
||||
|
||||
Reference in New Issue
Block a user