[PD Disaggregation][RL] Register to router with version and support rdma eager connect for pd (#6718)

* [Feature] Register to router with version info for PD disaggregation

Add RegisterManager for PD (Prefill-Decode) disaggregated deployment:
- All instances (Prefill/Decode) register to Router with heartbeat
- Prefill instances fetch Decode instance list from Router
- Prefill instances establish eager RDMA connections to Decode instances
- Register info includes: host_ip, port, role, version, is_paused, connected_decodes

Changes:
- Add RegisterManager class for managing PD registration and RDMA connections
- Add version field to ModelConfig for model version tracking
- Add connected_decodes to register_info for tracking connected Decode instances
- Add FD_ENABLE_PD_RDMA_EAGER_CONNECT environment variable

Test fixes:
- Add None checks for load_config in FDConfig.__init__
- Add version attribute to test mock model configs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refine

* remove test

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
jc
2026-03-17 14:43:35 +08:00
committed by GitHub
parent b152baeeee
commit 950366e58d
14 changed files with 507 additions and 97 deletions
-25
View File
@@ -1152,31 +1152,6 @@ class TestCommonEngineAdditionalCoverage(unittest.TestCase):
eng._control_update_weights(ControlRequest(request_id="ctrl", method="update_weights"))
self._detach_finalizer(eng)
def test_register_to_router_disabled(self):
eng = self._make_mixed_engine()
eng.cfg.router_config.router = None
with (
patch.object(eng, "llm_logger") as mock_logger,
patch("fastdeploy.engine.common_engine.threading.Thread") as thread_mock,
):
eng._register_to_router()
mock_logger.info.assert_called()
thread_mock.assert_not_called()
self._detach_finalizer(eng)
def test_register_to_router_enabled_starts_thread(self):
eng = self._make_mixed_engine()
eng.cfg.router_config.router = "http://router"
with patch("fastdeploy.engine.common_engine.threading.Thread") as thread_mock:
eng._register_to_router()
thread_mock.assert_called_once()
thread_mock.return_value.start.assert_called_once()
self._detach_finalizer(eng)
def test_insert_zmq_task_to_scheduler_normal_request(self):
eng = self._make_mixed_engine()
eng.running = True