[Engine] [Feature] Refactor async_llm:cross-process with EngineService，based on zmq communication (#4868)

* Refactor async_llm:cross-process with EngineService * fix: async_llm output process * fix: return prompt_token_ids and prompt_tokens in first res * optimize common_engine start func
2026-04-23 00:17:25 +08:00 · 2025-12-09 10:53:40 +08:00
parent 2f208db4e9
commit 5d9b5e4a5b
8 changed files with 2217 additions and 1790 deletions
@@ -191,7 +191,7 @@ class ZmqServerBase(ABC):
            return str(e), None

    def recv_result_handle(self):
-        while True:
+        while self.running:
            try:
                with self.response_token_lock:
                    client, _, request_id = self.socket.recv_multipart(flags=zmq.NOBLOCK)