[Feature] support v1 update/clear api for RL (#6761)

* [Feature] support v1 update/clear api for RL

* [fix] fix execute_model and add sleep/wakeup api

* [fix] fix mtp and key_prefix

* [chore] move _update_key_prefix to resume method

* [fix] make the interface safe to call multiple times

* [fix] fix some tiny bugs

* [chore] make small changes against pr review

* [docs] add docs for weight update

* [test] add some tests and update docs

* [style] fix code style check

* [test] fix ci

* [fix] fix stale control responses when control method timed out

* [chore] remove unused code

* [chore] fix code style

* [chore] optimize tags and key_prefix

* [test] fix ci

* [chore] fix code style

* [test] fix ci

* [fix] fix ep control

* [fix] fix ep control for engine cache queue
This commit is contained in:
Yonghua Li
2026-03-25 19:18:46 +08:00
committed by GitHub
parent 48cfb608aa
commit a7f52c300d
26 changed files with 1857 additions and 392 deletions
@@ -122,12 +122,21 @@ class ChatResponseProcessor:
else:
self._audio_buffer[req_id] = [token_ids]
else:
yield self.data_processor.process_response_dict(
response_dict=request_output,
stream=stream,
include_stop_str_in_output=include_stop_str_in_output,
request=request,
)
if self._is_async_processor:
response = await self.data_processor.process_response_dict(
response_dict=request_output,
stream=stream,
include_stop_str_in_output=include_stop_str_in_output,
request=request,
)
else:
response = self.data_processor.process_response_dict(
response_dict=request_output,
stream=stream,
include_stop_str_in_output=include_stop_str_in_output,
request=request,
)
yield response
elif stream:
decode_type = request_output["outputs"].get("decode_type", 0)
token_ids = request_output["outputs"]["token_ids"]