FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-07 16:08:58 +08:00

Files

T

memoryCoderC be3be4913a [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM (#5195 )

* [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM

* [Optimization] refactor(chat_handler,completion_handler): rename class

2025-12-25 16:28:15 +08:00

test_get_save_output_v1.py

[Fearture] Support cache kv cache for output tokens (#4535 )

2025-12-04 20:53:08 +08:00

test_pooler.py

[Docs] Add License in Unittest (#4957 )

2025-11-12 10:44:09 +08:00

test_process_batch_draft_tokens.py

[Speculative Decoding] split draft_tokens into standalone post-processing path (#5205 )

2025-11-27 11:22:41 +08:00

test_process_batch_output_use_zmq.py

[PD Disaggregation] Add timestamp for analyzing splitwise deployment (#5317 )

2025-12-08 10:08:44 +08:00

test_process_batch_output.py

[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 (#5458 )

2025-12-16 16:36:09 +08:00

test_stream_transfer_data.py

[Docs] Add License in Unittest (#4957 )

2025-11-12 10:44:09 +08:00

test_token_processor_trace_print.py

[PD Disaggregation] Add timestamp for analyzing splitwise deployment (#5317 )

2025-12-08 10:08:44 +08:00

test_token_processor.py

[Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM (#5195 )

2025-12-25 16:28:15 +08:00