FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 17:11:21 +08:00

Author	SHA1	Message	Date
memoryCoderC	be3be4913a	[Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM (#5195 ) * [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM * [Optimization] refactor(chat_handler,completion_handler): rename class	2025-12-25 16:28:15 +08:00
zhouchong	5d9b5e4a5b	[Engine] [Feature] Refactor async_llm:cross-process with EngineService，based on zmq communication (#4868 ) * Refactor async_llm:cross-process with EngineService * fix: async_llm output process * fix: return prompt_token_ids and prompt_tokens in first res * optimize common_engine start func	2025-12-09 10:53:40 +08:00
Juncai	80efe98f8d	[PD Disaggregation] Add timestamp for analyzing splitwise deployment (#5317 ) * Add timestamp for analyzing splitwise deployment * up * up * up * up * up * up * fix format * fix	2025-12-08 10:08:44 +08:00
Yuanle Liu	41c63f6056	remove fastsafetensors (#5371 )	2025-12-04 19:22:04 +08:00
Longzhi Wang	add524d80c	[Feature] support chunked moe (#4575 ) * [Feature] support chunked moe * update * update * fix and add test * update * fix conflict and modity test * fix fused_moe * fix fused_moe * fix docstring * fix * fix typo * fix test * fix * fix * fix test * fix test	2025-12-01 15:17:18 +08:00
bukejiyu	1539fd6056	[BugFix]Set default OMP_NUM_THREADS=3 and fix extra GPU memory usage in DeepSeek (#5219 ) * fix bug * update * update * update * fix copy * update	2025-11-28 14:22:04 +08:00
kevin	8e4e3ff510	[Feature] support eplb in api_server (#4782 ) * support eplb in api_server * update code * add eplb test case * update eplb * support tp+dp eplb * update test cese * update code * update code * fix bug * update copilot review * update test case name	2025-11-24 20:22:29 +08:00
Juncai	36822fa49c	[PD Disaggregation] remove splitwise deployment on single node and refine the code (#4891 ) * remove splitwise deployment on single node and refine the code * up * up * up * add test * up	2025-11-14 09:56:53 +08:00
bukejiyu	b09ebb2813	refactor pt loading (#4532 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-11-11 21:30:39 +08:00
Yuanle Liu	3dc0ffa46d	[TSP] Support qwen3 moe tsp + cudagraph (#4871 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support qwen3_moe tsp mode * fix * fix * update * update * update * fix * support external_rmsnorm * update * fix	2025-11-10 23:37:51 +08:00
chenjian	78895e2c7d	[Bug Fix] fix bug for PD EP (#4823 ) * fix bug for PD EP * fix * optimize perf for engine worker queue * fix bug * fix internode ll two stage * fix for ci * fix bug	2025-11-10 15:33:29 +08:00
chenjian	cc8f5312f5	[Feature] Add timestamp for profiler (#4726 ) * [Feature] Add timestamp for profiler * fix bug for offine inference * fix for ci * fix * fix ci	2025-11-05 12:04:59 +08:00
chen	1c3ca48128	[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs (#4769 )	2025-11-05 10:43:25 +08:00
zhouchong	35286ce31a	fix total_block_num init error in worker_process (#4687 )	2025-10-30 19:53:09 +08:00
chen	5c63a089f6	[Feature] Support logprobs_mode (#4567 )	2025-10-27 14:27:48 +08:00
zhouchong	dce988824d	[Feature] Support AsyncLLM (#4458 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * add async_llm * apply review * update engine config * Adapt to latest engine.py changes * add more unit tests * Increase unit test coverage --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-10-22 15:50:12 +08:00

16 Commits