FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 08:21:53 +08:00

Author	SHA1	Message	Date
Zhang Yulong	30db3e9d8f	[benchmark] update tools (#7512 )	2026-04-20 19:40:17 +08:00
Zhang Yulong	738c658c54	[Benchmark] Update seed argument handling in benchmark_serving.py (#7356 )	2026-04-13 16:05:50 +08:00
Zhang Yulong	7614175e13	Disable fixed random seed in benchmark_dataset.py (#7263 ) Commented out the random seed initialization to allow for varied randomness in benchmarks.	2026-04-10 13:56:14 +08:00
Zhang Yulong	f422f835e8	[benchmark] update tools (#7211 )	2026-04-07 16:25:44 +08:00
xiegegege	209e5cf7f4	[CE]add 21b mooncake yaml (#7033 ) * [CE]add 21b cpu cache ,glm mtp,glm for rl config * [CE]add 21b tp2 yaml * [CE]add 21b mooncake yaml * add fastdeploy benchmark,paddletest-155 * [CE] adjust vl wint4 config * [CE]add glm mtp with updatemodel config * [CE]fix * fix * test * test * test --------- Co-authored-by: xiegegege <>	2026-03-26 20:01:05 +08:00
Zhang Yulong	6f5aa883f7	[benchmark] update benchmark tools (#6991 ) * [benchmark] update benchmark tools * [benchmark] update benchmark tools	2026-03-24 20:56:27 +08:00
Zhang Yulong	2b10ebc1f1	[benchmark] Refactor debug logging and payload handling (#6949 ) * Refactor debug logging and payload handling * Update backend_request_func.py	2026-03-20 15:04:10 +08:00
Zhang Yulong	3a4e139f65	[Benchmark] fix multi turn (#6948 )	2026-03-20 13:22:30 +08:00
xjkmfa	3b203994e2	[Benchmark] Update Qwen3 vl 32k yaml (#6946 )	2026-03-20 11:48:53 +08:00
xjkmfa	a81116ad90	[Benchmark] Update Qwen3 vl dense yaml (#6945 )	2026-03-20 11:26:47 +08:00
Zhang Yulong	051bbbeead	[Benchmark] Update backend_request_func.py (#6575 )	2026-02-28 19:51:55 +08:00
Zhang Yulong	ce8123cb7f	[Benchmark] Update backend_request_func.py (#6566 )	2026-02-28 14:54:30 +08:00
Zhang Yulong	ff20a3cc02	[benchmark] update tool call (#6519 )	2026-02-26 17:06:54 +08:00
Zhang Yulong	96bfa0d5b9	[benchmark] Update benchmark_serving.py (#6467 )	2026-02-11 20:10:46 +08:00
Zhang Yulong	02c61f8346	[Benchmark] Update backend_request_func.py (#6441 )	2026-02-10 19:58:50 +08:00
Zhang Yulong	66c9e11998	[benchmark] update tools (#6437 )	2026-02-10 17:48:55 +08:00
Zhang Yulong	26ba019e66	Update README.md (#6343 )	2026-02-04 15:57:18 +08:00
Zhang Yulong	16d03c3127	update (#6335 )	2026-02-03 21:59:32 +08:00
xiegegege	51c6fa8afc	[CE]add 21b cpu cache ,glm mtp,glm for rl config (#6328 )	2026-02-03 20:10:47 +08:00
xjkmfa	e27a7cc5b0	[Benchmark] Ce qwen3 vl (#6288 ) * [CE]qwen3-vl	2026-02-03 14:17:28 +08:00
ophilia-lee	1705d0af7a	[benchmark]支持SGLang/VLLM获取cached tokens (#6240 ) * benchmark工具支持受限解码场景指定response_format * Update backend_request_func.py output.success判断兼容思考内容超长截断时回复内容为空的情况 * Update benchmark_serving.py 更新benchmark_metrics * 支持Completions接口 * 支持Completions接口 * 支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]async_request_eb_openai_completions 调大aiohttp 默认读 buffer size至4M，解决streaming 返回块过大报Chunk too big问题 * [Benchmark]调大aiohttp 默认读 buffer size至10M，解决streaming 返回块过大报Chunk too big问题 * [Benchmark]支持获取vLLM/SGLang cached_tokens [Benchmark]支持获取vLLM/SGLang cached_tokens * [benchmark]支持SGLang/VLLM获取cached tokens [benchmark]支持SGLang/VLLM获取cached tokens --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2026-01-27 14:57:20 +08:00
xiegegege	e22c4e29bb	[CE]add paddleocr config yaml (#6097 )	2026-01-19 20:07:42 +08:00
jc	e911ac2ce7	[BugFix] Refine the preparation of cpu and storage cache (#5777 ) * Refine the preparation of cpu and storage cache * fix error * fix error * up * fix * up docs * fix unittest * remove debug info	2026-01-05 10:13:30 +08:00
Zhang Yulong	2da32f2a35	Update benchmark_serving.py (#5861 )	2026-01-04 20:07:56 +08:00
ophilia-lee	d5f5dc4f6e	[Benchmark]调大aiohttp 默认读 buffer size至10M，解决streaming 返回块过大报Chunk too big问题 (#5771 ) * benchmark工具支持受限解码场景指定response_format * Update backend_request_func.py output.success判断兼容思考内容超长截断时回复内容为空的情况 * Update benchmark_serving.py 更新benchmark_metrics * 支持Completions接口 * 支持Completions接口 * 支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]async_request_eb_openai_completions 调大aiohttp 默认读 buffer size至4M，解决streaming 返回块过大报Chunk too big问题 * [Benchmark]调大aiohttp 默认读 buffer size至10M，解决streaming 返回块过大报Chunk too big问题 --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-12-25 19:36:11 +08:00
Juncai	412867fd99	[Feature] Support KV Cache Storage (#5571 ) * Support Mooncake Store * up * up * add op * fix conflict * fix error * up for comments * avoid thread lock * up * fix unittest * fix unittest * remove debug info * consider tp_size > 1 * add default rdma_nics * add utils * up * fix error --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-12-25 16:30:35 +08:00
ophilia-lee	99258e19c8	[Benchmark]支持Completions接口 (#5700 ) * benchmark工具支持受限解码场景指定response_format * Update backend_request_func.py output.success判断兼容思考内容超长截断时回复内容为空的情况 * Update benchmark_serving.py 更新benchmark_metrics * 支持Completions接口 * 支持Completions接口 * 支持Completions接口 * [Benchmark]支持Completions接口 * [Benchmark]支持Completions接口 --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-12-23 19:46:23 +08:00
Zhang Yulong	48f3e9797e	Update backend_request_func.py (#5633 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-12-18 16:21:34 +08:00
Zhang Yulong	c89a62e550	Update backend_request_func.py (#5631 )	2025-12-18 14:20:17 +08:00
Zhang Yulong	f45c131ddf	update (#5625 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-12-17 21:38:14 +08:00
xiegegege	97e340eb14	[CE]add pd router and wint4 tp4 config (#5554 )	2025-12-15 15:25:14 +08:00
tianlef	13cc7dacfd	[Doc]add text/vl cinn ce config (#5532 )	2025-12-12 16:16:06 +08:00
Zhang Yulong	510b82173a	[Benchmark] Update benchmark (#5496 ) * update benchmark * update benchmark	2025-12-11 11:53:12 +08:00
SunLei	5fb93d84f5	[Feature] [Benchmark]: add ZMQ-based FMQ implementation and benchmark tools (#5418 ) * feat(fmq): add ZMQ-based FMQ implementation and benchmark tools * move FMQ_CONFIG_JSON to envs * fix top_p_candidates (#5400) Co-authored-by: freeliuzc <lzc842650834@gmail.com> * [RL] Support Rollout Routing Replay (#5321) * [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> * [Bug fix] Fix the multi-input accuracy issue in the pooling model. (#5374) * fix multi-inputs * fix threshold * fix threshold * fix * [BugFix]remove _execute_empty_input (#5396) * Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402) This reverts commit `96d2d4877b`. * [New][RL] Support Rollout Routing Replay (#5405) * [RL] Support Rollout Routing Replay * add routing indices cache * fix config bug and moe forward bug * R3 Support GLM * support eb4.5 * fix merge bug * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add routing replay ci * support glm topk * support orther top_k * fix ci bug * pre-commit * only support chatcmpl * Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)" This reverts commit `c45e064f3d`. * Fix XPU and NPU bug --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> * bf16 deepseek (#5379) * fix deepseek (#5410) * Update tests/inter_communicator/test_fmq_factory.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update benchmarks/benchmark_fmq.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/inter_communicator/fmq.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: GoldPancake <56388518+Deleter-D@users.noreply.github.com> Co-authored-by: freeliuzc <lzc842650834@gmail.com> Co-authored-by: RAM <gstian5555@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yuanle Liu <yuanlehome@163.com> Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com> Co-authored-by: 周周周 <39978853+zhoutianzi666@users.noreply.github.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com>	2025-12-08 22:04:49 +08:00
xiegegege	b7e1e6c953	[CE]change yaml name	2025-12-04 19:14:11 +08:00
tianlef	04d35ace5e	[CE]add wint4 ep (#5355 )	2025-12-03 15:17:47 +08:00
Zhang Yulong	5b49142988	update (#5298 )	2025-11-28 18:29:16 +08:00
xiegegege	eae34a416c	[benchmark]add qwen3-235b pd+ep yaml (#5225 )	2025-11-25 19:53:30 +08:00
tianlef	de43577a7c	[Docs] add ebvlthinking yaml (#5120 )	2025-11-19 15:27:46 +08:00
Zhang Yulong	83532e1d01	[Benchmark] Enhance benchmark output logging (#4682 ) * Enhance benchmark output logging Add print statements to display the number of discarded outputs before and after filtering. * Update benchmark_serving.py	2025-11-06 16:53:31 +08:00
Juncai	08ca0f6aea	[Feature] [PD] add simple router and refine splitwise deployment (#4709 ) * add simple router and refine splitwise deployment * fix	2025-11-06 14:56:02 +08:00
zhang-prog	4c2ad15258	add paddleocr_vl benchmark (#4833 ) * add paddleocr_vl benchmark * fix * fix * fix * fix	2025-11-05 19:37:45 +08:00
ophilia-lee	412097c1b8	benchmark工具支持受限解码场景指定response_format (#4718 )	2025-10-31 12:26:24 +08:00
Ryan	28de91b50f	[Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B (#4645 ) * 45TVL support sot+CUDAGraph * mv unitest from ce_deploy 2 e2e * add test_EB_VL_Lite_sot_serving * rm useless line * add openai_client * fix unitest && reduce computing resources	2025-10-31 11:38:43 +08:00
kxz2002	a2870ed4a9	[Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” (#4668 ) * parser register name unify * change ernie_x1 to ernie-x1 * change ernie4_5_vl to ernie-45-vl * fix unit test	2025-10-31 10:45:27 +08:00
xjkmfa	19df1aec2b	[Docs] add Qwen25vl yaml (#4662 ) * Add ci case for min token and max token * 【CI case】include total_tokens in the last packet of completion interface stream output * 【CE】add qwen25-vl * 【CE】add qwen25-vl --------- Co-authored-by: xujing43 <xujing43@baidu.com>	2025-10-29 17:39:40 +08:00
RAM	86d5006a57	[Graph Optimization][Speculative Decoding] Update yaml and fix typo (#4612 )	2025-10-28 11:43:26 +08:00
ophilia-lee	70aa7423f8	benchmark工具适配SGLang框架 (#4607 ) * benchmark工具适配SGLang框架 * benchmark工具适配SGLang框架 * benchmark工具适配SGLang框架	2025-10-27 18:52:56 +08:00
tianlef	2676a918f0	[Doc]fix deepseek ce (#4560 )	2025-10-23 14:09:11 +08:00
tianlef	153f15db39	[Doc]add deepseek wint4 ce (#4517 )	2025-10-21 16:41:51 +08:00

1 2

78 Commits