FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Author	SHA1	Message	Date
xjkmfa	3b203994e2	[Benchmark] Update Qwen3 vl 32k yaml (#6946 )	2026-03-20 11:48:53 +08:00
xjkmfa	a81116ad90	[Benchmark] Update Qwen3 vl dense yaml (#6945 )	2026-03-20 11:26:47 +08:00
Zhang Yulong	66c9e11998	[benchmark] update tools (#6437 )	2026-02-10 17:48:55 +08:00
xiegegege	51c6fa8afc	[CE]add 21b cpu cache ,glm mtp,glm for rl config (#6328 )	2026-02-03 20:10:47 +08:00
xjkmfa	e27a7cc5b0	[Benchmark] Ce qwen3 vl (#6288 ) * [CE]qwen3-vl	2026-02-03 14:17:28 +08:00
xiegegege	e22c4e29bb	[CE]add paddleocr config yaml (#6097 )	2026-01-19 20:07:42 +08:00
xiegegege	97e340eb14	[CE]add pd router and wint4 tp4 config (#5554 )	2025-12-15 15:25:14 +08:00
tianlef	13cc7dacfd	[Doc]add text/vl cinn ce config (#5532 )	2025-12-12 16:16:06 +08:00
xiegegege	b7e1e6c953	[CE]change yaml name	2025-12-04 19:14:11 +08:00
tianlef	04d35ace5e	[CE]add wint4 ep (#5355 )	2025-12-03 15:17:47 +08:00
xiegegege	eae34a416c	[benchmark]add qwen3-235b pd+ep yaml (#5225 )	2025-11-25 19:53:30 +08:00
tianlef	de43577a7c	[Docs] add ebvlthinking yaml (#5120 )	2025-11-19 15:27:46 +08:00
Juncai	08ca0f6aea	[Feature] [PD] add simple router and refine splitwise deployment (#4709 ) * add simple router and refine splitwise deployment * fix	2025-11-06 14:56:02 +08:00
kxz2002	a2870ed4a9	[Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” (#4668 ) * parser register name unify * change ernie_x1 to ernie-x1 * change ernie4_5_vl to ernie-45-vl * fix unit test	2025-10-31 10:45:27 +08:00
xjkmfa	19df1aec2b	[Docs] add Qwen25vl yaml (#4662 ) * Add ci case for min token and max token * 【CI case】include total_tokens in the last packet of completion interface stream output * 【CE】add qwen25-vl * 【CE】add qwen25-vl --------- Co-authored-by: xujing43 <xujing43@baidu.com>	2025-10-29 17:39:40 +08:00
RAM	86d5006a57	[Graph Optimization][Speculative Decoding] Update yaml and fix typo (#4612 )	2025-10-28 11:43:26 +08:00
ophilia-lee	70aa7423f8	benchmark工具适配SGLang框架 (#4607 ) * benchmark工具适配SGLang框架 * benchmark工具适配SGLang框架 * benchmark工具适配SGLang框架	2025-10-27 18:52:56 +08:00
tianlef	2676a918f0	[Doc]fix deepseek ce (#4560 )	2025-10-23 14:09:11 +08:00
tianlef	153f15db39	[Doc]add deepseek wint4 ce (#4517 )	2025-10-21 16:41:51 +08:00
RAM	775edcc09a	[Executor] Default use CUDAGraph (#3594 ) * add start intercept * Adjustment GraphOptConfig * pre-commit * default use cudagraph * set default value * default use cuda graph * pre-commit * fix test case bug * disable rl * fix moba attention * only support gpu * Temporarily disable PD Disaggregation * set max_num_seqs of test case as 1 * set max_num_seqs and temperature * fix max_num_batched_tokens bug * close cuda graph * success run wint2 * profile run with max_num_batched_tokens * 1.add c++ memchecker 2.success run wint2 * updatee a800 yaml * update docs * 1. delete check 2. fix plas attn test case * default use use_unique_memory_pool * add try-except for warmup * ban mtp, mm, rl * fix test case mock * fix ci bug * fix form_model_get_output_topp0 bug * fix ci bug * refine deepseek ci * refine code * Disable PD * fix sot yaml	2025-10-21 14:25:45 +08:00
tianlef	14eb8b4f8b	add x1 a3b quantization (#4397 )	2025-10-14 15:04:06 +08:00
tianlef	8a964329f4	add glm benchmark yaml (#4289 )	2025-09-26 14:23:29 +08:00
tianlef	e79a1a7938	x1_a3b config (#4135 )	2025-09-16 19:44:46 +08:00
xiegegege	d682c97dd3	[benchmark]add lite-vl and x1 yaml (#4130 )	2025-09-16 16:38:36 +08:00
tianlef	83bf1fd5aa	[Doc]add plas attention config (#4128 )	2025-09-16 15:55:12 +08:00
tianlef	0bc7d076fc	[CE]add x1 w4a8c8 benchamrk config (#3607 ) * [CE]add x1 w4a8c8 benchamrk config * [CE]add x1 w4a8c8 benchamrk config * [CE]add x1 w4a8c8 benchamrk config	2025-08-26 11:27:32 +08:00
Zhang Yulong	9ff2dfb162	Create eb45-8k-fp8-tp1-dp8_ep.yaml (#3485 ) Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details 混合架构EP并行yaml	2025-08-20 14:33:54 +08:00
xiegegege	e3a843f2c5	[benchmark] add quantization for benchmark yaml (#2995 )	2025-07-24 13:26:34 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
RAM	0fad10b35a	[Executor] CUDA Graph support padding batch (#2844 ) * cuda graph support padding batch * Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes. * Do not insert max_num_seqs when the user specifies a capture list * Support set graph optimization config from YAML file * update cuda graph ci * fix ci bug * fix ci bug	2025-07-15 19:49:01 -07:00
ophilia-lee	33db137d0b	新增vLLM默认请求参数yaml	2025-07-15 19:31:27 +08:00
Divano	be5cabaf80	add quick benchmark (#2703 ) 测试脚本不需要过CI	2025-07-04 09:32:36 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00

33 Commits