YuBaoku
cae2709eff
[CI] Update stable test workflow and run.sh script ( #6352 )
2026-02-05 11:01:15 +08:00
Divano
ba9d2a9e5a
[CI] add update weights tests ( #6242 )
2026-01-27 20:54:21 +08:00
Yonghua Li
833d00e2d7
[BugFix] move cache creation back to cache transfer process and adapt clear/update ( #6144 )
...
* [fix] move cache creation back to cache transfer process
* [fix] fix clear cache
* [chore] change some log level
* [fix] fix clear cache
* [fix] fix clear cache for blockwisefp8 and mtp
* [fix] fix c8
* [fix] fix clear_mtp_cache args
* [chore] update cache_transfer_manager
* [fix] fix update mtp cache
2026-01-24 21:59:13 +08:00
fxyfxy777
4c92035f2d
[Feature] Unify fp8 block_wise quant ops ( #5991 )
...
* quant stash
* blockwise_quant
* precommit
* rm tensor.cut
* tp ok
* add swiglu
* rm outdate code
* fix activate ut
* change baseline
* fix baseline error
2026-01-15 05:50:37 -08:00
Ryan
0d1a5e70bc
[Graph Optimization] Add full_cuda_graph to control subgraph split ( #6027 )
2026-01-14 11:43:59 +08:00
Yonghua Li
456637002d
[BugFix] fix cache transfer manager updating/clearing ( #5930 )
...
* [fix] fix cache transfer manager updating/clearing
* [fix] fix code style
* [fix] fix config
* [fix] fix engine client
* [fix] let worker update kv cache status signal
* [fix] update worker process
* [fix] fix clear/update for case if comm group is shutdown
* [fix] update dynamic weight manager
* [fix] fix port
* [fix] add num_cpu_blocks arg for async_llm, and remove unnecessary waiting
2026-01-13 05:09:29 -08:00
Yonghua Li
a8d3e3ba12
[BugFix] fix shm opened but not closed in set_data_ipc ( #5826 )
2025-12-29 23:35:07 +08:00
YuBaoku
4c22a5afb8
[CI] Disable GPU cleanup due to CI machine limitations ( #5781 )
2025-12-26 00:11:06 +08:00
YuBaoku
7247dc5f3a
[CI] Add retry and robust cleanup for removal ( #5725 )
...
* [CI] Add retry and robust cleanup for removal
* [CI] Ensure clean GPU memory by killing leftover processes
2025-12-25 17:08:27 +08:00
YuBaoku
0410c42a9a
[CI] Refactor RL tests to reuse stable_test ( #5516 )
...
* [CI] Refactor RL tests to reuse stable_test
2025-12-24 19:18:00 +08:00
Divano
6b0fba8294
Update run.sh
2025-12-24 15:35:17 +08:00
Nyakku Shigure
11227e00bb
[GraphOptimization] Wrap deep gemm and triton as python op ( #5673 )
...
* [GraphOptimization] Wrap deep gemm and triton as python op
* add unitest to _base_test && compatibility
* paddle.static.MetaTensor -> "paddle.static.MetaTensor"
* mv register_custom_python_op
* rename yaml
---------
Co-authored-by: DrRyanHuang <zihaohuang@aliyun.com >
2025-12-24 15:23:46 +08:00
ophilia-lee
99258e19c8
[Benchmark]支持Completions接口 ( #5700 )
...
* benchmark工具支持受限解码场景指定response_format
* Update backend_request_func.py
output.success判断兼容思考内容超长截断时回复内容为空的情况
* Update benchmark_serving.py
更新benchmark_metrics
* 支持Completions接口
* 支持Completions接口
* 支持Completions接口
* [Benchmark]支持Completions接口
* [Benchmark]支持Completions接口
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-23 19:46:23 +08:00
Ryan
d01cb274d6
[Graph Optimization][CI] Add ERNIE45T 21B sot test ( #5538 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-13 00:43:15 +08:00
Ryan
e58fed3665
[Graph Optimization][BugFix][CI] Fix 0size bug && add unitest ( #5495 )
2025-12-11 16:25:26 +08:00
kevin
f7e832efaf
[BugFix] fix mm cudagraph ( #5266 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mm cudagraph
* fix test_prompt_ids bug
* update code
* update ci code
* update ci code
* update ci code
2025-12-09 11:51:00 +08:00
Divano
b935101008
Create test_prompt_ids.py
2025-11-28 10:11:51 +08:00
Zhang Yulong
5aa73d32f4
Update deploy.py ( #4850 )
2025-11-06 19:09:28 +08:00
YuBaoku
9eff788658
[CI] fix some ci yaml ( #4747 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-02 21:28:04 +08:00
ApplEOFDiscord
14f8cddaf1
[Feature] add mm token usage ( #4570 )
...
* add mm token usage
* fix unit test
* fix unit test
* fix unit test
* fix model path
* fix unit test
* fix unit test
* fix unit test
* remove uncomment
* change var name
* fix code style
* fix code style
* fix code style
* fix code style
* fix unit test
2025-10-29 14:37:12 +08:00
SunLei
2a9ed72533
feat: add support for API usage with multimodal models ( #4548 )
...
* feat: add support for API usage with multimodal models
* completion_tokens contains num_image_tokens
* remove test_request.py
* fix: paddle.device.is_compiled_with_cuda()
* fix test_unstream_without_logprobs
2025-10-28 20:23:46 +08:00
Zhang Yulong
b4014834a9
Extend sleep time to 10 seconds in switch_service ( #4618 )
...
Increase sleep duration before switching services.
2025-10-28 15:19:21 +08:00
kxz2002
327fa4c255
[DataProcessor] add reasoning_tokens into usage info ( #4520 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
2025-10-25 16:57:58 +08:00
RAM
8a02ab43a8
[FDConfig]Turn on the CUDAGraph + RL switch ( #4508 )
...
* Turn on the CUDAGraph + RL switch
* reduce max_num_seqs and number of request
2025-10-23 11:08:07 +08:00
Divano
2b53c4d684
【CI】Add test cases for n parameter and streaming validation ( #4503 )
...
* add repitation early stop cases
* add repitation early stop cases
* add structure test openai
* add n parameters case
2025-10-21 15:33:29 +08:00
RAM
775edcc09a
[Executor] Default use CUDAGraph ( #3594 )
...
* add start intercept
* Adjustment GraphOptConfig
* pre-commit
* default use cudagraph
* set default value
* default use cuda graph
* pre-commit
* fix test case bug
* disable rl
* fix moba attention
* only support gpu
* Temporarily disable PD Disaggregation
* set max_num_seqs of test case as 1
* set max_num_seqs and temperature
* fix max_num_batched_tokens bug
* close cuda graph
* success run wint2
* profile run with max_num_batched_tokens
* 1.add c++ memchecker 2.success run wint2
* updatee a800 yaml
* update docs
* 1. delete check 2. fix plas attn test case
* default use use_unique_memory_pool
* add try-except for warmup
* ban mtp, mm, rl
* fix test case mock
* fix ci bug
* fix form_model_get_output_topp0 bug
* fix ci bug
* refine deepseek ci
* refine code
* Disable PD
* fix sot yaml
2025-10-21 14:25:45 +08:00
kxz2002
b5b993e48e
【feature】support n parameter ( #4273 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support n parameter
* pre-commit check
* pre-commit check
* restore format_and_add_data
* update n_param
* bug fix index - str to int
* bug fix del child_task
* bug fix metrics
* add debug info
* add debug info2
* remove debug info
* change connecting symbol to '-'
* bugfix change connecting symbol
* bugfix change connecting symbol2
* unit tests fix
* unit test fix2
* unittest add param n=2
* n param add unit tests and adapt to echo
* pre-commit fix
* resolve review
* adjust stop reason
* add unittest for _create_chat_completion_choice
* modify unittest
* solve confict
* solve conflict
* resolve conflict
---------
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: gaoziyuan <m13689897706@163.com >
2025-10-17 20:51:59 +08:00
Ryan
49cea8fb1c
[SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp ( #3694 )
...
* rm inplace info && to(gpu)
* update append_attention
* unpin paddle version
* add full_cuda_graph=False
* add blank line
---------
Co-authored-by: SigureMo <sigure.qaq@gmail.com >
2025-10-17 10:57:55 +08:00
LiqinruiG
4251ac5e95
【Fix】 remove text_after_process & raw_prediction ( #4421 )
...
* remove text_after_process & raw_prediction
* remove text_after_process & raw_prediction
2025-10-16 19:00:18 +08:00
yinwei
20c7b741f4
[XPU] Support W4A8C8-TP4-300B Model ( #4068 )
...
* support w4a8
* delete ep block attn
* delete moe_topk_select
* update note
* update
* delte useless info
* update
* add some note
* fix some format
* update scale info
* add ans baseline
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-10-10 15:41:32 +08:00
co63oc
c4830ef24c
fix typos ( #4176 )
...
* fix typos
* fix
2025-09-22 14:27:17 +08:00
Divano
0b62648924
test xly ci
2025-09-22 14:13:00 +08:00
Divano
8e49d99009
Addcase ( #4112 )
...
logprob 没跑,不影响,增加校验openai 异常情况下 错误输出格式字段的case
2025-09-16 16:12:14 +08:00
xiaolei373
9ac539471d
[format] Valid para format error info ( #4035 )
...
* feat(log):add_request_and_response_log
* 报错信息与OpenAI对齐
2025-09-12 19:05:17 +08:00
Zhang Yulong
2359c8d21c
update ci ( #3962 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-09 10:09:13 +08:00
Zhang Yulong
349aa6348b
add cache queue port ( #3904 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add cache queue port
* add cache queue port
* add cache queue port
2025-09-05 21:17:06 +08:00
Divano
17731a8acd
add concurrency cases ( #3689 )
2025-08-28 18:30:19 +08:00
ltd0924
3d92fb09f7
[BugFix] fix parameter is 0 ( #3592 )
...
* Update engine_client.py
* fix
* Update common_engine.py
2025-08-28 09:52:36 +08:00
xjkmfa
afb9f327ef
【CI case】for echo finish_reason text_after_process and raw_prediction check ( #3630 )
...
* Add ci case for min token and max token
* 【CI case】include total_tokens in the last packet of completion interface stream output
* echo&finish_reason&text_after_process&raw_prediction check
* echo&finish_reason&text_after_process&raw_prediction check
* echo&finish_reason&text_after_process&raw_prediction check
* echo&finish_reason&text_after_process&raw_prediction check
* echo&finish_reason&text_after_process&raw_prediction check
---------
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-08-27 15:21:16 +08:00
Ryan
a5b4866ff1
[CudaGraph][SOT] Add unit tests for splitting the static graph into piecewise graphs that support cuda_graph ( #3590 )
...
* add unitest
* change sot_warmup_sizes
* wtf; add missed commit
2025-08-26 11:25:04 +08:00
chen
9cab3f47ff
[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing ( #3552 )
...
* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing
* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs
* delete some code
* code check
* code check and add doc
* fix tokenizer.decoder(-1), return 'Invalid Token'
* add ci for temp_scaled and top_p logprobs
* check test
* check seq len time shape
* logprob clip inf
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com >
2025-08-25 14:11:49 +08:00
YuBaoku
7821534ff5
[CI] add sot test ( #3579 )
...
* [CI] add sot test
* [CI] add sot test
2025-08-25 10:14:50 +08:00
Zhang Yulong
3cc182236a
update ci ( #3519 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-21 20:05:50 +08:00
Yzc216
466cbb5a99
[Feature] Models api ( #3073 )
...
* add v1/models interface related
* add model parameters
* default model verification
* unit test
* check model err_msg
* unit test
* type annotation
* model parameter in response
* modify document description
* modify document description
* unit test
* verification
* verification update
* model_name
* pre-commit
* update test case
* update test case
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/entrypoints/openai/test_serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/openai/serving_models.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-08-21 17:02:56 +08:00
Zhang Yulong
b7eee3aec1
Update CI ( #3474 )
...
* update CI cases
* update CI cases
* update CI cases
* update CI cases
* Merge upstream/develop and resolve directory rename conflict
* Merge upstream/develop and resolve directory rename conflict
* Merge upstream/develop and resolve directory rename conflict
* update deploy
* update deploy
* update deploy
* update deploy
* update deploy
2025-08-21 16:49:20 +08:00
YUNSHEN XIE
3a6058e445
Add stable ci ( #3460 )
...
* add stable ci
* fix
* update
* fix
* rename tests dir;fix stable ci bug
* add timeout limit
* update
2025-08-20 08:57:17 +08:00