Yonghua Li
0c01cccc32
[BugFix] fix double shutdown of comm group when rank0 clears weights slower than other ranks ( #5715 )
2025-12-25 21:48:53 +08:00
kevin
5538dda3c8
[Feature] pd support dy-c8 ipc ( #5750 )
...
* pd support dy-c8 ipc
* update code
* support v0
* update code
2025-12-25 21:22:34 +08:00
kevin
4fa76296d9
[BugFix] fix mm splitwise scheduler bug ( #5604 )
...
* fix mm splitwise scheduler bug
* fix test case bug
* update code
* update code
2025-12-25 04:08:11 -08:00
ophilia-lee
d5f5dc4f6e
[Benchmark]调大aiohttp 默认读 buffer size至10M,解决streaming 返回块过大报Chunk too big问题 ( #5771 )
...
* benchmark工具支持受限解码场景指定response_format
* Update backend_request_func.py
output.success判断兼容思考内容超长截断时回复内容为空的情况
* Update benchmark_serving.py
更新benchmark_metrics
* 支持Completions接口
* 支持Completions接口
* 支持Completions接口
* [Benchmark]支持Completions接口
* [Benchmark]支持Completions接口
* [Benchmark]async_request_eb_openai_completions 调大aiohttp 默认读 buffer size至4M,解决streaming 返回块过大报Chunk too big问题
* [Benchmark]调大aiohttp 默认读 buffer size至10M,解决streaming 返回块过大报Chunk too big问题
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-25 19:36:11 +08:00
Copilot
1cbf448178
[Feature] Add startup version check mechanism for Paddle ( #5769 )
...
* Initial plan
* 实现版本检查机制:添加get_version_info函数并在启动时检查Paddle版本
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* 修复代码审查反馈:改进错误处理和日志记录
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Change comments and warning messages from Chinese to English
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Update fastdeploy/__init__.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-25 19:29:04 +08:00
freeliuzc
9018ccf74e
[Speculative Decoding] Fix attn_mask_offset for multi-step MTP in mixed and PD-split modes ( #5738 )
...
* fix attn_mask_offset in mtp with multi-step and pd-split-mode
* fix xpu operater register
* update pmtp multi-step mtp strategy in d-split -mode
* add note
* fix xpu register
2025-12-25 01:54:59 -08:00
YuBaoku
7247dc5f3a
[CI] Add retry and robust cleanup for removal ( #5725 )
...
* [CI] Add retry and robust cleanup for removal
* [CI] Ensure clean GPU memory by killing leftover processes
2025-12-25 17:08:27 +08:00
Juncai
412867fd99
[Feature] Support KV Cache Storage ( #5571 )
...
* Support Mooncake Store
* up
* up
* add op
* fix conflict
* fix error
* up for comments
* avoid thread lock
* up
* fix unittest
* fix unittest
* remove debug info
* consider tp_size > 1
* add default rdma_nics
* add utils
* up
* fix error
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-25 16:30:35 +08:00
memoryCoderC
be3be4913a
[Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM ( #5195 )
...
* [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM
* [Optimization] refactor(chat_handler,completion_handler): rename class
2025-12-25 16:28:15 +08:00
Jiaxin Sui
8fc789bb3f
[iluvatar][CI] refactor iluvatar_ci ( #5588 )
...
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* Update Docker image tag in iluvatar_test workflow
* Update default Docker image version in workflow
* Update iluvatar_test.yml
* Update default Docker image in workflow config
* Update model path in run_ernie300B_4layer.py
* Update model path in offline inference check
* Add model_data directory and copy model files
Create model_data directory and copy necessary files.
* Update run_ernie_vl_28B.py
* Update run_ernie300B_4layer.py
* Update paddlepaddle installation method in script
* Change wget command to include proxy option
* Modify paddle package installation in CI script
Updated installation commands for paddle packages.
* Update paddlepaddle and paddle-iluvatar-gpu versions
* Delete .github/workflows/ci_iluvatar.yml
* Rename workflow from ILUVATAR Test to ILUVATAR-CI
* Update installation commands for paddlepaddle and iluvatar
2025-12-25 15:10:34 +08:00
qw86972190
135e47d551
[XPU]ZMQ logprob ( #5628 )
...
* [XPU]ZMQ logprob
2025-12-25 14:50:01 +08:00
Yuanle Liu
75b3180280
[BugFix] Fix _disable_sequence_parallel_moe_if_needed ( #5740 )
2025-12-24 20:02:22 -08:00
MingkunZhang
e48e306134
[Metax] update ci bash ( #5760 )
...
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com >
2025-12-25 11:47:38 +08:00
bukejiyu
f0bbdce849
[Loader]Fix bug in MTP weight loading ( #5744 )
...
* fix torch mtp
* fix
* update
2025-12-25 11:32:17 +08:00
RuohengMa
e154c03416
[XPU] refine moe_expert_ffn ut ( #5743 )
2025-12-25 10:35:24 +08:00
YuBaoku
9624bf3c6e
[CI] Fix image build to use the correct upstream artifacts
2025-12-24 22:44:34 +08:00
chenjian
b90a922f98
[Bug fix] Set enable_cache_output as false by default ( #5751 )
2025-12-24 21:37:24 +08:00
YuBaoku
6e39f88ca0
[CI] Fix ci_image_update error of no depends
2025-12-24 21:28:38 +08:00
YuBaoku
0410c42a9a
[CI] Refactor RL tests to reuse stable_test ( #5516 )
...
* [CI] Refactor RL tests to reuse stable_test
2025-12-24 19:18:00 +08:00
freeliuzc
2dc2ba49b5
[Speculative Decoding] Fix multistep MTP in splitewise-prefill mode ( #5723 )
2025-12-24 02:45:54 -08:00
YuBaoku
e75f93d302
[CI] Refactor RL tests to reuse test_metrics ( #5741 )
2025-12-24 17:08:40 +08:00
chen
c7ab32d154
check ( #5736 )
2025-12-24 16:49:20 +08:00
Divano
6b0fba8294
Update run.sh
2025-12-24 15:35:17 +08:00
Nyakku Shigure
11227e00bb
[GraphOptimization] Wrap deep gemm and triton as python op ( #5673 )
...
* [GraphOptimization] Wrap deep gemm and triton as python op
* add unitest to _base_test && compatibility
* paddle.static.MetaTensor -> "paddle.static.MetaTensor"
* mv register_custom_python_op
* rename yaml
---------
Co-authored-by: DrRyanHuang <zihaohuang@aliyun.com >
2025-12-24 15:23:46 +08:00
bukejiyu
ba4b7afb3a
[Others] Rename tensor_parallel_degree to tensor_model_parallel_size for paddleformers 0.4.1 ( #5727 )
2025-12-23 23:19:11 -08:00
GoldPancake
a0fed22ddb
[Feature] Add entropy calculation script
2025-12-24 15:00:06 +08:00
xunyoyo
8acdd9f156
[CI] 【Hackathon 9th Sprint No.41】NO.41 功能模块单测补充 -new
...
Add splitwise connector tests
Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-24 14:05:32 +08:00
YuBaoku
672620cdfe
Revert "[CI] Adapt vl_model baseline changes due to Paddle update ( #5576 )" ( #5732 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
This reverts commit 63fff8df70 .
2025-12-24 11:59:27 +08:00
周周周
922a73ddd6
[Others] clean code ( #5691 )
2025-12-24 11:28:47 +08:00
GoldPancake
23d488c488
[Feature] Entropy calculation support ( #5692 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support entropy
* fix bug
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-23 21:19:47 +08:00
bukejiyu
d1c6e57341
[Others] upgrade paddleformer to 0.4.0 ( #5599 )
2025-12-23 05:08:01 -08:00
ming1753
85db9d5e56
[Others] reschedule preempt task support optional func ( #5649 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Others] reschedule preempt task support optional func
* fix bug
* fix bug
2025-12-23 20:45:52 +08:00
Copilot
5cec66adb8
[Docs] 更新环境变量文档以同步最新代码 ( #5713 )
...
* Initial plan
* 更新环境变量文档以匹配最新代码
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-23 19:49:20 +08:00
ophilia-lee
99258e19c8
[Benchmark]支持Completions接口 ( #5700 )
...
* benchmark工具支持受限解码场景指定response_format
* Update backend_request_func.py
output.success判断兼容思考内容超长截断时回复内容为空的情况
* Update benchmark_serving.py
更新benchmark_metrics
* 支持Completions接口
* 支持Completions接口
* 支持Completions接口
* [Benchmark]支持Completions接口
* [Benchmark]支持Completions接口
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-23 19:46:23 +08:00
ming1753
04c30521dd
[Others] plugin raise error msg ( #5675 )
2025-12-23 18:56:54 +08:00
kesmeey
f15edbb6ef
[CI]【Hackathon 9th Sprint No.40】功能模块 fastdeploy/entrypoints/openai/api_server.py 单测补充 ( #5567 )
...
* Add tests for openai api_server coverage
* update
* Update tests for openai api_server
* fix bugs
* test: disable some api_server lifespan/controller tests for local env
* Format test_api_server with black
* update
* update
* test: narrow envs patch in api_server tests to avoid side effects
* fix: separate MagicMock creation to avoid missing req argument
* fix: patch TRACES_ENABLE env var in api_server tests
* fix: use os.environ patch for TRACES_ENABLE
* test: use fake fastdeploy.envs in api_server tests
* test: pass fake Request into chat/completion routes
* test: increase coverage for tracing and scheduler control
* fix: set dynamic_load_weight in tracing headers test
* ci: add retry and validation for FastDeploy.tar.gz download
* ci: fix indentation in _base_test.yml
* refactor: simplify test_api_server.py (807->480 lines, ~40% reduction)
* fix: restore missing args attributes (revision, etc.) in _build_args
* fix: patch sys.argv to prevent SystemExit: 2 in api_server tests
* improve coverage
* Remove docstring from test_api_server.py
Removed unnecessary docstring from test_api_server.py
---------
Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com >
2025-12-23 18:06:43 +08:00
Copilot
e9f5397bc9
[Docs] Update parameters documentation with latest code defaults and new parameters ( #5709 )
...
* Initial plan
* Update parameters documentation with correct default values and new parameters
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-23 17:31:44 +08:00
Divano
c1aa66df02
Revert "[Optim] Remove limitation of number of kvcache blocks ( #5612 )" ( #5702 )
...
This reverts commit 9da89a374b .
2025-12-23 15:41:33 +08:00
Jiaxin Sui
0bef9b684f
[Metax][CI]fix CI bug ( #5698 )
...
* Update run_ci_metax.sh
* Fix pull request branch reference in CI workflow
2025-12-23 14:56:34 +08:00
RuohengMa
2c3c983b96
[XPU] modify speculate_verify ( #5522 )
2025-12-23 14:50:30 +08:00
MingkunZhang
945a1bc4e2
[Metax] update ci name ( #5679 )
...
* [Metax] update ci name
* Update CI_METAX workflow for pull request handling
* Update ci_metax.yml
* Update CI_METAX workflow for pull request handling
* Remove commented-out code in run_ci_metax.sh
* Add environment to Jenkins trigger job
* Change trigger event from pull_request_target to pull_request
* Fix environment name casing in CI workflow
* Change environment name from Metax-ci to Metax_ci
* Modify CI_METAX workflow for PR targeting and concurrency
Updated workflow to use pull_request_target event and added concurrency settings.
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-12-23 14:00:48 +08:00
bukejiyu
6c36a17369
[Others]Prevent core dumps during Paddle version check ( #5657 )
2025-12-22 21:57:45 -08:00
Jiang-Jia-Jun
9da89a374b
[Optim] Remove limitation of number of kvcache blocks ( #5612 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optim] Remove limitation of number of kvcache blocks
* Update fastdeploy/envs.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/worker/iluvatar_worker.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Add docs
* Update fastdeploy/worker/worker_process.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix ci case
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-23 11:18:29 +08:00
ddchenhao66
4a74f5ab9b
[XPU]Set top_p=0.0 by default on XPU to optimize performance ( #5686 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-12-23 11:01:01 +08:00
xunyoyo
3aee5c4bf5
[CI] 【Hackathon 9th Sprint No.37】NO.37 功能模块单测补充 ( #5059 )
...
* Add unit tests for TokenProcessor functionality
* Add trace stubs for token processor tests
* Increase token processor test coverage
* Clean up imports in test_token_processor.py
Remove unnecessary path manipulation in test file.
* Cleanup: Remove unused imports in test_token_processor
Removed unused imports from the test file.
* Add trace_carrier to task in test cases
Added trace_carrier attribute to task in multiple test cases to ensure proper handling of trace information.
* Refine token processor tests for safe coverage
* Expand postprocess coverage
* Add ZMQ logprob parsing test
---------
Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com >
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-12-23 10:35:16 +08:00
Jiaxin Sui
f16077a939
[XPU][CI] Xpu ci update ( #5690 )
...
* Enhance run_ci_xpu.sh with caching and prefill options
* Update model path and configuration in run_ci_xpu.sh
* Add '北朝' keyword to assertion in run_45vl.py
* Enhance process termination logic in run_ci_xpu.sh
* Set timeout for CI_XPU job to 60 minutes
* Remove extra newline in stop_processes function
* Update paddlepaddle-xpu installation command
Comment out the previous paddlepaddle-xpu installation command and replace it with a specific version installation due to EP parallel error.
* Update PaddlePaddle installation command
* Remove max_tokens from model response configuration
Removed max_tokens parameter from the model response call.
2025-12-23 10:19:39 +08:00
xiaolei373
dfe8ea941c
[log]console log to llm log ( #5680 )
2025-12-23 10:05:45 +08:00
RAM
131defa122
Revert "Revert "[Feature] Use paddle.compat.enable_torch_proxy in `fastdepl…" ( #5606 )
...
This reverts commit 021399f7c9 .
2025-12-22 22:37:51 +08:00
ddchenhao66
a1535c7e7e
[XPU][CI] xpu add ci test for pd + TP2 ( #5653 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-12-22 19:27:10 +08:00
Yuanle Liu
8beb0158fa
[BugFix] fix rl signal ( #5681 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-22 00:35:54 -08:00