Yonghua Li
0c01cccc32
[BugFix] fix double shutdown of comm group when rank0 clears weights slower than other ranks ( #5715 )
2025-12-25 21:48:53 +08:00
kevin
5538dda3c8
[Feature] pd support dy-c8 ipc ( #5750 )
...
* pd support dy-c8 ipc
* update code
* support v0
* update code
2025-12-25 21:22:34 +08:00
kevin
4fa76296d9
[BugFix] fix mm splitwise scheduler bug ( #5604 )
...
* fix mm splitwise scheduler bug
* fix test case bug
* update code
* update code
2025-12-25 04:08:11 -08:00
Copilot
1cbf448178
[Feature] Add startup version check mechanism for Paddle ( #5769 )
...
* Initial plan
* 实现版本检查机制:添加get_version_info函数并在启动时检查Paddle版本
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* 修复代码审查反馈:改进错误处理和日志记录
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Change comments and warning messages from Chinese to English
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Update fastdeploy/__init__.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-25 19:29:04 +08:00
freeliuzc
9018ccf74e
[Speculative Decoding] Fix attn_mask_offset for multi-step MTP in mixed and PD-split modes ( #5738 )
...
* fix attn_mask_offset in mtp with multi-step and pd-split-mode
* fix xpu operater register
* update pmtp multi-step mtp strategy in d-split -mode
* add note
* fix xpu register
2025-12-25 01:54:59 -08:00
Juncai
412867fd99
[Feature] Support KV Cache Storage ( #5571 )
...
* Support Mooncake Store
* up
* up
* add op
* fix conflict
* fix error
* up for comments
* avoid thread lock
* up
* fix unittest
* fix unittest
* remove debug info
* consider tp_size > 1
* add default rdma_nics
* add utils
* up
* fix error
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-25 16:30:35 +08:00
memoryCoderC
be3be4913a
[Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM ( #5195 )
...
* [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM
* [Optimization] refactor(chat_handler,completion_handler): rename class
2025-12-25 16:28:15 +08:00
qw86972190
135e47d551
[XPU]ZMQ logprob ( #5628 )
...
* [XPU]ZMQ logprob
2025-12-25 14:50:01 +08:00
Yuanle Liu
75b3180280
[BugFix] Fix _disable_sequence_parallel_moe_if_needed ( #5740 )
2025-12-24 20:02:22 -08:00
bukejiyu
f0bbdce849
[Loader]Fix bug in MTP weight loading ( #5744 )
...
* fix torch mtp
* fix
* update
2025-12-25 11:32:17 +08:00
chenjian
b90a922f98
[Bug fix] Set enable_cache_output as false by default ( #5751 )
2025-12-24 21:37:24 +08:00
freeliuzc
2dc2ba49b5
[Speculative Decoding] Fix multistep MTP in splitewise-prefill mode ( #5723 )
2025-12-24 02:45:54 -08:00
Nyakku Shigure
11227e00bb
[GraphOptimization] Wrap deep gemm and triton as python op ( #5673 )
...
* [GraphOptimization] Wrap deep gemm and triton as python op
* add unitest to _base_test && compatibility
* paddle.static.MetaTensor -> "paddle.static.MetaTensor"
* mv register_custom_python_op
* rename yaml
---------
Co-authored-by: DrRyanHuang <zihaohuang@aliyun.com >
2025-12-24 15:23:46 +08:00
bukejiyu
ba4b7afb3a
[Others] Rename tensor_parallel_degree to tensor_model_parallel_size for paddleformers 0.4.1 ( #5727 )
2025-12-23 23:19:11 -08:00
GoldPancake
23d488c488
[Feature] Entropy calculation support ( #5692 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* support entropy
* fix bug
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-23 21:19:47 +08:00
bukejiyu
d1c6e57341
[Others] upgrade paddleformer to 0.4.0 ( #5599 )
2025-12-23 05:08:01 -08:00
ming1753
85db9d5e56
[Others] reschedule preempt task support optional func ( #5649 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Others] reschedule preempt task support optional func
* fix bug
* fix bug
2025-12-23 20:45:52 +08:00
ming1753
04c30521dd
[Others] plugin raise error msg ( #5675 )
2025-12-23 18:56:54 +08:00
Divano
c1aa66df02
Revert "[Optim] Remove limitation of number of kvcache blocks ( #5612 )" ( #5702 )
...
This reverts commit 9da89a374b .
2025-12-23 15:41:33 +08:00
RuohengMa
2c3c983b96
[XPU] modify speculate_verify ( #5522 )
2025-12-23 14:50:30 +08:00
bukejiyu
6c36a17369
[Others]Prevent core dumps during Paddle version check ( #5657 )
2025-12-22 21:57:45 -08:00
Jiang-Jia-Jun
9da89a374b
[Optim] Remove limitation of number of kvcache blocks ( #5612 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optim] Remove limitation of number of kvcache blocks
* Update fastdeploy/envs.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/worker/iluvatar_worker.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Add docs
* Update fastdeploy/worker/worker_process.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix ci case
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-23 11:18:29 +08:00
ddchenhao66
4a74f5ab9b
[XPU]Set top_p=0.0 by default on XPU to optimize performance ( #5686 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-12-23 11:01:01 +08:00
xiaolei373
dfe8ea941c
[log]console log to llm log ( #5680 )
2025-12-23 10:05:45 +08:00
RAM
131defa122
Revert "Revert "[Feature] Use paddle.compat.enable_torch_proxy in `fastdepl…" ( #5606 )
...
This reverts commit 021399f7c9 .
2025-12-22 22:37:51 +08:00
Yuanle Liu
8beb0158fa
[BugFix] fix rl signal ( #5681 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-22 00:35:54 -08:00
Sunny-bot1
04035e4ebf
support w4afp8 two stage ( #5608 )
2025-12-22 15:13:05 +08:00
Sunny-bot1
40f3897a4e
support w4afp8 moe offline permute & load ( #5613 )
2025-12-22 15:12:57 +08:00
ming1753
81384ef29e
[BugFix] fix download feature bug ( #5669 )
2025-12-22 13:46:39 +08:00
freeliuzc
6eada4929d
[Speculative Decoding]Support multi-step mtp with cudagraph ( #5624 )
...
* support multi-step mtp with cudagraph
* fix usage
* fix unit test
2025-12-22 11:34:04 +08:00
Yonghua Li
4f830aa505
[RL] provide options for whether shutdown comm group after weights cleared ( #5663 )
...
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [rl] provide options for whether shutdown comm group after weights cleared
* [fix] fix args hardcode
* [fix] change args type
* [fix] add worker process args
2025-12-19 07:06:48 -08:00
kevin
807e404369
[BugFix] fix eb5 mm prefix cache bug ( #5638 )
...
* fix eb5 mm prefix cache bug
* update code
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-12-19 14:57:37 +08:00
RichardWooSJTU
6bd772b93f
fix eplb weight updating ( #5529 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-19 14:30:32 +08:00
Yuanle Liu
689f54f671
[RL] Update worker_process.py ( #5651 )
2025-12-18 20:07:58 -08:00
fmiao2372
a8fce47195
[Intel HPU] enable kv cache scheduler v1 for hpu ( #5648 )
...
* [Intel HPU] enable kv cache scheduler v1 for hpu
* fix copilt comments
2025-12-19 12:03:39 +08:00
bukejiyu
fc452c8e29
[RL]Fix RL load_weights ( #5642 )
...
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-12-18 19:16:18 -08:00
Yuanle Liu
b47674c796
[BugFix] fix rl model_weights_signal to support tp>1 ( #5639 )
2025-12-18 04:43:58 -08:00
bukejiyu
4aa2c6871b
[RL]Support loading weights via the load_weights function for RL ( #5549 )
...
* RL support load_weights
* fix
2025-12-18 02:27:05 -08:00
yzwu
ac013803f3
[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode ( #5555 )
2025-12-18 02:14:25 -08:00
lizan1999
e1a9b282eb
fix bug for EP+MTP ( #5605 )
...
Co-authored-by: lizan1999 <lizan03@baidu.com >
2025-12-18 14:34:54 +08:00
Longzhi Wang
d8587e987e
[Model] tp+ep support v1_loader ( #5465 )
...
* [Model] tp+ep support v1_loader
* fix
* fix mtp_linear
* fix mtp_linear
* fix
* fix
* fix v0 loader
* fix
* Add get_tensor for ep
* fix linear weight_loader
* fix typo
* fix
2025-12-18 14:31:54 +08:00
zhupengyang
8735cb5045
[XPU] refactor moe ffn ( #5501 )
...
- remove BKCL_DISPATCH_ALL_GATHER
- support sparse mode
- support moe quant_method
2025-12-18 14:14:05 +08:00
MingkunZhang
d0a7834a17
[Metax] fix metax runner issue ( #5629 )
2025-12-17 21:32:54 -08:00
qw86972190
c606df59f5
[XPU]logprob bug ( #5626 )
2025-12-18 12:07:20 +08:00
megemini
111955ec0c
[BugFix] 移除重复的 PaddleOCRVLProcessor 初始化代码
2025-12-17 18:58:02 +08:00
fmiao2372
404cf0ece4
[Intel HPU] enable tensor_wise_fp8 ( #5324 )
...
* [Intel HPU] enable tensor_wise_fp8
* update code based on comments
* fix code style issue
* fix bug about RP 5138
* mv kv_cache modifications to HPU backend
* fix FP8 Precision Issues
* fix FP8 Precision Issues
* Add quantization UT
---------
Co-authored-by: yanfeich <yanfei.cheng@intel.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-17 16:45:03 +08:00
freeliuzc
15f5112ecb
[Speculative Decoding]Support different inferseed in speculate decoding ( #5568 )
...
* fix mtp entropy drop in RL
* optimize usage and fix unit test
* optimize padding_sampling_params speed(vectorized)
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-17 16:14:29 +08:00
Yonghua Li
0c8c6369ed
[Feature] [PD Disaggregation] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports ( #5415 )
...
* [feat] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports
* [fix] fix some bugs
* [fix] fix rdma port for cache manager/messager
* [fix] temporarily cancel port availability check to see if it can pass ci test
* [feat] simplify args for multi api server
* [fix] fix dp
* [fix] fix port for xpu
* [fix] add tests for ports post processing & fix ci
* [test] fix test_multi_api_server
* [fix] fix rdma_comm_ports args for multi_api_server
* [fix] fix test_common_engine
* [fix] fix test_cache_transfer_manager
* [chore] automatically setting FD_ENABLE_MULTI_API_SERVER
* [fix] avoid api server from creating engine_args twice
* [fix] fix test_run_batch
* [fix] fix test_metrics
* [fix] fix splitwise connector init
* [test] add test_rdma_transfer and test_expert_service
* [fix] fix code syntax
* [fix] fix test_rdma_transfer and build wheel with rdma script
2025-12-17 15:50:42 +08:00
周周周
e29b005520
[Others] Clean code && remove GPU sync code ( #5548 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-16 21:09:37 +08:00
Yuanle Liu
867803ae10
[BugFix] fix speculate_limit_thinking_content_length ( #5590 )
...
* fix speculate_limit_thinking_content_length
* update
2025-12-16 04:31:45 -08:00