YuBaoku
0359794e08
[CI] Sync _log_softmax_batch_invariant with paddle update ( #6893 )
2026-03-17 23:03:57 +08:00
yzwu
901b38c936
[Iluvatar] Optimize decode group_gemm and Support cuda graph for ernie ( #6803 )
2026-03-12 19:21:17 +08:00
yzwu
67388ce2f3
[Iluvatar][CI] Replace ci in ernie-300B-4layer with ernie-21b. ( #6747 )
2026-03-10 17:25:52 +08:00
yzwu
81acdb62bd
[Iluvatar][CI] Do not specify FD_LOG_DIR ( #6665 )
2026-03-06 11:54:44 +08:00
YuBaoku
16a393e90e
[CI] Fix non-deterministic test and skip failed_tests.log in log print ( #6672 )
2026-03-05 18:47:18 +08:00
YuBaoku
56ceeda80c
[CI] Adjust model-specific diff threshold and include iluvatar XPU paths in coverage ( #6663 )
2026-03-05 10:02:54 +08:00
YuBaoku
5c8f5184d9
[CI] Add pytest timeout and enable workflow rerun ( #6645 )
2026-03-04 21:30:16 +08:00
yzwu
3345641f4e
[Iluvatar][CI] fix the dim error of seq_lens_encoder and seq_lens_decoder ( #6637 )
2026-03-04 14:00:40 +08:00
YuBaoku
9a48a41abc
[CI] Fix accidental deletion of failed_tests.log during log cleanup ( #6634 )
2026-03-03 22:06:26 +08:00
YuBaoku
c3d6d706d5
[CI] Add nightly workflow for golang_router tests and improve log handling ( #6608 )
...
* [CI] Add nightly workflow for Golang router tests
* [CI] Improve pytest script stability and log handling
2026-03-03 19:36:57 +08:00
yzwu
6674131b0b
[Iluvatar] Support CudaGraph and optimize flash_attn_unpadded and fused_neox_rope_embedding ( #6553 )
2026-03-02 14:07:17 +08:00
Yuqiang Ge
1f931e05cd
[CI] Add retry logic for pip install in iluvatar CI script ( #6500 )
2026-02-25 16:01:41 +08:00
yzwu
60e75ea8e8
[Iluvatar][CI] Fix cannot import get_stop ( #6165 )
2026-02-10 16:57:23 +08:00
MingkunZhang
6e28b5ef4f
[Metax][CI] update metax ci files ( #6364 )
2026-02-05 17:16:31 +08:00
MingkunZhang
43e3886ef9
[Metax][CI] fix run_ci_metax.sh error ( #6341 )
2026-02-04 15:43:48 +08:00
MingkunZhang
2ffcb3d9ed
[Metax][CI] update ci test files ( #6340 )
2026-02-04 13:58:07 +08:00
Jiaxin Sui
20074d301f
[XPU] [CI] add xpu logprobs case ( #6187 )
...
* add xpu case
* add xpu case
2026-01-23 19:40:55 +08:00
YuBaoku
1cfb042045
[CI] Add ep4_mtp e2e test ( #6153 )
...
* [CI] Add ep4_mtp e2e test
2026-01-22 14:54:18 +08:00
yzwu
837ddca273
[Iluvartar][CI] Fix the error max_tokens_per_expert referenced before assignment ( #6083 )
2026-01-21 16:01:29 +08:00
Jiaxin Sui
b0fc9cadb5
[XPU][CI] update paddle version ( #6044 )
...
* Remove cache queue port from test configuration
Removed cache queue port configuration from test.
* Remove cache queue port from test_vl_model.py
Removed cache queue port argument from test configuration.
* Update test_w4a8.py
* Remove cache queue port from test_mtp.py
Removed cache queue port configuration from test.
* Remove cache queue port from test_logprobs_21b_tp4
Removed cache queue port configuration from test.
* Remove cache queue port from test configuration
Removed cache queue port configuration from test.
* Update test_ep4tp4_online.py
* Update run_xpu_ci_pytest.sh to comment out installations
Comment out PaddlePaddle installation and XVLLM download steps.
2026-01-15 15:17:48 +08:00
Jiaxin Sui
becd8c3803
[XPU][CI] Update XVLLM_PATH setup in run_xpu_ci_pytest.sh ( #6018 )
...
Download and set XVLLM_PATH from output.tar.gz instead of hardcoded path.
2026-01-13 15:42:52 +08:00
Yonghua Li
60ee72f682
[BugFix] [MultiAPIServer] fix rdma script and port check for multi api server ( #5935 )
...
* [fix] fix rdma script and add more error log for multi api server
* [fix] log
* [fix] fix test_multi_api_server
* [fix] fix multi api server port check
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2026-01-12 10:38:52 +08:00
MingkunZhang
384ffd6952
[Metax] add ci test file & update run_ci_metax.sh ( #5975 )
...
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com >
2026-01-09 18:47:06 +08:00
Jiaxin Sui
e93a7d3b6b
Lock PaddlePaddle version in run_xpu_ci_pytest.sh ( #5964 )
...
Locked PaddlePaddle version to 20260107 due to compatibility issues with the updated xhpc framework.
2026-01-09 10:41:34 +08:00
mouxin
0a92e96f20
[Feature] Add Golang-based Router for Request Scheduling and Load Balancing ( #5882 )
...
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
---------
Co-authored-by: mouxin <mouxin@baidu.com >
2026-01-07 21:28:08 +08:00
yzwu
29898372e9
[Iluvatar] remove CUDA_VISIBLE_DEVICE in run_ci_iluvatar.sh ( #5916 )
2026-01-07 14:10:47 +08:00
GoldPancake
e78e22ebd5
[BugFix] Fix entropy bugs ( #5818 )
...
* fix entropy bugs
* fix ut
* fix
2025-12-29 20:44:29 -08:00
yzwu
7b6cc11952
[Iluvatar] Fix FD launch error when specifing CUDA_VISBLE_DEVICE ( #5735 )
2025-12-26 14:01:27 +08:00
Jiaxin Sui
8fc789bb3f
[iluvatar][CI] refactor iluvatar_ci ( #5588 )
...
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* Update Docker image tag in iluvatar_test workflow
* Update default Docker image version in workflow
* Update iluvatar_test.yml
* Update default Docker image in workflow config
* Update model path in run_ernie300B_4layer.py
* Update model path in offline inference check
* Add model_data directory and copy model files
Create model_data directory and copy necessary files.
* Update run_ernie_vl_28B.py
* Update run_ernie300B_4layer.py
* Update paddlepaddle installation method in script
* Change wget command to include proxy option
* Modify paddle package installation in CI script
Updated installation commands for paddle packages.
* Update paddlepaddle and paddle-iluvatar-gpu versions
* Delete .github/workflows/ci_iluvatar.yml
* Rename workflow from ILUVATAR Test to ILUVATAR-CI
* Update installation commands for paddlepaddle and iluvatar
2025-12-25 15:10:34 +08:00
MingkunZhang
e48e306134
[Metax] update ci bash ( #5760 )
...
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com >
2025-12-25 11:47:38 +08:00
GoldPancake
a0fed22ddb
[Feature] Add entropy calculation script
2025-12-24 15:00:06 +08:00
Jiaxin Sui
0bef9b684f
[Metax][CI]fix CI bug ( #5698 )
...
* Update run_ci_metax.sh
* Fix pull request branch reference in CI workflow
2025-12-23 14:56:34 +08:00
MingkunZhang
945a1bc4e2
[Metax] update ci name ( #5679 )
...
* [Metax] update ci name
* Update CI_METAX workflow for pull request handling
* Update ci_metax.yml
* Update CI_METAX workflow for pull request handling
* Remove commented-out code in run_ci_metax.sh
* Add environment to Jenkins trigger job
* Change trigger event from pull_request_target to pull_request
* Fix environment name casing in CI workflow
* Change environment name from Metax-ci to Metax_ci
* Modify CI_METAX workflow for PR targeting and concurrency
Updated workflow to use pull_request_target event and added concurrency settings.
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-12-23 14:00:48 +08:00
YuBaoku
b57deb671d
[CI] Update check_approval.sh
2025-12-22 15:52:04 +08:00
MingkunZhang
46d83be065
[Metax] update ci test ( #5652 )
2025-12-19 17:25:47 +08:00
yzwu
ac013803f3
[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode ( #5555 )
2025-12-18 02:14:25 -08:00
zhupengyang
8735cb5045
[XPU] refactor moe ffn ( #5501 )
...
- remove BKCL_DISPATCH_ALL_GATHER
- support sparse mode
- support moe quant_method
2025-12-18 14:14:05 +08:00
kesmeey
d81341b9b3
[CI]【Hackathon 9th Sprint No.14】功能模块 fastdeploy/rl/rollout_model.py 单测补充 ( #5552 )
...
* Add rollout model unit tests
* test: update rl rollout_model tests
* test: fix cache_type_branches unsupported platform case
* test: fix rl rollout_model test indent
* Delete tests/spec_decode/test_mtp_proposer.py
* chore: format test_rollout_model
* chore: translate rollout test comments to English
* test: guard rollout_model import by disabling auto registry
* chore: reorder imports in rl rollout test
* test: isolate env for RL rollout tests
* style: format rollout RL tests with black
* update
* test: remove RL rollout unit tests causing collection issues
* test: add lightweight rollout_model RL unit tests
* fix(coverage): filter test file paths and handle collection failures
- Only extract real test file paths (tests/.../test_*.py) from pytest collect output
- Filter out ERROR/collecting prefixes to prevent garbage in failed_tests.log
- Add proper error handling for pytest collection failures
- Exit early if no test files can be extracted
- Preserve collection error output for debugging
* update
* style: fix code style issues in test_rollout_model.py
- Remove unused 'os' import
- Remove trailing blank lines
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-18 10:57:53 +08:00
FocusLuo
c3aaa7e441
[BugFix] Fixed build script issue on Intel HPU platforms ( #5455 )
...
* [INTEL HPU] Fixed build script issue for non-gpu platforms
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] PR CI HPU will not use fixed version of fastdeploy_intel_hpu
Signed-off-by: Luo, Focus <focus.luo@intel.com >
---------
Signed-off-by: Luo, Focus <focus.luo@intel.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-11 16:36:37 +08:00
YuanRisheng
f7c6b8c4ec
modify approve ( #5443 )
...
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com >
2025-12-09 16:52:10 +08:00
Jiaxin Sui
b5a7abe624
[XPU] [CI] Change Paddle Version to Nightly ( #5346 )
...
* Enhance run_ci_xpu.sh with caching and prefill options
* Update model path and configuration in run_ci_xpu.sh
* Add '北朝' keyword to assertion in run_45vl.py
* Enhance process termination logic in run_ci_xpu.sh
* Set timeout for CI_XPU job to 60 minutes
* Remove extra newline in stop_processes function
* Update paddlepaddle-xpu installation command
Comment out the previous paddlepaddle-xpu installation command and replace it with a specific version installation due to EP parallel error.
* Update PaddlePaddle installation command
2025-12-05 13:01:29 +08:00
zccjjj
e927c65742
[XPU] [Optimization] [EP] EP communication optimization. ( #5145 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-05 10:03:45 +08:00
Longzhi Wang
f6544c0b1b
[CI] Add RD in env CI. ( #5345 )
...
* test
* [CI] modify env ci(add RD)
* test done
2025-12-03 13:18:17 +08:00
YuBaoku
dfeabee123
[CI] Allow occasional distributed worker exit_code ( #5341 )
2025-12-03 10:56:59 +08:00
Longzhi Wang
21f138f68b
[CI] Add env ci ( #5331 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* test
* [CI] Add env ci
* test donw
2025-12-02 19:31:25 +08:00
fmiao2372
429dd2b1db
[Intel HPU] add example benchmark scripts for hpu ( #5304 )
...
* [Intel HPU] add example benchmark scripts for hpu
* Revise the code based on the copilot comments
* update code based on comments
* update ci ops version
2025-12-02 18:00:01 +08:00
Jiaxin Sui
8e0f4dfd0c
[XPU] [CI] Xpu Ci Refactor ( #5252 )
...
* add xpu ci
* add case
* add case
* fix ci bug
* Update Docker image tag to 'latest' in CI workflow
* Fix set -e usage in run_xpu_ci_pytest.sh
* add pd case
* add case
* Configure pip to use Tsinghua mirror for dependencies
Set the global pip index URL to Tsinghua mirror.
* fix ci bug
* fix bug
* fix bug
---------
Co-authored-by: suijiaxin <suijiaxin@Suis-MacBook-Pro.local >
Co-authored-by: root <root@gajl-bbc-onlinec-com-1511964.gajl.baidu.com >
Co-authored-by: root <root@gajl-bbc-onlinec-com-1511972.gajl.baidu.com >
2025-12-02 17:15:51 +08:00
Yuanle Liu
54119cf07e
[CI] Remove need approve by yuanlehome ( #5310 )
2025-12-01 01:44:43 -08:00
ddchenhao66
fc88eebc32
[CI][XPU] add pd disaggregation ( #5179 )
...
* [CI][XPU] add pd disaggregation
* Clarify comments and install iproute2
Updated comments to clarify script purpose and added installation of iproute2.
---------
Co-authored-by: ddchenhao66 <dhaochen163.com>
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-28 10:44:27 +08:00
Jiaxin Sui
07cb11e51d
[XPU][CI] Set pip index URL to Tsinghua mirror ( #5277 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* Set pip index URL to Tsinghua mirror
* Update ci_xpu.yml
* Update Docker image version in CI workflow
* Update Docker image tag in CI workflow
2025-11-27 22:12:20 +08:00