YuBaoku
91b8bf20f0
[CI] Add pytest failure log collection and persistence ( #7405 )
2026-04-16 22:56:17 +08:00
YuBaoku
17002edc47
[CI] Add approval check for logging-related modifications ( #7429 )
2026-04-16 14:50:22 +08:00
Echo-Nie
8819a039c9
[Others] Fix typo ( #7280 )
...
* typo
* typo
* typo
* typo
2026-04-14 17:28:22 +08:00
YuBaoku
1269eda2f9
[CI] Ensure container cleanup after job to avoid resource leakage ( #7315 )
...
* [CI] Ensure container cleanup after job to avoid resource leakage
* [CI] Use prebuilt wheels to install xgrammar==0.1.19 and torch==2.6.0
2026-04-10 22:32:18 +08:00
YuBaoku
ee73623c76
[CI] Set high-risk OOM tests for sequential execution ( #7268 )
2026-04-09 22:22:57 +08:00
YuBaoku
db808f2080
[CI] Optimize log cleanup and isolation in unittest ( #7132 )
2026-04-01 22:07:55 +08:00
yzwu
ceaf5df350
[Iluvatar] Fix cuda graph error for tp > 1 in ernie models ( #7126 )
2026-04-01 19:13:34 +08:00
YuBaoku
c6f0c5c3a6
[CI] Optimize test execution with single-GPU parallelism ( #7085 )
...
* [CI] Optimize test execution with single-GPU parallelism and log collection
* remove export CUDA_VISIBLE_DEVICES
* fix path error
* fix log_* path and debug
* [CI] Optimize test execution with single-GPU parallelism and log collection
2026-04-01 14:18:40 +08:00
yzwu
8789329457
[Iluvatar] Support wi4a16 group_gemm ( #7078 )
2026-03-30 19:03:51 +08:00
YuBaoku
2b84a4276e
[CI] Optimize CI: add timeout and cancel on PR close ( #6933 )
2026-03-19 15:54:30 +08:00
yzwu
8b890c0d72
[Iluvatar] refactor attn and moe code ( #6887 )
2026-03-18 10:31:00 +08:00
YuBaoku
0359794e08
[CI] Sync _log_softmax_batch_invariant with paddle update ( #6893 )
2026-03-17 23:03:57 +08:00
yzwu
901b38c936
[Iluvatar] Optimize decode group_gemm and Support cuda graph for ernie ( #6803 )
2026-03-12 19:21:17 +08:00
yzwu
67388ce2f3
[Iluvatar][CI] Replace ci in ernie-300B-4layer with ernie-21b. ( #6747 )
2026-03-10 17:25:52 +08:00
yzwu
81acdb62bd
[Iluvatar][CI] Do not specify FD_LOG_DIR ( #6665 )
2026-03-06 11:54:44 +08:00
YuBaoku
16a393e90e
[CI] Fix non-deterministic test and skip failed_tests.log in log print ( #6672 )
2026-03-05 18:47:18 +08:00
YuBaoku
56ceeda80c
[CI] Adjust model-specific diff threshold and include iluvatar XPU paths in coverage ( #6663 )
2026-03-05 10:02:54 +08:00
YuBaoku
5c8f5184d9
[CI] Add pytest timeout and enable workflow rerun ( #6645 )
2026-03-04 21:30:16 +08:00
yzwu
3345641f4e
[Iluvatar][CI] fix the dim error of seq_lens_encoder and seq_lens_decoder ( #6637 )
2026-03-04 14:00:40 +08:00
YuBaoku
9a48a41abc
[CI] Fix accidental deletion of failed_tests.log during log cleanup ( #6634 )
2026-03-03 22:06:26 +08:00
YuBaoku
c3d6d706d5
[CI] Add nightly workflow for golang_router tests and improve log handling ( #6608 )
...
* [CI] Add nightly workflow for Golang router tests
* [CI] Improve pytest script stability and log handling
2026-03-03 19:36:57 +08:00
yzwu
6674131b0b
[Iluvatar] Support CudaGraph and optimize flash_attn_unpadded and fused_neox_rope_embedding ( #6553 )
2026-03-02 14:07:17 +08:00
Yuqiang Ge
1f931e05cd
[CI] Add retry logic for pip install in iluvatar CI script ( #6500 )
2026-02-25 16:01:41 +08:00
yzwu
60e75ea8e8
[Iluvatar][CI] Fix cannot import get_stop ( #6165 )
2026-02-10 16:57:23 +08:00
MingkunZhang
6e28b5ef4f
[Metax][CI] update metax ci files ( #6364 )
2026-02-05 17:16:31 +08:00
MingkunZhang
43e3886ef9
[Metax][CI] fix run_ci_metax.sh error ( #6341 )
2026-02-04 15:43:48 +08:00
MingkunZhang
2ffcb3d9ed
[Metax][CI] update ci test files ( #6340 )
2026-02-04 13:58:07 +08:00
Jiaxin Sui
20074d301f
[XPU] [CI] add xpu logprobs case ( #6187 )
...
* add xpu case
* add xpu case
2026-01-23 19:40:55 +08:00
YuBaoku
1cfb042045
[CI] Add ep4_mtp e2e test ( #6153 )
...
* [CI] Add ep4_mtp e2e test
2026-01-22 14:54:18 +08:00
yzwu
837ddca273
[Iluvartar][CI] Fix the error max_tokens_per_expert referenced before assignment ( #6083 )
2026-01-21 16:01:29 +08:00
Jiaxin Sui
b0fc9cadb5
[XPU][CI] update paddle version ( #6044 )
...
* Remove cache queue port from test configuration
Removed cache queue port configuration from test.
* Remove cache queue port from test_vl_model.py
Removed cache queue port argument from test configuration.
* Update test_w4a8.py
* Remove cache queue port from test_mtp.py
Removed cache queue port configuration from test.
* Remove cache queue port from test_logprobs_21b_tp4
Removed cache queue port configuration from test.
* Remove cache queue port from test configuration
Removed cache queue port configuration from test.
* Update test_ep4tp4_online.py
* Update run_xpu_ci_pytest.sh to comment out installations
Comment out PaddlePaddle installation and XVLLM download steps.
2026-01-15 15:17:48 +08:00
Jiaxin Sui
becd8c3803
[XPU][CI] Update XVLLM_PATH setup in run_xpu_ci_pytest.sh ( #6018 )
...
Download and set XVLLM_PATH from output.tar.gz instead of hardcoded path.
2026-01-13 15:42:52 +08:00
Yonghua Li
60ee72f682
[BugFix] [MultiAPIServer] fix rdma script and port check for multi api server ( #5935 )
...
* [fix] fix rdma script and add more error log for multi api server
* [fix] log
* [fix] fix test_multi_api_server
* [fix] fix multi api server port check
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2026-01-12 10:38:52 +08:00
MingkunZhang
384ffd6952
[Metax] add ci test file & update run_ci_metax.sh ( #5975 )
...
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com >
2026-01-09 18:47:06 +08:00
Jiaxin Sui
e93a7d3b6b
Lock PaddlePaddle version in run_xpu_ci_pytest.sh ( #5964 )
...
Locked PaddlePaddle version to 20260107 due to compatibility issues with the updated xhpc framework.
2026-01-09 10:41:34 +08:00
mouxin
0a92e96f20
[Feature] Add Golang-based Router for Request Scheduling and Load Balancing ( #5882 )
...
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] add golang router
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
* [Feature] Add Golang-based Router for Request Scheduling and Load Balancing
---------
Co-authored-by: mouxin <mouxin@baidu.com >
2026-01-07 21:28:08 +08:00
yzwu
29898372e9
[Iluvatar] remove CUDA_VISIBLE_DEVICE in run_ci_iluvatar.sh ( #5916 )
2026-01-07 14:10:47 +08:00
GoldPancake
e78e22ebd5
[BugFix] Fix entropy bugs ( #5818 )
...
* fix entropy bugs
* fix ut
* fix
2025-12-29 20:44:29 -08:00
yzwu
7b6cc11952
[Iluvatar] Fix FD launch error when specifing CUDA_VISBLE_DEVICE ( #5735 )
2025-12-26 14:01:27 +08:00
Jiaxin Sui
8fc789bb3f
[iluvatar][CI] refactor iluvatar_ci ( #5588 )
...
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* refactor iluvatar_ci
* Update Docker image tag in iluvatar_test workflow
* Update default Docker image version in workflow
* Update iluvatar_test.yml
* Update default Docker image in workflow config
* Update model path in run_ernie300B_4layer.py
* Update model path in offline inference check
* Add model_data directory and copy model files
Create model_data directory and copy necessary files.
* Update run_ernie_vl_28B.py
* Update run_ernie300B_4layer.py
* Update paddlepaddle installation method in script
* Change wget command to include proxy option
* Modify paddle package installation in CI script
Updated installation commands for paddle packages.
* Update paddlepaddle and paddle-iluvatar-gpu versions
* Delete .github/workflows/ci_iluvatar.yml
* Rename workflow from ILUVATAR Test to ILUVATAR-CI
* Update installation commands for paddlepaddle and iluvatar
2025-12-25 15:10:34 +08:00
MingkunZhang
e48e306134
[Metax] update ci bash ( #5760 )
...
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com >
2025-12-25 11:47:38 +08:00
GoldPancake
a0fed22ddb
[Feature] Add entropy calculation script
2025-12-24 15:00:06 +08:00
Jiaxin Sui
0bef9b684f
[Metax][CI]fix CI bug ( #5698 )
...
* Update run_ci_metax.sh
* Fix pull request branch reference in CI workflow
2025-12-23 14:56:34 +08:00
MingkunZhang
945a1bc4e2
[Metax] update ci name ( #5679 )
...
* [Metax] update ci name
* Update CI_METAX workflow for pull request handling
* Update ci_metax.yml
* Update CI_METAX workflow for pull request handling
* Remove commented-out code in run_ci_metax.sh
* Add environment to Jenkins trigger job
* Change trigger event from pull_request_target to pull_request
* Fix environment name casing in CI workflow
* Change environment name from Metax-ci to Metax_ci
* Modify CI_METAX workflow for PR targeting and concurrency
Updated workflow to use pull_request_target event and added concurrency settings.
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-12-23 14:00:48 +08:00
YuBaoku
b57deb671d
[CI] Update check_approval.sh
2025-12-22 15:52:04 +08:00
MingkunZhang
46d83be065
[Metax] update ci test ( #5652 )
2025-12-19 17:25:47 +08:00
yzwu
ac013803f3
[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode ( #5555 )
2025-12-18 02:14:25 -08:00
zhupengyang
8735cb5045
[XPU] refactor moe ffn ( #5501 )
...
- remove BKCL_DISPATCH_ALL_GATHER
- support sparse mode
- support moe quant_method
2025-12-18 14:14:05 +08:00
kesmeey
d81341b9b3
[CI]【Hackathon 9th Sprint No.14】功能模块 fastdeploy/rl/rollout_model.py 单测补充 ( #5552 )
...
* Add rollout model unit tests
* test: update rl rollout_model tests
* test: fix cache_type_branches unsupported platform case
* test: fix rl rollout_model test indent
* Delete tests/spec_decode/test_mtp_proposer.py
* chore: format test_rollout_model
* chore: translate rollout test comments to English
* test: guard rollout_model import by disabling auto registry
* chore: reorder imports in rl rollout test
* test: isolate env for RL rollout tests
* style: format rollout RL tests with black
* update
* test: remove RL rollout unit tests causing collection issues
* test: add lightweight rollout_model RL unit tests
* fix(coverage): filter test file paths and handle collection failures
- Only extract real test file paths (tests/.../test_*.py) from pytest collect output
- Filter out ERROR/collecting prefixes to prevent garbage in failed_tests.log
- Add proper error handling for pytest collection failures
- Exit early if no test files can be extracted
- Preserve collection error output for debugging
* update
* style: fix code style issues in test_rollout_model.py
- Remove unused 'os' import
- Remove trailing blank lines
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-18 10:57:53 +08:00
FocusLuo
c3aaa7e441
[BugFix] Fixed build script issue on Intel HPU platforms ( #5455 )
...
* [INTEL HPU] Fixed build script issue for non-gpu platforms
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] PR CI HPU will not use fixed version of fastdeploy_intel_hpu
Signed-off-by: Luo, Focus <focus.luo@intel.com >
---------
Signed-off-by: Luo, Focus <focus.luo@intel.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-11 16:36:37 +08:00