Commit Graph

3087 Commits

Author SHA1 Message Date
YUNSHEN XIE 46664985fc Modify the existing coverage collection method (#3573)
fix cov report
2025-08-25 10:35:35 +08:00
YuBaoku 7821534ff5 [CI] add sot test (#3579)
* [CI] add sot test

* [CI] add sot test
2025-08-25 10:14:50 +08:00
lengxia 137e539456 [Feature][XPU] add custom kernels for mtp (#3537) 2025-08-25 10:14:17 +08:00
bukejiyu bdbac0aa3d support qwen2 weight only (#3571)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
2025-08-24 11:14:34 +08:00
bukejiyu 77514e3e1e [V1 Loader] support weight_only (#3413)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* support wint4/wint8

* delete smoe case

* update ci

* print log
2025-08-23 13:13:41 +08:00
Jiang-Jia-Jun 93e1b63200 Revert "[UnitTest][Copilot] Improve unit test coverage for entrypoints module…" (#3564)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
This reverts commit 36325e9ea7.
2025-08-23 10:44:23 +08:00
YuanRisheng e481b7a779 fix sot (#3556) 2025-08-23 08:37:06 +08:00
Zero Rains 79f0dbbb55 [V1 Loader] Support qwen2(bf16) (#3502)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen2(bf16)

* merge bias_loader and weight_loader
2025-08-23 01:08:23 +08:00
YUNSHEN XIE cb166053ba fix test name (#3493)
* fix test name

* update

* update

* fix

* fix

* update

* update

* update

* update

* update

* fix

* update
2025-08-22 23:43:47 +08:00
Copilot 36325e9ea7 [UnitTest][Copilot] Improve unit test coverage for entrypoints modules (#3546)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* Initial plan

* Add comprehensive unit tests for entrypoints utilities

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>

* Complete entrypoints test coverage improvement with tool parser tests

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>

* Apply pre-commit formatting to test files - fix trailing whitespace and long lines

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-22 19:20:51 +08:00
zhink df7c31012b Modified to support custom all reduce by default (#3538) 2025-08-22 16:59:05 +08:00
lddfym 27666ee586 [Feature] Add Qwen25-VL Processor (#3501)
* add qwen-2.5-vl processor

* add qwen25-vl processor

* add qwen25-vl processor

* add qwen25-vl processor

* add qwen25-vl processor position_ids

* add qwen25-vl processor

* add qwen25-vl processor

* position_ids

* add test for qwen25-vl

* organize comments

* formatted

* qwen_vl_processor

* add qwen_vl_processor unittest

* update model path

* update model path

* update qwen_vl_processor unittest

* add unittest and bug fix

* add unittest and bug fix

* Update fastdeploy/input/qwen_mm_processor/image_processor.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/input/qwen_vl_processor.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-22 16:49:42 +08:00
YuanRisheng 5b66462f0e Fix fdconfig bugs (#3528)
* fix config

* fix parallel

* fix ips

* fix rl

* open code
2025-08-22 16:17:15 +08:00
plusNew001 7ae41e9daf [CI] fix xpu ci bug (#3535) 2025-08-22 15:08:39 +08:00
freeliuzc 76759108c9 [Feature][SpeculativeDecoding]Support tree-attention (#3514)
* support tree-attention

* fix merge bug

* fix unit-test api

* fix merge bug
2025-08-22 13:36:41 +08:00
YuBaoku cc88671507 [CI] add container naming and cleanup logic in workflows (#3526) 2025-08-22 11:42:57 +08:00
YUNSHEN XIE 2630260616 disable stable test (#3529) 2025-08-22 11:38:18 +08:00
YuanRisheng 85fbf5455a [V1 Loader]Ernie VL support loader v1 (#3494)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* ernie vl support new loader

* add unittest

* fix test
2025-08-22 11:16:57 +08:00
Zhang Yulong 3cc182236a update ci (#3519)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-21 20:05:50 +08:00
YuanRisheng c389a4013c Unify server-side and model-side Config(Part-5) (#3497)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* move config

* fix xpu

* fix

* fix vl

* fix vl

* fix unitest

* fix args

* add unitest

* fix test
2025-08-21 19:00:21 +08:00
yangjianfengo1 e5aa7087db 【bug fix】修复w4a8编译慢 (#3510)
* 修复w4a8编译

* code style

* 修复tma copy
2025-08-21 18:50:14 +08:00
Zhang Yulong a5692e8b7d Add PD CI case (#3490)
* Create test_ernie_03b_pd.py

* Update test_ernie_03b_pd.py
2025-08-21 18:48:34 +08:00
李泳桦 8bea4b1e25 [fix] fix output tokens count in streaming completion api (#3507) 2025-08-21 18:19:13 +08:00
李泳桦 e4f0b755b4 [fix] setting disable_chat_template while passing prompt_token_ids led to response error (#3228)
* [fix] setting disable_chat_template while passing prompt_token_ids led to response error

* [fix] code syntax

* [test] add test case for this bug

* [test] add test case for empty message list

* [test] fix test case for empty message list
2025-08-21 17:30:51 +08:00
luukunn 371fb3f853 [Feature] add tool parser (#3483)
* add tool parser

* add x1 enable_thinking

* restart ci

* fix vl reasoning parser

* modify call style

* modify call style

* add offline enablethinking

* fix completion

* fix

* fix unit test

* fix unit test

* fix unit test

* fix vl reasoning parser

* fix vl reasoning parser
2025-08-21 17:25:44 +08:00
Yzc216 466cbb5a99 [Feature] Models api (#3073)
* add v1/models interface related

* add model parameters

* default model verification

* unit test

* check model err_msg

* unit test

* type annotation

* model parameter in response

* modify document description

* modify document description

* unit test

* verification

* verification update

* model_name

* pre-commit

* update test case

* update test case

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/entrypoints/openai/serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-21 17:02:56 +08:00
Zhang Yulong b7eee3aec1 Update CI (#3474)
* update CI cases

* update CI cases

* update CI cases

* update CI cases

* Merge upstream/develop and resolve directory rename conflict

* Merge upstream/develop and resolve directory rename conflict

* Merge upstream/develop and resolve directory rename conflict

* update deploy

* update deploy

* update deploy

* update deploy

* update deploy
2025-08-21 16:49:20 +08:00
qw86972190 c83381d650 revert pr (#3481)
Co-authored-by: iosmers <yinwei_hust@163.com>
2025-08-21 14:19:50 +08:00
ltd0924 51f68ae593 [Feature] add dealer manager to reuse the connection (#3471)
* [BugFix] fix control signal release failed

* [BugFix] fix control signal release failed

* update

* update

* update

* [Feature] add dealer manager to reuse the connection

* fix

* fix

* fix

* fix

* fix

* fix

* Create test_dealer_connection_manager.py

* Delete test/entrypoints/openai directory

* Update test_dealer_connection_manager.py

* Update test_dealer_connection_manager.py
2025-08-21 13:11:13 +08:00
YUNSHEN XIE 985b1265c3 CE 编译任务(合入触发) (#3491)
* add ce compile job

* fix

* update
2025-08-21 11:33:26 +08:00
memoryCoderC 31f639f10b [Feature] add prompt_tokens and completion_tokens (#3504)
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-21 10:23:27 +08:00
Zero Rains 30b3f2dc07 [BugFix][V1 Loader] fix the bug in creat weight for block_wise_fp8 (#3486)
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-20 05:52:54 -07:00
Ryan bcdfc1d6b9 Add custom op declaration for all_reduce (#3473)
* add custom op declaration

* roll back try except
2025-08-20 20:29:58 +08:00
Zhang Yulong 33ff0bfe38 Update disaggregated.md (#3495)
修复文档错误
2025-08-20 19:39:18 +08:00
YUNSHEN XIE e197894977 add e2e cases (#3476)
* add e2e cases

* fix
2025-08-20 18:50:14 +08:00
Zhang Yulong 9ff2dfb162 Create eb45-8k-fp8-tp1-dp8_ep.yaml (#3485)
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
混合架构EP并行yaml
2025-08-20 14:33:54 +08:00
YuBaoku 33d369586b [CI] remove useless case (#3482) 2025-08-20 14:20:30 +08:00
xiaolei373 5d131485d8 add error log to file (#3431)
Deploy GitHub Pages / deploy (push) Has been cancelled
* feat(log):add_request_and_response_log

* feat[log]:add error log to file
2025-08-20 09:52:34 +08:00
YUNSHEN XIE 3a6058e445 Add stable ci (#3460)
* add stable ci

* fix

* update

* fix

* rename tests dir;fix stable ci bug

* add timeout limit

* update
2025-08-20 08:57:17 +08:00
kevin 67298cf4c0 add error traceback info (#3419)
Deploy GitHub Pages / deploy (push) Has been cancelled
* add error traceback info

* update error msg

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-19 19:32:04 +08:00
yangjianfengo1 b047681c5d 【New Feature】支持Fp8 group Gemm 24稀疏 (#3463)
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* 支持24稀疏

* code style

* 增加stmatrix 宏定义判断

* code style
2025-08-19 02:54:47 -07:00
ltd0924 d587fb257f [CI] add test generation demo (#3270)
* Create test_generation.py

* update

* update

* format

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update test_generation.py

* Update setup.py

* Delete test/plugins/test_model_runner_register.py

---------

Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
2025-08-19 17:12:40 +08:00
Zero Rains fef447e350 [V1 Loader] Support MOE parameters create and load for DeepGemm and marlin backend (#3447)
* support deepgemm backend

* support marlin backend

* remove print

* fix process_prequanted_weights
2025-08-19 14:15:53 +08:00
chen 6735626014 fix request_output sampling_params (#3154) (#3464) 2025-08-19 13:52:50 +08:00
ltd0924 bca8905b40 [BugFix] fix control signal release failed (#3390)
* [BugFix] fix control signal release failed

* [BugFix] fix control signal release failed

* update

* update

* update
2025-08-19 13:51:38 +08:00
Zero Rains 8b12c80f90 [FixBug] compute early stopping with real batch size (#3418)
* [FixBug] compute early stopping with real batch size

* update

* fix test_sampler
2025-08-18 22:09:21 -07:00
luukunn 3a7a20d191 [Feature] Pass through the chat_template_kwargs to the data processing module (#3421)
* fix chat_template_args

* fix args

* add offline

* add offline

* fix

* fix

* fix default enable_thinking value

* fix default enable_thinking value

* modify condition

* Revert "modify condition"

This reverts commit 26430bdeb1.

* fix unit test
2025-08-19 10:50:01 +08:00
lizexu123 a053ab889b [BugFix] fix num_running_requests in cuda_graph (#3457)
* fix cuda_grpah

* add note

---------

Co-authored-by: RAM <gstian5555@outlook.com>
2025-08-19 10:47:22 +08:00
AIbin beec24fd89 【Inference Optimize】DeepSeek-v3 model inference performance optimization (#3455)
* DSK_OPT_01

* update FA3
2025-08-19 10:42:42 +08:00
zhuzixuan c95b3395e9 【BugFix】completion接口echo回显支持 (#3245)
* wenxin-tools-511,修复v1/completion无法回显的问题。

* 支持多prompt的回显

* 支持多prompt情况下的流式回显

* 补充了 completion 接口支持 echo 的单元测试

* pre-commit

* 移除了多余的test文件

* 修复了completion接口echo支持的单测方法

* 补充了单元测试文件

* 补充单测

* unittest

* 补充单测

* 修复单测

* 删除不必要的assert.

* 重新提交

* 更新测试方法

* ut

* 验证是否是正确思路单测

* 验证是否是正确思路单测

* 验证是否是正确思路单测3

* 优化单测代码,有针对性地缩小单测范围。

* 优化单测代码2,有针对性地缩小单测范围。

* 优化单测代码3,有针对性地缩小单测范围。

* support 'echo' in chat/completion.

* update

* update

* update

* update

* update

* update

* 补充了关于tokenid的单元测试

* update

* 修正index错误

* 修正index错误
2025-08-19 10:41:51 +08:00