FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Commit Graph

Author	SHA1	Message	Date
Bingoo	6b891da02b	[Optimization] enable trtllm_all_reduce fusion kernel in glm model (#6660 ) * enable trtllm_all_reduce fusion kernel in glm model * fix conflict * format update * fix a bug * modify test * modify test * support empty tensor and modify test * fix test_linear config issues * modify test name * add edge test case * modify format * fix conflict * modify default max token num in trtllm_allreduce_fusion * add max token num branch for trtllm_allreduce_fusion * fix format * fix rmsnorm config issue * modify 2025 to 2026 * using compat grard * Lazily import flashinfer.comm and fix test config issue * fix test issues * add flashinfer cache dir clean machine * fix some issues	2026-04-16 14:10:19 +08:00
xunyoyo	ff61a7f5a1	[CI] 【Hackathon 10th Spring No.40】功能模块 fastdeploy/model_executor/layers/linear.py单测补充 (#6107 ) * Add linear layer tests for model executor * Refine linear layer tests for uncovered branches * Refactor and enhance tests for linear layers Refactor test_linear.py by removing unused imports and redundant code, and updating model configuration parameters. Add new tests for linear layers and their loading mechanisms. * test: patch row-parallel alltoall in unit test * test: avoid alltoall reshape failure in row-parallel * test: expand linear coverage targets * Refine linear tests per review feedback * Fix linear tests for pre-sharded config and qkv fixture --------- Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>	2026-02-27 16:25:23 +08:00

Author

SHA1

Message

Date

Bingoo

6b891da02b

[Optimization] enable trtllm_all_reduce fusion kernel in glm model (#6660 )

* enable trtllm_all_reduce fusion kernel in glm model

* fix conflict

* format update

* fix a bug

* modify test

* modify test

* support empty tensor and modify test

* fix test_linear config issues

* modify test name

* add edge test case

* modify format

* fix conflict

* modify default max token num in trtllm_allreduce_fusion

* add max token num branch for trtllm_allreduce_fusion

* fix format

* fix rmsnorm config issue

* modify 2025 to 2026

* using compat grard

* Lazily import flashinfer.comm and fix test config issue

* fix test issues

* add flashinfer cache dir clean machine

* fix some issues

2026-04-16 14:10:19 +08:00

xunyoyo

ff61a7f5a1

[CI] 【Hackathon 10th Spring No.40】功能模块 fastdeploy/model_executor/layers/linear.py单测补充 (#6107 )

* Add linear layer tests for model executor

* Refine linear layer tests for uncovered branches

* Refactor and enhance tests for linear layers

Refactor test_linear.py by removing unused imports and redundant code, and updating model configuration parameters. Add new tests for linear layers and their loading mechanisms.

* test: patch row-parallel alltoall in unit test

* test: avoid alltoall reshape failure in row-parallel

* test: expand linear coverage targets

* Refine linear tests per review feedback

* Fix linear tests for pre-sharded config and qkv fixture

---------

Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>

2026-02-27 16:25:23 +08:00

2 Commits