Commit Graph

2 Commits

Author SHA1 Message Date
Bingoo 6b891da02b [Optimization] enable trtllm_all_reduce fusion kernel in glm model (#6660)
* enable trtllm_all_reduce fusion kernel in glm model

* fix conflict

* format update

* fix a bug

* modify test

* modify test

* support empty tensor and modify test

* fix test_linear config issues

* modify test name

* add edge test case

* modify format

* fix conflict

* modify default max token num in trtllm_allreduce_fusion

* add max token num branch for trtllm_allreduce_fusion

* fix format

* fix rmsnorm config issue

* modify 2025 to 2026

* using compat grard

* Lazily import flashinfer.comm and fix test config issue

* fix test issues

* add flashinfer cache dir clean machine

* fix some issues
2026-04-16 14:10:19 +08:00
xunyoyo ff61a7f5a1 [CI] 【Hackathon 10th Spring No.40】功能模块 fastdeploy/model_executor/layers/linear.py单测补充 (#6107)
* Add linear layer tests for model executor

* Refine linear layer tests for uncovered branches

* Refactor and enhance tests for linear layers

Refactor test_linear.py by removing unused imports and redundant code, and updating model configuration parameters. Add new tests for linear layers and their loading mechanisms.

* test: patch row-parallel alltoall in unit test

* test: avoid alltoall reshape failure in row-parallel

* test: expand linear coverage targets

* Refine linear tests per review feedback

* Fix linear tests for pre-sharded config and qkv fixture

---------

Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>
2026-02-27 16:25:23 +08:00