Files
FastDeploy/tests/ci_use
lizexu123 6619298b50 【Optim】Optimize grid dimensions using max_tokens_per_expert for MoE models (#6007)
* update w4afp8

* build.sh ok

* support cuda_graph

* fix

* add test

* fix max_tokens_per_expert

* >=70

* fix

* compute_max_tokens_from_prefix_sum in w4afp8

* compute_max_tokens use cub
2026-01-15 19:18:42 +08:00
..
2025-09-16 20:45:40 +08:00
2025-08-20 08:57:17 +08:00
2025-12-09 19:19:42 +08:00