lizexu123
|
6619298b50
|
【Optim】Optimize grid dimensions using max_tokens_per_expert for MoE models (#6007)
* update w4afp8
* build.sh ok
* support cuda_graph
* fix
* add test
* fix max_tokens_per_expert
* >=70
* fix
* compute_max_tokens_from_prefix_sum in w4afp8
* compute_max_tokens use cub
|
2026-01-15 19:18:42 +08:00 |
|