mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
6cff780fdb
* [RL] Add fused stack-transpose-quant for BlockWiseFP8 MoE weight quantization * update * update * update * support custom topk inDeepGemmFusedMoeMethod apply_tp * apply_ep_prefill support moe_topk_select * update * add ut * add ut * add ut * modity doc * fix env and docs * add ut --------- Co-authored-by: zhanghonggeng <zhanghonggeng@baidu.com>