Files
FastDeploy/custom_ops/gpu_ops
fxyfxy777 2ada119a38 [Optimize] optimize mask_quant & swiglu (#6222)
* optimize mask_quant op speed up 1.5

* fix calculate sequence

* add fused

* rm log

* push kernel code

* add ut

* accuracy ok

* add ue8m0

* add ut

* add merge develop

* rm ut of mask_per_token_quant
2026-02-02 13:52:38 +08:00
..
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00
2026-01-20 21:46:21 +08:00
2025-12-24 11:28:47 +08:00
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00