fxyfxy777
|
2ada119a38
|
[Optimize] optimize mask_quant & swiglu (#6222)
* optimize mask_quant op speed up 1.5
* fix calculate sequence
* add fused
* rm log
* push kernel code
* add ut
* accuracy ok
* add ue8m0
* add ut
* add merge develop
* rm ut of mask_per_token_quant
|
2026-02-02 13:52:38 +08:00 |
|