RichardWooSJTU
9f0778f991
[Feature] Support EP prefill with num_worst_tokens ( #6574 )
...
* support num worst tokens
* support num worst tokens
* fix build error
* support num worst tokens: fix errors
* support num worst tokens: fix feild
* support num worst tokens: delete requiements
* replace permute and depermute op by pure cuda
* replace permute and depermute op by pure cuda
* fix ci
* fix op
* fix nan
* fix code style
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2026-03-11 17:09:07 +08:00
fxyfxy777
36547cfdb3
[Feature] FD_USE_PHI_FP8_QUANT ( #6320 )
...
* add ut
* add use_fd_quant env
* rm mask_per_token_quant
* add make ops list
* USE_FD_FP8_QUANT -> FD_USE_PHI_FP8_QUANT 默认是true
* modify comments
* use bool type
* Add function declaration
2026-02-03 22:33:03 -08:00
fxyfxy777
4c92035f2d
[Feature] Unify fp8 block_wise quant ops ( #5991 )
...
* quant stash
* blockwise_quant
* precommit
* rm tensor.cut
* tp ok
* add swiglu
* rm outdate code
* fix activate ut
* change baseline
* fix baseline error
2026-01-15 05:50:37 -08:00
Yuanle Liu
cdc0004894
Revert "[Feature] add ue8m0 for per_token_quant_fp8 ( #5563 )" ( #5611 )
...
This reverts commit 73e1d6aa90 .
2025-12-17 13:59:06 +08:00
fxyfxy777
73e1d6aa90
[Feature] add ue8m0 for per_token_quant_fp8 ( #5563 )
...
* ue8m0
* add default arg
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-16 18:40:12 +08:00
Echo-Nie
ff653503ff
[Docs] Add License in Unittest ( #4957 )
...
* add copyright
* add CopyRight
2025-11-12 10:44:09 +08:00
ooo oo
460809070c
【Hackathon 9th No.54、57】 add unit tests for per_token_quant and per_token_quant_padding ( #3746 )
2025-09-04 11:46:38 +08:00