[Feature] FD_USE_PHI_FP8_QUANT (#6320)

* add ut

* add use_fd_quant env

* rm mask_per_token_quant

* add make ops list

* USE_FD_FP8_QUANT -> FD_USE_PHI_FP8_QUANT 默认是true

* modify comments

* use bool type

* Add function declaration
This commit is contained in:
fxyfxy777
2026-02-04 14:33:03 +08:00
committed by GitHub
parent 2ffcb3d9ed
commit 36547cfdb3
8 changed files with 634 additions and 51 deletions
+1
View File
@@ -294,6 +294,7 @@ elif paddle.is_compiled_with_cuda():
"gpu_ops/cpp_extensions.cc",
"gpu_ops/share_external_data.cu",
"gpu_ops/fused_mask_swiglu_fp8_quant_kernel.cu",
"gpu_ops/per_token_quant_fp8.cu",
"gpu_ops/update_split_fuse_input.cu",
"gpu_ops/text_image_index_out.cu",
"gpu_ops/text_image_gather_scatter.cu",