FastDeploy

apps/FastDeploy

Fork 0

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-07 16:08:58 +08:00

Commit Graph

Author	SHA1	Message	Date
Cheng Yanfei	fbcccaa750	[Intel HPU] enable MoE EP for hpu (#5855 ) * enable HPU MoE EP * MoE intermediate_scale stack * enable loader_v1 esp for tensor_wise_fp8 TP or EP * modify activation_scale name	2026-01-15 13:08:00 +08:00
fmiao2372	404cf0ece4	[Intel HPU] enable tensor_wise_fp8 (#5324 ) * [Intel HPU] enable tensor_wise_fp8 * update code based on comments * fix code style issue * fix bug about RP 5138 * mv kv_cache modifications to HPU backend * fix FP8 Precision Issues * fix FP8 Precision Issues * Add quantization UT --------- Co-authored-by: yanfeich <yanfei.cheng@intel.com> Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-12-17 16:45:03 +08:00

Author

SHA1

Message

Date

Cheng Yanfei

fbcccaa750

[Intel HPU] enable MoE EP for hpu (#5855 )

* enable HPU MoE EP

* MoE intermediate_scale stack

* enable loader_v1 esp for tensor_wise_fp8 TP or EP

* modify activation_scale name

2026-01-15 13:08:00 +08:00

fmiao2372

404cf0ece4

[Intel HPU] enable tensor_wise_fp8 (#5324 )

* [Intel HPU] enable tensor_wise_fp8

* update code based on comments

* fix code style issue

* fix bug about RP 5138

* mv kv_cache modifications to HPU backend

* fix FP8 Precision Issues

* fix FP8 Precision Issues

* Add quantization UT

---------

Co-authored-by: yanfeich <yanfei.cheng@intel.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>

2025-12-17 16:45:03 +08:00

2 Commits