FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

SuperNova 805f29a06c [Feature] refactor metax_gpu attention and moe and remove some useless code (#3688 )

Co-authored-by: yongqiangma <xing.wo@163.com>

2025-09-12 14:40:25 +08:00

…

__init__.py

…

block_wise_fp8.py

…

kv_cache.py

2025-09-09 05:25:08 -07:00

mix_quant.py

cache feature (#3857 )

2025-09-07 18:52:46 +08:00

quant_base.py

…

tensor_wise_fp8.py

…

w4a8.py

2025-09-05 17:07:58 +08:00

w4afp8.py

2025-09-05 17:07:58 +08:00

w8a8.py

fix w8a8.py (#3733 )

2025-09-03 10:57:26 +08:00

weight_only.py

2025-09-12 14:40:25 +08:00

wfp8afp8.py

2025-09-11 20:08:09 +08:00

wint2.py

…