This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 00:17:25 +08:00
Code
Issues
Actions
19
Packages
Projects
Releases
Wiki
Activity
Files
20de04e249d94f846c1f81d220aa2eab5b27e4ce
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
quantization
T
History
lizexu123
acdf0cd1d9
fix hadamard_block_size (
#5888
)
2026-01-06 14:12:14 +08:00
..
ops
…
__init__.py
fix hadamard_block_size (
#5888
)
2026-01-06 14:12:14 +08:00
block_wise_fp8.py
[GraphOptimization] Wrap deep gemm and triton as python op (
#5673
)
2025-12-24 15:23:46 +08:00
kv_cache.py
[Intel HPU] enable tensor_wise_fp8 (
#5324
)
2025-12-17 16:45:03 +08:00
mix_quant.py
support w4afp8 moe offline permute & load (
#5613
)
2025-12-22 15:12:57 +08:00
quant_base.py
…
tensor_wise_fp8.py
[Intel HPU] enable tensor_wise_fp8 (
#5324
)
2025-12-17 16:45:03 +08:00
w4a8.py
[XPU] refactor moe ffn (
#5501
)
2025-12-18 14:14:05 +08:00
w4afp8.py
[Feature] support w4afp8 v1_loader and v0_loader(tp>1) (
#5757
)
2025-12-30 14:11:52 +08:00
w8a8.py
…
weight_only.py
[XPU] refactor moe ffn (
#5501
)
2025-12-18 14:14:05 +08:00
wfp8afp8.py
…
wint2.py
…