This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-05-10 01:21:55 +08:00
Code
Issues
Actions
8
Packages
Projects
Releases
Wiki
Activity
Files
8f40dfa9bf0a487018926fc1a2209963d07bced9
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
quantization
T
History
Sunny-bot1
4ffe41a747
WINT4/WINT8 dense gemm default use Machete (
#4451
)
2025-10-23 17:57:59 +08:00
..
ops
WINT4/WINT8 dense gemm default use Machete (
#4451
)
2025-10-23 17:57:59 +08:00
__init__.py
[BugFix]fix v1 loader moe bf16, and supoort dynamic_load_weight create quant param (
#4229
)
2025-09-24 14:12:05 +08:00
block_wise_fp8.py
[v1 loader]qwen Offline fp8 (
#4036
)
2025-09-15 13:44:11 +08:00
kv_cache.py
[XPU] Support W4A8C8-TP4-300B Model (
#4068
)
2025-10-10 15:41:32 +08:00
mix_quant.py
[v1 loader]qwen Offline fp8 (
#4036
)
2025-09-15 13:44:11 +08:00
quant_base.py
…
tensor_wise_fp8.py
…
w4a8.py
[XPU] Support W4A8C8-TP4-300B Model (
#4068
)
2025-10-10 15:41:32 +08:00
w4afp8.py
…
w8a8.py
…
weight_only.py
WINT4/WINT8 dense gemm default use Machete (
#4451
)
2025-10-23 17:57:59 +08:00
wfp8afp8.py
[BugFix]Fix wfp8afp8 triton moe group_topk renormalized=True (
#4449
)
2025-10-16 23:17:48 +08:00
wint2.py
…