This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 17:11:21 +08:00
Code
Issues
Actions
23
Packages
Projects
Releases
Wiki
Activity
Files
b1a5b756a3566ff65bc49c7ec195db0d36fdbd08
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
quantization
T
History
Sunny-bot1
b1a5b756a3
[Optimize] Support WINT8 and group scale for Machete (
#3905
)
2025-09-15 12:01:34 +08:00
..
ops
[Optimize] Support WINT8 and group scale for Machete (
#3905
)
2025-09-15 12:01:34 +08:00
__init__.py
…
block_wise_fp8.py
…
kv_cache.py
[BugFix]Fix load kv cache quant scale (
#4077
)
2025-09-12 17:44:03 +08:00
mix_quant.py
cache feature (
#3857
)
2025-09-07 18:52:46 +08:00
quant_base.py
…
tensor_wise_fp8.py
…
w4a8.py
load hadamard_block_size from config (
#3797
)
2025-09-05 17:07:58 +08:00
w4afp8.py
load hadamard_block_size from config (
#3797
)
2025-09-05 17:07:58 +08:00
w8a8.py
fix w8a8.py (
#3733
)
2025-09-03 10:57:26 +08:00
weight_only.py
[Optimize] Support WINT8 and group scale for Machete (
#3905
)
2025-09-15 12:01:34 +08:00
wfp8afp8.py
[Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) (
#4051
)
2025-09-11 20:08:09 +08:00
wint2.py
…