This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 08:21:53 +08:00
Code
Issues
Actions
19
Packages
Projects
Releases
Wiki
Activity
Files
3214fb5393f711c6d5579288fda58ab39ac1d75e
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
quantization
T
History
Yuan Xiaolan
3214fb5393
support model loading for w4a8 offline quant (
#3064
)
...
支持W4A8 EP 对离线量化权重的load
2025-07-29 21:54:37 +08:00
..
ops
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
__init__.py
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
block_wise_fp8.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
kv_cache.py
support c4 attn && fix cache
2025-07-24 12:00:52 +08:00
mix_quant.py
support model loading for w4a8 offline quant (
#3064
)
2025-07-29 21:54:37 +08:00
quant_base.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
tensor_wise_fp8.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
w4a8.py
support model loading for w4a8 offline quant (
#3064
)
2025-07-29 21:54:37 +08:00
w4afp8.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
w8a8.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
weight_only.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
wfp8afp8.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
wint2.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00