This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-24 01:29:57 +08:00
Code
Issues
Actions
11
Packages
Projects
Releases
Wiki
Activity
Files
8995a38fa4d9a636a855e1dac8e766948379ecf6
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
quantization
T
History
zhupengyang
27b00cf385
[XPU] glm-4.5-air (
#7071
)
2026-04-14 11:31:49 +08:00
..
ops
[BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (
#6533
)
2026-03-16 21:32:43 +08:00
__init__.py
[XPU] glm-4.5-air (
#7071
)
2026-04-14 11:31:49 +08:00
block_wise_fp8.py
[TI-consistent] support quant use pow2scale (
#7308
)
2026-04-13 00:01:53 -07:00
fp8_utils.py
[Cleanup] Replace torch proxy alias with public compat API (
#7348
)
2026-04-13 11:43:26 +08:00
kv_cache.py
…
mix_quant.py
…
mxfp4.py
[Cleanup] Replace torch proxy alias with public compat API (
#7348
)
2026-04-13 11:43:26 +08:00
nvfp4.py
[Cleanup] Replace torch proxy alias with public compat API (
#7348
)
2026-04-13 11:43:26 +08:00
quant_base.py
[BugFix] fix flashinfer-cutedsl moe nvfp4 (
#7120
)
2026-04-03 15:43:19 +08:00
tensor_wise_fp8.py
…
w4a8.py
…
w4afp8.py
…
w8a8.py
…
weight_only.py
[Iluvatar] refactor attn and moe code (
#6887
)
2026-03-18 10:31:00 +08:00
wfp8afp8.py
…
wint2.py
…