Logo
Explore Help
Sign In
apps/FastDeploy
1
0
Fork 0
You've already forked FastDeploy
mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-24 01:29:57 +08:00
Code Issues Actions 11 Packages Projects Releases Wiki Activity
Files
8995a38fa4d9a636a855e1dac8e766948379ecf6
FastDeploy/fastdeploy/model_executor/layers/quantization
T
History
zhupengyang 27b00cf385 [XPU] glm-4.5-air (#7071)
2026-04-14 11:31:49 +08:00
..
ops
[BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (#6533)
2026-03-16 21:32:43 +08:00
__init__.py
[XPU] glm-4.5-air (#7071)
2026-04-14 11:31:49 +08:00
block_wise_fp8.py
[TI-consistent] support quant use pow2scale (#7308)
2026-04-13 00:01:53 -07:00
fp8_utils.py
[Cleanup] Replace torch proxy alias with public compat API (#7348)
2026-04-13 11:43:26 +08:00
kv_cache.py
…
mix_quant.py
…
mxfp4.py
[Cleanup] Replace torch proxy alias with public compat API (#7348)
2026-04-13 11:43:26 +08:00
nvfp4.py
[Cleanup] Replace torch proxy alias with public compat API (#7348)
2026-04-13 11:43:26 +08:00
quant_base.py
[BugFix] fix flashinfer-cutedsl moe nvfp4 (#7120)
2026-04-03 15:43:19 +08:00
tensor_wise_fp8.py
…
w4a8.py
…
w4afp8.py
…
w8a8.py
…
weight_only.py
[Iluvatar] refactor attn and moe code (#6887)
2026-03-18 10:31:00 +08:00
wfp8afp8.py
…
wint2.py
…
Powered by Gitea Version: 1.26.0 Page: 1730ms Template: 5ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API