This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-05-01 12:56:36 +08:00
Code
Issues
Actions
9
Packages
Projects
Releases
Wiki
Activity
Files
e4e3cede7f3044b7fb6448cf9dcf1970940674a8
FastDeploy
/
fastdeploy
/
model_executor
/
layers
/
moe
T
History
gaoziyuan
d85ef5352a
【BugFix】fix ep buffer clear (
#4450
)
...
* fix * fix
2025-10-21 10:56:00 +08:00
..
__init__.py
support w4afp8 EP inference (
#3044
)
2025-08-25 11:27:45 +08:00
ep.py
Fix noaux_tc cuda Error 700 in CUDAGraph (
#4174
)
2025-09-23 18:41:33 +08:00
fused_moe_backend_base.py
【BugFix】fix ep buffer clear (
#4450
)
2025-10-21 10:56:00 +08:00
fused_moe_cutlass_backend.py
Support GPT-OSS-BF16 (
#4240
)
2025-10-20 14:44:58 +08:00
fused_moe_deepgemm_backend.py
[BugFix]Dev fix custom ar unstable result (
#4437
)
2025-10-17 11:47:16 +08:00
fused_moe_marlin_backend.py
[BugFix]Dev fix custom ar unstable result (
#4437
)
2025-10-17 11:47:16 +08:00
fused_moe_triton_backend.py
[BugFix]Dev fix custom ar unstable result (
#4437
)
2025-10-17 11:47:16 +08:00
fused_moe_wint2_backend.py
[BugFix]Dev fix custom ar unstable result (
#4437
)
2025-10-17 11:47:16 +08:00
moe.py
Support GPT-OSS-BF16 (
#4240
)
2025-10-20 14:44:58 +08:00
triton_moe_kernels.py
[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 (
#4238
)
2025-09-24 16:39:51 +08:00