This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 00:17:25 +08:00
Code
Issues
Actions
19
Packages
Projects
Releases
Wiki
Activity
Files
367d37b523a79167672cc99d7b1b3df3eddd6c63
FastDeploy
/
fastdeploy
/
model_executor
T
History
sunxin
ae2f9f4d22
[BugFix] Enable moe_gate_fp32 using FD_ENABLE_RL (
#7130
)
...
* rl gate fp32 * clean
2026-04-06 21:07:38 -07:00
..
graph_optimization
[Feature] Support mtp overlap schedule (
#7001
)
2026-04-01 14:24:26 +08:00
guided_decoding
[Optimization] Use a separate driver when using Triton with Paddle (
#6897
)
2026-03-24 10:56:00 +08:00
layers
[OP][Optimization] Remove ENABLE_PREFILL template parameter in multi_query_append_attention_warp1_4_kernel (
#7201
)
2026-04-07 11:21:57 +08:00
logits_processor
[Bugfix] Align thinking_budget behavior with ERNIE reasoning flow (
#6934
)
2026-03-23 14:15:55 +08:00
model_loader
[Iluvatar] Fix cuda graph error for tp > 1 in ernie models (
#7126
)
2026-04-01 19:13:34 +08:00
models
[BugFix] Enable moe_gate_fp32 using FD_ENABLE_RL (
#7130
)
2026-04-06 21:07:38 -07:00
ops
[Iluvatar] Support wi4a16 group_gemm (
#7078
)
2026-03-30 19:03:51 +08:00
__init__.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
entropy_utils.py
[Bugfix] Fix entropy calculation bugs (
#5941
)
2026-01-08 20:57:35 +08:00
forward_meta.py
[Models]support GLM4.7 Flash && Ernie_MLA (
#7139
)
2026-04-03 17:41:33 +08:00
load_weight_utils.py
add reconstruct (
#6675
)
2026-03-10 11:25:37 +08:00
pre_and_post_process.py
[Feature] Support mtp overlap schedule (
#7001
)
2026-04-01 14:24:26 +08:00
utils.py
[Feature] Support NVFP4 Flashinfer-cutedsl MoE on SM100 (
#6963
)
2026-03-30 11:37:04 +08:00
xpu_pre_and_post_process.py
[XPU] Refactor pre process (
#6993
)
2026-04-01 20:29:55 +08:00