This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 00:17:25 +08:00
Code
Issues
Actions
21
Packages
Projects
Releases
Wiki
Activity
Files
7b0baced179deca3b5131180d1a516ccc482fc2d
FastDeploy
/
fastdeploy
/
model_executor
T
History
Echo-Nie
8819a039c9
[Others] Fix typo (
#7280
)
...
* typo * typo * typo * typo
2026-04-14 17:28:22 +08:00
..
graph_optimization
[FDConfig] Support CLI args for quantization params and add cudagraph validation (
#7281
)
2026-04-10 14:13:42 +08:00
guided_decoding
[Optimization] Use a separate driver when using Triton with Paddle (
#6897
)
2026-03-24 10:56:00 +08:00
layers
[BugFix] fix mm rope (
#7274
)
2026-04-14 11:36:08 +08:00
logits_processor
[Bugfix] Align thinking_budget behavior with ERNIE reasoning flow (
#6934
)
2026-03-23 14:15:55 +08:00
model_loader
[Loader] add multi-thread model loading (
#6877
)
2026-04-09 23:40:15 -07:00
models
[XPU] glm-4.5-air (
#7071
)
2026-04-14 11:31:49 +08:00
ops
[Others] Fix typo (
#7280
)
2026-04-14 17:28:22 +08:00
__init__.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
entropy_utils.py
[Bugfix] Fix entropy calculation bugs (
#5941
)
2026-01-08 20:57:35 +08:00
forward_meta.py
[Models]support GLM4.7 Flash && Ernie_MLA (
#7139
)
2026-04-03 17:41:33 +08:00
load_weight_utils.py
[Loader] add multi-thread model loading (
#6877
)
2026-04-09 23:40:15 -07:00
pre_and_post_process.py
[Speculative Decoding] Support mtp super ultra overlap in pd-split mode with insert_task overlap (
#7323
)
2026-04-13 19:41:17 +08:00
utils.py
[Feature] Support NVFP4 Flashinfer-cutedsl MoE on SM100 (
#6963
)
2026-03-30 11:37:04 +08:00
xpu_pre_and_post_process.py
[Others] Fix typo (
#7280
)
2026-04-14 17:28:22 +08:00