FastDeploy/fastdeploy/model_executor at 54f7d9f62128cf8317d170eb87605914f5d059de - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-06 23:49:39 +08:00

Files

T

History

YuBaoku 54f7d9f621 [CI] Sync mm_batch_invariant with paddle.mm update (#6557 )

2026-02-28 14:56:42 +08:00

..

graph_optimization

[Speculative Decoding] Support suffix decoding (#6403 )

2026-02-26 11:42:05 +08:00

guided_decoding

…

[CI] Sync mm_batch_invariant with paddle.mm update (#6557 )

2026-02-28 14:56:42 +08:00

logits_processor

[Feature] Support ThinkingBudget Logits processor to control thinking content length (#6367 )

2026-02-25 14:17:09 +08:00

[BugFix] Fix model loading error for 300B FP8 EP parallel test case (#6382 )

2026-02-10 11:32:57 +08:00

add dsv3 mixed deploy as EP16 TP8 (#6525 )

2026-02-27 14:08:25 +08:00

[Fix] Use paddle.device.get_device_properties for multi-platform compatibility (#6400 )

2026-02-09 19:15:41 +08:00

__init__.py

…

entropy_utils.py

…

forward_meta.py

…

load_weight_utils.py

[loader]supoort wint2 backend (#6139 )

2026-02-08 22:42:36 -08:00

pre_and_post_process.py

[OP][Feature] 统一 limit_thinking_content_length CUDA 算子，支持回复长度限制与注入序列 (#6493 )

2026-02-25 21:36:50 +08:00

utils.py

[BugFix] Fix model loading error for 300B FP8 EP parallel test case (#6382 )

2026-02-10 11:32:57 +08:00

xpu_pre_and_post_process.py

[XPU] Fix PD + MTP (#6495 )

2026-02-27 19:07:35 +08:00