FastDeploy/fastdeploy/model_executor at 18ae6aa4d6c403f651bebee810fe62b0b503d5ea - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

History

google-labs-jules[bot] 18ae6aa4d6 perf: avoid unnecessary dtype casting in RMSNorm

Added checks before calling `.astype` in `fastdeploy/model_executor/layers/normalization.py`. In PaddlePaddle, calling `.astype` allocates a new tensor even if it's already the target dtype, avoiding these casts skips memory allocations and kernel launches on the hot path.

2026-04-19 15:16:05 +00:00

..

graph_optimization

[RL] Add clear_graph_opt_backend for glm4_mtp (#7378 )

2026-04-15 19:44:15 +08:00

guided_decoding

[Optimization] Use a separate driver when using Triton with Paddle (#6897 )

2026-03-24 10:56:00 +08:00

perf: avoid unnecessary dtype casting in RMSNorm

2026-04-19 15:16:05 +00:00

logits_processor

[Bugfix] Align thinking_budget behavior with ERNIE reasoning flow (#6934 )

2026-03-23 14:15:55 +08:00

[Loader] add multi-thread model loading (#6877 )

2026-04-09 23:40:15 -07:00

[Optimization][DeepSeekV3.2]Reducing slot_mapping compute frequency from twice per layer to a single pre-processing step. (#7367 )

2026-04-16 19:54:12 +08:00

[Others] Fix typo (#7280 )

2026-04-14 17:28:22 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

entropy_utils.py

[Bugfix] Fix entropy calculation bugs (#5941 )

2026-01-08 20:57:35 +08:00

forward_meta.py

[Optimization][DeepSeekV3.2]Reducing slot_mapping compute frequency from twice per layer to a single pre-processing step. (#7367 )

2026-04-16 19:54:12 +08:00

load_weight_utils.py

[Loader] add multi-thread model loading (#6877 )

2026-04-09 23:40:15 -07:00

pre_and_post_process.py

[Speculative Decoding] Add MTP logprob support for PD disaggregation (#7442 )

2026-04-17 21:37:38 +08:00

utils.py

[Optimization] enable trtllm_all_reduce fusion kernel in glm model (#6660 )

2026-04-16 14:10:19 +08:00

xpu_pre_and_post_process.py

[XPU] Unify Spec and non-spec branch.(#6947 ) (#7180 )

2026-04-16 14:58:38 +08:00