FastDeploy/fastdeploy/model_executor/layers at bf7e2424d0c32ba1ef452b9ec30d1e3eba3cd68e - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

History

周周周 1c38da2118 Make seq_lens_this_time/decoder/encoder equal shape (#6942 )

2026-03-20 15:31:52 +08:00

..

Make seq_lens_this_time/decoder/encoder equal shape (#6942 )

2026-03-20 15:31:52 +08:00

[Iluvatar] refactor attn and moe code (#6887 )

2026-03-18 10:31:00 +08:00

batch_invariant_ops

[CI] Sync _log_softmax_batch_invariant with paddle update (#6893 )

2026-03-17 23:03:57 +08:00

opt wfp8afp8 triton moe (#6938 )

2026-03-20 11:07:25 +08:00

…

remove load_up_proj_weight_first (#6932 )

2026-03-19 17:21:34 +08:00

[Feature][Sampling] Extend top-k_top-p sampling to all backends and unify greedy decoding with top_k=1 (#6894 )

2026-03-19 01:43:10 -07:00

__init__.py

…

activation.py

[Feature] use phi permute/unpermute & rm swiglu (#6361 )

2026-03-12 02:01:57 -07:00

embeddings.py

[Feature][OP] Add batch-invariant RMSNorm kernel and TP embedding Custom AR path (#6749 )

2026-03-13 14:34:44 +08:00

linear.py

clean nvfp4 related code (#6644 )

2026-03-05 15:48:33 +08:00

lm_head.py

…

mtp_linear.py

…

normalization.py

[RL] support qkrmsnorm use proxy-norm (#6862 )

2026-03-18 23:27:26 -07:00

pooler.py

…

rotary_embedding.py

[Other] Adjust GPUModelRunner to enhance compatibility (#6851 )

2026-03-16 14:49:19 +08:00

utils.py

[Others] support import deepgemm/deepep from fleet ops (#6351 )

2026-02-09 11:53:13 +08:00