FastDeploy/fastdeploy/model_executor/layers at 2ae7ab28d2637bd327f2db9293be3673290063f5 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-10 01:21:55 +08:00

Files

T

History

Kane2011 2ae7ab28d2 [MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492 )

2025-08-25 17:44:20 +08:00

..

Add custom op declaration for all_reduce (#3473 )

2025-08-20 20:29:58 +08:00

[MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492 )

2025-08-25 17:44:20 +08:00

support w4afp8 EP inference (#3044 )

2025-08-25 11:27:45 +08:00

support w4afp8 EP inference (#3044 )

2025-08-25 11:27:45 +08:00

[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552 )

2025-08-25 14:11:49 +08:00

__init__.py

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

activation.py

[Polish Code] Remove useless notes

2025-08-14 14:04:52 +08:00

embeddings.py

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00

linear.py

support qwen2 weight only (#3571 )

2025-08-24 11:14:34 +08:00

lm_head.py

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00

mtp_linear.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

normalization.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

rotary_embedding.py

[MetaxGPU] Support FastDeploy on metax gpu (#3241 )

2025-08-13 11:11:54 +08:00

utils.py

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00