Logo
Explore Help
Sign In
apps/FastDeploy
1
0
Fork 0
You've already forked FastDeploy
mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-10 01:21:55 +08:00
Code Issues Actions 6 Packages Projects Releases Wiki Activity
Files
2ae7ab28d2637bd327f2db9293be3673290063f5
FastDeploy/fastdeploy/model_executor/layers
T
History
Kane2011 2ae7ab28d2 [MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492)
2025-08-25 17:44:20 +08:00
..
attention
Add custom op declaration for all_reduce (#3473)
2025-08-20 20:29:58 +08:00
backends
[MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492)
2025-08-25 17:44:20 +08:00
moe
support w4afp8 EP inference (#3044)
2025-08-25 11:27:45 +08:00
quantization
support w4afp8 EP inference (#3044)
2025-08-25 11:27:45 +08:00
sample
[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552)
2025-08-25 14:11:49 +08:00
__init__.py
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00
activation.py
[Polish Code] Remove useless notes
2025-08-14 14:04:52 +08:00
embeddings.py
[V1 Loader] support weight_only (#3413)
2025-08-23 13:13:41 +08:00
linear.py
support qwen2 weight only (#3571)
2025-08-24 11:14:34 +08:00
lm_head.py
[V1 Loader] support weight_only (#3413)
2025-08-23 13:13:41 +08:00
mtp_linear.py
polish code with new pre-commit rule (#2923)
2025-07-19 23:19:27 +08:00
normalization.py
polish code with new pre-commit rule (#2923)
2025-07-19 23:19:27 +08:00
rotary_embedding.py
[MetaxGPU] Support FastDeploy on metax gpu (#3241)
2025-08-13 11:11:54 +08:00
utils.py
[V1 Loader] support weight_only (#3413)
2025-08-23 13:13:41 +08:00
Powered by Gitea Version: 1.26.0 Page: 328ms Template: 6ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API