This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-05-09 08:55:00 +08:00
Code
Issues
Actions
7
Packages
Projects
Releases
Wiki
Activity
Files
ad9b95e6dd482c9e98ebf65321802c157ec9fb52
FastDeploy
/
fastdeploy
/
model_executor
/
layers
T
History
yangjianfengo1
e81046fdad
【New Feature】集中式支持w4afp8 (
#3644
)
...
* 支持tp w4afp8 * code style
2025-08-28 10:53:24 +08:00
..
attention
Revert "[Feature] block sparse attention (
#3209
)" (
#3647
)
2025-08-27 17:35:04 +08:00
backends
[NewFeatures] support eplb (
#3547
)
2025-08-26 16:19:30 +08:00
moe
【New Feature】集中式支持w4afp8 (
#3644
)
2025-08-28 10:53:24 +08:00
quantization
[Optimize]support machete weight only gemm (
#3561
)
2025-08-28 09:49:58 +08:00
sample
[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (
#3552
)
2025-08-25 14:11:49 +08:00
__init__.py
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00
activation.py
[Polish Code] Remove useless notes
2025-08-14 14:04:52 +08:00
embeddings.py
Supports DP+TP+EP hybrid parallel deployment strategy (
#3489
)
2025-08-26 00:04:01 -07:00
linear.py
Supports DP+TP+EP hybrid parallel deployment strategy (
#3489
)
2025-08-26 00:04:01 -07:00
lm_head.py
[Precision] Support lm_head layer running in float32 (
#3597
)
2025-08-27 11:34:53 +08:00
mtp_linear.py
polish code with new pre-commit rule (
#2923
)
2025-07-19 23:19:27 +08:00
normalization.py
adaptive rms_norm's dtype (
#3617
)
2025-08-26 15:29:15 +08:00
rotary_embedding.py
[MetaxGPU] Support FastDeploy on metax gpu (
#3241
)
2025-08-13 11:11:54 +08:00
utils.py
[V1 Loader] support weight_only (
#3413
)
2025-08-23 13:13:41 +08:00