This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-05-07 16:08:58 +08:00
Code
Issues
Actions
4
Packages
Projects
Releases
Wiki
Activity
Files
d8841b7b40993afa492066720b7158c09eb542ab
FastDeploy
/
fastdeploy
/
model_executor
/
layers
T
History
bukejiyu
bcaa98ff9c
V1 loader default (
#4251
)
...
* v1 laoder * update * update
2025-10-15 16:49:17 +08:00
..
attention
[Optimization] Fuse get_max_len and get_kv_max_len (
#4369
)
2025-10-13 20:35:00 +08:00
backends
[XPU] fix ep (
#4393
)
2025-10-15 11:41:05 +08:00
moe
V1 loader default (
#4251
)
2025-10-15 16:49:17 +08:00
pool
[Feature] support qwen3-embedding model load (
#4202
)
2025-09-23 00:14:35 -07:00
quantization
[XPU] Support W4A8C8-TP4-300B Model (
#4068
)
2025-10-10 15:41:32 +08:00
sample
[Executor]CUDAGraph support Speculate Decode (
#3769
)
2025-10-09 21:18:29 +08:00
__init__.py
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00
activation.py
[Intel HPU] Support intel hpu platform (
#4161
)
2025-09-24 12:27:50 +08:00
embeddings.py
[BugFix] fix qwen3-embedding model tp>1 (
#4223
)
2025-09-24 14:13:26 +08:00
linear.py
fix machete pre quant (
#4295
)
2025-09-28 16:11:09 +08:00
lm_head.py
[Feature] support qwen3-embedding model load (
#4202
)
2025-09-23 00:14:35 -07:00
mtp_linear.py
support tmp (
#3675
)
2025-08-28 19:42:32 +08:00
normalization.py
adaptive rms_norm's dtype (
#3617
)
2025-08-26 15:29:15 +08:00
pooler.py
[Feature] support pool (
#3827
)
2025-09-22 14:09:09 +08:00
rotary_embedding.py
[Intel HPU] Support intel hpu platform (
#4161
)
2025-09-24 12:27:50 +08:00
utils.py
[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 (
#4238
)
2025-09-24 16:39:51 +08:00