Logo
Explore Help
Sign In
apps/FastDeploy
1
0
Fork 0
You've already forked FastDeploy
mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00
Code Issues Actions 19 Packages Projects Releases Wiki Activity
Files
20de04e249d94f846c1f81d220aa2eab5b27e4ce
FastDeploy/fastdeploy/model_executor
T
History
zccjjj 20de04e249 [XPU] move xpu_attn_backend.py to FastDeploy/fastdeploy/model_executor/layers/backends/xpu (#5878)
2026-01-09 16:34:57 +08:00
..
graph_optimization
[Others] add assert and only count the actual load in cuda_graph (#5445)
2025-12-10 11:22:54 +08:00
guided_decoding
…
layers
[XPU] move xpu_attn_backend.py to FastDeploy/fastdeploy/model_executor/layers/backends/xpu (#5878)
2026-01-09 16:34:57 +08:00
logits_processor
…
model_loader
[Loader]Fix bug in MTP weight loading (#5744)
2025-12-25 11:32:17 +08:00
models
Revert "Revert "[TSP] last_norm allgather move to model.py (#5924)" (#5961)" (#5972)
2026-01-09 15:58:22 +08:00
ops
[INTEL HPU] support only one release package of PaddleCustomDevice (#5910)
2026-01-08 11:57:13 +08:00
__init__.py
…
entropy_utils.py
[Bugfix] Fix entropy calculation bugs (#5941)
2026-01-08 20:57:35 +08:00
forward_meta.py
[Intel HPU] enable chunked prefill (#5903)
2026-01-06 21:01:50 +08:00
load_weight_utils.py
[V1 Loader] Support loading static C8 scale JSON (#5909)
2026-01-06 19:49:30 -08:00
pre_and_post_process.py
[Metax] optimize flash attention backend (#5876)
2026-01-06 09:52:09 +08:00
utils.py
[INTEL_HPU] supported ERNIE-4.5-21B-A3B-Thinking (#5891)
2026-01-07 21:31:53 +08:00
xpu_pre_and_post_process.py
[XPU] Speculative Decoding with PD (#5856)
2026-01-05 17:31:03 +08:00
Powered by Gitea Version: 1.26.0 Page: 1737ms Template: 64ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API