FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

Ryan 49cea8fb1c [SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694 )

* rm inplace info && to(gpu)

* update append_attention

* unpin paddle version

* add full_cuda_graph=False

* add blank line

---------

Co-authored-by: SigureMo <sigure.qaq@gmail.com>

2025-10-17 10:57:55 +08:00

cpu_ops

fix typos (#3951 )

2025-09-08 15:22:41 +08:00

gpu_ops

[SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694 )

2025-10-17 10:57:55 +08:00

iluvatar_ops

[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )

2025-09-22 21:13:59 +08:00

metax_ops

[Metax] support cutlass moe & optimize flash attention (#4208 )

2025-09-29 11:22:43 +08:00

third_party

[setup optimize]Support git submodule (#4033 )

2025-09-11 17:41:16 +08:00

utils

【Fix bug] w4afp8 的nblock固定为256，并且fa3的append attn 增加mask参数 (#3771 )

2025-09-02 19:17:01 +08:00

xpu_ops

[XPU] refine fused moe (#4219 )

2025-10-16 19:04:07 +08:00

0001-DeepGEMM-95e81b3.patch

[feat] support fa3 backend for pd disaggregated (#2695 )

2025-07-03 22:33:27 +08:00

MANIFEST.in

…

setup_ops_cpu.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

setup_ops.py

【Hackathon 9th No.86】autogen MultiQueryDecoderAttention template_instantiation -part (#4383 )

2025-10-16 17:08:19 +08:00