FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-08 08:23:25 +08:00

Files

T

freeliuzc 15f5112ecb [Speculative Decoding]Support different inferseed in speculate decoding (#5568 )

* fix mtp entropy drop in RL

* optimize usage and fix unit test

* optimize padding_sampling_params speed(vectorized)

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>

2025-12-17 16:14:29 +08:00

test_activation.py

…

test_append_attention_with_output.py

…

test_append_attention.py

…

test_attention_layer.py

…

test_ep_moe_expert_dispatch_fp8.py

…

test_ffn.py

…

test_fusedmoe.py

…

test_guided_decoding.py

…

test_min_sampling.py

…

test_moba_attention_backend.py

…

test_native_paddle_backend.py

…

test_plas_attention.py

…

test_quantized_linear.py

…

test_repetition_early_stopper.py

…

test_sampler.py

…

test_speculative_sampler.py

…

test_w4a8_moe.py

…

test_w4afp8_moe.py

…