FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-06 15:40:33 +08:00

Author	SHA1	Message	Date
Haonan Luo	82057cb71f	Support MXFP4 for GPT-OSS (#5435 ) * support mxfp4 in gpt-oss * support mxfp4 in gpt-oss * add scope for flashinfer * remove torch code * update envs.FD_MXFP4_BACKEND * update process_weights_after_loading * update env name * support tp in gpt-oss, add e2e test * add flashinfer-python-paddle in requirements * fix import error * add test * add test * add test * add test	2026-01-22 14:21:01 +08:00
K11OntheBoat	490a6551dc	rename params of normalization layer (#6133 ) Co-authored-by: “liuruian” <liuruian@baidu.com>	2026-01-21 17:18:35 +08:00
sunxin	9dc1c74d36	fix opt qknorm (#6080 )	2026-01-19 12:07:20 +08:00
sunxin	2533836dbb	[Optimization] Accelerate Qwen3 QK RMSNorm via Fused Triton Kernel (#5880 ) * qk rmsnorm fused * inplace * glm * fix * add qknorm layer * fix * update * fix qwen3 vl * update rl baseline * fix qwen3 vl moe * test * fix qwen vl moe rl * fix	2026-01-12 05:10:21 -08:00
Yuanle Liu	d4a386dfc4	Revert "Revert "[TSP] last_norm allgather move to model.py (#5924 )" (#5961 )" (#5972 ) This reverts commit `8c3513a410`.	2026-01-09 15:58:22 +08:00
Yuanle Liu	8c3513a410	Revert "[TSP] last_norm allgather move to model.py (#5924 )" (#5961 ) This reverts commit `2bb838fed9`.	2026-01-09 15:20:40 +08:00
xiaoluomi	2bb838fed9	[TSP] last_norm allgather move to model.py (#5924 ) * support_lastnorm_gather_split_dev * support_lastnorm_gather_split_dev1 * support_lastnorm_gather_split_dev3 * support_lastnorm_gather_split_dev4 * support_lastnorm_gather_split_dev5	2026-01-07 23:36:33 -08:00
Longzhi Wang	d8587e987e	[Model] tp+ep support v1_loader (#5465 ) * [Model] tp+ep support v1_loader * fix * fix mtp_linear * fix mtp_linear * fix * fix * fix v0 loader * fix * Add get_tensor for ep * fix linear weight_loader * fix typo * fix	2025-12-18 14:31:54 +08:00
周周周	fb7f951612	[UNITEST] add test (#5305 )	2025-12-02 17:59:01 +08:00
Yuanle Liu	3dc0ffa46d	[TSP] Support qwen3 moe tsp + cudagraph (#4871 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support qwen3_moe tsp mode * fix * fix * update * update * update * fix * support external_rmsnorm * update * fix	2025-11-10 23:37:51 +08:00
zhupengyang	b54eb7ad81	[XPU] ep+tp all2all (#4836 )	2025-11-06 17:26:14 +08:00
Yuanle Liu	56e2d7e668	adaptive rms_norm's dtype (#3617 ) * adaptive rms_norm's dtype * adaptive rms_norm's dtype * add approve coverage --------- Co-authored-by: liuyuanle <liuyuanle@baidu.com>	2025-08-26 15:29:15 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
Yuanle Liu	61b3997b85	refactor rl get_name_mappings_to_training (#2847 ) Deploy GitHub Pages / deploy (push) Has been cancelled Details * refactor rl get_name_mappings_to_training * fix tp>1 * change variable name(ffn1->up_gate_proj/ffn2->down_proj) * change variable name(linear_weight->weight/linear_bias->bias) * add rl names mapping for vl * fix ernie 0.3B error * fix develop code * fix	2025-07-15 07:31:42 -07:00
EnflameGCU	d0f4d6ba3a	[GCU] Support gcu platform (#2702 ) baseline: `e7fa57ebae` Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-08 13:00:52 +08:00
liddk1121	1b54a2831e	Adapt for iluvatar gpu (#2684 )	2025-07-07 16:53:14 +08:00
Jiang-Jia-Jun	05c670e593	[Sync] Update to latest code (#2679 ) * [Sync] Update to latest code * Add new code files * Add new code files * update code * Try to fix build.sh * Try to fix build.sh * Update code * Update requirements.txt * Update code --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00

19 Commits