yzwu
|
8b890c0d72
|
[Iluvatar] refactor attn and moe code (#6887)
|
2026-03-18 10:31:00 +08:00 |
|
yzwu
|
901b38c936
|
[Iluvatar] Optimize decode group_gemm and Support cuda graph for ernie (#6803)
|
2026-03-12 19:21:17 +08:00 |
|
yzwu
|
6674131b0b
|
[Iluvatar] Support CudaGraph and optimize flash_attn_unpadded and fused_neox_rope_embedding (#6553)
|
2026-03-02 14:07:17 +08:00 |
|
yzwu
|
7b6cc11952
|
[Iluvatar] Fix FD launch error when specifing CUDA_VISBLE_DEVICE (#5735)
|
2025-12-26 14:01:27 +08:00 |
|
yzwu
|
ac013803f3
|
[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode (#5555)
|
2025-12-18 02:14:25 -08:00 |
|
yzwu
|
504461b6b5
|
[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651)
|
2025-09-22 21:13:59 +08:00 |
|
chen
|
f0f00a6025
|
[OPs] Universal optimization and Fix early_stop cuda 700 (#3375)
Deploy GitHub Pages / deploy (push) Has been cancelled
* delete nonzero
* delete setup_ops_base.py
* check if
* check gcp infer_seed.cpu()
* fix repetition_early_stopper_kernel cuda 700
|
2025-08-14 22:40:44 +08:00 |
|
yzwu
|
ce9180241e
|
[Iluvatar GPU] Modify the names of some variables (#3273)
|
2025-08-13 11:38:02 +08:00 |
|
yzwu
|
fbdd6b0663
|
[Iluvatar GPU] Optimze attention and moe performance (#3234)
|
2025-08-08 10:51:24 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Yuanle Liu
|
61b3997b85
|
refactor rl get_name_mappings_to_training (#2847)
Deploy GitHub Pages / deploy (push) Has been cancelled
* refactor rl get_name_mappings_to_training
* fix tp>1
* change variable name(ffn1->up_gate_proj/ffn2->down_proj)
* change variable name(linear_weight->weight/linear_bias->bias)
* add rl names mapping for vl
* fix ernie 0.3B error
* fix develop code
* fix
|
2025-07-15 07:31:42 -07:00 |
|
liddk1121
|
1b54a2831e
|
Adapt for iluvatar gpu (#2684)
|
2025-07-07 16:53:14 +08:00 |
|