Files
FastDeploy/custom_ops/gpu_ops
AIbin 1fb8194191 [OP][Models][Optimization] 优化 RoPE CUDA kernel 并更新 DeepSeek V3 配置 (#7359)
* dsk del prefill mask

* dsk support 1M+ seq_len rope

* update rope tests

* Replace max_position_embeddings with max_model_len

* 1D grid: gridDim.x has a maximum size of 2^31-1, far exceeding the actual number of tokens.
2026-04-13 19:12:36 +08:00
..
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-04-09 11:30:16 +08:00
2026-03-04 21:55:31 +08:00
2026-04-08 20:21:38 +08:00
2026-01-20 21:46:21 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-24 21:19:53 +08:00
2026-03-24 21:19:53 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2025-12-24 11:28:47 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00
2026-03-04 21:55:31 +08:00