Commit Graph

2 Commits

Author SHA1 Message Date
chen 193886e745 only cuda run triton op (#5846) 2025-12-31 14:17:31 +08:00
chen 0bcf924e10 [Optimization] Optimization for gather_logprob by 10GB (#5817)
* opt logprobs gather_logprob,reduce device memory usage by 10GB when token_num=8k
2025-12-30 15:33:34 +08:00