Files
FastDeploy/fastdeploy/model_executor/layers/quantization
Haonan Luo 82057cb71f Support MXFP4 for GPT-OSS (#5435)
* support mxfp4 in gpt-oss

* support mxfp4 in gpt-oss

* add scope for flashinfer

* remove torch code

* update envs.FD_MXFP4_BACKEND

* update process_weights_after_loading

* update env name

* support tp in gpt-oss, add e2e test

* add flashinfer-python-paddle in requirements

* fix import error

* add test

* add test

* add test

* add test
2026-01-22 14:21:01 +08:00
..
2026-01-22 14:21:01 +08:00
2026-01-22 14:21:01 +08:00
2025-12-18 14:14:05 +08:00
2025-09-03 10:57:26 +08:00
2025-12-18 14:14:05 +08:00
2025-11-11 21:30:39 +08:00
2025-10-31 15:44:14 +08:00