Support MXFP4 for GPT-OSS (#5435)

* support mxfp4 in gpt-oss * support mxfp4 in gpt-oss * add scope for flashinfer * remove torch code * update envs.FD_MXFP4_BACKEND * update process_weights_after_loading * update env name * support tp in gpt-oss, add e2e test * add flashinfer-python-paddle in requirements * fix import error * add test * add test * add test * add test
2026-04-23 17:11:21 +08:00 · 2026-01-22 14:21:01 +08:00
parent 309c7d9764
commit 82057cb71f
13 changed files with 670 additions and 25 deletions
@@ -411,6 +411,13 @@ class ModelConfig:
                else:
                    self.model_format = "paddle"
                    logger.info("The model format is Paddle")
+            elif (
+                "quantization_config" in self.model_config
+                and "quant_method" in self.model_config["quantization_config"]
+                and "mxfp4" == self.model_config["quantization_config"]["quant_method"]
+            ):
+                self.model_format = "torch"
+                logger.info("The model format is Hugging Face")
            else:
                raise ValueError(
                    "Unknown model format. Please ensure your config.json contains "