[Feature] support pooling model dummy_run (#4345)

* support qwen3-embedding

* fix ci bug

* support pooling dummy_run

* fix

* delete print

* parallel_config.max_model_len

* delete is_pooling_model in dummy_run

* fix

* fd_model

* fix embedding load

* fix

* fix post_process
This commit is contained in:
lizexu123
2025-10-17 13:30:55 +08:00
committed by GitHub
parent 15b6b8dc25
commit c234b995ab
10 changed files with 291 additions and 126 deletions
@@ -303,7 +303,9 @@ class Qwen3ForCausalLM(ModelForCasualLM):
if model_param_name not in params_dict:
continue
param = params_dict[model_param_name]
weight_loader = getattr(param, "weight_loader", default_weight_loader(self.fd_config))
weight_loader(param, loaded_weight, shard_id)
break