[Feature] consider multimodal model when dummy run (#6045)

* add mm do profile

* updata code

* update code

* update code

* update code

* update test case

* update code

* update code

* fix xpu bug

* update code

* add mm do profile

* update test case

* update code
This commit is contained in:
kevin
2026-02-09 17:49:55 +08:00
committed by GitHub
parent 783d56e28a
commit d60daca4a8
25 changed files with 166 additions and 19 deletions
+7
View File
@@ -1026,6 +1026,13 @@ def parse_args():
help="Enable output of token-level entropy.",
)
parser.add_argument(
"--mm_max_tokens_per_item",
type=json.loads,
default=None,
help="Maximum tokens per item in mm input.",
)
parser.add_argument(
"--num_cpu_blocks",
type=int,