mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
[BugFix] Cap nvcc -t threads to avoid compilation failures on high-co… (#6885)
* [BugFix] Cap nvcc -t threads to avoid compilation failures on high-core machines On machines with many cores (e.g. 192), the nvcc -t flag was set to os.cpu_count(), causing each nvcc process to spawn that many internal threads. Combined with Paddle's ThreadPoolExecutor launching parallel compilations (also based on cpu_count), this leads to ~28K+ threads, resource exhaustion, and silent compilation failures. The linker then cannot find the missing .o files, but a second build succeeds because already-compiled objects are cached. Cap nvcc -t at 4 to keep total parallelism reasonable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: gongweibao <gognweibao@baidu.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -363,8 +363,11 @@ elif paddle.is_compiled_with_cuda():
|
||||
"-Igpu_ops",
|
||||
"-Ithird_party/nlohmann_json/include",
|
||||
]
|
||||
worker_threads = os.cpu_count()
|
||||
nvcc_compile_args += ["-t", str(worker_threads)]
|
||||
# Limit nvcc internal threads to avoid resource exhaustion when Paddle's
|
||||
# ThreadPoolExecutor also launches many parallel compilations.
|
||||
# Total threads ≈ (number of parallel compile jobs) × nvcc_threads, so cap nvcc_threads at 4.
|
||||
nvcc_threads = min(os.cpu_count() or 1, 4)
|
||||
nvcc_compile_args += ["-t", str(nvcc_threads)]
|
||||
|
||||
nvcc_version = get_nvcc_version()
|
||||
print(f"nvcc_version = {nvcc_version}")
|
||||
|
||||
Reference in New Issue
Block a user