Skip to content

Commit 2b72119

Browse files
committed
CI: Cannot set --env CUDA_VISIBLE_DEVICES for docker run when using --gpus "device=${CUDA_VISIBLE_DEVICES:-all}" - dropping --env arg.
To see why, consider if we have export CUDA_VISIBLE_DEVICES=1, then setting --env CUDA_VISIBLE_DEVICES for docker run means that the docker runtime env will contain CUDA_VISIBLE_DEVICES=1; however, when you set docker run --gpus "device=${CUDA_VISIBLE_DEVICES:-all}", the docker runtime will only use GPU 1 but it renumbers it as zero. Therefore, when you run a cuda code inside docker the runtime only sees a single GPU device with device ID 0, but CUDA_VISIBLE_DEVICES is set to device id 1, and therefore you get an (uncaught) exception.
1 parent 70c3ed1 commit 2b72119

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

.github/workflows/docker-devito.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
tag: 'nvidia-nvc'
2424
# Respect CUDA_VISIBLE_DEVICES set by the runner and hard-limit docker to that device.
2525
# (--env without value forwards host var; --gpus maps only that device)
26-
flag: --init --env CUDA_VISIBLE_DEVICES --gpus "device=${CUDA_VISIBLE_DEVICES:-all}"
26+
flag: --init --gpus "device=${CUDA_VISIBLE_DEVICES:-all}"
2727
test: 'tests/test_gpu_openacc.py tests/test_gpu_common.py'
2828
runner: ["self-hosted", "nvidiagpu"]
2929

0 commit comments

Comments
 (0)