Skip to content

[Feature]: Support multi-architecture container images via CVMFS multiarch layout #8454

@aldbr

Description

@aldbr

User Story

As a DIRAC site administrator deploying ARM (aarch64) and other non-x86_64 worker nodes, I want DIRAC to automatically resolve the correct container image for the worker node's architecture, so that Pilot-Jobs can seamlessly execute payloads on heterogeneous clusters without manual per-architecture configuration.

Feature Description

DIRAC currently hardcodes or configures a single container image path (via ContainerRoot) that implicitly assumes x86_64. With the introduction of ARM resources and the CVMFS unpacked.cern.ch support for multi-architecture images, DIRAC needs to:

  1. Detect the worker node architecture at runtime (using platform.machine() / uname -m).
  2. Map it to the OCI architecture name used by the CVMFS multiarch layout (e.g. x86_64: amd64, aarch64: arm64).
  3. Resolve the correct image path under the CVMFS .multiarch directory structure: /cvmfs/unpacked.cern.ch/.multiarch/<arch>/<registry>/<image>:<tag>
    For example: /cvmfs/unpacked.cern.ch/.multiarch/arm64/registry.hub.docker.com/library/alma9:latest
    Note: Variant handling is deferred. Initially, DIRAC only resolves by architecture (uname -m), not by variant. Can be added later if needed.

It should affect both code paths that launch containers:

  • SingularityComputingElement (job execution via the JobAgent`)
  • dirac-apptainer-exec (Pilot-Job command execution)

It should maintain backward compatibility with the existing ContainerRoot configuration option: if ContainerRoot is set and the new multiarch path doesn't exist, fall back to ContainerRoot.

Definition of Done

  • Multi-architecture images hosted in CVMFS unpacked are supported
  • DIRAC should pick the right container based on the underlying architecture
  • Unit tests
  • Documentation
  • Backward compatibility (no breaking change)

Alternatives Considered

No response

Related Issues

First attempt (CVMFS unpacked support for multi-architecture did not exist at that time, we are in the process of testing it with LHCb images): #7589

Additional Context

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions